gm#
from gemma import gm
Kauldron API for Gemma.
All symbols#
Module#
Kauldron API for Gemma. |
|
Checkpoints API. |
|
Data pipeline ops. |
|
Evaluators for Gemma. |
|
Losses. |
|
Math utils (attention masks, positional embeddings, …). |
|
Gemma models. |
|
Symbols needed to build new |
|
Sharding utilities. |
|
Testing utilities (dummy models, tokenizer,…). |
|
Text processing utilities. |
|
Tools. |
|
Common types for Gemma. |
Class#
Loader for |
|
Hardcoded paths to Gemma checkpoints. |
|
Loads weights from a Gemma checkpoint. |
|
Wraps a partial loader to not restore the LoRA weights. |
|
Adds the model |
|
Creates the contrastive model inputs for DPO-like loss. |
|
Decode |
|
Equivalent to |
|
Replace each int by a new value. |
|
Add zeros to the end of the sequence to reach the max length. |
|
Parquet(*, _fake_refs: ‘type[_FakeRefsUnset] |
|
Sequence-to-sequence task. |
|
Tokenize a string to ids. |
|
Sampling evaluator. |
|
DPO loss. |
|
NPO loss. |
|
Wrapper around a model to compute policy and anchor outputs. |
|
Output of the |
|
Attention module. |
|
Transformer block. |
|
Einsum is a convenience module for parameterized tensor multiplication. |
|
Embedder module. |
|
Feed forward module. |
|
Gemma2 transformer architecture. |
|
Gemma2 transformer architecture. |
|
Gemma2 transformer architecture. |
|
Gemma3 transformer architecture. |
|
Gemma3 transformer architecture. |
|
Gemma3 transformer architecture. |
|
Gemma3 transformer architecture. |
|
Gemma3 transformer architecture. |
|
Gemma3n E2B transformer architecture. |
|
Gemma3n E4B transformer architecture. |
|
Gemma 4 26B_A4B MoE model. |
|
Gemma 4 31B model. |
|
Gemma 4 E2B model. |
|
Gemma 4 E4B model. |
|
Wrapper around a Gemma model to enable int4 inference. |
|
Wrapper around a Gemma model to enable LoRA. |
|
Output of the Gemma model. |
|
Wrapper around a Gemma model to enable quantization aware training. |
|
RMSNorm layer. |
|
SigLIP vision encoder forward pass from PatchifiedMedia. |
|
Base transformer class. |
|
Protocol for a transformer model to be used with a Sampler. |
|
Dummy transformer architecture, for testing. |
|
Dummy tokenizer. |
|
Chat sampler. |
|
Tokenizer for Gemma 2. |
|
Tokenizer for Gemma 3. |
|
Tokenizer for Gemma3n. |
|
Stateless sampler for Gemma4 with variable-aspect-ratio image support. |
|
Tokenizer for Gemma 4. |
|
Greedy sampling. |
|
Simple random sampling. |
|
Sampler. |
|
Base class for sampling methods. |
|
Special tokens ids. |
|
Base class for tokenizers. |
|
Sampler with tool support. |
|
Top-p (Nucleus) Sampling. |
|
Top-k sampling. |
|
Mcp tool handler. |
|
Base class to orchestrate tools. |
Function#
Restore the params from a checkpoint. |
|
Save the params to a checkpoint. |
|
Create the model |
|
Add zeros to the end of the sequence to reach the max length. |
|
Applies RoPE. |
|
Counts consecutive identical elements in a list. |
|
Use the local tokenizer, to avoid TFHub calls. |