# gm

[[[Source]]](https://github.com/google-deepmind/gemma/tree/main/gemma/gm/__init__.py)

```{code-block}
from gemma import gm
```

```{eval-rst}
.. automodule:: gemma.gm
  :no-members:
```

## All symbols


### Module

|  |  |
--- | ---
[gm](index) | Kauldron API for Gemma.
[gm.ckpts](ckpts/index) | Checkpoints API.
[gm.data](data/index) | Data pipeline ops.
[gm.evals](evals/index) | Evaluators for Gemma.
[gm.losses](losses/index) | Losses.
[gm.math](math/index) | Math utils (attention masks, positional embeddings, ...).
[gm.nn](nn/index) | Gemma models.
[gm.nn.config](nn/config/index) | Symbols needed to build new `TransformerConfig`.
[gm.sharding](sharding/index) | Sharding utilities.
[gm.testing](testing/index) | Testing utilities (dummy models, tokenizer,...).
[gm.text](text/index) | Text processing utilities.
[gm.tools](tools/index) | Tools.
[gm.typing](typing/index) | Common types for Gemma.

### Class

|  |  |
--- | ---
[gm.ckpts.AnchoredPolicyLoader](ckpts/AnchoredPolicyLoader) | Loader for `gm.nn.AnchoredPolicy` models.
[gm.ckpts.CheckpointPath](ckpts/CheckpointPath) | Hardcoded paths to Gemma checkpoints.
[gm.ckpts.LoadCheckpoint](ckpts/LoadCheckpoint) | Loads weights from a Gemma checkpoint.
[gm.ckpts.SkipLoRA](ckpts/SkipLoRA) | Wraps a partial loader to not restore the LoRA weights.
[gm.data.AddSeq2SeqFields](data/AddSeq2SeqFields) | Adds the model `input`, `target` and `loss_mask`.
[gm.data.ContrastiveTask](data/ContrastiveTask) | Creates the contrastive model inputs for DPO-like loss.
[gm.data.DecodeBytes](data/DecodeBytes) | Decode `bytes` to `str`.
[gm.data.FormatText](data/FormatText) | Equivalent to `template.format(text=my_string)`.
[gm.data.MapInts](data/MapInts) | Replace each int by a new value.
[gm.data.Pad](data/Pad) | Add zeros to the end of the sequence to reach the max length.
[gm.data.Parquet](data/Parquet) | Parquet(*, _fake_refs: 'type[_FakeRefsUnset] | dict[str, _FakeRootCfg]' = <class 'kauldron.utils.config_util._FakeRefsUnset'>, batch_size: int | None = None, seed: Union[kauldron.ktyping.array_type_meta.UInt32['2'], kauldron.ktyping.array_type_meta.Fry[''], kauldron.ktyping.array_type_meta.KdPRNGKey, kauldron.ktyping.array_type_meta.ScalarInt, Sequence[int], NoneType] = _FakeRootCfg('cfg.seed'), transforms: 'tr_normalize.Transformations' = <factory>, num_epochs: 'Optional[int]' = None, batch_drop_remainder: 'bool | str | DropRemainder' = True, num_workers: 'int' = 16, read_options: 'grain.ReadOptions | None' = None, enable_profiling: 'bool' = False, per_worker_buffer_size: 'int' = 1, shard_by_process: 'bool' = True, worker_init_fn: 'Callable[[int, int], None] | None' = None, shuffle: 'bool', path: 'epath.PathLike | list[epath.PathLike]')
[gm.data.Seq2SeqTask](data/Seq2SeqTask) | Sequence-to-sequence task.
[gm.data.Tokenize](data/Tokenize) | Tokenize a string to ids.
[gm.evals.SamplerEvaluator](evals/SamplerEvaluator) | Sampling evaluator.
[gm.losses.DpoLoss](losses/DpoLoss) | DPO loss.
[gm.losses.NpoLoss](losses/NpoLoss) | NPO loss.
[gm.nn.AnchoredPolicy](nn/AnchoredPolicy) | Wrapper around a model to compute policy and anchor outputs.
[gm.nn.AnchoredPolicyOutput](nn/AnchoredPolicyOutput) | Output of the `gm.nn.AnchoredPolicy`.
[gm.nn.Attention](nn/Attention) | Attention module.
[gm.nn.AttentionType](nn/AttentionType) | 
[gm.nn.Block](nn/Block) | Transformer block.
[gm.nn.Einsum](nn/Einsum) | Einsum is a convenience module for parameterized tensor multiplication.
[gm.nn.Embedder](nn/Embedder) | Embedder module.
[gm.nn.FeedForward](nn/FeedForward) | Feed forward module.
[gm.nn.Gemma2_27B](nn/Gemma2_27B) | Gemma2 transformer architecture.
[gm.nn.Gemma2_2B](nn/Gemma2_2B) | Gemma2 transformer architecture.
[gm.nn.Gemma2_9B](nn/Gemma2_9B) | Gemma2 transformer architecture.
[gm.nn.Gemma3_12B](nn/Gemma3_12B) | Gemma3 transformer architecture.
[gm.nn.Gemma3_1B](nn/Gemma3_1B) | Gemma3 transformer architecture.
[gm.nn.Gemma3_270M](nn/Gemma3_270M) | Gemma3 transformer architecture.
[gm.nn.Gemma3_27B](nn/Gemma3_27B) | Gemma3 transformer architecture.
[gm.nn.Gemma3_4B](nn/Gemma3_4B) | Gemma3 transformer architecture.
[gm.nn.Gemma3n_E2B](nn/Gemma3n_E2B) | Gemma3n E2B transformer architecture.
[gm.nn.Gemma3n_E4B](nn/Gemma3n_E4B) | Gemma3n E4B transformer architecture.
[gm.nn.Gemma4_26B_A4B](nn/Gemma4_26B_A4B) | Gemma 4 26B_A4B MoE model.
[gm.nn.Gemma4_31B](nn/Gemma4_31B) | Gemma 4 31B model.
[gm.nn.Gemma4_E2B](nn/Gemma4_E2B) | Gemma 4 E2B model.
[gm.nn.Gemma4_E4B](nn/Gemma4_E4B) | Gemma 4 E4B model.
[gm.nn.IntWrapper](nn/IntWrapper) | Wrapper around a Gemma model to enable int4 inference.
[gm.nn.LoRA](nn/LoRA) | Wrapper around a Gemma model to enable LoRA.
[gm.nn.Output](nn/Output) | Output of the Gemma model.
[gm.nn.QuantizationAwareWrapper](nn/QuantizationAwareWrapper) | Wrapper around a Gemma model to enable quantization aware training.
[gm.nn.RMSNorm](nn/RMSNorm) | RMSNorm layer.
[gm.nn.SigLiPFromPatches](nn/SigLiPFromPatches) | SigLIP vision encoder forward pass from PatchifiedMedia.
[gm.nn.Transformer](nn/Transformer) | Base transformer class.
[gm.nn.TransformerLike](nn/TransformerLike) | Protocol for a transformer model to be used with a Sampler.
[gm.testing.DummyGemma](testing/DummyGemma) | Dummy transformer architecture, for testing.
[gm.testing.DummyTokenizer](testing/DummyTokenizer) | Dummy tokenizer.
[gm.text.ChatSampler](text/ChatSampler) | Chat sampler.
[gm.text.Gemma2Tokenizer](text/Gemma2Tokenizer) | Tokenizer for Gemma 2.
[gm.text.Gemma3Tokenizer](text/Gemma3Tokenizer) | Tokenizer for Gemma 3.
[gm.text.Gemma3nTokenizer](text/Gemma3nTokenizer) | Tokenizer for Gemma3n.
[gm.text.Gemma4Sampler](text/Gemma4Sampler) | Stateless sampler for Gemma4 with variable-aspect-ratio image support.
[gm.text.Gemma4Tokenizer](text/Gemma4Tokenizer) | Tokenizer for Gemma 4.
[gm.text.Greedy](text/Greedy) | Greedy sampling.
[gm.text.RandomSampling](text/RandomSampling) | Simple random sampling.
[gm.text.Sampler](text/Sampler) | Sampler.
[gm.text.SamplingMethod](text/SamplingMethod) | Base class for sampling methods.
[gm.text.SpecialTokens](text/SpecialTokens) | Special tokens ids.
[gm.text.Tokenizer](text/Tokenizer) | Base class for tokenizers.
[gm.text.ToolSampler](text/ToolSampler) | Sampler with tool support.
[gm.text.TopPSampling](text/TopPSampling) | Top-p (Nucleus) Sampling.
[gm.text.TopkSampling](text/TopkSampling) | Top-k sampling.
[gm.tools.McpToolHandler](tools/McpToolHandler) | Mcp tool handler.
[gm.tools.ToolHandlerBase](tools/ToolHandlerBase) | Base class to orchestrate tools.

### Function

|  |  |
--- | ---
[gm.ckpts.load_params](ckpts/load_params) | Restore the params from a checkpoint.
[gm.ckpts.save_params](ckpts/save_params) | Save the params to a checkpoint.
[gm.data.make_seq2seq_fields](data/make_seq2seq_fields) | Create the model `input`, `target` and `loss_mask`.
[gm.data.pad](data/pad) | Add zeros to the end of the sequence to reach the max length.
[gm.math.apply_rope](math/apply_rope) | Applies RoPE.
[gm.math.count_consecutive](math/count_consecutive) | Counts consecutive identical elements in a list.
[gm.testing.use_hermetic_tokenizer](testing/use_hermetic_tokenizer) | Use the local tokenizer, to avoid TFHub calls.

### Typing

|  |  |
--- | ---
[gm.nn.Cache](nn/Cache) | 
[gm.typing.Params](typing/Params) | 


```{toctree}
:hidden:

ckpts/index
data/index
evals/index
losses/index
math/index
nn/index
sharding/index
testing/index
text/index
tools/index
typing/index
```