gm.nn.Block

gm.nn.Block#

class gemma.gm.nn.Block(
num_heads: int,
num_kv_heads: int,
embed_dim: int,
head_dim: int,
hidden_dim: int,
use_post_attn_norm: bool,
use_post_ffw_norm: bool,
attn_type: gemma.gm.nn._modules.AttentionType,
query_pre_attn_scalar: float,
transpose_gating_einsum: bool,
rope_base_frequency: int = 10000,
rope_scale_factor: float = 1.0,
attn_logits_soft_cap: float | None = None,
sliding_window_size: int | None = None,
use_qk_norm: bool = False,
parent: flax.linen.module.Module | flax.core.scope.Scope | flax.linen.module._Sentinel | None = <flax.linen.module._Sentinel object>,
name: str | None = None,
)[source]

Bases: flax.linen.module.Module

Transformer block.

num_heads: int
num_kv_heads: int
embed_dim: int
head_dim: int
hidden_dim: int
use_post_attn_norm: bool
use_post_ffw_norm: bool
attn_type: gemma.gm.nn._modules.AttentionType
query_pre_attn_scalar: float
transpose_gating_einsum: bool
rope_base_frequency: int = 10000
rope_scale_factor: float = 1.0
attn_logits_soft_cap: float | None = None
sliding_window_size: int | None = None
use_qk_norm: bool = False
setup()[source]

Initializes a Module lazily (similar to a lazy __init__).

setup is called once lazily on a module instance when a module is bound, immediately before any other methods like __call__ are invoked, or before a setup-defined attribute on self is accessed.

This can happen in three cases:

  1. Immediately when invoking apply(), init() or init_and_output().

  2. Once the module is given a name by being assigned to an attribute of another module inside the other module’s setup method (see __setattr__()):

    >>> class MyModule(nn.Module):
    ...   def setup(self):
    ...     submodule = nn.Conv(...)
    
    ...     # Accessing `submodule` attributes does not yet work here.
    
    ...     # The following line invokes `self.__setattr__`, which gives
    ...     # `submodule` the name "conv1".
    ...     self.conv1 = submodule
    
    ...     # Accessing `submodule` attributes or methods is now safe and
    ...     # either causes setup() to be called once.
    
  3. Once a module is constructed inside a method wrapped with compact(), immediately before another method is called or setup defined attribute is accessed.

name: str | None = None
parent: flax.linen.module.Module | flax.core.scope.Scope | flax.linen.module._Sentinel | None = None
scope: flax.core.scope.Scope | None = None