gm.nn.QuantizationAwareWrapper

gm.nn.QuantizationAwareWrapper#

class gemma.gm.nn.QuantizationAwareWrapper(
*,
method: gemma.peft._quantization_utils.QuantizationMethod = QuantizationMethod.NONE,
model: flax.linen.module.Module,
parent: flax.linen.module.Module | flax.core.scope.Scope | flax.linen.module._Sentinel | None = <flax.linen.module._Sentinel object>,
name: str | None = None,
)[source]

Bases: flax.linen.module.Module

Wrapper around a Gemma model to enable quantization aware training.

The model wrapped will have all it’s nn.Dense, nn.Einsum,… layers replaced by their quantization aware training versions. See gemma.peft documentation for more details.

method

The quantization method to use.

Type:

gemma.peft._quantization_utils.QuantizationMethod

model

The model to wrap.

Type:

flax.linen.module.Module

method: gemma.peft._quantization_utils.QuantizationMethod = 'none'
model: flax.linen.module.Module
name: str | None = None
parent: flax.linen.module.Module | flax.core.scope.Scope | flax.linen.module._Sentinel | None = None
scope: flax.core.scope.Scope | None = None