peft.QuantizationMethod

peft.QuantizationMethod#

class gemma.peft.QuantizationMethod(*values)[source]

Bases: etils.epy.py_utils.StrEnum

Quantization methods.

NONE

No quantization.

INT4

4 bits per-channel.

Q4_0

4 bits per-block.

Q4_0_TRANSPOSE

4 bits per-block (transpose first MLP layer).

SFP8

8 bits floating points.

NONE = 'none'
INT4 = 'int4'
INT8 = 'int8'
Q4_0 = 'q4_0'
Q4_0_TRANSPOSE = 'q4_0_transpose'
SFP8 = 'sfp8'