peft.QuantizationMethod#
- class gemma.peft.QuantizationMethod(*values)[source]
Bases:
etils.epy.py_utils.StrEnumQuantization methods.
- NONE
No quantization.
- INT4
4 bits per-channel.
- Q4_0
4 bits per-block.
- Q4_0_TRANSPOSE
4 bits per-block (transpose first MLP layer).
- SFP8
8 bits floating points.
- NONE = 'none'
- INT4 = 'int4'
- INT8 = 'int8'
- Q4_0 = 'q4_0'
- Q4_0_TRANSPOSE = 'q4_0_transpose'
- SFP8 = 'sfp8'