gm.data.AddSeq2SeqFields#
- class gemma.gm.data.AddSeq2SeqFields(*, in_prompt: typing.Annotated[typing.Any, <kauldron.kontext.annotate._KeyToken object at 0x7001c239ecf0>], in_response: typing.Annotated[typing.Any, <kauldron.kontext.annotate._KeyToken object at 0x7001c239ecf0>], out_input: typing.Annotated[typing.Any, <kauldron.kontext.annotate._KeyToken object at 0x7001c239ecf0>], out_target: typing.Annotated[typing.Any, <kauldron.kontext.annotate._KeyToken object at 0x7001c239ecf0>], out_target_mask: typing.Annotated[typing.Any, <kauldron.kontext.annotate._KeyToken object at 0x7001c239ecf0>])[source]
Bases:
grain._src.core.transforms.MapAdds the model input, target and loss_mask.
From prompt and response token ids, generate the model input, target and loss_mask.
Example:
# Input: { 'prompt': [10, 11, 12, 13], 'response': [20, 21, 1], # Here, response ends with EOS token. } # Ouptut: { 'input': [10, 11, 12, 13, 20, 21], 'target': [11, 12, 13, 20, 21, 1], 'target_mask': [ 0, 0, 0, 1, 1, 1], }
Note
Input and target are the same sequence shifted by one token.
The last token from the target is truncated from the input (as there’s no target for it)
- in_prompt
Input key
- Type:
Any
- in_response
Input key
- Type:
Any
- out_input
Output key (will be added to the example dict)
- Type:
Any
- out_target
Output key (will be added to the example dict)
- Type:
Any
- out_target_mask
Output key (will be added to the example dict)
- Type:
Any
- in_prompt: Annotated[Any, <kauldron.kontext.annotate._KeyToken object at 0x7001c239ecf0>]
- in_response: Annotated[Any, <kauldron.kontext.annotate._KeyToken object at 0x7001c239ecf0>]
- out_input: Annotated[Any, <kauldron.kontext.annotate._KeyToken object at 0x7001c239ecf0>]
- out_target: Annotated[Any, <kauldron.kontext.annotate._KeyToken object at 0x7001c239ecf0>]
- out_target_mask: Annotated[Any, <kauldron.kontext.annotate._KeyToken object at 0x7001c239ecf0>]
- map(element)[source]
Maps a single element.