core.training_args_base
core.training_args_base
Base Axolotl Training Mixins shared across various trainer configs
Classes
| Name | Description |
|---|---|
| AxolotlTrainingMixins | Mixin class for the Axolotl training args. |
AxolotlTrainingMixins
core.training_args_base.AxolotlTrainingMixins(
model_type=None,
lr_quadratic_warmup=False,
pretraining=False,
sample_packing=False,
sample_packing_sequentially=False,
sample_packing_mp_start_method=None,
sample_packing_drop_attention_mask=False,
multipack_real_batches=False,
include_tkps=True,
eval_sample_packing=None,
sample_packing_efficiency=1.0,
sample_packing_bin_size=200,
sample_packing_group_size=100000,
max_seq_length=2048,
dataset_num_proc=None,
relora_prune_ratio=None,
relora_prune_method=None,
jagged_restart_steps=None,
jagged_restart_warmup_steps=None,
jagged_restart_anneal_steps=None,
bench_split='eval',
bench_dataset='pharaouk/dharma-1/dharma_1_mini.json',
do_bench_eval=False,
do_causal_lm_eval=False,
max_bench_samples=None,
bench_source_max_len=2048,
dataloader_prefetch_factor=None,
cosine_min_lr_ratio=None,
cosine_constant_lr_ratio=None,
loraplus_lr_ratio=None,
loraplus_lr_embedding=1e-06,
embedding_lr_scale=None,
lr_groups=None,
embedding_lr=None,
qlora=False,
orpo_alpha=None,
lisa_n_layers=None,
lisa_step_interval=None,
lisa_layers_attribute=None,
curriculum_sampling=None,
alternate_lr_scheduler_type=None,
chat_template=None,
adam_beta3=None,
adam_epsilon2=None,
activation_offloading=None,
layer_offloading=None,
image_size=None,
image_resize_algorithm=None,
dion_learning_rate=None,
dion_momentum=None,
dion_rank_fraction=None,
dion_rank_multiple_of=None,
)Mixin class for the Axolotl training args.