core.training_args_base

core.training_args_base

Base Axolotl Training Mixins shared across various trainer configs

Classes

Name Description
AxolotlTrainingMixins Mixin class for the Axolotl training args.

AxolotlTrainingMixins

core.training_args_base.AxolotlTrainingMixins(
    model_type=None,
    lr_quadratic_warmup=False,
    pretraining=False,
    sample_packing=False,
    sample_packing_sequentially=False,
    sample_packing_mp_start_method=None,
    sample_packing_drop_attention_mask=False,
    multipack_real_batches=False,
    include_tkps=True,
    eval_sample_packing=None,
    sample_packing_efficiency=1.0,
    sample_packing_bin_size=200,
    sample_packing_group_size=100000,
    max_seq_length=2048,
    dataset_num_proc=None,
    relora_prune_ratio=None,
    relora_prune_method=None,
    jagged_restart_steps=None,
    jagged_restart_warmup_steps=None,
    jagged_restart_anneal_steps=None,
    bench_split='eval',
    bench_dataset='pharaouk/dharma-1/dharma_1_mini.json',
    do_bench_eval=False,
    do_causal_lm_eval=False,
    max_bench_samples=None,
    bench_source_max_len=2048,
    dataloader_prefetch_factor=None,
    cosine_min_lr_ratio=None,
    cosine_constant_lr_ratio=None,
    loraplus_lr_ratio=None,
    loraplus_lr_embedding=1e-06,
    embedding_lr_scale=None,
    lr_groups=None,
    embedding_lr=None,
    qlora=False,
    orpo_alpha=None,
    lisa_n_layers=None,
    lisa_step_interval=None,
    lisa_layers_attribute=None,
    curriculum_sampling=None,
    alternate_lr_scheduler_type=None,
    chat_template=None,
    adam_beta3=None,
    adam_epsilon2=None,
    activation_offloading=None,
    layer_offloading=None,
    image_size=None,
    image_resize_algorithm=None,
    dion_learning_rate=None,
    dion_momentum=None,
    dion_rank_fraction=None,
    dion_rank_multiple_of=None,
)

Mixin class for the Axolotl training args.