loaders.model

loaders.model

Model loader class implementation for loading, configuring, and patching various models.

Classes

Name	Description
ModelLoader	Manages model configuration, initialization and application of patches during

ModelLoader

loaders.model.ModelLoader(
    cfg,
    tokenizer,
    *,
    inference=False,
    reference_model=False,
    **kwargs,
)

Manages model configuration, initialization and application of patches during model loading.

This class orchestrates the entire process of loading a model from configuration to final preparation. It handles device mapping, quantization, attention mechanisms, adapter integration, and various optimizations.

The loading process includes

Loading and validating model configuration
Applying monkey patches for optimizations / fixes
Setting up device mapping (including multi-GPU configurations)
Configuring quantization
Setting attention mechanisms (Flash Attention, SDPA, etc.)
Loading and initializing the model
Applying adapters (LoRA, QLoRA, etc.)

Attributes

Name	Type	Description
model	PreTrainedModel \| PeftModel \| PeftMixedModel	The loaded model instance (available after load() is called).
model_kwargs	dict[str, Any]	Dictionary of keyword arguments passed to model initialization.
base_model		Name or path of the base model to load.
model_type		Type of model to load (e.g., `AutoModelForCausalLM`).
model_config		Configuration object for the model.
auto_model_loader		class used for loading the model (default: `AutoModelForCausalLM`).

Methods

Name	Description
load	Load and prepare the model with all configurations and patches.

load

loaders.model.ModelLoader.load()

Load and prepare the model with all configurations and patches.

Returns

Name	Type	Description
	tuple[PreTrainedModel \| PeftModelForCausalLM, PeftConfig \| None]	A tuple with the loaded model and its LoRA configuration (if applicable).