loaders.model
loaders.model
Model loader class implementation for loading, configuring, and patching various models.
Classes
Name | Description |
---|---|
ModelLoader | Manages model configuration, initialization and application of patches during |
ModelLoader
loaders.model.ModelLoader(
cfg,
tokenizer,*,
=False,
inference=False,
reference_model**kwargs,
)
Manages model configuration, initialization and application of patches during model loading.
This class orchestrates the entire process of loading a model from configuration to final preparation. It handles device mapping, quantization, attention mechanisms, adapter integration, and various optimizations.
The loading process includes
- Loading and validating model configuration
- Applying monkey patches for optimizations / fixes
- Setting up device mapping (including multi-GPU configurations)
- Configuring quantization
- Setting attention mechanisms (Flash Attention, SDPA, etc.)
- Loading and initializing the model
- Applying adapters (LoRA, QLoRA, etc.)
Attributes
Name | Type | Description |
---|---|---|
model | PreTrainedModel | PeftModel | PeftMixedModel | The loaded model instance (available after load() is called). |
model_kwargs | dict[str, Any] | Dictionary of keyword arguments passed to model initialization. |
base_model | Name or path of the base model to load. | |
model_type | Type of model to load (e.g., AutoModelForCausalLM ). |
|
model_config | Configuration object for the model. | |
auto_model_loader | class used for loading the model (default: AutoModelForCausalLM ). |
Methods
Name | Description |
---|---|
load | Load and prepare the model with all configurations and patches. |
load
loaders.model.ModelLoader.load()
Load and prepare the model with all configurations and patches.
Returns
Name | Type | Description |
---|---|---|
tuple[PreTrainedModel | PeftModelForCausalLM, PeftConfig | None] | A tuple with the loaded model and its LoRA configuration (if applicable). |