cli.utils.lora_merge
cli.utils.lora_merge
Functions
| Name | Description |
|---|---|
| copy_non_model_files | Copy all non-model files to the output directory. |
| find_lora_weights | Find corresponding LoRA A and B weights for a given key. |
| get_model_shards | Find all model shards in the given path. |
| merge_lora_sharded_efficient | Memory-efficient LoRA merging that processes shards individually |
copy_non_model_files
cli.utils.lora_merge.copy_non_model_files(input_path, output_path, model_shards)Copy all non-model files to the output directory.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| input_path | Path | Source directory | required |
| output_path | Path | Destination directory | required |
| model_shards | list[Path] | List of model shard files to skip | required |
find_lora_weights
cli.utils.lora_merge.find_lora_weights(lora_state, key, weight_renamings=None)Find corresponding LoRA A and B weights for a given key.
Also tries keys after applying weight renamings (from transformers v5 conversion mappings) in case the checkpoint key names differ from the runtime model key names used by the LoRA adapter.
get_model_shards
cli.utils.lora_merge.get_model_shards(model_path)Find all model shards in the given path.
merge_lora_sharded_efficient
cli.utils.lora_merge.merge_lora_sharded_efficient(
base_model_path,
lora_adapter_path,
output_path,
device='cpu',
safe_tensors=True,
simulate_nf4=False,
simulate_nf4_experts=False,
nf4_blocksize=None,
nf4_double_quant=True,
trust_remote_code=False,
)Memory-efficient LoRA merging that processes shards individually without loading the full model into memory.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| simulate_nf4 | bool | Apply NF4 roundtrip to ALL weight tensors (for QLoRA) | False |
| simulate_nf4_experts | bool | Apply NF4 roundtrip only to MoE expert tensors (for quantize_moe_experts). Expert tensors are identified by having “expert” in the key name and ndim >= 3. | False |
| trust_remote_code | bool | Whether to trust remote code when loading model config for layer-type introspection. Defaults to False for safety. | False |