DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
Paper
•
2406.11617
•
Published
•
8
Instruct template: Mistral V7
This model was merged using the Linear DELLA merge method using ConicCat/Mistral-Small-3.2-AntiRep-24B as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
models:
- model: zerofata/MS3.2-PaintedFantasy-v2-24B
parameters:
weight: 0.2
density: 0.5
epsilon: 0.4
- model: Gryphe/Codex-24B-Small-3.2
parameters:
weight: 0.2
density: 0.5
epsilon: 0.4
- model: CrucibleLab/M3.2-24B-Loki-V1.3
parameters:
weight: 0.4
density: 0.4
epsilon: 0.3
merge_method: della_linear
base_model: ConicCat/Mistral-Small-3.2-AntiRep-24B
parameters:
lambda: 0.9
normalize: true
dtype: bfloat16
tokenizer:
source: union