Resolving Interference When Merging Models
Paper
โข
2306.01708
โข
Published
โข
15
This is a merge of pre-trained language models created using mergekit. Artbitrary update, because I know that people would request it. Didn't have much time to test it, tbh, but feels nice enough? It's up to y'all to decide if it's an upgrade, sidegrade or downgrade. At least now both models have ChatML trained, there's that.
Static GGUF (by Mradermacher)
Imatrix GGUF (by Mradermacher)
EXL2 (by kingbri of RoyalLab)
This model was merged using the TIES merge method using nothingiisreal/MN-12B-Celeste-V1.9 as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
models:
- model: anthracite-org/magnum-12b-v2
parameters:
density: 0.3
weight: 0.5
- model: nothingiisreal/MN-12B-Celeste-V1.9
parameters:
density: 0.7
weight: 0.5
merge_method: ties
base_model: nothingiisreal/MN-12B-Celeste-V1.9
parameters:
normalize: true
int8_mask: true
dtype: bfloat16