metadata
license: llama3.3
base_model:
- allura-forge/Llama-3.3-8B-Instruct
pipeline_tag: text-generation
model-index:
- name: shb777/Llama-3.3-8B-Instruct-128K
results:
- task:
type: text-generation
dataset:
name: BBH
type: leaderboard
metrics:
- type: accuracy
value: 54.1
name: acc_norm
- task:
type: text-generation
dataset:
name: GPQA
type: leaderboard
metrics:
- type: accuracy
value: 29.9
name: acc_norm
- task:
type: text-generation
dataset:
name: MMLU Pro
type: leaderboard
metrics:
- type: accuracy
value: 38
name: acc
- task:
type: text-generation
dataset:
name: MuSR
type: leaderboard
metrics:
- type: accuracy
value: 37.8
name: acc_norm
- task:
type: text-generation
dataset:
name: IFEval
type: leaderboard
metrics:
- type: accuracy
value: 85.2
name: avg(prompt_strict + inst_strict)
- task:
type: text-generation
dataset:
name: MATH Hard
type: leaderboard
metrics:
- type: accuracy
value: 27.3
name: exact_match
Llama 3.3 8B 128K Instruct (Fixed)
Original allura-forge/Llama-3.3-8B-Instruct, Thanks!
Additional Fixes:
- Added
rope_scaling - Added chat template (Unsloth) in tokenizer config
- Updated generation config
- Enabled full context length