LFM2-700M-GRPO-NuminaMath-10K-GGUF
GGUF quantized versions of LFM2-700M-GRPO-NuminaMath-10K for efficient CPU and mixed CPU/GPU inference.
Model Overview
This is a quantized version of LFM2-700M-GRPO-NuminaMath-10K, a 700M parameter model fine-tuned using Group Relative Policy Optimization (GRPO) on the NuminaMath-CoT dataset for mathematical reasoning tasks.
Key Features
- Mathematical Reasoning: Optimized for step-by-step math problem solving
- GRPO Training: Uses reinforcement learning with verifiable rewards
- Efficient Inference: Quantized for fast CPU/GPU inference
- Wide Compatibility: Works with Ollama, llama.cpp, LM Studio, and more
Available Quantizations
| Quantization | File | Size | Description |
|---|---|---|---|
| Q4_K_M | lfm2-700m-grpo-numina-10k-q4_k_m.gguf |
~40% of original | Best balance of quality and size |
Quick Start
Using Ollama
# Pull and run directly from HuggingFace
ollama pull hf.co/ermiaazarkhalili/LFM2-700M-GRPO-NuminaMath-10K-GGUF:Q4_K_M
ollama run hf.co/ermiaazarkhalili/LFM2-700M-GRPO-NuminaMath-10K-GGUF:Q4_K_M "Solve step by step: What is 15% of 80?"
Alternative: Create Custom Modelfile
# Download the GGUF file first
huggingface-cli download ermiaazarkhalili/LFM2-700M-GRPO-NuminaMath-10K-GGUF \
lfm2-700m-grpo-numina-10k-q4_k_m.gguf --local-dir ./models
# Create Modelfile with custom system prompt
cat > Modelfile << 'EOF'
FROM ./models/lfm2-700m-grpo-numina-10k-q4_k_m.gguf
SYSTEM "You are a helpful math tutor. When given a math problem, solve it step by step, showing your reasoning clearly. Always verify your final answer."
PARAMETER temperature 0.7
PARAMETER top_p 0.9
EOF
# Create and run the model
ollama create lfm2-700m-grpo-numina-10k -f Modelfile
ollama run lfm2-700m-grpo-numina-10k
Using llama.cpp
# Download the GGUF file
huggingface-cli download ermiaazarkhalili/LFM2-700M-GRPO-NuminaMath-10K-GGUF \
lfm2-700m-grpo-numina-10k-q4_k_m.gguf --local-dir ./models
# Run inference
./llama-cli -m ./models/lfm2-700m-grpo-numina-10k-q4_k_m.gguf \
-p "Solve step by step: If a train travels at 60 mph for 2.5 hours, how far does it travel?" \
-n 256
# Or start a server
./llama-server -m ./models/lfm2-700m-grpo-numina-10k-q4_k_m.gguf \
--host 0.0.0.0 --port 8080
Using llama-cpp-python
from llama_cpp import Llama
# Load the model
llm = Llama(
model_path="./models/lfm2-700m-grpo-numina-10k-q4_k_m.gguf",
n_ctx=2048,
n_gpu_layers=-1 # Use all GPU layers if available
)
# Generate response
prompt = '''Solve step by step:
A store has a 25% off sale. If an item originally costs $80, what is the sale price?
Solution:'''
output = llm(
prompt,
max_tokens=256,
temperature=0.7,
top_p=0.9,
echo=False
)
print(output['choices'][0]['text'])
Using LM Studio
- Download the GGUF file from this repository
- Open LM Studio and navigate to the Models tab
- Click "Import Model" and select the downloaded GGUF file
- Load the model and start chatting about math problems!
Example Prompts
Here are some example prompts that work well with this model:
Solve step by step: What is 23 × 17?
Solve step by step: A rectangle has a length of 12 cm and a width of 8 cm. What is its area and perimeter?
Solve step by step: If 3x + 7 = 22, what is the value of x?
Solve step by step: A car travels 150 miles in 2.5 hours. What is its average speed in miles per hour?
Source Model
This is a quantized version of LFM2-700M-GRPO-NuminaMath-10K.
Training Details
| Property | Value |
|---|---|
| Base Model | LiquidAI/LFM2-700M |
| Training Method | GRPO (Group Relative Policy Optimization) |
| Dataset | AI-MO/NuminaMath-CoT |
| Training Samples | 10,000 |
| LoRA Rank | 16 |
| LoRA Alpha | 32 |
See the source model card for full training details and usage examples with Transformers.
Hardware Requirements
| Quantization | RAM Required | GPU VRAM (optional) |
|---|---|---|
| Q4_K_M | ~1-2 GB | ~1-2 GB |
Conversion Details
| Property | Value |
|---|---|
| Source Model | ermiaazarkhalili/LFM2-700M-GRPO-NuminaMath-10K |
| Conversion Date | 2025-12-29 |
| Quantization | Q4_K_M |
| Converter | llama.cpp |
License
CC-BY-NC-4.0 (same as source model)
Acknowledgments
- Liquid AI for the LFM2 base model
- AI-MO for the NuminaMath-CoT dataset
- llama.cpp for quantization tools
- ermiaazarkhalili for training and quantization
Quantized using the HF-TRL GGUF conversion pipeline on Compute Canada infrastructure
- Downloads last month
- 32
Hardware compatibility
Log In
to view the estimation
4-bit
Model tree for ermiaazarkhalili/LFM2-700M-GRPO-NuminaMath-10K-GGUF
Base model
LiquidAI/LFM2-700M