Qwen3-1.7B Medical Fine-tuned v3 (GGUF)
This is a GGUF-quantized version of bisonnetworking/bison-medical-v3-1.7b, a medical fine-tuned variant of Qwen/Qwen3-1.7B.
Model Details
- Base Model: Qwen/Qwen3-1.7B
- Fine-tuned Model: bisonnetworking/bison-medical-v3-1.7b
- Format: GGUF (llama.cpp compatible)
- Use Cases: Medical question answering, medical information retrieval
Available Quantizations
| Quantization | File Size | Use Case |
|---|---|---|
| Q8_0 | ~1.8 GB | High quality, good for GPU |
| Q6_K | ~1.4 GB | Very good quality, balanced |
| Q4_K_M | ~1.0 GB | Good quality/size tradeoff, optimized for CPU |
Usage
With llama.cpp
# Download a quantized version
huggingface-cli download bisonnetworking/bison-medical-v3-1.7b-gguf model-Q4_K_M.gguf --local-dir ./models
# Run inference
./llama.cpp/llama-cli -m ./models/model-Q4_K_M.gguf -p "What are the symptoms of diabetes?"
With Ollama
# Create Modelfile
cat > Modelfile <<EOF
FROM ./models/model-Q4_K_M.gguf
SYSTEM "You are a medical expert assistant. Provide accurate, evidence-based medical information."
EOF
# Create model
ollama create bison-medical-v3 -f Modelfile
# Run
ollama run bison-medical-v3 "What are the symptoms of diabetes?"
With LM Studio
- Download any
.gguffile from this repository - Open LM Studio
- Load the model from the downloaded file
- Start chatting!
Training Details
See bisonnetworking/bison-medical-v3-1.7b for full training details.
License
Apache 2.0 (same as base model)
- Downloads last month
- 75
Hardware compatibility
Log In
to view the estimation
4-bit
6-bit
8-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support