Qwen3-1.7B Medical Fine-tuned v3 (GGUF)

This is a GGUF-quantized version of bisonnetworking/bison-medical-v3-1.7b, a medical fine-tuned variant of Qwen/Qwen3-1.7B.

Model Details

  • Base Model: Qwen/Qwen3-1.7B
  • Fine-tuned Model: bisonnetworking/bison-medical-v3-1.7b
  • Format: GGUF (llama.cpp compatible)
  • Use Cases: Medical question answering, medical information retrieval

Available Quantizations

Quantization File Size Use Case
Q8_0 ~1.8 GB High quality, good for GPU
Q6_K ~1.4 GB Very good quality, balanced
Q4_K_M ~1.0 GB Good quality/size tradeoff, optimized for CPU

Usage

With llama.cpp

# Download a quantized version
huggingface-cli download bisonnetworking/bison-medical-v3-1.7b-gguf model-Q4_K_M.gguf --local-dir ./models

# Run inference
./llama.cpp/llama-cli -m ./models/model-Q4_K_M.gguf -p "What are the symptoms of diabetes?"

With Ollama

# Create Modelfile
cat > Modelfile <<EOF
FROM ./models/model-Q4_K_M.gguf
SYSTEM "You are a medical expert assistant. Provide accurate, evidence-based medical information."
EOF

# Create model
ollama create bison-medical-v3 -f Modelfile

# Run
ollama run bison-medical-v3 "What are the symptoms of diabetes?"

With LM Studio

  1. Download any .gguf file from this repository
  2. Open LM Studio
  3. Load the model from the downloaded file
  4. Start chatting!

Training Details

See bisonnetworking/bison-medical-v3-1.7b for full training details.

License

Apache 2.0 (same as base model)

Downloads last month
75
GGUF
Model size
2B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

4-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for bisonnetworking/bison-medical-v3-1.7b-gguf

Finetuned
Qwen/Qwen3-1.7B
Quantized
(137)
this model