gpt-oss-120b-Distill-Qwen3-1.7B-Thinking-GGUF - GGUF

  • gpt-oss-120b-Distill-Qwen3-1.7B-Thinking is a distilled model of GPT-oss-120B, achieving efficient resource utilization by pruning parameters and optimizing the inference path while preserving GPT style and natural language processing capabilities. In conversational scenarios, it exhibits smooth context-aware interactions, avoids over-inflation, produces concise yet logically rigorous outputs; it introduces table-based problem categorization, enhancing structured task representation, and adapting to multi-domain knowledge integration and high resource constraints environments. Its compact inference path optimizes dialogue response efficiency while retaining the natural language output style of GPT-oss, making it suitable for customer service, Q&A APIs, and educational chatbots.

Available Model files:

  • qwen3-1.7b.Q8_0.gguf
  • qwen3-1.7b.Q4_K_M.gguf

Ollama

An Ollama Modelfile is included for easy deployment.

Downloads last month
300
GGUF
Model size
2B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

4-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support