license: other
language:
- zh
- en
pipeline_tag: text-generation
tags:
- fortune-telling
- qwen
- qwen2.5
- qwen3
- gguf
- ollama
This is a 4B parameter model fine-tuned based on the Qwen3 architecture, specifically designed for Fortune Telling tasks. This repository provides merged Safetensors weights, GGUF quantized files, and a Modelfile for Ollama.
About This Repository
This repository contains the model in three formats:
- GGUF Quantized Model (Recommended):
- Filename:
FortuneQwen3_4b_q8_0.gguf(or other versions) - Description: Pre-converted GGUF format (Int8 quantized), ready for use with
llama.cpporOllama.
- Filename:
- Modelfile:
- Filename:
Modelfile - Description: Configuration file for Ollama import, defining system prompts and parameters.
- Filename:
- Hugging Face Safetensors:
- Filename:
model.safetensors, etc. - Description: Full model parameters with LoRA weights already merged, suitable for Transformers-based inference, further fine-tuning, or exporting custom GGUF files.
- Filename:
Quick Start
Option 1: Using Ollama (Recommended)
You can quickly create an Ollama model using the pre-converted GGUF file found in this repository.
Clone this repository:
git clone https://huggingface.co/Tbata7/FortuneQwen3_4b cd FortuneQwen3_4bCreate the model:
# This uses the local Modelfile and GGUF file ollama create FortuneQwen3_q8:4b -f ModelfileRun the model:
ollama run FortuneQwen3_q8:4b
Option 2: Using llama.cpp
If you prefer to use the GGUF file directly with llama.cpp:
./llama-cli -m FortuneQwen3_4b_q8_0.gguf -p "Your question here..." -n 512
Advanced Usage: Exporting Custom GGUF
If you wish to use a different quantization level (e.g., q4_k, q6_k, fp16), you can export a custom GGUF from the Safetensors weights using llama.cpp.
Prepare Environment: Ensure you have
llama.cpppython dependencies installed.Convert Model: Use the
convert_hf_to_gguf.pyscript. You must specify the--outtypeparameter to control the output type.Export as FP16 (No quantization):
python llama.cpp/convert_hf_to_gguf.py ./FortuneQwen3_4b --outfile FortuneQwen3_4b_fp16.gguf --outtype f16Export as Int8 (q8_0):
python llama.cpp/convert_hf_to_gguf.py ./FortuneQwen3_4b --outfile FortuneQwen3_4b_q8_0.gguf --outtype q8_0Other Quantizations: First export as f16, then use the
llama-quantizetool:./llama-quantize FortuneQwen3_4b_fp16.gguf FortuneQwen3_4b_q4_k_m.gguf q4_k_m
Model Information
- Base Architecture: Qwen3:4B
- Task: Fortune Telling / I-Ching Interpretation
- Context Window: 32768
- Fine-tuning Framework: LLaMA-Factory
Disclaimer
This model is for entertainment and research purposes only. Please believe in science.