FortuneQwen3_4b / README_EN.md

Tbata7

Upload 2 files

bb70993 verified 4 days ago

preview code

raw

history blame contribute delete

3.27 kB

metadata

license: other
language:
  - zh
  - en
pipeline_tag: text-generation
tags:
  - fortune-telling
  - qwen
  - qwen2.5
  - qwen3
  - gguf
  - ollama

FortuneQwen3_4b

中文 | English

This is a 4B parameter model fine-tuned based on the Qwen3 architecture, specifically designed for Fortune Telling tasks. This repository provides merged Safetensors weights, GGUF quantized files, and a Modelfile for Ollama.

About This Repository

This repository contains the model in three formats:

GGUF Quantized Model (Recommended):
- Filename: FortuneQwen3_4b_q8_0.gguf (or other versions)
- Description: Pre-converted GGUF format (Int8 quantized), ready for use with llama.cpp or Ollama.
Modelfile:
- Filename: Modelfile
- Description: Configuration file for Ollama import, defining system prompts and parameters.
Hugging Face Safetensors:
- Filename: model.safetensors, etc.
- Description: Full model parameters with LoRA weights already merged, suitable for Transformers-based inference, further fine-tuning, or exporting custom GGUF files.

Quick Start

Option 1: Using Ollama (Recommended)

You can quickly create an Ollama model using the pre-converted GGUF file found in this repository.

Clone this repository:

git clone https://huggingface.co/Tbata7/FortuneQwen3_4b
cd FortuneQwen3_4b

Create the model:

# This uses the local Modelfile and GGUF file
ollama create FortuneQwen3_q8:4b -f Modelfile

Run the model:
```
ollama run FortuneQwen3_q8:4b
```

Option 2: Using llama.cpp

If you prefer to use the GGUF file directly with llama.cpp:

./llama-cli -m FortuneQwen3_4b_q8_0.gguf -p "Your question here..." -n 512

Advanced Usage: Exporting Custom GGUF

If you wish to use a different quantization level (e.g., q4_k, q6_k, fp16), you can export a custom GGUF from the Safetensors weights using llama.cpp.

Prepare Environment: Ensure you have llama.cpp python dependencies installed.

Convert Model: Use the convert_hf_to_gguf.py script. You must specify the --outtype parameter to control the output type.

Export as FP16 (No quantization):

python llama.cpp/convert_hf_to_gguf.py ./FortuneQwen3_4b --outfile FortuneQwen3_4b_fp16.gguf --outtype f16

Export as Int8 (q8_0):

python llama.cpp/convert_hf_to_gguf.py ./FortuneQwen3_4b --outfile FortuneQwen3_4b_q8_0.gguf --outtype q8_0

Other Quantizations: First export as f16, then use the llama-quantize tool:

./llama-quantize FortuneQwen3_4b_fp16.gguf FortuneQwen3_4b_q4_k_m.gguf q4_k_m

Model Information

Base Architecture: Qwen3:4B
Task: Fortune Telling / I-Ching Interpretation
Context Window: 32768
Fine-tuning Framework: LLaMA-Factory

Disclaimer

This model is for entertainment and research purposes only. Please believe in science.