FortuneQwen3_4b / README_EN.md
Tbata7's picture
Upload 2 files
bb70993 verified
---
license: other
language:
- zh
- en
pipeline_tag: text-generation
tags:
- fortune-telling
- qwen
- qwen2.5
- qwen3
- gguf
- ollama
---
<div align="center">
# FortuneQwen3_4b
[**中文**](./README.md) | [**English**](./README_EN.md)
</div>
This is a 4B parameter model fine-tuned based on the Qwen3 architecture, specifically designed for Fortune Telling tasks. This repository provides merged Safetensors weights, GGUF quantized files, and a Modelfile for Ollama.
## About This Repository
This repository contains the model in three formats:
1. **GGUF Quantized Model (Recommended)**:
* Filename: `FortuneQwen3_4b_q8_0.gguf` (or other versions)
* Description: Pre-converted GGUF format (Int8 quantized), ready for use with `llama.cpp` or `Ollama`.
2. **Modelfile**:
* Filename: `Modelfile`
* Description: Configuration file for Ollama import, defining system prompts and parameters.
3. **Hugging Face Safetensors**:
* Filename: `model.safetensors`, etc.
* Description: Full model parameters with LoRA weights already merged, suitable for Transformers-based inference, further fine-tuning, or exporting custom GGUF files.
## Quick Start
### Option 1: Using Ollama (Recommended)
You can quickly create an Ollama model using the pre-converted GGUF file found in this repository.
1. **Clone this repository**:
```bash
git clone https://huggingface.co/Tbata7/FortuneQwen3_4b
cd FortuneQwen3_4b
```
2. **Create the model**:
```bash
# This uses the local Modelfile and GGUF file
ollama create FortuneQwen3_q8:4b -f Modelfile
```
3. **Run the model**:
```bash
ollama run FortuneQwen3_q8:4b
```
### Option 2: Using llama.cpp
If you prefer to use the GGUF file directly with `llama.cpp`:
```bash
./llama-cli -m FortuneQwen3_4b_q8_0.gguf -p "Your question here..." -n 512
```
## Advanced Usage: Exporting Custom GGUF
If you wish to use a different quantization level (e.g., q4_k, q6_k, fp16), you can export a custom GGUF from the Safetensors weights using `llama.cpp`.
1. **Prepare Environment**:
Ensure you have `llama.cpp` python dependencies installed.
2. **Convert Model**:
Use the `convert_hf_to_gguf.py` script. You must specify the `--outtype` parameter to control the output type.
* **Export as FP16 (No quantization)**:
```bash
python llama.cpp/convert_hf_to_gguf.py ./FortuneQwen3_4b --outfile FortuneQwen3_4b_fp16.gguf --outtype f16
```
* **Export as Int8 (q8_0)**:
```bash
python llama.cpp/convert_hf_to_gguf.py ./FortuneQwen3_4b --outfile FortuneQwen3_4b_q8_0.gguf --outtype q8_0
```
* **Other Quantizations**:
First export as f16, then use the `llama-quantize` tool:
```bash
./llama-quantize FortuneQwen3_4b_fp16.gguf FortuneQwen3_4b_q4_k_m.gguf q4_k_m
```
## Model Information
- **Base Architecture**: Qwen3:4B
- **Task**: Fortune Telling / I-Ching Interpretation
- **Context Window**: 32768
- **Fine-tuning Framework**: LLaMA-Factory
## Disclaimer
This model is for entertainment and research purposes only. Please believe in science.