|
|
---
|
|
|
license: other
|
|
|
language:
|
|
|
- zh
|
|
|
- en
|
|
|
pipeline_tag: text-generation
|
|
|
tags:
|
|
|
- fortune-telling
|
|
|
- qwen
|
|
|
- qwen2.5
|
|
|
- qwen3
|
|
|
- gguf
|
|
|
- ollama
|
|
|
---
|
|
|
|
|
|
<div align="center">
|
|
|
|
|
|
# FortuneQwen3_4b
|
|
|
|
|
|
[**中文**](./README.md) | [**English**](./README_EN.md)
|
|
|
|
|
|
</div>
|
|
|
|
|
|
This is a 4B parameter model fine-tuned based on the Qwen3 architecture, specifically designed for Fortune Telling tasks. This repository provides merged Safetensors weights, GGUF quantized files, and a Modelfile for Ollama.
|
|
|
|
|
|
## About This Repository
|
|
|
|
|
|
This repository contains the model in three formats:
|
|
|
|
|
|
1. **GGUF Quantized Model (Recommended)**:
|
|
|
* Filename: `FortuneQwen3_4b_q8_0.gguf` (or other versions)
|
|
|
* Description: Pre-converted GGUF format (Int8 quantized), ready for use with `llama.cpp` or `Ollama`.
|
|
|
2. **Modelfile**:
|
|
|
* Filename: `Modelfile`
|
|
|
* Description: Configuration file for Ollama import, defining system prompts and parameters.
|
|
|
3. **Hugging Face Safetensors**:
|
|
|
* Filename: `model.safetensors`, etc.
|
|
|
* Description: Full model parameters with LoRA weights already merged, suitable for Transformers-based inference, further fine-tuning, or exporting custom GGUF files.
|
|
|
|
|
|
## Quick Start
|
|
|
|
|
|
### Option 1: Using Ollama (Recommended)
|
|
|
|
|
|
You can quickly create an Ollama model using the pre-converted GGUF file found in this repository.
|
|
|
|
|
|
1. **Clone this repository**:
|
|
|
```bash
|
|
|
git clone https://huggingface.co/Tbata7/FortuneQwen3_4b
|
|
|
cd FortuneQwen3_4b
|
|
|
```
|
|
|
|
|
|
2. **Create the model**:
|
|
|
```bash
|
|
|
# This uses the local Modelfile and GGUF file
|
|
|
ollama create FortuneQwen3_q8:4b -f Modelfile
|
|
|
```
|
|
|
|
|
|
3. **Run the model**:
|
|
|
```bash
|
|
|
ollama run FortuneQwen3_q8:4b
|
|
|
```
|
|
|
|
|
|
### Option 2: Using llama.cpp
|
|
|
|
|
|
If you prefer to use the GGUF file directly with `llama.cpp`:
|
|
|
|
|
|
```bash
|
|
|
./llama-cli -m FortuneQwen3_4b_q8_0.gguf -p "Your question here..." -n 512
|
|
|
```
|
|
|
|
|
|
## Advanced Usage: Exporting Custom GGUF
|
|
|
|
|
|
If you wish to use a different quantization level (e.g., q4_k, q6_k, fp16), you can export a custom GGUF from the Safetensors weights using `llama.cpp`.
|
|
|
|
|
|
1. **Prepare Environment**:
|
|
|
Ensure you have `llama.cpp` python dependencies installed.
|
|
|
|
|
|
2. **Convert Model**:
|
|
|
Use the `convert_hf_to_gguf.py` script. You must specify the `--outtype` parameter to control the output type.
|
|
|
|
|
|
* **Export as FP16 (No quantization)**:
|
|
|
```bash
|
|
|
python llama.cpp/convert_hf_to_gguf.py ./FortuneQwen3_4b --outfile FortuneQwen3_4b_fp16.gguf --outtype f16
|
|
|
```
|
|
|
|
|
|
* **Export as Int8 (q8_0)**:
|
|
|
```bash
|
|
|
python llama.cpp/convert_hf_to_gguf.py ./FortuneQwen3_4b --outfile FortuneQwen3_4b_q8_0.gguf --outtype q8_0
|
|
|
```
|
|
|
|
|
|
* **Other Quantizations**:
|
|
|
First export as f16, then use the `llama-quantize` tool:
|
|
|
```bash
|
|
|
./llama-quantize FortuneQwen3_4b_fp16.gguf FortuneQwen3_4b_q4_k_m.gguf q4_k_m
|
|
|
```
|
|
|
|
|
|
## Model Information
|
|
|
|
|
|
- **Base Architecture**: Qwen3:4B
|
|
|
- **Task**: Fortune Telling / I-Ching Interpretation
|
|
|
- **Context Window**: 32768
|
|
|
- **Fine-tuning Framework**: LLaMA-Factory
|
|
|
|
|
|
## Disclaimer
|
|
|
|
|
|
This model is for entertainment and research purposes only. Please believe in science.
|
|
|
|