Tbata7
/

FortuneQwen3_4b

Text Generation

fortune-telling

Model card Files Files and versions

FortuneQwen3_4b / README_EN.md

Tbata7's picture

Upload 2 files

bb70993 verified 10 days ago

|

history blame contribute delete

3.27 kB

	---
	license: other
	language:
	- zh
	- en
	pipeline_tag: text-generation
	tags:
	- fortune-telling
	- qwen
	- qwen2.5
	- qwen3
	- gguf
	- ollama
	---

	<div align="center">

	# FortuneQwen3_4b

	[中文](./README.md) \| [English](./README_EN.md)

	</div>

	This is a 4B parameter model fine-tuned based on the Qwen3 architecture, specifically designed for Fortune Telling tasks. This repository provides merged Safetensors weights, GGUF quantized files, and a Modelfile for Ollama.

	## About This Repository

	This repository contains the model in three formats:

	1. GGUF Quantized Model (Recommended):
	* Filename: `FortuneQwen3_4b_q8_0.gguf` (or other versions)
	* Description: Pre-converted GGUF format (Int8 quantized), ready for use with `llama.cpp` or `Ollama`.
	2. Modelfile:
	* Filename: `Modelfile`
	* Description: Configuration file for Ollama import, defining system prompts and parameters.
	3. Hugging Face Safetensors:
	* Filename: `model.safetensors`, etc.
	* Description: Full model parameters with LoRA weights already merged, suitable for Transformers-based inference, further fine-tuning, or exporting custom GGUF files.

	## Quick Start

	### Option 1: Using Ollama (Recommended)

	You can quickly create an Ollama model using the pre-converted GGUF file found in this repository.

	1. Clone this repository:
	```bash
	git clone https://huggingface.co/Tbata7/FortuneQwen3_4b
	cd FortuneQwen3_4b
	```

	2. Create the model:
	```bash
	# This uses the local Modelfile and GGUF file
	ollama create FortuneQwen3_q8:4b -f Modelfile
	```

	3. Run the model:
	```bash
	ollama run FortuneQwen3_q8:4b
	```

	### Option 2: Using llama.cpp

	If you prefer to use the GGUF file directly with `llama.cpp`:

	```bash
	./llama-cli -m FortuneQwen3_4b_q8_0.gguf -p "Your question here..." -n 512
	```

	## Advanced Usage: Exporting Custom GGUF

	If you wish to use a different quantization level (e.g., q4_k, q6_k, fp16), you can export a custom GGUF from the Safetensors weights using `llama.cpp`.

	1. Prepare Environment:
	Ensure you have `llama.cpp` python dependencies installed.

	2. Convert Model:
	Use the `convert_hf_to_gguf.py` script. You must specify the `--outtype` parameter to control the output type.

	* Export as FP16 (No quantization):
	```bash
	python llama.cpp/convert_hf_to_gguf.py ./FortuneQwen3_4b --outfile FortuneQwen3_4b_fp16.gguf --outtype f16
	```

	* Export as Int8 (q8_0):
	```bash
	python llama.cpp/convert_hf_to_gguf.py ./FortuneQwen3_4b --outfile FortuneQwen3_4b_q8_0.gguf --outtype q8_0
	```

	* Other Quantizations:
	First export as f16, then use the `llama-quantize` tool:
	```bash
	./llama-quantize FortuneQwen3_4b_fp16.gguf FortuneQwen3_4b_q4_k_m.gguf q4_k_m
	```

	## Model Information

	- Base Architecture: Qwen3:4B
	- Task: Fortune Telling / I-Ching Interpretation
	- Context Window: 32768
	- Fine-tuning Framework: LLaMA-Factory

	## Disclaimer

	This model is for entertainment and research purposes only. Please believe in science.