File size: 3,272 Bytes
bb70993
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
eecdd66
 
bb70993
eecdd66
bb70993
eecdd66
bb70993
eecdd66
bb70993
eecdd66
bb70993
eecdd66
bb70993
 
 
 
 
 
 
 
 
eecdd66
bb70993
eecdd66
bb70993
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
eecdd66
bb70993
 
 
eecdd66
 
bb70993
eecdd66
 
bb70993
 
 
 
 
 
 
 
 
 
 
 
 
 
eecdd66
bb70993
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
eecdd66
bb70993
eecdd66
bb70993
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
---

license: other
language:
- zh
- en
pipeline_tag: text-generation
tags:
- fortune-telling
- qwen
- qwen2.5
- qwen3
- gguf
- ollama
---


<div align="center">

# FortuneQwen3_4b



[**中文**](./README.md) | [**English**](./README_EN.md)



</div>



This is a 4B parameter model fine-tuned based on the Qwen3 architecture, specifically designed for Fortune Telling tasks. This repository provides merged Safetensors weights, GGUF quantized files, and a Modelfile for Ollama.



## About This Repository



This repository contains the model in three formats:



1.  **GGUF Quantized Model (Recommended)**:

    *   Filename: `FortuneQwen3_4b_q8_0.gguf` (or other versions)
    *   Description: Pre-converted GGUF format (Int8 quantized), ready for use with `llama.cpp` or `Ollama`.
2.  **Modelfile**:
    *   Filename: `Modelfile`
    *   Description: Configuration file for Ollama import, defining system prompts and parameters.
3.  **Hugging Face Safetensors**:
    *   Filename: `model.safetensors`, etc.
    *   Description: Full model parameters with LoRA weights already merged, suitable for Transformers-based inference, further fine-tuning, or exporting custom GGUF files.

## Quick Start

### Option 1: Using Ollama (Recommended)

You can quickly create an Ollama model using the pre-converted GGUF file found in this repository.

1.  **Clone this repository**:
    ```bash

    git clone https://huggingface.co/Tbata7/FortuneQwen3_4b

    cd FortuneQwen3_4b

    ```


2.  **Create the model**:
    ```bash

    # This uses the local Modelfile and GGUF file

    ollama create FortuneQwen3_q8:4b -f Modelfile

    ```


3.  **Run the model**:
    ```bash

    ollama run FortuneQwen3_q8:4b

    ```


### Option 2: Using llama.cpp

If you prefer to use the GGUF file directly with `llama.cpp`:

```bash

./llama-cli -m FortuneQwen3_4b_q8_0.gguf -p "Your question here..." -n 512

```

## Advanced Usage: Exporting Custom GGUF

If you wish to use a different quantization level (e.g., q4_k, q6_k, fp16), you can export a custom GGUF from the Safetensors weights using `llama.cpp`.

1.  **Prepare Environment**:
    Ensure you have `llama.cpp` python dependencies installed.


2.  **Convert Model**:
    Use the `convert_hf_to_gguf.py` script. You must specify the `--outtype` parameter to control the output type.


    *   **Export as FP16 (No quantization)**:
        ```bash

        python llama.cpp/convert_hf_to_gguf.py ./FortuneQwen3_4b --outfile FortuneQwen3_4b_fp16.gguf --outtype f16

        ```


    *   **Export as Int8 (q8_0)**:

        ```bash

        python llama.cpp/convert_hf_to_gguf.py ./FortuneQwen3_4b --outfile FortuneQwen3_4b_q8_0.gguf --outtype q8_0

        ```



    *   **Other Quantizations**:

        First export as f16, then use the `llama-quantize` tool:

        ```bash

        ./llama-quantize FortuneQwen3_4b_fp16.gguf FortuneQwen3_4b_q4_k_m.gguf q4_k_m

        ```



## Model Information



- **Base Architecture**: Qwen3:4B

- **Task**: Fortune Telling / I-Ching Interpretation

- **Context Window**: 32768

- **Fine-tuning Framework**: LLaMA-Factory



## Disclaimer



This model is for entertainment and research purposes only. Please believe in science.