FortuneQwen3_4b / README_EN.md
Tbata7's picture
Upload 2 files
bb70993 verified
metadata
license: other
language:
  - zh
  - en
pipeline_tag: text-generation
tags:
  - fortune-telling
  - qwen
  - qwen2.5
  - qwen3
  - gguf
  - ollama

FortuneQwen3_4b

中文 | English

This is a 4B parameter model fine-tuned based on the Qwen3 architecture, specifically designed for Fortune Telling tasks. This repository provides merged Safetensors weights, GGUF quantized files, and a Modelfile for Ollama.

About This Repository

This repository contains the model in three formats:

  1. GGUF Quantized Model (Recommended):
    • Filename: FortuneQwen3_4b_q8_0.gguf (or other versions)
    • Description: Pre-converted GGUF format (Int8 quantized), ready for use with llama.cpp or Ollama.
  2. Modelfile:
    • Filename: Modelfile
    • Description: Configuration file for Ollama import, defining system prompts and parameters.
  3. Hugging Face Safetensors:
    • Filename: model.safetensors, etc.
    • Description: Full model parameters with LoRA weights already merged, suitable for Transformers-based inference, further fine-tuning, or exporting custom GGUF files.

Quick Start

Option 1: Using Ollama (Recommended)

You can quickly create an Ollama model using the pre-converted GGUF file found in this repository.

  1. Clone this repository:

    git clone https://huggingface.co/Tbata7/FortuneQwen3_4b
    cd FortuneQwen3_4b
    
  2. Create the model:

    # This uses the local Modelfile and GGUF file
    ollama create FortuneQwen3_q8:4b -f Modelfile
    
  3. Run the model:

    ollama run FortuneQwen3_q8:4b
    

Option 2: Using llama.cpp

If you prefer to use the GGUF file directly with llama.cpp:

./llama-cli -m FortuneQwen3_4b_q8_0.gguf -p "Your question here..." -n 512

Advanced Usage: Exporting Custom GGUF

If you wish to use a different quantization level (e.g., q4_k, q6_k, fp16), you can export a custom GGUF from the Safetensors weights using llama.cpp.

  1. Prepare Environment: Ensure you have llama.cpp python dependencies installed.

  2. Convert Model: Use the convert_hf_to_gguf.py script. You must specify the --outtype parameter to control the output type.

    • Export as FP16 (No quantization):

      python llama.cpp/convert_hf_to_gguf.py ./FortuneQwen3_4b --outfile FortuneQwen3_4b_fp16.gguf --outtype f16
      
    • Export as Int8 (q8_0):

      python llama.cpp/convert_hf_to_gguf.py ./FortuneQwen3_4b --outfile FortuneQwen3_4b_q8_0.gguf --outtype q8_0
      
    • Other Quantizations: First export as f16, then use the llama-quantize tool:

      ./llama-quantize FortuneQwen3_4b_fp16.gguf FortuneQwen3_4b_q4_k_m.gguf q4_k_m
      

Model Information

  • Base Architecture: Qwen3:4B
  • Task: Fortune Telling / I-Ching Interpretation
  • Context Window: 32768
  • Fine-tuning Framework: LLaMA-Factory

Disclaimer

This model is for entertainment and research purposes only. Please believe in science.