Qwen3-4b-thinking-abliterated (GGUF)
This is a surgically modified version of Qwen3-4B-Thinking-2507, It combines Native Reasoning (Chain-of-Thought) with Full Compliance (Uncensored) through mathematical ablation.
π§ Model DNA
- Base Model: Qwen3-4B-Thinking-2507
- Architecture: Qwen 3 (Native Thinking)
- Modification: Orthogonal Projection Abliteration.
- Surgery Details: Targetted layers 10-28, scrubbing the
o_proj(Attention Output) anddown_proj(MLP Output) to remove refusal vectors while preserving 99.9% of reasoning logic. - Format: GGUF (Q4_K_M, fp_16)
π Why this model is special:
- Native Thinking: Unlike older models that mimic reasoning, this model possesses a native logic engine. It will automatically generate
<think>blocks to plan complex code, catch logic bugs, and reason through edge cases. - Zero Refusals: The refusal mechanism has been mathematically removed. It will not lecture you on ethics or refuse technical requests, including penetration testing scripts or controversial logic.
π» Optimal Usage (llama.cpp)
To run this on a 4GB card with a large context window, use these specific flags to enable KV Cache quantization:
./llama-server \
-m qwen3-4b-thinking-abliterated-Q4_K_M.gguf \
--ctx-size 8192 \
--parallel 1 \
-ctk q8_0 \
-ctv q8_0 \
--n-gpu-layers 100 \
--temp 0.6 \
--repeat-penalty 1.0 \
-fa
- Downloads last month
- 61
Hardware compatibility
Log In
to view the estimation
4-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Model tree for jockey1011/Qwen3-4b-thinking-abliterated
Base model
Qwen/Qwen3-4B-Thinking-2507