Qwopus3.5-9B-v3-abliterated-TQ3_4S
Pure TQ3_4S GGUF built locally from huihui-ai/Huihui-Qwopus3.5-9B-v3-abliterated.
Base source:
huihui-ai/Huihui-Qwopus3.5-9B-v3-abliterated- upstream family:
Jackrong/Qwopus3.5-9B-v3
Artifacts:
- pure GGUF size:
4,491,580,736bytes (~4.17 GiB) - source F16 GGUF size:
17,920,693,568bytes
Workflow used:
- Download safetensors from Hugging Face.
- Convert to F16 GGUF with
convert_hf_to_gguf.pyfromturbo-tan/llama.cpp-tq3. - Quantize with
llama-quantize --purefrom the same repo.
Important implementation note:
- The public
llama.cpp-tq3checkout needed local fixes sollama-quantizecould actually quantizeTQ3_4Send-to-end:- expose
TQ3_1SandTQ3_4Sintools/quantize/quantize.cpp - map
LLAMA_FTYPE_MOSTLY_TQ3_1S/TQ3_4Sinsrc/llama-quant.cpp - wire
GGML_TYPE_TQ3_4Squantization inggml/src/ggml.c
- expose
Exact quantization command used:
./build/bin/llama-quantize --pure \
/path/to/Qwopus3.5-9B-v3-abliterated-f16.gguf \
/path/to/Qwopus3.5-9B-v3-abliterated-TQ3_4S.gguf \
TQ3_4S \
16
Runtime:
- target runtime:
turbo-tan/llama.cpp-tq3
Example server command:
./build/bin/llama-server \
-m /path/to/Qwopus3.5-9B-v3-abliterated-TQ3_4S.gguf \
--host 127.0.0.1 --port 8080 \
-ngl 99 -np 2 --kv-unified -c 32768 \
-ctk q8_0 -ctv q8_0 -fa on \
--jinja --reasoning on --reasoning-format deepseek --reasoning-budget 2048 \
--alias qwopus-local
Local smoke test:
- OpenAI-compatible server started successfully
/healthreturned{"status":"ok"}/v1/modelsreturned model idqwopus-local- completion prompt
Write only the word ok.returnedok
Perplexity:
- dataset:
wikitext/wiki.test.raw - successful evaluation command:
./build/bin/llama-perplexity \
-m /path/to/Qwopus3.5-9B-v3-abliterated-TQ3_4S.gguf \
-f /path/to/wiki.test.raw \
-ngl 0 -c 2048 -b 512 -ub 512 -fa off --no-kv-offload
- final estimate:
PPL = 10.7488 +/- 0.07717
Notes:
- Lower perplexity is better.
- The first high-throughput GPU eval attempt crashed late with a CUDA sync error; the above conservative eval settings completed successfully and are the reported numbers here.
- Downloads last month
- 543
Hardware compatibility
Log In to add your hardware
We're not able to determine the quantization variants.
Model tree for navanchauhan/Huihui-Qwopus3.5-9B-v3-abliterated-TQ3_4S
Base model
Qwen/Qwen3.5-9B-Base Finetuned
Qwen/Qwen3.5-9B Finetuned
unsloth/Qwen3.5-9B Adapter
Jackrong/Qwopus3.5-9B-v3