granite-3.3-8b-base - AWQ (4-bit)
Source model: ibm-granite/granite-3.3-8b-base
This model was quantized to 4-bit using VLLM-Compressor.
Quantization parameters: 4-bit, symmetric scheme.
Usage
# pip install vllm
from vllm import LLM
model = LLM("iproskurina/granite-3.3-8b-base-awq-int4")
output = model.generate("The capital of France is")
print(output)```
- Downloads last month
- 2