Llama-3.1-8B - AWQ (4-bit)
Source model: meta-llama/Llama-3.1-8B
This model was quantized to 4-bit using VLLM-Compressor.
Quantization parameters: 4-bit, symmetric scheme.
Usage
# pip install vllm
from vllm import LLM
model = LLM("iproskurina/Llama-3.1-8B-awq-int4")
output = model.generate("The capital of France is")
print(output)```
- Downloads last month
- 2