Latxa-Llama-3.1-70B-Instruct-GGUF
This repository contains GGUF format model files for HiTZ/Latxa-Llama-3.1-70B-Instruct.
Latxa is a family of Large Language Models (LLMs) specialized in Basque, built by the HiTZ Center. This model is based on Llama-3.1-70B-Instruct.
Model Details
- Original Model: HiTZ/Latxa-Llama-3.1-70B-Instruct
- Quantized by: xezpeleta
- Format: GGUF (compatible with llama.cpp, LM Studio, Ollama, etc.)
Provided Files
| File Name | Quant Method | Size | Description |
|---|---|---|---|
Latxa-Llama-3.1-70B-Instruct.gguf |
Q8_0 | ~75 GB | High fidelity, very low quality loss. Requires ~80GB RAM/VRAM. |
Note on Hardware: A 70B parameter model is large. To run the Q8_0 version, you need approximately 75-80 GB of available RAM (System RAM if running on CPU, or VRAM if running on GPU).
Usage
You can use this model with llama.cpp or any compatible UI.
1. CLI Command (llama-cli)
Run the model in interactive mode using the command line:
./llama-cli -m Latxa-Llama-3.1-70B-Instruct.gguf \
-p "Kaixo, azaldu iezadazu zer den adimen artifiziala labur." \
-n 512 \
-c 8192 \
--color \
-i
2. Server Command (llama-server)
This exposes an OpenAI-compatible API endpoint that you can use with other tools (like web UIs, VS Code extensions, or Python scripts).
./llama-server -hf itzune/Latxa-Llama-3.1-70B-Instruct.gguf \
--port 8080 \
--host 0.0.0.0 \
-c 8192 \
--n-gpu-layers 80
--host 0.0.0.0: Makes the server accessible to other devices on your local network.--n-gpu-layers 80: Offloads layers to the GPU. If you don't have enough VRAM, reduce this number or remove the flag to run on CPU.
3. Python (using openai library)
Since llama-server is OpenAI compatible, you can query it using the standard Python libraries:
from openai import OpenAI
client \= OpenAI(base\_url="http://localhost:8080/v1", api\_key="lm-studio")
response \= client.chat.completions.create(
model="Latxa-70B",
messages=\[
{"role": "system", "content": "Laguntzaile erabilgarria zara."},
{"role": "user", "content": "Idatzi poema labur bat itsasoari buruz."}
\],
temperature=0.7,
)
print(response.choices\[0\].message.content)
Credits
- Original model by HiTZ Center.
- GGUF conversion by [xezpeleta].
- Downloads last month
- 6
We're not able to determine the quantization variants.
Model tree for itzune/Latxa-Llama-3.1-70B-Instruct-GGUF
Base model
meta-llama/Llama-3.1-70B