Latxa-Llama-3.1-70B-Instruct-GGUF

This repository contains GGUF format model files for HiTZ/Latxa-Llama-3.1-70B-Instruct.

Latxa is a family of Large Language Models (LLMs) specialized in Basque, built by the HiTZ Center. This model is based on Llama-3.1-70B-Instruct.

Model Details

Original Model: HiTZ/Latxa-Llama-3.1-70B-Instruct
Quantized by: xezpeleta
Format: GGUF (compatible with llama.cpp, LM Studio, Ollama, etc.)

Provided Files

File Name	Quant Method	Size	Description
`Latxa-Llama-3.1-70B-Instruct.gguf`	Q8_0	~75 GB	High fidelity, very low quality loss. Requires ~80GB RAM/VRAM.

Note on Hardware: A 70B parameter model is large. To run the Q8_0 version, you need approximately 75-80 GB of available RAM (System RAM if running on CPU, or VRAM if running on GPU).

Usage

You can use this model with llama.cpp or any compatible UI.

1. CLI Command (`llama-cli`)

Run the model in interactive mode using the command line:

./llama-cli -m Latxa-Llama-3.1-70B-Instruct.gguf \
  -p "Kaixo, azaldu iezadazu zer den adimen artifiziala labur." \
  -n 512 \
  -c 8192 \
  --color \
  -i

2. Server Command (llama-server)

This exposes an OpenAI-compatible API endpoint that you can use with other tools (like web UIs, VS Code extensions, or Python scripts).

./llama-server -hf itzune/Latxa-Llama-3.1-70B-Instruct.gguf \  
  --port 8080 \  
  --host 0.0.0.0 \ 
  -c 8192 \  
  --n-gpu-layers 80

--host 0.0.0.0: Makes the server accessible to other devices on your local network.
--n-gpu-layers 80: Offloads layers to the GPU. If you don't have enough VRAM, reduce this number or remove the flag to run on CPU.

3. Python (using openai library)

Since llama-server is OpenAI compatible, you can query it using the standard Python libraries:

from openai import OpenAI

client \= OpenAI(base\_url="http://localhost:8080/v1", api\_key="lm-studio")

response \= client.chat.completions.create(  
    model="Latxa-70B",  
    messages=\[  
        {"role": "system", "content": "Laguntzaile erabilgarria zara."},  
        {"role": "user", "content": "Idatzi poema labur bat itsasoari buruz."}  
    \],  
    temperature=0.7,  
)

print(response.choices\[0\].message.content)

Credits

Original model by HiTZ Center.
GGUF conversion by [xezpeleta].

Downloads last month: 6

GGUF

Model size

71B params

Architecture

llama

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Model tree for itzune/Latxa-Llama-3.1-70B-Instruct-GGUF

Base model

meta-llama/Llama-3.1-70B

Finetuned

meta-llama/Llama-3.1-70B-Instruct