Latxa-Llama-3.1-70B-Instruct-GGUF

This repository contains GGUF format model files for HiTZ/Latxa-Llama-3.1-70B-Instruct.

Latxa is a family of Large Language Models (LLMs) specialized in Basque, built by the HiTZ Center. This model is based on Llama-3.1-70B-Instruct.

Model Details

Provided Files

File Name Quant Method Size Description
Latxa-Llama-3.1-70B-Instruct.gguf Q8_0 ~75 GB High fidelity, very low quality loss. Requires ~80GB RAM/VRAM.

Note on Hardware: A 70B parameter model is large. To run the Q8_0 version, you need approximately 75-80 GB of available RAM (System RAM if running on CPU, or VRAM if running on GPU).


Usage

You can use this model with llama.cpp or any compatible UI.

1. CLI Command (llama-cli)

Run the model in interactive mode using the command line:

./llama-cli -m Latxa-Llama-3.1-70B-Instruct.gguf \
  -p "Kaixo, azaldu iezadazu zer den adimen artifiziala labur." \
  -n 512 \
  -c 8192 \
  --color \
  -i

2. Server Command (llama-server)

This exposes an OpenAI-compatible API endpoint that you can use with other tools (like web UIs, VS Code extensions, or Python scripts).

./llama-server -hf itzune/Latxa-Llama-3.1-70B-Instruct.gguf \  
  --port 8080 \  
  --host 0.0.0.0 \ 
  -c 8192 \  
  --n-gpu-layers 80
  • --host 0.0.0.0: Makes the server accessible to other devices on your local network.
  • --n-gpu-layers 80: Offloads layers to the GPU. If you don't have enough VRAM, reduce this number or remove the flag to run on CPU.

3. Python (using openai library)

Since llama-server is OpenAI compatible, you can query it using the standard Python libraries:

from openai import OpenAI

client \= OpenAI(base\_url="http://localhost:8080/v1", api\_key="lm-studio")

response \= client.chat.completions.create(  
    model="Latxa-70B",  
    messages=\[  
        {"role": "system", "content": "Laguntzaile erabilgarria zara."},  
        {"role": "user", "content": "Idatzi poema labur bat itsasoari buruz."}  
    \],  
    temperature=0.7,  
)

print(response.choices\[0\].message.content)

Credits

  • Original model by HiTZ Center.
  • GGUF conversion by [xezpeleta].
Downloads last month
6
GGUF
Model size
71B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for itzune/Latxa-Llama-3.1-70B-Instruct-GGUF