βοΈ Kanoonu AI β Phi-3 GGUF (Q4_K_M)
Quantized GGUF version of Kanoonu AI β ready for local deployment
π Overview
This is the GGUF quantized version of tejasgowda05/Kanoonu-AI-Phi3-Finetuned β a Phi-3-mini model fine-tuned on 23,370 Indian law Q&A pairs covering the Indian Penal Code (IPC), Code of Criminal Procedure (CrPC), Constitution of India, and other statutes.
The GGUF format allows this model to run locally on CPU or GPU without requiring a high-end machine, making Indian legal information accessible to everyone.
π¦ Available Files
| File | Quantization | Size | Use Case |
|---|---|---|---|
phi-3-mini-4k-instruct.Q4_K_M.gguf |
Q4_K_M | ~2.2 GB | β Recommended β best balance of size and quality |
What is Q4_K_M?
Q4_K_M is a 4-bit quantization method that compresses the model to ~2.2GB with negligible quality loss compared to the full precision version. It runs comfortably on most modern laptops.
π Quick Start
Option 1 β Ollama (Easiest)
# Step 1 β Install Ollama from https://ollama.com/download
# Step 2 β Pull and run directly
ollama run hf.co/tejasgowda05/Kanoonu-AI-Phi3-GGUF:Q4_K_M
Option 2 β llama.cpp CLI
llama-cli -hf tejasgowda05/Kanoonu-AI-Phi3-GGUF --jinja
Option 3 β llama-cpp-python
from llama_cpp import Llama
llm = Llama(
model_path = "./kanoonu_model/phi-3-mini-4k-instruct.Q4_K_M.gguf",
n_ctx = 2048,
n_threads = 4,
)
response = llm(
"<|system|>\nYou are Kanoonu AI, an expert Indian legal assistant.\n<|end|>\n"
"<|user|>\nWhat is an FIR and how is it filed in India?<|end|>\n"
"<|assistant|>\n",
max_tokens = 200,
stop = ["<|end|>", "<|endoftext|>"],
)
print(response["choices"][0]["text"])
Option 4 β Python with ctransformers
from ctransformers import AutoModelForCausalLM
llm = AutoModelForCausalLM.from_pretrained(
"tejasgowda05/Kanoonu-AI-Phi3-GGUF",
model_file = "phi-3-mini-4k-instruct.Q4_K_M.gguf",
model_type = "mistral",
)
print(llm("What are the fundamental rights in the Indian Constitution?"))
π» Hardware Requirements
| Setup | Minimum RAM | Performance |
|---|---|---|
| CPU only | 8 GB RAM | Slow (~1-2 tokens/sec) |
| CPU + 8GB RAM | 8 GB RAM | Moderate (~3-5 tokens/sec) |
| GPU (4GB VRAM) | 4 GB VRAM | Fast (~15-20 tokens/sec) |
| GPU (8GB VRAM) | 8 GB VRAM | Very Fast (~30+ tokens/sec) |
ποΈ How This Was Created
microsoft/Phi-3-mini-4k-instruct (3.8B base model)
β
QLoRA Fine-tuning
(24,607 Indian law Q&A pairs)
β
tejasgowda05/Kanoonu-AI-Phi3-Finetuned (LoRA adapters)
β
Merge LoRA β Convert to GGUF β Quantize Q4_K_M
(via Unsloth)
β
tejasgowda05/Kanoonu-AI-Phi3-GGUF β you are here
| Training Metric | Value |
|---|---|
| Final Train Loss | 0.3478 |
| Best Eval Loss | 0.6568 |
| Training Examples | 23,370 |
| Training Time | ~270 minutes |
π Related Resources
| Resource | Link |
|---|---|
| π€ LoRA Adapter | tejasgowda05/Kanoonu-AI-Phi3-Finetuned |
| π€ GGUF Model (this repo) | tejasgowda05/Kanoonu-AI-Phi3-GGUF |
| π€ Formatted Dataset | tejasgowda05/Indian-Kanoonu-Dataset |
| π¦ Base Model | microsoft/Phi-3-mini-4k-instruct |
| π¦ Original Dataset | viber1/indian-law-dataset |
β οΈ Limitations & Disclaimer
- This model is intended for educational and informational purposes only
- It is not a substitute for professional legal advice
- Always consult a qualified lawyer for legal matters
- The model may occasionally produce inaccurate or outdated legal information
π€ Author
Tejas Gowda N β tejasgowda05
Built as part of the Kanoonu AI project β making Indian legal information accessible through conversational AI.
π License
Apache 2.0 β inherited from microsoft/Phi-3-mini-4k-instruct and the original dataset.
Trained 2x faster with Unsloth
- Downloads last month
- 330
4-bit
Model tree for tejasgowda05/Kanoonu-AI-Phi3-GGUF
Base model
microsoft/Phi-3-mini-4k-instruct