Llama 3, a collection of pre-trained and instruction-tuned text-to-text models, has been making waves in the AI community. In this article, we’ll explore how to fine-tune Llama 3 on a medical dataset and set it up for local use via the Jan application.

Understanding Llama 3
Llama 3 is an auto-regressive language model built on an optimized transformer architecture. It comes in two flavors: Llama 3 8B and Llama 3 70B. Both versions have context lengths of 8K tokens and offer impressive performance.
- Llama 3 8B: The most popular LLM on Hugging Face, with an instruction-tuned version that outperforms Google’s Gemma 7B-It and Mistral 7B.
- Llama 3 70B: Surpassing Gemini Pro 1.5 and Claude Sonnet on various performance metrics.
Fine-Tuning Llama 3
- Dataset: We’ll use the
ruslanmv/ai-medical-chatbot
dataset containing 250k dialogues between patients and doctors. - Setup:
- Fill out the Meta download form with your Kaggle email address.
- Go to the Llama 3 model page on Kaggle and accept the agreement (approval may take 1-2 days).
- Kaggle Notebook:
- Launch a new Kaggle Notebook.
- Add the Llama 3 8B-Chat model.
- Fine-tune it using the medical dataset and free GPUs.
- Local Use with Jan Application:
- Convert the model files into the Llama.cpp GGUF format.
- Quantize the GGUF model and push it to the Hugging Face Hub.
- Now you’re ready to use your fine-tuned Llama 3 model locally!
Conclusion
Llama 3 opens up exciting possibilities for natural language understanding and generation. Whether you’re a researcher, developer, or curious enthusiast, give Llama 3 a spin—it’s more than just a quirky name!
Leave a Reply