Llama 3, a collection of pre-trained and instruction-tuned text-to-text models, has been making waves in the AI community. In this article, we’ll explore how to fine-tune Llama 3 on a medical dataset and set it up for local use via the Jan application.

llama 3

Understanding Llama 3

Llama 3 is an auto-regressive language model built on an optimized transformer architecture. It comes in two flavors: Llama 3 8B and Llama 3 70B. Both versions have context lengths of 8K tokens and offer impressive performance.

  • Llama 3 8B: The most popular LLM on Hugging Face, with an instruction-tuned version that outperforms Google’s Gemma 7B-It and Mistral 7B.
  • Llama 3 70B: Surpassing Gemini Pro 1.5 and Claude Sonnet on various performance metrics.

Fine-Tuning Llama 3

  1. Dataset: We’ll use the ruslanmv/ai-medical-chatbot dataset containing 250k dialogues between patients and doctors.
  2. Setup:
    • Fill out the Meta download form with your Kaggle email address.
    • Go to the Llama 3 model page on Kaggle and accept the agreement (approval may take 1-2 days).
  3. Kaggle Notebook:
    • Launch a new Kaggle Notebook.
    • Add the Llama 3 8B-Chat model.
    • Fine-tune it using the medical dataset and free GPUs.
  4. Local Use with Jan Application:
    • Convert the model files into the Llama.cpp GGUF format.
    • Quantize the GGUF model and push it to the Hugging Face Hub.
    • Now you’re ready to use your fine-tuned Llama 3 model locally!

Conclusion

Llama 3 opens up exciting possibilities for natural language understanding and generation. Whether you’re a researcher, developer, or curious enthusiast, give Llama 3 a spin—it’s more than just a quirky name!