mirror of
https://github.com/Shubhamsaboo/awesome-llm-apps.git
synced 2026-03-08 23:13:56 -05:00
🦥 Finetune Gemma 3 with Unsloth (simple 4-bit LoRA)
Minimal example to finetune Google's Gemma 3 Instruct models with Unsloth using 4-bit loading + LoRA. Small, readable, and runnable on a CUDA GPU.
- Models: 270M, 1B, 4B, 12B, 27B
- Dataset: FineTome-100k (ShareGPT-style multi-turn chats)
- Method: Parameter-efficient LoRA (not full FT)
Reference: Unsloth’s Gemma 3 notes: unsloth.ai/blog/gemma3
Install
pip install -r requirements.txt
# or latest Unsloth per their guidance
pip install --upgrade --force-reinstall --no-cache-dir unsloth unsloth_zoo
Run
python finetune_gemma3.py
Outputs are saved to finetuned_model/.
What the script does
- Loads Gemma 3 with 4-bit quantization via Unsloth’s
FastModel. - Attaches LoRA adapters to attention/MLP projections.
- Prepares FineTome-100k by applying the Gemma 3 chat template.
- Trains with TRL’s
SFTTrainerfor a few demo steps. - Saves the finetuned weights.
Change model or settings
Edit the top of finetune_gemma3.py:
MODEL_NAME(e.g.,unsloth/gemma-3-270m-it,unsloth/gemma-3-1b-it)MAX_SEQ_LEN,LOAD_IN_4BIT,FULL_FINETUNING
Note: 4-bit/8-bit loading requires a CUDA GPU. On Mac (M1/M2), run on CPU/MPS without quantization or use a GPU machine.