Fine-Tuning LLMs with Unsloth: A Practical Guide

 

Fine-Tuning LLMs with Unsloth: A Practical Guide

Introduction

Fine-tuning large language models (LLMs) requires significant GPU memory and compute resources. This makes it challenging for researchers or small teams to adapt large models on limited hardware. Unsloth is a library that optimizes training, reducing memory consumption and accelerating fine-tuning.

In this blog, we will look at the main uses of Unsloth for fine-tuning, followed by a practical implementation with PEFT (Parameter-Efficient Fine-Tuning) using LoRA.


Why Use Unsloth for Fine-Tuning?

Unsloth provides several advantages:

  1. Memory Efficiency – Fine-tune large models with smaller GPUs by leveraging 4-bit quantization.

  2. Faster Training – Optimized kernels deliver 2–5x speed improvements.

  3. Parameter-Efficient Fine-Tuning (PEFT) – Supports LoRA and QLoRA, so only a small subset of parameters is updated.

  4. Compatibility – Works seamlessly with Hugging Face transformers, datasets, and peft.

  5. Flexibility – Can be applied to models such as GPT-2, LLaMA, Mistral, and Falcon.


Setup

Install the required packages:

pip install unsloth transformers datasets accelerate bitsandbytes peft

Step 1: Load the Dataset

We will use the Alpaca dataset for demonstration.

from datasets import load_dataset dataset = load_dataset("yahma/alpaca-cleaned") print(dataset["train"][0])

Step 2: Format the Dataset

We need to convert the instruction, input, and output fields into a single text sequence.

def format_alpaca(example): if example["input"]: return {"text": f"### Instruction:\n{example['instruction']}\n### Input:\n{example['input']}\n### Response:\n{example['output']}"} else: return {"text": f"### Instruction:\n{example['instruction']}\n### Response:\n{example['output']}"} dataset = dataset.map(format_alpaca)

Step 3: Load Model with Unsloth

We load a GPT-2 model with 4-bit quantization using Unsloth.

from unsloth import FastLanguageModel model, tokenizer = FastLanguageModel.from_pretrained( model_name="gpt2", load_in_4bit=True, max_seq_length=512, ) tokenizer.pad_token = tokenizer.eos_token

Step 4: Apply PEFT with LoRA

Using Unsloth’s integration with peft, we apply LoRA for parameter-efficient fine-tuning.

model = FastLanguageModel.get_peft_model( model, r=16, lora_alpha=32, lora_dropout=0.05, bias="none", task_type="CAUSAL_LM", )

Step 5: Tokenize the Dataset

def tokenize_function(examples): return tokenizer(examples["text"], truncation=True, padding="max_length", max_length=512) tokenized_dataset = dataset.map(tokenize_function, batched=True)

Step 6: Fine-Tune with Hugging Face Trainer

from transformers import TrainingArguments, Trainer training_args = TrainingArguments( output_dir="./gpt2-unsloth", per_device_train_batch_size=2, num_train_epochs=2, learning_rate=2e-5, fp16=True, logging_steps=50, save_steps=200, ) trainer = Trainer( model=model, args=training_args, train_dataset=tokenized_dataset["train"], tokenizer=tokenizer, ) trainer.train()

Step 7: Test the Fine-Tuned Model

from transformers import pipeline generator = pipeline("text-generation", model="./gpt2-unsloth", tokenizer=tokenizer) prompt = "### Instruction:\nExplain why the sky is blue.\n### Response:\n" output = generator(prompt, max_length=200, num_return_sequences=1) print(output[0]["generated_text"])

Conclusion

With Unsloth and PEFT, we can fine-tune models like GPT-2 efficiently, even on limited GPU resources. By combining 4-bit quantization and LoRA-based PEFT, training becomes both memory-efficient and significantly faster.

Unsloth makes it feasible for individuals and small teams to experiment with instruction tuning and other fine-tuning methods without requiring high-end hardware.

Comments

Popular posts from this blog

TensorFlow Python tutorial, deep learning with TensorFlow, TensorFlow examples, TensorFlow Keras tutorial, machine learning library Python

SciPy Python tutorial, scientific computing with SciPy, Python SciPy examples, SciPy library functions, SciPy for engineers

PyTorch Python tutorial, deep learning with PyTorch, PyTorch neural network examples, PyTorch GPU, PyTorch for beginners