Jellyfish Technologies Logo

PEFT : Efficient Fine-Tuning with Hyperparameters

peft-efficient-fine-tuning-with-hyperparameters

Parameter-Efficient Fine-Tuning (PEFT) is a game-changer when it comes to adapting large language models to your specific domain without the headache of full model fine-tuning. It’s especially popular when training compute-heavy models like LLaMA, Falcon, or Mistral on tasks like legal reasoning, healthcare Q&A, or document extraction.

In this blog, we’ll focus on:

  • What exactly is PEFT?
  • Why it works
  • What are the most important hyperparameters
  • What changing each hyperparameter actually does
  • Code examples

What is PEFT?

PEFT methods train a small number of parameters added to a pretrained model — keeping the base frozen. You get:

  • Minimal GPU memory usage
  • Fast training
  • High performance on domain-specific tasks

Popular PEFT methods:

  • LoRA (Low-Rank Adaptation)
  • Prefix Tuning
  • Adapters
  • IA3 (Input/Output-Aware Adaptation)

We’ll focus on LoRA, the most widely used.

Key Components of LoRA

LoRA works by injecting trainable low-rank matrices into linear layers of a transformer. Instead of updating the full matrix W, it learns W + A @ B, where A and B are smaller trainable matrices.

This saves memory and allows faster updates. You can fine-tune models like LLaMA-7B on a single 16GB GPU with LoRA.

Important LoRA Hyperparameters :

HyperparameterDescriptionEffect of Increasing
rLoRA rank (dimensionality of low-rank adapters)More capacity, more memory usage
lora_alphaScaling factor for LoRA weightsAffects learning dynamics; higher = stronger signal
target_modulesSpecific layers to inject LoRA into (e.g., q_proj, v_proj)More modules = more learnability
lora_dropoutDropout between A and B during trainingHelps regularization, especially on small datasets
biasWhether to train bias weights (none, all, or lora_only)none = fewer params, all = more flexibility
task_typeCAUSAL_LM or SEQ_CLS, affects which head is used for adaptationSet appropriately for generation/classification

Key TrainingArguments Hyperparameters :

ParameterDescriptionTip
num_train_epochsTotal number of epochsUse 3–5 for small to medium datasets
per_device_train_batch_sizeBatch size per GPULower = less memory needed; increase if you use accumulation
gradient_accumulation_stepsNumber of steps to accumulate before optimizer stepUse higher if batch size is low
fp16Enables 16-bit precisionGreat for memory efficiency (when using supported hardware)
learning_rateInitial learning rateUse 2e-4 to 3e-5 for LoRA; lower = more stable training
logging_stepsFrequency of loggingUse 10–50 for closer tracking
save_stepsFrequency of saving model checkpointsUse 100+ to save periodically
eval_strategyWhen to run evaluation (steps, epoch)Use steps for tight monitoring
eval_stepsEvaluation interval in stepsShould match save_steps or be smaller for frequent validation
report_toWhere to log metrics (wandb, tensorboard, etc.)Use wandb for collaborative tracking
run_nameNaming the experiment runUseful for organizing runs in experiment tracking tools

Code Example:

from transformers import TrainingArguments

training_args = TrainingArguments(
    output_dir="./deepseek_finetuned",
    num_train_epochs=3,
    per_device_train_batch_size=1,
    gradient_accumulation_steps=16,
    fp16=True,
    logging_steps=10,
    save_steps=100,
    eval_strategy="steps",
    eval_steps=100,
    learning_rate=3e-5,
    logging_dir="./logs",
    report_to="wandb",
    run_name="DeepSeek_FineTuning_Experiment",
)

Code Example:

from transformers import AutoModelForCausalLM, AutoTokenizer, Trainer
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training
from datasets import load_dataset

model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-hf")
model = prepare_model_for_kbit_training(model)

config = LoraConfig(
    r=8,
    lora_alpha=16,
    lora_dropout=0.1,
    bias="none",
    target_modules=["q_proj", "v_proj"],
    task_type="CAUSAL_LM"
)
model = get_peft_model(model, config)

tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-hf")
dataset = load_dataset("text", data_files={"train": "law_qa.txt"})
tokenized = dataset.map(lambda x: tokenizer(x["text"], padding=True, truncation=True), batched=True)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized["train"]
)

trainer.train()

Tips for Hyperparameter Tuning

  • Start small with r=4 and increase only if performance plateaus.
  • Use q_proj and v_proj as your initial target_modules. Add k_proj or o_proj if needed.
  • If overfitting, increase lora_dropout to 0.2–0.3.
  • Use lora_alpha values between 8–32 depending on the dataset size.
  • If training fails to converge, lower the learning rate to 1e-4 or 5e-5.
  • If using very small batches (e.g., batch_size=1), increase gradient_accumulation_steps to maintain effective batch size.
  • Use fp16=True to reduce memory usage when supported by your hardware.
  • Use logging_steps = 10–50 to monitor training progress without slowing it down.
  • Set save_steps and eval_steps to 100–200 for regular checkpoints and validation.
  • If training is unstable, try a lower learning_rate and reduce lora_alpha slightly. and increase only if performance plateaus.
  • Use q_proj and v_proj as your initial target_modules. Add k_proj or o_proj if needed.
  • If overfitting, increase lora_dropout to 0.2–0.3.
  • Use lora_alpha values between 8–32 depending on the dataset size.
  • If training fails to converge, lower the learning rate to 1e-4 or 5e-5.

Conclusion

PEFT, especially with LoRA, gives you production-grade model fine-tuning with minimal overhead. Instead of spending days on full model updates, you can run fine-tuning jobs in hours with excellent results.

And remember: fine-tuning is just the start. With PEFT adapters, you can swap tasks dynamically, experiment faster, and deploy smarter.

Share this article
Want to speak with our solution experts?
Jellyfish Technologies

Modernize Legacy System With AI: A Strategy for CEOs

Download the eBook and get insights on CEOs growth strategy

    Let's Talk

    We believe in solving complex business challenges of the converging world, by using cutting-edge technologies.