Gemma 4 E2B Korean SFT v3 (LoRA)

Google Gemma 4 E2B-IT (5.1B params) 모델을 9개 한국어 데이터셋으로 SFT (Supervised Fine-Tuning) 한 LoRA 어댑터입니다.

v3에서는 NVIDIA Nemotron-Personas-Korea 데이터셋(3,000건)을 추가하여 한국 페르소나 기반 멀티턴 대화 능력을 강화했습니다.

Model Details

항목
Base Model google/gemma-4-e2b-it (5.1B params)
Method LoRA (r=16, alpha=32, dropout=0.05)
Trainable Params 24.2M / 5.1B (0.47%)
Training Data 13,521 samples from 9 Korean datasets
Epochs 1
Training Time 13h 14m on Apple M4 Pro 48GB (MPS)
Framework TRL 1.2.0 + PEFT 0.19.1 + Transformers 5.5.4

Quick Start

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load base model + LoRA adapter
base_model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-4-e2b-it",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(base_model, "hoin1218/gemma-4-e2b-korean-sft")
tokenizer = AutoTokenizer.from_pretrained("hoin1218/gemma-4-e2b-korean-sft")

# Generate
messages = [{"role": "user", "content": "한국의 사계절에 대해 설명해주세요."}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Data

9개 한국어 데이터셋에서 총 13,521건을 선별하여 학습했습니다.

Dataset Samples Format Description
heegyu/open-korean-instructions 2,500 usr_bot 한국어 일반 Instruction
nlpai-lab/kullm-v2 1,491 instruction_input_output KULLM v2 한국어
llami-team/Korean-OpenThoughts-114k-Normalized 1,500 question_response 한국어 추론/사고
kuotient/orca-math-word-problems-193k-korean 1,500 question_response 한국어 수학 문제
kyujinpy/KOR-OpenOrca-Platypus-v3 1,000 instruction_input_output OpenOrca 한국어 번역
changpt/ko-lima-vicuna 1,030 sharegpt GPT-4 생성 한국어 (고품질)
beomi/KoAlpaca-v1.1a 1,000 instruction_output KoAlpaca 한국어
heegyu/namuwiki-extracted 500 title_text 나무위키 지식
nvidia/Nemotron-Personas-Korea 3,000 nemotron_persona 한국 페르소나 멀티턴 대화

Training Results

Final Metrics

Metric Value
Average Loss 12.38
Last Step Loss 12.46
Best Loss 10.36 (epoch 87%)
Best Token Accuracy 69.5% (epoch 89%)
Average Token Accuracy 61.8%
Total Steps 1,691
Total Tokens 5.08M

Loss Curve

Loss
33 |*
   | *
27 |
   |
21 |  *
   |
15 |    *
   |
12 |      * * * * * * * * * * * * * * * * * * * * * * * * * * * *
   |        * * * * * * * * * * * * * * * * * * * * * * * * * * * *
10 |                                              *
   |___________________________________________________________________________
   0%    10%   20%   30%   40%   50%   60%   70%   80%   90%   100%  epoch

Detailed Training Log (every 10%)

Epoch Loss Token Acc Learning Rate
1.2% 32.71 48.7% 1.65e-05
4.7% 14.61 60.2% 8.59e-05
10.1% 12.39 65.3% 9.94e-05
20.1% 11.37 67.6% 9.42e-05
30.2% 13.16 63.3% 8.42e-05
40.2% 12.14 65.1% 7.04e-05
50.3% 11.23 67.2% 5.44e-05
59.8% 11.66 66.6% 3.89e-05
69.8% 11.97 65.8% 2.36e-05
79.9% 11.27 67.2% 1.11e-05
89.9% 11.28 67.2% 3.00e-06
100.0% 12.46 64.9% 6.12e-09

v2 vs v3 Comparison

Metric v2 (8 datasets) v3 (9 datasets)
Training Data 10,521 samples 13,521 samples
Avg Loss 12.80 12.38
Best Loss 10.24 10.36
Best Accuracy 69.8% 69.5%
Total Tokens 3.54M 5.08M
Training Time 16h 27m 13h 14m

Training Configuration

# LoRA
r: 16
lora_alpha: 32
lora_dropout: 0.05
target_modules: ".*language_model.*?(q_proj|k_proj|v_proj|o_proj|gate_proj|up_proj|down_proj)"
bias: none
task_type: CAUSAL_LM

# Training
epochs: 1
batch_size: 1
gradient_accumulation_steps: 8
effective_batch_size: 8
learning_rate: 1.0e-4
lr_scheduler: cosine
warmup_ratio: 0.05
max_seq_length: 512
optimizer: adamw_torch
precision: fp16
gradient_checkpointing: true

Hardware

  • Device: Apple M4 Pro (48GB Unified Memory)
  • Backend: MPS (Metal Performance Shaders)
  • Precision: fp16 (MPS does not support bf16)
  • Attention: SDPA (Scaled Dot-Product Attention)

LoRA Target Modules

Gemma 4는 멀티모달 모델로, vision_tower/audio_tower에 PEFT가 지원하지 않는 Gemma4ClippableLinear 모듈이 존재합니다. 이를 우회하기 위해 target_modules를 정규식으로 language_model 하위 모듈만 매칭합니다:

.*language_model.*?(q_proj|k_proj|v_proj|o_proj|gate_proj|up_proj|down_proj)

Known Issues

  • Gemma 4는 text-only 학습에도 mm_token_type_ids가 필요합니다. 학습 시 커스텀 Gemma4DataCollator로 자동 주입합니다.
  • transformers>=5.5.0 필요 (5.4.x는 gemma4 모델 타입 미인식)

License

This model inherits the Gemma license from the base model.

Citation

@misc{gemma4-korean-sft-2026,
  title={Gemma 4 E2B Korean SFT v3},
  author={hoin1218},
  year={2026},
  url={https://huggingface.co/hoin1218/gemma-4-e2b-korean-sft}
}
Downloads last month
48
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train hoin1218/gemma-4-e2b-korean-sft