Gemma 4 E2B Korean SFT v3 (LoRA)
Google Gemma 4 E2B-IT (5.1B params) 모델을 9개 한국어 데이터셋으로 SFT (Supervised Fine-Tuning) 한 LoRA 어댑터입니다.
v3에서는 NVIDIA Nemotron-Personas-Korea 데이터셋(3,000건)을 추가하여 한국 페르소나 기반 멀티턴 대화 능력을 강화했습니다.
Model Details
| 항목 | 값 |
|---|---|
| Base Model | google/gemma-4-e2b-it (5.1B params) |
| Method | LoRA (r=16, alpha=32, dropout=0.05) |
| Trainable Params | 24.2M / 5.1B (0.47%) |
| Training Data | 13,521 samples from 9 Korean datasets |
| Epochs | 1 |
| Training Time | 13h 14m on Apple M4 Pro 48GB (MPS) |
| Framework | TRL 1.2.0 + PEFT 0.19.1 + Transformers 5.5.4 |
Quick Start
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load base model + LoRA adapter
base_model = AutoModelForCausalLM.from_pretrained(
"google/gemma-4-e2b-it",
torch_dtype=torch.bfloat16,
device_map="auto",
)
model = PeftModel.from_pretrained(base_model, "hoin1218/gemma-4-e2b-korean-sft")
tokenizer = AutoTokenizer.from_pretrained("hoin1218/gemma-4-e2b-korean-sft")
# Generate
messages = [{"role": "user", "content": "한국의 사계절에 대해 설명해주세요."}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Data
9개 한국어 데이터셋에서 총 13,521건을 선별하여 학습했습니다.
| Dataset | Samples | Format | Description |
|---|---|---|---|
| heegyu/open-korean-instructions | 2,500 | usr_bot | 한국어 일반 Instruction |
| nlpai-lab/kullm-v2 | 1,491 | instruction_input_output | KULLM v2 한국어 |
| llami-team/Korean-OpenThoughts-114k-Normalized | 1,500 | question_response | 한국어 추론/사고 |
| kuotient/orca-math-word-problems-193k-korean | 1,500 | question_response | 한국어 수학 문제 |
| kyujinpy/KOR-OpenOrca-Platypus-v3 | 1,000 | instruction_input_output | OpenOrca 한국어 번역 |
| changpt/ko-lima-vicuna | 1,030 | sharegpt | GPT-4 생성 한국어 (고품질) |
| beomi/KoAlpaca-v1.1a | 1,000 | instruction_output | KoAlpaca 한국어 |
| heegyu/namuwiki-extracted | 500 | title_text | 나무위키 지식 |
| nvidia/Nemotron-Personas-Korea | 3,000 | nemotron_persona | 한국 페르소나 멀티턴 대화 |
Training Results
Final Metrics
| Metric | Value |
|---|---|
| Average Loss | 12.38 |
| Last Step Loss | 12.46 |
| Best Loss | 10.36 (epoch 87%) |
| Best Token Accuracy | 69.5% (epoch 89%) |
| Average Token Accuracy | 61.8% |
| Total Steps | 1,691 |
| Total Tokens | 5.08M |
Loss Curve
Loss
33 |*
| *
27 |
|
21 | *
|
15 | *
|
12 | * * * * * * * * * * * * * * * * * * * * * * * * * * * *
| * * * * * * * * * * * * * * * * * * * * * * * * * * * *
10 | *
|___________________________________________________________________________
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% epoch
Detailed Training Log (every 10%)
| Epoch | Loss | Token Acc | Learning Rate |
|---|---|---|---|
| 1.2% | 32.71 | 48.7% | 1.65e-05 |
| 4.7% | 14.61 | 60.2% | 8.59e-05 |
| 10.1% | 12.39 | 65.3% | 9.94e-05 |
| 20.1% | 11.37 | 67.6% | 9.42e-05 |
| 30.2% | 13.16 | 63.3% | 8.42e-05 |
| 40.2% | 12.14 | 65.1% | 7.04e-05 |
| 50.3% | 11.23 | 67.2% | 5.44e-05 |
| 59.8% | 11.66 | 66.6% | 3.89e-05 |
| 69.8% | 11.97 | 65.8% | 2.36e-05 |
| 79.9% | 11.27 | 67.2% | 1.11e-05 |
| 89.9% | 11.28 | 67.2% | 3.00e-06 |
| 100.0% | 12.46 | 64.9% | 6.12e-09 |
v2 vs v3 Comparison
| Metric | v2 (8 datasets) | v3 (9 datasets) |
|---|---|---|
| Training Data | 10,521 samples | 13,521 samples |
| Avg Loss | 12.80 | 12.38 |
| Best Loss | 10.24 | 10.36 |
| Best Accuracy | 69.8% | 69.5% |
| Total Tokens | 3.54M | 5.08M |
| Training Time | 16h 27m | 13h 14m |
Training Configuration
# LoRA
r: 16
lora_alpha: 32
lora_dropout: 0.05
target_modules: ".*language_model.*?(q_proj|k_proj|v_proj|o_proj|gate_proj|up_proj|down_proj)"
bias: none
task_type: CAUSAL_LM
# Training
epochs: 1
batch_size: 1
gradient_accumulation_steps: 8
effective_batch_size: 8
learning_rate: 1.0e-4
lr_scheduler: cosine
warmup_ratio: 0.05
max_seq_length: 512
optimizer: adamw_torch
precision: fp16
gradient_checkpointing: true
Hardware
- Device: Apple M4 Pro (48GB Unified Memory)
- Backend: MPS (Metal Performance Shaders)
- Precision: fp16 (MPS does not support bf16)
- Attention: SDPA (Scaled Dot-Product Attention)
LoRA Target Modules
Gemma 4는 멀티모달 모델로, vision_tower/audio_tower에 PEFT가 지원하지 않는 Gemma4ClippableLinear 모듈이 존재합니다. 이를 우회하기 위해 target_modules를 정규식으로 language_model 하위 모듈만 매칭합니다:
.*language_model.*?(q_proj|k_proj|v_proj|o_proj|gate_proj|up_proj|down_proj)
Known Issues
- Gemma 4는 text-only 학습에도
mm_token_type_ids가 필요합니다. 학습 시 커스텀Gemma4DataCollator로 자동 주입합니다. transformers>=5.5.0필요 (5.4.x는 gemma4 모델 타입 미인식)
License
This model inherits the Gemma license from the base model.
Citation
@misc{gemma4-korean-sft-2026,
title={Gemma 4 E2B Korean SFT v3},
author={hoin1218},
year={2026},
url={https://huggingface.co/hoin1218/gemma-4-e2b-korean-sft}
}
- Downloads last month
- 48