前言

此脚本展示了如何使用SFTTrainer将模型或适配器微调到目标数据集中。

src link: https://github.com/huggingface/trl/blob/main/examples/scripts/sft.py

Operating System: Ubuntu 22.04.4 LTS

参考文档

  1. TRL - Transformer Reinforcement Learning
  2. TRL - Examples (huggingface)
  3. examples/scripts/sft.py

训练脚本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
"""
# regular:
python examples/scripts/sft.py \
--model_name_or_path="facebook/opt-350m" \
--dataset_text_field="text" \
--report_to="wandb" \
--learning_rate=1.41e-5 \
--per_device_train_batch_size=64 \
--gradient_accumulation_steps=16 \
--output_dir="sft_openassistant-guanaco" \
--logging_steps=1 \
--num_train_epochs=3 \
--max_steps=-1 \
--push_to_hub \
--gradient_checkpointing

# peft:
python examples/scripts/sft.py \
--model_name_or_path="facebook/opt-350m" \
--dataset_text_field="text" \
--report_to="wandb" \
--learning_rate=1.41e-5 \
--per_device_train_batch_size=64 \
--gradient_accumulation_steps=16 \
--output_dir="sft_openassistant-guanaco" \
--logging_steps=1 \
--num_train_epochs=3 \
--max_steps=-1 \
--push_to_hub \
--gradient_checkpointing \
--use_peft \
--lora_r=64 \
--lora_alpha=16
"""

from datasets import load_dataset
from transformers import AutoTokenizer

from trl import (
ModelConfig,
SFTConfig,
SFTScriptArguments,
SFTTrainer,
TrlParser,
get_kbit_device_map,
get_peft_config,
get_quantization_config,
)


if __name__ == "__main__":
parser = TrlParser((SFTScriptArguments, SFTConfig, ModelConfig))
script_args, training_args, model_config = parser.parse_args_and_config()

################
# Model init kwargs & Tokenizer
################
quantization_config = get_quantization_config(model_config)
model_kwargs = dict(
revision=model_config.model_revision,
trust_remote_code=model_config.trust_remote_code,
attn_implementation=model_config.attn_implementation,
torch_dtype=model_config.torch_dtype,
use_cache=False if training_args.gradient_checkpointing else True,
device_map=get_kbit_device_map() if quantization_config is not None else None,
quantization_config=quantization_config,
)
training_args.model_init_kwargs = model_kwargs
tokenizer = AutoTokenizer.from_pretrained(
model_config.model_name_or_path, trust_remote_code=model_config.trust_remote_code, use_fast=True
)
tokenizer.pad_token = tokenizer.eos_token

################
# Dataset
################
dataset = load_dataset(script_args.dataset_name)

################
# Training
################
trainer = SFTTrainer(
model=model_config.model_name_or_path,
args=training_args,
train_dataset=dataset[script_args.dataset_train_split],
eval_dataset=dataset[script_args.dataset_test_split],
tokenizer=tokenizer,
peft_config=get_peft_config(model_config),
)

trainer.train()

# Save and push to hub
trainer.save_model(training_args.output_dir)
if training_args.push_to_hub:
trainer.push_to_hub(dataset_name=script_args.dataset_name)

结语

第一百七十六篇博文写完,开心!!!!

今天,也是充满希望的一天。