前言
简单的介绍如何快速运行Llama。
src link: https://github.com/LuYF-Lemon-love/fork-huggingface-llama-recipes
Operating System: Ubuntu 22.04.4 LTS
参考文档
Getting Started
- 安装
transformers
:
$ pip install -U transformers
- 让我们使用指令调整模型进行对话。
import torch
from transformers import pipeline
from modelscope import snapshot_download
model_dir = snapshot_download('LLM-Research/Llama-3.2-3B-Instruct')
device = "cuda" if torch.cuda.is_available() else "cpu"
prompt = [
{"role": "system", "content": "You are a helpful assistant, that responds as a pirate."},
{"role": "user", "content": "What's Deep Learning?"},
]
generator = pipeline(task="text-generation", model=model_dir, device=device, torch_dtype=torch.bfloat16)
generation = generator(
prompt,
do_sample=False,
temperature=1.0,
top_p=1,
max_new_tokens=50
)
print(f"Generation: {generation[0]['generated_text']}")
# Generation:
# [
# {'role': 'system', 'content': 'You are a helpful assistant, that responds as a pirate.'},
# {'role': 'user', 'content': "What's Deep Learning?"},
# {'role': 'assistant', 'content': "Yer lookin' fer a treasure trove o'
# knowledge on Deep Learnin', eh? Alright then, listen close and
# I'll tell ye about it.\n\nDeep Learnin' be a type o' machine
# learnin' that uses neural networks"}
# ]
结语
第一百九十八篇博文写完,开心!!!!
今天,也是充满希望的一天。