Saya mencuba Granit.-Tutorial Python-php.cn

I tried out Granite .

Granit 3.0

Granite 3.0 ialah keluarga model bahasa generatif sumber terbuka dan ringan yang direka untuk pelbagai tugas peringkat perusahaan. Ia secara asli menyokong fungsi berbilang bahasa, pengekodan, penaakulan dan penggunaan alat, menjadikannya sesuai untuk persekitaran perusahaan.

Saya telah menguji menjalankan model ini untuk melihat tugasan yang boleh dikendalikannya.

Persediaan Persekitaran

Saya menyediakan persekitaran Granite 3.0 dalam Google Colab dan memasang perpustakaan yang diperlukan menggunakan arahan berikut:

!pip install torch torchvision torchaudio
!pip install accelerate
!pip install -U transformers

Salin selepas log masuk

Perlaksanaan

Saya menguji prestasi kedua-dua model 2B dan 8B Granite 3.0.

Model 2B

Saya menjalankan model 2B. Berikut ialah contoh kod untuk model 2B:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "auto"
model_path = "ibm-granite/granite-3.0-2b-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()

chat = [
    { "role": "user", "content": "Please list one IBM Research laboratory located in the United States. You should only output its name and location." },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
input_tokens = tokenizer(chat, return_tensors="pt").to("cuda")
output = model.generate(**input_tokens, max_new_tokens=100)
output = tokenizer.batch_decode(output)
print(output[0])

Salin selepas log masuk

Keluaran

<|start_of_role|>user<|end_of_role|>Please list one IBM Research laboratory located in the United States. You should only output its name and location.<|end_of_text|>
<|start_of_role|>assistant<|end_of_role|>1. IBM Research - Austin, Texas<|end_of_text|>

Salin selepas log masuk

Model 8B

Model 8B boleh digunakan dengan menggantikan 2b dengan 8b. Berikut ialah contoh kod tanpa medan peranan dan input pengguna untuk model 8B:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "auto"
model_path = "ibm-granite/granite-3.0-8b-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()

chat = [
    { "content": "Please list one IBM Research laboratory located in the United States. You should only output its name and location." },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

input_tokens = tokenizer(chat, add_special_tokens=False, return_tensors="pt").to("cuda")
output = model.generate(**input_tokens, max_new_tokens=100)
generated_text = tokenizer.decode(output[0][input_tokens["input_ids"].shape[1]:], skip_special_tokens=True)
print(generated_text)

Salin selepas log masuk

Keluaran

1. IBM Almaden Research Center - San Jose, California

Salin selepas log masuk

Panggilan Fungsi

Saya meneroka ciri Panggilan Fungsi, mengujinya dengan fungsi tiruan. Di sini, get_current_weather ditakrifkan untuk mengembalikan data cuaca palsu.

Fungsi Dummy

import json

def get_current_weather(location: str) -> dict:
    """
    Retrieves current weather information for the specified location (default: San Francisco).
    Args:
        location (str): Name of the city to retrieve weather data for.
    Returns:
        dict: Dictionary containing weather information (temperature, description, humidity).
    """
    print(f"Getting current weather for {location}")

    try:
        weather_description = "sample"
        temperature = "20.0"
        humidity = "80.0"

        return {
            "description": weather_description,
            "temperature": temperature,
            "humidity": humidity
        }
    except Exception as e:
        print(f"Error fetching weather data: {e}")
        return {"weather": "NA"}

Salin selepas log masuk

Penciptaan Segera

Saya mencipta gesaan untuk memanggil fungsi:

functions = [
    {
        "name": "get_current_weather",
        "description": "Get the current weather",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and country code, e.g. San Francisco, US",
                }
            },
            "required": ["location"],
        },
    },
]
query = "What's the weather like in Boston?"
payload = {
    "functions_str": [json.dumps(x) for x in functions]
}
chat = [
    {"role":"system","content": f"You are a helpful assistant with access to the following function calls. Your task is to produce a sequence of function calls necessary to generate response to the user utterance. Use the following function calls as required.{payload}"},
    {"role": "user", "content": query }
]

Salin selepas log masuk

Penjanaan Tindak Balas

Menggunakan kod berikut, saya menghasilkan respons:

instruction_1 = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
input_tokens = tokenizer(instruction_1, return_tensors="pt").to("cuda")
output = model.generate(**input_tokens, max_new_tokens=1024)
generated_text = tokenizer.decode(output[0][input_tokens["input_ids"].shape[1]:], skip_special_tokens=True)
print(generated_text)

Salin selepas log masuk

Keluaran

{'name': 'get_current_weather', 'arguments': {'location': 'Boston'}}

Salin selepas log masuk

Ini mengesahkan keupayaan model untuk menjana panggilan fungsi yang betul berdasarkan bandar yang ditentukan.

Spesifikasi Format untuk Aliran Interaksi Dipertingkat

Granite 3.0 membenarkan spesifikasi format untuk memudahkan respons dalam format berstruktur. Bahagian ini menerangkan menggunakan [UTTERANCE] untuk respons dan [THINK] untuk pemikiran dalaman.

Sebaliknya, memandangkan panggilan fungsi adalah output sebagai teks biasa, mungkin perlu melaksanakan mekanisme berasingan untuk membezakan antara panggilan fungsi dan respons teks biasa.

Menentukan Format Output

Berikut ialah contoh gesaan untuk membimbing output AI:

prompt = """You are a conversational AI assistant that deepens interactions by alternating between responses and inner thoughts.
<Constraints>
* Record spoken responses after the [UTTERANCE] tag and inner thoughts after the [THINK] tag.
* Use [UTTERANCE] as a start marker to begin outputting an utterance.
* After [THINK], describe your internal reasoning or strategy for the next response. This may include insights on the user's reaction, adjustments to improve interaction, or further goals to deepen the conversation.
* Important: **Use [UTTERANCE] and [THINK] as a start signal without needing a closing tag.**
</Constraints>

Follow these instructions, alternating between [UTTERANCE] and [THINK] formats for responses.
<output example>
example1:
  [UTTERANCE]Hello! How can I assist you today?[THINK]I’ll start with a neutral tone to understand their needs. Preparing to offer specific suggestions based on their response.[UTTERANCE]Thank you! In that case, I have a few methods I can suggest![THINK]Since I now know what they’re looking for, I'll move on to specific suggestions, maintaining a friendly and approachable tone.
...
</output example>

Please respond to the following user_input.
<user_input>
Hello! What can you do?
</user_input>
"""

Salin selepas log masuk

Contoh Kod Pelaksanaan

kod untuk menjana respons:

chat = [
    { "role": "user", "content": prompt },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

input_tokens = tokenizer(chat, return_tensors="pt").to("cuda")
output = model.generate(**input_tokens, max_new_tokens=1024)
generated_text = tokenizer.decode(output[0][input_tokens["input_ids"].shape[1]:], skip_special_tokens=True)
print(generated_text)

Salin selepas log masuk

Contoh Output

Outputnya adalah seperti berikut:

[UTTERANCE]Hello! I'm here to provide information, answer questions, and assist with various tasks. I can help with a wide range of topics, from general knowledge to specific queries. How can I assist you today?
[THINK]I've introduced my capabilities and offered assistance, setting the stage for the user to share their needs or ask questions.

Salin selepas log masuk

Teg [UTTERANCE] dan [THINK] telah berjaya digunakan, membenarkan pemformatan respons yang berkesan.

Bergantung pada gesaan, teg penutup (seperti [/UTTERANCE] atau [/THINK]) kadangkala mungkin muncul dalam output, tetapi secara keseluruhan, format output secara amnya boleh ditentukan dengan jayanya.

Contoh Kod Penstriman

Mari kita lihat juga cara untuk mengeluarkan respons penstriman.

Kod berikut menggunakan pustaka asyncio dan threading untuk menstrim respons secara tidak segerak daripada Granite 3.0.

!pip install torch torchvision torchaudio
!pip install accelerate
!pip install -U transformers

Salin selepas log masuk

Contoh Output

Menjalankan kod di atas akan menghasilkan respons tak segerak dalam format berikut:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "auto"
model_path = "ibm-granite/granite-3.0-2b-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()

chat = [
    { "role": "user", "content": "Please list one IBM Research laboratory located in the United States. You should only output its name and location." },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
input_tokens = tokenizer(chat, return_tensors="pt").to("cuda")
output = model.generate(**input_tokens, max_new_tokens=100)
output = tokenizer.batch_decode(output)
print(output[0])

Salin selepas log masuk

Contoh ini menunjukkan penstriman yang berjaya. Setiap token dijana secara tak segerak dan dipaparkan secara berurutan, membolehkan pengguna melihat proses penjanaan dalam masa nyata.

Ringkasan

Granite 3.0 memberikan respons yang cukup kukuh walaupun dengan model 8B. Ciri Panggilan Fungsi dan Spesifikasi Format juga berfungsi dengan baik, menunjukkan potensinya untuk pelbagai aplikasi.

Atas ialah kandungan terperinci Saya mencuba Granit.. Untuk maklumat lanjut, sila ikut artikel berkaitan lain di laman web China PHP!