Efficient merge strategy of PEFT LoRA adapter and base model-Python Tutorial-php.cn

Table of Contents

Introduction: Necessity of merging PEFT LoRA adapter with base model

Incorrect merge attempts and cause analysis

Correct merge strategy: use merge_and_unload method of PEFT library

1. Load the PEFT adapter model

2. Perform model merging

3. Save the merged model

Tokenizer

Note: PEFT version compatibility

Summarize

Home

Backend Development

Python Tutorial

Efficient merge strategy of PEFT LoRA adapter and base model

Linda Hamilton

Sep 19, 2025 pm 05:12 PM

Efficient merge strategy of PEFT LoRA adapter and basic model

This tutorial details how to efficiently merge the PEFT LoRA adapter with the base model to generate a completely independent model. The article points out that it is wrong to directly use transformers.AutoModel to load the adapter and manually merge the weights, and provides the correct process to use the merge_and_unload method in the peft library. In addition, the tutorial also emphasizes the importance of dealing with word segmentation and discusses PEFT version compatibility issues and solutions.

Introduction: Necessity of merging PEFT LoRA adapter with base model

After fine-tuning large language models using parameter efficient fine-tuning (PEFT) technology, especially LoRA (Low-Rank Adaptation), we usually get a lightweight adapter model. This adapter model contains only a small amount of weights modified during fine-tuning, and it needs to be combined with the original basic model to make inferences. When deploying or sharing a model, it is a common requirement to merge the adapter with the base model into a complete, independent model, which simplifies the loading and use of the model without the need to manage two model components simultaneously.

However, many beginners may experience difficulties when trying to merge, such as trying to load a PEFT adapter directly using AutoModel.from_pretrained from the transformers library, or trying to manually weight average model weights. These methods often lead to errors because PEFT adapters have their specific structure and loading mechanisms.

Incorrect merge attempts and cause analysis

A common mistake attempt is to use transformers.AutoModel.from_pretrained to load the PEFT adapter and try to merge the weights by manually weighting, as shown below:

 from transformers import AutoModel
# Error demonstration: Try to load the PEFT adapter directly# pretrained_model = AutoModel.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v0.6")
# lora_adapter = AutoModel.from_pretrained("ArcturusAI/Crystalline-1.1B-v23.12-tagger") # An error will be reported here # ... The subsequent weight merging logic is also incorrect...

When executing lora_adapter = AutoModel.from_pretrained("ArcturusAI/Crystalline-1.1B-v23.12-tagger"), an OSError is usually encountered, prompting that the standard model weight files such as pytorch_model.bin, tf_model.h5 are missing in the model path. This is because PEFT adapters usually only contain the weights of the adapter layer, rather than the complete model weight file, and transformers.AutoModel cannot recognize this format. Furthermore, the PEFT model works instead of simply weighting the weights of the base model and the adapter model, but rather modifying its behavior by injecting the adapter layer into a specific layer of the base model. Therefore, the method of manually merging weights is also logically incorrect.

Correct merge strategy: use merge_and_unload method of PEFT library

The PEFT library itself provides an official and efficient way to merge adapters with the underlying model: merge_and_unload(). This method correctly integrates adapter weights into the corresponding layer of the base model and returns a standard transformers model instance.

1. Load the PEFT adapter model

First, we need to use a class in the peft library that is specifically used to load PEFT models, such as AutoPeftModelForCausalLM, to load the trained PEFT adapter. This class automatically recognizes and loads the PEFT adapter and its associated underlying model configuration.

 from peft import AutoPeftModelForCausalLM
import torch

# Define the local path or Hugging Face model ID of the PEFT adapter model
# Assume that you have downloaded the adapter model locally, or you can load model_id = "./ArcturusAI/Crystalline-1.1B-v23.12-tagger" directly from Hugging Face Hub # Sample path # Loading the PEFT adapter model # Note: The base model and adapter weight will be loaded at the same time peft_model = AutoPeftModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16 # Choose the appropriate dtype based on your hardware and model size
)

print(f"Model type after loading: {type(peft_model)}")
# Expected output: <class></class>

2. Perform model merging

After loading, peft_model is a PeftModelForCausalLM instance. By calling its merge_and_unload() method, the PEFT library automatically merges the adapter weights into the base model and returns a standard transformers model instance.

 # Perform merged_model = peft_model.merge_and_unload()

print(f"Merged model type: {type(merged_model)}")
# Expected output: <class> (or the type corresponding to the base model)</class>

At this point, merged_model is already a complete model with all the necessary weights and can be used and saved like any other transformers model.

3. Save the merged model

The merged model can be saved locally using the save_pretrained method of the transformers library for subsequent loading and deployment.

 # Define the save path save_directory = "./ArcturusAI/Crystalline-1.1B-v23.12-tagger-fullmodel"

# Save the merged model merged_model.save_pretrained(save_directory)
print(f"The merged model has been saved to: {save_directory}")

Tokenizer

It should be noted that the merge_and_unload() method only processes the weights of the model, and does not handle word segmenters. A Tokenizer is a component independent of the weights of the model, which is responsible for converting text into a sequence of numbers that the model can understand. So you need to load the word segmenter for the base model separately and save it to the same directory as the merged model to ensure the integrity of the model.

 from transformers import AutoTokenizer

# Load the word segmenter base_model_name = "TinyLlama/TinyLlama-1.1B-Chat-v0.6"
tokenizer = AutoTokenizer.from_pretrained(base_model_name)

# Save the word participle to the same directory as the merge model tokenizer.save_pretrained(save_directory)
print(f"word participle saved to: {save_directory}")

After completing the above steps, the ./ArcturusAI/Crystalline-1.1B-v23.12-tagger-fullmodel directory will contain a complete, directly loadable and use model (including weights and word segmenters).

Note: PEFT version compatibility

When working with PEFT models, you may encounter compatibility issues between models trained by different versions of the peft library. For example, newer versions of peft may introduce new configuration keys (such as loftq_config, megatron_config, megatron_core) in the adapter_config.json file, while older versions of peft may not recognize these keys when loading, resulting in load failure.

If you encounter this type of problem, one solution is to manually edit the adapter_config.json file and remove those incompatible configuration keys. This usually happens when you try to load an adapter trained by a newer version using an older peft version.

Example (assuming you have downloaded the model locally and need to be modified):

Download the model: Make sure the PEFT adapter model has been downloaded to the local path.
Position adapter_config.json: Find the adapter_config.json file under the model path.
Edit file: Open adapter_config.json using a text editor.
Remove incompatible keys: Find and delete key-value pairs such as "loftq_config": null, "megatron_config": {}, "megatron_core": {}, etc.
Save the file: Save the modified adapter_config.json.

Important: This manual modification of configuration files should be used as a temporary solution and only if you know clearly which keys are the source of the problem. The best practice is to try to keep the peft library version consistent, or to consider the peft version of the deployment environment when training.

Summarize

Merging a PEFT LoRA adapter with the base model is a relatively straightforward process, and the key is to use the right tools provided by the peft library. Load the adapter through AutoPeftModelForCausalLM and then call the merge_and_unload() method to complete the model merge efficiently. At the same time, don't forget to process the word segmenter separately and save it with the merged model to ensure the integrity and convenience of model deployment. When dealing with models trained with different versions of Peft, pay attention to potential compatibility issues and take appropriate measures to resolve them.

The above is the detailed content of Efficient merge strategy of PEFT LoRA adapter and base model. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undress AI Tool

Undress images for free

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

ArtGPT

AI image generator for creative art from text prompts.

Stock Market GPT

AI powered investment research for smarter decisions

Hot Article

How To Play The Bing Homepage Quiz And Win (Quick Guide)

2 weeks ago By DDD

What is Trump's cryptocurrency? Introduction to Trump's Major Cryptocurrencies and Projects (2025)

1 months ago By DDD

Ethereum price forecast in September 2025: Can ETH break through the $5,000 mark?

3 weeks ago By DDD

Can the XPL coins that were snatched by big players in public sales exceed 1 US dollar when they are launched?

3 weeks ago By DDD

How To Get Help In Windows 11 & 10 (Quick Guide)

1 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

PHP Tutorial

1679

276

nyt connections hints and answers

326

836

Related knowledge

How to install packages from a requirements.txt file in Python Sep 18, 2025 am 04:24 AM

Run pipinstall-rrequirements.txt to install the dependency package. It is recommended to create and activate the virtual environment first to avoid conflicts, ensure that the file path is correct and that the pip has been updated, and use options such as --no-deps or --user to adjust the installation behavior if necessary.

Efficient merge strategy of PEFT LoRA adapter and base model Sep 19, 2025 pm 05:12 PM

This tutorial details how to efficiently merge the PEFT LoRA adapter with the base model to generate a completely independent model. The article points out that it is wrong to directly use transformers.AutoModel to load the adapter and manually merge the weights, and provides the correct process to use the merge_and_unload method in the peft library. In addition, the tutorial also emphasizes the importance of dealing with word segmenters and discusses PEFT version compatibility issues and solutions.

How to test Python code with pytest Sep 20, 2025 am 12:35 AM

Python is a simple and powerful testing tool in Python. After installation, test files are automatically discovered according to naming rules. Write a function starting with test_ for assertion testing, use @pytest.fixture to create reusable test data, verify exceptions through pytest.raises, supports running specified tests and multiple command line options, and improves testing efficiency.

Floating point number accuracy problem in Python and its high-precision calculation scheme Sep 19, 2025 pm 05:57 PM

This article aims to explore the common problem of insufficient calculation accuracy of floating point numbers in Python and NumPy, and explains that its root cause lies in the representation limitation of standard 64-bit floating point numbers. For computing scenarios that require higher accuracy, the article will introduce and compare the usage methods, features and applicable scenarios of high-precision mathematical libraries such as mpmath, SymPy and gmpy to help readers choose the right tools to solve complex accuracy needs.

How to handle command line arguments in Python Sep 21, 2025 am 03:49 AM

Theargparsemoduleistherecommendedwaytohandlecommand-lineargumentsinPython,providingrobustparsing,typevalidation,helpmessages,anderrorhandling;usesys.argvforsimplecasesrequiringminimalsetup.

How to work with PDF files in Python Sep 20, 2025 am 04:44 AM

PyPDF2, pdfplumber and FPDF are the core libraries for Python to process PDF. Use PyPDF2 to perform text extraction, merging, splitting and encryption, such as reading the page through PdfReader and calling extract_text() to get content; pdfplumber is more suitable for retaining layout text extraction and table recognition, and supports extract_tables() to accurately capture table data; FPDF (recommended fpdf2) is used to generate PDF, and documents are built and output through add_page(), set_font() and cell(). When merging PDFs, PdfWriter's append() method can integrate multiple files

python get current time example Sep 15, 2025 am 02:32 AM

Getting the current time can be implemented in Python through the datetime module. 1. Use datetime.now() to obtain the local current time, 2. Use strftime("%Y-%m-%d%H:%M:%S") to format the output year, month, day, hour, minute and second, 3. Use datetime.now().time() to obtain only the time part, 4. It is recommended to use datetime.now(timezone.utc) to obtain UTC time, avoid using deprecated utcnow(), and daily operations can meet the needs by combining datetime.now() with formatted strings.

Efficient integration of multi-file data using Pandas: IP, MAC and port association tutorial Sep 21, 2025 pm 03:00 PM

This tutorial shows in detail how to efficiently extract, correlate, and integrate specific data from multiple text files using Python's Pandas library. By loading the file data into a DataFrame and using merge operation to perform internal connections based on the IP address and MAC address, the final implementation of precise matching and outputting the association information of the IP, MAC address and corresponding ports from files from different sources.

See all articles