Neural network superbody? New National LV lab proposes new network cloning technology-AI-php.cn

In the movie, as the heroine Lucy’s brain power gradually develops, she acquires the following abilities:

10%: Ability to control the body’s autonomic nervous system system to improve body coordination and reaction speed.
30%: Ability to predict the future and predict people's actions, improving insight and judgment.
50%: Able to predict future changes by sensing small changes in the surrounding environment.
70%: Able to control the movement of the body and objects, possessing extraordinary movement and combat skills.
90%: Ability to connect to the universe and time, possessing the power of inspiration and intuition.
100%: Able to realize supernatural power, beyond the limits of human cognition.

At the end of the movie, the heroine gradually disappears and turns into a pure energy form, eventually disappearing into the universe and becoming one with the universe and time. The realization of the human super body is the ability to connect to the outside world to obtain infinite value. Migrating this idea to the neural network domain, if the connection with the entire network can be established,can also realize the network super body, and theoretically will obtain unbounded prediction capabilities.

That is, the physical network will inevitably limit the growth of network performance. When the target network is connected to the Model Zoo, the network no longer has an entity, but a network is established. The connected super-body form between them.

神经网络超体？新国立LV lab提出全新网络克隆技术

Above the picture: The difference between super-body network and entity network. The super-body network has no entity and is a form of connectivity between networks

The idea of the network's super-body is shared in this article CVPR 2023 The paper"Partial Network Cloning"can be explored. In this paper, the National University of SingaporeLV labproposes a new network cloning technology.

神经网络超体？新国立LV lab提出全新网络克隆技术

##Link: https://arxiv.org/abs/2303.10597##01 Problem Definition

In this article, the author mentioned that using this network cloning technology to achieve network dematerialization can bring the following advantages:

The implementation foundation of the super-body network is the rapidly expanding Model Zoo, with a large number of pre-trained models available for use. Therefore, for any task T, we can always find one or more models

so that the tasks of these existing models can be composed into the required tasks. That is:

神经网络超体？新国立LV lab提出全新网络克隆技术

(three networks are selected for connection).

神经网络超体？新国立LV lab提出全新网络克隆技术

As shown in the figure above, for task T, in order to construct the corresponding superbody network M_c, this article proposes The following construction framework:

Step 1: Locate the most appropriate ontology network M_t so that the intersection T⋂T_t of the task set T_t of the ontology network M_t and the required task set T is the largest. At this time, the ontology network is set as the main Network;
Step 2: Select the revised network M_s^1 and M_s^2 to supplement some of the missing tasks in the ontology network;
Step 3: Usenetwork cloning technologyLocate and connect the partially corrected networks M_s^1 and M_s^2 to the ontology network M_t;
Step 4: Use part of the correction data to fine-tune the connectivity module and prediction module of the network.

In summary, the network cloning technology required to build a network superbody proposed in this article can be expressed as:

神经网络超体？新国立LV lab提出全新网络克隆技术

Among them, M_s represents the correction network set, so the connection form of the network superbody is an ontology network plus one or several correction networks. Network cloning technology is to clone the partial correction network needed and embed it into In the ontology network.

Specifically, the network cloning framework proposed in this article includes the following two technical points:

神经网络超体？新国立LV lab提出全新网络克隆技术

For a clone containing P correction networks, the first technical point iskey part positioning Local (∙). Since the correction network may contain task information that is irrelevant to the task set T, the key part positioning Local (∙) aims to locate the parts in the correction network that are related to the task T⋂T_s. The positioning parameter is represented by M^ρ. The implementation details are in Section 1. given in subsection 2.1. The second technical point is the network module embedding Insert (∙). It is necessary to select the appropriate network embedding point R^ρ to embed all the correction networks. The implementation details are given in Section 2.2.

02 Method Overview

In the method part of network cloning, in order to simplify the description, we set the number of correction networks P=1 (therefore omitting the upper part of the correction network (labeled ρ), that is, we connect an ontology network and a correction network to build the required superbody network.

As mentioned above, network cloning includes key part positioning and network module embedding. Here, we introduce the intermediate transferable module M_f to assist understanding. That is, the network cloning technology locates key parts in the revised network to form a transferable module M_f, and then embeds the transferable module into the ontology network M_t through soft connections. Therefore, the goal ofnetwork cloning technology is to locate and embed migratable moduleswith portability and local fidelity.

神经网络超体？新国立LV lab提出全新网络克隆技术

##2.1 Locating key parts of the network

The goal of locating key parts of the network is to learn the selection function M. The selection function M is defined here as the mask that acts on the filter of each layer of the network. The migratable module at this time can be expressed as:

神经网络超体？新国立LV lab提出全新网络克隆技术

In the above formula, we represent the modified network M_s as L layer, each layer Expressed as. The extraction of known migratable modules does not make any modifications to the correction network.

In order to get the appropriate transferable module M_f, we locate the explicit part of the correction network M_s that makes the greatest contribution to the final prediction result. Prior to this, considering the black-box nature of neural networks and that we only need part of the prediction results of the network, we used LIME to fit and correct the network to model the local part of the required task (see the main text of the paper for specific details).

The local modeling results are represented by 神经网络超体？新国立LV lab提出全新网络克隆技术 , where D_t is the training data set corresponding to the required partial prediction results (smaller than the training set of the original network).

Therefore, the selection of function M can be optimized through the following objective function:

神经网络超体？新国立LV lab提出全新网络克隆技术

In this formula, The key parts of the localization are fitted to the locally modeled G.

2.2 Network module embedding

When locating the migratable module M_f in the correction network, use the selection function M directly Extracted from M_s without modifying its weights. The next step is to decide where to embed the migratable module M_f in the ontology network M_t to obtain the best cloning performance.

The embedding of network modules is controlled by the positional parameter R. Following most model reuse settings, network cloning retains the first few layers of the ontology model as generic feature extractors, and the network embedding process is simplified to finding the best embedding position (i.e. embedding the transferable module M_f at the Rth layer). The process of finding embeddings can be expressed as:

神经网络超体？新国立LV lab提出全新网络克隆技术

Please refer to the text for detailed formula explanation. In general, search-based embedding includes the following points:

#The best position parameter R The search process is from the deep layer of the network to the shallow layer;
After embedding the transferability module in the super-body network at the R layer, it is necessary to additionally introduce the Adapter A at the embedded position and re-finetune the F_c layer (for the classification network) Said), but the parameter amounts of the two are negligible compared to the entire model zoo;
When the connection is established from the L-1 layer to the 0th layer of the network, we Roughly estimate the embedding performance based on the loss convergence value of each fine-tune, and select the minimum convergence value point as the final network embedding point.

03 Practical application of network cloning technology

The core of the network cloning technology proposed in this article is to establish the connection path between pre-trained networks. It does not require By modifying any parameters of the pre-trained network, it can not only be used as a key technology for building network super-body, but can also be flexibly applied to various practical scenarios.

Scenario 1: Network cloning technology makes it possible to use Model Zoo online. In some cases with limited resources, users can flexibly utilize the online Model Zoo without downloading the pre-trained network to the local.

Note that the cloned model is determined by, where M_t and M_s are fixed and unchanged throughout the process. Model cloning does not make any modifications to the pre-trained model, nor does it introduce a new model. Model cloning makes any combination of functions in Model Zoo possible, which also helps maintain a good ecological environment of Model Zoo, because establishing a connection using M and R is a simple mask and positioning operation that is easy to undo. Therefore, the proposed network cloning technology supports the establishment of a sustainable Model Zoo online inference platform.

Scenario 2: The network generated through network cloning has a better information transmission mode. This technology can reduce transmission delays and losses when performing network transmission.

When performing network transmission, we only need to transmit the set 神经网络超体？新国立LV lab提出全新网络克隆技术 . Combined with the public Model Zoo, the receiver can restore the original network. Compared with the entire cloned network,is very small, so transmission delay can be reduced. If A and F_c still have some transmission loss, the receiver can easily fix it by fine-tuning on the data set. Therefore, network cloning provides a new form of network for efficient transmission.

04 Experimental results

We conducted experimental verification on the classification task. In order to evaluate the local performance representation ability of transferable modules, we introduce the conditional similarity index:

神经网络超体？新国立LV lab提出全新网络克隆技术

where Sim_cos (∙ ) represents cosine similarity.

神经网络超体？新国立LV lab提出全新网络克隆技术

Experiments on MNIST, CIFAR-10, CIFAR-100 and Tiny-ImageNet are given in the above table As a result, it can be seen that the performance improvement of the model obtained by network cloning (PNC) is the most significant. And fine-tuning the entire network (PNC-F) will not improve network performance. On the contrary, it will increase the bias of the model.

神经网络超体？新国立LV lab提出全新网络克隆技术

In addition to this, we evaluated the quality of the migratable modules (as shown above). As can be seen from the figure (left), each feature learned from each sub-dataset is more or less correlated, which shows the importance of extracting and localizing local features from the revised network. For transferable modules, we calculate their similarity Sim (∙). The figure (right) shows that the transferable module is highly similar in similarity to the sub-dataset to be cloned, and its relationship with the remaining sub-datasets is weakened (off-diagonal areas are marked with a lighter color than the matrix plot of the source network ). Therefore, it can be concluded that the transferable module successfully simulates the local performance on the task set to be cloned, proving the correctness of the positioning strategy.

05 Summary

This paper studies a new knowledge transfer task called Partial Network Cloning (PNC), which copies and pastes data from a revised network Clone the parameter module and embed it into the ontology network. Unlike previous knowledge transfer setups (which rely on updating the parameters of the network) our approach ensures that the parameters of all pre-trained models are unchanged. The core technology of PNC is to simultaneously locate key parts of the network and embed removable modules. The two steps reinforce each other.

We demonstrate outstanding results of our method on accuracy and transferability metrics on multiple datasets.

The above is the detailed content of Neural network superbody? New National LV lab proposes new network cloning technology. For more information, please follow other related articles on the PHP Chinese website!