WO2022252694A1 - 神经网络优化方法及其装置 - Google Patents
神经网络优化方法及其装置 Download PDFInfo
- Publication number
- WO2022252694A1 WO2022252694A1 PCT/CN2022/076556 CN2022076556W WO2022252694A1 WO 2022252694 A1 WO2022252694 A1 WO 2022252694A1 CN 2022076556 W CN2022076556 W CN 2022076556W WO 2022252694 A1 WO2022252694 A1 WO 2022252694A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- neural network
- target
- network architecture
- architecture
- target neural
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/0985—Hyperparameter optimisation; Meta-learning; Learning-to-learn
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
Definitions
- the present application relates to the technical field of artificial intelligence (AI), in particular to a neural network optimization method and a device thereof.
- AI artificial intelligence
- neural networks have been applied to more and more fields.
- a neural network development platform such as an automatic machine learning (AutoML) platform
- AutoML automatic machine learning
- users use the existing neural network, they may be dissatisfied with the performance of the existing neural network.
- the user can also use the neural network development platform to optimize the existing neural network to obtain better performance. good neural network.
- the neural network development platform can perform optimization operations such as graph optimization and operator fusion on the existing neural network to obtain a neural network that can realize the functions of the existing neural network and has better performance.
- the optimization operation has limited space for improving the performance of the neural network, resulting in poor performance improvement of the neural network.
- the present application provides a neural network optimization method and device thereof, which can effectively improve the performance of the optimized neural network.
- the technical scheme provided by this application is as follows:
- the present application provides a neural network optimization method
- the neural network optimization method includes: receiving the model file of the neural network to be optimized; based on the model file of the neural network to be optimized, obtaining the search space of the target neural network architecture, searching The space includes the value range of each attribute of each neuron in the target neural network architecture; based on the search space, the target neural network architecture is obtained; based on the model file of the neural network to be optimized, the target neural network architecture is trained to obtain the model of the target neural network file; provides the user with the model file of the target neural network.
- the neural network to be optimized can be mapped to a relatively similar search space based on the model file of the neural network to be optimized, and then the target neural network architecture is determined based on the search space, and by The target neural network architecture is trained to obtain a target neural network with greatly improved performance, and then provide the model file of the target neural network to the user.
- the method can greatly improve the performance of the optimized neural network, and can use the optimized neural network to solve more complex tasks, thereby ensuring the scope of application of the optimized neural network.
- the neural network optimization method further includes: receiving target information input by the user, the target information including one or more of the following information: information on the hardware running the target neural network, and indicating the performance of the target neural network by the user information requested.
- obtaining the search space of the target neural network architecture based on the model file of the neural network to be optimized includes: obtaining the search space of the target neural network architecture based on the model file of the neural network to be optimized and target information.
- the user can facilitate the optimization system to determine what kind of neural network needs to be optimized, so that the optimization process of the optimization system to optimize the neural network is more targeted, so as to provide users with more A target neural network that fits the needs of the user.
- the evaluation index values stored in the knowledge base usually include multiple index values, and the knowledge base needs to be retrieved according to the multi-type evaluation index values corresponding to the various index values, but the multi-type evaluation index values may not be included according to the user's input
- the target neural network architecture is obtained based on the search space, including: based on the search space, obtaining the specified information used to reflect the performance requirements of the user for the target neural network; based on the specified information, Retrieve in the knowledge base; when there is an existing neural network architecture satisfying the specified information in the knowledge base, determine the existing neural network architecture as the target neural network architecture.
- the specified information may include multiple types of evaluation index values, and based on the search space, obtaining the specified information to reflect the user's performance requirements for the target neural network includes: based on the search space, obtaining at least one candidate Neural network architecture; training and reasoning are performed on each candidate neural network architecture to obtain multi-category evaluation index values for each candidate neural network architecture.
- search in the knowledge base based on the specified information including: search in the knowledge base based on the multi-category evaluation index values of each candidate neural network architecture; when any neural network architecture in the knowledge base satisfies any When evaluating the multi-category evaluation index values of candidate neural network architectures, it is determined that there is an existing neural network architecture that satisfies the specified information in the knowledge base.
- obtaining the target neural network architecture based on the search space also includes: when there is no existing neural network architecture satisfying the specified information in the knowledge base, using an architecture search strategy to search to obtain the target neural network architecture.
- the existing neural network architecture can be directly determined as the target neural network architecture without using the architecture search strategy Searching the neural network architecture can improve the optimization efficiency of the neural network to be optimized and reduce the resource consumption of the optimized neural network.
- the neural network optimization method further includes: storing intermediate data generated during the process of searching for the target neural network architecture in the knowledge base, the intermediate data including one or more of the following: The search space, the candidate neural network architecture based on the search space, the first type of evaluation index value related to hardware, the second type of evaluation index value independent of hardware, and the information of the hardware used to run the target neural network.
- the intermediate data in the search process By storing the intermediate data in the search process in the knowledge base, it is possible to use the intermediate data to provide services for other users, and by searching in the knowledge base first, when the knowledge base exists in the neural network to be optimized.
- the architecture search strategy When there is a neural network architecture, there is no need to use the architecture search strategy to search the neural network architecture, which can improve the optimization efficiency of the neural network to be optimized and reduce the resource consumption of the optimized neural network.
- the target neural network architecture is trained based on the model file of the neural network to be optimized to obtain the model file of the target neural network, including: based on the model file of the neural network to be optimized, the target neural network architecture is trained using a model cloning method , get the model file of the target neural network.
- the target neural network architecture is trained by the model cloning method.
- the model cloning method can clone the reasoning behavior of the neural network to be optimized to the target neural network structure, and can ensure that the reasoning behavior of the target neural network is consistent with the reasoning behavior of the neural network to be optimized. consistency.
- obtaining the search space of the target neural network architecture includes: inputting the model file of the neural network to be optimized into the pre-trained artificial intelligence model, and obtaining the target neural network architecture output by the artificial intelligence model search space.
- the artificial intelligence model can be used to predict the search space, and the artificial intelligence model can be used to automatically detect the task type of the target neural network, so that the user does not need to inform the optimization system of the task type of the neural network to be optimized, which simplifies the tasks that the user needs to perform when optimizing the neural network. operate.
- the neural network optimization method further includes: using an optimization strategy to optimize the speed of the target neural network, optimizing Strategies include: graph optimization strategy and operator optimization strategy; providing the model file of the target neural network to the user, including: providing the user with the model file of the target neural network after speed optimization.
- the neural network optimization method further includes: receiving a speed optimization request sent by the user; based on the speed optimization request, using an optimization strategy to optimize the speed of the target neural network, and the optimization strategy includes: Graph optimization strategy and operator optimization strategy; provide users with the model file of the target neural network after speed optimization.
- the calculation amount or other system overhead (such as memory access overhead) of the target neural network can be reduced, and the inference speed of the target neural network can be improved.
- the present application provides a neural network optimization device, which includes: an interaction module for receiving a model file of the neural network to be optimized; an architecture determination module for based on the model file of the neural network to be optimized , to obtain the search space of the target neural network architecture, the search space includes the value range of each attribute of each neuron in the target neural network architecture; the architecture determination module is also used to obtain the target neural network architecture based on the search space; the training module uses The target neural network structure is trained based on the model file of the neural network to be optimized to obtain the model file of the target neural network; the interaction module is also used to provide the model file of the target neural network to the user.
- the interaction module also receives target information input by the user, and the target information includes one or more of the following information: information about the hardware running the target neural network, and information indicating the performance requirements of the user for the target neural network;
- the architecture determination module is specifically used to: obtain the search space of the target neural network architecture based on the model file and target information of the neural network to be optimized.
- the architecture determination module is specifically used to: obtain specified information reflecting the performance requirements of the user for the target neural network based on the search space; search in the knowledge base based on the specified information; When there is an existing neural network architecture of the information, the existing neural network architecture is determined as the target neural network architecture.
- the specified information includes multi-type evaluation index values
- the architecture determination module is specifically used to: obtain at least one candidate neural network architecture based on the search space; perform training and reasoning on each candidate neural network architecture, and obtain each Multiclass evaluation metric values for candidate neural network architectures.
- the architecture determination module is specifically used to: search in the knowledge base based on the multi-type evaluation index values of each candidate neural network architecture; when any neural network architecture in the knowledge base satisfies any candidate neural network When evaluating the multi-category evaluation index values of the architecture, it is determined that there is an existing neural network architecture that satisfies the specified information in the knowledge base.
- the architecture determining module is further specifically configured to: when there is no existing neural network architecture satisfying specified information in the knowledge base, use an architecture search strategy to search for a target neural network architecture.
- the neural network optimization device further includes: a storage module, configured to store intermediate data generated during the process of searching for the target neural network architecture in the knowledge base, the intermediate data including one or more of the following: the target neural network architecture The search space of , the candidate neural network architecture based on the search space, the first type of evaluation index value related to hardware, the second type of evaluation index value independent of hardware, and the information of the hardware used to run the target neural network.
- a storage module configured to store intermediate data generated during the process of searching for the target neural network architecture in the knowledge base, the intermediate data including one or more of the following: the target neural network architecture The search space of , the candidate neural network architecture based on the search space, the first type of evaluation index value related to hardware, the second type of evaluation index value independent of hardware, and the information of the hardware used to run the target neural network.
- the training module is specifically configured to: use a model cloning device to train the target neural network architecture based on the model file of the neural network to be optimized to obtain the model file of the target neural network.
- the architecture determination module is specifically configured to: input the model file of the neural network to be optimized into the pre-trained artificial intelligence model, and obtain the search space of the target neural network architecture output by the artificial intelligence model.
- the neural network optimization device further includes: a reasoning module, configured to optimize the speed of the target neural network by using an optimization strategy, the optimization strategy includes: a graph optimization strategy and an operator optimization strategy; an interaction module, specifically used to: Provides the model file of the target neural network optimized for speed.
- a reasoning module configured to optimize the speed of the target neural network by using an optimization strategy
- the optimization strategy includes: a graph optimization strategy and an operator optimization strategy
- an interaction module specifically used to: Provides the model file of the target neural network optimized for speed.
- the interaction module is also used to receive the speed optimization request sent by the user; correspondingly, the neural network optimization device further includes: a reasoning module, used to optimize the speed of the target neural network by using an optimization strategy based on the speed optimization request,
- the optimization strategy includes: graph optimization strategy and operator optimization strategy;
- the interactive module is also used to provide users with the model file of the target neural network after speed optimization.
- the present application provides a computer device.
- the computer device includes: a processor and a memory, and a computer program is stored in the memory; when the processor executes the computer program, the computer device realizes the first aspect of the present application and any optional The method provided by the implementation.
- the present application provides a non-transitory computer-readable storage medium.
- the instructions in the computer-readable storage medium are executed by a processor, the first aspect of the present application and any optional implementation manner are realized provided method.
- the present application provides a computer program product including instructions, which, when the computer program product is run on a computer, cause the computer to execute the method provided in the first aspect of the present application and any optional implementation manner.
- FIG. 1 is a schematic diagram of an optimization system involved in a neural network optimization method provided in an embodiment of the present application
- Fig. 2 is a schematic diagram of an optimization system involved in another neural network optimization method provided by an embodiment of the present application
- Fig. 3 is a schematic diagram of an optimization system involved in another neural network optimization method provided by an embodiment of the present application.
- FIG. 4 is a schematic diagram of an application scenario involved in a neural network optimization method provided by an embodiment of the present application
- FIG. 5 is a flow chart of a neural network optimization method provided in an embodiment of the present application.
- FIG. 6 is a schematic diagram of an operation process of a neural network to be optimized provided by an embodiment of the present application.
- FIG. 7 is a schematic diagram of an implementation process of a gradient used to control the parameters of the neural network architecture through the loss function provided by the embodiment of the present application;
- Fig. 8 is a flow chart of another neural network optimization method provided by the embodiment of the present application.
- FIG. 9 is a schematic diagram of a knowledge base provided by an embodiment of the present application.
- Fig. 10 is a schematic diagram of another knowledge base provided by the embodiment of the present application.
- Fig. 11 is a schematic diagram of an optimization system involved in another neural network optimization method provided by the embodiment of the present application.
- FIG. 12 is a schematic diagram of a search process using the MBNAS method provided in the embodiment of the present application.
- FIG. 13 is a schematic structural diagram of a neural network optimization device provided in an embodiment of the present application.
- Fig. 14 is a schematic structural diagram of another neural network optimization device provided by the embodiment of the present application.
- FIG. 15 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
- Deep learning Deep Learning: It is a kind of machine learning technology based on deep neural network algorithm, and its main feature is to use multiple nonlinear transformation structures to process and analyze data. It is mainly used in perception, decision-making and other scenarios in the field of artificial intelligence, such as image and speech recognition, natural language translation, computer games, etc.
- Automatic machine learning It is an advanced control framework for machine learning models, which can automatically search for the optimal parameter configuration of machine learning models without human intervention.
- Neural networks is a neural network that simulates the human brain in order to achieve a mathematical model similar to artificial intelligence. Neural networks can also be called neural network models. A neural network usually adopts a plurality of neurons (also called nodes, nodes) with a connection relationship to simulate the neural network of the human brain.
- connection mode and/or connection structure of each neuron in each neural network is referred to as the neural network architecture of the neural network.
- Typical neural network architectures include recurrent neural network (RNN) architecture, convolutional neural network (CNN) architecture, and so on.
- RNN recurrent neural network
- CNN convolutional neural network
- Neural network architectures can be represented by directed graphs such as directed acyclic graphs. Each edge in a directed graph has a weight, which is used to represent the importance of the input node in an edge relative to the output node in the edge.
- the parameters of the neural network include the above weights. It should be noted that the weights can usually be obtained by using sample data to train the neural network.
- Obtaining the neural network model from the neural network architecture consists of two stages.
- One stage is to perform weight initialization on the neural network architecture to obtain the initial neural network model, also called the initial sub-model.
- the weight initialization refers to the initialization of the weights (in some cases, biases) of each edge in the neural network architecture.
- weight initialization can be realized by generating weight initial values through Gaussian distribution.
- Another stage is to use the sample data to update the weights of the initial sub-model to obtain a neural network model, also called a child model.
- the sample data is input into the initial sub-model, and the initial sub-model can determine a loss value according to the predicted value of the sample data by the initial sub-model and the true value carried by the sample data, and update the weight of the initial sub-model based on the loss value.
- a sub-model can be obtained. This sub-model is a trained neural network model that can be used for a specific application.
- Measuring the pros and cons of a sub-model can be achieved through the evaluation index value of the sub-model.
- the evaluation index value is a measurement value obtained by evaluating the sub-model from at least one dimension.
- the evaluation index values of sub-models can be divided into two categories, one type of evaluation index value changes with hardware changes, and the other type of evaluation index values remains unchanged with hardware changes.
- the evaluation index value that changes with hardware changes is called the first type of evaluation index value
- the evaluation index value that remains unchanged with the hardware change is called the second type of evaluation index value.
- the first type of evaluation index values are evaluation index values related to hardware, including performance values related to hardware.
- the hardware-related performance value includes any one or more of model inference latency (latency), activation amount, throughput, power consumption (power), and video memory occupancy.
- the second type of evaluation index values are evaluation index values that are not related to hardware, including precision values that are not related to hardware.
- the precision value includes any one or more of accuracy (accuracy), precision (precision) and recall (recall).
- the evaluation index value not related to hardware also includes parameter quantity and computing power, and computing power specifically includes floating-point operations per second (FLOPs).
- the industry proposes to use neural network development platforms (such as AutoML platform) to design and train neural networks for users.
- neural network development platforms such as AutoML platform
- users may be dissatisfied with the performance of the existing neural network.
- the user can also use the neural network development platform to optimize the existing neural network to obtain better performance. good neural network.
- the main process includes: determining the search space of the optimized neural network according to the existing neural network, searching for the neural network architecture of the optimized neural network in the search space, and then training the obtained neural network architecture to obtain the optimized neural network architecture. network.
- the search space includes the value range of each attribute of each neuron.
- the search space defines a search range for the neural network architecture, and a set of searchable neural network architectures can be provided based on the range defined by the search space.
- the search space can be divided into chain architecture space, multi-branch architecture space, and block-based search space.
- Different search spaces can be characterized by the value range of each attribute.
- the search space can be characterized by the value range of the two attributes of neuron identification and neuron execution operations.
- the search space can also be characterized in combination with at least one of the number of layers included in the neural network architecture, the unit block data included in each layer, and the number of neurons included in each unit block.
- optimization operations such as graph optimization and operator fusion are usually performed on the existing neural network.
- the current optimization operation has limited room for improving the performance of the neural network, resulting in poor performance of the optimized neural network.
- the embodiment of the present application provides a neural network optimization method.
- This method can firstly map the neural network to be optimized to a similar search space according to the model file of the neural network to be optimized, and then determine the target neural network architecture based on the search space, and train the target neural network architecture.
- the target neural network whose performance is greatly improved is obtained, and then the model file of the target neural network is provided to the user. Therefore, the performance of the neural network optimized through the neural network optimization method provided in the embodiment of the present application can be greatly improved.
- the neural network optimization method provided in the embodiment of the present application can be applied to an optimization system.
- the optimization system is used to implement the neural network optimization method provided in the embodiment of the present application.
- the optimization system may be implemented by one or more devices such as terminals, physical machines, bare metal servers, cloud servers, virtual machines, or containers.
- the optimization system 1 may include the following functional modules: an interaction module 11 , an architecture determination module 12 and a training module 13 .
- the interaction module 11 is used to receive the model file of the neural network to be optimized, and provide the optimized model file of the target neural network to the user.
- the architecture determination module 12 is used to obtain the search space of the target neural network architecture according to the model file of the neural network to be optimized, and obtain the target neural network architecture based on the search space.
- the training module 13 is used to train the target neural network architecture based on the model file of the neural network to be optimized, and obtain the model file of the target neural network.
- the neural network provided to the user and the model file of the neural network provided to the user both refer to the neural network provided to the user, and are not distinguished for the convenience of description.
- the optimization system 1 may further include a reasoning module 14 .
- the reasoning module 14 is used to provide reasoning functions.
- the architecture determination module 12 can generate multiple candidate neural network architectures, so as to obtain the target neural network architecture according to the multiple candidate neural network architectures, then the reasoning module 14 can Inference is performed on the models corresponding to the multiple candidate neural network architectures, and first-type evaluation index values such as inference delays of the multiple candidate neural network architectures running on the hardware are obtained.
- the training module 13 is also used to train the models corresponding to the multiple candidate neural network architectures generated by the architecture determination module 12, and obtain second-type evaluation index values such as accuracy values of the multiple candidate neural network architectures.
- the architecture determination module 12 is specifically configured to obtain the target neural network architecture according to the first-type evaluation index values and the second-type evaluation index values of multiple candidate neural network architectures.
- the reasoning module 14 is also used to optimize the speed of the target neural network.
- the interaction module 11 is specifically configured to provide the user with the model file of the target neural network optimized for speed.
- the reasoning module 14 is also used to obtain the second type of evaluation index value of the target neural network, so as to provide the user with the evaluation index value of the target neural network when providing the model file of the target neural network.
- the optimization system 1 may further include a storage module 15 for storing intermediate data in the process of obtaining the target neural network architecture, so as to improve the optimization efficiency of the neural network to be optimized.
- a storage module 15 for storing intermediate data in the process of obtaining the target neural network architecture, so as to improve the optimization efficiency of the neural network to be optimized.
- each of the above modules may also have other functions, which are not listed here.
- Multiple parts of the optimization system 1 can be deployed on any one of terminals, physical machines, bare metal servers, cloud servers, virtual machines, and containers. Alternatively, multiple parts of the optimization system 1 may be distributed and deployed on one or more of multiple terminals, multiple physical machines, multiple bare metal servers, multiple cloud servers, multiple virtual machines, and multiple containers .
- the training module 13 may be located at the service side, for example, the training module 13 may be provided by a neural network optimization service provider.
- the training module 13 may also be located at the user side, that is, the training module 13 may be provided by the user who needs to use the neural network optimization service.
- the neural network optimization service is used to provide the functions realized by the neural network optimization method provided in the embodiment of the present application.
- the training module 13 can be used to train the target neural network architecture, and is also used to train the models corresponding to the multiple candidate neural network architectures generated by the architecture determination module 12, the training module 13 can Contains at least two parts.
- the first part of the at least two parts is used for training the target neural network architecture
- the second part of the at least two parts is used for training the models corresponding to the multiple candidate neural network architectures generated by the architecture determining module 12 .
- the first part may be located on the user side, so as to use the training data on the user side to execute the training process
- the second part may be located on the service side.
- the reasoning module 14 may be located at the service side, or, as shown in FIG. 3 , the reasoning module 14 may also be located at the user side.
- the inference module 14 uses the inference module 14 provided by the user to perform inference on the models corresponding to the multiple candidate neural network architectures generated by the architecture determination module 12, without uploading the models to the service side, thus avoiding model leakage , to ensure model privacy.
- the reasoning module 14 can be used to perform reasoning on the hardware corresponding to the models corresponding to the multiple candidate neural network architectures generated by the architecture determination module 12, and can also be used to obtain the second type of evaluation index value of the target neural network, it is similar to training
- the reasoning module 14 includes at least two parts.
- the first part of the at least two parts is used to obtain the second type of evaluation index value of the target neural network
- the second part of the at least two parts is used to obtain the inference delay of multiple candidate neural network architectures running on the hardware, etc.
- the first type of evaluation index value may be located on the user side, so as to use the training data on the user side to obtain the second type evaluation index value of the target neural network
- the second part may be located on the service side.
- some or all modules in the optimization system 1 can be realized by resources in the cloud platform.
- Basic resources owned by the cloud service provider are deployed on the cloud platform, such as computing resources, storage resources, and network resources.
- the computing resources may be a large number of computer devices (such as servers).
- the optimization system 1 can use the basic resources deployed in the cloud platform to implement the neural network optimization method provided in the embodiment of the present application to realize the optimization of the neural network.
- the interaction module 11 and the architecture determination module 12 when the interaction module 11 and the architecture determination module 12 are located on the service side, and the training module 13 and the reasoning module 14 are located on the user side, the interaction module 11 and the architecture determination module 12 can be deployed in the public cloud platform, training The module 13 and the reasoning module 14 can be deployed in a private cloud platform, so that the neural network optimization method provided by the embodiment of the present application can be realized based on the hybrid cloud platform formed by the public cloud platform and the private cloud platform.
- the optimization system 1 shown in FIG. 1 and FIG. 2 may be all deployed on a public cloud platform or all deployed on a private cloud platform.
- the neural network optimization method provided by the embodiment of the present application can be abstracted into a neural network optimization method by the cloud service provider on the cloud platform Cloud services are provided to users. After the user purchases the neural network optimization cloud service on the cloud platform, the cloud platform can use the optimization system 1 to optimize the cloud service for the neural network provided by the user.
- the cloud platform can provide users with different neural network optimization cloud services. For example, for the different deployment methods of the above-mentioned training module 13 and reasoning module 14, the cloud platform can provide users with at least the following two kinds of neural network optimization cloud services:
- the training module 13 and the reasoning module 14 are deployed on the service side, for example, each part of the optimization system 1 is deployed in a cloud computing cluster of the public cloud platform, the user can After purchasing the neural network optimization cloud service, the neural network to be optimized can be sent to the public cloud platform, and the public cloud platform can use the neural network optimization cloud service provided by optimization system 1 to optimize the neural network to be optimized, and provide the user with Provides an optimized neural network.
- the training module 13 and the reasoning module 14 are deployed on the user side, and the interaction module 11 and the architecture determination module 12 are deployed on the service side, for example, the interaction module 11 and the architecture determination module 12 are deployed In a cloud computing cluster of the public cloud platform), the training module 13 and the reasoning module 14 are deployed in a cloud computing cluster of the private cloud platform.
- the user After purchasing the neural network optimization cloud service, the user needs to send the neural network to be optimized to the public cloud platform, and the public cloud platform uses the architecture determination module 12 to provide the user with multiple alternative neural network architectures or neural networks based on the neural network to be optimized. Models for multiple candidate neural network architectures.
- the user uses the training module 13 and the reasoning module 14 to obtain the evaluation index values of multiple candidate neural network architectures, and sends the evaluation index values to the public cloud platform.
- the public cloud platform determines the target neural network architecture among multiple candidate neural network architectures according to the evaluation index value. Then the public cloud platform provides the target neural network architecture or the model corresponding to the neural network architecture to the user, so that the user uses the training module 13 to train the target neural network architecture or the model corresponding to the neural network architecture to obtain the optimized target neural network .
- the cloud platform may be a central cloud cloud platform, an edge cloud cloud platform, or a cloud platform including a central cloud and an edge cloud, which is not specifically limited in this embodiment of the application.
- the optimization system may be partially deployed on the cloud platform of the edge cloud and partially deployed on the cloud platform of the central cloud.
- Fig. 1 to Fig. 3 are only some specific examples of the optimization system provided by the embodiment of the present application. Moreover, the above division and deployment of functional modules in the optimization system are only illustrative examples. This application does not limit the division of functional modules in the optimization system and the deployment of each functional module. In actual applications, it can be used according to The deployment optimizes the computing power of the computing equipment of the system or the specific application requirements for adaptive deployment.
- FIG. 5 is a flow chart of a neural network optimization method provided by an embodiment of the present application. As shown in Figure 5, the neural network optimization method includes the following steps:
- Step 501 Receive the model file of the neural network to be optimized.
- the model file of the neural network to be optimized needs to be provided to the optimization system, so that the optimization system can determine what kind of neural network to be optimized needs to be optimized.
- the model file of the neural network to be optimized is used to indicate the neural network to be optimized.
- the neural network to be optimized is essentially a directed graph, and the directed graph can be represented by a model file, and the model file can be a file with a suffix of .ph.
- the operation process of the neural network to be optimized represented by the model file is: first perform the convolution operation (conv) on the input data (input), and then use the linear rectification function (rectified linear unit, ReLU) to perform the convolution operation
- the result is subjected to linear rectification processing, and a sum operation (add) is performed on the result processed by the linear rectification function and the result of the convolution operation, and then the sum result is output (output).
- the linear rectification function also known as the modified linear unit, is a commonly used activation function in artificial neural networks, and usually refers to the nonlinear function represented by the ramp function and its variants.
- Step 502 Receive the target information input by the user.
- the target information includes one or more of the following information: information about the hardware running the target neural network (also referred to as hardware information), and information indicating the performance requirements of the user for the target neural network. information (also known as performance requirement information).
- the user can also provide the optimization system with relevant requirements for optimizing the neural network to be optimized.
- the relevant requirements can use the target information express.
- the target information may include one or more of the following: information about the hardware that the user expects to run the target neural network, and information indicating the performance requirements of the user for the target neural network.
- the performance requirement information may also indicate the performance requirement of the target neural network on the specified data set, and the performance requirement information may include at least one of the first type of evaluation index value and the second type of evaluation index value indicating the target neural network information.
- the hardware information is used to indicate that the hardware that the user expects to run the target neural network is a certain type of graphics processing unit (graphics processing unit, GPU), and the performance requirement information is used to indicate that the user expects the inference accuracy of the target neural network to be 95.94%, or, The performance requirement information is used to indicate that the user expects the target neural network to have an inference accuracy of 95.94% on the cifar10 dataset.
- the target neural network is a neural network obtained after the optimization system optimizes the neural network to be optimized.
- step 502 is an optional execution step.
- the user can choose whether to provide target information to the optimization system.
- the optimization process of the optimization system to optimize the neural network can be more targeted, so as to provide the user with a target neural network that better meets the user's needs.
- Step 503 based on the model file and target information of the neural network to be optimized, obtain the search space of the target neural network architecture.
- step 502 is an optional execution step.
- step 502 is not executed, that is, when the optimization system does not receive the target information, the search space of the target neural network architecture may not be obtained according to the target information, and the realization process of obtaining the search space includes: Based on the model file of the neural network to be optimized, the search space is obtained.
- the implementation process of obtaining the search space of the target neural network architecture based on the model file of the neural network to be optimized and the target information is described below by taking the execution of step 502 as an example.
- the search space includes the value range of each attribute of each neuron in the target neural network architecture.
- a pre-trained artificial intelligence model may be used to predict the search space.
- the artificial intelligence model can output the search space of the target neural network architecture according to the input model file and target information of the neural network to be optimized.
- the process of predicting the search space through the artificial intelligence model may include: the artificial intelligence model analyzes the network architecture characteristics of the neural network to be optimized based on the model file of the neural network to be optimized, obtains the possible task type of the target neural network, and based on the target information in this possible In the task type, further determine the task type of the target neural network, such as further determining whether the task type of the target neural network is a classification task or a detection task in the classification task and detection task, and then output the target neural network according to the corresponding relationship between the task type and the search space The search space corresponding to the task type is obtained to obtain the search space of the target neural network architecture.
- this correlation can be used to further determine the task type of the target neural network among possible task types based on the target information.
- the relationship between the target information and the task type of the neural network is as follows: A1 model GPU is usually used to implement A2 task type tasks, and B1 model GPU is used to implement B2 task type tasks, then when the target information indicates that the target neural network is running
- the hardware is an A1 model GPU, it can be determined that the task type of the target neural network is the A2 task type among the possible task types according to the association relationship.
- the optimization system when the optimization system receives the target information, in the process of determining the search space, the target information can be used to determine the possible Task type further filters. Therefore, for different situations in which the optimization system receives and does not receive the target information, the impact on the implementation process of determining the search space lies in whether to use the target information for further screening. Moreover, when the optimization system receives the target information, since it can use the target information for further screening, it can determine a more matching search space for the target neural network, and can improve the performance of the target neural network optimized according to the search space.
- the artificial intelligence model can automatically detect the task type of the target neural network, it is not necessary for the user to inform the optimization system of the task type of the neural network to be optimized, which simplifies the need for the user when optimizing the neural network.
- the action to perform may be a classification model, such as a support vector machine (support vector machine, SVM).
- the data type of the model file of the neural network to be optimized can also be converted into a data type that the artificial intelligence model can recognize, such as
- the model file of the network is converted into one-dimensional feature data, and then the one-dimensional feature data is input into the artificial intelligence model.
- the one-dimensional feature data is used to represent the type of each node in the neural network to be optimized and the relationship between nodes.
- a graph algorithm (such as a graph kernel algorithm) can be used to convert the model file of the neural network to be optimized into one-dimensional feature data.
- the result of converting the model file into one-dimensional feature data is shown in Table 1, and for each row of data in this Table 1, "t#N” Indicates the Nth graph, such as "t#0" indicates the 0th graph, "v M L” indicates that the label of the Mth vertex in the graph is L, such as "v01” indicates that the label of the 0th vertex in the graph is 1, "e P Q” indicates that the Pth vertex and the Qth vertex are connected by an edge, such as "e01” indicates that the 0th vertex and the first vertex are connected by an edge, “t#-1 " indicates the end of the model file.
- the order of vertices in Fig. 6 is obtained by arranging from top to bottom and from left to right.
- Step 504 Obtain the target neural network architecture based on the search space and target information of the target neural network architecture.
- step 502 is an optional execution step.
- step 502 is not executed, that is, when the optimization system does not receive the target information, the target neural network architecture may not be obtained according to the target information, and the realization process of obtaining the target neural network architecture includes: Based on The model file of the neural network to be optimized to obtain the target neural network architecture.
- the implementation process of obtaining the target neural network architecture based on the model file and target information of the neural network to be optimized is described below by taking the execution of step 502 as an example.
- the target neural network architecture can be obtained based on the search space and target information.
- an architecture search strategy can be used to search for the target neural network architecture.
- the optimization system can be configured with a knowledge base that can be retrieved in the knowledge base to obtain the target neural network architecture based on the search space and target information.
- the optimization system when the optimization system is equipped with a knowledge base, it can first search in the knowledge base based on the search space and target information.
- the existing The neural network architecture is determined as the target neural network architecture.
- the target neural network architecture is obtained by using the architecture search strategy based on the search space and target information.
- the implementation process of searching in the knowledge base and using the architecture search strategy is not described here temporarily, and will be introduced in the following content.
- the existing neural network architecture can be directly determined as the target neural network architecture without using the architecture search strategy Searching the neural network architecture can improve the optimization efficiency of the neural network to be optimized and reduce the resource consumption of the optimized neural network.
- the intermediate data generated during the search for the target neural network architecture can also be stored in the knowledge base to improve the performance of the neural network to be optimized.
- Optimize efficiency the intermediate data includes one or more of the following: the search space of the target neural network architecture, the candidate neural network architecture obtained based on the search space, the first evaluation index value related to hardware, and the second evaluation index value independent of hardware. Class evaluation metric values, and information about the hardware used to run the target neural network.
- the first type of evaluation index value includes an inference delay running on hardware, etc.
- the second type of evaluation index value includes an accuracy value, etc.
- Step 505 Train the target neural network architecture based on the model file of the neural network to be optimized to obtain the model file of the target neural network.
- a model cloning method provided in the embodiment of the present application may be used to train the target neural network architecture to obtain a model file of the target neural network.
- the model cloning method is described below:
- the basic principle of the model cloning method is: to control the output of the target neural network architecture to any training data, and to fit the output of the neural network to be optimized to the output of any training data as the goal, train the target neural network architecture, and obtain the training completion target neural network. That is, when using the model cloning method to train the target neural network architecture, it can be judged whether the difference between the output of the target neural network architecture for any training data and the output of the neural network to be optimized for any training data tends to be At a minimum, to determine whether the training of the target neural network architecture is completed.
- fitting means that by adjusting the weight coefficients in the target neural network architecture, the difference between the output of the neural network architecture after adjusting the weight coefficients for any training data and the output of the neural network to be optimized for any training data tends to be minimum.
- the implementation process includes: for any training data, respectively obtain the target neural network architecture and the target parameters of the neural network to be optimized for any training data, and then according to the target neural network architecture and the neural network to be optimized Obtain the loss value of the specified loss function for the target parameter of any training data, and then pass the loss value back to the target neural network architecture, so as to determine the gradient used for tuning the neural network architecture according to the loss value, and then Adjust the weight parameters of the target neural network architecture according to the determined gradient until the training target is reached.
- the target parameter of the neural network for the training data is the logarithm of the ratio of the number of times the event A occurs to the number of times the event A does not occur in the neural network for the training data, that is, logits.
- Figure 7 is a schematic diagram of the process of training the target neural network architecture when the target parameter is logits.
- the process of obtaining the loss value of the specified loss function can be as follows: the target neural network architecture target parameters for any training data and the target parameters of the neural network to be optimized for any training data are both used as the input of the specified loss function , the output of the specified loss function is the loss value of the specified loss function.
- the specific implementation form of the specified loss function may be designed according to application requirements, and is not specifically limited in this embodiment of the present application.
- public data sets can be used as training sets.
- public data sets such as the ImageNet dataset can be used. Since the ImageNet dataset has rich image information, using the ImageNet dataset as a training set can effectively guarantee the training effect of the neural network.
- public datasets users do not need to upload training data, which can solve the problem that users cannot provide data for training models due to data privacy, legal or transmission restrictions.
- GAN generative adversarial network
- the reasoning behavior of the neural network to be optimized can be cloned to the target neural network architecture, and the reasoning behavior of the target neural network can be guaranteed to be consistent with the reasoning behavior of the neural network to be optimized. Behavior is consistent.
- the optimized target neural network architecture can be changed relative to the neural network architecture of the neural network to be optimized, and can Further guarantee the accuracy of the target neural network.
- Step 506 providing the model file of the target neural network to the user.
- the model file of the target neural network can be provided to the user, so that the user can use the target neural network.
- the optimization system may further optimize the speed of the target neural network, and provide the user with the speed-optimized neural network.
- the speed optimization may include offline optimization and online optimization. Offline optimization refers to the speed optimization of the target neural network when the user is not using the target neural network for reasoning. Online optimization refers to the speed optimization of the target neural network when the user uses the target neural network for reasoning. .
- the optimization strategy can be directly used to optimize the speed of the target neural network.
- the implementation process of step 506 includes: providing the user with the model file of the target neural network after speed optimization .
- the optimization strategy includes: graph optimization strategy and operator optimization strategy.
- the graph optimization strategy may include: performing an equivalent transformation of the graph used to represent the target neural network, and then adjusting the structure of the target neural network according to the equivalent transformed graph, and/or implementing multiple nodes in the target neural network The operator fusion is implemented on one node, and then the structure of the target neural network is adjusted according to the operator fusion result.
- the operator optimization strategy may include: using the operator search technology to search for the optimal operator implementation algorithm among various algorithms used to implement the operator according to the type and parameters of the operator. Among them, a collection of one or more operations performed on an operation object is called an operator.
- the calculation amount or other system overhead (such as memory access overhead) of the target neural network can be reduced, and the inference speed of the target neural network can be improved.
- the speed optimization of the target neural network may be performed according to the user's request.
- the neural network optimization method also includes:
- Step 507 receiving the speed optimization request sent by the user.
- the user feels that the speed of the target neural network needs to be optimized, he can also send a speed optimization request to the optimization system to request the optimization system to optimize the speed of the target neural network by using an optimization strategy.
- Step 508 Based on the speed optimization request, optimize the speed of the target neural network by using an optimization strategy.
- the optimization strategy includes: graph optimization strategy and operator optimization strategy.
- the optimization strategy includes: graph optimization strategy and operator optimization strategy.
- Step 509 providing the user with the model file of the target neural network optimized for speed.
- the knowledge base can be viewed as a database that stores information related to the neural network model architecture.
- the knowledge base includes at least three types of information, and the at least three types of information include: neural network pre-training parameters, evaluation index values, and neural network architecture.
- the evaluation index values include the first type of evaluation index value and the second type of evaluation index value and other evaluation index values.
- the first type of evaluation index value includes the inference delay and power consumption running on the hardware.
- the second type of evaluation index value includes the accuracy value etc.
- the pre-training parameters include: verification data used to verify accuracy in the training set of the neural network and other data (such as training data, etc.) used for knowledge transfer.
- Knowledge transfer refers to the generation of data used to obtain the neural network required by the user according to the user's needs and the existing data in the knowledge base.
- the pre-training parameters and evaluation index values can be regarded as the label information of the neural network architecture.
- FIG. 9 is a schematic diagram of the knowledge base including the pre-training parameters of the neural network, evaluation index values and neural network architecture.
- the information in the knowledge base is classified according to the source, and can be divided into initial information and later information.
- This initial information can come from public datasets and public model repositories.
- This late information can be accumulated during the process of optimizing the neural network.
- the later information may be intermediate data obtained during the process of searching the neural network architecture by using the search strategy.
- the implementation of searching in the knowledge base includes: according to the search space and target information, query whether the search space is stored in the knowledge base and meets the target
- the existing neural network architecture of the information when the existing neural network architecture that uses the search space and meets the target information is stored in the knowledge base, it means that the existing neural network architecture matches the task requirements of the target neural network, then it can be The existing neural network architecture is determined as the target neural network architecture.
- the process of storing the existing neural network architecture that uses the search space and satisfies the target information in the query knowledge base can be regarded as Space and target information, the process of querying whether there is label information that uses the search space and matches the target information in the pre-training parameters and evaluation index values of the knowledge base.
- the search space will be used, and the neural network that carries the label information that matches the target information will be used.
- the network architecture is determined to be an existing neural network architecture that uses the search space and satisfies the target information.
- the search space and the label information matching the target information can jointly indicate the training set for training the target neural network, and the label information matching the target information can indicate the first type of evaluation index value and the second type of evaluation index value of the target neural network.
- Evaluation index values such as index values
- the search space used by the neural network architecture can indicate that the neural network architecture and the neural network to be optimized belong to the same type of neural network.
- querying whether there is an existing neural network architecture using the search space stored in the knowledge base refers to analyzing the network structure of the neural network architecture stored in the knowledge base to determine the properties of each neuron in the neural network architecture. Whether the value range falls within the range included in the search space, when the value range of each attribute of each neuron in the neural network architecture falls within the range included in the search space, it is determined that the neural network architecture uses the search space.
- the evaluation index values stored in the knowledge base usually include multiple index values, for example, the first type of evaluation index value and the second type of evaluation index value. Then, when querying whether there is label information matching the target information in the knowledge base, it is necessary to match the various index values with the target information respectively. When the various index values match the target information, it is determined that the target The tag information that the information matches. Correspondingly, at this time, it is necessary to ensure that the target information includes multiple types of evaluation index values corresponding to the various index values.
- the target information input by the user may only include partial evaluation index values in the multi-category evaluation index values.
- the partial category evaluation index value not included in the target information can be obtained, and the obtained partial evaluation index value and the partial category evaluation index value in the target information together form a multi-category evaluation index value.
- the optimization system does not receive target information, it can obtain multiple types of evaluation index values corresponding to the various index values according to the search space.
- the set of the evaluation value obtained according to the search space and the evaluation value included in the target information is called specified information, and the specified information is used to reflect the performance requirements of the user for the target neural network, and the specified information includes Multi-category evaluation index values corresponding to various index values in the knowledge base.
- the implementation of obtaining specified information based on the search space of the target neural network architecture includes: obtaining at least one candidate neural network architecture based on the search space, and performing training and reasoning on each candidate neural network architecture , to get the multi-class evaluation index value of each candidate neural network architecture.
- the value of each attribute of each neuron in each candidate neural network architecture can be obtained by sampling the search space, so as to generate at least one candidate neural network architecture. For this process, refer to step 5041 accordingly.
- the training module can be used to initialize the weight of each candidate neural network architecture to obtain multiple initial sub-models, and then use the training data to train multiple initial sub-models to obtain the second type of evaluation index values of multiple candidate sub-models , the process can refer to step 5042 accordingly.
- the inference module can be used to initialize the weight of each candidate neural network architecture on the hardware to obtain the initial sub-model, and then reason the initial sub-model to obtain the first type of evaluation index value of the initial sub-model on the hardware. This process can be Refer to step 5043 accordingly.
- multiple existing neural network architectures can be selected according to the specified screening strategy
- One of the network architectures is determined as the target neural network architecture.
- the specified screening strategy can be determined according to the application requirements, for example, it can be randomly selected, or the one with the best performance among the multiple existing neural network architectures can be determined as the target neural network architecture, which is not specifically described in the embodiment of the present application. limited.
- the information in the knowledge base can be stored according to the search space.
- the knowledge base may be divided into multiple sub-knowledge bases in units of search spaces, and the information stored in the sub-knowledge bases corresponding to any search space is the information obtained under the search space.
- the knowledge base can be divided into three sub-knowledge bases in units of search spaces spaceA, spaceB and spaceC.
- the information stored in the sub-knowledge base corresponding to the search space spaceA includes: the pre-training parameters, evaluation index values and neural network architecture of the neural network obtained under the search space spaceA.
- the information stored in the sub-knowledge base corresponding to the search space spaceB includes: the pre-training parameters, evaluation index values and neural network architecture of the neural network obtained under the search space spaceB.
- the information stored in the sub-knowledge base corresponding to the search space spaceC includes: the pre-training parameters, evaluation index values and neural network architecture of the neural network obtained under the search space spaceC.
- the sub-knowledge base corresponding to the search space of the target neural network architecture can be found in the knowledge base first, and then the sub-knowledge base can be searched in the sub-knowledge base.
- the search space of the target neural network architecture is determined, there is no need to search in sub-knowledge bases corresponding to other search spaces, which reduces the scope of search in the knowledge base and shortens the time for searching the knowledge base. Time-consuming, reducing the resource consumption of searching the knowledge base.
- the model file provided by the user indicates that the neural network to be optimized is a ResNet34 neural network
- the target information provided by the user indicates that the performance of the optimized target neural network needs to meet the inference accuracy of at least 95.94% in the cifar10 dataset, and it is desired to improve the target Inference performance of a neural network on a model A GPU.
- the search space of the target neural network architecture is the search space of the ResNet series neural network model.
- the evaluation indicators stored in the knowledge base are inference accuracy, inference delay, and power consumption, since the target information only indicates the inference accuracy, it is necessary to obtain the inference delay and power consumption required by the user based on the search space.
- the search space of the ResNet series neural network model multiple candidate neural network architectures can be generated, and the neural network models corresponding to the multiple candidate neural network architectures can be inferred on the A-type GPU to obtain the multiple candidate neural network architectures.
- the inference delay and power consumption of the selected neural network architecture is performed in the knowledge base according to the search space, the inference accuracy indicated by the target information, the inference latency and power consumption of each candidate neural network architecture.
- searching in the knowledge base it is found that there is a ResNet18 neural network in the knowledge base.
- the ResNet18 neural network can meet the search space, the inference delay and power consumption determined according to the alternative neural network architecture, and the ResNet18 is on the cifar10 data set.
- the inference accuracy is 96.01%. That is, the ResNet18 neural network satisfies the search space determined according to the neural network ResNet34 to be optimized, the target information provided by the user, and the inference delay and power consumption determined according to the alternative neural network architecture. Therefore, the ResNet18 neural network can be determined as the target neural network architecture after optimization of the neural network ResNet34 to be optimized.
- a traditional neural network architecture search algorithm such as a neural architecture search (neural architecture search, NAS) algorithm or an efficient neural architecture search (efficient neural architecture search, ENAS) algorithm may be used for searching.
- a model-based neural architecture search model-based neural architecture search, MBNAS
- the implementation of the MBNAS method can be realized through multiple functional modules.
- the embodiment of the present application takes the implementation of the MBNAS method through multiple functional modules shown in Figure 11 as an example, and describes the implementation process of searching using the MBNAS method:
- the functional modules used to realize the MBNAS method include an architecture determination module 12, a training module 13, and an inference module 14, and the architecture determination module 12 includes a generation submodule 121 and a search submodule 122, and the search submodule 122 includes an evaluation unit 1221 and control unit 1222.
- the process of searching using the MBNAS method is described below, and the process includes the following steps:
- Step 5041 the generating submodule generates multiple neural network architectures according to the search space, and provides the multiple neural network architectures to the training module and the reasoning module.
- the search space includes the value range of each attribute of each neuron in the neural network architecture of the optimized neural network
- the generation sub-module can obtain the neural network architecture of the optimized neural network by sampling the search space.
- the value of each attribute of the element thus generating multiple neural network architectures.
- the generation sub-module can be randomly sampled, so that the balance of the samples used to train the evaluation unit can be guaranteed.
- the generation submodule may also use other methods to sample the search space, which is not specifically limited in this embodiment of the present application.
- Step 5042 the training module obtains multiple candidate sub-models according to multiple neural network architecture training, obtains the second type evaluation index values of multiple candidate sub-models, and provides the multiple neural network architectures and multiple The second type of evaluation index value of the alternative sub-model.
- the training module can initialize the weights of the neural network architecture to obtain multiple initial sub-models, and then use the training data to train multiple initial sub-models to obtain the second-type evaluation index values of multiple candidate sub-models.
- the second type of evaluation index value may include inference delay and power consumption.
- the training data used to train the initial sub-model can be public datasets such as the ImageNet dataset, or datasets provided by users.
- the training module can train multiple initial sub-models concurrently, which can shorten the training time and improve the training efficiency.
- Step 5043 the reasoning module performs reasoning on the hardware for the multiple sub-models corresponding to the multiple neural network architectures provided by the generation sub-module, obtains the first-type evaluation index values of the multiple sub-models on the hardware, and provides the multiple sub-models to the search sub-module The first-class evaluation index values of a neural network architecture and multiple sub-models.
- the first type of evaluation index value may include an accuracy value.
- the hardware used for inferring the sub-model may be the hardware indicated by the hardware information, or, among the hardware configured for the inference module, it has a greater difference with the hardware indicated by the hardware information. hardware with performance similarities.
- the target information provided by the user does not include hardware information, that is, when the user does not specify the hardware for running the target neural network
- the hardware used for inferring the sub-model may be the hardware configured for the inference module.
- the reasoning module executes multiple sub-models for reasoning on the hardware, which can be multiple initial sub-models obtained by weight initialization of the neural network architecture provided by the generation sub-module for the reasoning module, or multiple initial sub-models for the training module. Multiple candidate sub-models obtained from training.
- the training module and the generating sub-module provides the search sub-module with multiple neural network architectures generated by the generating sub-module.
- the inference module can perform inference on multiple sub-models on the hardware in parallel, and obtain the first-type evaluation index values of the multiple sub-models on the hardware.
- Step 5044 the search submodule determines the target neural network architecture according to the multiple neural network architectures, the second-type evaluation index values of multiple candidate sub-models, and the first-type evaluation index values of multiple sub-models.
- the search submodule can train the evaluation unit according to multiple neural network architectures, second-type evaluation index values of multiple candidate sub-models, and first-type evaluation index values of multiple sub-models. Then, the search sub-module uses the trained evaluation unit to predict the neural network architecture provided by the control unit, obtains the evaluation index value corresponding to the neural network architecture provided by the control unit, and uses the evaluation index value as feedback for training the control unit, The trained control unit is then used to determine the target neural network architecture.
- the evaluation unit belongs to a kind of neural network, and its training process includes: inputting the neural network architecture generated by the generation sub-module and its corresponding first-type evaluation index value and second-type evaluation index value into the evaluation unit, and inputting the first-type evaluation index value
- the index value and the second type of evaluation index value are used as the label of the input neural network architecture for supervised learning, so that the evaluation unit determines the loss value according to the evaluation index value predicted by the evaluation unit for the neural network architecture, and updates the evaluation according to the loss value
- the evaluation unit may be a recurrent neural network.
- a set of training data used to train the evaluation unit includes: any neural network architecture among the multiple neural network architectures generated by the generation submodule, the second type of evaluation index value of the candidate submodel corresponding to the neural network, and The first type of evaluation index value of the sub-model corresponding to the neural network.
- the control unit is also a type of neural network.
- the control unit can generate a neural network architecture according to the search space, and the evaluation unit after training can predict the evaluation index value of the neural network architecture generated by the control unit (for example, including at least one of the first type of evaluation index value and the second type of evaluation index value) , the predicted evaluation index value is used as the incentive (reward) for training the control unit, so as to adjust the weight parameters of the control unit according to the incentive, until the training end condition of the control unit is met, such as the control unit tends to converge.
- the control unit may be a recurrent neural network.
- the control unit after training is used to generate multiple candidate neural network architectures according to the search space of the target neural network.
- the search sub-module can obtain the plurality of candidate neural network architectures, obtain a plurality of initial neural networks according to the plurality of candidate neural network architectures, and then train the plurality of initial neural networks respectively to obtain multiple candidate neural network architectures corresponding to multiple neural network architectures. Then, according to the evaluation index values of multiple candidate neural networks, the candidate neural networks that meet the specified screening conditions are screened, and The neural network architecture of the candidate neural network satisfying the specified filter condition is determined as the target neural network architecture.
- the designated filter condition may be determined according to at least one of search space and target information.
- the specified filter condition indicates that the target neural network architecture needs to meet the user's performance requirements for the target neural network indicated by the target information, and if there are multiple candidate neural networks among multiple candidate neural networks that meet the performance requirements, the best performance
- the neural network architecture of the candidate neural network is determined as the target neural network architecture.
- a model cloning method provided in the embodiment of the present application may be used to train multiple initial neural networks
- the training set used for training may be a public data set such as ImageNet data set.
- the search submodule may use the training module to train multiple initial neural networks. And when the training module is deployed on the user side, multiple initial neural networks need to be sent to the training module deployed on the user side, so as to use the training module to perform the training process.
- the above description takes the training module 13 to provide all the required training functions and the reasoning module 14 to provide all the required inference functions in the search process using the MBNAS method as an example.
- the training function required in the search process may not be provided by the training module 13
- the required reasoning function may not be provided by the reasoning module 14 .
- the architecture determination module further includes a training submodule and a reasoning submodule, the training submodule is used to provide the training function required by the search process, and the reasoning submodule is used to provide the reasoning function required by the search process.
- the knowledge base includes at least three types of information, the at least three types of information include: the pre-training parameters of the neural network, the evaluation index value and the neural network architecture, and the pre-training parameters and evaluation index values can be visualized is the label information of the neural network architecture in the neural network architecture.
- the training set of the target neural network there are at least six retrieval results as follows in the retrieval of the knowledge base according to the search space and target information: no match with at least three types of information (that is, no match with the training set and all target information) match), only match with the pre-training parameters (that is, only match the training set), only match with some of the various index values in the evaluation index value (that is, only match with some evaluation index values), and only match with all evaluation index values ( That is, it matches all of the various index values in the evaluation index value), matches with the pre-training parameters and matches with some evaluation index values, and matches with the pre-training parameters and matches with all evaluation index values.
- the user does not specify the training set of the target neural network, there are at least three retrieval results for the knowledge base based on the search space and target information: none of the evaluation index values match, only part of the evaluation index values match, and all Evaluation index values match.
- the existing neural network architecture in the knowledge base can be used directly, and the MBNAS method needs to be used for searching in other matching situations.
- the implementation process of searching using the MBNAS method will be slightly different, specifically: when it does not match at least three types of information, only matches with pre-training parameters, and only matches with part of the evaluation in the case of a specified training set.
- the index value matches only matches with all evaluation index values in the case of a specified training set, or does not match with any evaluation index value, search according to the implementation process described in steps 5041 to 5044 above.
- the evaluation unit can be trained directly using the second type of evaluation index value in the evaluation index value. If the information that partially matches the evaluation index value is the first type of evaluation index value, the above steps can be unnecessary. 5043.
- the evaluation unit may be trained by directly using the first type of evaluation index value in the evaluation index value.
- the efficiency of optimizing the neural network to be optimized can be improved, and the optimization system optimizes the neural network for other users due to the first type of evaluation index value, the second type of evaluation index value and the search space. Therefore, by storing the intermediate data in the knowledge base, the knowledge base can be enriched and the optimization efficiency of the neural network to be optimized can be improved.
- the neural network to be optimized can be mapped to a relatively similar search space based on the model file of the neural network to be optimized, and then the target can be determined based on the search space
- Neural network architecture and by training the target neural network architecture, the target neural network with greatly improved performance can be obtained, and then the model file of the target neural network is provided to the user.
- the method can greatly improve the performance of the optimized neural network, and can use the optimized neural network to solve more complex tasks, thereby ensuring the scope of application of the optimized neural network.
- this method does not require users to upload training data, it can avoid the problem that users cannot provide training data to the platform due to data privacy, legal or transmission restrictions, realize the protection of user data, and improve the Applicability of neural network optimization methods.
- this method since this method only needs the user to provide the neural network to be optimized, and can optionally provide target information, the optimization of the neural network to be optimized can be completed. It does not require the user's relevant knowledge of model optimization like a high-level automatic machine learning platform.
- the optimization work can be carried out with a certain reserve, the threshold for using the neural network optimization method is lowered, and the application range of the neural network optimization method is expanded.
- the intermediate data can be used to provide services for other users, and by searching in the knowledge base first, when the knowledge base exists in the neural network to be optimized
- the architecture search strategy to search the neural network architecture, which can improve the optimization efficiency of the neural network to be optimized and reduce the resource consumption of the optimized neural network.
- the embodiment of the present application also provides a neural network optimization device.
- the neural network optimization device can realize part or all of the functions of the aforementioned optimization system 1 .
- the neural network optimization device is a software device, it may be part or all of the aforementioned optimization system.
- the neural network optimization device 130 includes:
- the interaction module 1301 is used to receive the model file of the neural network to be optimized.
- the architecture determination module 1302 is configured to obtain the search space of the target neural network architecture based on the model file of the neural network to be optimized, and the search space includes the value range of each attribute of each neuron in the target neural network architecture.
- the architecture determination module 1302 is also used to obtain the target neural network architecture based on the search space.
- the training module 1303 is configured to train the target neural network architecture based on the model file of the neural network to be optimized to obtain the model file of the target neural network.
- the interaction module 1301 is also used to provide the model file of the target neural network to the user.
- the interaction module 1301 also receives target information input by the user, and the target information includes one or more of the following information: information about the hardware running the target neural network, and information indicating the performance requirements of the user for the target neural network .
- the architecture determining module 1302 is specifically configured to: obtain the search space of the target neural network architecture based on the model file and target information of the neural network to be optimized.
- the architecture determination module 1302 is specifically configured to: obtain specified information reflecting the user's performance requirements for the target neural network based on the search space; search in the knowledge base based on the specified information; When specifying the existing neural network architecture of the information, the existing neural network architecture is determined as the target neural network architecture.
- the specified information includes multiple types of evaluation index values
- the architecture determination module 1302 is specifically used to: obtain at least one candidate neural network architecture based on the search space; perform training and reasoning on each candidate neural network architecture to obtain each The multi-class evaluation index value of a candidate neural network architecture.
- the architecture determination module 1302 is specifically configured to: search in the knowledge base based on the multi-type evaluation index values of each candidate neural network architecture; when any neural network architecture in the knowledge base satisfies any candidate neural network architecture When evaluating the multi-category evaluation index values of the network architecture, it is determined that there is an existing neural network architecture that satisfies the specified information in the knowledge base.
- the architecture determination module 1302 is further specifically configured to: when there is no existing neural network architecture satisfying specified information in the knowledge base, use an architecture search strategy to search for a target neural network architecture.
- the neural network optimization device 130 further includes: a storage module 1304, configured to store the intermediate data generated during the process of searching for the target neural network architecture in the knowledge base, and the intermediate data includes the following items or more: the search space of the target neural network architecture, the candidate neural network architecture based on the search space, the first type of evaluation index value related to the hardware, the second type of evaluation index value independent of the hardware, and used to run the target neural network Network hardware information.
- a storage module 1304 configured to store the intermediate data generated during the process of searching for the target neural network architecture in the knowledge base, and the intermediate data includes the following items or more: the search space of the target neural network architecture, the candidate neural network architecture based on the search space, the first type of evaluation index value related to the hardware, the second type of evaluation index value independent of the hardware, and used to run the target neural network Network hardware information.
- the training module 1303 is specifically configured to: use the model cloning device 130 to train the target neural network architecture based on the model file of the neural network to be optimized to obtain the model file of the target neural network.
- the architecture determination module 1302 is specifically configured to: input the model file of the neural network to be optimized into the pre-trained artificial intelligence model, and obtain the search space of the target neural network architecture output by the artificial intelligence model.
- the neural network optimization device 130 further includes: an inference module 1305 , configured to optimize the speed of the target neural network by using an optimization strategy, and the optimization strategy includes: a graph optimization strategy and an operator optimization strategy.
- the interaction module 1301 is specifically configured to: provide the user with the model file of the target neural network optimized for speed.
- the neural network to be optimized can be mapped to a relatively similar search space based on the model file of the neural network to be optimized, and then the target can be determined based on the search space.
- Neural network architecture, and by training the target neural network architecture the target neural network with greatly improved performance can be obtained, and then the model file of the target neural network is provided to the user.
- the neural network optimization device can greatly improve the performance of the optimized neural network, can use the optimized neural network to solve more complex tasks, and ensure the scope of application of the optimized neural network.
- the neural network optimization device since the neural network optimization device does not require users to upload training data, it can avoid the problem that users cannot provide training data to the platform due to data privacy, legal or transmission restrictions, and realize the protection of user data. The applicability of the neural network optimization device is improved.
- the neural network optimization device since the neural network optimization device only needs the user to provide the neural network to be optimized, and can optionally provide target information, it can complete the optimization of the neural network to be optimized, unlike the high-level deep learning AutoML platform, which does not require the user to optimize the model
- the optimization work can be carried out with a certain reserve of relevant knowledge, which lowers the threshold for using the neural network optimization device and expands the use range of the neural network optimization device.
- the intermediate data can be used to provide services for other users, and by searching in the knowledge base first, when the knowledge base exists in the neural network to be optimized
- the architecture search strategy to search the neural network architecture, which can improve the optimization efficiency of the neural network to be optimized and reduce the resource consumption of the optimized neural network.
- FIG. 15 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
- the optimization system shown in Fig. 1, Fig. 2 or Fig. 3 can be deployed in the computer device.
- the computer device 150 includes a memory 1501 , a processor 1502 , a communication interface 1503 and a bus 1504 .
- the memory 1501 , the processor 1502 , and the communication interface 1503 are connected to each other through a bus 1504 .
- the computer device 150 may include multiple processors 1502, so that different processors may be used to realize the functions of the above-mentioned different functional modules.
- the memory 1501 may be a read only memory (read only memory, ROM), a static storage device, a dynamic storage device or a random access memory (random access memory, RAM).
- the memory 1501 may store executable code sequences. When the executable code stored in the memory 1501 is executed by the processor 1502, the processor 1502 and the communication interface 1503 are used to execute the neural network optimization method provided by the embodiment of the present application.
- the memory 1501 may also include software modules and data required by other running processes such as an operating system. And the operating system can be LINUX, UNIX, WINDOWS TM and so on.
- the processor 1502 may be a general-purpose central processing unit (central processing unit, CPU), a microprocessor, an application specific integrated circuit (application specific integrated circuit, ASIC), a graphics processing unit (graphics processing unit, GPU) or one or more integrated circuit.
- CPU central processing unit
- ASIC application specific integrated circuit
- GPU graphics processing unit
- the processor 1502 may also be an integrated circuit chip with signal processing capabilities. During implementation, part or all of the functions of the neural network optimization method of the present application may be implemented by hardware integrated logic circuits in the processor 1502 or instructions in the form of software.
- the above-mentioned processor 1502 can also be a general-purpose processor, a digital signal processor (digital signal processing, DSP), an application-specific integrated circuit (ASIC), a ready-made programmable gate array (field programmable gate array, FPGA) or other programmable logic devices , discrete gate or transistor logic devices, discrete hardware components.
- DSP digital signal processing
- ASIC application-specific integrated circuit
- FPGA field programmable gate array
- Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed.
- a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
- the steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor.
- the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register.
- the storage medium is located in the memory 1501, and the processor 1502 reads the information in the memory 1501, and combines its hardware to complete the neural network optimization method of the embodiment of the present application.
- the communication interface 1503 uses a transceiver module such as but not limited to a transceiver to implement communication between the computer device 150 and other devices or communication networks.
- a transceiver module such as but not limited to a transceiver to implement communication between the computer device 150 and other devices or communication networks.
- the communication interface 1503 may be any one or any combination of the following devices: a network interface (such as an Ethernet interface), a wireless network card and other devices with network access functions.
- Bus 1504 may include pathways for transferring information between various components of computer device 150 (eg, memory 1501 , processor 1502 , communication interface 1503 ).
- a communication path is established between each of the above-mentioned computer devices 150 through a communication network.
- Each computer device 150 is used to realize some functions of the neural network optimization method provided by the embodiment of the present application.
- Any computer device 150 may be a computer device (for example: a server) in a cloud data center, or a computer device in an edge data center, or the like.
- all or part of them may be implemented by software, hardware, firmware or any combination thereof.
- software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
- the computer program product that provides the data synchronization cloud service includes one or more computer instructions. When these computer program instructions are loaded and executed on the computer device, the process or function of the neural network optimization method provided by the embodiment of the present application is fully or partially realized.
- the computer equipment can be a general purpose computer, special purpose computer, a computer network, or other programmable apparatus.
- Computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g. Coaxial cable, optical fiber, digital subscriber line or wireless (such as infrared, wireless, microwave, etc.) transmission to another website site, computer, server or data center.
- Computer-readable storage medium stores a computer that provides data synchronization cloud service Program instructions.
- the embodiment of the present application also provides a storage medium, which is a non-volatile computer-readable storage medium.
- a storage medium which is a non-volatile computer-readable storage medium.
- the embodiment of the present application also provides a computer program product containing instructions, and when the computer program product is run on a computer, the computer is made to execute the neural network optimization method provided in the embodiment of the present application.
- the program can be stored in a computer-readable storage medium.
- the above-mentioned The storage medium mentioned may be a read-only memory, a magnetic disk or an optical disk, and the like.
- the terms “first”, “second” and “third” are used for description purposes only, and cannot be understood as indicating or implying relative importance.
- the term “at least one” means one or more, and the term “plurality” means two or more, unless otherwise clearly defined.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
本申请公开了一种神经网络优化方法及其装置,属于人工智能AI技术领域。该方法包括:接收待优化神经网络的模型文件;基于待优化神经网络的模型文件,获取目标神经网络架构的搜索空间,搜索空间包括目标神经网络架构中各神经元的各属性的取值范围;基于搜索空间,获得目标神经网络架构;基于待优化神经网络的模型文件对目标神经网络架构进行训练,得到目标神经网络的模型文件;向用户提供目标神经网络的模型文件。本申请能够有效提高优化后的神经网络的性能。
Description
本申请要求于2021年05月29日提交的申请号为202110596002.1、发明名称为“神经网络优化方法及其装置”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。
本申请涉及人工智能(artificial intelligence,AI)技术领域,特别涉及一种神经网络优化方法及其装置。
随着神经网络的普及,神经网络应用到了越来越多的领域。在用户使用神经网络之前,通常需要使用神经网络开发平台(如自动机器学习(auto machine learning,AutoML)平台)为其设计和训练神经网络。并且,用户在使用已有神经网络时,还会出现对该已有神经网络的性能不满意的情况,此时用户还可以使用神经网络开发平台对该已有神经网络进行优化,以得到性能更好的神经网络。
相关技术中,神经网络开发平台可以通过对已有神经网络执行图优化和算子融合等优化操作,以得到能够实现已有神经网络的功能,且性能更好的神经网络。
但是,优化操作对神经网络的性能提升空间有限,导致神经网络的性能提升效果较差。
发明内容
本申请提供了一种神经网络优化方法及其装置,本申请能够有效提高优化后的神经网络的性能。本申请提供的技术方案如下:
第一方面,本申请提供了一种神经网络优化方法,该神经网络优化方法包括:接收待优化神经网络的模型文件;基于待优化神经网络的模型文件,获取目标神经网络架构的搜索空间,搜索空间包括目标神经网络架构中各神经元的各属性的取值范围;基于搜索空间,获得目标神经网络架构;基于待优化神经网络的模型文件对目标神经网络架构进行训练,得到目标神经网络的模型文件;向用户提供目标神经网络的模型文件。
在本申请提供的神经网络优化方法中,能够先根据待优化神经网络的模型文件,为待优化神经网络映射到较相似的搜索空间,然后在基于该搜索空间确定目标神经网络架构,并通过对该目标神经网络架构进行训练,能够得到性能得到较大提升的目标神经网络,然后向用户提供目标神经网络的模型文件。该方法使得优化后的神经网络的性能能够得到较大的提升,能够将优化后的神经网络用于解决较复杂的任务,保证了优化后的神经网络的适用范围。
可选地,该神经网络优化方法还包括:接收用户输入的目标信息,目标信息包括以下信息中的一种或多种:运行目标神经网络的硬件的信息,及指示用户对目标神经网络的性能要求的信息。相应的,基于待优化神经网络的模型文件,获取目标神经网络架构的搜索空间,包括:基于待优化神经网络的模型文件和目标信息,获取目标神经网络架构的搜索空间。
当用户选择向优化系统提供该目标信息时,能够便于优化系统确定需要针对什么样的待优化神经网络进行优化,使优化系统对待优化神经网络的优化过程更具有针对性,以便于向用户提供更符合用户需求的目标神经网络。
知识库中存储的评价指标值通常包括多种指标值,则需要根据与该多种指标值对应的多类评价指标值对知识库进行检索,但根据用户的输入可能不包括多类评价指标值中的全部,则在一种可能的实现方式中,基于搜索空间,获得目标神经网络架构,包括:基于搜索空间,获取用于反映用户对目标神经网络的性能要求的指定信息;基于指定信息,在知识库中进行检索;当知识库中存在满足指定信息的已有神经网络架构时,将已有神经网络架构确定为目标神经网络架构。
在一种可实现方式中,指定信息可以包括多类评价指标值,则基于搜索空间,获取用于反映用户对目标神经网络的性能要求的指定信息,包括:基于搜索空间,得到至少一个备选神经网络架构;对每个备选神经网络架构进行训练和推理,得到每个备选神经网络架构的多类评价指标值。
相应的,基于指定信息,在知识库中进行检索,包括:基于每个备选神经网络架构的多类评价指标值,在知识库中进行检索;当知识库中任一神经网络架构满足任一备选神经网络架构的多类评价指标值时,确定知识库中存在满足指定信息的已有神经网络架构。
进一步地,基于搜索空间,获得目标神经网络架构,还包括:当知识库中不存在满足指定信息的已有神经网络架构时,采用架构搜索策略搜索得到目标神经网络架构。
通过先在知识库中进行检索,当知识库中存在与搜索空间、目标信息匹配的已有神经网络架构时,能够直接将已有神经网络架构确定为目标神经网络架构,无需再使用架构搜索策略搜索该神经网络架构,能够提高对待优化神经网络的优化效率,并降低优化神经网络的资源消耗。
在一种可实现方式中,该神经网络优化方法还包括:将搜索目标神经网络架构的过程中产生的中间数据存储在知识库中,中间数据包括以下一项或多项:目标神经网络架构的搜索空间、基于搜索空间得到的备选神经网络架构、与硬件相关的第一类评价指标值、与硬件无关的第二类评价指标值、用于运行目标神经网络的硬件的信息。
通过将搜索过程中的中间数据存储在知识库中,使得能够利用该中间数据为其他用户提供服务,并且,通过先在知识库中进行检索,当知识库中存在于待优化神经网络匹配的已有神经网络架构时,无需再使用架构搜索策略搜索神经网络架构,能够提高对待优化神经网络的优化效率,并降低优化神经网络的资源消耗。
可选地,基于待优化神经网络的模型文件对目标神经网络架构进行训练,得到目标神经网络的模型文件,包括:基于待优化神经网络的模型文件,采用模型克隆方法对目标神经网络架构进行训练,得到目标神经网络的模型文件。
通过模型克隆方法对目标神经网络架构进行训练,该模型克隆方法能够将该待优化神经网络的推理行为克隆到目标神经网络架构,能够保证目标神经网络的推理行为与待优化神经网络的推理行为具有一致性。
其中,基于待优化神经网络的模型文件,获取目标神经网络架构的搜索空间,包括:将待优化神经网络的模型文件输入经过预训练的人工智能模型,得到人工智能模型输出的目标神经网络架构的搜索空间。
通过人工智能模型预测搜索空间,能够使用人工智能模型对目标神经网络的任务类型进行自动检测,使得无需用户告知优化系统待优化神经网络的任务类型,简化了对神经网络进行优化时用户需要执行的操作。
可选地,在基于待优化神经网络的模型文件对目标神经网络架构进行训练,得到目标神经网络的模型文件之后,该神经网络优化方法还包括:采用优化策略对目标神经网络进行速度优化,优化策略包括:图优化策略和算子优化策略;向用户提供目标神经网络的模型文件,包括:向用户提供速度优化后的目标神经网络的模型文件。
或者,在向用户提供目标神经网络的模型文件之后,该神经网络优化方法还包括:接收用户发送的速度优化请求;基于速度优化请求,采用优化策略对目标神经网络进行速度优化,优化策略包括:图优化策略和算子优化策略;向用户提供速度优化后的目标神经网络的模型文件。
通过使用优化策略对目标神经网络进行速度优化,能够减少目标神经网络的计算量或者其他系统开销(如访存开销),能够提高目标神经网络的推理速度。
第二方面,本申请提供了一种神经网络优化装置,该神经网络优化装置包括:交互模块,用于接收待优化神经网络的模型文件;架构确定模块,用于基于待优化神经网络的模型文件,获取目标神经网络架构的搜索空间,搜索空间包括目标神经网络架构中各神经元的各属性的取值范围;架构确定模块,还用于基于搜索空间,获得目标神经网络架构;训练模块,用于基于待优化神经网络的模型文件对目标神经网络架构进行训练,得到目标神经网络的模型文件;交互模块,还用于向用户提供目标神经网络的模型文件。
可选地,交互模块,还接收用户输入的目标信息,目标信息包括以下信息中的一种或多种:运行目标神经网络的硬件的信息,及指示用户对目标神经网络的性能要求的信息;架构确定模块,具体用于:基于待优化神经网络的模型文件和目标信息,获取目标神经网络架构的搜索空间。
可选地,架构确定模块,具体用于:基于搜索空间,获取用于反映用户对目标神经网络的性能要求的指定信息;基于指定信息,在知识库中进行检索;当知识库中存在满足指定信息的已有神经网络架构时,将已有神经网络架构确定为目标神经网络架构。
可选地,指定信息包括多类评价指标值,架构确定模块,具体用于:基于搜索空间,得到至少一个备选神经网络架构;对每个备选神经网络架构进行训练和推理,得到每个备选神经网络架构的多类评价指标值。
可选地,架构确定模块,具体用于:基于每个备选神经网络架构的多类评价指标值,在知识库中进行检索;当知识库中任一神经网络架构满足任一备选神经网络架构的多类评价指标值时,确定知识库中存在满足指定信息的已有神经网络架构。
可选地,架构确定模块,还具体用于:当知识库中不存在满足指定信息的已有神经网络架构时,采用架构搜索策略搜索得到目标神经网络架构。
可选地,该神经网络优化装置还包括:存储模块,用于将搜索目标神经网络架构的过程中产生的中间数据存储在知识库中,中间数据包括以下一项或多项:目标神经网络架构的搜索空间、基于搜索空间得到的备选神经网络架构、与硬件相关的第一类评价指标值、与硬件无关的第二类评价指标值、用于运行目标神经网络的硬件的信息。
可选地,训练模块,具体用于:基于待优化神经网络的模型文件,采用模型克隆装置对目标神经网络架构进行训练,得到目标神经网络的模型文件。
可选地,架构确定模块,具体用于:将待优化神经网络的模型文件输入经过预训练的人工智能模型,得到人工智能模型输出的目标神经网络架构的搜索空间。
可选地,该神经网络优化装置还包括:推理模块,用于采用优化策略对目标神经网络进行速度优化,优化策略包括:图优化策略和算子优化策略;交互模块,具体用于:向用户提供速度优化后的目标神经网络的模型文件。
可选地,交互模块,还用于接收用户发送的速度优化请求;相应的,该神经网络优化装置还包括:推理模块,用于基于速度优化请求,采用优化策略对目标神经网络进行速度优化,优化策略包括:图优化策略和算子优化策略;交互模块,还用于向用户提供速度优化后的目标神经网络的模型文件。
第三方面,本申请提供了一种计算机设备,计算机设备包括:处理器和存储器,存储器中存储有计算机程序;处理器执行计算机程序时,计算机设备实现本申请第一方面及任一可选的实现方式提供的方法。
第四方面,本申请提供了一种非瞬态的计算机可读存储介质,当该计算机可读存储介质中的指令被处理器执行时,实现本申请第一方面及任一可选的实现方式提供的方法。
第五方面,本申请提供了一种包含指令的计算机程序产品,当计算机程序产品在计算机上运行时,使得计算机执行本申请第一方面及任一可选的实现方式提供的方法。
图1是本申请实施例提供的一种神经网络优化方法涉及的优化系统的示意图;
图2是本申请实施例提供的另一种神经网络优化方法涉及的优化系统的示意图;
图3是本申请实施例提供的又一种神经网络优化方法涉及的优化系统的示意图;
图4是本申请实施例提供的一种神经网络优化方法涉及的应用场景的示意图;
图5是本申请实施例提供的一种神经网络优化方法的流程图;
图6是本申请实施例提供的一种待优化神经网络的运算过程的示意图;
图7是本申请实施例提供的一种通过损失函数控制对神经网络架构进行调参使用的梯度的实现过程示意图;
图8是本申请实施例提供的另一种神经网络优化方法的流程图;
图9是本申请实施例提供的一种知识库的示意图;
图10是本申请实施例提供的又一种知识库的示意图;
图11是本申请实施例提供的再一种神经网络优化方法涉及的优化系统的示意图;
图12是本申请实施例提供的一种采用MBNAS方法进行搜索的过程示意图;
图13是本申请实施例提供的一种神经网络优化装置的结构示意图;
图14是本申请实施例提供的另一种神经网络优化装置的结构示意图;
图15是本申请实施例提供的一种计算机设备的结构示意图。
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
为了便于理解本申请的技术方案,下面对本申请涉及的一些技术术语进行介绍。
深度学习(Deep Learning):是一类基于深层次神经网络算法的机器学习技术,其主要特征是使用多重非线性变换构对数据进行处理和分析。主要应用于人工智能领域的感知、决 策等场景,例如图像和语音识别、自然语言翻译、计算机博弈等。
自动机器学习:是针对机器学习模型的高级控制框架,其可以在无需人工干预的情况下,自动地搜索机器学习模型的最优参数配置。
神经网络(neural networks,NN)是一种模拟人脑的神经网络以期能够实现类人工智能的数学模型,神经网络也可以称为神经网络模型。神经网络通常采用具有连接关系的多个神经元(也称作节点,node)模拟人脑的神经网络。
其中,每个神经网络中各神经元的连接方式和/或连接结构称为该神经网络的神经网络架构。典型的神经网络架构包括循环神经网络(recurrent neural network,RNN)架构、卷积神经网络(convolutional neural network,CNN)架构等等。神经网络架构可以通过有向图(如有向无环图)进行表征。有向图中的每条边具有一个权重,权重用于表征一条边中输入节点相对于该边中输出节点的重要性。神经网络的参数即包括上述权重。需要说明,权重通常可以利用样本数据对神经网络进行训练得到。
根据神经网络架构获得神经网络模型包括两个阶段。一个阶段是对神经网络架构进行权重初始化(weight initialization),得到初始神经网络模型,也称作初始子模型。其中,权重初始化是指对神经网络架构中各条边的权重(在有些情况下,还包括偏置)进行初始化。具体实现时,可以通过高斯分布生成权重初始值从而实现权重初始化。另一个阶段是利用样本数据更新初始子模型的权重,得到神经网络模型,也称作子模型(child model)。具体地,将样本数据输入初始子模型,该初始子模型可以根据初始子模型对样本数据的预测值以及样本数据携带的真值确定损失值,基于该损失值更新初始子模型的权重。通过多轮权重迭代后,可以获得一个子模型。该子模型即为已训练的、可用于特定应用的神经网络模型。
衡量一个子模型的优劣可以通过子模型的评价指标值实现。其中,评价指标值是对子模型从至少一个维度进行评价所得的度量值。子模型的评价指标值可以分为两类,一类评价指标值随着硬件变化而变化,另一类评价指标值随着硬件变化保持不变。为了方便描述,本申请实施例将随着硬件变化而变化的评价指标值称为第一类评价指标值,将随着硬件变化保持不变的评价指标值称为第二类评价指标值。
第一类评价指标值是与硬件相关的评价指标值,包括与硬件相关的性能值。在一些实现方式中,与硬件相关的性能值包括模型推理时延(latency)、激活量、吞吐量、功耗(power)和显存占用率中的任意一种或多种。第二类评价指标值是与硬件不相关的评价指标值,包括与硬件不相关的精度值。在一些实现方式中,精度值包括准确率(accuracy)、精确率(precision)和召回率(recall)中的任意一种或多种。其中,与硬件不相关的评价指标值还包括参数量和计算力,计算力具体包括每秒浮点运算次数(floating-point operations per second,FLOPs)。
随着近年来计算机设备的算力的提升,业界提出了使用神经网络开发平台(如AutoML平台)为用户设计和训练神经网络。并且,用户在使用已有神经网络时,还会出现对该已有神经网络的性能不满意的情况,此时用户还可以使用神经网络开发平台对该已有神经网络进行优化,以得到性能更好的神经网络。其主要过程包括:根据已有神经网络确定优化后的神经网络的搜索空间,在搜索空间中搜索得到优化后神经网络的神经网络架构,然后对得到的神经网络架构进行训练,得到优化后的神经网络。
其中,搜索空间包括各神经元的各属性的取值范围。该搜索空间定义了对神经网络架构进行搜索的范围,基于搜索空间定义的范围可以提供一组可供搜索的神经网络架构。根据需 要构建的神经网络的类型,搜索空间可以分为链式架构空间、多分支架构空间以及基于单元块(block)的搜索空间等多种类型。不同的搜索空间均可以通过各属性的取值范围进行表征。例如,搜索空间可以通过神经元的标识和神经元执行操作这2种属性的取值范围进行表征。在一些情况下,搜索空间还可以结合神经网络架构包括的层数、每层包括的单元块数据以及每个单元块包括的神经元数中的至少一个进行表征。
目前,在对神经网络进行优化时,通常是对已有神经网络执行图优化和算子融合等优化操作。但是,目前优化操作对神经网络的性能提升空间有限,导致优化后的神经网络的性能仍然较差。
本申请实施例提供了一种神经网络优化方法。该方法能够先根据待优化神经网络的模型文件,为待优化神经网络映射到较相似的搜索空间,然后在基于该搜索空间确定目标神经网络架构,并通过对该目标神经网络架构进行训练,能够得到性能得到较大提升的目标神经网络,然后向用户提供目标神经网络的模型文件。因此,通过本申请实施例提供的神经网络优化方法优化后的神经网络的性能能够得到较大的提升。
本申请实施例提供的神经网络优化方法可应用于优化系统。该优化系统用于执行本申请实施例提供的神经网络优化方法。可选地,该优化系统可以通过终端、物理机、裸金属服务器、云服务器、虚拟机或容器等一种或多种设备实现。
优化系统可以在逻辑上分成多个部分,每个部分具有不同的功能。例如,如图1所示,优化系统1可以包括以下几个功能模块:交互模块11、架构确定模块12和训练模块13。交互模块11用于接收待优化神经网络的模型文件,并向用户提供优化后的目标神经网络的模型文件。架构确定模块12用于根据待优化神经网络的模型文件,获取目标神经网络架构的搜索空间,并基于搜索空间获得目标神经网络架构。训练模块13用于基于待优化神经网络的模型文件对目标神经网络架构进行训练,得到目标神经网络的模型文件。其中,由于神经网络需要以模型文件的形式向用户提供,下文中向用户提供神经网络和向用户提供神经网络的模型文件均指向用户提供神经网络,且为便于描述不对其进行区分。
可选地,如图2所示,该优化系统1还可以包括推理模块14。该推理模块14用于提供推理功能。例如,架构确定模块12在获得目标神经网络架构的过程中,可以生成多个备选神经网络架构,以便根据该多个备选神经网络架构得到目标神经网络架构,则推理模块14可以在硬件上对该多个备选神经网络架构对应的模型执行推理,获得多个备选神经网络架构在硬件上运行的推理时延等第一类评价指标值。此时,训练模块13还用于对架构确定模块12生成的多个备选神经网络架构对应的模型进行训练,获得多个备选神经网络架构的精度值等第二类评价指标值。相应的,架构确定模块12具体用于根据多个备选神经网络架构的第一类评价指标值和第二类评价指标值,获得目标神经网络架构。
进一步的,该推理模块14还用于对目标神经网络进行速度优化。此时,交互模块11具体用于向用户提供速度优化后的目标神经网络的模型文件。并且,该推理模块14还用于获取目标神经网络的第二类评价指标值,以便于向用户提供目标神经网络的模型文件时,一并向用于提供该目标神经网络的评价指标值。
并且,如图2所示,该优化系统1还可以包括存储模块15,该存储模块15用于存储获得目标神经网络架构过程中的中间数据,以提高对待优化神经网络的优化效率。另外,以上各个模块还可以具有其他功能,此处对其不进行一一列举。
优化系统1的多个部分可以部署在终端、物理机、裸金属服务器、云服务器、虚拟机和容器中的任一个上。或者,优化系统1的多个部分可以分布式地部署在多个终端、多个物理机、多个裸金属服务器、多个云服务器、多个虚拟机和多个容器中的一个或多个上。
并且,如图1和图2所示,训练模块13可以位于服务侧,例如:该训练模块13可以由神经网络优化服务的提供方提供。当然,在另一些实施例中,如图3所示,训练模块13也可以位于用户侧,即训练模块13可以由需要使用神经网络优化服务的用户提供。采用用户提供的训练模块13进行子模型训练,可以避免用于训练子模型的训练数据泄露,保障了数据安全。其中,神经网络优化服务用于提供本申请实施例提供的神经网络优化方法实现的功能。或者,由上可知,由于训练模块13可用于对目标神经网络架构进行训练,还用于与对架构确定模块12生成的多个备选神经网络架构对应的模型进行训练,则该训练模块13可以包括至少两部分。该至少两部分中的第一部分用于对目标神经网络架构进行训练,该至少两部分中的第二部分用于与对架构确定模块12生成的多个备选神经网络架构对应的模型进行训练。则作为一种部署方式,该第一部分可以位于用户侧,以便于使用该用户侧的训练数据执行训练过程,该第二部分可以位于服务侧。
类似地,如图2所示,推理模块14可以位于服务侧,或者,如图3所示,推理模块14也可以位于用户侧。当推理模块14位于用户侧时,采用用户所提供的推理模块14对架构确定模块12生成的多个备选神经网络架构对应的模型执行推理,无需将模型上传至服务侧,如此可以避免模型泄露,保障模型隐私。或者,由于推理模块14可用于在硬件上对架构确定模块12生成的多个备选神经网络架构对应的模型执行推理,还可用于获取目标神经网络的第二类评价指标值,则类似于训练模块13的部署方式,该推理模块14包括至少两部分。该至少两部分中的第一部分用于获取目标神经网络的第二类评价指标值,该至少两部分中的第二部分用于获得多个备选神经网络架构在硬件上运行的推理时延等第一类评价指标值。并且,作为一种部署方式,该第一部分可以位于用户侧,以便于使用该用户侧的训练数据获取目标神经网络的第二类评价指标值,该第二部分可以位于服务侧。
在一种可实现方式中,优化系统1中的部分或所有模块可以通过云平台中的资源实现。云平台中部署有云服务提供商拥有的基础资源,例如:计算资源、存储资源和网络资源等,该计算资源可以是大量的计算机设备(例如服务器)。优化系统1可以利用该云平台中部署的基础资源实现本申请实施例提供的神经网络优化方法,实现对神经网络的优化。例如:如图3所示的当交互模块11和架构确定模块12位于服务侧,训练模块13和推理模块14位于用户侧时,交互模块11和架构确定模块12可以部署在公有云平台中,训练模块13和推理模块14可以部署在私有云平台中,如此可以基于公有云平台和私有云平台形成的混合云平台实现本申请实施例提供的神经网络优化方法。或者如前述图1、图2所示的优化系统1可以全部部署在公有云平台或者全部部署在私有云平台。
当优化系统1中的部分或所有模块通过云平台中的资源实现时,如图4所示,本申请实施例提供的神经网络优化方法能够由云服务提供商在云平台抽象成一种神经网络优化云服务提供给用户。用户在云平台购买神经网络优化云服务后,云平台能够利用优化系统1对用户提供的神经网络进行优化的云服务。并且,根据优化系统1中的模块的部署方式,云平台可以向用户提供不同的神经网络优化云服务。例如,对于上述训练模块13和推理模块14的不同部署方式,云平台至少可以向用户提供以下两种神经网络优化云服务:
在第一种神经网络优化云服务中,当训练模块13和推理模块14均部署在服务侧,例如,优化系统1中的各部分均部署在公有云平台的一个云计算集群中,则用户在购买神经网络优化云服务后,可以将待优化神经网络发送至公有云平台中,则公有云平台可以利用优化系统1提供的神经网络优化云服务,对该待优化神经网络进行优化,并向用户提供优化后的神经网络。
在第二种神经网络优化云服务中,当训练模块13和推理模块14均部署在用户侧,交互模块11和架构确定模块12部署在服务侧时,例如,交互模块11和架构确定模块12部署在公有云平台的一个云计算集群)中,训练模块13和推理模块14部署在私有云平台的一个云计算集群中。则用户在购买神经网络优化云服务后,需要将待优化神经网络发送至公有云平台中,公有云平台利用架构确定模块12根据该待优化神经网络,向用户提供多个备选神经网络架构或多个备选神经网络架构对应的模型。然后用户使用训练模块13和推理模块14获取多个备选神经网络架构的评价指标值,并向公有云平台发送评价指标值。公有云平台根据评价指标值在多个备选神经网络架构中确定目标神经网络架构。然后公有云平台向用户提供目标神经网络架构或该神经网络架构对应的模型,使得用户使用训练模块13对目标神经网络架构或该神经网络架构对应的模型进行训练,以得到优化后的目标神经网络。
可选地,在本申请实施例中,云平台可以是中心云的云平台、边缘云的云平台或包括中心云和边缘云的云平台,本申请实施例对其不做具体限定。并且,当云平台为包括中心云和边缘云的云平台时,该优化系统可以部分部署在边缘云的云平台中,部分部署在中心云的云平台中。
需要说明的是,图1至图3仅仅是本申请实施例提供的优化系统的一些具体示例。并且,以上对优化系统中功能模块的划分方式和部署方式仅为示意性的举例,本申请不对优化系统中功能模块的划分方式和各功能模块的部署方式进行限定,实际应用时可根据用于部署优化系统的计算设备的计算能力或具体应用需求进行适应性的部署。
为了便于理解本申请实施例的技术方案,接下来以采用图2所示的优化系统实现本申请实施例提供的神经网络优化方法为例,对本申请实施例的一种神经网络优化方法进行介绍。图5是本申请实施例提供的一种神经网络优化方法的流程图。如图5所示,该神经网络优化方法包括以下步骤:
步骤501、接收待优化神经网络的模型文件。
当用户需要使用优化系统对待优化神经网络进行优化时,需要向优化系统提供待优化神经网络的模型文件,以便于优化系统确定需要针对什么样的待优化神经网络进行优化。待优化神经网络的模型文件用于指示待优化神经网络。例如,待优化神经网络实质上是有向图,有向图可以使用模型文件表示,且该模型文件可以为后缀为.ph的文件。如图6所示,模型文件表示的待优化神经网络的运算过程为:先对输入数据(input)执行卷积运算(conv),然后采用线性整流函数(rectified linear unit,ReLU)对卷积运算结果进行线性整流处理,并对经线性整流函数处理后的结果和卷积运算结果执行求和运算(add),然后输出(output)求和结果。其中,线性整流函数又称修正线性单元,是一种人工神经网络中常用的激活函数(activation function),通常指代以斜坡函数及其变种为代表的非线性函数。比较常用的线性整流函数有斜坡函数f(x)=max(0,x),以及带泄露整流函数(leaky reLU),其中x为神经元(Neuron)的输入。
步骤502、接收用户输入的目标信息,目标信息包括以下信息中的一种或多种:运行目标神经网络的硬件的信息(也称为硬件信息),及指示用户对目标神经网络的性能要求的信息(也称为性能要求信息)。
用户在使用优化系统对待优化神经网络进行优化时,除了向优化系统提供待优化神经网络的模型文件,还可以向优化系统提供用户对待优化神经网络进行优化的相关要求,该相关要求可以使用目标信息表示。可选地,该目标信息可以包括以下一种或多种:用户期望运行目标神经网络的硬件的信息,及指示用户对目标神经网络的性能要求的信息。并且,性能要求信息还可以指示目标神经网络在指定数据集上的性能要求,且该性能要求信息可以包括指示目标神经网络的第一类评价指标值和第二类评价指标值中的至少一个的信息。例如,硬件信息用于指示用户期望运行目标神经网络的硬件为某型号的图形处理器(graphics processing unit,GPU),性能要求信息用于指示用户期望目标神经网络的推理精度为95.94%,或者,性能要求信息用于指示用户期望目标神经网络在cifar10数据集上的推理精度为95.94%。其中,目标神经网络为经过优化系统对待优化神经网络进行优化后得到的神经网络。
需要说明的是,该步骤502是可选执行步骤。用户在使用优化系统对待优化神经网络进行优化时,可以选择是否需要向优化系统提供目标信息。当用户选择向优化系统提供该目标信息时,能够使优化系统对待优化神经网络的优化过程更具有针对性,以便于向用户提供更符合用户需求的目标神经网络。
步骤503、基于待优化神经网络的模型文件和目标信息,获取目标神经网络架构的搜索空间。
由上可知,步骤502是可选执行步骤,当不执行步骤502,即优化系统未接收目标信息时,可以不根据目标信息获取目标神经网络架构的搜索空间,则获得搜索空间的实现过程包括:基于待优化神经网络的模型文件,获得搜索空间。下面以执行步骤502为例,对基于待优化神经网络的模型文件和目标信息,获取目标神经网络架构的搜索空间的实现过程进行说明。其中,搜索空间包括目标神经网络架构中各神经元的各属性的取值范围。
在获取搜索空间的一种实现方式中,可以采用经过预训练的人工智能模型预测搜索空间。该人工智能模型能够根据输入的待优化神经网络的模型文件和目标信息,输出目标神经网络架构的搜索空间。通过人工智能模型预测搜索空间的过程可以包括:人工智能模型基于待优化神经网络的模型文件分析待优化神经网络的网络架构特征,得到目标神经网络可能的任务类型,并基于目标信息在该可能的任务类型中进一步确定目标神经网络的任务类型,如在分类任务和检测任务中进一步确定目标神经网络的任务类型是分类任务还是检测任务,然后根据任务类型与搜索空间的对应关系,输出目标神经网络的任务类型对应的搜索空间,得到目标神经网络架构的搜索空间。
其中,由于目标信息与神经网络的任务类型有一定关联关系,因此可以利用该关联关系基于目标信息在可能的任务类型中,进一步确定目标神经网络的任务类型。例如,假设目标信息与神经网络的任务类型的关联关系为:通常使用A1型号的GPU实现A2任务类型的任务,使用B1型号的GPU实现B2任务类型的任务,则当目标信息指示运行目标神经网络的硬件为A1型号的GPU时,可以根据该关联关系在可能的任务类型中,确定目标神经网络的任务类型为A2任务类型。
并且,根据人工智能模型预测搜索空间的过程可以看出,当优化系统接收到目标信息时, 在确定搜索空间的过程中,能够使用该目标信息对根据待优化神经网络的模型文件确定的可能的任务类型进一步筛选。因此,对于优化系统接收到和未接收到目标信息的不同情况,对确定搜索空间的实现过程的影响在于是否使用目标信息进一步筛选。并且,当优化系统接收到目标信息时,由于能够使用目标信息进一步筛选,能够为目标神经网络确定更匹配的搜索空间,能够提高根据搜索空间优化得到的目标神经网络的性能。
在该确定搜索空间的实现方式中,由于人工智能模型能够对目标神经网络的任务类型进行自动检测,使得无需用户告知优化系统待优化神经网络的任务类型,简化了对神经网络进行优化时用户需要执行的操作。可选地,人工智能模型可以为分类模型,例如可以为支持向量机(support vector machine,SVM)。
需要说明的是,在将待优化神经网络的模型文件输入至人工智能模型之前,还可以将待优化神经网络的模型文件的数据类型转换为人工智能模型能够识别的数据类型,如将待优化神经网络的模型文件转化为一维特征数据,然后将该一维特征数据输入人工智能模型,该一维特征数据用于表示待优化神经网络中每个节点的类型及节点间的关系。可选地,可以采用图算法(如图核(graph kernel)算法)将待优化神经网络的模型文件转化为一维特征数据。
例如,继续以步骤501的待优化神经网络的模型文件为例,将该模型文件转换为一维特征数据的结果如表1所示,对于该表1中的每一行数据,"t#N"表示第N个图,如“t#0”表示第0个图,"v M L"表示该图中第M个顶点的标签为L,如“v01”表示图中第0个顶点的标签为1,"e P Q"表示第P个顶点和第Q个顶点之间通过一条边连接,如"e01"表示第0个顶点和第1个顶点之间通过一条边连接,“t#-1”表示模型文件的结尾。其中,图6中顶点的顺序按照从上至下且从左至右的顺序排列得到。
表1
| 01 | t#0 |
| 02 | v01 |
| 03 | v12 |
| 04 | v23 |
| 05 | e01 |
| 06 | e02 |
| 07 | e12 |
| 08 | t#-1 |
步骤504、基于目标神经网络架构的搜索空间和目标信息,获得目标神经网络架构。
由上可知,步骤502是可选执行步骤,当不执行步骤502,即优化系统未接收目标信息时,可以不根据目标信息获取目标神经网络架构,则获得目标神经网络架构的实现过程包括:基于待优化神经网络的模型文件,获得目标神经网络架构。下面以执行步骤502为例,对基于待优化神经网络的模型文件和目标信息,获取目标神经网络架构的实现过程进行说明。
在确定搜索空间后,可以基于搜索空间和目标信息,获得目标神经网络架构。可选地,可以根据搜索空间和目标信息,采用架构搜索策略搜索得到目标神经网络架构。或者,优化系统可以配置有知识库,可以根据搜索空间和目标信息,在知识库中检索,以获取目标神经网络架构。或者,在优化系统配置有知识库时,可以先基于搜索空间和目标信息,在知识库 中进行检索,当知识库中存在与搜索空间、目标信息匹配的已有神经网络架构时,将已有神经网络架构确定为目标神经网络架构,当知识库中不存在与搜索空间、目标信息匹配的已有神经网络架构时,基于搜索空间和目标信息,采用架构搜索策略搜索得到目标神经网络架构。其中,为提高本申请实施例的可读性,此处暂不对在知识库中进行检索和采用架构搜索策略进行搜索的实现过程进行说明,在后面内容中再行介绍。
通过先在知识库中进行检索,当知识库中存在与搜索空间、目标信息匹配的已有神经网络架构时,能够直接将已有神经网络架构确定为目标神经网络架构,无需再使用架构搜索策略搜索该神经网络架构,能够提高对待优化神经网络的优化效率,并降低优化神经网络的资源消耗。
在一些实施例中,为了便于使用该搜索过程中得到的中间数据为其他用户服务,还可以将搜索目标神经网络架构的过程中产生的中间数据存储在知识库中,以提高对待优化神经网络的优化效率。可选地,中间数据包括以下一项或多项:目标神经网络架构的搜索空间、基于搜索空间得到的备选神经网络架构、与硬件相关的第一类评价指标值、与硬件无关的第二类评价指标值、及用于运行目标神经网络的硬件的信息。可选地,第一类评价指标值包括在硬件上运行的推理时延等,第二类评价指标值包括精度值等。
步骤505、基于待优化神经网络的模型文件对目标神经网络架构进行训练,得到目标神经网络的模型文件。
可选地,可以采用本申请实施例提供的一种模型克隆方法,对目标神经网络架构进行训练,得到目标神经网络的模型文件。下面对该模型克隆方法进行说明:
模型克隆方法的基本原理是:以控制目标神经网络架构对任一训练数据的输出,拟合待优化神经网络对该任一训练数据的输出为目标,对目标神经网络架构进行训练,得到训练完成的目标神经网络。也即是,在使用模型克隆方法对目标神经网络架构进行训练时,可以通过判断目标神经网络架构对任一训练数据的输出,与待优化神经网络对该任一训练数据的输出的差别是否趋于最小,来确定是否完成了对目标神经网络架构的训练。其中,拟合是指通过调整目标神经网络架构中的权重系数,使得调整权重系数后的神经网络架构对任一训练数据的输出与待优化神经网络对该任一训练数据的输出的差别趋于最小。
在一种可实现方式中,控制目标神经网络架构对任一训练数据的输出,拟合待优化神经网络对该任一训练数据的输出,可以通过损失函数控制对神经网络架构进行调参使用的梯度实现。如图7所示,其实现过程包括:对与任一训练数据,分别获取目标神经网络架构和待优化神经网络对该任一训练数据的目标参数,然后根据目标神经网络架构和待优化神经网络对该任一训练数据的目标参数,获取指定损失函数的损失值,然后将该损失值回传至目标神经网络架构,以便于根据该损失值确定对神经网络架构进行调参使用的梯度,然后根据确定的梯度调整目标神经网络架构的权重参数,直至达到训练目标。可选地,神经网络对训练数据的目标参数为神经网络针对训练数据发生事件A的次数与不发生事件A的次数的比值的对数,即logits。该图7即为目标参数为logits时,对目标神经网络架构进行训练的过程示意图。另外,获取指定损失函数的损失值的过程可选为:将目标神经网络架构对该任一训练数据的目标参数和待优化神经网络对该任一训练数据的目标参数均作为指定损失函数的输入,该指定损失函数的输出即为该指定损失函数的损失值。其中,指定损失函数的具体实现形式可以根据应用需求进行设计,本申请实施例对其不做具体限定。
并且,在对目标神经网络架构进行训练时,可以采用公开的数据集作为训练集。例如,可以使用ImageNet数据集等公开数据集。由于ImageNet数据集具有丰富的图像信息,将ImageNet数据集作为训练集能够有效保证对神经网络的训练效果。并且,通过使用公开的数据集,使得用户无需上传训练数据,能够解决用户出于对数据的隐私、法律或传输方面限制的原因而无法提供用于训练模型的数据的问题。另外,通过使用公开数据集作为训练集,相对于使用生成对抗网络(generative adversarial network,GAN)等生成训练数据的相关技术,避免了因生成对抗网络等存在的缺点对训练结果的影响,如训练不稳定、调参难度大、训练成本高昂、及在复杂的任务或高分辨率数据集上难以取得很好效果等缺点。
由上可知,通过使用该模型克隆方法对目标神经网络架构进行训练,能够将该待优化神经网络的推理行为克隆到目标神经网络架构,能够保证目标神经网络的推理行为与待优化神经网络的推理行为具有一致性。并且,通过采用本申请实施例提供的神经网络优化方法,及使用公开的数据集并设计指定损失函数,使得优化后的目标神经网络架构能够相对于待优化神经网络的神经网络架构发生改变,能够进一步保证目标神经网络的精度。
步骤506、向用户提供目标神经网络的模型文件。
在训练得到目标神经网络后,即可向用户提供目标神经网络的模型文件,以便于用户使用该目标神经网络。
可选地,在训练得到目标神经网络后,优化系统还可以对目标神经网络进行速度优化,并向用户提供速度优化后的神经网络。该速度优化可以包括线下优化和线上优化。线下优化是指用户在未使用目标神经网络进行推理时,对该目标神经网络进行速度优化,线上优化是指在用户使用目标神经网络进行推理的过程中,对该目标神经网络进行速度优化。
对于线下优化,可以在训练得到目标神经网络后,直接采用优化策略对目标神经网络进行速度优化,相应的,该步骤506的实现过程包括:向用户提供速度优化后的目标神经网络的模型文件。其中,优化策略包括:图优化策略和算子优化策略。图优化策略可以包括:对用于表示目标神经网络的图进行图的等效变换,然后根据等效变换后的图调整目标神经网络的结构,和/或,将目标神经网络中多个节点实现的算子融合到一个节点上实现,然后根据算子融合结果调整目标神经网络的结构。算子优化策略可以包括:使用算子搜索技术根据算子的类型和参数,在用于实现算子的多种算法中搜索出最优的算子实现算法。其中,对操作对象执行的一项或多项操作的集合称为一个算子。
通过使用优化策略对目标神经网络进行速度优化,能够减少目标神经网络的计算量或者其他系统开销(如访存开销),能够提高目标神经网络的推理速度。
在线下优化的另一种实现方式中,可以在向用户提供目标神经网络的模型文件之后,根据用户的请求对目标神经网络进行速度优化。相应的,如图8所示,神经网络优化方法还包括:
步骤507、接收用户发送的速度优化请求。
用户在目标神经网络的过程中,若觉得需要对目标神经网络的速度进行优化,还可以向优化系统发送速度优化请求,以请求优化系统采用优化策略对目标神经网络进行速度优化。
步骤508、基于速度优化请求,采用优化策略对目标神经网络进行速度优化。
其中,优化策略包括:图优化策略和算子优化策略。该优化策略的实现方式可以相应参考前述描述,此处不再赘述。
步骤509、向用户提供速度优化后的目标神经网络的模型文件。
下面对在知识库中进行检索,以获得目标神经网络架构的实现过程进行说明。为便于理解,先对知识库进行说明,然后对在知识库中进行检索的实现过程进行说明。
知识库可视为一个数据库,该知识库存储有与神经网络模型架构相关的信息。可选地,知识库包括至少三类信息,该至少三类信息包括:神经网络的预训练参数、评价指标值和神经网络架构。评价指标值包括第一类评价指标值和第二类评价指标值等评价指标值,第一类评价指标值包括在硬件上运行的推理时延和功耗等,第二类评价指标值包括精度值等。预训练参数包括:用于验证神经网络的训练集中验证精度的验证数据和用于进行知识迁移的其他数据(如训练数据等)。知识迁移是指根据用户需求和知识库中的已有数据,生成用于得到用户所需神经网络的数据。预训练参数和评价指标值可视为神经网络架构的标签信息。图9为知识库包括神经网络的预训练参数、评价指标值和神经网络架构的示意图。
其中,知识库中的信息按照来源分类,可分为初始信息和后期信息。该初始信息可以来自公开数据集和公开模型库。该后期信息可以在对神经网络进行优化的过程中累积得到。例如,后期信息可以为在采用搜索策略搜索神经网络架构过程中得到的中间数据。通过将采用搜索策略为用户搜索神经网络架构过程中得到的中间数据存储在该知识库中,使得能够利用该中间数据为其他用户提供服务,能够提高对待优化神经网络的优化效率,并降低优化神经网络的资源消耗。
可选地,根据目标神经网络架构的搜索空间和目标信息,在知识库中进行检索的实现方式包括:根据搜索空间和目标信息,查询知识库中是否存储有使用该搜索空间,且满足该目标信息的已有神经网络架构,当知识库中存储有使用该搜索空间,且满足该目标信息的已有神经网络架构时,说明该已有神经网络架构与目标神经网络的任务需求匹配,则可将该已有神经网络架构确定为目标神经网络架构。
由于预训练参数和评价指标值可视为神经网络架构的标签信息,在查询知识库中是否存储有使用该搜索空间,且满足该目标信息的已有神经网络架构的过程,可以视为根据搜索空间和目标信息,在知识库的预训练参数和评价指标值中查询是否存储有使用该搜索空间,且与目标信息匹配的标签信息的过程。当知识库中存储的预训练参数和评价指标值中存在存储有使用该搜索空间,且与目标信息匹配的标签信息时,将使用该搜索空间,且携带有与目标信息匹配的标签信息的神经网络架构,确定为使用该搜索空间,且满足该目标信息的已有神经网络架构。并且,搜索空间和与目标信息匹配的标签信息,可以共同指示对目标神经网络进行训练的训练集,与目标信息匹配的标签信息可以指示目标神经网络的第一类评价指标值和第二类评价指标值等评价指标值,神经网络架构使用该搜索空间可以指示神经网络架构与待优化神经网络属于同一类型的神经网络类型。
其中,查询知识库中是否存储有使用该搜索空间的已有神经网络架构,是指对知识库存储的神经网络架构的网络结构进行分析,以确定该神经网络架构中各神经元的各属性的取值范围是否落入该搜索空间包括的范围内,当该神经网络架构中各神经元的各属性的取值范围均落入该搜索空间包括的范围内时,确定该神经网络架构使用该搜索空间。
知识库中存储的评价指标值通常包括多种指标值,例如,包括第一类评价指标值和第二类评价指标值。则在查询知识库中是否存在与目标信息匹配的标签信息时,需要将该多种指 标值分别与目标信息匹配,当该多种指标值均与目标信息匹配时,确定知识库中存在与目标信息匹配的标签信息。相应的,此时需要保证目标信息包括与该多种指标值对应的多类评价指标值。
但是,用户输入的目标信息可能仅包括该多类评价指标值中的部分类评价指标值。此时,可以根据目标神经网络架构的搜索空间,得到目标信息未包括的部分类评价指标值,该得到的部分评价指标值和目标信息中的部分类评价指标值共同组成多类评价指标值。类似的,当优化系统未接收到目标信息时,则可以根据搜索空间,得到与该多种指标值对应的多类评价指标值。其中,为便于描述,将根据搜索空间获得的评价值和目标信息包括的评价值组成的集合称为指定信息,该指定信息用于反映用户对目标神经网络的性能要求,且该指定信息包括与知识库中的多种指标值对应的多类评价指标值。
在一种可实现方式中,基于目标神经网络架构的搜索空间,获取指定信息的实现方式包括:基于搜索空间,得到至少一个备选神经网络架构,对每个备选神经网络架构进行训练和推理,得到每个备选神经网络架构的多类评价指标值。其中,可以通过对搜索空间进行采样,得到每个备选神经网络架构中各神经元的各属性的取值,从而生成至少一个备选神经网络架构,该过程可以相应参考步骤5041。并且,可以采用训练模块对每个备选神经网络架构进行权重初始化,得到多个初始子模型,然后利用训练数据训练多个初始子模型,得到多个备选子模型的第二类评价指标值,该过程可以相应参考步骤5042。可以采用推理模块在硬件上对每个备选神经网络架构进行权重初始化得到的初始子模型,然后对初始子模型进行推理,获得初始子模型在硬件上的第一类评价指标值,该过程可以相应参考步骤5043。
此时,由于能够根据每个备选神经网络架构得到多类评价指标值,则在知识库中进行检索时,可以基于每个备选神经网络架构的多类评价指标值,在知识库中进行检索,并当知识库中任一神经网络架构满足任一备选神经网络架构的多类评价指标值时,确定知识库中存在满足指定信息的已有神经网络架构。
需要说明的是,当基于搜索空间得到多个备选神经网络架构时,知识库中满足指定信息的已有神经网络架构可能有多个,此时可以根据指定筛选策略将该多个已有神经网络架构中的一个确定为目标神经网络架构。该指定筛选策略可以根据应用需求确定,例如,可以随机选择,或者,可以将该多个已有神经网络架构中性能最好的一个确定为目标神经网络架构,本申请实施例对其不做具体限定。
可选地,知识库中的信息可以按照搜索空间进行存储。在一种可实现方式中,知识库可以以搜索空间为单位分成多个子知识库,任一搜索空间对应的子知识库中存储的信息为在该搜索空间下得到的信息。例如,如图10所示,知识库可以以搜索空间spaceA、spaceB和spaceC为单位分成三个子知识库。搜索空间spaceA对应的子知识库中存储的信息包括:在该搜索空间spaceA下,得到的神经网络的预训练参数、评价指标值和神经网络架构。搜索空间spaceB对应的子知识库中存储的信息包括:在该搜索空间spaceB下,得到的神经网络的预训练参数、评价指标值和神经网络架构。搜索空间spaceC对应的子知识库中存储的信息包括:在该搜索空间spaceC下,得到的神经网络的预训练参数、评价指标值和神经网络架构。
相应的,此时在知识库中进行检索时,可以先在知识库中找到目标神经网络架构的搜索空间对应的子知识库,然后在该子知识库中进行检索。这样一来,当确定目标神经网络架构的搜索空间后,就无需在其他搜索空间对应的子知识库中进行检索,减小了在知识库中进行 检索的检索范围,缩短了对知识库进行检索的耗时,减少了对知识库进行检索的资源耗费。
例如,假设用户提供的模型文件指示待优化神经网络为ResNet34神经网络,用户提供的目标信息指示优化得到的目标神经网络的性能需要满足在cifar10数据集的推理精度至少为95.94%,且希望提升目标神经网络在A型号的GPU上的推理性能。则根据该模型文件,可以确定目标神经网络架构的搜索空间为ResNet系列神经网络模型的搜索空间。假设知识库存储的评价指标值推理精度、推理时延和功耗,由于目标信息仅指示了推理精度,则需要基于搜索空间,获取用户所需的推理时延和功耗。那么可以根据ResNet系列神经网络模型的搜索空间,生成多个备选神经网络架构,并在A型号的GPU上对该多个备选神经网络架构对应的神经网络模型进行推理,得到该多个备选神经网络架构的推理时延和功耗。然后,根据搜索空间、目标信息指示得推理精度、每个备选神经网络架构的推理时延和功耗,在知识库中进行检索。在知识库中进行检索时,发现知识库中存在ResNet18神经网络,该ResNet18神经网络能够满足搜索空间、根据备选神经网络架构确定的推理时延和功耗,且该ResNet18为在cifar10数据集上的推理精度为96.01%。即该ResNet18神经网络满足根据待优化神经网络ResNet34确定的搜索空间,满足用户提供的目标信息,且满足根据备选神经网络架构确定的推理时延和功耗。因此可将该ResNet18神经网络确定为对待优化神经网络ResNet34优化后的目标神经网络架构。
下面对采用架构搜索策略进行搜索,获得目标神经网络架构的实现过程进行说明。在一种可实现方式中,可以采用神经网络架构搜索(neural architecture search,NAS)算法或高效神经网络架构搜索(efficient neural architecture search,ENAS)算法等传统神经网络架构搜索算法进行搜索。在另一种可实现方式中,可以采用本申请实施例提供的一种基于模型的神经网络架构搜索(model-based neural architecture search,MBNAS)方法进行搜索。并且,该MBNAS方法的实现可以通过多个功能模块实现。为便于理解,本申请实施例以通过图11所示的多个功能模块实现MBNAS方法为例,对采用MBNAS方法进行搜索的实现过程进行说明:
如图11所示,用于实现MBNAS方法的功能模块包括架构确定模块12、训练模块13和推理模块14,架构确定模块12包括生成子模块121和搜索子模块122,搜索子模块122包括评估单元1221和控制单元1222。下面借助图12所示的过程示意图,对采用MBNAS方法进行搜索的过程进行说明,该过程包括以下步骤:
步骤5041、生成子模块根据搜索空间生成多个神经网络架构,并向训练模块和推理模块提供该多个神经网络架构。
如前,搜索空间包括优化后神经网络的神经网络架构中各神经元的各属性的取值范围,则生成子模块可以通过对搜索空间进行采样,得到优化后神经网络的神经网络架构中各神经元的各属性的取值,从而生成多个神经网络架构。并且,在对搜索空间进行采样时,生成子模块可以采用随机方式进行采样,如此可以保障后续用于训练评估单元的样本的均衡性。或者,生成子模块也可以采用其他方式对搜索空间进行采样,本申请实施例对其不做具体限定。
步骤5042、训练模块根据多个神经网络架构训练得到多个备选子模型,获取多个备选子模型的第二类评价指标值,并向搜索子模块提供该多个神经网络架构和多个备选子模型的第二类评价指标值。
训练模块可以对神经网络架构进行权重初始化,得到多个初始子模型,然后利用训练数 据训练多个初始子模型,得到多个备选子模型的第二类评价指标值。可选地,该第二类评价指标值可以包括推理时延和功耗。训练初始子模型所采用的训练数据可以是ImageNet数据集等公开数据集,或者是用户提供的数据集。
其中,由于多个初始子模型的训练过程不存在相互依赖关系,训练模块可以并发地对多个初始子模型进行训练,如此可以缩短训练时长,提高训练效率。
步骤5043、推理模块在硬件上对生成子模块提供的多个神经网络架构对应的多个子模型进行推理,获得多个子模型在硬件上的第一类评价指标值,并向搜索子模块提供该多个神经网络架构和多个子模型的第一类评价指标值。
可选地,该第一类评价指标值可以包括精度值。其中,当用户提供的目标信息包括硬件信息时,用于推理子模型的硬件可以为该硬件信息指示的硬件,或者,在已为推理模块配置的硬件中与该硬件信息指示的硬件具有较大性能相似度的硬件。当用户提供的目标信息不包括硬件信息时,即用户未指定用于运行目标神经网络的硬件时,用于推理子模型的硬件可以为已为推理模块配置的硬件。
并且,推理模块在硬件上执行推理的多个子模型,可以为推理模块对生成子模块提供的神经网络架构进行权重初始化得到的多个初始子模型,也可以为训练模块对多个初始子模型进行训练得到的多个备选子模型。另外,只要推理模块、训练模块和生成子模块中有一个向搜索子模块提供生成子模块生成的多个神经网络架构即可。
与训练模块类似,推理模块可以并行地对多个子模型在硬件上执行推理,获得多个子模型在硬件上的第一类评价指标值。
步骤5044、搜索子模块根据多个神经网络架构、多个备选子模型的第二类评价指标值和多个子模型的第一类评价指标值,确定目标神经网络架构。
搜索子模块可以根据多个神经网络架构、多个备选子模型的第二类评价指标值和多个子模型的第一类评价指标值,对评估单元进行训练。然后,搜索子模块使用训练完成的评估单元对控制单元提供的神经网络架构进行预测,得到控制单元提供的神经网络架构对应的评价指标值,并将该评价指标值作为反馈用于训练控制单元,然后使用训练完的控制单元确定目标神经网络架构。
评估单元属于一种神经网络,其训练过程包括:将生成子模块生成的神经网络架构及其对应的第一类评价指标值和第二类评价指标值输入评估单元,以输入的第一类评价指标值和第二类评价指标值作为输入的神经网络架构的标签进行监督学习,使得评估单元根据标签和评估单元对神经网络架构预测得到的评价指标值确定损失值,并根据该损失值更新评估单元的权重参数,直至满足评估单元的训练结束条件,如评估单元趋于收敛或评估单元的损失值小于预设损失值。其中,评估单元可以为循环神经网络。用于对评估单元进行训练的一组训练数据包括:生成子模块生成的多个神经网络架构中的任一神经网络架构、该神经网络对应的备选子模型的第二类评价指标值、及该神经网络对应的子模型的第一类评价指标值。
控制单元也属于一种神经网络。控制单元可以根据搜索空间生成神经网络架构,训练完成的评估单元可以预测控制单元生成的神经网络架构的评价指标值(例如包括第一类评价指标值和第二类评价指标值中的至少一个),该预测得到的评价指标值用于作为训练控制单元的激励(reward),以根据该激励对控制单元的权重参数进行调整,直至满足控制单元的训练结束条件,如控制单元趋于收敛。其中,控制单元可以为循环神经网络。
完成训练后的控制单元用于根据目标神经网络的搜索空间,生成多个候选神经网络架构。搜索子模块可以获取该多个候选神经网络架构,并根据该多个候选神经网络架构得到多个初始神经网络,然后分别对多个初始神经网络进行训练,得到多个候选神经网络架构对应的多个候选神经网络,并分别对多个候选神经网络进行推理,得到多个候选神经网络的评价指标值,然后根据多个候选神经网络的评价指标值,筛选满足指定筛选条件的候选神经网络,并将该满足指定筛选条件的候选神经网络的神经网络架构,确定为目标神经网络架构。其中,指定筛选条件可以根据搜索空间和目标信息中的至少一个确定。例如,指定筛选条件指示目标神经网络架构需要满足目标信息指示的用户对目标神经网络的性能要求,且若多个候选神经网络中存在多个候选神经网络满足该性能要求时,可以将性能最好的候选神经网络的神经网络架构确定为目标神经网络架构。并且,可以采用本申请实施例提供的一种模型克隆方法,对多个初始神经网络进行训练,且训练使用的训练集可以为ImageNet数据集等公开数据集。在一些可能的实现方式中,搜索子模块可以采用训练模块对多个初始神经网络进行训练。且当训练模块部署在用户侧时,需要将多个初始神经网络发送至部署在用户侧的训练模块,以使用该训练模块执行训练过程。并且,以上描述是以采用MBNAS方法进行搜索的过程中所需的训练功能均由训练模块13提供,所需的推理功能均由推理模块14提供为例进行说明。在一种可能的实现方式中,该搜索过程中所需的训练功能也可以不由训练模块13提供,所需的推理功能也可以不由推理模块14提供。例如,架构确定模块还包括训练子模块和推理子模块,该训练子模块用于提供该搜索过程所需的训练功能,该推理子模块用于提供该搜索过程所需的推理功能。
需要说明的是,根据前面描述可知:知识库包括至少三类信息,该至少三类信息包括:神经网络的预训练参数、评价指标值和神经网络架构,且预训练参数和评价指标值可视为神经网络架构中神经网络架构的标签信息。若用户指定了目标神经网络的训练集,则根据搜索空间和目标信息对知识库的检索存在以下至少六种检索结果:与至少三类信息均不匹配(即与训练集与全部目标信息均不匹配)、仅与预训练参数匹配(即仅训练集匹配)、仅与评价指标值中多种指标值中的部分匹配(即仅与部分评价指标值匹配)、仅与全部评价指标值匹配(即与评价指标值中多种指标值中的全部匹配)、与预训练参数匹配且与部分评价指标值匹配、及与预训练参数匹配且与全部评价指标值匹配。若用户未指定目标神经网络的训练集,则根据搜索空间和目标信息对知识库的检索存在以下至少三种检索结果:与评价指标值均不匹配、仅与部分评价指标值匹配、及与全部评价指标值匹配。
当与预训练参数匹配且与全部评价指标值匹配,或与全部评价指标值匹配时,可以直接使用知识库中的已有神经网络架构,在其他匹配情况均需要使用MBNAS方法进行搜索。并且,对于上述不同匹配情况,采用MBNAS方法进行搜索的实现过程会稍有差别,具体为:当与至少三类信息均不匹配、仅与预训练参数匹配、指定训练集情况下仅与部分评价指标值匹配、指定训练集情况下仅与全部评价指标值匹配、或与评价指标值均不匹配时,按照上述步骤5041至步骤5044描述的实现过程进行搜索。当与预训练参数匹配且与部分评价指标值匹配、或未指定训练集情况下仅与部分评价指标值匹配时,若与评价指标值中部分匹配的信息为第二类评价指标值,则可以无需执行上述步骤5042,可以直接使用评价指标值中的第二类评价指标值对评估单元进行训练,若与评价指标值中部分匹配的信息为第一类评价指标值,则可以无需执行上述步骤5043,可以直接使用评价指标值中的第一类评价指标值对评估单元 进行训练。当无需执行上述步骤5042或5043时,能够提高对待优化神经网络进行优化的效率,且由于上述第一类评价指标值、第二类评价指标值和搜索空间等为优化系统为其他用户优化神经网络过程中得到的中间数据,因此,通过将中间数据存储在知识库中,能够丰富知识库,且提高对待优化神经网络的优化效率。
综上所述,在本申请实施例提供的神经网络优化方法中,能够先根据待优化神经网络的模型文件,为待优化神经网络映射到较相似的搜索空间,然后在基于该搜索空间确定目标神经网络架构,并通过对该目标神经网络架构进行训练,能够得到性能得到较大提升的目标神经网络,然后向用户提供目标神经网络的模型文件。该方法使得优化后的神经网络的性能能够得到较大的提升,能够将优化后的神经网络用于解决较复杂的任务,保证了优化后的神经网络的适用范围。
并且,由于该方法无需用户上传训练数据,能够避免用户出于对数据的隐私、法律或传输方面限制的原因而无法向平台提供用于训练数据的问题,实现对用户数据的保护,提高了该神经网络优化方法的适用性。
同时,由于该方法只需用户提供待优化神经网络,且可选提供目标信息,即可完成待优化神经网络的优化,不会像高阶自动机器学习平台一样,无需用户对模型优化的相关知识有一定储备就能开展优化工作,降低了该神经网络优化方法的使用门槛,扩展了该神经网络优化方法的使用范围。
另外,通过将搜索过程中的中间数据存储在知识库中,使得能够利用该中间数据为其他用户提供服务,并且,通过先在知识库中进行检索,当知识库中存在于待优化神经网络匹配的已有神经网络架构时,无需再使用架构搜索策略搜索神经网络架构,能够提高对待优化神经网络的优化效率,并降低优化神经网络的资源消耗。
需要说明的是,本申请实施例提供的神经网络优化方法的步骤先后顺序可以进行适当调整,步骤也可以根据情况进行相应增减。任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化的方法,都应涵盖在本申请的保护范围之内,因此不再赘述。
本申请实施例还提供了一种神经网络优化装置。该神经网络优化装置可以实现前述优化系统1的部分或全部功能。当神经网络优化装置为软件装置时,其可以是前述优化系统的部分或者全部。如图13所示,该神经网络优化装置130包括:
交互模块1301,用于接收待优化神经网络的模型文件。
架构确定模块1302,用于基于待优化神经网络的模型文件,获取目标神经网络架构的搜索空间,搜索空间包括目标神经网络架构中各神经元的各属性的取值范围。
架构确定模块1302,还用于基于搜索空间,获得目标神经网络架构。
训练模块1303,用于基于待优化神经网络的模型文件对目标神经网络架构进行训练,得到目标神经网络的模型文件。
交互模块1301,还用于向用户提供目标神经网络的模型文件。
可选地,交互模块1301,还接收用户输入的目标信息,目标信息包括以下信息中的一种或多种:运行目标神经网络的硬件的信息,及指示用户对目标神经网络的性能要求的信息。
相应的,架构确定模块1302,具体用于:基于待优化神经网络的模型文件和目标信息,获取目标神经网络架构的搜索空间。
可选地,架构确定模块1302,具体用于:基于搜索空间,获取用于反映用户对目标神经网络的性能要求的指定信息;基于指定信息,在知识库中进行检索;当知识库中存在满足指定信息的已有神经网络架构时,将已有神经网络架构确定为目标神经网络架构。
可选地,指定信息包括多类评价指标值,架构确定模块1302,具体用于:基于搜索空间,得到至少一个备选神经网络架构;对每个备选神经网络架构进行训练和推理,得到每个备选神经网络架构的多类评价指标值。
可选地,架构确定模块1302,具体用于:基于每个备选神经网络架构的多类评价指标值,在知识库中进行检索;当知识库中任一神经网络架构满足任一备选神经网络架构的多类评价指标值时,确定知识库中存在满足指定信息的已有神经网络架构。
可选地,架构确定模块1302,还具体用于:当知识库中不存在满足指定信息的已有神经网络架构时,采用架构搜索策略搜索得到目标神经网络架构。
可选地,如图14所示,该神经网络优化装置130还包括:存储模块1304,用于将搜索目标神经网络架构的过程中产生的中间数据存储在知识库中,中间数据包括以下一项或多项:目标神经网络架构的搜索空间、基于搜索空间得到的备选神经网络架构、与硬件相关的第一类评价指标值、与硬件无关的第二类评价指标值、用于运行目标神经网络的硬件的信息。
可选地,训练模块1303,具体用于:基于待优化神经网络的模型文件,采用模型克隆装置130对目标神经网络架构进行训练,得到目标神经网络的模型文件。
可选地,架构确定模块1302,具体用于:将待优化神经网络的模型文件输入经过预训练的人工智能模型,得到人工智能模型输出的目标神经网络架构的搜索空间。
可选地,如图14所示,该神经网络优化装置130还包括:推理模块1305,用于采用优化策略对目标神经网络进行速度优化,优化策略包括:图优化策略和算子优化策略。
相应的,交互模块1301,具体用于:向用户提供速度优化后的目标神经网络的模型文件。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的装置和模块的具体工作过程,可以参考前述方法实施例中的对应内容,在此不再赘述。
综上所述,在本申请实施例提供的神经网络优化装置中,能够先根据待优化神经网络的模型文件,为待优化神经网络映射到较相似的搜索空间,然后在基于该搜索空间确定目标神经网络架构,并通过对该目标神经网络架构进行训练,能够得到性能得到较大提升的目标神经网络,然后向用户提供目标神经网络的模型文件。该神经网络优化装置使得优化后的神经网络的性能能够得到较大的提升,能够将优化后的神经网络用于解决较复杂的任务,保证了优化后的神经网络的适用范围。
并且,由于该神经网络优化装置无需用户上传训练数据,能够避免用户出于对数据的隐私、法律或传输方面限制的原因而无法向平台提供用于训练数据的问题,实现对用户数据的保护,提高了该神经网络优化装置的适用性。
同时,由于该神经网络优化装置只需用户提供待优化神经网络,且可选提供目标信息,即可完成待优化神经网络的优化,不会像高阶深度学习AutoML平台一样,无需用户对模型优化的相关知识有一定储备就能开展优化工作,降低了该神经网络优化装置的使用门槛,扩展了该神经网络优化装置的使用范围。
另外,通过将搜索过程中的中间数据存储在知识库中,使得能够利用该中间数据为其他用户提供服务,并且,通过先在知识库中进行检索,当知识库中存在于待优化神经网络匹配 的已有神经网络架构时,无需再使用架构搜索策略搜索神经网络架构,能够提高对待优化神经网络的优化效率,并降低优化神经网络的资源消耗。
图15是本申请实施例提供的一种计算机设备的结构示意图。图1、图2或图3所示的优化系统可部署在该计算机设备中。如图15所示,该计算机设备150包括存储器1501、处理器1502、通信接口1503以及总线1504。其中,存储器1501、处理器1502、通信接口1503通过总线1504实现彼此之间的通信连接。并且,该计算机设备150可以包括多个处理器1502,以便于通过不同的处理器实现上述不同功能模块的功能。
存储器1501可以是只读存储器(read only memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(random access memory,RAM)。存储器1501可以存储可执行代码序,当存储器1501中存储的可执行代码被处理器1502执行时,处理器1502和通信接口1503用于执行本申请实施例提供的神经网络优化方法。存储器1501中还可以包括操作系统等其他运行进程所需的软件模块和数据等。且操作系统可以为LINUX,UNIX,WINDOWS
TM等。
处理器1502可以采用通用的中央处理器(central processing unit,CPU),微处理器,应用专用集成电路(application specific integrated circuit,ASIC),图形处理器(graphics processing unit,GPU)或者一个或多个集成电路。
处理器1502还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请的神经网络优化方法的部分或全部功能可以通过处理器1502中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器1502还可以是通用处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1501,处理器1502读取存储器1501中的信息,结合其硬件完成本申请实施例的神经网络优化方法。
通信接口1503使用例如但不限于收发器一类的收发模块,来实现计算机设备150与其他设备或通信网络之间的通信。例如,通信接口1503可以是以下器件的任一种或任一种组合:网络接口(如以太网接口)、无线网卡等具有网络接入功能的器件。
总线1504可包括在计算机设备150各个部件(例如,存储器1501、处理器1502、通信接口1503)之间传送信息的通路。
上述每个计算机设备150间通过通信网络建立通信通路。每个计算机设备150用于实现本申请实施例提供的神经网络优化方法的部分功能。任一计算机设备150可以为云数据中心中的计算机设备(例如:服务器),或边缘数据中心中的计算机设备等。
上述各个附图对应的流程的描述各有侧重,某个流程中没有详述的部分,可以参见其他流程的相关描述。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当 使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。提供数据同步云服务的计算机程序产品包括一个或多个计算机指令,在计算机设备上加载和执行这些计算机程序指令时,全部或部分地实现本申请实施例提供的神经网络优化方法的流程或功能。
计算机设备可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质存储有提供数据同步云服务的计算机程序指令。
本申请实施例还提供了一种存储介质,该存储介质为非易失性计算机可读存储介质,当存储介质中的指令被处理器执行时,实现如本申请实施例提供的神经网络优化方法。
本申请实施例还提供了一种包含指令的计算机程序产品,当计算机程序产品在计算机上运行时,使得计算机执行本申请实施例提供的神经网络优化方法。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
在本申请实施例中,术语“第一”、“第二”和“第三”仅用于描述目的,而不能理解为指示或暗示相对重要性。术语“至少一个”是指一个或多个,术语“多个”指两个或两个以上,除非另有明确的限定。
本申请中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
以上所述仅为本申请的可选实施例,并不用以限制本申请,凡在本申请的构思和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。
Claims (25)
- 一种神经网络优化方法,其特征在于,所述方法包括:接收待优化神经网络的模型文件;基于所述待优化神经网络的模型文件,获取目标神经网络架构的搜索空间,所述搜索空间包括所述目标神经网络架构中各神经元的各属性的取值范围;基于所述搜索空间,获得所述目标神经网络架构;基于所述待优化神经网络的模型文件对所述目标神经网络架构进行训练,得到目标神经网络的模型文件;向用户提供所述目标神经网络的模型文件。
- 根据权利要求1所述的方法,其特征在于,所述方法还包括:接收所述用户输入的目标信息,所述目标信息包括以下信息中的一种或多种:运行所述目标神经网络的硬件的信息,及指示所述用户对所述目标神经网络的性能要求的信息;所述基于所述待优化神经网络的模型文件,获取目标神经网络架构的搜索空间,包括:基于所述待优化神经网络的模型文件和所述目标信息,获取所述目标神经网络架构的搜索空间。
- 根据权利要求1或2所述的方法,其特征在于,所述基于所述搜索空间,获得所述目标神经网络架构,包括:基于所述搜索空间,获取用于反映所述用户对所述目标神经网络的性能要求的指定信息;基于所述指定信息,在知识库中进行检索;当所述知识库中存在满足所述指定信息的已有神经网络架构时,将所述已有神经网络架构确定为所述目标神经网络架构。
- 根据权利要求3所述的方法,其特征在于,所述指定信息包括多类评价指标值,所述基于所述搜索空间,获取用于反映所述用户对所述目标神经网络的性能要求的指定信息,包括:基于所述搜索空间,得到至少一个备选神经网络架构;对每个备选神经网络架构进行训练和推理,得到每个备选神经网络架构的所述多类评价指标值。
- 根据权利要求4所述的方法,其特征在于,所述基于所述指定信息,在知识库中进行检索,包括:基于每个备选神经网络架构的所述多类评价指标值,在所述知识库中进行检索;当所述知识库中任一神经网络架构满足任一备选神经网络架构的所述多类评价指标值时,确定所述知识库中存在满足所述指定信息的已有神经网络架构。
- 根据权利要求3至5任一所述的方法,其特征在于,所述基于所述搜索空间,获得所述目标神经网络架构,还包括:当所述知识库中不存在满足所述指定信息的已有神经网络架构时,采用架构搜索策略搜索得到所述目标神经网络架构。
- 根据权利要求6所述的方法,其特征在于,所述方法还包括:将搜索所述目标神经网络架构的过程中产生的中间数据存储在所述知识库中,所述中间数据包括以下一项或多项:所述目标神经网络架构的搜索空间、基于所述搜索空间得到的备 选神经网络架构、与硬件相关的第一类评价指标值、与硬件无关的第二类评价指标值、用于运行所述目标神经网络的硬件的信息。
- 根据权利要求1至7任一所述的方法,其特征在于,所述基于所述待优化神经网络的模型文件对所述目标神经网络架构进行训练,得到目标神经网络的模型文件,包括:基于所述待优化神经网络的模型文件,采用模型克隆方法对所述目标神经网络架构进行训练,得到所述目标神经网络的模型文件。
- 根据权利要求1至8任一所述的方法,其特征在于,所述基于所述待优化神经网络的模型文件,获取目标神经网络架构的搜索空间,包括:将所述待优化神经网络的模型文件输入经过预训练的人工智能模型,得到所述人工智能模型输出的所述目标神经网络架构的搜索空间。
- 根据权利要求1至9任一所述的方法,其特征在于,在所述基于所述待优化神经网络的模型文件对所述目标神经网络架构进行训练,得到目标神经网络的模型文件之后,所述方法还包括:采用优化策略对所述目标神经网络进行速度优化,所述优化策略包括:图优化策略和算子优化策略;所述向用户提供所述目标神经网络的模型文件,包括:向所述用户提供速度优化后的目标神经网络的模型文件。
- 根据权利要求1至9任一所述的方法,其特征在于,在所述向用户提供所述目标神经网络的模型文件之后,所述方法还包括:接收所述用户发送的速度优化请求;基于所述速度优化请求,采用优化策略对所述目标神经网络进行速度优化,所述优化策略包括:图优化策略和算子优化策略;向所述用户提供速度优化后的目标神经网络的模型文件。
- 一种神经网络优化装置,其特征在于,所述装置包括:交互模块,用于接收待优化神经网络的模型文件;架构确定模块,用于基于所述待优化神经网络的模型文件,获取目标神经网络架构的搜索空间,所述搜索空间包括所述目标神经网络架构中各神经元的各属性的取值范围;所述架构确定模块,还用于基于所述搜索空间,获得所述目标神经网络架构;训练模块,用于基于所述待优化神经网络的模型文件对所述目标神经网络架构进行训练,得到目标神经网络的模型文件;所述交互模块,还用于向用户提供所述目标神经网络的模型文件。
- 根据权利要求12所述的装置,其特征在于,所述交互模块,还接收所述用户输入的目标信息,所述目标信息包括以下信息中的一种或多种:运行所述目标神经网络的硬件的信息,及指示所述用户对所述目标神经网络的性能要求的信息;所述架构确定模块,具体用于:基于所述待优化神经网络的模型文件和所述目标信息,获取所述目标神经网络架构的搜索空间。
- 根据权利要求12或13所述的装置,其特征在于,所述架构确定模块,具体用于:基于所述搜索空间,获取用于反映所述用户对所述目标神经网络的性能要求的指定信息;基于所述指定信息,在知识库中进行检索;当所述知识库中存在满足所述指定信息的已有神经网络架构时,将所述已有神经网络架构确定为所述目标神经网络架构。
- 根据权利要求14所述的装置,其特征在于,所述指定信息包括多类评价指标值,所述架构确定模块,具体用于:基于所述搜索空间,得到至少一个备选神经网络架构;对每个备选神经网络架构进行训练和推理,得到每个备选神经网络架构的所述多类评价指标值。
- 根据权利要求15所述的装置,其特征在于,所述架构确定模块,具体用于:基于每个备选神经网络架构的所述多类评价指标值,在所述知识库中进行检索;当所述知识库中任一神经网络架构满足任一备选神经网络架构的所述多类评价指标值时,确定所述知识库中存在满足所述指定信息的已有神经网络架构。
- 根据权利要求14至16任一所述的装置,其特征在于,所述架构确定模块,还具体用于:当所述知识库中不存在满足所述指定信息的已有神经网络架构时,采用架构搜索策略搜索得到所述目标神经网络架构。
- 根据权利要求17所述的装置,其特征在于,所述装置还包括:存储模块,用于将搜索所述目标神经网络架构的过程中产生的中间数据存储在所述知识库中,所述中间数据包括以下一项或多项:所述目标神经网络架构的搜索空间、基于所述搜索空间得到的备选神经网络架构、与硬件相关的第一类评价指标值、与硬件无关的第二类评价指标值、用于运行所述目标神经网络的硬件的信息。
- 根据权利要求12至18任一所述的装置,其特征在于,所述训练模块,具体用于:基于所述待优化神经网络的模型文件,采用模型克隆装置对所述目标神经网络架构进行训练,得到所述目标神经网络的模型文件。
- 根据权利要求12至19任一所述的装置,其特征在于,所述架构确定模块,具体用于:将所述待优化神经网络的模型文件输入经过预训练的人工智能模型,得到所述人工智能模型输出的所述目标神经网络架构的搜索空间。
- 根据权利要求12至20任一所述的装置,其特征在于,所述装置还包括:推理模块,用于采用优化策略对所述目标神经网络进行速度优化,所述优化策略包括:图优化策略和算子优化策略;所述交互模块,具体用于:向所述用户提供速度优化后的目标神经网络的模型文件。
- 根据权利要求12至20任一所述的装置,其特征在于,所述交互模块,还用于接收所述用户发送的速度优化请求;所述装置还包括:推理模块,用于基于所述速度优化请求,采用优化策略对所述目标神经网络进行速度优化,所述优化策略包括:图优化策略和算子优化策略;所述交互模块,还用于向所述用户提供速度优化后的目标神经网络的模型文件。
- 一种计算机设备,其特征在于,所述计算机设备包括:处理器和存储器,所述存储器中存储有计算机程序;所述处理器执行计算机程序时,所述计算机设备实现权利要求1至11任一所述的方法。
- 一种非瞬态的计算机可读存储介质,其特征在于,当所述计算机可读存储介质中的指令被处理器执行时,所述处理器执行权利要求1至11任一所述的方法。
- 一种包含指令的计算机程序产品,其特征在于,当计算机程序产品中的指令在计算机上运行时,所述计算机执行权利要求1至11任一所述的方法。
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP22814743.5A EP4339843A4 (en) | 2021-05-29 | 2022-02-17 | METHOD AND APPARATUS FOR OPTIMIZING NEURAL NETWORK |
| US18/521,152 US20240095529A1 (en) | 2021-05-29 | 2023-11-28 | Neural Network Optimization Method and Apparatus |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110596002.1 | 2021-05-29 | ||
| CN202110596002.1A CN115409168A (zh) | 2021-05-29 | 2021-05-29 | 神经网络优化方法及其装置 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US18/521,152 Continuation US20240095529A1 (en) | 2021-05-29 | 2023-11-28 | Neural Network Optimization Method and Apparatus |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022252694A1 true WO2022252694A1 (zh) | 2022-12-08 |
Family
ID=84155966
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2022/076556 Ceased WO2022252694A1 (zh) | 2021-05-29 | 2022-02-17 | 神经网络优化方法及其装置 |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US20240095529A1 (zh) |
| EP (1) | EP4339843A4 (zh) |
| CN (1) | CN115409168A (zh) |
| WO (1) | WO2022252694A1 (zh) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN117236426A (zh) * | 2022-12-20 | 2023-12-15 | 北京九章云极科技有限公司 | 一种数据处理方法及系统 |
| CN117313704A (zh) * | 2023-11-28 | 2023-12-29 | 江西师范大学 | 基于公有与私有特征分解的混合可读性评估方法与系统 |
| CN121094525A (zh) * | 2025-08-06 | 2025-12-09 | 北京建工集团有限责任公司 | 一种复杂地层高水压大直径盾构隧道施工管理方法及系统 |
Families Citing this family (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118353779B (zh) * | 2024-05-20 | 2025-01-24 | 广州楚晨网络科技有限公司 | 一种优化策略的物联网网络确定方法 |
| CN118502926B (zh) * | 2024-07-19 | 2024-12-24 | 阿里云飞天(杭州)云计算技术有限公司 | 端侧算法模型的优化方法、设备、介质和程序产品 |
| CN120163196B (zh) * | 2025-05-19 | 2025-08-29 | 中国科学技术大学苏州高等研究院 | 神经网络和硬件的联合搜索方法及装置 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111814966A (zh) * | 2020-08-24 | 2020-10-23 | 国网浙江省电力有限公司 | 神经网络架构搜索方法、神经网络应用方法、设备及存储介质 |
| CN112101525A (zh) * | 2020-09-08 | 2020-12-18 | 南方科技大学 | 一种通过nas设计神经网络的方法、装置和系统 |
| CN112561027A (zh) * | 2019-09-25 | 2021-03-26 | 华为技术有限公司 | 神经网络架构搜索方法、图像处理方法、装置和存储介质 |
| CN113128678A (zh) * | 2020-01-15 | 2021-07-16 | 华为技术有限公司 | 神经网络的自适应搜索方法及装置 |
-
2021
- 2021-05-29 CN CN202110596002.1A patent/CN115409168A/zh active Pending
-
2022
- 2022-02-17 EP EP22814743.5A patent/EP4339843A4/en active Pending
- 2022-02-17 WO PCT/CN2022/076556 patent/WO2022252694A1/zh not_active Ceased
-
2023
- 2023-11-28 US US18/521,152 patent/US20240095529A1/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112561027A (zh) * | 2019-09-25 | 2021-03-26 | 华为技术有限公司 | 神经网络架构搜索方法、图像处理方法、装置和存储介质 |
| CN113128678A (zh) * | 2020-01-15 | 2021-07-16 | 华为技术有限公司 | 神经网络的自适应搜索方法及装置 |
| CN111814966A (zh) * | 2020-08-24 | 2020-10-23 | 国网浙江省电力有限公司 | 神经网络架构搜索方法、神经网络应用方法、设备及存储介质 |
| CN112101525A (zh) * | 2020-09-08 | 2020-12-18 | 南方科技大学 | 一种通过nas设计神经网络的方法、装置和系统 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4339843A4 |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN117236426A (zh) * | 2022-12-20 | 2023-12-15 | 北京九章云极科技有限公司 | 一种数据处理方法及系统 |
| CN117313704A (zh) * | 2023-11-28 | 2023-12-29 | 江西师范大学 | 基于公有与私有特征分解的混合可读性评估方法与系统 |
| CN117313704B (zh) * | 2023-11-28 | 2024-02-23 | 江西师范大学 | 基于公有与私有特征分解的混合可读性评估方法与系统 |
| CN121094525A (zh) * | 2025-08-06 | 2025-12-09 | 北京建工集团有限责任公司 | 一种复杂地层高水压大直径盾构隧道施工管理方法及系统 |
Also Published As
| Publication number | Publication date |
|---|---|
| EP4339843A4 (en) | 2024-11-20 |
| CN115409168A (zh) | 2022-11-29 |
| US20240095529A1 (en) | 2024-03-21 |
| EP4339843A1 (en) | 2024-03-20 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2022252694A1 (zh) | 神经网络优化方法及其装置 | |
| CN111723910B (zh) | 构建多任务学习模型的方法、装置、电子设备及存储介质 | |
| CN113128678B (zh) | 神经网络的自适应搜索方法及装置 | |
| JP7322044B2 (ja) | レコメンダシステムのための高効率畳み込みネットワーク | |
| CN113361680B (zh) | 一种神经网络架构搜索方法、装置、设备及介质 | |
| US11741361B2 (en) | Machine learning-based network model building method and apparatus | |
| US20190279088A1 (en) | Training method, apparatus, chip, and system for neural network model | |
| CN113869521A (zh) | 构建预测模型的方法、装置、计算设备和存储介质 | |
| CN113505883A (zh) | 一种神经网络训练方法以及装置 | |
| US20210168195A1 (en) | Server and method for controlling server | |
| CN116976461A (zh) | 联邦学习方法、装置、设备及介质 | |
| CN113869496B (zh) | 一种神经网络的获取方法、数据处理方法以及相关设备 | |
| CN112529207A (zh) | 模型优化方法、装置、存储介质及设备 | |
| WO2024139703A1 (zh) | 对象识别模型的更新方法、装置、电子设备、存储介质及计算机程序产品 | |
| CN118278534A (zh) | 一种生成模型的方法及装置 | |
| WO2025124299A1 (zh) | 一种基于强化学习的模拟电路网表划分方法及系统 | |
| CN115510327B (zh) | 点击率预测模型的训练方法、资源推荐方法及装置 | |
| US12579376B2 (en) | Label propagation using contrastive learning projections | |
| CN115412401B (zh) | 训练虚拟网络嵌入模型及虚拟网络嵌入的方法和装置 | |
| CN115827171B (zh) | 云端调参系统、调参方法及调参系统 | |
| CN120315629A (zh) | 人工智能模型部署方法、计算机系统、计算机可读存储介质及计算机程序产品 | |
| CN114154566B (zh) | 一种基于深度强化学习的边缘计算主动服务方法及系统 | |
| US12217190B2 (en) | Decision making using integrated machine learning models and knowledge graphs | |
| CN117540260A (zh) | 构建图神经网络的方法和装置以及图分类方法和装置 | |
| CN120181093B (zh) | 一种社交网络群体分类方法、系统、设备和存储介质 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22814743 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2022814743 Country of ref document: EP |
|
| ENP | Entry into the national phase |
Ref document number: 2022814743 Country of ref document: EP Effective date: 20231212 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |