EP4599366A1 - Fusion de réseaux neuronaux robustes contradictoires - Google Patents
Fusion de réseaux neuronaux robustes contradictoiresInfo
- Publication number
- EP4599366A1 EP4599366A1 EP23805946.3A EP23805946A EP4599366A1 EP 4599366 A1 EP4599366 A1 EP 4599366A1 EP 23805946 A EP23805946 A EP 23805946A EP 4599366 A1 EP4599366 A1 EP 4599366A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- training
- network
- adversarial
- neural network
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/094—Adversarial learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
Definitions
- This specification relates to processing inputs using neural networks.
- Neural networks are machine learning models that employ one or more layers of nonlinear units to predict an output for a received input.
- Some neural networks include one or more hidden layers in addition to an output layer. The output of each hidden layer is used as input to the next layer in the network, i.e., the next hidden layer or the output layer.
- Each layer of the network generates an output from a received input in accordance with current values of a respective set of parameters.
- This specification describes a system implemented as computer programs on one or more computers in one or more locations that trains a neural network to be resistant to adversarial attacks. That is, the system generates, by training the neural network, final values for the parameters of the neural network (“network parameters”) that will be used to perform a target task.
- the neural network becomes more secure by virtue of being less susceptible to adversarial attacks.
- An adversarial attack occurs when a malicious attacker intentionally submits inputs to the neural network that cause undesired behavior, i.e., incorrect outputs to be generated by the neural network.
- the security of the computer system that includes the neural network is improved because the system becomes more resistant to these types of attacks.
- the neural network can generalize to inference time and test time inputs that have a “distribution shift” relative to the training inputs used to train the neural network, improving the performance of the neural network on a variety of real-world tasks where distribution shift may be likely and without requiring any re-training.
- FIG. 1 A shows an example training system.
- FIG. IB shows an example configuration system.
- FIG. 2 is a flow diagram of an example process for merging adversarially-robust neural networks.
- FIG. 4 is a flow diagram for configuring the neural network after training.
- FIG. 5 shows an example of the results achieved by making use of the described techniques on eight different image classification tasks.
- FIG. 6 shows an example of robust accuracy of the described techniques for various weights.
- the system 100 By training the neural network 118 to be resistant to adversarial attacks, the system 100 obtains final values 110 of the network parameters that cause the neural network 118 to have robust, rather than brittle, performance at inference time, i.e., when processing network inputs 112 to generate network outputs 114 for the target task after training.
- the output generated by the neural network 118 for a given image may be an image classification output that includes scores for each of a set of object categories, with each score representing an estimated likelihood that the image contains an image of an object belonging to the category.
- the output generated by the neural network 118 for a given image may be an objection detection output that identifies positions of objects within the given image.
- the output generated by the neural network 118 for a given image may be an image segmentation output that identifies, for each pixel of the given input image, a category from a set of possible categories that the scene depicted at the pixel belongs to.
- the output generated by the neural network data can be a control policy for controlling the agent, e.g., data defining a probability distribution over possible actions that can be performed by the agent.
- the environment may be a real world environment, and the agent may be a physical agent operating in the real world environment.
- the sensor data can be data from an image, distance, or position sensor or from an actuator.
- the sensor data may include data characterizing the current state of the robot, e.g., one or more of: joint positionjoint velocityjoint force, torque or acceleration, e.g., gravity-compensated torque feedback, and global or relative pose of an item held by the robot.
- the sensor data may also include, for example, sensed electronic signals such as motor current or a temperature signal; and/or image or video data for example from a camera or a LIDAR sensor, e.g., data from sensors of the agent or data from sensors that are located separately from the agent in the environment.
- the neural network 118 can have any appropriate architecture that allows the neural network 118 to perform the target task, i.e., to map network inputs of the type and dimensions required by the task to network outputs of the type and dimensions required by the task. That is, when the task is a classification task, the neural network 118 maps the input to the classification task to a set of scores, one for each possible class for the task. When the task is a regression task, the neural network 118 maps the input to the regression task to a set of regressed values, one for each value that needs to be generated in order to perform the regression task.
- the neural network 118 can be a convolutional neural network, e.g., a neural network having a ResNet architecture, an Inception architecture, an EfficientNet architecture, and so on, or a Transformer neural network, e.g., a vision Transformer.
- a convolutional neural network e.g., a neural network having a ResNet architecture, an Inception architecture, an EfficientNet architecture, and so on
- a Transformer neural network e.g., a vision Transformer.
- the neural network 118 can be feed-forward neural network, e.g., an MLP, that includes multiple fully-connected layers.
- the system obtains data specifying a plurality of different adversarial training schemes 132.
- Each adversarial training scheme 132 trains the neural network 118 to be robust to a different type of adversarial attack.
- each adversarial training scheme 132 can have a different corresponding adversarial training loss function.
- some or all of the adversarial training schemes can train the neural network 118 to be robust to / /; -norm bounded perturbations for different values of p.
- each of these adversarial training schemes is associated with a corresponding function A that characterizes a corresponding threat model for the adversarial training scheme and that maps an input x to a set of A(x) of possible perturbed versions of the input x.
- -norm bounded perturbations with budget e > 0 can be described by: where d is the number of elements in the input x and 11 ⁇ 5
- the possible values of p can include two or more of: 1, 2, or infinity.
- the adversarial training loss function that is associated with a given value of p can satisfy: where y is the target (“ground truth”) output for the input x, 6 are the values of the network parameters, (0, x + ⁇ 5) is the network output generated by the neural network for the perturbed input x + 8 given 0 and L is a loss function for the target task, e.g., a cross-entropy loss function or other appropriate loss function.
- the set of adversarial training schemes 132 can also include a “nominal” scheme that does not apply any perturbation to the training inputs and simply trains using the loss function L. While the above describes the plurality of adversarial training schemes including -norm based schemes, other adversarial training schemes can also be included in the set of schemes 132.
- the plurality of adversarial training schemes can include one or more schemes that have a corresponding loss function that has a regularizer that encourages the loss to behave linearly in the vicinity of the training data.
- Qin, et al Advesarial Robustness through Local Linearization, arXiv: 1907.02610.
- each of the plurality of adversarial training schemes trains an instance 118 A-N of the neural network on a respective set of training data for the target task (with different schemes optionally having different respective sets of training data) using the adversarial training scheme to determine respective trained values 116A-N for each of the plurality of network parameters.
- each set of training data can be a respective subset of a larger set of training data 130 for the target task or can include a respective number of epochs of training on the set of training data 130.
- the system 100 trains each instance 118A-118N of the neural network from scratch, e.g., on the entire larger set of training data 130.
- the system 100 first trains one instance (“a first instance”) of the neural network from scratch or from a pre-trained checkpoint and then fine-tunes one or more of the other instances from the parameter values determined by training the first instance.
- Training the instances 118A-N of the neural network is described in more detail below.
- the system 100 obtains respective trained values 116A-N of the network parameters for each instance 118A-N and, therefore, respective trained values 116A-N of the network parameters corresponding to each of the adversarial training schemes.
- distribution shift may occur when inference-time images are drawn from a distribution that differs from the distribution of training images.
- the neural network may be trained on images of one real- world region and the inference images may be images of another region that is similar to the real -world region but has different properties, e.g., different objects, different lighting conditions, and so on.
- the neural network may be trained on images of one real-world region and the inference images may be images of the same real-world region but under different imaging conditions, e.g., different weather or lighting or other conditions.
- the training images may be images of one set of patients and the inference images may be images of another set of patients that has different characteristics from the training set.
- the training system 100 (or the inference system 170) “merges” the parameter values for the multiple different instances trained using the multiple different adversarial training schemes.
- the final neural network 118 smoothly trades-off robustness to different adversaries by modifying how the parameter values are combined and without any additional training.
- the final neural network 118 can achieve robustness to many different threats without jointly training on all of them, thereby reducing training time.
- the resulting neural network 118 is more robust to a given adversary than the constituent instance specialized against that same adversary.
- the resulting neural network 118 is significantly more robust to a range of adversarial attack than models trained using conventional techniques and can generalize to be robust to attacks that were unknown at training time.
- the neural network 118 can generalize to inference time and test time inputs that have a “distribution shift” relative to the training inputs used to train the neural network, improving the performance of the neural network 118 on a variety of real-world tasks where distribution shift may be likely and without requiring any re-training.
- the system 100 or the inference system 170 generates a final value 110 for the network parameter by combining the respective trained values 116A-N for the network parameter for each of the plurality of adversarial training schemes.
- the final values 110 are a “merged” version of the trained values 116A-N.
- the system 100 or the inference system 170 uses the neural network 118 in accordance with the final values 110 of the network parameters to perform the target task on new network inputs 112, provides the final values to another system for use in performing the target task on new network inputs, or both.
- FIG. IB shows an example configuration system 180.
- the configuration system 180 can be implemented as part of the training system 100 or the inference system 170 of FIG. 1A.
- the configuration system 180 can determine new final network parameter values 110 to be used by the neural network 118 at deployment time without needing to retrain any of the instances 118A-N of the neural network 118.
- the configuration system 180 receives the trained network parameter values 116A-N for the instances 118A-N, e.g., generated as described above.
- the configuration system 180 also receives test data for the target task.
- the test data includes test examples that match the likely distribution of the inference inputs that will be processed by the neural network 118 after the neural network 118 has been deployed.
- the system 180 can obtain the test data 182 after the neural network 118 has already been deployed in a given environment, e.g., as a result of monitoring the inference inputs and determining that the final network parameters 110 need to be updated.
- the configuration system 180 uses the test data 182 and the trained network parameter values 116A-N to determine new final values 110 for the network parameters, i.e., to determine final values 110 that adapt the test data to the distribution represented by the test data 182, without retraining the neural network 118.
- Determining the new final values 110 is described in more detail below with reference to FIGS. 2-4.
- the inference system 170 or another system can then use the neural network 118 to perform inference in accordance with the new final values 110.
- FIG. 2 is a flow diagram of an example process 200 for training a neural network to be robust to adversarial attack.
- the process 200 will be described as being performed by a system of one or more computers located in one or more locations.
- a training system e.g., the training system 100 of FIG. 1 A, appropriately programmed, can perform the process 200.
- the system obtains data specifying a plurality of adversarial training schemes obtaining data specifying a plurality of different adversarial training schemes (step 202).
- each adversarial training scheme trains the neural network to be robust to a different type of adversarial attack.
- the type of adversarial attack for the adversarial training scheme can be an Z p -norm bounded attack for a corresponding value of p, with each of the two or more of the adversarial training schemes have different corresponding values of p.
- the value of p can define the adversarial training loss function used to train the neural network under the adversarial training scheme.
- the system trains an instance of the neural network on a respective set of training data for the target task using the adversarial training scheme to determine respective trained values for each of the plurality of network parameters (step 204).
- each adversarial training scheme will generally be associated with a different adversarial training loss function from each other adversarial training scheme.
- the system trains the instance of the neural network on the loss function corresponding to the adversarial training scheme, e.g., using an appropriate machine learning technique, e.g., a gradient-based technique with an appropriate optimizer, e.g., Adam, rmsProp, SGD, and so on.
- an appropriate machine learning technique e.g., a gradient-based technique with an appropriate optimizer, e.g., Adam, rmsProp, SGD, and so on.
- the system trains each of the instances independently, e.g., so that each instance is trained on the same set of training data, starting from randomly initialized values of the network parameters or from pre-trained values of the network parameters as generated by pre-training the neural network, e.g., by a different training system or using a different training objective.
- the system first trains the instance of the neural network using one adversarial training scheme and then uses the trained instance to “bootstrap” the training of the other instances using the other adversarial training schemes.
- the system After training the instances and for each of the plurality of network parameters, the system generates a final value for the network parameter by combining the respective trained values for the network parameter for each of the plurality of adversarial training schemes (step 206).
- the system combines the respective trained values so that the resulting neural network, i.e., that uses the final values of the network parameters, will have improved performance on the target task relative to any one of the trained instances.
- the system can determine a respective weight for each of the adversarial training schemes.
- the system can then, for each of the plurality of network parameters, compute a weighted sum of the respective trained values for the network parameter for each of the plurality of adversarial training schemes in accordance with the respective weights for each of the adversarial training schemes.
- the system can use the final values to perform inference or can provide the final values to another system for use in performing inference. That is, the system (or the other system) can receive a new network input for the target task and then process the new network input using the neural network and in accordance with the final values of the network parameters, i.e., with the network parameters set to the final values, to generate network output for a target task for the new network input.
- FIG. 3 is a flow diagram of an example process 300 for training the instances of the neural network.
- the process 300 will be described as being performed by a system of one or more computers located in one or more locations.
- a training system e.g., the training system 100 of FIG. 1 A, appropriately programmed, can perform the process 300.
- the system obtains a set of training data for the target task (step 302).
- the set of training data generally includes multiple training examples, with each training example including a training input and a corresponding target output for the training input, i.e., the output that should be generated by performing the target task on the training input.
- the system also obtains initial values of the network parameters of the neural network (step 304).
- the system can initialize the initial values using a random parameter initialization technique, e.g., Glorot initialization, He initialization, or another parameter initialization technique.
- a random parameter initialization technique e.g., Glorot initialization, He initialization, or another parameter initialization technique.
- the neural network can have been pre-trained, e.g., through unsupervised learning or on another task, and the system can set the initial values equal to the pre-trained values.
- the system can select the first scheme at random from the set of multiple schemes.
- the system can receive an input identifying which scheme in the set of multiple schemes should be the first scheme.
- the system trains an instance of the neural network on a corresponding set of training data for the target task using the second adversarial training scheme and starting from the respective trained values of the network parameters for the first adversarial training scheme to determine respective trained values for each of the plurality of network parameters (step 304). That is, the system “fine-tunes” the instance of the neural network corresponding to the first scheme using the second scheme and starting from the trained values of the instance of the neural network corresponding to the first scheme, i.e., rather than from the randomly initialized or pre-trained values that were used at the beginning of the training of the instance of the neural network corresponding to the first scheme.
- the system trains the instance of the neural network for (i) fewer training iterations, (ii) on fewer training examples, or both than were used in the training for the first adversarial training scheme. For example, the system can train each second instance for only one epoch or, more generally, fewer than five epochs while training the first instance for at least ten epochs.
- the system can train the first instance for ten training epochs, while training each of the other two instances for only three epochs starting from the trained values of the first instance.
- the system achieves comparable performance but only trains for sixteen total epochs. Because training large neural networks is computationally expensive, the system makes the training significantly more computationally efficient while still obtaining instances that are high-performing with respect to their corresponding type of adversarial attack.
- FIG. 4 is a flow diagram of an example process 400 for determining the final values of the network parameters.
- the process 400 will be described as being performed by a system of one or more computers located in one or more locations.
- a training system e.g., the training system 100 of FIG. 1 A
- an inference system e.g., the inference system 170 of FIG. 1A, appropriately programmed, can perform the process 400.
- the system obtains respective trained values of the network parameters for each of the multiple adversarial training schemes (step 402), e.g., as determined by performing the training described above with reference to FIGS. 2 and 3.
- the system obtains test data for the target task (step 404).
- the distribution of the network inputs in the test data can differ from a distribution of network inputs in the respective sets of training data for the adversarial training schemes.
- the system determines, using the test data, a respective weight for each of the adversarial training schemes (step 406).
- the system can use the test data to determine the weights for each of the adversarial training schemes in any of a variety of ways.
- the system can determine a plurality of candidate sets of weights, and for each of the plurality of candidate sets of weights, generate, using the candidate set of weights, respective candidate final parameter values for the network parameters.
- the system can determine a performance metric on the test data of an instance of the neural network having the candidate final parameter values and selecting one of the candidate sets of weights based on the performance metrics, e.g., by selecting the candidate set of weights that has the best performance metric.
- the performance metric can measure an accuracy on the test data of the instance of the neural network having the candidate final parameter values.
- the system can use the test data to adapt the neural network to perform better on inputs having the distribution reflected by the test data.
- the performance metric can measure a robustness of the instance of the neural network having the candidate final parameter values to one or more particular types of adversarial attack on network inputs in the test data.
- one or more of the particular types of adversarial attack are different from the type of adversarial attack for any of the plurality of adversarial training schemes.
- the system can use the test data to adapt the neural network to perform better on one or more new types of adversarial attacks that were not encountered during.
- the system can determine the candidate sets of weights using any appropriate type of technique for searching through the space of possible sets of weights, e.g., grid search, random search, evolutionary search, gradient-descent based search, and so on.
- the system generates final values for the network parameters using the respective weights (step 408).
- the system For each of the plurality of network parameters, the system generates a final value for the network parameter by computing a weighted sum of the respective trained values for the network parameter for each of the plurality of adversarial training schemes in accordance with the respective weights for the adversarial training schemes.
- FIG. 5 shows an example 500 of the results achieved by making use of the described techniques on eight different image classification tasks.
- the soups are at least comparable to the baselines on each of the tasks and, for some of the tasks, show significant improvement in accuracy without requiring any additional training relative to any of the baselines. Moreover, in addition to the improved accuracy shown in FIG. 5, the soups are significantly more robust to a wide-range of adversarial attacks relative to any of the baselines.
- FIG. 6 shows an example 600 of the robust accuracy of the described techniques for various weights.
- FIG. 6 shows an example 600 of the Z M robust accuracy on two image classification tasks (CIFAR-10 and ImageNet) of a “soup” that includes a combination of two instances of the neural network, one trained from scratch using the Zoo -norm attack (referred to in FIG. 6 as 0 ⁇ ) and the other fine-tuned on the Z 2 -norm attack starting from the trained values of the instance that was trained on the Zoo -norm attack (referred to in FIG. 6 as 0 ⁇ , ⁇ 2 )-
- the example 600 plots the weight w assigned to the values of the network parameters of the instance trained using the Zoo -norm attack, with the values of the network parameters of the instance trained using the Z 2 -norm attack being assigned a weight of ( I -ir).
- w is equal to one
- the soup consists only of the instance trained using the Zoo-norm attack, since the weight assigned to the instance trained using the Z 2 -norm attack is zero.
- the soup exceeds the performance of the instance trained from scratch using the Zoo- norm attack on both tasks in terms of being robust to the Zoo -norm attack.
- the inclusion of the other instance in the soup helps the soup be more robust to the Zoo -norm attack, even though the other instance is not trained to specifically counteract the Zoo-norm attack.
- This specification uses the term “configured” in connection with systems and computer program components.
- a system of one or more computers to be configured to perform particular operations or actions means that the system has installed on it software, firmware, hardware, or a combination of them that in operation cause the system to perform the operations or actions.
- the one or more computer programs to be configured to perform particular operations or actions means that the one or more programs include instructions that, when executed by data processing apparatus, cause the apparatus to perform the operations or actions.
- Embodiments of the subject matter and the functional operations described in this specification can be implemented in digital electronic circuitry, in tangibly-embodied computer software or firmware, in computer hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them.
- Embodiments of the subject matter described in this specification can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions encoded on a tangible non transitory storage medium for execution by, or to control the operation of, data processing apparatus.
- the computer storage medium can be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
- the program instructions can be encoded on an artificially generated propagated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus for execution by a data processing apparatus.
- a program can be stored in a portion of a file that holds other programs or data, e.g., one or more scripts stored in a markup language document, in a single file dedicated to the program in question, or in multiple coordinated files, e.g., files that store one or more modules, sub programs, or portions of code.
- a computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a data communication network.
- the processes and logic flows described in this specification can be performed by one or more programmable computers executing one or more computer programs to perform functions by operating on input data and generating output.
- the processes and logic flows can also be performed by special purpose logic circuitry, e.g., an FPGA or an ASIC, or by a combination of special purpose logic circuitry and one or more programmed computers.
- Computer readable media suitable for storing computer program instructions and data include all forms of non volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks.
- semiconductor memory devices e.g., EPROM, EEPROM, and flash memory devices
- magnetic disks e.g., internal hard disks or removable disks
- magneto optical disks e.g., CD ROM and DVD-ROM disks.
- embodiments of the subject matter described in this specification can be implemented on a computer having a display device, e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor, for displaying information to the user and a keyboard and a pointing device, e.g., a mouse or a trackball, by which the user can provide input to the computer.
- a display device e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
- keyboard and a pointing device e.g., a mouse or a trackball
- Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback, e.g., visual feedback, auditory feedback, or tactile feedback; and input from the user can be received in any form, including acoustic, speech, or tactile input.
- Machine learning models can be implemented and deployed using a machine learning framework, e.g., a TensorFlow framework or a Jax framework.
- Embodiments of the subject matter described in this specification can be implemented in a computing system that includes a back end component, e.g., as a data server, or that includes a middleware component, e.g., an application server, or that includes a front end component, e.g., a client computer having a graphical user interface, a web browser, or an app through which a user can interact with an implementation of the subject matter described in this specification, or any combination of one or more such back end, middleware, or front end components.
- the components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network. Examples of communication networks include a local area network (LAN) and a wide area network (WAN), e.g., the Internet.
- LAN local area network
- WAN wide area network
- the computing system can include clients and servers.
- a client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
- a server transmits data, e.g., an HTML page, to a user device, e.g., for purposes of displaying data to and receiving user input from a user interacting with the device, which acts as a client.
- Data generated at the user device e.g., a result of the user interaction, can be received at the server from the device.
- Example 1 A method of training a neural network having a plurality of network parameters to perform a target task, the method comprising: obtaining data specifying a plurality of different adversarial training schemes, wherein each adversarial training scheme trains the neural network to be robust to a different type of adversarial attack; for each of the plurality of adversarial training schemes: training an instance of the neural network on a respective set of training data for the target task using the adversarial training scheme to determine respective trained values for each of the plurality of network parameters; and for each of the plurality of network parameters, generating a final value for the network parameter by combining the respective trained values for the network parameter for each of the plurality of adversarial training schemes.
- Example 2 The method of example 1, wherein each adversarial training scheme has a corresponding loss function that is different from each other adversarial training scheme and wherein training an instance of the neural network on a respective set of training data for the target task using the adversarial training scheme to determine respective trained values for each of the plurality of network parameters comprises training the instance of the neural network on the loss function corresponding to the adversarial training scheme.
- Example 3 The method of example 1 or example 2, wherein, for each of two or more of the adversarial training schemes, the type of adversarial attack for the adversarial training scheme is an Z p -norm bounded attack for a corresponding value of p, and wherein each of the two or more of the adversarial training schemes have different corresponding values of p.
- Example 4 The method of any one of example 1-3, wherein, for a first adversarial training scheme of the plurality of adversarial training schemes, training an instance of the neural network on a respective set of training data for the target task using the first adversarial training scheme to determine respective trained values for each of the plurality of network parameters comprises: training the instance of the neural network on the respective set of training data for the target task using the first adversarial training scheme and starting from initial values of the network parameters to determine respective trained values for each of the plurality of network parameters.
- Example 5 The method of example 4, wherein the initial values of the network parameters are determined using a random parameter initialization technique.
- Example 6 The method of example 4, wherein the initial values of the network parameters are determined by pre-training the neural network.
- Example 8 The method of example 7, wherein training the instance of the neural network on the respective set of training data for the target task using the second adversarial training scheme comprises training the instance of the neural network for (i) fewer training iterations, (ii) on fewer training examples, or both than were used in the training for the first adversarial training scheme.
- Example 9 The method of any preceding example, further comprising: determining a respective weight for each of the adversarial training schemes, wherein for each of the plurality of network parameters, generating a final value for the network parameter by combining the respective trained values for the network parameter for each of the plurality of adversarial training schemes comprises: computing a weighted sum of the respective trained values for the network parameter for each of the plurality of adversarial training schemes in accordance with the respective weights for each of the adversarial training schemes.
- Example 11 The method of example 10, wherein a distribution of the network inputs in the test data differ from a distribution of network inputs in the respective sets of training data for the adversarial training schemes.
- Example 13 The method of example 12, wherein the performance metric measures an accuracy on the test data of the instance of the neural network having the candidate final parameter values.
- Example 14 The method of example 12, wherein the performance metric measures a robustness of the instance of the neural network having the candidate final parameter values to one or more particular types of adversarial attack on network inputs in the test data.
- Example 15 The method of example 14, wherein one or more of the particular types of adversarial attack are different from the type of adversarial attack for any of the plurality of adversarial training schemes.
- Example 17 A method of configuring a neural network having a plurality of network parameters to perform a target task, the method comprising: obtaining data specifying, for each of a plurality of different adversarial training schemes, respective trained values for each of the plurality of network parameters, wherein each adversarial training scheme trains the neural network to be robust to a different type of adversarial attack, and wherein the respective trained values for each of the plurality of network parameters for each adversarial training scheme have been determined by training an instance of the neural network on a respective set of training data for the target task using the adversarial training scheme; obtaining test data for the target task; determining, using the test data, a respective weight for each of the adversarial training schemes; and for each of the plurality of network parameters, generating a final value for the network parameter by computing a weighted sum of the respective trained values for the network parameter for each of the plurality of adversarial training schemes in accordance with the respective weights for the adversarial training schemes.
- Example 18 A method performed by one or more computers, the method comprising: receiving a new network input; and processing the new network input using a neural network in accordance with final values of a plurality of network parameters of the neural network to generate network output for a target task for the new network input, wherein the final values of the plurality of network parameters have been generated by performing the operations of the respective method of any preceding example.
- Example 19 The method of example 18, wherein; the new network input comprises an image and the network output comprises classification data, the classification data comprising a respective score for each of a plurality of categories; or the new network input comprises an image and the network output comprises object detection data, the object detection data comprising an identification of a position of an object within the image; or the new network input comprises an image and the network output comprises segmentation data, the segmentation data comprising, for at least one pixel of the input image, a category from a set of possible categories that a scene depicted at the at least one pixel belongs to; or the new network input comprises sensor data characterizing a state of an environment being interacted with by an agent, and the network output comprises control policy data, the control policy data comprising a control policy for controlling the agent.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202263424770P | 2022-11-11 | 2022-11-11 | |
| PCT/EP2023/081621 WO2024100305A1 (fr) | 2022-11-11 | 2023-11-13 | Fusion de réseaux neuronaux robustes contradictoires |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| EP4599366A1 true EP4599366A1 (fr) | 2025-08-13 |
Family
ID=88833856
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| EP23805946.3A Pending EP4599366A1 (fr) | 2022-11-11 | 2023-11-13 | Fusion de réseaux neuronaux robustes contradictoires |
Country Status (3)
| Country | Link |
|---|---|
| EP (1) | EP4599366A1 (fr) |
| CN (1) | CN120188169A (fr) |
| WO (1) | WO2024100305A1 (fr) |
-
2023
- 2023-11-13 WO PCT/EP2023/081621 patent/WO2024100305A1/fr not_active Ceased
- 2023-11-13 CN CN202380078266.5A patent/CN120188169A/zh active Pending
- 2023-11-13 EP EP23805946.3A patent/EP4599366A1/fr active Pending
Also Published As
| Publication number | Publication date |
|---|---|
| WO2024100305A1 (fr) | 2024-05-16 |
| CN120188169A (zh) | 2025-06-20 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Shen et al. | BBAS: Towards large scale effective ensemble adversarial attacks against deep neural network learning | |
| US12333433B2 (en) | Training neural networks using priority queues | |
| US11775830B2 (en) | Training more secure neural networks by using local linearity regularization | |
| US11341364B2 (en) | Using simulation and domain adaptation for robotic control | |
| EP4425383B1 (fr) | Système de réseau neuronal | |
| CN111727441A (zh) | 实现用于高效学习的条件神经过程的神经网络系统 | |
| US20220156585A1 (en) | Training point cloud processing neural networks using pseudo-element - based data augmentation | |
| WO2018093926A1 (fr) | Apprentissage semi-supervisé de réseaux neuronaux | |
| WO2019202073A1 (fr) | Réseaux neuronaux pour apprentissage continu évolutif dans des domaines avec des tâches apprises séquentiellement | |
| CN110546653A (zh) | 使用神经网络的用于强化学习的动作选择 | |
| US20250182439A1 (en) | Unsupervised learning of object keypoint locations in images through temporal transport or spatio-temporal transport | |
| CN121809726A (zh) | 一种对抗攻击和生成对抗样本的方法 | |
| US20240338387A1 (en) | Input data item classification using memory data item embeddings | |
| CN120937017A (zh) | 具有仅解码器语言模型的多模态神经网络 | |
| US11354574B2 (en) | Increasing security of neural networks by discretizing neural network inputs | |
| US11676033B1 (en) | Training machine learning models to be robust against label noise | |
| Yin et al. | Adversarial attack, defense, and applications with deep learning frameworks | |
| EP4599366A1 (fr) | Fusion de réseaux neuronaux robustes contradictoires | |
| EP4695778A1 (fr) | Exécution de tâches de traitement d'image sur la base d'exemples de démonstration | |
| US20250139959A1 (en) | Detecting objects in images by generating sequences of tokens | |
| CN116992937A (zh) | 神经网络模型的修复方法和相关设备 | |
| US20250292099A1 (en) | System and Method for Transformation of Discrete Input for Adversarial Robustness | |
| Canady et al. | Applying DDDAS Principles for Realizing Optimized and Robust Deep Learning Models at the Edge | |
| Ranjie | Adversarial Attacks Against DNNs Towards Real-World Threat | |
| Attar et al. | Multi-label Object Detection Using Multi-model R-CNN |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
| PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
| 17P | Request for examination filed |
Effective date: 20250509 |
|
| AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC ME MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
| RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: GDM HOLDING LLC |
|
| DAV | Request for validation of the european patent (deleted) | ||
| DAX | Request for extension of the european patent (deleted) |