WO2025006729A2 - Système, procédé et produit-programme informatique pour apprentissage incrémentiel - Google Patents

Système, procédé et produit-programme informatique pour apprentissage incrémentiel Download PDF

Info

Publication number
WO2025006729A2
WO2025006729A2 PCT/US2024/035789 US2024035789W WO2025006729A2 WO 2025006729 A2 WO2025006729 A2 WO 2025006729A2 US 2024035789 W US2024035789 W US 2024035789W WO 2025006729 A2 WO2025006729 A2 WO 2025006729A2
Authority
WO
WIPO (PCT)
Prior art keywords
machine
learning model
model
requests
replace
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2024/035789
Other languages
English (en)
Other versions
WO2025006729A3 (fr
Inventor
Runxin HE
Mingji Lou
Nicholas Stephen KERSTING
Songshan LI
Iat Kei HO
Raphael OKOCHU
Yu Gu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Visa International Service Association
Original Assignee
Visa International Service Association
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Visa International Service Association filed Critical Visa International Service Association
Priority to EP24832921.1A priority Critical patent/EP4736072A2/fr
Priority to CN202480043551.8A priority patent/CN121548827A/zh
Publication of WO2025006729A2 publication Critical patent/WO2025006729A2/fr
Publication of WO2025006729A3 publication Critical patent/WO2025006729A3/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • This disclosure relates generally to machine-learning and, in some nonlimiting embodiments or aspects, to systems, methods, and computer program products for incremental learning for a model in a production environment.
  • Inference platforms to monitor a machine-learning model in production have several technical limitations. For example, training is a separate process and retraining the model may take a considerable amount of resources (e.g., memory, processing, time, etc.). The production and training environments both are associated with workloads and overheads. Further, during training of such models there are limited opportunities for production tests. Moreover, the training process is often slow in response to data distribution during the in-production generation of inferences.
  • a system comprising: at least one data storage device; and at least one processor programmed or configured to: execute a first machine-learning model for each input of a plurality of inputs associated with a plurality of requests in a production environment, the first- machine learning model configured to output an inference for each input; store, in the at least one data storage device, model data for each execution of the first machinelearning model; determine to train the first machine-learning model based on at least one rule; in response to determining to train the first machine-learning model, creating a second machine-learning model comprising weights from the first machine-learning model; train the second-machine learning model with the model data stored in the at least one data storage device; determine whether to replace the first machine-learning model with the second machine-learning model; and in response to determining to replace the first machine-learning model with the second machine-learning model, replace the first machine-learning model with the second machine-learning model in the production environment such that a next plurality of requests are input to
  • determining whether to replace the first machine-learning model with the second machine-learning model comprises comparing the first machine-learning model to the second machine-learning model. In non-limiting embodiments or aspects, determining whether to replace the first machine-learning model with the second machine-learning model is based on at least one of computation efficiency and accuracy.
  • the at least one rule is based on at least one of model score and feature distribution.
  • the inference comprises a score.
  • a computer- implemented method comprising: executing, with at least one processor, a first machine-learning model for each input of a plurality of inputs associated with a plurality of requests in a production environment, the first-machine learning model configured to output an inference for each input; storing, in at least one data storage device, model data for each execution of the first machine-learning model; determining, with at least one processor, to train the first machine-learning model based on at least one rule; in response to determining to train the first machine-learning model, creating a second machine-learning model comprising weights from the first machine-learning model; training, with at least one processor, the second-machine learning model with the model data stored in the at least one data storage device; determining, with at least one processor, whether to replace the first machine
  • the plurality of requests are processed as a batch by executing the first machine-learning model for each input of the plurality of inputs associated with the plurality of requests.
  • the plurality of requests are processed by a batch processor, the method includes generating a dashboard interface configured to communicate with the batch processor.
  • determining whether to replace the first machine-learning model with the second machine-learning model comprises comparing the first machine-learning model to the second machine-learning model. In non-limiting embodiments or aspects, determining whether to replace the first machine-learning model with the second machine-learning model is based on at least one of computation efficiency and accuracy.
  • the at least one rule is based on at least one of model score and feature distribution.
  • the inference comprises a score.
  • a computer program product comprising at least one non-transitory computer-readable medium including program instructions that, when executed by at least one processor, causes the at least one processor to: execute a first machine-learning model for each input of a plurality of inputs associated with a plurality of requests in a production environment, the first-machine learning model configured to output an inference for each input; store, in at least one data storage device, model data for each execution of the first machine-learning model; determine to train the first machine-learning model based on at least one rule; in response to determining to train the first machine-learning model, create a second machine-learning model comprising weights from the first machinelearning model; train the second-machine learning model with the model data stored in the at least one data storage device; determine whether to replace the first machinelearning model
  • a system comprising: at least one data storage device; and at least one processor programmed or configured to: execute a first machine-learning model for each input of a plurality of inputs associated with a plurality of requests in a production environment, the first-machine learning model configured to output an inference for each input; store, in the at least one data storage device, model data for each execution of the first machine-learning model; determine to train the first machine-learning model based on at least one rule; in response to determining to train the first machine-learning model, creating a second machine-learning model comprising weights from the first machine-learning model; train the second-machine learning model with the model data stored in the at least one data storage device; determine whether to replace the first machine-learning model with the second machine-learning model; and in response to determining to replace the first machinelearning model with the second machine-learning model, replace the first machinelearning model with the second machine-learning model in the production environment such that a next plurality of requests are input to the second machine-learning model.
  • Clause 3 The system of clause 1 or 2, wherein the plurality of requests are processed by a batch processor, and wherein the at least one processor is further programmed or configured to generate a dashboard interface configured to communicate with the batch processor.
  • Clause 4 The system of any of clauses 1 -3, wherein determining whether to replace the first machine-learning model with the second machine-learning model comprises comparing the first machine-learning model to the second machine-learning model.
  • Clause 5 The system of any of clauses 1 -4, wherein determining whether to replace the first machine-learning model with the second machine-learning model is based on at least one of computation efficiency and accuracy.
  • Clause 6 The system of any of clauses 1 -5, wherein the at least one rule is based on at least one of model score and feature distribution.
  • Clause 7 The system of any of clauses 1 -6, wherein the inference comprises a score.
  • a computer-implemented method comprising: executing, with at least one processor, a first machine-learning model for each input of a plurality of inputs associated with a plurality of requests in a production environment, the first- machine learning model configured to output an inference for each input; storing, in at least one data storage device, model data for each execution of the first machinelearning model; determining, with at least one processor, to train the first machine- learning model based on at least one rule; in response to determining to train the first machine-learning model, creating a second machine-learning model comprising weights from the first machine-learning model; training, with at least one processor, the second-machine learning model with the model data stored in the at least one data storage device; determining, with at least one processor, whether to replace the first machine-learning model with the second machine-learning model; and in response to determining to replace the first machine-learning model with the second machinelearning model, replacing the first machine-learning model with the second machinelearning model in the production environment such that a next
  • Clause 9 The method of clause 8, where the plurality of requests are processed as a batch by executing the first machine-learning model for each input of the plurality of inputs associated with the plurality of requests.
  • Clause 10 The method of clause 8 or 9, wherein the plurality of requests are processed by a batch processor, further comprising generating a dashboard interface configured to communicate with the batch processor.
  • Clause 1 1 The method of any of clauses 8-10, wherein determining whether to replace the first machine-learning model with the second machine-learning model comprises comparing the first machine-learning model to the second machine-learning model.
  • Clause 12 The method of any of clauses 8-1 1 , wherein determining whether to replace the first machine-learning model with the second machine-learning model is based on at least one of computation efficiency and accuracy.
  • Clause 13 The method of any of clauses 8-12, wherein the at least one rule is based on at least one of model score and feature distribution.
  • Clause 14 The method of any of clauses 8-13, wherein the inference comprises a score.
  • a computer program product comprising at least one non- transitory computer-readable medium including program instructions that, when executed by at least one processor, causes the at least one processor to: execute a first machine-learning model for each input of a plurality of inputs associated with a plurality of requests in a production environment, the first-machine learning model configured to output an inference for each input; store, in at least one data storage device, model data for each execution of the first machine-learning model; determine to train the first machine-learning model based on at least one rule; in response to determining to train the first machine-learning model, create a second machinelearning model comprising weights from the first machine-learning model; train the second-machine learning model with the model data stored in the at least one data storage device; determine whether to replace the first machine-learning model with the second machine-learning model; and in response to determining to replace the first machine-learning model with the second machine-learning model, replace the first machine-learning model with the second machine-learning model in the production environment such that a next
  • Clause 16 The computer program product of clause 15, where the plurality of requests are processed as a batch by executing the first machine-learning model for each input of the plurality of inputs associated with the plurality of requests.
  • Clause 17 The computer program product of clause 15 or 16, wherein the plurality of requests are processed by a batch processor, and wherein the at least one processor is further programmed or configured to generate a dashboard interface configured to communicate with the batch processor.
  • Clause 18 The computer program product of any of clauses 15-17, wherein determining whether to replace the first machine-learning model with the second machine-learning model comprises comparing the first machine-learning model to the second machine-learning model.
  • Clause 19 The computer program product of any of clauses 15-18, wherein determining whether to replace the first machine-learning model with the second machine-learning model is based on at least one of computation efficiency and accuracy.
  • Clause 20 The computer program product of any of clauses 15-19, wherein the at least one rule is based on at least one of model score and feature distribution.
  • FIG. 1 is a schematic diagram of a system for incremental learning according to some non-limiting embodiments or aspects
  • FIG. 2 is a schematic diagram of a system for incremental learning according to some non-limiting embodiments or aspects
  • FIG. 3 is a flow diagram of a method for incremental learning according to some non-limiting embodiments or aspects.
  • FIG. 4 is a schematic diagram of example components of one or more devices according to some non-limiting embodiments or aspects.
  • account identifier may include one or more primary account numbers (PANs), tokens, or other identifiers associated with a customer account.
  • PANs primary account numbers
  • token may refer to an identifier that is used as a substitute or replacement identifier for an original account identifier, such as a PAN.
  • Account identifiers may be alphanumeric or any combination of characters and/or symbols.
  • Tokens may be associated with a PAN or other original account identifier in one or more data structures (e.g., one or more databases, and/or the like) such that they may be used to conduct a transaction without directly using the original account identifier.
  • an original account identifier such as a PAN, may be associated with a plurality of tokens for different individuals or purposes.
  • An “application program interface” refers to computer code or other data sorted on a computer-readable medium that may be executed by a processor to facilitate the interaction between software components, such as a client-side front-end and/or server-side back-end for receiving data from the client.
  • An “interface” refers to a generated display, such as one or more graphical user interfaces (GUIs) with which a user may interact, either directly or indirectly (e.g., through a keyboard, mouse, etc.).
  • GUIs graphical user interfaces
  • the term “communication” may refer to the reception, receipt, transmission, transfer, provision, and/or the like of data (e.g., information, signals, messages, instructions, commands, and/or the like).
  • one unit e.g., a device, a system, a component of a device or system, combinations thereof, and/or the like
  • another unit means that the one unit is able to directly or indirectly receive information from and/or transmit information to the other unit.
  • This may refer to a direct or indirect connection (e.g., a direct communication connection, an indirect communication connection, and/or the like) that is wired and/or wireless in nature.
  • two units may be in communication with each other even though the information transmitted may be modified, processed, relayed, and/or routed between the first and second unit.
  • a first unit may be in communication with a second unit even though the first unit passively receives information and does not actively transmit information to the second unit.
  • a first unit may be in communication with a second unit if at least one intermediary unit processes information received from the first unit and communicates the processed information to the second unit.
  • the term “computing device” may refer to one or more electronic devices configured to process data.
  • a computing device may, in some examples, include the necessary components to receive, process, and output data, such as a processor, a display, a memory, an input device, a network interface, and/or the like.
  • a computing device may be a mobile device.
  • a mobile device may include a cellular phone (e.g., a smartphone or standard cellular phone), a portable computer, a wearable device (e.g., watches, glasses, lenses, clothing, and/or the like), a personal digital assistant (PDA), and/or other like devices.
  • a computing device may also be a desktop computer or other form of non-mobile computer.
  • issuer institution may refer to one or more entities, such as a bank, that provide accounts to customers for conducting transactions (e.g., payment transactions), such as initiating credit and/or debit payments.
  • issuer institution may provide an account identifier, such as a PAN, to a customer that uniquely identifies one or more accounts associated with that customer.
  • the account identifier may be embodied on a portable financial device, such as a physical financial instrument, e.g., a payment card, and/or may be electronic and used for electronic payments.
  • issuer system refers to one or more computer devices operated by or on behalf of an issuer institution, such as a server computer executing one or more software applications.
  • an issuer system may include one or more authorization servers for authorizing a transaction.
  • the term “merchant” may refer to an individual or entity that provides goods and/or services, or access to goods and/or services, to customers based on a transaction, such as a payment transaction.
  • the term “merchant” or “merchant system” may also refer to one or more computer systems operated by or on behalf of a merchant, such as a server computer executing one or more software applications.
  • client device may refer to one or more client-side devices or systems (e.g., remote from a transaction service provider) used to initiate or facilitate a transaction (e.g., a payment transaction).
  • client device may refer to one or more POS devices used by a merchant, one or more acquirer host computers used by an acquirer, one or more mobile devices used by a user, and/or the like.
  • a client device may be an electronic device configured to communicate with one or more networks and initiate or facilitate transactions.
  • a client device may include one or more computers, portable computers, laptop computers, tablet computers, mobile devices, cellular phones, wearable devices (e.g., watches, glasses, lenses, clothing, and/or the like), PDAs, and/or the like.
  • a “client” may also refer to an entity (e.g., a merchant, an acquirer, and/or the like) that owns, utilizes, and/or operates a client device for initiating transactions (e.g., for initiating transactions with a transaction service provider).
  • the term “payment device” may refer to a payment card (e.g., a credit or debit card), a gift card, a smartcard, smart media, a payroll card, a healthcare card, a wristband, a machine-readable medium containing account information, a keychain device or fob, an RFID transponder, a retailer discount or loyalty card, a cellular phone, an electronic wallet mobile application, a personal digital assistant (PDA), a pager, a security card, a computing device, an access card, a wireless terminal, a transponder, and/or the like.
  • a payment card e.g., a credit or debit card
  • a gift card e.g., a gift card
  • smartcard e.g., smartcard, smart media
  • a payroll card e.g., a healthcare card
  • a wristband e.g., a machine-readable medium containing account information, a keychain device or fob, an RFID transponder, a retailer discount or loyalty
  • the payment device may include volatile or non-volatile memory to store information (e.g., an account identifier, a name of the account holder, and/or the like).
  • the term “payment gateway” may refer to an entity and/or a payment processing system operated by or on behalf of such an entity (e.g., a merchant service provider, a payment service provider, a payment facilitator, a payment facilitator that contracts with an acquirer, a payment aggregator, and/or the like), which provides payment services (e.g., transaction service provider payment services, payment processing services, and/or the like) to one or more merchants.
  • the payment services may be associated with the use of payment devices managed by a transaction service provider.
  • the term “payment gateway system” may refer to one or more computer systems, computer devices, servers, groups of servers, and/or the like, operated by or on behalf of a payment gateway.
  • the term “server” may refer to or include one or more computing devices that are operated by or facilitate communication and processing for multiple parties in a network environment, such as the internet, although it will be appreciated that communication may be facilitated over one or more public or private network environments and that various other arrangements are possible. Further, multiple computing devices (e.g., servers, point-of-sale (POS) devices, mobile devices, etc.) directly or indirectly communicating in the network environment may constitute a “system.”
  • Reference to “a server” or “a processor,” as used herein, may refer to a previously-recited server and/or processor that is recited as performing a previous step or function, a different server and/or processor, and/or a combination of servers and/or processors.
  • a first server and/or a first processor that is recited as performing a first step or function may refer to the same or different server and/or a processor recited as performing a second step or function.
  • transaction service provider may refer to an entity that receives transaction authorization requests from merchants or other entities and provides guarantees of payment, in some cases through an agreement between the transaction service provider and an issuer institution.
  • a transaction service provider may include a payment network such as Visa® or any other entity that processes transactions.
  • transaction processing system may refer to one or more computer systems operated by or on behalf of a transaction service provider, such as a transaction processing server executing one or more software applications.
  • a transaction processing server may include one or more processors and, in some non-limiting embodiments or aspects, may be operated by or on behalf of a transaction service provider.
  • Non-limiting embodiments or aspects of the disclosed subject matter are directed to systems, methods, and computer program products for incremental learning.
  • Non-limiting embodiments allow for an inference model to be trained while it is being used to execute real-time and/or batched processing requests.
  • non-limiting embodiments allow for a model to be trained while deployed in a production environment, using smaller subsets of data each time as opposed to training at a longer interval with a larger amount of data.
  • Non-limiting embodiments provide for the model improvement in a production environment, thus avoiding disruption of service, delay, and other undesirable effects of training a model.
  • Nonlimiting embodiments may be used in conjunction with a batch inference platform as described in PCT Application No. PCT/US2024/35341 , filed June 25, 2024, which is incorporated herein by reference in its entirety.
  • Non-limiting embodiments may be implemented within an electronic payment processing network, although it will be appreciated that non-limiting embodiments may be implemented within various different types of environments, not limited to payment networks, where one or more models are executed as a service and benefit from regular and/or periodic training.
  • a server computer 100 may include one or more computing devices, such as a web server, private server, authentication server (e.g., such as a 3D Secure system), and/or any other computing device configured to communicate with one or more client devices (e.g., 1 14, 1 16, 1 18).
  • the client devices 1 14, 1 16, 1 18 may include computing devices that make requests from the server computer 100, such as processing requests (e.g., transaction processing and/or the like).
  • client devices 1 14, 1 16, 1 18 may each include issuer systems, and server computer 100 may be associated with a transaction processing system and/or payment gateway. It will be appreciated, however, that various entities may control and/or operate the devices and systems shown in FIG. 1 in various non-limiting embodiments.
  • the client devices 1 14, 1 16, 1 18 may communicate with the server computer 100 through one or more APIs, as an example, although various communication methods may be used.
  • the server computer 100 may be in communication with an inference engine 102.
  • the inference engine 102 may include one or more computing devices and/or software applications executed by one or more computing devices.
  • the inference engine 102 may be part of the server computer 100.
  • the inference engine 102 may be separate and/or remote from the server computer 100.
  • the inference engine 102 may be configured to control execution of one or more machine-learning models stored as model data 104 in one or more data storage devices.
  • the inference engine 102 may also be in communication with a database 107, which may include an audit log for each execution of a model, model metric data, and/or the like.
  • the machine-learning models may include scoring models (e.g., fraud or risk scoring models) used by issuer systems to make decisions (e.g., authorization decisions).
  • An incremental training engine 101 is also in communication with the server computer 100.
  • the incremental training engine 101 may include one or more computing devices and/or software applications executed by one or more computing devices.
  • the inference engine 102 may execute a first machine-learning model (e.g., a production model) for inputs received in real-time or in batch from the server computer 100 based on requests from client devices 1 14, 1 16, 1 18.
  • the inference and model data resulting from the model execution may be stored in the database 107.
  • the server computer 100 determines when to train the first machinelearning model. For example, training may be performed based on a predetermined time interval, a number of transactions processed, user input, and/or the like. In nonlimiting embodiments, determining to train the model is based on one or more rules, which may include thresholds for model scores, feature distribution, and/or the like.
  • the term “real-time” may refer to performance of a task or tasks during another process or before another process is completed.
  • the inference engine 102 may execute the first machine-learning model in real-time relative to processing a transaction (e.g., before or during authorization of a transaction or the like), during processing and/or communication of messages related to an event, at the time of making a decision (e.g., authorization decision, authentication decision and/or the like) related to an event (e.g., receiving an authorization request, at least a portion of which is included in the data sample, and determining an authorization decision based thereon), and/or the like.
  • a decision e.g., authorization decision, authentication decision and/or the like
  • Non-limiting embodiments provide for a faster, more computational efficient training process by adding a smaller (e.g., incremental) subset of data as compared to non-incremental approaches, and training on top of the previous (e.g., current) model weights.
  • a smaller subset of data e.g., incremental subset of data as compared to non-incremental approaches
  • training on top of the previous (e.g., current) model weights e.g., current) model weights.
  • the server computer 100 may communicate with the incremental training engine 101 .
  • the incremental training engine 101 may be separate from and/or a part of the server computer 100.
  • the incremental training engine 101 may then create a second machine-learning model based on the first machine-learning model that is in production.
  • the second machinelearning model may be stored as model data 104 and/or be stored in temporary memory.
  • the second machine-learning model may include weights from the first machine-learning model.
  • the second machine-learning model may be duplicated from the first machine-learning model.
  • the incremental training engine 101 may then train the second machine-learning model based on the audit log stored in the database 107 from prior executions within a time period.
  • the second machinelearning model may be trained while the first machine-learning model is being executed in the production environment.
  • the incremental training engine 101 and/or server computer 100 may then determine whether to replace the first machine-learning model with the second machine-learning model.
  • the incremental training engine 101 and/or server computer 100 may then replace the first machine-learning model with the second machine-learning model in the production environment such that a next plurality of requests are input to the second machine-learning model.
  • the first machine-learning model in the model data 104 may be replaced, removed, and/or the like such that the second machine-learning model is used moving forward.
  • determining whether to replace the first machine-learning model with the second machine-learning model involves comparing the two models (e.g., comparing efficiency, accuracy, and/or the like). Replacing the first model may occur if differences in model metrics between the newly trained model and the production model satisfy a predetermined threshold (e.g., meet or exceed a threshold). It will be appreciated that other parameters may be considered to determine when to replace the existing model with a newly trained model.
  • An inference engine 202 may include one or more computing devices and/or software applications executed by one or more computing devices for executing one or more machine-learning models in response to a request.
  • the inference engine 202 may be a real-time inference engine and/or a batch inference engine.
  • a payload may include input data for a risk scoring model.
  • the model inference engine 202 may execute one or more models upon request.
  • the model inference engine 202 in response to receiving multiple requests to provide reasoning associated with an output (e.g., a request for explainable machine-learning or artificial intelligence metrics), the model inference engine 202 may batch process the requests.
  • An audit log 201 may store model metadata resulting from execution of the machine-learning model(s) such that the model metadata is available to the inference engine 202.
  • the inference engine 202 may output explainable machine-learning or artificial intelligence metrics relating to the model and reasons (e.g., influences) on the model output.
  • a monitoring system dashboard 206 may present one or more GUIs to users to show graphical representations of data in the audit log 201 in addition to explainable metrics.
  • the monitoring system dashboard 206 may display alerts, notifications, and/or the like based on realtime and/or batched executions of the machine-learning model(s).
  • the monitoring system dashboard 206 is in communication with the inference engine 202 and audit log 201 .
  • the dashboard 206 may display some or all of the model metric data and/or facilitate user interaction with the inference engine 202 and/or audit log 201 .
  • a user may access the dashboard 206 through a web browser or application and use it to configure a training interval time, view model metric data, and/or the like.
  • the inference engine and/or dashboard 206 may communicate with a feature generation engine 209.
  • the feature generation engine 209 may include one or more computing devices and/or software applications executed by one or more computing devices for generating features (e.g., feature vectors) from the model data stored in the audit log 201 from prior executions.
  • an incremental training engine 208 may be configured to train a second model that is based on one or more of the models in production by the inference engine 202.
  • the incremental training engine 208 outputs a challenger model 205, which is analyzed by a model evaluation engine 207 to compare it with the model currently in production.
  • the model evaluation engine 207 may include one or more computing devices and/or software applications executed by one or more computing devices for performing such a comparison.
  • the production model(s) executed by the inference engine 202 may be replaced with the challenger model 205 in response to the model evaluation engine determining to replace it based on a comparison of model metrics.
  • the machine-learning models discussed herein may be unsupervised models. However, it will be appreciated that in nonlimiting embodiments, supervised learning models may also be used, for example in instances in which the model data (e.g., labels generated with the model, observations, and/or the like) is updated.
  • model data e.g., labels generated with the model, observations, and/or the like
  • Non-limiting embodiments provide for incremental learning that avoids a need to retrain a model from a beginning state (e.g., “from scratch”) each time it is trained. By iterating through multiple versions of models over time as the model data grows and provides more training data, incremental training is provided for each stage and/or threshold.
  • a step may be automatically performed in response to performance and/or completion of a prior step.
  • a first machine-learning model may be executed in a production environment.
  • the first machine-learning model may be configured in an electronic payment network to provide real-time inferences that are used by the payment network, such as but not limited to fraud determinations used for authorization and/or the like.
  • model data from the inference processed at step 300 may be stored in a data structure, such as a model database, audit log, and/or the like.
  • the model data may include a model input, a model output, model parameters (e.g., weights of nodes and/or edges of a network, transformations, and/or the like), and/or model metrics (e.g., Shapley values or other metrics representing how one or more features affect an individual prediction and/or how much that feature affected that prediction score compared to other features).
  • model parameters e.g., weights of nodes and/or edges of a network, transformations, and/or the like
  • model metrics e.g., Shapley values or other metrics representing how one or more features affect an individual prediction and/or how much that feature affected that prediction score compared to other features.
  • step 304 it is determined whether to train the machine-learning model based on the model data stored at step 302. Such a determination may be based on one or more rules, such as one or more thresholds associated with model performance metrics. For example, the score and/or feature distribution may be compared with one or more thresholds to determine whether training and/or retraining should be triggered at step 304. In some examples, feature distribution drift, score distribution drift, and/or the like may be used to trigger training and/or retraining at step 304. In some nonlimiting embodiments, Shapley values or other metrics representing how one or more features affect an individual prediction (e.g.
  • a score) and/or how much that feature affected (e.g., increased and/or decreased) that prediction score compared to other features may be used to monitor the model and trigger and/or retrigger retraining. Inputs continue to be provided to the first machine-learning model that is live in the production environment.
  • the method may proceed to step 306 and a second model (e.g., a replica model) may be created (e.g., generated) based on the first model. For example, a copy of the model with the same weights and other parameters may be generated so that it can be trained while the first model is still being executed in the production environment.
  • the second model is trained based on the model data stored at step 302.
  • the second model is trained using the model data, at step 310 it may be determined whether to replace the first model that is currently live in the production environment with the second model.
  • the trained second model becomes a challenger model that is compared to the first model.
  • the models may be evaluated by computation efficiency and/or accuracy based on a cross-validation data set. Other comparison methods may be performed.
  • the method proceeds to step 312 and the first model is replaced with the second model as the live model in the production environment. If it is determined not to replace the model, for example if the second model does not perform as well as the first model, the method may proceed back to step 300 and continue with the first model still in the production environment. [0068] Referring now to FIG.
  • Device 400 may correspond to at least one of the computing devices (e.g., server computer 100, inference engine 102, incremental training engine 101 , and/or the like) in FIG. 1.
  • such systems or devices may include at least one device 400 and/or at least one component of device 400.
  • the number and arrangement of components shown in FIG. 4 are provided as an example.
  • device 400 may include additional components, fewer components, different components, or differently arranged components than those shown in FIG. 4.
  • a set of components (e.g., one or more components) of device 400 may perform one or more functions described as being performed by another set of components of device 400.
  • device 400 may include bus 402, processor 404, memory 406, storage component 408, input component 410, output component 412, and communication interface 414.
  • Bus 402 may include a component that permits communication among the components of device 400.
  • processor 404 may be implemented in hardware, firmware, or a combination of hardware and software.
  • processor 404 may include a processor (e.g., a central processing unit (CPU), a GPU, an accelerated processing unit (APU), etc.), a microprocessor, a digital signal processor (DSP), and/or any processing component (e.g., a field-programmable gate array (FPGA), an applicationspecific integrated circuit (ASIC), etc.) that can be programmed to perform a function.
  • Memory 406 may include random access memory (RAM), read only memory (ROM), and/or another type of dynamic or static storage device (e.g., flash memory, magnetic memory, optical memory, etc.) that stores information and/or instructions for use by processor 404.
  • RAM random access memory
  • ROM read only memory
  • static storage device e.g., flash memory, magnetic memory, optical memory, etc.
  • storage component 408 may store information and/or software related to the operation and use of device 400.
  • storage component 408 may include a hard disk (e.g., a magnetic disk, an optical disk, a magneto-optic disk, a solid state disk, etc.) and/or another type of computer-readable medium.
  • Input component 410 may include a component that permits device 400 to receive information, such as via user input (e.g., a touch screen display, a keyboard, a keypad, a mouse, a button, a switch, a microphone, etc.).
  • input component 410 may include a sensor for sensing information (e.g., a global positioning system (GPS) component, an accelerometer, a gyroscope, an actuator, etc.).
  • Output component 412 may include a component that provides output information from device 400 (e.g., a display, a speaker, one or more light-emitting diodes (LEDs), etc.).
  • Communication interface 414 may include a transceiver-like component (e.g., a transceiver, a separate receiver and transmitter, etc.) that enables device 400 to communicate with other devices, such as via a wired connection, a wireless connection, or a combination of wired and wireless connections.
  • Communication interface 414 may permit device 400 to receive information from another device and/or provide information to another device.
  • communication interface 414 may include an Ethernet interface, an optical interface, a coaxial interface, an infrared interface, a radio frequency (RF) interface, a universal serial bus (USB) interface, a Wi-Fi® interface, a cellular network interface, and/or the like.
  • RF radio frequency
  • USB universal serial bus
  • Device 400 may perform one or more processes described herein. Device 400 may perform these processes based on processor 404 executing software instructions stored by a computer-readable medium, such as memory 406 and/or storage component 408.
  • a computer-readable medium may include any non-transitory memory device.
  • a memory device includes memory space located inside of a single physical storage device or memory space spread across multiple physical storage devices.
  • Software instructions may be read into memory 406 and/or storage component 408 from another computer-readable medium or from another device via communication interface 414. When executed, software instructions stored in memory 406 and/or storage component 408 may cause processor 404 to perform one or more processes described herein. Additionally, or alternatively, hardwired circuitry may be used in place of or in combination with software instructions to perform one or more processes described herein.
  • embodiments described herein are not limited to any specific combination of hardware circuitry and software.
  • the term “configured to,” as used herein, may refer to an arrangement of software, device(s), and/or hardware for performing and/or enabling one or more functions (e.g., actions, processes, steps of a process, and/or the like).
  • a processor configured to may refer to a processor that executes software instructions (e.g., program code) that cause the processor to perform one or more functions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Feedback Control In General (AREA)

Abstract

L'invention concerne des systèmes, des procédés et des produits-programmes informatiques pour un apprentissage incrémental. Un système comprend au moins un processeur programmé ou configuré pour exécuter un premier modèle d'apprentissage automatique pour chaque entrée d'une pluralité d'entrées associées à une pluralité de demandes dans un environnement de production, déterminer d'entraîner le premier modèle d'apprentissage automatique sur la base d'au moins une règle, en réponse à la détermination d'entraîner le premier modèle d'apprentissage automatique, créer un second modèle d'apprentissage automatique comprenant des poids provenant du premier modèle d'apprentissage automatique, entraîner le second modèle d'apprentissage automatique avec les données de modèle stockées dans ledit au moins un dispositif de stockage de données, déterminer s'il faut remplacer le premier modèle d'apprentissage automatique par le second modèle d'apprentissage automatique, en réponse à la détermination de remplacer le premier modèle d'apprentissage automatique par le second modèle d'apprentissage automatique, et remplacer le premier modèle d'apprentissage automatique par le second modèle d'apprentissage automatique dans l'environnement de production.
PCT/US2024/035789 2023-06-29 2024-06-27 Système, procédé et produit-programme informatique pour apprentissage incrémentiel Ceased WO2025006729A2 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP24832921.1A EP4736072A2 (fr) 2023-06-29 2024-06-27 Système, procédé et produit-programme informatique pour apprentissage incrémentiel
CN202480043551.8A CN121548827A (zh) 2023-06-29 2024-06-27 用于增量学习的系统、方法和计算机程序产品

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202363523956P 2023-06-29 2023-06-29
US63/523,956 2023-06-29

Publications (2)

Publication Number Publication Date
WO2025006729A2 true WO2025006729A2 (fr) 2025-01-02
WO2025006729A3 WO2025006729A3 (fr) 2025-03-27

Family

ID=93940191

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2024/035789 Ceased WO2025006729A2 (fr) 2023-06-29 2024-06-27 Système, procédé et produit-programme informatique pour apprentissage incrémentiel

Country Status (3)

Country Link
EP (1) EP4736072A2 (fr)
CN (1) CN121548827A (fr)
WO (1) WO2025006729A2 (fr)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150324686A1 (en) * 2014-05-12 2015-11-12 Qualcomm Incorporated Distributed model learning
EP3188041B1 (fr) * 2015-12-31 2021-05-05 Dassault Systèmes Mise à jour d'un système d'apprentissage de machine
US11620574B2 (en) * 2018-12-05 2023-04-04 The Board Of Trustees Of The University Of Illnois Holistic optimization for accelerating iterative machine learning
US12602921B2 (en) * 2020-05-27 2026-04-14 Nvidia Corporation Techniques for modifying and training a neural network

Also Published As

Publication number Publication date
EP4736072A2 (fr) 2026-05-06
CN121548827A (zh) 2026-02-17
WO2025006729A3 (fr) 2025-03-27

Similar Documents

Publication Publication Date Title
US11741475B2 (en) System, method, and computer program product for evaluating a fraud detection system
US20210065038A1 (en) Method, System, and Computer Program Product for Maintaining Model State
US20230351431A1 (en) System, Method, and Computer Program Product for Segmenting Users Using a Machine Learning Model Based on Transaction Data
WO2023069584A1 (fr) Système, procédé et produit-programme informatique pour apprentissage d'ensemble multi-domaine sur la base de données de séquence temporelle à plusieurs variables
US20220245516A1 (en) Method, System, and Computer Program Product for Multi-Task Learning in Deep Neural Networks
WO2023069699A1 (fr) Procédé, système et produit-programme informatique pour intégrer une compression et une régularisation
WO2023055501A1 (fr) Système, procédé et produit programme d'ordinateur pour l'apprentissage de l'espace d'intégration continue de transactions de paiement en temps réel
US11948064B2 (en) System, method, and computer program product for cleaning noisy data from unlabeled datasets using autoencoders
WO2024197299A9 (fr) Procédé, système et produit-programme informatique pour fournir un transformateur sensible au type pour des ensembles de données séquentiels
EP4736072A2 (fr) Système, procédé et produit-programme informatique pour apprentissage incrémentiel
CN114402335B (zh) 用于管理模型更新的方法、系统和计算机程序产品
US12340297B1 (en) System, method, and computer program product for generating and improving multitask learning models
US20250190804A1 (en) System, Method, and Computer Program Product for Active Learning in Graph Neural Networks Through Hybrid Uncertainty Reduction
US20250322417A1 (en) System, Method, and Computer Program Product for Predicting Consumer Behavior Based on Demographics and New Product Features Using Machine Learning Models
US20250139407A1 (en) Method, System, and Computer Program Product for Removing Fake Features in Deep Learning Models
EP4736078A2 (fr) Système, procédé et produit programme d'ordinateur à des fins de surveillance de modèle à l'aide d'inférences réparties en lot
US20250363379A1 (en) Method, System, and Computer Program Product for Use of Reinforcement Learning to Increase Machine Learning Model Label Accuracy
WO2025221946A1 (fr) Procédé, système et produit-programme informatique pour analyse de série chronologique en utilisant un mécanisme d'attention basé sur l'intégration d'intervalles de temps
US20250117699A1 (en) System, Method, and Computer Program Product for System Machine Learning in Device Placement
WO2025166095A1 (fr) Procédé, système et produit-programme informatique pour déterminer une importance de caractéristique à l'aide de valeurs de shapley associées à un modèle d'apprentissage automatique
EP4699043A1 (fr) Procédé, système et produit programme d'ordinateur pour analyse multicouche et détection de vulnérabilité de modèles d'apprentissage automatique à des attaques antagonistes
WO2025178788A1 (fr) Système, procédé et produit-programme informaique pour modèles d'apprentissage automatique améliorés permettant de générer des données tabulaires
WO2025183683A1 (fr) Système, procédé et produit programme d'ordinateur pour une correction d'état de modèles d'intelligence artificielle
WO2025085200A1 (fr) Procédé, système et produit programme d'ordinateur pour améliorer des modèles d'apprentissage automatique par génération de multiples intégrations d'utilisateur
WO2025058612A1 (fr) Procédé, système et produit programme d'ordinateur pour interprétation basée sur une perturbation des effets de caractéristiques associées dans des modèles d'apprentissage machine

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 2024832921

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 11202508363Q

Country of ref document: SG

WWP Wipo information: published in national office

Ref document number: 11202508363Q

Country of ref document: SG

ENP Entry into the national phase

Ref document number: 2024832921

Country of ref document: EP

Effective date: 20260129

ENP Entry into the national phase

Ref document number: 2024832921

Country of ref document: EP

Effective date: 20260129

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24832921

Country of ref document: EP

Kind code of ref document: A2