WO2022227176A1 - Procédé et appareil de poussée d'informations de médicament, dispositif informatique et support d'enregistrement - Google Patents
Procédé et appareil de poussée d'informations de médicament, dispositif informatique et support d'enregistrement Download PDFInfo
- Publication number
- WO2022227176A1 WO2022227176A1 PCT/CN2021/096712 CN2021096712W WO2022227176A1 WO 2022227176 A1 WO2022227176 A1 WO 2022227176A1 CN 2021096712 W CN2021096712 W CN 2021096712W WO 2022227176 A1 WO2022227176 A1 WO 2022227176A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- reward
- parameter
- user
- target
- drug
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H20/00—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
- G16H20/10—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to drugs or medications, e.g. for ensuring correct administration to patients
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Definitions
- the present application relates to the technical field of artificial intelligence, and in particular, to a method, device, computer equipment and storage medium for pushing drug information.
- DRL deep reinforcement learning
- the inventors realized that due to the essential difference between long-term outcomes and short-term outcomes, the essential difference is mainly reflected in the difference in the distance of action between long-term outcomes and short-term outcomes (for example, short-term outcomes are mainly affected by the most recent drug, and long-term outcomes are mainly affected by longer-term outcomes. drug effects before time), thus resulting in poor scalability of the DRL model.
- the embodiments of the present application provide a drug information push method, device, computer equipment and storage medium, which can enhance the scalability of a drug reward prediction model, thereby improving the accuracy of drug information push.
- the application provides a method for pushing drug information, the method comprising:
- target user attribute information of the target user input the target user attribute information into the drug reward prediction model, and the target user attribute information includes at least one of demographic information, health indicators for drug use for the target disease, and historical drug use information;
- Each first target reward parameter and each second target reward parameter of the target user under the action of each drug are output through the drug reward prediction model, wherein the drug reward prediction model includes the first network parameter and the second network parameter, and the first network parameter uses It is used to determine the first reward parameter of any user with any user attribute information under the action of various drugs, and the second network parameter is used to determine the second reward parameter of any user under the action of various drugs, and any user is under the action of various drugs.
- a drug corresponds to a first reward parameter and a second reward parameter, and the drug action duration corresponding to the first reward parameter is greater than the drug action duration corresponding to the second reward parameter;
- each user reward parameter of the target user under the action of each drug is determined, wherein the target user corresponds to one user under the action of one drug reward parameters;
- the maximum user reward parameter is determined from each user reward parameter, and the drug information of the target drug with the maximum user reward parameter is output to the user interface to display the target drug to the target user.
- the above-mentioned device further includes:
- a data acquisition module configured to acquire sample data of at least two users, and the sample data of one user includes user attribute information and sample drug information of the user;
- the sample input module is used to obtain each first sample reward parameter and each second sample reward parameter of each user under the action of the sample drug indicated by the sample drug information, and combine the sample data of at least two users, each first sample This reward parameter and each second sample reward parameter are input into the drug reward prediction model;
- the parameter training module is used to train the first network parameters and the second network parameters of the drug reward prediction model based on the user attribute information of at least two users, each first sample reward parameter and each second sample reward parameter, so as to obtain the parameters based on any parameter.
- the user attribute information of a user predicts the ability of the first reward parameter and the second reward parameter of any user under the action of each drug.
- the present application provides a computer device, including: a processor, a memory, and a network interface;
- the processor is connected to a memory and a network interface, wherein the network interface is used to provide a data communication function, the memory is used to store a computer program, and the processor is used to call the computer program to execute the first aspect in the embodiment of the present application.
- the drug information push method, the drug push method includes:
- target user attribute information of the target user input the target user attribute information into the drug reward prediction model, and the target user attribute information includes at least one of demographic information, health indicators for drug use for the target disease, and historical drug use information;
- Each first target reward parameter and each second target reward parameter of the target user under the action of each drug are output through the drug reward prediction model, wherein the drug reward prediction model includes the first network parameter and the second network parameter, and the first network parameter uses It is used to determine the first reward parameter of any user with any user attribute information under the action of various drugs, and the second network parameter is used to determine the second reward parameter of any user under the action of various drugs, and any user is under the action of various drugs.
- a drug corresponds to a first reward parameter and a second reward parameter, and the drug action duration corresponding to the first reward parameter is greater than the drug action duration corresponding to the second reward parameter;
- each user reward parameter of the target user under the action of each drug is determined, wherein the target user corresponds to one user under the action of one drug reward parameters;
- the maximum user reward parameter is determined from each user reward parameter, and the drug information of the target drug with the maximum user reward parameter is output to the user interface to display the target drug to the target user.
- the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and the computer program includes program instructions that, when executed by a processor, execute the above-mentioned first step in the present application.
- the drug information push method in one aspect, the drug push method includes:
- target user attribute information of the target user input the target user attribute information into the drug reward prediction model, and the target user attribute information includes at least one of demographic information, health indicators for drug use for the target disease, and historical drug use information;
- Each first target reward parameter and each second target reward parameter of the target user under the action of each drug are output through the drug reward prediction model, wherein the drug reward prediction model includes the first network parameter and the second network parameter, and the first network parameter uses It is used to determine the first reward parameter of any user with any user attribute information under the action of various drugs, and the second network parameter is used to determine the second reward parameter of any user under the action of various drugs, and any user is under the action of various drugs.
- a drug corresponds to a first reward parameter and a second reward parameter, and the drug action duration corresponding to the first reward parameter is greater than the drug action duration corresponding to the second reward parameter;
- each user reward parameter of the target user under the action of each drug is determined, wherein the target user corresponds to one user under the action of one drug reward parameters;
- the maximum user reward parameter is determined from each user reward parameter, and the drug information of the target drug with the maximum user reward parameter is output to the user interface to display the target drug to the target user.
- the embodiment of the present application enhances the scalability of the drug reward prediction model, and improves the interpretability, security, selectivity and traceability of the model, thereby improving the accuracy of drug information push and having strong applicability.
- FIG. 1 is a schematic structural diagram of a network architecture provided by the application.
- Fig. 2 is the schematic flow chart of the drug information push method provided by the application
- Fig. 3 is the structural representation of the drug reward prediction model provided by the application.
- FIG. 4 is a schematic structural diagram of a drug information push device provided by the present application.
- FIG. 5 is a schematic structural diagram of a computer device provided by the present application.
- the technical solutions of the present application may relate to the technical field of artificial intelligence, and may be applied to scenarios such as smart medical treatment such as medical information push, so as to realize digital medical treatment and promote the construction of smart cities.
- the data involved in this application such as attribute information and/or target drug information, may be stored in a database, or may be stored in a blockchain, such as distributed storage through a blockchain, which is not limited in this application.
- FIG. 1 is a schematic structural diagram of a network architecture provided by the present application.
- the network architecture may include a server 10 and a user terminal cluster, and the user terminal cluster may include multiple user terminals, as shown in FIG. ..., the user terminal 100n.
- the server 10 may be an independent physical server, or may provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content distribution networks (content delivery network, CDN), big data and artificial intelligence platforms and other basic cloud computing services cloud servers.
- Each user terminal in the user terminal cluster may include, but is not limited to, smart terminals such as smart phones, tablet computers, notebook computers, desktop computers, smart speakers, and smart watches.
- the computer device in this application may be an entity terminal with a drug information push function
- the entity terminal may be the server 10 as shown in FIG. 1 or a user terminal, which is not limited herein.
- the user terminal 100a, the user terminal 100b, the user terminal 100c, . . . , and the user terminal 100n can be respectively connected to the above-mentioned server 10 through a network, so that each user terminal can exchange data with the server 10 through the network connection.
- the server 10 may output the drug information of the target drug to the user interface corresponding to the user terminal of the target user, so that the target user can view the target drug on the user interface, wherein the user terminal of the target user may be a user terminal in the user terminal cluster.
- Any one of the user terminals eg, user terminal 100a).
- the drugs determined based on the drug reward prediction model and used for pushing to target users may be collectively referred to as target drugs.
- a functional model is called a drug reward prediction model.
- the drug information push method provided in this application can be applied to a drug information push scenario for any disease, such as a diabetes drug information push scenario, a hypertension drug information push scenario, or a drug information push scenario for other diseases.
- the target user is a doctor
- the doctor can input the patient's basic information into the drug reward prediction model, and can output the pushed drug information of the target drug to the user interface based on the patient's basic information.
- the doctor can view the information on the user interface.
- the target drug here the target drug can be used as the preliminary diagnosis result), and then combined with the further diagnosis results of the patient to determine the appropriate drug for the patient (such as the above-mentioned target drug).
- the patient can input their basic information to the self-service terminal (or simply self-service machine, etc.) provided by medical institutions such as hospitals, health stations or social health institutions.
- the self-service machine contains the above-mentioned drug reward prediction model, which can be based on
- the basic information of the patient outputs the drug information of the recommended target drug to the user interface of the self-service machine.
- the patient can view the target drug in the user interface of the self-service machine, and the patient can purchase the target drug directly, or the doctor can further diagnose and determine the drug suitable for the patient (such as the above-mentioned target drug).
- FIG. 2 is a schematic flowchart of a method for pushing drug information provided by an embodiment of the present application. As shown in FIG. 2, the method may include the following steps S101-S104:
- step S101 target user attribute information of the target user is acquired, and the target user attribute information is input into a drug reward prediction model.
- the computer device can first train the model parameters of the drug reward prediction model through the sample data of at least two users and the actual reward parameters of each user, so as to obtain the model parameters used to output any user's performance in each drug.
- Drug reward prediction model under the action of the first reward parameter and the second reward parameter.
- the drug reward prediction model here can be a deep reinforcement learning (deep q-network, DQN) model.
- the reinforcement learning method of the DQN model is to take actions (such as
- the artificial intelligence method is an artificial intelligence method that optimizes the strategy through the expected reward obtained after obtaining the expected reward.
- the parameter value corresponding to the expected reward may be the expected reward parameter (such as the following first expected reward parameter and second expected reward parameter), in other words, the value of the expected reward parameter is used to represent the expected reward.
- the policy refers to the method in which an action should be taken in a specific state to maximize the expected reward.
- the computer device may acquire sample data of at least two users, wherein the sample data of at least two users may be used to train a drug reward prediction model, one user corresponds to one sample data, and one sample data may include User attribute information and sample medication information of the user.
- the user attribute information here may include at least one of demographic information, health indicators of medication for the target disease, and historical medication information (ie, medication history), and the medication indicated by the sample medication information is a sample medication.
- the demographic information may include gender, age, health status, occupation, marriage, education level, income, and other information, and the health index may be understood as an inspection index corresponding to the target disease.
- the sample drugs used by different users for the target disease can be the same or different.
- the computer device can obtain each first sample reward parameter and each second sample reward parameter of each user under the action of the sample drug, and combine the sample data of at least two users, each first sample reward parameter, and each third sample reward parameter.
- the two-sample reward parameters are input into the drug reward prediction model.
- the actual long-term reward parameter of the user under the action of the sample drug may be referred to as the first sample reward parameter.
- the actual short-term reward parameter of the user under the action of the sample drug may also be referred to as the second sample reward parameter.
- the drug action duration corresponding to the first sample reward parameter is greater than the drug action duration corresponding to the second sample reward parameter.
- the reward here can be understood as the degree of influence of the user on their own health indicators after taking the sample drug for a period of time, and the value of the reward parameter is used to represent the degree of influence.
- reward parameter 1 is used to represent influence degree 1
- reward parameter 2 is used to represent influence degree 2. If reward parameter 1 is greater than reward parameter 2, it indicates that influence degree 1 is greater than influence degree 2.
- the computer device can train the first network parameters and the second network parameters of the drug reward prediction model based on the user attribute information of at least two users, each first sample reward parameter and each second sample reward parameter, so as to obtain the first network parameters and the second network parameters of the drug reward prediction model based on any The ability of the user attribute information (eg target user attribute information) of a user (eg target user) to predict the first reward parameter and the second reward parameter of any user under the action of each drug.
- the user attribute information eg target user attribute information
- the first network parameter can be used to determine the first reward parameter (also called long-term reward parameter) of any user with any user attribute information under the action of various drugs
- the second network parameter can be used to determine any user
- the drug action duration corresponding to the first reward parameter is greater than the drug action duration corresponding to the second reward parameter.
- the first network parameter here may include the first model parameter and the first backhaul parameter
- the second network parameter may include the second model parameter and the second backhaul parameter.
- the parameters that are iteratively updated based on the loss value in the drug reward prediction model may be collectively referred to as model parameters (eg, the first model parameter and the second model parameter).
- the application may refer to the return parameter corresponding to the first reward parameter in the first network parameter as the first return parameter (also referred to as the first return factor), and the application may also refer to the second network parameter as the second return parameter.
- the return parameter corresponding to the reward parameter is called the second return parameter (may also be referred to as the second return factor).
- the return parameters here can be understood as parameters that remain unchanged during the training process of the drug reward prediction model.
- the first return parameter is greater than the second return parameter, for example, the first return parameter is 0.9 or other values, the The second return parameter is 0.2 or other values.
- the computer device may determine each first expected reward parameter of each user under the action of the sample drug based on the first model parameter and the first return parameter, and based on the second model parameter and the second return parameter Each second expected reward parameter of each user under the action of the sample drug is determined.
- a user corresponds to a first expected reward parameter under the action of a sample drug
- a user corresponds to a second expected reward parameter under the action of a sample drug.
- the computer device can use the loss function to pass the first return parameter, the second return parameter, each first sample reward parameter, each second sample reward parameter, each first expected reward parameter, and each second expected reward The parameter determines each loss value corresponding to each user's sample data.
- a first sample reward parameter, a second sample reward parameter, a first expected reward parameter, and a second expected reward parameter correspond to a loss value corresponding to a user's sample data.
- the computer device can determine the loss value l loss corresponding to the user's sample data according to the following formula (1):
- At can represent the sample drug input into the drug reward prediction model at the current time t (that is, the sample drug in the sample data), and s t can represent the user attribute information input into the drug reward prediction model at the current time t (ie, the sample data in the sample data).
- s t+1 can represent the user attribute information input into the drug reward prediction model at the next time t +1
- Q long (s t , at ) can represent the user’s first expected reward at the current time t parameters
- Q short (s t , at ) can represent the second expected reward parameter of the user at the current time t
- r long can represent the first sample reward parameter of the user at the current time t
- r short can represent the current time t.
- the user's second sample reward parameter, ⁇ long can represent the first return coefficient
- ⁇ short can represent the second return coefficient
- Q long (s t+1 , a) can represent the user's first return at the next moment t+1
- the expected reward parameter, Q short (s t+1 ,a) can represent the second expected reward parameter of the user at the next moment t+1.
- the computer device may iteratively update the parameter value of the first model parameter and the parameter value of the second model parameter based on each loss value until the loss value remains unchanged, and then stop the prediction of drug reward
- the model is trained, and the iteratively updated first model parameters are used as the final first model parameters of the drug reward prediction model, and the iteratively updated second model parameters are used as the final second model parameters of the drug reward prediction model.
- the drug reward prediction model has the ability to predict the first reward parameter and the second reward parameter of any user under the action of each drug based on the user attribute information of any user.
- the drug reward prediction model may include multiple convolutional layers (eg, convolutional layers 10a to 10c) and multiple fully connected layers (eg, fully connected layer 20a and fully connected layer 20b).
- the input of the reward prediction model is the user attribute information of the user
- the output of the drug reward prediction model is the first reward parameter (eg Q long ) and the second reward parameter (eg Q short ) of any user under the action of each drug.
- the drug reward prediction model may include the fully connected layer 20a and the fully connected layer 20b, but not the convolutional layer 10a to the volume Laminate 10c.
- the drug reward prediction model here includes a first network parameter and a second network parameter, wherein the fully connected layer 20b (ie, the second fully connected layer) can be configured with the first network parameter and the second network parameter, as shown in FIG. 3 .
- the fully-connected layer 20b may include two fully-connected layers (such as the fully-connected layer 200b and the fully-connected layer 201b), wherein the fully-connected layer 200b is configured with the first network parameters, and the fully-connected layer 200b is configured based on the first network
- the parameter processes the user attribute information to output the first reward parameter Q long of any user under the action of each drug; the fully connected layer 201b is configured with the second network parameter, and the fully connected layer 201b is used for the user based on the second network parameter.
- the attribute information is processed to output the second reward parameter Q short of any user under the action of each drug.
- the computer device can obtain sample data of at least two users, of which at least two users'
- the sample data can be long-term follow-up data of a large number of diabetic patients, and one sample data can include one-time follow-up data of one patient.
- the sample data here may include user attribute information, and the user attribute information may include, but is not limited to, age, gender, medication history, sample drugs (that is, drugs prescribed by doctors or drugs actually taken by patients, such as biguanides or sulfonylureas) ), HbA1c value, creatinine value, and other health indicators for diabetes.
- the computer device can obtain each first sample reward parameter and each first sample reward parameter of each user under the action of the sample drug, and use the user attribute information of each user, each first sample reward parameter and each first sample reward parameter for each user.
- a sample reward parameter is input into the above drug reward prediction model.
- the first sample reward parameter may indicate whether diabetic complications occurred at the last follow-up after taking the drug.
- the first sample reward parameter is 0 when diabetic complications occur in diabetic patients, and 0 when diabetic patients do not appear.
- the first sample reward parameter is 1 for complications of diabetes.
- the second sample reward parameter can indicate whether the glycated hemoglobin value of the diabetic patient reaches the target at the next follow-up after taking the drug. When the second sample reward parameter is 0.
- the computer device may output each first expected reward parameter of each user under the action of the sample drug based on the above-mentioned fully connected layer 200b, and output each second expected reward parameter of each user under the action of the sample drug based on the above-mentioned fully connected layer 201b.
- the computer device can use the above-mentioned loss function to evaluate the first return parameter, the second return parameter, each first sample reward parameter, each second sample reward parameter, each first expected reward parameter, and each second expected reward. The parameters are calculated to obtain each loss value corresponding to the sample data of each user.
- the computer device can iteratively update the parameter value of the first model parameter and the parameter value of the second model parameter according to the loss value corresponding to all the sample data until the loss value is basically unchanged (for example, the loss value is the smallest), indicating that the drug reward prediction model Model training has been completed (i.e. the drug reward prediction model has converged).
- the first network parameters configured in the fully connected layer 200b include the first return parameters and the iteratively updated first model parameters
- the second network parameters configured in the fully connected layer 201b include the second return parameters and the iteratively updated first model parameters. Updated second model parameters.
- the first return parameter in the fully connected layer 200b and the iteratively updated first model parameter can be used to predict the first reward parameter of any user under the action of each drug
- the iteratively updated second model parameters can be used to predict the second reward parameters of any user under the action of each drug. It can be seen that the drug reward prediction model at this time has the ability to predict the first reward parameter and the second reward parameter of any user under the action of each drug based on the user attribute information of any user.
- the computer device can acquire the target user attribute information of the target user based on the input instruction, and input the target user attribute information into the drug Reward prediction model.
- the target user can input the attribute information of the target user in the above attribute information input area, and click the OK button in the user interface after the input is completed.
- the computer device can detect the input instruction on the attribute information input area, so as to obtain the target user.
- User's target user attribute information may include at least one of demographic information, health indicators of medication for the target disease, and historical medication information.
- Step S102 outputting each first target reward parameter and each second target reward parameter of the target user under the action of each drug through the drug reward prediction model.
- the computer device may determine each first target reward parameter of the target user under the action of each drug based on the first network parameters (ie, the first return parameter and the iteratively updated first model parameter), for example , the first network parameter may be the first network parameter in the fully connected layer 200b after the drug reward prediction model converges.
- the target user corresponds to a first target reward parameter under the action of a drug.
- the computer device may determine each second target reward parameter of the target user under the action of each drug, for example, the second network parameter based on the second network parameter (ie, the second return parameter and the iteratively updated second model parameter).
- the second network parameter in the fully connected layer 201b after convergence of the drug reward prediction model may be.
- the target user corresponds to a second target reward parameter under the action of a drug.
- Step S103 Determine each user reward parameter of the target user under the action of each drug based on each first target reward parameter of the target user and/or each second target reward parameter of the target user.
- the computer device may determine a first weighting coefficient for the first target reward parameter and a second weighting coefficient for the second target reward parameter.
- the first weighting coefficient (eg, 1 or other numerical values) and the second weighting coefficient (eg, 1 or other numerical values) here may be the weighting coefficients set by the user or the weighting coefficients configured by default in the drug reward prediction model.
- the computer device may determine each first weighted reward parameter corresponding to each first target reward parameter based on the first weighting coefficient and each first target reward parameter of the target user, and based on the second weighting coefficient and each second target user's second reward parameter
- the target reward parameter determines each second weighted reward parameter corresponding to each second target reward parameter.
- the computer equipment can sum up each first weighted reward parameter and each second weighted reward parameter to obtain each user reward parameter of the target user under the action of each drug, and a first weighted reward parameter corresponds to a second weighted reward parameter.
- a user reward parameter A user reward parameter.
- the computer equipment can also directly sum up each first target reward parameter and each second target reward parameter to obtain each user reward parameter of the target user under the action of each drug, a first target reward parameter and a second target reward parameter.
- the reward parameter corresponds to a user reward parameter.
- the computer device may determine each first target reward parameter of the target user as each user reward parameter of the target user under each drug action.
- the computer device may determine each second target reward parameter of the target user as each user reward parameter of the target user under the action of each drug, which can be specifically determined according to the actual application scenario, There is no restriction here.
- step S104 the maximum user reward parameter is determined from the user reward parameters, and the drug information of the target drug with the maximum user reward parameter is output to the user interface to display the target drug to the target user.
- the computer device may sort each user reward parameter (such as from large to small or from small to large) to obtain a sequence of user reward parameters, and assign the first or The last user reward parameter is used as the maximum user reward parameter. Further, the computer device may output the drug information of the target drug with the maximum user reward parameter to the user interface to present the target drug to the target user. Taking the scenario of diabetes drug information push as an example, when the target user's drug action requirement is that there will be no complications of diabetes in the long term, the maximum user reward parameter can be the maximum first target reward parameter among the first target reward parameters. At this time, The computer device may output medication information for the target medication having the largest first target reward parameter to the user interface.
- the maximum user reward parameter may be the largest second target reward parameter among the second target reward parameters, and the computer device may have the largest second target reward parameter.
- the drug information of the target drug is output to the user interface.
- the reward parameters of each user can be determined by each first weighted reward parameter and each second weighted reward parameter.
- the computer equipment Medication information for the target medication with the maximum user reward parameter can be output to the user interface.
- the target user can view the target drug on the user interface at this time, and send feedback information for the target drug to the computer device.
- the feedback information may include that the target drug is different from the historical drug previously taken by the target user, or the effect of the target user taking the target drug is not as good as the effect of taking the historical drug.
- the computer device can adjust the first network parameter and the second network parameter of the drug reward prediction model to It can better predict the first reward parameter and the second reward parameter of any user (such as the target user) under the action of each drug, and then push appropriate drug information to the target user.
- the computer device may input the attribute information of the target user into the drug reward prediction model, and output each first target reward parameter and each second target reward parameter of the target user under the action of each drug through the drug reward prediction model,
- the drug reward prediction model can output the first target reward parameter and the second target reward parameter at the same time, the reward parameter of the long-term outcome is evaluated by the first target reward parameter, and the reward parameter of the short-term outcome is evaluated by the second target reward parameter,
- the scalability of the drug reward prediction model is enhanced, and the interpretability, safety, selectivity and traceability of the model are improved.
- the computer device may determine each user reward parameter of the target user under the action of each drug based on each first target reward parameter of the target user and/or each second target reward parameter of the target user. At this time, the computer device can determine the maximum user reward parameter from the user reward parameters, and output the drug information of the target drug with the maximum user reward parameter to the user interface, so as to display the target drug to the target user, thereby improving the drug information Pushing accuracy, strong applicability.
- FIG. 4 is a schematic structural diagram of a drug information push device provided by an embodiment of the present application.
- the drug information push device may be a computer program (including program code) running in a computer device, for example, the drug information push device is an application software; the drug information push device may be used to execute the method provided by the embodiments of the present application corresponding steps in .
- the drug information pushing apparatus 1 may run on a computer device, and the computer device may be the server 10 in the embodiment corresponding to FIG. 1 above.
- the drug information pushing device 1 may include: a data acquisition module 10 , a sample input module 20 , a parameter training module 30 , an information input module 40 , a parameter output module 50 , a parameter determination module 60 and an information display module 70 .
- the information input module 40 is used to obtain the target user attribute information of the target user, and input the target user attribute information into the drug reward prediction model. at least one.
- the user interface includes an attribute information input area
- the above-mentioned information input module 40 includes: an information acquisition unit 401 .
- the information acquisition unit 401 is configured to acquire target user attribute information of the target user based on the input instruction when an input instruction on the attribute information input area is detected.
- step S101 for the specific implementation manner of the information obtaining unit 401, reference may be made to the description of step S101 in the embodiment corresponding to FIG. 2, which will not be repeated here.
- the parameter output module 50 is configured to output each first target reward parameter and each second target reward parameter of the target user under the action of each drug through a drug reward prediction model, wherein the drug reward prediction model includes a first network parameter and a second network parameters, the first network parameter is used to determine the first reward parameter of any user with any user attribute information under the action of various drugs, and the second network parameter is used to determine the second reward parameter of any user under the action of various drugs.
- the drug reward prediction model includes a first network parameter and a second network parameters
- the first network parameter is used to determine the first reward parameter of any user with any user attribute information under the action of various drugs
- the second network parameter is used to determine the second reward parameter of any user under the action of various drugs.
- Reward parameters any user corresponds to a first reward parameter and a second reward parameter under the action of a drug, and the drug action duration corresponding to the first reward parameter is greater than the drug action duration corresponding to the second reward parameter.
- the parameter determination module 60 is configured to determine each user reward parameter of the target user under the action of each drug based on each first target reward parameter of the target user and/or each second target reward parameter of the target user, wherein the target user is a Each drug corresponds to a user reward parameter.
- the parameter determination module 60 includes: a weighting coefficient determination unit 601 , a first reward parameter determination unit 602 and a second reward parameter determination unit 603 .
- Weighting coefficient determination unit 601 for determining the first weighting coefficient of the first target reward parameter and the second weighting coefficient of the second target reward parameter
- the first reward parameter determination unit 602 is configured to determine each first weighted reward parameter corresponding to each first target reward parameter based on the first weighting coefficient and each first target reward parameter of the target user, and based on the second weighting coefficient and the target user The second target reward parameters of each second target reward parameter determine each second weighted reward parameter corresponding to each second target reward parameter;
- the second reward parameter determining unit 603 is configured to determine, based on each first weighted reward parameter and each second weighted reward parameter, each user reward parameter of the target user under the action of each drug, a first weighted reward parameter and a second weighted reward The parameter corresponds to a user reward parameter.
- weighting coefficient determination unit 601, the first reward parameter determination unit 602, and the second reward parameter determination unit 603 can refer to the description of step S103 in the embodiment corresponding to FIG. 2, and will not be continued here. Repeat.
- the above parameter determination module 60 further includes: a third reward parameter determination unit 604 .
- the third reward parameter determining unit 604 is configured to determine each first target reward parameter of the target user as each user reward parameter of the target user under the action of each drug;
- the maximum user reward parameter is the maximum first target reward parameter among the first target reward parameters.
- the specific implementation of the third reward parameter determining unit 604 may refer to the description of step S103 in the above-mentioned embodiment corresponding to FIG. 2 , which will not be repeated here.
- the above parameter determination module 60 further includes: a fourth reward parameter determination unit 605 .
- the fourth reward parameter determination unit 605 is configured to determine each second target reward parameter of the target user as each user reward parameter of the target user under the action of each drug;
- the maximum user reward parameter is the maximum second target reward parameter among the second target reward parameters.
- the specific implementation of the fourth reward parameter determination unit 605 may refer to the description of step S103 in the above-mentioned embodiment corresponding to FIG. 2 , which will not be repeated here.
- the information display module 70 is used for determining the maximum user reward parameter from each user reward parameter, and outputting the drug information of the target drug with the maximum user reward parameter to the user interface to display the target drug to the target user.
- the above-mentioned drug information push device 1 further includes:
- the data acquisition module 10 is used for acquiring sample data of at least two users, and the sample data of one user includes user attribute information and sample drug information of the user;
- the sample input module 20 is used to obtain each first sample reward parameter and each second sample reward parameter of each user under the action of the sample drug indicated by the sample drug information, and combine the sample data of at least two users, each first sample reward parameter
- the sample reward parameters and the second sample reward parameters are input into the drug reward prediction model
- the parameter training module 30 is used for training the first network parameters and the second network parameters of the drug reward prediction model based on the user attribute information of at least two users, each first sample reward parameter and each second sample reward parameter, so as to obtain the first network parameter and the second network parameter of the drug reward prediction model.
- the user attribute information of any user predicts the ability of the first reward parameter and the second reward parameter of any user under the action of each drug.
- the first network parameters include first model parameters and first backhaul parameters
- the second network parameters include second model parameters and second backhaul parameters
- the above-mentioned parameter training module 30 includes: an expected parameter determination unit 301 , a loss value determination unit 302 and a parameter update unit 303 .
- An expected parameter determination unit 301 configured to determine each first expected reward parameter of each user under the action of the sample drug based on the first model parameter and the first returned parameter, and determine each user based on the second model parameter and the second returned parameter each second expected reward parameter under the action of the sample drug;
- the loss value determination unit 302 is configured to determine based on the first return parameter, the second return parameter, each first sample reward parameter, each second sample reward parameter, each first expected reward parameter and each second expected reward parameter Each loss value corresponding to each user's sample data;
- the parameter updating unit 303 is configured to iteratively update the parameter value of the first model parameter and the parameter value of the second model parameter based on each loss value until the loss value remains unchanged, so as to obtain a prediction based on the user attribute information of any user in each user.
- the ability of the first reward parameter and the second reward parameter under the action of the drug is configured to iteratively update the parameter value of the first model parameter and the parameter value of the second model parameter based on each loss value until the loss value remains unchanged, so as to obtain a prediction based on the user attribute information of any user in each user. The ability of the first reward parameter and the second reward parameter under the action of the drug.
- the specific implementation of the expected parameter determination unit 301, the loss value determination unit 302 and the parameter update unit 303 can be referred to the description of the model training of the drug reward prediction model in step S101 of the above-mentioned embodiment corresponding to FIG. 2, which will not be discussed here. Let's go on and on.
- step S101 to step S104 in will not be repeated here.
- the description of the beneficial effects of using the same method will not be repeated.
- FIG. 5 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
- the computer device may include a processor, memory, and a network interface.
- the computer device may also include a user interface.
- the computer device 1000 may be the server 10 in the above-mentioned embodiment corresponding to FIG. 1 , and the computer device 1000 may include: at least one processor 1001 , such as a CPU, at least one network interface 1004 , and user interface 1003 , memory 1005 , at least one communication bus 1002 .
- the communication bus 1002 is used to realize the connection and communication between these components.
- the user interface 1003 may include a display screen (display) and a keyboard (keyboard), and the network interface 1004 may optionally include a standard wired interface and a wireless interface (eg, a WI-FI interface).
- the memory 1005 may be high-speed RAM memory or non-volatile memory, such as at least one disk memory.
- the memory 1005 may optionally also be at least one storage device located remotely from the aforementioned processor 1001 .
- the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a device control application program.
- the network interface 1004 is mainly used for network communication with the user terminal;
- the user interface 1003 is mainly used to provide an input interface for the user;
- device control application to achieve:
- target user attribute information of the target user input the target user attribute information into the drug reward prediction model, and the target user attribute information includes at least one of demographic information, health indicators for drug use for the target disease, and historical drug use information;
- Each first target reward parameter and each second target reward parameter of the target user under the action of each drug are output through the drug reward prediction model, wherein the drug reward prediction model includes the first network parameter and the second network parameter, and the first network parameter uses It is used to determine the first reward parameter of any user with any user attribute information under the action of various drugs, and the second network parameter is used to determine the second reward parameter of any user under the action of various drugs, and any user is under the action of various drugs.
- a drug corresponds to a first reward parameter and a second reward parameter, and the drug action duration corresponding to the first reward parameter is greater than the drug action duration corresponding to the second reward parameter;
- each user reward parameter of the target user under the action of each drug is determined, wherein the target user corresponds to one user under the action of one drug reward parameters;
- the maximum user reward parameter is determined from each user reward parameter, and the drug information of the target drug with the maximum user reward parameter is output to the user interface to display the target drug to the target user.
- the computer device 1000 described in the embodiment of the present application can execute the description of the method for pushing drug information in the embodiment corresponding to FIG. 2 above, and can also execute the device for pushing drug information in the embodiment corresponding to FIG. 4 above.
- the description of 1 will not be repeated here.
- the description of the beneficial effects of using the same method will not be repeated.
- the embodiments of the present application further provide a computer-readable storage medium, and the computer-readable storage medium stores the computer program executed by the aforementioned drug information pushing device 1, and the computer program is stored in the computer-readable storage medium.
- the computer program includes program instructions, and when the processor executes the program instructions, it can execute the description of the drug information pushing method in the embodiment corresponding to FIG. 2 above, and therefore will not be repeated here. In addition, the description of the beneficial effects of using the same method will not be repeated.
- the storage medium involved in this application such as a computer-readable storage medium, may be non-volatile or volatile.
- program instructions may be deployed to execute on one computing device, or on multiple computing devices located at one site, or alternatively, on multiple computing devices distributed across multiple sites and interconnected by a communications network
- program instructions may be deployed to execute on one computing device, or on multiple computing devices located at one site, or alternatively, on multiple computing devices distributed across multiple sites and interconnected by a communications network
- multiple computing devices distributed in multiple locations and interconnected by a communication network can form a blockchain system.
- a computer program product or computer program including computer instructions stored in a computer-readable storage medium.
- the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the method for pushing drug information provided in the embodiments of the present application.
- the above-mentioned storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM) or a random access memory (RAM) or the like.
- the above-mentioned computer-readable storage medium may be the drug information pushing apparatus provided in any of the foregoing embodiments or an internal storage unit of the above-mentioned device, such as a hard disk or a memory of an electronic device.
- the computer-readable storage medium can also be an external storage device of the electronic device, such as a pluggable hard disk, a smart media card (SMC), a secure digital (SD) card equipped on the electronic device, Flash card (flash card), etc.
- the above-mentioned computer-readable storage medium may also include a magnetic disk, an optical disk, a read-only memory (ROM) or a random access memory (RAM), and the like.
- the computer-readable storage medium may also include both an internal storage unit of the electronic device and an external storage device.
- the computer-readable storage medium is used to store the computer program and other programs and data required by the electronic device.
- the computer-readable storage medium can also be used to temporarily store data that has been or will be output.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Business, Economics & Management (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Chemical & Material Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Medicinal Chemistry (AREA)
- Epidemiology (AREA)
- Medical Informatics (AREA)
- Primary Health Care (AREA)
- Public Health (AREA)
- Evolutionary Computation (AREA)
- Development Economics (AREA)
- Biomedical Technology (AREA)
- Game Theory and Decision Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
Un procédé et un appareil de poussée d'informations de médicament, un dispositif informatique et un support d'enregistrement sont divulgués dans des modes de réalisation de la présente demande. Le procédé est applicable au domaine de la médecine numérique, et consiste à : obtenir des informations d'attribut d'utilisateur cible d'un utilisateur cible, et entrer les informations d'attribut d'utilisateur cible dans un modèle de prédiction de récompense de médicament ; au moyen du modèle de prédiction de récompense de médicament, délivrer en sortie des premiers paramètres de récompense cibles et des seconds paramètres de récompense cibles de l'utilisateur cible sous l'action de médicaments ; sur la base des premiers paramètres de récompense cibles de l'utilisateur cible et/ou des seconds paramètres de récompense cibles de l'utilisateur cible, déterminer des paramètres de récompense d'utilisateur de l'utilisateur cible sous l'action des médicaments ; déterminer le paramètre de récompense d'utilisateur maximal parmi les paramètres de récompense d'utilisateur, et délivrer en sortie des informations de médicament du médicament cible ayant le paramètre de récompense d'utilisateur maximal dans une interface utilisateur pour afficher le médicament cible à l'utilisateur cible. En utilisant les modes de réalisation de la présente demande, l'extensibilité du modèle de prédiction de récompense de médicament peut être améliorée, ce qui permet d'améliorer la précision de poussée d'informations de médicament.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110473086.XA CN113076486B (zh) | 2021-04-29 | 2021-04-29 | 药物信息推送方法、装置、计算机设备及存储介质 |
| CN202110473086.X | 2021-04-29 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022227176A1 true WO2022227176A1 (fr) | 2022-11-03 |
Family
ID=76616011
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2021/096712 Ceased WO2022227176A1 (fr) | 2021-04-29 | 2021-05-28 | Procédé et appareil de poussée d'informations de médicament, dispositif informatique et support d'enregistrement |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN113076486B (fr) |
| WO (1) | WO2022227176A1 (fr) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116779096B (zh) * | 2023-06-28 | 2024-04-16 | 南栖仙策(南京)高新技术有限公司 | 一种用药策略确定方法、装置、设备和存储介质 |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20160279329A1 (en) * | 2013-11-07 | 2016-09-29 | Impreal Innovations Limited | System and method for drug delivery |
| CN110289068A (zh) * | 2019-06-20 | 2019-09-27 | 北京百度网讯科技有限公司 | 药品推荐方法及设备 |
| CN111666494A (zh) * | 2020-05-13 | 2020-09-15 | 平安科技(深圳)有限公司 | 分群决策模型生成、分群处理方法、装置、设备及介质 |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112561554B (zh) * | 2019-09-26 | 2023-07-28 | 腾讯科技(深圳)有限公司 | 确定待展示的多媒体资源方法、装置、服务器及存储介质 |
| CN111144949A (zh) * | 2019-12-30 | 2020-05-12 | 北京每日优鲜电子商务有限公司 | 奖励数据发放方法、装置、计算机设备及存储介质 |
| CN111933302B (zh) * | 2020-10-09 | 2021-01-05 | 平安科技(深圳)有限公司 | 药物推荐方法、装置、计算机设备及存储介质 |
-
2021
- 2021-04-29 CN CN202110473086.XA patent/CN113076486B/zh active Active
- 2021-05-28 WO PCT/CN2021/096712 patent/WO2022227176A1/fr not_active Ceased
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20160279329A1 (en) * | 2013-11-07 | 2016-09-29 | Impreal Innovations Limited | System and method for drug delivery |
| CN110289068A (zh) * | 2019-06-20 | 2019-09-27 | 北京百度网讯科技有限公司 | 药品推荐方法及设备 |
| CN111666494A (zh) * | 2020-05-13 | 2020-09-15 | 平安科技(深圳)有限公司 | 分群决策模型生成、分群处理方法、装置、设备及介质 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN113076486A (zh) | 2021-07-06 |
| CN113076486B (zh) | 2023-07-25 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20220310267A1 (en) | Evaluating Risk of a Patient Based on a Patient Registry and Performing Mitigating Actions Based on Risk | |
| US10685089B2 (en) | Modifying patient communications based on simulation of vendor communications | |
| US10395330B2 (en) | Evaluating vendor communications for accuracy and quality | |
| US20220043970A1 (en) | Automated form generation system | |
| US20170293722A1 (en) | Insurance Evaluation Engine | |
| CN115989490A (zh) | 用于为文本分类提供解释的技术 | |
| US20170286622A1 (en) | Patient Risk Assessment Based on Machine Learning of Health Risks of Patient Population | |
| US20180218126A1 (en) | Determining Patient Symptoms and Medical Recommendations Based on Medical Information | |
| US12608560B2 (en) | Custom-domain controller for large language models | |
| CN111933302B (zh) | 药物推荐方法、装置、计算机设备及存储介质 | |
| US12411759B2 (en) | Techniques for model artifact validation | |
| TW202318287A (zh) | 評估一人工智慧模型對企業績效目標之效果 | |
| US20170235886A1 (en) | Generating and Executing Complex Clinical Protocols on a Patient Registry | |
| US11562384B2 (en) | Dynamic choice reference list | |
| US20230395215A1 (en) | Scalable framework for digital mesh | |
| WO2022227164A1 (fr) | Procédé et appareil de traitement de données à base d'intelligence artificielle, dispositif et support | |
| CN111967581B (zh) | 分群模型的解释方法、装置、计算机设备和存储介质 | |
| WO2024152686A1 (fr) | Procédé et appareil de détermination d'indice de recommandation d'informations de ressource, dispositif, support de stockage et produit-programme d'ordinateur | |
| WO2022227176A1 (fr) | Procédé et appareil de poussée d'informations de médicament, dispositif informatique et support d'enregistrement | |
| US20230419139A1 (en) | Dynamic schema mapping between microservices | |
| CN111581929A (zh) | 基于表格的文本生成方法及相关装置 | |
| US20220215922A1 (en) | System and method for ranking options for medical treatments | |
| WO2021189949A1 (fr) | Procédé et appareil de recommandation d'informations, dispositif électronique et support | |
| US12566977B2 (en) | Optimizing CogBot retraining | |
| US11960746B2 (en) | Storage context aware tiering policy advisor |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21938646 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 21938646 Country of ref document: EP Kind code of ref document: A1 |