WO2021014823A1 - 情報処理装置、情報処理方法および情報処理プログラム - Google Patents
情報処理装置、情報処理方法および情報処理プログラム Download PDFInfo
- Publication number
- WO2021014823A1 WO2021014823A1 PCT/JP2020/023497 JP2020023497W WO2021014823A1 WO 2021014823 A1 WO2021014823 A1 WO 2021014823A1 JP 2020023497 W JP2020023497 W JP 2020023497W WO 2021014823 A1 WO2021014823 A1 WO 2021014823A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- variable
- user
- intervention
- calculation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0282—Rating or review of business operators or products
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
- G06N5/046—Forward inferencing; Production systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/067—Enterprise or organisation modelling
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
Definitions
- This disclosure relates to an information processing device, an information processing method, and an information processing program.
- the causal effect of the target variable is calculated by intentionally changing the value of the variable (intervening in the variable) (intervention effect).
- intervention effect There is a technology to calculate).
- the data of each variable is infinite (or sufficient)
- a highly accurate calculation result can be output, but in reality, the data of each variable is finite and the variable. Since the number of data varies from one data to another, the accuracy of the calculation result may decrease. Then, when the calculation result with low accuracy is presented, it cannot be said that appropriate information is presented to the user, and there is room for improvement.
- this disclosure proposes an information processing device, an information processing method, and an information processing program that can present appropriate information to the user according to the data situation for calculating the intervention effect.
- the information processing device of one form according to the present disclosure includes a determination unit and a presentation unit.
- the determination unit calculates the intervention effect that occurs in the objective variable by intervening in any one of the plurality of variables based on the causal information showing the causal relationship between the plurality of variables. It is determined whether or not the data of the variable required for the calculation is insufficient.
- the presenting unit presents information to the user based on the determination result of the determination unit.
- FIG. 1 It is a figure which shows the outline of the information processing method which concerns on embodiment of this disclosure. It is a block diagram which shows the structure of the information processing apparatus which concerns on embodiment. It is a figure which shows an example of the customer information. It is a figure which shows the information which is presented by the presentation part. It is a figure which shows the information which is presented by the presentation part. It is a figure which shows the information which is presented by the presentation part. It is a figure which shows the information which is presented by the presentation part. It is a figure which shows the information which is presented by the presentation part. It is a figure which shows the information which is presented by the presentation part. It is a figure which shows the information which is presented by the presentation part. It is a figure which shows the information which is presented by the presentation part. It is a figure which shows the information which is presented by the presentation part. It is a figure which shows the information which is presented by the presentation part. It is a figure which shows the information which is presented by the presentation part. It is a figure which shows the
- a plurality of components having substantially the same functional configuration may be distinguished by adding different numbers after the same reference numerals. However, if it is not necessary to distinguish each of the plurality of components having substantially the same functional configuration, only the same reference numerals are given.
- the information processing device, information processing method, and information processing program according to the embodiment are not limited to application to corporate data analysis, that is, data analysis in the business field, and are not limited to, for example, the medical field and education. It can be applied to data analysis in various fields such as fields.
- the information processing apparatus, information processing method, and information processing program according to the embodiment can be applied to data analysis that handles a plurality of variables.
- FIG. 1 is a diagram showing an outline of an information processing method according to an embodiment of the present disclosure.
- FIG. 1 shows causal information showing a causal relationship between a plurality of variables, a so-called causal graph.
- the causal graph shown in FIG. 1 for the four variables X, Y, Z1 and Z2, the direction of the causal effect between the variables is indicated by an arrow (cause ⁇ effect). That is, the causal graph shown in FIG. 1 is a directed graph.
- the causal information shown in FIG. 1 is, in other words, information of a graphical model with a probability distribution in which probabilistic / statistical cause and effect variables are connected by arrows.
- the variable may be any variable such as a categorical variable or a continuous variable, but specifically, it is desirable that the variable is a categorical variable.
- a continuous variable it can be made into a categorical variable by n equal division of the distribution function (n is a natural number of 2 or more).
- causal graph shown in FIG. 1 is an example of causal information, and the causal information may be information that lists causal relationships between variables, or information that can grasp causal relationships between a plurality of variables. Just do it.
- Such causal information is generated using, for example, customer attribute data (age, gender, address, etc.) possessed by the company, questionnaire data conducted for each customer, and the like.
- FIG. 1 shows causal information generated using data of a company that provides service A and service B to customers who are members (current and past).
- the variable Z1 shown in FIG. 1 is the data of the address of the customer who is a member.
- the variable Z2 is questionnaire data indicating the satisfaction level of the service B.
- the variable X is questionnaire data indicating the satisfaction level of the service A.
- the variable Y is data indicating whether or not the customer has withdrawn from the service.
- service A and service B are not limited to the case where they are provided by the same company, and may be provided by different companies. That is, the causal information (causal graph) is not limited to the case where it is generated using the data of one company, and may be generated using the data of a plurality of companies.
- the intervening variable X may be described as the intervention variable X
- the target variable Y may be described as the objective variable Y.
- the causal information shown in FIG. 1 is generated in advance based on the data of the customer of the company. Further, in FIG. 1, it is assumed that the user (for example, the person in charge of the company) accepts the variable X (intervention variable X) that the user wants to intervene and the objective variable Y that wants to see the intervention effect from the user. That is, in FIG. 1, the satisfaction data of the service A is intervened to see the change in the presence or absence of withdrawal of the member as the intervention effect.
- the purpose is to intervene in any one of a plurality of variables X, Y, Z1 and Z2 (intervention variable X) based on causal information.
- intervention variable X a plurality of variables X, Y, Z1 and Z2
- variable data referred to here includes data obtained by combining a plurality of variables in addition to data for each variable, and details of these will be described later.
- step S2 information based on the determination result in step S1 is presented to the user (step S2).
- the user when it is determined that the data of the variables X, Y, Z1, Z2 required for the intervention effect calculation is not insufficient (the data is sufficient), the user accepts the data.
- the intervention effect is calculated based on the content of the intervention, and the calculation result is presented to the user.
- the information processing method when it is determined that the data of the variables X, Y, Z1 and Z2 required for the intervention effect calculation is insufficient, for example, information indicating that the intervention effect calculation is impossible or information indicating that the intervention effect calculation is impossible. It presents to the user information indicating variables for which data is insufficient, information indicating countermeasures that satisfy the data of variables required for intervention effect calculation, and the like. The details of the information presented to the user will be described later.
- FIG. 2 is a block diagram showing a configuration of the information processing device 1 according to the embodiment.
- the information processing device 1 is communicably connected to the user terminal 11 via a predetermined network (not shown).
- a predetermined network not shown.
- the case where the information processing device 1 and the user terminal 11 are configured separately is shown, but in other embodiments, the functions of the information processing device 1 and the user terminal 11 are integrated.
- the configured terminal device may be adopted.
- the user terminal 11 is, for example, a terminal device used by a person in charge of a company or a user such as an individual.
- the user terminal 11 is realized by, for example, a smartphone, a tablet terminal, a notebook PC (Personal Computer), a desktop PC, a mobile phone, a PDA (Personal Digital Assistant), or the like.
- the information processing device 1 includes a communication unit 2, a control unit 3, and a storage unit 4.
- the communication unit 2 is realized by, for example, a NIC (Network Interface Card) or the like. Then, the communication unit 2 transmits / receives information to / from the user terminal 11 via a predetermined network.
- NIC Network Interface Card
- the control unit 3 includes a reception unit 31, an extraction unit 32, a determination unit 33, a presentation unit 34, a countermeasure proposal execution unit 35, and an intervention calculation unit 36.
- the storage unit 4 stores the causal information 41 and the customer information 42.
- the information processing device 1 includes, for example, a computer having a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), a data flash, an input / output port, and various circuits.
- a CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- data flash an input / output port
- various circuits for example, a computer having a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), a data flash, an input / output port, and various circuits.
- the CPU of the computer reads and executes the program stored in the ROM to read and execute the reception unit 31, the extraction unit 32, the determination unit 33, the presentation unit 34, the countermeasure proposal execution unit 35, and the intervention calculation unit of the control unit 3. Functions as 36.
- At least one or all of the reception unit 31, the extraction unit 32, the determination unit 33, the presentation unit 34, the countermeasure proposal execution unit 35, and the intervention calculation unit 36 of the control unit 3 are ASIC (Application Specific Integrated Circuit) or FPGA. It can also be configured with hardware such as (Field Programmable Gate Array).
- the storage unit 4 corresponds to, for example, RAM or data flash.
- the RAM or data flash can store causal information 41, customer information 42, information on various programs, and the like.
- the information processing device 1 may acquire the above-mentioned program and various information via another computer or a portable recording medium connected by a wired or wireless network.
- the causal information 41 is information showing a probabilistic or statistical causal relationship between a plurality of variables.
- the causal information 41 may be created, for example, based on a statistically estimated model of a causal Bayesian network or a causal structural equation (see, eg, Patent Document 1), or by an expert or user.
- Causal relationships between variables may be stored as associated information.
- Customer information 42 is customer data owned by the company to which the user belongs.
- FIG. 3 is a diagram showing an example of customer information 42.
- the customer information 42 may be generated for each company, or customer data owned by a plurality of companies may be collected.
- the customer information 42 includes items such as "customer ID”, "age”, “gender”, “address”, and "questionnaire data”.
- Customer ID is identification information that identifies the customer.
- Age is information indicating the age of the customer.
- the “age” may be the exact age as shown in FIG. 3, or may be abstracted as “20's”.
- Gender is information indicating the gender of the customer.
- the “address” is information indicating the customer's address.
- the “address” may be a detailed address, or may be abstracted as "Tokyo" or "Kanto region”.
- “Questionnaire data” is information indicating the answers to questionnaires conducted by companies to customers.
- “unanswered” of "questionnaire data” shown in FIG. 3 indicates that the questionnaire was not conducted or the questionnaire was conducted but not answered.
- control unit 3 reception unit 31, extraction unit 32, determination unit 33, presentation unit 34, countermeasure proposal execution unit 35, and intervention calculation unit 36.
- the reception unit 31 receives various information from the user via the user terminal 11. For example, the reception unit 31 receives information from the user for calculating the intervention effect. Specifically, the reception unit 31 accepts the selection of the intervention variable, which is a variable to intervene from the user, and the objective variable, which is a variable for which the intervention effect is to be seen, from the variables in the causal information 41. Further, the reception unit 31 accepts a user's selection operation for the options presented by the information processing device 1. Specifically, when the variable extracted by the extraction unit 32, which will be described later, has data loss, the reception unit 31 accepts a selection operation for designating a process for the missing variable.
- the intervention variable which is a variable to intervene from the user
- the objective variable which is a variable for which the intervention effect is to be seen
- the reception unit 31 accepts a selection operation for designating a countermeasure plan to be executed by the countermeasure plan execution unit 35, which will be described later, from a plurality of countermeasure plans presented to the user.
- the reception unit 31 receives the intervention content (change content) for the intervention variable when the intervention effect calculation can be performed.
- the extraction unit 32 extracts the variables necessary for the intervention effect calculation from the variables included in the causal information 41. Specifically, the extraction unit 32 extracts variables (for example, confounding variables) that have a direct or indirect causal relationship with the intervention variable based on the causal information 41. In other words, the extraction unit 32 selects variables (variables Z1 and Z2) in which the causal arrow shown in FIG. 1 is connected to the intervention variable X, and variables that are connected to the variables (variables Z1 and Z2) and are not connected to the intervention variable X. Extract.
- variables for example, confounding variables
- the determination unit 33 determines whether or not the variable data required for the intervention effect calculation is insufficient when the intervention effect calculation is performed based on the causal information 41. First, the determination unit 33 determines whether or not the variables extracted by the extraction unit 32, the intervention variables, and the objective variables include variables for which data is missing, prior to determining the lack of data. ..
- a variable for which data is missing refers to a case where there is no data for some of the values that a variable can take. For example, regarding the frequency distribution data of continuous variables, there is no data (missing) of some values (classes) among the continuous values (classes).
- the determination unit 33 When there is a variable whose data is missing (hereinafter, missing variable), the determination unit 33 presents the user with how to handle the missing variable and accepts the user's selection.
- the determination unit 33 presents, for example, the following three ways of handling the missing variable. (1) Exclude data containing missing values (2) Complement missing values (3) Treat missing values as one categorical value
- the determination unit 33 complements the missing data among the missing variables. For example, when the missing variable is a variable having a continuous value (continuous variable), the determination unit 33 uses, for example, the average value or the median value of the data in other values included in the missing variable to obtain the data of the missing value. To complement. Further, when the missing variable is a categorical variable, the determination unit 33 complements the data of the missing value by using, for example, the representative value of the missing variable.
- the determination unit 33 handles the missing data as it is. Specifically, the determination unit 33 adds information indicating that the missing value among the missing variables is missing, and performs the subsequent aggregation process.
- the determination unit 33 aggregates the data of these variables when the data of the variables is not missing and when the processing according to the handling method of the missing variables selected by the user is completed. Specifically, the determination unit 33 performs an aggregation process for aggregating data for each combination of the intervention variable, the objective variable, and the variables extracted by the extraction unit 32.
- the determination unit 33 determines whether or not the number of data is equal to or greater than a predetermined threshold value for each combination of variables aggregated by the aggregation process. When the number of data is equal to or greater than the threshold value for all combinations of variables, the determination unit 33 determines that the data of the variables required for the intervention effect calculation is sufficient (not insufficient). That is, the determination unit 33 determines that the intervention effect calculation is possible.
- the determination unit 33 determines that the data of the variables necessary for calculating the intervention effect is insufficient. That is, the determination unit 33 determines that the intervention effect calculation is impossible.
- the determination unit 33 notifies the presentation unit 34 and the intervention calculation unit 36 of the determination result.
- the determination result presented to the presentation unit 34 includes information on whether or not the data of the variables necessary for calculating the intervention effect is insufficient, information on the combination of variables for which the data is insufficient, and the like.
- the determination unit 33 determines whether or not the variable data required for the intervention effect calculation is insufficient. That is, the determination unit 33 determines whether or not the data shortage has been resolved by executing the countermeasure plan by the countermeasure plan execution unit 35.
- the presentation unit 34 presents information to the user based on the determination result of the determination unit 33. For example, the presentation unit 34 presents to the user information indicating that the intervention effect calculation is possible when the determination unit 33 determines that the data of the variables required for the intervention effect calculation is sufficient.
- the presentation unit 34 presents information indicating that the intervention effect calculation is impossible to the user.
- the presentation unit 34 presents information indicating the lack of data and a countermeasure plan for satisfying the data, as well as information indicating that the intervention effect calculation is impossible. The details of the information presented by the presentation unit 34 will be described later in FIGS. 4 to 16.
- the countermeasure plan execution unit 35 executes the countermeasure plan presented to the user by the presentation unit 34. For example, when a plurality of countermeasure plans are presented by the presenting unit 34, the reception unit 31 accepts the selection of one countermeasure plan from the user, and further receives an execution instruction instructing the execution of the selected countermeasure plan. Then, when the reception unit 31 receives the execution instruction of the countermeasure plan, the countermeasure plan execution unit 35 executes the countermeasure plan. Specific examples of the countermeasure plan executed by the countermeasure plan execution unit 35 will be described later with reference to FIGS. 4 to 16.
- the intervention calculation unit 36 executes the intervention effect calculation when the determination unit 33 determines that the data of the variables required for the intervention effect calculation is sufficient. Specifically, the intervention calculation unit 36 calculates the intervention effect in the objective variable based on the content of the intervention in the intervention variable received from the user by the reception unit 31.
- the intervention calculation unit 36 notifies the presentation unit 34 of the calculation result of the intervention effect calculation, and the presentation unit 34 presents the intervention information based on the calculation result to the user.
- the details of the intervention information presented to the user will be described later.
- 4 to 16 are diagrams showing information presented by the presentation unit 34.
- the presentation unit 34 presents, for example, the information “error!” And “intervention calculation cannot be performed” indicating that the intervention effect calculation is impossible, and the data is insufficient.
- the presentation unit 34 may display only the information (text information) shown in the lower part of FIG. 4 as the information to be presented to the user, and further, the information shown in the upper part of FIG. 4 (graphic model of the causal information 41). May also be displayed.
- the presentation unit 34 presents, for example, information indicating that the intervention effect calculation is impossible, and also presents a countermeasure plan that satisfies the variable data required for the intervention effect calculation. You may.
- variable ⁇ is an arbitrary variable related to this intervention effect calculation, and in the example of FIG. 4, it is variable X, variable Z1 or variable Z2. Further, in the following, "Zi" is assumed to be a variable Z1 or a variable Z2.
- the countermeasure plan in FIG. 5, the one having a weak influence among (2) Zi) ) Is dimmed. That is, the presentation unit 34 changes the display mode from other countermeasures for the countermeasures that cannot solve the data shortage. As a result, it is possible to prevent the user from erroneously selecting a meaningless countermeasure plan. It should be noted that the countermeasures that cannot solve the data shortage may be hidden.
- the presentation unit 34 when presenting a plurality of countermeasure plans, the presentation unit 34 presents recommended information (black-painted star mark shown in FIG. 5) based on the user's skill level for each countermeasure plan.
- the user's skill level is a degree indicating abundant experience in data analysis.
- the presentation unit 34 increases the recommended information of "changing the direction of the arrow from X to Zi". That is, the presentation unit 34 recommends the countermeasure plan of "changing the direction of the arrow from X to Zi" according to the level of skill level, because the reliability of the intervention effect calculation will be lowered if the user with low skill level performs it. Dynamically change the level of information. Further, as shown in FIG. 6, the presentation unit 34 displays auxiliary information such as "You are an expert, so the recommendation level is higher than usual.” This makes it possible to convey that a high degree of skill is required when implementing the countermeasure plan of "changing the direction of the arrow from X to Zi".
- the reception unit 31 accepts the selection of at least one countermeasure (check the check box) from the user.
- the presentation unit 34 presents specific information as shown in FIGS. 7 to 14 for each of the countermeasure proposals received by the reception unit 31.
- the above-mentioned countermeasures (1) to (5) will be specifically described.
- FIG. 7 shows a specific example of the countermeasure "combining the categorical values of the variable ⁇ ".
- X among the candidates X, Z1 and Z2 of the variable ⁇ is selected by the judgment of the system or by the user.
- the countermeasure "combine the categorical values of the variable ⁇ " is selected by the user, the data of the values for which the data is insufficient for the plurality of values (categorical values) included in the variable ⁇ . And data of other values are combined.
- FIG. 7 shows the distribution of the number of data (number of customers) at each value of the intervention variable X, “satisfaction with service A”.
- each value (1 to 5) of "satisfaction of service A" is "1: very dissatisfied", “2: slightly dissatisfied”, “3: neither", “4: slightly satisfied”, “ 5: Very satisfied ".
- the presentation unit 34 presents, for example, to combine the number of data of "1: very dissatisfied” and "2: slightly dissatisfied". That is, the presentation unit 34 regards two values, “1: very dissatisfied”, which is a value for which the number of data is insufficient, and "2: a little dissatisfied", which is not insufficient for the number of data, as one value, and data. Present to add up the numbers. That is, it is presented that two values having relatively similar contents are regarded as one value and the number of data is added up.
- the countermeasure plan execution unit 35 executes the countermeasure plan presented by the presentation unit 34. That is, the countermeasure plan execution unit 35 combines "1: very dissatisfied” and “2: slightly dissatisfied” and adds up the number of data of each value.
- the combined value is preferably expressed as a value that can identify the combination of "1: very dissatisfied” and "2: slightly dissatisfied”.
- the number of data of the value regarded as one is the total value of the number of data of the two values, so that the number of data is not insufficient.
- the possible values of the variable X are reduced from 5 values to 4 values, and the data shortage is solved. This makes it possible for the intervention calculation unit 36, which will be described later, to execute the intervention effect calculation.
- the number of data is not limited to the case of combining the value with insufficient data and the value without insufficient data, and if the data shortage can be resolved, the number of data will be increased. You may combine the missing values. Further, although FIG. 7 shows a case where two values are combined, three or more values may be combined.
- FIG. 7 the joining process is shown when the number of data of one variable X is insufficient, but the joining process can be performed even when the data due to the combination of a plurality of variables is insufficient. This point will be described with reference to FIG.
- the presentation unit 34 visualizes the information indicating the variable for which the data is insufficient and presents it to the user. Specifically, the presentation unit 34 displays a list of the number of data in a table format by combining variables including variables for which data is insufficient. Further, the presentation unit 34 changes the background color information (shading and RGB) of each item displayed in the table format according to the number of data. This makes it possible for the user to easily grasp which combination of values of data is missing in the plurality of variables.
- the presentation unit 34 is not limited to changing the background color information of each item, and may change the display mode (character size, etc.) according to the number of data of each item.
- the presentation unit 34 presents the combination of the values of the variables together with the information in the table described above.
- the presentation unit 34 presents two combination methods in order to solve the data shortage of the combination of “Hokkaido” and “very dissatisfied”. Specifically, the presentation unit 34 presents a method of combining "Hokkaido” and “Tohoku” and a method of combining "very dissatisfied” and “somewhat dissatisfied”. That is, the presentation unit 34 either combines the values of the variable "address” and reduces it from 5 values to 4 values, or combines the values of the variable "satisfaction of service A" and reduces it from 5 values to 4 values. This solves the data shortage.
- the countermeasure execution unit 35 executes the selected countermeasure.
- the presentation unit 34 sets the values (categorical values) of the variables X and Z for each variable or so as to eliminate the area (each item of the combination of variables) having a small number of data.
- Clustering processing is performed by the combination pattern of variables, and combination processing is internally performed so as to make the condition capable of calculating the intervention effect, and it is not presented to the user.
- the presentation unit 34 internally executes a joining process that eliminates an area where the number of data is insufficient by clustering items having similar distributions of variable values. .. More specifically, in the example shown in FIG. 8, the presentation unit 34 is connected by a combination of the variable X and the variable Z.
- FIG. 10 shows a specific example of the countermeasure plan “remove Zi having a weak influence”.
- the presentation unit 34 selects the variables required for the intervention effect calculation that have a small effect on the intervention effect from the intervention effect calculation. Present the user to exclude.
- the degree of influence on the intervention effect can be calculated based on, for example, a change in the mutual information amount (interdependent amount between the two variables) of the intervention variable X and the objective variable Y.
- the above-mentioned degree of influence is the mutual information amount of the intervention variable X and the objective variable Y when each variable Z1 (or variable Z2) is conditioned, and the mutual information amount when each variable is not conditioned. It can be calculated as a difference. That is, the smaller the difference, the smaller the degree of influence.
- variable Z1 "address” has a small influence on the calculation of the intervention effect when the intervention variable X and the objective variable Y are used. In other words, even if the variable Z1 "address" is excluded from the intervention effect calculation, the reliability of the calculation result does not decrease so much.
- the presentation unit 34 presents the user to remove the variable Z1 from the intervention effect calculation.
- the countermeasure plan execution unit 35 removes the variable Z1 while minimizing the decrease in the reliability of the calculation result. , You will be able to calculate the effect of intervention.
- the presentation unit 34 may present only the variable having the smallest influence degree as the variable to be removed from the intervention effect calculation, display all the variables in the order of the influence degree in a list, and display the variable to be removed from the calculation. You may let the user choose.
- the presentation unit 34 may present a variable whose degree of influence is smaller than a predetermined threshold value, that is, a variable whose difference in the amount of mutual information described above is smaller than a predetermined threshold value.
- the presentation unit 34 presents, for example, as a variable to be removed from the calculation when the mutual information amount of the variable X and the variable Z1 or the mutual information amount of the variable Y and the variable Z1 is smaller than a predetermined threshold value. May be good.
- the above-mentioned degree of influence is not limited to the case of calculating based on the mutual information of two variables, for example, a function for calculating the correlation between two variables, a function for calculating the similarity between two variables, a tendency score, and the like. It may be calculated based on other methods, i.e., it can be calculated by any method of measuring the distance between two variables.
- the presentation unit 34 indicates that the above-mentioned degree of influence is smaller than the predetermined threshold value and there is no influence even if it is removed from the calculation.
- the process of removing from the calculation may be performed internally without presenting the countermeasure plan to the user.
- FIG. 11 shows a specific example of the countermeasure plan “changing the direction of the arrow from X to Zi”.
- the presentation unit 34 presents to the user that when the countermeasure "change the direction of the arrow from X to Zi" is selected by the user, the direction of the causality between the variables required for the calculation of the intervention effect is changed.
- variable Z2 ⁇ variable X before the change is changed to variable X ⁇ variable Z1 after the change.
- the target of the variable for changing the direction of the causality may be arbitrarily selected by the user, or the control unit 3 automatically selects a variable having a high correlation between the two variables, a variable having a high similarity, and the like. You may.
- the presentation unit 34 presents the user with a countermeasure plan for changing the direction of causality, and also presents supplementary information regarding the countermeasure plan.
- the presentation unit 34 displays "Tips", which is supplementary information, together with the countermeasure plan.
- the reception unit 31 receives the execution instruction of the countermeasure plan from the user
- the countermeasure plan execution unit 35 executes the countermeasure plan.
- the direction of causality between the two variables is reversed, and as shown in FIG. 12, the variables required for calculating the intervention effect change.
- FIG. 12 shows a case where the causal directions of the variables X and Z2 are changed.
- the variables Z1 and Z2 are variables necessary for calculating the intervention effect.
- the intervention effect cannot be calculated in the causal relationship shown in the upper part of FIG. Therefore, as shown in the lower part of FIG. 12, by changing the direction of the causality of the variable X and the variable Z2, the variable Z2 is no longer a variable necessary for calculating the intervention effect. That is, in the case of the causal relationship shown in the lower part of FIG. 12, the intervention effect can be calculated when the intervention variable X and the objective variable Y are used. In other words, by changing the direction of causality, the variables required for intervention effect calculation change (decrease), so the variables lacking data are no longer necessary for calculation, and as a result, the data shortage can be resolved.
- the presentation unit 34 changes the intervening variable from the variable X to the variable Z2 when the correlation (or the degree of influence described above) between the variable X and the variable Z2 is equal to or higher than a predetermined threshold value.
- the presentation unit 34 presents that if there is a variable that has a high correlation with the intervention variable before the change, the intervention variable is changed to the variable. That is, the presentation unit 34 presents to the user as a countermeasure plan that the intervening variable is changed without changing the objective variable Y.
- variable Z1 when the variable Z1 lacks data, by setting the variable Z2 as the intervention variable, the variable Z1 is no longer a variable required for the intervention effect calculation, and as a result, the variable required for the intervention effect calculation. Data shortage can be solved.
- the presentation unit 34 presents the user with a countermeasure plan for changing the intervention variable, and also presents supplementary information regarding the countermeasure plan.
- the presentation unit 34 displays "Tips", which is supplementary information, together with the countermeasure plan.
- FIG. 14 shows a specific example of the countermeasure "continue the calculation although the accuracy drops". That is, the presentation unit 34 presents that the accuracy of the intervention effect calculation (reliability shown in the lower part of FIG. 14) is reduced when the countermeasure proposal "the accuracy is reduced but the calculation is continued" is selected by the user. , Presents the user to continue the intervention effect calculation.
- the presentation unit 34 displays information for the user to input the content of the intervention in the intervention variable X. Specifically, the presentation unit 34 displays the data of the variable X before the intervention and the data of the variable X after the intervention according to the user operation. That is, the user performs an operation of changing the distribution of the data of the variable X after the intervention. As shown in FIG. 14, among the values of the variable X, it is preferable that the value in which the user intervenes is displayed differently from the value in which the user has not intervened. In FIG. 14, the data of the value in which the user intervened is shown by hatching.
- the intervention calculation unit 36 calculates the intervention effect when the "decision" button is operated by the user after the intervention operation, and the presentation unit 34 outputs the intervention information based on the calculation result of the intervention calculation unit 36. indicate.
- the intervention information presented by the presentation unit 34 includes data on the objective variable Y before and after the intervention, and information on a change in the ratio of the number of data (intervention effect) (“1””. The ratio of XX% increased!) Is included.
- the presentation unit 34 provides intervention information including the reliability of the intervention effect, such as "However, the reliability of the result is ⁇ % because the approximate calculation is performed.” Present to. It is preferable that the presentation unit 34 presents the reliability of the intervention effect only when the determination unit 33 determines that the data is insufficient and the calculation of the intervention effect is originally impossible. That is, when the determination unit 33 determines that the data of the variables required for the intervention effect calculation is sufficient, the presentation unit 34 does not present information indicating the reliability of the intervention effect.
- the intervention calculation unit 36 When calculating the above-mentioned reliability (accuracy of intervention effect calculation), the intervention calculation unit 36 repeatedly executes random sampling of data ⁇ intervention effect calculation, for example, using the data of variables required for intervention effect calculation. Then, the reliability is calculated by calculating the variance of the calculation result. Alternatively, the intervention calculation unit 36 may calculate a standardized index according to the number of data for each combination of variables required for intervention effect calculation, and calculate the minimum value as reliability.
- the determination unit 33 determines that the data required for the intervention effect calculation is insufficient, the content presented when continuing the intervention effect calculation is shown. For example, the determination unit 33 shows. When it is determined that the data required for the intervention effect calculation is sufficient, basically the same information as in FIG. 14 is presented except for the presentation of reliability.
- FIG. 15 is a diagram showing information presented by the presentation unit 34.
- the presentation unit 34 when displaying the distribution of the data of the variable X after the intervention, the presentation unit 34 superimposes the display of "input not possible" on the data of the value "1" so that the data of the value "1" cannot be changed. Display it.
- the presentation unit 34 refers to a value in which the data is less than a predetermined number among the plurality of values included in the variable X. Prohibit acceptance of intervention from users.
- the display of "input not possible” is superimposed on the data of the value "1" to prohibit the intervention, but the display mode may be arbitrary. For example, data on values that prohibit intervention may be dimmed or transparent. Alternatively, the display mode may not be changed so that the data is not changed when the user operates to intervene.
- the presentation unit 34 may display the reason why "input is not possible” together with the screen. For example, the presentation unit 34 cannot change the data for the value "1" because the number of data is insufficient, or the data cannot be changed because the reliability of the calculation result of the intervention effect is low due to the insufficient number of data. May be displayed.
- FIG. 16 shows a specific example of the countermeasure plan “Retrieving data”.
- the presentation unit 34 presents information indicating a variable for which the data is insufficient, and also presents the user to retake the data.
- the presentation unit 34 is not limited to the case where the variable of lack of data is displayed as text when presenting the data retake, and for example, the information (visualized information) of the table as shown in FIG. 8 is also displayed. You may.
- FIG. 17 is a flowchart showing an information processing procedure executed by the information processing apparatus 1 according to the embodiment. In FIG. 17, it is assumed that the causal information 41 is generated in advance.
- the reception unit 31 receives the selection of the intervention variable and the objective variable from the variables in the causal information 41 from the user (step S101).
- the extraction unit 32 extracts variables (confounding variables, etc.) necessary for calculating the intervention effect from the variables in the causal information 41 (step S102).
- the determination unit 33 determines whether or not the variable data extracted by the extraction unit 32 has a defect (step S103).
- the presentation unit 34 presents the user with information including the following options (1) to (3), and the reception unit 31. Accepts the selection of any option from the user (step S104). (1) Exclude data containing missing values (2) Complement missing values (3) Treat missing values as one categorical value
- the determination unit 33 performs processing according to the option selected from (1) to (3) on the variables including the missing values, and obtains data for each combination of variables necessary for calculating the intervention effect. Aggregate (step S105).
- the determination unit 33 determines whether or not the total number of data is equal to or greater than the threshold value (step S106). That is, the determination unit 33 determines the lack of variable data necessary for calculating the intervention effect.
- the presentation unit 34 informs the user that the intervention effect calculation is possible when the total number of data is equal to or greater than the threshold value, that is, when the data of the variables required for the intervention effect calculation is sufficient (step S106: Yes). Present (step S107).
- the reception unit 31 receives a change in the distribution of the intervention variable from the user, and the intervention calculation unit 36 executes the intervention effect calculation in the objective variable based on the intervention content received from the user (step S108).
- the presentation unit 34 presents the change in the distribution of the objective variable based on the calculation result of the intervention calculation unit 36 to the user (step S109), and ends the process.
- step S103 if the extracted variable data is not missing (step S103: No), the control unit 3 advances the process to step S105.
- step S106 the presentation unit 34 cannot calculate the intervention effect when the total number of data is less than the threshold value, that is, when the data of the variables required for the intervention effect calculation is insufficient (step S106: No). Is presented to the user (step S110).
- the presentation unit 34 displays a list of countermeasure plans together with recommended information indicating the degree of recommendation (step S111). Subsequently, the presentation unit 34 accepts the selection of one countermeasure plan from the user and displays the information regarding the accepted countermeasure plan (step S112).
- the countermeasure plan execution unit 35 determines whether or not the user has performed an operation permitting the execution of the countermeasure plan (step S113), and performs an operation not permitting the execution of the countermeasure plan (step S113: No). ), End the process.
- step S113 when the user performs an operation permitting the execution of the countermeasure plan (step S113: Yes), the countermeasure plan execution unit 35 executes the countermeasure plan (step S114) and proceeds to the process in step S105.
- FIG. 18 is a block diagram showing an example of the hardware configuration of the information processing device 1 according to the present embodiment.
- the information processing device 1 includes a CPU (Central Processing Unit) 901, a ROM (Read Only Memory) 902, a RAM (Random Access Memory) 903, a host bus 905, a bridge 907, an external bus 906, and an interface 908. , Input device 911, output device 912, storage device 913, drive 914, connection port 915, and communication device 916.
- the information processing device 20 may include a processing circuit such as an electric circuit, a DSP, or an ASIC in place of or in combination with the CPU 901.
- the CPU 901 functions as an arithmetic processing device and a control device, and controls the overall operation in the information processing device 20 according to various programs. Further, the CPU 901 may be a microprocessor.
- the ROM 902 stores programs, calculation parameters, and the like used by the CPU 901.
- the RAM 903 temporarily stores a program used in the execution of the CPU 901, parameters that are appropriately changed in the execution, and the like.
- the CPU 901 may execute the functions of the reception unit 31, the extraction unit 32, the determination unit 33, the presentation unit 34, the countermeasure proposal execution unit 35, and the intervention calculation unit 36, for example.
- the CPU 901, ROM 902 and RAM 903 are connected to each other by a host bus 905 including a CPU bus and the like.
- the host bus 905 is connected to an external bus 906 such as a PCI (Peripheral Component Interconnect / Interface) bus via a bridge 907.
- the host bus 905, the bridge 907, and the external bus 906 do not necessarily have to be separately configured, and these functions may be implemented on one bus.
- the input device 911 is a device in which information is input by a user such as a mouse, keyboard, touch panel, buttons, microphone, switch, or lever.
- the input device 911 may be a remote control device using infrared rays or other radio waves, or may be an externally connected device such as a mobile phone or a PDA that supports the operation of the information processing device 1.
- the input device 911 may include, for example, an input control circuit that generates an input signal based on the information input by the user using the above input means.
- the output device 912 is a device capable of visually or audibly notifying the user of information.
- the output device 912 is, for example, a display device such as a CRT (Cathode Ray Tube) display device, a liquid crystal display device, a plasma display device, an EL (ElectroLuminence) display device, a laser projector, an LED (Light Emitting Diode) projector, or a lamp. It may be an audio output device such as a speaker or a headphone.
- the output device 912 may output the results obtained by various processes by the information processing device 1, for example. Specifically, the output device 912 may visually display the results obtained by various processes by the information processing device 1 in various formats such as texts, images, tables, and graphs. Alternatively, the output device 912 may convert an audio signal such as audio data or acoustic data into an analog signal and output it audibly.
- the input device 911 and the output device 912 may, for example, perform an interface function.
- the storage device 913 is a data storage device formed as an example of the storage unit 4 of the information processing device 1.
- the storage device 913 may be realized by, for example, a magnetic storage device such as an HDD (Hard Disk Drive), a semiconductor storage device, an optical storage device, an optical magnetic storage device, or the like.
- the storage device 913 may include a storage medium, a recording device that records data on the storage medium, a reading device that reads data from the storage medium, a deleting device that deletes the data recorded on the storage medium, and the like.
- the storage device 913 may store a program executed by the CPU 901, various data, various data acquired from the outside, and the like.
- the storage device 913 may execute a function of storing the causal information 41 and the customer information 42, for example.
- the drive 914 is a reader / writer for a storage medium, and is built in or externally attached to the information processing device 1.
- the drive 914 reads the information recorded in the removable storage medium such as the mounted magnetic disk, optical disk, magneto-optical disk, or semiconductor memory, and outputs the information to the RAM 903.
- the drive 914 can also write information to the removable storage medium.
- connection port 915 is an interface connected to an external device.
- the connection port 915 is a connection port capable of transmitting data with an external device, and may be, for example, USB (Universal Serial Bus).
- the communication device 916 is, for example, an interface formed by a communication device or the like for connecting to the network 920.
- the communication device 916 may be, for example, a communication card for a wired or wireless LAN (Local Area Network), LTE (Long Term Evolution), Bluetooth (registered trademark), WUSB (Wireless USB), or the like.
- the communication device 916 may be a router for optical communication, a router for ADSL (Asymmetric Digital Subscriber Line), a modem for various communications, or the like.
- the communication device 916 can send and receive signals and the like to and from the Internet or other communication devices in accordance with a predetermined protocol such as TCP / IP.
- the network 40 is a wired or wireless transmission line for information.
- the network 40 may include a public network such as the Internet, a telephone line network or a satellite communication network, various LANs (Local Area Network) including Ethernet (registered trademark), WAN (Wide Area Network), and the like.
- the network 920 may include a dedicated line network such as IP-VPN (Internet Protocol-Virtual Private Network).
- a computer program for exerting the same functions as each configuration of the information processing device 1 according to the above-described embodiment is also created for the hardware such as the CPU, ROM, and RAM built in the information processing device 1. It is possible. It is also possible to provide a storage medium in which the computer program is stored.
- each component of each device shown in the figure is a functional concept, and does not necessarily have to be physically configured as shown in the figure. That is, the specific form of distribution / integration of each device is not limited to the one shown in the figure, and all or part of the device is functionally or physically dispersed / physically distributed in arbitrary units according to various loads and usage conditions. It can be integrated and configured.
- the information processing device 1 includes a determination unit 33 and a presentation unit 34.
- the determination unit 33 calculates the intervention effect that occurs in the objective variable by intervening in any one of the plurality of variables based on the causal information 41 showing the causal relationship between the plurality of variables.
- the determination unit 33 is concerned. Determine if the variable data required for the calculation is insufficient.
- the presentation unit 34 presents information to the user based on the determination result of the determination unit 33.
- the presentation unit 34 presents the information indicating that the calculation is impossible to the user.
- the presentation unit 34 presents the information indicating the variables for which the data is insufficient to the user.
- the presentation unit 34 visualizes the information indicating the variable for which the data is insufficient and presents it to the user.
- the user can intuitively grasp which combination of data is lacking among the data obtained by combining a plurality of variables.
- the presentation unit 34 presents to the user a countermeasure plan for satisfying the variable data required for the calculation.
- the presentation unit 34 presents to the user that the data of the value for which the data is insufficient and the data of the other value are to be combined with respect to the plurality of values included in the variable.
- the presentation unit 34 presents to the user as a countermeasure plan to exclude from the calculation the variables that have a small effect on the intervention effect among the variables necessary for the calculation.
- the presentation unit 34 presents to the user as a countermeasure plan to change the direction of causality between the variables required for calculation.
- the presentation unit 34 presents that the accuracy of the calculation is reduced as a countermeasure, and also presents the user to continue the calculation.
- a user who does not want to change the content of the data can calculate the intervention effect.
- the presentation unit 34 presents information indicating variables for which data is insufficient, and also presents the user to regain the data.
- the presentation unit 34 presents to the user as a countermeasure plan to change the intervening variable without changing the objective variable.
- the presentation unit 34 presents supplementary information regarding the countermeasure plan to the user together with the countermeasure plan.
- the presentation unit 34 when presenting a plurality of countermeasure plans, the presentation unit 34 presents recommended information based on the skill level of the user for each countermeasure plan.
- the information processing device 1 further includes an intervention calculation unit 36.
- the intervention calculation unit 36 calculates the intervention effect in the objective variable based on the intervention content received from the user.
- the presentation unit 34 presents the intervention information to the user based on the calculation result of the intervention calculation unit 36.
- the presentation unit 34 uses the user for a value in which the data is less than a predetermined number among a plurality of values included in the variable. Prohibit acceptance of intervention from.
- the presentation unit 34 presents the intervention information including the intervention effect and the reliability of the intervention effect to the user.
- the presentation unit 34 presents the intervention information including the data before and after the intervention of the intervening variable and the objective variable to the user.
- the present technology can also have the following configurations.
- a determination unit that determines whether or not the data of the variable is insufficient, and
- An information processing device including a presentation unit that presents information based on the determination result of the determination unit to the user.
- (2) The presentation unit The information processing device according to (1) above, wherein when the determination unit determines that the data of the variable required for the calculation is insufficient, the information processing apparatus according to (1) above presents information indicating that the calculation is impossible to the user.
- the presentation unit The information processing according to (1) or (2) above, when the determination unit determines that the data of the variable required for the calculation is insufficient, the information indicating the variable whose data is insufficient is presented to the user. apparatus.
- the presentation unit The information processing device according to (3) above, wherein when data is insufficient due to a combination of a plurality of variables, information indicating the variable for which data is insufficient is visualized and presented to the user.
- the presentation unit When it is determined by the determination unit that the data of the variable required for the calculation is insufficient, a countermeasure plan for satisfying the data of the variable required for the calculation is presented to the user. Any of the above (1) to (4). Information processing device described in Crab.
- the presentation unit As the countermeasure, the information according to (5) above, which presents to the user to combine the data of the value for which data is insufficient and the data of other values of the plurality of values included in the variable. Processing equipment. (7) The presentation unit The information processing apparatus according to (5) or (6) above, which presents to the user to exclude the variable having a small influence on the intervention effect from the calculation among the variables required for the calculation as the countermeasure plan. .. (8) The presentation unit The information processing apparatus according to any one of (5) to (7) above, which presents to the user as a countermeasure plan to change the direction of causality between the variables required for the calculation.
- the presentation unit The information processing apparatus according to any one of (5) to (8) above, which presents that the accuracy of the calculation is lowered and that the user is to continue the calculation as the countermeasure plan.
- the presentation unit As the countermeasure, the information processing apparatus according to any one of (5) to (9) above, which presents information indicating the variable for which data is insufficient and also presents the user to regain the data.
- the presentation unit The information processing device according to any one of (5) to (10) above, which presents to the user to change the intervening variable without changing the objective variable as the countermeasure.
- (12) The presentation unit The information processing device according to any one of (5) to (11) above, which presents supplementary information about the countermeasure plan to the user together with the countermeasure plan.
- the presentation unit The information processing device according to any one of (5) to (12) above, which presents recommended information based on the user's skill level for each of the plurality of countermeasure proposals.
- the intervention calculation unit further includes an intervention calculation unit that calculates the intervention effect in the objective variable based on the intervention content received from the user.
- the presentation unit The information processing device according to any one of (1) to (13) above, which presents intervention information based on the calculation result of the intervention calculation unit to the user.
- the presentation unit When the intervention calculation unit accepts an intervention for calculating the intervention effect from the user, the acceptance of the intervention is prohibited for a value whose data is less than a predetermined number among a plurality of values included in the variable.
- the information processing apparatus according to (14) above.
- An information processing method including a presentation process of presenting information based on a determination result of the determination process to a user.
- the computer By being read by the computer, the computer can be read. Necessary for the calculation of the intervention effect that occurs in the objective variable by intervening in any one of the plurality of variables based on the causal information showing the causal relationship between the plurality of variables. A determination unit that determines whether or not the data of the variable is insufficient, and A presenting unit that presents information based on the determination result of the determination unit to the user, An information processing program that functions as.
- Information processing device 1 Information processing device 2 Communication unit 3 Control unit 4 Storage unit 11 User terminal 31 Reception unit 32 Extraction unit 33 Judgment unit 34 Presentation unit 35 Countermeasure proposal execution unit 36 Intervention calculation unit 41 Causal information 42 Customer information
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Theoretical Computer Science (AREA)
- Development Economics (AREA)
- Physics & Mathematics (AREA)
- Entrepreneurship & Innovation (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- General Physics & Mathematics (AREA)
- Economics (AREA)
- Data Mining & Analysis (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Game Theory and Decision Science (AREA)
- Human Resources & Organizations (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- Operations Research (AREA)
- Educational Administration (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
まず、図1を用いて、実施形態に係る情報処理方法の概要について説明する。図1は、本開示の実施形態に係る情報処理方法の概要を示す図である。図1では、複数の変数間における因果関係を示した因果情報、いわゆる因果グラフを示している。図1に示す例における因果グラフは、4つの変数X,Y,Z1,Z2について、変数間の因果の向きを矢印(原因→結果)で示している。すなわち、図1に示す因果グラフは、有向グラフである。また、図1に示す因果情報は、言い換えれば、確率・統計的な原因および結果の変数を矢印で繋いだ確率分布を伴うグラフィカルモデルの情報である。
次に、図2を用いて、実施形態に係る情報処理装置1の構成について説明する。図2は、実施形態に係る情報処理装置1の構成を示すブロック図である。図2に示すように、情報処理装置1は、図示しない所定のネットワークを介してユーザ端末11に通信可能に接続される。なお、本実施形態では、情報処理装置1およびユーザ端末11が別体で構成される場合を示したが、他の実施形態では、情報処理装置1およびユーザ端末11の機能が一体となるように構成された端末装置が採用されてもよい。
(1)欠損値が含まれるデータを除く
(2)欠損値を補完する
(3)欠損値を1つのカテゴリカル値として扱う
「欠損値が含まれるデータを除く」が選択された場合、判定部33は、欠損変数のうち、欠損したデータについては、データが無いものとして後段の集計処理を行う。
「欠損値を補完する」が選択された場合、判定部33は、欠損変数のうち、欠損したデータを補完する。例えば、判定部33は、欠損変数が連続値をとる変数(連続変数)である場合、例えば、欠損変数に含まれる他の値におけるデータの平均値や中央値を用いて、欠損した値のデータを補完する。また、判定部33は、欠損変数がカテゴリ変数である場合、例えば、欠損変数における代表値を用いて、欠損した値のデータを補完する。
「欠損値を1つのカテゴリカル値として扱う」が選択された場合、判定部33は、欠損したデータをそのまま扱う。具体的には、判定部33は、欠損変数のうち、欠損した値について、欠損していることを示す情報を付加して後段の集計処理を行う。
次に、提示部34によってユーザに提示される情報の詳細について、図4~図16を用いて説明する。
(1)変数○のカテゴリカル値を結合する
(2)Ziのうち影響が弱いものを除去する
(3)X→Ziに矢印の向きを変える
(4)精度が落ちるが計算を続行する
(5)データを取り直す
図7では、対策案「変数○のカテゴリカル値を結合する」の具体例を示している。図7の例では変数○の候補であるXやZ1やZ2のうちのXがシステムの判断によってあるいはユーザによって選択されたとする。提示部34は、対策案「変数○のカテゴリカル値を結合する」がユーザにより選択された場合、変数○に含まれる複数の値(カテゴリカル値)について、データが不足している値のデータと他の値のデータとを結合する。
次に、図10では、対策案「Ziのうち影響が弱いものを除去する」の具体例を示している。提示部34は、対策案「Ziのうち影響が弱いものを除去する」がユーザにより選択された場合、介入効果計算に必要な変数のうち、介入効果に及ぼす影響が小さい変数を介入効果計算から除外することをユーザに提示する。
次に、図11では、対策案「X→Ziに矢印の向きを変える」の具体例を示している。提示部34は、対策案「X→Ziに矢印の向きを変える」がユーザにより選択された場合、介入効果計算に必要な変数間における因果の向きを変えることをユーザに提示する。
次に、図14では、対策案「精度が落ちるが計算を続行する」の具体例を示している。つまり、提示部34は、対策案「精度が落ちるが計算を続行する」がユーザによって選択された場合、介入効果計算の精度(図14の下段に示す信頼性)が低下することを提示するとともに、介入効果計算を続行することをユーザに提示する。
次に、図16では、対策案「データを取り直す」の具体例を示している。提示部34は、対策案「データを取り直す」がユーザによって選択された場合、データが不足している変数を示す情報を提示するとともに、データを取り直すことをユーザに提示する。
次に、図17を用いて、実施形態に係る情報処理装置1が実行する情報処理の手順について説明する。図17は、実施形態に係る情報処理装置1が実行する情報処理の手順を示すフローチャートである。なお、図17において、因果情報41は予め生成されていることとする。
(1)欠損値が含まれるデータを除く
(2)欠損値を補完する
(3)欠損値を1つのカテゴリカル値として扱う
続いて、図18を参照して、本実施形態に係る情報処理装置1等のハードウェア構成の一例について説明する。図18は、本実施形態に係る情報処理装置1のハードウェア構成の一例を示すブロック図である。
以上説明したように、本開示の一実施形態によれば、情報処理装置1は、判定部33と、提示部34とを備える。判定部33は、複数の変数間における因果関係を示した因果情報41に基づいて、複数の変数のうちいずれかの変数に介入することで目的変数に生じる介入効果の計算を行う場合に、当該計算に必要な変数のデータが不足するか否かを判定する。提示部34は、判定部33の判定結果に基づく情報をユーザに提示する。
(1)
複数の変数間における因果関係を示した因果情報に基づいて、前記複数の変数のうちいずれかの前記変数に介入することで目的変数に生じる介入効果の計算を行う場合に、当該計算に必要な前記変数のデータが不足するか否かを判定する判定部と、
前記判定部の判定結果に基づく情報をユーザに提示する提示部と
を備える情報処理装置。
(2)
前記提示部は、
前記判定部によって前記計算に必要な前記変数のデータが不足すると判定された場合、前記計算が不可能であることを示す情報をユーザに提示する
前記(1)に記載の情報処理装置。
(3)
前記提示部は、
前記判定部によって前記計算に必要な前記変数のデータが不足すると判定された場合、データが不足している前記変数を示す情報をユーザに提示する
前記(1)または(2)に記載の情報処理装置。
(4)
前記提示部は、
複数の変数の組み合わせによるデータが不足している場合、データが不足している前記変数を示す情報を可視化してユーザに提示する
前記(3)に記載の情報処理装置。
(5)
前記提示部は、
前記判定部によって前記計算に必要な前記変数のデータが不足すると判定された場合、前記計算に必要な前記変数のデータを充足する対策案をユーザに提示する
前記(1)~(4)のいずれかに記載の情報処理装置。
(6)
前記提示部は、
前記対策案として、前記変数に含まれる複数の値について、データが不足している前記値のデータと他の前記値のデータとを結合することをユーザに提示する
前記(5)に記載の情報処理装置。
(7)
前記提示部は、
前記対策案として、前記計算に必要な前記変数のうち、介入効果に及ぼす影響が小さい前記変数を前記計算から除外することをユーザに提示する
前記(5)または(6)に記載の情報処理装置。
(8)
前記提示部は、
前記対策案として、前記計算に必要な前記変数間における因果の向きを変えることをユーザに提示する
前記(5)~(7)のいずれかに記載の情報処理装置。
(9)
前記提示部は、
前記対策案として、前記計算の精度が低下することを提示するとともに、当該計算を続行することをユーザに提示する
前記(5)~(8)のいずれかに記載の情報処理装置。
(10)
前記提示部は、
前記対策案として、データが不足している前記変数を示す情報を提示するとともに、データを取り直すことをユーザに提示する
前記(5)~(9)のいずれかに記載の情報処理装置。
(11)
前記提示部は、
前記対策案として、前記目的変数を変えずに、介入する前記変数を変えることをユーザに提示する
前記(5)~(10)のいずれかに記載の情報処理装置。
(12)
前記提示部は、
前記対策案とともに、当該対策案に関する補足情報をユーザに提示する
前記(5)~(11)のいずれかに記載の情報処理装置。
(13)
前記提示部は、
複数の前記対策案を提示する場合、前記対策案毎に、ユーザの熟練度に基づく推奨情報を提示する
前記(5)~(12)のいずれかに記載の情報処理装置。
(14)
前記判定部によって前記計算に必要な前記変数のデータが十分であると判定された場合、ユーザから受け付けた介入内容に基づいて前記目的変数における介入効果の計算を行う介入計算部をさらに備え、
前記提示部は、
前記介入計算部の計算結果に基づく介入情報をユーザに提示する
前記(1)~(13)のいずれかに記載の情報処理装置。
(15)
前記提示部は、
前記介入計算部が介入効果の計算を行うための介入を前記ユーザから受け付ける場合に、前記変数に含まれる複数の値のうち、データが所定数未満である値については、前記介入の受付を禁止する
前記(14)に記載の情報処理装置。
(16)
前記提示部は、
前記介入効果および前記介入効果の信頼性を含む前記介入情報をユーザに提示する
前記(14)または(15)に記載の情報処理装置。
(17)
前記提示部は、
介入した前記変数および前記目的変数の介入前後のデータを含む前記介入情報をユーザに提示する
前記(14)~(16)のいずれかに記載の情報処理装置。
(18)
複数の変数間における因果関係を示した因果情報に基づいて、前記複数の変数のうちいずれかの前記変数に介入することで目的変数に生じる介入効果の計算を行う場合に、当該計算に必要な前記変数のデータが不足するか否かを判定する判定工程と、
前記判定工程の判定結果に基づく情報をユーザに提示する提示工程と
を含む情報処理方法。
(19)
コンピュータに読み取られることで、前記コンピュータを、
複数の変数間における因果関係を示した因果情報に基づいて、前記複数の変数のうちいずれかの前記変数に介入することで目的変数に生じる介入効果の計算を行う場合に、当該計算に必要な前記変数のデータが不足するか否かを判定する判定部と、
前記判定部の判定結果に基づく情報をユーザに提示する提示部と、
として機能させる情報処理プログラム。
2 通信部
3 制御部
4 記憶部
11 ユーザ端末
31 受付部
32 抽出部
33 判定部
34 提示部
35 対策案実行部
36 介入計算部
41 因果情報
42 顧客情報
Claims (19)
- 複数の変数間における因果関係を示した因果情報に基づいて、前記複数の変数のうちいずれかの前記変数に介入することで目的変数に生じる介入効果の計算を行う場合に、当該計算に必要な前記変数のデータが不足するか否かを判定する判定部と、
前記判定部の判定結果に基づく情報をユーザに提示する提示部と
を備える情報処理装置。 - 前記提示部は、
前記判定部によって前記計算に必要な前記変数のデータが不足すると判定された場合、前記計算が不可能であることを示す情報をユーザに提示する
請求項1に記載の情報処理装置。 - 前記提示部は、
前記判定部によって前記計算に必要な前記変数のデータが不足すると判定された場合、データが不足している前記変数を示す情報をユーザに提示する
請求項1に記載の情報処理装置。 - 前記提示部は、
複数の変数の組み合わせによるデータが不足している場合、データが不足している前記変数を示す情報を可視化してユーザに提示する
請求項3に記載の情報処理装置。 - 前記提示部は、
前記判定部によって前記計算に必要な前記変数のデータが不足すると判定された場合、前記計算に必要な前記変数のデータを充足する対策案をユーザに提示する
請求項1に記載の情報処理装置。 - 前記提示部は、
前記対策案として、前記変数に含まれる複数の値について、データが不足している前記値のデータと他の前記値のデータとを結合することをユーザに提示する
請求項5に記載の情報処理装置。 - 前記提示部は、
前記対策案として、前記計算に必要な前記変数のうち、介入効果に及ぼす影響が小さい前記変数を前記計算から除外することをユーザに提示する
請求項5に記載の情報処理装置。 - 前記提示部は、
前記対策案として、前記計算に必要な前記変数間における因果の向きを変えることをユーザに提示する
請求項5に記載の情報処理装置。 - 前記提示部は、
前記対策案として、前記計算の精度が低下することを提示するとともに、当該計算を続行することをユーザに提示する
請求項5に記載の情報処理装置。 - 前記提示部は、
前記対策案として、データが不足している前記変数を示す情報を提示するとともに、データを取り直すことをユーザに提示する
請求項5に記載の情報処理装置。 - 前記提示部は、
前記対策案として、前記目的変数を変えずに、介入する前記変数を変えることをユーザに提示する
請求項5に記載の情報処理装置。 - 前記提示部は、
前記対策案とともに、当該対策案に関する補足情報をユーザに提示する
請求項5に記載の情報処理装置。 - 前記提示部は、
複数の前記対策案を提示する場合、前記対策案毎に、ユーザの熟練度に基づく推奨情報を提示する
請求項5に記載の情報処理装置。 - 前記判定部によって前記計算に必要な前記変数のデータが十分であると判定された場合、ユーザから受け付けた介入内容に基づいて前記目的変数における介入効果の計算を行う介入計算部をさらに備え、
前記提示部は、
前記介入計算部の計算結果に基づく介入情報をユーザに提示する
請求項1に記載の情報処理装置。 - 前記提示部は、
前記介入計算部が介入効果の計算を行うための介入を前記ユーザから受け付ける場合に、前記変数に含まれる複数の値のうち、データが所定数未満である値については、前記介入の受付を禁止する
請求項14に記載の情報処理装置。 - 前記提示部は、
前記介入効果および前記介入効果の信頼性を含む前記介入情報をユーザに提示する
請求項14に記載の情報処理装置。 - 前記提示部は、
介入した前記変数および前記目的変数の介入前後のデータを含む前記介入情報をユーザに提示する
請求項14に記載の情報処理装置。 - 複数の変数間における因果関係を示した因果情報に基づいて、前記複数の変数のうちいずれかの前記変数に介入することで目的変数に生じる介入効果の計算を行う場合に、当該計算に必要な前記変数のデータが不足するか否かを判定する判定工程と、
前記判定工程の判定結果に基づく情報をユーザに提示する提示工程と
を含む情報処理方法。 - コンピュータに読み取られることで、前記コンピュータを、
複数の変数間における因果関係を示した因果情報に基づいて、前記複数の変数のうちいずれかの前記変数に介入することで目的変数に生じる介入効果の計算を行う場合に、当該計算に必要な前記変数のデータが不足するか否かを判定する判定部と、
前記判定部の判定結果に基づく情報をユーザに提示する提示部と、
として機能させる情報処理プログラム。
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202080048856.XA CN114175082B (zh) | 2019-07-24 | 2020-06-16 | 信息处理设备、信息处理方法和信息处理程序 |
| EP20843646.9A EP4006806A4 (en) | 2019-07-24 | 2020-06-16 | INFORMATION PROCESSING DEVICE, METHOD AND PROGRAM |
| JP2021533863A JP7505495B2 (ja) | 2019-07-24 | 2020-06-16 | 情報処理装置、情報処理方法および情報処理プログラム |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2019136273 | 2019-07-24 | ||
| JP2019-136273 | 2019-07-24 | ||
| JP2019-201039 | 2019-11-05 | ||
| JP2019201039 | 2019-11-05 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2021014823A1 true WO2021014823A1 (ja) | 2021-01-28 |
Family
ID=74193376
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2020/023497 Ceased WO2021014823A1 (ja) | 2019-07-24 | 2020-06-16 | 情報処理装置、情報処理方法および情報処理プログラム |
Country Status (4)
| Country | Link |
|---|---|
| EP (1) | EP4006806A4 (ja) |
| JP (1) | JP7505495B2 (ja) |
| CN (1) | CN114175082B (ja) |
| WO (1) | WO2021014823A1 (ja) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115544333A (zh) * | 2022-10-21 | 2022-12-30 | 北京安信天行科技有限公司 | 一种数据展示方法、系统及电子设备 |
| JPWO2023152897A1 (ja) * | 2022-02-10 | 2023-08-17 | ||
| WO2025099976A1 (ja) * | 2023-11-10 | 2025-05-15 | ソニーグループ株式会社 | 情報処理装置及び情報処理方法、並びにコンピュータプログラム |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2014228991A (ja) | 2013-05-21 | 2014-12-08 | ソニー株式会社 | 情報処理装置および方法、並びにプログラム |
| JP2015060259A (ja) * | 2013-09-17 | 2015-03-30 | 株式会社日立製作所 | データ分析支援システム |
| JP2016133895A (ja) * | 2015-01-16 | 2016-07-25 | キヤノン株式会社 | 情報処理装置、情報処理方法、及びプログラム |
| JP2018190140A (ja) * | 2017-05-01 | 2018-11-29 | オムロン株式会社 | 学習装置、学習方法、及び学習プログラム |
Family Cites Families (15)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5130936A (en) * | 1990-09-14 | 1992-07-14 | Arinc Research Corporation | Method and apparatus for diagnostic testing including a neural network for determining testing sufficiency |
| US8498879B2 (en) * | 2006-04-27 | 2013-07-30 | Wellstat Vaccines, Llc | Automated systems and methods for obtaining, storing, processing and utilizing immunologic information of individuals and populations for various uses |
| US20110214050A1 (en) * | 2006-09-29 | 2011-09-01 | Stambaugh Thomas M | Virtual systems for spatial organization, navigation, and presentation of information |
| EP1967996A1 (en) * | 2007-03-09 | 2008-09-10 | Omron Corporation | Factor estimating support device and method of controlling the same, and factor estimating support program |
| JP5388060B2 (ja) * | 2009-07-13 | 2014-01-15 | 東芝エレベータ株式会社 | エレベータの部品改善計画システム及びその部品改善計画方法 |
| JP5402375B2 (ja) * | 2009-08-07 | 2014-01-29 | ソニー株式会社 | 情報処理装置、基準値決定方法およびプログラム |
| JP5702115B2 (ja) * | 2010-11-09 | 2015-04-15 | ダイコク電機株式会社 | 遊技情報表示装置 |
| US20130166188A1 (en) * | 2011-12-21 | 2013-06-27 | Microsoft Corporation | Determine Spatiotemporal Causal Interactions In Data |
| KR20160042987A (ko) * | 2013-08-14 | 2016-04-20 | 노파르티스 아게 | 산발성 봉입체 근염을 치료하는 방법 |
| US10133791B1 (en) * | 2014-09-07 | 2018-11-20 | DataNovo, Inc. | Data mining and analysis system and method for legal documents |
| CN104537418A (zh) * | 2014-12-11 | 2015-04-22 | 广东工业大学 | 一种自底向上的高维数据因果网络学习方法 |
| WO2016160734A1 (en) * | 2015-03-27 | 2016-10-06 | Beyondcore, Inc. | Analyzing variations within and/or between data sets |
| CN106874589A (zh) * | 2017-02-10 | 2017-06-20 | 泉州装备制造研究所 | 一种基于数据驱动的报警根源寻找方法 |
| CN108684051B (zh) * | 2018-05-11 | 2021-11-19 | 广东南方通信建设有限公司 | 一种基于因果诊断的无线网络性能优化方法、电子设备及存储介质 |
| CN109271488B (zh) * | 2018-10-08 | 2021-08-27 | 广东工业大学 | 一种结合行为序列和文本信息的社交网络用户间因果关系发现方法及系统 |
-
2020
- 2020-06-16 CN CN202080048856.XA patent/CN114175082B/zh active Active
- 2020-06-16 EP EP20843646.9A patent/EP4006806A4/en active Pending
- 2020-06-16 JP JP2021533863A patent/JP7505495B2/ja active Active
- 2020-06-16 WO PCT/JP2020/023497 patent/WO2021014823A1/ja not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2014228991A (ja) | 2013-05-21 | 2014-12-08 | ソニー株式会社 | 情報処理装置および方法、並びにプログラム |
| JP2015060259A (ja) * | 2013-09-17 | 2015-03-30 | 株式会社日立製作所 | データ分析支援システム |
| JP2016133895A (ja) * | 2015-01-16 | 2016-07-25 | キヤノン株式会社 | 情報処理装置、情報処理方法、及びプログラム |
| JP2018190140A (ja) * | 2017-05-01 | 2018-11-29 | オムロン株式会社 | 学習装置、学習方法、及び学習プログラム |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4006806A4 |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPWO2023152897A1 (ja) * | 2022-02-10 | 2023-08-17 | ||
| WO2023152897A1 (ja) * | 2022-02-10 | 2023-08-17 | 富士通株式会社 | 情報処理プログラム、情報処理装置及び情報処理方法 |
| JP7705072B2 (ja) | 2022-02-10 | 2025-07-09 | 富士通株式会社 | 情報処理プログラム、情報処理装置及び情報処理方法 |
| CN115544333A (zh) * | 2022-10-21 | 2022-12-30 | 北京安信天行科技有限公司 | 一种数据展示方法、系统及电子设备 |
| WO2025099976A1 (ja) * | 2023-11-10 | 2025-05-15 | ソニーグループ株式会社 | 情報処理装置及び情報処理方法、並びにコンピュータプログラム |
Also Published As
| Publication number | Publication date |
|---|---|
| JP7505495B2 (ja) | 2024-06-25 |
| CN114175082A (zh) | 2022-03-11 |
| JPWO2021014823A1 (ja) | 2021-01-28 |
| EP4006806A4 (en) | 2022-08-24 |
| EP4006806A1 (en) | 2022-06-01 |
| CN114175082B (zh) | 2025-05-06 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US10389662B2 (en) | Aggregation and visualization of multiple chat room information | |
| CN109937415B (zh) | 使用多个路径在图形数据库中进行相关性评分的装置、方法和系统 | |
| US20200042646A1 (en) | Descriptive text generation for data visualizations | |
| US20130304469A1 (en) | Information processing method and apparatus, computer program and recording medium | |
| US20100165396A1 (en) | Information communication system, user terminal and information communication method | |
| CN113157947A (zh) | 知识图谱的构建方法、工具、装置和服务器 | |
| WO2021014823A1 (ja) | 情報処理装置、情報処理方法および情報処理プログラム | |
| JPWO2020004154A1 (ja) | 情報処理装置、情報処理方法及びプログラム | |
| US10019508B1 (en) | Keeping up with the joneses | |
| US9105036B2 (en) | Visualization of user sentiment for product features | |
| CN117971661A (zh) | 大模型测试方法、装置、电子设备及存储介质 | |
| CN112528158B (zh) | 课程推荐方法、装置、设备及存储介质 | |
| WO2022156534A1 (zh) | 视频质量评估方法和装置 | |
| CN107729424B (zh) | 一种数据可视化方法及设备 | |
| JP2018073191A (ja) | プロジェクト管理項目評価システム及びプロジェクト管理項目評価方法 | |
| CN117813617A (zh) | 基于人工智能的编排 | |
| CN102663004B (zh) | 图形化呈现条件组合的方法和装置 | |
| CN114416801A (zh) | 用于输出信息的方法和装置 | |
| US9846742B2 (en) | Apparatus and method for providing community service | |
| US20190384505A1 (en) | Information processing device, parts selection method, and computer-readable recording medium | |
| CN113032251B (zh) | 应用程序服务质量的确定方法、设备和存储介质 | |
| EP4105789A1 (en) | Information processing device, information processing method, and information processing program | |
| JP2024169110A (ja) | 情報処理装置、情報処理方法、記録媒体、プログラム | |
| CN116468479A (zh) | 确定页面质量评估维度方法、页面质量的评估方法和装置 | |
| CN113343090B (zh) | 用于推送信息的方法、装置、设备、介质和产品 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20843646 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2021533863 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2020843646 Country of ref document: EP |
|
| ENP | Entry into the national phase |
Ref document number: 2020843646 Country of ref document: EP Effective date: 20220224 |
|
| WWG | Wipo information: grant in national office |
Ref document number: 202080048856.X Country of ref document: CN |