WO2022227355A1 - 获取知识的方法和装置 - Google Patents
获取知识的方法和装置 Download PDFInfo
- Publication number
- WO2022227355A1 WO2022227355A1 PCT/CN2021/115192 CN2021115192W WO2022227355A1 WO 2022227355 A1 WO2022227355 A1 WO 2022227355A1 CN 2021115192 W CN2021115192 W CN 2021115192W WO 2022227355 A1 WO2022227355 A1 WO 2022227355A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- knowledge
- task
- machine learning
- module
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/01—Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
Definitions
- the present application relates to the field of computers, and more particularly, to a method and apparatus for acquiring knowledge.
- the model is the most important part. However, since training the model takes a lot of time, with the continuous expansion of application scenarios, the model can be stored in the model knowledge base. Machine learning developers can find the required pre-trained models and use them directly with simple modifications, which greatly reduces the manpower and material consumption caused by repeated training.
- the existing model knowledge base can only perform simple model search according to the name, and the application scope of the model is not clear, and it is not clear which model to use under what circumstances, resulting in the inaccuracy of the model knowledge base. search.
- the present application provides a method and device for acquiring knowledge, which can automatically acquire corresponding knowledge from a knowledge base according to parameters, so as to realize accurate search of knowledge in the knowledge base.
- a method for acquiring knowledge comprising: acquiring one or more first knowledges from a knowledge base according to parameters, where the parameters include any one or a combination of the following: within a machine learning task The knowledge of the machine learning tasks, the properties of the machine learning tasks, the knowledge among multiple machine learning tasks; the one or more first knowledges are provided to the user.
- the corresponding knowledge can be automatically obtained from the knowledge base according to the parameters, so as to realize the accurate search of knowledge in the knowledge base.
- the method further includes: acquiring the parameter input by the user; or acquiring the parameter from another system.
- the knowledge in the machine learning task includes a sample set and a model of the machine learning task, and the model is obtained by training according to the sample set; or
- the attributes of the machine learning tasks include constraints and application scopes of the machine learning tasks; or the knowledge among the multiple machine learning tasks includes association relationships between the multiple machine learning tasks.
- the method further includes: acquiring second knowledge related to the first knowledge from the knowledge base; providing the user with the second knowledge Knowledge.
- a corresponding knowledge similarity comparison method is determined according to the first knowledge; a similar knowledge list is obtained from the task knowledge base according to the knowledge similarity comparison method ; Determine the second knowledge from the similar knowledge list according to a similarity threshold.
- the method further includes: providing the user with configuration information of the first knowledge.
- the method further includes: acquiring target knowledge selected by the user, where the target knowledge is the first knowledge and/or the second knowledge.
- the target knowledge is used in any one of the following scenarios:
- the method further includes: updating the task knowledge base according to the first knowledge and the second knowledge.
- a method for comparing knowledge similarities and differences is determined according to the first knowledge and the second knowledge;
- the similarities and differences comparison results are the similarities and differences comparison results between the first knowledge and the second knowledge; according to the similarities and differences comparison results and the update rules, a combination of any one or more of the following knowledge in the task knowledge base Update: knowledge within the machine learning task, attributes of the machine learning task, knowledge between a plurality of the machine learning tasks.
- the method further includes: the edge device synchronizing the knowledge in the knowledge base to the cloud device; or the cloud device synchronizing the knowledge in the knowledge base Sync to the edge device.
- an apparatus for acquiring knowledge including an acquiring module and a display module.
- the obtaining module is used to obtain one or more first knowledges from the knowledge base according to parameters, and the parameters include any one or a combination of the following: knowledge in the machine learning task, knowledge in the machine learning task attributes, knowledge among multiple machine learning tasks;
- the display module is used to provide the one or more first knowledges to the user.
- the obtaining module is further configured to: obtain the parameter input by the user; or obtain the parameter from other systems.
- the knowledge in the machine learning task includes a sample set and a model of the machine learning task, and the model is obtained by training according to the sample set; or
- the attributes of the machine learning tasks include constraints and application scopes of the machine learning tasks; or the knowledge among the multiple machine learning tasks includes association relationships between the multiple machine learning tasks.
- the obtaining module is further configured to obtain second knowledge related to the first knowledge from the knowledge base; the display module is further configured to use for providing the second knowledge to the user.
- the display module is further configured to: provide the user with configuration information of the first knowledge.
- the acquiring module is further configured to acquire target knowledge selected by the user, where the target knowledge is the first knowledge and/or the second knowledge.
- the target knowledge is used in any one of the following scenarios:
- the apparatus further includes: a synchronization module for the edge device to synchronize the knowledge in the knowledge base to the cloud device; or the cloud device to synchronize the knowledge in the knowledge base to the cloud device; The knowledge in the knowledge base is synchronized to the edge device.
- a device for acquiring knowledge comprising an input and output interface, a processor and a memory, wherein the processor is used to control the input and output interface to send and receive information, the memory is used to store a computer program, and the processing The computer program is used to call and run the computer program from the memory, so that the method described in the first aspect or any one of the possible implementations of the first aspect is performed.
- the processor may be a general-purpose processor, which may be implemented by hardware or software.
- the processor can be a logic circuit, an integrated circuit, etc.; when implemented by software, the processor can be a general-purpose processor, implemented by reading software codes stored in a memory, which can Integrated in the processor, can be located outside the processor, independent existence.
- a chip in a fourth aspect, obtains an instruction and executes the instruction to implement the first aspect and the method in any one of the implementation manners of the first aspect.
- the chip includes a processor and a data interface
- the processor reads the instructions stored in the memory through the data interface, and executes the first aspect and any one of the implementation manners of the first aspect.
- the chip may further include a memory, the memory stores an instruction, the processor is used to execute the instruction stored on the memory, and when the instruction is executed, the processor is used to execute the first.
- a computer program product comprising: computer program code, when the computer program code is run on a computer, the computer is made to execute any one of the first aspect and the first aspect method in the implementation.
- a computer-readable storage medium including instructions; the instructions are used to implement the first aspect and the method in any one of the implementation manners of the first aspect.
- the above-mentioned storage medium may specifically be a non-volatile storage medium.
- FIG. 1 is a schematic flowchart of a method for acquiring knowledge provided by an embodiment of the present application.
- FIG. 2 is a schematic block diagram of a knowledge base initialization provided by an embodiment of the present application.
- FIG. 3 is a schematic block diagram of another knowledge base initialization provided by an embodiment of the present application.
- FIG. 4 is a schematic block diagram of an operation phase of a knowledge base provided by an embodiment of the present application.
- FIG. 5 is a schematic block diagram of another knowledge base operation stage provided by an embodiment of the present application.
- FIG. 6 is a schematic interface diagram of a knowledge base initialization parameter configuration provided by an embodiment of the present application.
- FIG. 7 is a schematic diagram of a parameter configuration interface for querying a model in a knowledge base provided by an embodiment of the present application.
- FIG. 8 is a schematic diagram of a parameter configuration interface for querying a task in a knowledge base provided by an embodiment of the present application.
- FIG. 9 is a schematic diagram of a parameter configuration interface for querying the application scope of a model in a knowledge base provided by an embodiment of the present application.
- FIG. 10 is a schematic block diagram of an apparatus 1000 for acquiring knowledge provided by an embodiment of the present application.
- FIG. 11 is a schematic block diagram of a device 1100 for acquiring knowledge provided by an embodiment of the present application.
- the network architecture and service scenarios described in the embodiments of the present application are for the purpose of illustrating the technical solutions of the embodiments of the present application more clearly, and do not constitute a limitation on the technical solutions provided by the embodiments of the present application.
- the evolution of the architecture and the emergence of new business scenarios, the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems.
- references in this specification to "one embodiment” or “some embodiments” and the like mean that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application.
- appearances of the phrases “in one embodiment,” “in some embodiments,” “in other embodiments,” “in other embodiments,” etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean “one or more but not all embodiments” unless specifically emphasized otherwise.
- the terms “including”, “including”, “having” and their variants mean “including but not limited to” unless specifically emphasized otherwise.
- At least one means one or more, and “plurality” means two or more.
- And/or which describes the relationship of the associated objects, means that there can be three relationships, for example, A and/or B, which can mean: including the existence of A alone, the existence of A and B at the same time, and the existence of B alone, where A and B can be singular or plural.
- the character “/” generally indicates that the associated objects are an “or” relationship.
- At least one item(s) below” or similar expressions thereof refer to any combination of these items, including any combination of single item(s) or plural items(s).
- At least one (a) of a, b, or c can represent: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, c may be single or multiple .
- the model is the most important part.
- the model represented by deep neural network has achieved good results in many machine learning related applications, such as image classification and speech recognition.
- An example is the TensorFlow Hub provided by Google, which is a model repository for storing reusable machine learning assets.
- Machine learning developers can find the required pre-trained models and use them directly with simple modifications, which greatly reduces the manpower and material consumption caused by repeated training.
- the existing model knowledge base mainly has the following problems:
- the embodiments of the present application provide a method for acquiring knowledge, which can automatically acquire corresponding knowledge from a knowledge base according to parameters, so as to realize accurate search of knowledge in the knowledge base.
- a method for acquiring knowledge provided by an embodiment of the present application will be described in detail below with reference to FIG. 1 .
- the method includes steps 110-120, and the steps 110-120 will be described in detail below respectively.
- Step 110 Acquire one or more first knowledges from the knowledge base according to parameters, where the parameters include any one or a combination of the following: knowledge within the machine learning task, attributes of the machine learning task, multiple knowledge between machine learning tasks.
- the parameters may also be acquired.
- the parameters input by the user may be obtained, or the parameters may be obtained from other systems (for example, other intelligent platforms), which are not specifically limited in this application.
- the knowledge in the above-mentioned machine learning task includes a sample set and a model of the machine learning task, and the model is obtained by training according to the sample set.
- the attributes of the above machine learning tasks include constraints and application scopes of the machine learning tasks.
- the knowledge among the above-mentioned multiple machine learning tasks includes the association relationship between the multiple machine learning tasks. The above knowledge will be explained in detail below with reference to specific examples, and will not be repeated here.
- Step 120 Provide the one or more first knowledge to the user.
- second knowledge related to the first knowledge may also be acquired from a knowledge base, and the second knowledge is provided to the user.
- configuration information related to the first knowledge may also be provided to the user, for example, an introduction to the first knowledge, benefits, and the like.
- configuration information related to the second knowledge may also be provided to the user.
- target knowledge selected by the user may also be acquired, where the target knowledge is the above-mentioned first knowledge and/or second knowledge.
- the target indication is for any of the following scenarios:
- mutual synchronization between the knowledge base of the edge device and the knowledge base of the cloud device can also be implemented.
- the edge device synchronizes the knowledge in the knowledge base to the cloud device; or the cloud device synchronizes the knowledge in the knowledge base to the edge device.
- FIG. 2 is a schematic block diagram of a knowledge base initialization provided by an embodiment of the present application.
- the knowledge base initialization process may include: a knowledge base initialization module 210 , an edge knowledge base 220 and a cloud knowledge base 230 .
- the functions of the above modules are described in detail below.
- the edge knowledge base 220 a knowledge base located on the edge side device, is used to store multi-task knowledge, multi-task knowledge index table and multi-task knowledge extractor.
- the device on the edge side may be, for example, a server close to the user side.
- the cloud knowledge base 230 is a knowledge base located on the device on the cloud side, and is used for storing multi-task knowledge, multi-task knowledge index table and multi-task knowledge extractor.
- the device on the cloud side may be, for example, a server on the cloud side.
- the knowledge base initialization module 210 is used to construct the above-mentioned knowledge base, and can also be understood as initializing the above-mentioned knowledge base.
- the knowledge base initialization module 210 is further configured to realize the storage and synchronization of the multi-task knowledge between the above-mentioned edge knowledge base 220 and the cloud knowledge base 230 .
- the input of the knowledge base initialization module 210 is multi-task knowledge and its extractor (which may be multi-task knowledge and its extractor input by the system, or multi-task knowledge and its extractor in the cloud), and the output is multi-task knowledge , a multi-task knowledge index table, and a multi-task knowledge extractor.
- the knowledge base initialization module 210 can store the multi-task knowledge index table and the multi-task knowledge extractor in the edge knowledge base 220 or the cloud knowledge base 230 based on the multi-task knowledge and its extractor, and complete the edge knowledge base 220 Storage and synchronization of multitasking knowledge with cloud knowledge base 230 .
- the knowledge base initialization module 210 may include the following two sub-modules: a multi-task knowledge initialization module 211 and an edge-cloud knowledge synchronization module 212 . The functions of these two sub-modules will be described in detail below, and will not be described in detail here.
- Multi-task knowledge can include three levels of knowledge: intra-task knowledge, task knowledge, and inter-task knowledge.
- the multi-task knowledge may also include a fourth level of knowledge: a task group. The knowledge of each of the above-mentioned levels will be described in detail below.
- the knowledge of the first level is intra-task knowledge, which belongs to the knowledge that may be stored in traditional lifelong learning, including: samples and models.
- the sample can also be called the task sample, which refers to the sample set under a certain task after the task is defined. It is a data record used for model training or testing. Each record includes different data items, which can refer to labeled samples and unlabeled samples. For example, a sample of coal blending is shown in Table 1.
- a model can also be called a task model, which refers to a model trained separately for each task.
- the information of the model may include, but is not limited to: training method of the model, model hyperparameters, model parameters, and the like.
- the training method of the model may include, but is not limited to, machine learning algorithms, such as neural networks, support vector machines, and the like.
- Model hyperparameters refer to configuration items of the machine learning algorithm used to train the model, such as the learning rate in a neural network.
- Model parameters refer to the configuration of the model mapping function, which is obtained by training machine learning algorithms, such as weights in neural networks, support vectors in support vector machines, etc. For example, a coal blending prediction model trained according to coal blending samples is shown in Table 2.
- the knowledge of the second level is task knowledge, which belongs to the knowledge that is not stored in traditional lifelong learning, including: task constraints, task attributes, and the like.
- the task constraint refers to the configuration item of the task definition algorithm used to divide the task, for example, the lower limit of the task sample size is the minimum value of the number of samples included in a task.
- Task properties are used to define the data items or feature columns that the model applies to. Table 3 shows a coal blending task.
- the knowledge of the third level is inter-task knowledge, which belongs to the knowledge that is not stored in traditional lifelong learning, including the association relationship between multiple tasks, which can be used for the discrimination and optimization of unknown tasks.
- the inter-task knowledge may include the task list and the degree of association between different tasks.
- the task list is the task list stored in the knowledge base, and is the input of task relationship discovery.
- the degree of association between different tasks can have two inheritance forms: semantic belonging relation and transfer relation in transfer learning. Among them, the semantic subordination relationship is used to express the degree of mutual subordination of different tasks, which can be output by the task definition method. Table 4 shows the subordination matrix of a coal blending task.
- the transferability relationship in transfer learning is used to express the degree of mutual transferability between different tasks, which can be output by the task transfer relationship discovery method.
- Table 5 shows the transferability matrix of a coal blending task.
- the task transfer relationship discovery method refers to measuring the degree of transferability between tasks by means of similarity and other methods according to characteristics such as task samples.
- metadata-based task migration relationship discovery can be used to extract knowledge at the inter-task level. The basic principles are as follows: weight the metadata items and data items in the distance function respectively; use the metadata as a constraint to construct a priori clustering , and then use the data to further construct the posterior clustering on the basis of the prior clustering.
- Table 5 The mobility matrix of a coal blending task
- the knowledge of the fourth level is a task group, which belongs to the knowledge that is not stored in traditional lifelong learning. It refers to aggregating tasks with similar relationships into groups, which can be used to accelerate the optimization module of unknown tasks.
- the task group may include: task group constraints, task group attributes, task lists, task group samples, and task group models.
- the task group constraint refers to the configuration item of the task group division algorithm used to divide the task group, such as the lower limit of the sample size of the task group: the minimum value of the number of samples included in a task group.
- the task group attribute is used to define the data item or characteristic column of the application scope of the task group.
- the task list refers to the list of tasks stored in the knowledge base, which is the input of the task group division algorithm.
- Task group samples refer to all samples in the task group.
- a task group model refers to a model obtained by training all samples in a task group, or a model constructed based on the knowledge of multiple tasks in each task group.
- the multi-task knowledge initialization module 211 initializes the knowledge base based on the input multi-task knowledge and its extractor, stores the multi-task knowledge index table and the multi-task knowledge extractor in the edge knowledge base 220, and feeds back the initialization status after completing the initialization. Synchronized to the cloud knowledge base 230 .
- the multi-task knowledge initialization module 211 may include: an intra-task knowledge and its index table initialization module 2111 , a task knowledge and its index table initialization module 2112 , and an inter-task knowledge and its index table initialization module 2113 .
- the multi-task knowledge initialization module 211 may choose to call one or more of the above sub-modules according to different levels of multi-task knowledge.
- the input multi-task knowledge only has samples and task attributes, and there is no inter-task knowledge such as task migration relationship, only the first two submodules (intra-task knowledge and its index table initialization module 2111, task knowledge and its index table initialization module 2112) need to be called. ) to complete the initialization.
- Intra-task knowledge and its index table initialization module 2111 its input is multi-task knowledge and its extractor, and its output is intra-task knowledge, intra-task knowledge index table and intra-task knowledge extractor.
- the intra-task knowledge and its index table initialization 2111 generates intra-task knowledge, intra-task knowledge index table and intra-task knowledge extractor based on the input multi-task knowledge and its extractor, and initializes them into the edge knowledge base 220 . If the multi-task knowledge is a sample set and its task attributes, the intra-task knowledge and its index table initialization module 2111 will take the task attribute as an index of the sample set, and update this knowledge into the intra-task knowledge index table.
- Task knowledge and its index table initialization module 2112 its input is multi-task knowledge and its extractor, and its output is task knowledge, task knowledge index table and task knowledge extractor. Specifically, the task knowledge and its index table initialization module 2112 generates task knowledge, a task knowledge index table and a task knowledge extractor based on the input multi-task knowledge and its extractor, and initializes them into the edge knowledge base 220 . If multi-task knowledge is a task, including task samples, task models and their task attributes, the task knowledge and its index table initialization module 2112 will use the task attributes as an index of the task and update this knowledge into the task knowledge index table.
- the inter-task knowledge and its index table initialization module 2113 whose input is multi-task knowledge and its extractor, and the output is inter-task knowledge, an inter-task knowledge index table, and an inter-task knowledge extractor.
- the inter-task knowledge and its index table initialization module 2113 generates inter-task knowledge, an inter-task knowledge index table and an inter-task knowledge extractor based on the input multi-task knowledge and its extractor, and initializes them to the edge knowledge base 220 middle.
- the inter-task knowledge is the transferable relationship between a task and other tasks and the task group to which it belongs
- the inter-task knowledge and its index table initialization module 2113 will use the task attribute as the index of the task and update this knowledge to the inter-task knowledge knowledge index table.
- the edge-cloud knowledge synchronization module 212 is used for bidirectional transmission of multi-task knowledge of the edge knowledge base 220 and the cloud knowledge base 230 to ensure the synchronization of their knowledge. In this way, the training problem that it is difficult to support a large number of models due to the limitation of computing resources of the edge knowledge base 220 can be avoided.
- the input of the edge-cloud knowledge synchronization module 212 is multi-task knowledge, and the output is multi-task knowledge and initialization state feedback. Specifically, the edge-cloud knowledge synchronization module 212 initializes the edge knowledge base 220 based on the multi-task knowledge of the cloud knowledge base 230, and feeds back the initialization state after the initialization is completed.
- the knowledge output by the multi-task knowledge initialization module 211 to the edge knowledge base 220 also needs to be synchronized to the cloud knowledge base 230 .
- this includes dealing with conflicts of knowledge, say based on the time stamp before and after, based on the confidence of knowledge/amount of supporting evidence, or manual handling of conflicts by the knowledge base administrator.
- this running phase may include: an edge knowledge base 220 , a cloud knowledge base 230 , a knowledge base search module 310 , and a candidate knowledge cache 330 .
- the operation phase may further include a knowledge base incremental maintenance module 320 . The functions of each of the above modules will be described in detail below.
- the knowledge base (eg, the edge knowledge base 220 ) is searched according to the runtime data and the target knowledge type query command, and different levels of knowledge and indexes are extracted and fed back by extractors at different levels in the knowledge base.
- the target knowledge type query command is parsed by the query command to obtain a query knowledge item list, and according to the query knowledge item list, searched and sorted according to the task index table, and returned as the target knowledge and candidate to be filled knowledge output.
- the knowledge base search module 310 includes the following sub-modules: a knowledge query module 311 , a multi-task knowledge and index extraction module 312 , and a multi-task index query and sorting module 313 .
- the functions of the sub-modules included in the knowledge base search module 310 will be described in detail below with reference to FIG. 5 , which will not be described in detail here.
- the knowledge base incremental maintenance module 320 first integrates the historical knowledge to be filled in the candidate knowledge cache 330 and the filled knowledge after annotation and model filling into a complete candidate update knowledge, and then combines the existing multi-task knowledge to obtain Update the policy and then update the knowledge to the knowledge base.
- the knowledge base incremental maintenance module 320 may include the following sub-modules: a knowledge asynchronous integration module 321 , a knowledge strategy discrimination module 322 , and a knowledge and index update module 323 .
- each sub-module included in the knowledge base search module 310 and the knowledge base incremental maintenance module 320 is described in detail below with reference to the specific example in FIG. 5 .
- the knowledge base search module 310 includes the following sub-modules: a knowledge query module 311 , a multi-task knowledge and index extraction module 312 , and a multi-task index query and sorting module 313 .
- the knowledge query module 311 queries related knowledge based on the user's target knowledge type query command and feeds back query results. Specifically, the knowledge query module 311 parses the user's target knowledge type query command into a query knowledge item list, which is used for knowledge query, and outputs the query result as the target knowledge to the user, and also as the candidate knowledge to be filled into the candidate knowledge Cache 330. Specifically, as an example, as shown in FIG. 5 , the knowledge query module 311 includes the following sub-modules: a query command parsing module 3111 and a knowledge feedback module 3112 .
- the query command parsing module 3111 parses the user's target knowledge type query command into query knowledge items, which are used for knowledge query. Specifically, its input is a target knowledge type query command, and its output is a list of query knowledge items, the recipient.
- the query command may be in a dictionary-like format, such as the query format in Table 6 below, where the query knowledge type may include knowledge at different levels such as intra-task, task, and inter-task.
- the user may ask questions about the runtime data, such as "User A wants to know the task knowledge related to the 03 runtime data set", etc., the user sends the target knowledge type query command to the system according to the content of the question, and the query command parsing module 312 parses it into a query knowledge item list, which can be in list format, for example: when the query knowledge type in the command is a task, the query knowledge item list is ⁇ task attribute, task model, task sample ⁇ .
- the query command parsing module 312 parses the query command into several specific query knowledge items "samples" and "tasks” and outputs them to the knowledge feedback module 313 .
- the knowledge feedback module 3112 determines which knowledge and index extraction modules need to be called based on the query knowledge item list, outputs the extracted knowledge as target knowledge to the user, and puts it into the candidate knowledge cache 330 as candidate knowledge to be filled.
- the input is the query knowledge item list and the receiver, and the output is the target knowledge and the candidate knowledge to be filled.
- the knowledge feedback module 3112 invokes the corresponding one or more knowledge and index extraction modules according to the query knowledge item list.
- the knowledge feedback module 3112 only needs to call the in-task knowledge and index extraction module and the task knowledge and index extraction module, and then return the search extraction result to the user as target knowledge.
- it is also put into the candidate knowledge cache 330 as a partially known candidate knowledge to be filled. If the relevant knowledge cannot be searched in the knowledge base, it will also be put into the candidate knowledge cache 330 as the unknown candidate knowledge to be filled.
- the multi-task knowledge and index extraction module 312 searches the knowledge base based on runtime data and query knowledge items, and uses a knowledge extractor to extract knowledge and indexes within tasks, among tasks, and among tasks. Specifically, its inputs are runtime data, knowledge extractors, and query knowledge items, and its outputs are knowledge and indexes within tasks, among tasks, and among tasks.
- the multi-task knowledge and index extraction module 312 may include the following sub-modules: an intra-task knowledge and index extraction module 3121 , a task knowledge and index extraction module 3122 , and an inter-task knowledge and index extraction module 3123 .
- the multi-task knowledge and index extraction module 314 extracts different levels of knowledge according to the query knowledge item. For example, you can only query intra-task knowledge, not query task knowledge and inter-task knowledge, or you can query all three kinds of knowledge together.
- the intra-task knowledge and index extraction module 3121 extracts intra-task level knowledge, such as samples, models and their indexes, based on runtime data and intra-task knowledge extractors. Specifically, its input is runtime data, in-task knowledge extractor, and its output is in-task knowledge and index.
- the intra-task knowledge and index extraction module 3121 extracts the intra-task level knowledge based on the runtime data and the intra-task knowledge extractor. If the user wants to know "what are the samples related to the runtime data", the intra-task knowledge
- the index extraction module 3121 first extracts the task index, and then calls the multi-task index query and sorting module 313 to search for related samples according to the task index.
- the task knowledge and index extraction module 3122 extracts task-level knowledge, such as task attributes and indexes, based on runtime data, task knowledge extractors and intra-task knowledge. Specifically, its input is runtime data, task knowledge extractor and intra-task knowledge, and its output is task-level knowledge and index.
- the task knowledge and index extraction module 3122 extracts the knowledge at the task level based on the runtime data and the intra-task knowledge extractor. If the user wants to know "whether the runtime data belongs to a known task", the task knowledge and The index extracting module 3122 will first extract the task index, and then call the multi-task index query and sorting module 313 according to the task index to find out whether there are related tasks and their task attributes, task models, task samples, and the like.
- the inter-task knowledge and index extraction module 3123 extracts inter-task level knowledge, such as task relationships, task groups and their indexes, based on runtime data, inter-task knowledge extractors and task knowledge. Specifically, its input is runtime data, inter-task knowledge extractor and task knowledge, and the output is inter-task level knowledge.
- the inter-task knowledge and index extraction module 3123 extracts the inter-task level knowledge based on the runtime data and the intra-task knowledge extractor.
- the inter-knowledge and index extraction module 3123 first extracts the task index, and then calls the multi-task index query and sorting module 313 to find out whether there are related tasks, their task relationships, task groups, and the like according to the task knowledge and the task index.
- the multi-task index query and sorting module 313, which can also be referred to as a related knowledge search module, searches for the knowledge at all levels in the knowledge base based on the index extracted by the knowledge and index extraction module and the task index table in the knowledge base, and sorts them according to similarity Output the search results.
- the input is an index and a task index table
- the output is the search result sorted by similarity.
- the similarity of tasks can be judged from the following multiple perspectives: task samples, by judging the similarity of the distribution of two sample sets to judge the similarity of tasks; task attributes, by judging the similarity between the rules of two task attributes to judge the similarity of tasks; task migration relationship, to judge the similarity between tasks by the degree of transferability between two tasks; task group, to judge the similarity between tasks by the tasks contained in the same task group .
- the multi-task index query and sorting module 313 may include the following sub-modules: a comparison module 3132 , a knowledge similarity measurement module 3133 , and a related knowledge screening module 3134 . The functions of each of the above sub-modules will be described in detail below.
- the comparison module 3132 performs comparison and adaptation according to the type of knowledge to be searched, and selects an appropriate knowledge similarity measurement method. Specifically, the input is the type of knowledge to be searched, and the output is the similarity measurement method. For an example, there may be many types of knowledge to be searched, such as samples, models, tasks, task relationships, etc., and the methods for measuring the similarity of each kind of knowledge are different.
- the comparison module 3132 performs comparison and adaptation according to the type of knowledge to be searched, and selects an appropriate knowledge similarity measurement method. For example, if the current knowledge to be searched is "related tasks", the adapted similarity measurement method is "using a decision tree to compare task attributes".
- the knowledge similarity measurement module 3133 measures the knowledge similarity according to the knowledge similarity measurement method selected by the comparison and adaptation module. Specifically, the input is a knowledge similarity measurement method, and the output is a knowledge similarity list. For example, the knowledge similarity measurement module 3133 will read the existing knowledge in the knowledge base, traverse it, and measure the similarity between the existing knowledge and the new knowledge according to the knowledge similarity measurement method selected by the comparison module 3132 . Sort all the results in descending order of similarity and output the knowledge similarity list.
- the relevant knowledge screening module 3134 will filter out the most relevant one or more knowledge outputs from the knowledge similarity list.
- the input is a knowledge similarity list
- the output is a related knowledge list.
- the related knowledge screening module 3134 selects and outputs one or more related knowledge according to the knowledge similarity list, the knowledge similarity threshold and the user query command obtained by the knowledge similarity measuring module 3133 .
- the knowledge similarity list is ⁇ task 1 distance: 0.3, task 2 distance: 0.35, task 3 distance: 0.6 ⁇
- the distance similarity threshold is 0.4
- the relevant knowledge screening module 3134 will output List of Relevant Knowledge [Task 1, Task 2].
- the knowledge base incremental maintenance module 320 may include the following sub-modules: a knowledge asynchronous integration module 321 , a knowledge strategy discrimination module 322 , and a knowledge and index update module 323 .
- the knowledge asynchronous integration module 321 performs asynchronous integration based on the historical knowledge to be filled in the candidate knowledge cache 330 and the filled knowledge after annotation and model filling to obtain complete candidate update knowledge.
- the input is the historical knowledge to be filled and the knowledge that has been filled
- the output is the candidate update knowledge.
- the runtime data is sometimes incomplete, some samples will have the problem of missing annotations. Therefore, it cannot be directly updated into the knowledge base as complete knowledge, but needs to be put into the candidate knowledge cache 330 to wait for the arrival of real labels, and then integrated into a labeled sample as a candidate update knowledge through the knowledge asynchronous integration module 321 .
- the knowledge strategy discriminating module 322 discriminates, based on the candidate update knowledge, strategies required to incrementally update the knowledge of related tasks in the knowledge base, so as to improve the coverage and accuracy of the knowledge base.
- relevant knowledge also includes task attributes, inter-task relationships and task groups.
- the input is the candidate update knowledge
- the output is the update strategy.
- the knowledge policy determination module 322 needs to determine the following information to be updated in the knowledge base according to the candidate update knowledge: task attributes, samples, models, inter-task relationships, and task groups.
- the knowledge strategy discrimination module 322 can discriminate different update strategies based on candidate update knowledge: determine the type of candidate update knowledge according to task attributes and task models, etc.; use methods such as task division, task migration relationship discovery and even task group mining to discriminate candidates Update knowledge samples, models, tasks, inter-task relationships, and how task groups will be updated into the knowledge base.
- the known tasks are the tasks stored in the knowledge base.
- Candidate update knowledge is the knowledge to be updated into the knowledge base.
- the task division method refers to, given a sample, dividing different samples into different tasks, and outputting task attributes and inter-task affiliation.
- the implementation methods of task division may include but are not limited to: the user manually specifies task attribute items and inputs them into the system at runtime; experts manually specify task attribute items and solidify them into the system in advance; experts manually specify the task attributes of some task samples as annotations , train a task classifier that inputs a sample and outputs the task attribute to which the sample belongs.
- the task migration relationship discovery method refers to measuring the degree of transferability between tasks according to the characteristics of task samples and other methods, and outputting the migration relationship between tasks.
- the task group mining method refers to dividing similar tasks into the same group through clustering and other methods based on the task migration relationship, and outputting task groups, in which the same task may be assigned to multiple groups.
- the above-mentioned candidate update knowledge can be judged from the following multiple dimensions and combinations of these dimensions: 1. Distinguish the difference (new and old) of task attributes, that is to say, judge whether the target task exists in the knowledge base. The basic idea of this unknown degree is to use the similarity measure to judge whether the target task (attribute) is similar to the known task attribute. 2. Distinguish the difference between task models (difficult and easy), that is, to judge whether the target task can be accurately reasoned by the knowledge base model. The basic idea of this unknown degree is to use model confidence, model transferability or other model quality measures to judge whether the target task model is similar to the known task model. For example, the higher the model confidence, the less the model is expected to make inferences about the test sample.
- model transferability the higher the model transferability, the higher the probability that the model is expected to be transferred to the target task.
- One realization of the model transferability is the similarity of the task samples.
- other measures of model quality may include the sample size for training the model (statistically speaking, the larger the sample size, the higher the confidence), the stability of the model when tested on diverse datasets (for example, this The model has been tested on multiple different datasets, and the effect is relatively stable) and so on.
- sample-level model Given a target task, the training set of the sample-level model comes from a subset of the task dataset; 2. Single-task-level model: Given a target task, the training set of the single-task-level model directly adopts The complete set of the task dataset; 3. Multi-task-level model: Given a target task, the training set of the multi-task-level model comes from multiple task datasets.
- the knowledge policy discrimination module 322 may include the following submodules: an adaptation module 3221, a knowledge disparity comparison module 3222, and an update decision module 3223. The functions of the above submodules are described in detail below.
- the adaptation module 3221 selects an adapted comparison method according to the type of the candidate update knowledge. Specifically, the input is the candidate update knowledge, and the output is the comparison method.
- candidate update knowledge may exist in various forms, such as samples, models, task attributes, inter-task relationships, etc., as well as numeric, categorical, tensor, and rule types in various formats.
- the adaptation module 3221 needs to select different alignment methods according to different types of combinations of candidate update knowledge.
- the knowledge dissimilarity comparison module 3222 compares the candidate updated knowledge according to the comparison method selected by the adaptation module 3221 . Specifically, the input is the comparison method, and the output is the comparison result. As an example, the knowledge dissimilarity comparison module 3222 needs to determine the degree of similarity and difference between the newly acquired knowledge and the existing knowledge at runtime. The similarities and differences of knowledge can be compared from several aspects, such as comparing sample distribution, model accuracy, task attributes, etc., and finally obtain different results. For example, the comparison result may be "Sample Distribution: Same, Model Accuracy: Similar, Task Attribute: Different".
- the update decision module 3223 outputs a corresponding update strategy according to the comparison result of the knowledge difference comparison module 3222 .
- the input is the comparison result
- the output is the update strategy.
- the update decision module 3223 outputs a corresponding update strategy according to the comparison result between the new knowledge and the existing knowledge by the knowledge difference comparison module 3222 . If the comparison result of the knowledge dissimilarity comparison module 3222 is "task application scope: same, model accuracy: different", the update decision module 3223 will output a corresponding update strategy, such as "knowledge reshaping".
- incremental update methods include but are not limited to one or more of the following: 1. Knowledge inheritance: inheritance of task attributes, models and samples, inheritance of inter-task relationships, and task groups (optional). 2. Knowledge accumulation: update task attributes, models and samples, update inter-task relationships, and task groups (optional). 3. Knowledge merging: update task attributes; continue to use inter-task relationships, task groups (optional), task samples and models. 4. Knowledge Reshaping: Update the task model and samples, the relationship between tasks, and task groups (optional).
- the selection strategies of different knowledge updating methods can also integrate new task knowledge such as target task attributes, inter-task relationships, and task groups (optional).
- the selection strategy is manually given by the user at runtime; the selection strategy is manually given by the expert and then solidified; a classifier can be trained based on the target task attributes (determining the application scope of the model), the relationship between tasks and task groups (determining the degree of model matching) , output the selected strategy; the model application scope is determined according to the target task attributes, and the model matching degree is determined according to the inter-task relationship and task group (as shown in Table 8); there may be other comprehensive target task attributes, inter-task relationships and task groups (can be Select) and other methods to add task knowledge.
- the knowledge and index update module 323 may include the following sub-modules: an intra-task knowledge and index update module 3231 , a task knowledge and index update module 3232 , and an inter-task knowledge and index update module 3233 . The functions of the above sub-modules are described in detail below.
- the in-task knowledge and index update module 3231 updates the knowledge and index in the task based on the update strategy obtained by the knowledge strategy discrimination module. Specifically, the input is the update strategy, and the output is the in-task knowledge. In one example, the in-task knowledge and index updating module 3231 updates the in-task knowledge and index based on the updating strategy obtained by the knowledge strategy discriminating module. If some sample sets in the runtime data are determined to belong to existing tasks in the knowledge base, the in-task knowledge and index updating module 3231 will update the sample sets as in-task knowledge into the knowledge base according to the update strategy.
- the task knowledge and index update module 3232 updates the task knowledge and index based on the update strategy obtained by the knowledge strategy discrimination module. Specifically, the input is the update strategy, and the output is the task knowledge. In one example, the task knowledge and index update module 3232 updates the task knowledge and index based on the update strategy obtained by the knowledge strategy discrimination module. If some sample sets in the runtime data are determined to belong to existing tasks in the knowledge base, and the task attributes, task constraints, etc. of the existing tasks have also changed after adding new samples, the task knowledge and index update module 3232 According to the update strategy, new task attributes, task constraints, etc. will be updated into the knowledge base as task knowledge.
- the inter-task knowledge and index update module 3233 updates the inter-task knowledge and index based on the update strategy obtained by the knowledge strategy discrimination module. Specifically, its input is an update strategy, and its output is inter-task knowledge. In an example, the inter-task knowledge and index update module 3233 updates the inter-task knowledge and index based on the update strategy obtained by the knowledge strategy discrimination module. If some sample sets in the runtime data are judged to belong to the existing tasks in the knowledge base, and the task migration relationship of the existing tasks and the task group they belong to have also changed after adding new samples, the knowledge and index between tasks will be changed. The update module 3233 updates the new task migration relationship, the task group to which it belongs, and the like into the knowledge base as inter-task knowledge according to the update strategy.
- the synchronization of the multi-task knowledge between the edge knowledge base 220 and the cloud knowledge base 230 can also be realized.
- the multi-task knowledge of the edge knowledge base 220 and the cloud knowledge base 230 can be bidirectionally transmitted through the edge-cloud knowledge synchronization module to ensure the synchronization of their knowledge.
- the coal blending quality prediction system is a complex system.
- the factory hopes to improve the quality of coal blending through machine learning. Therefore, a knowledge base is needed, which includes the prediction of coal blending quality under different coal blending parameters.
- different machine learning models different machine learning models.
- the inputs of the different machine learning models are different coal blending parameters, and the outputs are different predicted coal blending quality values.
- In-task knowledge in the multi-task knowledge initialization module 211 and its index table initialization module 2111 receives the coal blending multi-task knowledge and its extractor, extracts the in-task knowledge and builds an index table, and obtains the following table 9-Table 12. result.
- the in-task knowledge and its index table initialization module 2111 After the in-task knowledge and its index table initialization module 2111 completes the extraction of the in-task knowledge, it uses pickle to serialize the knowledge as in-task knowledge and save it in the knowledge base, and enters the task knowledge and its index table initialization module 2112 .
- the task knowledge and its index table initialization module 2112 receives the multi-task knowledge and its extractor and the extracted intra-task knowledge, extracts the task knowledge and builds an index table, and obtains the results shown in Table 13 below.
- the task knowledge and its index table initialization module 2112 After the task knowledge and its index table initialization module 2112 completes the task knowledge extraction, it uses pickle to serialize the knowledge as task knowledge and save it in the knowledge base, and enters the inter-task knowledge and its index table initialization module 2113 .
- the inter-task knowledge and its index table initialization module 2113 receives the multi-task knowledge and its extractor and the extracted task knowledge, extracts the inter-task knowledge and builds an index table, and obtains the results shown in Table 4-5 above. It should be understood that the relationship of multiple tasks in the same scenario is generally managed uniformly by a single task dependency table, and the task dependency index table is shown in Table 14.
- inter-task knowledge and its index table initialization module 2113 After the inter-task knowledge and its index table initialization module 2113 completes the extraction of the inter-task knowledge, use the pickle package to serialize the knowledge as inter-task knowledge and save it in the knowledge base, and exit the multi-task knowledge initialization module 211 to complete the multi-task knowledge initialization. .
- the edge-cloud knowledge synchronization module 212 reads knowledge from the knowledge base (edge knowledge base 220 ) and synchronizes it to the cloud knowledge base 230 .
- the query command parsing module 3111 After the query command parsing module 3111 receives a target knowledge type query command as shown in Table 6 above, the query command parsing module 3111 parses out that user A needs to find the task related to the data set No. 03 in the knowledge base, and the receiving address is 192.162.10.12 .
- the query command parsing module 3111 can pass the information to the knowledge feedback module 3112 as query knowledge items.
- the query knowledge items received by the knowledge feedback module 3112 are shown in the following table 15.
- the No. 03 data set can be read, and the knowledge and index extraction module 3121 in the task can be entered to obtain the corresponding task knowledge, and Output it as target knowledge.
- the comparison module 3132 receives the task T4 and its task attribute decision tree CLF4, outputs the knowledge similarity comparison method shown in Table 16 below according to the set adaptation rules, and enters the knowledge similarity measurement module 3133.
- the above task similarity comparison methods actually include a variety of new methods that have not been proposed, such as training sample comparison methods, task model comparison methods, and application range comparison methods.
- the principle of the application scope comparison method is to judge that the task models are similar when the application scope of the task models is similar. It can be used on different basic models, and it is the method with the highest generality, accuracy and difficulty among the above three methods.
- One implementation of the application-range alignment method is the decision tree method.
- a decision tree can be constructed for each task model to determine whether the task model is available. For example, the entire dataset is divided into several tasks, and each task has its own linear regression model. For a task, the entire data set is predicted by the task-corresponding linear regression model and compared with the real annotations. If the prediction is correct, it is considered that "the model is accepted by the sample", and it is assigned a value of 1, otherwise it is 0. The entire training set and 01 values are then concatenated as input to the decision tree. The decision tree will divide the samples according to the splitting rules of its nodes and some parameter settings, so that the purity of each node (such as measured by entropy, gini, etc.) is as high as possible.
- the knowledge similarity comparison method is "use decision tree to compare task similarity", read the existing knowledge from the knowledge base, and obtain the knowledge similarity list shown in the following table 17 through the similarity measurement algorithm, Enter the relevant knowledge screening module 3134.
- the related knowledge screening module 3134 After receiving the knowledge similarity list shown in Table 17, the related knowledge screening module 3134 filters out the similar task list shown in Table 18 according to the set distance threshold of 0.35, and returns it as the target knowledge.
- the runtime data and its related knowledge are entered into the candidate knowledge cache 330 as the candidate knowledge to be filled and wait for the annotations that have not yet arrived.
- the CSR annotation arrives, read the historical knowledge to be filled and the external filled knowledge from the candidate knowledge cache 330 and enter it into the knowledge asynchronous integration module 321, and integrate them into the complete candidate update knowledge shown in Table 19 below, and Enter the knowledge policy determination module 322 .
- the adaptation module 3221 in the knowledge strategy discrimination module 322 receives the candidate update knowledge and its related knowledge, according to the set adaptation rules, outputs the knowledge difference comparison method shown in the following table 20, and enters the knowledge difference comparison module 3222.
- the knowledge difference comparison module 3222 receives the knowledge difference comparison method shown in Table 20, it compares the candidate update knowledge with its related knowledge, outputs the knowledge difference comparison result shown in Table 21 below, and enters the update decision module 3223.
- the update decision module 3223 After receiving the knowledge disparity comparison result shown in Table 21, the update decision module 3223 outputs the update methods shown in the following Tables 22-23 according to the set update rules.
- the task knowledge and index update module 3232 decides to retrain the new task and the original No. 9 task according to the update strategy, retains its task attributes, merges the samples and generates a new model, task index and other knowledge. And replace the original No. 9 task in the knowledge base with the newly generated task update to complete the update of task knowledge and index.
- the size of the sequence numbers of the above-mentioned processes does not mean the sequence of execution, and the execution sequence of each process should be determined by its functions and internal logic, and should not be dealt with in the embodiments of the present application. implementation constitutes any limitation.
- FIG. 10 is a schematic block diagram of an apparatus 1000 for acquiring knowledge provided by an embodiment of the present application.
- the device for acquiring knowledge can be implemented as part or all of the device through software, hardware or a combination of the two.
- the apparatus provided by the embodiment of the present application can implement the method flow shown in FIG. 1 of the embodiment of the present application.
- the apparatus 1000 for acquiring knowledge includes an acquisition module 1010 and a display module 1020, wherein:
- the obtaining module 1010 is configured to obtain one or more first knowledges from the knowledge base according to parameters, where the parameters include any one or a combination of the following: knowledge in the machine learning task, knowledge in the machine learning task attributes, knowledge between multiple machine learning tasks;
- the display module 1020 is configured to provide the one or more first knowledges to the user.
- the obtaining module 1010 is further configured to: obtain the parameters input by the user; or obtain the parameters from other systems.
- the knowledge in the machine learning task includes a sample set and a model of the machine learning task, and the model is obtained by training according to the sample set; or the attributes of the machine learning task include the machine learning The constraints of the task, the scope of application; or the knowledge between the multiple machine learning tasks includes the association relationship between the multiple machine learning tasks.
- the obtaining module 1010 is further configured to obtain second knowledge related to the first knowledge from the knowledge base; the display module is further configured to provide the user with the second knowledge .
- the display module 1020 is further configured to: provide the user with configuration information of the first knowledge.
- the acquiring module 1010 is further configured to: acquire target knowledge selected by the user, where the target knowledge is the first knowledge and/or the second knowledge.
- the target knowledge is used in any of the following scenarios:
- the apparatus 1000 further includes: a synchronization module 1030 for the edge device to synchronize the knowledge in the knowledge base to the cloud device; or the cloud device to synchronize the knowledge in the knowledge base to the edge equipment.
- the device for acquiring knowledge provided by the above-mentioned embodiments only uses the division of the above-mentioned functional modules as an example to illustrate the image prediction. That is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
- the apparatus for acquiring knowledge and the method for acquiring knowledge provided by the above embodiments belong to the same concept, and the specific implementation process thereof can be found in the above method embodiments, which will not be repeated here.
- FIG. 11 is a schematic block diagram of a device 1100 for acquiring knowledge provided by an embodiment of the present application.
- the device 1100 for acquiring knowledge includes a set apparatus 1000 for acquiring knowledge, which can execute each step of the method shown in FIG. 1 , and in order to avoid repetition, it will not be described in detail here.
- the device 1100 for acquiring knowledge includes: a memory 1110 , a processor 1120 and an input and output interface 1130 .
- the processor 1120 can be connected in communication with the input and output interface 1130 .
- the memory 1110 may be used to store program codes and data of the knowledge acquisition device 1100. Therefore, the memory 1110 may be an internal storage unit of the processor 1120 , or may be an external storage unit independent of the processor 1120 , or may include a storage unit internal to the processor 1120 and an external storage unit independent of the processor 1120 . part.
- the device 1100 for acquiring knowledge may further include a bus 1140 .
- the memory 1110 and the input/output interface 1130 may be connected to the processor 1120 through the bus 1140 .
- the bus 1140 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, or the like.
- PCI peripheral component interconnect
- EISA extended industry standard architecture
- the bus 1140 can be divided into an address bus, a data bus, a control bus, and the like. For ease of presentation, only one thick line is used in FIG. 11, but it does not mean that there is only one bus or one type of bus.
- the processor 1120 may be, for example, a central processing unit (CPU), a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), an on-site A field programmable gate array (FPGA) or other programmable logic device, transistor logic device, hardware component, or any combination thereof. It may implement or execute the various exemplary logical blocks, modules and circuits described in connection with this disclosure.
- the processor may also be a combination that implements computing functions, such as a combination comprising one or more microprocessors, a combination of a DSP and a microprocessor, and the like.
- the input and output interface 1130 may be a circuit including the above-mentioned antenna, transmitter chain, and receiver chain, and the two may be independent circuits or the same circuit.
- the processor 1120 is configured to perform the following operations:
- the processor 1120 is further configured to: obtain the parameter input by the user; or obtain the parameter from other systems.
- the knowledge in the machine learning task includes a sample set and a model of the machine learning task, and the model is obtained by training according to the sample set; or the attributes of the machine learning task include the machine learning The constraints of the task, the scope of application; or the knowledge between the multiple machine learning tasks includes the association relationship between the multiple machine learning tasks.
- the processor 1120 is further configured to: acquire second knowledge related to the first knowledge from the knowledge base; and provide the second knowledge to the user.
- the processor 1120 is further configured to: determine a corresponding knowledge similarity comparison method according to the first knowledge; obtain a similar knowledge list from the task knowledge base according to the knowledge similarity comparison method; A degree threshold determines the second knowledge from the list of similar knowledge.
- the processor 1120 is further configured to: provide the user with configuration information of the first knowledge.
- the processor 1120 is further configured to: acquire target knowledge selected by the user, where the target knowledge is the first knowledge and/or the second knowledge.
- the target knowledge is used in any of the following scenarios:
- the processor 1120 is further configured to: update the task knowledge base according to the first knowledge and the second knowledge.
- the processor 1120 is specifically configured to: determine a knowledge similarities and differences comparison method according to the first knowledge and the second knowledge; obtain a knowledge similarities and differences comparison results according to the knowledge similarities and differences comparison methods, and the knowledge similarities and differences comparison results. is the similarities and differences comparison results between the first knowledge and the second knowledge; according to the similarities and differences comparison results and update rules, update any one or a combination of the following knowledge in the task knowledge base: Knowledge within the machine learning task, attributes of the machine learning task, knowledge between a plurality of the machine learning tasks.
- the processor 1120 is further configured to: the edge device synchronizes the knowledge in the knowledge base to the cloud device; or the cloud device synchronizes the knowledge in the knowledge base to the edge device.
- modules of the above-described examples can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.
- An embodiment of the present application further provides a chip, the chip obtains an instruction and executes the instruction to implement the above-mentioned method for acquiring knowledge, or the instruction is used to implement the above-mentioned device for acquiring knowledge.
- the chip includes a processor and a data interface, and the processor reads the instructions stored in the memory through the data interface, and executes the above method for acquiring knowledge.
- the chip may also include a memory, in which an instruction is stored, the processor is used to execute the instruction stored on the memory, and when the instruction is executed, the processor is used to execute the above-mentioned acquisition. method of knowledge.
- An embodiment of the present application further provides a computer-readable storage medium, where the computer-readable storage medium stores an instruction, and the instruction is used for the method for acquiring knowledge in the foregoing method embodiments, or the instruction is used for implementing the foregoing knowledge acquisition method. equipment.
- Embodiments of the present application further provide a computer program product including instructions, where the instructions are used to implement the method for acquiring knowledge in the foregoing method embodiments, or the instructions are used to implement the foregoing apparatus for acquiring knowledge.
- the processor may be a central processing unit (CPU), the processor may also be other general-purpose processors, digital signal processors (digital signal processors, DSP), application specific integrated circuits (application specific integrated circuits) circuit, ASIC), off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
- a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
- the memory may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
- the non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically programmable Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
- Volatile memory may be random access memory (RAM), which acts as an external cache.
- RAM random access memory
- SRAM static random access memory
- DRAM dynamic random access memory
- DRAM synchronous dynamic random access memory
- SDRAM synchronous dynamic random access memory
- DDR SDRAM double data rate synchronous dynamic random access memory
- enhanced SDRAM enhanced synchronous dynamic random access memory
- SLDRAM synchronous connection dynamic random access memory Fetch memory
- direct memory bus random access memory direct rambus RAM, DR RAM
- plural means two or more.
- At least one item(s) below” or similar expressions thereof refer to any combination of these items, including any combination of single item(s) or plural items(s).
- at least one item (a) of a, b, or c can represent: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, c may be single or multiple .
- the functions, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable storage medium.
- the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution.
- the computer software product is stored in a storage medium, including Several instructions are used to cause a computing device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
- the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program codes .
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
一种获取知识的方法和装置,该方法包括:根据参数从知识库中获取一个或多个第一知识,所述参数包括以下中的任一种或多种的组合:机器学习任务内的知识、所述机器学习任务的属性、多个机器学习任务之间的知识;向用户提供所述一个或多个第一知识。上述技术方案能够根据参数自动从知识库中获取对应的知识,实现知识库中知识的精准搜索。
Description
本申请涉及计算机领域,并且更具体地,涉及一种获取知识的方法和装置。
在机器学习的诸多知识中,模型是最重要的部分。但是,由于训练模型需要占据大量的时间,随着应用场景的不断拓展,可以将模型存储至模型知识库中。机器学习开发者可以从中查找所需要的已经训练好的模型并通过简单的修改就可直接使用,这大大减少了重复训练导致的人力物力消耗。
相关的技术方案中,一方面,现有的模型知识库只能根据名称进行简单的模型搜索,且模型的应用范围不清晰,不清楚何种情况下使用何种模型,导致模型知识库无法精确搜索。
因此,如何实现模型知识库的精准搜索成为亟需要解决的问题。
发明内容
本申请提供一种获取知识的方法和装置,能够根据参数自动从知识库中获取对应的知识,实现知识库中知识的精准搜索。
第一方面,提供了一种获取知识的方法,包括:根据参数从知识库中获取一个或多个第一知识,所述参数包括以下中的任一种或多种的组合:机器学习任务内的知识、所述机器学习任务的属性、多个机器学习任务之间的知识;向用户提供所述一个或多个第一知识。
上述技术方案中,能够根据参数自动从知识库中获取对应的知识,实现知识库中知识的精准搜索。
结合第一方面,在第一方面的某些实现方式中,所述方法还包括:获取用户输入的所述参数;或从其他系统获取所述参数。
结合第一方面,在第一方面的某些实现方式中,所述机器学习任务内的知识包括所述机器学习任务的样本集合、模型,所述模型是根据所述样本集合训练得到的;或所述机器学习任务的属性包括所述机器学习任务的约束、应用范围;或所述多个机器学习任务之间的知识包括多个所述机器学习任务之间的关联关系。
结合第一方面,在第一方面的某些实现方式中,所述方法还包括:从所述知识库中获取与所述第一知识相关的第二知识;向所述用户提供所述第二知识。
结合第一方面,在第一方面的某些实现方式中,根据所述第一知识确定对应的知识相似度比较方法;根据所述知识相似度比较方法从所述任务知识库中获得相似知识列表;根据相似度阈值从所述相似知识列表中确定所述第二知识。
结合第一方面,在第一方面的某些实现方式中,所述方法还包括:向所述用户提供所述第一知识的配置信息。
结合第一方面,在第一方面的某些实现方式中,所述方法还包括:获取用户选择的目标知识,所述目标知识为所述第一知识和/或第二知识。
结合第一方面,在第一方面的某些实现方式中,所述目标知识用于以下中的任一种场景:
智能驾驶的物品识别;
智能驾驶的人物识别;
开发者平台;
人工智能的市场平台;
物联网的市场平台;
解决方案的市场平台。
结合第一方面,在第一方面的某些实现方式中,所述方法还包括:根据所述第一知识和所述第二知识,对所述任务知识库进行更新。
结合第一方面,在第一方面的某些实现方式中,根据所述第一知识和所述第二知识确定知识异同对比方法;根据所述知识异同对比方法得到知识异同对比结果,所述知识异同对比结果为所述第一知识和所述第二知识之间的异同对比结果;根据所述异同对比结果以及更新规则,对所述任务知识库中的以下任一种或多种知识的组合进行更新:所述机器学习任务内的知识、所述机器学习任务的属性、多个所述机器学习任务之间的知识。
结合第一方面,在第一方面的某些实现方式中,所述方法还包括:边缘设备将所述知识库中的知识同步至云端设备;或所述云端设备将所述知识库中的知识同步至所述边缘设备。
第二方面,提供了一种获取知识的装置,包括:获取模块,显示模块。其中,获取模块用于根据参数从知识库中获取一个或多个第一知识,所述参数包括以下中的任一种或多种的组合:机器学习任务内的知识、所述机器学习任务的属性、多个机器学习任务之间的知识;显示模块用于向用户提供所述一个或多个第一知识。
结合第二方面,在第二方面的某些实现方式中,所述获取模块还用于:获取用户输入的所述参数;或从其他系统获取所述参数。
结合第二方面,在第二方面的某些实现方式中,所述机器学习任务内的知识包括所述机器学习任务的样本集合、模型,所述模型是根据所述样本集合训练得到的;或所述机器学习任务的属性包括所述机器学习任务的约束、应用范围;或所述多个机器学习任务之间的知识包括多个所述机器学习任务之间的关联关系。
结合第二方面,在第二方面的某些实现方式中,所述获取模块,还用于从所述知识库中获取与所述第一知识相关的第二知识;所述显示模块,还用于向所述用户提供所述第二知识。
结合第二方面,在第二方面的某些实现方式中,所述显示模块还用于:向所述用户提供所述第一知识的配置信息。
结合第二方面,在第二方面的某些实现方式中,所述获取模块还用于:获取用户选择的目标知识,所述目标知识为所述第一知识和/或第二知识。
结合第二方面,在第二方面的某些实现方式中,所述目标知识用于以下中的任一种场景:
智能驾驶的物品识别;
智能驾驶的人物识别;
开发者平台;
人工智能的市场平台;
物联网的市场平台;
解决方案的市场平台。
结合第二方面,在第二方面的某些实现方式中,所述装置还包括:同步模块,用于边缘设备将所述知识库中的知识同步至云端设备;或所述云端设备将所述知识库中的知识同步至所述边缘设备。
第三方面,提供了一种获取知识的设备,包括输入输出接口、处理器和存储器,其中所 述处理器用于控制所述输入输出接口收发信息,所述存储器用于存储计算机程序,所述处理器用于从存储器中调用并运行该计算机程序,使得所述执行第一方面或第一方面任意一种可能的实现方式中所述的方法。
可选地,该处理器可以是通用处理器,可以通过硬件来实现也可以通过软件来实现。当通过硬件实现时,该处理器可以是逻辑电路、集成电路等;当通过软件来实现时,该处理器可以是一个通用处理器,通过读取存储器中存储的软件代码来实现,该存储器可以集成在处理器中,可以位于该处理器之外,独立存在。
第四方面,提供了一种芯片,该芯片获取指令并执行该指令来实现上述第一方面以及第一方面的任意一种实现方式中的方法。
可选地,作为一种实现方式,该芯片包括处理器与数据接口,该处理器通过该数据接口读取存储器上存储的指令,执行上述第一方面以及第一方面的任意一种实现方式中的方法。
可选地,作为一种实现方式,该芯片还可以包括存储器,该存储器中存储有指令,该处理器用于执行该存储器上存储的指令,当该指令被执行时,该处理器用于执行第一方面以及第一方面中的任意一种实现方式中的方法。
第五方面,提供了一种计算机程序产品,所述计算机程序产品包括:计算机程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行上述第一方面以及第一方面的任意一种实现方式中的方法。
第六方面,提供了一种计算机可读存储介质,包括指令;所述指令用于实现上述第一方面以及第一方面的任意一种实现方式中的方法。
可选地,作为一种实现方式,上述存储介质具体可以是非易失性存储介质。
图1是本申请实施例提供的一种获取知识的方法的示意性流程图。
图2是本申请实施例提供的一种知识库初始化的示意性框图。
图3是本申请实施例提供的另一种知识库初始化的示意性框图。
图4是本申请实施例提供的一种知识库运行阶段的示意性框图。
图5是本申请实施例提供的另一种知识库运行阶段的示意性框图。
图6是本申请实施例提供的一种知识库初始化参数配置的界面示意图。
图7是本申请实施例提供的一种查询知识库中模型的参数配置界面示意图。
图8是本申请实施例提供的一种查询知识库中任务的参数配置界面示意图。
图9是本申请实施例提供的一种查询知识库中模型应用范围的参数配置界面示意图。
图10是本申请实施例提供的一种获取知识的装置1000的示意性框图。
图11是本申请实施例提供的一种获取知识的设备1100的示意性框图。
下面将结合附图,对本申请中的技术方案进行描述。
本申请将围绕包括多个设备、组件、模块等的系统来呈现各个方面、实施例或特征。应当理解和明白的是,各个系统可以包括另外的设备、组件、模块等,并且/或者可以并不包括结合附图讨论的所有设备、组件、模块等。此外,还可以使用这些方案的组合。
另外,在本申请实施例中,“示例的”、“例如”等词用于表示作例子、例证或说明。本申请中被描述为“示例”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选 或更具优势。确切而言,使用示例的一词旨在以具体方式呈现概念。
本申请实施例中,“相应的(corresponding,relevant)”和“对应的(corresponding)”有时可以混用,应当指出的是,在不强调其区别时,其所要表达的含义是一致的。
本申请实施例描述的网络架构以及业务场景是为了更加清楚地说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域普通技术人员可知,随着网络架构的演变和新业务场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
在本说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:包括单独存在A,同时存在A和B,以及单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。
为了便于理解,下面先对本申请实施例可能涉及的相关术语和概念进行介绍。
在机器学习的诸多知识中,模型是最重要的部分,以深度神经网络为代表的模型在诸多机器学习相关应用中取得了很好的效果,例如图片分类、语音识别等。但是,由于训练模型需要占据大量的时间,随着应用场景的不断拓展,开源共享训练数据和模型逐渐成为了业界的一种趋势。例如Google提供的TensorFlow Hub,它是一个用于存储可重用机器学习资产的模型知识库。机器学习开发者可以从中查找所需要的已经训练好的模型并通过简单的修改就可直接使用,这大大减少了重复训练导致的人力物力消耗。但是现有的模型知识库主要存在以下几个问题:
1、模型知识库无法精确搜索
现有的模型知识库(也可以称为知识搜索引擎)通常只能根据名称进行简单的模型搜索,且模型的应用范围不清晰,不清楚何种情况下使用何种模型。由于边缘情景多种多样,往往需针对不同情景提供定制化AI服务,再加上边缘数据非同分布程度太高,导致现有知识搜索引擎无法根据不同场景来准确搜索适配的模型。因此目前边缘模型严重依赖于人工定制,消耗大量的人力物力,使得知识库中的已知模型可利用程度过低。举例说明,比如在边缘进行配煤性能预测与控制时:(1)不同工厂,其生产目标和生产条件不同,导致配煤单数据差异很大,因此需要采用不同的模型;(2)不同工况,如运行在1000℃环境下和1200℃环境下,煤炭反应后的焦炭强度(coke strength after reaction,CSR)不同,对应的模型也有所区别;(3)不同产线,即使属于同一家工厂,传送带和粉碎机损耗不同,配煤操作员不同等,也会导致不同产线很难用一套统一的模型来运行。
2、知识灾难性遗忘
传统机器学习方法,包括迁移学习和增量学习,由于只保留一个模型并在该模型上修改, 未保存可应对不同情景的不同模型。所以往往在不同环境下会遗忘过去曾经出现的模型,在过去能处理好的情景下产生严重错误,这将导致迁移和增量的反复进行。一方面遗忘导致的准确率低下,这会降低用户体验,另一方面需要大量的时间再次进行数据采集和模型学习,会造成人力物力的浪费。举例说明,为了预测配煤问题中的CSR,采用传统增量学习方法,训练出一个用于A工况下的模型,并且该模型在A工况下表现良好。当工况发生变化时,由于模型准确率下降,必须重新采集B工况下的数据,将A工况模型迁移/增量改变为B工况模型。当工况又变回A时,由于原模型的参数基本已经被覆盖,新模型几乎完全遗忘了A工况下的知识,这就又需要重新采集数据,将B工况模型迁移/增量回原本的A工况模型,这样反复的学习会耗费大量的时间和算力。
有鉴于此,本申请实施例提供了一种获取知识的方法,可以根据参数自动从知识库中获取对应的知识,实现知识库中知识的精准搜索。
下面结合图1,对本申请实施例提供的一种获取知识的方法进行详细描述。如图1所示,该方法包括步骤110-120,下面分别对步骤110-120进行详细描述。
步骤110:根据参数从知识库中获取一个或多个第一知识,所述参数包括以下中的任一种或多种的组合:机器学习任务内的知识、所述机器学习任务的属性、多个机器学习任务之间的知识。
可选地,在步骤110之前,还可以获取所述参数。具体的,可以是获取用户输入的所述参数,或者从其他系统(例如其他智能平台)获取所述参数,本申请对此不做具体限定。
作为示例,上述机器学习任务内的知识包括所述机器学习任务的样本集合、模型,所述模型是根据所述样本集合训练得到的。上述机器学习任务的属性包括所述机器学习任务的约束、应用范围。上述多个机器学习任务之间的知识包括多个所述机器学习任务之间的关联关系。下面会结合具体的例子,对上述知识进行详细的解释说明,此处不再赘述。
步骤120:向用户提供所述一个或多个第一知识。
可选地,还可以从知识库中获取与所述第一知识相关的第二知识,并向用户提供该第二知识。
可选地,还可以向用户提供该第一知识相关的配置信息,例如,第一知识的简介,收益等。
可选地,还可以向用户提供该第二知识相关的配置信息。
可选地,还可以获取用户选择的目标知识,该目标知识为上述第一知识和/或第二知识。
作为示例,该目标指示用于以下中的任一种场景:
智能驾驶的物品识别;
智能驾驶的人物识别;
开发者平台;
人工智能的市场平台;
物联网的市场平台;
解决方案的市场平台。
可选地,还可以实现边缘设备的知识库和云端设备的知识库之间的相互同步。例如,边缘设备将所述知识库中的知识同步至云端设备;或云端设备将所述知识库中的知识同步至所述边缘设备。
图2是本申请实施例提供的一种知识库初始化的示意性框图。如图2所示,该知识库初始化的过程可以包括:知识库初始化模块210、边缘知识库220和云端知识库230。下面分别 对上述各个模块的功能进行详细描述。
边缘知识库220,位于边缘侧的设备上的知识库,用于存储多任务知识、多任务知识索引表以及多任务知识其提取器。边缘侧的设备例如可以是靠近用户侧的服务器。
云端知识库230,位于云侧的设备上的知识库,用于存储多任务知识、多任务知识索引表以及多任务知识其提取器。云侧的设备例如可以是云侧的服务器。
知识库初始化模块210,用于构建上述知识库,也可以理解为对上述知识库进行初始化。可选地,知识库初始化模块210还用于实现上述边缘知识库220和云端知识库230之间的多任务知识的存储和同步。具体的,知识库初始化模块210的输入为多任务知识及其提取器(可以是系统输入的多任务知识及其提取器,或是云端的多任务知识及其提取器),输出为多任务知识、多任务知识索引表以及多任务知识提取器。也就是说,知识库初始化模块210可以基于多任务知识及其提取器,将多任务知识索引表和多任务知识提取器存储在边缘知识库220或云端知识库230中,并完成边缘知识库220和云端知识库230之间的多任务知识的存储和同步。作为示例,如图3所示,知识库初始化模块210中可以包括以下两个子模块:多任务知识初始化模块211和边云知识同步模块212。下面会对这两个子模块的功能进行详细描述,此处暂不详述。
应理解,上述的任务也即机器学习任务,可以理解针对特定样本的模型学习过程,这里的样本可以是标签或者特征。多任务知识可以包括三个层级的知识:任务内知识、任务知识、任务间知识。可选地,多任务知识还可以包括第四个层级的知识:任务组(group)。下面分别对上述各个层级的知识进行详细描述。
作为一个示例,第一个层级的知识为任务内知识,其属于传统终身学习可能存储的知识,包括:样本(sample)和模型(model)。其中,样本也可以称为任务样本,是指任务定义后,某一个任务下的样本集合。是用于模型训练或测试的数据记录,每条记录包括不同数据项,可以指代有标签样本和无标签样本,比如,一条配煤的样本如表1所示。模型也可以称为任务模型,是指每个任务单独训练出来的模型。该模型的信息可以包括但不限于:模型的训练方法、模型超参数、模型参数等。其中,模型的训练方法可以包括但不限于:机器学习的算法,例如神经网络、支持向量机等。模型超参数指用于训练模型的机器学习算法的配置项,比如神经网络中的学习率。模型参数指模型映射函数的配置,是由机器学习算法训练得出的,例如神经网络中的权重、支持向量机中的支持向量等。比如,根据配煤的样本训练得到的一个配煤预测模型如表2所示。
表1 一条配煤的样本
| 样本名称 | ad | vdaf | std | G值 | CSR |
| 配煤样本1 | 8.553833 | 27.38117 | 0.494833 | 68.4 | 65.0 |
表2 一个配煤预测模型
| 模型名称 | 训练方法 | 模型超参数 | 模型参数 |
| 配煤模型1 | 线性回归 | 学习率=0.05… | A1=10,A2=1.5… |
作为另一个示例,第二个层级的知识为任务知识,其属于传统终身学习未存储的知识,包括:任务约束、任务属性等。其中,任务约束指用于划分任务的任务定义算法的配置项,比如任务样本量下限为一个任务所包含的样本数目的最小值。任务属性用来定义模型应用范围的数据项或特征列。表3示出了一个配煤任务。
表3 一个配煤任务
作为另一个示例,第三个层级的知识为任务间知识,其属于传统终身学习未存储的知识,包括多个任务之间的关联关系,可用于未知任务的判别和优化。具体的,任务间知识可以包括任务列表和不同任务之间的关联程度。其中,任务列表为存储在知识库中的任务列表,是任务关系发现的输入。不同任务之间的关联程度可以有两种继承形态:语义上的从属度关系(belong relation)和迁移学习中的可迁移度关系(transfer relation)。其中,语义上的从属度关系用于表达不同任务可相互从属的程度,可由任务定义方法输出,表4示出了一个配煤任务的从属度矩阵。迁移学习中的可迁移度关系用于表达不同任务间可互相迁移的程度,可由任务迁移关系发现方法输出,表5示出了一个配煤任务的可迁移度矩阵。作为示例,任务迁移关系发现方法是指根据任务样本等特征,通过相似性等方法,衡量任务间可迁移的程度。举例说明,目前可以采用基于元数据的任务迁移关系发现来提取任务间级别的知识,基本原理如下:在距离函数中分别对元数据项与数据项加权;以元数据作为约束构建先验聚类,在先验聚类基础上再采用数据进一步构建后验聚类。
表4 一个配煤任务的从属度矩阵
表5 一个配煤任务的可迁移度矩阵
作为另一个示例,第四个层级的知识为任务组(group),其属于传统终身学习未存储的知识,是指将关系相近的任务聚合成组,可用于未知任务优化模块的加速。任务组中可以包括:任务组约束、任务组属性、任务列表、任务组样本以及任务组模型等。其中,任务组约束指用于划分任务组的任务组划分算法的配置项,如任务组样本量下限:一个任务组包含的样本数目的最小值。任务组属性用来定义任务组应用范围的数据项或特征列。任务列表指存储知识库中任务的列表,是任务组划分算法的输入。任务组样本指任务组内所有的样本。任务组模型指由任务组内所有的样本进行训练得到模型,或基于每个任务组内多个任务的知识构建的模型。
下面对上述多任务知识初始化模块211和边云知识同步模块212这两个子模块的功能进行详细描述。
多任务知识初始化模块211,基于输入的多任务知识及其提取器对知识库进行初始化,将多任务知识索引表和多任务知识提取器存储在边缘知识库220中,完成初始化后反馈初始化状态并同步至云端知识库230。具体的,多任务知识初始化模块211可以包括:任务内知识及其索引表初始化模块2111、任务知识及其索引表初始化模块2112、任务间知识及其索引表初始化模块2113。多任务知识初始化模块211根据多任务知识的不同层级可以选择调用上述一个或多个子模块。如输入的多任务知识只有样本、任务属性,没有任务迁移关系等任务间知识时,只需要调用前两个子模块(任务内知识及其索引表初始化模块2111、任务知识及其索引表初始化模块2112)即可完成初始化。
任务内知识及其索引表初始化模块2111,其输入为多任务知识及其提取器,输出为任务内知识、任务内知识索引表以及任务内知识提取器。具体的,任务内知识及其索引表初始化2111基于输入的多任务知识及其提取器,生成任务内知识、任务内知识索引表和任务内知识提取器,并将其初始化到边缘知识库220中。如多任务知识是一个样本集及其任务属性,任务内知识及其索引表初始化模块2111会将任务属性作为该样本集的索引,并将这一知识更新到任务内知识索引表中。
任务知识及其索引表初始化模块2112,其输入为多任务知识及其提取器,输出为任务知识、任务知识索引表以及任务知识提取器。具体的,任务知识及其索引表初始化模块2112基于输入的多任务知识及其提取器,生成任务知识、任务知识索引表和任务知识提取器,并将其初始化到边缘知识库220中。如多任务知识是一个任务,包括任务样本、任务模型及其任务属性,任务知识及其索引表初始化模块2112会将任务属性作为该任务的索引并将这一知识更新到任务知识索引表中。
任务间知识及其索引表初始化模块2113,其输入为多任务知识及其提取器,输出为任务间知识、任务间知识索引表以及任务间知识提取器。具体的,任务间知识及其索引表初始化模块2113基于输入的多任务知识及其提取器,生成任务间知识、任务间知识索引表和任务间知识提取器,并将其初始化到边缘知识库220中。如多任务知识是一个任务与其他任务的可迁移关系及其所属的任务组时,任务间知识及其索引表初始化模块2113会将任务属性作为该任务的索引并将这一知识更新到任务间知识索引表中。
边云知识同步模块212,用于将边缘知识库220和云端知识库230的多任务知识进行双向传输,保证其知识的同步。这样,可以避免由于边缘知识库220的计算资源的限制所造成的难以支持大量模型的训练问题。边云知识同步模块212的输入为多任务知识,输出为多任务知识、初始化状态反馈。具体的,边云知识同步模块212基于云端知识库230的多任务知识对边缘知识库220进行初始化,完成初始化后反馈初始化状态。如果初始化时用到了多任务知识初始化模块211,则也需要将多任务知识初始化模块211输出到边缘知识库220的知识同步到云端知识库230。例如,包括处理知识的冲突,比方说根据时间戳的前后,根据知识的置信度/支撑证据数量,或者由知识库管理员手动处理冲突。
下面结合图4,对上述边缘知识库220的运行阶段进行详细描述。如图4所示,该运行阶段可以包括:边缘知识库220、云端知识库230、知识库搜索模块310,候选知识缓存330。可选地,该运行阶段还可以包括知识库增量维护模块320。下面分别对上述各个模块的功能进行详细描述。
1、知识库搜索模块310
根据运行时数据和目标知识类型查询命令搜索知识库(例如,边缘知识库220),用知识库中不同层级的提取器提取不同层级的知识和索引并反馈。具体的,在知识库搜索模块310 中,目标知识类型查询命令经过查询命令解析之后得到查询知识项列表,并根据查询知识项列表,按照任务索引表查找排序之后返回,作为目标知识和候选待填充知识输出。作为示例,如图5所示,知识库搜索模块310中包含以下子模块:知识查询模块311、多任务知识与索引提取模块312、多任务索引查询与排序模块313。下面会结合图5对知识库搜索模块310中包含的子模块的功能进行详细描述,此处暂不详述。
2、知识库增量维护模块320
根据新任务是否已知对任务不同层级的知识进行处理并对知识库进行增量维护。具体的,其输入为历史待填充知识、已填充知识、多任务知识,输出为任务内知识、任务知识、任务间知识。一个示例,知识库增量维护模块320先将候选知识缓存330中的历史待填充知识和经过标注与模型填充后的已填充知识整合成为完整的候选更新知识,再结合现有的多任务知识得到更新策略,然后将知识更新到知识库中。举例说明,如图5所示,知识库增量维护模块320可以包括以下的子模块:知识异步整合模块321、知识策略判别模块322、知识与索引更新模块323。
下面结合图5中具体的例子,对知识库搜索模块310以及知识库增量维护模块320中包含的各个子模块的功能进行详细描述。
1、知识库搜索模块310中包含以下子模块:知识查询模块311、多任务知识与索引提取模块312、多任务索引查询与排序模块313。
知识查询模块311,基于用户的目标知识类型查询命令查询相关知识并反馈查询结果。具体的,知识查询模块311将用户的目标知识类型查询命令解析为查询知识项列表,用于知识的查询,并将查询结果作为目标知识输出给用户,同时也作为候选待填充知识放入候选知识缓存330中。具体的,作为示例,如图5所示,知识查询模块311包括以下子模块:查询命令解析模块3111、知识反馈模块3112。
查询命令解析模块3111,将用户的目标知识类型查询命令解析为查询知识项,用于知识的查询。具体的,其输入为目标知识类型查询命令,其输出为查询知识项列表,接收方。一个示例,查询命令例如可以是类似dictionary的格式,比如下表6中的查询格式,其中,查询知识类型可以包括任务内、任务、任务间等不同层级的知识。另一个示例,用户可能针对运行时数据提出问题,如“用户A想要知道03号运行时数据集相关的任务知识”等,用户根据问题内容向系统发送目标知识类型查询命令,查询命令解析模块312将其解析为查询知识项列表,可以是list格式,比如:当命令中查询知识类型为任务时,查询知识项列表为{任务属性,任务模型,任务样本}。查询命令解析模块312将查询命令解析为具体的几个查询知识项“样本”、“任务”并输出给知识反馈模块313。
表6 查询格式
| 查询命令ID | 查询方 | 接收方地址 | 查询知识类型 | 运行时数据(集)ID |
| 01 | 用户A | 192.162.x.x | 任务知识 | 03 |
知识反馈模块3112,基于查询知识项列表决定需要调用哪几个知识与索引提取模块,并将提取到的知识作为目标知识输出给用户,同时作为候选待填充知识放入候选知识缓存330中。具体的,其输入为查询知识项列表、接收方,输出为目标知识、候选待填充知识。一个示例,知识反馈模块3112根据查询知识项列表调用相应的一个或多个知识与索引提取模块,如用户只需要查询“样本”和“任务”相关的知识,并不关心任务间级别的知识,则该知识反馈模块3112只需要调用任务内知识与索引提取模块和任务知识与索引提取模块,然后将搜索提取结果作为目标知识返回给用户。同时,也作为部分已知的候选待填充知识放入候选知 识缓存330中。若在知识库中搜索不到相关知识,则也会作为未知的候选待填充知识放入候选知识缓存330中。
多任务知识与索引提取模块312,基于运行时数据和查询知识项搜索知识库,用知识提取器提取出任务内、任务和任务间的知识和索引。具体的,其输入为运行时数据、知识提取器和查询知识项,输出为任务内、任务和任务间的知识和索引。一个示例,如图5所示,多任务知识与索引提取模块312可以包括以下的子模块:任务内知识与索引提取模块3121、任务知识与索引提取模块3122、任务间知识与索引提取模块3123。多任务知识与索引提取模块314根据查询知识项提取不同层级的知识。比如可以只查询任务内知识,不查询任务知识和任务间知识,也可以三种知识都一起查询。
任务内知识与索引提取模块3121,基于运行时数据和任务内知识提取器提取出任务内级别的知识,如样本、模型及其索引等。具体的,其输入为运行时数据、任务内知识提取器,输出为任务内的知识和索引。一个示例,任务内知识与索引提取模块3121基于运行时数据和任务内知识提取器提取出任务内级别的知识,如用户想知道“与运行时数据相关的样本有哪些”,则该任务内知识与索引提取模块3121首先会提取出任务索引,再根据任务索引调用多任务索引查询与排序模块313查找相关的样本。
任务知识与索引提取模块3122,基于运行时数据、任务知识提取器和任务内知识提取出任务级别的知识,如任务属性及其索引等。具体的,其输入为运行时数据、任务知识提取器和任务内知识,输出为任务级别的知识和索引。一个示例,任务知识与索引提取模块3122基于运行时数据和任务内知识提取器提取出任务内级别的知识,如用户想知道“运行时数据是否属于某个已知任务”,则该任务知识与索引提取模块3122首先会提取出任务索引,再根据任务索引调用多任务索引查询与排序模块313查找是否存在相关的任务及其任务属性、任务模型、任务样本等。
任务间知识与索引提取模块3123,基于运行时数据、任务间知识提取器和任务知识提取出任务间级别的知识,如任务关系、任务组及其索引等。具体的,其输入为运行时数据、任务间知识提取器和任务知识,输出为任务间级别的知识。一个示例,任务间知识与索引提取模块3123基于运行时数据和任务内知识提取器提取出任务间级别的知识,如用户想知道“与运行时数据对应任务相关的任务有哪些”,则该任务间知识与索引提取模块3123首先会提取出任务索引,再根据任务知识和任务索引调用多任务索引查询与排序模块313查找是否存在相关的任务及其任务关系、任务组等。
多任务索引查询与排序模块313,也可以称为相关知识查找模块,基于知识与索引提取模块提取出的索引和知识库中的任务索引表查找知识库中的各层级知识,并按照相似性排序输出查找结果。具体的,其输入为索引、任务索引表,输出为按照相似性排序后的查找结果。举例说明,任务的相似性可以从以下多个角度来判断:任务样本,通过判断两个样本集合分布的相似性来判断任务相似性;任务属性,通过判断两个任务属性规则之间的相似性来判断任务相似性;任务迁移关系,通过两个任务之间的可迁移程度来判断任务之间的相似性;任务组,通过同一任务组之间所包含的任务来判断任务之间的相似性。一个示例,多任务索引查询与排序模块313可以包括以下的子模块:比对模块3132、知识相似度衡量模块3133、相关知识筛选模块3134。下面分别对上述各个子模块的功能进行详细描述。
比对模块3132,根据需要查找的知识类型进行比对适配,选取适当的知识相似度衡量方法。具体的,其输入为需要搜索的知识类型,输出为相似度衡量方法。一个示例,需要查找的知识类型可能有很多种,比如样本、模型、任务、任务关系等,对每种知识进行相似度衡 量的方法有所不同。比对模块3132根据需要查找的知识类型进行比对适配,选取适当的知识相似度衡量方法。举例说明,比如当前要查找的知识是“相关的任务”,则适配的相似度衡量方法是“利用决策树比较任务属性”。
知识相似度衡量模块3133,根据比对适配模块选定的知识相似度衡量方法对知识相似度进行衡量。具体的,其输入为知识相似度衡量方法,输出为知识相似度列表。一个示例,知识相似度衡量模块3133会读取知识库中现有知识,对其进行遍历,按照比对模块3132选定的知识相似度衡量方法衡量现有知识和新知识的相似度。按照相似度降序对所有结果进行排序后输出知识相似度列表。
相关知识筛选模块3134,会从知识相似度列表中筛选出最相关的一个或多个知识输出。具体的,其输入为知识相似度列表,输出为相关知识列表。一个示例,相关知识筛选模块3134根据知识相似度衡量模块3133得到的知识相似度列表、知识相似度阈值和用户查询命令,筛选出一个或多个相关知识并输出。比如,知识相似度列表为{任务1距离:0.3,任务2距离:0.35,任务3距离:0.6},距离相似度阈值为0.4,用户希望查询相关的所有任务,则相关知识筛选模块3134会输出相关知识列表[任务1,任务2]。
2、知识库增量维护模块320可以包括以下的子模块:知识异步整合模块321、知识策略判别模块322、知识与索引更新模块323。
知识异步整合模块321,基于候选知识缓存330中的历史待填充知识和经过标注与模型填充后的已填充知识进行异步整合,获得完整的候选更新知识。具体的,其输入为历史待填充知识、已填充知识,输出为候选更新知识。一个示例,由于运行时数据有时并不完整,部分样本会存在标注缺失的问题。因此不能直接作为完整知识更新到知识库中,而是需要先放入候选知识缓存330中等待真实标注到达,而后经过知识异步整合模块321将其整合成为具有标签的样本作为候选更新知识。
知识策略判别模块322,基于候选更新知识判别在知识库中增量更新相关任务知识所需的策略,提升知识库覆盖范围和精度。相关知识除了传统的样本与模型外,还包括任务属性、任务间关系和任务组。具体的,其输入为候选更新知识,输出为更新策略。一个示例,知识策略判别模块322需要根据候选更新知识决定需要被更新到知识库中的以下信息:任务属性、样本、模型、任务间关系、任务组。
举例说明,知识策略判别模块322可以基于候选更新知识判别不同的更新策略:根据任务属性和任务模型等确定候选更新知识的类型;利用任务划分、任务迁移关系发现乃至任务组挖掘等方法,判别候选更新知识的样本、模型、任务、任务间关系和任务组将以何种方式更新到知识库中。其中,已知任务是知识库中存储的任务。候选更新知识是将要更新到知识库中的知识。
应理解,任务划分方法是指给定样本,将不同的样本划分到不同的任务,输出任务属性和任务间从属关系。任务划分的实现方式可包括但不限于:用户人工指定任务属性项并在运行时输入到系统中;专家人工指定任务属性项并提前固化到系统中;专家人工指定部分任务样本的任务属性作为标注,训练任务分类器,该分类器输入样本,输出该样本所属的任务属性。任务迁移关系发现方法是指根据任务样本等特征,通过相似性等方法,衡量任务间可迁移的程度,输出任务间迁移关系。任务组挖掘方法是指基于任务迁移关系,通过聚类等方法,将相似的任务划分到同一组中,输出任务组,其中,同一任务有可能被分配到多个组当中。
上述候选更新知识可以从以下多个维度以及这些维度的组合进行判别:1、区分任务属性的差异(新与旧),也就是说判断目标任务是否存在于知识库中。此未知程度的基本思路是 采用相似性度量,来判断目标任务(属性)与已知任务属性是否相似。2、区分任务模型的差异(难与易),也就是判断目标任务是否能被知识库模型准确推理。此未知程度的基本思路是采用模型置信度、模型可迁移率或其它模型质量度量,来判断目标任务模型和已知任务模型是否相似。例如,模型置信度越高,预计模型对测试样本的推理错误越小。又如,模型可迁移率越高,预计模型迁移到目标任务的可能性越高,一种模型可迁移率的实现是任务样本的相似度。又如,其它模型质量的度量可以包括训练该模型的样本量(从统计角度来说,样本量越大可信度越高)、该模型在多样数据集测试时的稳定性(比方说,这个模型被多个不同数据集测试,效果都比较稳定)等。
更具体的,对上面三种度量方法,从训练集层次关系的角度来看,我们又可以将置信度等度量又分别细致划分为三个层次进行度量,如样本级别、单任务级别、多任务级别。比如置信度可以细分为样本级别模型置信度、单任务级别模型置信度、多任务级别模型置信度等。其它模型质量度量也可以类似地细分。1、样本级别模型:给定一个目标任务,样本级别模型的训练集来自于该任务数据集的子集;2、单任务级别模型:给定一个目标任务,单任务级别模型的训练集直接采用该任务数据集的全集;3、多任务级别模型:给定一个目标任务,多任务级别模型的训练集来自多个任务数据集。
举例说明,知识策略判别模块322可以包括以下子模块:适配模块3221、知识异同比对模块3222、更新决策模块3223,下面对上述各个子模块的功能进行详细描述。
适配模块3221,根据候选更新知识的类型选择适配的比对方法。具体的,其输入为候选更新知识,输出为比对方法。一个示例,候选更新知识可能以多种形式存在,例如样本、模型、任务属性、任务间关系等,也会以多种格式存在数值型、类别型、张量型、规则型等,适配模块3221需要根据候选更新知识的不同类型组合来选取不同的比对方法。
知识异同比对模块3222,根据适配模块3221选定的比对方法对候选更新知识进行比对。具体的,其输入为比对方法,输出为比对结果。一个示例,知识异同比对模块3222需要判断运行时新得到的知识与现有知识的异同程度。知识的异同具体可以从几个方面来比对,比如比对样本分布、模型精确度、任务属性等,最终得到不同方面的结果。如比对结果可能为“样本分布:相同、模型精确度:相似、任务属性:不同”。
更新决策模块3223,根据知识异同比对模块3222的比对结果输出对应的更新策略。具体的,其输入为比对结果,输出为更新策略。一个示例,更新决策模块3223根据知识异同比对模块3222对新知识与现有知识的比对结果输出相应的更新策略。如知识异同比对模块3222的比对结果为“任务应用范围:相同、模型精确度:不同”,则更新决策模块3223会输出对应的更新策略,如“知识重塑”。
应理解,除了样本、模型更新方法传统地支持更新外,任务属性、任务关系和任务组知识更新方法本身也支持增量特性。也即未知任务到来时,知识库不需要把所有已知和未知知识从头开始重新学习,而是只更新一部分受未知知识影响的知识即可。如表7所示,增量更新方法包括但不限于如下中的一种或多种:1、知识沿用:沿用任务属性、模型和样本、沿用任务间关系、任务组(可选)。2、知识累积:更新任务属性、模型和样本、更新任务间关系、任务组(可选)。3、知识合并:更新任务属性;沿用任务间关系、任务组(可选)、任务样本与模型。4、知识重塑:更新任务模型与样本、任务间关系、任务组(可选)。
表7 增量更新方法
还应理解,不同知识更新方法的选择策略上除结合传统的样本和模型外,还可综合目标任务属性、任务间关系和任务组(可选)等新增任务知识。比方说:用户运行时人工给定选择策略;专家人工给定选择策略后固化;训练一个分类器,可基于目标任务属性(决定模型应用范围)、任务间关系和任务组(决定模型匹配程度),输出选择的策略;根据目标任务属性决定模型应用范围,根据任务间关系和任务组决定模型匹配程度(如表8所示);可能存在其他综合目标任务属性、任务间关系和任务组(可选)等新增任务知识的方法。
表8 模型匹配程度
知识与索引更新模块323,基于知识策略判别模块得到的更新策略,分别更新任务内、任务和任务间的知识和索引。具体的,其输入为更新策略,输出为任务内知识、任务知识和任务间知识。一个示例,知识与索引更新模块323可以包括以下子模块:任务内知识与索引更新模块3231、任务知识与索引更新模块3232、任务间知识与索引更新模块3233。下面对上述各个子模块的功能进行详细描述。
任务内知识与索引更新模块3231,基于知识策略判别模块得到的更新策略,更新任务内的知识和索引。具体的,其输入为更新策略,输出为任务内知识。一个示例,任务内知识与索引更新模块3231基于知识策略判别模块得到的更新策略,更新任务内的知识和索引。如运行时数据中的某些样本集被判定为属于知识库中的已有任务,则任务内知识与索引更新模块3231会按照更新策略将样本集作为任务内知识更新到知识库中。
任务知识与索引更新模块3232,基于知识策略判别模块得到的更新策略,更新任务知识和索引。具体的,其输入为更新策略,输出为任务知识。一个示例,任务知识与索引更新模块3232基于知识策略判别模块得到的更新策略,更新任务知识和索引。如运行时数据中的某些样本集被判定为属于知识库中的已有任务,且增加新样本后已有任务的任务属性、任务约束等也发生了改变,则任务知识与索引更新模块3232会按照更新策略将新的任务属性、任务约束等作为任务知识更新到知识库中。
任务间知识与索引更新模块3233,基于知识策略判别模块得到的更新策略,更新任务间的知识和索引。具体的,其输入为更新策略,输出为任务间知识。一个示例,任务间知识与索引更新模块3233基于知识策略判别模块得到的更新策略,更新任务间知识和索引。如运行时数据中的某些样本集被判定为属于知识库中的已有任务,且增加新样本后已有任务的任务迁移关系、所属任务组等也发生了改变,则任务间知识与索引更新模块3233会按照更新策略将新的任务迁移关系、所属任务组等作为任务间知识更新到知识库中。
应理解,如图5所示,边缘知识库220中的多任务知识更新后,还可以实现边缘知识库220和云端知识库230之间的多任务知识的同步。例如,可以通过边云知识同步模块将边缘知识库220和云端知识库230的多任务知识进行双向传输,保证其知识的同步。
下面结合一个具体的例子,对知识库的运行阶段进行详细描述,应理解,下面的例子仅仅是为了帮助本领域技术人员理解本申请实施例,而非要将申请实施例限制于所示例的具体数值或具体场景。本领域技术人员根据下面所给出的下面的例子,显然可以进行各种等价的修改或变化,这样的修改和变化也落入本申请实施例的范围内。
配煤质量预测系统是一个复杂的系统,工厂希望通过机器学习的方式来实现配煤质量的提升,因此需要一个知识库,该知识库中包括在不同煤种配比参数下进行配煤质量预测的不同的机器学习模型。该不同的机器学习模型的输入为不同煤种配比参数,输出为预测的不同的配煤质量数值。
在使用前,用户可以根据给定的必选和可选参数进行自定义配置。例如,如图6-图9所示的自定义配置。下面以配煤质量预测系统为例,分别对知识库的初始化阶段以及运行阶段的各个模块的功能进行详细描述。
1、知识库的初始化阶段
多任务知识初始化模块211中的任务内知识及其索引表初始化模块2111接收到配煤多任务知识及其提取器,提取出任务内知识并构建索引表,得到如下表9-表12所示的结果。
表9 任务内样本知识
| 任务内样本地址 | 样本 |
| root/T1/S1 | 样本S1-1 |
| root/T1/S1 | 样本S1-2 |
| root/T2/S2 | 样本S2-1 |
表10 任务内模型知识
| 任务内模型地址 | 模型 |
| root/T1/M1 | 模型T1-M1 |
| root/T1/M2 | 模型T1-M2 |
| root/T2/M1 | 模型T2-M1 |
表11 其他任务内知识
| 其他任务内知识地址 | 其他任务内知识 |
| root/T1/O1 | 模型精度=0.98... |
| root/T1/O2 | 模型精度=0.95... |
| root/T2/O1 | 模型精度=0.96... |
表12 任务内知识索引表
任务内知识及其索引表初始化模块2111完成任务内知识的提取后,用pickle包将知识序列化作为任务内知识保存在知识库中,并进入任务知识及其索引表初始化模块2112。
任务知识及其索引表初始化模块2112接收到多任务知识及其提取器和已经提取的任务内知识,提取出任务知识并构建索引表,得到如下表13所示的结果。
表13 任务知识及其索引表
| 任务索引 | 任务地址 | 任务属性 | 其他属性 |
| 01 | root/T1 | 决策树CLF1 | 样本下限=5... |
| 02 | root/T2 | 决策树CLF2 | 样本下限=5... |
| 03 | root/T3 | 决策树CLF3 | 样本下限=5... |
任务知识及其索引表初始化模块2112完成任务知识的提取后,用pickle包将知识序列化作为任务知识保存在知识库中,并进入任务间知识及其索引表初始化模块2113。
任务间知识及其索引表初始化模块2113接收到多任务知识及其提取器和已经提取的任务知识,提取出任务间知识并构建索引表,得到如上表4-5所示的结果。应理解,同一场景下的多任务的关系通常由单个任务相关性表统一管理,任务相关性索引表如表14所示。
表14 任务相关性索引表
| 任务相关性表索引 | 任务地址 |
| TR01 | root/TR1 |
| TR02 | root/TR2 |
| TR03 | root/TR3 |
任务间知识及其索引表初始化模块2113完成任务间知识的提取后,用pickle包将知识序列化作为任务间知识保存在知识库中,并退出多任务知识初始化模块211,完成多任务知识的初始化。
当多任务知识的初始化完成后,边云知识同步模块212从知识库(边缘知识库220)中读取知识并同步到云端知识库230。
2、知识库的运行阶段
查询命令解析模块3111接收到如上表6所示的一条目标知识类型查询命令后,查询命令解析模块3111解析出用户A需要查找知识库中与03号数据集相关的任务,接收地址为192.162.10.12。查询命令解析模块3111可以将这些信息作为查询知识项传递给知识反馈模块3112。
知识反馈模块3112接收到的查询知识项如下表15所示,其可以根据查询知识类型为任务知识,读取03号数据集,并进入任务内知识与索引提取模块3121获取对应的任务知识,并将其作为目标知识输出。
表15 查询知识项
| 接收方地址 | 查询知识类型 | 运行时数据(集)ID |
| 192.162.10.12 | 任务知识 | 03 |
比对模块3132接收到任务T4和其任务属性决策树CLF4,根据设定好的适配规则,输出如下表16所示的知识相似度比较方法,并进入知识相似度衡量模块3133。
表16 知识相似度比较方法
| 查找内容 | 任务属性 | 知识相似度比较方法 |
| 相关任务 | 决策树CLF4 | 例如:利用决策树比较任务相似性 |
上述任务相似度比较方法,实际上包括采用训练样本比对方法、任务模型比对方法、应用范围比对方法等多种未被提出的新方法。应用范围比对法的原理是当任务模型应用范围相似时,判断任务模型相似。可以用到不同基础模型上,是上述三种方法中泛用性、精度和难度都最高的方法。应用范围比对法的一种实现是决策树方法。
应理解,可以为每个任务模型构建一棵决策树,用于判别任务模型是否可用。举例说明, 整个数据集下被划分为若干任务,每个任务有自己的线性回归模型。对于一个任务,整个数据集用任务对应线性回归模型进行预测并与真实的标注进行比较,如果预测正确,则认为“该模型被该样本接受”,为其赋值1,否则为0。然后将整个训练集和01值拼接作为决策树的输入。决策树会根据其结点的分裂规则和一些参数设定来将样本划分开,使得每个结点的纯度(如用entropy、gini等衡量)尽可能高。
决策树相似度比较原理是当不同任务的决策树比较相似时,说明任务模型应用范围类似,也即认为任务是比较相似的。举例说明,可以将每棵决策树提取为多个规则集合,每个规则是一个二元组:规则=(条件,结论),如(条件:x[0]<=28.818and x[2]<=64.65,结论:gini=0……)。将两棵决策树所有的条件(上图区域)提取出来,并对齐条件,使得两棵决策树能够在相同条件下进行结论比较,再综合不同条件下的结论比较,得到最终相似性。
知识相似度衡量模块3133到知识相似度比较方法为“利用决策树比较任务相似性”,从知识库中读取现有知识,通过相似性衡量算法得到如下表17所示的知识相似度列表,进入相关知识筛选模块3134。
表17 知识相似度列表
| 任务索引 | 任务名 | 距离 |
| 05 | 任务T5 | 0.59 |
| 06 | 任务T6 | 0.42 |
| 07 | 任务T7 | 0.43 |
| 08 | 任务T8 | 0.51 |
| 09 | 任务T9 | 0.31 |
| 10 | 任务T10 | 0.36 |
相关知识筛选模块3134接收到表17所示的知识相似度列表后,根据设定好的距离阈值为0.35筛选出如表18所示的相似任务列表,将其作为目标知识返回。
表18 筛选出的相似任务列表
| 任务索引 | 任务名 | 距离 |
| 09 | 任务T9 | 0.31 |
完成用户命令后,运行时数据和其相关知识作为候选待填充知识进入候选知识缓存330中等待尚未到达的标注。当其CSR标注到来时,则从候选知识缓存330中读取历史待填充知识和外部的已填充知识进入知识异步整合模块321,将其整合成为完整的如下表19所示的候选更新知识,并进入知识策略判别模块322。
表19 候选更新知识
| 数据id | 灰分_x | 挥发分_x | 硫分_x | G值_x | CSR(填充) |
| 01 | 9.77 | 27.77 | 0.98 | 69.75 | 64.2 |
| 02 | 10.04 | 29.15 | 1.04 | 71.75 | 65.1 |
| 03 | 10.17 | 28.92 | 1.09 | 71 | 64.8 |
知识策略判别模块322中的适配模块3221接收到候选更新知识和其相关知识,根据设定好的适配规则,输出如下表20所示的知识异同比对方法,并进入知识异同比对模块3222。
表20 知识异同比对方法
| 候选更新知识类型 | 任务属性 | 知识异同比对方法 |
| 任务 | 决策树CLF4 | 比对任务属性和模型 |
知识异同比对模块3222接收到表20所示的知识异同比对方法后,对候选更新知识和其相关知识进行比对,输出如下表21所示的知识异同比对结果,并进入更新决策模块3223。
表21 知识异同比对结果
| 任务索引 | 任务属性 | 任务模型 |
| 09 | 相似 | 不同 |
更新决策模块3223接收到表21所示的知识异同比对结果后,根据设定好的更新规则,输出如下表22-23所示的更新方法。
表22 知识更新方法
表23 知识更新方法
任务知识与索引更新模块3232根据更新策略决定对新任务和原9号任务进行重训练,沿用其任务属性,合并样本并产生新的模型、任务索引以及其他知识。并用新产生的任务更新替代知识库中原有的9号任务,完成任务知识与索引的更新。
应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
上文结合图1至图9,详细描述了本申请实施例提供的方法,下面将结合图10-图11,详细描述本申请装置的实施例。应理解,方法实施例的描述与装置实施例的描述相互对应,因此,未详细描述的部分可以参见前面方法实施例。
图10是本申请实施例提供的一种获取知识的装置1000的示意性框图。该获取知识的装置可以通过软件、硬件或者两者的结合实现成为装置中的部分或者全部。本申请实施例提供的装置可以实现本申请实施例图1所示的方法流程,该获取知识的装置1000包括:获取模块1010,显示模块1020,其中:
获取模块1010,用于根据参数从知识库中获取一个或多个第一知识,所述参数包括以下中的任一种或多种的组合:机器学习任务内的知识、所述机器学习任务的属性、多个机器学习任务之间的知识;
显示模块1020,用于向用户提供所述一个或多个第一知识。
可选地,所述获取模块1010还用于:获取用户输入的所述参数;或从其他系统获取所述参数。
可选地,所述机器学习任务内的知识包括所述机器学习任务的样本集合、模型,所述模型是根据所述样本集合训练得到的;或所述机器学习任务的属性包括所述机器学习任务的约束、应用范围;或所述多个机器学习任务之间的知识包括多个所述机器学习任务之间的关联关系。
可选地,所述获取模块1010,还用于从所述知识库中获取与所述第一知识相关的第二知识;所述显示模块,还用于向所述用户提供所述第二知识。
可选地,所述显示模块1020还用于:向所述用户提供所述第一知识的配置信息。
可选地,所述获取模块1010还用于:获取用户选择的目标知识,所述目标知识为所述第一知识和/或第二知识。
可选地,所述目标知识用于以下中的任一种场景:
智能驾驶的物品识别;
智能驾驶的人物识别;
开发者平台;
人工智能的市场平台;
物联网的市场平台;
解决方案的市场平台。
可选地,所述装置1000还包括:同步模块1030,用于边缘设备将所述知识库中的知识同步至云端设备;或所述云端设备将所述知识库中的知识同步至所述边缘设备。
需要说明的是:上述实施例提供的获取知识的装置在图像预测时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的获取知识的装置与获取知识的方法实施例属于同一构思,其具体实现过程详见上文中的方法实施例,这里不再赘述。
图11是本申请实施例提供的一种获取知识的设备1100的示意性框图。获取知识的设备1100中包括设置的获取知识的装置1000,能够执行图,1所示的方法的各个步骤,为了避免重复,此处不再详述。获取知识的设备1100包括:存储器1110、处理器1120以及输入输出接口1130。
其中,该处理器1120可以与输入输出接口1130通信连接。该存储器1110可以用于存储获取知识的设备1100的程序代码和数据。因此,该存储器1110可以是处理器1120内部的存储单元,也可以是与处理器1120独立的外部存储单元,还可以是包括处理器1120内部的存储单元和与处理器1120独立的外部存储单元的部件。
可选的,获取知识的设备1100还可以包括总线1140。其中,存储器1110、输入输出接口1130可以通过总线1140与处理器1120连接。总线1140可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。所述总线1140可以分为地址总线、数据总线、控制总线等。为便于表示,图11中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
举例说明,处理器1120例如可以是中央处理器(central processing unit,CPU),通用处理器,数字信号处理器(digital signal processor,DSP),专用集成电路(application-specific integrated circuit,ASIC),现场可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。所述处理器也可以是实现计 算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等等。
输入输出接口1130可以是包括上述天线和发射机链和接收机链的电路,二者可以是独立的电路,也可以是同一个电路。
当存储器1110中存储的获取知识的设备1100的程序代码和数据被执行时,一种可能的实现方式中,所述处理器1120用于执行以下操作:
根据参数从知识库中获取一个或多个第一知识,所述参数包括以下中的任一种或多种的组合:机器学习任务内的知识、所述机器学习任务的属性、多个机器学习任务之间的知识;向用户提供所述一个或多个第一知识。
可选地,所述处理器1120还用于:获取用户输入的所述参数;或从其他系统获取所述参数。
可选地,所述机器学习任务内的知识包括所述机器学习任务的样本集合、模型,所述模型是根据所述样本集合训练得到的;或所述机器学习任务的属性包括所述机器学习任务的约束、应用范围;或所述多个机器学习任务之间的知识包括多个所述机器学习任务之间的关联关系。
可选地,所述处理器1120还用于:从所述知识库中获取与所述第一知识相关的第二知识;向所述用户提供所述第二知识。
可选地,所述处理器1120还用于:根据所述第一知识确定对应的知识相似度比较方法;根据所述知识相似度比较方法从所述任务知识库中获得相似知识列表;根据相似度阈值从所述相似知识列表中确定所述第二知识。
可选地,所述处理器1120还用于:向所述用户提供所述第一知识的配置信息。
可选地,所述处理器1120还用于:获取用户选择的目标知识,所述目标知识为所述第一知识和/或第二知识。
可选地,所述目标知识用于以下中的任一种场景:
智能驾驶的物品识别;
智能驾驶的人物识别;
开发者平台;
人工智能的市场平台;
物联网的市场平台;
解决方案的市场平台。
可选地,所述处理器1120还用于:根据所述第一知识和所述第二知识,对所述任务知识库进行更新。
可选地,所述处理器1120具体用于:根据所述第一知识和所述第二知识确定知识异同对比方法;根据所述知识异同对比方法得到知识异同对比结果,所述知识异同对比结果为所述第一知识和所述第二知识之间的异同对比结果;根据所述异同对比结果以及更新规则,对所述任务知识库中的以下任一种或多种知识的组合进行更新:所述机器学习任务内的知识、所述机器学习任务的属性、多个所述机器学习任务之间的知识。
可选地,所述处理器1120还用于:边缘设备将所述知识库中的知识同步至云端设备;或所述云端设备将所述知识库中的知识同步至所述边缘设备。
上述描述的各示例的模块,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应 认为超出本申请的范围。
本申请实施例还提供一种芯片,该芯片获取指令并执行该指令来实现上述获取知识的方法,或者该指令用于实现上述的获取知识的设备。
可选地,作为一种实现方式,该芯片包括处理器与数据接口,该处理器通过该数据接口读取存储器上存储的指令,执行上述获取知识的方法。
可选地,作为一种实现方式,该芯片还可以包括存储器,该存储器中存储有指令,该处理器用于执行该存储器上存储的指令,当该指令被执行时,该处理器用于执行上述获取知识的方法。
本申请实施例还提供一种计算机可读存储介质,该计算机可读存储介质存储有指令,该指令用于上述方法实施例中的获取知识的方法,或者该指令用于实现上述的获取知识的设备。
本申请实施例还提供一种包含指令的计算机程序产品,该指令用于实现上述方法实施例中的获取知识的方法,或者该指令用于实现上述的获取知识的设备。
一种实现举例,处理器可以为中央处理单元(central processing unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
一种实现举例,存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的随机存取存储器(random access memory,RAM)可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。
本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况,其中A,B可以是单数或者复数。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系,但也可能表示的是一种“和/或”的关系,具体可参考前后文进行理解。
本申请中,“多个”是指两个或两个以上。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算设备(可以是个人计算机,服务 器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。
Claims (18)
- 一种获取知识的方法,其特征在于,包括:根据参数从知识库中获取一个或多个第一知识,所述参数包括以下中的任一种或多种的组合:机器学习任务内的知识、所述机器学习任务的属性、多个机器学习任务之间的知识;向用户提供所述一个或多个第一知识。
- 根据权利要求1所述的方法,其特征在于,所述方法还包括:获取用户输入的所述参数;或从其他系统获取所述参数。
- 根据权利要求1或2所述的方法,其特征在于,所述机器学习任务内的知识包括所述机器学习任务的样本集合、模型,所述模型是根据所述样本集合训练得到的;或所述机器学习任务的属性包括所述机器学习任务的约束、应用范围;或所述多个机器学习任务之间的知识包括多个所述机器学习任务之间的关联关系。
- 根据权利要求1至3中任一项所述的方法,其特征在于,所述方法还包括:从所述知识库中获取与所述第一知识相关的第二知识;向所述用户提供所述第二知识。
- 根据权利要求1所述的方法,其特征在于,所述方法还包括:向所述用户提供所述第一知识的配置信息。
- 根据权利要求1至5中任一项所述的方法,其特征在于,所述方法还包括:获取用户选择的目标知识,所述目标知识为所述第一知识和/或第二知识。
- 根据权利要求6所述的方法,其特征在于,所述目标知识用于以下中的任一种场景:智能驾驶的物品识别;智能驾驶的人物识别;开发者平台;人工智能的市场平台;物联网的市场平台;解决方案的市场平台。
- 根据权利要求1至7中任一项所述的方法,其特征在于,所述方法还包括:边缘设备将所述知识库中的知识同步至云端设备;或所述云端设备将所述知识库中的知识同步至所述边缘设备。
- 一种获取知识的装置,其特征在于,包括:获取模块,用于根据参数从知识库中获取一个或多个第一知识,所述参数包括以下中的任一种或多种的组合:机器学习任务内的知识、所述机器学习任务的属性、多个机器学习任务之间的知识;显示模块,用于向用户提供所述一个或多个第一知识。
- 根据权利要求9所述的装置,其特征在于,所述获取模块还用于:获取用户输入的所述参数;或从其他系统获取所述参数。
- 根据权利要求9或10所述的装置,其特征在于,所述机器学习任务内的知识包括所述机器学习任务的样本集合、模型,所述模型是根据所述样本集合训练得到的;或所述机器学习任务的属性包括所述机器学习任务的约束、应用范围;或所述多个机器学习任务之间的知识包括多个所述机器学习任务之间的关联关系。
- 根据权利要求9至11中任一项所述的装置,其特征在于,所述获取模块,还用于从所述知识库中获取与所述第一知识相关的第二知识;所述显示模块,还用于向所述用户提供所述第二知识。
- 根据权利要求9所述的装置,其特征在于,所述显示模块还用于:向所述用户提供所述第一知识的配置信息。
- 根据权利要求9至13中任一项所述的装置,其特征在于,所述获取模块还用于:获取用户选择的目标知识,所述目标知识为所述第一知识和/或第二知识。
- 根据权利要求14所述的装置,其特征在于,所述目标知识用于以下中的任一种场景:智能驾驶的物品识别;智能驾驶的人物识别;开发者平台;人工智能的市场平台;物联网的市场平台;解决方案的市场平台。
- 根据权利要求9至15中任一项所述的装置,其特征在于,还包括:同步模块,用于边缘设备将所述知识库中的知识同步至云端设备;或所述云端设备将所述知识库中的知识同步至所述边缘设备。
- 一种获取知识的设备,其特征在于,包括处理器和存储器;所述处理器运行所述存储器中的指令,使得所述获取知识的设备执行如权利要求1至8中任一项所述的方法。
- 一种计算机可读存储介质,其特征在于,包括指令;所述指令用于实现如权利要求1至8中任一项所述的方法。
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP21938819.6A EP4307185A4 (en) | 2021-04-29 | 2021-08-30 | METHOD AND APPARATUS FOR ACQUIRING KNOWLEDGE |
| US18/492,754 US20240054364A1 (en) | 2021-04-29 | 2023-10-23 | Knowledge obtaining method and apparatus |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110473240 | 2021-04-29 | ||
| CN202110473240.3 | 2021-04-29 | ||
| CN202110720333.1A CN115271087A (zh) | 2021-04-29 | 2021-06-28 | 获取知识的方法和装置 |
| CN202110720333.1 | 2021-06-28 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US18/492,754 Continuation US20240054364A1 (en) | 2021-04-29 | 2023-10-23 | Knowledge obtaining method and apparatus |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022227355A1 true WO2022227355A1 (zh) | 2022-11-03 |
Family
ID=83745391
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2021/115192 Ceased WO2022227355A1 (zh) | 2021-04-29 | 2021-08-30 | 获取知识的方法和装置 |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US20240054364A1 (zh) |
| EP (1) | EP4307185A4 (zh) |
| CN (1) | CN115271087A (zh) |
| WO (1) | WO2022227355A1 (zh) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113533622A (zh) * | 2021-07-19 | 2021-10-22 | 华能国际电力股份有限公司上海石洞口第二电厂 | 一种基于神经网络的磨煤机煤质预测方法 |
| CN119560066B (zh) * | 2024-10-31 | 2025-11-04 | 清华大学 | 基于模块化深度学习的材料属性预测方法和装置 |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101782976A (zh) * | 2010-01-15 | 2010-07-21 | 南京邮电大学 | 一种云计算环境下机器学习自动选择方法 |
| US20190163667A1 (en) * | 2017-11-29 | 2019-05-30 | Google Llc | On-Device Machine Learning Platform to Enable Sharing of Machine-Learned Models Between Applications |
| CN110869949A (zh) * | 2017-08-11 | 2020-03-06 | 谷歌有限责任公司 | 设备上机器学习平台 |
| US20200097845A1 (en) * | 2018-09-21 | 2020-03-26 | International Business Machines Corporation | Recommending machine learning models and source codes for input datasets |
| CN111369011A (zh) * | 2020-04-16 | 2020-07-03 | 光际科技(上海)有限公司 | 机器学习模型应用的方法、装置、计算机设备和存储介质 |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2019113308A1 (en) * | 2017-12-05 | 2019-06-13 | Franchitti Jean Claude | Active adaptation of networked compute devices using vetted reusable software components |
| US11475374B2 (en) * | 2019-09-14 | 2022-10-18 | Oracle International Corporation | Techniques for automated self-adjusting corporation-wide feature discovery and integration |
-
2021
- 2021-06-28 CN CN202110720333.1A patent/CN115271087A/zh active Pending
- 2021-08-30 WO PCT/CN2021/115192 patent/WO2022227355A1/zh not_active Ceased
- 2021-08-30 EP EP21938819.6A patent/EP4307185A4/en active Pending
-
2023
- 2023-10-23 US US18/492,754 patent/US20240054364A1/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101782976A (zh) * | 2010-01-15 | 2010-07-21 | 南京邮电大学 | 一种云计算环境下机器学习自动选择方法 |
| CN110869949A (zh) * | 2017-08-11 | 2020-03-06 | 谷歌有限责任公司 | 设备上机器学习平台 |
| US20190163667A1 (en) * | 2017-11-29 | 2019-05-30 | Google Llc | On-Device Machine Learning Platform to Enable Sharing of Machine-Learned Models Between Applications |
| US20200097845A1 (en) * | 2018-09-21 | 2020-03-26 | International Business Machines Corporation | Recommending machine learning models and source codes for input datasets |
| CN111369011A (zh) * | 2020-04-16 | 2020-07-03 | 光际科技(上海)有限公司 | 机器学习模型应用的方法、装置、计算机设备和存储介质 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4307185A4 * |
Also Published As
| Publication number | Publication date |
|---|---|
| EP4307185A4 (en) | 2024-10-02 |
| EP4307185A1 (en) | 2024-01-17 |
| US20240054364A1 (en) | 2024-02-15 |
| CN115271087A (zh) | 2022-11-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20220382564A1 (en) | Aggregate features for machine learning | |
| US12299043B2 (en) | Systems and methods for tagging datasets using models arranged in a series of nodes | |
| Cherfi et al. | Very fast C4. 5 decision tree algorithm | |
| CN114119058B (zh) | 用户画像模型的构建方法、设备及存储介质 | |
| CN107291840B (zh) | 一种用户属性预测模型构建方法和装置 | |
| Zadeh et al. | Assessment of semantic similarity of concepts defined in ontology | |
| US12493819B2 (en) | Utilizing machine learning models to generate initiative plans | |
| CN116049376B (zh) | 一种信创知识检索回复的方法、装置和系统 | |
| US20240054364A1 (en) | Knowledge obtaining method and apparatus | |
| Berko et al. | A method to solve uncertainty problem for big data sources | |
| Meira et al. | Fast anomaly detection with locality-sensitive hashing and hyperparameter autotuning | |
| WO2023278567A1 (en) | Method of graph modeling electronic documents with author verification | |
| CN110765276A (zh) | 知识图谱中的实体对齐方法及装置 | |
| Lei et al. | Time-aware semantic web service recommendation | |
| Kumar et al. | Online semi-supervised classification on multilabel evolving high-dimensional text streams | |
| US20260072912A1 (en) | Semantic search in high-dimensional spaces using euclidean distance and cluster-based optimization | |
| US12197438B2 (en) | Data manipulation language parser system and method for entity resolution | |
| CN120450656A (zh) | 一种基于ai技术的销售辅助系统及方法 | |
| Amouee et al. | A new anomalous text detection approach using unsupervised methods | |
| McClean et al. | Knowledge discovery by probabilistic clustering of distributed databases | |
| CN112948589A (zh) | 文本分类方法、装置和计算机可读存储介质 | |
| US11537647B2 (en) | System and method for decision driven hybrid text clustering | |
| CN120386887B (zh) | 基于知识图谱的邮票检索方法、系统、电子设备及介质 | |
| Birgersson et al. | Data integration using machine learning: Automation of data mapping using machine learning techniques | |
| Sima et al. | Smes dedicated knowledge exploitation mechanism: A recommender system based on knowledge relatedness |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21938819 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2021938819 Country of ref document: EP |
|
| ENP | Entry into the national phase |
Ref document number: 2021938819 Country of ref document: EP Effective date: 20231009 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |








