CN117633155A

CN117633155A - Enterprise matching method, device, equipment, medium and program product

Info

Publication number: CN117633155A
Application number: CN202311659790.XA
Authority: CN
Inventors: 罗奕康; 聂砂; 戴菀庭; 郑江; 丁苏苏
Original assignee: China Construction Bank Corp; CCB Finetech Co Ltd
Current assignee: China Construction Bank Corp; CCB Finetech Co Ltd
Priority date: 2023-12-05
Filing date: 2023-12-05
Publication date: 2024-03-01

Abstract

The application provides an enterprise matching method, an enterprise matching device, enterprise matching equipment, an enterprise matching medium and a program product. Relates to the technical field of natural language processing. The method comprises the following steps: and inputting the government information to be processed into a named entity recognition model, and acquiring a plurality of initial entities and types of each initial entity in the government information to be processed output by the named entity recognition model. And then, determining an entity alignment model corresponding to the initial entity according to the type of the initial entity for each initial entity, inputting the initial entity into the corresponding entity alignment model, and acquiring a target entity of the initial entity output by the entity alignment model. And finally, determining the enterprises meeting a plurality of target entities as target enterprises according to the enterprise information of each enterprise. The entity alignment model is used for converting an initial entity into a target entity, and the target entity is structured data. The technical scheme improves the accuracy of the target enterprise obtained by matching.

Description

Enterprise matching method, device, equipment, medium and program product

Technical Field

The present disclosure relates to the field of natural language processing technologies, and in particular, to an enterprise matching method, apparatus, device, medium, and program product.

Background

As modern government matters become increasingly complex, the level of intelligence of the traditional government has become difficult to cope with this new situation and smart governments must be established. Compared with the traditional electronic government affairs, the intelligent government has the characteristics of thorough perception, quick response, active service, scientific decision, people oriented, and the like, can effectively improve government affair handling efficiency, and shortens resident affair handling flow.

At present, governments issue some benefit-enterprise policies through smart governments, so that in order to enable the benefit-enterprise policies to reach enterprises capable of claiming rewards in time, related staff usually analyze the benefit-enterprise policies manually to determine conditions corresponding to the benefit-enterprise policies, match the conditions with enterprise images, thereby determining target enterprises meeting the benefit-enterprise policies, and send the benefit-enterprise policies to the target enterprises through mailbox, short messages, application programs and the like.

However, the accuracy of prior art matching target enterprises is low.

Disclosure of Invention

The application provides an enterprise matching method, device, equipment, medium and program product, which are used for solving the problem of lower accuracy of a target enterprise matched in the prior art.

In a first aspect, the present application provides an enterprise matching method, including:

Inputting government information to be processed into a named entity recognition model, and acquiring a plurality of initial entities and types of each initial entity in the government information to be processed, which are output by the named entity recognition model;

for each initial entity, determining an entity alignment model corresponding to the initial entity according to the type of the initial entity, wherein the entity alignment model is used for converting the initial entity into a target entity, and the target entity is structured data;

inputting the initial entity into a corresponding entity alignment model, and acquiring a target entity of the initial entity output by the entity alignment model;

determining enterprises meeting a plurality of target entities at the same time as target enterprises according to the enterprise information of each enterprise;

wherein the entity alignment model comprises at least one of: a registered place entity alignment model, an industry domain entity alignment model, a registered time entity alignment model, a business authentication entity alignment model, and a business scale entity alignment model.

In one possible design, the type of the initial entity includes at least one of: registration location, industry domain, registration time, enterprise authentication, and enterprise scale;

Correspondingly, when the type of any initial entity is a registration place, the entity alignment model corresponding to the initial entity is a registration place entity alignment model;

when the type of any initial entity is the industry field, the entity alignment model corresponding to the initial entity is the industry field entity alignment model;

when the type of any initial entity is registration time, the entity alignment model corresponding to the initial entity is a registration time entity alignment model;

when the type of any initial entity is enterprise authentication, the entity alignment model corresponding to the initial entity is an enterprise authentication entity alignment model;

when the type of any initial entity is enterprise scale, the entity alignment model corresponding to the initial entity is enterprise scale entity alignment model.

In one possible design, when any initial entity type is a business scale, inputting the initial entity into the business scale entity alignment model, obtaining the business scale entity alignment model and outputting the target entity of the initial entity includes:

if the initial entity contains a preset keyword, determining a target entity corresponding to the keyword as a target entity corresponding to the initial entity;

The preset keywords comprise at least one of the following: large, medium, small and micro;

the large corresponding target entity is a large enterprise, the medium corresponding target entity is a medium enterprise, the small corresponding target entity is a small enterprise, and the micro corresponding target entity is a micro enterprise.

In one possible design, the inputting the initial entity into the corresponding entity alignment model, and obtaining the target entity of the initial entity output by the entity alignment model, includes:

if a plurality of first entities with types being industry fields exist in the plurality of initial entities, determining a union of the plurality of first entities as a second entity;

and inputting a second entity into the industry domain entity alignment model, and acquiring a target entity corresponding to the second entity output by the industry domain entity alignment model.

In one possible design, before the entering the to-be-processed government information into the named entity recognition model and obtaining a plurality of initial entities in the to-be-processed government information output by the named entity recognition model and the type of each initial entity, the method further includes:

acquiring a plurality of sample government affair information;

Labeling each sample government information to obtain labeling information of the government information, wherein the labeling information is used for explaining the position and the type of each sample entity in the sample government information;

and performing model training according to the plurality of sample government affair information and the labeling information of each sample government affair information to obtain the named entity recognition model.

In one possible design, the method further comprises:

and sending the government affair information to be processed to the terminal equipment of the target enterprise.

In a second aspect, the present application provides an enterprise matching apparatus, comprising:

the input module is used for inputting the government information to be processed into a named entity recognition model and acquiring a plurality of initial entities and the type of each initial entity in the government information to be processed, which are output by the named entity recognition model;

the determining module is used for determining an entity alignment model corresponding to each initial entity according to the type of the initial entity, wherein the entity alignment model is used for converting the initial entity into a target entity, and the target entity is structured data;

the input module is further configured to input the initial entity into a corresponding entity alignment model, and obtain a target entity of the initial entity output by the entity alignment model;

The determining module is further configured to determine, according to the enterprise information of each enterprise, an enterprise that satisfies a plurality of target entities at the same time as a target enterprise;

In one possible design, when the type of any initial entity is a business scale, the business scale entity alignment model is used to:

In one possible design, the input module is specifically configured to:

In one possible design, before the to-be-processed government information is input into a named entity recognition model, a plurality of initial entities in the to-be-processed government information output by the named entity recognition model and types of each initial entity are obtained, the enterprise matching device further includes a training module, where the training module is configured to:

Acquiring a plurality of sample government affair information;

In one possible design, the enterprise matching apparatus further includes a sending module configured to:

In a third aspect, embodiments of the present application provide an electronic device,

comprising the following steps: a processor, and a memory communicatively coupled to the processor;

the memory stores computer-executable instructions;

the processor executes computer-executable instructions stored in the memory to implement the method as described above in the first aspect and the various possible designs of the first aspect.

In a fourth aspect, embodiments of the present application provide a computer-readable storage medium having stored therein computer-executable instructions for implementing the method as described in the first aspect and the various possible designs of the first aspect, when executed by a processor.

In a fifth aspect, embodiments of the present application provide a computer program product comprising a computer program which, when executed by a processor, implements the method as described above for the first aspect and the various possible designs of the first aspect.

According to the enterprise matching method, device, equipment, medium and program product, the method comprises the steps of inputting the government information to be processed into a named entity recognition model, and obtaining a plurality of initial entities in the government information to be processed and the types of the initial entities output by the named entity recognition model. And then, determining an entity alignment model corresponding to the initial entity according to the type of the initial entity for each initial entity, inputting the initial entity into the corresponding entity alignment model, and acquiring a target entity of the initial entity output by the entity alignment model. And finally, determining the enterprises meeting a plurality of target entities as target enterprises according to the enterprise information of each enterprise. The entity alignment model is used for converting an initial entity into a target entity, and the target entity is structured data. In the technical scheme, after the initial entity of the government information to be processed is extracted, the corresponding initial entity and the enterprise information are subjected to bidirectional alignment through different entity alignment models, so that the accuracy of matching with the enterprise information is improved, and the accuracy of a target enterprise obtained by matching is improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.

Fig. 1 is a schematic flow chart of an embodiment one of an enterprise matching method provided in an embodiment of the present application;

fig. 2 is a schematic flow chart of a second embodiment of an enterprise matching method provided in the embodiment of the present application;

fig. 3 is a schematic flow chart of a third embodiment of an enterprise matching method provided in the embodiment of the present application;

fig. 4 is a schematic structural diagram of an enterprise matching apparatus according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Specific embodiments thereof have been shown by way of example in the drawings and will herein be described in more detail. These drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but to illustrate the concepts of the present application to those skilled in the art by reference to specific embodiments.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present application as detailed in the accompanying claims.

In the technical scheme of the application, the processing of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the related information such as financial data or user data accords with the regulations of related laws and regulations and does not violate the popular regulations. User information (including but not limited to user equipment information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) referred to herein are both information and data that is authorized by the user or is fully authorized by the parties, and the collection, use, and processing of relevant data requires compliance with relevant laws and regulations and standards, and is provided with corresponding operational portals for the user to choose authorization or denial.

First, terms related to the present application will be explained.

1. Natural language processing (Natural Language Processing, NLP): natural language processing is an important research direction in the fields of computer science, artificial intelligence, and linguistic crossover, mainly to study how to let computers understand, generate, and process natural language.

2. Named entity recognition (Named Entity Recognition, NER): named entity recognition is a task in natural language processing, mainly to identify entities in text that have a specific meaning, such as person names, place names, organization names, etc.

3. Text classification (english: text Classification): text classification is a task in natural language processing, primarily to classify text into predefined categories based on the content of the text.

4. Big Data (English: big Data): big data refers to a large or complex set of data that is not sufficiently processed in conventional data processing applications. Large data is typically characterized by large amounts of data, fast data generation, and data diversity.

5. Data Matching (english: data Matching): data matching refers to finding the same or similar data in a large amount of data.

6. Bi-directional alignment (Bidirectional Alignment): bidirectional alignment refers to alignment between two different data sets such that data in one data set can find corresponding data in the other data set, and vice versa.

Next, an explanation is given of the application background of the present application.

Currently, governments issue a number of benefit policies, such as rewarding "small micro-tech enterprises". These benefit policies are typically issued in a corresponding platform, such as a smart government. If the enterprises meeting the conditions do not have a login platform, the enterprises cannot check the benefit policy, and cannot apply the benefit. Therefore, in order to ensure that the enterprise can check the benefit policy of applying for the prize and the supplement in time, the benefit policy needs to be timely touched to the corresponding enterprise.

At present, a benefit-enterprise policy is touched to enterprises which can apply for rewards, and related staff usually analyze the benefit-enterprise policy manually to determine the conditions corresponding to the benefit-enterprise policy. And then, obtaining enterprise portraits and portraits labels of enterprises according to enterprise information, matching conditions of the benefit-enterprise policies with the enterprises according to the enterprise portraits and portraits labels of the enterprises, determining target enterprises meeting the benefit-enterprise policies, sending the benefit-enterprise policies to the target enterprises in a mailbox, short message, application program and other modes, or reminding users to log in an intelligent government for checking.

The prior art realizes the matching of enterprises and the benefit and enterprise policy mainly through natural semantic analysis, thereby realizing the intelligent matching and recommendation of the enterprise portrait and the benefit and enterprise policy, providing more opportunities and consultation for the enterprises and promoting the development of the enterprises. However, the above-mentioned prior art needs to analyze the benefit policy manually, and cannot ensure the rigor and efficiency of the manual processing process, which results in lower accuracy and efficiency of the matched target enterprises. Moreover, when the benefit-to-enterprise policy contains complicated benefit-to-enterprise conditions, the accuracy of recommending the benefit-to-enterprise policy based on the enterprise image is not sufficient and interpretation is difficult.

In another prior art, a benefit-enterprise policy may be extracted and processed by a natural language processing method, and the extracted data may be disassembled, so as to obtain an application range and an evaluation standard of the policy.

In the prior art, since the benefit-enterprise policy itself is natural language rather than structured data, specific conditions of the benefit-enterprise policy cannot be extracted accurately, and the data extracted from the benefit-enterprise policy cannot be in one-to-one correspondence with enterprise information accurately. For example, a "new energy automobile industry" field may exist in a benefit-enterprise policy, but no one field in enterprise information of an enterprise meeting the benefit-enterprise policy at the time of registration is "new energy automobile", and only an "operation range" field, such as automobile manufacturing, is available, so that the benefit-enterprise policy cannot be matched with the enterprise.

Based on the above example, it can be seen that the matching result of the prior art is not accurate, some enterprises meeting the policy conditions may be omitted, and some enterprises not meeting the policy conditions may be erroneously matched.

In summary, the prior art has the problem of low matching accuracy of the target enterprise.

Based on the technical problems, the application provides an enterprise matching method, after acquiring the government information to be processed, the government information to be processed can be input into a naming identification model so as to acquire initial entities in the government information to be processed and types of the initial entities. And then, inputting each initial entity into an entity alignment model corresponding to the initial entity based on the type of each initial entity, so that the corresponding entity alignment model can convert the initial entity from natural language into unstructured data, thereby realizing bidirectional alignment with enterprise information and improving the accuracy of subsequent matching with enterprises.

The following describes the technical scheme of the present application in detail through specific embodiments.

It should be noted that the following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments.

Fig. 1 is a schematic flow chart of an embodiment one of an enterprise matching method provided in an embodiment of the present application. As shown in fig. 1, the enterprise matching method may include the steps of:

s11, inputting the government information to be processed into a named entity recognition model, and acquiring a plurality of initial entities and types of each initial entity in the government information to be processed output by the named entity recognition model.

The execution main body of the embodiment of the application is an electronic device, which may be a terminal device, such as a notebook computer, a tablet computer, a desktop computer, etc., or may be a server, such as a background server of an intelligent government. In practical application, the electronic device is specifically a terminal device or a server, and may be determined according to practical situations, which is not specifically limited.

The government information to be processed is government information related to enterprises, such as a benefit policy, a notification of related procedures required to be handled by the enterprises, a notification of related proving materials required to be provided by the enterprises, and the like. In practical application, the to-be-processed government information input by government personnel can be received through an application programming interface (Application Programming Interface, API), or a graphical user interface (Graphical User Interface, GUI) or the like, the to-be-processed government information can be obtained from a server or a database storing the to-be-processed government information, and when the to-be-processed government information is stored in a local storage space of the electronic device in advance, the electronic device can also directly obtain the to-be-processed government information from the local storage space.

The named entity recognition model is used for extracting a plurality of initial entities in the government information to be processed and the type of each initial entity. Furthermore, the named entity recognition model can also extract the position of each initial entity in the government information to be processed.

It should be understood that the embodiments of the present application do not limit the number of initial entities, and may be determined according to actual situations.

In one possible implementation, the policy title, issuing entity, and policy script of the government information to be processed may be used as "promt: the content # forms are spliced together, and then the spliced government information is input into a named entity recognition model, so that a plurality of initial entities and types of each initial entity in the government information to be processed, which is output by the named entity recognition model, are obtained.

Illustratively, assume that the post-splice government information is: "# item: the medical institution occupies the cultivated land and does not occupy the tax # \n# mechanism: XX office # n# condition of XX office XX province: the tax-free medical institution is limited in the specific range to the places and the matched facilities of the medical institution which are approved by the health and health administration departments of the people government above county level and are specially used for disease diagnosis and treatment activities. # of the design.

The government information after the splicing is input into a named entity recognition model, and a plurality of initial entities and types of each initial entity output by the named entity recognition model can be expressed as follows: [4,8, "industry field", "medical institution" ], [31,34 "," registration place "," XX province "], [46, 50", "industry field", "medical institution" ], [78,82 "," industry field "," medical institution "], [85,105", "industry field", "place for disease diagnosis, therapeutic activity and its supporting facilities" ] ].

In the above example, taking [4,8, "industry field", "medical institution" ] as an example, the medical institution is an initial entity, the industry field is the type of the initial entity, 4 is the starting position (i.e. the 4 th word) of the initial instance in the government information to be processed, and 8 is the ending position (i.e. the 8 th word) of the initial instance in the government information to be processed. It should be understood that, for the relevant content of other initial entities in the above examples, reference may be made to the explanation section of [4,8, "industry field", "medical institution" ], which is not described in detail herein.

Wherein the type of the initial entity comprises at least one of: registration location, industry domain, registration time, enterprise authentication, and enterprise scale.

S12, determining an entity alignment model corresponding to each initial entity according to the type of the initial entity.

The entity alignment model is used for converting the initial entity into a target entity, wherein the target entity is structured data so as to perform bidirectional alignment on government information to be processed and enterprise information.

When the type of any initial entity is a registration place, the entity alignment model corresponding to the initial entity is a registration place entity alignment model. When the type of any initial entity is the industry field, the entity alignment model corresponding to the initial entity is the industry field entity alignment model. When the type of any initial entity is the registration time, the entity alignment model corresponding to the initial entity is the registration time entity alignment model. When the type of any initial entity is enterprise authentication, the entity alignment model corresponding to the initial entity is enterprise authentication entity alignment model. When the type of any initial entity is enterprise scale, the entity alignment model corresponding to the initial entity is enterprise scale entity alignment model.

S13, inputting the initial entity into a corresponding entity alignment model, and obtaining a target entity of the initial entity output by the entity alignment model.

It should be understood that the specific processing procedure of each entity alignment model will be described in the following embodiments, which will not be described herein.

It should be understood that all the initial entities may be input into the corresponding entity alignment model at the same time, the initial entities may be sequentially input into the corresponding entity alignment model in a certain order, or a preset number of initial entities may be input into the corresponding entity alignment model first, and then a preset number of initial entities may be input into the corresponding entity alignment model until all the initial entities are input into the corresponding entity alignment model. The sequence may be a preset sequence or a random sequence, the preset number may be 2, 3, 4, etc., and the sequence and the preset number may be determined according to actual situations, which is not particularly limited.

S14, determining enterprises meeting a plurality of target entities at the same time as target enterprises according to the enterprise information of each enterprise.

When each enterprise is registered, the related authorities collect related information of the enterprise and integrate the related information to generate enterprise information. After determining a plurality of target entities of the government affair information to be processed, the government affair information to be processed is matched with enterprises according to the enterprise information of the enterprises. If there is enterprise information of one enterprise that is matched with a plurality of target entities at the same time, the enterprise is determined as a target enterprise.

In one possible implementation, if any field corresponding to the target entity exists in the enterprise information, it is determined that the enterprise information matches the target entity successfully.

By way of example, other matching methods may be implemented, such as rule-based matching, machine learning-based matching, etc., and the present application is not limited to a particular matching method.

For example, the enterprise information for each enterprise described above may be stored in a large data cluster.

Optionally, after the target enterprise is determined, the to-be-processed government information can be sent to the terminal device of the target enterprise, so that relevant staff of the target enterprise can check the to-be-processed government information through the terminal device, and follow-up processing is performed based on the to-be-processed government information.

Optionally, after the target enterprise is determined, a reminding message can be sent to the terminal device of the target enterprise, where the reminding message is used to remind relevant staff of the target enterprise to log in the intelligent government to look up the government information to be processed. Specifically, the reminding information may include a website link for publishing the government information to be processed, or how to view the government information to be processed.

The embodiment of the application provides an enterprise matching method, which is used for acquiring a plurality of initial entities and types of each initial entity in government information to be processed, which is output by a named entity identification model, by inputting the government information to be processed into the named entity identification model. And then, determining an entity alignment model corresponding to the initial entity according to the type of the initial entity for each initial entity, inputting the initial entity into the corresponding entity alignment model, and acquiring a target entity of the initial entity output by the entity alignment model. And finally, determining the enterprises meeting a plurality of target entities as target enterprises according to the enterprise information of each enterprise. The entity alignment model is used for converting an initial entity into a target entity, and the target entity is structured data. In the technical scheme, after the initial entity of the government information to be processed is extracted, the corresponding initial entity and the enterprise information are subjected to bidirectional alignment through different entity alignment models, so that the accuracy of matching with the enterprise information is improved, and the accuracy of a target enterprise obtained by matching is improved.

Based on the above embodiments, the processing procedures of the registration place entity alignment model, the industry field entity alignment model, the registration time entity alignment model, the enterprise authentication entity alignment model, and the enterprise scale entity alignment model are explained below, respectively.

Registration place entity alignment model

When the type of the initial entity is a registration place, the initial entity is input into a registration place entity alignment model. After the registration place entity alignment model acquires the initial entity, the province, the city and the like are extracted from the initial entity, and the division table is queried, so that the corresponding target entity is output.

Based on the example shown in S11, the initial entity "XX province" is input into the registered location entity alignment model, and the target entity output by the registered location entity alignment model is: XX province- > registration place entity alignment model- > [ { 'address': 'XX province', 'full_path': [ '430000000000' ] ]

Industry domain entity alignment model

If a plurality of first entities with types being industry fields exist in the plurality of initial entities, determining a union of the plurality of first entities as a second entity. And then, inputting the second entity into the industry domain entity alignment model, and acquiring a target entity corresponding to the second entity output by the industry domain entity alignment model.

Based on the example shown in S11, the union of the first entity "medical institution", "place to engage in disease diagnosis", "place to engage in therapeutic activity and its ancillary facilities" of the type of industry field is determined as the second entity { "medical institution", "place to engage in disease diagnosis", "place to engage in therapeutic activity and its ancillary facilities" }. The second entity is input into an industry domain entity alignment model, the industry domain entity alignment model aligns the second entity (the example assumes that the alignment target is a national first-class industry and does not limit the alignment level in practical application), and finally the alignment result is obtained by merging, and a target entity obtained by obtaining the merging is output, and the target entity can be represented as follows:

1. Medical institution- > industry field entity alignment model- > Q0000 hygiene and social work

2. Places for conducting disease diagnosis and treatment activities and supporting facilities thereof- > entity alignment model in industry field- > Q0000 sanitation and social work

3.Out[7]:['Q0000']

Registration time entity alignment model

When the type of the initial entity is registration time, the initial entity is input into a registration time entity alignment model. After the initial entity is acquired, the registration time entity alignment model extracts time through the regular expression, and judges whether the registration time is before or after the time. And after determining the judging result, outputting the target entity based on the judging result.

Illustratively, assuming that the initial entity is "XXXX year XX month XX day before", the target entity that is output by the registration time entity alignment model is: before XX month XX day- > registration time entity alignment model- > [ None, 'XXXXXX month XX day' ]

Business authentication entity alignment model

When the type of the initial entity is enterprise authentication, the initial entity is input into an enterprise authentication entity alignment model. After the initial entity is obtained, the enterprise authentication entity alignment model carries out vectorization processing on the initial entity, and the vectorized initial entity is searched in an enterprise authentication qualification library, so that the best matched enterprise authentication is obtained and output. The best matching enterprise authentication is the target entity.

It should be appreciated that the library of enterprise authentication qualifications is pre-established, including a plurality of vectorized enterprise authentication qualifications.

The method for searching the vectorized initial entity in the enterprise authentication qualification library may be to calculate a distance between the vectorized initial entity and each vectorized enterprise authentication qualification in the enterprise authentication qualification library, and determine the vectorized enterprise authentication qualification corresponding to the maximum distance as the enterprise authentication with which the vectorized initial entity is the best match.

Illustratively, the target entity output by the enterprise authentication entity alignment model may be represented as follows: [ Enterprise authentication: xxxx) [ Enterprise authentication: xxxx ].

Business scale entity alignment model

If the initial entity with the type of enterprise scale comprises a preset keyword, determining a target entity corresponding to the keyword as a target entity corresponding to the initial entity.

Wherein the preset keywords comprise at least one of the following: large, medium, small and micro. The target entity corresponding to the preset keyword 'large' is a large enterprise, the target entity corresponding to the preset keyword 'medium' is a medium-sized enterprise, the target entity corresponding to the preset keyword 'small' is a small enterprise, and the target entity corresponding to the preset keyword 'micro' is a micro enterprise.

For example, assuming that the initial entity is a "small and medium-sized enterprise," the target entity output by the enterprise-scale entity alignment model is: middle-sized and small-sized enterprises- > enterprise-scale entity alignment model- [ 'middle-sized enterprises', 'small-sized enterprises' ].

The policy names, issuing authorities and policy conditions in the government information to be processed have been converted by natural language into a set of conditions that can be matched, i.e. into structured language.

Before the government information to be processed is matched with the enterprise by using each model, each model needs to be trained. The model training process is described in detail below in connection with specific embodiments. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.

In a specific implementation, the execution body of the model training process may be an electronic device with processing capability, such as a terminal or a server. It should be understood that the electronic device performing the enterprise matching method and the electronic device performing the model training process may be the same device or different devices.

Next, the training processes of the named entity recognition model, the registered location entity alignment model, the industry domain entity alignment model, the registration time entity alignment model, the enterprise authentication entity alignment model, and the enterprise scale entity alignment model are respectively explained.

Named entity recognition model

And acquiring a plurality of sample government information, marking each sample government information, and acquiring marking information of the government information. The labeling information is used for explaining the positions and types of various sample entities in the sample government affair information. And then, carrying out model training according to the plurality of sample government affair information and the labeling information of each sample government affair information to obtain a named entity recognition model.

Exemplary, sample government information is obtained, and the policy title, the issuing unit and the policy script of the sample government information are "sample: content # "is spliced together, such as: "# item: newly obtained country, province, city-level small giant deems # \n# mechanism: XXX development area management Committee Industrial development office # n# condition: the local enterprises identified by the small giant nations, provinces and cities in 2022 are newly obtained. This has the advantage of maximally absorbing the policy titles and information in the issuing entity. Further, the spliced sample government information is marked by the form of sequence marks such as BIO (English full name: B-begin, I-begin, O-output end) and BIOS, and marking information obtained by marking can be: [ [11,16, "enterprise authentication" ], [24,33, "registration place" ], [61,66, "enterprise authentication" ].

Further, the initial named entity recognition model may be trained using a plurality of sample government information and labeling information for each sample government information. The initial named entity recognition model is a common natural language processing technology, and can recognize various named entities such as person names, place names, organization names and the like from texts. The named entity recognition model obtained by training the initial named entity recognition model is used for recognizing various conditions (namely the initial entity) in the government information to be processed.

The specific method for training the initial named entity recognition model can be various, such as a conditional random field (Conditional Random Field, CRF), a Long-short-term memory network (Long-Short Term Memory, LSTM), a language characterization model (Bidirectional Encoder Representations from Transformers, BERT) and the like. The specific choice of which method depends on the nature of the data and the task requirements. When training the model, the parameters of the model also need to be adjusted to achieve the best recognition effect.

Registration place entity alignment model

Training the initial registration place entity alignment model based on the plurality of first sample texts and the first labeling information of each first sample text, so as to obtain the registration place entity alignment model, and enabling the registration place entity alignment model to extract entities such as provinces, cities, counties and the like in places.

The first labeling information is used for representing a place in the first sample text and entities such as provinces, cities, counties and the like in the place.

Exemplary embodiments. The type of initial registration place entity alignment model may be a NER model.

Industry domain entity alignment model

Training the initial industry domain entity alignment model based on the plurality of second sample texts, thereby obtaining the industry domain entity alignment model. The industry domain entity alignment model is used for classifying and aligning national industry standard classification issued by the national statistics bureau.

By way of example, the type of initial industry domain entity alignment model may be a text classification model.

Registration time entity alignment model

Training the initial registration time entity alignment model based on the plurality of third sample texts and third labeling information of each third sample text, thereby obtaining a registration time entity alignment model. The registration time entity alignment model is used to classify alignment "before", "after".

For example, assuming that the third sample text is "before 2023 year 2 month 5 days", the third labeling information is used to label "2023 year 2 month 5 days" before, "assuming that the third sample text is" after 2023 year 2 month 5 days ", the third labeling information is used to label" 2023 year 2 month 5 days "after," assuming that the third sample text is "2023 year 2 month 1 day to 2023 year 2 month 5 days", the third labeling information is used to label "2023 year 2 month 1 day" after, "and" to 2023 year 2 month 5 days "before.

For example, the type of initial registration time entity alignment model may be a text classification model.

Business authentication entity alignment model

Training the initial enterprise certification entity alignment model based on the fourth plurality of sample texts, thereby obtaining the enterprise certification entity alignment model. The enterprise authentication entity alignment model is used for retrieving enterprise authentication in the enterprise authentication qualification library, which is matched with the vectorized initial entity best according to natural language.

Illustratively, the type of initial enterprise authentication entity alignment model may be a vector retrieval model.

Business scale entity alignment model

For the enterprise scale, the model with the large, medium, small and micro keywords capable of being extracted can be determined as the entity alignment model of the enterprise scale without training additional models.

It should be understood that the present application is not limited to the type of any of the models described above, and any of the models may be replaced by other types of models having the same function.

It should be appreciated that the criteria for bi-directional alignment referred to in any of the above embodiments may be replaced by other alignment criteria, such as industry standard based alignment, geographic location based alignment, etc., which are not particularly limited by the present application.

Based on the enterprise matching method shown in any of the above embodiments, the enterprise matching method will be specifically explained by two specific examples.

Example one

Fig. 2 is a schematic flow chart of a second embodiment of an enterprise matching method provided in the embodiment of the present application. As shown in fig. 2, the enterprise matching method may include the steps of:

and step 1, marking data.

And acquiring a plurality of sample government information, marking each sample government information, and acquiring marking information of the government information.

And step 2, training a named entity recognition model.

And carrying out model training according to the plurality of sample government affair information and the labeling information of each sample government affair information to obtain a named entity recognition model.

And 3, training an entity alignment model.

Model training is respectively carried out aiming at different conditions such as registration places, industry fields, registration time, enterprise authentication, enterprise scale and the like, so that entity alignment models corresponding to the different conditions are obtained.

The steps 1 to 3 are model training processes.

And 4, inputting the government information to be processed into a named entity recognition model, and acquiring a plurality of initial entities and types of each initial entity in the government information to be processed output by the named entity recognition model.

And 5, inputting the initial entity into a corresponding entity alignment model according to the type of the initial entity aiming at each initial entity, and obtaining a target entity of the initial entity output by the entity alignment model.

And 6, data matching.

And matching the government information to be processed with the enterprises according to the enterprise information of each enterprise, and determining the enterprises meeting a plurality of target entities at the same time as the target enterprises.

Wherein, step 5 and step 6 are application processes of each model.

Example two

Based on the above embodiments, the application process of each model will be further described. Fig. 3 is a schematic flow chart of a third embodiment of an enterprise matching method provided in the embodiment of the present application. As shown in fig. 3, the enterprise matching method may include the steps of:

and a step a of inputting the government information to be processed into a named entity recognition model, and obtaining a plurality of initial entities and types of each initial entity in the government information to be processed output by the named entity recognition model.

And b1, inputting an initial entity with the type of the registration place into a registration place entity alignment model, and obtaining a target entity output by the registration place entity alignment model.

And b2, inputting the initial entity with the type of the industry field into an industry field entity alignment model, and obtaining a target entity output by the industry field entity alignment model.

And b3, inputting the initial entity with the type of registration time into a registration time entity alignment model, and obtaining a target entity output by the registration time entity alignment model.

And b4, inputting the initial entity with the type of enterprise authentication into an enterprise authentication entity alignment model, and obtaining a target entity output by the enterprise authentication entity alignment model.

And b5, inputting the initial entity with the type of the enterprise scale into the enterprise scale entity alignment model to obtain a target entity output by the enterprise scale entity alignment model.

It should be understood that the execution order of the steps b1 to b5 is not limited, that is, the steps b1 to b5 may be executed simultaneously, or may be executed sequentially in a sequential order.

And c, data matching.

Based on the enterprise matching method shown in the above embodiment, the enterprise matching method has the following technical effects:

1. after government issues government information to be processed, automatic policy condition extraction based on natural language processing is carried out on policy conditions of the government information to be processed, alignment processing is carried out on enterprise fields, and finally enterprise lists suitable for the government information to be processed are matched based on enterprise information, so that enterprise access to the government information to be processed can be realized, and policy effectiveness and access efficiency are improved.

2. The natural language processing technology is used for extracting the policy conditions of the government information to be processed, so that the accuracy and the efficiency of the extraction are greatly improved. The policy condition is the initial entity in the government affair information to be processed.

3. After the policy conditions are extracted, the method also aims at enterprise fields to align the policy conditions, so that the policy conditions can be accurately matched with the actual conditions of enterprises.

4. After condition extraction and alignment processing are performed, matching is performed on the basis of enterprise information and the policy conditions after the alignment processing, so that a target enterprise meeting the policy conditions can be accurately determined.

The following are device embodiments of the present application, which may be used to perform method embodiments of the present application. For details not disclosed in the device embodiments of the present application, please refer to the method embodiments of the present application.

Fig. 4 is a schematic structural diagram of an enterprise matching apparatus according to an embodiment of the present application. As shown in fig. 4, the enterprise matching apparatus 40 includes:

the input module 41 is configured to input the government information to be processed into a named entity recognition model, and obtain a plurality of initial entities in the government information to be processed and the type of each initial entity output by the named entity recognition model;

The determining module 42 is configured to determine, for each initial entity, an entity alignment model corresponding to the initial entity according to a type of the initial entity, where the entity alignment model is used to convert the initial entity into a target entity, and the target entity is structured data;

the input module 41 is further configured to input the initial entity into a corresponding entity alignment model, and obtain a target entity of the initial entity output by the entity alignment model;

the determining module 42 is further configured to determine, according to the enterprise information of each enterprise, an enterprise that satisfies a plurality of target entities at the same time as a target enterprise;

In one possible design, the type of initial entity includes at least one of: registration location, industry domain, registration time, enterprise authentication, and enterprise scale.

Correspondingly, when the type of any initial entity is a registration place, the entity alignment model corresponding to the initial entity is the registration place entity alignment model.

When the type of any initial entity is the industry field, the entity alignment model corresponding to the initial entity is the industry field entity alignment model.

When the type of any initial entity is the registration time, the entity alignment model corresponding to the initial entity is the registration time entity alignment model.

When the type of any initial entity is enterprise authentication, the entity alignment model corresponding to the initial entity is enterprise authentication entity alignment model.

In one possible design, when the type of any initial entity is enterprise-scale, an enterprise-scale entity alignment model is used to:

if the initial entity contains the preset keyword, determining the target entity corresponding to the keyword as the target entity corresponding to the initial entity.

The preset keywords comprise at least one of the following: large, medium, small and micro.

The large corresponding target entity is a large enterprise, the medium corresponding target entity is a medium-sized enterprise, the small corresponding target entity is a small-sized enterprise, and the micro corresponding target entity is a micro-sized enterprise.

In one possible design, the input module 41 is specifically configured to:

if a plurality of first entities with types being industry fields exist in the plurality of initial entities, determining a union of the plurality of first entities as a second entity.

In one possible design, before the to-be-processed government information is input into the named entity recognition model, and the plurality of initial entities in the to-be-processed government information output by the named entity recognition model and the type of each initial entity are obtained, the enterprise matching apparatus 40 further includes a training module, where the training module is configured to:

and acquiring a plurality of sample government affair information.

Labeling each sample government information to obtain labeling information of the government information, wherein the labeling information is used for explaining the position and the type of each sample entity in the sample government information.

In one possible design, the enterprise matching apparatus 40 further includes a sending module configured to:

It should be noted that, it should be understood that the division of the modules of the above apparatus is merely a division of a logic function, and may be fully or partially integrated into a physical entity or may be physically separated. And these modules may all be implemented in software in the form of calls by the processing element; or can be realized in hardware; the method can also be realized in a form of calling software by a processing element, and the method can be realized in a form of hardware by a part of modules. The modules may be processing elements that are individually set up, may be implemented as integrated in a chip of the above-described apparatus, or may be stored in a memory of the above-described apparatus in the form of program codes, and the functions of the above-described modules may be called and executed by a processing element of the above-described apparatus. In addition, all or part of the modules can be integrated together or can be independently implemented. The processing element here may be an integrated circuit with signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in a software form.

Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 5, the electronic device 50 may include: a processor 51, and a memory 52.

The processor 51 executes computer-executable instructions stored in the memory, causing the processor 51 to execute the arrangements of the above-described embodiments. The processor 51 may be a general-purpose processor including a central processing unit CPU, a network processor (network processor, NP), etc.; but may also be a digital signal processor DSP, an application specific integrated circuit ASIC, a field programmable gate array FPGA or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component.

The memory 52 is connected to the processor 51 via a system bus and communicates with each other, the memory 52 being adapted to store computer program instructions.

Optionally, the electronic device 50 may further include a transceiver, where the transceiver is configured to obtain the pending government affair information and send the pending government affair information to a terminal device of the target enterprise.

The system bus may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The system bus may be classified into an address bus, a data bus, a control bus, and the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus. The transceiver is used to enable communication between the database access device and other computers (e.g., clients, read-write libraries, and read-only libraries). The memory may include random access memory (random access memory, RAM) and may also include non-volatile memory (non-volatile memory).

The electronic device provided by the embodiment of the application is used for executing the technical scheme of the enterprise matching method in the embodiment.

The embodiment of the application also provides a chip for running the instruction, which is used for executing the technical scheme of the enterprise matching method in the embodiment.

The embodiment of the application also provides a computer readable storage medium, wherein computer execution instructions are stored in the computer readable storage medium, and the computer execution instructions are used for realizing the technical scheme of the enterprise matching method in the embodiment when being executed by a processor.

The embodiment of the application also provides a computer program product, which comprises a computer program stored in a computer readable storage medium, wherein at least one processor can read the computer program from the computer readable storage medium, and the technical scheme of the enterprise matching method in the embodiment can be realized when the at least one processor executes the computer program.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, e.g., the division of modules is merely a logical function division, and there may be additional divisions of actual implementation, e.g., multiple modules may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules, which may be in electrical, mechanical, or other forms.

The modules illustrated as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to implement the solution of this embodiment.

In addition, each functional module in each embodiment of the present application may be integrated in one processing unit, or each module may exist alone physically, or two or more modules may be integrated in one unit. The units formed by the modules can be realized in a form of hardware or a form of hardware and software functional units.

The integrated modules, which are implemented in the form of software functional modules, may be stored in a computer readable storage medium. The software functional modules described above are stored in a storage medium and include instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or processor to perform some steps of the methods of the various embodiments of the present application.

It should be understood that the above processor may be a central processing unit (Central Processing Unit, abbreviated as CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, abbreviated as DSP), application specific integrated circuits (Application Specific Integrated Circuit, abbreviated as ASIC), etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution.

The memory may comprise a high-speed RAM memory, and may further comprise a non-volatile memory NVM, such as at least one magnetic disk memory, and may also be a U-disk, a removable hard disk, a read-only memory, a magnetic disk or optical disk, etc.

The bus may be an industry standard architecture (Industry Standard Architecture, ISA) bus, an external device interconnect (Peripheral Component Interconnect, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, the buses in the drawings of the present application are not limited to only one bus or one type of bus.

The storage medium may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.

An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an application specific integrated circuit (Application Specific Integrated Circuits, ASIC for short). Of course, the processor and the storage medium may reside as discrete components in an electronic control unit or master control device.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the method embodiments described above may be performed by hardware associated with program instructions. The foregoing program may be stored in a computer readable storage medium. The program, when executed, performs steps including the method embodiments described above; and the aforementioned storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions from the scope of the technical solutions of the embodiments of the present application.

Claims

1. An enterprise matching method, comprising:

2. The method of claim 1, wherein the type of the initial entity comprises at least one of: registration location, industry domain, registration time, enterprise authentication, and enterprise scale;

3. The method of claim 2, wherein when any of the initial entities is of a type of enterprise scale, inputting the initial entity into the enterprise-scale entity alignment model, obtaining the enterprise-scale entity alignment model to output a target entity of the initial entity, comprises:

4. The method of claim 2, wherein inputting the initial entity into a corresponding entity alignment model, obtaining a target entity of the initial entity output by the entity alignment model, comprises:

5. The method according to any one of claims 1 to 4, wherein before the entering of the to-be-processed government information into a named entity recognition model, obtaining a plurality of initial entities and a type of each initial entity in the to-be-processed government information output by the named entity recognition model, the method further comprises:

Acquiring a plurality of sample government affair information;

6. The method according to any one of claims 1 to 4, further comprising:

7. An enterprise matching apparatus, comprising:

8. An electronic device, comprising: a processor, and a memory communicatively coupled to the processor;

the memory stores computer-executable instructions;

the processor executes computer-executable instructions stored in the memory to implement the method of any one of claims 1 to 6.

9. A computer readable storage medium having stored therein computer executable instructions which when executed by a processor are adapted to carry out the method of any one of claims 1 to 6.

10. A computer program product comprising a computer program which, when executed by a processor, implements the method of any one of claims 1 to 6.