CN114140077A - A kind of government policy deconstruction method, apparatus, computer equipment and storage medium - Google Patents
A kind of government policy deconstruction method, apparatus, computer equipment and storage medium Download PDFInfo
- Publication number
- CN114140077A CN114140077A CN202111441163.XA CN202111441163A CN114140077A CN 114140077 A CN114140077 A CN 114140077A CN 202111441163 A CN202111441163 A CN 202111441163A CN 114140077 A CN114140077 A CN 114140077A
- Authority
- CN
- China
- Prior art keywords
- word
- words
- matching
- summarized
- key
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
- G06Q10/103—Workflow collaboration or project management
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/335—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Artificial Intelligence (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- Economics (AREA)
- Data Mining & Analysis (AREA)
- Primary Health Care (AREA)
- Probability & Statistics with Applications (AREA)
- Development Economics (AREA)
- Databases & Information Systems (AREA)
- Educational Administration (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a government policy deconstruction method, which comprises the following steps: 1. determining the average word number of key words to be summarized; 2. acquiring a policy text for the coming-off of a government website; 3. performing word segmentation processing on the obtained policy text to enable the policy text to be decomposed into a plurality of words; 4. finding out a first-order word serving as a key word sentence to be summarized from the decomposed words; 5. finding out a word with the highest matching weight from the decomposed words according to the locked current ordinal word, and locking the word into a next ordinal word of the key word sentence to be summarized, wherein the word becomes the current ordinal word of the key word sentence to be summarized; 6. and repeating the step 5 until the number of the key words to be summarized reaches the average number of the words. An apparatus, a computer device and a storage medium for implementing the above method are also disclosed. The invention has high summarizing accuracy and improves the working efficiency.
Description
Technical Field
The invention relates to the technical field of electronic government affairs, in particular to a government policy deconstruction method, a government policy deconstruction device, computer equipment and a storage medium.
Background
Policies are the standardized regulation of struggling goals that should be met, principles of action followed, explicit tasks completed, the mode of work performed, general steps taken and specific measures taken over a certain historical period in an authoritative fashion. The policy has the following characteristics: 1. and (4) timeliness. The policy is a real policy which is implemented under historical conditions and national conditions within a certain time. 2. And (4) expression. In terms of expression form, the policy is not a material entity but is externalized as concept and information expressed by symbols, and it is expressed by expression means such as language and characters in the right authority.
With the rapid development of network technology, an electronic government affair technology is promoted. The electronic government affairs are to use modern information technology means such as computer, network and communication to realize the optimization and recombination of government organization structure and work flow, surpass the limit of time, space and division of departments, and build a simple, efficient, clean and fair government operation mode so as to provide high-quality, standard, transparent and international-level-compliant management and service to the society in all directions.
The government is used as a national management department, and the electronic government affairs are developed by surfing the Internet, so that the modernization of government management is facilitated, and the electronization, automation and networking of government office work are realized. Through the rapid and cheap communication means of the internet, the government can enable the public to quickly know the composition, the functions and the work rules of government institutions and various policy and regulations, increase the transparency of work enforcement and consciously accept the supervision of the public.
In electronic government affairs, various data, documents, files and social and economic data of government organs are stored in a network server in a digital form, and can be quickly inquired, used and called by a computer retrieval mechanism.
Although the popularization and use of electronic government affairs enable people to timely and quickly acquire policies issued by governments, enterprises or individuals can only know and handle the relevant policies when the enterprises or individuals want to handle the relevant policies under the condition that the policies are known, however, the types and contents of the policies from the center to the local are very many, and the enterprises or individuals cannot know all the policies. Meanwhile, most policy handling is time-efficient, and particularly some financial policies cannot enjoy the fund reward or benefit given by the policy once the handling time limit is missed. Therefore, when policies are officially released, the policies should be timely and efficiently pushed to businesses or individuals that may be eligible. Generally, the content in the policy document is large, and it is obviously difficult for the target company or person to grasp the important information if all the content in the policy document is pushed to the target company or person directly. Therefore, it is necessary to extract the key information of interest of the target enterprise or person in the policy document and then push the key information to the target enterprise or person, so as to better excite the initiative of the target enterprise or person in transacting the project and be beneficial to policy implementation. Most of the existing methods extract the critical information in the policy files in a worker mode, but because various policy files are more in appearance and the content of each policy file is more, the policy files are extracted only in a manual mode, the efficiency is low, the consumed time is long, and meanwhile, the labor cost is increased.
To this end, the applicant has sought, through useful research and research, a solution to the above-mentioned problems, in the context of which the technical solutions to be described below have been made.
Disclosure of Invention
One of the technical problems to be solved by the present invention is: aiming at the defects of the prior art, the government policy deconstruction method which improves the efficiency, saves the time and reduces the labor cost is provided.
The second technical problem to be solved by the present invention is: and provides a government policy critical information constructing device for realizing the government policy deconstruction method.
The third technical problem to be solved by the invention is that: a computer device is provided for implementing the above-described government policy deconstruction method.
The fourth technical problem to be solved by the invention is that: a computer readable storage medium is provided for implementing the above-described government policy deconstruction method.
A government policy deconstruction method as a first aspect of the present invention comprises the steps of:
step S10, determining the average word number of the key words to be summarized;
step S20, obtaining a policy text for the coming-off of the government website;
step S30, performing word segmentation processing on the obtained policy text to enable the policy text to be decomposed into a plurality of words;
step S40, finding out the first order word as the key word needed to be summarized from the decomposed words;
step S50, according to the locked current word, finding out the word with the highest matching weight from the decomposed words, and locking the word as the next word of the key word sentence to be summarized, wherein the word becomes the current word of the key word sentence to be summarized;
and step S60, repeating step S50 until the number of words of the key words to be summarized reaches the average number of words.
In a preferred embodiment of the present invention, in step S10, the determining the average word number of the key words to be summarized includes the following steps:
step S11, acquiring all key words in the historical database;
step S12, performing word segmentation processing on the word number of each key word sentence, so that each key word sentence is decomposed into a plurality of words;
and step S13, counting the word number of each key word after word segmentation, and performing average calculation processing on the counted data, wherein the calculated value is the average word number of the key word to be summarized.
In a preferred embodiment of the present invention, in step S30, the performing word segmentation processing on the obtained policy text includes the following steps:
step S31, the obtained policy text is segmented, so that the policy text is decomposed into a plurality of paragraphs;
step S32, each paragraph after being decomposed is executed with sentence dividing processing, so that each paragraph is decomposed into several sentences;
in step S33, each sentence after decomposition is subjected to word segmentation processing, so that each sentence is decomposed into a number of words.
In a preferred embodiment of the present invention, in step S40, the finding out the word in the first order as the key word sentence to be summarized from the decomposed words includes the following steps:
step S41, finding out the first order word from the word sequence library, matching the word with the decomposed words one by one, if the matching is successful, entering step S42, if the matching is failed, entering step S43;
step S42, locking the matched words into the first order words of the key words and sentences to be summarized;
step S43, finding out the similar meaning word of the current word from the word sequence library, matching the similar meaning word with the decomposed words one by one, if the matching is successful, returning to step S42, if the matching is failed, entering step S44;
and step S44, finding out the next word from the word sequence library, wherein the word becomes the word of the current order, matching the word with the decomposed words one by one, returning to step S42 if the matching is successful, and returning to step S43 if the matching is failed.
In a preferred embodiment of the present invention, in step S50, the step of finding out the word with the highest matching weight from the decomposed words according to the locked current-order word, and locking the word as the next-order word of the critical word sentence to be summarized includes the following steps,
step S51, finding out the word with the highest matching weight from the neuron library according to the locked current sequential word, matching the word with a plurality of decomposed words one by one, if the matching is successful, entering step S52, and if the matching is failed, entering step S53;
step S52, locking the matched words into the next sequential words of the key words and sentences to be summarized;
step S53, reducing the matching weight, finding out the word corresponding to the reduced matching weight from the neuron library according to the locked current sequential word, matching the word with a plurality of decomposed words one by one, returning to step S52 if the matching is successful, and entering step S54 if the matching is failed;
step S54, finding out the next word from the word sequence library, wherein the word becomes the word of the current sequence, and matching the word with the decomposed words one by one, if the matching is successful, returning to step S52, and if the matching is failed, entering step S55;
step S55, finding out the similar meaning word of the current word from the word sequence library, matching the similar meaning word with the decomposed words one by one, if the matching is successful, returning to step S52, if the matching is failed, entering step S56;
step S56, finding out the next word from the word sequence library, matching the word with the decomposed words one by one, if matching is successful, returning to step S52, and if matching is failed, returning to step S54.
In a preferred embodiment of the present invention, the method further includes step S70, performing syntax judgment processing on the summarized key word, if the syntax judgment is correct, it indicates that the summarized key word conforms to the syntax rules, and if the syntax judgment is incorrect, it indicates that the summarized key word does not conform to the syntax rules, and returning to step S10 to reconstruct the key word.
In a preferred embodiment of the present invention, the method further includes step S80, performing manual verification on the grammatically determined key word, if the manual verification is successful, pushing the summarized key word to the target, and if the manual verification fails, returning to step S10 to reconstruct the key word.
In a preferred embodiment of the present invention, the method further includes step S90, performing self-learning processing on the key words and phrases that are successfully verified manually.
In a preferred embodiment of the present invention, in step S90, the self-learning process is performed on the key words and phrases that are successfully verified manually, which includes the following steps:
step S91, performing word segmentation processing on the key words and sentences successfully verified, so that the key words and sentences are decomposed into a plurality of words;
step S92, establishing the matching weight between every two adjacent words, and storing the established matching weight into a neuron library;
step S93, performing part-of-speech recognition on each decomposed word, and storing the recognition result in a part-of-speech library;
step S94, obtaining the decomposed approximate words of each word from the Internet, and storing the obtained similar meaning words into a similar meaning word library;
and step S95, carrying out priority sorting processing on each decomposed word, and storing the priority sorting result into the word sequence library.
A government policy critical information constructing apparatus as a second aspect of the present invention comprises:
the average word number calculation module is used for determining the average word number of the key words and sentences to be summarized;
the policy text acquisition module is used for acquiring a policy text of the coming-off of the government website;
the word segmentation processing module is used for carrying out word segmentation processing on the obtained policy text so as to decompose the policy text into a plurality of words;
the first word and sentence matching module is used for finding out a first order word serving as a key word and sentence to be summarized from the decomposed words; and
and the second word and sentence matching module is used for finding out a word with the highest matching weight from the decomposed words according to the locked current sequence word, locking the word into a next sequence word of the key word and sentence to be summarized, enabling the word to become the current sequence word of the key word and sentence to be summarized, and repeating the steps until the word number of the key word and sentence to be summarized reaches the average word number.
In a preferred embodiment of the present invention, the apparatus further includes a grammar judgment processing module, where the grammar judgment processing module is configured to perform grammar judgment processing on the summarized key words and sentences, and if the grammar judgment is correct, it indicates that the summarized key words and sentences conform to grammar rules, and if the grammar judgment is wrong, it indicates that the summarized key words and sentences do not conform to grammar rules, and rebuilds the key words and sentences.
In a preferred embodiment of the present invention, the system further comprises a self-learning module, wherein the self-learning module is used for performing self-learning processing on the summarized key words and sentences.
A computer device as a third aspect of the present invention for implementing the above-mentioned government policy deconstruction method comprises a memory storing a computer program and a processor implementing the following steps when executing the computer program:
step S10, determining the average word number of the key words to be summarized;
step S20, obtaining a policy text for the coming-off of the government website;
step S30, performing word segmentation processing on the obtained policy text to enable the policy text to be decomposed into a plurality of words;
step S40, finding out the first order word as the key word needed to be summarized from the decomposed words;
step S50, according to the locked current word, finding out the word with the highest matching weight from the decomposed words, and locking the word as the next word of the key word sentence to be summarized, wherein the word becomes the current word of the key word sentence to be summarized;
and step S60, repeating step S50 until the number of words of the key words to be summarized reaches the average number of words.
A computer-readable storage medium as a fourth aspect of the present invention for implementing the above-described government policy deconstruction method, having stored thereon a computer program which, when executed by a processor, performs the steps of:
step S10, determining the average word number of the key words to be summarized;
step S20, obtaining a policy text for the coming-off of the government website;
step S30, performing word segmentation processing on the obtained policy text to enable the policy text to be decomposed into a plurality of words;
step S40, finding out the first order word as the key word needed to be summarized from the decomposed words;
step S50, according to the locked current word, finding out the word with the highest matching weight from the decomposed words, and locking the word as the next word of the key word sentence to be summarized, wherein the word becomes the current word of the key word sentence to be summarized;
and step S60, repeating step S50 until the number of words of the key words to be summarized reaches the average number of words.
Due to the adoption of the technical scheme, the invention has the beneficial effects that: the method carries out word segmentation processing on the policy text, and then finds out each word in the key words and sentences to be summarized one by one from the decomposed words, so that the found words form the summarized key words and sentences, the summarizing accuracy is high, the working efficiency is effectively improved, the working time is saved, and the labor cost is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a diagram of an application scenario of one embodiment of the government policy deconstruction method of the present invention.
Fig. 2 is a flow diagram of one embodiment of a government policy deconstruction method of the present invention.
FIG. 3 is a flow chart of the present invention for determining average word count.
FIG. 4 is a flow diagram of the word segmentation process of the present invention.
FIG. 5 is a flow chart of the present invention for finding words that are the first order of a key word sentence to be summarized.
FIG. 6 is a flow chart of the present invention for finding the next ordered word locked to a critical word sentence to be summarized.
Fig. 7 is a flow chart illustrating an embodiment of a specific application of the government policy deconstruction method of the present invention.
Fig. 8 is a flow chart of another embodiment of a government policy deconstruction method of the present invention.
Fig. 9 is a flow chart of yet another embodiment of a government policy deconstruction method of the present invention.
Fig. 10 is a flow chart of yet another embodiment of a government policy deconstruction method of the present invention.
Fig. 11 is a flow chart of the self-learning process in the government policy deconstruction method of the present invention.
Fig. 12 is a schematic structural diagram of an embodiment of a government policy critical information constructing apparatus of the present invention.
Fig. 13 is an internal structural view of the computer device of the present invention.
Detailed Description
In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the invention easy to understand, the invention is further explained below by combining the specific drawings.
The government policy deconstruction method provided by the invention can be applied to the application environment shown in fig. 1. Wherein a user terminal 101 communicates with a server 102 via a network. The user terminal 101 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 102 may be implemented by an independent server or a server cluster formed by a plurality of servers. The user sends an information acquisition request to the server 102 through the user terminal 101, the server 102 acquires the policy file after receiving the information acquisition request, performs word segmentation processing on the policy file, finds out each word in the key words and sentences to be summarized one by one from the decomposed words, constructs the found words into summarized key words and sentences, and finally sends the constructed key words and sentences to the user terminal 101. The invention has high summarizing accuracy, effectively improves the working efficiency, saves the working time and reduces the labor cost.
Referring to fig. 2, a government policy deconstruction method is shown comprising the steps of:
step S10, determine the average word number of the key words to be summarized.
And step S20, acquiring a policy text for the outgoing of the government website.
In step S30, the obtained policy text is subjected to word segmentation processing, so that the policy text is decomposed into several words.
And step S40, finding out the first-order word which is the key word sentence to be summarized from the decomposed words.
Step S50, according to the locked current ordered word, finding out the word with the highest matching weight from the decomposed words, and locking the word as the next ordered word of the key word sentence to be summarized, where the word becomes the current ordered word of the key word sentence to be summarized.
And step S60, repeating step S50 until the number of words of the key words to be summarized reaches the average number of words.
In step S10, referring to fig. 3, determining the average word number of the critical word to be summarized includes the following steps:
and step S11, acquiring all key words in the history database. The method comprises the steps of storing a plurality of key words and phrases of policy files summarized manually in a historical database, inputting the key words and phrases of the policy files summarized manually into the historical database in a manual mode for storage, or storing the key words and phrases constructed by the method into the historical database for storage each time.
In step S12, the word number of each key word sentence is segmented, so that each key word sentence is decomposed into several words. In this embodiment, the word segmentation processing method may adopt a Baidu LAC word segmentation processing method.
And step S13, counting the word number of each key word after word segmentation, and performing average calculation processing on the counted data, wherein the calculated value is the average word number of the key word to be summarized.
In step S20, the policy text for the outgoing government website may be obtained directly from the government website through the internet, or may be manually inputted into the system.
In step S30, referring to fig. 4, the word segmentation process is performed on the obtained policy text, and the method includes the following steps:
in step S31, the obtained policy text is segmented so that the policy text is decomposed into several paragraphs.
In step S32, a sentence segmentation process is performed on each of the decomposed paragraphs, so that each paragraph is decomposed into several sentences.
In step S33, each sentence after decomposition is subjected to word segmentation processing, so that each sentence is decomposed into a number of words. In this embodiment, the word segmentation processing method may adopt a Baidu LAC word segmentation processing method.
In step S40, referring to fig. 5, finding out the first-order word as the key word sentence to be summarized from the decomposed words includes the following steps:
step S41, finding out the first-order word from the word sequence library, matching the word with the decomposed words one by one, if the matching is successful, going to step S42, and if the matching is failed, going to step S43.
And step S42, locking the matched words into the first-order words of the key words needing to be summarized.
Step S43, finding out the similar meaning word of the current word from the word sequence library, matching the similar meaning word with the decomposed words one by one, if the matching is successful, returning to step S42, if the matching is failed, entering step S44.
And step S44, finding out the next word from the word sequence library, wherein the word becomes the word of the current order, matching the word with the decomposed words one by one, returning to step S42 if the matching is successful, and returning to step S43 if the matching is failed.
In step S50, referring to fig. 6, the word with the highest matching weight is found out from the decomposed words according to the locked current-order word, and the word is locked as the next-order word of the critical word sentence to be summarized, including the following steps,
and step S51, finding out the word with the highest matching weight from the neuron library according to the locked current sequential word, matching the word with a plurality of decomposed words one by one, if the matching is successful, entering step S52, and if the matching is failed, entering step S53.
And step S52, locking the matched words into the next sequential words of the key words needing to be summarized.
And step S53, reducing the matching weight, finding out the word corresponding to the reduced matching weight from the neuron library according to the locked current sequential word, matching the word with the decomposed words one by one, returning to step S52 if the matching is successful, and entering step S54 if the matching is failed.
Step S54, finding out the next word from the word sequence library, where the word becomes the word in the current position, and matching the word with the decomposed words one by one, if the matching is successful, returning to step S52, and if the matching is failed, returning to step S55.
Step S55, finding out the similar meaning word of the current word from the word sequence library, matching the similar meaning word with the decomposed words one by one, if the matching is successful, returning to step S52, if the matching is failed, entering step S56.
Step S56, finding out the next word from the word sequence library, matching the word with the decomposed words one by one, if matching is successful, returning to step S52, and if matching is failed, returning to step S54.
Referring to fig. 7, a specific application example of the government policy deconstruction method of the present invention is shown, comprising the steps of:
1. and determining the average word number of the key words to be summarized. In this embodiment, the average word number is 8, and the number of words of the critical word to be summarized is determined to be 8.
2. And acquiring a policy text for the coming station of the government website. In this example, the government text is "for breeding enterprises to obtain quality awards for national, provincial and municipal governments, the municipal financing awards the highest grading awards of 500, 200 and 100 ten thousand dollars at a time. ".
3. The obtained policy text is subjected to word segmentation processing, so that the policy text is decomposed into a plurality of words. In this embodiment, the policy text is broken down into "(pairs) (get) (country level) (,) (province level) (and) (breed) (of) (city level) (government) (quality award) (breed) (enterprise) (,) (city level) (financial) (one-time) (give) (top) (500 ten thousand) (,) (200 ten thousand) (,) (100 ten thousand) (profiles) (reward) ()").
4. And finding out a first-order word serving as a key word sentence to be summarized from the decomposed words through the word sequence library and the near-sense word library.
5. And according to the locked current ordered word, finding out a word with the highest matching weight from the decomposed words through the word sequence library and the near-meaning word library, and locking the word into a word with the next order of the key word sentence to be summarized, wherein the word becomes the current ordered word of the key word sentence to be summarized.
6. And (5) repeatedly executing the step until the number of the key words to be summarized reaches the average number of the words.
Referring to fig. 8, the government policy deconstruction method of the present invention further includes step S70, performing syntax judgment processing on the summarized key word, if the syntax judgment is correct, it indicates that the summarized key word conforms to the syntax rules, and if the syntax judgment is incorrect, it indicates that the summarized key word does not conform to the syntax rules, and returning to step S10 to reconstruct the key word.
Referring to fig. 9, the government policy deconstruction method of the present invention further includes step S80, performing manual verification on the legally determined key words, if the manual verification is successful, pushing the summarized key words to the target, and if the manual verification fails, returning to step S10 to reconstruct the key words.
Referring to fig. 10, the government policy deconstruction method of the present invention further includes step S90, performing a self-learning process on the key words and phrases that are successfully verified manually, so as to improve the accuracy of the summarization.
In step S90, referring to fig. 11, the self-learning process is performed on the key words and phrases that are successfully verified manually, and the method includes the following steps:
and step S91, performing word segmentation processing on the successfully verified key words and sentences to make the key words and sentences decomposed into a plurality of words. In this embodiment, the word segmentation processing method may adopt a Baidu LAC word segmentation processing method.
And step S92, establishing the matching weight between every two adjacent words, and storing the established matching weight into a neuron library.
Step S93 is to perform part-of-speech recognition on each of the decomposed words, and store the recognition result in the part-of-speech library.
Step S94, obtaining the decomposed approximate words of each word from the internet, and storing the obtained approximate words in the approximate word library.
And step S95, carrying out priority sorting processing on each decomposed word, and storing the priority sorting result into the word sequence library.
Referring to fig. 12, a device for constructing critical information of government policy is shown, which includes an average word count calculation module 100, a policy text acquisition module 200, a word segmentation processing module 300, a first word and sentence matching module 400, a second word and sentence matching module 500, a grammar judgment processing module 600, and a self-learning module 700.
The average word number calculation module 100 is used to determine the average word number of the critical words to be summarized.
The policy text acquisition module 200 is used for acquiring the policy text of the outbound government website.
The word segmentation processing module 300 is configured to perform word segmentation processing on the obtained policy text, so that the policy text is decomposed into several words.
The first word and sentence matching module 400 is configured to find a first-order word, which is a critical word and sentence to be summarized, from the decomposed words.
The second word and sentence matching module 500 is configured to find a word with the highest matching weight from the decomposed words according to the locked current ordinal word, lock the word as a next ordinal word of the key word and sentence to be summarized, where the word becomes the current ordinal word of the key word and sentence to be summarized, and repeat the process until the number of words of the key word and sentence to be summarized reaches the average number of words.
The grammar judgment processing module 600 is configured to perform grammar judgment processing on the summarized key words and sentences, indicate that the summarized key words and sentences conform to grammar rules if grammar judgment is correct, and indicate that the summarized key words and sentences do not conform to grammar rules if grammar judgment is wrong, and reconstruct the key words and sentences.
The self-learning module 700 is used to perform self-learning processing on the summarized key words and sentences, and aims to improve the accuracy of summarization. Specifically, the self-learning module 700 performs word segmentation on the manually verified key words and sentences first, so that the key words and sentences are decomposed into a plurality of words. Then, a matching weight between every two adjacent words is established, and the established matching weight is stored in a neuron library. Then, part-of-speech recognition is performed on each decomposed word, and the recognition result is stored in a part-of-speech library. Then, the decomposed approximate words of each word are obtained from the internet, and the obtained similar meaning words are stored in a similar meaning word library. And finally, carrying out priority sorting processing on each decomposed word, and storing a priority sorting result into a word sequence library.
The various modules in the government policy critical information building apparatus of this invention may be implemented in whole or in part in software, hardware and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
The present invention also provides a computer device for implementing the above-mentioned government policy deconstruction method, which may be a server, and its internal structure diagram may be as shown in fig. 13. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing data such as user information, record information and files. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program when executed by a processor implements a method of government policy deconstruction as described above.
Those skilled in the art will appreciate that the architecture shown in fig. 13 is merely a block diagram of some of the structures associated with the present solution, and does not constitute a limitation on the computing devices to which the present solution applies, and that a particular computing device may include more or less components than those shown, or combine certain components, or have a different arrangement of components.
Specifically, the computer device of the present invention includes a memory storing a computer program and a processor implementing the following steps when the processor executes the computer program:
step S10, determining the average word number of the key words to be summarized;
step S20, obtaining a policy text for the coming-off of the government website;
step S30, performing word segmentation processing on the obtained policy text to enable the policy text to be decomposed into a plurality of words;
step S40, finding out the first order word as the key word needed to be summarized from the decomposed words;
step S50, according to the locked current word, finding out the word with the highest matching weight from the decomposed words, and locking the word as the next word of the key word sentence to be summarized, wherein the word becomes the current word of the key word sentence to be summarized;
and step S60, repeating step S50 until the number of words of the key words to be summarized reaches the average number of words.
The present invention also provides a computer-readable storage medium for implementing the above-mentioned government policy deconstruction method, having stored thereon a computer program which, when executed by a processor, performs the steps of:
step S10, determining the average word number of the key words to be summarized;
step S20, obtaining a policy text for the coming-off of the government website;
step S30, performing word segmentation processing on the obtained policy text to enable the policy text to be decomposed into a plurality of words;
step S40, finding out the first order word as the key word needed to be summarized from the decomposed words;
step S50, according to the locked current word, finding out the word with the highest matching weight from the decomposed words, and locking the word as the next word of the key word sentence to be summarized, wherein the word becomes the current word of the key word sentence to be summarized;
and step S60, repeating step S50 until the number of words of the key words to be summarized reaches the average number of words.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.
Claims (14)
1. A government policy deconstruction method comprising the steps of:
step S10, determining the average word number of the key words to be summarized;
step S20, obtaining a policy text for the coming-off of the government website;
step S30, performing word segmentation processing on the obtained policy text to enable the policy text to be decomposed into a plurality of words;
step S40, finding out the first order word as the key word needed to be summarized from the decomposed words;
step S50, according to the locked current word, finding out the word with the highest matching weight from the decomposed words, and locking the word as the next word of the key word sentence to be summarized, wherein the word becomes the current word of the key word sentence to be summarized;
and step S60, repeating step S50 until the number of words of the key words to be summarized reaches the average number of words.
2. The government policy deconstruction method of claim 1 wherein said determining the average number of words of the critical words to be summarized in step S10 comprises the steps of:
step S11, acquiring all key words in the historical database;
step S12, performing word segmentation processing on the word number of each key word sentence, so that each key word sentence is decomposed into a plurality of words;
and step S13, counting the word number of each key word after word segmentation, and performing average calculation processing on the counted data, wherein the calculated value is the average word number of the key word to be summarized.
3. The government policy deconstruction method of claim 1 wherein said tokenizing the retrieved policy text in step S30 comprises the steps of:
step S31, the obtained policy text is segmented, so that the policy text is decomposed into a plurality of paragraphs;
step S32, each paragraph after being decomposed is executed with sentence dividing processing, so that each paragraph is decomposed into several sentences;
in step S33, each sentence after decomposition is subjected to word segmentation processing, so that each sentence is decomposed into a number of words.
4. The government policy deconstruction method of claim 1 wherein said step of finding a word of a first rank from the decomposed words as a critical word sentence to be summarized in step S40 comprises the steps of:
step S41, finding out the first order word from the word sequence library, matching the word with the decomposed words one by one, if the matching is successful, entering step S42, if the matching is failed, entering step S43;
step S42, locking the matched words into the first order words of the key words and sentences to be summarized;
step S43, finding out the similar meaning word of the current word from the word sequence library, matching the similar meaning word with the decomposed words one by one, if the matching is successful, returning to step S42, if the matching is failed, entering step S44;
and step S44, finding out the next word from the word sequence library, wherein the word becomes the word of the current order, matching the word with the decomposed words one by one, returning to step S42 if the matching is successful, and returning to step S43 if the matching is failed.
5. The government policy deconstruction method of claim 1 wherein said step of finding a word with the highest matching weight from the decomposed words according to the locked current-order word and locking the word as a next-order word of the critical word sentence to be summarized in step S50 comprises the steps of,
step S51, finding out the word with the highest matching weight from the neuron library according to the locked current sequential word, matching the word with a plurality of decomposed words one by one, if the matching is successful, entering step S52, and if the matching is failed, entering step S53;
step S52, locking the matched words into the next sequential words of the key words and sentences to be summarized;
step S53, reducing the matching weight, finding out the word corresponding to the reduced matching weight from the neuron library according to the locked current sequential word, matching the word with a plurality of decomposed words one by one, returning to step S52 if the matching is successful, and entering step S54 if the matching is failed;
step S54, finding out the next word from the word sequence library, wherein the word becomes the word of the current sequence, and matching the word with the decomposed words one by one, if the matching is successful, returning to step S52, and if the matching is failed, entering step S55;
step S55, finding out the similar meaning word of the current word from the word sequence library, matching the similar meaning word with the decomposed words one by one, if the matching is successful, returning to step S52, if the matching is failed, entering step S56;
step S56, finding out the next word from the word sequence library, matching the word with the decomposed words one by one, if matching is successful, returning to step S52, and if matching is failed, returning to step S54.
6. The government policy deconstruction method of any one of claims 1-5, further comprising a step S70 of performing syntax judgment processing on the summarized key word, wherein if the syntax judgment is correct, the summarized key word is determined to be in compliance with the syntax rules, and if the syntax judgment is incorrect, the summarized key word is determined to be not in compliance with the syntax rules, and returning to the step S10 of reconstructing the key word.
7. The government policy deconstruction method of claim 6, further comprising step S80, wherein the critical words and phrases after the grammatical decision processing are manually verified, if the manual verification is successful, the summarized critical words and phrases are pushed to the target, and if the manual verification fails, the method returns to step S10 to reconstruct the critical words and phrases.
8. The government policy deconstruction method of claim 7, further comprising the step of S90 of performing a self-learning process on the key words and phrases that are manually verified.
9. The government policy deconstruction method of claim 8, wherein the step of self-learning critical words and phrases that are manually verified in step S90 comprises the steps of:
step S91, performing word segmentation processing on the key words and sentences successfully verified, so that the key words and sentences are decomposed into a plurality of words;
step S92, establishing the matching weight between every two adjacent words, and storing the established matching weight into a neuron library;
step S93, performing part-of-speech recognition on each decomposed word, and storing the recognition result in a part-of-speech library;
step S94, obtaining the decomposed approximate words of each word from the Internet, and storing the obtained similar meaning words into a similar meaning word library;
and step S95, carrying out priority sorting processing on each decomposed word, and storing the priority sorting result into the word sequence library.
10. A government policy critical information constructing apparatus comprising:
the average word number calculation module is used for determining the average word number of the key words and sentences to be summarized;
the policy text acquisition module is used for acquiring a policy text of the coming-off of the government website;
the word segmentation processing module is used for carrying out word segmentation processing on the obtained policy text so as to decompose the policy text into a plurality of words;
the first word and sentence matching module is used for finding out a first order word serving as a key word and sentence to be summarized from the decomposed words; and
and the second word and sentence matching module is used for finding out a word with the highest matching weight from the decomposed words according to the locked current sequence word, locking the word into a next sequence word of the key word and sentence to be summarized, enabling the word to become the current sequence word of the key word and sentence to be summarized, and repeating the steps until the word number of the key word and sentence to be summarized reaches the average word number.
11. The government policy critical information constructing device of claim 10, further comprising a grammar judgment processing module, wherein the grammar judgment processing module is configured to perform grammar judgment processing on the summarized key words and sentences, if the grammar judgment is correct, it indicates that the summarized key words and sentences conform to grammar rules, and if the grammar judgment is wrong, it indicates that the summarized key words and sentences do not conform to grammar rules, and reconstruct the key words and sentences.
12. The government policy criticality information constructing apparatus of claim 10, further comprising a self-learning module for self-learning processing of the summarization formed critical words and phrases.
13. A computer device for implementing the above-mentioned government policy deconstruction method comprising a memory storing a computer program and a processor implementing the steps of the government policy deconstruction method according to any one of claims 1-9 when executed.
14. A computer-readable storage medium for implementing the above-mentioned government policy deconstruction method, having stored thereon a computer program which, when executed by a processor, performs the steps of the government policy deconstruction method of any one of claims 1-9.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202111441163.XA CN114140077A (en) | 2021-11-30 | 2021-11-30 | A kind of government policy deconstruction method, apparatus, computer equipment and storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202111441163.XA CN114140077A (en) | 2021-11-30 | 2021-11-30 | A kind of government policy deconstruction method, apparatus, computer equipment and storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN114140077A true CN114140077A (en) | 2022-03-04 |
Family
ID=80389693
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202111441163.XA Pending CN114140077A (en) | 2021-11-30 | 2021-11-30 | A kind of government policy deconstruction method, apparatus, computer equipment and storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN114140077A (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5077668A (en) * | 1988-09-30 | 1991-12-31 | Kabushiki Kaisha Toshiba | Method and apparatus for producing an abstract of a document |
| CN108519970A (en) * | 2018-02-06 | 2018-09-11 | 平安科技(深圳)有限公司 | The identification method of sensitive information, electronic device and readable storage medium storing program for executing in text |
| CN109977390A (en) * | 2017-12-27 | 2019-07-05 | 北京搜狗科技发展有限公司 | A kind of method and device generating text |
| CN111178065A (en) * | 2019-12-12 | 2020-05-19 | 中国建设银行股份有限公司 | Word segmentation recognition word stock construction method, Chinese word segmentation method and device |
| CN111930805A (en) * | 2020-08-10 | 2020-11-13 | 中国平安人寿保险股份有限公司 | Information mining method and computer equipment |
-
2021
- 2021-11-30 CN CN202111441163.XA patent/CN114140077A/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5077668A (en) * | 1988-09-30 | 1991-12-31 | Kabushiki Kaisha Toshiba | Method and apparatus for producing an abstract of a document |
| CN109977390A (en) * | 2017-12-27 | 2019-07-05 | 北京搜狗科技发展有限公司 | A kind of method and device generating text |
| CN108519970A (en) * | 2018-02-06 | 2018-09-11 | 平安科技(深圳)有限公司 | The identification method of sensitive information, electronic device and readable storage medium storing program for executing in text |
| CN111178065A (en) * | 2019-12-12 | 2020-05-19 | 中国建设银行股份有限公司 | Word segmentation recognition word stock construction method, Chinese word segmentation method and device |
| CN111930805A (en) * | 2020-08-10 | 2020-11-13 | 中国平安人寿保险股份有限公司 | Information mining method and computer equipment |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11914968B2 (en) | Official document processing method, device, computer equipment and storage medium | |
| CN111061833B (en) | Data processing method and device, electronic equipment and computer readable storage medium | |
| CN111783471B (en) | Semantic recognition method, device, equipment and storage medium for natural language | |
| CN110457302B (en) | Intelligent structured data cleaning method | |
| WO2020232882A1 (en) | Named entity recognition method and apparatus, device, and computer readable storage medium | |
| CN109299235B (en) | Knowledge base searching method, device and computer readable storage medium | |
| CN110362798B (en) | Method, apparatus, computer device and storage medium for judging information retrieval analysis | |
| CN110851576A (en) | Question and answer processing method, device, equipment and readable medium | |
| CN111339166A (en) | Thesaurus-based matching recommendation method, electronic device and storage medium | |
| CN115470861A (en) | Data processing method, device and electronic device | |
| US11055200B2 (en) | Systems and methods for validating domain specific models | |
| CN120146050A (en) | Service push method, system, device and storage medium based on public opinion analysis | |
| CN120764501A (en) | Business reporting scenario data collection method, system, storage medium and electronic device | |
| CN116776900A (en) | Enhanced data screening method, device, equipment and medium based on multilingual model | |
| US11775757B2 (en) | Automated machine-learning dataset preparation | |
| CN120144719A (en) | Question and answer processing method, device, equipment and storage medium based on artificial intelligence | |
| CN114140077A (en) | A kind of government policy deconstruction method, apparatus, computer equipment and storage medium | |
| US20240354507A1 (en) | Keyword extraction method, device, computer equipment and storage medium | |
| CN118916453A (en) | Intelligent operation and maintenance method based on self-developed GPT model and related equipment thereof | |
| CN111401009B (en) | A digital emoticon recognition and conversion method, device, server and storage medium | |
| CN110705258A (en) | Text entity identification method and device | |
| US20200302914A1 (en) | Method, device, computer apparatus, and storage medium of processing claim data | |
| CN115345132A (en) | File processing method, device and equipment | |
| CN114490934A (en) | Element detection method and device of business link, computer equipment and storage medium | |
| CN115455187B (en) | Event extraction methods, apparatus, computer equipment and storage media |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |