WO2017117781A1 - Procédé et système de classification d'informations de réseau - Google Patents
Procédé et système de classification d'informations de réseau Download PDFInfo
- Publication number
- WO2017117781A1 WO2017117781A1 PCT/CN2016/070405 CN2016070405W WO2017117781A1 WO 2017117781 A1 WO2017117781 A1 WO 2017117781A1 CN 2016070405 W CN2016070405 W CN 2016070405W WO 2017117781 A1 WO2017117781 A1 WO 2017117781A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- network information
- category
- word segmentation
- segmentation processing
- present
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
Definitions
- the present invention relates to the field of the Internet, and in particular, to a method and system for classifying network information.
- the network consists of nodes and connections, representing many objects and their interconnections.
- a network is a kind of graph that is generally considered to be a weighted graph.
- the network has a specific physical meaning, that is, the network is abstracted from some practical problem of the same type.
- the network In the field of computers, the network is a virtual platform for information transmission, reception, and sharing. Through it, the information of various points, faces, and bodies is linked together to realize the sharing of these resources.
- the network is the most important invention in the history of human development. Improve the development of science and technology and human society.
- Network information is massive, so how to classify network information becomes a key technology in network search. There are many ways to classify network information at present, but the classification is mostly inaccurate.
- the application provides a method for classifying network information. It solves the shortcomings of inaccurate classification of network information in the prior art technical solutions.
- a method for classifying network information comprising the following steps:
- the category of the most vocabulary is used as the first category of the network information.
- the method further includes:
- the method further includes:
- the network information is re-classified.
- a classification system for network information comprising:
- An obtaining unit configured to obtain network information that needs to be classified
- a word segmentation unit for performing word segmentation processing on the network information, and performing quantitative statistics on the same vocabulary obtained by the word segmentation process
- system further includes:
- a learning unit configured to input the network information into a second class corresponding to the first class of the learning vector machine to output the network information, and if the second category is the same as the first category, determining that the category of the network information is the first category.
- system further includes:
- the reprocessing unit is configured to re-classify the network information if the second category is different from the first category.
- the technical solution provided by the present invention performs word segmentation processing on network information, and captures keywords of network information, and classifies according to the number of times of the keyword, so that it has the advantage of accurate classification.
- FIG. 1 is a flowchart of a method for classifying network information according to a first preferred embodiment of the present invention
- FIG. 2 is a structural diagram of a network information classification system according to a second preferred embodiment of the present invention.
- FIG. 1 is a schematic diagram of a network information classification method according to a first preferred embodiment of the present invention. The method is as shown in FIG.
- Step S101 Obtain network information that needs to be classified
- Step S102 performing word segmentation processing on the network information, and performing quantity statistics on the same vocabulary obtained by the word segmentation processing;
- the above-mentioned word segmentation processing algorithm may use an existing word segmentation processing algorithm, such as a Baidu word segmentation processing algorithm.
- Step S103 The category of the most vocabulary is used as the first category of the network information.
- the technical solution provided by the present invention performs word segmentation processing on network information, and captures keywords of network information, and classifies according to the number of times of the keyword, so that it has the advantage of accurate classification.
- the foregoing method may further include:
- the foregoing method may further include:
- the network information is re-classified.
- FIG. 2 is a network information classification system according to a second preferred embodiment of the present invention.
- the system as shown in FIG. 2, includes:
- the obtaining unit 201 is configured to acquire network information that needs to be classified;
- the word segmentation unit 202 is configured to perform word segmentation processing on the network information, and perform quantity statistics on the same vocabulary obtained by the word segmentation process;
- the above-mentioned word segmentation processing algorithm may use an existing word segmentation processing algorithm, such as a Baidu word segmentation processing algorithm.
- the category unit 203 is configured to use the category of the most vocabulary as the first category of the network information.
- the technical solution provided by the present invention performs word segmentation processing on network information, and captures keywords of network information, and classifies according to the number of times of the keyword, so that it has the advantage of accurate classification.
- the above system may further include:
- the learning unit 204 is configured to input the network information into the second class corresponding to the first class to output the network information, and if the second category is the same as the first category, determine that the network information category is One category.
- the above system may further include:
- the reprocessing unit 205 is configured to re-classify the network information if the second category is different from the first category.
- the program may be stored in a computer readable storage medium, and the storage medium may include: Flash drive, read-only memory (English: Read-Only Memory, referred to as: ROM), random accessor (English: Random Access Memory, referred to as: RAM), disk or CD.
- ROM Read-Only Memory
- RAM Random Access Memory
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Data Mining & Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
L'invention concerne un procédé et un système de classification d'informations de réseau, le procédé comprenant les étapes suivantes : acquisition d'informations de réseau nécessitant une classification (101) ; exécution d'un traitement de segmentation de mots sur les informations de réseau et comptage des quantités de mots identiques obtenus au moyen du traitement de segmentation de mots (102) ; adoption du type du mot présent dans la quantité la plus importante comme étant un premier type d'informations de réseau (103). Le procédé et le système de classification selon l'invention présentent l'avantage d'une classification précise.
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201680000012.1A CN105723367A (zh) | 2016-01-07 | 2016-01-07 | 网络信息的分类方法及系统 |
| PCT/CN2016/070405 WO2017117781A1 (fr) | 2016-01-07 | 2016-01-07 | Procédé et système de classification d'informations de réseau |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2016/070405 WO2017117781A1 (fr) | 2016-01-07 | 2016-01-07 | Procédé et système de classification d'informations de réseau |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2017117781A1 true WO2017117781A1 (fr) | 2017-07-13 |
Family
ID=56162469
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2016/070405 Ceased WO2017117781A1 (fr) | 2016-01-07 | 2016-01-07 | Procédé et système de classification d'informations de réseau |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN105723367A (fr) |
| WO (1) | WO2017117781A1 (fr) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106777401A (zh) * | 2017-03-10 | 2017-05-31 | 北京搜狐新媒体信息技术有限公司 | 信息分类方法及装置 |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20050246333A1 (en) * | 2004-04-30 | 2005-11-03 | Jiang-Liang Hou | Method and apparatus for classifying documents |
| CN101794311A (zh) * | 2010-03-05 | 2010-08-04 | 南京邮电大学 | 基于模糊数据挖掘的中文网页自动分类方法 |
| CN102819595A (zh) * | 2012-08-10 | 2012-12-12 | 北京星网锐捷网络技术有限公司 | 网页分类方法、装置及网络设备 |
| CN103049568A (zh) * | 2012-12-31 | 2013-04-17 | 武汉传神信息技术有限公司 | 对海量文档库的文档分类的方法 |
| CN104750833A (zh) * | 2015-04-03 | 2015-07-01 | 浪潮集团有限公司 | 一种文本分类方法及装置 |
| CN104750754A (zh) * | 2013-12-31 | 2015-07-01 | 北龙中网(北京)科技有限责任公司 | 网站所属行业的分类方法和服务器 |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102222072A (zh) * | 2010-04-19 | 2011-10-19 | 腾讯科技(深圳)有限公司 | 一种信息分类的方法和装置 |
-
2016
- 2016-01-07 CN CN201680000012.1A patent/CN105723367A/zh active Pending
- 2016-01-07 WO PCT/CN2016/070405 patent/WO2017117781A1/fr not_active Ceased
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20050246333A1 (en) * | 2004-04-30 | 2005-11-03 | Jiang-Liang Hou | Method and apparatus for classifying documents |
| CN101794311A (zh) * | 2010-03-05 | 2010-08-04 | 南京邮电大学 | 基于模糊数据挖掘的中文网页自动分类方法 |
| CN102819595A (zh) * | 2012-08-10 | 2012-12-12 | 北京星网锐捷网络技术有限公司 | 网页分类方法、装置及网络设备 |
| CN103049568A (zh) * | 2012-12-31 | 2013-04-17 | 武汉传神信息技术有限公司 | 对海量文档库的文档分类的方法 |
| CN104750754A (zh) * | 2013-12-31 | 2015-07-01 | 北龙中网(北京)科技有限责任公司 | 网站所属行业的分类方法和服务器 |
| CN104750833A (zh) * | 2015-04-03 | 2015-07-01 | 浪潮集团有限公司 | 一种文本分类方法及装置 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN105723367A (zh) | 2016-06-29 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN112528672B (zh) | 一种基于图卷积神经网络的方面级情感分析方法及装置 | |
| US20230117973A1 (en) | Data processing method and apparatus | |
| WO2022057776A1 (fr) | Procédé et appareil de compression de modèle | |
| US20160078344A1 (en) | Extraction of inference rules from heterogeneous graphs | |
| CN113505206B (zh) | 基于自然语言推理的信息处理方法、装置和电子设备 | |
| WO2020192523A1 (fr) | Procédé et appareil de détection de qualité de traduction, système de traduction automatique et support d'informations | |
| CN111339308A (zh) | 基础分类模型的训练方法、装置和电子设备 | |
| WO2017117806A1 (fr) | Procédé et système de recherche de terme pour des informations web | |
| CN111930858A (zh) | 一种异质信息网络的表示学习方法、装置及电子设备 | |
| CN111563172A (zh) | 基于动态知识图谱构建的学术热点趋势预测方法和装置 | |
| Wang et al. | Research on the Application of Improved BERT‐DPCNN Model in Chinese News Text Classification | |
| WO2017117781A1 (fr) | Procédé et système de classification d'informations de réseau | |
| WO2026056447A1 (fr) | Procédé et système d'inférence de modèle | |
| WO2017117783A1 (fr) | Système et procédé de recherche d'informations de réseau | |
| Wang et al. | A software-hardware co-exploration framework for optimizing communication in neuromorphic processor | |
| WO2017128357A1 (fr) | Procédé à base de mégadonnées et système d'analyse de page web | |
| CN115954058B (zh) | 一种有机反应分类方法、装置、电子设备及存储介质 | |
| Chang et al. | Applying code transform model to newly generated program for improving execution performance | |
| WO2017117782A1 (fr) | Procédé et système de traitement de segmentation de mots d'informations de réseau | |
| US11126912B2 (en) | Realigning streams of neuron outputs in artificial neural network computations | |
| An et al. | Naturalness of ontology concepts for rating aspects of the semantic web | |
| WO2026060937A1 (fr) | Procédé d'inférence de modèle et système d'inférence de modèle | |
| WO2026051472A1 (fr) | Procédé d'établissement de liaison pour exécution de modèle, nœud de commande, grappe et produit programme | |
| EP4278304A1 (fr) | Optimisation d'opérations dans un réseau de neurones artificiels | |
| WO2017117805A1 (fr) | Procédé et système de capture d'informations web |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16882926 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 16882926 Country of ref document: EP Kind code of ref document: A1 |