WO2001006407A1 - Procede de conversion d'informations texte - Google Patents
Procede de conversion d'informations texte Download PDFInfo
- Publication number
- WO2001006407A1 WO2001006407A1 PCT/RU2000/000276 RU0000276W WO0106407A1 WO 2001006407 A1 WO2001006407 A1 WO 2001006407A1 RU 0000276 W RU0000276 W RU 0000276W WO 0106407 A1 WO0106407 A1 WO 0106407A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- word
- code
- information
- words
- proposal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/268—Morphological analysis
Definitions
- ⁇ data of the program is information received from the user or received from the carrier (communication channel) of the user
- the development is based on the basic usage of the meaning of the bases of the lexicon of two (or more) languages.
- Databases are available on the word stock (many words of the language) and, as a minimum, a unit of textual information is replicated on the word.
- an analysis is made of morphological changes in the word of different depths (in different situations) and analysis of the correlation between them.
- the database is used on the structure (many rules) of the language (s), i.e. ⁇ rule words, morphology, syntax, history, gunguntu and stylistics.
- the main basis for such hardware and software is the processing of linguistics and the process of machine translation. The main reason for such processing lies in the fact that all natural languages have the meaning of words.
- ned ⁇ s ⁇ a ⁇ am ⁇ i ⁇ a ⁇ n ⁇ sya ⁇ sya nee ⁇ n ⁇ michn ⁇ e is ⁇ lz ⁇ vanie ⁇ anala communication, availability ⁇ l ⁇ 15% ⁇ bel ⁇ v ⁇ i ⁇ e ⁇ edache in ⁇ matsii, ned ⁇ s ⁇ a ⁇ chn ⁇ vys ⁇ aya s ⁇ s ⁇ ⁇ e ⁇ edachi in ⁇ matsii, d ⁇ s ⁇ u ⁇ ⁇ in ⁇ matsii on ⁇ g ⁇ anichenn ⁇ m ⁇ liches ⁇ ve yazy ⁇ v.
- the claimed invention eliminates the listed deficiencies.
- the objective of the invention is to improve the design of information technology, processed products from textual information obtained from the natural language.
- Technical Result the use of a communication channel, a significant increase in the speed of the transmission of information signals, the interruption of communication
- P ⁇ ichem ⁇ i ⁇ e ⁇ b ⁇ az ⁇ vanii sl ⁇ va in ⁇ d and ⁇ b ⁇ a ⁇ n ⁇ is ⁇ lzuyu ⁇ sl ⁇ v ⁇ b ⁇ az ⁇ va ⁇ elnye and m ⁇ l ⁇ giches ⁇ ie ⁇ s ⁇ benn ⁇ s ⁇ i sl ⁇ va (s ⁇ l ⁇ nenie, v ⁇ emya, ⁇ d, chisl ⁇ and d ⁇ .
- the average length of words in the Bolylin language of the romanian-Germanic group is in the range of 1-6 characters (for Slavic 7-8 sign).
- 6-8 characters or 48-64 bits (6-8 bytes) are required.
- the language of the original input and the final receipt of the text information does not depend on the language and the user interface. Any user will be able to work with any information in their own native language.
- Fig. 1 shows tables 1 and 2 of the words in accordance with the invention
- Fig. 2 a table of 3 categories is shown. binary code
- Tables 4 and 5 are shown on fsh.Z and 4, demonstrating the proposed method of processing of the text received from a natural language in the country of communication with the foreign language
- Fig. 5 a blank diagram of the conversion of textual information and the exchange of known informational channels to the communication channel; on fsh.b-scheme of processing of information and exchange of information signals through the communication channel that implements the claimed method.
- the third user works in another language and in the third in the third computer there is a system for converting from the foreign language into the user language.
- P ⁇ edlagae maya s ⁇ ema ⁇ e ⁇ b ⁇ az ⁇ vaniya ⁇ e ⁇ s ⁇ v ⁇ y in ⁇ matsii ⁇ luchenn ⁇ y of es ⁇ es ⁇ venn ⁇ g ⁇ yazy ⁇ a in sshnaly in ⁇ matsi ⁇ nn ⁇ g ⁇ ⁇ bmena ⁇ analu communication ⁇ ( ⁇ ig. 6) v ⁇ lyuchae ⁇ a ⁇ m ⁇ yu ⁇ e ⁇ y 1, 2 associated in ⁇ mashyunnym ⁇ bmen ⁇ m (In ⁇ e ⁇ ne ⁇ communication lines, and d ⁇ dis ⁇ e ⁇ y.) With P ⁇ lz ⁇ va ⁇ elyami 1 and 2.
- Each company has a system for analyzing and compiling texts that is based on user-defined language databases.
- the circuit works as follows.
- the system for exchanging information with the computer 1 runs the codes for the values of the words, taking into account the meaning of this word, including Well-shaped forum, logical appearance and synthesizing connections. These facilities, and the need for additional amenities, add to the comfort of the main room. For the group of products obtained from the words of one of the following services, a type of this service is added.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Pharmaceuticals Containing Other Organic And Inorganic Compounds (AREA)
Abstract
Cette invention se rapporte au domaine de l'informatique et des techniques de calcul, et peut notamment être utilisée dans des systèmes de conversion et de codage d'informations texte, qui sont issues de langues naturelles, en des signaux d'échange d'informations sur un canal de communication. Cette invention permet d'utiliser rationnellement un canal de communication d'informations, d'accroître sensiblement la vitesse de transmission des informations, et de pouvoir travailler sur des textes en quelque langue que ce soit. A cette fin, on attribue un code binaire à valeur unique lors de la conversion des informations texte, et l'on utilise en qualité d'unité de texte d'informations à coder non pas un symbole (lettre), mais un mot. La conversion en code binaire se fait à l'aide d'une table du contenu sémantique de ce mot. Lors de la conversion du mot en code et inversement, on utilise les caractéristiques constitutives et morphologiques de ce mot (déclinaisons, temps, genre, nombre, etc.). En outre, on ajoute au code binaire du mot un code de liaison syntactique entre ledit mot et les autres mots se trouvant dans une proposition donnée ou, encore, un code complémentaire qui augmente la protection contre les interférences du code principal. On ajoute encore au groupe de codes obtenus à partir des mots d'une proposition, un code décrivant le type de la proposition (simple, complexe, subordonnée, etc.) ou l'on ajoute un code contenant des informations sur les interactions entre les mots de la proposition.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| RU99115906/09A RU99115906A (ru) | 1999-07-19 | Способ преобразования текстовой информации | |
| RU99115906 | 1999-07-19 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2001006407A1 true WO2001006407A1 (fr) | 2001-01-25 |
Family
ID=20222955
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/RU2000/000276 Ceased WO2001006407A1 (fr) | 1999-07-19 | 2000-07-04 | Procede de conversion d'informations texte |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2001006407A1 (fr) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| DE4311211A1 (de) * | 1993-04-05 | 1994-10-06 | Ibm | Computersystem und Verfahren zur automatisierten Analyse eines Textes |
| RU2107942C1 (ru) * | 1994-01-10 | 1998-03-27 | Александр Андреевич Шпаков | Способ установления в хранилище местоположения объекта по поисковому тематическому признаку |
| US5761688A (en) * | 1994-12-26 | 1998-06-02 | Sharp Kabushiki Kaisha | Dictionary retrieval apparatus |
| RU2131620C1 (ru) * | 1997-03-25 | 1999-06-10 | Попов Александр Федорович | Устройство для информационной коммуникации |
-
2000
- 2000-07-04 WO PCT/RU2000/000276 patent/WO2001006407A1/fr not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| DE4311211A1 (de) * | 1993-04-05 | 1994-10-06 | Ibm | Computersystem und Verfahren zur automatisierten Analyse eines Textes |
| RU2107942C1 (ru) * | 1994-01-10 | 1998-03-27 | Александр Андреевич Шпаков | Способ установления в хранилище местоположения объекта по поисковому тематическому признаку |
| US5761688A (en) * | 1994-12-26 | 1998-06-02 | Sharp Kabushiki Kaisha | Dictionary retrieval apparatus |
| RU2131620C1 (ru) * | 1997-03-25 | 1999-06-10 | Попов Александр Федорович | Устройство для информационной коммуникации |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Koenig | Compounding mixed-methods problems in frame analysis through comparative research | |
| Satapathy et al. | Phonetic-based microtext normalization for twitter sentiment analysis | |
| US6405190B1 (en) | Free format query processing in an information search and retrieval system | |
| Oflazer | Two-level description of Turkish morphology | |
| Lucchesi et al. | Applications of finite automata representing large vocabularies | |
| Le et al. | Sentiment analysis for low resource languages: A study on informal Indonesian tweets | |
| CN101923858B (zh) | 一种实时同步互译语音终端 | |
| CA2089177A1 (fr) | Systeme de communication a extraction de messages textuels par concepts introduits au clavier | |
| Becker et al. | Text generation: A systematic literature review of tasks, evaluation, and challenges | |
| EP3276507A1 (fr) | Dispositif de codage, procédé de codage et procédé de recherche | |
| Yang et al. | Hierarchical summarization of large documents | |
| CN114297353A (zh) | 数据处理方法、装置、存储介质及设备 | |
| CA3110046A1 (fr) | Decouverte lexicale par apprentissage automatique | |
| US5075851A (en) | System for translating a source language word with a prefix into a target language word with multiple forms | |
| CN101551798A (zh) | 翻译输入法及字库 | |
| HaCohen‐Kerner et al. | HAADS: A Hebrew Aramaic abbreviation disambiguation system | |
| WO2001006407A1 (fr) | Procede de conversion d'informations texte | |
| CN1575467A (zh) | 不受语言和方法限制的计算机化编码器-解码器 | |
| Kwon et al. | Making your tweets more fancy: Emoji insertion to texts | |
| Andreyev | Models as a Tool in the Development of Linguistic Theory | |
| CN115905865A (zh) | 文本合并判断模型的训练方法和文本合并判断方法 | |
| Juma Zagood | An analytical study of the strategies used in translating Trump’s tweets into Arabic | |
| Skyttner | Information theory‐a psychological study in old and new concepts | |
| Walter et al. | Beyond xier: An Exploration of German Nonbinary Pronoun Usage and Discourse on Twitter | |
| Nghiem et al. | A hybrid approach for semantic enrichment of MathML mathematical expressions |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AK | Designated states |
Kind code of ref document: A1 Designated state(s): US |
|
| AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
| DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
| 122 | Ep: pct application non-entry in european phase |