WO2001006407A1 - Procede de conversion d'informations texte - Google Patents

Procede de conversion d'informations texte Download PDF

Info

Publication number
WO2001006407A1
WO2001006407A1 PCT/RU2000/000276 RU0000276W WO0106407A1 WO 2001006407 A1 WO2001006407 A1 WO 2001006407A1 RU 0000276 W RU0000276 W RU 0000276W WO 0106407 A1 WO0106407 A1 WO 0106407A1
Authority
WO
WIPO (PCT)
Prior art keywords
word
code
information
words
proposal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/RU2000/000276
Other languages
English (en)
Russian (ru)
Inventor
Iliya Alexandrovich Boldov
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from RU99115906/09A external-priority patent/RU99115906A/ru
Application filed by Individual filed Critical Individual
Publication of WO2001006407A1 publication Critical patent/WO2001006407A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/268Morphological analysis

Definitions

  • ⁇ data of the program is information received from the user or received from the carrier (communication channel) of the user
  • the development is based on the basic usage of the meaning of the bases of the lexicon of two (or more) languages.
  • Databases are available on the word stock (many words of the language) and, as a minimum, a unit of textual information is replicated on the word.
  • an analysis is made of morphological changes in the word of different depths (in different situations) and analysis of the correlation between them.
  • the database is used on the structure (many rules) of the language (s), i.e. ⁇ rule words, morphology, syntax, history, gunguntu and stylistics.
  • the main basis for such hardware and software is the processing of linguistics and the process of machine translation. The main reason for such processing lies in the fact that all natural languages have the meaning of words.
  • ned ⁇ s ⁇ a ⁇ am ⁇ i ⁇ a ⁇ n ⁇ sya ⁇ sya nee ⁇ n ⁇ michn ⁇ e is ⁇ lz ⁇ vanie ⁇ anala communication, availability ⁇ l ⁇ 15% ⁇ bel ⁇ v ⁇ i ⁇ e ⁇ edache in ⁇ matsii, ned ⁇ s ⁇ a ⁇ chn ⁇ vys ⁇ aya s ⁇ s ⁇ ⁇ e ⁇ edachi in ⁇ matsii, d ⁇ s ⁇ u ⁇ ⁇ in ⁇ matsii on ⁇ g ⁇ anichenn ⁇ m ⁇ liches ⁇ ve yazy ⁇ v.
  • the claimed invention eliminates the listed deficiencies.
  • the objective of the invention is to improve the design of information technology, processed products from textual information obtained from the natural language.
  • Technical Result the use of a communication channel, a significant increase in the speed of the transmission of information signals, the interruption of communication
  • P ⁇ ichem ⁇ i ⁇ e ⁇ b ⁇ az ⁇ vanii sl ⁇ va in ⁇ d and ⁇ b ⁇ a ⁇ n ⁇ is ⁇ lzuyu ⁇ sl ⁇ v ⁇ b ⁇ az ⁇ va ⁇ elnye and m ⁇ l ⁇ giches ⁇ ie ⁇ s ⁇ benn ⁇ s ⁇ i sl ⁇ va (s ⁇ l ⁇ nenie, v ⁇ emya, ⁇ d, chisl ⁇ and d ⁇ .
  • the average length of words in the Bolylin language of the romanian-Germanic group is in the range of 1-6 characters (for Slavic 7-8 sign).
  • 6-8 characters or 48-64 bits (6-8 bytes) are required.
  • the language of the original input and the final receipt of the text information does not depend on the language and the user interface. Any user will be able to work with any information in their own native language.
  • Fig. 1 shows tables 1 and 2 of the words in accordance with the invention
  • Fig. 2 a table of 3 categories is shown. binary code
  • Tables 4 and 5 are shown on fsh.Z and 4, demonstrating the proposed method of processing of the text received from a natural language in the country of communication with the foreign language
  • Fig. 5 a blank diagram of the conversion of textual information and the exchange of known informational channels to the communication channel; on fsh.b-scheme of processing of information and exchange of information signals through the communication channel that implements the claimed method.
  • the third user works in another language and in the third in the third computer there is a system for converting from the foreign language into the user language.
  • P ⁇ edlagae maya s ⁇ ema ⁇ e ⁇ b ⁇ az ⁇ vaniya ⁇ e ⁇ s ⁇ v ⁇ y in ⁇ matsii ⁇ luchenn ⁇ y of es ⁇ es ⁇ venn ⁇ g ⁇ yazy ⁇ a in sshnaly in ⁇ matsi ⁇ nn ⁇ g ⁇ ⁇ bmena ⁇ analu communication ⁇ ( ⁇ ig. 6) v ⁇ lyuchae ⁇ a ⁇ m ⁇ yu ⁇ e ⁇ y 1, 2 associated in ⁇ mashyunnym ⁇ bmen ⁇ m (In ⁇ e ⁇ ne ⁇ communication lines, and d ⁇ dis ⁇ e ⁇ y.) With P ⁇ lz ⁇ va ⁇ elyami 1 and 2.
  • Each company has a system for analyzing and compiling texts that is based on user-defined language databases.
  • the circuit works as follows.
  • the system for exchanging information with the computer 1 runs the codes for the values of the words, taking into account the meaning of this word, including Well-shaped forum, logical appearance and synthesizing connections. These facilities, and the need for additional amenities, add to the comfort of the main room. For the group of products obtained from the words of one of the following services, a type of this service is added.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Pharmaceuticals Containing Other Organic And Inorganic Compounds (AREA)

Abstract

Cette invention se rapporte au domaine de l'informatique et des techniques de calcul, et peut notamment être utilisée dans des systèmes de conversion et de codage d'informations texte, qui sont issues de langues naturelles, en des signaux d'échange d'informations sur un canal de communication. Cette invention permet d'utiliser rationnellement un canal de communication d'informations, d'accroître sensiblement la vitesse de transmission des informations, et de pouvoir travailler sur des textes en quelque langue que ce soit. A cette fin, on attribue un code binaire à valeur unique lors de la conversion des informations texte, et l'on utilise en qualité d'unité de texte d'informations à coder non pas un symbole (lettre), mais un mot. La conversion en code binaire se fait à l'aide d'une table du contenu sémantique de ce mot. Lors de la conversion du mot en code et inversement, on utilise les caractéristiques constitutives et morphologiques de ce mot (déclinaisons, temps, genre, nombre, etc.). En outre, on ajoute au code binaire du mot un code de liaison syntactique entre ledit mot et les autres mots se trouvant dans une proposition donnée ou, encore, un code complémentaire qui augmente la protection contre les interférences du code principal. On ajoute encore au groupe de codes obtenus à partir des mots d'une proposition, un code décrivant le type de la proposition (simple, complexe, subordonnée, etc.) ou l'on ajoute un code contenant des informations sur les interactions entre les mots de la proposition.
PCT/RU2000/000276 1999-07-19 2000-07-04 Procede de conversion d'informations texte Ceased WO2001006407A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
RU99115906/09A RU99115906A (ru) 1999-07-19 Способ преобразования текстовой информации
RU99115906 1999-07-19

Publications (1)

Publication Number Publication Date
WO2001006407A1 true WO2001006407A1 (fr) 2001-01-25

Family

ID=20222955

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/RU2000/000276 Ceased WO2001006407A1 (fr) 1999-07-19 2000-07-04 Procede de conversion d'informations texte

Country Status (1)

Country Link
WO (1) WO2001006407A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE4311211A1 (de) * 1993-04-05 1994-10-06 Ibm Computersystem und Verfahren zur automatisierten Analyse eines Textes
RU2107942C1 (ru) * 1994-01-10 1998-03-27 Александр Андреевич Шпаков Способ установления в хранилище местоположения объекта по поисковому тематическому признаку
US5761688A (en) * 1994-12-26 1998-06-02 Sharp Kabushiki Kaisha Dictionary retrieval apparatus
RU2131620C1 (ru) * 1997-03-25 1999-06-10 Попов Александр Федорович Устройство для информационной коммуникации

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE4311211A1 (de) * 1993-04-05 1994-10-06 Ibm Computersystem und Verfahren zur automatisierten Analyse eines Textes
RU2107942C1 (ru) * 1994-01-10 1998-03-27 Александр Андреевич Шпаков Способ установления в хранилище местоположения объекта по поисковому тематическому признаку
US5761688A (en) * 1994-12-26 1998-06-02 Sharp Kabushiki Kaisha Dictionary retrieval apparatus
RU2131620C1 (ru) * 1997-03-25 1999-06-10 Попов Александр Федорович Устройство для информационной коммуникации

Similar Documents

Publication Publication Date Title
Koenig Compounding mixed-methods problems in frame analysis through comparative research
Satapathy et al. Phonetic-based microtext normalization for twitter sentiment analysis
US6405190B1 (en) Free format query processing in an information search and retrieval system
Oflazer Two-level description of Turkish morphology
Lucchesi et al. Applications of finite automata representing large vocabularies
Le et al. Sentiment analysis for low resource languages: A study on informal Indonesian tweets
CN101923858B (zh) 一种实时同步互译语音终端
CA2089177A1 (fr) Systeme de communication a extraction de messages textuels par concepts introduits au clavier
Becker et al. Text generation: A systematic literature review of tasks, evaluation, and challenges
EP3276507A1 (fr) Dispositif de codage, procédé de codage et procédé de recherche
Yang et al. Hierarchical summarization of large documents
CN114297353A (zh) 数据处理方法、装置、存储介质及设备
CA3110046A1 (fr) Decouverte lexicale par apprentissage automatique
US5075851A (en) System for translating a source language word with a prefix into a target language word with multiple forms
CN101551798A (zh) 翻译输入法及字库
HaCohen‐Kerner et al. HAADS: A Hebrew Aramaic abbreviation disambiguation system
WO2001006407A1 (fr) Procede de conversion d'informations texte
CN1575467A (zh) 不受语言和方法限制的计算机化编码器-解码器
Kwon et al. Making your tweets more fancy: Emoji insertion to texts
Andreyev Models as a Tool in the Development of Linguistic Theory
CN115905865A (zh) 文本合并判断模型的训练方法和文本合并判断方法
Juma Zagood An analytical study of the strategies used in translating Trump’s tweets into Arabic
Skyttner Information theory‐a psychological study in old and new concepts
Walter et al. Beyond xier: An Exploration of German Nonbinary Pronoun Usage and Discourse on Twitter
Nghiem et al. A hybrid approach for semantic enrichment of MathML mathematical expressions

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): US

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
122 Ep: pct application non-entry in european phase