ATE425500T1 - Verfahren zur segmentierung von text - Google Patents

Verfahren zur segmentierung von text

Info

Publication number
ATE425500T1
ATE425500T1 AT01937085T AT01937085T ATE425500T1 AT E425500 T1 ATE425500 T1 AT E425500T1 AT 01937085 T AT01937085 T AT 01937085T AT 01937085 T AT01937085 T AT 01937085T AT E425500 T1 ATE425500 T1 AT E425500T1
Authority
AT
Austria
Prior art keywords
text elements
predetermined number
initial
consecutive
beginnings
Prior art date
Application number
AT01937085T
Other languages
English (en)
Inventor
Eva Ejerhed
Original Assignee
Hapax Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hapax Ltd filed Critical Hapax Ltd
Application granted granted Critical
Publication of ATE425500T1 publication Critical patent/ATE425500T1/de

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99934Query formulation, input preparation, or translation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99935Query augmenting and refining, e.g. inexact access
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99936Pattern matching access

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Machine Translation (AREA)
  • Character Input (AREA)
  • Crushing And Pulverization Processes (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)
  • Diaphragms For Electromechanical Transducers (AREA)
AT01937085T 2000-05-31 2001-05-30 Verfahren zur segmentierung von text ATE425500T1 (de)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
SE0002034A SE517005C2 (sv) 2000-05-31 2000-05-31 Segmentering av text

Publications (1)

Publication Number Publication Date
ATE425500T1 true ATE425500T1 (de) 2009-03-15

Family

ID=20279913

Family Applications (1)

Application Number Title Priority Date Filing Date
AT01937085T ATE425500T1 (de) 2000-05-31 2001-05-30 Verfahren zur segmentierung von text

Country Status (7)

Country Link
US (1) US6810375B1 (de)
EP (1) EP1305738B1 (de)
AT (1) ATE425500T1 (de)
AU (1) AU2001262852A1 (de)
DE (1) DE60137935D1 (de)
SE (1) SE517005C2 (de)
WO (1) WO2001093088A1 (de)

Families Citing this family (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8190436B2 (en) * 2001-12-07 2012-05-29 At&T Intellectual Property Ii, L.P. System and method of spoken language understanding in human computer dialogs
US7493253B1 (en) 2002-07-12 2009-02-17 Language And Computing, Inc. Conceptual world representation natural language understanding system and method
US8818793B1 (en) * 2002-12-24 2014-08-26 At&T Intellectual Property Ii, L.P. System and method of extracting clauses for spoken language understanding
US8849648B1 (en) * 2002-12-24 2014-09-30 At&T Intellectual Property Ii, L.P. System and method of extracting clauses for spoken language understanding
US7424467B2 (en) 2004-01-26 2008-09-09 International Business Machines Corporation Architecture for an indexer with fixed width sort and variable width sort
US8296304B2 (en) 2004-01-26 2012-10-23 International Business Machines Corporation Method, system, and program for handling redirects in a search engine
US7293005B2 (en) 2004-01-26 2007-11-06 International Business Machines Corporation Pipelined architecture for global analysis and index building
US7499913B2 (en) 2004-01-26 2009-03-03 International Business Machines Corporation Method for handling anchor text
US7461064B2 (en) * 2004-09-24 2008-12-02 International Buiness Machines Corporation Method for searching documents for ranges of numeric values
US8051096B1 (en) 2004-09-30 2011-11-01 Google Inc. Methods and systems for augmenting a token lexicon
US7996208B2 (en) 2004-09-30 2011-08-09 Google Inc. Methods and systems for selecting a language for text segmentation
US7680648B2 (en) * 2004-09-30 2010-03-16 Google Inc. Methods and systems for improving text segmentation
US8843536B1 (en) 2004-12-31 2014-09-23 Google Inc. Methods and systems for providing relevant advertisements or other content for inactive uniform resource locators using search queries
US8417693B2 (en) 2005-07-14 2013-04-09 International Business Machines Corporation Enforcing native access control to indexed documents
US8073681B2 (en) 2006-10-16 2011-12-06 Voicebox Technologies, Inc. System and method for a cooperative conversational voice user interface
US8631005B2 (en) 2006-12-28 2014-01-14 Ebay Inc. Header-token driven automatic text segmentation
US7818176B2 (en) 2007-02-06 2010-10-19 Voicebox Technologies, Inc. System and method for selecting and presenting advertisements based on natural language processing of voice-based input
JP5256654B2 (ja) * 2007-06-29 2013-08-07 富士通株式会社 文章分割プログラム、文章分割装置および文章分割方法
US8260619B1 (en) 2008-08-22 2012-09-04 Convergys Cmg Utah, Inc. Method and system for creating natural language understanding grammars
US20090150141A1 (en) * 2007-12-07 2009-06-11 David Scott Wible Method and system for learning second or foreign languages
US8140335B2 (en) 2007-12-11 2012-03-20 Voicebox Technologies, Inc. System and method for providing a natural language voice user interface in an integrated voice navigation services environment
US9305548B2 (en) * 2008-05-27 2016-04-05 Voicebox Technologies Corporation System and method for an integrated, multi-modal, multi-device natural language voice services environment
US8326637B2 (en) 2009-02-20 2012-12-04 Voicebox Technologies, Inc. System and method for processing multi-modal device interactions in a natural language voice services environment
EP2488963A1 (de) * 2009-10-15 2012-08-22 Rogers Communications Inc. System und verfahren zur formulierungserkennung
KR101622111B1 (ko) * 2009-12-11 2016-05-18 삼성전자 주식회사 대화 시스템 및 그의 대화 방법
US8788260B2 (en) * 2010-05-11 2014-07-22 Microsoft Corporation Generating snippets based on content features
US8977538B2 (en) * 2010-09-13 2015-03-10 Richard Salisbury Constructing and analyzing a word graph
US8892550B2 (en) * 2010-09-24 2014-11-18 International Business Machines Corporation Source expansion for information retrieval and information extraction
RU2474870C1 (ru) * 2011-11-18 2013-02-10 Общество С Ограниченной Ответственностью "Центр Инноваций Натальи Касперской" Способ автоматизированного анализа текстовых документов
US9934218B2 (en) * 2011-12-05 2018-04-03 Infosys Limited Systems and methods for extracting attributes from text content
US9280520B2 (en) * 2012-08-02 2016-03-08 American Express Travel Related Services Company, Inc. Systems and methods for semantic information retrieval
US9607613B2 (en) 2014-04-23 2017-03-28 Google Inc. Speech endpointing based on word comparisons
WO2016044321A1 (en) 2014-09-16 2016-03-24 Min Tang Integration of domain information into state transitions of a finite state transducer for natural language processing
EP3195145A4 (de) 2014-09-16 2018-01-24 VoiceBox Technologies Corporation Sprachhandel
JP2016062264A (ja) * 2014-09-17 2016-04-25 株式会社東芝 対話支援装置、方法およびプログラム
WO2016061309A1 (en) 2014-10-15 2016-04-21 Voicebox Technologies Corporation System and method for providing follow-up responses to prior natural language inputs of a user
US10388270B2 (en) 2014-11-05 2019-08-20 At&T Intellectual Property I, L.P. System and method for text normalization using atomic tokens
US10614799B2 (en) 2014-11-26 2020-04-07 Voicebox Technologies Corporation System and method of providing intent predictions for an utterance prior to a system detection of an end of the utterance
US10431214B2 (en) 2014-11-26 2019-10-01 Voicebox Technologies Corporation System and method of determining a domain and/or an action related to a natural language input
US10324965B2 (en) * 2014-12-30 2019-06-18 International Business Machines Corporation Techniques for suggesting patterns in unstructured documents
US20170031893A1 (en) * 2015-07-30 2017-02-02 Pat Inc. Set-based Parsing for Computer-Implemented Linguistic Analysis
WO2017083504A1 (en) * 2015-11-12 2017-05-18 Semantic Machines, Inc. Interaction assistant
WO2018023106A1 (en) 2016-07-29 2018-02-01 Erik SWART System and method of disambiguating natural language processing requests
US10402473B2 (en) * 2016-10-16 2019-09-03 Richard Salisbury Comparing, and generating revision markings with respect to, an arbitrary number of text segments
US10929754B2 (en) 2017-06-06 2021-02-23 Google Llc Unified endpointer using multitask and multidomain learning
WO2018226779A1 (en) 2017-06-06 2018-12-13 Google Llc End of query detection
TWI709080B (zh) * 2017-06-14 2020-11-01 雲拓科技有限公司 申請專利範圍之結構組構裝置
US11003854B2 (en) * 2018-10-30 2021-05-11 International Business Machines Corporation Adjusting an operation of a system based on a modified lexical analysis model for a document
US11568153B2 (en) 2020-03-05 2023-01-31 Bank Of America Corporation Narrative evaluator
CN119721017B (zh) * 2023-09-27 2025-12-16 广东小天才科技有限公司 一种语法分析方法、装置、设备和存储介质

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4887212A (en) 1986-10-29 1989-12-12 International Business Machines Corporation Parser for natural language text
US4864502A (en) 1987-10-07 1989-09-05 Houghton Mifflin Company Sentence analyzer
US5146405A (en) 1988-02-05 1992-09-08 At&T Bell Laboratories Methods for part-of-speech determination and usage
JP2764343B2 (ja) 1990-09-07 1998-06-11 富士通株式会社 節/句境界抽出方式
US6823301B1 (en) 1997-03-04 2004-11-23 Hiroshi Ishikura Language analysis using a reading point
JPH10254877A (ja) 1997-03-14 1998-09-25 Omron Corp 文体変換装置、ワードプロセッサ、および、文体変換方法
EP1055182A2 (de) * 1998-02-13 2000-11-29 Microsoft Corporation Segmentierung chinesischer text in wörtern
EP0962873A1 (de) 1998-06-02 1999-12-08 International Business Machines Corporation Textinformationsverarbeitung und automatisierte Informationserkennung

Also Published As

Publication number Publication date
SE0002034D0 (sv) 2000-05-31
SE517005C2 (sv) 2002-04-02
DE60137935D1 (de) 2009-04-23
EP1305738B1 (de) 2009-03-11
US6810375B1 (en) 2004-10-26
EP1305738A1 (de) 2003-05-02
SE0002034L (sv) 2001-12-01
AU2001262852A1 (en) 2001-12-11
WO2001093088A1 (en) 2001-12-06

Similar Documents

Publication Publication Date Title
ATE425500T1 (de) Verfahren zur segmentierung von text
CA2463230A1 (en) A method and apparatus for decoding handwritten characters
DE60208614D1 (de) Verfahren und Vorrichtung zur Bereitstellung einer Liste von öffentlichen Schlüsseln in einem Public-Key-System
EP0834826A3 (de) Lokalisierung von Schablonen in optischen Zeichenerkennungsanlagen
DE69417105D1 (de) Vorrichtung und Verfahren zum Erkennen handgeschriebener Symbole
EP0924639A3 (de) Gerät zur Zeichenkettenermittlung und Gerät zur Musterermittlung
ATE315256T1 (de) Verfahren zur extraktion eines hash-strings
DE69930560D1 (de) Verfahren und Vorrichtung zur Mustererkennung
DE60124495D1 (de) Verfahren und Vorrichtung zur Erzeugung von einem Scout-Scan
DE69001539D1 (de) Verfahren zur herstellung von isoflavanonen.
ATE294977T1 (de) Verfahren und einrichtung zur erkennung eines handschriftlichen musters
EP1218854A4 (de) Gerät und verfahren zur erzeugung von, auf zwei-driedimensionalen bildern basierten drei-dimensionalen mustern
DE69230031D1 (de) Mustererkennung und -echtheitsprüfung, insbesondere für handgeschriebene Unterschriften
DE69736751D1 (de) Verfahren und Vorrichtung zur Mustererkennung sowie Fernsprechsystem
DE60225329D1 (de) Vorrichtung und verfahren zur erkennung von code
DE60131893D1 (de) Verfahren und vorrichtung zum erzeugen von eindeutigen audiosignaturen
IL161379A0 (en) Character string identification
DE69413362D1 (de) Verfahren zur erzeugung eines unsichtbaren markierungszeichens
DE60136988D1 (de) Informationsverarbeitungsvorrichtung und Verfahren sowie Chipkarte
DE69025693D1 (de) Verfahren und System zum Vergleichen von Punktmustern sowie Bilderkennungsverfahren und -system mit Verwendung desselben
DE60119403D1 (de) Vorrichtung und Verfahren zum Ändern von Karteninformationen
IT1272259B (it) Procedimento ed apparecchio per il riconoscimento dei caratteri
DE69022013D1 (de) Verfahren und Vorrichtung zum Blasformen.
DE60030677D1 (de) Verfahren und Vorrichtung zur Mustererkennung
EP0665506A3 (de) Verfahren und Gerät zur Erkennung handgeschriebener Zeichen.

Legal Events

Date Code Title Description
RER Ceased as to paragraph 5 lit. 3 law introducing patent treaties