EP1665561A1 - Procede et appareil permettant de produire un message textuel - Google Patents

Procede et appareil permettant de produire un message textuel

Info

Publication number
EP1665561A1
EP1665561A1 EP04784421A EP04784421A EP1665561A1 EP 1665561 A1 EP1665561 A1 EP 1665561A1 EP 04784421 A EP04784421 A EP 04784421A EP 04784421 A EP04784421 A EP 04784421A EP 1665561 A1 EP1665561 A1 EP 1665561A1
Authority
EP
European Patent Office
Prior art keywords
message
templates
utterance
text message
template
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP04784421A
Other languages
German (de)
English (en)
Other versions
EP1665561A4 (fr
Inventor
Yaxin Zhang
Xin He
Xiao-Lin Ren
Fang Sun
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Motorola Mobility LLC
Original Assignee
Motorola Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Motorola Inc filed Critical Motorola Inc
Publication of EP1665561A1 publication Critical patent/EP1665561A1/fr
Publication of EP1665561A4 publication Critical patent/EP1665561A4/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/10Speech classification or search using distance or distortion measures between unknown speech and reference templates
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
    • H04M1/72436User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for text messaging, e.g. short messaging services [SMS] or e-mails
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/26Devices for calling a subscriber
    • H04M1/27Devices whereby a plurality of signals may be stored simultaneously
    • H04M1/271Devices whereby a plurality of signals may be stored simultaneously controlled by voice recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2250/00Details of telephonic subscriber devices
    • H04M2250/74Details of telephonic subscriber devices with voice recognition means

Definitions

  • the invention relates to a method and apparatus for providing a text message using voice.
  • the invention is particularly useful for, but not necessarily limited to, providing a text message using voice inputs processed on a portable electronic device having limited memory and computational capacity.
  • SMS Short Messaging Service
  • Short text messaging often using the Short Messaging Service (SMS) format, is a very popular application in wireless communications. Billions of short text messages are sent each month, usually from one mobile phone to another. Such text messages are popular for a number of reasons. The messages are generally a fraction of the cost of a one-minute mobile telephone call and they do not require an engaged tone to send or to receive.
  • Text messages are generally created by typing characters into the keypad of a mobile telephone.
  • non-querty keypads to compose a message can be awkward and generally requires more time than would be needed using a full-size querty keyboard. But of course it is impractical to have a full size keyboard attached to a mobile phone. Thus there is a need for a more effective method of composing short text messages.
  • speech recognition systems are well known, most are not suitable for use in portable electronic devices such as mobile phones. That is because prior art speech recognition systems generally require more processing power and memory than are available in portable electronic devices.
  • Prior art closed vocabulary speech recognition systems and methods employ a pre-defined, fixed vocabulary list.
  • the fixed vocabulary list may be large but may not be exhaustive and therefore, for instance, a person's family name and the names of many locations would not be included.
  • open vocabulary speech recognition systems and methods have a variable vocabulary list to which new words and phrases may be added by a user or otherwise.
  • current open vocabulary speech recognition systems and methods require relatively high computational overheads that may not be acceptable for portable electronic devices such as Personal Digital Assistants, radio-telephones and other portable devices.
  • a method of providing a text message includes the steps of receiving an utterance at an input of an electronic device. Speech recognition is then performed on the utterance guided by user-defined message templates stored in a memory associated with the electronic device, wherein speech recognition is defined by matching the utterance with one of the templates to create a matching template. A text message is then provided from the matching template.
  • At least one of the message templates may include a fixed language component.
  • At least one of the message templates may include a variable language component.
  • At least one of the message templates may include both a fixed and a variable language component.
  • the text message may be an SMS message.
  • the above method may also include the step of editing the user-defined message template by receiving typed characters from a keypad of the electronic device.
  • a component of the text message may be a transcription of the utterance.
  • the entirety of the text message may be a transcription of the utterance.
  • an electronic device for providing a text message includes a microphone operative to receive an utterance; a non-volatile memory for storing message templates; and a processor operative to perform speech recognition of the utterance guided by the message templates, wherein processor is operative to match the utterance with one of the templates to create a matching template, and to provide a text message from the matching template.
  • the message templates may also include fixed or variable language components or both fixed and variable language components.
  • the text message may be an SMS message.
  • the electronic device may include a keypad operative for editing the message template.
  • the electronic device may be operative to match the utterance with a plurality of the templates and to calculate a likelihood score for each of the templates.
  • Fig. 1 is a schematic block diagram of a radio telephone in accordance with the present invention
  • Fig. 2 is a flow diagram illustrating a method for providing, editing and transmitting a text message in accordance with the present invention
  • Fig. 3 is a flow diagram that illustrates a method for providing a list of candidate message templates to a user in accordance with the present invention
  • Fig. 4 is a flow diagram illustrating a method for enabling a user to edit existing message templates and save new templates in a static programmable memory in accordance with the present invention.
  • a radio telephone 100 comprising a radio frequency communications unit 105 coupled to be in communication with a processor 110.
  • I/O Input/Output
  • the processor 110 includes an encoder/decoder 125 with an associated
  • the processor 110 also includes a micro-processor 135 coupled, by a common data and address bus 140, to the encoder/decoder 125 and an associated character Read Only Memory (ROM) 145, a Random Access Memory (RAM)
  • the static programmable memory 155 and SIM module 160 each can store, amongst other things, selected incoming text messages, a telephone book database, and, as described in more detail below, templates of outgoing text messages.
  • the microprocessor 135 has ports for coupling to the keypad 120, the display 115 and an alert module 165 that typically contains a speaker, vibrator motor and associated drivers.
  • the character Read Only Memory 145 stores code for decoding or encoding text messages that may be received by the communication unit 105, input at the keypad 120.
  • the radio frequency communications unit 105 is a combined receiver and transmitter having a common antenna 170.
  • the communications unit 105 has a transceiver 175 coupled to antenna 170 via a radio frequency amplifier 180.
  • the transceiver 175 is also coupled to a combined modulator/demodulator 185 that couples the communications unit 2 to the processor 110.
  • Fig. 2 there is a flow diagram illustrating one embodiment of the present invention including a method 200 for providing, editing and transmitting a text message using the radio telephone 100.
  • the method 200 is invoked at a start step 205.
  • an utterance is received at an input, such as the microphone 190, of the telephone 100.
  • the processor 110 then performs sampling and digitizing of the utterance waveform at step 215, then segmenting at a step 220 before processing to provide feature vectors representing the waveform at a step 225.
  • steps 215, 220, and 225 are well known in the art and therefore do not require a detailed explanation.
  • speech recognition is performed on the feature vectors resulting from step 225.
  • the speech recognition is guided by user- defined message templates stored in the static programmable memory 155 of the device 100.
  • the message templates are described in more detail later in this specification.
  • the method 200 then provides a text message to a user at step 235.
  • the message may be provided to the user using one of the I/O interfaces such as the display 115 or the speaker 195 of the device 100. After the message is provided to the user, the user is then able to decide whether to edit the message at step 240.
  • the message is transmitted at step 245 in a message format such as SMS. However if the user decides at step 240 to edit the message, the message is edited at step 250 before being transmitted at step 245.
  • the user may edit the message in several different ways including speaking edits into the speaker 195 or typing edits into the keypad 120.
  • the method 200 then ends at step 255.
  • the provide a text message step 235 may include providing a user of the telephone 100 with a list of candidate message templates from which the user may select the template that is most appropriate for the intended text message.
  • Fig. 3 is a flow diagram that illustrates a method 300 for providing such a list of candidate templates to a user. The method 300 is invoked at start step 305 when a user inputs a command into the keypad 120 or into the microphone
  • the method 300 first includes the processor 110 selecting at step 310 a message template from a list of available message templates. At step 315 the selected template is then compared with the feature vectors provided in step 225 of method 200. The processor 110 then calculates a likelihood score at step 320 that estimates the matching quality between aspects of the selected template and the feature vectors of the input utterance. The processor 110 then determines at step 325 whether the likelihood score is above a set threshold. The threshold may be automatically calculated by processor 110, or it may be pre-set by a user of the telephone 100. If the likelihood score of the selected template is below the set threshold, the template is rejected at step 330.
  • the method 300 determines whether all available templates have been evaluated. If all available templates have not been evaluated, at step 345 the method 300 selects the next message template and returns to step 315 where the next template is compared with the feature vectors of the input utterance. If all templates have been evaluated at step 340, the method 300 continues to step 350 and provides a list of all of the candidate templates to the user.
  • the candidate templates may be provided to the user using one of the I/O interfaces such as the display 115 or the speaker 195 of the device 100.
  • the method 300 then ends at step 355.
  • users of the telephone 100 are not limited to the use of templates supplied by a manufacturer of the device 100. Rather, users of the device 100 are able to edit existing templates stored in the static programmable memory 155 to create their own personalized message templates.
  • Fig. 4 there is illustrated a method 400 for enabling a user to edit existing templates and save new templates in static programmable memory 155. The method 400 is invoked at start step 405 when a user inputs a command into the keypad 120 or into the microphone 190.
  • a list of existing templates is provided to the user of the device 100 through an I/O interface such as the display 115 or the speaker 195.
  • the user selects a desired message template at step 415 using an I/O interface such as the microphone 190 or the keypad 120.
  • the user edits the template at step 410
  • the method 400 then ends at step 430.
  • Other methods of editing the message templates are also within the scope of the present invention, including connecting the telephone 100 to a host computer using a communication channel such as a USB cable and then downloading or flashing edited templates to the static programmable memory- 155.
  • the method of the present invention may further include message templates that comprise fixed and variable language components.
  • the fixed language components are not changed when a user selects a template and transmits a message.
  • the variable language components may be changed by the user from message to message.
  • the use of fixed and variable language components can greatly leverage the limited processing power and memory of the telephone 100.
  • a particular template of a short text message concerning a meeting request might include the following: "Meet me at $PLACE at $TIME".
  • the fixed language components are underlined and the variable language components are capitalized and begin with "$”.
  • variable language component $FESTINAL may be edited by the user to include:
  • $FESTINAL sp
  • the phone 100 is able to recognize the edited variable language components entered by a user. Because the variable language components consist of discrete sets of variables, the speech recognition processing overhead and memory requirements are minimized. The above method is thus particularly suited for devices having limited processing and memory resources such as mobile phones.
  • the use of templates including fixed and variable language components increases the efficiency of a speech recognition system for several reasons. First, the fixed language components of a particular template may generally be recognized quickly and efficiently because there are only a modest number of templates saved in the static programmable memory 155 compared with the almost unlimited number of sentence permutations associated with natural language sentence structures.
  • variable language components may also be recognized efficiently because the intra-sentence location of a variable language component in a message template automatically identifies a discrete set of possible responses. For example, referring to the "Happy $FESTINAL" message template given above, the fixed language component "Happy” may act as a signal such that the processor 110 knows that the subsequent voice input received at the microphone 190 will be the variable language component "$FESTINAL.”
  • PDAs Personal Digital Assistants
  • a text message may be provided through voice inputs rather than through typed characters entered into a small keypad.
  • the invention may include open vocabulary speech recognition to avoid the memory intensive requirements of prior art closed vocabulary speech recognition.
  • Open vocabulary speech recognition uses speaker-independent sub-word acoustic models designed to cover all of the acoustic occurrences, or phonemes, of a language.
  • a user is not limited to a predefined vocabulary but can edit the variable language components as described above to include words not found in a dictionary, such as names and locations.
  • the result is that the text messages provided by the present invention may be highly personalized.

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Telephone Function (AREA)
  • Telephonic Communication Services (AREA)

Abstract

Cette invention concerne un procédé et un appareil permettant de produire un message textuel, lequel procédé consiste à recevoir un énoncé (étape 210) au niveau d'une entrée d'un dispositif électronique (100). Une reconnaissance vocale est ensuite effectuée sur l'énoncé (étape 230) sur la base de modèles de messages définis par l'utilisateur stockés dans une mémoire (155) associée au dispositif électronique (100). La reconnaissance vocale est définie par la mise en correspondance de l'énoncé avec un des modèles afin qu'un modèle concordant soit créé. Un message textuel est ensuite produit à partir du modèle concordant (étape 235).
EP04784421A 2003-09-23 2004-09-17 Procede et appareil permettant de produire un message textuel Withdrawn EP1665561A4 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CNB031249639A CN100353417C (zh) 2003-09-23 2003-09-23 用于提供文本消息的方法和装置
PCT/US2004/030553 WO2005031995A1 (fr) 2003-09-23 2004-09-17 Procede et appareil permettant de produire un message textuel

Publications (2)

Publication Number Publication Date
EP1665561A1 true EP1665561A1 (fr) 2006-06-07
EP1665561A4 EP1665561A4 (fr) 2011-03-23

Family

ID=34383973

Family Applications (1)

Application Number Title Priority Date Filing Date
EP04784421A Withdrawn EP1665561A4 (fr) 2003-09-23 2004-09-17 Procede et appareil permettant de produire un message textuel

Country Status (5)

Country Link
EP (1) EP1665561A4 (fr)
KR (1) KR100759728B1 (fr)
CN (1) CN100353417C (fr)
RU (1) RU2320082C2 (fr)
WO (1) WO2005031995A1 (fr)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1295674C (zh) * 2002-03-27 2007-01-17 诺基亚有限公司 模式识别
KR100805252B1 (ko) 2005-06-27 2008-02-21 서울통신기술 주식회사 Ip 단말의 통화 처리 방법 및 그 장치
DE102007061156A1 (de) * 2007-12-17 2009-08-06 Vodafone Holding Gmbh Nachrichtenübertragung in Telekommunikationsnetzen
KR101597286B1 (ko) 2009-05-07 2016-02-25 삼성전자주식회사 아바타 영상 메시지를 생성하는 장치 및 방법
CN102263851A (zh) * 2010-05-31 2011-11-30 北京迅捷英翔网络科技有限公司 消息转换方法
CN103366741B (zh) * 2012-03-31 2019-05-17 上海果壳电子有限公司 语音输入纠错方法及系统
RU2637874C2 (ru) 2013-06-27 2017-12-07 Гугл Инк. Генерирование диалоговых рекомендаций для чатовых информационных систем
US9473627B2 (en) 2013-11-08 2016-10-18 Sorenson Communications, Inc. Video endpoints and related methods for transmitting stored text to other video endpoints
US9185211B2 (en) 2013-11-08 2015-11-10 Sorenson Communications, Inc. Apparatuses and methods for operating a communication system in one of a tone mode and a text mode
KR101894928B1 (ko) 2017-02-14 2018-09-05 (주)스톤아이 방문 횟수를 이용한 보너스 정산 시스템의 보너스 금액 산출 장치 및 방법
US11924149B2 (en) 2020-10-15 2024-03-05 Google Llc Composition of complex content via user interaction with an automated assistant

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4525793A (en) * 1982-01-07 1985-06-25 General Electric Company Voice-responsive mobile status unit
CA2372671C (fr) * 1994-10-25 2007-01-02 British Telecommunications Public Limited Company Services a commande vocale
JP3533051B2 (ja) * 1996-08-21 2004-05-31 パイオニア株式会社 自動音声応答機能付き電話機
US6173316B1 (en) * 1998-04-08 2001-01-09 Geoworks Corporation Wireless communication device with markup language based man-machine interface
US6526292B1 (en) * 1999-03-26 2003-02-25 Ericsson Inc. System and method for creating a digit string for use by a portable phone
RU13455U1 (ru) * 1999-09-30 2000-04-10 Бурин Андрей Михайлович Устройство для пересылки текста электронной почты из электронного почтового ящика на сотовый телефон абонента и для пересылки текстовых сообщений с сотового телефона абонента в электронный почтовый ящик
DE19959903A1 (de) * 1999-12-07 2001-06-13 Bruno Jentner Modul zur Unterstützung der Text-Mitteilungs-Kommunikation in Mobilfunknetzen
US6625474B1 (en) * 2000-04-11 2003-09-23 Motorola, Inc. Method and apparatus for audio signal based answer call message generation
KR20020028501A (ko) * 2000-10-10 2002-04-17 김철권 통신망에서의 음성 데이터와 문자 데이터간의 변환 방법및 그 장치
US6795808B1 (en) * 2000-10-30 2004-09-21 Koninklijke Philips Electronics N.V. User interface/entertainment device that simulates personal interaction and charges external database with relevant data
WO2002077975A1 (fr) * 2001-03-27 2002-10-03 Koninklijke Philips Electronics N.V. Procede de selection et de transmission de messages alphabetiques via un mobile
EP1324314B1 (fr) * 2001-12-12 2004-10-06 Siemens Aktiengesellschaft Système pour la reconnaissance de la parole et méthode d'opération d'un tel système
US6895257B2 (en) * 2002-02-18 2005-05-17 Matsushita Electric Industrial Co., Ltd. Personalized agent for portable devices and cellular phone
US7072684B2 (en) * 2002-09-27 2006-07-04 International Business Machines Corporation Method, apparatus and computer program product for transcribing a telephone communication
US20040176139A1 (en) * 2003-02-19 2004-09-09 Motorola, Inc. Method and wireless communication device using voice recognition for entering text characters

Also Published As

Publication number Publication date
RU2006113581A (ru) 2007-10-27
KR100759728B1 (ko) 2007-09-20
EP1665561A4 (fr) 2011-03-23
CN100353417C (zh) 2007-12-05
CN1601548A (zh) 2005-03-30
RU2320082C2 (ru) 2008-03-20
KR20060054469A (ko) 2006-05-22
WO2005031995A1 (fr) 2005-04-07

Similar Documents

Publication Publication Date Title
US6424945B1 (en) Voice packet data network browsing for mobile terminals system and method using a dual-mode wireless connection
US6694295B2 (en) Method and a device for recognizing speech
EP2224705B1 (fr) Dispositif mobile de communications sans fil doté de conversion de voix à texte et procédé correspondant
US8577681B2 (en) Pronunciation discovery for spoken words
US6895257B2 (en) Personalized agent for portable devices and cellular phone
US6526292B1 (en) System and method for creating a digit string for use by a portable phone
WO2005027482A1 (fr) Messagerie textuelle par reconnaissance de locutions
US7043436B1 (en) Apparatus for synthesizing speech sounds of a short message in a hands free kit for a mobile phone
KR100759728B1 (ko) 텍스트 메시지를 제공하는 방법 및 장치
CN111325039A (zh) 基于实时通话的语言翻译方法、系统、程序和手持终端
EP1751742A1 (fr) Stations mobile et procede pour emettre et recevoir des messages
WO2008118038A1 (fr) Procédé d'échange de messages et dispositif permettant sa mise en oeuvre
WO2005027477A1 (fr) Annuaire telephonique actionne par la voix pour la reconnaissance de nom dependant du locuteur et la classification de numeros de telephone
US20050256710A1 (en) Text message generation
JP4070963B2 (ja) 移動体通信機器
KR100724848B1 (ko) 휴대 단말에서 입력 문자 실시간 낭독방법
CN111274828B (zh) 基于留言的语言翻译方法、系统、计算机程序和手持终端
JP2003333203A (ja) 音声合成システム、サーバ装置および情報処理方法ならびに記録媒体、プログラム
KR19990043026A (ko) 음성인식 한글입력장치
KR20060063420A (ko) 휴대단말기에서의 음성인식방법 및 이를 구비한 휴대단말기
JP2002140086A (ja) 携帯電話機用のショートメッセージから音声出力への変換装置
JP2000151827A (ja) 電話音声認識システム
JP2005286886A (ja) サーバ
JPH11187088A (ja) 携帯端末装置
JP2002344572A (ja) 携帯電話端末、プログラム、プログラムを記録した記録媒体

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20060316

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): DE FR GB IT

DAX Request for extension of the european patent (deleted)
RBV Designated contracting states (corrected)

Designated state(s): DE FR GB IT

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: MOTOROLA MOBILITY, INC.

A4 Supplementary search report drawn up and despatched

Effective date: 20110221

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: MOTOROLA MOBILITY LLC

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN

18W Application withdrawn

Effective date: 20150120

P01 Opt-out of the competence of the unified patent court (upc) registered

Effective date: 20230520