WO2014106986A1 - Electronic apparatus controlled by a user's voice and control method thereof - Google Patents

Electronic apparatus controlled by a user's voice and control method thereof Download PDF

Info

Publication number
WO2014106986A1
WO2014106986A1 PCT/KR2013/009606 KR2013009606W WO2014106986A1 WO 2014106986 A1 WO2014106986 A1 WO 2014106986A1 KR 2013009606 W KR2013009606 W KR 2013009606W WO 2014106986 A1 WO2014106986 A1 WO 2014106986A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
user
electronic apparatus
identification tag
tag
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/KR2013/009606
Other languages
French (fr)
Inventor
Eun-Hee Park
So-yon You
Sang-Jin Han
Jae-Kwon Kim
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Electronics Co Ltd
Original Assignee
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Samsung Electronics Co Ltd filed Critical Samsung Electronics Co Ltd
Priority to EP13869992.1A priority Critical patent/EP2941896A4/en
Publication of WO2014106986A1 publication Critical patent/WO2014106986A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42203Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Definitions

  • Apparatuses and methods consistent with exemplary embodiments relate to an electronic apparatus and a control method thereof. More particularly, the exemplary embodiments relate to an electronic apparatus controlled by a user’s voice and a method of controlling the apparatus.
  • TVs televisions
  • users may watch a large number of digital broadcast channels through televisions.
  • a TV may recognize a users’ voice and may perform functions which correspond to the users’ voices, such as volume adjustment or channel change.
  • One or more exemplary embodiments may overcome the above disadvantages and other disadvantages not described above. However, it is understood that one or more exemplary embodiment are not required to overcome the disadvantages described above, and may not overcome any of the problems described above.
  • One or more exemplary embodiments are to provide an electronic apparatus capable of easily calling an object by using voice recognition technology and a method of controlling voice recognition.
  • an electronic apparatus controlled by a user’s voice may include: a voice input which is configured to receive a user’s voice; a display configured to provide a user interface (UI) screen which includes at least one object; and a controller configured to determine whether or not calling, by the user’s voice, to the object is possible, assigning an identification tag which identifies the object to the object based on a determination result, and displaying the identification tag-assigned object.
  • UI user interface
  • the controller may assign and display the identification tag to the object when a text item which enables voice calling to the object is not tagged to the object.
  • the object to which the text item is not tagged may be an object having a thumbnail attribute or may have a list attribute.
  • the controller may perform a task for the identification of the tag-assigned object when the identification tag is called by the user’s voice.
  • the controller may sequentially assign identification tags to objects based on a displayed order of the objects on the UI screen and may display the identification tag-assigned objects.
  • the identification tags may include at least one of a number tag having a preset order and an alphabet tag having a preset order.
  • the controller may assign and display a preset graphic user interface (GUI) which indicates to at least one of the identification tag and a text item which is callable by the user’s voice that the voice calling is possible.
  • GUI graphic user interface
  • a method of controlling an electronic apparatus with a user’s voice may include: receiving a user command which provides a user interface (UI) screen including at least one object; determining whether or not calling the object using by a user’s voice is possible; and assigning an identification tag which identifies the object based on a determination result and displaying the identification tag-assigned object.
  • UI user interface
  • the determining whether or not calling is possible may include determining that the calling is impossible when a text item which enables voice calling to the object is not tagged to the object and the displaying may include assigning and displaying the identification tag to the object upon determination that voice calling is impossible.
  • the object to which the text item is not tagged may be an object having a thumbnail attribute or may have a list attribute.
  • the method of controlling an electronic apparatus may further include performing a task for the identification tag-assigned object, when the identification tag is called by the user’s voice.
  • the displaying may include sequentially assigning and displaying identification tags to objects on the UI screen based on a displayed order of the objects.
  • the identification tags may be at least one of a number tag having a preset order and an alphabet tag having a preset order.
  • the displaying may include assigning and displaying a preset graphic user interface (GUI) which indicates that voice calling is possible to at least one of the identification tag and a text item which is callable by the user’s voice.
  • GUI graphic user interface
  • Another exemplary embodiment may provide an electronic apparatus controlled by a user’s voice, the electronic apparatus including: a controller configured to determine whether or not it is possible for a user’s voice to call an object, assign to the object an identification tag which identifies the object based on a result of the determination, and display the identification tag-assigned object.
  • the apparatus may further include a voice input configured to receive a user’s voice.
  • a display may be configured to provide a user interface (UI) screen including at least one object.
  • UI user interface
  • the controller may assign and display the identification tag to the object in response to a text item which enables voice calling to the object not being tagged to the object.
  • the object to which the text item is not tagged may be an object having a thumbnail attribute or a list attribute.
  • the controller may perform a task of identifying a tag-assigned object when the identification tag is called by the user’s voice.
  • the electronic apparatus may be easily controlled by using voice recognition technology.
  • FIG. 1 is a block diagram which illustrates a configuration of an electronic apparatus, according to an exemplary embodiment
  • FIG. 2 is a block diagram which illustrates a configuration of an electronic apparatus, according to another exemplary embodiment
  • FIG. 3 is a view which illustrates a software configuration stored in a storage, according to an exemplary embodiment
  • FIG. 4 is a view which illustrates an interactive system, according to an exemplary embodiment
  • FIGS. 5 to 7 are views which illustrate UI providing methods, according to various exemplary embodiments.
  • FIG. 8 is a flow chart which illustrates method of controlling an electronic apparatus, according to an exemplary embodiment.
  • FIG. 1 is a block diagram which illustrates a configuration of an electronic apparatus, according to an exemplary embodiment.
  • An electronic apparatus 100 as illustrated in FIG. 1 includes a voice recognition input 110, a display 120 and a controller 130.
  • the electronic apparatus 100 may be a smart TV, but this is merely an exemplary embodiment and the electronic apparatus 100 may be implemented with various electronic apparatuses such as a smart phone, a tablet personal computer (PC) and a laptop PC.
  • a smart TV a smart TV
  • PC personal computer
  • laptop PC a laptop PC
  • the electronic apparatus 100 may be implemented to recognize voice by a natural utterance of a person and may perform a voice recognition function which is received as an execution command.
  • the voice recognition is to automatically identify linguistic meaning contents from the voice.
  • the voice recognition is a process of inputting a voice waveform, identifying a word or a word string, and extracting meaning, and may be performed through processes of voice analysis, phonemic recognition, word recognition, sentence interpretation, and meaning extraction. A detailed description thereof will be omitted.
  • the voice input 110 receives a user’s uttered voice.
  • the voice input 110 converts the input voice signal into an electrical signal and outputs the electrical signal to the controller 130.
  • the voice input 110 may be implemented as a microphone. Further, the voice input 110 may be implemented in all-in-one form with the electronic apparatus 100 or separate from the electronic apparatus 100. The separate voice input 110 may be connected to the electronic apparatus 100 in a wired or wireless manner.
  • the display 120 displays an image which corresponds to a broadcast signal received through a broadcast receiver.
  • the display 120 may display image data (for example, a moving image) input through an external terminal input.
  • the display 120 may display a UI screen including various objects.
  • the display 120 may display the UI screen including a plurality of thumbnails which correspond to movie contents in a UI screen configured to provide a plurality of movie contents.
  • the controller 130 controls the voice input 110 and the display 120.
  • the controller 130 may include a module implemented as a central processing unit (CPU), a read only memory (ROM), and a random access memory (RAM), which are configured to store data.
  • CPU central processing unit
  • ROM read only memory
  • RAM random access memory
  • the controller 130 may determine whether or not it is possible to use a voice to call an object displayed in the display 120 and assign and display an identification tag which identifies the object based on the result of the determination.
  • the controller 130 may determine whether or not it is impossible to call the user’s voice when a text item which enables the voice calling to the object is not tagged to the object, and whether or not it is possible to assign and display the identification tag to the object.
  • the controller may assign the identification tag which allows the user to identify and, call the object to the object, and may enable voice calling of the user.
  • the controller 130 may sequentially assign and display identification tags on a UI screen based on a displayed order of objects.
  • the identification tag may be one of a number tag having a preset order and an alphabet tag having a preset order.
  • the identification tag is not limited thereto.
  • the controller 130 may assign a first identification tag to a fifth identification tag according to the displayed order and may allow the user to call the desired movie thumbnail in a voice using the identification tags.
  • controller 130 may assign and display a preset GUI which indicates to a calling text which is callable by a voice that calling is possible.
  • the above-described identification tag may be included in the text item which is callable by voice. However, in addition to the identification tag, other text items which are callable by voice may be included.
  • a specific menu item is configured in a text called “Sub category” which is callable by voice
  • the controller may assign and display a preset GUI to a corresponding text.
  • the preset GUI indicating that calling is possible may be a GUI having the form of a quotation mask, but the GUI is not limited thereto.
  • any GUI configured to allow the user to recognize that calling is possible such as a SUI having the form of a speech bubble, may be applied.
  • the controller 130 may control the identification tag-assigned object to be called when the identification tag is called by the user’s voice input through the voice input 110. For example, the controller 130 may determine that the object to which the identification tag called “1” is assigned is called when “1” is called by the user’s voice and may perform a task which corresponds to a corresponding object.
  • the task which corresponds to the corresponding object is a predefined job executable by an apparatus through the calling of the corresponding object.
  • the task may be a job which reproduces the corresponding content or which displays a detailed item of the corresponding content.
  • the controller 130 recognizes a voice using a voice recognition module and a voice database, when the voice is input through the voice input 110.
  • the voice recognition is divided into isolated word recognition which recognizes an uttered voice through classification of words, continuous speed recognition which recognizes continuous words, continuous sentences, and dialogic voices, and keyword spotting which is a middle form between the isolated word recognition and the continuous speech recognition, and detects and recognizes a predetermined keyword.
  • the controller 130 detects a start and an end of the user’s uttered voice in the input voice signal to determine a voice period.
  • the controller 130 may calculate the energy of the input voice signal, classify an energy level of the voice signal according to the calculated energy, and detecting the period of the voice through dynamic programming.
  • the controller 130 detects a phoneme which is a minimum unit of the voice from the voice signal in the detected voice period based on an acoustic model and generates phonemic data.
  • the controller 130 generates text information by applying a hidden Markov model (HMM) which is a probabilistic model to the generated phonemic data.
  • HMM hidden Markov model
  • the voice recognition method of the user is merely an exemplary embodiment and the user’s voice may be recognized through other methods. Therefore, the controller 130 may recognize the user’s voice included in the voice signal. As described above, the controller 130 performs a task of the electronic apparatus using the recognized voice.
  • FIG. 2 a block diagram which illustrates a configuration of an electronic apparatus according to another exemplary embodiment.
  • an electronic apparatus 100’ includes a voice input 110, a display 120, a controller 130, a storage 140, a broadcast receiver 150, an external terminal input 160, a remote controller signal receiver 170, a communicator 180, a recognizer 190, and an audio output 195.
  • FIG. 2 Detailed description of the portions of the components illustrated in FIG. 2 which are the same as in the components illustrated in FIG. 1 will be omitted to avoid obscuring the invention.
  • the controller 130 includes a RAM 131, a ROM 132, a main CPU 133, a graphic processor 134, a first interface 135-1 to an n-th interface 135-n, and a bus 136.
  • the RAM 131, the ROM 132, the main CPU 133, the graphic processor 134, and the first to n-th interfaces 135-1 to 135-n may be connected to each other through the bus 136.
  • the first to n-th interfaces 135-1 to 135-n are connected to the above-described components.
  • One of the interfaces may be a network interface connected to an external apparatus through a network.
  • the main CPU 133 accesses the storage 140 to perform booting using operation system (O/S) stored in the storage 140.
  • the main CPU 133 performs various operations using various programs, contents, data, and the like, stored in the storage 140.
  • a comment set and the like for system booting is stored in the ROM 132.
  • the main CPU 133 copies the O/S stored in the storage 140 to the RAM 131 according to the command stored in the ROM 132 and executes the O/S to boot the system.
  • the main CPU 133 copies various application programs stored in the storage 140 to the RAM 131 and executes the application programs copied to the RAM 131 to perform various operations.
  • the graphic processor 134 generates a screen including various objects such as an icon, an image, and a text using a calculator (not shown) and a renderer (not shown).
  • the calculator (not shown) calculates an attribute value such as a coordinate value in which each of the objects is to be displayed; a shape, a size, and a color according to a layout of the screen.
  • the renderer (not shown) generates screens having various layouts including the objects based on the attribute value calculated in the calculator (not shown).
  • the screen generated in the renderer (not shown) is displayed in a display region of the display 120.
  • the storage 140 stores various data and programs for driving and controlling the electronic apparatus 100’.
  • the storage 140 may include a voice recognition module configured to recognize the voice input through the voice input 110 and a voice database.
  • the voice database means a database in which a preset voice and a voice task matching the preset voice are stored.
  • the broadcast receiver 150 receives a broadcast signal from the outside in a wired or wireless manner.
  • the broadcast signal includes a video signal, an audio signal, and additional data (for example, electronic program guide (EPG)).
  • EPG electronic program guide
  • the broadcast receiver 150 may receive a broadcast signal from various sources such as a terrestrial broadcasting, cable broadcasting, satellite broadcasting and Internet broadcasting.
  • the external terminal input 160 receives image data (for example, a moving image, a photo, and the like), audio data (for example, music and the like), and the like, from outside of the electronic apparatus 100’.
  • the external terminal input 160 may include at least one selected from the group consisting of a high-definition multimedia interface (HDMI) input terminal, a component input terminal, a PC input terminal, and a universal serial bus (USB) input terminal.
  • HDMI high-definition multimedia interface
  • the remote controller signal receiver 170 may receive a remove controller signal even when the electronic apparatus 100’ is in a voice task mode or in a motion task mode.
  • the communication 180 may connect the electronic apparatus 100’ and an external apparatus (not shown) through control of the controller 130.
  • the communicator 180 may provide a communication method such as Ethernet 181, a wireless local area network (LAN) 182, and Bluetooth 183.
  • the communication method of the communicator 180 is not limited thereto.
  • the external apparatus may be implemented as an automatic speech recognition (ASR) server and an interactive server configured to provide interactive service.
  • ASR automatic speech recognition
  • the electronic apparatus 100’ be implemented to provide various interactive services as well as to perform a task according to simple voice recognition and description thereof, which will be made later with reference to FIG. 4.
  • the external apparatus may be implemented with a server configured to download an application or perform web browsing.
  • the recognizer 190 outputs a voice through audio output 195, which corresponds to a broadcast signal by control of the controller 130.
  • the audio output 195 may include at least one output selected from the group including a speaker 191, a head phone output terminal 192, and a Sony/Philips digital interface (S/PDIF) output terminal 193.
  • S/PDIF Sony/Philips digital interface
  • FIG. 3 is a view which illustrates a configuration of software stored in a storage according to an exemplary embodiment.
  • the storage 140 includes a power control module 140a, a channel control module 140b, a volume control module 140c, an external input control module 140d, a screen control module 140e, an audio control module 140f, an Internet control module 140g, an application control module 140h, a search control nodule 140i, a UI processing module 140j, a voice recognition module 140k, a voice database 140l.
  • the modules 140a to 140l may be implemented with software to respectfully perform a power control functions, a channel control function, a volume control function, an external input control function, a screen control function, an audio control function, an Internet control function, an application execution function, a search control function and a UI processing function.
  • the controller 130 may execute the software stored in the storage 140 to perform a corresponding function. For example, the controller 130 may recognize the user’s voice using the voice recognition module 140k and the voice database 140l and perform a task which corresponds to the recognized voice.
  • FIG. 4 is a view illustrating an interactive system 1000 according to an exemplary embodiment.
  • the interactive system 1000 includes an electronic apparatus 100,’ a first server 200, and a second server 300.
  • the electronic apparatus 100’ may be implemented with the electronic apparatus 100’ illustrated in FIG. 2 and may perform various tasks according to a user’s voice.
  • the electronic apparatus 100’ outputs a response message which corresponds to the user’s voice or performs a task which corresponds to the user’s voice.
  • the electronic apparatus 100’ transmits the collected user’s voice to the first server 200 (for example, an ASR server).
  • the first server 200 converts the received user’s voice into text information (or a text) and transmits the text information to the electronic apparatus 100’.
  • the electronic apparatus 100’ transmits to the second server 300 the text information received from the first server 200.
  • the second server 300 for example, an interactive server
  • the second server 300 generates response information which corresponds to the received text information and transmits the generated response information to the electronic apparatus 100’.
  • the electronic apparatus 100’ may perform various operations based on the response information received from the second server 300.
  • the electronic apparatus 100’ may output the response message which corresponds to the user’s voice.
  • the response message may be output as at least one of a voice and a text.
  • the electronic apparatus 100’ may output the broadcast time of the specific program in a voice or text form, or in a combination thereof.
  • the electronic apparatus 100’ may perform a task which corresponds to the user’s voice. For example, when the user’s voice for channel change is received, the electronic apparatus 100’ may tune a corresponding channel and display the tuned channel.
  • the electronic apparatus 100’ may provide a response message which corresponds to the corresponding task. That is, the electronic apparatus 100’ may output information for the task performed according to the user’s voice in a voice or text form, or in a combination thereof. In the above-described example, the electronic apparatus 100’ may output information for the changed channel or a message which indicates that the channel change is completed through at least one of a voice or a text.
  • FIG. 5 a view illustrating a UI providing a method according to an exemplary embodiment.
  • a UI screen including a plurality of objects 511 to 518 may be displayed on a screen.
  • the objects 511 to 517 indicating movies thumbnails are objects to which a calling text which enables voice calling is not tagged
  • the object 518 indicating a specific menu item is an object to which the calling text which enables the voice calling is tagged.
  • identification tags 511-1 to 517-1 which identify corresponding objects may be assigned to the objects 511 to 517 which indicate the movie thumbnails.
  • the identification tags may be tags which identify the corresponding objects by voice and may be assigned in a number form which is easily recognizable by the user.
  • the identification tags are not limited thereto.
  • a calling text called “sub category” is tagged to the object 518 which indicates the specific menu item and a separate identification tag may be not assigned to the object 518.
  • FIG. 5(b) illustrates a state in which a next UI page is displayed by selection of an item 520 which displays a UI page next to the UI page illustrated in FIG. 5(a).
  • a plurality of movie thumbnails 511 to 517 displayed on the UI page illustrated in FIG. 5(a) and other movie thumbnail objects 521 to 527 may be displayed on the UI page illustrated in FIG. 5(b).
  • identification tags which identify corresponding objects may be assigned to the corresponding thumbnail objects 521 to 522.
  • the identification tags may be number tags of 1 to 7 like the identification tags as illustrated in FIG. 5(a). This is because the identification tags are to simply identify the objects displayed on the displayed screen to be easily called in a voice and thus may be reused regardless of whether or not the identification tags are identification tags used in the previously displayed screen.
  • number tags of 8 to 14 which are next orders of the identification tags illustrated in FIG. 5(a) may be assigned to the corresponding thumbnails objects 521 to 527.
  • FIG. 6 is a view illustrating a UI providing method according to another exemplary embodiment.
  • a preset GUI which indicates that the corresponding text is callable may be assigned to a text item which is callable.
  • GUIs 611 to 617 having a double-quotation mark form may be assigned to identification tags assigned to the thumbnail objects 511 to 517. Therefore, the user may intuitively recognize that the corresponding identification tag is a text which is callable in a voice.
  • the “sub category” may be displayed to indicate that the “sub category” is a callable text.
  • FIG. 7 is a view which illustrates a UI providing method according to another exemplary embodiment.
  • alphabet tags may be assigned to thumbnail objects to which a calling text which enables voice calling, are not tagged in order to enable voice calling.
  • any identification tag assigned to an object which may be easily recognized by the user, may be not limited to numbers but applied in various forms.
  • thumbnail image as an example of an object which is not callable by the user’s voice, but it is merely an exemplary embodiment and the concept may be applied to other objects which are not callable by the user, such as a list.
  • FIG. 8 is a flow chart which illustrates a control method of an electronic apparatus according to an exemplary embodiment.
  • the electronic apparatus first receives a user command for providing a UI screen including at least one object (S810).
  • the electronic apparatus determines whether or not it is possible for the user's voice to call the object in the UI screen (S820).
  • the electronic apparatus assigns an identification tag which identifies the object to the object based on a determination result in step S820 and displays the identification tag-assigned object (S830).
  • the electronic apparatus may determine that calling is impossible when a text item which enables the user's voice calling the object is not tagged.
  • the object to which the text item is not tagged may be an object having a thumbnail attribute or may be a list attribute.
  • step S830 of displaying the identification tag-assigned object the electronic apparatus may assign the identification tag to an object which is determined not to be callable in step S820 and may display the identification tag-assigned object.
  • the electronic apparatus may sequentially assign and display identification tags based on a displayed order of objects on the UI screen.
  • the identification tags may be at least one of a number tag having a preset order and an alphabet tag having a preset order.
  • the electronic apparatus may determine that the identification tag-assigned object is called and may perform a task for a corresponding object.
  • the electronic apparatus may assign and display a preset GUI which indicates that voice calling is possible to at least one of the identification tag assigned to the object and the text item, which is callable by the user’s voice.
  • the text item which is callable by the user’s voice means a text item which may be voice-recognizable and for example, the text item may include a menu title item assigned to the preset menu item.
  • the desired object may be selected without an additional focus manipulation.
  • the method of controlling an electronic apparatus may be implemented with a program and may be provided to electronic apparatuses.
  • a non-transitory computer-readable medium in which the program performing a configuration of determining whether or not it is possible for a user's voice to call the object when a user command for providing a UI screen including at least one object is received and assigning and displaying an identification tag which identifies the object based on a determination result is stored may be provided.
  • the non-transitory computer-recordable medium is not a medium configured to temporarily store data such as a register, a cache, a memory, and the like but an apparatus-readable storage medium configured to semi-permanently store data.
  • the above-described applications or programs may be stored and provided in the non-transitory computer-recordable medium such as a compact disc (CD), a digital versatile disc (DVD), a hard disc (HD), a Blu-ray disc®, a USB, a memory card, a ROM, and the like.
  • a bus is not illustrated, communication between the components of the electronic apparatus may be performed through the bus.
  • a processor configured to perform the above-described various steps such as a CPU and a microprocessor may be further included.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Theoretical Computer Science (AREA)
  • User Interface Of Digital Computer (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)

Abstract

An electronic apparatus is provided. The electronic apparatus includes a voice input which is configured to receive a user's voice, a display configured to provide a user interface (UI) screen including at least one object, and a controller which is configured to determine whether or not it is possible for the user's voice to call the object, assign to the object an identification tag which identifies the object based on a result of the determination, and display the identification tag-assigned object.

Description

ELECTRONIC APPARATUS CONTROLLED BY A USER’S VOICE AND CONTROL METHOD THEREOF
Apparatuses and methods consistent with exemplary embodiments relate to an electronic apparatus and a control method thereof. More particularly, the exemplary embodiments relate to an electronic apparatus controlled by a user’s voice and a method of controlling the apparatus.
With development of electronic technology, various kinds of electronic apparatuses have developed and become wide spread. These apparatuses have increasingly become equipped with various functions, depending on user needs. In particular, televisions (TVs) have recently provided access to the Internet in order to support Internet service. As a result, users may watch a large number of digital broadcast channels through televisions.
In the recent years, voice recognition technology has been developed to more conveniently and intuitively control electronic apparatuses. In particular, a TV may recognize a users’ voice and may perform functions which correspond to the users’ voices, such as volume adjustment or channel change.
However, using a known method of controlling voice recognition search menus on a screen, one by one, through a focus using limited navigation commands, usability is degraded.
One or more exemplary embodiments may overcome the above disadvantages and other disadvantages not described above. However, it is understood that one or more exemplary embodiment are not required to overcome the disadvantages described above, and may not overcome any of the problems described above.
One or more exemplary embodiments are to provide an electronic apparatus capable of easily calling an object by using voice recognition technology and a method of controlling voice recognition.
According to an aspect of an exemplary embodiment, there is provided an electronic apparatus controlled by a user’s voice. The electronic apparatus may include: a voice input which is configured to receive a user’s voice; a display configured to provide a user interface (UI) screen which includes at least one object; and a controller configured to determine whether or not calling, by the user’s voice, to the object is possible, assigning an identification tag which identifies the object to the object based on a determination result, and displaying the identification tag-assigned object.
The controller may assign and display the identification tag to the object when a text item which enables voice calling to the object is not tagged to the object.
The object to which the text item is not tagged may be an object having a thumbnail attribute or may have a list attribute.
The controller may perform a task for the identification of the tag-assigned object when the identification tag is called by the user’s voice.
The controller may sequentially assign identification tags to objects based on a displayed order of the objects on the UI screen and may display the identification tag-assigned objects.
The identification tags may include at least one of a number tag having a preset order and an alphabet tag having a preset order.
The controller may assign and display a preset graphic user interface (GUI) which indicates to at least one of the identification tag and a text item which is callable by the user’s voice that the voice calling is possible.
According to an aspect of an exemplary embodiment, there is provided a method of controlling an electronic apparatus with a user’s voice. The control method may include: receiving a user command which provides a user interface (UI) screen including at least one object; determining whether or not calling the object using by a user’s voice is possible; and assigning an identification tag which identifies the object based on a determination result and displaying the identification tag-assigned object.
The determining whether or not calling is possible may include determining that the calling is impossible when a text item which enables voice calling to the object is not tagged to the object and the displaying may include assigning and displaying the identification tag to the object upon determination that voice calling is impossible.
The object to which the text item is not tagged may be an object having a thumbnail attribute or may have a list attribute.
The method of controlling an electronic apparatus may further include performing a task for the identification tag-assigned object, when the identification tag is called by the user’s voice.
The displaying may include sequentially assigning and displaying identification tags to objects on the UI screen based on a displayed order of the objects.
The identification tags may be at least one of a number tag having a preset order and an alphabet tag having a preset order.
The displaying may include assigning and displaying a preset graphic user interface (GUI) which indicates that voice calling is possible to at least one of the identification tag and a text item which is callable by the user’s voice.
Another exemplary embodiment may provide an electronic apparatus controlled by a user’s voice, the electronic apparatus including: a controller configured to determine whether or not it is possible for a user’s voice to call an object, assign to the object an identification tag which identifies the object based on a result of the determination, and display the identification tag-assigned object. The apparatus may further include a voice input configured to receive a user’s voice. A display may be configured to provide a user interface (UI) screen including at least one object.
The controller may assign and display the identification tag to the object in response to a text item which enables voice calling to the object not being tagged to the object. The object to which the text item is not tagged may be an object having a thumbnail attribute or a list attribute. In addition, the controller may perform a task of identifying a tag-assigned object when the identification tag is called by the user’s voice.
According to the various exemplary embodiments, the electronic apparatus may be easily controlled by using voice recognition technology.
Additional aspects and advantages of the exemplary embodiments will be set forth in the detailed description, will be obvious from the detailed description, or may be learned by practicing the exemplary embodiments.
The above and/or other aspects will be more apparent by describing in detail exemplary embodiments, with reference to the accompanying drawings, in which:
FIG. 1 is a block diagram which illustrates a configuration of an electronic apparatus, according to an exemplary embodiment;
FIG. 2 is a block diagram which illustrates a configuration of an electronic apparatus, according to another exemplary embodiment;
FIG. 3 is a view which illustrates a software configuration stored in a storage, according to an exemplary embodiment;
FIG. 4 is a view which illustrates an interactive system, according to an exemplary embodiment;
FIGS. 5 to 7 are views which illustrate UI providing methods, according to various exemplary embodiments; and
FIG. 8 is a flow chart which illustrates method of controlling an electronic apparatus, according to an exemplary embodiment.
Hereinafter, exemplary embodiments will be described in more detail with reference to the accompanying drawings.
In the following description, same reference numerals are used for the same elements when they are depicted in different drawings. The matters defined in the description, such as detailed construction and elements, are provided to assist in a comprehensive understanding of the exemplary embodiments. Thus, it is apparent that the exemplary embodiments can be carried out without those specifically defined matters. Also, functions or elements known in the related art are not described in detail since they would obscure the exemplary embodiments with unnecessary detail.
FIG. 1 is a block diagram which illustrates a configuration of an electronic apparatus, according to an exemplary embodiment.
An electronic apparatus 100 as illustrated in FIG. 1 includes a voice recognition input 110, a display 120 and a controller 130.
The electronic apparatus 100 may be a smart TV, but this is merely an exemplary embodiment and the electronic apparatus 100 may be implemented with various electronic apparatuses such as a smart phone, a tablet personal computer (PC) and a laptop PC.
The electronic apparatus 100 may be implemented to recognize voice by a natural utterance of a person and may perform a voice recognition function which is received as an execution command. Herein, the voice recognition is to automatically identify linguistic meaning contents from the voice. Specifically, the voice recognition is a process of inputting a voice waveform, identifying a word or a word string, and extracting meaning, and may be performed through processes of voice analysis, phonemic recognition, word recognition, sentence interpretation, and meaning extraction. A detailed description thereof will be omitted.
The voice input 110 receives a user’s uttered voice. The voice input 110 converts the input voice signal into an electrical signal and outputs the electrical signal to the controller 130. The voice input 110 may be implemented as a microphone. Further, the voice input 110 may be implemented in all-in-one form with the electronic apparatus 100 or separate from the electronic apparatus 100. The separate voice input 110 may be connected to the electronic apparatus 100 in a wired or wireless manner.
The display 120 displays an image which corresponds to a broadcast signal received through a broadcast receiver. The display 120 may display image data (for example, a moving image) input through an external terminal input.
In particular, the display 120 may display a UI screen including various objects. For example, the display 120 may display the UI screen including a plurality of thumbnails which correspond to movie contents in a UI screen configured to provide a plurality of movie contents.
The controller 130 controls the voice input 110 and the display 120. Herein, the controller 130 may include a module implemented as a central processing unit (CPU), a read only memory (ROM), and a random access memory (RAM), which are configured to store data.
In particular, the controller 130 may determine whether or not it is possible to use a voice to call an object displayed in the display 120 and assign and display an identification tag which identifies the object based on the result of the determination.
Specifically, the controller 130 may determine whether or not it is impossible to call the user’s voice when a text item which enables the voice calling to the object is not tagged to the object, and whether or not it is possible to assign and display the identification tag to the object.
For example, in general, an object having a thumbnail attribute or a list attribute does not have the text item in which the user may use their voice to call the object. Therefore, the controller may assign the identification tag which allows the user to identify and, call the object to the object, and may enable voice calling of the user.
At this time, the controller 130 may sequentially assign and display identification tags on a UI screen based on a displayed order of objects. Here, the identification tag may be one of a number tag having a preset order and an alphabet tag having a preset order. However, the identification tag is not limited thereto.
For example, when five movie thumbnails are displayed on a screen, the controller 130 may assign a first identification tag to a fifth identification tag according to the displayed order and may allow the user to call the desired movie thumbnail in a voice using the identification tags.
Further, the controller 130 may assign and display a preset GUI which indicates to a calling text which is callable by a voice that calling is possible. The above-described identification tag may be included in the text item which is callable by voice. However, in addition to the identification tag, other text items which are callable by voice may be included.
For example, a specific menu item is configured in a text called “Sub category” which is callable by voice, the controller may assign and display a preset GUI to a corresponding text. Here, the preset GUI indicating that calling is possible may be a GUI having the form of a quotation mask, but the GUI is not limited thereto. For example, any GUI configured to allow the user to recognize that calling is possible, such as a SUI having the form of a speech bubble, may be applied.
Further, the controller 130 may control the identification tag-assigned object to be called when the identification tag is called by the user’s voice input through the voice input 110. For example, the controller 130 may determine that the object to which the identification tag called “1” is assigned is called when “1” is called by the user’s voice and may perform a task which corresponds to a corresponding object. Here, the task which corresponds to the corresponding object is a predefined job executable by an apparatus through the calling of the corresponding object. For example, when a content to which the identification tag called “1” is assigned is a moving image content, the task may be a job which reproduces the corresponding content or which displays a detailed item of the corresponding content.
Hereinafter, a method of recognizing a user’s voice input through the voice input 110 by the controller 130 will be described in brief.
The controller 130 recognizes a voice using a voice recognition module and a voice database, when the voice is input through the voice input 110. The voice recognition is divided into isolated word recognition which recognizes an uttered voice through classification of words, continuous speed recognition which recognizes continuous words, continuous sentences, and dialogic voices, and keyword spotting which is a middle form between the isolated word recognition and the continuous speech recognition, and detects and recognizes a predetermined keyword.
When the user’s voice is input, the controller 130 detects a start and an end of the user’s uttered voice in the input voice signal to determine a voice period. The controller 130 may calculate the energy of the input voice signal, classify an energy level of the voice signal according to the calculated energy, and detecting the period of the voice through dynamic programming. The controller 130 detects a phoneme which is a minimum unit of the voice from the voice signal in the detected voice period based on an acoustic model and generates phonemic data. The controller 130 generates text information by applying a hidden Markov model (HMM) which is a probabilistic model to the generated phonemic data. However, as described above, the voice recognition method of the user is merely an exemplary embodiment and the user’s voice may be recognized through other methods. Therefore, the controller 130 may recognize the user’s voice included in the voice signal. As described above, the controller 130 performs a task of the electronic apparatus using the recognized voice.
FIG. 2 a block diagram which illustrates a configuration of an electronic apparatus according to another exemplary embodiment. Referring to FIG. 2, an electronic apparatus 100’ includes a voice input 110, a display 120, a controller 130, a storage 140, a broadcast receiver 150, an external terminal input 160, a remote controller signal receiver 170, a communicator 180, a recognizer 190, and an audio output 195.
Detailed description of the portions of the components illustrated in FIG. 2 which are the same as in the components illustrated in FIG. 1 will be omitted to avoid obscuring the invention.
The controller 130 includes a RAM 131, a ROM 132, a main CPU 133, a graphic processor 134, a first interface 135-1 to an n-th interface 135-n, and a bus 136.
The RAM 131, the ROM 132, the main CPU 133, the graphic processor 134, and the first to n-th interfaces 135-1 to 135-n may be connected to each other through the bus 136.
The first to n-th interfaces 135-1 to 135-n are connected to the above-described components. One of the interfaces may be a network interface connected to an external apparatus through a network.
The main CPU 133 accesses the storage 140 to perform booting using operation system (O/S) stored in the storage 140. The main CPU 133 performs various operations using various programs, contents, data, and the like, stored in the storage 140.
A comment set and the like for system booting is stored in the ROM 132. When a turn-on command is input and power is supplied, the main CPU 133 copies the O/S stored in the storage 140 to the RAM 131 according to the command stored in the ROM 132 and executes the O/S to boot the system. When the booting is completed, the main CPU 133 copies various application programs stored in the storage 140 to the RAM 131 and executes the application programs copied to the RAM 131 to perform various operations.
The graphic processor 134 generates a screen including various objects such as an icon, an image, and a text using a calculator (not shown) and a renderer (not shown). The calculator (not shown) calculates an attribute value such as a coordinate value in which each of the objects is to be displayed; a shape, a size, and a color according to a layout of the screen. The renderer (not shown) generates screens having various layouts including the objects based on the attribute value calculated in the calculator (not shown). The screen generated in the renderer (not shown) is displayed in a display region of the display 120.
The storage 140 stores various data and programs for driving and controlling the electronic apparatus 100’. The storage 140 may include a voice recognition module configured to recognize the voice input through the voice input 110 and a voice database. The voice database means a database in which a preset voice and a voice task matching the preset voice are stored.
The broadcast receiver 150 receives a broadcast signal from the outside in a wired or wireless manner. The broadcast signal includes a video signal, an audio signal, and additional data (for example, electronic program guide (EPG)). The broadcast receiver 150 may receive a broadcast signal from various sources such as a terrestrial broadcasting, cable broadcasting, satellite broadcasting and Internet broadcasting.
The external terminal input 160 receives image data (for example, a moving image, a photo, and the like), audio data (for example, music and the like), and the like, from outside of the electronic apparatus 100’. The external terminal input 160 may include at least one selected from the group consisting of a high-definition multimedia interface (HDMI) input terminal, a component input terminal, a PC input terminal, and a universal serial bus (USB) input terminal. The remote controller signal receiver 170 may receive a remove controller signal even when the electronic apparatus 100’ is in a voice task mode or in a motion task mode.
The communication 180 may connect the electronic apparatus 100’ and an external apparatus (not shown) through control of the controller 130. Specifically, the communicator 180 may provide a communication method such as Ethernet 181, a wireless local area network (LAN) 182, and Bluetooth 183. However, the communication method of the communicator 180 is not limited thereto.
Here, the external apparatus may be implemented as an automatic speech recognition (ASR) server and an interactive server configured to provide interactive service.
That is, the electronic apparatus 100’ be implemented to provide various interactive services as well as to perform a task according to simple voice recognition and description thereof, which will be made later with reference to FIG. 4.
Further, the external apparatus may be implemented with a server configured to download an application or perform web browsing.
The recognizer 190 outputs a voice through audio output 195, which corresponds to a broadcast signal by control of the controller 130. The audio output 195 may include at least one output selected from the group including a speaker 191, a head phone output terminal 192, and a Sony/Philips digital interface (S/PDIF) output terminal 193.
FIG. 3 is a view which illustrates a configuration of software stored in a storage according to an exemplary embodiment.
As illustrated in FIG. 3, the storage 140 includes a power control module 140a, a channel control module 140b, a volume control module 140c, an external input control module 140d, a screen control module 140e, an audio control module 140f, an Internet control module 140g, an application control module 140h, a search control nodule 140i, a UI processing module 140j, a voice recognition module 140k, a voice database 140l. The modules 140a to 140l may be implemented with software to respectfully perform a power control functions, a channel control function, a volume control function, an external input control function, a screen control function, an audio control function, an Internet control function, an application execution function, a search control function and a UI processing function. The controller 130 may execute the software stored in the storage 140 to perform a corresponding function. For example, the controller 130 may recognize the user’s voice using the voice recognition module 140k and the voice database 140l and perform a task which corresponds to the recognized voice.
FIG. 4 is a view illustrating an interactive system 1000 according to an exemplary embodiment. As illustrated in FIG. 4, the interactive system 1000 includes an electronic apparatus 100,’ a first server 200, and a second server 300.
The electronic apparatus 100’ may be implemented with the electronic apparatus 100’ illustrated in FIG. 2 and may perform various tasks according to a user’s voice.
Specifically, the electronic apparatus 100’ outputs a response message which corresponds to the user’s voice or performs a task which corresponds to the user’s voice.
Therefore, if necessary, the electronic apparatus 100’ transmits the collected user’s voice to the first server 200 (for example, an ASR server). When the user’s voice is received from the electronic apparatus 100’, the first sever 200 converts the received user’s voice into text information (or a text) and transmits the text information to the electronic apparatus 100’.
The electronic apparatus 100’ transmits to the second server 300 the text information received from the first server 200. When the text information is received from the electronic apparatus 100’, the second server 300 (for example, an interactive server) generates response information which corresponds to the received text information and transmits the generated response information to the electronic apparatus 100’.
The electronic apparatus 100’ may perform various operations based on the response information received from the second server 300.
Specifically, the electronic apparatus 100’ may output the response message which corresponds to the user’s voice. Here, the response message may be output as at least one of a voice and a text. For example, when the user’s voice inquiries regarding the broadcast time of a specific program, the electronic apparatus 100’ may output the broadcast time of the specific program in a voice or text form, or in a combination thereof.
Further, the electronic apparatus 100’ may perform a task which corresponds to the user’s voice. For example, when the user’s voice for channel change is received, the electronic apparatus 100’ may tune a corresponding channel and display the tuned channel.
At this time, the electronic apparatus 100’ may provide a response message which corresponds to the corresponding task. That is, the electronic apparatus 100’ may output information for the task performed according to the user’s voice in a voice or text form, or in a combination thereof. In the above-described example, the electronic apparatus 100’ may output information for the changed channel or a message which indicates that the channel change is completed through at least one of a voice or a text.
Hereinafter, UI providing methods according to various exemplary embodiments will be described with reference to FIGS. 5 to 7.
FIG. 5 a view illustrating a UI providing a method according to an exemplary embodiment.
As illustrated in FIG. 5(a), a UI screen including a plurality of objects 511 to 518 may be displayed on a screen.
In the UI screen illustrated in FIG. 5(a), it is assumed that the objects 511 to 517 indicating movies thumbnails are objects to which a calling text which enables voice calling is not tagged, and the object 518 indicating a specific menu item is an object to which the calling text which enables the voice calling is tagged.
At this time, identification tags 511-1 to 517-1 which identify corresponding objects may be assigned to the objects 511 to 517 which indicate the movie thumbnails. Here, the identification tags may be tags which identify the corresponding objects by voice and may be assigned in a number form which is easily recognizable by the user. However, the identification tags are not limited thereto.
On the other hand, a calling text called “sub category” is tagged to the object 518 which indicates the specific menu item and a separate identification tag may be not assigned to the object 518.
FIG. 5(b) illustrates a state in which a next UI page is displayed by selection of an item 520 which displays a UI page next to the UI page illustrated in FIG. 5(a).
A plurality of movie thumbnails 511 to 517 displayed on the UI page illustrated in FIG. 5(a) and other movie thumbnail objects 521 to 527 may be displayed on the UI page illustrated in FIG. 5(b).
At this time, identification tags which identify corresponding objects may be assigned to the corresponding thumbnail objects 521 to 522. As illustrated in FIG. 5(b), the identification tags may be number tags of 1 to 7 like the identification tags as illustrated in FIG. 5(a). This is because the identification tags are to simply identify the objects displayed on the displayed screen to be easily called in a voice and thus may be reused regardless of whether or not the identification tags are identification tags used in the previously displayed screen.
However, this is merely an exemplary embodiment and in some cases, number tags of 8 to 14 which are next orders of the identification tags illustrated in FIG. 5(a) may be assigned to the corresponding thumbnails objects 521 to 527.
FIG. 6 is a view illustrating a UI providing method according to another exemplary embodiment.
As illustrated in FIG. 6, a preset GUI which indicates that the corresponding text is callable may be assigned to a text item which is callable.
For example, as illustrated in FIG. 6, GUIs 611 to 617 having a double-quotation mark form may be assigned to identification tags assigned to the thumbnail objects 511 to 517. Therefore, the user may intuitively recognize that the corresponding identification tag is a text which is callable in a voice.
Further, even in the object 518 indicating that a specific menu item to which a calling text called “sub category” is tagged, the “sub category” may be displayed to indicate that the “sub category” is a callable text.
FIG. 7 is a view which illustrates a UI providing method according to another exemplary embodiment.
As illustrated in FIG. 7, alphabet tags may be assigned to thumbnail objects to which a calling text which enables voice calling, are not tagged in order to enable voice calling.
That is, any identification tag assigned to an object, which may be easily recognized by the user, may be not limited to numbers but applied in various forms.
The above-described exemplary embodiments have described the thumbnail image as an example of an object which is not callable by the user’s voice, but it is merely an exemplary embodiment and the concept may be applied to other objects which are not callable by the user, such as a list.
FIG. 8 is a flow chart which illustrates a control method of an electronic apparatus according to an exemplary embodiment.
As illustrated in FIG. 8, in a control method of an electronic apparatus controlled by the user’s voice according to the exemplary embodiment, the electronic apparatus first receives a user command for providing a UI screen including at least one object (S810).
The electronic apparatus determines whether or not it is possible for the user's voice to call the object in the UI screen (S820).
Next, the electronic apparatus assigns an identification tag which identifies the object to the object based on a determination result in step S820 and displays the identification tag-assigned object (S830).
At this time, in step S820 of determining whether or not calling is possible, the electronic apparatus may determine that calling is impossible when a text item which enables the user's voice calling the object is not tagged. Here, the object to which the text item is not tagged may be an object having a thumbnail attribute or may be a list attribute.
Further, in step S830 of displaying the identification tag-assigned object, the electronic apparatus may assign the identification tag to an object which is determined not to be callable in step S820 and may display the identification tag-assigned object.
Specifically, the electronic apparatus may sequentially assign and display identification tags based on a displayed order of objects on the UI screen. The identification tags may be at least one of a number tag having a preset order and an alphabet tag having a preset order.
When the identification tag assigned to the object is called by the user’s voice, the electronic apparatus may determine that the identification tag-assigned object is called and may perform a task for a corresponding object.
Further, in step S830 of displaying the identification tag-assigned object, the electronic apparatus may assign and display a preset GUI which indicates that voice calling is possible to at least one of the identification tag assigned to the object and the text item, which is callable by the user’s voice. Here, the text item which is callable by the user’s voice means a text item which may be voice-recognizable and for example, the text item may include a menu title item assigned to the preset menu item.
Therefore, when the object is selected through voice recognition, the desired object may be selected without an additional focus manipulation.
The method of controlling an electronic apparatus according to the above-described various exemplary embodiments may be implemented with a program and may be provided to electronic apparatuses.
For example, a non-transitory computer-readable medium in which the program performing a configuration of determining whether or not it is possible for a user's voice to call the object when a user command for providing a UI screen including at least one object is received and assigning and displaying an identification tag which identifies the object based on a determination result is stored, may be provided.
The non-transitory computer-recordable medium is not a medium configured to temporarily store data such as a register, a cache, a memory, and the like but an apparatus-readable storage medium configured to semi-permanently store data. Specifically, the above-described applications or programs may be stored and provided in the non-transitory computer-recordable medium such as a compact disc (CD), a digital versatile disc (DVD), a hard disc (HD), a Blu-ray disc®, a USB, a memory card, a ROM, and the like.
Further, in the block diagram illustrating the electronic apparatus, although a bus is not illustrated, communication between the components of the electronic apparatus may be performed through the bus. In each device, a processor configured to perform the above-described various steps such as a CPU and a microprocessor may be further included.
The foregoing exemplary embodiments and advantages are merely exemplary and are not to be construed as limiting the present disclosure. The exemplary embodiments can be readily applied to other types of devices. Also, the description of the exemplary embodiments is intended to be illustrative, and not to limit the scope of the claims, as many alternatives, modifications and variations will be apparent to those skilled in the art.

Claims (14)

  1. An electronic apparatus controlled by a user’s voice, the electronic apparatus comprising:
    a voice input unit configured to receive a user’s voice;
    a display configured to provide a user interface (UI) screen including at least one object; and
    a controller configured to determine whether or not it is possible to call the object using the user’s voice, assign an identification tag for identifying the object to the object based on a determination result, and display the identification tag-assigned object.
  2. The electronic apparatus as claimed in claim 1, wherein the controller assigns and displays the identification tag to the object when a text item which enables voice calling to the object is not tagged to the object.
  3. The electronic apparatus as claimed in claim 2, wherein the object to which the text item is not tagged is an object having a thumbnail attribute or a list attribute.
  4. The electronic apparatus as claimed in claim 1, wherein the controller performs a task for the identification tag-assigned object when the identification tag is called by the user’s voice.
  5. The electronic apparatus as claimed in claim 1, wherein the controller sequentially assigns identification tags to objects based on a displayed order of the objects on the UI screen and displays the identification tag-assigned objects.
  6. The electronic apparatus as claimed in claim 5, wherein the identification tags includes at least one of a number tag having a preset order and an alphabet tag having a preset order.
  7. The electronic apparatus as claimed in claim 1, wherein the controller assigns and displays a preset graphic user interface (GUI) indicating that the voice calling is possible to at least one of the identification tag and a text item which is callable by the user’s voice.
  8. A method of controlling an electronic apparatus by a user’s voice, the control method comprising:
    receiving a user command for providing a user interface (UI) screen including at least one object;
    determining whether or not it is possible to call the object by a user’s voice; and
    assigning an identification tag for identifying the object to the object based on a result of the determination and displaying the identification tag-assigned object.
  9. The method as claimed in claim 8, wherein the determining whether or not calling is possible includes determining that calling is impossible when a text item which enables voice calling to the object is not tagged to the object and the displaying includes assigning and displaying the identification tag to the object when it is determined that the voice calling is impossible.
  10. The control method as claimed in claim 9, wherein the object to which the text item is not tagged is an object having a thumbnail attribute or a list attribute.
  11. The method as claimed in claim 8, further comprising performing a task for the identification tag-assigned object when the identification tag is called by the user’s voice.
  12. The method as claimed in claim 8, wherein the displaying on the UI screen includes sequentially assigning and displaying identification tags to objects based on a displayed order of the objects.
  13. The method as claimed in claim 12, wherein the identification tag is at least one of a number tag having a preset order and an alphabet tag having a preset order.
  14. The method as claimed in claim 8, wherein the displaying includes assigning and displaying a preset graphic user interface (GUI) indicating that voice calling is possible to at least one of the identification tag and a text item which is callable by the user’s voice.
PCT/KR2013/009606 2013-01-07 2013-10-25 Electronic apparatus controlled by a user's voice and control method thereof Ceased WO2014106986A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP13869992.1A EP2941896A4 (en) 2013-01-07 2013-10-25 ELECTRONIC APPARATUS CONTROLLED BY THE VOICE OF A USER AND METHOD FOR CONTROLLING THE SAME

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020130001776A KR20140089847A (en) 2013-01-07 2013-01-07 electronic apparatus and control method thereof
KR10-2013-0001776 2013-01-07

Publications (1)

Publication Number Publication Date
WO2014106986A1 true WO2014106986A1 (en) 2014-07-10

Family

ID=51062063

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2013/009606 Ceased WO2014106986A1 (en) 2013-01-07 2013-10-25 Electronic apparatus controlled by a user's voice and control method thereof

Country Status (4)

Country Link
US (1) US10250935B2 (en)
EP (1) EP2941896A4 (en)
KR (1) KR20140089847A (en)
WO (1) WO2014106986A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104599669A (en) * 2014-12-31 2015-05-06 乐视致新电子科技(天津)有限公司 Voice control method and device
CN110968375A (en) * 2018-09-29 2020-04-07 Tcl集团股份有限公司 Interface control method and device, intelligent terminal and computer readable storage medium
CN116564280A (en) * 2023-07-05 2023-08-08 深圳市彤兴电子有限公司 Speech recognition-based display control method, device and computer equipment

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6102588B2 (en) * 2013-07-10 2017-03-29 ソニー株式会社 Information processing apparatus, information processing method, and program
KR102298767B1 (en) * 2014-11-17 2021-09-06 삼성전자주식회사 Voice recognition system, server, display apparatus and control methods thereof
US11019884B2 (en) 2016-11-23 2021-06-01 Nike, Inc. Sole structure having a midsole component with movable traction members
EP3401797A1 (en) 2017-05-12 2018-11-14 Samsung Electronics Co., Ltd. Speech navigation for multilingual web pages
CN116072115B (en) * 2017-05-12 2026-02-10 三星电子株式会社 Display devices and their control methods
KR102519635B1 (en) 2018-01-05 2023-04-10 삼성전자주식회사 Method for displaying an electronic document for processing a voice command and electronic device thereof
KR102482589B1 (en) 2018-02-12 2022-12-30 삼성전자주식회사 Method for operating speech recognition service and electronic device supporting the same
CN108491179A (en) * 2018-03-13 2018-09-04 黄玉玲 A kind of method and system of word input
KR102563314B1 (en) 2018-08-30 2023-08-04 삼성전자주식회사 Electronic Device and the Method for Generating Short cut of Quick Command
WO2021142040A1 (en) * 2020-01-06 2021-07-15 Strengths, Inc. Precision recall in voice computing
US11922096B1 (en) 2022-08-30 2024-03-05 Snap Inc. Voice controlled UIs for AR wearable devices

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050210416A1 (en) * 2004-03-16 2005-09-22 Maclaurin Matthew B Interactive preview of group contents via axial controller
US20070118382A1 (en) * 2005-11-18 2007-05-24 Canon Kabushiki Kaisha Information processing apparatus and information processing method
KR20110035036A (en) * 2009-09-29 2011-04-06 엘지전자 주식회사 Mobile terminal and its control method
US20110159885A1 (en) * 2009-12-30 2011-06-30 Lg Electronics Inc. Mobile terminal and method of controlling the operation of the mobile terminal
US20120167153A1 (en) * 2010-12-23 2012-06-28 Electronics And Telecommunications Research Institute System for providing broadcast service and method for providing broadcast service

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
IL119948A (en) * 1996-12-31 2004-09-27 News Datacom Ltd Voice activated communication system and program guide
US6615176B2 (en) * 1999-07-13 2003-09-02 International Business Machines Corporation Speech enabling labeless controls in an existing graphical user interface
KR20010015934A (en) 2000-03-11 2001-03-05 김하철 method for menu practice of application program using speech recognition
US20030005461A1 (en) * 2001-07-02 2003-01-02 Sony Corporation System and method for linking closed captioning to web site
US20050268214A1 (en) 2004-05-31 2005-12-01 De-Jen Lu Simple input method for a web browser
JP2009169883A (en) 2008-01-21 2009-07-30 Ict Solutions:Kk Simple operation method for web browser
GB0911353D0 (en) * 2009-06-30 2009-08-12 Haq Saad U Discrete voice command navigator
US8572177B2 (en) * 2010-03-10 2013-10-29 Xmobb, Inc. 3D social platform for sharing videos and webpages
US20120260284A1 (en) * 2011-04-07 2012-10-11 Sony Corporation User interface for audio video display device such as tv personalized for multiple viewers
KR101897492B1 (en) 2011-06-07 2018-09-13 삼성전자주식회사 Display apparatus and Method for executing hyperlink and Method for recogniting voice thereof
US9183832B2 (en) 2011-06-07 2015-11-10 Samsung Electronics Co., Ltd. Display apparatus and method for executing link and method for recognizing voice thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050210416A1 (en) * 2004-03-16 2005-09-22 Maclaurin Matthew B Interactive preview of group contents via axial controller
US20070118382A1 (en) * 2005-11-18 2007-05-24 Canon Kabushiki Kaisha Information processing apparatus and information processing method
KR20110035036A (en) * 2009-09-29 2011-04-06 엘지전자 주식회사 Mobile terminal and its control method
US20110159885A1 (en) * 2009-12-30 2011-06-30 Lg Electronics Inc. Mobile terminal and method of controlling the operation of the mobile terminal
US20120167153A1 (en) * 2010-12-23 2012-06-28 Electronics And Telecommunications Research Institute System for providing broadcast service and method for providing broadcast service

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2941896A4 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104599669A (en) * 2014-12-31 2015-05-06 乐视致新电子科技(天津)有限公司 Voice control method and device
CN110968375A (en) * 2018-09-29 2020-04-07 Tcl集团股份有限公司 Interface control method and device, intelligent terminal and computer readable storage medium
CN110968375B (en) * 2018-09-29 2023-01-31 Tcl科技集团股份有限公司 Interface control method and device, intelligent terminal and computer readable storage medium
CN116564280A (en) * 2023-07-05 2023-08-08 深圳市彤兴电子有限公司 Speech recognition-based display control method, device and computer equipment
CN116564280B (en) * 2023-07-05 2023-09-08 深圳市彤兴电子有限公司 Display control method and device based on voice recognition and computer equipment

Also Published As

Publication number Publication date
KR20140089847A (en) 2014-07-16
EP2941896A4 (en) 2016-07-06
EP2941896A1 (en) 2015-11-11
US20140196087A1 (en) 2014-07-10
US10250935B2 (en) 2019-04-02

Similar Documents

Publication Publication Date Title
WO2014106986A1 (en) Electronic apparatus controlled by a user's voice and control method thereof
EP4235365B1 (en) Method for controlling electronic apparatus based on voice recognition and motion recognition, and electronic apparatus applying the same
WO2015099276A1 (en) Display apparatus, server apparatus, display system including them, and method for providing content thereof
WO2013100366A1 (en) Electronic apparatus and method of controlling electronic apparatus
WO2014107076A1 (en) Display apparatus and method of controlling a display apparatus in a voice recognition system
US20130035942A1 (en) Electronic apparatus and method for providing user interface thereof
CN103034328A (en) Electronic device and method for providing user interface thereof
WO2014069943A1 (en) Method of providing information-of-users' interest when video call is made, and electronic apparatus thereof
CN102917271A (en) Method for controlling electronic equipment and electronic equipment using the method
WO2014175520A1 (en) Display apparatus for providing recommendation information and method thereof
WO2015130035A1 (en) Apparatus and method for generating a guide sentence
WO2014010981A1 (en) Method for controlling external input and broadcast receiving apparatus
WO2020159047A1 (en) Content playback device using voice assistant service and operation method thereof
WO2019231138A1 (en) Image display apparatus and operating method of the same
WO2015046764A1 (en) Method for recognizing content, display apparatus and content recognition system thereof
WO2013100367A1 (en) Electronic apparatus and method for controlling thereof
WO2013100368A1 (en) Electronic apparatus and method of controlling the same
WO2019160388A1 (en) Apparatus and system for providing content based on user utterance
WO2021107371A1 (en) Electronic device and control method therefor
WO2014077616A1 (en) Display apparatus and method for delivering message thereof
WO2019216484A1 (en) Electronic device and operating method therefor
WO2019103518A1 (en) Electronic device and control method therefor
KR20170054367A (en) Electronic apparatus and control method thereof
WO2023075118A1 (en) Electronic device and operation method therefor
KR20210135471A (en) Display device, server device, display system comprising them and methods thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13869992

Country of ref document: EP

Kind code of ref document: A1

REEP Request for entry into the european phase

Ref document number: 2013869992

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2013869992

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE