WO2019142419A1 - Dispositif de traitement d'informations et procédé de traitement d'informations - Google Patents

Dispositif de traitement d'informations et procédé de traitement d'informations Download PDF

Info

Publication number
WO2019142419A1
WO2019142419A1 PCT/JP2018/038725 JP2018038725W WO2019142419A1 WO 2019142419 A1 WO2019142419 A1 WO 2019142419A1 JP 2018038725 W JP2018038725 W JP 2018038725W WO 2019142419 A1 WO2019142419 A1 WO 2019142419A1
Authority
WO
WIPO (PCT)
Prior art keywords
input
information processing
control unit
user
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2018/038725
Other languages
English (en)
Japanese (ja)
Inventor
亜由美 中川
賢次 杉原
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Corp
Original Assignee
Sony Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Corp filed Critical Sony Corp
Publication of WO2019142419A1 publication Critical patent/WO2019142419A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/10Speech classification or search using distance or distortion measures between unknown speech and reference templates
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Definitions

  • the present disclosure relates to an information processing apparatus and an information processing method.
  • Patent Document 1 discloses a technique for correcting recognition errors associated with proper nouns.
  • the present disclosure proposes a new and improved information processing apparatus and information processing method capable of easily correcting a selection error of a form to be input.
  • a control unit which selects a first target form to be input from a plurality of forms based on a user's input operation, and performs character input on the first target form;
  • the unit selects a second target form different from the first target form based on the feedback of the user on the input content input to the first target form, and selects the second target form as the second target form.
  • An information processing apparatus for inputting characters is provided.
  • the processor selects a first target form to be input from a plurality of forms based on the user's input operation, and performs character input on the first target form.
  • a second target form different from the first target form is selected based on the user's feedback on the input content input to the first target form, and the character input is performed on the second target form.
  • an input form in which a plurality of forms to be input for character input exist is in widespread use.
  • Such an input format is adopted, for example, in an input interface such as a to-do list or a scheduler, and has a plurality of forms corresponding to a title, a date, a time, and the like.
  • the information processing apparatus selects a first target form to be input from a plurality of forms based on the user's input operation, and inputs characters to the first target form.
  • Control unit to perform the the control unit selects a second target form different from the first target form based on the user's feedback on the input content input to the first target form, and the second target form is a character.
  • One of the features is to perform input.
  • FIG. 1 and FIG. 2 are diagrams for explaining the outline of the present embodiment.
  • a speech input interface including a plurality of forms as a factor by which a speech recognition result is input to a form not intended by the user, for example, the accuracy of speech recognition itself may be mentioned.
  • FIG. 1 is a diagram showing an example when a form not intended by the user is selected due to an error in speech recognition result.
  • FIG. 1 shows a situation in which the user U registers a schedule by speech using a voice input interface IF having a plurality of forms F1 to F4.
  • the forms F1 to F4 may be forms for inputting the title, date, time, and day of the week regarding the schedule, respectively.
  • the user U utters a name of a form for specifying a voice recognition result erroneously input to the form F1 to the intended form, ie, the form F4 corresponding to the day of the week. I am doing UO1b.
  • the information processing server 20 may move the speech recognition result from the form F1 to the form F4 based on the detected speech UO1b of the user U.
  • the information processing server 20 according to the present embodiment can correct the voice input result in accordance with the input format of the form F4 corresponding to the day of the week. Focusing on the lower right of FIG. 1, it can be understood that “Kayayou” input to the form F1 is corrected to “Tuesday” by the above-described processing, and is correctly input to the form F4 intended by the user U.
  • the information processing server 20 According to the information processing server 20 according to the present embodiment, it is possible to easily correct the selection error of the form caused by the error of the speech recognition and further correct the error itself of the speech recognition.
  • FIG. 2 is a diagram showing an example in which a form not intended by the user is selected due to an error in character conversion in the speech recognition process.
  • FIG. 2 shows a situation in which the user U registers a schedule by speech using a voice input interface IF having a plurality of forms F1 to F4 as in FIG.
  • the description of the configuration and the like of the common form is omitted.
  • the user U instructs the utterance UO 2 a to register a schedule of “5 days”. However, at this time, since there is an error in the character conversion in the speech recognition process, and "5 days" is recognized as "when", the user U accepts free input instead of the form F2 corresponding to the intended date.
  • the speech recognition result is input to the form F1 corresponding to the title.
  • the user U utters a form number to input the speech recognition result erroneously input to the form F1 into the intended form, that is, the form F2 corresponding to the date. I am doing UO2b.
  • the user U may designate a form number displayed on the voice input interface IF in addition to the form name to issue a correction instruction.
  • the information processing server 20 may move the speech recognition result from the form F1 to the form F2 based on the detected speech UO2b of the user U.
  • the information processing server 20 may correct the voice input result in accordance with the input format of the form F2 corresponding to the date. Focusing on the lower right of FIG. 2, it can be seen that “when” that is input to the form F1 is corrected to a date format representing 5 days, and is correctly input to the form F2 intended by the user U. Further, since the date is correctly input to the form F2, the information processing server 20 may automatically input the day of the week corresponding to the form F4 based on the date.
  • the information processing server 20 According to the information processing server 20 according to the present embodiment, it is possible to easily correct the selection error of the form caused by the error of the character conversion in the speech recognition process and further correct the error itself of the character conversion. It becomes possible.
  • FIG. 3 is a block diagram showing an exemplary configuration of the information processing system according to the present embodiment.
  • the information processing system according to the present embodiment includes an information processing terminal 10 and an information processing server 20. Further, the information processing terminal 10 and the information processing server 20 are connected via the network 30 so as to be able to communicate with each other.
  • the information processing terminal 10 is an information processing apparatus that provides the user with a character input interface having a plurality of forms based on control by the information processing server 20.
  • the information processing terminal 10 according to the present embodiment is realized by, for example, a smartphone, a tablet, a head mounted display, a general-purpose computer, or a dedicated device of a stationary type or an autonomous moving type.
  • the information processing server 20 is an information processing apparatus that controls input / output related to a character input interface including a plurality of forms.
  • the information processing server 20 according to the present embodiment may control the display of the character input interface and the character input to the form.
  • the information processing server 20 is characterized in that it realizes a character input interface which allows the user to easily correct the error of the form as described with reference to FIGS. 1 and 2.
  • a character input interface which allows the user to easily correct the error of the form as described with reference to FIGS. 1 and 2.
  • the network 30 has a function of connecting the information processing terminal 10 and the information processing server 20.
  • the network 30 may include the Internet, a public network such as a telephone network, a satellite communication network, various LANs (Local Area Networks) including Ethernet (registered trademark), a WAN (Wide Area Network), and the like.
  • the network 30 may include a leased line network such as an Internet Protocol-Virtual Private Network (IP-VPN).
  • IP-VPN Internet Protocol-Virtual Private Network
  • the network 30 may also include a wireless communication network such as Wi-Fi (registered trademark) or Bluetooth (registered trademark).
  • the configuration example of the information processing system according to the present embodiment has been described above.
  • the configuration described above with reference to FIG. 3 is merely an example, and the configuration of the information processing system according to the present embodiment is not limited to such an example.
  • the functions of the information processing terminal 10 and the information processing server 20 according to the present embodiment may be realized by a single device.
  • the configuration of the information processing system according to the present embodiment can be flexibly deformed according to the specification and the operation.
  • FIG. 4 is a block diagram showing an example of a functional configuration of the information processing terminal 10 according to the present embodiment.
  • the information processing terminal 10 according to the present embodiment includes a display unit 110, an audio output unit 120, an audio input unit 130, an imaging unit 140, a sensor unit 150, a control unit 160, and a server communication unit 170. .
  • the display unit 110 has a function of outputting visual information such as an image or text.
  • the display unit 110 according to the present embodiment displays a character input interface based on control by the information processing server 20, for example.
  • the display unit 110 includes a display device or the like that presents visual information.
  • the display device include a liquid crystal display (LCD) device, an organic light emitting diode (OLED) device, and a touch panel.
  • the display unit 110 according to the present embodiment may output visual information by a projection function.
  • the voice output unit 120 has a function of outputting various sounds including voice.
  • the audio output unit 120 according to the present embodiment includes an audio output device such as a speaker or an amplifier.
  • the voice input unit 130 has a function of collecting sound information such as an utterance of a user and an ambient sound generated around the information processing terminal 10.
  • the voice input unit 130 according to the present embodiment includes a plurality of microphones for collecting sound information.
  • the imaging unit 140 has a function of capturing an image of the user or the surrounding environment.
  • the image information captured by the imaging unit 140 may be used for detection of the line of sight of the user by the information processing server 20 or the like.
  • the imaging unit 140 according to the present embodiment includes an imaging device capable of capturing an image. Note that the above image includes moving images as well as still images.
  • the sensor unit 150 has a function of collecting various sensor information related to the surrounding environment and the user.
  • the sensor information collected by the sensor unit 150 may be used, for example, for gesture recognition by the information processing server 20.
  • the sensor unit 150 includes, for example, an infrared sensor, an acceleration sensor, a gyro sensor, and the like.
  • Control unit 160 The control part 160 which concerns on this embodiment has a function which controls each structure with which the information processing terminal 10 is provided.
  • the control unit 160 controls, for example, start and stop of each component. Further, the control unit 160 inputs a control signal generated by the information processing server 20 to the display unit 110 or the audio output unit 120.
  • the control unit 160 according to the present embodiment may have the same function as the input / output control unit 220 of the information processing server 20 described later.
  • the server communication unit 170 has a function of performing information communication with the information processing server 20 via the network 30. Specifically, the server communication unit 170 transmits, to the information processing server 20, the sound information collected by the voice input unit 130, the image information captured by the imaging unit 140, and the sensor information collected by the sensor unit 150. The server communication unit 170 also receives, from the information processing server 20, a control signal and the like relating to the output of the character input interface.
  • the example of the functional configuration of the information processing terminal 10 according to the present embodiment has been described above.
  • the above configuration described using FIG. 4 is merely an example, and the functional configuration of the information processing terminal 10 according to the present embodiment is not limited to such an example.
  • the information processing terminal 10 according to the present embodiment may not necessarily include all of the configurations shown in FIG. 4.
  • the information processing terminal 10 can be configured not to include the imaging unit 140, the sensor unit 150, and the like.
  • the control unit 160 according to the present embodiment may have the same function as the input / output control unit 220 of the information processing server 20.
  • the functional configuration of the information processing terminal 10 according to the present embodiment can be flexibly deformed according to the specification and the operation.
  • FIG. 5 is a block diagram showing an example of a functional configuration of the information processing server 20 according to the present embodiment.
  • the information processing server 20 according to the present embodiment includes a recognition unit 210, an input / output control unit 220, and a terminal communication unit 230.
  • the recognition unit 210 executes voice recognition processing based on the user's uttered voice collected by the information processing terminal 10. Further, the recognition unit 210 may execute gaze detection based on an image captured by the information processing terminal 10, gesture recognition based on an image or sensor information, and the like.
  • the input / output control unit 220 totally controls input / output processing related to the character input interface.
  • the input / output control unit 220 for example, performs character input on the form of the character input interface based on the user's input operation.
  • the input / output control unit 220 selects a first target form to be an input target from a plurality of forms based on an input operation using a user's utterance or the like, and the first target You may enter text on the form. That is, the input / output control unit 220 according to the present embodiment can automatically select a form for character input based on the result of speech recognition for the user's speech.
  • the input / output control unit 220 selects a second target form different from the first target form based on user feedback on the input content input to the first target form, It has a function of inputting characters in the second target form. More specifically, the input / output control unit 220 selects the form specified by the feedback as the second target form, and the character corresponding to at least a part of the input content input to the first target form is selected. You may enter in the second target form.
  • the above user's feedback may be an instruction to correct a form error. That is, when the automatically selected form is incorrect, the input / output control unit 220 according to the present embodiment can perform a correction process so that the voice recognition result is input to the form designated by the user. According to the above-described function of the input / output control unit 220 according to the present embodiment, it is possible to easily correct a form error due to various factors without requiring a complicated operation.
  • Terminal communication unit 230 The terminal communication unit 230 performs information communication with the information processing terminal 10 via the network 30. Specifically, the terminal communication unit 230 receives sound information, image information, sensor information, and the like from the information processing terminal 10. The terminal communication unit 230 also transmits the control signal generated by the input / output control unit 220 to the information processing terminal 10.
  • the functional configuration example of the information processing server 20 according to an embodiment of the present disclosure has been described.
  • the above configuration described using FIG. 5 is merely an example, and the functional configuration of the information processing server 20 according to the present embodiment is not limited to such an example.
  • the configuration shown above may be realized by being distributed by a plurality of devices.
  • the functions of the information processing terminal 10 and the information processing server 20 may be realized by a single device.
  • the functional configuration of the information processing server 20 according to the present embodiment can be flexibly deformed according to the specification and the operation.
  • the input / output control unit 220 selects the first target form to be input from the plurality of forms based on the input operation of the user, and the first target form is character Has a function to perform input.
  • the input / output control unit 220 may select the first target form based on, for example, the speech recognition result for the input operation performed by speech, and may input the speech recognition result to the first target form.
  • the input / output control unit 220 can select the first target form based on the speech recognition result for the speech and the domain set in each form. Also, as described above, when moving the speech recognition result to the second target form designated by the user's feedback, the input / output control unit 220 is corrected based on the domain set in the second target form. The voice recognition result may be input to the second target form.
  • FIG. 6 is an example of the Nbest result of the speech recognition process according to the present embodiment.
  • the recognition unit 210 according to the present embodiment generates, for example, a plurality of character string candidates based on the user's utterance, and outputs the character string candidate having the highest reliability among the character string candidates as a final speech recognition result. You may At this time, the Nbest result is a collection of character string candidates corresponding to the first to nth degrees of reliability.
  • each character string candidate is associated with a domain indicating an attribute of the character string.
  • the character string candidate "Tuesday” is associated with the domain “day of the week” since the character string is one of the days of the week, and the character string candidate "Kyanobi” has a domain where free input is permitted. "Title” is associated.
  • the input / output control unit 220 uses the first target form for the form in which the domain “title” is set based on the domain “title” associated with the “keyboard” output as the speech recognition result. Select as and enter the speech recognition result.
  • the input / output control unit 220 relates to voice recognition based on the form specified by the user, that is, the domain set in the second target form. It is possible to control the recalculation of the reliability and input the corrected speech recognition result into the second target form.
  • the input / output control unit 220 recalculates the reliability in the recognition unit 210 based on the domain "day of the week” by the user specifying a form in which the domain "day of the week” is set by feedback. I am doing it.
  • the right side of FIG. 6 shows the Nbest result re-obtained by the recalculation of the reliability.
  • the recognition unit 210 calculates the reliability of the character string candidate associated with the domain “day of the week” to the top, and thus the reliability of the character string candidate “Tuesday” changes the highest. I understand that At this time, the recognition unit 210 outputs “Tuesday” with the highest degree of reliability as a speech recognition result.
  • the input / output control unit 220 is corrected by causing the recognition unit 210 to recalculate the reliability based on the domain set in the second target form specified by the user. It is possible to obtain speech recognition results and realize input in line with the user's intention. According to the above-described function of the input / output control unit 220, it is possible to easily correct form errors and speech recognition errors without requiring complicated operations.
  • FIG. 7 is a diagram for describing a correction process in which a unit block is designated according to the present embodiment.
  • the user U instructs the utterance UO 7a to register a schedule of "English from 18 o'clock on Tuesday".
  • "Tuesday” is recognized as "Kyappie”
  • F1 corresponding to the title which allows free input, together with the correctly recognized "English” character string. It is done.
  • the speech recognition result according to the present embodiment may include a plurality of unit blocks.
  • the above-mentioned unit block indicates, for example, a character string divided by a unit such as a word, a phrase, or a clause, and in the above-mentioned example, corresponds to "gaze" and "English".
  • the input / output control unit 220 may display information on unit blocks included in the input content together with the input content input to the form.
  • the input / output control unit 220 displays the unit block relating to "Kyanobi” as "A” and the unit block relating to "English” as "B".
  • the user U inputs the character string "Kyanobi" corresponding to the unit block A incorrectly input to the form F1 into the form F4 corresponding to the day of the week, the unit block An utterance UO 7b in which A and form number 2 are designated is performed.
  • the input / output control unit 220 deletes the character string "Kyanei” corresponding to the unit block A from the form F1 based on the detected speech UO 7b of the user U, and is corrected by recalculation of the reliability.
  • the string "Tuesday” can be entered into Form F4.
  • the input / output control unit 220 can also correct the input content based on the connection probability between unit blocks.
  • FIG. 8 is a diagram for describing a correction process based on the connection probability between unit blocks according to the present embodiment.
  • the user U inputs the character string “when”, which corresponds to the unit block A incorrectly input to the form F1, into the form F2 corresponding to the date.
  • the utterance UO 8b specifying the form number 2 is performed.
  • input / output control unit 220 not only unit block A's reliability but unit blocks located before and after unit block A. Recalculate the connection probability concerning B.
  • the recognition unit 210 since the result is output including the probability of the connection relationship between the unit blocks, the character string corresponding to a certain unit block (first unit block) is corrected based on the domain. In this case, the connection relationship between the second unit blocks located before and after the first unit block may also be recalculated.
  • the connection probability with the block B is simply recalculated. It is corrected to "3 pm” which has a high probability of connecting with "5 days”.
  • the input / output control unit 220 corrects "3 pm” according to the form format based on the domain associated with the corrected character string "3 pm", and inputs it to the form F3. .
  • a more effective correction can be realized by considering the connection probability of unit blocks. Even if errors in the previous and subsequent unit blocks are not corrected by one process, it is possible to correct the errors in all the unit blocks by repeating the above process.
  • FIG. 9 is a diagram for describing correction processing when a form not intended by the user is selected by setting of semantic analysis according to the present embodiment.
  • the user U instructs the utterance UO 9a to register a schedule of “greeting from 15 o'clock”.
  • the user U wants to input all the character strings related to “pick up from 15:00” in the form F1.
  • the input / output control unit 220 inputs "15 o'clock” into the form F3, and only "pick up” is the form F1. Has entered.
  • the character string is input in a form not conforming to the user's intention. There is a case.
  • the user U may make an utterance UO 9 b for moving “15:00” input to the unintended form F 3 to the form F 1.
  • the user U can designate an arbitrary form by the form number or the form name.
  • the input / output control unit 220 deletes “15 o'clock” from the form F3 based on the recognized speech UO 9b of the user U, and adds it to the form F1.
  • the input / output control unit 220 corrects and inputs the character string in accordance with the input format of the second target form which is the corrected input destination. It is also good.
  • FIG. 10 is a diagram for describing a correction process when a form not intended by the user is selected due to the unset domain, according to the present embodiment.
  • the user U instructs the utterance UO 10a to register a schedule of "20th (Hatuka)".
  • the user U wants to input "20th” in the form F2 corresponding to the date.
  • the input / output control unit 220 sets "20 days” for the form F1 that allows free input. "Has been entered. As described above, even if there is no error in the speech recognition result, when the domain intended by the user is not set in the recognized character string, the character string is input in a form not conforming to the user's intention There is a case.
  • the user U may perform the speech UO 10 b for associating the domain newly set in the form F 2 with the “20 days” input in the unintended form F 1. .
  • the input / output control unit 220 based on the feedback of the user U by the utterance UO 10b, the designated "character string” and the domain "date” set in the designated form F2. It is possible to correspond newly. Further, the input / output control unit 220 may delete “20 days” input to the form F1 based on the utterance UO 10b, and may perform input in accordance with the form F2.
  • the input / output control unit 220 it is possible to newly associate the domain intended by the user with the character string based on the user's instruction, and thereafter, the input reflecting the user's intention It is possible to realize
  • FIG. 11A and 11B are diagrams for explaining addition of a domain to the specific expression according to the present embodiment.
  • the user U instructs the utterance UO 11 a to register the schedule of “the day of ⁇ ”.
  • the user U expresses “March 14” as “the day of ⁇ ” from the convention related to the pi.
  • the input / output control unit 220 inputs “day of ⁇ ” to the form F1.
  • a string may be input to a form that does not conform to the user's intention.
  • an alias, an abbreviation, etc. are widely contained in said specific expression.
  • the specific expression according to the present embodiment may be an expression used only in a specific group, for example, in a home, in addition to an expression used in the world.
  • the user U may make an utterance UO 11 b for moving “the day of ⁇ ” input to the unintended form F 1 to the form F 2.
  • the input / output control unit 220 sets “the day of ⁇ ” to “March 14”. You may convert and fill in form F2. Further, at this time, the input / output control unit 220 may perform control to newly associate the “ ⁇ day” and the “March 14” domain “date”.
  • the input / output control unit 220 is shown in the upper part of FIG. As shown, the information processing terminal 10 may be made to output a voice SO11 for inquiring of the user U a date expression related to "the day of ⁇ ".
  • the input / output control unit 220 determines that “the day of ⁇ ” and “March 14”, the domain “date” Can be newly associated.
  • the input / output control unit 220 may also display the character string while maintaining the expression “the day of ⁇ ” in the form F 2 in order to reflect the intention of the user U better. In this case, since “the day of ⁇ ” and “March 14” are associated inside, it is possible to execute the scheduler function etc without any problem.
  • the input / output control unit 220 can flexibly control the input / output related to the input interface IF based on the reliability related to speech recognition.
  • FIG. 12 is a diagram for describing input / output control in the case where the degree of reliability related to speech recognition is low.
  • the input / output control unit 220 does not input the speech recognition result to the form, but inputs it to the user U based on the reliability of the speech recognition of the speech UO 12a performed by the user U falling below the threshold.
  • the information processing terminal 10 is made to output voice SO12 for requesting specification of a form to be executed.
  • the input / output control unit 220 can request the user to explicitly designate a form for inputting the speech recognition result.
  • the input / output control unit 220 controls recalculation of the reliability based on the speech UO 12b, and the corrected voice
  • the recognition result can be input to form F4.
  • FIGS. 13 and 14 are diagrams for describing input / output control in the case where the reliability of character string candidates is antagonized.
  • the recognition unit 210 according to the present embodiment generates a plurality of character string candidates based on the user's utterance, and finally recognizes the character string candidate having the highest reliability among the character string candidates. It can be output as a result.
  • the reliability of a plurality of character string candidates antagonize.
  • the input / output control unit 220 inputs each of the competing character strings to the form when the difference in reliability from the first to nth positions falls below the threshold Td in the Nbest result. Good.
  • the input / output control unit 220 may obtain the difference after normalizing the reliability.
  • the input / output control unit 220 generates a plurality of second target forms based on the domains corresponding to the character string “Kayobi” and the character string “Tuesday” having competing degrees of reliability, That is, the forms F1 and F4 are selected, and the character string "Kyanobi" and the character string “Tuesday” are respectively input.
  • the input / output control unit 220 causes the information processing terminal 10 to output the voice SO13 for confirming which form the input result is correct to obtain the feedback from the user U, thereby the intention of the user U Input can be realized.
  • the input / output control unit 220 has a plurality of based on the domains corresponding to the character string “todaya” and the character string “today” that have competitive degrees.
  • a second target form of, ie, forms F1 and F2 may be selected, and the string "Today's" and the string “Today” may be entered, respectively.
  • FIG. 15 is a diagram showing a correction example involving separation of unit blocks according to the present embodiment.
  • the user U instructs the utterance UO 15a to register a schedule of “5:00 pm on the 5th”.
  • the user U may make an utterance UO 15b specifying a plurality of forms for inputting the unit block A corresponding to "someday afternoon sharing".
  • the input / output control unit 220 can cause the recognition unit 210 to execute the calculation of Nbest relating to the unit block A again based on the utterance UO 15 b.
  • the input / output control unit 220 may separate the character strings “5 days” and “3 pm” included in the unit block A based on the recalculated Nbest result, and input them to the forms F2 and F3, respectively. it can.
  • the input / output control unit 220 may display the statement T1 prompting the correction of the error on the input interface IF.
  • the user U may make an utterance divided into units of the form, such as "A date someday A”, “A date someday 2", “someday date”, "Shinjiha Time”, etc. Is expected to have the effect of realizing more efficient correction.
  • the user U can also specify a correction other than the specification of the form, such as "not being a shanghai, sanji”.
  • a correction other than the specification of the form such as "not being a shanghai, sanji”.
  • the wording asking for specification of the form to be input to the user U without inputting the speech recognition result in the form T2 may be output on the input interface IF.
  • the input / output control unit 220 when the user U makes an utterance UO 16b specifying a form F1 to F3, the input / output control unit 220 generates characters from forms F2 and F3 other than the form F1 allowing free input. By fitting a column, making corrections, and inputting the remaining character string into the form F1, the user U can present the intended input.
  • the input / output control unit 220 requests the user U to input the title again without inputting the speech recognition result to the form.
  • the word T3 may be output on the input interface IF.
  • the input / output control unit 220 converts the character string "pick up” to the form F1.
  • the input intended by the user U can be presented.
  • the user U instructs the utterance UO 18a to register a hotel schedule.
  • the utterance UO 18 a includes two character strings “Tomorrow” and “10 days” associated with the domain “Date”.
  • the input / output control unit 220 it is difficult for the input / output control unit 220 to determine which of the character strings “Tomorrow” and “10 days” is to be input to the form F2.
  • the input / output control unit 220 designates a form F1 that allows free input, in which no domain is set, and a title
  • the word T4 requesting the user to utter again the content to be input to may be displayed on the input interface IF.
  • the input / output control unit 220 can induce the user U's utterance not to fluctuate by designating the form F1 which permits free input and urging the user to speak again.
  • the title is re-uttered, It is possible to correct correctly for both domains.
  • the lower part of FIG. 18 shows the result of the correction made by the input / output control unit 220 based on the utterance UO 18 b for designating the title, which the user U has made.
  • the input / output control unit 220 inputs "Reserve 10 days hotel" done by the utterance UO 18b in the form F1, and inputs the remaining "Tomorrow” date in the form F2, and “Friday,” which is the date of “Tomorrow,” is entered in Form F4.
  • the input / output control unit 220 may input the speech recognition result for the re-speech to the form F1. Also in this case, it is possible to enter in the form F2 a character string corresponding to the domain "date” not included in the re-speech.
  • the user U utters a name of a form for specifying a voice recognition result erroneously input to the form F1 to the intended form, ie, the form F4 corresponding to the day of the week.
  • the form F4 corresponding to the day of the week.
  • UO19b we are doing UO19b.
  • "Wednesday" has already been input to the form F4 designated by the user by the utterance UO 19a.
  • the input / output control unit 220 determines whether the character string already input and the character string newly instructed to be input can be compatible, and performs control based on the determination. For example, in the case of an example shown in FIG. 19, the input / output control unit 220 can not accept both “Wednesday” and “Tuesday” due to the nature of the form F4, so “Tuesday” newly input is instructed. You may overwrite the "Wednesday" of. On the other hand, in the case of a form that allows free input, such as form F1, for example, the input / output control unit 220 appends a character string for which a new input is instructed while maintaining the already input character string. You may As described above, according to the input / output control unit 220 according to the present embodiment, it is possible to realize appropriate correction control based on the nature of the foam.
  • the functions of the input / output control unit 220 according to the present embodiment have been described above in detail with specific examples. Although the case where Japanese is used as the type of the character string to be input has been described above, the function possessed by the input / output control unit 220 according to the present embodiment is applicable regardless of the type of language.
  • FIG. 20 is a diagram for describing a correction process when English is used as the type of character string.
  • the user U instructs the utterance UO 20a to register the schedule of “Tuesday”.
  • the input / output control unit 220 corresponds the title “Choose way” to a title that allows free input. You have filled in form F1.
  • the input / output control unit 220 displays the two unit blocks “Choose” and “way” included in “Choose way” as unit blocks A and B, respectively.
  • the input / output control unit 220 deletes the unit blocks A and B from the form F1 based on the utterance UO 20b, and corrects based on the domain of the specified form F2 “Tuesday” Can be entered into form F2.
  • FIG. 21 is a flowchart showing the flow of the operation of the information processing server 20 according to the present embodiment.
  • the terminal communication unit 230 receives the speech information of the user collected by the information processing terminal 10 (S1101).
  • the recognition unit 210 executes speech recognition processing based on the speech information received in step S1101 (S1102). At this time, the recognition unit 210 may perform the calculation of the reliability, the acquisition of the Nbest result, the calculation of the connection probability between unit blocks, and the like.
  • the input / output control unit 220 determines whether or not the difference in reliability between the 1st and nth places in the Nbest result is smaller than a threshold (S1103).
  • the input / output control unit 220 determines whether to perform input / output control at the time of antagonism of reliability. (S1104).
  • the recognition unit 210 outputs the character string candidate with the highest reliability as the speech recognition result, and input / output control of the form is executed by the input / output control unit 220 (S1105).
  • the input / output control unit 220 performs input / output control at the time of antagonism of the reliability (S1104: Yes)
  • the input / output control unit 220 antagonizes the reliability as shown in FIG. 13 and FIG.
  • the form input / output control at the time is executed (S1106).
  • the input / output control unit 220 causes the recognition unit 210 to calculate the reliability again based on the above feedback (S1108), and the recalculation is performed. Form input / output control is performed based on the reliability (S1109).
  • the above-mentioned feedback may be performed not only by voice but also by sight line, gesture, operation of an input device, or the like.
  • the input / output control unit 220 controls the voice input interface provided with a plurality of forms.
  • the application scope of the technical idea according to the present disclosure is not limited to the voice input interface. Therefore, in the second embodiment, a case will be described where the input / output control unit 220 controls character string input to a form placed on a Web page.
  • FIG. 22 is a diagram for describing an automatic input for a form placed on a web page.
  • FIG. 22 shows a web page WP having a plurality of forms corresponding to a name, a birthday, a telephone number, a zip code and the like.
  • the user can enter information in each of the placed forms using, for example, a keyboard, but as the number of forms increases, the load associated with the input operation increases, and input errors etc. Is also expected to occur.
  • FIG. 23A and FIG. 23B are diagrams showing examples of input errors by the automatic input tool.
  • an input error occurs such that the information to be separately input to the two forms corresponding to the zip code can be forced into one of the forms.
  • Such an input error may occur, for example, when the postal code is managed as information corresponding to one form in the automatic input tool.
  • FIG. 23B shows an example in which information is input in a language different from the language assumed by the form.
  • Japanese first name and last name are input in the form corresponding to First name and Last name, which should be originally input in English.
  • Such an input error may occur, for example, when only information written in Japanese is stored in the automatic input tool.
  • the technical idea according to the present embodiment was conceived focusing on the above points, and even if information is input to an incorrect form by automatic input, it is possible to perform easy correction without complicated operations. Do. Further, according to the information processing server 20 according to the present embodiment, it is possible to realize information input with fewer errors.
  • FIG. 24 is a diagram for describing automatic input control by the input / output control unit 220 according to the present embodiment.
  • FIG. 24 shows a Web page WP having a plurality of forms corresponding to full name (Kanji), full name (Kana), birthday, phone number, zip code and the like.
  • the input / output control unit 220 selects a plurality of first target forms for performing information input from the plurality of forms based on the user's input operation, and selects a plurality of selected first targets. You can automatically input the specified string to the form.
  • the input / output control unit 220 may use, for example, an utterance of the user, an operation using an input device such as a mouse, a touch, or the like as a trigger of the automatic input. Further, the input / output control unit 220 may execute automatic input for a plurality of forms using information input by the user and information set in the form set FS designated in advance.
  • FIG. 24 shows an example of the form set FS according to the present embodiment.
  • the form set FS according to the present embodiment is an information set in which information to be automatically input to a plurality of forms is summarized for each user and application.
  • the form set FS includes last name (Kanji), first name (Kanji), last name (Kana), first name (Kana), date of birth, telephone number, and zip code grouped by user. It is defined.
  • the form set FS may be automatically generated by the input / output control unit 220 based on past input results, or may be generated and edited by the user.
  • the input / output control unit 220 may present the form set FS as visual information to the user.
  • the input / output control unit 220 may assign an ID to the name of the form set FS or each character string included in the form set FS.
  • the user acquires a form set FS used for automatic input by designating the name “Toshi” and the ID “1” corresponding to the name “Toshi”, and a plurality of information are included using the form set. It is possible to perform automatic filling of forms of.
  • the input / output control unit 220 executes automatic input for a plurality of forms using the form set FS corresponding to the form set name “Toshishi” designated by the user.
  • the input / output control unit 220 may obtain the form set FS set by default and perform automatic input.
  • the input / output control unit 220 when the above-described automatic input is performed, is characterized by assigning an ID to each form arranged in the web page WP.
  • the input / output control unit 220 assigns IDs “1” to “12” to each form and displays the forms on the web page WP.
  • the ID given to each form and each piece of information included in the form set FS may be for the user to more easily realize correction of the input mistake when an input mistake occurs.
  • 25 to 27 are diagrams for explaining the correction of input information according to the present embodiment.
  • a situation after the input / output control unit 220 has automatically input the form placed on the web page WP is shown.
  • FIG. 25 as in the case shown in FIG. 23A, an example is shown in which “last name” and “first name” and “sei” and “mei” are input in reverse.
  • the user U can issue an instruction to correct the automatic input result using the ID assigned to each form or the identifier assigned to each character string included in the form set FS.
  • the user U performs feedback relating to a correction instruction by performing an utterance UO 25 a with a content of “1 and 2 are reversed” and an utterance UO 25 b with a content of “A to 1”. .
  • the input / output control unit 220 can, for example, replace the information input to the form “last name” corresponding to the ID “1” and the form “first name” corresponding to the ID “2” based on the utterance UO 25a. . Also, the input / output control unit 220 overwrites the form “surname” corresponding to the ID “1” with the character string “Ueda” corresponding to the ID “A” included in the form set FS, for example, based on the utterance UO 25 b. Also, it is possible to move the character string "Koshishi" entered in the form "surname” to the form "first name".
  • the input / output control unit 220 when it is instructed to replace the character string input in the form "last name” and the form "first name", the input / output control unit 220 It is possible to automatically replace the input character string. Also, for example, when the user U utters "1 to 3", etc., the input / output control unit 220 inputs the character string "Toshishi” entered in the form "surname” as the input form of the form "Mei”. It is also possible to fill in the form "Mei” after modifying it to a kana expression.
  • the user U can give an instruction to correct the automatic input result using the ID given to each form or the identifier given to each character string included in the form set FS.
  • the user U performs the feedback relating to the correction instruction by performing the utterance UO 26 a with the content of “11 to 11 and 12” and the utterance UO 26 b with the content “G to 11 and 12”. Is going.
  • the input / output control unit 220 refers to the character string “111-2222” input to the form given the ID “11” based on the speech UO 26 a and the speech U O 26 b, for example, and is included in the character string
  • the character string can be divided based on the delimiter, the attribute of the form, the general knowledge, etc., and the character string can be input to the form to which the ID "11” and the ID "12" are given.
  • the input / output control unit 220 may cause the information processing terminal 10 to perform an output requesting the user to specify the break position.
  • the input / output control unit 220 acquires the break position based on, for example, the user speaking "3 digits and 4 digits", and the contents of the character string held by the form set FS It is also possible to correct
  • FIG. 27 shows an example of the case where a Japanese-written character string is input to a form that should normally be input in English.
  • the user U can issue an instruction to correct the automatic input result using the ID assigned to each form or the identifier assigned to each character string included in the form set FS.
  • the user U may also issue a correction instruction using the name or ID of the form set FS.
  • the user U performs feedback relating to the correction instruction by performing the utterance UO 27a with the content "form set in English” and the utterance UO 27b with the content "A and B in English". There is.
  • the input / output control unit 220 may execute automatic input again after switching the form set FS based on, for example, the utterance UO 26a or the utterance UO 26b. If, for example, a correction instruction relating to switching between forms is performed before the instruction relating to the switching of the form set FS, the input / output control unit 220 determines the content of the correction instruction, It may be reflected even after switching of the form set FS.
  • a plurality of form sets FS according to the present embodiment can be set according to the user, the language, the location, the application, and the like, and can be switched according to the situation.
  • the functions of the input / output control unit 220 according to the present embodiment have been described above in detail. According to the above-described function of the input / output control unit 220 according to the present embodiment, it is possible to realize automatic input of a form with fewer input errors and to easily correct input contents even when an input error occurs. It becomes possible.
  • the input / output control unit 220 automatically inputs to the form arranged in the Web page has been described as an example, the input / output control unit 220 is not limited to such an example. It is possible to correspond widely to the automatic input to the form.
  • FIG. 28 is a block diagram illustrating an exemplary hardware configuration of the information processing terminal 10 and the information processing server 20 according to an embodiment of the present disclosure.
  • the information processing terminal 10 and the information processing server 20 include, for example, a processor 871, a ROM 872, a RAM 873, a host bus 874, a bridge 875, an external bus 876, an interface 877, and an input device 878. , An output device 879, a storage 880, a drive 881, a connection port 882, and a communication device 883.
  • the hardware configuration shown here is an example, and some of the components may be omitted. In addition, components other than the components shown here may be further included.
  • the processor 871 functions as, for example, an arithmetic processing unit or a control unit, and controls the overall operation or a part of each component based on various programs recorded in the ROM 872, RAM 873, storage 880, or removable recording medium 901. .
  • the ROM 872 is a means for storing a program read by the processor 871, data used for an operation, and the like.
  • the RAM 873 temporarily or permanently stores, for example, a program read by the processor 871 and various parameters and the like that appropriately change when the program is executed.
  • the processor 871, the ROM 872, and the RAM 873 are connected to one another via, for example, a host bus 874 capable of high-speed data transmission.
  • host bus 874 is connected to external bus 876, which has a relatively low data transmission speed, via bridge 875, for example.
  • the external bus 876 is also connected to various components via an interface 877.
  • Input device 8708 For the input device 878, for example, a mouse, a keyboard, a touch panel, a button, a switch, a lever, and the like are used. Furthermore, as the input device 878, a remote controller (hereinafter, remote control) capable of transmitting a control signal using infrared rays or other radio waves may be used.
  • the input device 878 also includes a voice input device such as a microphone.
  • the output device 879 is a display device such as a CRT (Cathode Ray Tube), an LCD, or an organic EL, a speaker, an audio output device such as a headphone, a printer, a mobile phone, or a facsimile. It is a device that can be notified visually or aurally. Also, the output device 879 according to the present disclosure includes various vibration devices capable of outputting haptic stimulation.
  • the storage 880 is a device for storing various data.
  • a magnetic storage device such as a hard disk drive (HDD), a semiconductor storage device, an optical storage device, a magneto-optical storage device, or the like is used.
  • the drive 881 is a device that reads information recorded on a removable recording medium 901 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory, or writes information on the removable recording medium 901, for example.
  • a removable recording medium 901 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory
  • the removable recording medium 901 is, for example, DVD media, Blu-ray (registered trademark) media, HD DVD media, various semiconductor storage media, and the like.
  • the removable recording medium 901 may be, for example, an IC card equipped with a non-contact IC chip, an electronic device, or the like.
  • connection port 882 is, for example, a port for connecting an externally connected device 902 such as a USB (Universal Serial Bus) port, an IEEE 1394 port, a SCSI (Small Computer System Interface), an RS-232C port, or an optical audio terminal. is there.
  • an externally connected device 902 such as a USB (Universal Serial Bus) port, an IEEE 1394 port, a SCSI (Small Computer System Interface), an RS-232C port, or an optical audio terminal. is there.
  • the external connection device 902 is, for example, a printer, a portable music player, a digital camera, a digital video camera, an IC recorder, or the like.
  • the communication device 883 is a communication device for connecting to a network.
  • a communication card for wired or wireless LAN Bluetooth (registered trademark) or WUSB (Wireless USB), a router for optical communication, ADSL (Asymmetric Digital) (Subscriber Line) router, or modem for various communications.
  • Bluetooth registered trademark
  • WUSB Wireless USB
  • ADSL Asymmetric Digital
  • Subscriber Line Subscriber Line
  • the information processing server 20 selects a first target form to be input from a plurality of forms based on the input operation of the user, and the first target It has an input / output control unit 220 for inputting characters in a form.
  • the input / output control unit 220 according to an embodiment of the present disclosure is configured to select a second target form different from the first target form based on user feedback on the input content input to the first target form.
  • One of the features is to select and perform the character input on the second target form. According to the configuration, it is possible to easily correct the selection error of the form to be input.
  • each step concerning processing of information processing server 20 of this specification does not necessarily need to be processed in chronological order according to the order described in the flowchart.
  • the steps related to the processing of the information processing server 20 may be processed in an order different from the order described in the flowchart or may be processed in parallel.
  • a control unit that selects a first target form to be input from a plurality of forms based on a user's input operation, and performs character input on the first target form, Equipped with The control unit selects a second target form different from the first target form based on the feedback of the user on the input content input to the first target form, and the second target form Perform the above character input, Information processing device.
  • the control unit selects the form specified by the feedback as the second target form, and the character corresponding to at least a part of the input content input to the first target form is the second target Fill in the form, The information processing apparatus according to (1).
  • the control unit causes a unit block included in the input content input to the first target form to be displayed together with the input content, and a character corresponding to the unit block specified by the feedback from the first target form While deleting, the character corresponding to the said unit block is input into said 2nd object form,
  • the control unit separates a character string included in the unit block based on the feedback, and inputs the separated character string to the second target form.
  • At least one of the input operation and the feedback is performed by speech.
  • the information processing apparatus according to any one of the above (1) to (4).
  • the control unit selects the first target form based on the result of speech recognition for the input operation performed by speech, and inputs the result of the speech recognition to the first target form.
  • the information processing apparatus according to any one of the above (1) to (5).
  • the control unit selects the first target form based on the speech recognition result and a domain set in the form.
  • the control unit inputs, to the second target form, the speech recognition result corrected based on a domain set in the selected second target form.
  • the information processing apparatus according to (6) or (7). The control unit controls recalculation of the reliability related to the voice recognition result based on the domain set in the selected second target form, and the corrected voice recognition result is converted to a second target form.
  • the control unit causes the unit block included in the voice recognition result input to the first target form to be displayed together with the voice recognition result, and the first unit block designated by the feedback and the feedback are designated by the feedback. Causing the connection probability of the second unit block located before and after the first unit block to be recalculated based on the domain set in the form; The information processing apparatus according to any one of the above (6) to (9). (11) The control unit inputs a character string corresponding to a second unit block corrected by recalculation of the connection probability into the form in which a domain associated with the character string is set. The information processing apparatus according to (10).
  • the control unit newly associates a domain with at least a part of the speech recognition result based on the feedback.
  • the control unit newly associates a character string designated by the feedback with a domain set in the form designated by the feedback.
  • the control unit requests the user to provide feedback for specifying the form for inputting the speech recognition result without selecting the first target form when the reliability of the speech recognition result is lower than a threshold.
  • the information processing apparatus according to any one of the above (6) to (13).
  • the control unit is configured to select a plurality of second target forms based on a domain corresponding to the character string candidate having the reliability that is competitive when the reliability of the character string candidate related to the speech recognition result is antagonized.
  • the character string candidates having the reliability to be competitively selected are respectively input to the plurality of second target forms.
  • the information processing apparatus according to any one of the above (6) to (14).
  • the control unit designates the form in which the domain is not set, and utters the input content for the designated form. Ask the user, The information processing apparatus according to any one of the above (6) to (15).
  • the control unit selects a plurality of the first target forms based on the input operation, and performs automatic input of a set character string.
  • the information processing apparatus according to any one of the above (1) to (16).
  • the control unit presents to the user a form set that defines a string of characters to be automatically input to the plurality of first target forms, and executes the automatic input based on the designated form set.
  • the control unit adds an identifier to at least one of the character string included in the form set and the form, and corrects the result of the automatic input based on the identifier included in the feedback.
  • the information processing apparatus according to (18).
  • the processor selects a first target form to be input from a plurality of forms based on a user's input operation, and performs character input on the first target form; A second target form different from the first target form is selected based on the user's feedback on the input content input to the first target form, and the character input is performed on the second target form What to do, including, Information processing method.
  • information processing terminal 110 display unit 120 voice output unit 130 voice input unit 140 imaging unit 150 sensor unit 160 control unit 170 server communication unit 20 information processing server 210 recognition unit 220 input / output control unit 230 terminal communication unit

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

Le problème décrit par la présente invention est de corriger facilement une sélection erronée d'un formulaire destiné à une saisie. La solution selon l'invention consiste à pourvoir à un dispositif de traitement d'informations comprenant : une unité de commande qui sélectionne, dans une pluralité de formulaires, un premier formulaire cible destiné à une entrée sur la base d'une opération de saisie utilisateur, et qui entre des caractères dans le premier formulaire cible. L'unité de commande sélectionne un second formulaire cible différent du premier formulaire cible en fonction d'un retour de l'utilisateur sur contenu de la saisie dans le premier formulaire cible, et entre des caractères dans le second formulaire cible.
PCT/JP2018/038725 2018-01-22 2018-10-17 Dispositif de traitement d'informations et procédé de traitement d'informations Ceased WO2019142419A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2018-008156 2018-01-22
JP2018008156 2018-01-22

Publications (1)

Publication Number Publication Date
WO2019142419A1 true WO2019142419A1 (fr) 2019-07-25

Family

ID=67302083

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2018/038725 Ceased WO2019142419A1 (fr) 2018-01-22 2018-10-17 Dispositif de traitement d'informations et procédé de traitement d'informations

Country Status (1)

Country Link
WO (1) WO2019142419A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2023007960A (ja) * 2021-07-02 2023-01-19 株式会社アドバンスト・メディア 情報処理装置、情報処理システム、情報処理方法及びプログラム

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH02126300A (ja) * 1988-11-04 1990-05-15 Nippon Telegr & Teleph Corp <Ntt> 音声修正方式
JP2000207166A (ja) * 1999-01-19 2000-07-28 Nec Corp 音声入力装置及び音声入力方法
JP2001306293A (ja) * 2000-04-20 2001-11-02 Canon Inc 情報入力方法、情報入力装置及び記憶媒体
WO2002031643A1 (fr) * 2000-10-11 2002-04-18 Canon Kabushiki Kaisha Dispositif de traitement d'information, procede de traitement d'information et support de stockage
JP2004222169A (ja) * 2003-01-17 2004-08-05 Daikin Ind Ltd 情報処理装置および方法、並びにプログラム
JP2015516587A (ja) * 2012-03-08 2015-06-11 フェイスブック,インク. 対話から情報を抽出するデバイス

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH02126300A (ja) * 1988-11-04 1990-05-15 Nippon Telegr & Teleph Corp <Ntt> 音声修正方式
JP2000207166A (ja) * 1999-01-19 2000-07-28 Nec Corp 音声入力装置及び音声入力方法
JP2001306293A (ja) * 2000-04-20 2001-11-02 Canon Inc 情報入力方法、情報入力装置及び記憶媒体
WO2002031643A1 (fr) * 2000-10-11 2002-04-18 Canon Kabushiki Kaisha Dispositif de traitement d'information, procede de traitement d'information et support de stockage
JP2004222169A (ja) * 2003-01-17 2004-08-05 Daikin Ind Ltd 情報処理装置および方法、並びにプログラム
JP2015516587A (ja) * 2012-03-08 2015-06-11 フェイスブック,インク. 対話から情報を抽出するデバイス

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2023007960A (ja) * 2021-07-02 2023-01-19 株式会社アドバンスト・メディア 情報処理装置、情報処理システム、情報処理方法及びプログラム

Similar Documents

Publication Publication Date Title
US11594211B2 (en) Methods and systems for correcting transcribed audio files
AU2016211903B2 (en) Updating language understanding classifier models for a digital personal assistant based on crowd-sourcing
AU2015375326B2 (en) Headless task completion within digital personal assistants
US11615788B2 (en) Method for executing function based on voice and electronic device supporting the same
JP2008203559A (ja) 対話装置及び方法
CN101978390A (zh) 服务启动技术
CN1763842B (zh) 用于语音识别中的动词错误恢复的方法和系统
JP2016061954A (ja) 対話装置、方法およびプログラム
KR102076793B1 (ko) 음성을 통한 전자문서 제공 방법, 음성을 통한 전자문서 작성 방법 및 장치
JP6596373B2 (ja) 表示処理装置及び表示処理プログラム
JP6828741B2 (ja) 情報処理装置
JP7575804B2 (ja) 音声認識プログラム、音声認識方法、音声認識装置および音声認識システム
WO2019142419A1 (fr) Dispositif de traitement d&#39;informations et procédé de traitement d&#39;informations
JP5892598B2 (ja) 音声文字変換作業支援装置、音声文字変換システム、音声文字変換作業支援方法及びプログラム
JP3878147B2 (ja) 端末装置
JP6756211B2 (ja) 通信端末、音声変換方法、及びプログラム
WO2019017027A1 (fr) Dispositif et procédé de traitement d&#39;informations
JP2008145769A (ja) 対話シナリオ生成システム,その方法およびプログラム
JP4589843B2 (ja) 対話方法、対話装置、対話プログラムおよび記録媒体
JP5184071B2 (ja) 書き起こしテキスト作成支援装置、書き起こしテキスト作成支援プログラム、及び書き起こしテキスト作成支援方法
JP7489232B2 (ja) 情報処理システム、情報処理方法、及び情報処理プログラム
KR20220043753A (ko) 음성을 텍스트로 변환한 음성 기록에서 유사 발음의 단어를 포함하여 검색하는 방법, 시스템, 및 컴퓨터 판독가능한 기록 매체
WO2019142447A1 (fr) Dispositif de traitement d&#39;informations et procédé de traitement d&#39;informations
JP2008243048A (ja) 対話装置、対話方法及びプログラム
JP2009037433A (ja) ナンバーボイスブラウザ、およびナンバーボイスブラウザの制御方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18901170

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18901170

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP