WO2019000871A1 - 用于提供语音服务的方法、装置和服务器 - Google Patents
用于提供语音服务的方法、装置和服务器 Download PDFInfo
- Publication number
- WO2019000871A1 WO2019000871A1 PCT/CN2017/118008 CN2017118008W WO2019000871A1 WO 2019000871 A1 WO2019000871 A1 WO 2019000871A1 CN 2017118008 W CN2017118008 W CN 2017118008W WO 2019000871 A1 WO2019000871 A1 WO 2019000871A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- voice service
- device end
- target voice
- message
- request
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/51—Discovery or management thereof, e.g. service location protocol [SLP] or web services
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/53—Network services using third party service providers
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/08—Network architectures or network communication protocols for network security for authentication of entities
- H04L63/0876—Network architectures or network communication protocols for network security for authentication of entities based on the identity of the terminal or configuration, e.g. MAC address, hardware or software configuration or device fingerprint
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/10—Network architectures or network communication protocols for network security for controlling access to devices or network resources
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/10—Network architectures or network communication protocols for network security for controlling access to devices or network resources
- H04L63/102—Entity profiles
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/60—Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/08—Protocols for interworking; Protocol conversion
- H04L69/085—Protocols for interworking; Protocol conversion specially adapted for interworking of IP-based networks with other networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/30—Definitions, standards or architectural aspects of layered protocol stacks
- H04L69/32—Architecture of open systems interconnection [OSI] 7-layer type protocol stacks, e.g. the interfaces between the data link level and the physical level
- H04L69/322—Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions
- H04L69/326—Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions in the transport layer [OSI layer 4]
Definitions
- the present application relates to the field of computer technologies, and in particular, to the field of artificial intelligence, and in particular, to a method, an apparatus, and a server for providing a voice service.
- embodiments of the present application provide a method, apparatus, and server for providing a voice service.
- the embodiment of the present application provides a method for providing a voice service, including: receiving a request message for providing a target voice service to a device end that has accessed a third-party voice service, where the request message includes a request content and a device end.
- the message format configured in the framework model is generated and transmitted based on the transport protocol configured in the data service framework model of the constructed target voice service; the message format and third-party voice service configured in the data service framework model of the constructed target voice service The message format is consistent, and the transport protocol configured in the data service framework model of the constructed target voice service is consistent with the transport protocol of the third-party voice service.
- the method further includes: providing the user with the configuration information of the target voice service to be replaced in response to the request for the target voice service registration to be sent by the user, so that the user replaces the configuration file of the device end.
- the corresponding configuration item; the configuration information to be replaced includes the user identifier, the user password, and the path address of the access token.
- the method further includes: receiving an access request for accessing the target voice service by the device end, where the target voice service access request includes a user identifier, a user password, and an identifier of the device end; and based on the access request, to the device end
- the access token of the target voice service is issued, so that the device obtains the issued access token by obtaining the path address of the access token.
- the issuing an access token of the target voice service to the device according to the access request includes: searching for the device identifier that has obtained the user authorization according to the user identifier and the user password; and determining the target voice service access request Whether the identifier of the device is the same as the device ID that has been authorized by the user; if yes, the access token of the target voice service is issued to the device.
- receiving a request message for providing a target voice service to a device end that has accessed the third-party voice service comprising: receiving the device that has obtained the access token of the target voice service and has accessed the third-party voice service The request message sent by the terminal.
- the response message is generated by parsing the request message to obtain the request content and the status information of the device side; generating corresponding operation instructions based on the device-side status information and the request content; and the data service according to the target voice service.
- the message format and transport protocol configured in the framework model encapsulate operation instructions to generate a response message.
- the status information of the device includes: a capability statement of the device end, context information of the device end, and event information of the device end.
- generating corresponding operation instructions based on the device-side status information and the request content including: determining a callable operation interface of the device end based on the device-side capability declaration, the device-side context information, and the device-side event information; and the callable operation
- the target operation interface corresponding to the requested content is determined in the interface; the voice service content is determined according to the requested content, and an operation instruction for invoking the target operation interface to output the voice service content is generated.
- obtaining a response message generated after processing the requested content based on the device-side status information includes: detecting whether the requested content includes a voice interaction requirement; and responding to detecting that the requested content includes a voice interaction requirement, at a preset time
- the length or the preset message length divides the voice service data, generates a plurality of response message fragments, and sends the response message to the device side, including: sending the response message segment to the device end according to the generation time of the response message segment.
- the method further includes: constructing a data service framework model of the target voice service, the data service framework model includes: a transport protocol layer, a message format layer, and a device end capability layer; and a data service framework model for constructing the target voice service includes : constructing a transport protocol layer, including configuring a transport protocol used by the target voice service; constructing a message format layer, including configuring a message format of a request message and a response message of the target voice service; constructing a device-side capability layer, including configuring a request message and a response The logic that resolves the capabilities of the device side in the message.
- the embodiment of the present application provides an apparatus for providing a voice service, including: a first receiving unit, configured to receive a request message for providing a target voice service to a device end that has accessed a third-party voice service, where The request message includes the request content and the status information of the device end; the obtaining unit is configured to obtain a response message generated by processing the requested content based on the device-side status information, the response message includes an operation instruction, and the sending unit is configured to send to the device end a response message; wherein the request message and the response message are generated according to a message format configured in a data service framework model of the constructed target voice service, and are transmitted based on a transport protocol configured in a data service framework model of the constructed target voice service; The message format configured in the data service framework model of the constructed target voice service is consistent with the message format of the third-party voice service, and the transport protocol configured in the data service framework model of the constructed target voice service is consistent with the transmission protocol of the third-party voice service. .
- the apparatus further includes: a providing unit, configured to provide the user with the configuration information of the target voice service to be replaced, in response to obtaining the request for the target voice service registration to the device end issued by the user, for the user Replace the corresponding configuration item in the configuration file of the device.
- the configuration information to be replaced includes the user ID, the user password, and the path address of the access token.
- the apparatus further includes: a second receiving unit, configured to receive an access request for accessing the device end to the target voice service, where the target voice service access request includes the user identifier, the user password, and the device end
- the authorization unit is configured to issue an access token of the target voice service to the device according to the access request, so that the device obtains the issued access token by obtaining the path address of the access token.
- the authorization unit is further configured to issue an access token of the target voice service to the device end according to the following manner: searching for the device identifier of the acquired user authorization according to the user identifier and the user password; determining the target voice service access request Whether the identifier of the device in the device is consistent with the device ID that has been authorized by the user; if yes, the access token of the target voice service is issued to the device.
- the first receiving unit is further configured to: receive a request message issued by a device end that has obtained an access token of the target voice service and has accessed the third party voice service.
- the response message obtained by the obtaining unit is generated by parsing the request message, and obtaining the request content and the state information of the device side; generating corresponding operation instructions based on the device-side state information and the request content;
- the message format and transport protocol configured in the service data service framework model encapsulates operational instructions and generates response messages.
- the status information of the device includes: a capability statement of the device end, context information of the device end, and event information of the device end.
- the operation instruction in the response message acquired by the obtaining unit is generated according to the device-side capability declaration, the device-side context information, and the device-side event information to determine the device-side callable operation interface;
- the target operation interface corresponding to the requested content is determined in the operation interface;
- the voice service content is determined according to the requested content, and an operation instruction for invoking the target operation interface to output the voice service content is generated.
- the obtaining unit is further configured to: detect whether the requested content includes a voice interaction requirement; and in response to detecting that the requested content includes a voice interaction requirement, divide the voice service data by a preset time length or a preset message length, Generating a plurality of response message segments; the sending unit is further configured to: send the response message segments to the device end in sequence according to the generation time of the response message segments.
- the apparatus further includes: a building unit configured to construct a data service framework model of the target voice service, where the data service framework model includes: a transport protocol layer, a message format layer, and a device end capability layer; For constructing a transport protocol layer, including configuring a transport protocol used by the target voice service; constructing a message format layer, including configuring a message format of a request message and a response message of the target voice service; and constructing a device end capability layer, including configuring a slave request message And the logic of parsing out the capabilities of the device side in the response message.
- the data service framework model includes: a transport protocol layer, a message format layer, and a device end capability layer; For constructing a transport protocol layer, including configuring a transport protocol used by the target voice service; constructing a message format layer, including configuring a message format of a request message and a response message of the target voice service; and constructing a device end capability layer, including configuring a slave request message And the logic of parsing out the capabilities
- an embodiment of the present application provides a server, including: one or more processors; and a storage device, configured to store one or more programs, when one or more programs stored by the storage device are one or more
- the processor executes such that one or more processors implement the methods described above for providing voice services.
- the method, device, and server for providing a voice service provided by the present application by receiving a request message for requesting content and device-side status information of a target voice service for a device end that has accessed the third-party voice service;
- the response message generated by the status information of the device end after processing the requested content, the response message includes an operation instruction; finally, the response message is sent to the device end, wherein the request message and the response message are data service frames according to the constructed target voice service.
- the message format generated in the model, consistent with the third-party voice service, and based on the transmission protocol of the third-party voice service configured in the data service framework model of the constructed target voice service, can utilize the developed The third-party voice service service interaction and other logic quickly access the target voice service, without the need to separately develop service logic for the target voice service, which is beneficial to reduce the development and operation and maintenance costs of products accessing different voice services.
- FIG. 1 is an exemplary system architecture diagram to which the present application can be applied;
- FIG. 2 is a flow diagram of one embodiment of a method for providing a voice service in accordance with the present application
- 3A is a schematic diagram of an application scenario of a method for providing a voice service according to the present application
- 3B is a schematic diagram of another application scenario of a method for providing a voice service according to the present application.
- FIG. 4 is a schematic diagram of an application scenario of a method for accessing a voice service in a method for providing a voice service according to the present application;
- FIG. 5 is a schematic diagram of a data service framework model of a target voice service
- FIG. 6 is a schematic structural diagram of an embodiment of an apparatus for providing a voice service according to the present application.
- FIG. 7 is a block diagram showing the structure of a computer system suitable for implementing the server of the embodiment of the present application.
- FIG. 1 illustrates an exemplary system architecture 100 of an embodiment of a method for providing a voice service or a device for providing a voice service to which the present application may be applied.
- system architecture 100 can include terminal 101, devices 102, 103, network 104, and server 105.
- the medium 104 is used to provide a medium for communication links between the terminal 101 and the server 105, and a medium for providing a communication link between the devices 102, 103 and the server 105.
- Network 104 may include various types of connections, such as wired, wireless communication links, fiber optic cables, and the like.
- User 110 can interact with server 105 over network 104 using terminal 101 to receive or send messages and the like.
- An application that interacts with the server 105 such as a web browser application, a voice service client application, etc., may be installed on the terminal 101.
- the terminal 101 can enable various electronic devices having a display screen including, but not limited to, a smartphone, a tablet, a desktop computer, and the like.
- the devices 102, 103 can also interact with the server 105 over the network 104 to receive or send messages and the like.
- the devices 102, 103 may be electronic devices having an audio input interface and an audio output interface, such as a speaker with a microphone.
- the server 105 may be a server that provides various services, such as a voice server that supports web page content displayed on the terminal 101 and controls audio output operations performed by the devices 102, 103.
- the voice server may process the request by the user 110 to perform voice service operations for the devices 102, 103 through the terminal 101, and send the processing results (eg, audio data and control commands of the audio output interface) to the devices 102, 103.
- the devices 102, 103 can receive the audio data and control commands sent by the server 105 through the network 104 and perform corresponding operations, thereby enabling the devices 102, 103 to access the voice services provided by the voice server 105.
- the method for providing a voice service provided by the embodiment of the present application is generally performed by the server 105. Accordingly, the device for providing the voice service is generally disposed in the server 105.
- a server can be a clustered server, including multiple servers with different processes deployed.
- the method for providing a voice service includes the following steps:
- Step 201 Receive a request message for providing a target voice service to a device end that has accessed the third-party voice service.
- the electronic device on which the method for providing the voice service is run ie, the server of the target voice service, such as the server shown in FIG. 1 can be utilized from the user through a wired connection or a wireless connection.
- the electronic device of the voice service request e.g., terminal 101 shown in FIG. 1 receives the request message described above, or receives the request message from a device side (e.g., device 102, 103 shown in FIG. 1) with which the user performs voice interaction.
- the request message includes the requested content and status information of the device end that has accessed the third-party voice service.
- the requested content may include the content of the voice service requested by the user, for example, may include voice data input by the user through the audio input interface.
- the status information of the device may be information indicating the current running status of the device, and may include information about an operation currently being performed by the device, status information of a current interface of the device, and the like.
- the request message may be a request message for providing a target voice service to a device end that has accessed the third-party voice service.
- the target voice service and the third party voice service here can be voice services provided by different servers or server clusters, and both can provide voice services with different characteristics.
- third-party voice services and target voice services can be voice services that support different language types.
- the device After accessing the third-party voice service, the device can apply the voice service of the language type supported by the third voice service (for example, English). If the device needs to apply services of other language types (such as Chinese), the device can access other language types.
- Target voice service for example, English. If the device needs to apply services of other language types (such as Chinese), the device can access other language types.
- the request message is generated according to a message format configured in a data service framework model of the constructed target voice service, and configured to transmit a transport protocol according to a data service framework model of the constructed target voice service, and the foregoing
- the message format configured in the data service framework model of the target voice service is consistent with the message format of the third-party voice service, and the transport protocol configured in the data service framework model of the target voice service is constructed and the third-party voice service is configured.
- the transmission protocol is consistent. That is to say, the target voice service pre-builds a data service framework model, and other electronic devices can interact with the server of the target voice service according to the data service framework model.
- a message format and a transport protocol are configured, and an electronic device interacting with a server of the target voice service may send or receive a message according to a configured message format, and perform a message based on the configured transport protocol when transmitting the data.
- the message generated and transmitted according to the message format and transmission protocol configured in the data service framework model described above can be received and successfully parsed by the server of the target voice service, and the server of the target voice service can respond according to the content obtained by the analysis.
- the above transmission protocol may define a connection manner between the target voice server and other electronic devices, and may be a universal transmission protocol.
- the above message format can define what is represented by multiple fields in the message.
- the third-party voice service adopts the Http2.0-based transmission protocol and defines that the first field in the message format represents audio data and the second field represents the device-side state
- the transport protocol can be configured as Http2.0.
- the message format also includes the first field indicating the audio data and the second field indicating the state of the device.
- the request message may be sent by the user by using an electronic device that establishes a communication connection with the device, and the electronic device may request the device to obtain the state information of the device in advance.
- the status information may be sent to the electronic device, such that when the target voice service is requested to be provided, the electronic device may send the requested content and the status information of the device side together to the server of the target voice service.
- the user can input the service URL of the target voice service in the browser application of the electronic device (for example, a mobile phone), and select a desired voice service after logging in the user account, such as voice communication, setting an alarm clock, playing music, and the like.
- the user can select the requested voice service in a client application of the target voice service, such as a voice service client installed on the mobile phone.
- the request content may be generated according to the voice service selected by the user, and the request message is generated according to the state information of the device end acquired in advance, and sent to the electronic device on which the method for providing the voice service runs.
- the request message may be sent by a device that has accessed the third-party voice service.
- the device end may generate the request message according to the message format configured in the data service framework model of the constructed target voice service by using the state information and the requested content, and based on the already The transport protocol configured in the data service framework model of the constructed target voice service transmits the request message.
- Step 202 Acquire a response message generated after processing the requested content based on the state information of the device.
- the request content may be subjected to intent analysis, the intention of the voice service request is determined, and then the operation instruction for the device end is determined.
- the operational instructions herein may include voice service data that matches the intent of the voice service request described above, and commands for controlling the device to invoke the specified interface to output voice service data.
- a response message including the above operation instructions can then be generated.
- An electronic device on which the method for providing a voice service operates can acquire the generated response message.
- the target voice service is deployed on a server cluster that includes multiple servers that maintain communication connections with each other
- the service that generates the response message and the service that sends the response message to the device are deployed in the server cluster.
- the server on which the service for sending a response message to the device side is deployed may receive the response message from the server that generated the response message. If the service that generates the above response message and the service that sends the response message to the device side are deployed on the same server in the server cluster, the server can capture the generated response message and cache it.
- the above response message is also configured according to the data service framework model of the constructed target voice service.
- the message format is generated and transmitted based on the transport protocol configured in the data service framework model of the constructed target voice service.
- the message format configured in the data service framework model of the constructed target voice service is consistent with the message format of the third-party voice service, and the transmission protocol configured in the data service framework model of the constructed target voice service is the same.
- the transmission protocol of the three-party voice service is consistent.
- the device side that has accessed the third-party voice service does not need to re-develop the interaction logic of the voice service according to the data service framework of the target voice service, and can interact with the server of the target voice service by using the interaction logic of the developed third-party voice service. , which significantly reduces development costs.
- the response message may be generated by parsing the request message, and obtaining the request content and the state information of the device, and then generating the corresponding information based on the state information of the device and the requested content.
- the operation instruction is finally generated according to the message format and the transmission protocol configured in the data service framework model of the target voice service to generate a response message.
- the electronic device ie, the server of the target voice service
- the method for providing a voice service may parse the request header and the body in the received request message according to the configured transmission protocol.
- the request header may contain the identifier of the device side provided by the user, and the body text may include the content of the request and other status information of the device side.
- the corresponding request content and the status information of the device end can be extracted according to the configured message format.
- the operation performed by the device may be determined according to the state information of the device, the voice service data corresponding to the requested content is found, and an operation instruction including the operation performed by the voice service data is generated, and finally, the request header in the configured transmission protocol may be utilized.
- the operation command is encapsulated by the request method and the uniform resource identifier to generate the response message.
- the status information of the device in the request message may include: a capability declaration of the device, context information of the device, and event information of the device.
- the capability statement of the device side may be a statement of the capability that the device side reports, that is, a statement of the interface that can be called by the device end, including voice input, voice output, speaker control, audio player, alarm clock, settings, and the like.
- the context information of the device may be the current state of the device end reported by the device or the operation information currently being executed by the device. For example, whether the device is currently playing music, whether the voice input is being received, and whether the device has an alarm is set.
- the event information on the device side may be information about events occurring on the device side, such as whether the alarm clock on the device side is ringing, the device end starts playing music, the device end ends playing music, and the like. These status information on the device side can be reported by the device side and attached to the body of the request message.
- the step of generating the corresponding operation instruction based on the device-side status information and the request content may include: determining, by the device-side capability declaration, the device-side context information, and the device-side event information, determining the device-side callable An operation interface; determining a target operation interface corresponding to the requested content from the callable operation interface; determining a voice service content according to the requested content, and generating an operation instruction for invoking the target operation interface to output the voice service content.
- the callable operation interface usable for responding to the request message may be determined according to the capability declared by the device end (ie, the operation interface of the device end), the operation currently performed by the device end, and the event information generated by the device end. Determining the target operation interface according to the content of the request, for example, requesting the content to query the weather condition, determining that the target operation interface includes a voice output interface. Then, the content of the voice service can be determined according to the content of the request, that is, the voice service data is searched and generated. For example, when the content is requested to query the weather, the text of the current weather condition can be found through the network, and converted into audio data as a voice. Service data.
- the voice service content and the target operation interface can generate an operation instruction, for example, combining the above audio data and the called voice output interface to generate an operation instruction.
- an operation instruction for example, combining the above audio data and the called voice output interface to generate an operation instruction.
- Step 203 Send a response message to the device.
- the electronic device on which the method for providing the voice service is run may send the response message to the device through the network, where the response message includes an operation instruction.
- the device side can receive and parse the response message to obtain the above operation instruction. After that, the device end can call the corresponding interface to output the voice service data according to the operation instruction.
- the device side has developed logic for interacting with the third-party voice service when accessing the third-party voice service, and the device side can use the developed logic to receive the response message transmitted by the third-party voice service-based transmission protocol.
- the message format of the response message is consistent with the message format of the third-party voice service, and the device side can successfully parse the content represented by each field in the response message by using the developed logic, for example, extracting the invoked interface and passing the The data output by the interface, and then the corresponding operation.
- user D turns on the voice service by operating on device A that has accessed the third-party voice service.
- the device A may send a request message to the server B of the target voice service in step 1, requesting to provide the target voice service for the device A, and the server B processes the request in step 2.
- the voice service results, and in step 3, sends the voice service result to the device A in the form of a response message.
- the request message sent by the device A to the server B in step 1 and the response message sent by the server B to the device in step 3 are all configured according to the third party voice configured in the data service framework model of the constructed target voice service.
- a service-consistent message format is generated and transmitted based on a transport protocol configured in the data service framework model of the constructed target voice service that is consistent with the third-party voice service.
- the user D can open the operation application (APP) of the device A on the electronic device C that has been connected to the device A, and the user D can be in the operation application of the device A.
- the electronic device C may generate a request message requesting to provide a voice service for the device A in response to a request input by the user, and send the request message to the server B of the target voice service in step 1, and the server B performs the request message in step 2.
- Processing, the voice service result is obtained, and in step 3, the voice service result is sent to the device A in the form of a response message.
- the request message sent by the electronic device C to the server B in step 1 and the response message sent by the server B to the device in step 3 are all configured according to the third party voice configured in the data service framework model of the constructed target voice service.
- a service-consistent message format is generated and transmitted based on a transport protocol configured in the data service framework model of the constructed target voice service that is consistent with the third-party voice service.
- the method for providing a voice service provided by the foregoing embodiment of the present application, by receiving a request message for providing a target voice service for a device end that has accessed the third-party voice service, including requesting content and status information of the device, and then acquiring the a response message including an operation instruction generated by processing the request content after the status information of the device end, and finally sending a response message to the device end, where the request message and the response message are configured according to the data service framework model of the constructed target voice service.
- the message format generation consistent with the third-party voice service and based on the transmission protocol of the third-party voice service that is configured in the data service framework model of the constructed target voice service, can utilize the developed third-party voice service
- the service interaction and other logic quickly access the target voice service, without the need to separately develop service logic for the target voice service, which is beneficial to reduce the development and operation and maintenance costs of products accessing different voice services.
- the user before the server of the target voice service provides the voice service, the user needs to be prompted to change the configuration information associated with the third-party voice service interaction process to establish a connection with the target voice service.
- the foregoing method for providing a voice service may further include: providing, to the user, the to-be-replaced configuration information of the target voice service, in response to obtaining a request by the user to perform a target voice service registration on the device end. , for the user to replace the corresponding configuration item in the configuration file of the device side.
- the configuration information to be replaced includes a user identifier, a user password, and a path address for obtaining an access token.
- the user can log in to the service platform of the target voice service and register, and the server of the target voice service can provide the user identifier, the user password, and the path address of the access token after the user registers.
- the user can use the username and password to replace the username and password in the configuration file of the device (for example, the configuration file config.json of the javaClient toolkit), and replace the configuration file on the device side for defining the login mode.
- the user can modify the path address of the device to obtain the access token in the configuration file for defining the access token acquisition mode.
- the device side needs to acquire an access token of the target voice service, and connects the interface of the target voice service through the access token.
- the foregoing method for providing a voice service may further include: receiving an access request for accessing the device end to the target voice service, and issuing a target to the device end based on the access request An access token of the voice service, where the device obtains the issued access token by using the path address of the obtained access token; wherein the access request includes a user identifier, a user password, and an identifier of the device.
- the user can send a request for accessing the target voice service to the device end in the platform of the target voice service, where the request includes the user identifier and the user password provided by the server of the target voice service when the user registers.
- the user may also add an identifier of the device end in the access request to authorize the device end to request the target voice service on behalf of the user.
- the server of the target voice service can authenticate according to the user ID and the user password, and issue the token to the device.
- the step of issuing the access token of the target voice service to the device according to the access request may include: searching for the device identifier that has been authorized by the user according to the user identifier and the user password; and determining the target voice service access request. Whether the identifier of the device end is consistent with the device identifier that has obtained the authorization of the user; if yes, the access token of the target voice service is issued to the device end.
- the user may provide an authorized device identifier to the server of the target voice service.
- the user may perform the identifier of the device end in the access request and the authorized device identifier. If the comparison is the same, the access token can be issued to the device.
- the device can obtain the issued access token according to the path of obtaining the access token in the modified configuration file, and use the access token to connect to the interface of the target voice service.
- the step of receiving the request message for providing the target voice service to the device end that has accessed the third-party voice service may include: receiving the access that has obtained the target voice service.
- the above request message sent by the device side of the token and having accessed the third party voice service that is to say, the server of the target voice service only receives the request message for providing the voice service to the device side that has acquired the access token. In this way, it can be avoided that the request frequency of the server of the target voice service is too high, the service is unavailable, and the security of the target voice service can be improved.
- FIG. 4 is a schematic diagram of an application scenario of a method for accessing a voice service in a method for providing a voice service according to the present application.
- step 1 the user D uses the terminal device E to enter the service page of the target voice service, which may be a web address of the target voice service platform.
- the terminal device E requests registration from the server B of the target voice service according to the registration operation of the user.
- the server B returns the user name (client_id), the password (client_secrect), and the token acquisition path to the terminal device E in step 3.
- the terminal device E displays the user name, password, and token acquisition path to the user D.
- the user needs to use the account name, password, and token provided by the server B to obtain the configuration corresponding to the path change.
- the device A needs to access the target voice service provided by the server B.
- the user uses the terminal device E to perform the login operation in step 5.
- the user identifier, the user password, and the identifier of the device A are input.
- the terminal device E is in step 6.
- the login information (the user ID, the user password, and the identifier of the device A) is sent to the server B.
- the server B verifies whether the identifier of the device A is consistent with the device ID authorized by the user 3 based on the user identifier and the user password. After the verification is passed, the token is issued to the device A in step 8.
- the device A accesses the voice service provided by the server B.
- the user is provided with the configuration information to be replaced when the user registers, to prompt the user to replace the corresponding configuration item on the device end, and the user can obtain the access order by using the simple configuration item replacement operation and the login authorization operation.
- the card provides a reliable voice service while reducing the technical threshold for accessing the target voice service, effectively reducing the development workload of the access target voice service, and facilitating the provision of diverse voice services with high efficiency and low cost.
- the request content may include audio stream data input by a user.
- the electronic device on which the method for providing a voice service runs may perform voice activity detection (VAD), detecting a gap in the audio stream data input by the user, that is, detecting a pause of the user's speech, according to The detected pause divides the audio stream data input by the user into a plurality of segments.
- VAD voice activity detection
- the user inputs multiple pieces of audio stream data, and the server of the voice service can correspondingly return multiple fragmented response messages, and each response message segment can correspond to one audio stream. Data fragment.
- the method may first detect whether the requested content includes a voice interaction requirement. For example, when the content of the request is an alarm clock, the content of the request does not include a voice interaction requirement, and when the content of the request is a question-based conversation, the content of the request includes a voice interaction requirement.
- the voice service data may be divided by a preset time length or a preset message length to generate a plurality of response message segments. The preset length of time and the preset message length can be pre-configured.
- the voice service server can segment the result of the voice service.
- the server of the voice service may send a response message to the device side in a data stream manner, that is, the response message segment may be sequentially sent to the device according to the generation time of the response message segment. In this way, the problem that the request message processing time is too long and the voice service has poor real-time performance can be avoided.
- the above method for providing a voice service may further comprise the step of constructing a data service framework model of the target voice service.
- FIG. 5 shows a schematic structure of a data service framework model of the target voice service of the present embodiment.
- the data service framework model of the target voice service includes a transport layer 501, a message format layer 502, and a device end capability layer 503.
- the transport layer 501 is located at the bottom layer and is used to define a transport protocol between the device and the server. It can be defined as a transport protocol consistent with the third-party voice service.
- the message format layer 502 can be used to define the format of the request message and the response message, such as defining the content represented by the various fields in the body of the request message.
- the device end capability layer 503 is located at the top level and defines an operational interface of the callable device side, that is, defines various capabilities of the device side, such as voice output capability, volume control capability, and the like.
- the step of constructing the data service framework model of the target voice service may include: constructing a transport protocol layer, including configuring a transport protocol used by the target voice service; constructing a message format layer, including configuring a request message of the target voice service, and The message format of the response message; and the construction of the device-side capability layer, including the logic to configure the ability to resolve the device side from the request message and the response message.
- the target voice service may perform message transmission according to the transport protocol and message format configured in the framework, and may perform a logical pair request message based on the configured capability of parsing the capability of the device from the request message and the response message. And parsing the response message.
- the server of the target voice service can obtain the request content of the device end and the callable interface information of the device end, and then can respond according to the request content and the callable interface information of the device end, and generate a response operation instruction.
- the data service framework model of the above-mentioned target voice service can be compatible with the third-party voice service, so that the device side does not need to perform a large amount of repetitive development work to access different voice services.
- the present application provides an embodiment of an apparatus for providing a voice service
- the device can be specifically applied to a server of a target voice service.
- the apparatus 600 for providing a voice service in this embodiment includes: a first receiving unit 601, an obtaining unit 602, and a sending unit 603.
- the first receiving unit 601 is configured to receive a request message for providing a target voice service to the device end that has accessed the third-party voice service, where the request message includes the requested content and the state information of the device end, and the obtaining unit 602 is configured to obtain The response message generated after processing the requested content based on the status information of the device, wherein the response message includes an operation instruction, and the sending unit 603 is configured to send a response message to the device.
- the request message and the response message are generated according to the message format configured in the data service framework model of the constructed target voice service, and are transmitted based on the transport protocol configured in the data service framework model of the constructed target voice service.
- the message format configured in the data service framework model of the constructed target voice service is consistent with the message format of the third-party voice service, and the transport protocol configured in the data service framework model of the target voice service is configured and the third-party voice service is configured.
- the transmission protocol is consistent.
- the first receiving unit 601 may use an electronic device (for example, the terminal 101 shown in FIG. 1) from which a user performs a voice service request through a network, or a device end that uses the user to perform voice interaction through the network (for example, The device 102, 103) shown in 1 receives the above request message.
- the request message is organized in accordance with a transport protocol with a third party voice service and in accordance with a message format defined with a third party voice service. In this way, when the device side sends the request message, the target voice service can receive and parse the request message without changing the logic for interacting with the third party voice service.
- the obtaining unit 602 can acquire a response message generated in response to the request message.
- the response message includes an operation instruction on the device side, and the operation instruction may include the output voice service data and the invoked device-side operation interface.
- the response message here is also transmitted in accordance with the transport protocol of the third party voice service and in accordance with the message format defined by the third party voice service. In this way, when the response message is sent to the device side, and the logic for interacting with the third-party voice service is not changed, the device side can also receive and parse the response message.
- the sending unit 603 can send a response message to the device end, so that the device side performs a corresponding operation according to the response message.
- the apparatus 600 may further include: a providing unit configured to provide the user with the to-be-replaced configuration information of the target voice service in response to acquiring a request for the target voice service registration by the user to obtain the user The user replaces the corresponding configuration item in the configuration file of the device.
- the configuration information to be replaced includes the user ID, the user password, and the path address of the access token.
- the foregoing apparatus may further include: a second receiving unit, configured to receive an access request for accessing the device end to the target voice service, where the target voice service access request includes the user identifier, the user password, and An identifier of the device, and an authorization unit configured to issue an access token of the target voice service to the device according to the access request, so that the device obtains the issued access token by obtaining the path address of the access token.
- a second receiving unit configured to receive an access request for accessing the device end to the target voice service, where the target voice service access request includes the user identifier, the user password, and An identifier of the device
- an authorization unit configured to issue an access token of the target voice service to the device according to the access request, so that the device obtains the issued access token by obtaining the path address of the access token.
- the authorization unit may be further configured to issue an access token of the target voice service to the device according to the following manner: searching for the device identifier of the acquired user authorization according to the user identifier and the user password; and determining the target voice service. Whether the identifier of the device in the access request is consistent with the device ID that has been authorized by the user; if yes, the access token of the target voice service is issued to the device.
- the first receiving unit may be further configured to: receive a request message sent by the device end that has obtained the access token of the target voice service and has accessed the third party voice service.
- the response message obtained by the obtaining unit may be generated by parsing the request message, and deriving the request content and the state information of the device side; generating corresponding operation instructions based on the device-side state information and the request content; The message format and transport protocol configured in the data service framework model of the voice service encapsulate operation instructions to generate a response message.
- the obtaining unit may include a generating module, and the generating module is configured to generate a response message in the above manner.
- the status information of the device may include: a capability declaration of the device, context information of the device, and event information of the device.
- the operation instruction in the response message acquired by the obtaining unit may be generated in the following manner, that is, the generating module may be configured to generate a response message according to the device-side capability declaration, the device-side context information, and the device-side event.
- the information determines a callable operation interface of the device end; determines a target operation interface corresponding to the requested content from the callable operation interface; determines a voice service content according to the requested content, and generates an operation instruction for invoking the target operation interface to output the voice service content.
- the obtaining unit may be further configured to: detect whether the requested content includes a voice interaction requirement; and in response to detecting that the requested content includes a voice interaction requirement, divide the voice service data by a preset time length or a preset message length. And generating a plurality of response message segments; the sending unit is further configured to: sequentially send the response message segment to the device end according to the generation time of the response message segment.
- the apparatus 600 may further include: a building unit configured to construct a data service framework model of the target voice service, where the data service framework model includes: a transport protocol layer, a message format layer, and a device end capability layer;
- the specific configuration is used to: construct a transport protocol layer, including configuring a transport protocol used by the target voice service; construct a message format layer, including configuring a message format of a request message and a response message of the target voice service; and constructing a device end capability layer, including configuring the slave
- the apparatus 600 for providing a voice service receives a request message for providing a target voice service to a device end that has accessed the third-party voice service by using the first receiving unit, and the acquiring unit acquires the request information based on the state information of the device end.
- the response message generated after the content is processed, the sending unit sends the response message to the client; wherein the request message and the response message are consistent with the third-party voice service configured in the data service framework model of the target voice service that has been constructed.
- the message format is generated and transmitted based on a transport protocol consistent with the third-party voice service configured in the data service framework model of the target voice service that has been constructed.
- the logical fast access target voice service that has been developed on the device side and interacts with the third-party voice service can be reused, which greatly reduces the development workload when the device accesses the target voice service, and is beneficial for reducing access to different voice services.
- FIG. 7 a block diagram of a computer system 700 suitable for use with a server for implementing embodiments of the present application is shown.
- the server shown in FIG. 7 is merely an example, and should not impose any limitation on the function and scope of use of the embodiments of the present application.
- computer system 700 includes a central processing unit (CPU) 701 that can be loaded into a program in random access memory (RAM) 703 according to a program stored in read only memory (ROM) 702 or from storage portion 708. And perform various appropriate actions and processes.
- RAM random access memory
- ROM read only memory
- RAM 703 various programs and data required for the operation of the system 700 are also stored.
- the CPU 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704.
- An input/output (I/O) interface 705 is also coupled to bus 704.
- the following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, etc.; an output portion 707 including a cathode ray tube (CRT), a liquid crystal display (LCD), and the like, and a speaker; a storage portion 708 including a hard disk or the like And a communication portion 709 including a network interface card such as a LAN card, a modem, or the like.
- the communication section 709 performs communication processing via a network such as the Internet.
- Driver 710 is also connected to I/O interface 705 as needed.
- a removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like, is mounted on the drive 710 as needed so that a computer program read therefrom is installed into the storage portion 708 as needed.
- an embodiment of the present disclosure includes a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for executing the method illustrated in the flowchart.
- the computer program can be downloaded and installed from the network via communication portion 709, and/or installed from removable media 711.
- the central processing unit (CPU) 701 the above-described functions defined in the method of the present application are performed.
- the computer readable medium described herein may be a computer readable signal medium or a computer readable storage medium or any combination of the two.
- the computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections having one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
- a computer readable storage medium may be any tangible medium that can contain or store a program, which can be used by or in connection with an instruction execution system, apparatus or device.
- a computer readable signal medium may include a data signal that is propagated in the baseband or as part of a carrier, carrying computer readable program code. Such propagated data signals can take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing.
- the computer readable signal medium can also be any computer readable medium other than a computer readable storage medium, which can transmit, propagate, or transport a program for use by or in connection with the instruction execution system, apparatus, or device.
- Program code embodied on a computer readable medium can be transmitted by any suitable medium, including but not limited to wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
- each block of the flowchart or block diagram can represent a module, a program segment, or a portion of code that includes one or more of the logic functions for implementing the specified.
- Executable instructions can also occur in a different order than that illustrated in the drawings. For example, two successively represented blocks may in fact be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending upon the functionality involved.
- each block of the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented in a dedicated hardware-based system that performs the specified function or operation. Or it can be implemented by a combination of dedicated hardware and computer instructions.
- the units involved in the embodiments of the present application may be implemented by software or by hardware.
- the described unit may also be provided in the processor, for example, as a processor comprising a first receiving unit, an obtaining unit and a transmitting unit.
- the name of these units does not constitute a limitation on the unit itself in some cases.
- the first receiving unit may also be described as “receiving to provide a target voice service for a device end that has accessed the third-party voice service.
- the unit of the request message may be implemented by software or by hardware.
- the described unit may also be provided in the processor, for example, as a processor comprising a first receiving unit, an obtaining unit and a transmitting unit.
- the name of these units does not constitute a limitation on the unit itself in some cases.
- the first receiving unit may also be described as “receiving to provide a target voice service for a device end that has accessed the third-party voice service.
- the unit of the request message may be described as “receiving to provide
- the present application also provides a computer readable medium, which may be included in the apparatus described in the above embodiments, or may be separately present and not incorporated into the apparatus.
- the computer readable medium carries one or more programs, when the one or more programs are executed by the device, causing the device to: receive a request message for providing a target voice service to a device end that has accessed the third party voice service, The request message includes a request content and status information of the device end; and a response message generated after processing the requested content based on the status information of the device end, the response message includes an operation instruction; and sending the message to the device end The response message; wherein the request message and the response message are generated according to a message format configured in a data service framework model of the target voice service that has been constructed, and based on the data service of the target voice service that has been constructed a transport protocol transmission configured in the framework model; the message format configured in the data service framework model of the constructed target voice service is consistent with the message format of the third-party voice service, and the target voice is
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computer Security & Cryptography (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computer Hardware Design (AREA)
- Human Computer Interaction (AREA)
- Computing Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Power Engineering (AREA)
- Telephonic Communication Services (AREA)
Abstract
Description
Claims (22)
- 一种用于提供语音服务的方法,其特征在于,包括:接收为已接入第三方语音服务的设备端提供目标语音服务的请求消息,所述请求消息包括请求内容和所述设备端的状态信息;获取基于所述设备端的状态信息对所述请求内容进行处理后生成的响应消息,所述响应消息包括操作指令;向所述设备端发送所述响应消息;其中,所述请求消息和所述响应消息按照已构建的所述目标语音服务的数据服务框架模型中配置的消息格式生成、并基于已构建的所述目标语音服务的数据服务框架模型中配置的传输协议传输;所述已构建的所述目标语音服务的数据服务框架模型中配置的消息格式与所述第三方语音服务的消息格式一致,所述已构建的所述目标语音服务的数据服务框架模型中配置的传输协议与所述第三方语音服务的传输协议一致。
- 根据权利要求1所述的方法,其特征在于,所述方法还包括:响应于获取到用户发出的对所述设备端进行目标语音服务注册的请求,向所述用户提供所述目标语音服务的待替换配置信息,以供用户替换所述设备端的配置文件中的对应配置项;所述待替换配置信息包括用户标识、用户密码以及获取访问令牌的路径地址。
- 根据权利要求2所述的方法,其特征在于,所述方法还包括:接收将所述设备端接入所述目标语音服务的接入请求,所述目标语音服务接入请求包括所述用户标识、所述用户密码以及所述设备端的标识;基于所述接入请求,向所述设备端发放所述目标语音服务的访问令牌,以供所述设备端通过所述获取访问令牌的路径地址获取发放的访问令牌。
- 根据权利要求3所述的方法,其特征在于,所述基于所述接入请求,向所述设备端发放所述目标语音服务的访问令牌,包括:根据所述用户标识和所述用户密码查找出已获取用户授权的设备标识;判断所述目标语音服务接入请求中的设备端的标识是否与所述已获取用户授权的设备标识一致;若是,向所述设备端发放所述目标语音服务的访问令牌。
- 根据权利要求4所述的方法,其特征在于,所述接收为已接入第三方语音服务的设备端提供目标语音服务的请求消息,包括:接收由已获得所述目标语音服务的访问令牌、并且已接入所述第三方语音服务的设备端发出的所述请求消息。
- 根据权利要求1所述的方法,其特征在于,所述响应消息按照如下方式生成:对所述请求消息进行解析,得出所述请求内容和所述设备端的状态信息;基于所述设备端的状态信息和所述请求内容生成对应的操作指令;按照所述目标语音服务的数据服务框架模型中配置的消息格式和传输协议封装所述操作指令,生成所述响应消息。
- 根据权利要求6所述的方法,其特征在于,所述设备端的状态信息包括:设备端的能力声明、设备端的上下文环境信息、设备端的事件信息。
- 根据权利要求7所述的方法,其特征在于,所述基于所述设备端的状态信息和所述请求内容生成对应的操作指令,包括:基于所述设备端的能力声明、设备端的上下文环境信息、设备端 的事件信息确定所述设备端的可调用操作接口;从所述可调用操作接口中确定出与所述请求内容对应的目标操作接口;根据所述请求内容确定出语音服务内容,并生成调用所述目标操作接口输出所述语音服务内容的操作指令。
- 根据权利要求1所述的方法,其特征在于,所述获取基于所述设备端的状态信息对所述请求内容进行处理后生成的响应消息,包括:检测所述请求内容是否包含语音交互需求;响应于检测到所述请求内容包含语音交互需求,以预设的时间长度或预设的消息长度划分语音服务数据,生成多个响应消息片段;所述向所述设备端发送所述响应消息,包括:按照所述响应消息片段的生成时间依次向所述设备端发送所响应消息片段。
- 根据权利要求1所述的方法,其特征在于,所述方法还包括:构建所述目标语音服务的数据服务框架模型,所述数据服务框架模型包括:传输协议层、消息格式层、设备端能力层;所述构建所述目标语音服务的数据服务框架模型包括:构建所述传输协议层,包括配置所述目标语音服务所采用的传输协议;构建所述消息格式层,包括配置所述目标语音服务的请求消息和响应消息的消息格式;构建所述设备端能力层,包括配置从请求消息和响应消息中解析出设备端的能力的逻辑。
- 一种用于提供语音服务的装置,其特征在于,包括:第一接收单元,配置用于接收为已接入第三方语音服务的设备端提供目标语音服务的请求消息,所述请求消息包括请求内容和所述设备端的状态信息;获取单元,配置用于获取基于所述设备端的状态信息对所述请求内容进行处理后生成的响应消息,所述响应消息包括操作指令;发送单元,配置用于向所述设备端发送所述响应消息;其中,所述请求消息和所述响应消息按照已构建的所述目标语音服务的数据服务框架模型中配置的消息格式生成、并基于已构建的所述目标语音服务的数据服务框架模型中配置的传输协议传输;所述已构建的所述目标语音服务的数据服务框架模型中配置的消息格式与所述第三方语音服务的消息格式一致,所述已构建的所述目标语音服务的数据服务框架模型中配置的传输协议与所述第三方语音服务的传输协议一致。
- 根据权利要求11所述的装置,其特征在于,所述装置还包括:提供单元,配置用于响应于获取到用户发出的对所述设备端进行目标语音服务注册的请求,向所述用户提供所述目标语音服务的待替换配置信息,以供用户替换所述设备端的配置文件中的对应配置项;所述待替换配置信息包括用户标识、用户密码以及获取访问令牌的路径地址。
- 根据权利要求12所述的装置,其特征在于,所述装置还包括:第二接收单元,配置用于接收将所述设备端接入所述目标语音服务的接入请求,其中,所述目标语音服务接入请求包括所述用户标识、所述用户密码以及所述设备端的标识;授权单元,配置用于基于所述接入请求,向所述设备端发放所述目标语音服务的访问令牌,以供所述设备端通过所述获取访问令牌的路径地址获取发放的访问令牌。
- 根据权利要求13所述的装置,其特征在于,所述授权单元进一步配置用于按照如下方式向所述设备端发放所述目标语音服务的访问令牌:根据所述用户标识和所述用户密码查找出已获取用户授权的设备 标识;判断所述目标语音服务接入请求中的设备端的标识是否与所述已获取用户授权的设备标识一致;若是,向所述设备端发放所述目标语音服务的访问令牌。
- 根据权利要求14所述的装置,其特征在于,第一接收单元进一步配置用于:接收由已获得所述目标语音服务的访问令牌、并且已接入所述第三方语音服务的设备端发出的所述请求消息。
- 根据权利要求11所述的装置,其特征在于,所述获取单元获取的所述响应消息按照如下方式生成:对所述请求消息进行解析,得出所述请求内容和所述设备端的状态信息;基于所述设备端的状态信息和所述请求内容生成对应的操作指令;按照所述目标语音服务的数据服务框架模型中配置的消息格式和传输协议封装所述操作指令,生成所述响应消息。
- 根据权利要求16所述的装置,其特征在于,所述设备端的状态信息包括:设备端的能力声明、设备端的上下文环境信息、设备端的事件信息。
- 根据权利要求17所述的装置,其特征在于,所述获取单元获取的响应消息中的操作指令是按照如下方式生成的:基于所述设备端的能力声明、设备端的上下文环境信息、设备端的事件信息确定所述设备端的可调用操作接口;从所述可调用操作接口中确定出与所述请求内容对应的目标操作接口;根据所述请求内容确定出语音服务内容,并生成调用所述目标操 作接口输出所述语音服务内容的操作指令。
- 根据权利要求11所述的装置,其特征在于,所述获取单元进一步配置用于:检测所述请求内容是否包含语音交互需求;响应于检测到所述请求内容包含语音交互需求,以预设的时间长度或预设的消息长度划分语音服务数据,生成多个响应消息片段;所述发送单元进一步配置用于:按照所述响应消息片段的生成时间依次向所述设备端发送所响应消息片段。
- 根据权利要求11所述的装置,其特征在于,所述装置还包括:构建单元,配置用于构建所述目标语音服务的数据服务框架模型,所述数据服务框架模型包括:传输协议层、消息格式层、设备端能力层;所述构建单元具体配置用于:构建所述传输协议层,包括配置所述目标语音服务所采用的传输协议;构建所述消息格式层,包括配置所述目标语音服务的请求消息和响应消息的消息格式;构建所述设备端能力层,包括配置从请求消息和响应消息中解析出设备端的能力的逻辑。
- 一种服务器,其特征在于,包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-10中任一所述的方法。
- 一种计算机可读存储介质,其上存储有计算机程序,其特征 在于,该程序被处理器执行时实现如权利要求1-10中任一所述的方法。
Priority Applications (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2019537348A JP6754011B2 (ja) | 2017-06-30 | 2017-12-22 | 音声サービスを提供するための方法、装置およびサーバ |
| KR1020197020272A KR102144286B1 (ko) | 2017-06-30 | 2017-12-22 | 음성 서비스 제공 방법, 장치 및 서버 |
| EP17915576.7A EP3550801B1 (en) | 2017-06-30 | 2017-12-22 | Method and device for providing voice service, and server |
| US16/507,248 US10791200B2 (en) | 2017-06-30 | 2019-07-10 | Method, apparatus and server for providing voice service |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201710525724.1A CN107277153B (zh) | 2017-06-30 | 2017-06-30 | 用于提供语音服务的方法、装置和服务器 |
| CN201710525724.1 | 2017-06-30 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US16/507,248 Continuation US10791200B2 (en) | 2017-06-30 | 2019-07-10 | Method, apparatus and server for providing voice service |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2019000871A1 true WO2019000871A1 (zh) | 2019-01-03 |
Family
ID=60070767
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2017/118008 Ceased WO2019000871A1 (zh) | 2017-06-30 | 2017-12-22 | 用于提供语音服务的方法、装置和服务器 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US10791200B2 (zh) |
| EP (1) | EP3550801B1 (zh) |
| JP (1) | JP6754011B2 (zh) |
| KR (1) | KR102144286B1 (zh) |
| CN (1) | CN107277153B (zh) |
| WO (1) | WO2019000871A1 (zh) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111147586A (zh) * | 2019-12-27 | 2020-05-12 | 腾讯科技(深圳)有限公司 | 设备端控制方法、装置和会议系统 |
| CN119889316A (zh) * | 2025-01-13 | 2025-04-25 | 科大讯飞股份有限公司 | 语音指令的流式识别方法、装置、介质及设备 |
Families Citing this family (23)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107277153B (zh) * | 2017-06-30 | 2020-05-26 | 百度在线网络技术(北京)有限公司 | 用于提供语音服务的方法、装置和服务器 |
| CN107342083B (zh) * | 2017-07-05 | 2021-07-20 | 百度在线网络技术(北京)有限公司 | 用于提供语音服务的方法和装置 |
| CN107733722B (zh) * | 2017-11-16 | 2021-07-20 | 百度在线网络技术(北京)有限公司 | 用于配置语音服务的方法和装置 |
| CN107911386B (zh) * | 2017-12-06 | 2020-12-04 | 北京小米移动软件有限公司 | 获取服务授权信息的方法及装置 |
| WO2019236444A1 (en) * | 2018-06-05 | 2019-12-12 | Voicify, LLC | Voice application platform |
| CN109036427B (zh) * | 2018-09-25 | 2021-01-26 | 苏宁智能终端有限公司 | 一种动态配置语音识别服务的方法及系统 |
| US11100926B2 (en) * | 2018-09-27 | 2021-08-24 | Coretronic Corporation | Intelligent voice system and method for controlling projector by using the intelligent voice system |
| US11087754B2 (en) | 2018-09-27 | 2021-08-10 | Coretronic Corporation | Intelligent voice system and method for controlling projector by using the intelligent voice system |
| CN112579749B (zh) * | 2018-11-14 | 2024-04-19 | 深圳市云歌人工智能技术有限公司 | 提供以及获取服务的方法、系统及存储介质 |
| CN111324468B (zh) * | 2018-12-13 | 2023-08-01 | 熙牛医疗科技(浙江)有限公司 | 消息传递方法、装置、系统及计算设备 |
| CN109815025B (zh) * | 2018-12-17 | 2024-03-15 | 顺丰科技有限公司 | 一种业务模型调用方法、装置及存储介质 |
| CN109918040B (zh) * | 2019-03-15 | 2022-08-16 | 阿波罗智联(北京)科技有限公司 | 语音指令分发方法和装置、电子设备及计算机可读介质 |
| US11516221B2 (en) * | 2019-05-31 | 2022-11-29 | Apple Inc. | Multi-user devices in a connected home environment |
| CN111371792A (zh) * | 2020-03-06 | 2020-07-03 | 杭州涂鸦信息技术有限公司 | 一种基于智能音频设备上报拾音数据的方法及系统 |
| US20210383811A1 (en) * | 2020-06-09 | 2021-12-09 | Native Voice, Inc. | Methods and systems for audio voice service in an embedded device |
| CN114726830B (zh) * | 2020-12-18 | 2024-10-29 | 阿里巴巴集团控股有限公司 | 语音服务访问方法、系统和车辆 |
| CN115168064A (zh) * | 2021-04-07 | 2022-10-11 | 腾讯科技(深圳)有限公司 | 应用服务调用方法和装置、应用程序接入方法 |
| KR20230023212A (ko) * | 2021-08-10 | 2023-02-17 | 삼성전자주식회사 | 상태 변경에 따라 음성 명령 처리 결과를 출력하는 전자 장치 및 그의 동작 방법 |
| CN114244821B (zh) * | 2021-12-16 | 2023-03-14 | 北京百度网讯科技有限公司 | 数据处理方法、装置、设备、电子设备和存储介质 |
| CN114048303B (zh) * | 2022-01-11 | 2022-05-17 | 北京安博通科技股份有限公司 | 一种人机协同作战处置响应的系统及方法 |
| CN114373449A (zh) * | 2022-01-18 | 2022-04-19 | 海信电子科技(武汉)有限公司 | 智能设备、服务器及语音交互方法 |
| CN115033404A (zh) * | 2022-06-30 | 2022-09-09 | 京东方科技集团股份有限公司 | 一种集成ai能力的智慧交互平板以及交互方法 |
| CN118227447B (zh) * | 2024-05-22 | 2025-08-29 | 北京阿帕科蓝科技集团有限公司 | 指标监控方法、装置、计算机设备和存储介质 |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20090187956A1 (en) * | 2008-01-22 | 2009-07-23 | Joseph Sommer | Method and apparatus for merging voice and data features with internet protocol television |
| CN101567941A (zh) * | 2008-04-25 | 2009-10-28 | 佛山市顺德区顺达电脑厂有限公司 | 实时语音预约系统及方法 |
| CN101699840A (zh) * | 2009-11-09 | 2010-04-28 | 南京希华通信技术有限公司 | 融合通信中智能语音交互系统及其实现方法 |
| CN102638452A (zh) * | 2012-03-14 | 2012-08-15 | 杭州华三通信技术有限公司 | 一种基于VoIP网络的呼叫方法和设备 |
| CN105679319A (zh) * | 2015-12-29 | 2016-06-15 | 百度在线网络技术(北京)有限公司 | 语音识别处理方法及装置 |
| CN107277153A (zh) * | 2017-06-30 | 2017-10-20 | 百度在线网络技术(北京)有限公司 | 用于提供语音服务的方法、装置和服务器 |
Family Cites Families (17)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8448059B1 (en) * | 1999-09-03 | 2013-05-21 | Cisco Technology, Inc. | Apparatus and method for providing browser audio control for voice enabled web applications |
| US6934756B2 (en) * | 2000-11-01 | 2005-08-23 | International Business Machines Corporation | Conversational networking via transport, coding and control conversational protocols |
| US6801604B2 (en) * | 2001-06-25 | 2004-10-05 | International Business Machines Corporation | Universal IP-based and scalable architectures across conversational applications using web services for speech and audio processing resources |
| JP2003337866A (ja) * | 2002-05-20 | 2003-11-28 | Shimizu Corp | 室内環境・情報管理統合化システム |
| KR100477513B1 (ko) * | 2002-11-25 | 2005-03-17 | 전자부품연구원 | 이기종 프로토콜간 상호 데이터 전송을 위한 공통프로토콜 계층 구조 및 방법과 공통 프로토콜 패킷 |
| US7180984B1 (en) * | 2002-11-26 | 2007-02-20 | At&T Corp. | Mixed protocol multi-media provider system incorporating a session initiation protocol (SIP) based media server adapted to operate using SIP messages which encapsulate GR-1129 advanced intelligence network based information |
| EA015549B1 (ru) * | 2003-06-05 | 2011-08-30 | Интертраст Текнолоджис Корпорейшн | Переносимая система и способ для приложений одноранговой компоновки услуг |
| US20070140255A1 (en) * | 2005-12-21 | 2007-06-21 | Motorola, Inc. | Method and system for communication across different wireless technologies using a multimode mobile device |
| US9288276B2 (en) * | 2006-11-03 | 2016-03-15 | At&T Intellectual Property I, L.P. | Application services infrastructure for next generation networks including a notification capability and related methods and computer program products |
| WO2008085206A2 (en) * | 2006-12-29 | 2008-07-17 | Prodea Systems, Inc. | Subscription management of applications and services provided through user premises gateway devices |
| US20160277261A9 (en) * | 2006-12-29 | 2016-09-22 | Prodea Systems, Inc. | Multi-services application gateway and system employing the same |
| JP2009110300A (ja) * | 2007-10-30 | 2009-05-21 | Nippon Telegr & Teleph Corp <Ntt> | 情報家電ネットワーク制御装置、情報家電ネットワーク制御システム、情報家電ネットワーク制御方法、およびプログラム |
| US9159322B2 (en) * | 2011-10-18 | 2015-10-13 | GM Global Technology Operations LLC | Services identification and initiation for a speech-based interface to a mobile device |
| US9326088B2 (en) * | 2011-10-21 | 2016-04-26 | GM Global Technology Operations LLC | Mobile voice platform architecture with remote service interfaces |
| CN102571967B (zh) * | 2012-01-17 | 2015-04-01 | 深圳市乐唯科技开发有限公司 | 一种实现多对象数据交互应答和呼叫功能的系统及方法 |
| US9536527B1 (en) * | 2015-06-30 | 2017-01-03 | Amazon Technologies, Inc. | Reporting operational metrics in speech-based systems |
| CN105871972A (zh) * | 2015-11-13 | 2016-08-17 | 乐视云计算有限公司 | 一种视频资源的分布式缓存方法、装置及系统 |
-
2017
- 2017-06-30 CN CN201710525724.1A patent/CN107277153B/zh active Active
- 2017-12-22 WO PCT/CN2017/118008 patent/WO2019000871A1/zh not_active Ceased
- 2017-12-22 KR KR1020197020272A patent/KR102144286B1/ko active Active
- 2017-12-22 EP EP17915576.7A patent/EP3550801B1/en active Active
- 2017-12-22 JP JP2019537348A patent/JP6754011B2/ja active Active
-
2019
- 2019-07-10 US US16/507,248 patent/US10791200B2/en active Active
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20090187956A1 (en) * | 2008-01-22 | 2009-07-23 | Joseph Sommer | Method and apparatus for merging voice and data features with internet protocol television |
| CN101567941A (zh) * | 2008-04-25 | 2009-10-28 | 佛山市顺德区顺达电脑厂有限公司 | 实时语音预约系统及方法 |
| CN101699840A (zh) * | 2009-11-09 | 2010-04-28 | 南京希华通信技术有限公司 | 融合通信中智能语音交互系统及其实现方法 |
| CN102638452A (zh) * | 2012-03-14 | 2012-08-15 | 杭州华三通信技术有限公司 | 一种基于VoIP网络的呼叫方法和设备 |
| CN105679319A (zh) * | 2015-12-29 | 2016-06-15 | 百度在线网络技术(北京)有限公司 | 语音识别处理方法及装置 |
| CN107277153A (zh) * | 2017-06-30 | 2017-10-20 | 百度在线网络技术(北京)有限公司 | 用于提供语音服务的方法、装置和服务器 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP3550801A4 * |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111147586A (zh) * | 2019-12-27 | 2020-05-12 | 腾讯科技(深圳)有限公司 | 设备端控制方法、装置和会议系统 |
| CN111147586B (zh) * | 2019-12-27 | 2022-03-04 | 腾讯科技(深圳)有限公司 | 设备端控制方法、装置和会议系统 |
| CN119889316A (zh) * | 2025-01-13 | 2025-04-25 | 科大讯飞股份有限公司 | 语音指令的流式识别方法、装置、介质及设备 |
Also Published As
| Publication number | Publication date |
|---|---|
| US10791200B2 (en) | 2020-09-29 |
| CN107277153B (zh) | 2020-05-26 |
| KR102144286B1 (ko) | 2020-08-14 |
| EP3550801A4 (en) | 2019-11-20 |
| JP2020511804A (ja) | 2020-04-16 |
| CN107277153A (zh) | 2017-10-20 |
| US20190335020A1 (en) | 2019-10-31 |
| KR20190091545A (ko) | 2019-08-06 |
| JP6754011B2 (ja) | 2020-09-09 |
| EP3550801B1 (en) | 2020-08-12 |
| EP3550801A1 (en) | 2019-10-09 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US10791200B2 (en) | Method, apparatus and server for providing voice service | |
| CN108306877B (zh) | 基于node js的用户身份信息的验证方法、装置和存储介质 | |
| US10884808B2 (en) | Edge computing platform | |
| US10210030B2 (en) | Securely operating remote cloud-based applications | |
| US11360737B2 (en) | Method and apparatus for providing speech service | |
| CN110413418B (zh) | 缓存同步装置及方法,缓存同步系统、电子设备 | |
| JP2020003773A (ja) | 情報を送信するための方法と装置 | |
| WO2019015272A1 (zh) | 信息处理方法和装置 | |
| CN111028839B (zh) | 一种智能家居控制方法、装置及电子设备 | |
| WO2022057677A1 (zh) | 振动控制方法、装置、电子设备和计算机可读存储介质 | |
| US20250141941A1 (en) | Real-time media streams | |
| CN113691602A (zh) | 基于云手机的业务处理方法、系统、装置、设备及介质 | |
| CN108512889B (zh) | 一种基于http的应用响应推送方法及代理服务器 | |
| WO2023246060A1 (zh) | 用户认证授权方法、装置、介质及设备 | |
| CN118972431A (zh) | Ai模型的请求处理方法、计算机设备、介质及产品 | |
| CN110781014A (zh) | 基于Android设备的录音数据的多进程分发方法与系统 | |
| CN120017720B (zh) | 微服务实例路由方法、api网关及计算服务设备 | |
| CN113946816B (zh) | 基于云服务的鉴权方法、装置、电子设备和存储介质 | |
| CN112015383A (zh) | 一种登录方法和装置 | |
| CN116561013B (zh) | 基于目标服务框架的测试方法、装置、电子设备和介质 | |
| CN115865974A (zh) | 边缘设备、云端设备、边缘计算系统及方法和存储介质 | |
| CN115720224A (zh) | 基于桌面云的访问方法、装置、电子设备和介质 | |
| WO2022100203A1 (zh) | 数据处理方法、装置、介质、网络接入设备及电子设备 | |
| CN115129469B (zh) | 跨进程通信方法、装置、设备及存储介质 | |
| WO2020192245A1 (zh) | 应用开启方法、装置和计算机系统及介质 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17915576 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2019537348 Country of ref document: JP Kind code of ref document: A |
|
| ENP | Entry into the national phase |
Ref document number: 20197020272 Country of ref document: KR Kind code of ref document: A |
|
| ENP | Entry into the national phase |
Ref document number: 2017915576 Country of ref document: EP Effective date: 20190704 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |