WO2019056821A1 - 用于信息交互的方法及装置 - Google Patents
用于信息交互的方法及装置 Download PDFInfo
- Publication number
- WO2019056821A1 WO2019056821A1 PCT/CN2018/092870 CN2018092870W WO2019056821A1 WO 2019056821 A1 WO2019056821 A1 WO 2019056821A1 CN 2018092870 W CN2018092870 W CN 2018092870W WO 2019056821 A1 WO2019056821 A1 WO 2019056821A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information
- image
- semantic
- processed
- tag
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/53—Querying
- G06F16/535—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/26—Techniques for post-processing, e.g. correcting the recognition result
- G06V30/262—Techniques for post-processing, e.g. correcting the recognition result using context analysis, e.g. lexical, syntactic or semantic context
- G06V30/274—Syntactic or semantic context, e.g. balancing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/38—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/383—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/53—Querying
Definitions
- the present application relates to the field of data processing technologies, and in particular, to the field of information interaction technologies, and in particular, to a method and apparatus for information interaction.
- the image recognition technology is a technology that is currently developing at a high speed.
- An important direction of the image recognition technology is to satisfy the user's cognitive needs by understanding and recognizing the content in the image.
- Existing recognition of images is usually to import images to be identified into an image search engine in order to find the same or similar images, ie to map.
- the purpose of the embodiments of the present application is to provide a method and apparatus for information interaction to solve the technical problems mentioned in the above background art.
- the embodiment of the present application provides a method for information interaction, where the method includes: receiving a webpage browsing request of a user, where the webpage browsing request includes a webpage; acquiring to-be-processed information, where the to-be-processed information includes text Information and image; extracting feature words of the text information of the to-be-processed information, and searching for description information of the image of the to-be-processed information based on the feature word, wherein the feature word is used to represent a search request for the image, and the description information is used by Characterizing the text description of the above image; constructing the response information of the to-be-processed information by the above description information.
- the feature words for extracting the text information of the to-be-processed information include: semantically identifying the text information, obtaining semantic information corresponding to the text information, and extracting the feature words from the semantic information.
- the describing information of the image that is to be searched for the information to be processed based on the feature word includes: importing an image into an image search model, and obtaining a to-be-matched image set corresponding to the image, where the to-be-matched image set includes at least one
- the image search model is used to represent the first correspondence between the image and the image to be matched
- the image to be matched is imported into the semantic tag model to obtain a semantic tag set corresponding to the image set to be matched
- the semantic tag model is used for Determining a second correspondence between the image to be matched and the semantic tag, the semantic tag is used for text description of the image to be matched; and selecting a semantic tag to be identified from the semantic tag set, and corresponding to the image in the semantic tag to be identified
- the explanatory information of the noun is used as the descriptive information.
- the filtering out a semantic tag to be identified from the semantic tag set includes: counting the number of the same semantic tags in the semantic tag set, and using the most semantic tag as the semantic tag to be identified.
- the method further includes the step of: modifying the description information, the step of modifying the description information, comprising: receiving feedback information corresponding to the response information, where the feedback information is used to evaluate the response information. Accuracy; semantically identifying the above feedback information to obtain accuracy rate information; when the accuracy rate information is lower than a set threshold, selecting a second to-be-identified tag from the semantic tag other than the semantic tag to be identified in the semantic tag set And the interpretation information of the noun corresponding to the image in the second to-be-identified tag is used as the secondary description information; and the response information of the to-be-processed information is constructed by the secondary description information.
- the embodiment of the present application provides an apparatus for information interaction, where the apparatus includes: a to-be-processed information acquiring unit, configured to acquire information to be processed, where the to-be-processed information includes text information and an image; a unit for extracting feature words of the text information of the to-be-processed information, and searching for the description information of the image of the to-be-processed information based on the feature word, wherein the feature word is used to represent a search request for the image, and the description information is used by the unit Characterizing the text description of the image; the response information construction unit is configured to construct the response information of the to-be-processed information by using the above description information.
- the description information acquiring unit includes: a semantic recognition subunit, configured to perform semantic recognition on the text information to obtain semantic information corresponding to the text information; and a feature word extraction subunit, configured to extract from the semantic information. Feature words.
- the description information acquiring unit includes: a to-be-matched image acquisition sub-unit, configured to import an image into an image search model, to obtain a to-be-matched image set corresponding to the image, where the to-be-matched image set includes at least one to-be-matched image set.
- a to-be-matched image acquisition sub-unit configured to import an image into an image search model, to obtain a to-be-matched image set corresponding to the image, where the to-be-matched image set includes at least one to-be-matched image set.
- the image search model is used to represent the first correspondence between the image and the image to be matched
- the semantic tag acquisition sub-unit is configured to import the image to be matched into the semantic tag model, and obtain a semantic tag set corresponding to the image set to be matched.
- the semantic tag model is used to represent a second correspondence between the image to be matched and the semantic tag, the semantic tag is used for text description of the image to be matched, and the description information obtaining subunit is used to filter out a to-be-identified from the semantic tag set.
- the semantic tag uses the interpretation information of the noun corresponding to the image in the semantic tag to be identified as the description information.
- the foregoing description information obtaining subunit includes: counting the number of the same semantic tags in the semantic tag set, and using the most semantic tag as the semantic tag to be identified.
- the apparatus further includes: a correction unit, configured to modify the description information, where the correction unit includes: a feedback information receiving subunit, configured to receive feedback information corresponding to the response information, where the feedback information is used to Evaluating the accuracy of the response information; the accuracy information acquisition sub-unit is configured to perform semantic recognition on the feedback information to obtain accuracy rate information; and the second to-be-identified tag acquisition sub-unit is configured to use the accuracy information below the setting At the threshold, selecting a secondary to-be-identified tag from the semantic tags other than the semantic tag to be identified in the semantic tag set; the secondary description information acquiring sub-unit, for using the noun corresponding to the image in the second to-be-identified tag The interpretation information is used as the secondary description information; the secondary response information construction subunit is configured to construct the response information of the to-be-processed information by the secondary description information.
- the correction unit includes: a feedback information receiving subunit, configured to receive feedback information corresponding to the response information, where the feedback information is used to Evaluating the accuracy of the response information
- an embodiment of the present application provides a server, including: one or more processors; and a memory, configured to store one or more programs, when the one or more programs are executed by the one or more processors.
- the one or more processors described above are caused to perform the method for information interaction of the first aspect described above.
- an embodiment of the present application provides a computer readable storage medium having stored thereon a computer program, wherein the program is implemented by a processor to implement the method for information interaction of the first aspect.
- the method and device for information interaction provided by the embodiment of the present application extracts the feature words of the text information of the information to be processed, and obtains the description information of the image of the information to be processed, and establishes the correspondence between the text information and the image in the information to be processed. After that, by constructing the response information by describing the information, the information interaction with the information to be processed is realized, and the efficiency of information interaction is improved.
- FIG. 1 is an exemplary system architecture diagram to which the present application can be applied;
- FIG. 2 is a flow diagram of one embodiment of a method for information interaction in accordance with the present application.
- FIG. 3 is a schematic diagram of an application scenario of a method for information interaction according to the present application.
- FIG. 4 is a schematic structural diagram of an embodiment of an apparatus for information interaction according to the present application.
- FIG. 5 is a schematic structural diagram of a computer system suitable for implementing a server of an embodiment of the present application.
- FIG. 1 illustrates an exemplary system architecture 100 of an embodiment of a method for information interaction or apparatus for information interaction to which the present application may be applied.
- system architecture 100 can include terminal devices 101, 102, 103, network 104, and server 105.
- the network 104 is used to provide a medium for communication links between the terminal devices 101, 102, 103 and the server 105.
- Network 104 may include various types of connections, such as wired, wireless communication links, fiber optic cables, and the like.
- the user can interact with the server 105 over the network 104 using the terminal devices 101, 102, 103 to send pending messages or receive response messages and the like.
- Various communication client applications such as a web browser application, an instant communication tool, social platform software, and the like can be installed on the terminal devices 101, 102, and 103.
- the terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting information editing, including but not limited to smartphones, tablets, laptop portable computers, desktop computers, and the like.
- the server 105 may be a server that provides various services, such as a server that performs information processing on the information to be processed on the terminal devices 101, 102, and 103.
- the server may obtain the to-be-processed information on the terminal device 101, 102, and 103, extract the feature word from the text information of the to-be-processed information, and search for the description information of the image of the to-be-processed information based on the feature word. Finally, construct the corresponding to-be-processed by the description information.
- Information response information enables information interaction.
- the method for information interaction provided by the embodiment of the present application may be separately executed by the terminal device 101, 102, 103, or may be jointly performed by the terminal device 101, 102, 103 and the server 105.
- the means for information interaction may be provided in the terminal device 101, 102, 103 or may be provided in the server 105.
- terminal devices, networks, and servers in Figure 1 is merely illustrative. Depending on the implementation needs, there can be any number of terminal devices, networks, and servers.
- the method for information interaction includes the following steps:
- Step 201 Acquire information to be processed.
- the electronic device for example, the terminal device 101, 102, 103 or the server 105 shown in FIG. 1 on which the method for information interaction runs can obtain the information to be processed through a wired connection or a wireless connection.
- the information to be processed includes text information and an image.
- the above wireless connection manner may include but is not limited to 3G/4G connection, WiFi connection, Bluetooth connection, WiMAX connection, Zigbee connection, UWB (ultra wideband) connection, and other wireless connection methods that are now known or developed in the future. .
- the user can perform operations related to text and pictures in an information processing application on the terminal device 101, 102, 103.
- a user inputs text information and images in an information processing application.
- the image may be composed of various objects (for example, an image of a certain plant, an image of an animal, etc.), and the text information may be: "I just shot outside, I have never seen it, who knows.”
- the terminal device 101, 102, 103 or the server 105 can use the information including the text information and the image as the information to be processed.
- Step 202 Extract feature words of the text information of the to-be-processed information, and search for description information of the image of the to-be-processed information based on the feature words.
- the feature word is used to represent a search request for the image, for example, the feature word may be: "who knows", “what is”, and the like.
- Feature words are extracted to indicate that the user has an intention to know the information of the image.
- the description information of the image can be found in a plurality of ways, wherein the description information is used to represent the text description of the image.
- the description information can be: "XXX (plant name), scientific name XX, Liliaceae lily, perennial herb, native China.".
- the character words included in the text information for extracting the information to be processed may include the following steps:
- semantic recognition of the text information is performed to obtain semantic information corresponding to the text information.
- the above text information “just outside, not seen, who knows” as an example, the semantic recognition of the text information, the corresponding semantic information can be: “what is in the image”.
- the text information of the information to be processed may also directly include the feature words.
- the text information can be "Who knows what plants are in the picture", among which "picture”, “what", "plant” can be characteristic words.
- the describing information of the image that is to be used to search for the information to be processed based on the feature word may include the following steps:
- the image is imported into the image search model to obtain a set of images to be matched corresponding to the above image.
- the image search model can extract the image features of the input image, and then find an image containing the same or similar image features as a to-be-matched image by means of a local image library or networking.
- the image set to be matched may include at least one image to be matched, and the image search model is used to represent a first correspondence between the image and the image to be matched.
- the first correspondence may be the same or similar relationship between the image being directed and the image to be matched.
- the image to be matched is imported into the semantic tag model to obtain a semantic tag set corresponding to the image set to be matched.
- the image to be matched is found from an existing local or network image that is the same or similar to the imported image.
- the image to be matched may be imported into the semantic tag model to obtain a semantic tag corresponding to the image to be matched.
- the semantic tag model is used to represent a second correspondence between the image to be matched and the semantic tag, and the semantic tag is used for text description of the image to be matched.
- the semantic tag corresponding to an image to be matched may be: "This is a lily.”
- a semantic tag to be identified is filtered out from the set of semantic tags, and the interpretation information of the noun corresponding to the image in the semantic tag to be identified is used as the description information.
- Importing an image into the image lookup model yields at least one image to be matched, each of which has a semantic label, and the imported image has only one semantic label. Therefore, a semantic tag that best matches the imported image can be filtered from the semantic tag set, and the semantic tag is used as the semantic tag to be recognized. Then, the interpretation information of the noun corresponding to the imported image in the semantic tag to be recognized may be used as the description information.
- the semantic tag to be identified is: "This is a lily.” Then "lily” is the noun corresponding to the imported image.
- the interpretation information of the "lily flower” can be obtained by local or network inquiry, and the interpretation information can be used as the description information of the imported image.
- the filtering the semantic identifier to be identified from the semantic label set may include: counting the number of the same semantic labels in the semantic label set, and using the most semantic label as the number The semantic tag to be identified.
- Importing an image into the image lookup model yields at least one image to be matched, and each image to be matched has a semantic tag.
- the images to be matched may be different from each other, but the resulting semantic tags may be the same.
- the image to be matched may be captured from the same plant from different angles, and the corresponding plurality of semantic tags may be the same.
- Step 203 Construct the response information of the to-be-processed information by using the foregoing description information.
- the response information can be constructed by describing the information.
- the above description information may be: "XXX (plant name), scientific name XX, Liliaceae lily, perennial herb, native China ⁇ ”
- the response information may be: "The figure is XXX (plant name)
- the response information can be sent to the terminal devices 101, 102, 103 to implement information interaction with the information to be processed on the terminal devices 101, 102, 103.
- the method further includes: performing the step of modifying the foregoing description information, where the step of modifying the foregoing description information may include:
- the feedback information corresponding to the response information is received, and the feedback information is used to evaluate the accuracy of the response information.
- the evaluation can be: “Yes, it is XXX, our family also has”, “No, it should be XXX", “may be YYY, not like XXX”. These evaluations of the user can be considered as feedback information on the response information.
- the above feedback information is semantically identified to obtain accuracy rate information.
- the semantic recognition result of "Yes, is XXX, our family also has” can be: “The response information is correct”; “No, it should be XXX”
- the semantic recognition result can be: "Response information error”; "Probably YYY”
- the semantic recognition result, unlike XXX, can be: "The response message is indeterminate.”
- you can set the above "right, XXX, our family also has” accuracy rate can be 100%; “No, it should be XXX” accuracy rate can be 0%; “may be YYY, not like XXX
- the accuracy rate can be 50%.
- the accuracy information of the response information can be obtained. For example, the accuracy rate is 8 for 100%, the accuracy rate is 0%, and the accuracy rate is 50%.
- the second identifier to be identified is selected from the semantic tags other than the semantic tag to be identified.
- the accuracy information can reflect the correctness of the response information.
- the response information can be considered correct. For example, if 8 out of 10 feedback messages think that the response information is correct, you can directly think that the response information is 100% correct.
- the accuracy information is below a certain set threshold (for example, 69%), the response information can be considered incorrect.
- other semantic tags may be selected from the semantic tag set other than the semantic tags to be identified corresponding to the response information as the secondary to-be-identified tags.
- the interpretation information of the noun in the second to-be-identified tag is used as the secondary description information.
- the interpretation information of the noun in the second to-be-identified tag can be used as the secondary description information.
- the response information of the to-be-processed information is constructed by the secondary description information.
- the response information can be reconstructed by the secondary description information, and then the response information is transmitted to the terminal devices 101, 102, 103.
- the other to-be-identified tags may be continuously selected to obtain the response information until the corresponding accuracy rate is reached.
- the information is above the set threshold.
- the image in the to-be-processed information and the response information may be associated with each other and saved in the database for subsequent query of the image and the response information.
- FIG. 3 is a schematic diagram of an application scenario of a method for information interaction according to the present embodiment.
- the user sends a message in the plant bar through the terminal device 102: "I just found it in the vicinity, it is pretty, I don't know what it is, does anyone know?" and added the corresponding image.
- the server 105 acquires the content sent by the user, and will give the content as the information to be processed. Then, the server 105 extracts the feature word "what" from the text information; after that, the server 105 acquires the description information corresponding to the image, constructs the response information by the description information, and transmits it to the terminal device 102.
- the method provided in the foregoing embodiment of the present application is capable of extracting feature words of text information of the information to be processed, and obtaining description information of the image of the information to be processed, and establishing a correspondence relationship between the text information and the image in the information to be processed;
- the information constructs response information, and realizes information interaction with the information to be processed.
- the present application provides an embodiment of an apparatus for information interaction, the apparatus embodiment corresponding to the method embodiment shown in FIG.
- the device can be specifically applied to various electronic devices.
- the apparatus 400 for information interaction of this embodiment may include: a to-be-processed information acquisition unit 401, a description information acquisition unit 402, and a response information construction unit 403.
- the to-be-processed information acquiring unit 401 is configured to obtain the to-be-processed information, where the to-be-processed information includes the text information and the image;
- the description information acquiring unit 402 is configured to extract the feature words of the text information of the to-be-processed information, and find the above based on the feature words.
- Descriptive information of an image of the information to be processed wherein the feature word is used to represent a search request for the image, the description information is used to represent a text description of the image; and the response information construction unit 403 is configured to construct the above by using the description information The response information of the pending information.
- the description information acquiring unit 402 may include: a semantic recognition subunit (not shown in the figure) and a feature word extraction subunit (not shown in the figure).
- the semantic recognition sub-unit is configured to perform semantic recognition on the text information to obtain semantic information corresponding to the text information
- the feature word extraction sub-unit is configured to extract feature words from the semantic information.
- the foregoing description information acquiring unit 402 may include: a to-be-matched image acquisition sub-unit (not shown in the figure), a semantic tag acquisition sub-unit (not shown in the figure), and a description. Information acquisition subunit (not shown in the figure).
- the image to be matched sub-unit is configured to import an image into an image search model, and obtain a to-be-matched image set corresponding to the image, where the to-be-matched image set includes at least one image to be matched, and the image search model is used to represent the image and a first correspondence relationship between the images to be matched;
- the semantic tag acquisition subunit is configured to import the image to be matched into the semantic tag model, and obtain a semantic tag set corresponding to the image set to be matched, wherein the semantic tag model is used to represent the image to be matched and the semantics a second correspondence of the label, the semantic label is used for text description of the image to be matched;
- the description information obtaining subunit is configured to filter out a semantic label to be identified from the semantic label set, and corresponding to the image in the semantic label to be identified
- the explanatory information of the noun is used as the descriptive information.
- the foregoing description information acquiring subunit may include: counting the number of the same semantic labels in the semantic label set, and using the most semantic label as the semantic label to be identified.
- the apparatus for information interaction may further include: a correction unit (not shown) for modifying the description information
- the modifying unit may include: The feedback information receiving subunit, the accuracy information acquiring subunit, the secondary to-be-identified tag acquiring subunit, the secondary description information acquiring subunit, and the secondary response information constructing subunit.
- the feedback information receiving subunit is configured to receive feedback information corresponding to the response information, where the feedback information is used to evaluate the accuracy of the response information, and the accuracy information acquisition subunit is configured to perform semantic recognition on the feedback information to obtain an accuracy rate.
- the second to-be-identified tag acquisition sub-unit is configured to select a secondary to-be-identified tag from the semantic tag other than the to-be-identified semantic tag in the semantic tag set when the accuracy information is lower than the set threshold;
- the information acquisition subunit is configured to use the interpretation information of the noun corresponding to the image in the second to-be-identified tag as the secondary description information;
- the secondary response information construction subunit is configured to construct the response of the to-be-processed information by the secondary description information information.
- the embodiment further provides a server comprising: one or more processors; a memory for storing one or more programs, when the one or more programs are executed by the one or more processors, One or more processors perform the methods described above for information interaction.
- the embodiment further provides a computer readable storage medium having stored thereon a computer program that, when executed by the processor, implements the above-described method for information interaction.
- FIG. 5 there is shown a block diagram of a computer system 500 suitable for use in implementing the server of the embodiments of the present application.
- the server shown in FIG. 5 is merely an example, and should not impose any limitation on the function and scope of use of the embodiments of the present application.
- computer system 500 includes a central processing unit (CPU) 501 that can be loaded into a program in random access memory (RAM) 503 according to a program stored in read only memory (ROM) 502 or from storage portion 508. And perform various appropriate actions and processes.
- RAM random access memory
- ROM read only memory
- RAM 503 various programs and data required for the operation of the system 500 are also stored.
- the CPU 501, the ROM 502, and the RAM 503 are connected to each other through a bus 504.
- An input/output (I/O) interface 505 is also coupled to bus 504.
- the following components are connected to the I/O interface 505: an input portion 506 including a keyboard, a mouse, etc.; an output portion 507 including, for example, a cathode ray tube (CRT), a liquid crystal display (LCD), and the like, and a storage portion 508 including a hard disk or the like. And a communication portion 509 including a network interface card such as a LAN card, a modem, or the like. The communication section 509 performs communication processing via a network such as the Internet.
- Driver 510 is also coupled to I/O interface 505 as needed.
- a removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like is mounted on the drive 510 as needed so that a computer program read therefrom is installed into the storage portion 508 as needed.
- an embodiment of the present disclosure includes a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for executing the method illustrated in the flowchart.
- the computer program can be downloaded and installed from the network via the communication portion 509, and/or installed from the removable medium 511.
- CPU central processing unit
- the computer readable medium described above may be a computer readable signal medium or a computer readable storage medium or any combination of the two.
- the computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections having one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
- a computer readable storage medium may be any tangible medium that can contain or store a program, which can be used by or in connection with an instruction execution system, apparatus or device.
- a computer readable signal medium may include a data signal that is propagated in the baseband or as part of a carrier, carrying computer readable program code. Such propagated data signals can take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing.
- the computer readable signal medium can also be any computer readable medium other than a computer readable storage medium, which can transmit, propagate, or transport a program for use by or in connection with the instruction execution system, apparatus, or device.
- Program code embodied on a computer readable medium can be transmitted by any suitable medium, including but not limited to wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
- each block of the flowchart or block diagram can represent a module, a program segment, or a portion of code that includes one or more of the logic functions for implementing the specified.
- Executable instructions can also occur in a different order than that illustrated in the drawings. For example, two successively represented blocks may in fact be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending upon the functionality involved.
- each block of the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented in a dedicated hardware-based system that performs the specified function or operation. Or it can be implemented by a combination of dedicated hardware and computer instructions.
- the units involved in the embodiments of the present application may be implemented by software or by hardware.
- the described unit may also be disposed in the processor, for example, as a processor including a to-be-processed information acquisition unit, a description information acquisition unit, and a response information construction unit.
- the names of these units do not constitute a limitation on the unit itself under certain circumstances.
- the response information construction unit may also be described as "a unit for constructing response information.”
- the present application also provides a computer readable medium, which may be included in the apparatus described in the above embodiments, or may be separately present and not incorporated into the apparatus.
- the computer readable medium carries one or more programs, when the one or more programs are executed by the device, causing the device to: acquire information to be processed, the information to be processed includes text information and an image; and extract the information to be processed a feature word of the text information, the description information of the image of the to-be-processed information is searched based on the feature word, wherein the feature word is used to represent a search request for the image, and the description information is used to represent a text description of the image;
- the response information of the above-mentioned to-be-processed information is constructed by the above description information.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Library & Information Science (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Processing Or Creating Images (AREA)
- Information Transfer Between Computers (AREA)
Abstract
本申请实施例公开了用于信息交互的方法及装置。该方法的一具体实施方式包括:获取待处理信息,上述待处理信息包括文字信息和图像;提取上述待处理信息的文字信息的特征词,基于上述特征词查找上述待处理信息的图像的描述信息,其中,上述特征词用于表征对上述图像的查找请求,上述描述信息用于表征对上述图像的文字描述;通过上述描述信息构建上述待处理信息的应答信息。该实施方式通过描述信息构建应答信息,实现了与待处理信息之间的信息交互,提高了信息交互的效率。
Description
本专利申请要求于2017年9月19日提交的、申请号为201710847084.6、申请人为百度在线网络技术(北京)有限公司、发明名称为“用于信息交互的方法及装置”的中国专利申请的优先权,该申请的全文以引用的方式并入本申请中。
本申请涉及数据处理技术领域,具体涉及信息交互技术领域,尤其涉及用于信息交互的方法及装置。
识图技术是当前正在高速发展的一项技术,识图技术的一个重要方向是:通过对图像中的内容进行理解并识别,满足用户的认知需求。现有对图像的识别通常是将待识别图像导入图像搜索引擎,以便找到相同或相似的图像,即以图找图。
在人们的日常工作中,并不是每个图像都要进行以图找图操作,是否需要以图找图需要根据实际需要而定。很多情况下,用户在通过文字和图像传达信息时,可能没有直接建立文字和图像之间的对应关系(例如,包含了文字和图像,但没有通过文字对图像进行说明等),看到文字和图像的用户也可能无法进行相应的信息反馈(例如,用户不知道图像中的内容等),这就容易出现信息传达错误或不准确的情况,信息交互的效率不高。
发明内容
本申请实施例的目的在于提出了用于信息交互的方法及装置,来解决以上背景技术部分提到的技术问题。
第一方面,本申请实施例提供了一种用于信息交互的方法,该方 法包括:接收用户的网页浏览请求,其中,该网页浏览请求包括网址;获取待处理信息,上述待处理信息包括文字信息和图像;提取上述待处理信息的文字信息的特征词,基于上述特征词查找上述待处理信息的图像的描述信息,其中,上述特征词用于表征对上述图像的查找请求,上述描述信息用于表征对上述图像的文字描述;通过上述描述信息构建上述待处理信息的应答信息。
在一些实施例中,上述提取上述待处理信息的文字信息的特征词包括:对文字信息进行语义识别,得到对应上述文字信息的语义信息;从上述语义信息中提取特征词。
在一些实施例中,上述基于上述特征词查找上述待处理信息的图像的描述信息包括:将图像导入图像查找模型,得到与上述图像对应的待匹配图像集合,上述待匹配图像集合包括至少一张待匹配图像,上述图像查找模型用于表征图像与待匹配图像之间的第一对应关系;将待匹配图像导入语义标签模型,得到对应待匹配图像集合的语义标签集合,上述语义标签模型用于表征待匹配图像与语义标签的第二对应关系,语义标签用于对待匹配图像进行文字说明;从上述语义标签集合中筛选出一个待识别语义标签,将待识别语义标签中的与上述图像对应的名词的解释信息作为描述信息。
在一些实施例中,上述从上述语义标签集合中筛选出一个待识别语义标签包括:统计上述语义标签集合中相同的语义标签的数量,将数量最多的语义标签作为待识别语义标签。
在一些实施例中,上述方法还包括:对上述描述信息进行修正的步骤,上述对上述描述信息进行修正的步骤包括:接收对应上述应答信息的反馈信息,上述反馈信息用于评价上述应答信息的准确性;对上述反馈信息进行语义识别,得到准确率信息;在上述准确率信息低于设定阈值时,从上述语义标签集合中除待识别语义标签以外的语义标签中选择二次待识别标签;将二次待识别标签中的与上述图像对应的名词的解释信息作为二次描述信息;通过二次描述信息构建上述待处理信息的应答信息。
第二方面,本申请实施例提供了一种用于信息交互的装置,上述 该装置包括:待处理信息获取单元,用于获取待处理信息,上述待处理信息包括文字信息和图像;描述信息获取单元,用于提取上述待处理信息的文字信息的特征词,基于上述特征词查找上述待处理信息的图像的描述信息,其中,上述特征词用于表征对上述图像的查找请求,上述描述信息用于表征对上述图像的文字描述;应答信息构建单元,用于通过上述描述信息构建上述待处理信息的应答信息。
在一些实施例中,上述描述信息获取单元包括:语义识别子单元,用于对文字信息进行语义识别,得到对应上述文字信息的语义信息;特征词提取子单元,用于从上述语义信息中提取特征词。
在一些实施例中,上述描述信息获取单元包括:待匹配图像获取子单元,用于将图像导入图像查找模型,得到与上述图像对应的待匹配图像集合,上述待匹配图像集合包括至少一张待匹配图像,上述图像查找模型用于表征图像与待匹配图像之间的第一对应关系;语义标签获取子单元,用于将待匹配图像导入语义标签模型,得到对应待匹配图像集合的语义标签集合,上述语义标签模型用于表征待匹配图像与语义标签的第二对应关系,语义标签用于对待匹配图像进行文字说明;描述信息获取子单元,用于从上述语义标签集合中筛选出一个待识别语义标签,将待识别语义标签中的与上述图像对应的名词的解释信息作为描述信息。
在一些实施例中,上述描述信息获取子单元包括:统计上述语义标签集合中相同的语义标签的数量,将数量最多的语义标签作为待识别语义标签。
在一些实施例中,上述装置还包括:修正单元,用于对上述描述信息进行修正,上述修正单元包括:反馈信息接收子单元,用于接收对应上述应答信息的反馈信息,上述反馈信息用于评价上述应答信息的准确性;准确率信息获取子单元,用于对上述反馈信息进行语义识别,得到准确率信息;二次待识别标签获取子单元,用于在上述准确率信息低于设定阈值时,从上述语义标签集合中除待识别语义标签以外的语义标签中选择二次待识别标签;二次描述信息获取子单元,用于将二次待识别标签中的与上述图像对应的名词的解释信息作为二次 描述信息;二次应答信息构建子单元,用于通过二次描述信息构建上述待处理信息的应答信息。
第三方面,本申请实施例提供了一种服务器,包括:一个或多个处理器;存储器,用于存储一个或多个程序,当上述一个或多个程序被上述一个或多个处理器执行时,使得上述一个或多个处理器执行上述第一方面的用于信息交互的方法。
第四方面,本申请实施例提供了一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现上述第一方面的用于信息交互的方法。
本申请实施例提供的用于信息交互的方法及装置,提取待处理信息的文字信息的特征词,并得到待处理信息的图像的描述信息,建立了待处理信息中文字信息与图像的对应关系;之后,通过描述信息构建应答信息,实现了与待处理信息之间的信息交互,提高了信息交互的效率。
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本申请的其它特征、目的和优点将会变得更明显:
图1是本申请可以应用于其中的示例性系统架构图;
图2是根据本申请的用于信息交互的方法的一个实施例的流程图;
图3是根据本申请的用于信息交互的方法的一个应用场景的示意图;
图4是根据本申请的用于信息交互的装置的一个实施例的结构示意图;
图5是适于用来实现本申请实施例的服务器的计算机系统的结构示意图。
下面结合附图和实施例对本申请作进一步的详细说明。可以理解 的是,此处所描述的具体实施例仅仅用于解释相关发明,而非对该发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关发明相关的部分。
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。
图1示出了可以应用本申请的用于信息交互的方法或用于信息交互的装置的实施例的示例性系统架构100。
如图1所示,系统架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。
用户可以使用终端设备101、102、103通过网络104与服务器105交互,以发送待处理消息或接收应答信息等。终端设备101、102、103上可以安装有各种通讯客户端应用,例如网页浏览器应用、即时通信工具、社交平台软件等。
终端设备101、102、103可以是具有显示屏并且支持信息编辑的各种电子设备,包括但不限于智能手机、平板电脑、膝上型便携计算机和台式计算机等等。
服务器105可以是提供各种服务的服务器,例如对终端设备101、102、103上的待处理信息进行信息处理的服务器。服务器可以获取终端设备101、102、103上的待处理信息,从待处理信息的文字信息中提取特征词,基于特征词查找待处理信息的图像的描述信息;最后,通过描述信息构建对应待处理信息的应答信息,实现了信息交互。
需要说明的是,本申请实施例所提供的用于信息交互的方法可以由终端设备101、102、103单独执行,或者也可以由终端设备101、102、103和服务器105共同执行。相应地,用于信息交互的装置可以设置于终端设备101、102、103中,也可以设置于服务器105中。
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。
继续参考图2,示出了根据本申请的用于信息交互的方法的一个实施例的流程200。该用于信息交互的方法包括以下步骤:
步骤201,获取待处理信息。
在本实施例中,用于信息交互的方法运行于其上的电子设备(例如图1所示的终端设备101、102、103或服务器105)可以通过有线连接方式或者无线连接方式获取待处理信息,其中,待处理信息包括文字信息和图像。需要指出的是,上述无线连接方式可以包括但不限于3G/4G连接、WiFi连接、蓝牙连接、WiMAX连接、Zigbee连接、UWB(ultra wideband)连接、以及其他现在已知或将来开发的无线连接方式。
用户可以在终端设备101、102、103上的某一信息处理应用中进行与文字和图片相关的操作。例如,用户在信息处理应用中输入了文字信息和图像。其中,图像可以是包含各种对象(例如可以是某种植物的图像、动物的图像等),文字信息可以是:“刚才外面拍的,没见过,谁知道”。则终端设备101、102、103或服务器105就可以将这些包含文字信息和图像的信息作为待处理信息。
步骤202,提取上述待处理信息的文字信息的特征词,基于上述特征词查找上述待处理信息的图像的描述信息。
获取到待处理信息,首先需要从待处理信息包含的文字信息中提取特征词。其中,上述特征词用于表征对上述图像的查找请求,例如,特征词可以是:“谁知道”、“是什么”等。
提取到特征词,说明用户有意图想知道图像的信息。之后,就可以通过多种方式查找图像的描述信息,其中,上述描述信息用于表征对上述图像的文字描述。描述信息可以是:“XXX(植物名称),学名XX,百合科百合属,多年生草本植物,原产中国···”。
在本实施例的一些可选的实现方式中,上述提取上述待处理信息的文字信息包含的特征词可以包括以下步骤:
第一步,对文字信息进行语义识别,得到对应上述文字信息的语义信息。
还以上述的文字信息:“刚才外面拍的,没见过,谁知道”为例, 对该文字信息进行语义识别,得到的对应的语义信息可以是:“图像中是什么”。
第二步,从上述语义信息中提取特征词。
上述的语义信息为:“图片中是什么”,对应的特征词可以是:“图片”和“是什么”。
需要说明的是,待处理信息的文字信息也可以直接包含特征词。例如:文字信息可以是“谁知道图里是什么植物”,其中,“图”、“是什么”、“植物”就可以是特征词。
在本实施例的一些可选的实现方式中,上述基于上述特征词查找上述待处理信息的图像的描述信息可以包括以下步骤:
第一步,将图像导入图像查找模型,得到与上述图像对应的待匹配图像集合。
图像查找模型能够提取输入的图像的图像特征,然后通过本地的图像库或联网等方式,找到包含相同或相似图像特征的图像作为待匹配图像。其中,上述待匹配图像集合可以包括至少一张待匹配图像,上述图像查找模型用于表征图像与待匹配图像之间的第一对应关系。第一对应关系可以是指导入的图像与待匹配图像之间的相同或相似的关系。
第二步,将待匹配图像导入语义标签模型,得到对应待匹配图像集合的语义标签集合。
待匹配图像从现有的本地或网络上找到的,与导入的图像相同或相近的图像。得到待匹配图像后,可以将待匹配图像导入语义标签模型,以得到对应待匹配图像的语义标签。其中,上述语义标签模型用于表征待匹配图像与语义标签的第二对应关系,语义标签用于对待匹配图像进行文字说明。例如,对应某一待匹配图像的语义标签可以是:“这是百合花”。
第三步,从上述语义标签集合中筛选出一个待识别语义标签,将待识别语义标签中的与上述图像对应的名词的解释信息作为描述信息。
向图像查找模型导入一张图像可以得到至少一个待匹配图像,每 个待匹配图像都有一个语义标签,而导入的图像只要一个语义标签。因此,可以从语义标签集合中筛选出一个最符合导入的图像的语义标签,并将该语义标签作为待识别语义标签。然后,可以从该待识别语义标签中的与导入的图像对应的名词的解释信息作为描述信息。例如,待识别语义标签是:“这是百合花”。则“百合花”就是与导入的图像对应的名词。通过本地或网络查询可以得到“百合花”的解释信息,可以将该解释信息作为导入的图像的描述信息。
在本实施例的一些可选的实现方式中,上述从上述语义标签集合中筛选出一个待识别语义标签可以包括:统计上述语义标签集合中相同的语义标签的数量,将数量最多的语义标签作为待识别语义标签。
向图像查找模型导入一张图像可以得到至少一个待匹配图像,每个待匹配图像都有一个语义标签。待匹配图像之间可以彼此不同,但得到的语义标签可以相同。例如,待匹配图像可以是从不同角度对同一植物拍摄得到的,对应得到的多个语义标签可以是相同的。而相同的语义标签的数量越多,说明导入的图像在多个拍摄角度都与某一被拍摄对象相同的。因此,可以统计上述语义标签集合中相同的语义标签的数量,将数量最多的语义标签作为待识别语义标签。
步骤203,通过上述描述信息构建上述待处理信息的应答信息。
得到描述信息后,可以通过描述信息构建应答信息。例如,上述的描述信息可以是:“XXX(植物名称),学名XX,百合科百合属,多年生草本植物,原产中国···”,应答信息可以是:“图中是XXX(植物名称),更详细的信息可以参考:https://xxx.xxx.com/item/%E7%99%B%88/7886?fr=aladdin&fromid=7780&fromtitle=%E7%99%BE%E5%90%88%E8%8A%B1”,之后,可以将应答信息发送到终端设备101、102、103上,实现与终端设备101、102、103上待处理信息的信息交互。
在本实施例的一些可选的实现方式中,上述方法还包括:对上述描述信息进行修正的步骤,上述对上述描述信息进行修正的步骤可以包括:
第一步,接收对应上述应答信息的反馈信息,上述反馈信息用于 评价上述应答信息的准确性。
应答信息返回到终端设备101、102、103上后,其他用户可以对应答信息做出是否准确或正确的评价。例如,评价可以是:“对,就是XXX,我们家也有”、“不对吧,应该是XXX”、“可能是YYY吧,不像XXX”等。用户的这些评价就可以认为是对应答信息的反馈信息。
第二步,对上述反馈信息进行语义识别,得到准确率信息。
不同的反馈信息表示的准确率可以不同。例如,“对,就是XXX,我们家也有”的语义识别结果可以是:“应答信息正确”;“不对吧,应该是XXX”的语义识别结果可以是:“应答信息错误”;“可能是YYY吧,不像XXX”的语义识别结果可以是:“应答信息不定”。相应的,可以设置上述的“对,就是XXX,我们家也有”的准确率可以是100%;“不对吧,应该是XXX”的准确率可以是0%;“可能是YYY吧,不像XXX”的准确率可以是50%。统计一段时间内全部的准确率,可以得到应答信息的准确率信息。例如,准确率是100%有8个,准确率是0%有1个,准确率是50%有1个,则准确率信息可以是:(8*100%+1*0%+1*50%)/10=85%。
第三步,在上述准确率信息低于设定阈值时,从上述语义标签集合中除待识别语义标签以外的语义标签中选择二次待识别标签。
准确率信息可以反应应答信息的正确性,当准确率信息高于设定阈值时,可以认为应答信息正确。例如,10条反馈信息中有8条认为应答信息正确,则可以直接认为应答信息是100%正确的。当准确率信息低于某一设定阈值(例如69%)时,可以认为应答信息不正确。这时,可以从语义标签集合中除与应答信息对应的待识别语义标签以外的语义标签中选择其他的语义标签作为二次待识别标签。
第四步,将二次待识别标签中的名词的解释信息作为二次描述信息。
与上述过程类似,可以将二次待识别标签中的名词的解释信息作为二次描述信息。
第五步,通过二次描述信息构建上述待处理信息的应答信息。
通过二次描述信息可以重新构建应答信息,之后,再将应答信息 发送到终端设备101、102、103上。
需要说明的是,如果通过二次描述信息得到的应答信息对应的反馈信息的准确率信息还是低于设定阈值,则还可以继续选出其他的待识别标签得到应答信息,直至对应的准确率信息高于设定阈值。
当确定了应答信息正确后,可以将待处理信息中的图像和应答信息建立对应关系,并保存到数据库中,以便后续对图像和应答信息的查询。
继续参见图3,图3是根据本实施例的用于信息交互的方法的应用场景的一个示意图。在图3的应用场景中,用户通过终端设备102在植物吧中发了信息:“刚才在附近考到的,挺漂亮,不知道是什么,有人知道么?”,并增加了对应的图像。服务器105获取到用户发送的内容,将给内容作为待处理信息。然后,服务器105从文字信息中提取到特征词“是什么”;之后,服务器105获取到该图像对应的描述信息,通过描述信息构建应答信息,并发送给终端设备102。
本申请的上述实施例提供的方法能够提取待处理信息的文字信息的特征词,并得到待处理信息的图像的描述信息,建立了待处理信息中文字信息与图像的对应关系;之后,通过描述信息构建应答信息,实现了与待处理信息之间的信息交互。
进一步参考图4,作为对上述各图所示方法的实现,本申请提供了一种用于信息交互的装置的一个实施例,该装置实施例与图2所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。
如图4所示,本实施例的用于信息交互的装置400可以包括:待处理信息获取单元401、描述信息获取单元402和应答信息构建单元403。其中,待处理信息获取单元401用于获取待处理信息,上述待处理信息包括文字信息和图像;描述信息获取单元402用于提取上述待处理信息的文字信息的特征词,基于上述特征词查找上述待处理信息的图像的描述信息,其中,上述特征词用于表征对上述图像的查找请求,上述描述信息用于表征对上述图像的文字描述;应答信息构建单元403用于通过上述描述信息构建上述待处理信息的应答信息。
在本实施例的一些可选的实现方式中,上述描述信息获取单元 402可以包括:语义识别子单元(图中未示出)和特征词提取子单元(图中未示出)。其中,语义识别子单元用于对文字信息进行语义识别,得到对应上述文字信息的语义信息;特征词提取子单元用于从上述语义信息中提取特征词。
在本实施例的一些可选的实现方式中,上述描述信息获取单元402可以包括:待匹配图像获取子单元(图中未示出)、语义标签获取子单元(图中未示出)和描述信息获取子单元(图中未示出)。其中,待匹配图像获取子单元用于将图像导入图像查找模型,得到与上述图像对应的待匹配图像集合,上述待匹配图像集合包括至少一张待匹配图像,上述图像查找模型用于表征图像与待匹配图像之间的第一对应关系;语义标签获取子单元用于将待匹配图像导入语义标签模型,得到对应待匹配图像集合的语义标签集合,上述语义标签模型用于表征待匹配图像与语义标签的第二对应关系,语义标签用于对待匹配图像进行文字说明;描述信息获取子单元用于从上述语义标签集合中筛选出一个待识别语义标签,将待识别语义标签中的与上述图像对应的名词的解释信息作为描述信息。
在本实施例的一些可选的实现方式中,上述描述信息获取子单元可以包括:统计上述语义标签集合中相同的语义标签的数量,将数量最多的语义标签作为待识别语义标签。
在本实施例的一些可选的实现方式中,上述用于信息交互的装置400还可以包括:修正单元(图中未示出),用于对上述描述信息进行修正,上述修正单元可以包括:反馈信息接收子单元、准确率信息获取子单元、二次待识别标签获取子单元、二次描述信息获取子单元和二次应答信息构建子单元。其中,反馈信息接收子单元用于接收对应上述应答信息的反馈信息,上述反馈信息用于评价上述应答信息的准确性;准确率信息获取子单元用于对上述反馈信息进行语义识别,得到准确率信息;二次待识别标签获取子单元用于在上述准确率信息低于设定阈值时,从上述语义标签集合中除待识别语义标签以外的语义标签中选择二次待识别标签;二次描述信息获取子单元用于将二次待识别标签中的与上述图像对应的名词的解释信息作为二次描述信息; 二次应答信息构建子单元用于通过二次描述信息构建上述待处理信息的应答信息。
本实施例还提供了一种服务器,包括:一个或多个处理器;存储器,用于存储一个或多个程序,当上述一个或多个程序被上述一个或多个处理器执行时,使得上述一个或多个处理器执行上述的用于信息交互的方法。
本实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现上述的用于信息交互的方法。
下面参考图5,其示出了适于用来实现本申请实施例的服务器的计算机系统500的结构示意图。图5示出的服务器仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。
如图5所示,计算机系统500包括中央处理单元(CPU)501,其可以根据存储在只读存储器(ROM)502中的程序或者从存储部分508加载到随机访问存储器(RAM)503中的程序而执行各种适当的动作和处理。在RAM 503中,还存储有系统500操作所需的各种程序和数据。CPU 501、ROM 502以及RAM 503通过总线504彼此相连。输入/输出(I/O)接口505也连接至总线504。
以下部件连接至I/O接口505:包括键盘、鼠标等的输入部分506;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分507;包括硬盘等的存储部分508;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分509。通信部分509经由诸如因特网的网络执行通信处理。驱动器510也根据需要连接至I/O接口505。可拆卸介质511,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器510上,以便于从其上读出的计算机程序根据需要被安装入存储部分508。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分509从网络上被下载和安装,和/或从 可拆卸介质511被安装。在该计算机程序被中央处理单元(CPU)501执行时,执行本申请的方法中限定的上述功能。
需要说明的是,本申请上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本申请中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本申请中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。
附图中的流程图和框图,图示了按照本申请各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组 合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本申请实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中,例如,可以描述为:一种处理器包括待处理信息获取单元、描述信息获取单元和应答信息构建单元。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定,例如,应答信息构建单元还可以被描述为“用于构建应答信息的单元”。
作为另一方面,本申请还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的装置中所包含的;也可以是单独存在,而未装配入该装置中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该装置执行时,使得该装置:获取待处理信息,上述待处理信息包括文字信息和图像;提取上述待处理信息的文字信息的特征词,基于上述特征词查找上述待处理信息的图像的描述信息,其中,上述特征词用于表征对上述图像的查找请求,上述描述信息用于表征对上述图像的文字描述;通过上述描述信息构建上述待处理信息的应答信息。
以上描述仅为本申请的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本申请中所涉及的发明范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述发明构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本申请中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。
Claims (12)
- 一种用于信息交互的方法,其特征在于,所述方法包括:获取待处理信息,所述待处理信息包括文字信息和图像;提取所述待处理信息的文字信息的特征词,基于所述特征词查找所述待处理信息的图像的描述信息,其中,所述特征词用于表征对所述图像的查找请求,所述描述信息用于表征对所述图像的文字描述;通过所述描述信息构建所述待处理信息的应答信息。
- 根据权利要求1所述的方法,其特征在于,所述提取所述待处理信息的文字信息的特征词包括:对文字信息进行语义识别,得到对应所述文字信息的语义信息;从所述语义信息中提取特征词。
- 根据权利要求1所述的方法,其特征在于,所述基于所述特征词查找所述待处理信息的图像的描述信息包括:将图像导入图像查找模型,得到与所述图像对应的待匹配图像集合,所述待匹配图像集合包括至少一张待匹配图像,所述图像查找模型用于表征图像与待匹配图像之间的第一对应关系;将待匹配图像导入语义标签模型,得到对应待匹配图像集合的语义标签集合,所述语义标签模型用于表征待匹配图像与语义标签的第二对应关系,语义标签用于对待匹配图像进行文字说明;从所述语义标签集合中筛选出一个待识别语义标签,将待识别语义标签中的与所述图像对应的名词的解释信息作为描述信息。
- 根据权利要求3所述的方法,其特征在于,所述从所述语义标签集合中筛选出一个待识别语义标签包括:统计所述语义标签集合中相同的语义标签的数量,将数量最多的语义标签作为待识别语义标签。
- 根据权利要求4所述的方法,其特征在于,所述方法还包括:对所述描述信息进行修正的步骤,所述对所述描述信息进行修正的步骤包括:接收对应所述应答信息的反馈信息,所述反馈信息用于评价所述应答信息的准确性;对所述反馈信息进行语义识别,得到准确率信息;在所述准确率信息低于设定阈值时,从所述语义标签集合中除待识别语义标签以外的语义标签中选择二次待识别标签;将二次待识别标签中的与所述图像对应的名词的解释信息作为二次描述信息;通过二次描述信息构建所述待处理信息的应答信息。
- 一种用于信息交互的装置,其特征在于,所述装置包括:待处理信息获取单元,用于获取待处理信息,所述待处理信息包括文字信息和图像;描述信息获取单元,用于提取所述待处理信息的文字信息的特征词,基于所述特征词查找所述待处理信息的图像的描述信息,其中,所述特征词用于表征对所述图像的查找请求,所述描述信息用于表征对所述图像的文字描述;应答信息构建单元,用于通过所述描述信息构建所述待处理信息的应答信息。
- 根据权利要求6所述的装置,其特征在于,所述描述信息获取单元包括:语义识别子单元,用于对文字信息进行语义识别,得到对应所述文字信息的语义信息;特征词提取子单元,用于从所述语义信息中提取特征词。
- 根据权利要求6所述的装置,其特征在于,所述描述信息获取 单元包括:待匹配图像获取子单元,用于将图像导入图像查找模型,得到与所述图像对应的待匹配图像集合,所述待匹配图像集合包括至少一张待匹配图像,所述图像查找模型用于表征图像与待匹配图像之间的第一对应关系;语义标签获取子单元,用于将待匹配图像导入语义标签模型,得到对应待匹配图像集合的语义标签集合,所述语义标签模型用于表征待匹配图像与语义标签的第二对应关系,语义标签用于对待匹配图像进行文字说明;描述信息获取子单元,用于从所述语义标签集合中筛选出一个待识别语义标签,将待识别语义标签中的与所述图像对应的名词的解释信息作为描述信息。
- 根据权利要求8所述的装置,其特征在于,所述描述信息获取子单元包括:统计所述语义标签集合中相同的语义标签的数量,将数量最多的语义标签作为待识别语义标签。
- 根据权利要求9所述的装置,其特征在于,所述装置还包括:修正单元,用于对所述描述信息进行修正,所述修正单元包括:反馈信息接收子单元,用于接收对应所述应答信息的反馈信息,所述反馈信息用于评价所述应答信息的准确性;准确率信息获取子单元,用于对所述反馈信息进行语义识别,得到准确率信息;二次待识别标签获取子单元,用于在所述准确率信息低于设定阈值时,从所述语义标签集合中除待识别语义标签以外的语义标签中选择二次待识别标签;二次描述信息获取子单元,用于将二次待识别标签中的与所述图像对应的名词的解释信息作为二次描述信息;二次应答信息构建子单元,用于通过二次描述信息构建所述待处 理信息的应答信息。
- 一种服务器,包括:一个或多个处理器;存储器,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器执行权利要求1至5任一所述的方法。
- 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如权利要求1至5任一所述的方法。
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP18836804.7A EP3480704A4 (en) | 2017-09-19 | 2018-06-26 | PROCESS AND DEVICE FOR INFORMATION INTERACTION |
| JP2019504024A JP6783375B2 (ja) | 2017-09-19 | 2018-06-26 | 情報インタラクションのための方法および装置 |
| US16/265,303 US20190163699A1 (en) | 2017-09-19 | 2019-02-01 | Method and apparatus for information interaction |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201710847084.6A CN107590252A (zh) | 2017-09-19 | 2017-09-19 | 用于信息交互的方法及装置 |
| CN201710847084.6 | 2017-09-19 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US16/265,303 Continuation US20190163699A1 (en) | 2017-09-19 | 2019-02-01 | Method and apparatus for information interaction |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2019056821A1 true WO2019056821A1 (zh) | 2019-03-28 |
Family
ID=61047238
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2018/092870 Ceased WO2019056821A1 (zh) | 2017-09-19 | 2018-06-26 | 用于信息交互的方法及装置 |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US20190163699A1 (zh) |
| EP (1) | EP3480704A4 (zh) |
| JP (1) | JP6783375B2 (zh) |
| CN (1) | CN107590252A (zh) |
| WO (1) | WO2019056821A1 (zh) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110427460A (zh) * | 2019-08-06 | 2019-11-08 | 北京百度网讯科技有限公司 | 用于交互信息的方法及装置 |
| CN112905825A (zh) * | 2019-12-04 | 2021-06-04 | 上海博泰悦臻电子设备制造有限公司 | 用于信息处理的方法、设备和计算机存储介质 |
Families Citing this family (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108334498A (zh) * | 2018-02-07 | 2018-07-27 | 百度在线网络技术(北京)有限公司 | 用于处理语音请求的方法和装置 |
| CN111931510B (zh) * | 2019-04-25 | 2024-08-20 | 广东小天才科技有限公司 | 一种基于神经网络的意图识别方法及装置、终端设备 |
| CN111538862B (zh) * | 2020-05-15 | 2023-06-20 | 北京百度网讯科技有限公司 | 用于解说视频的方法及装置 |
| CN113052561A (zh) * | 2021-04-01 | 2021-06-29 | 苏州惟信易量智能科技有限公司 | 一种基于可穿戴设备的流程控制系统及方法 |
| CN113034114A (zh) * | 2021-04-01 | 2021-06-25 | 苏州惟信易量智能科技有限公司 | 一种基于可穿戴设备的流程控制系统及方法 |
| CN113034113B (zh) * | 2021-04-01 | 2024-10-22 | 苏州惟信易量智能科技有限公司 | 一种基于可穿戴设备的流程控制系统及方法 |
| CN117216308B (zh) * | 2023-11-09 | 2024-04-26 | 天津华来科技股份有限公司 | 基于大模型的搜索方法、系统、设备及介质 |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1952935A (zh) * | 2006-09-22 | 2007-04-25 | 南京搜拍信息技术有限公司 | 综合利用图像及文字信息的搜索系统及搜索方法 |
| CN102402582A (zh) * | 2010-09-30 | 2012-04-04 | 微软公司 | 提供与相关媒体项相关联的对象和个体之间的关联 |
| US8452794B2 (en) * | 2009-02-11 | 2013-05-28 | Microsoft Corporation | Visual and textual query suggestion |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101986293B (zh) * | 2010-09-03 | 2016-08-24 | 百度在线网络技术(北京)有限公司 | 用于在搜索界面中呈现搜索答案信息的方法及设备 |
| US9098533B2 (en) * | 2011-10-03 | 2015-08-04 | Microsoft Technology Licensing, Llc | Voice directed context sensitive visual search |
| US10223454B2 (en) * | 2013-05-01 | 2019-03-05 | Cloudsight, Inc. | Image directed search |
| JP2017037557A (ja) * | 2015-08-12 | 2017-02-16 | 富士ゼロックス株式会社 | 情報処理装置及びプログラム |
| CN105512220B (zh) * | 2015-11-30 | 2018-12-11 | 小米科技有限责任公司 | 图像页面输出方法及装置 |
| CN106777177A (zh) * | 2016-12-22 | 2017-05-31 | 百度在线网络技术(北京)有限公司 | 检索方法和装置 |
| CN115357748B (zh) * | 2017-01-17 | 2025-10-21 | 腾讯科技(上海)有限公司 | 头戴式装置 |
-
2017
- 2017-09-19 CN CN201710847084.6A patent/CN107590252A/zh active Pending
-
2018
- 2018-06-26 WO PCT/CN2018/092870 patent/WO2019056821A1/zh not_active Ceased
- 2018-06-26 EP EP18836804.7A patent/EP3480704A4/en not_active Ceased
- 2018-06-26 JP JP2019504024A patent/JP6783375B2/ja active Active
-
2019
- 2019-02-01 US US16/265,303 patent/US20190163699A1/en not_active Abandoned
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1952935A (zh) * | 2006-09-22 | 2007-04-25 | 南京搜拍信息技术有限公司 | 综合利用图像及文字信息的搜索系统及搜索方法 |
| US8452794B2 (en) * | 2009-02-11 | 2013-05-28 | Microsoft Corporation | Visual and textual query suggestion |
| CN102402582A (zh) * | 2010-09-30 | 2012-04-04 | 微软公司 | 提供与相关媒体项相关联的对象和个体之间的关联 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP3480704A4 * |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110427460A (zh) * | 2019-08-06 | 2019-11-08 | 北京百度网讯科技有限公司 | 用于交互信息的方法及装置 |
| CN112905825A (zh) * | 2019-12-04 | 2021-06-04 | 上海博泰悦臻电子设备制造有限公司 | 用于信息处理的方法、设备和计算机存储介质 |
| CN112905825B (zh) * | 2019-12-04 | 2023-03-21 | 博泰车联网科技(上海)股份有限公司 | 用于信息处理的方法、设备和计算机存储介质 |
Also Published As
| Publication number | Publication date |
|---|---|
| EP3480704A4 (en) | 2019-09-18 |
| JP2019536122A (ja) | 2019-12-12 |
| US20190163699A1 (en) | 2019-05-30 |
| EP3480704A1 (en) | 2019-05-08 |
| JP6783375B2 (ja) | 2020-11-11 |
| CN107590252A (zh) | 2018-01-16 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2019056821A1 (zh) | 用于信息交互的方法及装置 | |
| US11899681B2 (en) | Knowledge graph building method, electronic apparatus and non-transitory computer readable storage medium | |
| US10803861B2 (en) | Method and apparatus for identifying information | |
| US20190005126A1 (en) | Artificial intelligence based method and apparatus for processing information | |
| CN109145280B (zh) | 信息推送的方法和装置 | |
| CN108229704B (zh) | 用于推送信息的方法和装置 | |
| CN109189938B (zh) | 用于更新知识图谱的方法和装置 | |
| US11011163B2 (en) | Method and apparatus for recognizing voice | |
| CN111522927B (zh) | 基于知识图谱的实体查询方法和装置 | |
| US11055373B2 (en) | Method and apparatus for generating information | |
| US11003731B2 (en) | Method and apparatus for generating information | |
| CN107919129A (zh) | 用于控制页面的方法和装置 | |
| CN108287927B (zh) | 用于获取信息的方法及装置 | |
| WO2020044099A1 (zh) | 一种基于对象识别的业务处理方法和装置 | |
| CN109934242A (zh) | 图片识别方法和装置 | |
| CN108268450B (zh) | 用于生成信息的方法和装置 | |
| CN112214695A (zh) | 信息处理方法、装置和电子设备 | |
| CN110019906B (zh) | 用于显示信息的方法和装置 | |
| CN108667915B (zh) | 信息推送系统、方法和装置 | |
| CN107622766B (zh) | 用于搜索信息的方法和装置 | |
| KR20210080561A (ko) | 컨설팅 정보 처리 방법 및 장치 | |
| CN108920707B (zh) | 用于标注信息的方法及装置 | |
| CN107885872B (zh) | 用于生成信息的方法和装置 | |
| WO2022012107A1 (zh) | 用于显示插件的方法及装置 | |
| CN110796137A (zh) | 一种识别图像的方法和装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| ENP | Entry into the national phase |
Ref document number: 2019504024 Country of ref document: JP Kind code of ref document: A |
|
| ENP | Entry into the national phase |
Ref document number: 2018836804 Country of ref document: EP Effective date: 20190201 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |