WO2018023212A1 - 一种图像识别方法及终端 - Google Patents

一种图像识别方法及终端 Download PDF

Info

Publication number
WO2018023212A1
WO2018023212A1 PCT/CN2016/092464 CN2016092464W WO2018023212A1 WO 2018023212 A1 WO2018023212 A1 WO 2018023212A1 CN 2016092464 W CN2016092464 W CN 2016092464W WO 2018023212 A1 WO2018023212 A1 WO 2018023212A1
Authority
WO
WIPO (PCT)
Prior art keywords
terminal
information
target
image file
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2016/092464
Other languages
English (en)
French (fr)
Inventor
李昌竹
王细勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201680087732.6A priority Critical patent/CN109478311A/zh
Priority to US16/321,960 priority patent/US11132545B2/en
Priority to EP16910810.7A priority patent/EP3486863A4/en
Priority to PCT/CN2016/092464 priority patent/WO2018023212A1/zh
Publication of WO2018023212A1 publication Critical patent/WO2018023212A1/zh
Anticipated expiration legal-status Critical
Priority to US17/240,103 priority patent/US11804053B2/en
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/40Software arrangements specially adapted for pattern recognition, e.g. user interfaces or toolboxes therefor
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • G06V10/95Hardware or software architectures specially adapted for image or video understanding structured as a network, e.g. client-server architectures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04842Selection of displayed objects or displayed text elements

Definitions

  • the present invention relates to the field of image recognition technologies, and in particular, to an image recognition method and a terminal.
  • label information can be automatically labeled for image files.
  • the image needs to be uploaded to the server, and the server identifies the image and labels the label information, and feeds the labeled label information to the terminal.
  • the server identifies the image and labels the label information, and feeds the labeled label information to the terminal.
  • the terminal needs to upload the image to the server for labeling the label information, which is disadvantageous for Protect the privacy of end users.
  • the embodiment of the invention discloses an image recognition method and a terminal, which can improve the efficiency of the image recognition by the terminal and effectively protect the privacy of the terminal user.
  • a first aspect of the embodiments of the present invention discloses an image recognition method, including: a terminal acquiring an image file including a target object; and the terminal, by using a neural network computing device in the terminal, according to an image recognition model in the terminal The target object is identified to obtain object type information of the target object; the terminal saves the object type information as first object information of the target object to the image file.
  • the terminal can identify the target object in the image file according to the image recognition model in the terminal by the neural network computing device in the terminal, and obtain the item category of the target object. Information, and can protect the item category information as the first label information Save to image file.
  • the terminal can identify the target through the neural network computing device in the terminal. Thereby improving the efficiency of the terminal for image recognition and effectively protecting the privacy of the end user.
  • the acquiring, by the terminal, an image file that includes the target object including: determining, by the terminal, a target corresponding to the selecting operation according to a user selecting a display image on the terminal
  • the terminal generates an image file containing the object.
  • the terminal can interact with the user in the above manner.
  • the terminal if the selecting operation corresponds to n targets, and n is a positive integer, the terminal generates an image file that includes the target, including: the terminal generates the An image file of n objects; or the terminal generates m image files including at least one object of the n objects, where m is an integer and m ⁇ n.
  • the method before the terminal identifies the target object according to an image recognition model in the terminal, to obtain the object category information of the target object, the method further includes: The terminal receives and stores the image recognition model sent by the first server; or the terminal performs image recognition training according to the picture in the terminal by the neural network computing device to obtain the image recognition model.
  • the terminal can perform image recognition training in a variety of ways, which is beneficial to obtain a more accurate graphic recognition model.
  • the method further includes: the terminal The first tag information of the target object is sent to the first server to share the first tag information on the first server.
  • the terminal can share the workload of the server by sharing the label information, and can realize information sharing with other terminals.
  • the method before the terminal sends the first label information of the target to the first server, the method further includes: the terminal is configured according to the user Revising the label information, and modifying the first label information; the terminal sending the first label information of the target to the first server, including: the terminal The first tag information is sent to the first server.
  • the terminal further improves the accuracy of the generated tags, and can share through the server, sharing the workload required to be revised on the server.
  • the method further includes: if the terminal receives the first The first label revision information sent by the server, the terminal updating the first label information to the first label revision information.
  • the terminal can also obtain more accurate tag information through the server.
  • the method further includes: the terminal is at the target Locating, in the object detail information, the second tag information that matches the first tag information; the terminal determining whether the second tag information is found; if yes, the terminal is the second tag information Saving to the image file, and marking the target object in the image file according to the second label information; if not, the terminal is in the image file according to the first label information The target object is labeled.
  • the terminal can obtain more detailed label information by the above manner.
  • a second aspect of an embodiment of the present invention discloses a terminal, the terminal comprising means for performing the method in the first aspect.
  • a third aspect of the embodiments of the present invention discloses a terminal, where the terminal includes a processor, a memory, a neural network computing device, a communication interface, and a communication bus, the processor, the memory, the neural network computing device, and the The communication interfaces are connected through the communication bus and complete communication with each other;
  • the memory stores executable program code for wireless communication
  • the processor is configured to support the terminal to perform a corresponding function of the method provided by the first aspect.
  • a fourth aspect of the embodiments of the present invention discloses a computer storage medium for storing computer software instructions for a terminal provided by the third aspect, which comprises a program for executing the method in the first aspect.
  • the terminal can identify the target object in the image file according to the image recognition model in the terminal by acquiring the image file containing the target object, and obtain the target through the neural network computing device in the terminal.
  • the item category information of the item, and the item category information can be saved as the first tag information in the image file.
  • the terminal can pass the neural network in the terminal
  • the computing device identifies the target. Thereby improving the efficiency of the terminal for image recognition and effectively protecting the privacy of the end user.
  • FIG. 1 is a schematic structural diagram of an image recognition system according to an embodiment of the present invention.
  • FIG. 2 is a schematic flow chart of an image recognition method according to an embodiment of the present invention.
  • FIG. 3 is a schematic flow chart of another image recognition method according to an embodiment of the present invention.
  • FIG. 4 is a schematic flow chart of still another image recognition method according to an embodiment of the present invention.
  • FIG. 5 is a schematic flowchart diagram of still another image recognition method according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of a terminal display interface according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of another terminal display interface according to an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of still another terminal display interface according to an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a unit of a terminal according to an embodiment of the present invention.
  • FIG. 10 is a schematic structural diagram of a terminal according to an embodiment of the present invention.
  • references to "an embodiment” herein mean that a particular feature, structure, or characteristic described in connection with the embodiments can be included in at least one embodiment of the invention.
  • the appearances of the phrases in various places in the specification are not necessarily referring to the same embodiments, and are not exclusive or alternative embodiments that are mutually exclusive. Those skilled in the art will understand and implicitly understand that the embodiments described herein can be combined with other embodiments.
  • the embodiment of the invention discloses an image recognition method, a terminal and a system, which can improve the efficiency of the image recognition by the terminal and effectively protect the privacy of the terminal user.
  • the application scenarios of the embodiments of the present invention are described below.
  • the application scenarios in the embodiments of the present invention may also be understood as the communication systems to which the embodiments of the present invention are applied.
  • FIG. 1 is a schematic structural diagram of an image recognition system disclosed in the prior art.
  • the system includes an image recognition server 1 and n terminals 21 to 2n.
  • the image recognition server 1 can perform image recognition and classification on the image uploaded by the terminal through CV technology, NLP technology, etc., and then label the uploaded image with label information, and can feed back the label information to the uploading. terminal.
  • the background information may be revised by manual calibration to increase the accuracy of labeling the image information. That is, the image recognition server 1 can simultaneously realize automatic labeling of the image and manual labeling of the label information.
  • the workload of the image recognition server 1 increases, and the processing efficiency of labeling the image information is low. If the manual calibration label information is required in the background, the cost of maintaining the image recognition server 1 is also increased. The above method cannot effectively protect the privacy of the end user because the image in the terminal needs to be uploaded to obtain the corresponding tag information.
  • the terminal described in the embodiments of the present invention may include a handheld computer, a personal digital assistant, a cellular phone, a network application, a camera, a smart phone, an enhanced general packet radio service (EGPRS) mobile phone, a media player, a navigation device, and an electronic device.
  • GPRS enhanced general packet radio service
  • FIG. 2 is a schematic flowchart diagram of an image recognition method according to an embodiment of the present invention.
  • the method can be implemented by the above terminal. As shown in FIG. 2, the method includes at least the following steps.
  • step S201 the terminal acquires an image file containing the target object.
  • the terminal may obtain an image file that includes the target object from the image file or the video file; or the terminal may obtain an image file that includes the target object from other communication devices that communicate with the terminal; or, the terminal also
  • the target object may be extracted from the original image file or the video file, and a new image file containing the target object may be generated; or the terminal may use the stored image file as an image file containing the target object, and determine the content according to the user's selection operation.
  • the target on the image file may include at least one target.
  • the determination of the target may be determined based on the user's operation on the image file, or may be determined based on the target feature in the target.
  • Step S202 The terminal identifies the target object according to an image recognition model in the terminal by using a neural network computing device in the terminal to obtain object type information of the target object.
  • the neural network computing device in the terminal may identify the target object according to the image recognition model in the terminal to obtain the target object.
  • Object category information Specifically, the image recognition model in the terminal may be established by the server, and the server performs image recognition model training by collecting multiple types of image files, and pushes the trained image recognition model to the terminal.
  • an image recognition model can be established in the terminal by a neural network computing device.
  • the terminal After the terminal acquires the image file containing the target object, the terminal may be triggered to call the image recognition model to identify the target object according to a user instruction or other manner, and obtain object type information of the target object by using the image recognition model.
  • the object category information may include text information such as an object category or an object name of the target. The efficiency of image recognition by the terminal can be improved by the neural network computing device configured in the terminal.
  • Step S203 The terminal saves the object category information as the first tag information of the target object to the image file.
  • the terminal uses the object category information as the target
  • the first tag information of the object is saved to the image file.
  • the object category information of the target object obtained by the image recognition model may be saved in the image file according to a preset label information format, so that the object category information is saved as the first label information to the image file.
  • the first tag information can be saved to an extended segment or a comment segment of the image file.
  • the terminal can follow The first tag information marks the target object, and the second tag information may be determined according to the first tag information, so that the target object is marked according to the second tag information.
  • the terminal may further perform the user operation to modify the saved first label information, or upload the first label information to the server for sharing.
  • the embodiments of the present invention are not limited herein.
  • the terminal can identify the target object in the image file according to the image recognition model in the terminal by the neural network computing device in the terminal, and obtain the item category of the target object. Information, and the item category information can be saved as the first label information into the image file.
  • the terminal can identify the target through the neural network computing device in the terminal. Thereby improving the efficiency of the terminal for image recognition and effectively protecting the privacy of the end user.
  • FIG. 3 is a schematic flowchart diagram of another image recognition method according to an embodiment of the present invention. As shown in FIG. 3, the method includes at least the following steps.
  • step S301 the server A performs training of the image recognition model.
  • server A can train the image recognition model through a mobile video file or graphic file.
  • the image recognition model may be a Convolutional Neural Network (CNN).
  • Server A can use a variety of multimedia files (such as video files, image files, or other files including displayed images) as training materials by offline (such as manual collection in the background) or in the manner of first collection.
  • Step S302 the server A sends the trained image recognition model to the terminal.
  • server A can send the trained image recognition model to the terminal.
  • the server may send the image recognition model to the terminal as a file of a specific format, and the specific format may be known by the server and the terminal, so that the terminal can identify the file in the specific format after receiving the file, and The image recognition model is stored.
  • the server A may send an image recognition model to the requesting terminal after receiving the request of the terminal; or the server A may send the image recognition model to all or part of the terminals in the system after training the image recognition model.
  • the server A may broadcast a notification to the terminal in the system after training the image recognition model, and send an image recognition model to the response terminal in the system after receiving the response of the terminal in the system.
  • Step S303 the terminal receives and stores the image recognition model.
  • the terminal may receive the image recognition model sent by the server A, and store it in the terminal memory. After acquiring the image file containing the target object, the terminal may be from the terminal. The image recognition model is retrieved from the memory.
  • Step S304 the terminal determines, according to a selection operation of the display image on the terminal by the user, the target object corresponding to the selection operation.
  • the terminal may display an image file or a video file on the display interface, and may monitor a touch operation of the currently displayed image on the display interface by the user, and monitor the user to display the image on the terminal.
  • the operation is a selection operation
  • the target corresponding to the selection operation can be determined.
  • the selected operation of the displayed image by the user may be one or more times.
  • the target corresponding to the selection operation may be one or more.
  • the one or more targets may be determined according to an image region selected by the selection operation.
  • FIG. 6 is a schematic diagram of a terminal display interface according to an embodiment of the present invention.
  • the target object when it is detected that the selected area corresponding to the user's selection operation is C1, the target object may be determined as M1; or, when the selected area corresponding to the user's selection operation is monitored as C2, the target may be determined.
  • the objects are M2 and M3.
  • Step S305 the terminal generates an image file including the target.
  • the terminal may generate an image file including the target object.
  • an image file of different numbers of objects may be generated according to the number of objects in the selected image. For example, a plurality of objects may be included in one image file, or only one object may be included in one graphic file.
  • a new image file containing the target object may be generated.
  • the image file includes both the target object M2 and the target object M3.
  • two image files are generated according to the number of objects, wherein the two image files in FIG. 6C respectively include the object M2 and the object M3.
  • Step S306 the terminal identifies the target object according to the stored image recognition model by using a neural network computing device in the terminal to obtain object type information of the target object.
  • the terminal may identify the target object according to the stored image recognition model by using the neural network computing device in the terminal to obtain the object category of the target object. information.
  • the terminal may input the image file as input data into the image recognition model stored by the terminal by using a neural network computing device, where the image recognition model may be used in the neural network computing device.
  • the calculation method extracts the target object in the image file, classifies and identifies the target object, and finally outputs the target object.
  • Object category information The terminal may output object type information of the object at an output end of the image recognition model.
  • the object type information may include an object type of the target or an object name of the target, and the like.
  • Step S307 the terminal saves the object category information as the first tag information of the target object to the image file.
  • the terminal after the terminal identifies the target object according to the image recognition model to obtain the object category information of the target object, the first category information of the object category information as the target object may be saved into the image file.
  • the object category information obtained by the terminal may be stored according to a preset label format.
  • the terminal when the terminal obtains the object category information of the target object as the object, the object type is an animal, and the object name is a rabbit; it can be converted into the first label information according to the preset label format as follows.
  • the terminal may save the file to the extended segment or the comment segment of the image file.
  • the format of the image file as JPEG (Joint Photographic Experts Group) as an example
  • 0xFF0 ⁇ 0xFFD in the JPEG format image file is a pre-reserved extension section
  • 0xFFE in the JPEG format image file is a pre-reserved comment (Comment).
  • the terminal may save the first tag information to any one of the above two segments including the target image file.
  • the terminal may save the first label information to the corresponding segment of the other format, which is not performed in the embodiment of the present invention. limited.
  • the first label information may be displayed on the image file according to a preset label display format.
  • the first label information may also be displayed in a preset label display format on the original image or video file before the user performs the selection operation.
  • FIG. 7 is a schematic diagram of another terminal display interface according to an embodiment of the present invention.
  • the target object in the image file may be labeled, and the specific label information may be as shown in FIG. 7B or FIG. 7C; or The target object in the original image can also be labeled.
  • Label information in 7A see the figure.
  • Step S308 the terminal sends the first label information of the target to the server A to share the first label information on the server A.
  • the terminal may also send the obtained first label information to the server A to share the first label information on the server A.
  • the terminal may send the object category information to the server A.
  • the terminal may also transmit the image file containing the target object in which the first tag information is stored to the server A.
  • the server A can store the recognition result of the image recognition by the terminal, that is, the object type information or the first tag information of the target object transmitted by the terminal.
  • the server may feed back the object category information or the first label information uploaded by the terminal to the other terminal to share the image recognition result, and at the same time, the server A may avoid the image recognition model.
  • the target is identified multiple times, which improves the efficiency of image recognition.
  • the terminal may revise or calibrate the first label information before sending the first label information of the target to the server A. If the first label information is displayed on the image file, if the recognition result of the image recognition module is incorrect, that is, the first label information is not accurate, the user may receive the revision operation of the first label information, and according to the revision operation. The first tag information is revised to make the first tag information more accurate. Therefore, the revised first label information is sent to the server A, which can reduce the process of the server A to modify the label information in the background, thereby further improving the efficiency of image recognition.
  • the terminal may send the first label information of the target object to the server A, and receive the revised label information fed back by the server A for the first label information, so that the terminal may use the server A to revise the image recognition result, so that the terminal can make the terminal Get more accurate image recognition results.
  • the terminal may receive and store the image recognition model of the server training, and may identify the image containing the target object in the terminal according to the image recognition model sent by the server, to acquire the object category information of the target object, and
  • the object type information is used as the first tag information to mark the target object, and the first tag information can be sent to the server for sharing on the server, so that the terminal can share the workload of the image recognition by the server, and enhance the image recognition. effectiveness.
  • FIG. 4 is a schematic flowchart diagram of still another image recognition method according to an embodiment of the present invention. As shown in FIG. 4, the method includes the following steps.
  • step S401 the terminal acquires an image file containing the target object.
  • Step S402 the terminal identifies the target object according to an image recognition model in the terminal by using a neural network computing device in the terminal to obtain object type information of the target object.
  • Step S403 the terminal saves the object category information as the first tag information of the target object to the image file.
  • step S401 to the step S403 refer to the specific implementation manner of the corresponding step in the embodiment shown in FIG. 2 or FIG. 3, and details are not described herein again.
  • Step S404 the terminal sends the first tag information to the server B.
  • the first tag information may be sent to the server B.
  • the terminal sends a query request to the server B, where the query request carries the first label information.
  • the terminal requests the server B to search for the second tag information that matches the first tag information by sending a query request, where the second tag information includes more detailed description information about the target object.
  • Step S405 the server B receives the first tag information, and searches for a second tag information that matches the first tag information in the stored target object information base.
  • the server B may search for the second label information that matches the first label information in the stored target detailed information database. Specifically, the server B may pre-establish an item detailed information library, where the item detailed information library includes detailed information of the item, wherein the item type may correspond to different item detailed information; for example, when the item is an animal or plant, the item detailed information may be Including but not limited to information such as item name, scientific name, English name, subject, habitat, characteristics, etc.; when the item is a commodity, the item details may include, but are not limited to, the item name, category, place of production, manufacturer, and the like.
  • the server B may search, in the item detailed information database, whether there is second tag information that matches the item category or the item name in the first tag information according to the item category or the item name in the first tag information. If the tag information that matches the item category or the item name in the first tag information is found, the corresponding tag image may be selected from the plurality of tag information according to the specific display image of the target object. The target object displays the second tag information that the image matches.
  • Step S406 the server B sends a search result to the terminal.
  • server B may determine the results of the lookup in the target details repository. Among them, the search results may include found or not found. After the second tag information is found, the server may send the second tag information to the terminal in the search result.
  • Step S407 the terminal determines, according to the search result, whether the second tag information is found.
  • the terminal may receive the search result sent by the server B, and may determine, according to the search result, whether the second tag information that matches the first tag information is found. Specifically, the terminal may determine whether the second tag information is found by parsing whether the second tag information is carried in the search result. If the second tag information is not carried in the search result, the terminal may determine that the second tag information is not found. Alternatively, the terminal and the server B can agree that the value carried by the search result is different, and represents different search results. If the search result carries a value of 0, the search result is not found; the search result carries a value of 1, indicating that the search result is If found, the terminal may further request the second tag information from the server when the received search result is 1.
  • Step S408 if the terminal determines that the result is yes, the terminal saves the second tag information to the image file, and performs the target object in the image file according to the second tag information. Label.
  • the second tag information may be further saved into the image file that includes the target object.
  • the second label information refer to an implementation manner of saving the first label information.
  • the first tag information and the second tag information may be saved in the same extended segment or comment segment of the image file, or the first tag information or the second tag information may be saved into the extended segment and the comment segment of the image file, respectively.
  • the saved first label information or the second label information may be marked on the target object of the image file.
  • the second tag information may be marked on the target object of the image file according to a preset label format.
  • the label format of the second label information may be different from the label format of the first label information.
  • the image file may be marked by the first label information and the second label information at the same time.
  • FIG. 8 is a schematic diagram of still another terminal display interface according to an embodiment of the present invention.
  • the target object on the image file is marked by the second tag information; as shown in FIG. 8B, the target object on the image file can also be labeled by the first tag information and the second tag information.
  • Step S409 if the terminal determines that the result is no, the terminal labels the target object in the image file according to the first label information, and sends a notification message to the server B to notify the The server B deletes the second tag information that matches the first tag information in the target detailed information database.
  • the root when the terminal determines that the second tag information is not found, the root may be The first tag information is tagged to the target object in the image file according to a preset tag format.
  • the server B may be notified that the second tag information matching the first tag information is missing in the target detailed information repository stored by the server B, that is, the detailed information of the target object is missing.
  • the terminal may search for the corresponding second label information by other means (such as through a webpage) and feed back to the server B, and the server B performs the audit. If the server B passes the audit, the terminal B feeds back to the server B to acknowledge the second label information. Message; or Server B finds the second tag information on its own and adds it to the target details repository.
  • the terminal may obtain the second label information in a more detailed manner after acquiring the first label information. Thereby, a plurality of tag information can be marked for the target object, and the user is provided with a plurality of tag information to enhance the user experience.
  • FIG. 5 is a schematic flowchart diagram of still another image recognition method according to an embodiment of the present invention. As shown in FIG. 5, the method includes the following steps.
  • step S501 the server A performs training of the image recognition model.
  • Step S502 the server A sends the trained image recognition model to the terminal.
  • Step S503 the terminal receives and stores the image recognition model.
  • Step S504 the terminal determines, according to a selection operation of the display image on the terminal by the user, the target object corresponding to the selection operation.
  • Step S505 the terminal generates an image file including the target.
  • Step S506 the terminal identifies the target object according to the stored image recognition model by using a neural network computing device in the terminal to obtain object type information of the target object.
  • Step S507 the terminal saves the object category information as the first tag information of the target object to the image file.
  • Step S508 the terminal sends the first label information to the server A to share the first label information on the server A.
  • Step S509 the terminal sends the first tag information to the server B.
  • Step S510 the server B receives the first tag information, and searches for the second tag information that matches the first tag information in the stored target object information base.
  • Step S511 the server B sends a search result to the terminal.
  • Step S512 if the terminal acquires the second tag information according to the search result, save the second tag information to the image file, and compare the second tag information to the image file. Targets are marked.
  • steps S501 to S511 For a specific implementation manner of the steps S501 to S511, reference may be made to the specific implementation manner of the response step in the embodiment shown in FIG. 2 to FIG. 4, and details are not described herein again. It should be noted that the order of execution of steps S508 and S509 is not limited in the embodiment of the present invention.
  • the terminal can use the image recognition model pushed by the server A to obtain the label information of the target object in the image on the terminal, and the server B can also obtain the more detailed label information of the target object, thereby sharing the image of the server.
  • the recognized workload increases the efficiency of image recognition and provides users with a variety of tag information to enhance the user experience.
  • FIG. 9 is a schematic diagram of a unit composition of a terminal according to an embodiment of the present invention.
  • the terminal 900 can include:
  • An obtaining unit 901 configured to acquire an image file that includes a target object
  • the identifying unit 902 is configured to identify the target object by using a neural network computing device in the terminal according to an image recognition model in the terminal to obtain object type information of the target object;
  • the saving unit 903 is configured to save the object category information as the first tag information of the target object to the image file.
  • the obtaining unit 901 includes:
  • a determining unit configured to determine, according to a selection operation of the display image on the terminal by the user, a target object corresponding to the selecting operation
  • a generating unit for generating an image file containing the target.
  • the generating unit is further configured to:
  • the terminal before the identifying unit 902 identifies the target object according to the image recognition model in the terminal, to obtain the object category information of the target object, the terminal further includes:
  • the receiving unit 904 is configured to receive and store the image recognition model sent by the first server; or
  • the training unit 905 is configured to perform image recognition training according to the picture in the terminal by the neural network computing device to obtain the image recognition model.
  • the terminal further includes:
  • the sending unit 906 is configured to send the first label information of the target to the first server to share the first label information on the first server.
  • the terminal before the sending unit 906 sends the first label information of the target to the first server, the terminal further includes:
  • the revision unit 907 is configured to modify the first label information according to a revision operation of the first label information by the user;
  • the sending unit 906 is further configured to:
  • the terminal further includes:
  • the updating unit 908 is configured to update the first label information to the first label revision information if the terminal receives the first label revision information sent by the first server.
  • the terminal further includes:
  • the searching unit 909 is configured to search, in the target detailed information database, the second tag information that matches the first tag information
  • the determining unit 910 is configured to determine whether the second tag information is found
  • the saving unit 903 is further configured to: save the second label information to the image file if the determination result of the determining unit is yes, and compare the second label information to the image file. Labeling the target;
  • the labeling unit 911 is configured to mark the target object in the image file according to the first label information if the determination result of the determining unit is negative.
  • the obtaining unit 901 is configured to perform step S201 in the embodiment shown in FIG. 2
  • the method is: the identification unit 902 is configured to execute the method of step S202 in the embodiment shown in FIG. 2; the saving unit 903 is configured to execute the method of step S203 in the embodiment shown in FIG.
  • the receiving unit 904 is configured to execute the method in step S303 in the embodiment shown in FIG. 3; the obtaining unit 901 is further configured to execute the methods in steps S304-S305 in the embodiment shown in FIG. 3; It is also used to perform the method of step S306 in the embodiment shown in FIG. 3; the saving unit 903 is further configured to execute the method of step S307 in the embodiment shown in FIG. 3; the transmitting unit 906 is configured to execute the embodiment shown in FIG. The method of step S308.
  • the obtaining unit 901 is further configured to perform the method of step S401 in the embodiment shown in FIG. 4; the identifying unit 902 is further configured to execute the method in step S402 in the embodiment shown in FIG. 4; a method for performing step S403 in the embodiment shown in FIG. 4; the transmitting unit 907 is further configured to execute the method of step S404 in the embodiment shown in FIG. 4; the determining unit 910 is configured to execute the embodiment shown in FIG.
  • the method of step S407; the saving unit 903 is further configured to execute the method of step S408 in the embodiment shown in FIG. 4; the labeling unit 911 is configured to execute the method of steps S408 to S409 in the embodiment shown in FIG.
  • the receiving unit 904 is further configured to perform the method of step S503 in the embodiment shown in FIG. 5; the obtaining unit 901 is further configured to perform the methods of steps S504 to S505 in the embodiment shown in FIG. 5; 902 is also used to perform the method of step S506 in the embodiment shown in FIG. 5; the saving unit 903 is further configured to execute the method of step S507 in the embodiment shown in FIG. 5; the sending unit 907 is further configured to execute the implementation shown in FIG. The method of steps S508 to S509 in the example; the labeling unit 911 is further configured to execute the method of step S512 in the embodiment shown in FIG. 5.
  • the sending unit 907 can send a message, information, and the like to the server or other communication device through the communication interface configured in the terminal 900.
  • the receiving unit 904 can send the terminal or other communication device through the communication interface configured in the terminal 900. News, information, etc.
  • the above communication interface is a wireless interface.
  • the terminal 900 in the embodiment shown in Fig. 9 is presented in the form of a unit.
  • a "unit" herein may refer to an application-specific integrated circuit (ASIC), a processor and memory that executes one or more software or firmware programs, integrated logic circuits, and/or other devices that provide the functionality described above. .
  • ASIC application-specific integrated circuit
  • FIG. 10 is a schematic structural diagram of a terminal according to an embodiment of the present invention.
  • the terminal 1000 includes: a storage unit 1010, a communication interface 1020, a neural network computing device 1030, and a
  • the processor 1040 is coupled to the storage unit 1010 and the communication interface 1020.
  • the storage unit 1010 is configured to store instructions
  • the processor 1040 is configured to execute the instructions
  • the communication interface 1020 is configured to communicate with other devices under the control of the processor 1040.
  • any one of the above-described embodiments of the present application may be performed according to the instructions.
  • the neural network computing device 1030 may include one or more neural network computing chips, such as a DSP (Digital Signal Processing) chip, an NPU (Network Process Unit), and a GPU (Graphic Process Unit). At least one of the others.
  • DSP Digital Signal Processing
  • NPU Network Process Unit
  • GPU Graphic Process Unit
  • the processor 1040 can also be referred to as a Central Processing Unit (CPU).
  • the storage unit 1010 may include a read only memory and a random access memory, and provides instructions, data, and the like to the processor 1040.
  • a portion of storage unit 1010 may also include a non-volatile random access memory.
  • the components of terminal 1000 are coupled together, for example, by a bus system, in a particular application.
  • the bus system can also include a power bus, a control bus, and a status signal bus.
  • various buses are labeled as bus system 1050 in the figure. The method disclosed in the above embodiments of the present invention may be applied to the processor 1040 or implemented by the processor 1040.
  • Processor 1040 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method may be completed by an integrated logic circuit of hardware in the processor 1040 or an instruction in a form of software.
  • the processor 1040 may be a general purpose processor, a digital signal processor, an application specific integrated circuit, an off-the-shelf programmable gate array or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component.
  • the processor 1040 can implement or perform the methods, steps, and logic blocks disclosed in the embodiments of the present invention.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present invention may be directly implemented by the hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in the storage unit 1010.
  • the processor 1040 can read the information in the storage unit 1010 and complete the steps of the above method in combination with hardware.
  • the embodiment of the present invention further provides a computer storage medium for storing computer software instructions used by the above micro base station, which comprises a computer program for performing the foregoing method embodiments.
  • embodiments of the present invention can be provided as a method, apparatus (device), or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • the computer program is stored/distributed in a suitable medium, provided with other hardware or as part of the hardware, or in other distributed forms, such as over the Internet or other wired or wireless telecommunication systems.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

公开了一种图像识别方法及终端,该方法包括:终端获取包含目标物的图像文件;所述终端通过所述终端中的神经网络计算装置根据所述终端中的图像识别模型对所述目标物进行识别,以得到所述目标物的物体类别信息;所述终端将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件。可见,能够提高终端进行图像识别的效率,并有效保护终端用户的隐私。

Description

一种图像识别方法及终端 技术领域
本发明涉及图像识别技术领域,具体涉及一种图像识别方法及终端。
背景技术
当前,由于互联网上的信息繁杂,需要对互联网上的文本文件或者图像文件标注至少一个标签信息,以便于对文本文件或图像文件根据标签信息进行分类管理。随着计算机视觉(Computer Vision,CV)、自然语言理解(Natural Language Processing,NLP)、深度神经网络(Deep Neural Network)等技术的发展,可实现对图像文件自动标注标签信息。当为终端上的图像标注标签信息时,需要将图像上传至服务器中,服务器对图像识别分类后为图像标注标签信息,并将标注的标签信息反馈给终端。由于终端数量日益增加,导致服务器需要处理的图像数量增加,服务器工作量增大,则对图像标注标签信息的处理效率会降低;并且,终端需要将图像上传至服务器进行标签信息的标注,不利于保护终端用户的隐私。
发明内容
本发明实施例公开了一种图像识别方法及终端,能够提高终端进行图像识别的效率,并有效保护终端用户的隐私。
本发明实施例第一方面公开了一种图像识别方法,包括:终端获取包含目标物的图像文件;所述终端通过所述终端中的神经网络计算装置根据所述终端中的图像识别模型对所述目标物进行识别,以得到所述目标物的物体类别信息;所述终端将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件。
可见,终端通过获取包含目标物的图像文件,可通过所述终端中的神经网络计算装置根据所述终端中图像识别模型对图像文件中的目标物进行识别,并可得到该目标物的物品类别信息,并可将该物品类别信息作为第一标签信息保 存至图像文件中。通过上述方式,终端可通过终端中的神经网络计算装置对目标物进行识别。从而提高终端进行图像识别的效率,并有效保护终端用户的隐私。
在第一方面的一些可能的实施方式中,所述终端获取包含目标物的图像文件,包括:所述终端根据用户对所述终端上的显示图像的选取操作,确定所述选取操作对应的目标物;所述终端生成包含所述目标物的图像文件。
可见,终端可通过上述方式实现与用户进行交互。
在第一方面的一些可能的实施方式中,若所述选取操作对应n个目标物,n为正整数,所述终端生成包含所述目标物的图像文件,包括:所述终端生成包含所述n个目标物的图像文件;或者,所述终端生成包含所述n个目标物中的至少一个目标物的m个图像文件,m为整数且m≤n。
在第一方面的一些可能的实施方式中,所述终端根据终端中的图像识别模型对所述目标物进行识别,以得到所述目标物的物体类别信息之前,所述方法还包括:所述终端接收并存储第一服务器发送的所述图像识别模型;或者,所述终端通过所述神经网络计算装置根据所述终端中的图片进行图像识别训练,以得到所述图像识别模型。
可见,终端可通过多种方式进行图像识别训练,有利于得到更准确的图形识别模型。
在第一方面的一些可能的实施方式中,所述终端将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件之后,所述方法还包括:所述终端将所述目标物的第一标签信息发送至所述第一服务器,以在所述第一服务器上共享所述第一标签信息。
可见,终端可通过共享标签信息,分担服务器的工作量,并可实现与其他终端进行信息共享。
在第一方面的一些可能的实施方式中,所述终端将所述目标物的第一标签信息发送至所述第一服务器之前,所述方法还包括:所述终端根据用户针对所述第一标签信息的修订操作,对所述第一标签信息进行修订;所述终端将所述目标物的第一标签信息发送至所述第一服务器,包括:所述终端将所述修订后的所述第一标签信息发送至所述第一服务器。
可见,终端进一步提高所生成标签的准确度,并可通过服务器进行共享,分担了需要在服务器上进行修订的工作量。
在第一方面的一些可能的实施方式中,所述终端将所述目标物的第一标签信息发送至所述第一服务器之后,所述方法还包括:若所述终端接收到所述第一服务器发送的第一标签修订信息,所述终端将所述第一标签信息更新为所述第一标签修订信息。
可见,终端还可通过服务器获取更准确地标签信息。
在第一方面的一些可能的实施方式中,所述终端将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件之后,所述方法还包括:所述终端在目标物详细信息库中查找与所述第一标签信息相匹配的第二标签信息;所述终端判断是否查找到所述第二标签信息;若为是,则所述终端将所述第二标签信息保存至所述图像文件,并根据所述第二标签信息对所述图像文件中的所述目标物进行标注;若为否,则所述终端根据所述第一标签信息对所述图像文件中的所述目标物进行标注。
可见,终端可通过上述方式有可能获取更详细的标签信息。
本发明实施例第二方面公开了一种终端,所述终端包括用于执行第一方面中的方法的单元。
本发明实施例第三方面公开了一种终端,所述终端包括处理器、存储器、神经网络计算装置、通信接口和通信总线,所述处理器、所述存储器、所述神经网络计算装置和所述通信接口通过所述通信总线连接并完成相互间的通信;
所述存储器存储有可执行程序代码,所述通信接口用于无线通信;
所述处理器被配置为支持该终端执行第一方面提供的方法中相应的功能。
本发明实施例第四方面公开一种计算机存储介质,用于储存为上述第三方面提供的终端所用的计算机软件指令,其包含用于执行第一方面中方法所设计的程序。
本发明实施例中,终端通过获取包含目标物的图像文件,可通过所述终端中的神经网络计算装置根据所述终端中图像识别模型对图像文件中的目标物进行识别,并可得到该目标物的物品类别信息,并可将该物品类别信息作为第一标签信息保存至图像文件中。通过上述方式,终端可通过终端中的神经网络 计算装置对目标物进行识别。从而提高终端进行图像识别的效率,并有效保护终端用户的隐私。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例公开的图像识别系统的结构示意图;
图2是本发明实施例公开的一种图像识别方法的流程示意图;
图3是本发明实施例公开的另一种图像识别方法的流程示意图;
图4是本发明实施例公开的又一种图像识别方法的流程示意图;
图5是本发明实施例公开的又一种图像识别方法的流程示意图;
图6是本发明实施例公开的一种终端显示界面的示意图;
图7是本发明实施例公开的另一种终端显示界面的示意图;
图8是本发明实施例公开的又一种终端显示界面的示意图;
图9是本发明实施例公开的一种终端的单元组成示意图;
图10是本发明实施例公开的一种终端的结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。
本发明的说明书和权利要求书及所述附图中的术语“第一”、“第二”、“第三”和“第四”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出 的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本发明的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行描述。
本发明实施例公开了一种图像识别方法、终端及系统,能够提高终端进行图像识别的效率,并有效保护终端用户的隐私。为了更好的理解本发明实施例,下面先对本发明实施例的应用场景进行描述,其中,本发明实施例中的应用场景也可理解为本发明方法及装置实施例所应用的通信系统。
请参阅图1,图1是现有技术公开的图像识别系统的结构示意图。如图1所示,该系统包括图像识别服务器1及n个终端21~2n。在传统的图像识别的技术方案中,图像识别服务器1可通过CV技术、NLP技术等对终端上传的图像进行图像识别分类后,为上传的图像标注标签信息,并可将该标签信息反馈至上传终端。其中,在图像识别服务器对图像标注标签信息后,后台还可通过人工校准对该标签信息进行修订,以增加为图像标注标签信息的准确性。即通过图像识别服务器1可同时实现为图像自动标注标签信息以及人工标注标签信息。然而,随着系统中终端数量的增加,导致图像识别服务器1的工作量增加,对图像标注标签信息的处理效率低,若需要后台人工校准标签信息,则还会增加维护图像识别服务器1的成本,并且由于终端中的图像需要上传才能够获取对应的标签信息,则上述方式无法有效保护终端用户的隐私。
基于图1所示系统,为解决上述传统方式存在的技术问题,下面对本发明实施例中的方法实施例进行详细描述。其中,本发明实施例中描述的终端可以包括手持计算机、个人数字助理、蜂窝电话、网络应用、照相机、智能电话、增强型通用分组无线服务(EGPRS)移动电话、媒体播放器、导航设备、电子邮件设备、游戏机或这些数据处理设备或其它数据处理设备中的至少一种。
请参阅图2,图2是本发明实施例公开的一种图像识别方法的流程示意图。 该方法可由上述终端实现。如图2所示,该方法至少包括以下步骤。
步骤S201,终端获取包含目标物的图像文件。
在一些可行的实施方式中,终端可从图像文件或视频文件中获取包含目标物的图像文件;或者,终端也可从其他与其通信的通信设备中获取包含目标物的图像文件;或者,终端也可从原图像文件中或视频文件中提取出目标物,并生成包含该目标物的新图像文件;或者,终端将存储的图像文件作为包含目标物的图像文件,并根据用户的选取操作确定该图像文件上的目标物。其中,该图像文件可包括至少一个目标物。可选的,目标物的确定可基于用户对图像文件的操作确定,也可基于目标物中的目标特征进行确定。
步骤S202,所述终端通过所述终端中的神经网络计算装置根据所述终端中的图像识别模型对所述目标物进行识别,以得到所述目标物的物体类别信息。
在一些可行的实施方式中,终端获取到上述包含目标物的图像文件后,可通过所述终端中的神经网络计算装置根据终端中的图像识别模型对目标物进行识别,以得到该目标物的物体类别信息。具体的,终端中的图像识别模型可由服务器建立,服务器通过采集多种类型的图像文件进行图像识别模型训练,并将训练好的图像识别模型推送至终端中。或者,终端中可通过神经网络计算装置建立图像识别模型。当终端获取到包含目标物的图像文件后,可根据用户指令或其他方式触发终端调用上述图像识别模型对目标物进行识别,并通过上述图像识别模型得到该目标物的物体类别信息。其中,物体类别信息可包括该目标物的物体类别或物体名称等文本信息。通过终端中配置的神经网络计算装置,能够提高终端进行图像识别的效率。
步骤S203,所述终端将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件。
在一些可行的实施方式中,终端在根据所述终端中的图像识别模型对所述目标物进行识别,以得到所述目标物的物体类别信息之后,终端将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件。其中,通过图像识别模型得到的目标物的物体类别信息可按照预设的标签信息格式保存至图像文件中,从而,物体类别信息即作为第一标签信息保存至图像文件。可选的,可将第一标签信息保存至图像文件的扩展段或注释段中。可选的,终端可按照 第一标签信息对目标物进行标注,也可根据第一标签信息确定更详尽的第二标签信息,从而根据第二标签信息对目标物进行标注。可选的,终端还可接收用户操作对上述保存的第一标签信息进行修订,也可将第一标签信息上传至服务器进行共享。在此,本发明实施例不做限定。
可见,终端通过获取包含目标物的图像文件,可通过所述终端中的神经网络计算装置根据所述终端中图像识别模型对图像文件中的目标物进行识别,并可得到该目标物的物品类别信息,并可将该物品类别信息作为第一标签信息保存至图像文件中。通过上述方式,终端可通过终端中的神经网络计算装置对目标物进行识别。从而提高终端进行图像识别的效率,并有效保护终端用户的隐私。
请参见图3,图3是本发明实施例公开的另一种图像识别方法的流程示意图。如图3所示,该方法至少包括以下步骤。
步骤S301,服务器A进行图像识别模型的训练。
在一些可行的实施方式中,服务器A可通过手机视频文件或图形文件训练图像识别模型。可选的,该图像识别模型可为卷积神经网络模型(Convolutional Neural Network,CNN)。服务器A可通过离线(如后台人工收集)或者在先收集的方式手机多种多媒体文件(如视频文件、图像文件或其他包括显示图像的文件)作为训练材料。
步骤S302,所述服务器A将训练好的所述图像识别模型发送给终端。
在一些可行的实施方式中,服务器A可将训练好的图像识别模型发送给终端。具体的,服务器可将图像识别模型作为特定格式的文件发送给终端,该特定格式可被服务器与终端共知,从而终端能够在接收到该特定格式的文件后,对其进行识别,并可对该图像识别模型进行存储。可选的,服务器A可在接收到终端的请求后,向请求终端发送图像识别模型;或者,服务器A可在训练好上述图像识别模型后,将图像识别模型发送给系统中的全部或部分终端;或者,服务器A可在训练好上述图像识别模型后,对系统中的终端进行广播通知,当接收到系统中终端的响应后,向系统中的响应终端发送图像识别模型。
步骤S303,所述终端接收并存储所述图像识别模型。
在一些可行的实施方式中,终端可接收服务器A发送的图像识别模型,并将其存储在终端内存中,当获取到包含目标物的图像文件后,即可从终端内 存中调取该图像识别模型。
步骤S304,所述终端根据用户对所述终端上的显示图像的选取操作,确定所述选取操作对应的目标物。
在一些可行的实施方式中,终端可在显示界面上显示图像文件或视频文件等,并可监测用户对显示界面上当前显示图像的触控操作,当监测到用户对终端上显示图像的触控操作为选取操作时,可确定该选取操作对应的目标物。可选的,监测到的用户对显示图像的选取操作可为一次或多次,当监测到用户对显示图像的选取操作时,该选取操作对应的目标物可为一个或多个。可根据选取操作选取的图像区域确定上述一个或多个目标物。
举例说明,请参阅图6,图6是本发明实施例公开的一种终端显示界面的示意图。如图6A所示,当监测到用户的选取操作对应的选取区域为C1时,则可确定目标物为M1;或者,当监测到用户的选取操作对应的选取区域为C2时,则可确定目标物为M2和M3。
步骤S305,所述终端生成包含所述目标物的图像文件。
在一些可行的实施方式中,当确定选取操作对应的图像中的目标物后,终端可生成包含该目标物的图像文件。可选的,可根据选取的图像中的目标物的数量,对应生成不同数量的目标物的图像文件。如在一个图像文件中可包括多个目标物,或者在一个图形文件中也可仅包括一个目标物。
举例说明,当终端确定用户选取操作在显示界面上对应的选取区域为C2时,可生成包含目标物的新的图像文件,如图6B所示,图像文件中同时包括目标物M2与目标物M3;或者,如图6C所示,根据目标物的数量生成两个图像文件,其中,图6C中的两个图像文件分别包含目标物M2与目标物M3。
步骤S306,所述终端通过所述终端中的神经网络计算装置根据所述存储的图像识别模型对所述目标物进行识别,以得到所述目标物的物体类别信息。
在一些可行的实施方式中,当终端生成包含目标物的图像文件后,终端可通过所述终端中的神经网络计算装置根据存储的图像识别模型对目标物进行识别,以得到目标物的物体类别信息。可选的,当终端生成包含目标物的图像文件后,终端可通过神经网络计算装置将该图像文件作为输入数据输入至终端存储的图像识别模型中,图像识别模型可通过神经网络计算装置中的计算方式提取出图像文件中的目标物,并对该目标物进行分类识别,最后输出目标物的 物体类别信息。终端可在该图像识别模型的输出端输出目标物的物体类别信息。该物体类别信息可包括目标物的物体类别或者目标物的物体名称等。
步骤S307,所述终端将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件。
在一些可行的实施方式中,当终端根据图像识别模型对目标物进行识别,以得到目标物的物体类别信息后,可将该物体类别信息作为目标物的第一标签信息保存至图像文件中。其中,终端得到的物体类别信息可按照预设的标签格式进行存储。
举例说明,当终端得到目标物的物体类别信息为目标物的物体类别为动物,物体名称为兔子时;可将其按照如下预设的标签格式转化为第一标签信息。
本发明实施例中标签格式可为但不限于:
<xml>
<类别>动物</类别>
<名称>兔子</名称>
</xml>
可选的,终端将物体类别信息按照上述标签格式转化为第一标签信息后,可保存至图像文件的扩展段或注释段中。以图像文件的格式为JPEG(Joint Photographic Experts Group)为例,JPEG格式的图像文件中0xFF0~0xFFD为预保留的扩展(Extension)段,JPEG格式的图像文件中0xFFE为预保留的注释(Comment)段。则终端可将第一标签信息保存至包含目标物图像文件的上述两个段中的任意一个段内。当然,若图像文件的格式为其他格式时,如为PNG(Portable Network Graphic Format,图像文件存储格式)等,终端可将第一标签信息保存至其他格式的对应段中,本发明实施例不做限定。
可选的,可将第一标签信息按照预设的标签显示格式显示在图像文件上。
可选的,还可将第一标签信息按照预设的标签显示格式显示在用户未进行选取操作前的原始图像或视频文件上。
举例说明,请参阅图7,图7是本发明实施例公开的另一种终端显示界面的示意图。如图7B或图7C所示,当生成一个或多个包含目标物的图像文件时,可对图像文件中的目标物进行标注,具体标签信息可参见图7B或图7C中所示;或者,还可对原始图像中的目标物进行标注,具体标签信息可参见图 7A中的标签信息。
步骤S308,所述终端将所述目标物的第一标签信息发送至所述服务器A,以在所述服务器A上共享所述第一标签信息。
在一些可行的实施方式中,终端还可将得到的第一标签信息发送至服务器A中,以在服务器A上对第一标签信息进行共享。或者,可选的,终端可将物体类别信息发送至服务器A中。终端还可将保存有第一标签信息的包含目标物的图像文件发送给服务器A。服务器A能够存储终端进行图像识别的识别结果,即终端发送的目标物的物体类别信息或第一标签信息。当其他终端请求服务器识别相同目标物时,服务器可将终端上传的上述物体类别信息或第一标签信息反馈给其他终端,以实现图像识别结果的共享,同时,服务器A可避免通过图像识别模型对该目标物进行多次识别,提升了图像识别的效率。
可选的,终端在将目标物的第一标签信息发送至服务器A之前,可对该第一标签信息进行修订或校准。如在图像文件上显示该第一标签信息后,若图像识别模块的识别结果有误,即第一标签信息不够精确时,可接收用户对该第一标签信息的修订操作,并根据该修订操作对第一标签信息进行修订,以使第一标签信息能够更加准确。从而,将修订后的第一标签信息发送给服务器A,可减免服务器A后台对标签信息进行修订的过程,从而,进一步提升了图像识别的效率。或者,终端可将目标物的第一标签信息发送给服务器A后,接收服务器A针对该第一标签信息反馈的修订标签信息,从而,终端可利用服务器A对图像识别结果进行修订,可使终端得到更加准确的图像识别结果。
本发明实施例中,终端可接收并存储服务器训练的图像识别模型,并可根据服务器发送的图像识别模型对终端中的包含目标物的图像进行识别,以获取目标物的物体类别信息,并可将该物体类别信息作为第一标签信息对目标物进行标注,也可将该第一标签信息发送给服务器以在服务器上进行共享,从而通过终端能够分担服务器进行图像识别的工作量,提升图像识别效率。
请参阅图4,图4是本发明实施例公开的又一种图像识别方法的流程示意图。如图4所示,该方法包括以下步骤。
步骤S401,终端获取包含目标物的图像文件。
步骤S402,所述终端通过所述终端中的神经网络计算装置根据所述终端中的图像识别模型对所述目标物进行识别,以得到所述目标物的物体类别信息。
步骤S403,所述终端将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件。
在一些可行的实施方式中,步骤S401~步骤S403的具体实施方式可参见图2或图3所示实施例中的对应步骤的具体实施方式,在此不再赘述。
步骤S404,所述终端向服务器B发送所述第一标签信息。
在一些可行的实施方式中,当终端确定目标物的第一标签信息后,可将该第一标签信息发送至服务器B中。可选的,终端向服务器B发送查询请求,该查询请求携带有第一标签信息。终端通过发送查询请求以请求服务器B查找与第一标签信息相匹配的第二标签信息,其中,第二标签信息包括更加详细的对目标物的描述信息。
步骤S405,所述服务器B接收所述第一标签信息,并在存储的目标物详细信息库中查找与所述第一标签信息相匹配的第二标签信息。
在一些可行的实施方式中,服务器B接收到第一标签信息后,可在存储的目标物详细信息库中查找与第一标签信息相匹配的第二标签信息。具体的,服务器B可预建立物品详细信息库,该物品详细信息库中包括物品的详细信息,其中,物品类别不同可对应不同的物品详细信息;如当物品为动植物时,物品详细信息可包括但不限于物品名称、学名、英文名、所属科目、栖息地、特点等信息;当物品为商品时,物品详细信息可包括但不限于物品名称、类别、生产地、生产商等信息。其中,服务器B可根据第一标签信息中的物品类别或物品名称在物品详细信息库中查找是否存在与第一标签信息中的物品类别或物品名称相匹配的第二标签信息。若当查找出多个标签信息与第一标签信息中的物品类别或物品名称相匹配的标签信息时,还可根据目标物的具体显示图像从多个标签信息中筛选出与第一标签信息对应的目标物显示图像相匹配的第二标签信息。
步骤S406,所述服务器B向所述终端发送查找结果。
在一些可行的实施例中,服务器B可确定在目标物详细信息库中的查找结果。其中,查找结果可包括查找到或未查找到。当服务器查找到第二标签信息后,可在查找结果中携带第二标签信息发送至终端。
步骤S407,所述终端根据所述查找结果判断是否查找到所述第二标签信息。
在一些可行的实施例中,终端可接收服务器B发送的查找结果,并可根据查找结果判断是否查找到与第一标签信息相匹配的第二标签信息。具体的,终端可通过解析查找结果中是否携带第二标签信息,来判断是否查找到第二标签信息。若查找结果中未携带第二标签信息,则终端可判断未查找到第二标签信息。或者,终端与服务器B可约定查找结果所携带的值不同,代表不同的查找结果,如查找结果携带值为0时,表明查找结果为未查找到;查找结果携带值为1,表明查找结果为已查找到,则终端可在接收到的查找结果为1时,进一步向服务器请求第二标签信息。
步骤S408,若所述终端判断结果为是,则所述终端将所述第二标签信息保存至所述图像文件,并根据所述第二标签信息对所述图像文件中的所述目标物进行标注。
在一些可行的实施方式中,当终端判断出查找到第二标签信息后,可进一步将第二标签信息保存至包含目标物体的图像文件中。其中,保存第二标签信息的具体实施方式可参见保存第一标签信息的实施方式。其中,可将第一标签信息及第二标签信息保存至图像文件的同一扩展段或注释段内,或者将第一标签信息或第二标签信息分别保存至图像文件的扩展段及注释段中。
可选的,可将保存的第一标签信息或第二标签信息标注至图像文件的目标物上。本发明实施例中,当终端判断出查找到第二标签信息后,可将第二标签信息按照预设的标签格式标注至图像文件的目标物上。其中,第二标签信息的标签格式可与第一标签信息的标签格式不同。
可选的,还可同时通过第一标签信息与第二标签信息对图像文件进行标注。
举例说明,请参阅图8,图8是本发明实施例公开的又一种终端显示界面的示意图。如图8A所示,通过第二标签信息对图像文件上的目标物进行标注;如图8B所示,还可同时通过第一标签信息与第二标签信息对图像文件上的目标物进行标注。
步骤S409,若所述终端判断结果为否,则所述终端根据所述第一标签信息对所述图像文件中的所述目标物进行标注,并向所述服务器B发送通知消息,以通知所述服务器B在所述目标物详细信息库中缺失与所述第一标签信息相匹配的第二标签信息。
在一些可行的实施方式中,当终端判断出未查找到第二标签信息后,可根 据第一标签信息按照预设的标签格式标注至图像文件中的目标物上。并可通知服务器B在服务器B存储的目标物详细信息库中缺失与第一标签信息相匹配的第二标签信息,即缺失目标物的详细信息。可选的,终端可通过其他方式(如通过网页)查找对应的第二标签信息并反馈至服务器B,由服务器B进行审核,若服务器B审核通过,则向服务器B反馈承认该第二标签信息的消息;或者服务器B自行查找第二标签信息,并将其补充至目标物详细信息库中。
本发明实施例中,终端可在获取第一标签信息后,获取更加详细的第二标签信息。从而可为目标物标注多种标签信息,为用户提供了多种标签信息,增强用户体验。
请参阅图5,图5是本发明实施例公开的又一种图像识别方法的流程示意图。如图5所示,该方法包括以下步骤。
步骤S501,服务器A进行图像识别模型的训练。
步骤S502,所述服务器A将训练好的所述图像识别模型发送给终端。
步骤S503,所述终端接收并存储所述图像识别模型。
步骤S504,所述终端根据用户对所述终端上的显示图像的选取操作,确定所述选取操作对应的目标物。
步骤S505,所述终端生成包含所述目标物的图像文件。
步骤S506,所述终端通过所述终端中的神经网络计算装置根据所述存储的图像识别模型对所述目标物进行识别,以得到所述目标物的物体类别信息。
步骤S507,所述终端将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件。
步骤S508,所述终端向服务器A发送所述第一标签信息,以在所述服务器A上共享所述第一标签信息。
步骤S509,所述终端向服务器B发送所述第一标签信息。
步骤S510,所述服务器B接收所述第一标签信息,并在存储的目标物详细信息库中查找与所述第一标签信息相匹配的第二标签信息。
步骤S511,所述服务器B向所述终端发送查找结果。
步骤S512,若所述终端根据所述查找结果获取到第二标签信息,将所述第二标签信息保存至所述图像文件,并根据所述第二标签信息对所述图像文件中的所述目标物进行标注。
在一些可行的实施方式中,步骤S501~S511的具体实施方式可参见图2~图4所示的实施例中响应步骤的具体实施方式描述,在此不再赘述。需要注意的是,本发明实施例对步骤S508、S509的执行顺序并无限定。
本发明实施例中,终端可利用服务器A推送的图像识别模型在终端上实现获取图像中目标物的标签信息,也可利用服务器B获取该目标物更加详细的标签信息,从而能够分担服务器进行图像识别的工作量,提升图像识别效率,并为用户提供了多种标签信息,提升了用户体验。
下面结合上述应用场景以及上述方法实施例,对本发明实施例中的装置实施例进行详细说明。
请参阅图9,图9是本发明实施例公开的一种终端的单元组成示意图。所述终端900可包括:
获取单元901,用于获取包含目标物的图像文件;
识别单元902,用于通过所述终端中的神经网络计算装置根据所述终端中的图像识别模型对所述目标物进行识别,以得到所述目标物的物体类别信息;
保存单元903,用于将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件。
可选的,在一些可行的实施方式中,所述获取单元901包括:
确定单元,用于根据用户对所述终端上的显示图像的选取操作,确定所述选取操作对应的目标物;
生成单元,用于生成包含所述目标物的图像文件。
可选的,在一些可行的实施方式中,若所述选取操作对应n个目标物,n为正整数,所述生成单元还用于:
生成包含所述n个目标物的图像文件;或者,
生成包含所述n个目标物中的至少一个目标物的m个图像文件,m为整数且m≤n。
可选的,在一些可行的实施方式中,所述识别单元902根据终端中的图像识别模型对所述目标物进行识别,以得到所述目标物的物体类别信息之前,所述终端还包括:
接收单元904,用于接收并存储第一服务器发送的所述图像识别模型;或者,
训练单元905,用于通过所述神经网络计算装置根据所述终端中的图片进行图像识别训练,以得到所述图像识别模型。
可选的,在一些可行的实施方式中,所述保存单元903将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件之后,所述终端还包括:
发送单元906,用于将所述目标物的第一标签信息发送至所述第一服务器,以在所述第一服务器上共享所述第一标签信息。
可选的,在一些可行的实施方式中,所述发送单元906将所述目标物的第一标签信息发送至所述第一服务器之前,所述终端还包括:
修订单元907,用于根据用户针对所述第一标签信息的修订操作,对所述第一标签信息进行修订;
所述发送单元906还用于:
将所述修订后的所述第一标签信息发送至所述第一服务器。
可选的,在一些可行的实施方式中,所述发送单元906将所述目标物的第一标签信息发送至所述第一服务器之后,所述终端还包括:
更新单元908,用于若所述终端接收到所述第一服务器发送的第一标签修订信息,将所述第一标签信息更新为所述第一标签修订信息。
可选的,在一些可行的实施方式中,所述保存单元903将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件之后,所述终端还包括:
查找单元909,用于在目标物详细信息库中查找与所述第一标签信息相匹配的第二标签信息;
判断单元910,用于判断是否查找到所述第二标签信息;
所述保存单元903,还用于若所述判断单元的判断结果为是,将所述第二标签信息保存至所述图像文件,并根据所述第二标签信息对所述图像文件中的所述目标物进行标注;
标注单元911,用于若所述判断单元的判断结果为否,根据所述第一标签信息对所述图像文件中的所述目标物进行标注。
本发明实施例中,获取单元901用于执行图2所示的实施例中步骤S201 的方法;识别单元902用于执行图2所示的实施例中步骤S202的方法;保存单元903用于执行图3所示的实施例中步骤S203的方法。
本发明实施例中,接收单元904用于执行图3所示的实施例中步骤S303的方法;获取单元901还用于执行图3所示的实施例中步骤S304~S305的方法;识别单元902还用于执行图3所示的实施例中步骤S306的方法;保存单元903还用于执行图3所示的实施例中步骤S307的方法;发送单元906用于执行图3所示的实施例中步骤S308的方法。
本发明实施例中,获取单元901还用于执行图4所示的实施例中步骤S401的方法;识别单元902还用于执行图4所示的实施例中步骤S402的方法;保存单元903还用于执行图4所示的实施例中步骤S403的方法;发送单元907还用于执行图4所示的实施例中步骤S404的方法;判断单元910用于执行图4所示的实施例中步骤S407的方法;保存单元903还用于执行图4所示的实施例中步骤S408的方法;标注单元911用于执行图4所示的实施例中步骤S408~S409的方法。
本发明实施例中,接收单元904还用于执行图5所示实施例中的步骤S503的方法;获取单元901还用于执行图5所示实施例中的步骤S504~S505的方法;识别单元902还用于执行图5所示实施例中的步骤S506的方法;保存单元903还用于执行图5所示实施例中的步骤S507的方法;发送单元907还用于执行图5所示实施例中的步骤S508~S509的方法;标注单元911还用于执行图5所示实施例中的步骤S512的方法。
本发明实施例中,以上发送单元907可以通过终端900中配置的通信接口向服务器或其他通信设备发送消息、信息等;以上接收单元904可以通过终端900中配置通信接口接收终端或其他通信设备发送的消息、信息等。上述通信接口为无线接口。
参照以上实施例,图9所示实施例中的终端900是以单元的形式来呈现。这里的“单元”可以指特定应用集成电路(application-specific integrated circuit,ASIC),执行一个或多个软件或固件程序的处理器和存储器,集成逻辑电路,和/或其他可以提供上述功能的器件。
请参阅图10,图10是本发明实施例公开的一种终端的结构示意图。终端1000包括:存储单元1010、通信接口1020、神经网络计算装置1030及与所 述存储单元1010和通信接口1020耦合的处理器1040。所述存储单元1010用于存储指令,所述处理器1040用于执行所述指令,所述通信接口1020用于在所述处理器1040的控制下与其他设备进行通信。当所述处理器1040在执行所述指令时可根据所述指令执行本申请上述实施例中的任意一种图像识别方法。
神经网络计算装置1030可包括一个或多个神经网络计算芯片,如DSP(Digital Signal Process,数字信号处理)芯片、NPU(Network Process Unit,网络处理单元)、GPU(Graphic Process Unit,图像处理器)等中的至少一种。
处理器1040还可称中央处理单元(CPU,Central Processing Unit)。存储单元1010可以包括只读存储器和随机存取存储器,并向处理器1040提供指令和数据等。存储单元1010的一部分还可包括非易失性随机存取存储器。具体的应用中终端1000的各组件例如通过总线系统耦合在一起。总线系统除了可包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线系统1050。上述本发明实施例揭示的方法可应用于处理器1040中,或由处理器1040实现。处理器1040可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器1040中的硬件的集成逻辑电路或者软件形式的指令完成。其中,上述处理器1040可以是通用处理器、数字信号处理器、专用集成电路、现成可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。处理器1040可以实现或者执行本发明实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本发明实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储单元1010,例如处理器1040可读取存储单元1010中的信息,结合其硬件完成上述方法的步骤。
本发明实施例还提供了一种计算机存储介质,用于储存为上述微基站所用的计算机软件指令,其包含用于执行上述方法实施例所涉及的计算机程序。
尽管在此结合各实施例对本发明进行了描述,然而,在实施所要求保护的本发明过程中,本领域技术人员通过查看所述附图、公开内容、以及所附权 利要求书,可理解并实现所述公开实施例的其他变化。在权利要求中,“包括”(comprising)一词不排除其他组成部分或步骤,“一”或“一个”不排除多个的情况。单个处理器或其他单元可以实现权利要求中列举的若干项功能。相互不同的从属权利要求中记载了某些措施,但这并不表示这些措施不能组合起来产生良好的效果。
本领域技术人员应明白,本发明的实施例可提供为方法、装置(设备)、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。计算机程序存储/分布在合适的介质中,与其它硬件一起提供或作为硬件的一部分,也可以采用其他分布形式,如通过Internet或其它有线或无线电信系统。
本发明是参照本发明实施例的方法、装置(设备)和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管结合具体特征及其实施例对本发明进行了描述,显而易见的,在不脱离本发明的精神和范围的情况下,可对其进行各种修改和组合。相应地,本说明书和附图仅仅是所附权利要求所界定的本发明的示例性说明,且视为已覆 盖本发明范围内的任意和所有修改、变化、组合或等同物。显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。

Claims (25)

  1. 一种终端,其特征在于,包括处理器、存储器、神经网络计算装置、通信接口和通信总线,所述处理器、所述存储器、所述神经网络计算装置和所述通信接口通过所述通信总线连接并完成相互间的通信;
    所述存储器存储有可执行程序代码,所述通信接口用于无线通信;
    所述处理器用于调用所述存储器中的所述可执行程序代码,以执行以下操作:
    获取包含目标物的图像文件;
    控制所述神经网络计算装置根据所述终端中的图像识别模型对所述目标物进行识别,以得到所述目标物的物体类别信息;
    将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件。
  2. 如权利要求1所述终端,其特征在于,所述神经网络计算装置包括至少一个神经网络计算芯片。
  3. 如权利要求1或2所述终端,其特征在于,所述处理器获取包含目标物的图像文件的具体方式为:
    根据用户对所述终端上的显示图像的选取操作,确定所述选取操作对应的目标物;
    生成包含所述目标物的图像文件。
  4. 如权利要求3所述终端,其特征在于,若所述选取操作对应n个目标物,n为正整数,所述处理器生成包含所述目标物的图像文件的具体方式为:
    生成包含所述n个目标物的图像文件;或者,
    生成包含所述n个目标物中的至少一个目标物的m个图像文件,m为整数且m≤n。
  5. 如权利要求1-4任一项所述终端,其特征在于,所述处理器控制所述神经网络计算装置根据所述终端中的图像识别模型对所述目标物进行识别,以得到所述目标物的物体类别信息之前,所述处理器还用于调用所述存储器中的所述可执行程序代码,以执行以下操作:
    控制所述通信接口接收并存储第一服务器发送的所述图像识别模型;或者,
    控制所述神经网络计算装置根据所述终端中的图片进行图像识别训练,以得到所述图像识别模型。
  6. 如权利要求5所述终端,其特征在于,所述处理器将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件之后,所述处理器还用于调用所述存储器中的所述可执行程序代码,以执行以下操作:
    控制所述通信接口将所述目标物的第一标签信息发送至所述第一服务器,以在所述第一服务器上共享所述第一标签信息。
  7. 如权利要求6所述终端,其特征在于,所述处理器将所述目标物的第一标签信息发送至所述第一服务器之前,所述处理器还用于调用所述存储器中的所述可执行程序代码,以执行以下操作:
    根据用户针对所述第一标签信息的修订操作,对所述第一标签信息进行修订;
    其中,所述处理器控制所述通信接口将所述目标物的第一标签信息发送至所述第一服务器的具体方式为:
    将所述修订后的所述第一标签信息发送至所述第一服务器。
  8. 如权利要求6所述终端,其特征在于,所述处理器控制所述通信接口将所述目标物的第一标签信息发送至所述第一服务器之后,所述处理器还用于调用所述存储器中的所述可执行程序代码,以执行以下操作:
    若通过所述通信接口收到所述第一服务器发送的第一标签修订信息,将所述第一标签信息更新为所述第一标签修订信息。
  9. 如权利要求1-8任一项所述终端,其特征在于,所述处理器将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件之后,所述处理器还用于调用所述存储器中的所述可执行程序代码,以执行以下操作:
    在目标物详细信息库中查找与所述第一标签信息相匹配的第二标签信息;
    判断是否查找到所述第二标签信息;
    若为是,则将所述第二标签信息保存至所述图像文件,并根据所述第二标签信息对所述图像文件中的所述目标物进行标注;
    若为否,则根据所述第一标签信息对所述图像文件中的所述目标物进行标注。
  10. 一种终端,其特征在于,包括:
    获取单元,用于获取包含目标物的图像文件;
    识别单元,用于通过所述终端中的神经网络计算装置根据所述终端中的图像识别模型对所述目标物进行识别,以得到所述目标物的物体类别信息;
    保存单元,用于将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件。
  11. 如权利要求10所述终端,其特征在于,所述获取单元包括:
    确定单元,用于根据用户对所述终端上的显示图像的选取操作,确定所述选取操作对应的目标物;
    生成单元,用于生成包含所述目标物的图像文件。
  12. 如权利要求11所述终端,其特征在于,若所述选取操作对应n个目标物,n为正整数,所述生成单元还用于:
    生成包含所述n个目标物的图像文件;或者,
    生成包含所述n个目标物中的至少一个目标物的m个图像文件,m为整数且m≤n。
  13. 如权利要求10-12任一项所述终端,其特征在于,所述识别单元根据终端中的图像识别模型对所述目标物进行识别,以得到所述目标物的物体类别信息之前,所述终端还包括:
    接收单元,用于接收并存储第一服务器发送的所述图像识别模型;或者,
    训练单元,用于通过所述神经网络计算装置根据所述终端中的图片进行图像识别训练,以得到所述图像识别模型。
  14. 如权利要求13所述终端,其特征在于,所述保存单元将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件之后,所述终端还包括:
    发送单元,用于将所述目标物的第一标签信息发送至所述第一服务器,以在所述第一服务器上共享所述第一标签信息。
  15. 如权利要求14所述终端,其特征在于,所述发送单元将所述目标物的第一标签信息发送至所述第一服务器之前,所述终端还包括:
    修订单元,用于根据用户针对所述第一标签信息的修订操作,对所述第一 标签信息进行修订;
    所述发送单元还用于:
    将所述修订后的所述第一标签信息发送至所述第一服务器。
  16. 如权利要求14所述终端,其特征在于,所述发送单元将所述目标物的第一标签信息发送至所述第一服务器之后,所述终端还包括:
    更新单元,用于若所述终端接收到所述第一服务器发送的第一标签修订信息,将所述第一标签信息更新为所述第一标签修订信息。
  17. 如权利要求10-16任一项所述终端,其特征在于,所述保存单元将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件之后,所述终端还包括:
    查找单元,用于在目标物详细信息库中查找与所述第一标签信息相匹配的第二标签信息;
    判断单元,用于判断是否查找到所述第二标签信息;
    所述保存单元,还用于若所述判断单元的判断结果为是,将所述第二标签信息保存至所述图像文件,并根据所述第二标签信息对所述图像文件中的所述目标物进行标注;
    标注单元,用于若所述判断单元的判断结果为否,根据所述第一标签信息对所述图像文件中的所述目标物进行标注。
  18. 一种图像识别方法,其特征在于,包括:
    终端获取包含目标物的图像文件;
    所述终端通过所述终端中的神经网络计算装置根据所述终端中的图像识别模型对所述目标物进行识别,以得到所述目标物的物体类别信息;
    所述终端将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件。
  19. 如权利要求18所述方法,其特征在于,所述终端获取包含目标物的图像文件,包括:
    所述终端根据用户对所述终端上的显示图像的选取操作,确定所述选取操作对应的目标物;
    所述终端生成包含所述目标物的图像文件。
  20. 如权利要求19所述方法,其特征在于,若所述选取操作对应n个目标物,n为正整数,所述终端生成包含所述目标物的图像文件,包括:
    所述终端生成包含所述n个目标物的图像文件;或者,
    所述终端生成包含所述n个目标物中的至少一个目标物的m个图像文件,m为整数且m≤n。
  21. 如权利要求18-20任一项所述方法,其特征在于,所述终端根据终端中的图像识别模型对所述目标物进行识别,以得到所述目标物的物体类别信息之前,所述方法还包括:
    所述终端接收并存储第一服务器发送的所述图像识别模型;或者,
    所述终端通过所述神经网络计算装置根据所述终端中的图片进行图像识别训练,以得到所述图像识别模型。
  22. 如权利要求21所述方法,其特征在于,所述终端将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件之后,所述方法还包括:
    所述终端将所述目标物的第一标签信息发送至所述第一服务器,以在所述第一服务器上共享所述第一标签信息。
  23. 如权利要求22所述方法,其特征在于,所述终端将所述目标物的第一标签信息发送至所述第一服务器之前,所述方法还包括:
    所述终端根据用户针对所述第一标签信息的修订操作,对所述第一标签信息进行修订;
    所述终端将所述目标物的第一标签信息发送至所述第一服务器,包括:
    所述终端将所述修订后的所述第一标签信息发送至所述第一服务器。
  24. 如权利要求22所述方法,其特征在于,所述终端将所述目标物的第一标签信息发送至所述第一服务器之后,所述方法还包括:
    若所述终端接收到所述第一服务器发送的第一标签修订信息,所述终端将所述第一标签信息更新为所述第一标签修订信息。
  25. 如权利要求18-24任一项所述方法,其特征在于,所述终端将所述物体类别信息作为所述目标物的第一标签信息保存至所述图像文件之后,所述方法还包括:
    所述终端在目标物详细信息库中查找与所述第一标签信息相匹配的第二标签信息;
    所述终端判断是否查找到所述第二标签信息;
    若为是,则所述终端将所述第二标签信息保存至所述图像文件,并根据所述第二标签信息对所述图像文件中的所述目标物进行标注;
    若为否,则所述终端根据所述第一标签信息对所述图像文件中的所述目标物进行标注。
PCT/CN2016/092464 2016-07-30 2016-07-30 一种图像识别方法及终端 Ceased WO2018023212A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
CN201680087732.6A CN109478311A (zh) 2016-07-30 2016-07-30 一种图像识别方法及终端
US16/321,960 US11132545B2 (en) 2016-07-30 2016-07-30 Image recognition method and terminal
EP16910810.7A EP3486863A4 (en) 2016-07-30 2016-07-30 PICTURE IDENTIFICATION METHOD AND SENDING DEVICE
PCT/CN2016/092464 WO2018023212A1 (zh) 2016-07-30 2016-07-30 一种图像识别方法及终端
US17/240,103 US11804053B2 (en) 2016-07-30 2021-04-26 Image recognition method and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/092464 WO2018023212A1 (zh) 2016-07-30 2016-07-30 一种图像识别方法及终端

Related Child Applications (2)

Application Number Title Priority Date Filing Date
US16/321,960 A-371-Of-International US11132545B2 (en) 2016-07-30 2016-07-30 Image recognition method and terminal
US17/240,103 Continuation US11804053B2 (en) 2016-07-30 2021-04-26 Image recognition method and terminal

Publications (1)

Publication Number Publication Date
WO2018023212A1 true WO2018023212A1 (zh) 2018-02-08

Family

ID=61072174

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/092464 Ceased WO2018023212A1 (zh) 2016-07-30 2016-07-30 一种图像识别方法及终端

Country Status (4)

Country Link
US (2) US11132545B2 (zh)
EP (1) EP3486863A4 (zh)
CN (1) CN109478311A (zh)
WO (1) WO2018023212A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111752450A (zh) * 2020-05-28 2020-10-09 维沃移动通信有限公司 显示方法、装置及电子设备
CN112070224A (zh) * 2020-08-26 2020-12-11 成都品果科技有限公司 一种神经网络训练用样本的修订系统及方法
CN112445927A (zh) * 2019-08-28 2021-03-05 阿里巴巴集团控股有限公司 目标对象的搜索方法、识别网络模型的训练方法及装置

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7171349B2 (ja) * 2018-09-28 2022-11-15 富士フイルム株式会社 画像処理装置、画像処理方法、プログラムおよび記録媒体
CN110111648A (zh) * 2019-04-17 2019-08-09 吉林大学珠海学院 一种编程训练系统及方法
CN110751663A (zh) * 2019-10-29 2020-02-04 北京云聚智慧科技有限公司 一种图像标注方法及装置
CN111581421B (zh) * 2020-04-30 2024-06-04 京东方科技集团股份有限公司 图像检索方法、图像检索装置及图像检索系统
CN111783584B (zh) * 2020-06-22 2023-08-08 杭州飞步科技有限公司 图像目标检测方法、装置、电子设备及可读存储介质
CN112418017B (zh) * 2020-11-09 2025-02-21 西安万像电子科技有限公司 图像处理的方法、装置及系统
US11868433B2 (en) * 2020-11-20 2024-01-09 Accenture Global Solutions Limited Target object identification for waste processing

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102591994A (zh) * 2012-03-07 2012-07-18 我查查信息技术(上海)有限公司 基于图像采集设备的信息获取方法、装置及移动通信设备
CN104103085A (zh) * 2013-04-11 2014-10-15 三星电子株式会社 屏幕图像中的对象
CN104615769A (zh) * 2015-02-15 2015-05-13 小米科技有限责任公司 图片分类方法及装置
US20150178596A1 (en) * 2013-12-20 2015-06-25 Google Inc. Label Consistency for Image Analysis
CN104766041A (zh) * 2014-01-07 2015-07-08 腾讯科技(深圳)有限公司 一种图像识别方法、装置及系统
CN105528611A (zh) * 2015-06-24 2016-04-27 广州三瑞医疗器械有限公司 用于疼痛识别的分类器训练方法及其装置

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7783135B2 (en) * 2005-05-09 2010-08-24 Like.Com System and method for providing objectified image renderings using recognition information from images
US20080159622A1 (en) * 2006-12-08 2008-07-03 The Nexus Holdings Group, Llc Target object recognition in images and video
CN106027910B (zh) * 2013-01-22 2019-08-16 华为终端有限公司 预览画面呈现方法、装置及终端
US9230194B2 (en) * 2013-09-16 2016-01-05 Google Inc. Training image sampling
US9619488B2 (en) 2014-01-24 2017-04-11 Microsoft Technology Licensing, Llc Adaptable image search with computer vision assistance
US20150324688A1 (en) 2014-05-12 2015-11-12 Qualcomm Incorporated Customized classifier over common features
CN104270552A (zh) * 2014-08-29 2015-01-07 华为技术有限公司 一种声像播放方法及装置
CN105488044A (zh) * 2014-09-16 2016-04-13 华为技术有限公司 数据处理的方法和设备
CN104573669B (zh) * 2015-01-27 2018-09-04 中国科学院自动化研究所 图像物体检测方法
CN104715262B (zh) * 2015-03-31 2019-10-08 努比亚技术有限公司 一种利用拍照实现智能标签功能的方法、装置及移动终端
WO2016207875A1 (en) * 2015-06-22 2016-12-29 Photomyne Ltd. System and method for detecting objects in an image
CN105117399B (zh) * 2015-07-03 2020-01-03 深圳码隆科技有限公司 一种图像搜索方法和装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102591994A (zh) * 2012-03-07 2012-07-18 我查查信息技术(上海)有限公司 基于图像采集设备的信息获取方法、装置及移动通信设备
CN104103085A (zh) * 2013-04-11 2014-10-15 三星电子株式会社 屏幕图像中的对象
US20150178596A1 (en) * 2013-12-20 2015-06-25 Google Inc. Label Consistency for Image Analysis
CN104766041A (zh) * 2014-01-07 2015-07-08 腾讯科技(深圳)有限公司 一种图像识别方法、装置及系统
CN104615769A (zh) * 2015-02-15 2015-05-13 小米科技有限责任公司 图片分类方法及装置
CN105528611A (zh) * 2015-06-24 2016-04-27 广州三瑞医疗器械有限公司 用于疼痛识别的分类器训练方法及其装置

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112445927A (zh) * 2019-08-28 2021-03-05 阿里巴巴集团控股有限公司 目标对象的搜索方法、识别网络模型的训练方法及装置
CN111752450A (zh) * 2020-05-28 2020-10-09 维沃移动通信有限公司 显示方法、装置及电子设备
CN112070224A (zh) * 2020-08-26 2020-12-11 成都品果科技有限公司 一种神经网络训练用样本的修订系统及方法
CN112070224B (zh) * 2020-08-26 2024-02-23 成都品果科技有限公司 一种神经网络训练用样本的修订系统及方法

Also Published As

Publication number Publication date
US11804053B2 (en) 2023-10-31
CN109478311A (zh) 2019-03-15
US20190180101A1 (en) 2019-06-13
US20210240982A1 (en) 2021-08-05
EP3486863A4 (en) 2019-07-10
EP3486863A1 (en) 2019-05-22
US11132545B2 (en) 2021-09-28

Similar Documents

Publication Publication Date Title
WO2018023212A1 (zh) 一种图像识别方法及终端
US12001475B2 (en) Mobile image search system
US12197543B2 (en) Ephemeral content management
WO2020093289A1 (zh) 资源推荐方法、装置、电子设备及存储介质
US20190163767A1 (en) Image processing method, image processing device, computer device, and computer readable storage medium
WO2017088415A1 (zh) 检索视频内容的方法、装置和电子设备
CN109947989B (zh) 用于处理视频的方法和装置
US10754869B2 (en) Managing data format of data received from devices in an internet of things network
CN103370701A (zh) 用于提供自动和增量移动应用识别的方法、装置和计算机程序产品
US9355338B2 (en) Image recognition device, image recognition method, and recording medium
WO2016000507A1 (zh) 省流量模式搜索服务的方法、服务器、客户端和系统
CN110012049B (zh) 信息推送方法. 系统. 服务器及计算机可读存储介质
CN109767257B (zh) 基于大数据分析的广告投放方法、系统及电子设备
US12287823B2 (en) Image search method, terminal, and server
CN116071527A (zh) 一种对象处理方法、装置、存储介质及电子设备
CN107885827B (zh) 文件获取方法、装置、存储介质及电子设备
CN114297381A (zh) 文本处理方法、装置、设备及存储介质
CN110008930A (zh) 用于识别动物面部状态的方法和装置
CN115982444A (zh) 标签生成方法及装置、存储介质、计算设备
CN107507094A (zh) 一种信息处理方法、装置及存储介质
US20200012688A1 (en) Method and device for retrieving content
CN111784376B (zh) 用于处理信息的方法和装置
KR20150045560A (ko) 업 데이트 된 포스트 정보를 이용하여 컨텐츠를 분류하는 전자 장치 및 방법
CN111488928B (zh) 用于获取样本的方法及装置
US9830352B2 (en) Information processing device, information processing system, information processing method, and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16910810

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2016910810

Country of ref document: EP

Effective date: 20190214