WO2022007596A1 - 图像检索系统、方法和装置 - Google Patents
图像检索系统、方法和装置 Download PDFInfo
- Publication number
- WO2022007596A1 WO2022007596A1 PCT/CN2021/099890 CN2021099890W WO2022007596A1 WO 2022007596 A1 WO2022007596 A1 WO 2022007596A1 CN 2021099890 W CN2021099890 W CN 2021099890W WO 2022007596 A1 WO2022007596 A1 WO 2022007596A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- distance
- retrieved
- engine
- matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/56—Information retrieval; Database structures therefor; File system structures therefor of still image data having vectorial format
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/51—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/53—Querying
Definitions
- the present application relates to the field of computers, and in particular, to an image retrieval system, method and apparatus.
- the central processing unit (CPU) that provides the image retrieval function directly loads the images in the gallery into the main memory according to the user's retrieval request, and then compares the query image and The similarity of the images in the gallery, which in turn determines the same or similar images to the query image.
- the CPU needs to perform data processing and distance calculation in the process of image retrieval, which is complicated and time-consuming, and cannot meet the real-time requirements of image retrieval. Therefore, how to provide an efficient image retrieval method has become an urgent technical problem to be solved.
- the present application provides a method for image retrieval, in which image retrieval operations are performed by heterogeneous systems including multiple computing engines, and each operation process in image retrieval is performed with the idea of "specializing people for special tasks", so as to avoid using a single CPU to perform image retrieval process.
- the resulting processing is complex and time-consuming, so as to improve the efficiency of image retrieval.
- an image retrieval system in a first aspect, includes an interface for acquiring an image to be retrieved and a computing engine set including a first computing engine and a second computing engine, wherein the first computing engine is used for computing the image to be retrieved Similarity with multiple images included in the gallery; the second calculation engine is used to determine the retrieval result according to the similarity between the image to be retrieved and the images included in the gallery, and the retrieval result includes images similar to the image to be retrieved. target image.
- the set of computing engines can be used to perform the image retrieval operation, and then the retrieval result can be determined according to the similarity between the image to be retrieved and the image in the gallery, avoiding the time-consuming and low-efficiency caused by the image retrieval operation performed by a single CPU. problem.
- the similarity between the image to be retrieved and the multiple images included in the gallery is obtained according to the distance between the query vector corresponding to the image to be retrieved and the center vectors corresponding to the multiple images.
- the similarity between the image to be retrieved and the multiple images included in the gallery refers to the distance between the query vector corresponding to the image to be retrieved and the center vectors corresponding to the multiple images.
- the distance between the vectors is used to measure the similarity of the images, and then the image matching the image to be retrieved is determined.
- the first computing engine is specifically configured to convert a query vector into a query matrix, convert a center vector into a center matrix, and obtain a distance table according to the query matrix and the center matrix , wherein the distance table is used to determine the distance between the query vector corresponding to the image to be retrieved and the center vectors corresponding to multiple images.
- the first calculation engine can be used to process matrix correlation calculation. In the process of calculating the distance between vectors, the distance calculation process can be mapped to a matrix processing process, and then the dedicated first calculation engine can complete the matrix correlation processing, improving the distance calculation process. speed, thereby improving the processing efficiency of the entire image retrieval process.
- the image retrieval system further includes a shared memory, and the shared memory is connected to the first computing engine through an internal bus; the first computing engine is further configured to store the distance table in the shared memory through the internal bus .
- the first computing engine and the second computing engine can access the data in the storage medium in a shared manner.
- the first computing engine can store the distance table in the shared memory
- the second computing engine can The engine can obtain the distance table from the shared memory, and complete other processing operations of image retrieval based on the distance table, avoiding the slow memory access speed and cache miss caused by the use of storage media other than the image retrieval system to store the distance table. problem, and further improve the speed of image retrieval.
- the shared memory and the second computing engine are connected through an internal bus; the second computing engine is specifically used to obtain a distance table from the shared memory; it is determined according to the distance table whether the gallery contains the A target image that is similar to the image to be retrieved. Since both the first computing engine and the second computing engine can access the shared memory, the second computing engine can obtain the distance table from the shared memory, and perform subsequent processing based on the distance table, avoiding the cache caused by obtaining data from the memory of the host system Cache misses lead to time-consuming problems and speed up data processing.
- the image retrieval system further includes a third computing engine, which is used to obtain an index list corresponding to the images in the gallery, where the third computing engine may also be called direct memory access
- the engine is used to obtain in-memory data connected to the host system through direct memory access;
- the second calculation engine is specifically used to obtain a distance matrix according to the distance table and the index list, and the distance matrix is used to indicate the index associated with each index in the index list.
- Distance value the distance between the query vector corresponding to the image to be retrieved and the center vectors corresponding to the multiple images is determined according to the distance matrix.
- the second calculation engine may calculate the similarity between the vectors according to the distance table and the index list, and determine the image matching the image to be retrieved.
- storing the distance table and the index list in the shared memory can speed up the processing speed of random reading in the process of generating the distance matrix, and improve the efficiency of data processing.
- the second calculation engine is specifically configured to determine the retrieval result according to the distance matrix to determine the sorting result of the distance between the query vector corresponding to the image to be retrieved and the center vectors corresponding to the multiple images.
- the distance matrix is used to identify the distance between each segment of the image to be retrieved and the corresponding segment of the center vector, and the distance between the query vector and the center vector of the image to be retrieved can be further determined according to the distance matrix, and then the distance is sorted, Finalize the search results.
- the second calculation engine is specifically configured to determine the retrieval result according to a preset threshold, and the distance between the vector associated with the image included in the retrieval result and the query vector satisfies a preset condition , the preset condition includes that the distance between the vector associated with the image and the query vector is greater than or equal to the preset threshold, or the distance between the vector associated with the image and the query vector is smaller than the preset threshold.
- the preset condition includes that the distance between the vector associated with the image and the query vector is greater than or equal to the preset threshold, or the distance between the vector associated with the image and the query vector is smaller than the preset threshold.
- the image retrieval system is a chip or a PCIe card, and the first computing engine and the second computing engine are connected through an internal bus.
- the connection mode of the internal bus the data transmission path between the first computing engine and the second computing engine is shorter, and the data transmission speed is faster.
- the first computing engine and the second computing engine in the computing engine set are respectively logic circuits.
- different hardware engines can be used for different types of computing operations to complete data processing faster.
- the system is a virtualization system
- the first computing engine and the second computing engine in the computing engine set are respectively virtual machines or containers running on devices in the virtualization system.
- the functions of each hardware engine can also be realized by using a virtual machine or container in the form of virtualization. Provides an efficient image retrieval method.
- the image to be retrieved comes from a retrieval request, and the retrieval request includes at least one image to be retrieved, or the retrieval request includes a video to be retrieved, and the video to be retrieved includes at least one frame of image to be retrieved.
- the present application can be applied to the video retrieval process in addition to the image retrieval process, wherein, each frame of image can be executed according to the above image retrieval process, or the retrieval result can be obtained quickly through the calculation engine set.
- the present application provides an image retrieval method, which is applicable to an image retrieval system.
- the image retrieval system includes an interface and a set of computing engines, and the set of computing engines includes a first computing engine and a second computing engine.
- the specific retrieval process includes: the first calculation engine calculates the similarity between the image to be retrieved and the multiple images included in the gallery; the second calculation engine determines the retrieval according to the similarity between the image to be retrieved and the multiple images included in the gallery As a result, the retrieval result includes whether the gallery contains a target image similar to the image to be retrieved.
- the similarity between the image to be retrieved and the multiple images included in the gallery is obtained according to the distance between the query vector corresponding to the image to be retrieved and the center vectors corresponding to the multiple images.
- the similarity between the image to be retrieved and the multiple images included in the gallery refers to the distance between the query vector corresponding to the image to be retrieved and the center vectors corresponding to the multiple images.
- the distance between the vectors is used to measure the similarity of the images, and then the image matching the image to be retrieved is determined.
- the first computing engine converts the query vector into a query matrix, converts the center vector into a center matrix, and obtains a distance table according to the query matrix and the center matrix, wherein, The distance table is used to determine the distance between the query vector corresponding to the image to be retrieved and the center vectors corresponding to the multiple images.
- the first calculation engine can be used to process the matrix correlation calculation. In the process of calculating the distance between vectors, the distance calculation process can be converted into a matrix processing process, and then the dedicated first calculation engine can complete the matrix correlation processing to improve the distance calculation process. speed, thereby improving the processing efficiency of the entire image retrieval process.
- the image retrieval system further includes a shared memory, and the shared memory is connected to the first computing engine through an internal bus; the first computing engine stores the distance table in the shared memory through the internal bus.
- the first computing engine and the second computing engine can access the data in the storage medium in a shared manner.
- the first computing engine can store the distance table in the shared memory
- the second computing engine can The engine can obtain the distance table from the shared memory, and complete other processing operations of image retrieval based on the distance table, avoiding the slow access speed and complex problems caused by using storage media other than the image retrieval system to store the distance table.
- the shared memory and the second computing engine are connected through an internal bus; the second computing engine obtains a distance table from the shared memory; according to the distance table, it is determined whether the gallery contains images related to the image to be retrieved Similar target images. Since both the first computing engine and the second computing engine can access the shared memory, the second computing engine can obtain the distance table from the shared memory, and perform subsequent processing based on the distance table, avoiding the cache caused by obtaining data from the memory of the host system Cache misses lead to time-consuming problems and speed up data processing.
- the image retrieval system further includes a third computing engine.
- the third computing engine can obtain the index list corresponding to the images in the gallery.
- the third computing engine can also be called a direct memory access engine. , used to obtain in-memory data connected to the host system through direct memory access; the second computing engine obtains a distance matrix according to the distance table and the index list, and the distance matrix is used to indicate the distance value associated with each index in the index list; The distance matrix determines the distance between the query vector corresponding to the image to be retrieved and the center vectors corresponding to the multiple images.
- the second calculation engine may calculate the similarity between the vectors according to the distance table and the index list, and determine the image matching the image to be retrieved.
- the second computing engine determines the retrieval result according to the distance matrix to determine the distance between the query vector corresponding to the image to be retrieved and the center vectors corresponding to the multiple images.
- the distance matrix is used to identify the distance between each segment of the image to be retrieved and the corresponding segment of the center vector, and the distance between the query vector and the center vector of the image to be retrieved can be further determined according to the distance matrix, and then the distance is sorted, Finalize the search results.
- the second calculation engine is specifically configured to determine the retrieval result according to a preset threshold, and the distance between the vector associated with the image included in the retrieval result and the query vector satisfies a preset condition , the preset condition includes that the distance between the vector associated with the image and the query vector is greater than or equal to the preset threshold, or the distance between the vector associated with the image and the query vector is smaller than the preset threshold.
- the preset condition includes that the distance between the vector associated with the image and the query vector is greater than or equal to the preset threshold, or the distance between the vector associated with the image and the query vector is smaller than the preset threshold.
- the image retrieval system is a chip or a PCIe card, and the first computing engine and the second computing engine are connected through an internal bus.
- the first computing engine and the second computing engine in the computing engine set are respectively logic circuits.
- the image retrieval system is a virtualization system
- the first computing engine and the second computing engine in the computing engine set are respectively virtual machines or containers running on devices in the virtualization system.
- the image to be retrieved comes from a retrieval request, and the retrieval request includes at least one image to be retrieved, or the retrieval request includes a video to be retrieved, and the video to be retrieved includes at least one frame of image to be retrieved.
- the present application can also be applied to a video retrieval process, wherein each frame of image can be executed according to the above image retrieval process, or a retrieval result can be quickly obtained through a set of computing engines.
- the present application provides an apparatus for image retrieval, the apparatus comprising various modules for performing the method for image retrieval in the second aspect or any possible implementation manner of the second aspect.
- the present application provides a data processing system
- the data processing system includes a host system and an image retrieval system
- the host system and the image retrieval system communicate through a network
- the host system is used to obtain a retrieval request, and determine the retrieval according to the retrieval request.
- image and send the retrieved image to the image retrieval system
- the image retrieval system is used to execute the functions implemented by each computing engine in the image retrieval system in the first aspect or any possible implementation manner of the first aspect.
- the present application provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, which, when executed on a computer, cause the computer to execute the methods described in the above aspects.
- the present application provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the methods described in the above aspects.
- the present application may further combine to provide more implementation manners.
- 1 is a schematic flowchart of a method for product quantization processing provided by the application
- FIG. 2 is a schematic structural diagram of a data processing system 100 provided by the present application.
- FIG. 3 is a schematic flowchart of an image retrieval method provided by the present application.
- FIG. 4 is a schematic flowchart of another image retrieval method provided by the present application.
- FIG. 5 is a schematic flowchart of a method for calculating a distance table provided by the application
- FIG. 6 is a schematic flowchart of another method for calculating a distance table provided by the application.
- FIG. 7 is a schematic flowchart of a method for calculating a distance matrix provided by the present application.
- FIG. 8 is a schematic flowchart of another method for calculating a distance matrix provided by the present application.
- FIG. 9 is a schematic flowchart of a method for calculating an accumulation vector provided by the present application.
- FIG. 10 is a schematic flowchart of another method for calculating a distance matrix provided by the application.
- FIG. 11 is a schematic diagram of a parallel processing flow provided by the application.
- FIG. 12 is a schematic structural diagram of an image retrieval apparatus 600 provided by the present application.
- Image search refers to the process of finding images that match the query image entered by the user in a gallery (also called a base library, a sample library, or a library to be searched).
- a gallery also called a base library, a sample library, or a library to be searched.
- image retrieval is based on deep learning and image recognition technology, combined with different application business and industry scenarios, using feature vectorization and search capabilities to search for the same or similar images from a specified gallery.
- each image and query image in the gallery can be represented by feature data
- the vector of the image can be obtained according to a feature extraction model (for example, a deep learning feature extraction model ResNet)
- the feature data of the same image includes at least one feature vector (also Can be called a vector)
- each vector can be represented by a multi-dimensional array, for example, vector 1 can be represented by (a, b, c), a, b, c are floating point numbers.
- the above-mentioned image retrieval process can also be understood as a retrieval processing process based on image feature data. It should be noted that, for the convenience of description, the following embodiments of the present application take the feature data of each image as a feature vector as an example for description.
- the long feature is the feature data of the image obtained according to the feature extraction model, also known as long feature data, or uncoded data, long encoded data, original feature data, and each image will obtain a long feature.
- Short features refer to the data obtained by decomposing long features by an artificial intelligence (artificial intelligence, AI) feature training platform after training according to the model, also known as short feature data, or encoded data, short encoded data.
- AI artificial intelligence
- product quantization (PQ) method can be used to convert long features into short features, which is a method to speed up retrieval.
- each segment has a segment sequence number, and the segment sequence number is unique in this segment.
- a long feature including 512 dimensions is divided into 64 segments, each segment includes 8 feature vectors F, and the segment numbers are 0 to 7 respectively; then, by calculating the distance from the K cluster center vectors obtained by pre-training, In each segment, the feature vector with the closest distance to the K cluster center vectors obtained by pre-training is determined, and the segment sequence number of the feature vector is used as the identifier of the short feature of the segment.
- K is the number of preset cluster centers, and K is a positive integer greater than or equal to 2.
- the K cluster center vectors obtained by pre-training may use long features of all or part of the images in the gallery as training data, and K clusters are obtained by using a clustering algorithm (for example, K-Means algorithm).
- a class includes a cluster center, and the vector used to identify the cluster center is called a cluster center vector (also called a center vector).
- F1 is the feature vector with the closest distance to the K cluster center vectors
- the segment number 1 of F1 can be used to identify segment 1, that is, the short feature can be marked with 1 Identifies segment 1.
- the segment After each segment in the long feature has undergone the above distance calculation and comparison, the segment can be identified by the sequence number of the feature vector in a segment.
- the above-mentioned product quantization process can also be understood as a way of simplifying the recording of long features of images in the gallery.
- the index of the original feature is constructed by calculating the distance between the vector and the center vector in each segment.
- the set of identifiers used to identify each segment of the long feature of all images in the gallery is called a short feature list (also called an index list), and the sequence of each short feature in the short feature list is the same as the segment order of the long feature. Subsequent image retrieval processing is performed based on the short feature.
- the processing system 100 includes a host system 10, a heterogeneous system 20 (also referred to as an image retrieval system), a first memory 30 and an interface bus 40 , wherein the host system 10 , the heterogeneous system 20 , and the first memory 30 are connected through an interface bus 40 .
- the interface bus 40 includes Peripheral Component Interconnect Express (PCIe), and the heterogeneous system 20 may be a chip or a PCIe card.
- PCIe Peripheral Component Interconnect Express
- the retrieval framework is used to receive the retrieval request from the user, the retrieval request includes the image to be retrieved; further, the retrieval request is sent to the heterogeneous system 20 through the interface bus 40, and the heterogeneous system 20 completes the processing of the retrieval request, for example, obtains the retrieval request
- the host system may also complete the processing of the retrieval request, and obtain the query vector associated with the image to be retrieved.
- the host system 10 is also used to run an operating system.
- the host system 10 includes a processor 101, which may be a central processing unit (central processing unit, CPU).
- the heterogeneous system 20 includes two or more computing engines.
- the heterogeneous system 20 includes a matrix computing engine 201 , a vector computing engine 202 , a scalar computing engine 203 , an internal bus 206 and an interface 207 .
- the heterogeneous system 20 further includes a direct memory access (direct memory access, DMA) engine 204 and a second memory 205.
- DMA direct memory access
- the matrix calculation engine 201, the vector calculation engine 202, the scalar calculation engine 203, the direct memory access (DMA) engine 204, the second memory 205 and the interface 207 communicate with each other through the internal bus 206, wherein the internal bus 206 can It is a parallel bus, which is used to realize point-to-point (P2P) communication; the interior can also include PCIe or front side bus (FSB) or other forms that can realize the connection of various components.
- P2P point-to-point
- FFB front side bus
- the matrix calculation engine 201 may also be referred to as the first calculation engine
- the vector calculation engine 202 and the scalar calculation engine 203 may be referred to as the second calculation engine
- the direct memory access engine 204 may be referred to as the third calculation engine
- the The second memory 205 is called shared memory.
- the first memory 30 and the second memory 205 are two different memories.
- the first memory 30 and the second memory 205 may also be different memories whose mapping relationship is established through memory mapped input/output (MMIO) technology; or, the first memory 30 and the second memory 205 It can also be the same memory.
- MMIO memory mapped input/output
- the matrix calculation engine 201 is used to implement matrix correlation operation processing, including matrix multiplication and other operations involving matrices. For example, S201 in FIG. 3 calculates the distance table.
- the vector calculation engine 202 is used to implement vector correlation calculation, including vector accumulation and other operations involving vectors. For example, S204 in FIG. 3 accumulates distance vectors to obtain a distance result between vectors.
- the scalar calculation engine 203 is configured to perform scalar correlation operations, including querying the short feature list and the distance table, thereby generating a distance matrix, and sorting the results of the matrix distance calculation. For example, in Fig.
- S202 traverses the short feature list, S203 generates a distance matrix, and S205 sorts; wherein, a scalar refers to a quantity that has only a size and no direction.
- the direct memory access engine 204 is used to use the direct memory access technology to load the short feature list from the first memory 20 to the second memory 205. Each computing engine in the heterogeneous system can quickly obtain the content of the short feature list, and then complete the corresponding processing operate.
- the second memory 205 is used to provide a shared storage area for the matrix calculation engine 201 , the vector calculation engine 202 , the scalar calculation engine 203 , and the direct memory access engine 204 .
- the above-mentioned matrix calculation engine 201 , vector calculation engine 202 , scalar calculation engine 203 , and direct memory access engine 204 may be respectively implemented by a hardware circuit with a specific function.
- each computing engine consists of a chip or a circuit structure built with electronic components, which is used to perform computing processing on data.
- This embodiment is described by taking the matrix calculation engine 201 as an example.
- the matrix calculation engine 201 has the generality and programmability of a CPU, but is more specialized, and can efficiently perform correlation operations, for example, matrix correlation operations.
- the matrix calculation engine 201 can be replaced by a processing chip such as a graphics processing unit (graphics processing unit, GPU), an embedded neural-network processing unit (neural-network processing unit, NPU).
- the matrix calculation engine 201 , the vector calculation engine 202 , the scalar calculation engine 203 , and the direct memory access engine 204 may also be implemented by software modules, respectively, or constituted by a combination of software and hardware.
- the above-mentioned computing engines can also be deployed in one or in part.
- the matrix computing engine 201, the vector computing engine 202, and the vector computing engine 203 can be deployed in one, for implementing matrix operations and vector operations. .
- the heterogeneous system shown in FIG. 2 can also be implemented in software.
- the heterogeneous system is a virtualization system, wherein the matrix calculation engine 201, the vector calculation engine 202, the scalar calculation engine 203, The direct memory access engine 204 can be implemented by a virtualized computing device such as a virtual machine or a container.
- the heterogeneous system is connected to the host system through a network, and the network can be an Ethernet, an optical network, a wireless network, etc. to realize the host system communication with heterogeneous systems.
- the virtualization system includes one or more devices, and each device includes logic circuits or devices used to implement corresponding computing functions.
- the device may include the above PCIe card or chip.
- each computing engine is an One or more virtual machine or container implementations obtained after the device with the above PCIe card or chip is virtualized.
- a virtualization system includes a device with a device (eg, GPU) that implements the function of a matrix computing engine, and multiple virtual machines with a virtual graphics processor (virtual GPU, vGPU) are obtained by virtualizing the resources of the device. or containers, each virtual machine or container is used to implement the functions of the above computing engines.
- the following embodiments take the data processing system 100 shown in FIG. 2 as an example to further explain the image retrieval method provided by the present application, wherein the heterogeneous system 20 can obtain the image to be retrieved from the host system 10 through the interface 207, and determine the image to be retrieved. The similarity between the retrieved image and the image in the gallery is retrieved, and the retrieval result of the image to be retrieved is determined according to the calculation result of the similarity.
- the heterogeneous system 20 can adopt the idea of "dedicated personnel to special work". Referring to FIG.
- FIG. 4 is a schematic flowchart of an image retrieval method provided by the present application. As shown in the figure, the method includes:
- the matrix calculation engine determines the distance table according to the query vector corresponding to the image to be retrieved in the retrieval request and the center vector in the gallery.
- each cluster center includes a center vector (also known as a product quantized cluster center vector), and each center vector includes a segment with a long feature. Count the same number of vectors, each vector can be represented by a multidimensional array.
- K center vectors 1 to K are obtained, and each center vector includes M segments, where M is the length of the long feature. The number of segments, for example, center vector 1 includes vector 11 to vector 1M. Then, referring to the process shown in FIG. 1, a product quantization process is performed on the long features of each image in the gallery to create a short feature list.
- the image to be retrieved in the retrieval request can be obtained first, and then the preprocessing operation is performed on the image to be retrieved, including: using the same feature extraction model as the image preprocessing process in the gallery to obtain long features of the image to be retrieved, and then , divide the long feature of the image to be retrieved into M segments, where M is the same number of segments for the long feature when the product quantization process is performed in the gallery. For example, in FIG.
- the above preprocessing process for the retrieved images can be performed by a heterogeneous system, or by a host system or other devices or systems, and the preprocessing results are sent to the heterogeneous system, and the heterogeneous system executes the image. further processing.
- the present application does not limit the specific processing procedures of the above-mentioned clustering processing and feature extraction model.
- an appropriate algorithm may be selected to perform clustering and image feature extraction processing according to business requirements.
- each segment in the query vector can be used as a retrieval object, and the distance between the vector of the segment corresponding to the retrieval object and each vector of the corresponding segment in the K center vectors can be calculated respectively, and then obtained according to the above distance calculation results.
- distance table. 5 is a schematic flowchart of a distance calculation method provided by an embodiment of the present application. As shown in the figure, the distance between the vector 11 of the segment 1 of the query vector 1 and the vector 11 of the segment 1 of the center vector 1 is calculated. D11 is obtained, then D11 is used to indicate the distance between the vector 11 indicated by segment 1 of the query vector and the vector 11 indicated by segment 1 of the center vector 1 .
- the distance between two vectors can be calculated in any of the following ways:
- x 1 is the query vector
- x 1k is the K-th segment vector of the query vector
- x 2 is the center vector to be compared in the gallery
- x 2k is the K-th segment in the center vector
- K ranges from 1 to An integer of n, where n is a positive integer greater than 1.
- Method 2 on the basis of formula 1, can also perform the square root operation on both sides of the equal sign of formula 1, that is, use the following formula 2 to calculate the distance between the two vectors:
- the square root operation can also be removed on the basis of formula 2, and the following formula 3 can be used to calculate the distance between the two vectors:
- the calculation process of the above formula can also be converted into a matrix calculation process, and then the matrix calculation engine is used to perform matrix calculation, so as to avoid obtaining data one by one and the delay caused by performing operation operations. and efficiency issues.
- the query vector is converted into a row matrix according to the segment sequence numbers.
- the vectors corresponding to each segment of the query vector are respectively used as row elements in the first matrix according to the segment sequence numbers in the query vector;
- the segment number is converted into a column matrix.
- the vector corresponding to each segment of each center vector is taken as an element of a column in the second matrix, and then the matrix calculation engine executes the matrix. Relevant computing operations to improve the efficiency of data processing.
- the above row matrix may also be referred to as the first matrix
- the column matrix may be referred to as the second matrix.
- the vector of each segment in query vector 1 is taken as a row in the first matrix.
- the first matrix is an M*1 matrix, and each segment is used as an element of a row of the first matrix;
- the vectors of each segment of the center vector 1 are taken as the elements of a column in the second matrix, ..., and so on, the second matrix is a 1*M column matrix, each The segments are each an element of a column of the second matrix.
- the multiplication operation of the first matrix and the second matrix can be performed, and then the result of the above matrix multiplication is multiplied by 2 to obtain the formula 4
- the first item in formula 4 is the accumulation operation of each segment in the center vector 1.
- the result of the first item can be pre-calculated, and the accumulated sum of the vectors of the same segment sequence number of each center vector can be obtained separately, and the Each accumulated sum is taken as a column element of the third matrix, respectively.
- the calculation process of the above formula 4 is converted into the multiplication of the first matrix and the second matrix as shown in FIG. 5 , and then the result obtained by the multiplication is accumulated with the third matrix, and the distance calculation result of formula 4 can be obtained.
- the third matrix in FIG. 6 can also be called an offset matrix, although FIG. 5 is the result of adding the result of multiplying the first matrix and the second matrix to the third matrix to obtain the result of the above distance formula,
- all elements in the matrix obtained by multiplying the matrices can also be multiplied by -1, and then the addition operation with the third matrix can be performed, and then the formula 4 can be obtained. process result.
- formulas 1 to 3 can also adopt a similar idea, and use each formula as a row element or column element of the matrix according to the segment sequence number, and then convert the distance calculation process into the calculation process of the matrix, And use the matrix calculation engine to perform the above-mentioned matrix multiplication and addition operations to improve the speed of matrix calculation processing.
- any two vectors can be obtained based on the distance calculation result in any way, and further, the distance can be calculated using the results from the table memory shown in Figure 5, wherein, D 11 for indicating a segment in the query vector a
- D 21 is used to indicate the vector 11 indicated by the segment 2 in the query vector 1 and the center vector 2 indicated by the segment 2
- the distance calculation result of the vector 21, . . . and so on, the distance table includes the distance calculation results of the vectors indicated by each segment in the query vector and the vector of each segment corresponding to the center vector.
- the distance table may be stored in the second memory 205 of the heterogeneous system 20, and a notification may be sent to the scalar computing engine, and the scalar computing engine will then perform subsequent operations.
- the scalar calculation engine 202 can directly read the distance table through the second memory 205, and then generate the distance matrix. For the specific process, refer to steps S402 and S403.
- the matrix calculation engine can directly process the distance calculation process in the form of matrix operation, and the above distance calculation process can be completed more quickly. Moreover, the matrix calculation engine supports the operation of floating-point numbers, and there is no need for the conversion of floating-point numbers and integer-point numbers like when the CPU performs distance calculation, which simplifies the efficiency of data processing. In addition, by optimizing the distance calculation method, the distance between any two vectors can be quickly confirmed, and then the similarity between the vectors can be judged, which further improves the efficiency of image retrieval.
- the scalar calculation engine generates a distance matrix according to the short feature list and the distance table.
- the short feature list is usually generated during the creation of the gallery and stored in the first memory 30 shown in FIG. 1 .
- the scalar computing engine can notify the direct memory engine 204 to load the short feature list of the gallery into the second memory 205 of the heterogeneous system 20, and then the scalar computing engine Traverse the short feature list one by one, obtain a distance value from the distance table query according to the identification of each segment in the short feature list, then write the distance value into the corresponding position of the distance matrix, and finally obtain the distance matrix, which can be stored in the second memory of the heterogeneous system.
- the distance matrix is a logical structure for storing data.
- the second memory 205 can be divided into a storage space of a specified size, and the storage space can be used to realize the storage of the distance matrix. It is used to store the data of D11, and the storage space 2 is used to store the data of D12.
- the short feature list includes the short features of all images in the gallery, and the distance value corresponding to the short feature serial number of each image can be used as a row element in the distance matrix, for example, image 1 includes 64 serial numbers.
- the first short feature identifier is 1, then the distance value D11 of the first row and first column in the distance table can be used as the element of the first row and first column of the distance matrix.
- the second short feature identifier of image 1 is 2
- the distance value D22 of the second row and second column in the distance table can be used as the element of the second row and second column of the distance matrix, ..., and so on, the elements of one row of the distance matrix represent the distance value of an image, and then Obtain an N*M matrix, where N is the number of images included in the short feature list, and M is the number of segments of the long feature.
- the direct memory engine may load all or part of the short feature list into the second storage 205 of the heterogeneous system 20 according to the size of the remaining space in the second storage 205 .
- the second memory 205 can be set using a storage medium with high-speed read and write delay, for example, static random access memory (SRAM), and the scalar computing engine can directly read the short feature list and the distance table through the second memory , to avoid the data transmission delay caused by reading from the first memory.
- SRAM static random access memory
- the CPU first loads the short feature list into the memory, and the data with high access frequency will be loaded into the CPU cache, but due to the limited cache space, it is impossible to load all the short feature lists into the cache, and the distance matrix generation process It is also necessary to frequently read the data of the short feature list and the distance table, which will cause the problem that there is no data to be read in the cache (also known as the cache miss problem), which affects the speed of reading data.
- the short feature list is loaded into the second memory 205, and the scalar computing engine and the second memory are connected through an internal bus, so that frequent relocation is not required during data processing.
- the distance table and the short feature list are respectively stored in the second memory 205 of the heterogeneous system.
- the process can be understood as the process of random reading of the distance table and sequential writing of the distance matrix. Storing the short feature list in the second memory can also improve the speed of reading data in the short feature list, and further improve the processing efficiency of the image retrieval process.
- the scalar computing engine may notify the vector computing engine to continue to perform the distance accumulation operation. For details, refer to step S403.
- the vector calculation engine determines the accumulated distance vector according to the distance matrix.
- each row has several distance values, and each distance value represents the distance between the vector corresponding to a segment of the query vector and the vector corresponding to the segment of the center vector in the gallery.
- the distance between the query vector and the same center vector in the gallery can be obtained by accumulating all the distances.
- the vector computing engine can use one accumulation instruction to realize the accumulation operation of multiple floating-point numbers.
- the above accumulation operation will involve a large number of floating-point accumulation operations. Including 100 million vectors, 100 million accumulation operations will be generated, and each accumulation operation contains M (the number of PQ segments) elements.
- the vector calculation engine is used to directly perform the vector accumulation operation to greatly speed up the speed of vector accumulation.
- the scalar calculation engine determines the retrieval result according to the accumulated distance results.
- the scalar calculation engine can sort the accumulated distance results first, and then, based on the sorting results, select at least one image that matches the query vector as a retrieval result.
- the sorting method can be sorted by Top X, that is, X distance results are screened in the distance result in descending order, X is a positive integer greater than or equal to 1, and the specific value of X can be specified manually.
- the sorting process of the accumulated distance results can use a binary heap to filter the distance results of Top X.
- the scalar computing engine can divide the data to be sorted into at least two groups, and then sort each group according to size, and finally sort the sorting results of different groups twice, and then sort the final sorting results. Select at least one accumulated distance result as the retrieval result.
- the heterogeneous system can also determine the images in the associated gallery according to the retrieval result of the query vector. Specifically, since the short feature list is generated according to the images in the gallery, as shown in FIG. 7 , the short feature list can record the short feature list. Identify the relationship with the image in the gallery, then, the distance matrix generated from the short feature list and the distance table can also add the identity of the image.
- a field for the identification of the image can be added to each vector, and correspondingly, the accumulated distance result can also have a field for identifying the image, when according to Top X
- the identifier of the matching image can be determined according to the field of the image, and then the matching image is obtained from the memory storing the images of the gallery, and then the matching image is presented to the user as the retrieval result of the image to be retrieved.
- the direct memory access engine can relocate the sorting result to the first memory, and then the host system searches for the image in the gallery associated with the sorting result according to the sorting result.
- the retrieval result is retrieved, and the retrieval result of the retrieval request is presented to the user through an interface or a network (web) interface.
- the images in the gallery can be stored in the image retrieval system shown in FIG. 1 , for example, using the first memory 30 of the host system 10 to store the images in the gallery, or using a device other than the system to store the images in the gallery.
- a matching image can be obtained from a device other than the system according to the identifier of the image indicated in the retrieval result.
- the image retrieval process may be performed in a multi-level parallel processing manner.
- the parallel processing operation may be performed in the following two manners:
- the query requests can be executed in parallel according to the above steps S301 to S304 respectively.
- each computing engine in the heterogeneous system can use multiple processing
- the processor core or process or thread executes the processing of multiple query requests at the same time, thereby realizing parallel processing of multiple query requests.
- FIG. 11 is a schematic diagram of a parallel processing flow.
- the scalar calculation engine can generate a partial distance matrix by accumulating the partial matrix to obtain The accumulated distance vector is then sorted by Top X on the accumulated distance vector by the scalar calculation engine, and then the sorting result of the partial distance matrix is obtained.
- the above process can be called an iterative processing process based on the partial distance matrix.
- the iterative processing process of the distance matrix can be divided into two or more sub-processes, and each process respectively executes the above-mentioned iterative processing process based on the partial distance matrix.
- the final result of the distance matrix can be determined only by updating the sorting results in the sorting stage.
- the sorting result can be stored in the form of a stack. Whenever an iterative process based on a partial distance matrix is completed, the sorting result in the stack is updated to obtain the latest sorting result until the processing of all distance matrices is completed.
- the distance calculation engine 201 can divide the gallery into several groups, and then successively calculate the distances between the respective segments of the query vector and the center vectors corresponding to the respective groups in the gallery, and then generate a distance table. distance subtable; Next, the distance matrix is generated by the scalar calculation engine 203 according to the distance subtable, wherein a part of the data of the distance table is called the distance subtable; Then, the distance matrix is generated by the scalar calculation engine 203 according to the distance subtable; The vector calculation engine 202 determines the accumulated distance results according to the distance matrix, and then the scalar calculation engine 203 determines the sorting order of the distance results according to the accumulated distance results.
- the scalar calculation engine 203 and the vector calculation engine 202 execute corresponding processing procedures in sequence according to the above process, until the processing of the entire distance table is completed.
- the sorting process can be based on a form similar to the first-in-first-out stack to determine X vectors. After each new X vectors are obtained, they are compared with the vectors in the stack. Among the newly obtained X vectors, there is a smaller distance. vector, replace the vector in the original stack.
- the data can be divided into fine-grained segments, so that distance matrix generation, vector accumulation and sorting can be performed at the same time, which further improves the efficiency of image retrieval.
- the image retrieval method provided by the present application adopts the idea of "special personnel and dedicated work", uses heterogeneous systems including multiple dedicated computing engines to perform image retrieval processing, and avoids the image retrieval process caused by the CPU performing image retrieval processing.
- the problem of slow retrieval speed is also provided.
- a variety of calculation methods are also provided to simplify the processing process of the distance calculation.
- the matrix calculation engine can be used to quickly complete the vector calculation process, which further improves the efficiency of image retrieval processing and reduces the delay of data processing.
- the above-mentioned method of the present application can also be applied to scenarios such as video retrieval, face recognition, and text retrieval.
- the method in FIG. 11 performs retrieval processing, obtains retrieval results, and further uses retrieval results including all frames as retrieval results, which can also quickly complete video retrieval processing.
- the above-mentioned image retrieval method can also be used, taking the image of the face to be retrieved as the to-be-retrieved image, and further performing the above-mentioned methods in FIGS.
- FIG. 12 is a schematic structural diagram of an image retrieval apparatus 600 provided by this application. As shown in the figure, the apparatus 600 includes a first computing unit 601 and a second computing unit 602, wherein,
- the first calculation unit 601 is used to calculate the similarity between the to-be-retrieved image and multiple images included in the gallery;
- the second computing unit 602 is configured to determine a retrieval result according to the similarity between the image to be retrieved and multiple images included in the gallery, where the retrieval result includes whether the gallery contains images similar to the image to be retrieved target image.
- first computing unit 601 in the apparatus 600 in the embodiment of the present application may correspond to the implementation of the matrix computing engine 201 in FIG. 2
- second computing unit 602 may correspond to the vector computing engine 202 and the scalar computing engine in FIG. 2 . 203 implementation.
- the first computing unit 601 and the second computing unit 602 can also implement the image retrieval methods shown in FIGS. 3 to 11 through software, and the apparatus 600 and its respective modules can also be software modules.
- the functions of the first computing unit 601 and the second computing unit 602 are respectively implemented by the GPU calling the instructions in the memory.
- the functions of the first computing unit and the second computing unit may also be implemented by a CPU or other type of processor by calling computer instructions in the memory.
- the processor includes a digital signal processor (digital signal processing, DSP), an application-specific integrated circuit (application-specific integrated circuit, ASIC), a field-programmable gate array (field-programmable gate array, FPGA) or other programmable logic devices , discrete gate or transistor logic devices, discrete hardware components, etc.
- DSP digital signal processing
- ASIC application-specific integrated circuit
- FPGA field-programmable gate array
- a general purpose processor may be a microprocessor or any conventional processor or the like.
- the similarity between the to-be-retrieved image and multiple images included in the gallery is based on the distance between the query vector corresponding to the to-be-retrieved image and the center vector corresponding to the multiple images.
- the obtained similarity or the similarity between the image to be retrieved and the multiple images included in the gallery refers to the distance between the query vector corresponding to the image to be retrieved and the center vector corresponding to the multiple images.
- the first computing unit 601 is further configured to convert the query vector into a query matrix, convert the center vector into a center matrix, and obtain a distance table according to the query matrix and the center matrix, and the distance The table is used to determine the distance between the query vector corresponding to the to-be-retrieved image and the center vectors corresponding to the multiple images.
- the apparatus 600 further includes a shared memory (not shown in FIG. 6 ), and the second computing unit 602 is further configured to obtain the distance table from the shared memory; and determine the distance table according to the distance table. Whether the image library contains a target image similar to the image to be retrieved.
- the apparatus 600 further includes a third computing unit 603, configured to obtain the index list corresponding to the images in the gallery, and the second computing engine is specifically configured to obtain the distance table from the shared memory; according to The distance table determines whether a target image similar to the to-be-retrieved image is contained in the gallery.
- the second calculation unit 602 is further configured to obtain a distance matrix according to the distance table and the index list, where the distance matrix is used to indicate a distance value associated with each index in the index list; determine according to the distance matrix The distance between the query vector corresponding to the image to be retrieved and the center vectors corresponding to the multiple images.
- the second calculation unit 602 is further configured to determine the retrieval result according to the distance matrix to determine the distance between the query vector corresponding to the image to be retrieved and the center vector corresponding to the multiple images.
- the second calculation unit 602 is further configured to determine the retrieval result according to a preset threshold, and the distance between the vector associated with the image included in the retrieval result and the query vector satisfies a preset condition, and the preset
- the set condition includes that the distance between the vector associated with the image and the query vector is greater than or equal to the preset threshold, or the distance between the vector associated with the image and the query vector is smaller than the preset threshold.
- the image to be retrieved comes from a retrieval request, and the retrieval request includes at least one image to be retrieved, or the retrieval request includes a video to be retrieved, and the video to be retrieved includes at least one frame of the to-be-retrieved image. image.
- the apparatus 600 may correspond to executing the methods described in the embodiments of the present application, and the above-mentioned and other operations and/or functions of the various units in the apparatus 600 are to implement the corresponding methods of the respective methods in FIG. 3 to FIG. 11 , respectively.
- the process, for the sake of brevity, will not be repeated here.
- the apparatus 600 provided by the present application can use multiple computing units to respectively execute different data processing processes, thereby improving the efficiency of data processing.
- a variety of calculation methods are used to simplify the process of distance calculation.
- the apparatus 600 can also map the distance calculation process into a matrix calculation process, and use the first calculation unit to quickly complete the distance calculation process, which further improves the efficiency of image retrieval processing and reduces the time delay of data processing.
- multiple query requests or different steps in the same query request can be executed in parallel based on the processing capabilities of each computing engine in the heterogeneous system, which further reduces the processing delay of image retrieval and meets the requirements of real-time image retrieval. Require.
- the present application also provides an image retrieval system.
- the image retrieval system is a heterogeneous system 20 as shown in FIG. 1 .
- the heterogeneous system 20 is used to implement the functions of the corresponding processes of the methods in FIG. 3 to FIG. 11 , in order to It is concise and will not be repeated here.
- the present application also provides a data processing system, the data processing system includes a host system 10 and a heterogeneous system 20 as shown in FIG. 1 , the host system is used to obtain a retrieval request, and send an image to be retrieved in the retrieval request to the heterogeneous system , and the functions of the corresponding processes of the respective methods in FIG. 3 to FIG. 11 are executed by the heterogeneous system, which is not repeated here for brevity.
- the above embodiments may be implemented in whole or in part by software, hardware, firmware or any other combination.
- the above-described embodiments may be implemented in whole or in part in the form of a computer program product.
- the computer program product includes one or more computer instructions. When the computer program instructions are loaded or executed on a computer, all or part of the processes or functions described in the embodiments of the present application are generated.
- the computer may be a general purpose computer, special purpose computer, computer network, or other programmable device.
- the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server, or data center Transmission to another website site, computer, server, or data center is by wire (eg, coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.).
- the computer-readable storage medium may be any available medium that a computer can access, or a data storage device such as a server, a data center, or the like containing one or more sets of available media.
- the usable media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media.
- the semiconductor medium may be a solid state drive (SSD).
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Processing Or Creating Images (AREA)
Abstract
一种包括接口和计算引擎集合的图像检索系统,其中,接口用于获取待检索图像;计算引擎集合包括第一计算引擎和第二计算引擎,第一计算引擎用于计算待检索图像与图库所包含的多个图像之间的相似度;第二计算引擎用于根据所检索图像与图库所包含的多个图像之间的相似度确定检索结果,检索结果包括与所述待检索图像相似的目标图像,由此提升图像检索处理的速度,降低数据处理的时延。
Description
本申请涉及计算机领域,尤其涉及一种图像检索系统、方法和装置。
随着计算机技术的发展,图像检索技术也逐渐成为研究的热点,应用场景也越来越广泛,例如,在商品图像搜索功能中,基于用户提供的图像在图库中搜索,找到同款或相似的商品,进而将找到的商品推荐给用户;或者,在版权判定的搜索服务中,以图像为检索对象,在海量的图库中查找相同或相似图像,进而判定检索对象是否存在侵权问题。
传统的图像检索过程,通常是由提供图像检索功能的中央处理器(central processing unit,CPU)直接根据用户检索请求,将图库中图像加载至主存(main memory),然后,比对查询图像和图库中的图像的相似度,进而确定与查询图像相同或相似的图像。但是,当图库中包括海量数据时,由于CPU执行图像检索的过程中需要执行数据处理和距离计算等操作,过程复杂,耗时长,无法满足图像检索的实时需求。因此,如何提供一种高效的图像检索方法成为亟待解决的技术问题。
发明内容
本申请提供了一种图像检索的方法,由包括多个计算引擎的异构系统执行图像检索操作,以“专人专事”的思路执行图像检索中各个操作过程,避免使用单一CPU执行图像检索过程所带来的处理复杂、耗时长的问题,以此提升图像检索的效率。
第一方面,提供一种图像检索系统,该系统包括用于获取待检索图像的接口和包括第一计算引擎和第二计算引擎的计算引擎集合,其中,第一计算引擎用于计算待检索图像与图库所包含的多个图像之间的相似度;第二计算引擎用于根据待检索图像与图库所包含的多个图像之间的相似度确定检索结果,检索结果包括与待检索图像相似的目标图像。通过上述图像检索系统,可以利用计算引擎集合执行图像检索操作,进而根据待检索图像和图库中图像的相似度确定检索结果,避免由单一CPU执行图像检索操作所带来的耗时长、效率低的问题。
在一种可能的实现方式中,待检索图像与图库所包含的多个图像之间的相似度是根据待检索图像所对应的查询向量与多个图像对应的中心向量之间的距离获得的。或者,待检索图像与图库所包含的多个图像之间的相似度是指待检索图像所对应的查询向量与多个图像对应的中心向量之间的距离。本申请所要保护的图像检索系统中,利用向量间的距离衡量图像的相似度,进而确定与待检索图像匹配的图像。
在另一种可能的实现方式中,在图像检索处理的过程中,第一计算引擎具体用于将查询向量转换为查询矩阵,将中心向量转换为中心矩阵,根据查询矩阵和中心矩阵获得距离表,其中,距离表用于确定待检索图像所对应的查询向量与多个图像对应的中心向量之间的距离。第一计算引擎可以用于处理矩阵相关计算,在计算向量间距离的过程中,可以将距离计算过程映射为矩阵处理过程,进而由专用的第一计算引擎完成矩阵的相关处理,提升距离计算过程的速度,进而提升整个图像检索过程的处理效率。
在另一种可能的实现方式中,图像检索系统还包括共享内存,该共享内存与第一计算引 擎之间通过内部总线连接;第一计算引擎还用于通过内部总线将距离表存储至共享内存。通过在图像检索系统中添加共享内存,第一计算引擎和第二计算引擎可以通过共享的方式访问该存储介质中数据,例如,第一计算引擎可以将距离表存储在共享内存中,第二计算引擎可以从共享内存中获取距离表,并基于该距离表完成图像检索的其他处理操作,避免使用图像检索系统以外的存储介质存储距离表所带来的访存速度慢以及缓存失效(cache miss)问题,进一步提升图像检索的速度。
在另一种可能的实现方式中,共享内存与第二计算引擎之间通过内部总线连接;第二计算引擎具体用于从共享内存中获取距离表;根据距离表确定图库中是否包含与所述待检索图像相似的目标图像。由于第一计算引擎和第二计算引擎均可以访问共享内存,第二计算引擎可以从共享内存获取距离表,并基于距离表执行后续处理,避免从主机系统的内存中获取数据所带来的缓存失效(Cache miss)导致耗时长的问题,加快数据处理的速度。
在另一种可能的实现方式中,图像检索系统还包括第三计算引擎,第三计算引擎,用于获取图库中图像所对应的索引清单,其中,第三计算引擎也可以称为直接内存访问引擎,用于通过直接内存访问的方式获取主机系统连接的内存中数据;第二计算引擎具体用于根据距离表和索引清单获得距离矩阵,距离矩阵用于指示索引清单中每个索引所关联的距离值;根据距离矩阵确定待检索图像所对应的查询向量与多个图像对应的中心向量之间的距离。第二计算引擎可以根据距离表和索引清单计算向量之间的相似度,确定与待检索图像匹配的图像。而且,将距离表和索引清单存储在共享内存中,可以加快生成距离矩阵过程中需要随机读的处理速度,提升数据处理的效率。
在另一种可能的实现方式中,第二计算引擎具体用于根据距离矩阵确定待检索图像所对应的查询向量与多个图像对应的中心向量之间的距离的排序结果确定检索结果。距离矩阵用于标识待检索图像的每个分段和中心向量的对应分段的距离,根据距离矩阵可以进一步确定待检索图像的查询向量和中心向量之间的距离,进而对该距离进行排序,最终确定检索结果。
在另一种可能的实现方式中,第二计算引擎具体用于根据预设阈值确定所述检索结果,所述检索结果所包括的图像所关联的向量与所述查询向量的距离满足预设条件,所述预设条件包括图像所关联的向量与所述查询向量的距离大于或等于所述预设阈值,或者,图像所关联的向量与所述查询向量的距离小于所述预设阈值。除了通过排序方式获得检索结果外,还可以通过预设阈值确定一个或多个满足条件的图像作为检索结果,也可以查找到与待检索图像匹配的检索结果。
在另一种可能的实现方式中,图像检索系统为芯片或PCIe卡,第一计算引擎和第二计算引擎通过内部总线相连。通过使用内部总线的连接方式,使得第一计算引擎和第二计算引擎之间的数据传输路径更短,数据传输速度更快。
在另一种可能的实现方式中,计算引擎集合中第一计算引擎和第二计算引擎分别为逻辑电路。通过使用专用逻辑电路构建的第一计算引擎和第二计算引擎,可以针对不同类型的运算操作使用不同的硬件引擎,更快完成数据处理。
在另一种可能的实现方式中,系统为虚拟化系统,计算引擎集合中第一计算引擎和第二计算引擎分别为运行在虚拟化系统中设备的虚拟机或容器。除了使用硬件电路构建的硬件引擎实现上述专用矩阵、向量和标量的运算外,还可以使用虚拟化形式的虚拟机或容器实现各个硬件引擎的功能,那么,在数据中心中也可以实现本申请所提供的高效的图像检索方法。
在另一种可能的实现方式中,待检索图像来自检索请求,检索请求包括至少一个待检索图像,或者,检索请求包括待检索视频,所述待检索视频包括至少一帧待检索图像。本申请 可以针对图像检索处理外,还是适用于视频检索过程,其中,每帧图像可以分别按照上述图像检索过程执行,也可以通过计算引擎集合快速获得检索结果。
第二方面,本申请提供一种图像检索的方法,该方法适用于图像检索系统,图像检索系统包括接口和计算引擎集合,计算引擎集合包括第一计算引擎和第二计算引擎。具体检索过程包括:第一计算引擎计算待检索图像与图库所包含的多个图像之间的相似度;第二计算引擎根据待检索图像与图库所包含的多个图像之间的相似度确定检索结果,检索结果包括图库中是否包含与待检索图像相似的目标图像。
在一种可能的实现方式中,待检索图像与图库所包含的多个图像之间的相似度是根据待检索图像所对应的查询向量与多个图像对应的中心向量之间的距离获得的。或者,待检索图像与图库所包含的多个图像之间的相似度是指待检索图像所对应的查询向量与多个图像对应的中心向量之间的距离。本申请所要保护的图像检索系统中,利用向量间的距离衡量图像的相似度,进而确定与待检索图像匹配的图像。
在另一种可能的实现方式中,在图像检索处理的过程中,第一计算引擎将查询向量转换为查询矩阵,将中心向量转换为中心矩阵,根据查询矩阵和中心矩阵获得距离表,其中,距离表用于确定待检索图像所对应的查询向量与多个图像对应的中心向量之间的距离。第一计算引擎可以用于处理矩阵相关计算,在计算向量间距离的过程中,可以将距离计算过程转换为矩阵处理过程,进而由专用的第一计算引擎完成矩阵的相关处理,提升距离计算过程的速度,进而提升整个图像检索过程的处理效率。
在另一种可能的实现方式中,图像检索系统还包括共享内存,该共享内存与第一计算引擎之间通过内部总线连接;第一计算引擎通过内部总线将距离表存储至共享内存。通过在图像检索系统中添加共享内存,第一计算引擎和第二计算引擎可以通过共享的方式访问该存储介质中数据,例如,第一计算引擎可以将距离表存储在共享内存中,第二计算引擎可以从共享内存中获取距离表,并基于该距离表完成图像检索的其他处理操作,避免使用图像检索系统以外的存储介质存储距离表所带来的访问速度慢,处理复杂的问题。
在另一种可能的实现方式中,共享内存与第二计算引擎之间通过内部总线连接;第二计算引擎从共享内存中获取距离表;根据距离表确定图库中是否包含与所述待检索图像相似的目标图像。由于第一计算引擎和第二计算引擎均可以访问共享内存,第二计算引擎可以从共享内存获取距离表,并基于距离表执行后续处理,避免从主机系统的内存中获取数据所带来的缓存失效(Cache miss)导致耗时长的问题,加快数据处理的速度。
在另一种可能的实现方式中,图像检索系统还包括第三计算引擎,第三计算引擎,可以获取图库中图像所对应的索引清单,其中,第三计算引擎也可以称为直接内存访问引擎,用于通过直接内存访问的方式获取主机系统连接的内存中数据;第二计算引擎根据距离表和索引清单获得距离矩阵,距离矩阵用于指示索引清单中每个索引所关联的距离值;根据距离矩阵确定待检索图像所对应的查询向量与多个图像对应的中心向量之间的距离。第二计算引擎可以根据距离表和索引清单计算向量之间的相似度,确定与待检索图像匹配的图像。
在另一种可能的实现方式中,第二计算引擎根据距离矩阵确定待检索图像所对应的查询向量与多个图像对应的中心向量之间的距离的排序结果确定检索结果。距离矩阵用于标识待检索图像的每个分段和中心向量的对应分段的距离,根据距离矩阵可以进一步确定待检索图像的查询向量和中心向量之间的距离,进而对该距离进行排序,最终确定检索结果。
在另一种可能的实现方式中,第二计算引擎具体用于根据预设阈值确定所述检索结果,所述检索结果所包括的图像所关联的向量与所述查询向量的距离满足预设条件,所述预设条 件包括图像所关联的向量与所述查询向量的距离大于或等于所述预设阈值,或者,图像所关联的向量与所述查询向量的距离小于所述预设阈值。除了通过排序方式获得检索结果外,还可以通过预设阈值确定一个或多个满足条件的图像作为检索结果,也可以查找到与待检索图像匹配的检索结果。
在另一种可能的实现方式中,图像检索系统为芯片或PCIe卡,第一计算引擎和第二计算引擎通过内部总线相连。
在另一种可能的实现方式中,计算引擎集合中第一计算引擎和第二计算引擎分别为逻辑电路。
在另一种可能的实现方式中,图像检索系统为虚拟化系统,计算引擎集合中第一计算引擎和第二计算引擎分别为运行在虚拟化系统中设备的虚拟机或容器。
在另一种可能的实现方式中,待检索图像来自检索请求,检索请求包括至少一个待检索图像,或者,检索请求包括待检索视频,所述待检索视频包括至少一帧待检索图像。本申请可以针对图像检索处理外,还是适用于视频检索过程,其中,每帧图像可以分别按照上述图像检索过程执行,也可以通过计算引擎集合快速获得检索结果。
第三方面,本申请提供一种图像检索的装置,所述装置包括用于执行第二方面或第二方面任一种可能实现方式中的图像检索的方法的各个模块。
第四方面,本申请提供一种数据处理系统,该数据处理系统包括主机系统和图像检索系统,主机系统和图像检索系统通过网络进行通信,主机系统用于获取检索请求,并根据检索请求确定检索图像,并将该检索图像发送至图像检索系统,图像检索系统用于执行第一方面或第一方面任意一种可能的实现方式中图像检索系统中各个计算引擎所实现的功能。
第四方面,本申请提供一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述各方面所述的方法。
第五方面,本申请提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述各方面所述的方法。
本申请在上述各方面提供的实现方式的基础上,还可以进行进一步组合以提供更多实现方式。
图1为本申请提供的一种乘积量化处理的方法的流程示意图;
图2为本申请提供的一种数据处理系统100的架构示意图;
图3为本申请提供的一种图像检索方法的流程示意图;
图4为本申请提供的另一种图像检索方法的流程示意图;
图5为本申请提供的一种计算距离表的方法的流程示意图;
图6为本申请提供的另一种计算距离表的方法的流程示意图;
图7为本申请提供的一种计算距离矩阵的方法的流程示意图;
图8为本申请提供的另一种计算距离矩阵的方法的流程示意图;
图9为本申请提供的一种计算累加向量的方法的流程示意图;
图10为本申请提供的另一种计算距离矩阵的方法的流程示意图;
图11为本申请提供的一种并行处理流程的示意图;
图12为本申请提供的一种图像检索的装置600的架构示意图。
为了便于理解,首先对本实施例中涉及的术语进行解释。
图像检索(image search),是指在图库(也可以称为底库、样本库或待检索库)中查找与用户输入的查询图像匹配的图像的过程。通常地,图像检索基于深度学习(deep learning)与图像识别技术,结合不同应用业务和行业场景,利用特征向量化与搜索能力,从指定图库中搜索相同或相似的图像。其中,图库中每个图像和查询图像均可以利用特征数据表示,图像的向量可以根据特征提取模型(例如,深度学习的特征提取模型ResNet)获得,同一图像的特征数据包括至少一个特征向量(也可以称为向量),每个向量可以用一个多维数组表示,例如,向量1可以利用(a,b,c)表示,a,b,c为浮点数。上述图像检索的过程也可以理解为是基于图像的特征数据的检索处理过程。值得说明的是,为便于描述,本申请的以下实施方式中以每个图像的特征数据为一个特征向量为例进行说明。
长特征,是根据特征提取模型获得的图像的特征数据,也称为长特征数据,或称为未编码数据、长编码数据、原始特征数据,每个图像会获得一个长特征。
短特征,是指由人工智能(artificial intelligent,AI)特征训练平台根据模型训练后,将长特征分解所获得的数据,也称为短特征数据,或称为编码数据、短编码数据。通常地,可以采用乘积量化(product quantization,PQ)方法将长特征转化为短特征,乘积量化方法是一种加快检索的方法。
示例地,图1为本申请实施例提供的一种乘积量化处理的方法的流程示意图,如图所示,以1个图像的长特征转化为短特征的过程为例,先将长特征分段,每个分段有一个分段序号,且该分段序号在该分段中唯一。例如,将包括512维的长特征分为64段,每段包括8个特征向量F,分段序号分别为0至7;然后,通过与预先训练获得的K个聚类中心向量的距离计算,在每个分段中确定与预先训练获得的K个聚类中心向量的距离最近的特征向量,并以该特征向量的分段序号作为该分段的短特征的标识。其中,K是预置的聚类中心的数量,K为大于或等于2的正整数。具体地,预先训练获得的K个聚类中心向量可以是以图库中全部或部分图像的长特征为训练数据,采用聚类算法(例如,K-Means算法)获得K个聚类,每个聚类包括一个聚类中心,用于标识该聚类中心的向量称为聚类中心向量(也可以称为中心向量)。例如,在包括F0至F7的分段1中,F1是与K个聚类中心向量的距离最近的特征向量,则可以利用F1的分段序号1标识分段1,即短特征中可以以1标识分段1。长特征中每个分段经过上述距离计算和比较后,均可以用一个分段中特征向量的序号标识该分段。上述乘积量化过程也可以理解为一种简化图库中图像的长特征的记录方式,通过每个分段中向量和中心向量的距离计算,构建原始特征的索引。用于标识图库中所有图像的长特征的各个分段的标识的集合称为短特征清单(也可以称为索引清单),短特征清单中各个短特征的顺序与长特征的分段顺序相同。后续图像检索处理过程为基于该短特征所执行检索处理。
下面结合图2进一步介绍本申请提供的数据处理系统100,如图所示,该处理系统100包括主机系统10、异构系统20(也可以称为图像检索系统)、第一存储器30和接口总线40,其中,主机系统10、异构系统20、第一存储器30通过接口总线40相连。其中,接口总线40包括快捷外围部件互连标准(Peripheral Component Interconnect Express,PCIe),异构系统20可以是一个芯片,也可以是一个PCIe卡。
主机系统10,用于向主机系统10中运行的应用程序或主机系统10外部的其他系统、设 备或应用程序提供应用编程接口(application programming interface,API)或图像检索框架,该应用程序接口或图像检索框架,用于接收用户的检索请求,该检索请求包括待检索图像;进而通过接口总线40将检索请求发送给异构系统20,由异构系统20完成检索请求的处理,例如,获取检索请求中待检索图像所关联的查询向量。可选地,也可以是主机系统完成检索请求的处理,获取待检索图像所关联的查询向量。此外,主机系统10还用于运行操作系统。主机系统10包括处理器101,处理器101可以是中央处理器(central processing unit,CPU)。
异构系统20,用于在图库中检索与查询图像相同或相似的图像。异构系统20包括两种或两种以上的计算引擎,例如,如图2所示,异构系统20包括矩阵计算引擎201、向量计算引擎202、标量计算引擎203、内部总线206和接口207。可选地,异构系统20还包括直接内存访问(direct memory access,DMA)引擎204、第二存储器205。其中,矩阵计算引擎201、向量计算引擎202、标量计算引擎203、直接内存访问(direct memory access,DMA)引擎204、第二存储器205和接口207通过内部总线206相通信,其中,内部总线206可以为并行总线,用于实现点对点(point-to-point,P2P)通信;内部也可以为包括PCIe或前置总线(front side bus,FSB)或其他形式可以实现各个部件连接的方式。
为了便于描述,也可以将矩阵计算引擎201称为第一计算引擎,将向量计算引擎202和标量计算引擎203称为第二计算引擎,将直接内存访问引擎204称为第三计算引擎,将第二存储器205称为共享内存。
作为一种可能的实施例,第一存储器30和第二存储器205为两个不同存储器。可选地,第一存储器30和第二存储器205还可以是通过内存映射输入/输出(memory mapped input/output,MMIO)技术建立映射关系的不同存储器;或者,第一存储器30和第二存储器205也可以是同一存储器。
矩阵计算引擎201,用于实现矩阵相关运算处理,包括矩阵乘和其他涉及矩阵的运算过程,例如,图3中S201计算距离表。向量计算引擎202,用于实现向量相关计算,包括向量累加等涉及向量的运算过程,例如,图3中S204累加距离向量获得向量间的距离结果。标量计算引擎203,用于执行标量相关运算,包括查询短特征清单和距离表,进而生成距离矩阵,以及对矩阵距离计算后的结果进行排序的操作。例如,图3中S202遍历短特征清单、S203生成距离矩阵和S205排序;其中,标量是指只有大小的,无方向的量。直接内存访问引擎204,用于采用直接内存访问技术,将短特征清单从第一存储器20加载至第二存储器205,异构系统中各个计算引擎可以快速获取短特征清单中内容,进而完成相应处理操作。第二存储器205,用于为矩阵计算引擎201、向量计算引擎202、标量计算引擎203、直接内存访问引擎204提供共享形式的存储区域。
上述矩阵计算引擎201、向量计算引擎202、标量计算引擎203、直接内存访问引擎204可以分别由一种具有特定功能的硬件电路实现。例如,每个计算引擎分别由芯片或由电子元器件构建的电路结构组成,用于对数据执行计算处理。本实施例以矩阵计算引擎201为例进行说明,矩阵计算引擎201具有CPU的通用性和可编程性,但更具有专用性,可以高效执行相关运算,例如,矩阵相关运算。矩阵计算引擎201可以被替换成图形处理单元(graphics processing unit,GPU)、嵌入式神经网络处理器(neural-network processing units,NPU)等处理芯片。
可选地,矩阵计算引擎201、向量计算引擎202、标量计算引擎203、直接内存访问引擎204也可以分别是由软件模块实现,或者,软件和硬件混合的方式构成。可选地,上述各个计算引擎也可以合一或部分合一部署,例如,将矩阵计算引擎201、向量计算引擎202和向 量计算引擎203合一部署,用于实现矩阵类运算和向量类运算处理。
作为一种可能的实施例,图2所示的异构系统还可以以软件方式实现,例如,异构系统是虚拟化系统,其中,矩阵计算引擎201、向量计算引擎202、标量计算引擎203、直接内存访问引擎204可以由虚拟机或容器等虚拟化形式的计算设备实现,此时,异构系统通过网络与主机系统相连,该网络可以是以太网、光纤网络、无线网络等形式实现主机系统和异构系统的通信。虚拟化系统中包括一个或多个设备,每个设备中包括分别用于实现相应计算功能的逻辑电路或器件,例如,该设备中可以包括上述PCIe卡或芯片,此时,各个计算引擎是对带有上述PCIe卡或芯片的设备虚拟化后获得的一个或多个虚拟机或容器实现。例如,虚拟化系统包括带有实现矩阵计算引擎功能的器件(例如,GPU)的设备,通过对该设备的资源虚拟化后获得带有虚拟图形处理器(virtual GPU,vGPU)的多个虚拟机或容器,每个虚拟机或容器分别用于实现上述各个计算引擎的功能。
为了便于描述,以下实施例以图2所示的数据处理系统100为例进一步解释本申请提供的图像检索方法,其中,异构系统20可以通过接口207从主机系统10获取待检索图像,确定待检索图像与图库中图像的相似度,再根据相似度的计算结果确定待检索图像的检索结果。在上述图像检索过程中,异构系统20可以采用“专人专事”的思路,参见图3,在异构系统20中使用不同类型的计算引擎分别执行S201距离表计算、S202遍历短特征清单、S203生成距离矩阵、S204累加距离向量和S205排序的操作,进而确定检索结果,检索结果包括与待检索图像相似的图像(也称为目标图像)。由此实现针对不同运算操作的类型分别采用不同计算引擎执行处理的过程,提升数据处理的效率。
接下来,结合附图进一步介绍本申请提供的图像检索方法,图4为本申请提供的一种图像检索的方法的流程示意图,如图所示,该方法包括:
S401、矩阵计算引擎根据检索请求中待检索图像所对应的查询向量和图库中的中心向量确定距离表。
为了加快海量数据的检索速度,在执行图像检索前,图2所示的主机系统10或系统100以外的其他设备或系统预先通过特征提取模型获得图库中图像的长特征;并以图库中全部或部分数据为训练样本,利用聚类算法获得K个聚类中心,每个聚类中心包括一个中心向量(也可以称为乘积量化聚类中心向量),每个中心向量包括与长特征中分段数相同个数的向量,每个向量可以用一个多维数组表示。示例地,如图4中,图库中图像的长特征经过乘积量化和聚类处理后获得K个中心向量1至中心向量K,每个中心向量包括M个分段,其中,M为长特征的分段数量,例如,中心向量1包括向量11至向量1M。然后,参考图1所示的过程,对图库中各个图像的长特征执行乘积量化处理,创建短特征清单。
在获取检索请求时,可以先获取检索请求中待检索图像,然后,对待检索图像执行预处理操作,包括:利用与图库中图像预处理过程相同的特征提取模型获取待检索图像的长特征,然后,将待检索图像的长特征分成M段,其中,M为图库中执行乘积量化处理时对长特征的分段数相同,例如,图4中将查询向量1分成向量1至向量M。
值得说明的是,上述对检索图像的预处理过程可以由异构系统执行,也可以由主机系统或其他设备或系统执行,并将预处理的结果发送给异构系统,由异构系统执行图像的进一步处理。此外,本申请并不限定上述聚类处理和特征提取模型的具体处理过程,具体实施中,可以根据业务需求选择合适的算法执行聚类和图像特征提取处理。
进一步地,为了在图库中确定与查询向量相似的向量,需要确定查询向量和图库的各个中心向量的相似度。具体地,可以以查询向量中每个分段分别作为检索对象,分别计算该检 索对象对应的分段的向量与K个中心向量中对应分段的各个向量的距离,进而根据上述距离计算结果获得距离表。示例地,图5为本申请实施例提供的一种距离计算方法的流程示意图,如图所示,计算查询向量1的分段1的向量11和中心向量1的分段1的向量11的距离获得D11,则D11用于指示查询向量的分段1所指示的向量11与中心向量1的分段1所指示的向量11的距离。
具体地,可以采用如下方式中任意一种方式计算两个向量间的距离:
方式1,可以采用如下公式1计算两个向量间的距离:
其中,x
1为查询向量,x
1k为查询向量的第K个分段向量;x
2为图库中待比较的中心向量,x
2k为中心向量中第K个分段,K取值从1至n的整数,n为大于1的正整数。
方式2,还可以在公式1基础上,对公式1的等号两边执行开根号运算,即采用如下公式2计算两个向量间的距离:
方式3,还可以在公式2基础上去除开根号运算,采用如下公式3计算两个向量间的距离:
方式4,由于公式2中第1项对于所有查询向量的分段所对应的向量的累加结果,对于查询向量中任意一个分段所指示的向量和中心向量中任意一个分段的向量计算距离的过程而言,该值均相同,在距离计算过程中也可以将该值理解为一个常量,也即在任意两个向量的距离计算过程中均包括该项的计算结果,因此,为了简化计算过程,可以在距离计算公式中去掉该项。具体可以采用如下公式4计算两个向量的距离:
进一步地,为了简化距离计算的处理过程,也可以将上述公式的计算过程转换为矩阵计算的过程,进而使用矩阵计算引擎执行矩阵计算,避免逐个获取数据,以及执行运算操作所带来的时延和效率问题。
具体地,将查询向量按照分段序号转化为行矩阵,例如,按照查询向量中分段序号分别将查询向量的各个分段所对应的向量分别作为第一矩阵中的行元素;将中心向量按照分段序号转换为列矩阵,例如,按照每个中心向量的分段序号,分别将每个中心向量的各个分段所对应的向量作为第二矩阵中一列的元素,进而由矩阵计算引擎执行矩阵相关运算操作,提升数据处理的效率。为了便于描述,也可以将上述行矩阵称为第一矩阵,列矩阵称为第二矩阵。
示例地,参考图6,以查询向量1和中心向量1矩阵映射过程为例,首先,按照查询向量1的分段序号,分别以查询向量1中每个分段的向量作为第一矩阵中一行的元素,如将查 询向量11作为第一矩阵中第一行的元素,……,依此类推,第一矩阵为M*1的矩阵,每个分段分别作为第一矩阵的一行的元素;按照中心向量1的分段序号,依次将中心向量1的各个分段的向量分别作为第二矩阵中一列的元素,……,依此类推,第二矩阵为1*M的列矩阵,每个分段分别作为第二矩阵的一列的元素。此时,在利用如上述公式4计算查询向量1和中心向量1的距离时,则可以执行第一矩阵和第二矩阵的乘操作,再将上述矩阵乘的结果乘以2即可获得公式4中第2项的结果。进一步地,公式4中第一项是对中心向量1中各个分段的累加操作,可以预先计算第1项的结果,分别获取每个中心向量的相同分段序号的向量的累加和,并将各个累加和分别作为第三矩阵的一个列元素。进而将上述公式4的计算过程转换为如图5所示的第一矩阵和第二矩阵相乘,再将相乘获得的结果与第三矩阵进行累加,即可获得公式4的距离计算结果。
值得说明的是,图6中第三矩阵也可以称为偏移矩阵,虽然图5中是将第一矩阵和第二矩阵相乘的结果与第三矩阵进行相加获得上述距离公式的结果,但是,具体实施中,为了简化运算过程,加快数据处理效率,也可以将矩阵相乘获得的矩阵中所有元素均乘以-1,再执行与第三矩阵的相加操作,进而获得公式4的处理结果。
可选地,除了上述公式4以外,公式1至公式3也可以采用类似的思路,将各个公式按照分段序号作为矩阵的行元素或列元素,进而将距离计算过程转换为矩阵的计算过程,并利用矩阵计算引擎执行上述矩阵乘和加操作,提升矩阵计算处理的速度。
基于上述任意一种方式可以获得任意两个向量的距离计算结果,进一步地,可以利用如图5所示的距离表存储距离计算结果,其中,D
11用于指示查询向量1中分段1所指示的向量11与中心向量1中分段1所指示的向量11的距离计算结果,D
21用于指示查询向量1中分段2所指示的向量11与中心向量2中分段2所指示的向量21的距离计算结果,……,依此类推,距离表中包括查询向量中各个分段所指示的向量分别与中心向量对应的各个分段的向量的距离计算结果。
矩阵计算引擎在生成距离表后,可以将该距离表存储在异构系统20的第二存储器205中,并向标量计算引擎发送通知,进而由标量计算引擎执行后续操作。当距离表存储在第二存储器205时,标量计算引擎202可以直接通过第二存储器205读取距离表,进而生成距离矩阵,具体过程参见步骤S402和S403。
由于上述距离表的计算过程主要是涉及矩阵的距离计算过程,利用矩阵计算引擎可以直接以矩阵运算的方式处理距离计算过程,能够更快的完成上述距离计算过程。而且,矩阵计算引擎支持浮点数运行,无需像CPU执行距离计算时进行浮点数和整点数的转换的操作,简化了数据处理的效率。此外,通过优化距离计算的方法,可以快速确认任意两个向量的距离,进而判断向量间的相似度,也进一步提升了图像检索的效率。
S402、标量计算引擎根据短特征清单和距离表生成距离矩阵。
短特征清单通常在创建图库的过程中生成,存储在图1所示的第一存储器30中。如图8所示,在图像检索过程中,标量计算引擎在生成距离矩阵之前,可以通知直接内存引擎204将图库的短特征清单加载至异构系统20的第二存储器205,进而由标量计算引擎逐个遍历短特征清单,根据短特征清单中的各个分段的标识从距离表查询得到一个距离值,然后,将该距离值写入距离矩阵的对应位置,最终,获得距离矩阵,该距离矩阵可以存储在异构系统的第二存储器中。
值得说明的是,距离矩阵是一种存储数据的逻辑结构,具体实施中,可以在第二存储器205划分指定大小的存储空间,利用该存储空间实现距离矩阵的存储,例如,指定存储空间1 用于存储D11的数据,存储空间2用于存储D12的数据。
示例地,如图7所示,短特征清单包括图库中全部图像的短特征,可以以每个图像的短特征序号对应的距离值分别作为距离矩阵中一行元素,例如,图像1包括64个序号,第一个短特征标识为1,则可以将距离表中第一行第一列的距离值D11作为距离矩阵第一行第一列的元素,类似的,图像1的第二个短特征标识为2,则可以将距离表中第二行第二列的距离值D22作为距离矩阵第二行第二列的元素,……,依此类推,距离矩阵一行元素表示一个图像的距离值,进而获得N*M的矩阵,其中,N为短特征清单中所包括的图像的个数,M为长特征的分段个数。
可选地,直接内存引擎可以根据第二存储器205中剩余空间的大小将全部或部分短特征清单加载至异构系统20的第二存储器205中。
第二存储器205可以采用具有高速读写时延的存储介质设置,例如,静态随机存取存储器(static random access memory,SRAM),标量计算引擎可以通过第二存储器直接读取短特征清单和距离表,避免从第一存储器读取时所带来的数据传输时延。此外,传统技术中CPU先将短特征清单加载至内存,访问频率高的数据会被加载至CPU的缓存,但是由于缓存空间有限,无法将全部短特征清单加载至缓存中,而距离矩阵生成过程又需要频繁读取短特征清单和距离表的数据,会导致缓存中未存在待读取数据的问题(也称为cache miss问题),影响读取数据的速度。而本申请通过在异构系统中添加第二存储器205,将短特征清单加载至第二存储器205中,标量计算引擎和第二存储器通过内部总线相连,使得数据处理过程中无需频繁搬迁。此外,由于短特征清单中的标识并非按照分段顺序依次标识,但按照矩阵逐行顺序依次写入各个元素,将距离表和短特征清单分别存储在异构系统的第二存储器205中,上述过程可以理解为对距离表的随机读以及对距离矩阵的顺序写的过程。将短特征清单存储在第二存储器中,还可以提升短特征清单中数据读取的速度,进一步提升图像检索过程的处理效率。
标量计算引擎在生成距离矩阵后可以通知向量计算引擎继续执行距离累加操作,具体参见步骤S403。
S403、向量计算引擎根据距离矩阵确定累加后的距离向量。
步骤S402所获得的距离矩阵中每一行有若干个距离值,每一个距离值代表查询向量的一个分段所对应的向量与图库中的中心向量对应分段的向量间的距离,需要将同一行的所有距离进行累加操作,才能得到查询向量与图库中同一个中心向量之间的距离。
如图9所示,向量计算引擎可以使用一条累加指令就能实现多个浮点数的累加操作,而在传统技术的处理过程中,上述累加操作会涉及大量的浮点累加运算处理,如果图库中包括1亿条的向量,将会产生1亿次的累加操作,每次累加操作包含M(PQ分段数)个元素,利用向量计算引擎直接执行向量的累加操作大大加快向量累加的速度。
S404、标量计算引擎根据累加后的距离结果确定检索结果。
标量计算引擎可以先对累加后的距离结果进行排序,然后,基于排序结果,选择至少一个与查询向量匹配的图像作为检索结果。
其中,排序的方法可以采用Top X排序,也即在距离结果中按照从大到小的顺序筛选出X个距离结果,X为大于或等于1的正整数,X的具体值可以人为指定。具体地,累加后的距离结果的排序过程可以采用二叉堆的方式筛选Top X的距离结果。例如,标量计算引擎可以将待排序的数据分成至少两个组,然后,在每个组中再按照大小进行排序,最后,再将不同组的排序结果进行二次排序,进而在最终的排序结果中选择至少一个累加后的距离结果作 为检索结果。
进一步地,异构系统还可以根据查询向量的检索结果确定关联的图库中图像,具体地,由于短特征清单是按照图库中图像生成的,如图7所示,短特征清单可以记录短特征的标识与图库中图像的关系,那么,根据短特征清单和距离表生成的距离矩阵中也可以添加图像的标识。示例地,在图9所示的距离矩阵中,可以在每个向量中添加图像的标识的字段,相应地,在累加后的距离结果中同样可以带有用于标识图像的字段,当根据Top X等方式确定累加后的距离结果时,可以根据图像的字段确定匹配的图像的标识,进而从存储图库的图像的存储器中获得匹配图像,进而将匹配图像作为待检索图像的检索结果呈现给用户。
作为一种可能的实施例,异构系统获得排序结果后,可以由直接内存访问引擎将排序结果搬迁至第一存储器中,进而由主机系统根据上述排序结果查找排序结果关联的图库中图像作为的检索结果,并通过接口或网络(web)界面向用户呈现检索请求的检索结果。
值得说明的是,图库中图像可以存储在图1所示的图像检索系统中,例如,利用主机系统10的第一存储器30存储,也可以利用该系统以外的设备存储图库中图像,当确定待检索图像的查询向量的检索结果时,可以根据检索结果中指示的图像的标识,从系统以外的设备获取匹配图像。
作为一种可能的实施例,在步骤S301生成距离矩阵之后,可以采用多级并行处理方式执行图像检索过程,具体地,可以采用如下两种方式执行并行处理操作:
方式1,多个查询请求的并行处理。
当异构系统获取多个查询请求时,可以分别按照上述步骤S301至步骤S304并行执行查询请求,此时,由于每个查询请求的过程相对独立,异构系统中各个计算引擎可以采用多个处理器核或进程或线程同时执行多个查询请求的处理过程,进而实现多个查询请求的并行处理。
方式2,同一查询请求的并行处理。
对于同一个查询请求,可以根据统计数据确定矩阵计算引擎、向量计算引擎、标量计算引擎的处理能力执行并行处理。具体地,参考图11,图11为一种并行处理流程的示意图,如图所示,在矩阵计算引擎生成距离表后,标量计算引擎可以在生成部分距离矩阵时,即将该部分矩阵进行累加获得累加后的距离向量,再由标量计算引擎对上述累加后的距离向量进行Top X排序,进而获得该部分距离矩阵的排序结果,上述过程可以称为基于部分距离矩阵的迭代处理过程。对于同一个距离矩阵的处理过程,可以将距离矩阵的迭代处理过程划分为两个或多个子过程,每个过程分别执行上述基于部分距离矩阵的迭代处理过程。而对于多个基于部分距离矩阵的迭代处理过程,仅需在排序阶段更新排序结果即可确定该距离矩阵的最终结果。例如,可以采用堆栈形式存储排序结果,每当完成一个基于部分距离矩阵的迭代处理过程,则更新堆栈中排序结果,获得最新的排序结果,直到完成全部距离矩阵的处理过程为止。
可选地,如图9所示,距离计算引擎201可以将图库分成若干组,然后,依次计算查询向量的各个分段和图库中各个分组所对应的中心向量的距离,进而生成距离表的一个距离子表;接下来,由标量计算引擎203根据距离子表生成距离矩阵,其中,距离表的一部分数据称为距离子表;然后,由标量计算引擎203根据距离子表生成距离矩阵;再由向量计算引擎202根据距离矩阵确定累加后的距离结果,进而由标量计算引擎203根据累加后的距离结果确定距离结果的排序顺序。距离计算引擎201每生成一个距离子表就按照上述过程依次由标量计算引擎203、向量计算引擎202执行相应的处理过程,直到完成整个距离表的处理。其 中,排序过程可以是基于类似先进先出堆栈的形式确定X个向量,每次获取新的X个向量后,均与堆栈中的向量进行比较,新获得的X个向量中存在距离更小的向量,则替换原堆栈中向量。通过上述实施方式,可以将数据进行细粒度的切分,使得距离矩阵生成、向量累加以及排序可以同时执行,进一步提升了图像检索的效率。
通过上述过程的描述,本申请提供的图像检索方法采用“专人专事”的思路,利用包括多个专用计算引擎的异构系统执行图像检索处理,避免由CPU执行图像检索处理所带来的图像检索速度慢的问题。进一步地,在向量的距离计算过程中,还提供多种计算方式简化距离计算的处理过程。而且,通过将距离计算过程映射为矩阵计算过程,可以利用矩阵计算引擎快速完成向量的计算处理过程,进一步提升了了图像检索处理的效率,降低了数据处理的时延。此外,通过在异构系统中添加共享内存,利用共享内存存储距离表和短特征清单,可以避免cache miss问题,提升数据处理的效率。另一方面,在图像检索过程中,还可以基于异构系统中各个计算引擎的处理能力,并行执行多个查询请求或同一查询请求中不同步骤,进一步降低图像检索的处理时延,满足实时图像检索的要求。
作为一种可能的实施例,本申请的上述方法除了应用于图像检索外,还可以针对视频检索、人脸识别、文字检索等场景,对于视频检索场景,每帧图像可以分别采用上述图3至图11的方法执行检索处理,获得检索结果,进而将包括所有帧的检索结果作为检索结果,也可以快速完成视频检索处理。此外,针对人脸识别场景,同样可以采用上述图像检索的方法,以待检索人脸的图像为待检索图像,进一步执行上述图3至图11的方法也可以实现快速获得匹配的图像的过程。
值得说明的是,对于上述方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作并不一定是本申请所必须的。
上文中结合图1至图11,详细描述了根据本申请实施例所提供的图像检索的方法,下面将结合图12,描述根据本申请实施例所提供的图像检索的装置。
图12为本申请提供的一种图像检索的装置600的结构示意图,如图所示,装置600包括第一计算单元601和第二计算单元602,其中,
第一计算单元601,用于计算所述待检索图像与图库所包含的多个图像之间的相似度;
第二计算单元602,用于根据所述待检索图像与所述图库所包含的多个图像之间的相似度确定检索结果,所述检索结果包括图库中是否包含与所述待检索图像相似的目标图像。
应理解的是,本申请实施例的装置600中第一计算单元601可以对应图2中矩阵计算引擎201的实现方式,第二计算单元602可以对应图2中包括向量计算引擎202和标量计算引擎203的实现方式。第一计算单元601和第二计算单元602也可以通过软件实现图3至图11所示的图像检索的方法,装置600及其各个模块也可以为软件模块。例如,由GPU调用存储器中指令分别实现第一计算单元601和第二计算单元602的功能。可选地,当第一计算单元601和第二计算单元602由软件实现时,也可以由CPU或其他类型的处理器调用存储器中计算机指令分别实现第一计算单元和第二计算单元的功能。其中,处理器包括数字信号处理器(digital signal processing,DSP)、专用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field-programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者是任何常规的处理器等。
可选地,所述待检索图像与所述图库所包含的多个图像之间的相似度是根据所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离获得的,或者所述待检索图像与图库所包含的多个图像之间的相似度是指所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离。
可选地,第一计算单元601,还用于将所述查询向量转换为查询矩阵,将所述中心向量转换为中心矩阵,根据所述查询矩阵和所述中心矩阵获得距离表,所述距离表用于确定所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离。
可选地,装置600中还包括共享内存(图6中未示出),则第二计算单元602,还用于从所述共享内存中获取所述距离表;并根据所述距离表确定所述图库中是否包含与所述待检索图像相似的目标图像。
可选地,装置600还包括第三计算单元603,用于获取所述图库中图像所对应的索引清单,所述第二计算引擎具体用于从所述共享内存中获取所述距离表;根据所述距离表确定所述图库中是否包含与所述待检索图像相似的目标图像。
第二计算单元602,还用于根据所述距离表和所述索引清单获得距离矩阵,所述距离矩阵用于指示所述索引清单中每个索引所关联的距离值;根据所述距离矩阵确定所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离。
可选地,第二计算单元602,还用于根据所述距离矩阵确定所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离的排序结果确定检索结果。
可选地,第二计算单元602,还用于根据预设阈值确定所述检索结果,所述检索结果所包括的图像所关联的向量与所述查询向量的距离满足预设条件,所述预设条件包括图像所关联的向量与所述查询向量的距离大于或等于所述预设阈值,或者,图像所关联的向量与所述查询向量的距离小于所述预设阈值。
可选地,所述待检索图像来自检索请求,所述检索请求包括至少一个所述待检索图像,或者,所述检索请求包括待检索视频,所述待检索视频包括至少一帧所述待检索图像。
根据本申请实施例的装置600可对应于执行本申请实施例中描述的方法,并且装置600中的各个单元的上述和其它操作和/或功能分别为了实现图3至图11的各个方法的相应流程,为了简洁,在此不再赘述。
本申请提供的装置600可以利用多个计算单元分别执行不同的数据处理过程,提升数据处理的效率。在向量的距离计算过程中,采用多种计算方式简化距离计算的处理过程。而且,装置600还可以将距离计算过程映射为矩阵计算过程,利用第一计算单元快速完成距离计算处理过程,进一步提升了了图像检索处理的效率,降低了数据处理的时延。此外,在图像检索过程中,还可以基于异构系统中各个计算引擎的处理能力,并行执行多个查询请求或同一查询请求中不同步骤,进一步降低图像检索的处理时延,满足实时图像检索的要求。
本申请还提供的一种图像检索系统,该图像检索系统如图1所示的异构系统20,该异构系统20用于实现如图3至图11中各个方法的相应流程的功能,为了简洁,在此不再赘述。
本申请还提供一种数据处理系统,该数据处理系统包括如图1所示的主机系统10和异构系统20,主机系统用于获得检索请求,将检索请求中待检索图像发送给异构系统,进而由异构系统执行如图3至图11中各个方法的相应流程的功能,为了简洁,在此不再赘述。
上述实施例,可以全部或部分地通过软件、硬件、固件或其他任意组合来实现。当使用软件实现时,上述实施例可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载或执行所述计算机程序指令时,全部或部 分地产生按照本申请实施例所述的流程或功能。所述计算机可以为通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集合的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质。半导体介质可以是固态硬盘(solid state drive,SSD)。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。
Claims (33)
- 一种图像检索系统,其特征在于,包括:接口和计算引擎集合;所述接口,用于获取待检索图像;所述计算引擎集合包括第一计算引擎和第二计算引擎,其中,所述第一计算引擎用于计算所述待检索图像与图库所包含的多个图像之间的相似度;所述第二计算引擎用于根据所述待检索图像与所述图库所包含的多个图像之间的相似度确定检索结果,所述检索结果包括与所述待检索图像相似的目标图像。
- 根据权利要求1所述的系统,其特征在于,所述待检索图像与所述图库所包含的多个图像之间的相似度是根据所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离获得的,或者所述待检索图像与图库所包含的多个图像之间的相似度是指所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离。
- 根据权利要求2所述的系统,其特征在于,所述第一计算引擎具体用于将所述查询向量转换为查询矩阵,将所述中心向量转换为中心矩阵,根据所述查询矩阵和所述中心矩阵获得距离表,所述距离表用于确定所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离。
- 根据权利要求3所述的系统,其特征在于,所述系统还包括共享内存,所述共享内存与所述第一计算引擎之间通过内部总线连接;所述第一计算引擎还用于通过所述内部总线将所述距离表存储至所述共享内存。
- 根据权利要求4所述的系统,其特征在于,所述共享内存与所述第二计算引擎之间通过内部总线连接;所述第二计算计算引擎具体用于从所述共享内存中获取所述距离表;根据所述距离表确定所述图库中是否包含与所述待检索图像相似的目标图像。
- 根据权利要求5所述的系统,其特征在于,所述系统还包括第三计算引擎,所述第三计算引擎,用于获取所述图库中图像所对应的索引清单;所述第二计算引擎具体用于根据所述距离表和所述索引清单获得距离矩阵,所述距离矩阵用于指示所述索引清单中每个索引所关联的距离值;根据所述距离矩阵确定所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离。
- 根据权利要求6所述的系统,其特征在于,所述第二计算引擎具体用于根据所述距离矩阵确定所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离的排序结果确定检索结果。
- 根据权利要求6所述的系统,其特征在于,所述第二计算引擎具体用于根据预设阈值确定所述检索结果,所述检索结果所包括的图像所关联的向量与所述查询向量的距离满足预设条件,所述预设条件包括图像所关联的向量 与所述查询向量的距离大于或等于所述预设阈值,或者,图像所关联的向量与所述查询向量的距离小于所述预设阈值。
- 根据权利要求1至8中任一所述的系统,其特征在于,所述系统为芯片或PCIe卡,所述第一计算引擎和所述第二计算引擎通过内部总线相连。
- 根据权利要求9所述的系统,其特征在于,所述计算引擎集合中所述第一计算引擎和所述第二计算引擎分别为逻辑电路。
- 根据权利要求1至8中任一所述的系统,其特征在于,所述系统为虚拟化系统,所述计算引擎集合中所述第一计算引擎和所述第二计算引擎分别为运行在所述虚拟化系统中设备的虚拟机或容器。
- 根据权利要求1至11中所述的系统,其特征在于,所述待检索图像来自检索请求,所述检索请求包括至少一个所述待检索图像,或者,所述检索请求包括待检索视频,所述待检索视频包括至少一帧所述待检索图像。
- 一种图像检索的方法,其特征在于,所述方法适用于图像检索系统,所述图像检索系统包括接口和计算引擎集合,所述计算引擎集合包括第一计算引擎和第二计算引擎,所述方法包括:所述第一计算引擎计算所述待检索图像与图库所包含的多个图像之间的相似度;所述第二计算引擎根据所述待检索图像与所述图库所包含的多个图像之间的相似度确定检索结果,所述检索结果包括图库中是否包含与所述待检索图像相似的目标图像。
- 根据权利要求13所述的方法,其特征在于,所述待检索图像与所述图库所包含的多个图像之间的相似度是根据所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离获得的,或者所述待检索图像与图库所包含的多个图像之间的相似度是指所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离。
- 根据权利要求14所述的方法,其特征在于,所述第一计算引擎计算所述待检索图像与图库所包含的多个图像之间的相似度,包括:所述第一计算引擎将所述查询向量转换为查询矩阵,将所述中心向量转换为中心矩阵,根据所述查询矩阵和所述中心矩阵获得距离表,所述距离表用于确定所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离。
- 根据权利要求15所述的方法,其特征在于,所述系统还包括共享内存,所述共享内存与所述第一计算引擎之间通过内部总线连接;所述第一计算引擎通过所述内部总线将所述距离表存储至所述共享内存。
- 根据权利要求16所述的方法,其特征在于,所述共享内存与所述第二引擎之间通过内部总线连接;所述第二计算引擎从所述共享内存中获取所述距离表;并根据所述距离表确定所述图库中是否包含与所述待检索图像相似的目标图像。
- 根据权利要求17所述的方法,其特征在于,所述系统还包括第三计算引擎,所述第三计算引擎,获取所述图库中图像所对应的索引清单,所述第二计算引擎具体用于从所述共享内存中获取所述距离表;根据所述距离表确定所述图库中是否包含与所述待检索图像相似的目标图像;所述第二计算引擎根据所述距离表和所述索引清单获得距离矩阵,所述距离矩阵用于指示所述索引清单中每个索引所关联的距离值;根据所述距离矩阵确定所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离。
- 根据权利要求18所述的方法,其特征在于,所述第二计算引擎根据所述距离表确定所述图库中是否包含与所述待检索图像相似的目标图像,包括:所述第二计算引擎根据所述距离矩阵确定所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离的排序结果确定检索结果。
- 根据权利要求18所述的方法,其特征在于,所述第二计算引擎根据所述距离表确定所述图库中是否包含与所述待检索图像相似的目标图像,包括:所述第二计算引擎根据预设阈值确定所述检索结果,所述检索结果所包括的图像所关联的向量与所述查询向量的距离满足预设条件,所述预设条件包括图像所关联的向量与所述查询向量的距离大于或等于所述预设阈值,或者,图像所关联的向量与所述查询向量的距离小于所述预设阈值。
- 根据权利要13至18中任一所述的方法,其特征在于,所述系统为芯片或PCIe卡,所述第一计算引擎和所述第二计算引擎通过内部总线相连。
- 根据权利要21中任一所述的方法,其特征在于,所述计算引擎集合中所述第一计算引擎和所述第二计算引擎分别为逻辑电路。
- 根据权利要13至18中任一所述的方法,其特征在于,所述系统为虚拟化系统,所述计算引擎集合中所述第一计算引擎和所述第二计算引擎分别为运行在所述虚拟化系统中设备的虚拟机或容器。
- 根据权利要13至23中任一所述的方法,其特征在于,所述待检索图像来自检索请求,所述检索请求包括至少一个所述待检索图像,或者,所述检索请求包括待检索视频,所述待检索视频包括至少一帧所述待检索图像。
- 一种图像检索的装置,其特征在于,所述装置包括第一计算单元和第二计算单元,所述第一计算单元,用于计算所述待检索图像与图库所包含的多个图像之间的相似度;所述第二计算单元,用于根据所述待检索图像与所述图库所包含的多个图像之间的相似度确定检索结果,所述检索结果包括图库中是否包含与所述待检索图像相似的目标图像。
- 根据权利要求25所述的装置,其特征在于,所述待检索图像与所述图库所包含的多个图像之间的相似度是根据所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离获得的,或者所述待检索图像与图库所包含的多个图像之间的相似度是指所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离。
- 根据权利要求26所述的装置,其特征在于,所述第一计算单元,还用于将所述查询向量转换为查询矩阵,将所述中心向量转换为中心矩阵,根据所述查询矩阵和所述中心矩阵获得距离表,所述距离表用于确定所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离。
- 根据权利要求27所述的装置,其特征在于,所述装置还包括共享内存;所述第二计算单元,还用于从所述共享内存中获取所述距离表;并根据所述距离表确定所述图库中是否包含与所述待检索图像相似的目标图像。
- 根据权利要求28所述的装置,其特征在于,所述装置还包括第三计算单元,所述第三计算单元,用于获取所述图库中图像所对应的索引清单,所述第二计算引擎具体用于从所述共享内存中获取所述距离表;根据所述距离表确定所述图库中是否包含与所述待检索图像相似的目标图像;所述第二计算单元,还用于根据所述距离表和所述索引清单获得距离矩阵,所述距离矩阵用于指示所述索引清单中每个索引所关联的距离值;根据所述距离矩阵确定所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离。
- 根据权利要求28所述的装置,其特征在于,所述第二计算单元,还用于根据所述距离矩阵确定所述待检索图像所对应的查询向量与所述多个图像对应的中心向量之间的距离的排序结果确定检索结果。
- 根据权利要求28所述的装置,其特征在于,所述第二计算单元,还用于根据预设阈值确定所述检索结果,所述检索结果所包括的图像所关联的向量与所述查询向量的距离满足预设条件,所述预设条件包括图像所关联的向量与所述查询向量的距离大于或等于所述预设阈值,或者,图像所关联的向量与所述查询向量的距离小于所述预设阈值。
- 根据权利要求25至31中任一所述的装置,其特征在于,所述待检索图像来自检索请求,所述检索请求包括至少一个所述待检索图像,或者,所述检索请求包括待检索视频,所述待检索视频包括至少一帧所述待检索图像。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中包括指令,当其在计算机上运行时,使得计算机执行权利要求13至24中任一所述的方法的操作步骤。
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP21837926.1A EP4167107A4 (en) | 2020-07-07 | 2021-06-12 | IMAGE RECOVERY SYSTEM, METHOD AND APPARATUS |
| US18/151,433 US20230161811A1 (en) | 2020-07-07 | 2023-01-07 | Image search system, method, and apparatus |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010645032 | 2020-07-07 | ||
| CN202010645032.2 | 2020-07-07 | ||
| CN202011621979.6A CN113971225A (zh) | 2020-07-07 | 2020-12-30 | 图像检索系统、方法和装置 |
| CN202011621979.6 | 2020-12-30 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US18/151,433 Continuation US20230161811A1 (en) | 2020-07-07 | 2023-01-07 | Image search system, method, and apparatus |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022007596A1 true WO2022007596A1 (zh) | 2022-01-13 |
Family
ID=79552737
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2021/099890 Ceased WO2022007596A1 (zh) | 2020-07-07 | 2021-06-12 | 图像检索系统、方法和装置 |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US20230161811A1 (zh) |
| EP (1) | EP4167107A4 (zh) |
| CN (1) | CN115878824B (zh) |
| WO (1) | WO2022007596A1 (zh) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114818873A (zh) * | 2022-03-30 | 2022-07-29 | 浙江大华技术股份有限公司 | 一种图像合档方法、装置、终端及计算机可读存储介质 |
| CN119293278A (zh) * | 2022-12-08 | 2025-01-10 | 华为技术有限公司 | 一种图像检索的方法、系统以及装置 |
Families Citing this family (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7962482B2 (en) * | 2001-05-16 | 2011-06-14 | Pandora Media, Inc. | Methods and systems for utilizing contextual feedback to generate and modify playlists |
| US12524567B2 (en) * | 2022-02-25 | 2026-01-13 | BeeKeeperAI, Inc. | Systems and methods for dataset selection optimization in a zero-trust computing environment |
| US20250054273A1 (en) * | 2023-08-11 | 2025-02-13 | Seminal One Pty Ltd | Methods and systems for determining similarities between media |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101236565A (zh) * | 2008-02-22 | 2008-08-06 | 南京大学 | 一种基于表示转换的多义数字图像检索方法 |
| CN103778240A (zh) * | 2014-02-10 | 2014-05-07 | 中国人民解放军信息工程大学 | 基于功能磁共振成像和图像字典稀疏分解的图像检索方法 |
| CN110543578A (zh) * | 2019-08-09 | 2019-12-06 | 华为技术有限公司 | 物体识别方法及装置 |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN105912611B (zh) * | 2016-04-05 | 2019-04-26 | 中国科学技术大学 | 一种基于cnn的快速图像检索方法 |
| EP4089531B1 (en) * | 2016-12-31 | 2024-06-26 | Intel Corporation | Systems, methods, and apparatuses for heterogeneous computing |
| CN109522435B (zh) * | 2018-11-15 | 2022-05-20 | 中国银联股份有限公司 | 一种图像检索方法及装置 |
| CN110609916A (zh) * | 2019-09-25 | 2019-12-24 | 四川东方网力科技有限公司 | 视频图像数据检索方法、装置、设备和存储介质 |
-
2020
- 2020-12-30 CN CN202211628292.4A patent/CN115878824B/zh active Active
-
2021
- 2021-06-12 EP EP21837926.1A patent/EP4167107A4/en active Pending
- 2021-06-12 WO PCT/CN2021/099890 patent/WO2022007596A1/zh not_active Ceased
-
2023
- 2023-01-07 US US18/151,433 patent/US20230161811A1/en not_active Abandoned
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101236565A (zh) * | 2008-02-22 | 2008-08-06 | 南京大学 | 一种基于表示转换的多义数字图像检索方法 |
| CN103778240A (zh) * | 2014-02-10 | 2014-05-07 | 中国人民解放军信息工程大学 | 基于功能磁共振成像和图像字典稀疏分解的图像检索方法 |
| CN110543578A (zh) * | 2019-08-09 | 2019-12-06 | 华为技术有限公司 | 物体识别方法及装置 |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114818873A (zh) * | 2022-03-30 | 2022-07-29 | 浙江大华技术股份有限公司 | 一种图像合档方法、装置、终端及计算机可读存储介质 |
| CN119293278A (zh) * | 2022-12-08 | 2025-01-10 | 华为技术有限公司 | 一种图像检索的方法、系统以及装置 |
Also Published As
| Publication number | Publication date |
|---|---|
| US20230161811A1 (en) | 2023-05-25 |
| CN115878824A (zh) | 2023-03-31 |
| CN115878824B (zh) | 2023-10-20 |
| EP4167107A1 (en) | 2023-04-19 |
| EP4167107A4 (en) | 2023-11-08 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN115878824B (zh) | 图像检索系统、方法和装置 | |
| US10452691B2 (en) | Method and apparatus for generating search results using inverted index | |
| JP6721681B2 (ja) | 並列検索動作を実行する方法及び装置 | |
| CN103797509B (zh) | 图像检索装置和图像检索方法 | |
| WO2021169173A1 (zh) | 数据聚类的存储方法、装置、计算机设备及存储介质 | |
| US10826980B2 (en) | Command process load balancing system | |
| US11893012B1 (en) | Content extraction using related entity group metadata from reference objects | |
| JP2018501579A (ja) | 画像の内容の意味表現 | |
| CN113971225A (zh) | 图像检索系统、方法和装置 | |
| US10579616B2 (en) | Data search system, data search method, and program product | |
| WO2023222091A1 (zh) | 一种向量检索方法及装置 | |
| CN117251641A (zh) | 向量数据库检索方法、系统、电子设备及存储介质 | |
| CN113190551A (zh) | 特征检索系统的构建方法、特征检索方法、装置及设备 | |
| CN111651625A (zh) | 图像检索方法、装置、电子设备及存储介质 | |
| CN116561319A (zh) | 文本的聚类方法、文本的聚类装置和文本聚类系统 | |
| CN110209895B (zh) | 向量检索方法、装置和设备 | |
| CN116304253B (zh) | 数据存储方法、数据检索方法和识别相似视频的方法 | |
| CN119226428A (zh) | 检索方法及计算机设备 | |
| CN116186154A (zh) | 数据同步方法及装置 | |
| Antaris et al. | Similarity search over the cloud based on image descriptors' dimensions value cardinalities | |
| CN116363416A (zh) | 一种图像去重方法、装置、电子设备和存储介质 | |
| US20260119501A1 (en) | Retrieval method and computer device | |
| WO2025002246A1 (zh) | 检索方法及计算机设备 | |
| US20230214394A1 (en) | Data search method and apparatus, electronic device and storage medium | |
| CN118535611A (zh) | 数据搜索方法、装置及相关设备 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21837926 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2021837926 Country of ref document: EP Effective date: 20230113 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |



