WO2024078032A1 - 信号处理方法、装置、设备、存储介质及计算机程序 - Google Patents
信号处理方法、装置、设备、存储介质及计算机程序 Download PDFInfo
- Publication number
- WO2024078032A1 WO2024078032A1 PCT/CN2023/103954 CN2023103954W WO2024078032A1 WO 2024078032 A1 WO2024078032 A1 WO 2024078032A1 CN 2023103954 W CN2023103954 W CN 2023103954W WO 2024078032 A1 WO2024078032 A1 WO 2024078032A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- signal
- event signal
- event
- frame
- pixel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/803—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of input or preprocessed data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N25/00—Circuitry of solid-state image sensors [SSIS]; Control thereof
- H04N25/47—Image sensors with pixel address output; Event-driven image sensors; Selection of pixels to be read out based on image data
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/44—Event detection
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/45—Cameras or camera modules comprising electronic image sensors; Control thereof for generating image signals from two or more image sensors being of different type or operating in different modes, e.g. with a CMOS sensor for moving images in combination with a charge-coupled device [CCD] for still images
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/80—Camera processing pipelines; Components thereof
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/95—Computational photography systems, e.g. light-field imaging systems
- H04N23/951—Computational photography systems, e.g. light-field imaging systems by using two or more images to influence resolution, frame rate or aspect ratio
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N25/00—Circuitry of solid-state image sensors [SSIS]; Control thereof
- H04N25/40—Extracting pixel data from image sensors by controlling scanning circuits, e.g. by modifying the number of pixels sampled or to be sampled
- H04N25/41—Extracting pixel data from a plurality of image sensors simultaneously picking up an image, e.g. for increasing the field of view by combining the outputs of a plurality of sensors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N25/00—Circuitry of solid-state image sensors [SSIS]; Control thereof
- H04N25/50—Control of the SSIS exposure
- H04N25/57—Control of the dynamic range
- H04N25/58—Control of the dynamic range involving two or more exposures
- H04N25/581—Control of the dynamic range involving two or more exposures acquired simultaneously
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/01—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
- H04N7/0117—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving conversion of the spatial resolution of the incoming video signal
Definitions
- the present application relates to the field of image processing technology, and in particular to a signal processing method, apparatus, device, storage medium and computer program.
- the image signals obtained by using traditional image sensors to collect signals from target scenes may be blurred, overexposed, or underexposed.
- the blurred image signals are caused by the displacement of objects in the target scene relative to the image sensor during the exposure time
- the overexposed image signals are caused by the whitening of objects due to the excessive brightness of the target scene or the long exposure time
- the underexposed image signals are caused by the lack of imaging details due to the low brightness of the target scene or the short exposure time.
- DVS dynamic vision sensors
- the present application provides a signal processing method, device, equipment, storage medium and computer program, which can improve the quality of images.
- the technical solution is as follows:
- a signal processing method is provided.
- an image signal and a first event signal of a target scene are obtained, the image signal indicates the brightness information of multiple pixels corresponding to the target scene during the exposure time, the first event signal indicates the motion information of the multiple pixels during the exposure time, and the first event signal is an event signal in a frame format or an event signal in a stream format.
- the first event signal is format-converted in the time dimension and/or space dimension to obtain a second event signal, the second event signal is an event signal in a frame format, and the resolution of the second event signal is the same as the resolution of the image signal.
- the second event signal is fused with the image signal to obtain a fused signal.
- the second event signal is obtained by converting the format of the first event signal in the time dimension and/or the space dimension. Since the second event signal is an event signal in a frame format, that is, the format of the second event signal is similar to the format of the image signal. Therefore, the second event signal has a resolution, and the resolution of the second event signal is the same as the resolution of the image signal. In this way, the second event signal can be better fused with the image signal. Moreover, since the image signal indicates the brightness information of multiple pixels during the exposure time, the event signal indicates the motion information of the multiple pixels during the exposure time. Therefore, the event signal is fused with the image signal, and the obtained fused signal includes both the brightness information of the multiple pixels and the motion information of the multiple pixels. In this way, the quality of the image can be improved by a dense fused signal that has both brightness information and motion information.
- the exposure time refers to the exposure time of the image sensor.
- the format of the second event signal is any one of the event frame format, the time plane format and the voxel grid format.
- the event signal in the event frame format is a frame of event signal consisting of the accumulated value of the event polarity corresponding to each pixel within a period of time or the total number of event polarities.
- the event signal in the time plane format is a frame of event signal consisting of the maximum timestamp corresponding to the polarity event of each pixel within a period of time.
- the event signal in the voxel grid format is a frame of event signal consisting of the product of the accumulated value of the event polarity corresponding to each pixel within a period of time and the weight of the event signal in the time dimension.
- the first event signal includes an event signal in a frame format and an event signal in a stream format.
- the first event signal is converted in time dimension and/or space dimension to obtain the second event signal in different ways, which will be described in the following two situations.
- the first event signal is an event signal in a frame format
- the first event signal includes M frame event signals
- the second event signal includes N frame event signals, where M and N are both integers greater than or equal to 1, and M is greater than or equal to N.
- the M frame event signals are converted into a frame signal.
- the sequence number is divided into N groups of event signals, each of the N groups of event signals includes at least one frame of event signals with continuous frame sequence numbers.
- Each of the N groups of event signals is format-converted in terms of time dimension and/or space dimension to obtain the N frames of event signals.
- a group of event signals is selected from the N groups of event signals as the target group event signals.
- Mode 1 Performing format conversion of the time dimension and the space dimension on at least one frame of event signal included in the target group of event signals to obtain a frame of event signal after format conversion.
- traversing the at least one frame event signal in sequence according to the above method to obtain the pixel value of each pixel in the one frame event signal after format conversion is only an example.
- the pixel value of each pixel in the one frame event signal after format conversion can also be determined in other ways.
- each frame event signal in the at least one frame event signal is interpolated to obtain at least one frame event signal after interpolation processing.
- the weight of each frame event signal after interpolation processing in the at least one frame event signal after interpolation processing in the time dimension is determined.
- the target pixel value of each pixel included in each frame event signal after interpolation processing is determined. Then, the target pixel values of the pixels at the same position in the at least one frame event signal after interpolation processing are accumulated to obtain the one frame event signal after format conversion.
- the at least one frame event signal includes more pixels. In this way, when determining the target pixel value of any pixel in each frame event signal, it is not necessary to consider the pixel values of other pixels adjacent to the pixel in the spatial dimension, thereby improving the efficiency of signal processing.
- the one-frame event signal obtained after format conversion by the above-mentioned method 1 is an event signal in voxel grid format. That is, when the first event signal is an event signal in frame format, according to the method provided by the above-mentioned method 1, the accumulated value of the event polarity corresponding to each pixel and the product of the weight of the first event signal in the time dimension are used as the pixel value after conversion of each pixel to obtain an event signal in voxel grid format.
- Mode 2 Performing a format conversion of a spatial dimension on at least one frame of event signals included in the target group of event signals to obtain a frame of event signals after format conversion.
- each frame event signal in the at least one frame event signal is interpolated to obtain at least one frame event signal after interpolation processing.
- the pixel values of the pixels at the same position in the at least one frame event signal after interpolation processing are accumulated to obtain a frame event signal after format conversion.
- the at least one frame event signal includes more pixels. In this way, there is no need to determine the pixel value of any pixel in each frame event signal in the spatial dimension, that is, there is no need to consider the pixel values of other pixels adjacent to the pixel in the spatial dimension, thereby improving the efficiency of signal processing.
- the event signal of one frame obtained after the format conversion by the above method 2 is an event signal in an event frame format. That is, in the case where the first event signal is an event signal in a frame format, according to the method provided in the above method 2, the accumulated value of the event polarity corresponding to each pixel is used as the pixel value after the conversion of each pixel to obtain an event signal of one frame in an event frame format.
- the total number of event polarities corresponding to each pixel can also be used as the pixel value after the conversion of each pixel to obtain an event signal of one frame in an event frame format, and the embodiment of the present application does not limit this.
- the event frame format is simpler than the voxel grid format, converting the first event signal into a second event signal in the event frame format can improve the efficiency of signal processing.
- Mode 3 Performing a time-dimensional format conversion on at least one frame of event signals included in the target group event signals to obtain a format-converted A frame event signal.
- each frame event signal in the at least one frame event signal records the brightness change of the pixel
- the maximum frame number in the frame numbers of the at least one frame event signal is determined as the target pixel value of the pixel. If some frame event signals in the at least one frame event signal record the brightness change of the pixel, the maximum frame number in the frame numbers of the part of the frame event signals is determined as the target pixel value of the pixel. If the at least one frame event signal does not record the brightness change of the pixel, the target pixel value of the pixel is determined to be 0.
- the target pixel values of each pixel constitute a frame event signal after format conversion.
- the event signal recording the brightness change of the pixel is directly selected from the at least one frame event signal according to the above method, and the maximum frame number in the frame numbers of the selected event signal is determined as the target pixel value of the pixel, or, when the at least one frame event signal does not record the brightness change of the pixel, determining the target pixel value of the pixel to be 0 is only an example.
- the pixel value of each pixel in the frame event signal after format conversion can also be determined in other ways. For example, the at least one frame event signal is sorted in order from small to large according to the frame number to obtain the sorting result of the at least one frame event signal. Based on the sorting result and the pixel value of each pixel included in each frame event signal of the at least one frame event signal, the target pixel value of each pixel is determined.
- the one-frame event signal obtained after format conversion by the above method 3 is an event signal in a time plane format. That is, when the first event signal is an event signal in a frame format, according to the method provided by the above method 3, the maximum frame number of the event signal in the at least one frame event signal recording the brightness change of each pixel is used as the target pixel value of each pixel to obtain a frame event signal in a time plane format.
- Mode 4 Based on the image signal, format conversion of the time dimension and the space dimension is performed on at least one frame event signal included in the target group event signal to obtain a frame event signal after format conversion.
- Each frame event signal in the at least one frame event signal is split according to the polarity of the event to obtain at least one frame positive event signal and at least one frame negative event signal.
- the pixel values of each pixel in each frame positive event signal in the at least one frame positive event signal and the pixel values of each pixel in each frame negative event signal in the at least one frame negative event signal are determined.
- the target pixel value of each pixel in each frame positive event signal is determined.
- the target pixel value of each pixel in each frame negative event signal is determined.
- a frame event signal after format conversion is determined.
- the first event signal is formatted in combination with the acquired image signal, so that the converted second event signal can more accurately indicate the brightness information of the pixel at different moments within the exposure time.
- the first event signal is an event signal in a stream format
- the first event signal includes event signals at H moments
- the H moments are within the exposure time
- the second event signal includes N frame event signals
- H and N are both integers greater than or equal to 1.
- the exposure time is divided into N sub-time periods, and each of the N sub-time periods includes an event signal at at least one of the H moments.
- the event signal included in each of the N sub-time periods is format-converted in the time dimension and/or space dimension to obtain the N frame event signal.
- a sub-time period is selected from the N sub-time periods as the target sub-time period.
- Mode 1 Perform format conversion of the time dimension and the space dimension on the event signal of at least one moment included in the target sub-time period to obtain a frame of event signal after format conversion.
- the event signal of one frame is obtained as an event signal in a voxel grid format. That is, in the case where the first event signal is an event signal in a stream format, according to the method provided by the above method 1, the accumulated value of the event polarity corresponding to each pixel and the product of the weight of the first event signal in the time dimension are used as the pixel value after the conversion of each pixel to obtain the format conversion. Event signal in voxel grid format for the next frame.
- Mode 2 performing a format conversion of a spatial dimension on an event signal of at least one moment included in a target sub-time period to obtain a frame of event signal after the format conversion.
- the event polarity of each pixel included in the event signal at each moment in the at least one moment is determined.
- the event polarity of the pixels at the same position in the event signal at the at least one moment is accumulated to obtain a frame of event signal after format conversion.
- the event signal of one frame obtained after the format conversion by the above-mentioned method 2 is an event signal in an event frame format. That is, in the case where the first event signal is an event signal in a stream format, according to the method provided in the above-mentioned method 2, the accumulated value of the event polarity corresponding to each pixel is used as the pixel value after the conversion of each pixel to obtain an event signal of one frame in an event frame format after the format conversion.
- the total number of event polarities corresponding to each pixel can also be used as the pixel value after the conversion of each pixel to obtain an event signal of one frame in an event frame format after the format conversion, and the embodiment of the present application does not limit this.
- Mode 3 performing format conversion in the time dimension on the event signal of at least one moment included in the target sub-time period to obtain a frame of event signal after format conversion.
- the spatial position coordinates of the pixel after transformation are determined from the correspondence between the spatial position coordinates before transformation and the spatial position coordinates after transformation based on the spatial position coordinates of the pixel in the event signal of each moment. If the event signal of each moment in the event signal of at least one moment records the brightness change of the pixel, the maximum timestamp among the timestamps of the event signal of at least one moment is determined as the pixel value at the spatial position coordinates after transformation of the pixel.
- the maximum timestamp among the timestamps of the event signal of the some moments is determined as the pixel value at the spatial position coordinates after transformation of the pixel.
- the pixel values at the spatial position coordinates after transformation of each pixel constitute a frame of event signal after format transformation.
- any pixel in the event signal of the at least one moment directly selecting the event signal recording the brightness change of the pixel from the event signal of the at least one moment according to the above method, and determining the maximum timestamp in the timestamps of the selected event signal as the pixel value at the spatial position coordinates after the pixel is converted is only an example.
- the pixel values of each pixel in the event signal of this frame after the format conversion can also be determined in other ways.
- the event signals of the at least one moment are sorted in the order of the timestamps from small to large to obtain the sorting result of the event signal of the at least one moment.
- the spatial position coordinates of each pixel in the event signal of the at least one moment are determined. Based on the sorting result and the timestamp of the event signal of each moment in the at least one moment, the pixel value of each pixel after the format conversion is determined.
- the one-frame event signal obtained after format conversion by the above method 3 is an event signal in a time plane format. That is, when the first event signal is an event signal in a stream format, according to the method provided by the above method 3, the timestamp corresponding to the last polarity event of each pixel is used as the target pixel value corresponding to each pixel to obtain a one-frame event signal in a time plane format after format conversion.
- Mode 4 Based on the image signal, format conversion of the event signal of at least one moment included in the target sub-time period is performed in terms of time dimension and space dimension to obtain a frame of event signal after format conversion.
- the negative polarity value of each pixel included in the event signal at each moment in the at least one moment, and the image signal determine a frame of event signal after format conversion.
- the first event signal is formatted in combination with the acquired image signal, so that the converted second event signal can more accurately indicate the brightness information of the pixel at different moments within the exposure time.
- a mask area in a frame of event signals is determined, the mask area indicates an area where pixels with motion information are located in the corresponding frame of event signals, pixel values of each pixel within the mask area are fused with pixel values of corresponding pixels in the image signal, and pixel values of each pixel outside the mask area are set to pixel values of corresponding pixels in the image signal, so as to obtain a frame of fused signal.
- each pixel outside the mask area is shielded, and there is no need to fuse the event signal and the image signal for each pixel outside the mask area.
- the target scene is an autonomous driving scene
- the fused signal is input into a neural network model to obtain scene perception information of the autonomous driving scene.
- a cloud server receives an image signal and a first event signal of a target scene sent by a signal processing device, wherein the image signal indicates the brightness information of multiple pixels corresponding to the target scene during the exposure time, and the first event signal indicates the motion information of the multiple pixels during the exposure time, and the first event signal is an event signal in a frame format or an event signal in a stream format.
- the cloud server performs format conversion of the first event signal in a time dimension and/or a space dimension to obtain a second event signal, wherein the second event signal is an event signal in a frame format, and the resolution of the second event signal is the same as the resolution of the image signal.
- the second event signal is fused with the image signal to obtain a fused signal.
- the cloud server sends the fused signal to the signal processing device.
- the first event signal includes an event signal in a frame format and an event signal in a stream format.
- the first event signal is converted in time dimension and/or space dimension to obtain the second event signal in different ways, which will be described in the following two situations.
- the first event signal is an event signal in a frame format
- the first event signal includes M frame event signals
- the second event signal includes N frame event signals, where M and N are both integers greater than or equal to 1, and M is greater than or equal to N.
- the M frame event signals are divided into N groups of event signals according to the frame sequence number, and each group of event signals in the N groups of event signals includes at least one frame event signal with a continuous frame sequence number.
- Each group of event signals in the N groups of event signals is format-converted in the time dimension and/or space dimension to obtain the N frame event signals.
- the first event signal is an event signal in a stream format
- the first event signal includes event signals at H moments
- the H moments are within the exposure time
- the second event signal includes N frame event signals
- H and N are both integers greater than or equal to 1.
- the exposure time is divided into N sub-time periods, and each of the N sub-time periods includes an event signal at at least one moment in the H moments.
- the event signal included in each of the N sub-time periods is format-converted in the time dimension and/or space dimension to obtain the N frame event signal.
- a mask area in a frame of event signals is determined, the mask area indicates an area where pixels with motion information are located in the corresponding frame of event signals, pixel values of each pixel within the mask area are fused with pixel values of corresponding pixels in the image signal, and pixel values of each pixel outside the mask area are set to pixel values of corresponding pixels in the image signal, so as to obtain a frame of fused signal.
- a signal processing device wherein the signal processing device has the function of implementing the signal processing method in the first aspect.
- the signal processing device includes at least one module, and the at least one module is used to implement the signal processing method provided in the first aspect.
- a cloud server comprising a communication interface and one or more processors;
- the communication interface is used to receive an image signal of a target scene and a first event signal sent by a signal processing device, wherein the image signal indicates brightness information of a plurality of pixels corresponding to the target scene within an exposure time, and the first event signal indicates motion information of the plurality of pixels within the exposure time, and the first event signal is an event signal in a frame format or an event signal in a stream format;
- the one or more processors are used to perform format conversion of the first event signal in a time dimension and/or a space dimension to obtain a second event signal, where the second event signal is an event signal in a frame format, and a resolution of the second event signal is the same as a resolution of the image signal;
- the one or more processors are used to fuse the second event signal with the image signal to obtain a fused signal
- the one or more processors are used to send the fused signal to the signal processing device through the communication interface.
- a signal processing system which includes a signal processing device and a cloud server, wherein the signal processing device is used to send an image signal and a first event signal of a target scene to the cloud server, and the cloud server is used to implement the signal processing method provided in the second aspect above.
- a signal processing device comprising a processor and a memory, the memory being used to store a computer program for executing the signal processing method provided in the first aspect.
- the processor is configured to execute the computer program stored in the memory to implement the signal processing method described in the first aspect.
- the signal processing device may further include a communication bus, and the communication bus is used to establish a connection between the processor and the memory.
- a computer-readable storage medium stores instructions.
- the instructions When executed on a computer, When executed, the computer executes the steps of the signal processing method described in the first aspect or the second aspect.
- a computer program product comprising instructions is provided, and when the instructions are executed on a computer, the computer executes the steps of the signal processing method described in the first aspect or the second aspect.
- a computer program is provided, and when the computer program is executed on a computer, the computer executes the steps of the signal processing method described in the first aspect or the second aspect.
- FIG1 is a schematic diagram of a first event signal provided in an embodiment of the present application.
- FIG2 is a schematic diagram of an application scenario provided by an embodiment of the present application.
- FIG3 is a schematic diagram of another application scenario provided by an embodiment of the present application.
- FIG4 is a schematic diagram of another application scenario provided by an embodiment of the present application.
- FIG5 is a schematic diagram of the architecture of a signal processing system provided in an embodiment of the present application.
- FIG6 is a schematic diagram of a signal processing device provided in an embodiment of the present application.
- FIG7 is a flow chart of a signal processing method provided in an embodiment of the present application.
- FIG8 is a schematic diagram of an event signal splitting method provided in an embodiment of the present application.
- FIG9 is a flowchart of another signal processing method provided in an embodiment of the present application.
- FIG10 is a schematic diagram of the structure of a signal processing device provided in an embodiment of the present application.
- FIG11 is a schematic diagram of the structure of a computer device provided in an embodiment of the present application.
- FIG12 is a schematic diagram of the structure of a terminal device provided in an embodiment of the present application.
- FIG. 13 is a schematic diagram of the structure of another terminal device provided in an embodiment of the present application.
- Image sensor It can collect signals of the target scene to obtain the image signal of the target scene, that is, the image sensor can convert the optical signal into an electrical signal, which can be a digital signal.
- CMOS image sensors include two shutter modes: rolling shutter and global shutter.
- DVS It can independently perceive the brightness change of each pixel in the multiple pixels corresponding to the target scene, and output the spatial position coordinates, current timestamp and brightness change information of the pixel for the pixel whose brightness change exceeds the change threshold, so as to obtain the event signal of the target scene.
- Stream format event signal It is a set of four-dimensional arrays in the form of (x, y, t, p).
- x and y are positive integers, representing the spatial position coordinates of the pixel
- t is a positive real number, representing the timestamp of the brightness change of the pixel
- p represents the polarity of the brightness change.
- Event signal in frame format A two-dimensional array obtained by projecting the positions of each pixel whose brightness changes over a period of time onto the same two-dimensional plane.
- the value of any pixel position in the two-dimensional array can be expressed as E(x, y, Mi ).
- x and y are positive integers, representing the spatial position coordinates of the pixel
- Mi is a positive integer, representing the frame number of the event signal.
- E(x, y, Mi ) -1
- E(x, y, Mi ) 1
- E(x, y, Mi ) 1
- Figure 1 is a schematic diagram of a first event signal provided by an embodiment of the present application.
- the left figure is an event signal in a stream format over a period of time
- the right figure is an event signal in a frame format.
- a black dot represents a negative polarity event occurring in a pixel
- a white dot represents a positive polarity event occurring in a pixel
- a white triangle represents no event occurring in a pixel.
- Coupled sensor A new type of sensor formed by coupling DVS with an image sensor. By collecting signals from the target scene through the coupled sensor, the image signal and event signal of the target scene can be obtained.
- the signal processing method provided in the embodiments of the present application can be applied to a variety of scenarios, such as autonomous driving, terminal device imaging, and target object monitoring.
- Figure 2 is a schematic diagram of an application scenario provided by an embodiment of the present application.
- the on-board perception device when the on-board perception device includes a DVS and an image sensor, the on-board perception device obtains an image signal of the target scene through the image sensor, and obtains a first event signal of the target scene through the DVS. Then, the first event signal is formatted according to the method provided in the embodiment of the present application to obtain a second event signal, and the second event signal is fused with the image signal to obtain a fused signal. Finally, the fused signal is input into the neural network model to obtain scene perception information, thereby realizing the perception of road conditions, vehicles, pedestrians, environmental changes and other information in the autonomous driving scenario.
- FIG. 3 is a schematic diagram of another application scenario provided by an embodiment of the present application.
- a terminal device such as a personal computer (PC), a mobile phone, a smart phone, a personal digital assistant (PDA), a pocket pc (PPC), or a tablet computer
- the terminal device simultaneously obtains the image signal and the first event signal of the target scene through the coupling sensor, and performs format conversion on the first event signal according to the method provided in the embodiment of the present application to obtain the second event signal, and then fuses the second event signal with the image signal to obtain a fused signal.
- the fused signal is input into the image processor to obtain a real-time picture of the scene.
- Figure 4 is a schematic diagram of another application scenario provided by an embodiment of the present application.
- the image processing device obtains a fusion signal according to the method provided by an embodiment of the present application, and the fusion signal is a fusion signal corresponding to the current exposure time. Then, based on the fusion signal corresponding to the current exposure time and the fusion signal corresponding to the previous exposure time, video interpolation processing is performed to reduce image delay and improve image accuracy.
- FIG. 5 is a schematic diagram of the architecture of a signal processing system provided in an embodiment of the present application.
- the system includes an image sensor 501, a DVS502 and a signal processing device 503.
- the image sensor 501 and the DVS502 form a coupled sensor.
- the image sensor 501 and the DVS502 can be other devices independent of the signal processing device 503, that is, the image sensor 501, the DVS502 and the signal processing device 503 are three independent devices respectively.
- the image sensor 501 and the DVS502 are integrated into the signal processing device 503, that is, the image sensor 501, the DVS502 and the signal processing device 503 are used as a whole device, which is not limited in the embodiment of the present application.
- the image sensor 501 is used to output the image signal of the target scene.
- the DVS 502 is used to output the first event signal of the target scene.
- FIG6 is a schematic diagram of a signal processing device provided in an embodiment of the present application.
- the signal processing device includes an input module, a conversion module, a fusion module and an output module.
- the input module is used to input the image signal and the first event signal of the target scene, and the first event signal is an event signal in a stream format or an event signal in a frame format.
- the conversion module is used to convert the format of the first event signal to obtain a second event signal.
- the fusion module is used to fuse the second event signal with the image signal to obtain a fused signal.
- the output module is used to output the fused signal.
- the above-mentioned input module, conversion module, fusion module and output module can all be deployed on the signal processing device.
- the input module can also be deployed on the signal processing device, and the conversion module, fusion module and output module can all be deployed on the cloud server.
- the input module, conversion module, fusion module and output module are preferentially deployed on the chip for algorithm hardening.
- related software can also be developed on the operating system.
- FIG7 is a flow chart of a signal processing method provided in an embodiment of the present application.
- the execution subject of the signal processing method provided in an embodiment of the present application is a signal processing device, and the signal processing device includes any one of the vehicle-mounted sensing device, terminal device and image processing device mentioned above. Please refer to Figure 7, the method includes the following steps.
- Step 701 Acquire an image signal and a first event signal of a target scene, wherein the image signal indicates brightness information of a plurality of pixels corresponding to the target scene within an exposure time, and the first event signal indicates motion information of the plurality of pixels within the exposure time.
- the first event signal is an event signal in a frame format or an event signal in a stream format.
- the image signal of the target scene is acquired through the image sensor, and the first event signal of the target scene is acquired through the DVS.
- the image signal and the first event signal of the target scene are acquired through a coupling sensor.
- the image signal and the first event signal of the target scene can also be acquired by other means, which is not limited in the embodiments of the present application.
- the exposure time refers to the exposure time of the image sensor.
- the brightness information of each pixel within the exposure time is processed according to a relevant algorithm to obtain a frame of image signal of the target scene.
- the first event signal includes an event signal in a stream format and an event signal in a frame format.
- the event signal in a stream format includes event signals at H moments within the exposure time. For an event signal at any of the H moments, the pixel value of each pixel in the event signal at that moment is the event polarity corresponding to each pixel at that moment.
- the event signal in a frame format includes M frame event signals within the exposure time. For any frame event signal in the M frame event signals, the pixel value of each pixel in the frame event signal is the event polarity corresponding to each pixel within a period of time.
- Step 702 Perform format conversion of the time dimension and/or space dimension on the first event signal to obtain a second event signal, where the second event signal is an event signal in a frame format, and the resolution of the second event signal is the same as the resolution of the image signal.
- the format of the second event signal is any one of an event frame format, a time plane format and a voxel grid format.
- the event signal in the event frame format is a frame of event signal consisting of the accumulated value of the event polarity corresponding to each pixel within a period of time or the total number of event polarities.
- the event signal in the time plane format is a frame of event signal consisting of the maximum timestamp corresponding to the polarity event of each pixel within a period of time.
- the event signal in the voxel grid format is a frame of event signal consisting of the product of the accumulated value of the event polarity corresponding to each pixel within a period of time and the weight of the event signal in the time dimension.
- the first event signal includes an event signal in a frame format and an event signal in a stream format.
- the first event signal is converted in time dimension and/or space dimension to obtain the second event signal in different ways, which will be described in the following two situations.
- the first event signal is an event signal in a frame format
- the first event signal includes M frame event signals
- the second event signal includes N frame event signals, where M and N are both integers greater than or equal to 1, and M is greater than or equal to N.
- the M frame event signals are divided into N groups of event signals according to the frame sequence number, and each group of event signals in the N groups of event signals includes at least one frame event signal with a continuous frame sequence number.
- Each group of event signals in the N groups of event signals is format-converted in the time dimension and/or space dimension to obtain the N frame event signals.
- the M frames of event signals are divided into N groups of event signals according to the following formula (1).
- ⁇ t INT(M/N) (1)
- ⁇ t represents the offset in the time dimension, that is, the ⁇ t frame event signals with consecutive frame numbers in the M frame event signals are regarded as a group of event signals
- N represents the total number of event signals after conversion, which is usually set in advance
- INT(M/N) represents the floor function.
- the M-frame event signal can also be divided into N groups of event signals in other ways. For example, since the total number of converted event signals (i.e., N) is set in advance, that is, the frame number of the converted event signal is known, and the signal processing device stores the correspondence between the frame number of the converted event signal and the offset ⁇ t on the time dimension.
- N the total number of converted event signals
- the offset ⁇ t on the time dimension corresponding to each frame number is obtained from the correspondence between the frame number and the offset ⁇ t on the time dimension, and then according to the offset ⁇ t on the time dimension corresponding to each frame number, the ⁇ t frame event signals with continuous frame numbers in the M-frame event signal are taken as a group of event signals.
- the 4 frame event signals of frame numbers 1-4 in the acquired 20 frame event signals are divided into one group of event signals
- the 8 frame event signals of frame numbers 5-12 are divided into one group of event signals
- the 6 frame event signals of frame numbers 13-18 are divided into one group of event signals
- the 2 frame event signals of frame numbers 19-20 are divided into one group of event signals, thereby obtaining 4 groups of event signals.
- a group of event signals is selected from the N groups of event signals as the target group event signals.
- Mode 1 Performing format conversion of the time dimension and the space dimension on at least one frame of event signal included in the target group of event signals to obtain a frame of event signal after format conversion.
- the pixel value of the pixel in the frame event signal and the pixel value of the adjacent pixel in the frame event signal are accumulated to obtain the pixel value of the pixel in the spatial dimension of the frame event signal.
- the weight of the frame event signal in the time dimension and the pixel value of the pixel in the spatial dimension of the frame event signal are multiplied to obtain the target pixel value of the pixel in the frame event signal, and the at least one frame event signal is traversed in sequence to obtain the target pixel value of the pixel in each frame event signal.
- the target pixel values of the pixel in the at least one frame event signal are accumulated respectively to obtain the pixel value of the pixel in the frame event signal after format conversion.
- the pixel value of each pixel in the frame event signal after format conversion can be determined according to the above steps to obtain the frame event signal after format conversion.
- the pixel value of any pixel in a frame of event signal after format conversion is determined according to the following formula (2).
- K(u, v, Ni ) represents the pixel value of the pixel (u, v) in the frame event signal with the frame number Ni obtained after the format conversion
- E(x, y, Mi ) represents the pixel value of the pixel (x, y) in the event signal with the frame number Mi in the at least one frame event signal
- ⁇ x represents the offset along the X-axis in the spatial dimension
- ⁇ y represents the offset along the Y-axis in the spatial dimension, which is usually set in advance.
- traversing the at least one frame event signal in sequence according to the above method to obtain the pixel value of each pixel in the one frame event signal after format conversion is only an example.
- the pixel value of each pixel in the one frame event signal after format conversion can also be determined in other ways.
- each frame event signal in the at least one frame event signal is interpolated to obtain at least one frame event signal after interpolation processing.
- the weight of each frame event signal after interpolation processing in the at least one frame event signal after interpolation processing in the time dimension is determined.
- the target pixel value of each pixel included in each frame event signal after interpolation processing is determined. Then, the target pixel values of the pixels at the same position in the at least one frame event signal after interpolation processing are accumulated to obtain a frame event signal after format conversion.
- the frame event signal For each frame event signal, based on the pixel values of every two adjacent pixels in the spatial dimension of the frame event signal, the frame event signal is interpolated to obtain the interpolated event signal. For any pixel in the interpolated event signal of each frame, the pixel is interpolated. The pixel value in the event signal after the frame interpolation processing and the weight of the event signal after the frame interpolation processing in the time dimension are multiplied to obtain the target pixel value of the pixel in the event signal after the frame interpolation processing, and the event signal after the at least one frame interpolation processing is traversed in sequence to obtain the target pixel value of the pixel in each frame interpolation processing event signal.
- the target pixel values corresponding to the pixel in the event signal after the at least one frame interpolation processing are accumulated to obtain the pixel value of the pixel in the frame event signal after the format conversion. That is, by interpolating the at least one frame event signal, the at least one frame event signal includes more pixels. In this way, when determining the target pixel value of any pixel in each frame event signal, there is no need to consider the pixel values of other pixels adjacent to the pixel in the spatial dimension, thereby improving the efficiency of signal processing.
- the event signal can be interpolated by a nearest neighbor interpolation method or a bilinear interpolation method, which is not limited in the embodiments of the present application.
- the one-frame event signal obtained after format conversion by the above-mentioned method 1 is an event signal in a voxel grid format. That is, when the first event signal is an event signal in a frame format, according to the method provided by the above-mentioned method 1, the accumulated value of the event polarity corresponding to each pixel and the product of the weight of the first event signal in the time dimension are used as the pixel value after conversion of each pixel to obtain an event signal in a voxel grid format.
- Mode 2 Performing a format conversion of a spatial dimension on at least one frame of event signals included in the target group of event signals to obtain a frame of event signals after format conversion.
- the pixel value of the pixel in the frame event signal is accumulated with the pixel value of the adjacent pixel in the frame event signal to obtain the pixel value of the pixel in the spatial dimension in the frame event signal, and the at least one frame event signal is traversed in sequence to obtain the pixel value of the pixel in the spatial dimension in each frame event signal. Then, the pixel value of the pixel in the spatial dimension in the at least one frame event signal is accumulated to obtain the pixel value of the pixel in the frame event signal after format conversion. In this way, for each pixel in the at least one frame event signal, the pixel value of each pixel in the frame event signal after format conversion can be determined according to the above steps to obtain the frame event signal after format conversion.
- the pixel value of any pixel in a frame of event signal after format conversion is determined according to the following formula (3).
- K(u, v, Ni ) represents the pixel value of the pixel (u, v) in the frame event signal with the frame number Ni obtained after the format conversion.
- each frame event signal in the at least one frame event signal is interpolated to obtain at least one frame event signal after interpolation processing.
- the pixel values of the pixels at the same position in the at least one frame event signal after interpolation processing are accumulated to obtain a frame event signal after format conversion.
- interpolation processing is performed on the frame event signal based on the pixel values of every two adjacent pixels in the frame event signal in the spatial dimension to obtain the interpolated event signal.
- the pixel values of the pixel in the at least one frame event signal after interpolation processing are accumulated to obtain the pixel value of the pixel in the frame event signal after format conversion. That is, by interpolating the at least one frame event signal, the at least one frame event signal includes more pixels. In this way, there is no need to determine the pixel value of any pixel in each frame event signal in the spatial dimension, that is, there is no need to consider the pixel values of other pixels adjacent to the pixel in the spatial dimension, thereby improving the efficiency of signal processing.
- the event signal of one frame obtained after the format conversion by the above method 2 is an event signal in an event frame format. That is, in the case where the first event signal is an event signal in a frame format, according to the method provided in the above method 2, the accumulated value of the event polarity corresponding to each pixel is used as the pixel value after the conversion of each pixel to obtain an event signal of one frame in an event frame format.
- the total number of event polarities corresponding to each pixel can also be used as the pixel value after the conversion of each pixel to obtain an event signal of one frame in an event frame format, and the embodiment of the present application does not limit this.
- the event frame format is simpler than the voxel grid format, converting the first event signal into a second event signal in the event frame format can improve the efficiency of signal processing.
- Mode 3 performing format conversion in a time dimension on at least one frame of event signal included in the target group of event signals to obtain a frame of event signal after format conversion.
- each frame event signal in the at least one frame event signal records the brightness change of the pixel
- the maximum frame number in the frame numbers of the at least one frame event signal is determined as the target pixel value of the pixel. If some frame event signals in the at least one frame event signal record the brightness change of the pixel, the maximum frame number in the frame numbers of the some frame event signals is determined as the target pixel value of the pixel. If the at least one frame event signal does not record the brightness change of the pixel, the target pixel value of the pixel is determined to be 0.
- the target pixel values of each pixel constitute a frame event signal after format conversion.
- the event signal recording the brightness change of the pixel is directly selected from the at least one frame event signal according to the above method, and the maximum frame number in the frame numbers of the selected event signal is determined as the target pixel value of the pixel, or, when the at least one frame event signal does not record the brightness change of the pixel, determining the target pixel value of the pixel to be 0 is only an example.
- the pixel value of each pixel in the frame event signal after format conversion can also be determined in other ways. For example, the at least one frame event signal is sorted in order from small to large according to the frame number to obtain the sorting result of the at least one frame event signal. Based on the sorting result and the pixel value of each pixel included in each frame event signal in the at least one frame event signal, the target pixel value of each pixel is determined.
- any pixel in the first frame event signal in the sorting result determine whether the pixel value of the pixel in the first frame event signal is 0. If the pixel value of the pixel in the first frame event signal is not 0, determine the frame number of the first frame event signal as the target pixel value of the pixel. If the pixel value of the pixel in the first frame event signal is 0, determine that the target pixel value of the pixel is also 0. In this way, for each pixel in the first frame event signal, the target pixel value of each pixel in the first frame event signal can be determined according to the above steps. For any pixel in the second frame event signal in the sorting result, determine whether the pixel value of the pixel in the second frame event signal is 0.
- the target pixel value of each pixel in the second frame event signal can be determined according to the above steps. Then, traverse the at least one frame event signal in sequence according to the same method to obtain a frame event signal after format conversion.
- the target group event signal includes three frames of event signals with frame numbers 8, 9, and 10.
- the target pixel value of pixel (1, 1) is 8; after traversing the event signal with frame number 9, the target pixel value of pixel (1, 1) remains unchanged at 8; after traversing the event signal with frame number 10, the target pixel value of pixel (1, 1) is 10.
- this frame event signal is obtained by performing format conversion in the time dimension on the three frame event signals with frame numbers 8, 9, and 10 in the target group event signal.
- the one-frame event signal obtained after format conversion by the above method 3 is an event signal in a time plane format. That is, when the first event signal is an event signal in a frame format, according to the method provided by the above method 3, the maximum frame number of the event signal in the at least one frame event signal recording the brightness change of each pixel is used as the target pixel value of each pixel to obtain a frame event signal in a time plane format.
- Mode 4 Based on the image signal, format conversion of the time dimension and the space dimension is performed on at least one frame event signal included in the target group event signal to obtain a frame event signal after format conversion.
- Each frame event signal in the at least one frame event signal is split according to the polarity of the event to obtain at least one frame positive event signal and at least one frame negative event signal.
- the pixel values of each pixel in each frame positive event signal in the at least one frame positive event signal and the pixel values of each pixel in each frame negative event signal in the at least one frame negative event signal are determined.
- the target pixel value of each pixel in each frame positive event signal is determined.
- the target pixel value of each pixel in each frame negative event signal is determined.
- a frame event signal after format conversion is determined.
- the event signal of each frame is split according to the polarity of the event to obtain a frame of positive event signal and a frame of negative event signal.
- the process includes: for any pixel in the frame event signal, determining whether the pixel value of the pixel in the frame event signal is a positive value. When the pixel value of the pixel in the frame event signal is a positive value, the pixel value of the pixel is kept unchanged. When the pixel value of the pixel in the frame event signal is not a positive value, the pixel value of the pixel is set to 0, thereby obtaining a frame of positive event signal corresponding to the frame event signal. Similarly, for any pixel in the frame event signal, determining whether the pixel value of the pixel in the frame event signal is a negative value.
- the pixel value of the pixel in the frame event signal is a negative value
- the pixel value of the pixel is kept unchanged.
- the pixel value of the pixel in the frame event signal is not a negative value
- the pixel value of the pixel is set to 0, thereby obtaining a frame of negative event signal corresponding to the frame event signal.
- Figure 8 is a schematic diagram of an event signal splitting provided by an embodiment of the present application.
- the frame event signal includes four pixels: pixel (1, 1), pixel (1, 2), pixel (2, 1) and pixel (2, 2).
- a frame of positive event signal and a frame of negative event signal obtained by splitting the frame event signal are shown in Figure 8.
- the pixel value of the pixel in the positive event signal of the frame is accumulated with the pixel value of the adjacent pixel in the positive event signal of the frame to obtain the pixel value of the pixel in the spatial dimension of the positive event signal of the frame.
- the pixel value of the pixel in the spatial dimension of the positive event signal of the frame is multiplied by the positive change threshold value to obtain the target pixel value of the pixel in the positive event signal of the frame, and the at least one frame of positive event signals is traversed in sequence to obtain the target pixel value of the pixel in each frame of the positive event signal.
- the target pixel value of the pixel in each frame of the negative event signal is determined in a similar manner.
- the target pixel value of the pixel in the at least one frame of the positive event signal and the target pixel value of the pixel in the at least one frame of the negative event signal are accumulated, and the accumulated calculation result is multiplied by the pixel value of the pixel in the image signal to obtain the pixel value of the pixel in the frame event signal after the format conversion.
- the pixel value of each pixel in the frame event signal after the format conversion can be determined according to the above steps to obtain a frame event signal after the format conversion.
- the pixel value of any pixel in the frame event signal after format conversion is determined according to the following formula (4).
- K(u, v, Ni ) represents the pixel value of the pixel (u, v) in the frame event signal with the frame number Ni obtained after the format conversion
- E + (x, y, Mi ) represents the pixel value of the pixel (x, y) in the positive event signal with the frame number Mi in the at least one frame positive event signal
- C + represents the positive change threshold, which is usually set in advance.
- E - (x, y, Mi ) represents the pixel value of the pixel (x, y) in the negative event signal with the frame number Mi in the at least one frame of the negative event signal, represents the pixel value of the pixel (x, y) in the spatial dimension in the negative event signal of the frame number Mi
- C - represents the negative change threshold, which is usually set in advance, represents the target pixel value of the pixel (x, y) in the negative event signal of the frame number Mi
- L(u, v) represents the pixel value of the pixel (u, v) in the image signal
- Represents the exponential function which is used to convert the accumulated calculation results from the logarithmic domain to the linear domain.
- the first event signal is formatted in combination with the acquired image signal, so that the converted second event signal can more accurately indicate the brightness information of the pixel at different moments within the exposure time.
- the first event signal is an event signal in a stream format
- the first event signal includes event signals at H moments
- the H moments are within the exposure time
- the second event signal includes N frame event signals
- H and N are both integers greater than or equal to 1.
- the exposure time is divided into N sub-time periods, and each of the N sub-time periods includes an event signal at at least one of the H moments.
- the event signal included in each of the N sub-time periods is format-converted in the time dimension and/or space dimension to obtain the N frame event signal.
- the process of dividing the exposure time into N sub-time periods is similar to the process of dividing M frame event signals into N groups of event signals according to the frame sequence numbers in the first case above. Therefore, you can refer to the relevant content of the first case above and will not repeat it here.
- a sub-time period is selected from the N sub-time periods as the target sub-time period.
- Mode 1 Perform format conversion of the time dimension and the space dimension on the event signal of at least one moment included in the target sub-time period to obtain a frame of event signal after format conversion.
- the weight of the event signal at the moment in the time dimension is determined based on the timestamp of the event signal at the moment, the start time of the target sub-time period, and the duration of the target sub-time period. For any pixel in the event signal at the moment, the event polarity of the pixel in the event signal at the moment is multiplied by the weight of the event signal at the moment in the time dimension to obtain the target pixel value of the pixel in the event signal at the moment, and the event signal of the at least one moment is traversed in sequence to obtain the target pixel value of the pixel in the event signal at each moment.
- the target pixel values corresponding to the pixel in the event signal of the at least one moment are accumulated to obtain the pixel value of the pixel in the event signal of the frame after the format conversion.
- the pixel value of each pixel in the event signal of the frame after the format conversion can be determined according to the above steps, thereby obtaining a frame of event signal after the format conversion.
- the pixel value of any pixel in the frame event signal after format conversion is determined according to the following formula (5).
- K(u, v, Ni ) represents the pixel value of the pixel (u, v) in the frame event signal with the frame number Ni obtained after the format conversion
- tstart represents the start time of the target sub-time period
- tend represents the end time of the target sub-time period
- tj represents the timestamp of the event signal at time tj in the event signal at the at least one time
- p j ⁇ (ux j ) ⁇ (vy j ) represents the event polarity corresponding to the pixel (u, v) in the event signal at time t j
- ⁇ (ux j ) and ⁇ (vy j ) represent the polarity function
- the one-frame event signal obtained after format conversion by the above method 1 is an event signal in voxel grid format. That is, when the first event signal is an event signal in stream format, according to the method provided in the above method 1, the accumulated value of the event polarity corresponding to each pixel and the product of the weight of the first event signal in the time dimension are used as the pixel value after conversion of each pixel to obtain a one-frame event signal in voxel grid format after format conversion.
- Mode 2 performing a format conversion of a spatial dimension on an event signal of at least one moment included in a target sub-time period to obtain a frame of event signal after the format conversion.
- the event polarities corresponding to the pixel in the event signal at the at least one moment are accumulated to obtain the pixel value of the pixel in the event signal of the frame after the format conversion.
- the pixel value of each pixel in the event signal of the frame after the format conversion can be determined according to the above steps, thereby obtaining the event signal of the frame after the format conversion.
- the pixel value of any pixel in the frame event signal after format conversion is determined according to the following formula (6).
- K(u, v, Ni ) represents the pixel value of the pixel (u, v) in the frame event signal with the frame number Ni obtained after the format conversion.
- the event signal of one frame obtained after the format conversion by the above-mentioned method 2 is an event signal in an event frame format. That is, in the case where the first event signal is an event signal in a stream format, according to the method provided in the above-mentioned method 2, the accumulated value of the event polarity corresponding to each pixel is used as the pixel value after the conversion of each pixel to obtain an event signal of one frame in an event frame format after the format conversion.
- the total number of event polarities corresponding to each pixel can also be used as the pixel value after the conversion of each pixel to obtain an event signal of one frame in an event frame format after the format conversion, and the embodiment of the present application does not limit this.
- Mode 3 performing format conversion in the time dimension on the event signal of at least one moment included in the target sub-time period to obtain a frame of event signal after format conversion.
- the spatial position coordinates of the pixel after transformation are determined from the correspondence between the spatial position coordinates before transformation and the spatial position coordinates after transformation based on the spatial position coordinates of the pixel in the event signal of each moment. If the event signal of each moment in the event signal of at least one moment records the brightness change of the pixel, the maximum timestamp among the timestamps of the event signal of at least one moment is determined as the pixel value at the spatial position coordinates after transformation of the pixel.
- the maximum timestamp among the timestamps of the event signal of the some moments is determined as the pixel value at the spatial position coordinates after transformation of the pixel.
- the pixel values at the spatial position coordinates after transformation of each pixel constitute a frame of event signal after format transformation.
- any pixel in the event signal of the at least one moment directly selecting the event signal recording the brightness change of the pixel from the event signal of the at least one moment according to the above method, and determining the maximum timestamp in the timestamps of the selected event signal as the pixel value at the spatial position coordinates after the pixel is converted is only an example.
- the pixel values of each pixel in the event signal of this frame after the format conversion can also be determined in other ways. For example, the event signals of the at least one moment are sorted in the order of the timestamps from small to large to obtain the sorting result of the event signal of the at least one moment.
- the spatial position coordinates of each pixel in the event signal of the at least one moment are determined. Based on the sorting result and the timestamp of the event signal of each moment in the at least one moment, the pixel value of each pixel after the format conversion is determined.
- the spatial position coordinates of the pixel after transformation are determined from the correspondence between the spatial position coordinates before transformation and the spatial position coordinates after transformation. Then, the timestamp of the event signal at the first moment is determined as the pixel value at the spatial position coordinates of the pixel after transformation. In this way, for each pixel in the event signal at the first moment, the pixel value at the spatial position coordinates of each pixel after transformation can be determined according to the above steps.
- the event signals at at least one moment are traversed in sequence according to the same method, so as to obtain the pixel value of the pixel in this frame event signal after format transformation.
- the pixel value of each pixel in this frame event signal after format transformation can be determined according to the above steps to obtain a frame event signal after format transformation.
- the one-frame event signal obtained after format conversion by the above method 3 is an event signal in a time plane format. That is, when the first event signal is an event signal in a stream format, according to the method provided by the above method 3, the timestamp corresponding to the last polarity event of each pixel is used as the target pixel value corresponding to each pixel to obtain a one-frame event signal in a time plane format after format conversion.
- Mode 4 Based on the image signal, format conversion of the event signal of at least one moment included in the target sub-time period is performed in terms of time dimension and space dimension to obtain a frame of event signal after format conversion.
- the event signal of the at least one moment includes the event polarity of each pixel, and the positive change threshold, and determines the positive polarity value of each pixel included in the event signal at each moment. Based on the event polarity of each pixel included in the event signal at each moment in the at least one moment, and the negative change threshold, determine the negative polarity value of each pixel included in the event signal at each moment. Based on the positive polarity value of each pixel included in the event signal at each moment in the at least one moment, the negative polarity value of each pixel included in the event signal at each moment, and the image signal, determine a frame of event signal after format conversion.
- the absolute value of the event polarity of the pixel in the event signal at the moment is multiplied by the positive change threshold to obtain the positive polarity value of the pixel in the event signal at the moment.
- the absolute value of the event polarity of the pixel in the event signal at the moment is multiplied by the negative change threshold to obtain the negative polarity value of the pixel in the event signal at the moment.
- the positive polarity value corresponding to the pixel in the event signal at the at least one moment and the negative polarity value of the pixel in the event signal at the at least one moment are accumulated, and the accumulated calculation result is multiplied by the pixel value of the pixel in the image signal to obtain the pixel value of the pixel in the event signal of the frame after the format conversion.
- the pixel value of each pixel in the event signal of the frame after the format conversion can be determined according to the above steps to obtain a frame of event signal after the format conversion.
- the pixel value of any pixel in the frame event signal after format conversion is determined according to the following formula (7).
- K(u, v, Ni ) represents the pixel value of the pixel (u, v) in the frame event signal with the frame number Ni obtained after format conversion
- C + ⁇ ( pj -1) ⁇ ( uxj ) ⁇ ( vyj ) represents the positive polarity value corresponding to the pixel (u, v) in the event signal at time tj
- C - ⁇ ( pj +1) ⁇ ( uxj ) ⁇ ( vyj ) represents the negative polarity value corresponding to the pixel (u, v) in the event signal at time tj .
- the first event signal is formatted in combination with the acquired image signal, so that the converted second event signal can more accurately indicate the brightness information of the pixel at different moments within the exposure time.
- Step 703 Fuse the second event signal with the image signal to obtain a fused signal.
- a mask area in a frame of event signals is determined, the mask area indicates an area where pixels with motion information are located in the corresponding frame of event signals, pixel values of each pixel within the mask area are fused with pixel values of corresponding pixels in the image signal, and pixel values of each pixel outside the mask area are set to pixel values of corresponding pixels in the image signal, so as to obtain a frame of fused signal.
- a frame event signal is selected from the N frame event signal, and taking the frame event signal as an example, the process of fusing the frame event signal with the image signal to obtain a frame fused signal is introduced.
- the pixel value of each pixel in the frame event signal may be 0 or not. If the pixel value of a certain pixel is 0, it indicates that there is no motion information of the pixel at the time indicated by the frame event signal. If the pixel value of a certain pixel is not 0, it indicates that there is motion information of the pixel at the time indicated by the frame event signal. In this way, the area where the pixels with motion information in the frame event signal are located is determined as the mask area corresponding to the frame event signal.
- a mask signal corresponding to the frame event signal is generated. That is, for any pixel in the frame event signal, when the pixel value of the pixel is 0, the value of the mask array corresponding to the pixel is set to the first value. When the pixel value of the pixel is not 0, the value of the mask array corresponding to the pixel is set to the second value.
- the values of the mask array corresponding to each pixel in the frame event signal constitute the mask signal corresponding to the frame event signal.
- the area surrounded by the mask array whose value in the mask signal is the second value is the mask area corresponding to the frame event signal.
- the first value and the second value are preset, for example, the first value is 0, and the second value is 1. Moreover, the first value and the second value can be adjusted according to different requirements.
- a pixel with the same spatial position coordinates as the pixel is selected from the image signal to obtain the pixel corresponding to the pixel in the image signal.
- the pixel value of the pixel in the frame event signal is fused with the pixel value of the pixel in the image signal to obtain the pixel value of the pixel in this frame fusion signal.
- the pixel value of the pixel in the image signal is determined as the pixel value of the pixel in this frame fusion signal. In this way, for each pixel in the frame event signal, the pixel value of each pixel in this frame fusion signal can be determined according to the above steps to obtain a frame fusion signal.
- each pixel outside the mask area is shielded, without the need to Each pixel outside the area is fused with the event signal and the image signal.
- the pixel value of the pixel in the frame event signal when the pixel value of the pixel in the frame event signal is fused with the pixel value of the pixel in the image signal, the pixel value of the pixel in the frame event signal can be directly added to the pixel value of the pixel in the image signal, and the pixel value of the pixel in the frame event signal can also replace the pixel value of the pixel in the image signal.
- the pixel value of the pixel in the frame event signal and the pixel value of the pixel in the image signal can also be fused in other ways, and the embodiments of the present application are not limited to this.
- the fused signal can also be used as input for downstream tasks, including video frame insertion, image deblurring, image super-resolution, target object monitoring, depth estimation and other scenarios.
- the fused signal is input into the neural network model to obtain the scene perception information of the autonomous driving scene. That is, the fused signal is used as the input of the downstream task of the autonomous driving scene, so as to realize the perception of road conditions, vehicles, pedestrians and environmental changes in the autonomous driving scene.
- the second event signal and the image signal can be preprocessed respectively to improve the image quality of the second event signal and the image signal, thereby further improving the image quality of the fused signal.
- the second event signal is filtered to eliminate noise and bad pixels in the second event signal.
- the image signal is interpolated, denoised, de-mosaiced, and white-balanced.
- the filtering process for the second event signal includes median filtering, Gaussian filtering, etc.
- the second event signal and the image signal can also be preprocessed in other ways, which is not limited in the embodiment of the present application.
- a second event signal is obtained by converting the format of the first event signal in the time dimension and/or the space dimension. Since the second event signal is an event signal in a frame format, that is, the format of the second event signal is similar to the format of the image signal. Therefore, the second event signal has a resolution, and the resolution of the second event signal is the same as the resolution of the image signal. In this way, the second event signal can be better fused with the image signal. Moreover, since the image signal indicates the brightness information of multiple pixels during the exposure time, the event signal indicates the motion information of the multiple pixels during the exposure time. Therefore, the event signal is fused with the image signal, and the obtained fused signal includes both the brightness information of the multiple pixels and the motion information of the multiple pixels. In this way, the quality of the image can be improved by a dense fused signal that has both brightness information and motion information.
- FIG9 is a flow chart of another signal processing method provided in an embodiment of the present application, wherein the interactive execution subjects of the signal processing method provided in an embodiment of the present application are a signal processing device and a cloud server.
- the method includes the following steps.
- Step 901 The signal processing device obtains an image signal and a first event signal of a target scene, and sends the image signal and the first event signal of the target scene to a cloud server.
- the image signal indicates brightness information of a plurality of pixels corresponding to the target scene within an exposure time
- the first event signal indicates motion information of the plurality of pixels within the exposure time.
- the first event signal is an event signal in a frame format or an event signal in a stream format.
- Step 902 The cloud server receives the image signal and the first event signal of the target scene sent by the signal processing device, and performs format conversion of the first event signal in the time dimension and/or space dimension to obtain a second event signal, where the second event signal is an event signal in a frame format, and the resolution of the second event signal is the same as the resolution of the image signal.
- the first event signal includes an event signal in a frame format and an event signal in a stream format.
- the cloud server performs format conversion of the time dimension and/or space dimension on the first event signal to obtain the second event signal in different ways, which will be described in the following two situations.
- the first event signal is an event signal in a frame format
- the first event signal includes M frame event signals
- the second event signal includes N frame event signals, where M and N are both integers greater than or equal to 1, and M is greater than or equal to N.
- the M frame event signals are divided into N groups of event signals according to the frame sequence number, and each group of event signals in the N groups of event signals includes at least one frame event signal with a continuous frame sequence number.
- Each group of event signals in the N groups of event signals is format-converted in the time dimension and/or space dimension to obtain the N frame event signals.
- the first event signal is an event signal in a stream format
- the first event signal includes event signals at H moments
- the H moments are within the exposure time
- the second event signal includes N frame event signals
- H and N are both integers greater than or equal to 1.
- the exposure time is divided into N sub-time periods, and each of the N sub-time periods includes an event signal at at least one moment in the H moments.
- the event signal included in each of the N sub-time periods is format-converted in the time dimension and/or space dimension to obtain the N frame event signal.
- the process of converting the format of the first event signal by the cloud server is similar to the process of converting the format of the first event signal by the signal processing device in step 702.
- the process of format conversion of the first event signal is similar, so the relevant contents of the above step 702 may be referred to and will not be described again here.
- Step 903 The cloud server fuses the second event signal with the image signal to obtain a fused signal.
- the cloud server performs the following operations on each of the N frames of event signals: determining a mask area in a frame of event signal, the mask area indicating an area where pixels with motion information are located in the corresponding frame of event signal, fusing the pixel values of each pixel within the mask area with the pixel values of the corresponding pixels in the image signal, and setting the pixel values of each pixel outside the mask area to the pixel values of the corresponding pixels in the image signal to obtain a frame of fused signal.
- Step 904 The cloud server sends the fused signal to the signal processing device.
- the cloud server After the cloud server obtains the fused signal according to the above steps, it sends the fused signal to the signal processing device. After the signal processing device receives the fused signal sent by the cloud server, it uses the fused signal as the input of the downstream task.
- the downstream task includes any one of video interpolation, image deblurring, image super-resolution, target object monitoring and depth estimation.
- the fused signal is input into the neural network model to obtain the scene perception information of the autonomous driving scene. That is, the fused signal is used as the input of the downstream task of the autonomous driving scene, so as to realize the perception of road conditions, vehicles, pedestrians and environmental changes in the autonomous driving scene.
- a second event signal is obtained by converting the format of the first event signal in the time dimension and/or the space dimension. Since the second event signal is an event signal in a frame format, that is, the format of the second event signal is similar to the format of the image signal. Therefore, the second event signal has a resolution, and the resolution of the second event signal is the same as the resolution of the image signal. In this way, the second event signal can be better fused with the image signal. Moreover, since the image signal indicates the brightness information of multiple pixels during the exposure time, the event signal indicates the motion information of the multiple pixels during the exposure time. Therefore, the event signal is fused with the image signal, and the obtained fused signal includes both the brightness information of the multiple pixels and the motion information of the multiple pixels. In this way, the quality of the image can be improved by a dense fused signal that has both brightness information and motion information.
- FIG10 is a schematic diagram of the structure of a signal processing device provided in an embodiment of the present application, and the signal processing device can be implemented by software, hardware or a combination of both to form part or all of the signal processing equipment.
- the device includes: an acquisition module 1001 , a conversion module 1002 and a fusion module 1003 .
- the acquisition module 1001 is used to acquire an image signal and a first event signal of a target scene, wherein the image signal indicates brightness information of a plurality of pixels corresponding to the target scene during an exposure time, and the first event signal indicates motion information of the plurality of pixels during an exposure time, and the first event signal is an event signal in a frame format or an event signal in a stream format.
- the detailed implementation process refers to the corresponding contents in the above-mentioned embodiments, which will not be repeated here.
- the conversion module 1002 is used to convert the format of the first event signal in the time dimension and/or the space dimension to obtain a second event signal, where the second event signal is an event signal in a frame format, and the resolution of the second event signal is the same as the resolution of the image signal.
- the detailed implementation process refers to the corresponding content in the above embodiments, which will not be repeated here.
- the fusion module 1003 is used to fuse the second event signal with the image signal to obtain a fusion signal.
- the detailed implementation process refers to the corresponding content in the above embodiments, which will not be repeated here.
- the first event signal is an event signal in a frame format, and the first event signal includes M frame event signals, and the second event signal includes N frame event signals, M and N are both integers greater than or equal to 1, and M is greater than or equal to N;
- the conversion module 1002 is specifically used to:
- each group of event signals in the N groups of event signals includes at least one frame of event signals with consecutive frame sequence numbers;
- a format conversion of a time dimension and/or a space dimension is performed on each group of the N groups of event signals to obtain the N frames of event signals.
- the first event signal is an event signal in a stream format, and the first event signal includes event signals at H moments, the H moments are within the exposure time, the second event signal includes N frame event signals, and H and N are both integers greater than or equal to 1;
- the conversion module 1002 is specifically used to:
- each of the N sub-time periods includes an event signal of at least one of the H moments;
- the event signal included in each of the N sub-time periods is format-converted in terms of time dimension and/or space dimension to obtain the N frames of event signals.
- the format of the second event signal is any one of an event frame format, a time plane format and a voxel grid format.
- the second event signal includes N frame event signals, where N is an integer greater than or equal to 1;
- the fusion module 1003 is specifically configured to:
- the pixel values of each pixel within the mask area are fused with the pixel values of the corresponding pixels in the image signal, and the pixel values of each pixel outside the mask area are set to the pixel values of the corresponding pixels in the image signal to obtain a frame of fused signal.
- the target scene is an autonomous driving scene; the device further includes:
- the input module is used to input the fusion signal into the neural network model to obtain scene perception information of the autonomous driving scene.
- a second event signal is obtained by converting the format of the first event signal in the time dimension and/or the space dimension. Since the second event signal is an event signal in a frame format, that is, the format of the second event signal is similar to the format of the image signal. Therefore, the second event signal has a resolution, and the resolution of the second event signal is the same as the resolution of the image signal. In this way, the second event signal can be better fused with the image signal. Moreover, since the image signal indicates the brightness information of multiple pixels during the exposure time, the event signal indicates the motion information of the multiple pixels during the exposure time. Therefore, the event signal is fused with the image signal, and the obtained fused signal includes both the brightness information of the multiple pixels and the motion information of the multiple pixels. In this way, the quality of the image can be improved by a dense fused signal that has both brightness information and motion information.
- the signal processing device provided in the above embodiment processes the signal
- only the division of the above functional modules is used as an example.
- the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
- the signal processing device provided in the above embodiment and the signal processing method embodiment belong to the same concept, and the specific implementation process is detailed in the method embodiment, which will not be repeated here.
- FIG 11 is a schematic diagram of the structure of a computer device according to an embodiment of the present application, wherein the computer device is the above-mentioned signal processing device or cloud server.
- the computer device includes at least one processor 1101, a communication bus 1102, a memory 1103 and at least one communication interface 1104.
- the processor 1101 may be a general-purpose central processing unit (CPU), a network processor (NP), a microprocessor, or may be one or more integrated circuits for implementing the solution of the present application, such as an application-specific integrated circuit (ASIC), a programmable logic device (PLD) or a combination thereof.
- ASIC application-specific integrated circuit
- PLD programmable logic device
- the above-mentioned PLD may be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL) or any combination thereof.
- the communication bus 1102 is used to transmit information between the above components.
- the communication bus 1102 can be divided into an address bus, a data bus, a control bus, etc. For ease of representation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.
- the memory 1103 may be a read-only memory (ROM), a random access memory (RAM), an electrically erasable programmable read-only memory (EEPROM), an optical disc (including a compact disc read-only memory (CD-ROM), a compressed optical disc, a laser disc, a digital versatile disc, a Blu-ray disc, etc.), a disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store the desired program code in the form of instructions or data structures and can be accessed by a computer, but is not limited to this.
- the memory 1103 may exist independently and be connected to the processor 1101 via the communication bus 1102.
- the memory 1103 may also be integrated with the processor 1101.
- the communication interface 1104 uses any transceiver-like device for communicating with other devices or communication networks.
- the communication interface 1104 includes a wired communication interface and may also include a wireless communication interface.
- the wired communication interface may be, for example, an Ethernet interface.
- the Ethernet interface may be an optical interface, an electrical interface, or a combination thereof.
- the wireless communication interface may be a wireless local area network (WLAN) interface, a cellular network communication interface, or a combination thereof, etc.
- WLAN wireless local area network
- the processor 1101 may include one or more CPUs, such as CPU0 and CPU1 shown in FIG. 11 .
- a computer device may include multiple processors, such as processor 1101 and processor 1105 shown in FIG11. Each of these processors may be a single-core processor or a multi-core processor.
- the processor here may refer to one or more devices, circuits, and/or processing cores for processing data (such as computer program instructions).
- the computer device may further include an output device 1106 and an input device 1107.
- the output device 1106 communicates with the processor 1101 and may display information in a variety of ways.
- the output device 1106 may be a liquid crystal display (LCD).
- the input device 1107 may be a computer display device, such as a computer monitor, a display device, such as a LCD, a light emitting diode (LED), a cathode ray tube (CRT) display device, or a projector.
- the input device 1107 communicates with the processor 1101 and may receive user input in a variety of ways.
- the input device 1107 may be a mouse, a keyboard, a touch screen device, or a sensor device.
- the memory 1103 is used to store the program code 1110 for executing the solution of the present application, and the processor 1101 can execute the program code 1110 stored in the memory 1103.
- the program code 1110 may include one or more software modules, and the computer device may implement the signal processing method provided in the embodiment of FIG. 7 or FIG. 9 above through the processor 1101 and the program code 1110 in the memory 1103.
- the cloud server includes a communication interface and one or more processors;
- the communication interface is used to receive an image signal of a target scene and a first event signal sent by a signal processing device, wherein the image signal indicates brightness information of a plurality of pixels corresponding to the target scene within an exposure time, and the first event signal indicates motion information of the plurality of pixels within the exposure time, and the first event signal is an event signal in a frame format or an event signal in a stream format.
- the detailed implementation process refers to the corresponding contents in the above-mentioned embodiments, which will not be repeated here.
- One or more processors are used to convert the format of the first event signal in the time dimension and/or the space dimension to obtain a second event signal, the second event signal is an event signal in a frame format, and the resolution of the second event signal is the same as the resolution of the image signal.
- the detailed implementation process refers to the corresponding content in the above embodiments, which will not be repeated here.
- One or more processors are used to fuse the second event signal with the image signal to obtain a fused signal.
- the detailed implementation process refers to the corresponding content in the above embodiments, which will not be repeated here.
- One or more processors are used to send the fused signal to the signal processing device through the communication interface.
- the detailed implementation process refers to the corresponding content in the above embodiments, which will not be repeated here.
- FIG. 12 is a schematic diagram of the structure of a terminal device provided in an embodiment of the present application.
- the terminal device may be the above-mentioned signal processing device.
- the terminal device includes a sensor unit 1210 , a computing unit 1220 , a storage unit 1240 and an interaction unit 1230 .
- the sensor unit 1210 generally includes a visual sensor (such as a camera), a depth sensor, an IMU, a laser sensor, etc.;
- the computing unit 1220 usually includes a CPU, a GPU, a cache, a register, etc., and is mainly used to run an operating system;
- the storage unit 1240 mainly includes a memory and an external storage, and is mainly used for reading and writing local and temporary data;
- the interaction unit 1230 mainly includes a display screen, a touch panel, a speaker, a microphone, etc., and is mainly used to interact with the user, obtain input, and implement the presentation algorithm effect, etc.
- Fig. 13 is a schematic diagram of the structure of a terminal device provided in an embodiment of the present application.
- the terminal device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, a button 190, a motor 191, an indicator 192, a camera 193, a display screen 194, and a subscriber identification module (SIM) card interface 195, etc.
- SIM subscriber identification module
- the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, and the like.
- the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the terminal device 100.
- the terminal device 100 may include more or fewer components than shown in the figure, or combine some components, or split some components, or arrange the components differently.
- the components shown in the figure may be implemented in hardware, software, or a combination of software and hardware.
- the processor 110 may include one or more processing units, for example, the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), a controller, a memory, a video codec, a digital signal processor (digital signal processor, DSP), a baseband processor, and/or a neural network processor (neural-network processing unit, NPU), etc.
- different processing units can be independent devices or integrated in one or more processors.
- the processor 110 can execute a computer program to implement any method in the embodiments of the present application.
- the controller may be the nerve center and command center of the terminal device 100.
- the controller may generate an operation control signal according to the instruction operation code and the timing signal to complete the control of fetching and executing instructions.
- the processor 110 may also be provided with a memory for storing instructions and data.
- the memory in the processor 110 is a cache memory.
- the memory may store instructions or data that the processor 110 has just used or is cyclically using. The instruction or data can be directly called from the memory when it is used for the second time, thus avoiding repeated access and reducing the waiting time of the processor 110, thereby improving the efficiency of the system.
- the processor 110 may include one or more interfaces.
- the interface may include an inter-integrated circuit (IC) interface, an inter-integrated circuit sound (IS) interface, a pulse code modulation (PCM) interface, a universal asynchronous receiver/transmitter (UART) interface, a mobile industry processor interface (MIPI), a general-purpose input/output (GPIO) interface, a subscriber identity module (SIM) interface, and/or a universal serial bus (USB) interface, etc.
- IC inter-integrated circuit
- IS inter-integrated circuit sound
- PCM pulse code modulation
- UART universal asynchronous receiver/transmitter
- MIPI mobile industry processor interface
- GPIO general-purpose input/output
- SIM subscriber identity module
- USB universal serial bus
- the interface connection relationship between the modules illustrated in the embodiment of the present application is only a schematic illustration and does not constitute a structural limitation on the terminal device 100.
- the terminal device 100 may also adopt different interface connection methods in the above embodiments, or a combination of multiple interface connection methods.
- the charging management module 140 is used to receive charging input from a charger.
- the charger may be a wireless charger or a wired charger.
- the charging management module 140 may receive charging input from the wired charger through the USB interface 130 .
- the power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110.
- the power management module 141 receives input from the battery 142 and/or the charging management module 140 to power the processor 110, the internal memory 121, the external memory, the display screen 194, the camera 193, and the wireless communication module 160.
- the wireless communication function of the terminal device 100 can be implemented through the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor and the baseband processor.
- the terminal device 100 can use a wireless communication function to communicate with other devices.
- the terminal device 100 can communicate with a second electronic device, the terminal device 100 establishes a screen projection connection with the second electronic device, and the terminal device 100 outputs the projection data to the second electronic device.
- the projection data output by the terminal device 100 may be audio and video data.
- Antenna 1 and antenna 2 are used to transmit and receive electromagnetic wave signals.
- Each antenna in terminal device 100 can be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve the utilization of antennas.
- antenna 1 can be reused as a diversity antenna for a wireless local area network.
- the antenna can be used in combination with a tuning switch.
- the mobile communication module 150 can provide solutions for wireless communications including 1G/3G/4G/5G applied to the terminal device 100.
- the mobile communication module 150 may include at least one filter, a switch, a power amplifier, a low noise amplifier (LNA), etc.
- the mobile communication module 150 can receive electromagnetic waves from the antenna 1, and filter, amplify, and process the received electromagnetic waves, and transmit them to the modulation and demodulation processor for demodulation.
- the mobile communication module 150 can also amplify the signal modulated by the modulation and demodulation processor, and convert it into electromagnetic waves for radiation through the antenna 2.
- at least some of the functional modules of the mobile communication module 150 can be set in the processor 110.
- at least some of the functional modules of the mobile communication module 150 can be set in the same device as at least some of the modules of the processor 110.
- the modem processor may include a modulator and a demodulator.
- the modulator is used to modulate the low-frequency baseband signal to be sent into a medium-high frequency signal.
- the demodulator is used to demodulate the received electromagnetic wave signal into a low-frequency baseband signal.
- the demodulator then transmits the demodulated low-frequency baseband signal to the baseband processor for processing.
- the application processor outputs a sound signal through an audio device (not limited to a speaker 170A, a receiver 170B, etc.), or displays an image or video through a display screen 194.
- the modem processor may be an independent device.
- the modem processor may be independent of the processor 110 and be set in the same device as the mobile communication module 150 or other functional modules.
- the wireless communication module 160 can provide wireless communication solutions including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) network), bluetooth (BT), global navigation satellite system (GNSS), frequency modulation (FM), near field communication (NFC), infrared (IR) and the like applied to the terminal device 100.
- WLAN wireless local area networks
- BT wireless fidelity
- GNSS global navigation satellite system
- FM frequency modulation
- NFC near field communication
- IR infrared
- the wireless communication module 160 can be one or more devices integrating at least one communication processing module.
- the wireless communication module 160 receives electromagnetic waves via antenna 1, modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110.
- the wireless communication module 160 can also receive the signal to be sent from the processor 110, modulate the frequency, amplify it, and convert it into electromagnetic waves for radiation through antenna 2.
- the antenna 1 of the terminal device 100 is coupled to the mobile communication module 150, and the antenna 2 is coupled to the wireless communication module 160, so that the terminal device 100 can communicate with the network and other devices through wireless communication technology.
- the wireless communication technology may include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), wideband code division multiple access (WCDMA), and wireless communication technology. access, WCDMA), time-division code division multiple access (TD-SCDMA), long term evolution (LTE), BT, GNSS, WLAN, NFC, FM, and/or IR technology, etc.
- the GNSS may include a global positioning system (GPS), a global navigation satellite system (GLONASS), a Beidou navigation satellite system (BDS), a quasi-zenith satellite system (QZSS) and/or a satellite based augmentation system (SBAS).
- GPS global positioning system
- GLONASS global navigation satellite system
- BDS Beidou navigation satellite system
- QZSS quasi-zenith satellite system
- SBAS satellite based augmentation system
- the terminal device 100 implements the display function through a GPU, a display screen 194, and an application processor.
- the GPU is a microprocessor for image processing, which connects the display screen 194 and the application processor.
- the GPU is used to perform mathematical and geometric calculations for graphics rendering.
- the processor 110 may include one or more GPUs that execute program instructions to generate or change display information.
- the display screen 194 is used to display images, videos, etc.
- the display screen 194 includes a display panel.
- the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode or an active-matrix organic light-emitting diode (AMOLED), a flexible light-emitting diode (FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diodes (QLED), etc.
- the terminal device 100 may include 1 or N display screens 194, where N is a positive integer greater than 1.
- the display screen 194 may be used to display various interfaces of the system output of the terminal device 100 .
- the terminal device 100 can realize the shooting function through ISP, camera 193, video codec, GPU, display screen 194 and application processor.
- the ISP is used to process the data fed back by the camera 193. For example, when taking a photo, the shutter is opened, and the light is transmitted to the camera photosensitive element through the lens. The light signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing and converts it into an image visible to the naked eye.
- the ISP can also perform algorithm optimization on the noise, brightness, and skin color of the image. The ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In some embodiments, the ISP can be set in the camera 193.
- the camera 193 is used to capture still images or videos.
- the object generates an optical image through the lens and projects it onto the photosensitive element.
- the photosensitive element can be a charge coupled device (CCD) or a complementary metal oxide semiconductor (CMOS) phototransistor.
- CMOS complementary metal oxide semiconductor
- the photosensitive element converts the optical signal into an electrical signal, and then passes the electrical signal to the ISP to be converted into a digital image signal.
- the ISP outputs the digital image signal to the DSP for processing.
- the DSP converts the digital image signal into an image signal in a standard RGB, YUV or other format.
- the terminal device 100 may include 1 or N cameras 193, where N is a positive integer greater than 1.
- Digital signal processors are used to process digital signals. In addition to processing digital image signals, they can also process other digital signals.
- Video codecs are used to compress or decompress digital videos.
- the terminal device 100 may support one or more video codecs. In this way, the terminal device 100 can play or record videos in multiple coding formats, such as Moving Picture Experts Group (MPEG) 1, MPEG1, MPEG3, MPEG4, etc.
- MPEG Moving Picture Experts Group
- NPU is a neural network (NN) computing processor.
- NN neural network
- applications such as intelligent cognition of the terminal device 100 can be realized, such as image recognition, face recognition, voice recognition, text understanding, etc.
- the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the terminal device 100.
- the external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. For example, files such as music and videos can be stored in the external memory card.
- the internal memory 121 can be used to store computer executable program codes, which include instructions.
- the processor 110 executes various functional applications and data processing of the terminal device 100 by running the instructions stored in the internal memory 121.
- the internal memory 121 may include a program storage area and a data storage area.
- the program storage area can store an operating system, an application required for at least one function (such as the indoor positioning method in the embodiment of the present application, etc.).
- the data storage area can store data created during the use of the terminal device 100 (such as audio data, phone book, etc.).
- the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one disk storage device, a flash memory device, a universal flash storage (UFS), etc.
- UFS universal flash storage
- the terminal device 100 can implement audio functions through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the headphone interface 170D, and the application processor, etc. For example, music playing, recording, etc.
- the audio module 170 can be used to play the sound corresponding to the video. For example, when the display screen 194 displays the video playing screen, the audio module 170 outputs the sound of the video playing.
- the audio module 170 is used to convert digital audio information into analog audio signals for output, and also to convert analog audio input into digital audio signals. Signal.
- the speaker 170A also called a "horn" is used to convert audio electrical signals into sound signals.
- the receiver 170B also called a “handset”, is used to convert audio electrical signals into sound signals.
- Microphone 170C also called “microphone” or “microphone”, is used to convert sound signals into electrical signals.
- the earphone interface 170D is used to connect a wired earphone.
- the earphone interface 170D may be the USB interface 130, or may be a 3.5 mm open mobile terminal platform (OMTP) standard interface or a cellular telecommunications industry association of the USA (CTIA) standard interface.
- OMTP open mobile terminal platform
- CTIA cellular telecommunications industry association of the USA
- the pressure sensor 180A is used to sense pressure signals and can convert pressure signals into electrical signals.
- the pressure sensor 180A can be disposed on the display screen 194.
- the gyroscope sensor 180B can be used to determine the motion posture of the terminal device 100.
- the air pressure sensor 180C is used to measure air pressure.
- the acceleration sensor 180E can detect the magnitude of the acceleration of the terminal device 100 in all directions (including three axes or six axes). When the terminal device 100 is stationary, the magnitude and direction of gravity can be detected. It can also be used to identify the posture of the terminal device and applied to applications such as horizontal and vertical screen switching and pedometers.
- the distance sensor 180F is used to measure the distance.
- the ambient light sensor 180L is used to sense the brightness of ambient light.
- the fingerprint sensor 180H is used to collect fingerprints.
- the temperature sensor 180J is used to detect the temperature.
- the touch sensor 180K is also called a "touch panel”.
- the touch sensor 180K can be set on the display screen 194, and the touch sensor 180K and the display screen 194 form a touch screen, also called a "touch screen”.
- the touch sensor 180K is used to detect touch operations acting on or near it.
- the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
- Visual output related to the touch operation can be provided through the display screen 194.
- the touch sensor 180K can also be set on the surface of the terminal device 100, which is different from the position of the display screen 194.
- the key 190 includes a power key, a volume key, etc.
- the key 190 may be a mechanical key or a touch key.
- the terminal device 100 may receive key input and generate key signal input related to user settings and function control of the terminal device 100.
- Motor 191 can generate vibration prompts.
- the indicator 192 may be an indicator light, which may be used to indicate the charging status, power changes, messages, missed calls, notifications, etc.
- the SIM card interface 195 is used to connect a SIM card.
- the computer program product includes one or more computer instructions.
- the computer can be a general-purpose computer, a special-purpose computer, a computer network or other programmable device.
- the computer instructions can be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
- the computer instructions can be transmitted from a website site, computer, server or data center by wired (for example: coaxial cable, optical fiber, data subscriber line (digital subscriber line, DSL)) or wireless (for example: infrared, wireless, microwave, etc.) mode to another website site, computer, server or data center.
- the computer-readable storage medium can be any available medium that can be accessed by a computer, or a data storage device such as a server or data center that includes one or more available media integrated.
- the available medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a digital versatile disc (DVD)), or a semiconductor medium (e.g., a solid state disk (SSD)).
- a magnetic medium e.g., a floppy disk, a hard disk, a magnetic tape
- an optical medium e.g., a digital versatile disc (DVD)
- DVD digital versatile disc
- SSD solid state disk
- the computer-readable storage medium mentioned in the embodiment of the present application may be a non-volatile storage medium, in other words, a non-transient storage medium.
- an embodiment of the present application further provides a computer-readable storage medium, in which instructions are stored.
- the instructions When the instructions are executed on a computer, the computer executes the steps of the above-mentioned signal processing method.
- the embodiment of the present application also provides a computer program product including instructions, which, when executed on a computer, enables the computer to execute the steps of the above-mentioned signal processing method.
- a computer program is provided, which, when executed on a computer, enables the computer to execute the steps of the above-mentioned signal processing method.
- the information including but not limited to user device information, user personal information, etc.
- data including but not limited to data for analysis, stored data, displayed data, etc.
- signals involved in the embodiments of the present application are all authorized by the user or fully authorized by all parties, and the collection, use and processing of relevant data must comply with the relevant laws, regulations and standards of the relevant countries and regions.
- the image signal and the first event signal of the target scene involved in the embodiments of the present application are obtained with full authorization.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Signal Processing (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Computer Graphics (AREA)
- Human Computer Interaction (AREA)
- Image Processing (AREA)
- Television Systems (AREA)
Abstract
Description
Δt=INT(M/N) (1)
Claims (20)
- 一种信号处理方法,其特征在于,所述方法包括:获取目标场景的图像信号和第一事件信号,所述图像信号指示所述目标场景对应的多个像素在曝光时间内的亮度信息,所述第一事件信号指示所述多个像素在所述曝光时间内的运动信息,所述第一事件信号为帧格式的事件信号或流格式的事件信号;对所述第一事件信号进行时间维度和/或空间维度的格式转化,以得到第二事件信号,所述第二事件信号为帧格式的事件信号,所述第二事件信号的分辨率与所述图像信号的分辨率相同;将所述第二事件信号与所述图像信号进行融合,以得到融合信号。
- 如权利要求1所述的方法,其特征在于,所述第一事件信号为帧格式的事件信号,且所述第一事件信号包括M帧事件信号,所述第二事件信号包括N帧事件信号,所述M和所述N均为大于等于1的整数,且所述M大于等于所述N;所述对所述第一事件信号进行时间维度和/或空间维度的格式转化,以得到第二事件信号,包括:将所述M帧事件信号按照帧序号划分为N组事件信号,所述N组事件信号中的每组事件信号包括帧序号连续的至少一帧事件信号;对所述N组事件信号中的每组事件信号进行时间维度和/或空间维度的格式转化,以得到所述N帧事件信号。
- 如权利要求1所述的方法,其特征在于,所述第一事件信号为流格式的事件信号,且所述第一事件信号包括H个时刻的事件信号,所述H个时刻位于所述曝光时间内,所述第二事件信号包括N帧事件信号,所述H和所述N均为大于等于1的整数;所述对所述第一事件信号进行时间维度和/或空间维度的格式转化,以得到第二事件信号,包括:将所述曝光时间划分为N个子时间段,所述N个子时间段中的每个子时间段包括所述H个时刻中至少一个时刻的事件信号;对所述N个子时间段中每个子时间段包括的事件信号进行时间维度和/或空间维度的格式转化,以得到所述N帧事件信号。
- 如权利要求1-3任一所述的方法,其特征在于,所述第二事件信号的格式为事件帧格式、时间面格式和体素网格格式中的任一种。
- 如权利要求1-4任一所述的方法,其特征在于,所述第二事件信号包括N帧事件信号,所述N为大于等于1的整数;所述将所述第二事件信号与所述图像信号进行融合,以得到融合信号,包括:对所述N帧事件信号中每帧事件信号执行以下操作:确定一帧事件信号中的掩码区域,所述掩码区域指示对应的所述一帧事件信号中存在运动信息的像素所处的区域;将位于所述掩码区域内的各个像素的像素值与所述图像信号中相应像素的像素值进行融合,将位于所述掩码区域之外的各个像素的像素值设置为所述图像信号中相应像素的像素值,以得到一帧融合信号。
- 如权利要求1-5任一所述的方法,其特征在于,所述目标场景为自动驾驶场景;所述方法还包括:将所述融合信号输入至神经网络模型,以得到所述自动驾驶场景的场景感知信息。
- 如权利要求1-6任一所述的方法,其特征在于,所述方法的执行主体为云服务器;所述获取目标场景的图像信号和第一事件信号包括:接收信号处理设备发送的所述目标场景的图像信号和第一事件信号;所述将所述第二事件信号与所述图像信号进行融合,以得到融合信号之后,所述方法还包括:将所述融合信号发送给所述信号处理设备。
- 一种信号处理装置,其特征在于,所述装置包括:获取模块,用于获取目标场景的图像信号和第一事件信号,所述图像信号指示所述目标场景对应的多个像素在曝光时间内的亮度信息,所述第一事件信号指示所述多个像素在所述曝光时间内的运动信息,所述第一事件信号为帧格式的事件信号或流格式的事件信号;转化模块,用于对所述第一事件信号进行时间维度和/或空间维度的格式转化,以得到第二事件信号,所述第二事件信号为帧格式的事件信号,所述第二事件信号的分辨率与所述图像信号的分辨率相同;融合模块,用于将所述第二事件信号与所述图像信号进行融合,以得到融合信号。
- 如权利要求8所述的装置,其特征在于,所述第一事件信号为帧格式的事件信号,且所述第一事件信号包括M帧事件信号,所述第二事件信号包括N帧事件信号,所述M和所述N均为大于等于1的整数,且所述M大于等于所述N;所述转化模块具体用于:将所述M帧事件信号按照帧序号划分为N组事件信号,所述N组事件信号中的每组事件信号包括帧序号连续的至少一帧事件信号;对所述N组事件信号中的每组事件信号进行时间维度和/或空间维度的格式转化,以得到所述N帧事件信号。
- 如权利要求8所述的装置,其特征在于,所述第一事件信号为流格式的事件信号,且所述第一事件信号包括H个时刻的事件信号,所述H个时刻位于所述曝光时间内,所述第二事件信号包括N帧事件信号,所述H和所述N均为大于等于1的整数;所述转化模块具体用于:将所述曝光时间划分为N个子时间段,所述N个子时间段中的每个子时间段包括所述H个时刻中至少一个时刻的事件信号;对所述N个子时间段中每个子时间段包括的事件信号进行时间维度和/或空间维度的格式转化,以得到所述N帧事件信号。
- 如权利要求8-10任一所述的装置,其特征在于,所述第二事件信号的格式为事件帧格式、时间面格式和体素网格格式中的任一种。
- 如权利要求8-11任一所述的装置,其特征在于,所述第二事件信号包括N帧事件信号,所述N为大于等于1的整数;所述融合模块具体用于:对所述N帧事件信号中每帧事件信号执行以下操作:确定一帧事件信号中的掩码区域,所述掩码区域指示对应的所述一帧事件信号中存在运动信息的像素所处的区域;将位于所述掩码区域内的各个像素的像素值与所述图像信号中相应像素的像素值进行融合,将位于所述掩码区域之外的各个像素的像素值设置为所述图像信号中相应像素的像素值,以得到一帧融合信号。
- 如权利要求8-12任一所述的装置,其特征在于,所述目标场景为自动驾驶场景;所述装置还包括:输入模块,用于将所述融合信号输入至神经网络模型,以得到所述自动驾驶场景的场景感知信息。
- 一种云服务器,其特征在于,所述云服务器包括通信接口和一个或多个处理器;所述通信接口,用于接收信号处理设备发送的目标场景的图像信号和第一事件信号,所述图像信号指示所述目标场景对应的多个像素在曝光时间内的亮度信息,所述第一事件信号指示所述多个像素在所述曝光时间内的运动信息,所述第一事件信号为帧格式的事件信号或流格式的事件信号;所述一个或多个处理器,用于对所述第一事件信号进行时间维度和/或空间维度的格式转化,以得到第二事件信号,所述第二事件信号为帧格式的事件信号,所述第二事件信号的分辨率与所述图像信号的分辨率相同;所述一个或多个处理器,用于将所述第二事件信号与所述图像信号进行融合,以得到融合信号;所述一个或多个处理器用于通过所述通信接口将所述融合信号发送给所述信号处理设备。
- 如权利要求14所述的云服务器,其特征在于,所述第一事件信号为帧格式的事件信号,且所述第一事件信号包括M帧事件信号,所述第二事件信号包括N帧事件信号,所述M和所述N均为大于等于1的整数,且所述M大于等于所述N;所述一个或多个处理器具体用于:将所述M帧事件信号按照帧序号划分为N组事件信号,所述N组事件信号中的每组事件信号包括帧序号连续的至少一帧事件信号;对所述N组事件信号中的每组事件信号进行时间维度和/或空间维度的格式转化,以得到所述N帧事件信号。
- 如权利要求14所述的云服务器,其特征在于,所述第一事件信号为流格式的事件信号,且所述第一事件信号包括H个时刻的事件信号,所述H个时刻位于所述曝光时间内,所述第二事件信号包括N帧事件信号,所述H和所述N均为大于等于1的整数;所述一个或多个处理器具体用于:将所述曝光时间划分为N个子时间段,所述N个子时间段中的每个子时间段包括所述H个时刻中至少一个时刻的事件信号;对所述N个子时间段中每个子时间段包括的事件信号进行时间维度和/或空间维度的格式转化,以得到所述N帧事件信号。
- 如权利要求14-16任一所述的云服务器,其特征在于,所述第二事件信号包括N帧事件信号,所述N为大于等于1的整数;所述一个或多个处理器具体用于:对所述N帧事件信号中每帧事件信号执行以下操作:确定一帧事件信号中的掩码区域,所述掩码区域指示对应的所述一帧事件信号中存在运动信息的像素所处的区域;将位于所述掩码区域内的各个像素的像素值与所述图像信号中相应像素的像素值进行融合,将位于所述掩码区域之外的各个像素的像素值设置为所述图像信号中相应像素的像素值,以得到一帧融合信号。
- 一种信号处理设备,其特征在于,所述信号处理设备包括存储器和处理器,所述存储器用于存储计算机程序,所述处理器被配置为用于执行所述存储器中存储的计算机程序,以实现权利要求1-6任一项所述的方法。
- 一种计算机可读存储介质,其特征在于,所述存储介质内存储有指令,当所述指令在计算机或处理器上运行时,使得所述计算机或所述处理器执行权利要求1-7任一所述的方法的步骤。
- 一种计算机程序,其特征在于,所述计算机程序包括指令,当所述指令在所述计算机上运行时,使得所述计算机执行权利要求1-7任一项所述的方法。
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP23876241.3A EP4589549A4 (en) | 2022-10-14 | 2023-06-29 | SIGNAL PROCESSING METHOD AND APPARATUS, DEVICE, RECORDING MEDIUM AND COMPUTER PROGRAM |
| US19/177,148 US20250240538A1 (en) | 2022-10-14 | 2025-04-11 | Signal Processing Method, Apparatus, and Device, Storage Medium, and Computer Program |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211259723.4A CN117893856A (zh) | 2022-10-14 | 2022-10-14 | 信号处理方法、装置、设备、存储介质及计算机程序 |
| CN202211259723.4 | 2022-10-14 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US19/177,148 Continuation US20250240538A1 (en) | 2022-10-14 | 2025-04-11 | Signal Processing Method, Apparatus, and Device, Storage Medium, and Computer Program |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024078032A1 true WO2024078032A1 (zh) | 2024-04-18 |
Family
ID=90638186
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2023/103954 Ceased WO2024078032A1 (zh) | 2022-10-14 | 2023-06-29 | 信号处理方法、装置、设备、存储介质及计算机程序 |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US20250240538A1 (zh) |
| EP (1) | EP4589549A4 (zh) |
| CN (1) | CN117893856A (zh) |
| WO (1) | WO2024078032A1 (zh) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20250159371A1 (en) * | 2023-11-09 | 2025-05-15 | Omnivision Technologies, Inc. | Hybrid image sensors with on-chip image deblur |
| CN118494511B (zh) * | 2024-07-17 | 2024-10-11 | 比亚迪股份有限公司 | 车载数据处理方法、装置、计算机设备及存储介质 |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113724142A (zh) * | 2020-05-26 | 2021-11-30 | 杭州海康威视数字技术股份有限公司 | 图像复原系统及方法 |
| CN113727079A (zh) * | 2020-05-25 | 2021-11-30 | 华为技术有限公司 | 一种图像信号处理方法及装置、电子设备 |
| US20220222496A1 (en) * | 2021-01-13 | 2022-07-14 | Fotonation Limited | Image processing system |
| CN114972412A (zh) * | 2022-05-12 | 2022-08-30 | 深圳锐视智芯科技有限公司 | 一种姿态估计方法、装置、系统及可读存储介质 |
| CN115564695A (zh) * | 2022-09-29 | 2023-01-03 | 联想(北京)有限公司 | 一种处理方法及电子设备 |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111247801B (zh) * | 2017-09-28 | 2022-06-14 | 苹果公司 | 用于事件相机数据处理的系统和方法 |
| KR20230127287A (ko) * | 2020-12-31 | 2023-08-31 | 후아웨이 테크놀러지 컴퍼니 리미티드 | 포즈 추정 방법 및 관련 장치 |
| CN112801027B (zh) * | 2021-02-09 | 2024-07-12 | 北京工业大学 | 基于事件相机的车辆目标检测方法 |
-
2022
- 2022-10-14 CN CN202211259723.4A patent/CN117893856A/zh active Pending
-
2023
- 2023-06-29 EP EP23876241.3A patent/EP4589549A4/en active Pending
- 2023-06-29 WO PCT/CN2023/103954 patent/WO2024078032A1/zh not_active Ceased
-
2025
- 2025-04-11 US US19/177,148 patent/US20250240538A1/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113727079A (zh) * | 2020-05-25 | 2021-11-30 | 华为技术有限公司 | 一种图像信号处理方法及装置、电子设备 |
| CN113724142A (zh) * | 2020-05-26 | 2021-11-30 | 杭州海康威视数字技术股份有限公司 | 图像复原系统及方法 |
| US20220222496A1 (en) * | 2021-01-13 | 2022-07-14 | Fotonation Limited | Image processing system |
| CN114972412A (zh) * | 2022-05-12 | 2022-08-30 | 深圳锐视智芯科技有限公司 | 一种姿态估计方法、装置、系统及可读存储介质 |
| CN115564695A (zh) * | 2022-09-29 | 2023-01-03 | 联想(北京)有限公司 | 一种处理方法及电子设备 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4589549A4 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN117893856A (zh) | 2024-04-16 |
| EP4589549A4 (en) | 2025-12-31 |
| EP4589549A1 (en) | 2025-07-23 |
| US20250240538A1 (en) | 2025-07-24 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN114946169B (zh) | 一种图像获取方法以及装置 | |
| EP3686845B1 (en) | Image processing method and device and apparatus | |
| EP4036854B1 (en) | Image processing method and apparatus, and electronic device | |
| WO2020192461A1 (zh) | 一种延时摄影的录制方法及电子设备 | |
| US20240119566A1 (en) | Image processing method and apparatus, and electronic device | |
| CN116095476B (zh) | 摄像头的切换方法、装置、电子设备及存储介质 | |
| US20250240538A1 (en) | Signal Processing Method, Apparatus, and Device, Storage Medium, and Computer Program | |
| CN113052056B (zh) | 一种视频处理的方法以及装置 | |
| CN113096022A (zh) | 图像虚化处理方法、装置、存储介质与电子设备 | |
| WO2022033344A1 (zh) | 视频防抖方法、终端设备和计算机可读存储介质 | |
| CN113744139A (zh) | 图像处理方法、装置、电子设备及存储介质 | |
| CN113468929B (zh) | 运动状态识别方法、装置、电子设备和存储介质 | |
| WO2024082976A1 (zh) | 文本图像的ocr识别方法、电子设备及介质 | |
| CN117880645A (zh) | 一种图像处理的方法、装置、电子设备及存储介质 | |
| CN112188094B (zh) | 图像处理方法及装置、计算机可读介质及终端设备 | |
| CN116095509B (zh) | 生成视频帧的方法、装置、电子设备及存储介质 | |
| CN108427938A (zh) | 图像处理方法、装置、存储介质和电子设备 | |
| CN117135455A (zh) | 一种图像处理方法及电子设备 | |
| CN116723264A (zh) | 确定目标位置信息的方法、设备及存储介质 | |
| CN116055894B (zh) | 基于神经网络的图像去频闪方法和装置 | |
| CN116051450B (zh) | 眩光信息获取方法、装置、芯片、电子设备及介质 | |
| WO2024093854A1 (zh) | 一种图像处理方法及电子设备 | |
| CN117593236A (zh) | 图像的显示方法、装置和终端设备 | |
| CN119277212B (zh) | 图像处理方法和装置 | |
| CN119255090B (zh) | 图像处理方法、电子设备及计算机可读存储介质 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23876241 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2023876241 Country of ref document: EP |
|
| ENP | Entry into the national phase |
Ref document number: 2023876241 Country of ref document: EP Effective date: 20250416 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWP | Wipo information: published in national office |
Ref document number: 2023876241 Country of ref document: EP |