WO2024055764A1 - 图像处理方法及装置 - Google Patents
图像处理方法及装置 Download PDFInfo
- Publication number
- WO2024055764A1 WO2024055764A1 PCT/CN2023/110156 CN2023110156W WO2024055764A1 WO 2024055764 A1 WO2024055764 A1 WO 2024055764A1 CN 2023110156 W CN2023110156 W CN 2023110156W WO 2024055764 A1 WO2024055764 A1 WO 2024055764A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- isp
- image
- images
- parameters
- original
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/60—Image enhancement or restoration using machine learning, e.g. neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/80—Camera processing pipelines; Components thereof
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Definitions
- Embodiments of the present application relate to the field of image processing, and in particular, to an image processing method and device.
- the terminal device can perform image processing on the images collected by the camera, and apply the image processing results to scenarios such as autonomous driving and security monitoring.
- the firmware (FW) part of the image signal processor (image signal processor, ISP) is based on fixed algorithm parameters and performs image processing on the images collected by the camera to obtain the processed image.
- image signal processor image signal processor
- image processing methods based on fixed algorithm parameters are relatively simple and cannot adapt to all scenarios.
- Embodiments of the present application provide an image processing method and device.
- the device can generate ISP parameters that satisfy the corresponding scenarios based on the original images obtained in different scenarios to achieve dynamic adjustment of the ISP parameters, thereby performing adaptive image enhancement for different scenarios.
- embodiments of the present application provide an image processing method.
- the method includes: the device acquires a first original image collected by a camera in a first scene.
- the device obtains the first ISP parameter based on the first original image.
- the device performs image processing on the first original image collected by the camera based on the first ISP parameter to obtain the first image.
- the device outputs the first image to the target application, so that the target application performs image task processing on the first image.
- the device obtains the second original image collected by the camera in the second scene.
- the device obtains the second ISP parameter based on the second original image.
- the device performs image processing on the second original image collected by the camera based on the second ISP parameter to obtain the second image.
- the device outputs the second image to the target application, so that the target application performs image task processing on the second image.
- embodiments of the present application can obtain ISP parameters that meet the needs of different scenarios, thereby realizing dynamic adjustment of ISP parameters, and performing image processing on original images collected in different scenarios based on different ISP parameters, thereby improving the accuracy of image processing.
- the ISP parameters in the embodiments of the present application are automatically obtained, that is, the corresponding ISP parameters can be predicted based on the original images obtained in different scenarios.
- the device can periodically acquire ISP parameters to reduce the computing pressure of the device.
- the target application may be a sensing application, a display application, etc., which is not limited in this application.
- image processing characters for CV perception applications may include but are not limited to 2D/3D target detection, lane line detection, scene semantic segmentation, etc., which are not limited in this application.
- the image content of the first original image and the second original image are different, and/or the image attributes of the first original image and the second original image are different.
- the contents and/or attributes of images taken in different scenes are different. This application can predict the corresponding ISP parameters based on different original images to obtain better image processing results, thereby adapting to Different scene requirements.
- the first ISP parameter and the second ISP parameter are used to adjust at least one of the following image attributes of the image: brightness, color, noise, sharpness, and contrast.
- different ISP parameters can be used to adjust different image attributes to obtain better image effects.
- the image attributes of the first image and the second image meet the image task requirements of the target application.
- the image processing in this application is all to meet the needs of the target application, so that the target application can better recognize the processed image, and improve the accuracy of the image recognition (or other image processing processes) of the target application.
- obtaining the first ISP parameter based on the first original image includes: inputting the first original image to the ISP parameter prediction model, and obtaining the first ISP parameter predicted by the ISP parameter prediction model based on the first original image.
- ISP parameter; based on the second original image, obtaining the second ISP parameter includes: inputting the second original image to the ISP parameter prediction model, and obtaining the second ISP parameter predicted by the ISP parameter prediction model based on the second original image.
- the ISP parameter prediction model runs on the neural network processor NPU.
- the prediction model in the embodiment of the present application runs on the general computing power of the NPU and requires only a small amount of calculation.
- this application can be deployed on edge devices without occupying high-priority tasks on the AI processor.
- the method before obtaining the first original image collected by the camera in the first scene, the method further includes: obtaining the ISP parameter prediction model from the cloud.
- the device only needs the ISP system of the chip on sale to support the dynamic rewriting of parameters.
- This solution can download updates over the air (that is, obtain and install them from the cloud) for products already in circulation on the market. Support is provided to meet compatibility needs.
- image processing is performed on the first original image based on the first ISP parameter to obtain the first image, including: replacing the ISP parameter currently saved in the ISP memory with the first ISP parameter, so that the ISP changes from Obtain the first ISP parameter from the ISP memory, perform image processing on the first original image based on the first ISP parameter, and output the first image; perform image processing on the second original image based on the second ISP parameter, and obtain the second image, It includes: replacing the first ISP parameter currently saved in the ISP memory with the second ISP parameter, so that the ISP obtains the second ISP parameter from the ISP memory, and performs image processing on the second original image based on the second ISP parameter, and outputs the second ISP parameter.
- Two images In this way, in the embodiment of the present application, the traditional ISP processing algorithm is still used on the image, which can ensure the controllability and interpretability of the entire processing process.
- embodiments of the present application provide a model training method.
- the method includes: inputting N original images to an image signal processor ISP parameter prediction model, obtaining N ISP parameters output by the ISP parameter prediction model, the weight of the ISP parameter prediction model is the first weight, and the N ISP parameters are the ISP parameter predictions.
- the model is obtained based on the first weight value, and N is an integer greater than 1.
- Input N ISP parameters and N original images to the proxy model and obtain N images output by the proxy model.
- the proxy network can realize the function of ISP, and realize the end-to-end loop training process between the ISP parameter prediction model and the target application through joint training of the differentiable proxy network and the FW parameter prediction network.
- the model training method can be executed on a computing node.
- the preset condition includes: the difference between the preset ground truth labels of the N original images and the ground truth labels included in the image task processing result of the target application is less than a preset threshold.
- the method before inputting N original images to the ISP parameter prediction model, the method also includes: inputting M original images and M ISP parameters to the ISP, and obtaining M images output by the image processor; wherein, M original images and M ISP parameters correspond one to one, M is an integer greater than 1; input M original images and M ISP parameters to the proxy model, and obtain M images output by the proxy model; the current weight of the proxy model is the third weight; if the proxy model The similarity between the output M images and the M images output by the image processor is less than the threshold, adjust the weight of the proxy model to the fourth weight, and start again from the step of inputting M original images and M ISP parameters to the ISP.
- the processing results of the proxy network can be made close to or the same as the processing results of the ISP, so that the proxy network can realize the function of the ISP during the joint training process.
- inventions of the present application provide an image processing device.
- the device includes: ISP parameter prediction module and ISP module.
- the ISP parameter prediction module is used to obtain the first original image collected by the camera in the first scene; the ISP parameter prediction module is also used to obtain the first image signal processor ISP parameters based on the first original image; the image processing ISP module , used to perform image processing on the first original image based on the first ISP parameter to obtain the first image; the ISP module is also used to output the first image to the target application, so that the target application performs image task processing on the first image;
- the ISP parameter prediction module is also used to obtain the second original image collected by the camera in the second scene; the ISP parameter prediction module is also used to obtain the second ISP parameter based on the second original image; the ISP module is also used to obtain the second ISP parameter based on the second original image.
- the second ISP parameter is used to perform image processing on the second original image to obtain the second image; the ISP module is also used to output the second image to the
- the image content of the first original image and the second original image are different, and/or the image attributes of the first original image and the second original image are different.
- the first ISP parameter and the second ISP parameter are used to adjust at least one of the following image attributes of the image: brightness, color, noise, sharpness, and contrast.
- the image attributes of the first image and the second image meet the image task requirements of the target application.
- the ISP parameter prediction module is used to replace the ISP parameters currently saved in the ISP memory with the first ISP parameter; the ISP module is used to obtain the first ISP parameter from the ISP memory, and calculate the first ISP parameter based on the first ISP parameter.
- An ISP parameter which performs image processing on the first original image and outputs the first image; an ISP parameter prediction module, which is used to replace the first ISP parameter currently saved in the ISP memory with the second ISP parameter; the ISP module, which is used to obtain the first ISP parameter from the ISP Obtain the second ISP parameter in the memory, perform image processing on the second original image based on the second ISP parameter, and output the second image.
- embodiments of the present application provide a model training system, including an image signal processor ISP parameter prediction model, a proxy model and a target application.
- the system inputs N original images to the ISP parameter prediction model and obtains N ISP parameters output by the ISP parameter prediction model.
- the weight of the ISP parameter prediction model is the first weight
- the N ISP parameters are the values of the ISP parameter prediction model based on the first weight.
- N is an integer greater than 1.
- the system inputs N ISP parameters and N original images to the proxy model and obtains N images output by the proxy model.
- the system inputs N images to the target application.
- the system adjusts the weight of the ISP parameter prediction model to the second weight, and again starts from the step of inputting N original images to the ISP parameter prediction model until the N output by the proxy model The images meet the image task requirements of the target application.
- the system also includes an ISP; before inputting N original images to the ISP parameter prediction model, input M original images and M ISP parameters to the ISP to obtain M images output by the image processor; where , M original images and M ISP parameters correspond one to one, M is an integer greater than 1; input M original images and M ISP parameters to the proxy model, and obtain M images output by the proxy model; the current weight of the proxy model is the third weight; if the similarity between the M images output by the proxy model and the M images output by the image processor is less than the threshold, adjust the weight of the proxy model to the fourth weight, and the fourth weight is different from the third weight.
- the execution starts from inputting M original images and M ISP parameters to the image processor, obtaining M images output by the image processor, until the similarity between the M images output by the proxy model and the M images output by the image processor degree is greater than the threshold.
- embodiments of the present application provide an electronic device, including: one or more processors; a memory; and one or more computer programs, wherein one or more computer programs are stored on the memory, and when the computer program is Or a method that, when executed by multiple processors, causes the electronic device to execute the instructions of the first aspect or the method in any possible implementation of the first aspect.
- embodiments of the present application provide a camera module, including: one or more processors; a memory; and one or more computer programs, wherein one or more computer programs are stored in the memory.
- the computer program is A method that, when executed by one or more processors, causes the camera module to execute the instructions of the method in the first aspect or any possible implementation of the first aspect.
- embodiments of the present application provide a computer-readable medium for storing a computer program, where the computer program includes instructions for executing the method in the first aspect or any possible implementation of the first aspect.
- embodiments of the present application provide a computer program, which includes instructions for executing the method in the first aspect or any possible implementation of the first aspect.
- embodiments of the present application provide a chip, which includes a processing circuit and transceiver pins.
- the transceiver pin and the processing circuit communicate with each other through an internal connection path, and the processing circuit executes the method in the first aspect or any possible implementation of the first aspect to control the receiving pin to receive the signal, so as to Control the sending pin to send signals.
- Figure 1 is a schematic diagram of the hardware structure of electronic equipment
- Figure 2 is a schematic diagram of the hardware structure of the electronic device
- Figure 3 is a schematic flowchart of an exemplary image processing method
- Figure 4 is a schematic flow chart of an exemplary image processing method
- Figure 5 is a schematic flowchart of an exemplary image processing method
- Figure 6 is a schematic structural diagram of an exemplary proxy network
- Figure 7 is a schematic diagram of an exemplary agent network training process
- Figure 8 is a schematic diagram of the training process of the FW parameter prediction network
- Figure 9 is a schematic diagram of the system structure of FW parameter prediction network training exemplarily shown.
- Figure 10 is a schematic diagram of an exemplary image processing flow
- Figure 11 is a schematic structural diagram of an exemplary device.
- a and/or B can mean: A exists alone, A and B exist simultaneously, and they exist alone. B these three situations.
- first and second in the description and claims of the embodiments of this application are used to distinguish different objects, rather than to describe a specific order of objects.
- first target object, the second target object, etc. are used to distinguish different target objects, rather than to describe a specific order of the target objects.
- multiple processing units refer to two or more processing units; multiple systems refer to two or more systems.
- the image processing method in the embodiment of this application is applied to electronic devices, where the electronic device can be a mobile phone, a tablet, a vehicle-mounted device, a security device, a computer, a wearable device, a smart home device, etc., which is not limited by this application.
- FIG. 1 shows a schematic structural diagram of an electronic device 100 .
- the electronic device 100 shown in FIG. 1 is only an example of an electronic device, and the electronic device 100 may have more or fewer components than shown in the figure, and two or more components may be combined. , or can have different component configurations.
- the various components shown in Figure 1 may be implemented in hardware, software, or a combination of hardware and software including one or more signal processing and/or application specific integrated circuits.
- the electronic device is a mobile phone as an example. In other embodiments, the electronic device may be any device described above, which is not limited in this application.
- the electronic device 100 may include: a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2.
- Mobile communication module 150 wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone interface 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen 194, And subscriber identification module (subscriber identification module, SIM) card interface 195, etc.
- SIM subscriber identification module
- the sensor module 180 may include a pressure sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light. Sensor 180L, bone conduction sensor 180M, etc.
- the processor 110 may include one or more processing units.
- the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (GPU), and an image signal processor. (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processing unit (NPU) wait.
- application processor application processor, AP
- modem processor graphics processing unit
- GPU graphics processing unit
- image signal processor image signal processor
- ISP image signal processor
- controller memory
- video codec digital signal processor
- DSP digital signal processor
- baseband processor baseband processor
- NPU neural-network processing unit
- different processing units can be independent devices or integrated in one or more processors.
- the controller may be the nerve center and command center of the electronic device 100 .
- the controller can generate operation control signals based on the instruction operation code and timing signals to complete the control of fetching and executing instructions.
- the processor 110 may also be provided with a memory for storing instructions and data.
- the memory in processor 110 is cache memory. This memory may hold instructions or data that have been recently used or recycled by processor 110 . If the processor 110 needs to use the instructions or data again, it can be called directly from the memory. Repeated access is avoided and the waiting time of the processor 110 is reduced, thus improving the efficiency of the system.
- processor 110 may include one or more interfaces.
- Interfaces may include integrated circuit (inter-integrated circuit, I2C) interface, integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, pulse code modulation (pulse code modulation, PCM) interface, universal asynchronous receiver and transmitter (universal asynchronous receiver/transmitter (UART) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, subscriber identity module (SIM) interface, and /or universal serial bus (USB) interface, etc.
- I2C integrated circuit
- I2S integrated circuit built-in audio
- PCM pulse code modulation
- UART universal asynchronous receiver and transmitter
- MIPI mobile industry processor interface
- GPIO general-purpose input/output
- SIM subscriber identity module
- USB universal serial bus
- the I2C interface is a bidirectional synchronous serial bus, including a serial data line (SDA) and a serial clock line (derail clock line, SCL).
- processor 110 may include multiple sets of I2C buses.
- the processor 110 can separately couple the touch sensor 180K, charger, flash, camera 193, etc. through different I2C bus interfaces.
- the processor 110 can be coupled to the touch sensor 180K through an I2C interface, so that the processor 110 and the touch sensor 180K communicate through the I2C bus interface to implement the touch function of the electronic device 100 .
- the I2S interface can be used for audio communication.
- processor 110 may include multiple sets of I2S buses.
- the processor 110 can be coupled with the audio module 170 through the I2S bus to implement communication between the processor 110 and the audio module 170 .
- the audio module 170 can transmit audio signals to the wireless communication module 160 through the I2S interface to implement the function of answering calls through a Bluetooth headset.
- the PCM interface can also be used for audio communications to sample, quantize and encode analog signals.
- audio module 170 The wireless communication module 160 may be coupled through a PCM bus interface. In some embodiments, the audio module 170 can also transmit audio signals to the wireless communication module 160 through the PCM interface to implement the function of answering calls through a Bluetooth headset. Both the I2S interface and the PCM interface can be used for audio communication.
- the UART interface is a universal serial data bus used for asynchronous communication.
- the bus can be a bidirectional communication bus. It converts the data to be transmitted between serial communication and parallel communication.
- a UART interface is generally used to connect the processor 110 and the wireless communication module 160 .
- the processor 110 communicates with the Bluetooth module in the wireless communication module 160 through the UART interface to implement the Bluetooth function.
- the audio module 170 can transmit audio signals to the wireless communication module 160 through the UART interface to implement the function of playing music through a Bluetooth headset.
- the MIPI interface can be used to connect the processor 110 with peripheral devices such as the display screen 194 and the camera 193 .
- MIPI interfaces include camera serial interface (CSI), display serial interface (DSI), etc.
- the processor 110 and the camera 193 communicate through the CSI interface to implement the shooting function of the electronic device 100 .
- the processor 110 and the display screen 194 communicate through the DSI interface to implement the display function of the electronic device 100 .
- the GPIO interface can be configured through software.
- the GPIO interface can be configured as a control signal or as a data signal.
- the GPIO interface can be used to connect the processor 110 with the camera 193, display screen 194, wireless communication module 160, audio module 170, sensor module 180, etc.
- the GPIO interface can also be configured as an I2C interface, I2S interface, UART interface, MIPI interface, etc.
- the USB interface 130 is an interface that complies with the USB standard specification, and may be a Mini USB interface, a Micro USB interface, a USB Type C interface, etc.
- the USB interface 130 can be used to connect a charger to charge the electronic device 100, and can also be used to transmit data between the electronic device 100 and peripheral devices. It can also be used to connect headphones to play audio through them. This interface can also be used to connect other electronic devices, such as AR devices, etc.
- the interface connection relationships between the modules illustrated in the embodiments of the present application are only schematic illustrations and do not constitute a structural limitation of the electronic device 100 .
- the electronic device 100 may also adopt different interface connection methods in the above embodiments, or a combination of multiple interface connection methods.
- the charging management module 140 is used to receive charging input from the charger.
- the charger can be a wireless charger or a wired charger.
- the charging management module 140 may receive charging input from the wired charger through the USB interface 130 .
- the charging management module 140 may receive wireless charging input through the wireless charging coil of the electronic device 100 . While the charging management module 140 charges the battery 142, it can also provide power to the electronic device through the power management module 141.
- the power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110.
- the power management module 141 receives input from the battery 142 and/or the charging management module 140, and supplies power to the processor 110, internal memory 121, external memory, display screen 194, camera 193, wireless communication module 160, etc.
- the power management module 141 can also be used to monitor battery capacity, battery cycle times, battery health status (leakage, impedance) and other parameters.
- the power management module 141 may also be provided in the processor 110 .
- the power management module 141 and the charging management module 140 may also be provided in the same device.
- the wireless communication function of the electronic device 100 can be implemented through the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor and the baseband processor.
- Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
- Each antenna in electronic device 100 may be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization. For example: Antenna 1 can be reused as a diversity antenna for a wireless LAN. In other embodiments, antennas may be used in conjunction with tuning switches.
- the mobile communication module 150 can provide solutions for wireless communication including 2G/3G/4G/5G applied on the electronic device 100 .
- the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (LNA), etc.
- the mobile communication module 150 can receive electromagnetic waves through the antenna 1, perform filtering, amplification and other processing on the received electromagnetic waves, and transmit them to the modem processor for demodulation.
- the mobile communication module 150 can also amplify the signal modulated by the modem processor and convert it into electromagnetic waves through the antenna 1 for radiation.
- at least part of the functional modules of the mobile communication module 150 may be disposed in the processor 110 .
- at least part of the functional modules of the mobile communication module 150 and at least part of the modules of the processor 110 may be provided in the same device.
- a modem processor may include a modulator and a demodulator.
- the modulator is used to modulate the low-frequency baseband signal to be sent into a medium-high frequency signal.
- the demodulator is used to demodulate the received electromagnetic wave signal into a low-frequency baseband signal.
- the demodulator then transmits the demodulated low-frequency baseband signal to the baseband processor for processing.
- the application processor outputs sound signals through audio devices (not limited to speaker 170A, receiver 170B, etc.), or displays images or videos through display screen 194.
- the modem processor may be a stand-alone device.
- the modem processor may be independent of the processor 110 and may be provided in the same device as the mobile communication module 150 or other functional modules.
- the wireless communication module 160 can provide applications on the electronic device 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) network), Bluetooth (bluetooth, BT), and global navigation satellites.
- WLAN wireless local area networks
- System global navigation satellite system, GNSS
- frequency modulation frequency modulation, FM
- near field communication technology near field communication, NFC
- infrared technology infrared, IR
- the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
- the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
- the wireless communication module 160 can also receive the signal to be sent from the processor 110, frequency modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.
- the antenna 1 of the electronic device 100 is coupled to the mobile communication module 150, and the antenna 2 is coupled to the wireless communication module 160, so that the electronic device 100 can communicate with the network and other devices through wireless communication technology.
- the wireless communication technology may include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), broadband Code division multiple access (wideband code division multiple access, WCDMA), time division code division multiple access (time-division code division multiple access, TD-SCDMA), long term evolution (long term evolution, LTE), BT, GNSS, WLAN, NFC , FM, and/or IR technology, etc.
- the GNSS may include global positioning system (GPS), global navigation satellite system (GLONASS), Beidou navigation satellite system (BDS), quasi-zenith satellite system (quasi) -zenith satellite system (QZSS) and/or satellite based augmentation systems (SBAS).
- GPS global positioning system
- GLONASS global navigation satellite system
- BDS Beidou navigation satellite system
- QZSS quasi-zenith satellite system
- SBAS satellite based augmentation systems
- the electronic device 100 implements display functions through a GPU, a display screen 194, an application processor, and the like.
- the GPU is an image processing microprocessor and is connected to the display screen 194 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering.
- Processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.
- the display screen 194 is used to display images, videos, etc.
- Display 194 includes a display panel.
- the display panel can use a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light emitting diode or an active matrix organic light emitting diode (active-matrix organic light emitting diode).
- LCD liquid crystal display
- OLED organic light-emitting diode
- AMOLED organic light-emitting diode
- FLED flexible light-emitting diode
- Miniled MicroLed, Micro-oLed, quantum dot light emitting diode (QLED), etc.
- the electronic device 100 may include 1 or N display screens 194, where N is a positive integer greater than 1.
- the electronic device 100 can implement the shooting function through an ISP, a camera 193, a video codec, a GPU, a display screen 194, an application processor, and the like.
- the ISP is used to process the data fed back by the camera 193. For example, when taking a photo, the shutter is opened, the light is transmitted to the camera sensor through the lens, the optical signal is converted into an electrical signal, and the electrical signal is processed by the ISP and converted into an image visible to the naked eye.
- ISP can also perform algorithm optimization on image noise, brightness, white balance, contrast and other attributes.
- the ISP may be provided in the camera 193.
- the optimized attributes of ISPs of different manufacturers and models can be set according to actual needs. For example, the ISP of electronic device A can optimize the noise and brightness of the image.
- the ISP of electronic device B can optimize the brightness and contrast of the image, which is not limited by this application.
- Camera 193 is used to capture still images or video.
- the object passes through the lens to produce an optical image that is projected onto the photosensitive element.
- the photosensitive element can be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
- the electronic device 100 may include 1 or N cameras 193, where N is a positive integer greater than 1.
- the image captured by the camera 193 may be called a Raw image (or simply a Raw image), and may be understood as is the original image file.
- ISP can perform image processing on Raw images and output RGB images or YUV images.
- RGB and YUV are both color spaces, used to represent colors.
- "Y” in YUV represents brightness, which is the grayscale value; while "U” and “V” represent image color and saturation, which are used to specify the color of pixels.
- RGB color mode is a color standard in the industry. It obtains a variety of colors by changing the three color channels of red (R), green (G), and blue (B) and superimposing them on each other. of.
- Video codecs are used to compress or decompress digital video.
- Electronic device 100 may support one or more video codecs. In this way, the electronic device 100 can play or record videos in multiple encoding formats, such as moving picture experts group (MPEG) 1, MPEG2, MPEG3, MPEG4, etc.
- MPEG moving picture experts group
- MPEG2 MPEG2, MPEG3, MPEG4, etc.
- NPU is a neural network (NN) computing processor.
- NN neural network
- Intelligent cognitive applications of the electronic device 100 can be implemented through the NPU, such as image recognition, face recognition, speech recognition, text understanding, etc.
- the NPU may be used to support the FW parameter prediction network in the FW parameter setting module, that is, the NPU executes the steps performed by the FW parameter prediction network in the FW parameter setting module.
- the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 100.
- the external memory card communicates with the processor 110 through the external memory interface 120 to implement the data storage function. Such as saving music, videos, etc. files in external memory card.
- Internal memory 121 may be used to store computer executable program code, which includes instructions.
- the processor 110 executes instructions stored in the internal memory 121 to execute various functional applications and data processing of the electronic device 100 .
- the internal memory 121 may include a program storage area and a data storage area. Among them, the stored program area can store an operating system, at least one application program required for a function (such as a sound playback function, an image playback function, etc.).
- the storage data area may store data created during use of the electronic device 100 (such as audio data, phone book, etc.).
- the internal memory 121 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, universal flash storage (UFS), etc.
- the electronic device 100 can implement audio functions through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the headphone interface 170D, and the application processor. Such as music playback, recording, etc.
- the audio module 170 is used to convert digital audio information into analog audio signal output, and is also used to convert analog audio input into digital audio signals. Audio module 170 may also be used to encode and decode audio signals. In some embodiments, the audio module 170 may be provided in the processor 110 , or some functional modules of the audio module 170 may be provided in the processor 110 .
- Speaker 170A also called “speaker” is used to convert audio electrical signals into sound signals.
- the electronic device 100 can listen to music through the speaker 170A, or listen to hands-free calls.
- Receiver 170B also called “earpiece” is used to convert audio electrical signals into sound signals.
- the electronic device 100 answers a call or a voice message, the voice can be heard by bringing the receiver 170B close to the human ear.
- Microphone 170C also called “microphone” or “microphone” is used to convert sound signals into electrical signals. When making a call or sending a voice message, the user can speak close to the microphone 170C with the human mouth and input the sound signal to the microphone 170C.
- the electronic device 100 may be provided with at least one microphone 170C. In other embodiments, the electronic device 100 may be provided with two microphones 170C, which in addition to collecting sound signals, may also implement a noise reduction function. In other embodiments, the electronic device 100 can also be provided with three, four or more microphones 170C to collect sound signals, reduce noise, identify sound sources, and implement directional recording functions, etc.
- the headphone interface 170D is used to connect wired headphones.
- the headphone interface 170D may be a USB interface 130, or may be a 3.5mm open mobile terminal platform (OMTP) standard interface, or a Cellular Telecommunications Industry Association of the USA (CTIA) standard interface.
- OMTP open mobile terminal platform
- CTIA Cellular Telecommunications Industry Association of the USA
- the pressure sensor 180A is used to sense pressure signals and can convert the pressure signals into electrical signals.
- pressure sensor 180A may be disposed on display screen 194 .
- pressure sensors 180A there are many types of pressure sensors 180A, such as resistive pressure sensors, inductive pressure sensors, capacitive pressure sensors, etc.
- a capacitive pressure sensor may include at least two parallel plates of conductive material.
- the electronic device 100 determines the intensity of the pressure based on the change in capacitance.
- the electronic device 100 detects the intensity of the touch operation according to the pressure sensor 180A.
- the electronic device 100 may also calculate the touched position based on the detection signal of the pressure sensor 180A.
- touch operations with different touch operation intensities can correspond to different operation instructions. For example: when a touch operation with a touch operation intensity less than the first pressure threshold is applied to the short message application icon, an instruction to view the short message is executed. When a touch operation with a touch operation intensity greater than or equal to the first pressure threshold is applied to the short message application icon, an instruction to create a new short message is executed.
- the gyro sensor 180B may be used to determine the motion posture of the electronic device 100 .
- the angular velocity of electronic device 100 about three axes ie, x, y, and z axes
- the gyro sensor 180B can be used for image stabilization. For example, when the shutter is pressed, the gyro sensor 180B detects the angle at which the electronic device 100 shakes, calculates the distance that the lens module needs to compensate based on the angle, and allows the lens to offset the shake of the electronic device 100 through reverse movement to achieve anti-shake.
- the gyro sensor 180B can also be used for navigation and somatosensory game scenes.
- Air pressure sensor 180C is used to measure air pressure. In some embodiments, the electronic device 100 calculates the altitude through the air pressure value measured by the air pressure sensor 180C to assist positioning and navigation.
- Magnetic sensor 180D includes a Hall sensor.
- the electronic device 100 may utilize the magnetic sensor 180D to detect opening and closing of the flip holster.
- the electronic device 100 may detect the opening and closing of the flip according to the magnetic sensor 180D. Then, based on the detected opening and closing status of the leather case or the opening and closing status of the flip cover, features such as automatic unlocking of the flip cover are set.
- the acceleration sensor 180E can detect the acceleration of the electronic device 100 in various directions (generally three axes). When the electronic device 100 is stationary, the magnitude and direction of gravity can be detected. It can also be used to identify the posture of electronic devices and be used in horizontal and vertical screen switching, pedometer and other applications.
- Distance sensor 180F for measuring distance.
- Electronic device 100 can measure distance via infrared or laser. In some embodiments, when shooting a scene, the electronic device 100 may utilize the distance sensor 180F to measure distance to achieve fast focusing.
- Proximity light sensor 180G may include, for example, a light emitting diode (LED) and a light detector, such as a photodiode.
- the light emitting diode may be an infrared light emitting diode.
- the electronic device 100 emits infrared light outwardly through the light emitting diode.
- Electronic device 100 uses photodiodes to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the electronic device 100 . When insufficient reflected light is detected, the electronic device 100 may determine that there is no object near the electronic device 100 .
- the electronic device 100 can use the proximity light sensor 180G to detect when the user holds the electronic device 100 close to the ear for talking, so as to automatically turn off the screen to save power.
- the proximity light sensor 180G can also be used in holster mode, and pocket mode automatically unlocks and locks the screen.
- the ambient light sensor 180L is used to sense ambient light brightness.
- the electronic device 100 can adaptively adjust the brightness of the display screen 194 according to the perceived ambient light brightness.
- the ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures.
- the ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the electronic device 100 is in the pocket to prevent accidental touching.
- Fingerprint sensor 180H is used to collect fingerprints.
- the electronic device 100 can use the collected fingerprint characteristics to achieve fingerprint unlocking, access to application locks, fingerprint photography, fingerprint answering of incoming calls, etc.
- Temperature sensor 180J is used to detect temperature.
- the electronic device 100 utilizes the temperature detected by the temperature sensor 180J to execute the temperature processing strategy. For example, when the temperature reported by the temperature sensor 180J exceeds a threshold, the electronic device 100 reduces the performance of a processor located near the temperature sensor 180J in order to reduce power consumption and implement thermal protection. In other embodiments, when the temperature is lower than another threshold, the electronic device 100 heats the battery 142 to prevent the low temperature from causing the electronic device 100 to shut down abnormally. In some other embodiments, when the temperature is lower than another threshold, the electronic device 100 performs boosting on the output voltage of the battery 142 to avoid abnormal shutdown caused by low temperature.
- Touch sensor 180K also called “touch panel”.
- the touch sensor 180K can be disposed on the display screen 194.
- the touch sensor 180K and the display screen 194 form a touch screen, which is also called a "touch screen”.
- the touch sensor 180K is used to detect a touch operation on or near the touch sensor 180K.
- the touch sensor can pass the detected touch operation to the application processor to determine the touch event type.
- Visual output related to the touch operation may be provided through display screen 194 .
- the touch sensor 180K may also be disposed on the surface of the electronic device 100 at a location different from that of the display screen 194 .
- Bone conduction sensor 180M can acquire vibration signals.
- the bone conduction sensor 180M can acquire the vibration signal of the vibrating bone mass of the human body's vocal part.
- the bone conduction sensor 180M can also contact the human body's pulse and receive blood pressure beating signals.
- the bone conduction sensor 180M can also be provided in an earphone and combined into a bone conduction earphone.
- the audio module 170 can analyze the voice signal based on the vibration signal of the vocal vibrating bone obtained by the bone conduction sensor 180M to implement the voice function.
- the application processor can analyze the heart rate information based on the blood pressure beating signal acquired by the bone conduction sensor 180M to implement the heart rate detection function.
- the buttons 190 include a power button, a volume button, etc.
- Key 190 may be a mechanical key. It can also be a touch button.
- the electronic device 100 may receive key inputs and generate key signal inputs related to user settings and function control of the electronic device 100 .
- the motor 191 can generate vibration prompts.
- the motor 191 can be used for vibration prompts for incoming calls and can also be used for touch vibration feedback.
- touch operations acting on different applications can correspond to different vibration feedback effects.
- the motor 191 can also respond to different vibration feedback effects for touch operations in different areas of the display screen 194 .
- Different application scenarios such as time reminders, receiving information, alarm clocks, games, etc.
- the touch vibration feedback effect can also be customized.
- the indicator 192 may be an indicator light, which may be used to indicate charging status, power changes, or may be used to indicate messages, missed calls, notifications, etc.
- the SIM card interface 195 is used to connect a SIM card.
- the SIM card can be connected to or separated from the electronic device 100 by inserting it into the SIM card interface 195 or pulling it out from the SIM card interface 195 .
- the electronic device 100 can support 1 or N SIM card interfaces, where N is a positive integer greater than 1.
- SIM card interface 195 can support Nano SIM card, Micro SIM card, SIM card, etc. Multiple cards can be inserted into the same SIM card interface 195 at the same time. The types of the plurality of cards may be the same or different.
- the SIM card interface 195 is also compatible with different types of SIM cards.
- the SIM card interface 195 is also compatible with external memory cards.
- the electronic device 100 interacts with the network through the SIM card to implement functions such as calls and data communications.
- the electronic device 100 uses an eSIM, that is, an embedded SIM card.
- the eSIM card can be embedded in the electronic device 100 and cannot be separated from the electronic device 100 .
- the software system of the electronic device 100 may adopt a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture.
- the embodiment of this application takes the Android system with a layered architecture as an example to illustrate the software structure of the electronic device 100 .
- the system may be a Linux system, which is not limited in this application.
- FIG. 2 is a software structure block diagram of the electronic device 100 according to the embodiment of the present application.
- the layered architecture of the electronic device 100 divides the software into several layers, and each layer has clear roles and division of labor.
- the layers communicate through software interfaces.
- the Android system is divided into four layers, from top to bottom: application layer, application framework layer, Android runtime and system libraries, and kernel layer.
- the application layer can include a series of application packages.
- the application package can include camera, gallery, calendar, calling, map, navigation, WLAN, Bluetooth, music, video, short message and other applications.
- the application framework layer provides an application programming interface (API) and programming framework for applications in the application layer.
- API application programming interface
- the application framework layer includes some predefined functions.
- the application framework layer can include window manager, content provider, view system, phone manager, resource manager, notification manager, FW parameter setting module, etc.
- a window manager is used to manage window programs.
- the window manager can obtain the display size, determine whether there is a status bar, lock the screen, capture the screen, etc.
- Content providers are used to store and retrieve data and make this data accessible to applications.
- Said data can include videos, images, audio, calls made and received, browsing history and bookmarks, phone books, etc.
- the view system includes visual controls, such as controls that display text, controls that display pictures, etc.
- a view system can be used to build applications.
- the display interface can be composed of one or more views.
- a display interface including a text message notification icon may include a view for displaying text and a view for displaying pictures.
- the phone manager is used to provide communication functions of the electronic device 100 .
- call status management including connected, hung up, etc.
- the resource manager provides various resources to applications, such as localized strings, icons, pictures, layout files, video files, etc.
- the notification manager allows applications to display notification information in the status bar, which can be used to convey notification-type messages and can automatically disappear after a short stay without user interaction.
- the notification manager is used to notify download completion, message reminders, etc.
- the notification manager can also be notifications that appear in the status bar at the top of the system in the form of charts or scroll bar text, such as notifications for applications running in the background, or notifications that appear on the screen in the form of conversation windows. For example, text information is prompted in the status bar, a beep sounds, the electronic device vibrates, the indicator light flashes, etc.
- the FW parameter setting module may include a FW parameter prediction network.
- the FW parameter prediction network is used to predict the corresponding FW parameters based on scene requirements.
- the FW parameter setting module can be located at the application layer.
- the FW parameters obtained by the FW parameter prediction network can also be called ISP parameters, FW hyperparameters, or ISP hyperparameters, which can be set according to actual needs and are not limited in this application. It can be understood that during subsequent use, ISP can obtain image processing parameters based on FW hyperparameters and preset image processing parameter functions. ISP can process Raw images based on image processing parameters.
- the FW parameter setting module can also run at the application layer, which can be understood as a piece of code running on the NPU, which is not limited in this application.
- the user can update the FW parameter setting module by updating the system of the electronic device.
- the user can operate the electronic device, and the electronic device can obtain an updated version from the cloud and update the system based on the updated version to install the FW parameter setting module.
- Android Runtime includes core libraries and virtual machines. Android runtime is responsible for the scheduling and management of the Android system.
- the core library contains two parts: one is the functional functions that need to be called by the Java language, and the other is the core library of Android.
- the application layer and the application framework layer run in a virtual machine.
- the virtual machine executes the Java files of the application layer and the application framework layer as binary files.
- the virtual machine is used to perform functions such as object life cycle management, stack management, thread management, security and exception management, and garbage collection.
- System libraries can include multiple functional modules. For example: surface manager (surface manager), media libraries (Media Libraries), 3D graphics processing libraries (for example: OpenGL ES), 2D graphics engines (for example: SGL), etc.
- the surface manager is used to manage the display subsystem and provides the fusion of 2D and 3D layers for multiple applications.
- the media library supports playback and recording of a variety of commonly used audio and video formats, as well as static image files, etc.
- the media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
- the 3D graphics processing library is used to implement 3D graphics drawing, image rendering, composition, and layer processing.
- 2D Graphics Engine is a drawing engine for 2D drawing.
- the kernel layer is the layer between hardware and software.
- the kernel layer includes at least display driver, camera driver, audio driver, sensor driver, etc.
- the components included in the system framework layer, system library and runtime layer shown in Figure 2 do not constitute specific limitations on the electronic device 100.
- the electronic device 100 may include more or fewer components than shown in the figures, or some components may be combined, some components may be separated, or some components may be arranged differently.
- a camera in an electronic device collects an image, and the image is called a Raw image.
- the ISP can obtain the FW parameters from storage (which can also be called shared memory, for example, it can be a Double Data Rate memory, which is not limited in this application).
- the FW parameters are set before the electronic equipment leaves the factory, and a set of FW parameters is stored in the DDR.
- the FW parameter can be understood as an array.
- the FW parameter can be understood as an array used to adjust at least one image attribute during ISP processing.
- the ISP can adjust the brightness attribute in the Raw image based on the first parameter in the FW parameter, so that the brightness of the generated RGB (or YUV, which will not be described in detail below) image is brighter or darker.
- Applications in electronic devices can obtain RGB images and process the RGB images accordingly.
- the perception application can perform AI recognition on RGB images to identify people or other matters in RGB images, which is not limited by this application.
- each module may also include interactions with other modules.
- the ISP may obtain Raw images through a camera driver. This application does not limit this, and the description will not be repeated below.
- the FW parameters are set before the electronic device leaves the factory, it can be understood as a fixed FW parameter. That is, the ISP performs the same processing on all acquired Raw images.
- the first parameter in the FW parameter indicates that the brightness (i.e., brightness attribute) of the Raw image is adjusted to the first threshold.
- the ISP will adjust the brightness parameter of the Raw image to the first threshold based on the FW parameter. threshold.
- this adjustment method is too simple and cannot be applied to all scenarios. For example, in the car In the mobile device, the brightness of the image acquired by the camera in the tunnel is relatively dark.
- the RGB image it generates may be The brightness is still dark, so that downstream applications (such as perception applications) may not be able to accurately identify objects in the image.
- FIG. 4 is a schematic diagram illustrating the principle of an image processing method according to an embodiment of the present application.
- the FW parameters running on the AI (Artificial Intelligence) processor for example, it can be an NPU or a TPU, etc., which is not limited in this application
- the setting module (specifically, the FW parameter prediction network) can predict the corresponding optimal FW parameters based on the Raw image (to be different from the FW parameters in Figure 3, this application calls them dynamic FW parameters), and the ISP can predict based on the AI processor.
- the dynamic FW parameters are obtained, image processing is performed on the Raw image, and the corresponding RGB image is generated.
- Perception applications can also be other applications
- the image processing method in the embodiment of the present application can dynamically adjust the FW parameters according to the scene requirements, so that the ISP can adaptively process the images obtained by the camera in different scenes, so as to obtain a processing effect that is more in line with the scene requirements and the image is clearer. As a result, the recognition accuracy of downstream applications is improved.
- vehicle-mounted cameras i.e., cameras inherited from vehicle-mounted equipment
- vehicle-mounted equipment such as MCUs in vehicle-mounted equipment
- real-time multi-type perception of the road environment such as 2D/3D target detection, Lane line detection, scene semantic segmentation, etc.
- relevant descriptions in prior art embodiments which are not limited in this application.
- computer vision perception tasks are usually carried out by deep models based on CNN (Convolutional Neutral Network) or transformer architecture.
- CNN Convolutional Neutral Network
- transformer architecture For the same original image (i.e.
- the FW parameter prediction network in the FW parameter setting module can be used for each Raw The image predicts a set of the most suitable FW parameters, so that the RGB or YUV image obtained after ISP processing of the Raw image can obtain the best prediction effect on one or more perception tasks, thereby improving assisted driving or autonomous driving scenarios. Perceptual accuracy.
- the ATR Adaptive Tone Reproduction
- the noise module predicts a set of parameters with weak noise reduction strength to reduce the loss of image details due to smearing.
- the DRC Dynamic Range Compression
- assisted driving or autonomous driving application scenarios are used as examples for explanation.
- the specific implementation methods for other application scenarios are the same. This application will not illustrate each example one by one, and the description will not be repeated below.
- FIG. 5 is a schematic flowchart of an exemplary image processing method.
- the image processing method in the embodiment of the present application can be divided into an offline training stage and an online deployment stage.
- the offline stage includes but is not limited to two stages:
- the proxy network can be understood as a network that replaces the ISP, and its processing results are the same or similar to those of the ISP. It can be understood that the process of proxy network training is a process in which the image processing results of the proxy network gradually approach the processing results of the ISP.
- the FW parameter prediction network training phase can be understood as training the FW parameter prediction network based on the trained proxy network to obtain a FW parameter prediction network with more accurate prediction results.
- the prediction result is the FW parameter.
- the online deployment phase optionally deploys the parameter prediction network obtained in the offline training phase to the target device.
- the target device can be terminal equipment, vehicle-mounted equipment, security monitoring equipment, etc., and is not limited in this application.
- the offline training phase can be executed by the computing node.
- Computing nodes can be any devices such as computers, servers, computers, etc.
- the steps in the offline phase may be executed by the GPU processor in the computing node, which is not limited by this application.
- the offline training phase can be executed at any point in time to obtain a trained parameter prediction network.
- the electronic device can be installed with a FW parameter setting module that supports the parameter prediction network before leaving the factory.
- offline training The training phase can also be executed periodically (for example, once a month) to update the parameter prediction network.
- the computing node can upload the updated parameter prediction network to the cloud (the cloud includes one or more servers), and the cloud can provide the electronic device with Push the system update version, which includes the parameter prediction network.
- the electronic device can update the FW parameter setting module based on the system update version to obtain the new parameter prediction network.
- the operator needs to prepare data and models before starting the offline training phase.
- the data prepared includes but is not limited to:
- N can be any value.
- the operator can control the image sensor (such as a vehicle-mounted camera) to collect images to obtain N Raw images, which are hereinafter referred to as training Raw images.
- the scene covered by the collected images optionally fits the working environment of the electronic equipment (such as vehicle-mounted equipment or terminal equipment) in the deployment stage as much as possible.
- the working environment may involve tunnels and highways in different weather environments.
- the operator can control the image sensor to collect images of multiple tunnels and images of different highways in different weather environments. This application does not Make limitations.
- the difference between the images can optionally be the difference of each parameter between the images, such as the noise parameter, brightness parameter, etc. mentioned above.
- the images involved in the embodiments of the present application are arrays to the computer.
- it can actually be understood as describing the elements of the image.
- the corresponding array or can be further understood as the parameters corresponding to the array describing the image.
- any processing of the image such as processing the image by ISP, can be understood as processing the pixel values corresponding to the image, and the description will not be repeated below.
- the images collected by the cameras may be the same or different.
- vehicle-mounted device A and vehicle-mounted device B are vehicle-mounted devices of the same manufacturer but different models, and their camera parameters are different, the collected Raw images will be different.
- the different images may optionally be different in attributes (such as brightness, color temperature, etc.).
- different attributes may optionally mean that the type, quantity, and/or intensity of the attributes (which may also be high or low, light or dark, etc.) are different.
- the camera of vehicle-mounted device A and the camera of vehicle-mounted device B capture the same picture in the same environment.
- the attributes of the Raw image collected by vehicle-mounted camera A include but are not limited to: attribute A (for example, color temperature), attribute B (for example, white balance), and attribute C (for example, brightness), while the Raw image collected by the camera of vehicle-mounted device B
- the attributes of the image include but are not limited to: attribute A, attribute B, attribute D (for example, noise), and attribute E (for example, Gamma).
- the attribute A of the image captured by the camera of the vehicle-mounted device A and the attribute A of the image captured by the camera of the vehicle-mounted device B are the same or different in size (can also be high or low), which is not limited in this application.
- operators can perform training operations on different types and/or different models of equipment to obtain parameter prediction networks corresponding to different types and/or different models of equipment.
- This application only takes the training process corresponding to a single device as an example.
- the computing node can repeatedly execute the training phase described in the embodiments of this application based on different devices to obtain the corresponding parameter prediction network. This application will not give examples one by one.
- b.N group of FW parameters For example, the operator can use a sampling algorithm such as Latin-Hypercube (the corresponding formula can be set according to actual needs, which is not limited in this application) to sample N sets of FW parameters.
- the FW parameters are arrays.
- N groups of FW parameters can also be understood as N arrays.
- each set of training FW parameters is optionally a set of vectors with a length of n.
- the value of n is related to the processing capability of the ISP.
- the attributes processed by different models of ISP may be the same or different.
- the ISP of model A is used to process q attributes such as brightness and color temperature
- n is related to q.
- one parameter in the FW parameters can be used to adjust an attribute, and multiple parameters in the FW parameters can also be used to adjust an attribute, which is not limited in this application.
- the FW parameters during the preparation process are called training FW parameters.
- the number of Raw images participating in training is equal to N (that is, the number of groups of FW parameters) is used as an example for explanation.
- the number of Raw images may also be greater than N or less than N.
- the corresponding FW parameters can be the same or different.
- the greater the number of FW parameters, that is, the more samples participating in training the closer the network (such as the agent network) obtained by training is to the user's needs.
- CV Computer Vision
- the CV perception network is used to predict specific computer vision tasks, such as 2D/3D box prediction for vehicles, pedestrians, traffic signs and other objects in road scenes, semantic segmentation of instances in the scene, and specific object prediction. Tracking and more.
- specific computer vision tasks such as 2D/3D box prediction for vehicles, pedestrians, traffic signs and other objects in road scenes, semantic segmentation of instances in the scene, and specific object prediction. Tracking and more.
- the CV perception network involved in the embodiments of this application does not specify a specific perception task, as long as the task has a clear objective function and the perception network has end-to-end trainable characteristics. For most supervised CV tasks and most deep models built based on CNN, these two prerequisites are easily satisfied.
- the default CV sensing network in the embodiment of this application receives RGB or YUV images as input, and passes through a series of computing modules or sub-networks (such as backbone( Backbone Network), RPN (Region Proposal Network, Candidate Region Generation Network), etc.) and then output the prediction results required for specific tasks, such as 2D/3D detection boxes, object category labels, prediction confidence, segmentation masks, object motion trajectories, etc. .
- a series of computing modules or sub-networks such as backbone( Backbone Network), RPN (Region Proposal Network, Candidate Region Generation Network), etc.
- d.True value label For example, the operator can set a true value label for the Raw image used for training, which is used to identify the real result of the Raw image passing through the CV perception network.
- Ground-truth labels include but are not limited to: 2D/3D detection boxes, object category labels, prediction confidence, segmentation masks, object motion trajectories and other labels. Its specific settings are set according to the CV-aware network. For example, if the CV perception network is used for object category recognition on RGB images, the operator can set the true value label for the Raw image in advance, that is, set the object category label included in the Raw image through manual identification. For example, the Raw image includes images of kittens and puppies, and the object category tags set by the operator for the Raw image include kitten tags and puppy tags.
- the purpose of the true value label is to compare it with the actual output result of the CV during the training process to determine the deviation between the result output by the CV perception network and the real result (i.e., the true value label). Specific details will be described in detail in the following examples
- the agent network may be a convolutional neural network (CNN), a recurrent neural network (RNN), or some other deep model (such as a transformer, etc.).
- CNN convolutional neural network
- RNN recurrent neural network
- the present invention does not limit the specific model type used. It only needs to ensure that it receives a Raw image and a set of FW parameters as input and has differentiable properties.
- the proxy network is also set with an initial weight.
- the weight of the proxy network is called the proxy network weight
- the weight of the FW parameter prediction network is called FW parameter prediction.
- Network weight For example, the weights involved in the embodiment of this application may be a set of arrays.
- Figure 6 is a schematic structural diagram of an exemplary proxy network. Please refer to Figure 6. This structure is the structure of the Transformer model, which includes the Encoder (encoding) and Decoder (decoding) parts.
- Encoder includes but is not limited to: Multi-Head Attention layer, Layer Normalization (Norm in Figure 6) and Feed forward neural network (Feed forward neural network) ( Figure 6) 6 Feed Forwad) and other modules.
- Decoder includes but is not limited to: Masked multi-head attention layer, multi-head attention layer, layer normalization (i.e. (Norm) in Figure 6) and feed-forward neural network (i.e. Feed in Figure 6 forwad) and other modules. Among them, each module corresponds to a set of weight values, and the weight values of all modules are combined to form the weight value of the agent network.
- the FW parameter prediction network is a deep network deployed on the AI processor. It reads the Raw image data to be processed in the video stream collected by the current camera from storage (such as DDR) on demand, and predicts the best FW parameters. (It can also be understood as network reasoning). The FW parameter prediction network writes the predicted parameters (a set of vectors with a length of N, where N is the number of dynamically configurable FW parameters) into the shared memory of the ISP Firmware, overwriting the original FW parameters.
- the FW parameter prediction network is similar to the proxy network, and can be a convolutional neural network (CNN), a recursive neural network (RNN), or some other deep models (such as transformer, etc.).
- CNN convolutional neural network
- RNN recursive neural network
- the present invention does not limit the specific model type used, but only needs to ensure that it has differentiable properties.
- the FW parameter prediction network is also set with initial weights.
- the initial weight can be set arbitrarily by the operator, and is not limited in this application.
- the weight of the FW parameter prediction network can be called the FW parameter prediction network weight to distinguish it from other weights.
- the proxy network weight please refer to the relevant description of the proxy network weight and will not be repeated here.
- the proxy network is used to use a differentiable model to proxy the non-differentiable image processing algorithms in ISP, thereby allowing first-order optimization algorithms such as gradient descent algorithms to apply to the FW parameter prediction network, proxy network and CV sensing network.
- the complete data path is jointly optimized end-to-end.
- Figure 7 is a schematic diagram of an exemplary agent network training process. Please refer to Figure 7. Specific details include but are not limited to:
- ISP performs image processing on the training Raw image based on the training FW parameters to obtain the RGB_ISP image.
- ISP is a SoC subsystem composed of two parts: Firmware and Hardware (some ISPs also include some software parts running on AI processing).
- the Firmware part is responsible for some scheduling work required to execute the image processing algorithm, such as reading FW parameters from the designated memory area, controlling the switch status and execution sequence of each module, modifying the status bits of each register, etc., while executing some low-intensity logical operations.
- the Hardware part is responsible for performing high-intensity operations in image processing algorithms, and its calculation logic is controlled by the status bits in the register.
- the operator prepares N training FW parameters and N training Raw images.
- the operator can preset the correspondence between N training FW parameters and N training Raw images.
- Raw image 1 corresponds to FW parameter 1
- Raw image 2 corresponds to FW parameter 2
- Raw image N corresponds to FW parameter N.
- the corresponding relationship can be set arbitrarily, just ensure that each Raw image corresponds to a FW parameter, and the corresponding FW parameters between different Raw images should be as different as possible to increase the difference between samples.
- the computing node may include N ISPs, each ISP is used to perform image processing on a Raw image based on FW parameters to obtain N RGB images (which may also be YUV images, this application does not limit the Repeat the explanation again).
- the ISP's processing of Raw images based on FW parameters may specifically include but is not limited to: the ISP itself carries an image processing parameter algorithm or an image processing parameter function.
- the ISP may be based on a preset image processing parameter algorithm and FW. parameters (i.e. FW hyperparameters) to obtain the corresponding image processing parameters.
- Raw images are arrays, and the arrays include multiple grayscale values.
- the ISP can adjust at least one gray value in the data based on the image processing parameters to change the properties of the Raw image. For example, based on the image processing parameters, ISP adjusts some grayscale values below the threshold (which can be set according to actual needs) in the Raw image to the desired value, so that the brightness of the generated RGB image is higher than that before adjustment.
- the threshold which can be set according to actual needs
- ISP adjusts some grayscale values below the threshold (which can be set according to actual needs) in the Raw image to the desired value, so that the brightness of the generated RGB image is higher than that before adjustment.
- the RGB image obtained by ISP is called the RGB_ISP image in the embodiment of the present application.
- the image processing process of ISP can be understood as correcting (or processing) the attributes of the corresponding Raw image based on the FW parameters to obtain an RGB image.
- the attributes of the RGB image are those processed based on the FW parameters.
- Raw image 1 includes color temperature A
- ISP performs image processing on Raw image 1 based on FW parameters (the FW parameters include color temperature thresholds, which may also be called color temperature correction parameters or color temperature correction thresholds, and are not limited in this application) to obtain RGB Image 1.
- the color temperature B in the RGB image 1 is the same as or different from the color temperature A in the Raw image 1.
- ISP can also perform image processing on each of the N training Raw images based on each parameter of the N training FW parameters to obtain N ⁇ N RGB_ISP images, so as to Expand the size of the training set.
- Specific training methods can be set according to actual needs and are not limited in this application.
- the ISP involved in the embodiment of this application can be any model of ISP in any type of equipment. Assuming that the processes involved in Figure 7 are all executed by the ISP of the electronic device of model A, then the proxy network obtained by the process of Figure 7 and the FW parameter prediction network obtained in the following embodiments all correspond to model A. of electronic equipment. If you need to obtain the FW parameter prediction network corresponding to the electronic device of model B, you need to re-execute the training processes in Figure 7 and the following embodiments based on the electronic device of model B. For example, the ISP involved in S701 will Replace the ISP for electronic equipment with model B. Examples will not be repeated in the embodiments of this application, and descriptions will not be repeated below.
- all devices such as ISP
- modules involved in the training phase in the embodiment of the present application can be integrated into the computing node and run by the computing node.
- the ISP in the computing node is the ISP of the electronic device of model A.
- the computing node may include multiple independent devices or modules.
- the ISP and the proxy network are different devices, and the proxy network can be on the GPU of the computing node.
- the ISP can be executed by the ISP in the A-type electronic device and output to the GPU of the computing node.
- the system structure of the computing node can be set according to actual needs, which is not limited in this application.
- the proxy network outputs an RGB_DL image based on the training FW parameters and the training Raw image.
- the agent network may be composed of a depth perception model (it may also be other neural networks, which is not limited in this application).
- the computing node uses N Raw images and N training FW parameters as the input of the agent network.
- the agent network is based on the training Practice FW parameters, N Raw images and initial proxy network weights for inference (or operation), and output RGB_DL images.
- the operator has preset the corresponding relationship between N Raw images and N FW parameters.
- the proxy network when the proxy network is processing, it is still processed according to the set corresponding relationship. It can be understood that when the proxy network is processing, the corresponding relationship between the Raw image and the FW parameters needs to be consistent with the corresponding relationship between the Raw image and the FW parameters when the ISP performs image processing, in order to be comparable.
- the computing node can determine the number of Raw images and FW parameters (ie, the number of groups) input into the proxy network each time based on the processing capability of the proxy network.
- the Raw image and FW parameters input each time maintain the corresponding relationship as described above.
- the number of Raw images (and FW parameters) input can be the same or different each time.
- the computing node can input 20 Raw images and 20 FW parameters to the proxy network each time.
- the numerical values are only illustrative examples and can be set according to actual needs, and are not limited in this application.
- the proxy network can operate on multiple Raw images and FW parameters input each time based on the current weight (which may be an initial weight or an updated weight). During the operation process, it is similar to ISP and processes the Raw image based on the FW parameters corresponding to each Raw image to obtain the processing result.
- the current weight which may be an initial weight or an updated weight.
- the proxy network processes N Raw images and N sets of FW parameters (which may be processed in batches) to obtain N RGB_DL images.
- the proxy network is set with initial weights, and the initial weights are arbitrarily set by the operator.
- the results (i.e., arrays) output by the agent network in the initial stage may be meaningless.
- the process shown in Figure 7 in the embodiment of the present application can be understood as The proxy network is continuously trained to update the proxy network weight, so that the result output by the proxy network gradually approaches the result processed by the ISP in S701 (i.e., RGB_ISP image).
- S703 Determine whether the similarity between the RGB_IGP image and the RGB_DL image is greater than the threshold.
- the computing node (specifically, it may be the CPU of the computing node) obtains N RGB_ISP images output by the ISP and N RGB_DL images output by the proxy network.
- the computing node can obtain the similarity between the two corresponding images based on the correspondence between the N RGB_ISP images and the N RGB_DL images to obtain N similarities.
- the computer can obtain the overall similarity between the RGB_ISP image processed by the ISP and the RGB_DL image processed by the proxy network based on the N similarities.
- the correspondence between N RGB_ISP images and N RGB_DL images refers to the RGB images obtained after processing based on the same Raw image and the same FW parameter.
- the calculation node obtains the similarity between the two corresponding images based on RGB_ISP image 1 and RGB_DL image 1.
- RGB_ISP image 1 and RGB_DL image 1 are obtained by the ISP and proxy network based on Raw image 1 and FW parameter 1 respectively.
- each image is an array.
- RGB_ISP image 1 and RGB_DL image 1 are recorded as I out_ISP
- RGB_DL image 1 is I out_DL
- Raw image 1 is recorded as I Raw
- the FW parameter is p.
- the computing node calculates the difference between I out_ISP and I out_DL (which may also be called error, similarity difference, etc., and is not limited in this application), which is recorded as L_1.
- the specific calculation method may refer to existing technical embodiments, and is not limited in this application.
- the computing node can obtain the similarity difference between N RGB_ISP images and N RGB_DL images, so as to obtain N similarity difference values, such as L_1...L_N.
- the calculation node calculates the average of N similarity difference values, which is the similarity value between the RGB_ISP image processed by the ISP and the RGB_DL image processed by the proxy network, recorded as L.
- calculation methods such as arithmetic mean can also be used to obtain the similarity value, which is not limited in this application.
- the computing node detects whether L is greater than (or equal to) a preset threshold.
- the preset threshold can be set according to actual needs and is not limited in this application.
- the preset threshold is used to indicate the degree of similarity between images. The smaller the preset threshold, the greater the similarity between the two images. Correspondingly, if L is smaller, the similarity between the two images is higher.
- the computing node detects that the L value is less than the threshold, it can also be understood that the similarity is greater than the threshold. Then execute S704.
- the computing node detects that the L value is greater than or equal to the threshold, it can also be understood that the similarity is less than the threshold, that is, the difference between the image output by the proxy network and the image output by the ISP is large, then S705 is executed.
- the computing node can save the current proxy network weight and lock the proxy network weight to prevent the proxy network weight from changing with subsequent processing.
- the weight of the proxy network needs to be readjusted, so that the proxy network re-executes S702 based on the new weight.
- the computing node uses optimization algorithms such as the gradient descent algorithm to calculate the corresponding gradient value based on the current weight of the agent network (denoted as w t ). And according to the gradient value, the current agent network weight (w t ) is updated by gradient descent to obtain the updated agent network weight, which is recorded as w t+1 .
- optimization algorithms such as the gradient descent algorithm to calculate the corresponding gradient value based on the current weight of the agent network (denoted as w t ).
- the current agent network weight (w t ) is updated by gradient descent to obtain the updated agent network weight, which is recorded as w t+1 .
- the agent network can repeatedly execute S702.
- the agent network processes the training FW parameters and the training Raw image based on the updated weight (w t+1 ).
- the computing node can compare the output result of the proxy network with the output result of the ISP again (S701 only needs to be executed once), and based on the comparison result, execute S704 or S705.
- the number of loops depends on S703.
- the size of the L value in that is, the similarity between the image of the output result of the proxy network and the output result of the ISP.
- the calculation node detects that the similarity is greater than the threshold, that is, the L value is less than the threshold, then the proxy is saved. Network weights, training is over.
- the FW parameter prediction network is a deep network deployed on the AI processor. It reads the Raw image data to be processed in the video stream collected by the current camera from storage (such as DDR) on demand, and Predict the optimal FW parameters (which can also be understood as network reasoning).
- the FW parameter prediction network writes the predicted parameters (a set of vectors with a length of N, where N is the number of dynamically configurable FW parameters) into the shared memory of the ISP Firmware, overwriting the original FW parameters.
- the training of the FW parameter prediction network in the embodiment of the present application can be understood as a process of making the output results of the FW parameter prediction network gradually more accurate.
- the FW parameter prediction network is the same as the proxy network. It is also set with initial weights. In the process of training the FW parameter prediction network, it is similar to the training of the proxy network. In the process, the network weight is predicted by the FW parameters. Continuous updates are performed so that the output results of the FW parameter prediction network gradually optimize the output FW parameters.
- Figure 8 is a schematic diagram of an exemplary FW parameter prediction network training process. Please refer to Figure 8. Specific details include but are not limited to:
- the FW parameter prediction network obtains the predicted FW parameters based on the Raw image.
- the computing node can input N Raw images (that is, N Raw images in the preparation phase) to the FW parameter prediction network, and the FW parameter prediction network can be based on the current weight (which can be the initial weight or the updated weight). weight), process (or operate) the input N Raw images to obtain N predicted FW parameters.
- N Raw images that is, N Raw images in the preparation phase
- the FW parameter prediction network can be based on the current weight (which can be the initial weight or the updated weight). weight)
- process (or operate) the input N Raw images to obtain N predicted FW parameters.
- the input samples in the training phase of the FW parameter prediction network that is, the N Raw images are the same as the input samples in the training phase of the proxy network, are used as an example for explanation.
- the input samples in the FW parameter pre-storage network training phase may be M images among N Raw images.
- it can also be other M Raw images that are different from the agent network training stage, and this application does not limit it.
- the computing node can input N Raw images to the FW parameter prediction network in batches based on the processing capability of the FW parameter prediction network.
- the number of Raw images input can be the same or different each time.
- the computing node can input 20 Raw images to the FW parameter prediction network at a time.
- the numerical values are only illustrative examples and can be set according to actual needs, and are not limited in this application.
- the N predicted FW parameters output by the FW parameter prediction network correspond one-to-one to the input Raw image.
- the FW parameter prediction network can be based on the Raw image 1 and output the predicted FW parameter 1. Based on Raw image 2, output predicted FW parameter 2.
- the FW parameter prediction network is set with initial weights, which are arbitrarily set by the operator.
- the results (i.e., arrays) output by the FW parameter prediction network in the initial stage may be meaningless, but in the embodiment of this application, the results output by the FW parameter prediction network will still be used. is called the predicted FW parameter.
- the process shown in Figure 8 in the embodiment of the present application can be understood as continuously training the FW parameter prediction network to update the weight of the FW parameter prediction network to continuously optimize the output results of the FW parameter prediction network.
- the proxy network obtains the RGB image based on the predicted FW parameters and Raw image.
- N Raw images and N predicted FW parameters are in one-to-one correspondence.
- the corresponding relationship is the corresponding relationship described in S801.
- the proxy network can process (or operate) N Raw images based on the current weight (that is, the weight obtained after training in Figure 7) according to the correspondence between N Raw images and N predicted FW parameters. , get N RGB images. For example, based on the current weight, the proxy network can combine the predicted FW parameter 1 and the predicted FW parameter 1 according to the corresponding relationship between the Raw image 1 and the predicted FW parameter 1 (that is, the predicted FW parameter 1 is obtained by the FW parameter prediction network based on the Raw image 1).
- Raw image 1 is processed (or operated) as a pair of input parameters to obtain RGB image 1.
- the computer can obtain the correspondence between the Raw image 1, the predicted FW parameter 1, and the RGB image 1.
- the proxy network has been trained in Figure 7.
- the result obtained after the proxy network processes the Raw image based on the current weight is the same as that processed by the specified model of ISP.
- the results are the same or close (i.e. similar). That is to say, in S802, the proxy network can be regarded as an ISP to perform image processing on the Raw image to obtain the corresponding RGB image.
- the proxy network has differentiable properties to achieve end-to-end joint training between the FW prediction network and the CV sensing network.
- the computing node can input N Raw images and N FW prediction parameters to the proxy network in batches.
- the specific implementation method can be referred to the above, and will not be described again here.
- the CV perception network obtains the perception task label based on the RGB image.
- the operator is pre-set with a CV sensing network.
- the function of the CV perception network can be to output the corresponding perception task label based on the input Raw image.
- the number and type of perception task labels depend on the task type of the CV perception network. For example, if the task type of the CV perception network is to identify object types, the CV perception network can output the corresponding perception task label based on the input Raw image, and the perception task label includes the object type label.
- the computing node after the computing node obtains the N RGB images output by the proxy network, it can input the N RGB images (it can be in batches.
- the specific processing method can be referred to the above, and will not be described again here).
- the CV perception network can process (or recognize) N RGB images and obtain N perception task labels.
- N RGB images are in one-to-one correspondence with N perception task labels.
- each perception task label may include multiple sub-task labels.
- the type and number of sub-task labels are associated with the task type of the CV perception network. For detailed description, please refer to the above and will not be repeated here.
- the computing node can obtain the correspondence between N RGB images and N perception task labels.
- the computing node also obtains the correspondence between N RGB images and N Raw images.
- the computing node can obtain the correspondence between N perception task labels and N Raw images. For example, perception task label 1 corresponds to Raw image 1, perception task label 2 corresponds to Raw image 2, and so on.
- the operator sets a true value label for each Raw image among the N Raw images.
- the computing node can obtain the correspondence between N true value labels and N Raw images in advance.
- the computing node can obtain the The correspondence between N perception task labels and N truth value labels (the correspondence is one-to-one).
- perception task label 1 corresponds to ground truth label 1
- perception task label 2 corresponds to ground truth label 2.
- S804 Determine whether the difference between the true value label and the perception task label is less than a threshold.
- the computing node can obtain the correspondence between N perception task labels and N truth value labels.
- the computing node can obtain the difference between the two corresponding perception task labels and the truth value label (which can also be called a difference value, loss function, etc.) based on the corresponding relationship between the N perception task labels and the N truth value labels.
- This application without limitation) to obtain N difference values.
- the manner in which the calculation node obtains the difference between the true value label and the perception task label may refer to existing technical embodiments, and is not limited in this application.
- the calculation node can obtain the average value of N differences (it can also be the arithmetic average, which is not limited in this application), which is recorded as Q.
- the computing node detects whether Q is greater than (or equal to) a preset threshold.
- the preset threshold can be set according to actual needs and is not limited in this application.
- a preset threshold is used to indicate the degree of difference between two labels. Among them, the smaller the preset threshold is, the smaller the difference between the two labels is. Correspondingly, if the Q value is smaller, the difference between the two labels is smaller.
- the computing node detects that the Q value is less than the threshold, it can also be understood that the difference is less than the threshold. Then execute S704.
- the computing node detects that the Q value is greater than or equal to the threshold, it can also be understood that the difference is greater than the threshold, that is, the difference between the perception task label output by the CV perception network based on the RGB image output by the agent network and the true value label. is larger, it can also be understood that the enhancement effect of the RGB image obtained by the proxy network based on the predicted FW parameters is low, making the results recognized by the CV sensing network inaccurate. It can also be understood that the predicted FW parameters cannot correct the Raw image. Achieve the enhancement effect that can be recognized by the CV perception network.
- the computing node executes S806.
- the computing node can save the FW parameter prediction network, that is, save the current structure of the FW parameter prediction network (such as a neural network structure) and the current weight of the FW parameter prediction network, recorded as w f .
- the weight of the FW parameter prediction network needs to be readjusted so that the FW parameter prediction network Re-execute S801 based on the updated weight.
- the computing node uses an optimization algorithm such as the gradient descent algorithm (which can be set according to actual needs and is not limited in this application) to predict the current weight (w f ) of the network based on the FW parameters and calculate the corresponding gradient value. And according to the gradient value, the current FW parameter prediction network weight (w f ) is updated by gradient descent to obtain the updated FW parameter prediction network weight, which is recorded as w f+1 .
- the gradient descent algorithm which can be set according to actual needs and is not limited in this application
- the current FW parameter prediction network weight (w f ) is updated by gradient descent to obtain the updated FW parameter prediction network weight, which is recorded as w f+1 .
- the FW parameter prediction network can repeatedly execute S801.
- the FW parameter prediction network processes the Raw image based on the updated weight (w f+1 ) to obtain the corresponding predicted FW parameters.
- the number of loops of the process in Figure 8 depends on the size of the Q value in S804, that is, the degree of difference between the perceived task label and the true value label. In any cycle, if the Q value is less than the threshold, the calculation node saves the FW parameter prediction network and the corresponding FW parameter prediction network weight, and the training ends.
- Fig. 9 is a schematic diagram of the system structure of FW parameter prediction network training.
- the computing node inputs the Raw image into the FW parameter prediction network.
- the FW parameter prediction network outputs predicted FW parameters based on the input Raw image.
- the predicted FW parameters and Raw images are used as inputs to the proxy network, and the proxy network outputs RGB images.
- the RGB image is used as the input of the CV sensing network, and the computing node can obtain the Q value (i.e., the difference value or loss function) based on the output of the CV sensing network.
- the Q value does not meet the conditions, that is, it is greater than the threshold, the system will return to the starting position and repeat the process in Figure 9 until the Q value is less than the threshold.
- the image processing method also includes an online deployment stage.
- the online deployment stage The FW parameter prediction network and its weights saved in Figure 8 are deployed to the target device.
- the target device is the device corresponding to the ISP during the training process.
- the ISP is an electronic device of model A
- the operator can deploy the FW parameter prediction network and its weights saved by the computing node to the electronic device of model A.
- model A electronic equipment can also obtain and install the FW parameter prediction network and its weights through the cloud, which is not limited by this application.
- the target device loads the FW parameter prediction network and its weights into the memory, and works in the operating system as a resident process of the AI processor (such as NPU). It can also be understood as the FW parameter setting module in Figure 2 (which can also be called the FW parameter prediction module), or the FW parameter setting (prediction) application running at the application layer.
- the AI processor such as NPU
- the FW parameter setting module in Figure 2 which can also be called the FW parameter prediction module
- the FW parameter setting (prediction) application running at the application layer.
- FIG. 10 is a schematic diagram of an exemplary image processing flow.
- a trigger signal can be sent to the AI processor to call the FW parameter setting module on the AI processor.
- the FW parameter setting module includes the FW parameter prediction network.
- the camera will be called at the same time, and the camera will start collecting images in response to the call of the camera application. And place the collected images in shared memory.
- the FW parameter setting module reads Raw images from shared memory in response to calls from the camera application.
- shared memory is only used as an example for explanation. In other embodiments, it can also be other storage, which is not limited by this application.
- the FW parameter prediction network in the FW parameter setting module can predict the network weight based on the FW parameters, process the input Raw image, and predict the corresponding FW parameters.
- the FW parameter setting module writes the FW parameters into the ISP memory corresponding to the ISP FW (i.e., the ISP software part).
- the ISP memory may be part of the shared memory or may be located in other storage, which is not limited in this application.
- the FW parameters may have been saved in the ISP memory, and the saved FW parameters may be the initial FW parameters, or may be the last written by the FW parameter setting module, which is not limited in this application.
- the FW parameter setting module can delete the FW parameters currently saved in the ISP memory and write the newly predicted FW parameters.
- ISP FW can read the FW parameters (that is, the newly written FW parameters) from the ISP memory.
- ISP FW can indicate FW parameters to ISP HW.
- ISP FW can modify the register status of FW HW to write FW parameters into the register of FW HW.
- ISP HW reads the Raw image from the shared memory and performs image processing on the Raw image based on the obtained FW parameters (for example, read from the register) to obtain an RGB image.
- ISP HW can output the acquired RGB image to the camera application.
- the camera application can display the acquired RGB image on the display of the electronic device.
- the autonomous driving application after the autonomous driving application obtains the RGB image output by the ISP HW, it can recognize the RGB image based on the application's CV perception network to obtain the recognition result, and based on the recognition As a result, automatic driving operation is performed.
- the FW parameter prediction network in the FW parameter setting module can predict the Raw image of each frame collected by the camera, that is, for each Raw image , execute the process in Figure 10.
- the FW parameter prediction network in the FW parameter setting module can be based on a preset period (for example, it can be based on the interval length, or it can be based on the image collected by the camera).
- the frame rate setting can be set according to actual needs and is not limited in this application) to predict the Raw images collected by the camera.
- the FW parameter setting module can make predictions every 1s to reduce the workload of the AI processor.
- the FW parameter setting module can work in sleep mode after being called by the application, and only make a prediction when the application layer sends a wake-up signal to it.
- the CPU such as an autonomous driving application
- the autonomous driving application can send instruction information to the FW parameter setting module to trigger it to perform FW parameter prediction, and This parameter prediction can be done only once.
- the FW parameter setting module can also be triggered after the electronic device is started.
- the input of the FW parameter setting module may be a Raw image with a lower resolution after downsampling.
- the downsampling operation may be performed by the AI processor or by the ISP HW, which is not limited in this application.
- the ISP HW may obtain the original Raw image from the shared memory, and after downsampling the Raw image, output the Raw image with a lower resolution to the shared memory.
- the FW parameter setting module may read the Raw image with a lower resolution from the shared memory and perform subsequent processing.
- the Raw image read by the FW parameter setting module from the shared memory may also be the original Raw image obtained by other encoding or compression, which is not limited in this application.
- FIG. 11 shows a schematic block diagram of a device 1100 according to an embodiment of the present application.
- the device 1100 may include: a processor 1101 and a transceiver/transceiver pin 1102, and optionally, a memory 1103.
- bus 1104 includes a power bus, a control bus, and a status signal bus in addition to a data bus.
- bus 1104 various buses are referred to as bus 1104 in the figure.
- the memory 1103 may be used for instructions in the foregoing method embodiments.
- the processor 1101 can be used to execute instructions in the memory 1103, and control the receiving pin to receive signals, and control the transmitting pin to send signals.
- the device 1100 may be an electronic device, a camera module, or a chip of an electronic device in the above method embodiment.
- This embodiment also provides a computer storage medium that stores computer instructions.
- the electronic device When the computer instructions are run on an electronic device, the electronic device causes the electronic device to execute the above related method steps to implement the method in the above embodiment.
- This embodiment also provides a computer program product.
- the computer program product When the computer program product is run on a computer, it causes the computer to perform the above related steps to implement the method in the above embodiment.
- inventions of the present application also provide a device.
- This device may be a chip, a camera assembly (or a camera module), and the device may include a connected processor and a memory; where the memory is used to store computer execution instructions, When the device is running, the processor can execute computer execution instructions stored in the memory, so that the chip executes the methods in each of the above method embodiments.
- the electronic equipment, computer storage media, computer program products or chips provided in this embodiment are all used to execute the corresponding methods provided above. Therefore, the beneficial effects they can achieve can be referred to the corresponding methods provided above. The beneficial effects of the method will not be repeated here.
- the disclosed devices and methods can be implemented in other ways.
- the device embodiments described above are only illustrative.
- the division of modules or units is only a logical function division. In actual implementation, there may be other division methods.
- multiple units or components may be combined or can be integrated into another device, or some features can be ignored, or not implemented.
- the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms.
- a unit described as a separate component may or may not be physically separate.
- a component shown as a unit may be one physical unit or multiple physical units, that is, it may be located in one place, or it may be distributed to multiple different places. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
- each functional unit in each embodiment of the present application can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit.
- the above integrated units can be implemented in the form of hardware or software functional units.
- Integrated units may be stored in a readable storage medium if they are implemented in the form of software functional units and sold or used as independent products.
- the technical solutions of the embodiments of the present application are essentially or contribute to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the software product is stored in a storage medium , including several instructions to cause a device (which can be a microcontroller, a chip, etc.) or a processor to execute all or part of the steps of the methods of various embodiments of the present application.
- the aforementioned storage media include: U disk, mobile hard disk, read only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program code.
- the steps of the methods or algorithms described in connection with the disclosure of the embodiments of this application can be implemented in hardware or by a processor executing software instructions.
- Software instructions can be composed of corresponding software modules.
- Software modules can be stored in random access memory (Random Access Memory, RAM), flash memory, read only memory (Read Only Memory, ROM), erasable programmable read only memory ( Erasable Programmable ROM (EPROM), electrically erasable programmable read-only memory (Electrically EPROM, EEPROM), register, hard disk, removable hard disk, compact disc (CD-ROM) or any other form of storage media well known in the art.
- An exemplary storage medium is coupled to the processor such that the processor can read information from the storage medium and write information to the storage medium.
- the storage medium can also be an integral part of the processor.
- the processor and storage media may be located in an ASIC.
- Computer-readable media includes computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another.
- Storage media can be any available media that can be accessed by a general purpose or special purpose computer.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Signal Processing (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Neurology (AREA)
- Studio Devices (AREA)
- Image Processing (AREA)
Abstract
Description
Claims (23)
- 一种图像处理方法,其特征在于,包括:获取摄像头在第一场景下采集到的第一原始图像;基于所述第一原始图像,获取第一图像信号处理器(Image Signal Processing,ISP)参数;基于所述第一ISP参数,对所述第一原始图像进行图像处理,得到第一图像;将所述第一图像输出至目标应用,使得所述目标应用对所述第一图像进行图像任务处理;获取所述摄像头在第二场景下采集到的第二原始图像;基于所述第二原始图像,获取第二ISP参数;基于所述第二ISP参数,对所述第二原始图像进行图像处理,得到第二图像;将所述第二图像输出至所述目标应用,使得所述目标应用对所述第二图像进行图像任务处理。
- 根据权利要求1所述的方法,其特征在于,所述第一原始图像与所述第二原始图像的图像内容不同,和/或,所述第一原始图像与所述第二原始图像的图像属性不同。
- 根据权利要求2所述的方法,其特征在于,所述第一ISP参数与所述第二ISP参数用于调整图像的以下至少一种图像属性:亮度、颜色、噪声、锐度、对比度。
- 根据权利要求2所述的方法,其特征在于,所述第一图像与所述第二图像的图像属性满足所述目标应用的图像任务需求。
- 根据权利要求1所述的方法,其特征在于,所述基于所述第一原始图像,获取第一ISP参数,包括:将所述第一原始图像输入至ISP参数预测模型,获取所述ISP参数预测模型基于所述第一原始图像预测出的所述第一ISP参数;所述基于所述第二原始图像,获取第二ISP参数,包括:将所述第二原始图像输入至所述ISP参数预测模型,获取所述ISP参数预测模型基于所述第二原始图像预测出的所述第二ISP参数。
- 根据权利要求5所述的方法,其特征在于,所述ISP参数预测模型运行在神经网路处理器NPU上。
- 根据权利要求5所述的方法,其特征在于,所述获取摄像头在第一场景下采集到的第一原始图像之前,所述方法还包括:从云端获取所述ISP参数预测模型。
- 根据权利要求1至7任一项所述的方法,其特征在于,所述基于所述第一ISP参数,对所述第一原始图像进行图像处理,得到第一图像,包括:将ISP内存中当前保存的ISP参数替换为所述第一ISP参数,使得ISP从所述ISP内存中获取所述第一ISP参数,并基于所述第一ISP参数,对所述第一原始图像进行图像处理,输出所述第一图像;所述基于所述第二ISP参数,对所述第二原始图像进行图像处理,得到第二图像,包括:将所述ISP内存中当前保存的所述第一ISP参数替换为所述第二ISP参数,使得所述ISP从所述ISP内存中获取所述第二ISP参数,并基于所述第二ISP参数,对所述第二原始图像进行图像处理,输出所述第二图像。
- 一种模型训练方法,其特征在于,包括:向图像信号处理器ISP参数预测模型输入N个原始图像,获取所述ISP参数预测模型输出的N个ISP参数,所述ISP参数预测模型的权重为第一权重,所述N个ISP参数为所述ISP参数预测模型基于所述第一权重值得到的,N为大于1的整数;向代理模型输入所述N个ISP参数与所述N个原始图像,获取所述代理模型输出的N个图像;向目标应用输入所述N个图像,获取所述目标应用输出的图像任务处理结果;若所述图像任务处理结果未满足预设条件,调整所述ISP参数预测模型的权重为第二权重,再次从所述向ISP参数预测模型输入N个原始图像的步骤开始执行,直至所述目标应用输出的图像任务处理结果满足所述预设条件。
- 根据权利要求9所述的方法,其特征在于,所述预设条件包括:所述N个原始图像的预设真值标签与所述目标应用的图像任务处理结果中包括的真值标签之间的差值小于预设阈值。
- 根据权利要求9所述的方法,其特征在于,所述向ISP参数预测模型输入N个原始图像之前,方法还包括:向ISP输入M个原始图像和M个ISP参数,获取所述图像处理器输出的M个图像;其中,M个原始图像和M个ISP参数一一对应,M为大于1的整数;向所述代理模型输入M个原始图像和M个ISP参数,获取所述代理模型输出的M个图像;所述代理模型的当前权重为第三权重;若所述代理模型输出的M个图像与所述图像处理器输出的M个图像之间的相似度小于阈值,调整所述代理模型的权重为第四权重,再次从所述向ISP输入M个原始图像和M个ISP参数的步骤开始执行,直至所述代理模型输出的M个图像与所述图像处理器输出的M个图像之间的相似度大于阈值。
- 一种图像处理装置,其特征在于,包括:图像信号处理器ISP参数预测模块,用于获取摄像头在第一场景下采集到的第一原始图像;所述ISP参数预测模块,还用于基于所述第一原始图像,获取第一图像信号处理器ISP参数;图像处理ISP模块,用于基于所述第一ISP参数,对所述第一原始图像进行图像处理,得到第一图像;所述ISP模块,还用于将所述第一图像输出至目标应用,使得所述目标应用对所述第一图像进行图像任务处理;所述ISP参数预测模块,还用于获取所述摄像头在第二场景下采集到的第二原始图像;所述ISP参数预测模块,还用于基于所述第二原始图像,获取第二ISP参数;所述ISP模块,还用于基于所述第二ISP参数,对所述第二原始图像进行图像处理,得到第二图像;所述ISP模块,还用于将所述第二图像输出至所述目标应用,使得所述目标应用对所述第二图像进行图像任务处理。
- 根据权利要求12所述的装置,其特征在于,所述第一原始图像与所述第二原始图像的图像内容不同,和/或,所述第一原始图像与所述第二原始图像的图像属性不同。
- 根据权利要求13所述的装置,其特征在于,所述第一ISP参数与所述第二ISP参数用于调整图像的以下至少一种图像属性:亮度、颜色、噪声、锐度、对比度。
- 根据权利要求13所述的装置,其特征在于,所述第一图像与所述第二图像的图像属性满足所述目标应用的图像任务需求。
- 根据权利要求12至15任一项所述的装置,其特征在于,所述ISP参数预测模块,用于将ISP内存中当前保存的ISP参数替换为所述第一ISP参数;所述ISP模块,用于从所述ISP内存中获取所述第一ISP参数,并基于所述第一ISP参数,对所述第一原始图像进行图像处理,输出所述第一图像;所述ISP参数预测模块,用于将所述ISP内存中当前保存的所述第一ISP参数替换为所述第二ISP参数;所述ISP模块,用于从所述ISP内存中获取所述第二ISP参数,并基于所述第二ISP参数,对所述第二原始图像进行图像处理,输出所述第二图像。
- 一种模型训练系统,其特征在于,包括图像信号处理器ISP参数预测模型、代理模型和目标应用;向所述ISP参数预测模型输入N个原始图像,获取所述ISP参数预测模型输出的N个ISP参数,所述ISP参数预测模型的权重为第一权重,所述N个ISP参数为所述ISP参数预测模型基于所述第一权重值得到的,N为大于1的整数;向所述代理模型输入所述N个ISP参数与所述N个原始图像,获取所述代理模型输出的N个图像;向所述目标应用输入所述N个图像;若所述N个图像未满足所述目标应用的图像任务需求,调整所述ISP参数预测模型的权重为第二权重,再次从所述向ISP参数预测模型输入N个原始图像的步骤开始执行,直至所述代理模型输出的N个图像满足所述目标应用的图像任务需求。
- 根据权利要求17所述的系统,其特征在于,所述系统还包括ISP;所述向ISP参数预测模型输入N个原始图像之前,向所述ISP输入M个原始图像和M个ISP参数,获取所述图像处理器输出的M个图像;其中,M个原始图像和M个ISP参数一一对应,M为大于1的整数;向所述代理模型输入M个原始图像和M个ISP参数,获取所述代理模型输出的M个图像;所述代理模型的当前权重为第三权重;若所述代理模型输出的M个图像与所述图像处理器输出的M个图像之间的相似度小于阈值,调整所述代理模型的权重为第四权重,所述第四权重与所述第三权重不相同,再次从所述向图像处理器输入M个原始图像和M个ISP参数,获取所述图像处理器输出的M个图像开始执行,直至所述代理模型输出的M个图像与所述图像处理器输出的M个图像之间的相似度大于阈值。
- 一种电子设备,其特征在于,包括:一个或多个处理器;存储器;以及一个或多个计算机程序,其中所述一个或多个计算机程序存储在所述存储器上,当所述计算机程序被所述一个或多个处理器执行时,使得所述电子设备执行权利要求1-8任一项所述的方法。
- 一种摄像模组,其特征在于,包括:一个或多个处理器;存储器;以及一个或多个计算机程序,其中所述一个或多个计算机程序存储在所述存储器上,当所述计算机程序被所述一个或多个处理器执行时,使得所述摄像模组执行权利要求1-8任一项所述的方法。
- 一种芯片,其特征在于,包括一个或多个接口电路和一个或多个处理器;所述接口电路用于从电子设备的存储器接收信号,并向所述处理器发送所述信号,所述信号包括存储器中存储的计算机指令;当所述处理器执行所述计算机指令时,使得所述电子设备执行权利要求1-8任一项所述的方法。
- 一种计算机存储介质,其特征在于,包括计算机指令,当所述计算机指令在电子设备上运行时,使得所述电子设备执行如权利要求1-8任一项所述的方法。
- 一种计算机程序产品,其特征在于,当所述计算机程序产品在计算机上运行时,使得所述计算机执行如权利要求1-8任一项所述的方法。
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP23864498.3A EP4576018A4 (en) | 2022-09-14 | 2023-07-31 | IMAGE PROCESSING METHOD AND APPARATUS |
| US19/078,875 US20250247629A1 (en) | 2022-09-14 | 2025-03-13 | Image Processing Method and Apparatus |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211115044.XA CN117726929A (zh) | 2022-09-14 | 2022-09-14 | 图像处理方法及装置 |
| CN202211115044.X | 2022-09-14 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US19/078,875 Continuation US20250247629A1 (en) | 2022-09-14 | 2025-03-13 | Image Processing Method and Apparatus |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024055764A1 true WO2024055764A1 (zh) | 2024-03-21 |
Family
ID=90203939
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2023/110156 Ceased WO2024055764A1 (zh) | 2022-09-14 | 2023-07-31 | 图像处理方法及装置 |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US20250247629A1 (zh) |
| EP (1) | EP4576018A4 (zh) |
| CN (1) | CN117726929A (zh) |
| WO (1) | WO2024055764A1 (zh) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119277026A (zh) * | 2024-10-21 | 2025-01-07 | 南京优宇旭科技有限公司 | 固定交通环境视觉内容管理系统 |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118175433A (zh) * | 2024-05-13 | 2024-06-11 | 成都云创天下科技有限公司 | 一种基于同一视频画面内不同场景isp自动调优的方法 |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108632512A (zh) * | 2018-05-17 | 2018-10-09 | Oppo广东移动通信有限公司 | 图像处理方法、装置、电子设备及计算机可读存储介质 |
| WO2021204202A1 (zh) * | 2020-04-10 | 2021-10-14 | 华为技术有限公司 | 图像自动白平衡的方法及装置 |
| CN114615495A (zh) * | 2020-12-09 | 2022-06-10 | Oppo广东移动通信有限公司 | 模型量化方法、装置、终端及存储介质 |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111656781A (zh) * | 2018-01-30 | 2020-09-11 | 高通股份有限公司 | 用于使用参考图像进行图像信号处理器调谐的系统和方法 |
| US10796200B2 (en) * | 2018-04-27 | 2020-10-06 | Intel Corporation | Training image signal processors using intermediate loss functions |
| CN114677719A (zh) * | 2020-12-09 | 2022-06-28 | 华为技术有限公司 | 图像信号处理的方法、装置和计算机可读存储介质 |
-
2022
- 2022-09-14 CN CN202211115044.XA patent/CN117726929A/zh active Pending
-
2023
- 2023-07-31 EP EP23864498.3A patent/EP4576018A4/en active Pending
- 2023-07-31 WO PCT/CN2023/110156 patent/WO2024055764A1/zh not_active Ceased
-
2025
- 2025-03-13 US US19/078,875 patent/US20250247629A1/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108632512A (zh) * | 2018-05-17 | 2018-10-09 | Oppo广东移动通信有限公司 | 图像处理方法、装置、电子设备及计算机可读存储介质 |
| WO2021204202A1 (zh) * | 2020-04-10 | 2021-10-14 | 华为技术有限公司 | 图像自动白平衡的方法及装置 |
| CN114615495A (zh) * | 2020-12-09 | 2022-06-10 | Oppo广东移动通信有限公司 | 模型量化方法、装置、终端及存储介质 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4576018A4 |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119277026A (zh) * | 2024-10-21 | 2025-01-07 | 南京优宇旭科技有限公司 | 固定交通环境视觉内容管理系统 |
Also Published As
| Publication number | Publication date |
|---|---|
| EP4576018A1 (en) | 2025-06-25 |
| US20250247629A1 (en) | 2025-07-31 |
| EP4576018A4 (en) | 2025-11-26 |
| CN117726929A (zh) | 2024-03-19 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP4064284A1 (en) | Voice detection method, prediction model training method, apparatus, device, and medium | |
| WO2021136050A1 (zh) | 一种图像拍摄方法及相关装置 | |
| CN114242037B (zh) | 一种虚拟人物生成方法及其装置 | |
| CN115798390B (zh) | 屏幕显示方法和终端设备 | |
| WO2022179604A1 (zh) | 一种分割图置信度确定方法及装置 | |
| WO2021258814A1 (zh) | 视频合成方法、装置、电子设备及存储介质 | |
| WO2021104485A1 (zh) | 一种拍摄方法及电子设备 | |
| US20250247629A1 (en) | Image Processing Method and Apparatus | |
| WO2021057626A1 (zh) | 图像处理方法、装置、设备及计算机存储介质 | |
| CN115589051B (zh) | 充电方法和终端设备 | |
| CN111768765B (zh) | 语言模型生成方法和电子设备 | |
| WO2023179123A1 (zh) | 蓝牙音频播放方法、电子设备及存储介质 | |
| CN115705241B (zh) | 应用的调度方法及电子设备 | |
| CN114444000A (zh) | 页面布局文件的生成方法、装置、电子设备以及可读存储介质 | |
| WO2025036044A1 (zh) | 资源调度的方法、装置和电子设备 | |
| CN111835904A (zh) | 一种基于情景感知和用户画像开启应用的方法及电子设备 | |
| WO2023207667A1 (zh) | 一种显示方法、汽车和电子设备 | |
| CN113380240B (zh) | 语音交互方法和电子设备 | |
| CN110058729A (zh) | 调节触摸检测的灵敏度的方法和电子设备 | |
| CN117995137B (zh) | 一种调节显示屏色温的方法、电子设备及相关介质 | |
| CN115964231A (zh) | 基于负载模型的评估方法和装置 | |
| CN116489272A (zh) | 语音消息播放方法及电子设备 | |
| CN114115772B (zh) | 灭屏显示的方法及装置 | |
| WO2023098467A1 (zh) | 语音解析方法、电子设备、可读存储介质及芯片系统 | |
| CN112770002A (zh) | 一种心跳管控的方法和电子设备 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23864498 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2023864498 Country of ref document: EP |
|
| ENP | Entry into the national phase |
Ref document number: 2023864498 Country of ref document: EP Effective date: 20250319 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWP | Wipo information: published in national office |
Ref document number: 2023864498 Country of ref document: EP |