WO2024252977A1 - Signal processing device and signal processing method - Google Patents
Signal processing device and signal processing method Download PDFInfo
- Publication number
- WO2024252977A1 WO2024252977A1 PCT/JP2024/019482 JP2024019482W WO2024252977A1 WO 2024252977 A1 WO2024252977 A1 WO 2024252977A1 JP 2024019482 W JP2024019482 W JP 2024019482W WO 2024252977 A1 WO2024252977 A1 WO 2024252977A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- demosaic
- scene
- unit
- artificial intelligence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/617—Upgrading or updating of programs or applications for camera control
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/80—Camera processing pipelines; Components thereof
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N9/00—Details of colour television systems
- H04N9/64—Circuits for processing colour signals
Definitions
- the present technology relates to a signal processing device that performs an image analysis process using an AI model and a method thereof.
- an image analysis process such as an object detection process or an object recognition process on a captured image using an artificial intelligence (AI) model including a neural network such as a convolutional neural network (CNN)
- AI artificial intelligence
- CNN convolutional neural network
- the present technology has been made in view of the above circumstances, and it is desirable to improve processing accuracy of the image analysis process using an AI model.
- An image processing system includes circuitry configured to acquire image data captured by an image sensor, process the image data to generate a non-demosaic image, and selectively output the non-demosaic image to an artificial intelligence model trained to perform image analysis. Since the demosaic process involves a spatial interpolation process, deviation from the original pixel value tends to occur, and in a case where the demosaiced image is used as input data of the image analysis process by the AI model, the processing accuracy may decrease due to the deviation of the pixel value. By using the non-demosaic image as the input data of the image analysis process as described above, it is possible to prevent degradation in processing accuracy due to such a demosaic process.
- Fig. 1 is a block diagram illustrating a schematic configuration example of a camera device including a signal processing device as the first embodiment.
- Fig. 2 is a diagram for describing an internal configuration example of an image signal processing unit included in the signal processing device as the first embodiment.
- Fig. 3 is a diagram illustrating a result of an experiment on a change characteristic of the image analysis processing accuracy in a case where a type of input data of an AI processing unit is changed.
- Fig. 4 is an explanatory diagram of a color separation image according to the embodiment.
- Fig. 5 is a diagram for considering a factor of degradation in analysis processing accuracy that occurs in a case where a demosaiced image is used.
- Fig. 1 is a block diagram illustrating a schematic configuration example of a camera device including a signal processing device as the first embodiment.
- Fig. 2 is a diagram for describing an internal configuration example of an image signal processing unit included in the signal processing device as the first embodiment.
- Fig. 3 is a diagram
- FIG. 6 is a flowchart illustrating a specific processing procedure example for realizing an analysis processing method as the first embodiment.
- Fig. 7 is a block diagram for describing a configuration example of a camera device as the second embodiment.
- Fig. 8 is a flowchart illustrating a specific processing procedure example for realizing an analysis processing method as a modification.
- Fig. 9 is an explanatory diagram of an exemplary configuration in which an AI processing unit is provided outside a sensor device.
- First embodiment> (1-1. Configuration example of camera device) (1-2. Analysis processing method as first embodiment) (1-3. Processing procedure)
- FIG. 1 is a block diagram illustrating a schematic configuration example of a camera device 10 including a signal processing device as the first embodiment according to the present technology. As illustrated in the drawing, the camera device 10 includes an optical system 11, a communication interface (I/F) 12, a camera control unit 13, an out-of-sensor memory unit 14, and a communication unit 15 together with a sensor unit 1.
- I/F communication interface
- the sensor unit 1 corresponds to the signal processing device as the first embodiment.
- the sensor unit 1 is configured as, for example, an image sensor such as a charge coupled device (CCD) type image sensor or a complementary metal oxide semiconductor (CMOS) type image sensor.
- the sensor unit 1 is configured to be capable of performing not only an imaging function but also the image analysis process using an artificial intelligence (AI) model as image analysis process on a captured image.
- AI artificial intelligence
- a plurality of pixels that receives light of different wavelength bands is formed as pixels each of which has a light receiving element, and a captured image as a color image can be obtained.
- the optical system 11 includes lenses such as a cover lens and a focus lens, and a diaphragm (iris) mechanism.
- Light (incident light) from a subject is guided by the optical system 11 and condensed on the light receiving surface of the sensor unit 1.
- the communication interface (I/F) 12 is a communication interface for performing data communication between the sensor unit 1 and the camera control unit 13.
- the camera control unit 13 includes, for example, a microcomputer including a central processing unit (CPU), a read only memory (ROM), and a random access memory (RAM), and performs overall control of the camera device 10 by the CPU executing various types of processing in accordance with a program stored in the ROM or a program loaded in the RAM.
- the camera control unit 13 can receive various pieces of data from the sensor unit 1 and transmits various pieces of data to the sensor unit 1 via the communication interface 12.
- the out-of-sensor memory unit 14 is connected to the camera control unit 13.
- the out-of-sensor memory unit 14 includes, for example, a non-volatile storage device such as a solid state drive (SSD) or a flash memory device, and is used to store information used for various kinds of control by the camera control unit 13. Furthermore, the out-of-sensor memory unit 14 can also be used to store various pieces of data obtained in the sensor unit 1, such as captured image data by the sensor unit 1.
- first AI model setting data P1 and second AI model setting data P2 are stored in the out-of-sensor memory unit 14, which will be described later again.
- the communication unit 15 is connected to the camera control unit 13.
- the communication unit 15 is configured to be able to perform wired or wireless data communication with an external device.
- the communication unit 15 can also be configured to have a network communication function, and in this case, the camera control unit 13 can exchange data with a predetermined device (for example, the server device) on a predetermined network such as the Internet via the communication unit 15, for example.
- the sensor unit 1 includes a pixel array unit 2, an image signal processing unit 3, a preprocessing unit 4, an AI processing unit 5, an in-sensor control unit 6, an in-sensor memory unit 7, an output data generation unit 8, and a communication interface (I/F) 9.
- the pixel array unit 2 for example, a plurality of pixels each of which has a light receiving element (photoelectric conversion element) such as a photodiode is two-dimensionally disposed in the horizontal direction and the vertical direction.
- the pixel array unit 2 includes a pixel unit Pu (see Fig. 4A to be described later) in which a plurality of pixels that receives light of different wavelength bands is two-dimensionally disposed in a predetermined pattern, and a plurality of the pixel units Pu is two-dimensionally disposed.
- the pixel unit Pu is formed by disposing three types of pixels of an R pixel that receives R (red) light, a G pixel that receives G (green) light, and a B pixel that receives B (blue) light in a predetermined array pattern.
- the pixel unit Pu in the present example is formed by disposing R pixels, G pixels, and B pixels in a Bayer array.
- the pixel array unit 2 also includes a configuration for obtaining image data as digital data, such as a readout circuit for reading out a value (received light value) of each pixel and an analog to digital converter (ADC) for digitally sampling a pixel value as an analog signal.
- a readout circuit for reading out a value (received light value) of each pixel
- ADC analog to digital converter
- An image signal processing unit (ISP: Imaging Signal Processor) 3 receives image data (captured image data) obtained by the pixel array unit 2, and performs various types of image signal processes. Note that the internal configuration of the image signal processing unit 3 will be described again later.
- the preprocessing unit 4 receives image data after the image signal process by the image signal processing unit 3, and performs an image signal process as preprocessing for the image analysis process by the AI processing unit 5.
- the preprocessing unit 4 in the present example is configured to be able to perform at least image resizing processing.
- the AI processing unit 5 performs the image analysis process using the AI model using the image data output from the preprocessing unit 4 as input data.
- the AI processing unit 5 includes, for example, a digital signal processor (DSP), and can switch an AI model used for the image analysis process by switching processing parameters.
- DSP digital signal processor
- the AI processing unit 5 of the present example is configured to be able to perform the image analysis process using an AI model including a neural network such as a convolutional neural network (CNN), for example.
- CNN convolutional neural network
- the camera device 10 in the present example is disposed in a commercial facility such as a supermarket or a department store and performs the image analysis process with a person as a customer as a target subject.
- the object detection process with a person as a target subject is performed.
- the AI model in the AI processing unit 5 an AI model machine trained to perform an object detection process with a person as a target subject is used.
- the object detection process includes a process of identifying a region where the target subject is present as a so-called bounding box.
- the in-sensor control unit 6 includes a microcomputer including, for example, a CPU, a ROM, a RAM, and the like, and integrally controls the operation of the sensor unit 1.
- the in-sensor control unit 6 controls the operation of the pixel array unit 2. Specifically, control of start/stop of operation and the like are performed.
- the in-sensor control unit 6 also controls the operations of the image signal processing unit 3, the preprocessing unit 4, and the AI processing unit 5.
- the in-sensor control unit 6 can control process parameters of various types of processing.
- the control of the AI processing unit 5 the in-sensor control unit 6 can perform switching control of the AI model.
- the in-sensor memory unit 7 is connected to the in-sensor control unit 6.
- the in-sensor memory unit 7 includes, for example, a non-volatile storage device such as a flash memory device, and is used to store information used for various kinds of control by the in-sensor control unit 6.
- the output data generation unit 8 receives the analysis process result by the AI processing unit 5 and the image data output from the image signal processing unit 3, and generates output data to be output to the outside of the sensor unit 1.
- the output data generation unit 8 generates output data on the basis of an instruction from the in-sensor control unit 6. For example, it is conceivable to switch whether or not to use both the analysis process result by the AI processing unit 5 and the image data as output data or use only the analysis process result by the AI processing unit 5 as output data on the basis of an instruction from the in-sensor control unit 6.
- the communication interface 9 is a communication interface for enabling data output from the inside of the sensor unit 1 to the outside of the sensor unit 1 and data input from the outside of the sensor unit 1 to the inside of the sensor unit 1, and performs data communication according to a predetermined communication data format with the communication interface 12 described above.
- the above-described output data can be output to the outside of the sensor unit 1 (the camera control unit 13 in this example) via the communication interface 9.
- the in-sensor control unit 6 can perform data communication with the camera control unit 13 via the communication interface 9.
- Fig. 2 is a diagram for describing an internal configuration example of the image signal processing unit 3. Note that Fig. 2 illustrates the preprocessing unit 4, the AI processing unit 5, and the in-sensor control unit 6 illustrated in Fig. 1 together with the internal configuration example of the image signal processing unit 3.
- Image data as RAW data is input to the image signal processing unit 3 from the pixel array unit 2 illustrated in Fig. 1.
- the RAW data referred to herein means image data obtained by reading the value of each pixel in raster order, that is, in the present example, image data in a state where the pixel array in the Bayer array is maintained.
- the image signal processing unit 3 includes a black level correction unit 31, a gain adjustment unit 33, a demosaic processing unit 34, a color correction unit 35, a gamma correction unit 36, and a dewarp processing unit 37, and can sequentially perform a shading correction process, a gain adjustment process, a demosaic process, a color correction process, a gamma correction process, and a dewarping process on image data as RAW data input by the pixel array unit 2.
- the gain adjustment process by the gain adjustment unit 33 includes overall gain adjustment for adjusting the luminance distribution of the entire image regardless of the color of the pixel, and an auto white balance (AWB) process that is the gain adjustment process for each color.
- AVB auto white balance
- the demosaic process by the demosaic processing unit 34 is a process of generating image data as an R image, a G image, and a B image with the same number of pixels as the input image data by performing a spatial interpolation process for each color of R, G, and B from the input image data in the Bayer array state.
- the color correction unit 35 performs a color correction process by a linear matrix process on the image data subjected to the demosaic process. Furthermore, the dewarp processing unit 37 performs at least a lens distortion correction process as the dewarping process.
- the image signal processing unit 3 in the present embodiment also includes an image organization unit 38 and a selector 39, which will be described later again.
- the present embodiment proposes a method of using a non-demosaic image that is a captured image in a state not subjected to a demosaic process as input data of the AI processing unit 5. Since the demosaic process involves a spatial interpolation process, deviation from the original pixel value tends to occur, and in a case where a demosaiced image is used as input data of the image analysis process by the AI model, the accuracy of the image analysis process may decrease due to the deviation of the pixel value.
- the input data of the AI processing unit 5 is switched between a demosaiced image that is an image after the demosaic process by the demosaic processing unit 34 and a non-demosaic image on the basis of the scene determination result for the scene to be imaged.
- a resource (memory amount and calculation capability) that can be allocated to the image analysis process tend to be insufficient.
- an AI model such as an object detection process
- a scene captured by a camera such as a bright/dark scene such as day and night
- a scene in which a subject to be analyzed such as a person is present relatively far away, or a scene in which the subject to be analyzed such as a person is present nearby, can change over time. It is desirable that the analysis processing accuracy is maintained at a certain level or more even with respect to such a change in the scene.
- the method of absorbing the change in the scene in the AI model can be applied in a case where the execution entity of the image analysis process is a computer device having a relatively rich resource, specifically, a computer device outside the camera device 10, but is difficult to be applied in a case where the image analysis process is performed in the camera device 10 having a limited resource, particularly, in a case where the image analysis process is performed in the sensor device (sensor unit 1).
- a method of switching the input data of the AI processing unit 5 between the demosaiced image and the non-demosaic image on the basis of the scene determination result as described above is used.
- Fig. 3 illustrates a result of an experiment on a change characteristic of image analysis processing accuracy in a case where the type of input data of the AI processing unit 5 is changed.
- the experimental result of Fig. 3 illustrates a result in a case where the object detection process with the human body as the target subject is performed as the image analysis process using the AI model, and illustrates a measurement result of the evaluation value of the image analysis processing accuracy in a case where the image data of each extraction position indicated by ⁇ 1> to ⁇ 6> in Fig. 2 is used as the input data for each scene of the dark scene and the bright scene.
- the evaluation value here is obtained as a ratio at which 90% or more of persons can be accurately detected in a case where the object detection process with the human body as the target subject is performed a plurality of times.
- the dark scene means a scene in which the target subject appears dark, and can be defined as, for example, a scene in which the luminance value of the target subject is equal to or less than a certain value.
- the bright scene means a scene in which the target subject appears brightly, and can be defined as, for example, a scene in which the luminance value of the target subject exceeds the certain value.
- the extraction position of ⁇ 1> is a position immediately before input to the image signal processing unit 3
- the extraction position of ⁇ 2> is a position between a shading correction unit 32 and the gain adjustment unit 33.
- the extraction position of ⁇ 3> is a position between the gain adjustment unit 33 and the demosaic processing unit 34.
- the extraction positions of ⁇ 1> to ⁇ 3> are both positions before demosaicing.
- image data as a color separation image to be described below is used as the image data of the extraction positions of ⁇ 1> to ⁇ 3> in correspondence with the use of the AI model having the CNN as the AI model.
- Fig. 4 is an explanatory diagram of a color separation image according to the embodiment.
- Fig. 4A illustrates a pixel array (Bayer array in this example) in the pixel array unit 2.
- the pixel unit Pu is formed by disposing four pixels of R, G, G, and B in a predetermined array pattern according to the Bayer format.
- the color separation image means an image formed by collecting pixel values at the same pixel unit position the from respective pixel units Pu and disposing the pixel values in different regions on the same image plane.
- Fig. 4B illustrates a color separation image generated in the case of the Bayer array.
- pixel values of pixels having the same pixel unit position are collected and disposed in different regions on the same image plane for the R pixel, the G pixel, the G pixel, and the B pixel in respective pixel units Pu, thereby generating a color separation image including an image region in which pixel values of the R pixels in respective pixel units Pu are arrayed, an image region in which pixel values of one G pixel in respective pixel units Pu are arrayed, an image region in which pixel values of the other G pixel in respective pixel units Pu are arrayed, and an image region in which pixel values of the B pixel in respective pixel units Pu are arrayed.
- the format of the input data to the AI processing unit is an input data format suitable for the configuration of the CNN, and the accuracy of the image analysis process can be improved.
- the extraction positions of ⁇ 4> to ⁇ 6> are extraction positions after demosaicing, ⁇ 4> is a position between the demosaic processing unit 34 and the color correction unit 35, ⁇ 5> is a position between the color correction unit 35 and the gamma correction unit 36, and ⁇ 6> is a position between the gamma correction unit 36 and the dewarp processing unit 37.
- the evaluation value is 90% or more regardless of the data type of the input data from ⁇ 1> to ⁇ 6>. From this point, in the case of the bright scene, it can be seen that the accuracy of the object detection process tends not to depend on the difference in data type.
- the input data to the AI processing unit 5 is resized (for example, thinned out) to a predetermined size in the preprocessing unit 4 so as to reduce the amount of input data to the AI processing unit 5. It is considered that degradation in the processing accuracy in the case of using the demosaiced image occurs due to this point.
- the amount of input data of the AI processing unit 5 is limited to the amount of data of 1280 ⁇ 960 pixels
- an image with a resolution (information density) of 1280 ⁇ 960 pixels can be used as the input data
- the resolution of each of the R image, the B image, and the G image to be the input data of the AI processing unit 5 is reduced to 739 ⁇ 554 pixels.
- the degradation in resolution in this manner is also considered to be a cause of degradation in processing accuracy in a case where a demosaiced image is used as input data.
- a method is used in which it is determined whether or not it is a dark scene, and a non-demosaic image is input as input data of the AI processing unit 5 in a case where the scene is determined to be the dark scene.
- an AI model capable of absorbing a difference between scenes cannot be used due to a resource problem, it is possible to suppress degradation in image analysis processing accuracy in a dark scene, and it is possible to suppress degradation in image analysis processing accuracy depending on a scene.
- the non-demosaic image after shading correction indicated as ⁇ 2> is used as the input data in the dark scene.
- the color separation image of the non-demosaic image obtained between the shading correction unit 32 and the gain adjustment unit 33 is used as input data in the dark scene.
- the image signal processing unit 3 is provided with the image organization unit 38.
- the image organization unit 38 receives the non-demosaic image after the shading correction output from the shading correction unit 32, and rearranges the pixel values by the method described above with reference to Fig. 3, thereby generating a color separation image for the non-demosaic image after the shading correction.
- the demosaiced image is input as the input data of the AI processing unit 5.
- the demosaiced image output from the dewarp processing unit 37 is provided as input data of the AI processing unit 5.
- the image signal processing unit 3 is provided with a selector 39.
- the selector 39 receives inputs of the non-demosaic image that is the color separation image in the image organization unit 38 and the demosaiced image output from the dewarp processing unit 37 to output the image instructed by the in-sensor control unit 6 to the preprocessing unit 4.
- the in-sensor control unit 6 determines whether or not it is a dark scene on the basis of the demosaiced image. Specifically, the in-sensor control unit 6 in the present example determines whether or not it is a dark scene on the basis of the demosaiced image output by the dewarp processing unit 37. Then, as a result of determining whether or not it is a dark scene, the in-sensor control unit 6 causes the selector 39 to output (select) the demosaiced image in a case where it is determined that the scene is not a dark scene, and causes the selector 39 to output (select) the non-demosaic image in a case where it is determined that the scene is a dark scene.
- the in-sensor control unit 6 performs control so that an AI model trained with a demosaiced image as input data for training is used as the AI model in a case where the demosaiced image is used as input data of the AI processing unit 5, and an AI model trained with a non-demosaic image as input data for training is used as the AI model in a case where the non-demosaic image is used as input data of the AI processing unit 5.
- the setting data of the AI processing unit 5 for realizing the former AI model that is, the AI model trained with the demosaiced image as the input data for training is stored as the first AI model setting data P1 in the out-of-sensor memory unit 14 illustrated in Fig. 1
- the setting data of the AI processing unit 5 for realizing the latter AI model that is, the AI model trained with the non-demosaic image as the input data for training is stored as the second AI model setting data P2 in the out-of-sensor memory unit 14.
- the first AI model setting data P1 and the second AI model setting data P2 are data including parameters as filter coefficients used in a filtering process such as a convolution process in the CNN and various parameters related to the structure of the neural network.
- the in-sensor control unit 6 instructs the camera control unit 13 via the communication interface 9 to read the first AI model setting data P1 from the out-of-sensor memory unit 14 and transfer the first AI model setting data P1 to the in-sensor control unit. Then, by performing parameter setting of the AI processing unit 5 according to the transferred first AI model setting data P1, the AI processing unit 5 can execute the image analysis process by the AI model trained with the demosaiced image as the input data for training.
- the in-sensor control unit 6 instructs the camera control unit 13 via the communication interface 9 to read the second AI model setting data P2 from the out-of-sensor memory unit 14 and transfer the second AI model setting data P2 to the in-sensor control unit, and sets the parameters of the AI processing unit 5 according to the transferred second AI model setting data P2, so that the AI processing unit 5 can execute the image analysis process by the AI model trained with the non-demosaic image as the input data for training.
- FIG. 6 A specific processing procedure example to be executed by the in-sensor control unit 6 to realize the analysis processing method as the first embodiment described above will be described with reference to the flowchart of Fig. 6.
- the processing illustrated in Fig. 6 is executed by the CPU in the in-sensor control unit 6 on the basis of a program stored in a predetermined storage device such as the ROM of the in-sensor control unit 6.
- the execution entity of the process is the in-sensor control unit 6 for the sake of explanation.
- the in-sensor control unit 6 starts the processing illustrated in Fig. 6 in response to activation.
- step S101 the in-sensor control unit 6 determines whether or not the scene determination execution condition is satisfied. That is, it is determined whether or not a predetermined condition determined in advance as a condition that a scene determination process (a process of determining whether or not the scene is a dark scene in this example) in step S103 described later is to be executed is satisfied.
- the scene determination process in step S103 may be periodically performed at predetermined time intervals, for example. In this case, the scene determination condition is only required to be the passage of a certain period of time.
- the scene determination execution condition is that an instruction from the outside has been given, that as a result of simple brightness detection using an illuminance sensor or the like, there has been a change in brightness by a predetermined amount or more, or the like, and the scene determination execution condition is not limited to a specific condition.
- step S101 determines whether or not the process is ended, that is, whether or not a predetermined condition determined in advance as a condition that the series of processes illustrated in Fig. 6 should be ended, such as power-off, for example, is satisfied.
- step S102 determines whether or not the process is ended, that is, whether or not a predetermined condition determined in advance as a condition that the series of processes illustrated in Fig. 6 should be ended, such as power-off, for example, is satisfied.
- step S102 determines whether or not the process is ended, that is, whether or not a predetermined condition determined in advance as a condition that the series of processes illustrated in Fig. 6 should be ended, such as power-off, for example, is satisfied.
- step S102 determines whether or not the process is ended, that is, whether or not a predetermined condition determined in advance as a condition that the series of processes illustrated in Fig. 6 should be ended, such as power-off, for example, is satisfied.
- step S102 determines whether or not the process is ended, that
- step S101 the in-sensor control unit 6 proceeds to step S103 and executes the scene determination process. That is, in the present example, it is determined whether or not the scene is a dark scene on the basis of the demosaiced image output from the dewarp processing unit 37. Specifically, it is determined whether or not the average luminance value of the region of the target subject is equal to or less than a predetermined luminance value. At this time, it is conceivable to identify the region of the target subject on the basis of, for example, region information of the target subject obtained as a result of the object detection process by the AI processing unit 5.
- step S104 the in-sensor control unit 6 determines whether or not it is a dark scene. That is, it is determined whether or not a determination result indicating a dark scene has been obtained as a result of the determination process in step S103.
- step S104 the in-sensor control unit 6 proceeds to step S105 and gives an instruction to select a demosaiced image. That is, it instructs the selector 39 to selectively output the demosaiced image input from the dewarp processing unit 37.
- step S106 the in-sensor control unit 6 performs a setting process of the first AI model. That is, the control unit instructs the camera control unit 13 via the communication interface 9 to read the first AI model setting data P1 from the out-of-sensor memory unit 14 and transfer to the control unit, and performs parameter setting of the AI processing unit 5 according to the transferred first AI model setting data P1.
- the AI processing unit 5 performs the image analysis process using the AI model trained with the demosaiced image as input data and the demosaiced image as input data for training.
- the in-sensor control unit 6 returns to step S101 in response to the execution of the setting process in step S106.
- step S104 in a case where it is determined that the scene is the dark scene, the in-sensor control unit 6 proceeds to step S107, and instructs the selector 39 to select the non-demosaic image, that is, to select and output the non-demosaic image by the color separation image input from the image organization unit 38.
- step S108 following step S107 the in-sensor control unit 6 performs a setting process of the second AI model. That is, the control unit instructs the camera control unit 13 via the communication interface 9 to read the second AI model setting data P2 from the out-of-sensor memory unit 14 and transfer to the control unit, and performs parameter setting of the AI processing unit 5 according to the transferred second AI model setting data P2.
- the AI processing unit 5 performs the image analysis process using the AI model trained with the non-demosaic image by the color separation image as input data and the non-demosaic image by the color separation image as input data for training.
- the in-sensor control unit 6 returns to step S101 in response to the execution of the setting process in step S108.
- step S102 the in-sensor control unit 6 ends the series of processes illustrated in Fig. 6.
- the scene determination process it is also conceivable to cause the AI processing unit 5 to execute the determination as to whether or not the scene is a dark scene by providing a determination function of whether or not the scene is a dark scene as part of the AI model in the AI processing unit 5.
- the in-sensor control unit 6 performs control of the selector 39 (and switching control of the AI model) on the basis of the result of the determination process as to whether or not it is a dark scene performed by the AI processing unit 5 as described above.
- Fig. 7 is a block diagram for describing a configuration example of a camera device 10A as the second embodiment having a non-demosaic image accumulation function. Note that, although illustration of the optical system 11 and illustration of the pixel array unit 2, the in-sensor memory unit 7, and the output data generation unit 8 are omitted in Fig. 7 for convenience of illustration, the camera device 10A of the second embodiment also includes the optical system 11, the pixel array unit 2, the in-sensor memory unit 7, and the output data generation unit 8 as in the case of the camera device 10.
- first AI model setting data P1 and the second AI model setting data P2 are also stored in the out-of-sensor memory unit 14 in the camera device 10A as in the case of the camera device 10.
- the camera device 10A is different from the camera device 10 illustrated in Fig. 1 in that a sensor unit 1A is provided instead of the sensor unit 1.
- the sensor unit 1A is different from the sensor unit 1 in that an in-sensor control unit 6A is provided instead of the in-sensor control unit 6.
- the in-sensor control unit 6A is different from the in-sensor control unit 6 in that the in-sensor control unit 6A performs a process for accumulating non-demosaic images in a memory in the camera device 10A, specifically, in the present example, in the out-of-sensor memory unit 14.
- the non-demosaic image by the color separation image generated by the image organization unit 38 can be input not only to the selector 39 but also to the communication interface 9.
- the in-sensor control unit 6A causes the camera control unit 13 to transmit the non-demosaic image by the color separation image via the communication interface 9, and instructs the camera control unit 13 to store the non-demosaic image in the out-of-sensor memory unit 14 via the communication interface 9.
- the captured image by the non-demosaic image that can be obtained under the actual use environment of the camera device 10A can be accumulated in the out-of-sensor memory unit 14.
- the in-sensor control unit 6A performs a process of transmitting the non-demosaic image accumulated in the out-of-sensor memory unit 14 to the outside of the camera device 10A. Specifically, the in-sensor control unit 6A instructs the camera control unit 13 to execute processing of transmitting the non-demosaic image accumulated in the out-of-sensor memory unit 14 to the external device on the basis of an instruction from the external device (for example, a server device or the like) of the camera device 10A. As a result, it is possible to transmit the non-demosaic image to be used for retraining to the external device corresponding to a case where the retraining of the AI model used by the AI processing unit 5 is performed by the external device.
- the external device for example, a server device or the like
- the in-sensor control unit 6A instructs the camera control unit 13 to execute the transmission process for the accumulated non-demosaic image
- the camera control unit 13 performs the transmission process for the accumulated non-demosaic image on the basis of an instruction from the outside.
- the accumulation unit that accumulates the non-demosaic image is the out-of-sensor memory unit 14, but the accumulation unit may be a memory in the sensor unit 1A such as the in-sensor memory unit 7.
- the sensor unit 1A includes a communication unit capable of directly communicating with an external device of the camera device 10A
- an execution entity of the process of transmitting the accumulated non-demosaic image may be the in-sensor control unit 6A.
- the embodiment is not limited to the specific example described above, and may be configured as various modifications.
- the parameter adjustment according to the dark scene/bright scene as described above is an effective method in a case where the object detection process is performed as the image analysis process using the AI model.
- a segmentation process such as semantic segmentation is performed as the image analysis process using an AI model
- class identification is performed for each block such as each pixel in the segmentation process. Therefore, when gamma correction or overall gain adjustment is performed, processing accuracy in a block of a bright portion or a dark portion may deteriorate. Therefore, in a case where the segmentation process is performed as the image analysis process using an AI model, it is conceivable to use a captured image in a state where gamma correction or overall gain adjustment is not performed as input data.
- Fig. 8 is a flowchart illustrating a specific processing procedure example for realizing an analysis processing method as a modification in which the signal processing parameter of the image signal processing unit 3 is changed on the basis of the scene determination result as described above.
- the processing illustrated in Fig. 8 may be performed by either the in-sensor control unit 6 or the in-sensor control unit 6A, but the processing will be described here as being executed by the in-sensor control unit 6.
- the processing illustrated in Fig. 8 is different from the processing illustrated in Fig. 6 in that the processing of steps S110 and S111 in the drawing is added.
- the in-sensor control unit 6 in this case performs the signal processing parameter adjustment process in step S110 in response to the execution of the setting process of the first AI model in step S106.
- the signal processing parameter adjustment process in step S110 corresponding to a case where it is determined that the scene is not a dark scene, processing of performing control is performed so that the parameter for preventing the overexposure is set in the gain adjustment unit 33 and the gamma correction unit 36.
- the in-sensor control unit 6 performs the signal processing parameter adjustment process in step S111 in response to the execution of the setting process of the second AI model in step S108.
- the signal processing parameter adjustment process in step S111 in the present example corresponding to a case where it is determined that the scene is a dark scene, processing of performing control is performed so that the above-described parameter for preventing black crushing is set in the gain adjustment unit 33 and the gamma correction unit 36.
- the in-sensor control unit 6 returns the process to step S101 even in a case where any signal processing parameter adjustment process of steps S110 and S111 is performed.
- the configuration in which the AI processing unit 5 is provided in the sensor device has been exemplified, but it is also conceivable to have a configuration in which the AI processing unit 5 is provided outside the sensor device as in the camera device 10B illustrated in Fig. 9.
- a sensor unit 1B including only the pixel array unit 2 and the communication interface 9 is provided instead of the sensor unit 1 (or the sensor unit 1A).
- the image signal processing unit 3, the preprocessing unit 4, and the AI processing unit 5 are provided outside the sensor unit 1B.
- a camera device 10B includes a camera control unit 13B instead of the camera control unit 13.
- a captured image as RAW data obtained by the pixel array unit 2 is input to the image signal processing unit 3 via the communication interface 9 and the communication interface 12. Then, the image data selected by the selector 39 of the image signal processing unit 3 is input to the AI processing unit 5 via the preprocessing unit 4.
- the camera control unit 13B controls the selector 39 based on the scene determination result.
- the camera control unit 13B performs the AI model setting process of the AI processing unit 5 on the basis of the first AI model setting data P1 and the second AI model setting data P2 stored in the out-of-sensor memory unit 14, as in the in-sensor control unit 6.
- the camera control unit 13B performs a process of storing the non-demosaic image obtained by the image signal processing unit 3 in a memory in the camera device 10B such as the out-of-sensor memory unit 14.
- the camera control unit 13B can also perform processing of transmitting the accumulated non-demosaic image to an external device.
- the demosaiced image/non-demosaic image is switched on the basis of the determination result as to whether or not it is a dark scene.
- the scene determination it is also conceivable to adopt a configuration in which three or more scenes are determined and the extraction position of the input data of the AI processing unit 5 is changed for each scene.
- the input data may be a non-demosaic image after shading correction in the first scene
- the input data may be a non-demosaic image after AWB processing in the second scene
- the input data may be a demosaiced image in the third scene.
- the pixel array unit 2 is configured to separately receive only three wavelengths of R, G, and B has been exemplified.
- the present technology can also be suitably applied to a case where a pixel array unit configured to separately receive light of four or more wavelength bands, such as a pixel array unit in a multi spectrum camera, is used.
- the AI processing unit 5 in a case where it is assumed that a plurality of AI models having different analysis tasks is switched and used, it is also conceivable to determine whether or not to set the input data of the AI processing unit 5 to a non-demosaic image according to the type of AI model used by the AI processing unit 5 (that is, the type of analysis task).
- the signal processing device (sensor unit 1, 1A, camera device 10B) as the embodiment includes an AI processing unit (5) that performs an image analysis process using an AI model with a non-demosaic image as input data, the non-demosaic image being a captured image in a state not subjected to a demosaic process, the captured image being obtained by a pixel array unit (2) configured by a plurality of pixel units two-dimensionally disposed, each pixel unit including a plurality of pixels receiving light of different wavelength bands, the plurality of pixels being two-dimensionally disposed in a predetermined pattern.
- an AI processing unit (5) that performs an image analysis process using an AI model with a non-demosaic image as input data, the non-demosaic image being a captured image in a state not subjected to a demosaic process, the captured image being obtained by a pixel array unit (2) configured by a plurality of pixel units two-dimensionally disposed, each pixel unit including a plurality of pixels receiving light
- the demosaic process involves a spatial interpolation process, deviation from the original pixel value tends to occur, and in a case where the demosaiced image is used as input data of the image analysis process by the AI model, the processing accuracy may decrease due to the deviation of the pixel value.
- the non-demosaic image as the input data of the image analysis process as described above, it is possible to prevent degradation in processing accuracy due to such a demosaic process. Therefore, the processing accuracy of the image analysis process using the AI model can be improved.
- the signal processing device includes a demosaic processing unit (34) that performs a demosaic process on a captured image obtained by the pixel array unit; and a control unit (in-sensor control unit 6, 6A, camera control unit 13B) that performs control so that input data of the AI processing unit is switched between a demosaiced image and the non-demosaic image on a basis of a scene determination result for a scene to be imaged, the demosaiced image being an image after the demosaic process by the demosaic processing unit.
- a demosaic processing unit 344 that performs a demosaic process on a captured image obtained by the pixel array unit
- a control unit in-sensor control unit 6, 6A, camera control unit 13B
- the accuracy of the image analysis process may be improved by using the non-demosaic image instead of the demosaiced image depending on the scene. That is, even when a model that does not have a function of absorbing a difference between scenes is used as the AI model, the accuracy of the image analysis process may be improved by adopting a method of using a non-demosaic image instead of a demosaiced image in a certain scene.
- the non-demosaic image can be used as the input data in a specific scene in which the image analysis processing accuracy is improved in a case where the non-demosaic image is used as the input data. Therefore, even in a case where an AI model capable of absorbing a difference between scenes cannot be used due to a resource problem, it is possible to suppress degradation in image analysis processing accuracy depending on scenes.
- control unit performs control so that an AI model trained with a demosaiced image as input data for training is used as the AI model in a case where the demosaiced image is input data, and an AI model trained with a non-demosaic image as input data for training is used as the AI model in a case where the non-demosaic image is input data.
- an AI model trained with a demosaiced image as input data for training is used as the AI model in a case where the demosaiced image is input data
- an AI model trained with a non-demosaic image as input data for training is used as the AI model in a case where the non-demosaic image is input data.
- the determination of the scene to be imaged is determination of whether or not the scene is a dark scene
- the control unit receives a non-demosaic image as input data in a case where it is determined that the scene is a dark scene.
- accuracy of the image analysis process may be improved by using a non-demosaic image instead of a demosaiced image. Therefore, according to the above configuration, even in a case where the AI model capable of absorbing the difference between the scenes cannot be used due to the resource problem, it is possible to suppress degradation in the image analysis processing accuracy depending on the scene.
- the signal processing device includes an image signal processing unit (3) that includes a demosaic processing unit and performs an image signal process on a captured image, and the control unit changes a signal processing parameter of the image signal processing unit on the basis of a scene determination result (see Fig. 8).
- the control unit changes a signal processing parameter of the image signal processing unit on the basis of a scene determination result (see Fig. 8).
- the AI model is an AI model having a neural network as a CNN
- the signal processing device includes an image organization unit (38) that collects pixel values for respective pixels having a same in-pixel-unit position from each of the pixel units and generates a color separation image that is an image formed by being disposed in different regions on a same image plane, and the AI processing unit performs the image analysis process using the color separation image as input data.
- the format of the input data to the AI processing unit is an input data format suitable for the configuration of the CNN. Therefore, the accuracy of the image analysis process can be improved.
- the AI processing unit performs the image analysis process using the non-demosaic image after shading correction as input data.
- the analysis processing accuracy can be improved corresponding to a specific scene such as a dark scene.
- the signal processing device includes an accumulation unit (in-sensor memory unit 7, out-of-sensor memory unit 14) that accumulates non-demosaic images.
- the non-demosaic images accumulated in the accumulation unit can be used for retraining of the AI model used in the AI processing unit. As the AI model can be retrained, the accuracy of the image analysis process by the AI processing unit can be improved.
- the signal processing device as an embodiment includes a transmission processing unit (camera control unit 13, in-sensor control unit 6A, camera control unit 13B) that performs a process of transmitting the non-demosaic images accumulated in the accumulation unit to the outside of the device.
- a transmission processing unit camera control unit 13, in-sensor control unit 6A, camera control unit 13B
- the non-demosaic image it is possible to transmit the non-demosaic image to be used for retraining to the external device corresponding to a case where the retraining of the AI model is performed by the external device of the signal processing device. Therefore, retraining of the AI model can be appropriately performed.
- the signal processing device (sensor unit 1, 1A) as an embodiment is configured as a sensor device including a pixel array unit.
- the accuracy of the image analysis process can be improved for the sensor device configured to be capable of performing the image analysis process using the AI model.
- a signal processing method as an embodiment includes performing an image analysis process using an AI model with a non-demosaic image as input data, the non-demosaic image being a captured image in a state not subjected to a demosaic process, the captured image being obtained by a pixel array unit including a plurality of pixel units two-dimensionally disposed, each pixel unit including a plurality of pixels two-dimensionally disposed in a predetermined pattern, the plurality of pixels receiving light in different wavelength bands.
- An image processing system comprising: circuitry configured to acquire image data captured by an image sensor, process the image data to generate a non-demosaic image, and selectively output the non-demosaic image to an artificial intelligence model trained to perform image analysis.
- the image processing system of (1) wherein the non-demosaic image is a captured image in a state not subjected to a demosaic process.
- the image processing system of (2) wherein the non-demosaic image is the captured image in the state not subjected to a demosaic process and after shading correction.
- the image processing system of (1) further comprising: a memory configured to store one or more non-demosaic images.
- the artificial intelligence model is a convolutional neural network.
- the circuitry is further configured to determine whether a scene corresponding to the acquired image data is a dark scene.
- the image processing system of (7), wherein the scene is a dark scene in a case that a luminance value of a target subject is equal to or less than a predetermined luminance value.
- the non-demosaic image is a color separation image of the non-demosaic image.
- the circuitry is further configured to acquire artificial intelligence model setting data, perform parameter setting for the artificial intelligence model based on the acquired artificial intelligence model setting data, wherein, based on the parameter setting, the artificial intelligence model is an artificial intelligence model trained with non-demosaic images as input for training.
- the circuitry is further configured to in a case that it cannot be determined whether a scene corresponding to the acquired image data is the dark scene, select the non-demosaic image.
- the circuitry is further configured to in response to a determination that the scene is not the dark scene, selectively output a demosaic image to the artificial intelligence model trained to perform image analysis.
- An image processing system comprising: an image sensor configured to capture image data; a processor configured to acquire the image data captured by the image sensor, process the image data to generate a non-demosaic image, and selectively output the non-demosaic image to an artificial intelligence model trained to perform image analysis; and a communication interface configured to output an image analysis result of the trained artificial intelligence model to a camera processing circuity.
- An image processing method comprising: acquiring image data captured by an image sensor; processing the image data to generate a non-demosaic image; and selectively outputting the non-demosaic image to an artificial intelligence model trained to perform image analysis.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- General Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Studio Devices (AREA)
- Image Analysis (AREA)
Abstract
Description
In such an image analysis process, it is common to use 3ch images of red (R), green (G), and blue (B) as input data of the AI model (see, for example,
Since the demosaic process involves a spatial interpolation process, deviation from the original pixel value tends to occur, and in a case where the demosaiced image is used as input data of the image analysis process by the AI model, the processing accuracy may decrease due to the deviation of the pixel value. By using the non-demosaic image as the input data of the image analysis process as described above, it is possible to prevent degradation in processing accuracy due to such a demosaic process.
<1. First embodiment>
(1-1. Configuration example of camera device)
(1-2. Analysis processing method as first embodiment)
(1-3. Processing procedure)
<2. Second embodiment>
<3. Modifications>
<4. Summary of Embodiment>
<5. Present Technology>
(1-1. Configuration example of camera device)
Fig. 1 is a block diagram illustrating a schematic configuration example of a
As illustrated in the drawing, the
The
The
Furthermore, in the
The
In the
The
Specifically, the
Note that the internal configuration of the image
The
The
In this case, as the AI model in the
For example, the in-
Furthermore, the in-
Furthermore, as the control of the
The above-described output data can be output to the outside of the sensor unit 1 (the
Furthermore, the in-
Note that Fig. 2 illustrates the
Furthermore, the
As described above, in the application in which a person appreciates an image, it is desirable to use images of 3 ch of R, G, and B because full color expression can be performed. However, in the image analysis process using an AI model, it is not necessarily the best in terms of analysis processing accuracy to use images of 3 ch of R, G, and B as input data.
Since the demosaic process involves a spatial interpolation process, deviation from the original pixel value tends to occur, and in a case where a demosaiced image is used as input data of the image analysis process by the AI model, the accuracy of the image analysis process may decrease due to the deviation of the pixel value.
On the other hand, by using the non-demosaic image as the input data of the
For example, in the image analysis process using an AI model, such as an object detection process, it is practically requested to maintain accuracy even in a case where a certain degree of change occurs in a scene captured by a camera. For example, a scene captured by a camera, such as a bright/dark scene such as day and night, a scene in which a subject to be analyzed such as a person is present relatively far away, or a scene in which the subject to be analyzed such as a person is present nearby, can change over time. It is desirable that the analysis processing accuracy is maintained at a certain level or more even with respect to such a change in the scene.
Therefore, the method of absorbing the change in the scene in the AI model can be applied in a case where the execution entity of the image analysis process is a computer device having a relatively rich resource, specifically, a computer device outside the
Specifically, the experimental result of Fig. 3 illustrates a result in a case where the object detection process with the human body as the target subject is performed as the image analysis process using the AI model, and illustrates a measurement result of the evaluation value of the image analysis processing accuracy in a case where the image data of each extraction position indicated by <1> to <6> in Fig. 2 is used as the input data for each scene of the dark scene and the bright scene.
The evaluation value here is obtained as a ratio at which 90% or more of persons can be accurately detected in a case where the object detection process with the human body as the target subject is performed a plurality of times.
The extraction positions of <1> to <3> are both positions before demosaicing. In the experiment, image data as a color separation image to be described below is used as the image data of the extraction positions of <1> to <3> in correspondence with the use of the AI model having the CNN as the AI model.
Fig. 4A illustrates a pixel array (Bayer array in this example) in the
Fig. 4B illustrates a color separation image generated in the case of the Bayer array. In the case of the Bayer array as illustrated, pixel values of pixels having the same pixel unit position are collected and disposed in different regions on the same image plane for the R pixel, the G pixel, the G pixel, and the B pixel in respective pixel units Pu, thereby generating a color separation image including an image region in which pixel values of the R pixels in respective pixel units Pu are arrayed, an image region in which pixel values of one G pixel in respective pixel units Pu are arrayed, an image region in which pixel values of the other G pixel in respective pixel units Pu are arrayed, and an image region in which pixel values of the B pixel in respective pixel units Pu are arrayed.
The extraction positions of <4> to <6> are extraction positions after demosaicing, <4> is a position between the
According to an experiment, in the case of the data types <1> to <3> before demosaicing, the evaluation value is maintained at 60% or more, whereas in the case of the data type <4> <5> after demosaicing, the evaluation value is significantly lower than 60%. In the case of using the data type of <6>, that is, the demosaiced image after gamma correction, a slight increase in the evaluation value can be confirmed, but is less than 60%.
In the dark scene, the case of using the data type of <2>, that is, the image after shading correction and before gain adjustment has the highest evaluation value, and the evaluation value is about 75%.
The degradation in resolution in this manner is also considered to be a cause of degradation in processing accuracy in a case where a demosaiced image is used as input data.
As a result, in a case where an AI model capable of absorbing a difference between scenes cannot be used due to a resource problem, it is possible to suppress degradation in image analysis processing accuracy in a dark scene, and it is possible to suppress degradation in image analysis processing accuracy depending on a scene.
As illustrated in Fig. 2, the
Specifically, in the present example, in a case where it is determined that it is not a dark scene, the demosaiced image output from the
The
Then, as a result of determining whether or not it is a dark scene, the in-
Here, the first AI model setting data P1 and the second AI model setting data P2 are data including parameters as filter coefficients used in a filtering process such as a convolution process in the CNN and various parameters related to the structure of the neural network.
Furthermore, in a case where it is determined that it is a dark scene, the in-
A specific processing procedure example to be executed by the in-
Note that the processing illustrated in Fig. 6 is executed by the CPU in the in-
In the present example, the in-
The scene determination process in step S103 may be periodically performed at predetermined time intervals, for example. In this case, the scene determination condition is only required to be the passage of a certain period of time.
Note that it is also conceivable that the scene determination execution condition is that an instruction from the outside has been given, that as a result of simple brightness detection using an illuminance sensor or the like, there has been a change in brightness by a predetermined amount or more, or the like, and the scene determination execution condition is not limited to a specific condition.
In a case where it is determined in step S102 that the process is not ended, the in-
That is, a loop process of waiting for either the establishment of the scene determination execution condition or the establishment of the processing end condition is formed by the processes of steps S101 and S102.
As a result, in a case where it is determined that the scene is not a dark scene, the
As a result, in a case where it is determined that the scene is a dark scene, the
Next, the second embodiment will be described.
In the second embodiment, the non-demosaic image is accumulated on the assumption that the AI model used in the
Note that, although illustration of the
The
As illustrated, in the
The in-
As a result, the captured image by the non-demosaic image that can be obtained under the actual use environment of the
As a result, it is possible to transmit the non-demosaic image to be used for retraining to the external device corresponding to a case where the retraining of the AI model used by the
Here, in a case where the
Note that the embodiment is not limited to the specific example described above, and may be configured as various modifications.
For example, it is also conceivable to change the signal processing parameter of the image
In a case where a segmentation process such as semantic segmentation is performed as the image analysis process using an AI model, class identification is performed for each block such as each pixel in the segmentation process. Therefore, when gamma correction or overall gain adjustment is performed, processing accuracy in a block of a bright portion or a dark portion may deteriorate.
Therefore, in a case where the segmentation process is performed as the image analysis process using an AI model, it is conceivable to use a captured image in a state where gamma correction or overall gain adjustment is not performed as input data.
Here, the processing illustrated in Fig. 8 may be performed by either the in-
Specifically, the in-
In addition, the in-
Specifically, in the
Furthermore, in a case where the non-demosaic image accumulation described in the second embodiment is realized, the
For example, it is conceivable to switch the input data to the non-demosaic image in a case where the target subject is imaged in a small size and resolution is requested, and switch the input data to the demosaiced image in a scene where the target subject is imaged in a large size.
As described above, the signal processing device (
Since the demosaic process involves a spatial interpolation process, deviation from the original pixel value tends to occur, and in a case where the demosaiced image is used as input data of the image analysis process by the AI model, the processing accuracy may decrease due to the deviation of the pixel value. By using the non-demosaic image as the input data of the image analysis process as described above, it is possible to prevent degradation in processing accuracy due to such a demosaic process.
Therefore, the processing accuracy of the image analysis process using the AI model can be improved.
As described with reference to Fig. 3, it has been confirmed by an experiment that the accuracy of the image analysis process may be improved by using the non-demosaic image instead of the demosaiced image depending on the scene. That is, even when a model that does not have a function of absorbing a difference between scenes is used as the AI model, the accuracy of the image analysis process may be improved by adopting a method of using a non-demosaic image instead of a demosaiced image in a certain scene.
As described above, by switching the input data of the AI processing unit between the demosaiced image and the non-demosaic image on the basis of the scene determination result, the non-demosaic image can be used as the input data in a specific scene in which the image analysis processing accuracy is improved in a case where the non-demosaic image is used as the input data.
Therefore, even in a case where an AI model capable of absorbing a difference between scenes cannot be used due to a resource problem, it is possible to suppress degradation in image analysis processing accuracy depending on scenes.
As a result, as the image analysis process using the AI model, appropriate image analysis process according to the scene can be executed.
As described with reference to Fig. 3, it has been confirmed that, in a dark scene, accuracy of the image analysis process may be improved by using a non-demosaic image instead of a demosaiced image.
Therefore, according to the above configuration, even in a case where the AI model capable of absorbing the difference between the scenes cannot be used due to the resource problem, it is possible to suppress degradation in the image analysis processing accuracy depending on the scene.
As a result, it is possible to perform the image signal process suitable for a non-demosaic image in a case where it is determined that the scene is a specific scene and the image analysis process using the non-demosaic image as input data is performed, and perform the image signal process suitable for a demosaiced image in a case where it is determined that the scene is not a specific scene and the image analysis process using the demosaiced image as input data is performed.
Therefore, the accuracy of the image analysis process can be improved.
As a result, the format of the input data to the AI processing unit is an input data format suitable for the configuration of the CNN.
Therefore, the accuracy of the image analysis process can be improved.
As described with reference to Fig. 3, in a scene in which a non-demosaic image should be used as input data, use of a non-demosaic image after shading correction can improve analysis processing accuracy, compared with use of a non-demosaic image before shading correction.
Therefore, according to the above configuration, the analysis processing accuracy can be improved corresponding to a specific scene such as a dark scene.
The non-demosaic images accumulated in the accumulation unit can be used for retraining of the AI model used in the AI processing unit.
As the AI model can be retrained, the accuracy of the image analysis process by the AI processing unit can be improved.
As a result, it is possible to transmit the non-demosaic image to be used for retraining to the external device corresponding to a case where the retraining of the AI model is performed by the external device of the signal processing device.
Therefore, retraining of the AI model can be appropriately performed.
As a result, the accuracy of the image analysis process can be improved for the sensor device configured to be capable of performing the image analysis process using the AI model.
With such a signal processing method, it is possible to obtain functions and effects similar to functions and effects of the signal processing devices as the above-described embodiments.
The present technology can also adopt the following configurations.
(1)
An image processing system, comprising:
circuitry configured to
acquire image data captured by an image sensor,
process the image data to generate a non-demosaic image, and
selectively output the non-demosaic image to an artificial intelligence model trained to perform image analysis.
(2)
The image processing system of (1), wherein the non-demosaic image is a captured image in a state not subjected to a demosaic process.
(3)
The image processing system of (2), wherein the non-demosaic image is the captured image in the state not subjected to a demosaic process and after shading correction.
(4)
The image processing system of (1), further comprising:
a memory configured to store one or more non-demosaic images.
(5)
The image processing system of (4), wherein the circuitry is further configured to
retrain the artificial intelligence model using the one or more non-demosaic images stored in the memory.
(6)
The image processing system of (1), wherein the artificial intelligence model is a convolutional neural network.
(7)
The image processing system of (1), wherein the circuitry is further configured to
determine whether a scene corresponding to the acquired image data is a dark scene.
(8)
The image processing system of (7), wherein the scene is a dark scene in a case that a luminance value of a target subject is equal to or less than a predetermined luminance value.
(9)
The image processing system of (7), wherein the circuitry is further configured to
in response to a determination that the scene is the dark scene, select the non-demosaic image.
(10)
The image processing system of (9), wherein the non-demosaic image is a color separation image of the non-demosaic image.
(11)
The image processing system of (9), wherein the circuitry is further configured to
acquire artificial intelligence model setting data,
perform parameter setting for the artificial intelligence model based on the acquired artificial intelligence model setting data,
wherein, based on the parameter setting, the artificial intelligence model is an artificial intelligence model trained with non-demosaic images as input for training.
(12)
The image processing system of (10), wherein the artificial intelligence model setting data includes filter coefficients used in a convolution process in a convolutional neural network and a structure of the convolutional neural network.
(13)
The image processing system of (7), wherein the circuitry is further configured to
in a case that it cannot be determined whether a scene corresponding to the acquired image data is the dark scene, select the non-demosaic image.
(14)
The image processing system of (7), wherein the circuitry is further configured to
in response to a determination that the scene is not the dark scene, selectively output a demosaic image to the artificial intelligence model trained to perform image analysis.
(15)
The image processing system of (1), further comprising:
the image sensor configured to capture the image data.
(16)
An image processing system, comprising:
an image sensor configured to capture image data;
a processor configured to
acquire the image data captured by the image sensor,
process the image data to generate a non-demosaic image, and
selectively output the non-demosaic image to an artificial intelligence model trained to perform image analysis; and
a communication interface configured to output an image analysis result of the trained artificial intelligence model to a camera processing circuity.
(17)
An image processing method, comprising:
acquiring image data captured by an image sensor;
processing the image data to generate a non-demosaic image; and
selectively outputting the non-demosaic image to an artificial intelligence model trained to perform image analysis.
(18)
The method of (17), further comprising:
determining whether a scene corresponding to the acquired image data is a dark scene.
(19)
The method of (17), further comprising:
in response to a determination that the scene is the dark scene, selecting the non-demosaic image, wherein the non-demosaic image is a color separation image of the non-demosaic image.
(20)
The method of (19), further comprising:
acquiring artificial intelligence model setting data,
performing parameter setting for the artificial intelligence model based on the acquired artificial intelligence model setting data,
wherein, based on the parameter setting, the artificial intelligence model is an artificial intelligence model trained with non-demosaic images as input for training.
1, 1A, 1B sensor unit
2 pixel array unit
3 image signal processing unit
4 preprocessing unit
5 AI processing unit
6, 6A in-sensor control unit
7 in-sensor memory unit
8 output data generation unit
9 communication interface (I/F)
11 optical system
12 communication interface (I/F)
13, 13B camera control unit
14 out-of-sensor memory unit
15 communication unit
P1 first AI model setting data
P2 second AI model setting data
31 black level correction unit
32 shading correction unit
33 gain adjustment unit
34 demosaic processing unit
35 color correction unit
36 gamma correction unit
37 dewarp processing unit
38 image organization unit
39 selector
Pu pixel unit
Claims (20)
- An image processing system, comprising:
circuitry configured to
acquire image data captured by an image sensor,
process the image data to generate a non-demosaic image, and
selectively output the non-demosaic image to an artificial intelligence model trained to perform image analysis. - The image processing system of claim 1, wherein the non-demosaic image is a captured image in a state not subjected to a demosaic process.
- The image processing system of claim 2, wherein the non-demosaic image is the captured image in the state not subjected to a demosaic process and after shading correction.
- The image processing system of claim 1, further comprising:
a memory configured to store one or more non-demosaic images. - The image processing system of claim 4, wherein the circuitry is further configured to
retrain the artificial intelligence model using the one or more non-demosaic images stored in the memory. - The image processing system of claim 1, wherein the artificial intelligence model is a convolutional neural network.
- The image processing system of claim 1, wherein the circuitry is further configured to
determine whether a scene corresponding to the acquired image data is a dark scene. - The image processing system of claim 7, wherein the scene is a dark scene in a case that a luminance value of a target subject is equal to or less than a predetermined luminance value.
- The image processing system of claim 7, wherein the circuitry is further configured to
in response to a determination that the scene is the dark scene, select the non-demosaic image. - The image processing system of claim 9, wherein the non-demosaic image is a color separation image of the non-demosaic image.
- The image processing system of claim 9, wherein the circuitry is further configured to
acquire artificial intelligence model setting data,
perform parameter setting for the artificial intelligence model based on the acquired artificial intelligence model setting data,
wherein, based on the parameter setting, the artificial intelligence model is an artificial intelligence model trained with non-demosaic images as input for training. - The image processing system of claim 10, wherein the artificial intelligence model setting data includes filter coefficients used in a convolution process in a convolutional neural network and a structure of the convolutional neural network.
- The image processing system of claim 7, wherein the circuitry is further configured to
in a case that it cannot be determined whether a scene corresponding to the acquired image data is the dark scene, select the non-demosaic image. - The image processing system of claim 7, wherein the circuitry is further configured to
in response to a determination that the scene is not the dark scene, selectively output a demosaic image to the artificial intelligence model trained to perform image analysis. - The image processing system of claim 1, further comprising:
the image sensor configured to capture the image data. - An image processing system, comprising:
an image sensor configured to capture image data;
a processor configured to
acquire the image data captured by the image sensor,
process the image data to generate a non-demosaic image, and
selectively output the non-demosaic image to an artificial intelligence model trained to perform image analysis; and
a communication interface configured to output an image analysis result of the trained artificial intelligence model to a camera processing circuity. - An image processing method, comprising:
acquiring image data captured by an image sensor;
processing the image data to generate a non-demosaic image; and
selectively outputting the non-demosaic image to an artificial intelligence model trained to perform image analysis. - The method of claim 17, further comprising:
determining whether a scene corresponding to the acquired image data is a dark scene. - The method of claim 18, further comprising:
in response to a determination that the scene is the dark scene, selecting the non-demosaic image, wherein the non-demosaic image is a color separation image of the non-demosaic image. - The method of claim 19, further comprising:
acquiring artificial intelligence model setting data,
performing parameter setting for the artificial intelligence model based on the acquired artificial intelligence model setting data,
wherein, based on the parameter setting, the artificial intelligence model is an artificial intelligence model trained with non-demosaic images as input for training.
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202480036428.3A CN121241375A (en) | 2023-06-07 | 2024-05-28 | Signal processing device and signal processing method |
| EP24733341.2A EP4724999A1 (en) | 2023-06-07 | 2024-05-28 | Signal processing device and signal processing method |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2023-094237 | 2023-06-07 | ||
| JP2023094237A JP2024176038A (en) | 2023-06-07 | 2023-06-07 | Signal processing device and signal processing method |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024252977A1 true WO2024252977A1 (en) | 2024-12-12 |
Family
ID=91580712
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2024/019482 Ceased WO2024252977A1 (en) | 2023-06-07 | 2024-05-28 | Signal processing device and signal processing method |
Country Status (4)
| Country | Link |
|---|---|
| EP (1) | EP4724999A1 (en) |
| JP (1) | JP2024176038A (en) |
| CN (1) | CN121241375A (en) |
| WO (1) | WO2024252977A1 (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20210073957A1 (en) * | 2018-12-21 | 2021-03-11 | Huawei Technologies Co., Ltd. | Image processor and method |
| WO2022092742A1 (en) * | 2020-10-27 | 2022-05-05 | 삼성전자 주식회사 | Device and method for generating image in which subject has been captured |
| WO2022265903A1 (en) * | 2021-06-18 | 2022-12-22 | Qualcomm Incorporated | Composite image signal processor |
-
2023
- 2023-06-07 JP JP2023094237A patent/JP2024176038A/en active Pending
-
2024
- 2024-05-28 CN CN202480036428.3A patent/CN121241375A/en active Pending
- 2024-05-28 WO PCT/JP2024/019482 patent/WO2024252977A1/en not_active Ceased
- 2024-05-28 EP EP24733341.2A patent/EP4724999A1/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20210073957A1 (en) * | 2018-12-21 | 2021-03-11 | Huawei Technologies Co., Ltd. | Image processor and method |
| WO2022092742A1 (en) * | 2020-10-27 | 2022-05-05 | 삼성전자 주식회사 | Device and method for generating image in which subject has been captured |
| US20230245285A1 (en) * | 2020-10-27 | 2023-08-03 | Samsung Electronics Co., Ltd. | Device and method for generating image in which subject has been captured |
| WO2022265903A1 (en) * | 2021-06-18 | 2022-12-22 | Qualcomm Incorporated | Composite image signal processor |
Also Published As
| Publication number | Publication date |
|---|---|
| JP2024176038A (en) | 2024-12-19 |
| CN121241375A (en) | 2025-12-30 |
| EP4724999A1 (en) | 2026-04-15 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7363953B2 (en) | Imaging device | |
| JP7372034B2 (en) | Imaging device and image processing device | |
| US8937680B2 (en) | Image pickup unit and image processing unit for image blur correction | |
| JPWO2017170716A1 (en) | Imaging apparatus, image processing apparatus, and electronic apparatus | |
| JP7439856B2 (en) | Imaging device | |
| US7444075B2 (en) | Imaging device, camera, and imaging method | |
| WO2024252977A1 (en) | Signal processing device and signal processing method | |
| US20240062339A1 (en) | Photographing system and method of image fusion | |
| JPWO2017170717A1 (en) | IMAGING DEVICE, FOCUS ADJUSTMENT DEVICE, AND ELECTRONIC DEVICE | |
| JP7726766B2 (en) | Imaging device, imaging method, and program | |
| WO2017170726A1 (en) | Image pickup device and electronic apparatus | |
| JP6604385B2 (en) | Imaging device | |
| JPWO2017170719A1 (en) | Imaging apparatus and electronic device | |
| JPWO2017170718A1 (en) | IMAGING DEVICE, SUBJECT DETECTING DEVICE, AND ELECTRONIC DEVICE | |
| JPWO2017057295A1 (en) | Control device and imaging device | |
| JPWO2017170725A1 (en) | IMAGING DEVICE, SUBJECT DETECTING DEVICE, AND ELECTRONIC DEVICE | |
| WO2017057294A1 (en) | Imaging device and focus detection device | |
| WO2017057268A1 (en) | Imaging device and control device |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24733341 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2024733341 Country of ref document: EP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2024733341 Country of ref document: EP Effective date: 20260107 |
|
| ENP | Entry into the national phase |
Ref document number: 2024733341 Country of ref document: EP Effective date: 20260107 |
|
| ENP | Entry into the national phase |
Ref document number: 2024733341 Country of ref document: EP Effective date: 20260107 |
|
| ENP | Entry into the national phase |
Ref document number: 2024733341 Country of ref document: EP Effective date: 20260107 |
|
| ENP | Entry into the national phase |
Ref document number: 2024733341 Country of ref document: EP Effective date: 20260107 |
|
| WWP | Wipo information: published in national office |
Ref document number: 2024733341 Country of ref document: EP |