EP4670059A1 - Procédé et dispositif de classification de matériaux à l'aide de données indirectes de temps de vol - Google Patents
Procédé et dispositif de classification de matériaux à l'aide de données indirectes de temps de volInfo
- Publication number
- EP4670059A1 EP4670059A1 EP24706135.1A EP24706135A EP4670059A1 EP 4670059 A1 EP4670059 A1 EP 4670059A1 EP 24706135 A EP24706135 A EP 24706135A EP 4670059 A1 EP4670059 A1 EP 4670059A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- spot
- information processing
- flight
- machine learning
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S17/00—Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
- G01S17/88—Lidar systems specially adapted for specific applications
- G01S17/89—Lidar systems specially adapted for specific applications for mapping or imaging
- G01S17/894—Three-dimensional [3D] imaging with simultaneous measurement of time-of-flight at a two-dimensional [2D] array of receiver pixels, e.g. time-of-flight cameras or flash lidar
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S7/00—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
- G01S7/48—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00
- G01S7/4802—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00 using analysis of echo signal for target characterisation; Target signature; Target cross-section
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
Definitions
- the present disclosure generally pertains to an information processing device and an information processing method for classifying materials.
- Time-of-flight (ToF) devices which determine a depth map of a scene based on a ToF of light that is emitted by an illuminator of the ToF device, thrown back by an object in the scene and detected by a ToF sensor of the ToF device, wherein the ToF of the light is determined based on the round-trip time (direct ToF device) or the phase of the detected light (indirect ToF device).
- spot ToF devices are known in which the illuminator emits spotted light to the scene, for example, a light pattern of separated high-intensity and low-intensity light areas such as a pattern of light dots.
- Some known methods for material sensing are based on multispectral sensors where the reflectance of the material is analyzed at different wavelengths.
- the disclosure provides an information processing device for classifying materials, comprising circuitry configured to: obtain spot indirect time-of-flight data acquired in a time-of-flight measurement of light thrown back by a material; compute, based on the spot indirect time-of-flight data, feature data according to a plurality of predefined features; and input the feature data into a machine learning algorithm, wherein the machine learning algorithm is trained to classify the material based on the feature data.
- the disclosure provides an information processing method for classifying materials, comprising: obtaining spot indirect time-of-flight data acquired in a time-of-flight measurement of light thrown back by a material; computing, based on the spot indirect time-of-flight data, feature data according to a plurality of predefined features; and inputting the feature data into a machine learning algorithm, wherein the machine learning algorithm is trained to classify the material based on the feature data.
- Fig. 1 schematically illustrates in a block diagram an embodiment of an information processing device
- Fig. 2 schematically illustrates an embodiment of spot and valley detection
- Fig. 3 schematically illustrates an embodiment of direct global separation
- Fig. 4 schematically illustrates an embodiment of an information processing device for facial recognition
- Fig. 5 schematically illustrates an embodiment of a training of a machine learning algorithm
- Fig. 6 schematically illustrates an embodiment of normalized mean feature values for different material classes
- Fig. 7 schematically illustrates in a flow diagram an embodiment of an information processing method
- Fig. 8 schematically illustrates a multi-purpose computer which can be used for implementing an information processing device.
- time-of-flight (ToF) devices which determine a depth map of a scene based on a ToF of light that is emitted by a light source of the ToF device, thrown back by an object in the scene and detected by a ToF sensor of the ToF device, wherein the ToF of the light is determined based on the round-trip time (direct ToF device (dToF)) or the phase of the detected light (indirect ToF device (iToF)).
- dToF direct ToF device
- iToF indirect ToF device
- spot ToF devices are known in which the light source emits spotted light to the scene, for example, a light pattern of separated high-intensity and low-intensity light areas such as a pattern of light dots.
- facial recognition based on 2D images e.g., for authenticating a user for unlocking a mobile device.
- the image e.g., a 2D infrared (IR) or RGB (red-green-blue) image
- IR infrared
- RGB red-green-blue
- CNN convolutional neural network
- facial recognition based on 2D images may be spoofed by presenting a printed image of the legitimate user to the camera or by wearing/presenting a mask with the characteristics of the legitimate user to the camera of the mobile device.
- sub-surface scattering the light penetrates the material, and depending on its thickness and transparency, the light would travel inside the material more or less.
- spot indirect ToF devices which typically use active modulated (temporal and spatial) IR light, may be used for skin identification for improving facial recognition.
- active modulated (temporal and spatial) IR light may be used for skin identification for improving facial recognition.
- some embodiments pertain to an information processing device for classifying materials, wherein the information processing device includes circuitry configured to: obtain spot indirect time-of-flight data acquired in a time-of-flight measurement of light thrown back by a material; compute, based on the spot indirect time-of-flight data, feature data according to a plurality of predefined features; and input the feature data into a machine learning algorithm, wherein the machine learning algorithm is trained to classify the material based on the feature data.
- the information processing device may be a data processing module, a computer, a server, a mobile device (such as a smartphone, a tablet, a laptop), a virtual reality device or the like.
- the functionality may be implemented by software executed by a processor such as a microprocessor or the like.
- the circuitry may be based on or may include or may be implemented by typical electronic components configured to achieve the functionality as described herein.
- the circuitry may be based on or may include or may be implemented in parts by typical electronic components and integrated circuitry logic and in parts by software.
- the circuitry may include data storage capabilities to store data such as memory which may be based on semiconductor storage technology (e.g., RAM, EPROM, etc.) or magnetic storage technology (e.g., a hard disk drive) or the like.
- semiconductor storage technology e.g., RAM, EPROM, etc.
- magnetic storage technology e.g., a hard disk drive
- the circuitry may include a data bus for receiving and transmitting data over the data bus.
- the circuitry may implement communication protocols for receiving and transmitting the data over the data bus.
- the information processing device includes a spot indirect time-of-flight device to acquire the spot indirect time-of-flight data.
- the spot indirect time-of-flight device includes the information processing device.
- the information processing device and the spot indirect-time-of-flight device are separated devices.
- the spot indirect time-of-flight device includes a spot illuminator configured to illuminate a scene with spotted light.
- the spot illuminator may include a light emitting diode (LED), a laser, a laser diode, an LED array, a laser diode array, a diffractive optical element, a lens system, etc.
- LED light emitting diode
- the spot illuminator may include a light emitting diode (LED), a laser, a laser diode, an LED array, a laser diode array, a diffractive optical element, a lens system, etc.
- the spotted light has a spatial light pattern including high-intensity light areas and low-intensity light areas and, thus, a plurality of light spots corresponding to the high-intensity light areas is projected onto the scene.
- the light spots may be dots, stripes, a checker pattern or the like.
- the spotted light is further temporal intensity modulated with a configured modulation frequency in accordance with an applied modulation signal.
- the applied modulation signal is a periodic signal which may be a sinusoidal signal, a rectangular signal, or the like.
- the time-of-flight sensor includes a time-of-flight pixel array including plurality of time-of- flight pixels arranged in rows and columns, wherein each time-of-flight pixel may be a CAPD (Current Assisted photonic Demodulator) pixel array including a plurality of CAPD pixels (e.g., one-tapped, two-tapped, etc.) arranged in rows and columns.
- CAPD Current Assisted photonic Demodulator
- the time-of-flight sensor may be embedded in a time-of-flight camera which may further include optical elements such as a(n) (adaptive) lens system, color filters, a diffractive optical element or the like.
- optical elements such as a(n) (adaptive) lens system, color filters, a diffractive optical element or the like.
- the spot indirect time-of-flight device includes a control configured to control the overall operation of the spot indirect time-of-flight device.
- the time-of-flight measurement may include four correlation measurements, wherein for each correlation measurement a different phase shift between a modulation signal applied to the spot illuminator and a demodulation signal applied to the time-of-flight sensor is utilized (e.g., 0°, 90°, 180° and 270°), wherein the modulation signal and the demodulation signal are synchronized.
- control for each correlation measurement, the control outputs one frame, wherein the frame includes digital representations of the output voltages of each CAPD pixel.
- the control performs pre-processing by computing IQ values (I: in-phase component; Q: quadrature component) for each CAPD pixel, wherein the IQ values are based on the frames of the correlation measurements, thereby computing one frame with IQ values for each CAPD pixel, which is output.
- the spot indirect time-of-flight data may thus include one or more frames including digital representations of output voltages or IQ values.
- the circuitry computes, based on the spot indirect time-of-flight data, feature data according to a plurality of predefined features.
- the plurality of predefined features basically corresponds to a plurality of computation rules indicating how to compute the feature data from the spot indirect time-of-flight data to obtain properties of the material which allow to classify the material, in particular which are unique for different materials. These material properties are represented in the spot indirect time- of-flight data.
- any computation rule may be valid, for example, computation rules with which time-of-flight properties are computed such as amplitude, confidence, intensity, phase or depth of each time-of-flight pixel.
- Each predefined feature of the plurality of predefined features thus corresponds to a computation rule for computing feature values representing a part of the feature data.
- predefined features may be correlated such that accuracy and robustness of the material classification may be reduced and processing time may be increased when too many correlated predefined features are used.
- predefined features should be engineered and selected according to the underlying physics, their importance for the classification and their correlation with other predefined features.
- sub-surface scattering the light penetrates the material, and depending on its thickness and transparency, the light would travel inside the material more or less.
- the features should be selected to be unique in different materials because sub-surface scattering is different in those materials.
- the light signal received at each time-of-flight pixel has typically a direct light component and a global light component, wherein the direct light component corresponds to light that is emitted by the illuminator, thrown back by an object in a scene and imaged onto a certain time-of-flight pixel.
- the global light component is a sum of multiple reflections, scattering and stray light due to the time-of-flight camera itself (e.g., lenses, filters, etc.), geometric scene features (e.g., corners, concave regions, etc.) and material features (e.g., scattering, transparency, etc.).
- the global light component causes Multipath Interference (MPI), as it mixes with the direct light component.
- MPI Multipath Interference
- sub-surface scattering affects spots (e.g., dots) and valleys in a unique way in different materials.
- the circuitry is configured to detect spots and valleys in the spot indirect time-of-flight data, as will be discussed under reference of Fig. 2.
- the circuitry obtains the pixel positions of the spots and the corresponding valleys.
- the circuitry is configured to perform direct and global separation (DGS), as will be discussed under reference of Fig. 3.
- DGS direct and global separation
- the feature data are computed based on properties of the detected spots and valleys.
- the plurality of predefined features may thus correspond to computation rules which require properties of the detected spots and valleys as input, for example, the computation rule may require computation of a sum, a difference, a product or a ratio of spot and corresponding valley properties, e.g., difference or ratio between cartesian depth/phase of a spot (before or after DGS correction) and that of the corresponding valley, or a ratio between confidence of a spot and that of the corresponding valley.
- the computation rule may require computation of a sum, a difference, a product or a ratio of spot and corresponding valley properties, e.g., difference or ratio between cartesian depth/phase of a spot (before or after DGS correction) and that of the corresponding valley, or a ratio between confidence of a spot and that of the corresponding valley.
- the circuity obtains an input feature for the trained machine learning algorithm for each predefined feature at the end of the computation according to the predefined feature, wherein each input feature includes one or more feature values (having a data format such as binary, Integer, floating point, double precision or the like) and, thus, each input feature includes a single feature value or an array/vector of feature values.
- each input feature includes one or more feature values (having a data format such as binary, Integer, floating point, double precision or the like) and, thus, each input feature includes a single feature value or an array/vector of feature values.
- the feature data may thus include a plurality of input features, e.g., an array/vector of different input features, wherein each input feature includes one or more feature values.
- the predefined features are predefined features per detected spot requiring computation of reflectance of the spot, reflectance of the corresponding valley, the spot size, ratio between cartesian depth or phase of the spot after direct global separation and cartesian depth or phase of the corresponding valley, ratio between amplitude or confidence of the spot after direct global separation and amplitude or confidence of the corresponding valley, ratio between variance of cartesian depth or phase of the spot and variance of cartesian depth or phase of the corresponding valley, as will be discussed in more detail under reference of Figs. 1 to 6.
- the circuitry inputs the feature data into a machine learning algorithm, wherein the machine learning algorithm is trained to classify the material based on the feature data.
- the classifier may be a binary classifier such that the machine learning algorithm outputs whether the material belongs to a certain material or not.
- the material class may correspond to a larger material group such as metal, textile, paper, plastic, silicone, latex, rubber, wax, wood or skin or the like.
- the classifier may be a multiclass classifier such that the machine learning algorithm outputs, for example, for each material class, a probability that the material belongs to this material class.
- the classifier may be a multiclass classifier such that the machine learning algorithm outputs, for example, the material class to which the material belongs to.
- the machine learning algorithm is one of a support vector classifier, a random forest, a decision tree, a k-nearest neighbor algorithm, a naive Bayes classifier and AdaBoost.
- Other machine learning algorithms may be appropriate as well.
- the facial recognition may be improved by including skin identification to avoid spoofing.
- the machine learning algorithm is trained to classify the material into skin or non-skin.
- the circuitry is configured to perform facial recognition using the classification result.
- the circuitry determines that facial recognition has failed when the machine learning algorithm classifies the material as not being skin.
- the circuitry when the machine learning classifies the material as being skin, performs the facial recognition based on the spot indirect time-of-flight data and/or based on one or more 2D images such as a 2D infrared (IR) or RGB (red-green-blue) image.
- the 2D images may be acquired with an image sensor (e.g., a CCD (“Charge-Coupled device”) sensor or an active pixel sensor).
- the image sensor may be part of the information processing device.
- the circuitry may compute an amplitude image, based on the spot indirect time-of-flight data, which is comparable to a 2D image and extracts 2D facial features from the amplitude image, wherein 2D facial feature extraction from 2D images is generally known.
- the circuitry is configured to compute, based on the spot indirect time-of- flight data, a depth map and to perform the facial recognition further based on the depth map.
- the circuitry may extract structural 3D features from the depth map to perform the facial recognition and may correlate the structural 3D features with the 2D facial features to perform the facial recognition.
- the circuitry may store corresponding reference features of the legitimate user.
- the extracted 2D facial features - which are extracted based on the 2D images - and/or the structural 3D features, which are extracted based on the spot indirect time-of-flight data, may be compared with the stored reference features of the legitimate user for authentication.
- classifying the material as skin or not for improving facial recognition may benefit from using only input features from certain regions of the human face, since taking feature data of the whole face may blur the distinguishability of skin with respect to other materials.
- the classification accuracy may increase.
- the circuitry is configured to compute the feature data in accordance with a plurality of predefined region-of-interests.
- the information processing method includes: obtaining spot indirect time-of-flight data acquired in a time-of-flight measurement of light thrown back by a material; computing, based on the spot indirect time-of-flight data, feature data according to a plurality of predefined features; and inputting the feature data into a machine learning algorithm, wherein the machine learning algorithm is trained to classify the material based on the feature data.
- the information processing method may be performed by the information processing device as described herein.
- the methods as described herein are also implemented in some embodiments as a computer program causing a computer and/or a processor to perform the method, when being carried out on the computer and/or processor.
- a non-transitory computer- readable recording medium is provided that stores therein a computer program product, which, when executed by a processor, such as the processor described above, causes the methods described herein to be performed.
- FIG. 1 there is schematically illustrated in a block diagram an embodiment of an information processing device 1, which is discussed in the following under reference of Figs. 1 to 6.
- the information processing device l is a mobile device, here a smartphone, and includes a spot indirect time-of-flight device 100 (spot iToF device 100 in the following), a data bus 12 and circuitry 13.
- the spot iToF device 100 includes a spot illuminator 2, a ToF camera 3, a control 4 and a communication interface 5.
- the spot illuminator 2 includes, e.g., a diode laser array as a light source or a laser and a diffractive optical element or the like to illuminate a scene 6 with spotted light.
- the scene 6 includes an object 7 which at least partially throws back the spotted light.
- the object is made of or covered with a certain material which is to be classified.
- the spotted light has a spatial light pattern including high-intensity areas 8 and low-intensity areas 9 and, thus, a plurality of light spots corresponding to the high-intensity areas 8 is projected onto the scene 6.
- the light pattern or the light spots may be dots, stripes, or a checker pattern or the like.
- the spot illuminator 2 emits the temporal intensity modulated light with a configured modulation frequency, in accordance with an applied modulation signal, to the scene 6.
- the applied modulation signal is a periodic signal such as a sinusoidal signal, a rectangular signal or the like.
- the ToF camera 3 includes, e.g., a lens system, an aperture and an ToF sensor (not shown) to detect slight thrown back by the object 7 in the scene 6.
- the ToF sensor includes a plurality of two-tapped current-assisted photonic demodulator (CAPD) pixels arranged in rows and columns. Each CAPD pixel generates, in accordance with an applied demodulation signal, an output voltage in accordance with the phase of the received light signal.
- CAPD current-assisted photonic demodulator
- the demodulation signal is further phase-shifted by 180° between a first tap and a second tap of the two-tapped CAPD pixel and the difference of the voltage of the two taps is output to decrease an ambient light contribution to the output voltage, pixel offset, etc.
- the applied demodulation signal is a period signal such as a sinusoidal signal, a rectangular signal or the like in accordance with the modulation signal.
- the integration time of the ToF sensor (of each two-tapped CAPD pixel) is controlled, for example, by controlling a number of modulation periods T over which the output voltage is generated.
- the control 4 executes software by a processor, for example, a control block 10 and, optionally, a pre-processing block 11.
- the control block 10 includes procedures having instructions to control the overall operation of the spot iToF device 100.
- the control block 10 includes instructions to perform a ToF measurement to acquire spot iToF data, for example, the ToF measurement includes four correlation measurements, wherein for each correlation measurement a different phase shift between the modulation signal applied to the spot illuminator 2 and the demodulation signal applied to the ToF sensor is utilized (e.g., 0°, 90°, 180° and 270°).
- the control block 10 obtains for each correlation measurement a frame including digital representations of the output voltages of the plurality of CAPD pixels from the ToF sensor.
- the pre-processing block 11 computes IQ values (I: in-phase component; Q: quadrature component) based on the digital representations of the output voltages generated in each correlation measurement.
- the control 4 outputs the four frames of the ToF measurement as spot iToF data via the communication interface 5 over the data bus 12 (e.g., a data bus in accordance with MIPI (Mobile Industry Processor Interface) specifications) to the circuitry 13.
- MIPI Mobile Industry Processor Interface
- the circuitry 13 includes, e.g., a processor (e.g., an application processor) and data storage configured to perform the functions as described in the following.
- a processor e.g., an application processor
- data storage configured to perform the functions as described in the following.
- the circuitry 13 executes software including a data processing block 14 which includes procedures having instructions to cause the circuitry 13 to perform the following functions.
- the circuitry 13 computes, based on the spot iToF data, ToF properties for each pixel of the ToF sensor, at least:
- Vo° is the digital representation of the pixel output voltage for a phase shift of 0° between the modulation and demodulation signal
- Vw is the digital representation of the pixel output voltage for a phase shift of 90° between the modulation and demodulation signal
- Viso° is the digital representation of the pixel output voltage for a phase shift of 180° between the modulation and demodulation signal
- ⁇ 270° is the digital representation of the pixel output voltage for a phase shift of 270° between the modulation and demodulation signal.
- the circuitry 13 may compute the intensity for each ToF pixel, which is the squared amplitude, for detecting spots and valleys, as will be discussed under reference of Fig. 2.
- circuitry 13 may compute the confidence for each pixel, which is proportional to the amplitude, as generally known.
- the data processing block 14 includes a trained machine learning algorithm 15, which is trained to classify the material of the object 7 based on feature data.
- Fig. 2 schematically illustrates an embodiment of spot and valley detection.
- a ToF sensor 20 (e.g., of the ToF camera 3 in Fig. 1) is schematically illustrated in Fig. 2, which has a plurality of two-tapped CAPD pixels arranged in rows (R-l,. . ., R-m,. . ., R-M; wherein M is an Integer) and columns (C-l,. . ., C-n,. . ., C-N; wherein N is an Integer).
- Each CAPD pixel has a pixel position identified by its row index (X) and its column index (Y).
- the circuitry 13, executing the data processing block 14, obtains spot iToF data acquired by the spot iToF device 100 in a ToF measurement from the control 4.
- the spot iToF data include a plurality of spots 21 and valleys 22 characterized by the amplitude/intensity/confidence values (Z) associated with each pixel position.
- a spot may spread over one or more pixels, e.g., centered at column C-n - the spot is illustrated by reference number 21 -LI - until the valley corresponding to the spot 21- L1 - a valley pixel of the valley is illustrated by reference number 22-L1.
- the circuitry 13 detects the spot 21 -LI - here a dot - based on a first predetermined intensity threshold Zthl. Furthermore, the circuitry 13 detects whether the spot 21-L1 is saturated based on a second predetermined intensity threshold Zth2.
- the circuitry 13 computes a spot size - here dot size - of the spot 21 -LI (for each spot 21, but illustrated only for spot 21-L1, as the skilled person will appreciate), which is the number of spot pixels included in the spot 21-L1 which have an intensity value above the first predetermined intensity threshold Zthl.
- the dot pixel window is one additional pixel to the spot size (here dot size) in each direction (X, Y).
- valley pixels all pixels which are not within the spot size but within the spot pixel window are part of the valley corresponding to the spot 21 -LI and are referred to as valley pixels (as for example, the valley pixel 22-L1).
- the circuitry 13, executing the data processing block 14, further performs DGS for each spot 21, as will be discussed in the following.
- Fig. 3 schematically illustrates an embodiment of DGS.
- the spot 21-L1 is chosen for illustration, but DGS is performed for each detected spot 21.
- the Fig. 3 depicts the IQ values of spot 21-L1 (including three spot pixels along LI) in the IQ- plane.
- the angle corresponds to the phase and the distance to the origin corresponds to the amplitude (which is proportional to the confidence).
- the IQ values of spot 21 -LI have a different phase due to the global light component present in addition to the direct light component.
- the IQ value of valley pixel 22-L1 includes only the global light component.
- the circuitry 13 computes for spot 21 -LI (and for all other detected spots 21 similarly) the difference between the IQ values of spot 21 -LI (foe each spot pixel) and the IQ values of valley pixel 22-L1 to get rid of the global light component in the IQ values of the spot 21 -LI, thereby the DGS corrected IQ values 21 -LI -DGS of the spot 21 -LI (for each spot pixel) are obtained.
- the circuitry 13 computes for each detected spot 21 the amplitude/confidence/intensity based on the DGS corrected IQ values.
- phase in iToF is computed by:
- the circuitry 13 computes for each detected spot 21 the phase based on IQ values before and after the DGS correction.
- the circuitry 13 computes for each valley 22 (for each valley pixel in the valley 22) the amplitude or confidence and the phase.
- the circuitry 13 stores - after spot and valley detection and DGS - for each detected spot 21 the following values: pixel position of the center spot pixel of the spot, pixel positions of each spot pixel (within the spot size) of the spot, amplitude or confidence of each spot pixel before and after DGS, phase of each spot pixel before and after DGS and the spot size. These are referred to as pre-feature spot data in the following.
- the circuitry 13 stores - after spot and valley detection and DGS - for each detected valley 22 the following values: pixel position of the corresponding center spot pixel, pixel positions of the valley pixels within the spot pixel window of the corresponding spot, the amplitude or confidence of each valley pixel and the phase of each valley pixel. These are referred to as pre-feature valley data in the following.
- the data processing block 14 includes a trained machine learning algorithm, which is trained to classify the material of the object 7 based on feature data.
- the feature data including a plurality of input features (IF 1) to (IF6) are computed by the circuitry 13 based on the detected spots and valleys using the pre-feature spot data and the prefeature valley data.
- the feature data include a plurality of input features (IF 1) to (IF6) computed according to a plurality of predefined features, wherein each input feature (IF 1) to (IF6) includes one or more feature values (depending on the number of detected spots).
- the input features (IF 1) to (IF6) are computed after choosing the corresponding predefined features from a (possibly infinite) pool of predefined features after analyzing the accuracy of the classification results.
- predefined features may be appropriate as well in other embodiments.
- This input feature may be understood as a distance-normalize confidence at the maximum spot pixel position (e.g., the center spot pixel position).
- This input feature may be understood as a distance-normalize confidence around the maximum spot pixel position (e.g., the center spot pixel position). The confidence of any valley pixel may be used for the computation.
- (IF3) Spot size the number of spot pixels in the spot.
- the confidence of the center spot pixel (e.g., the spot pixel with the highest intensity value in the spot).
- the confidence of any valley pixel may be used for the computation.
- (IF 6) Ratio between variance of cartesian depth (or phase) of the spot and variance of cartesian depth (or phase) of the corresponding valley: varianceRATio SPOT VALLEY
- the variances are based on all spot pixel spot and all valley pixels corresponding to the spot, respectively.
- Z DOT is the cartesian depth measured at the center spot pixel (e.g., the spot pixel with the highest intensity value in the spot)
- Z DGS is the cartesian depth (Z DOT ) after DGS
- ZVALLEY is the cartesian depth measured around the spot (any valley pixel may be used).
- the circuitry 13 computes the cartesian depth as follows: from the IQ values the phase can be computed (see (4)), then a calibrated phase is computed therefrom, then a radial depth is computed therefrom, and the cartesian depth is computed therefrom.
- the cartesian depth is a lens-distortion-corrected version of the radial depth, which is obtained by projecting the radial depth values to the cartesian depth plane.
- a camera model obtained during camera calibration process is used for the projection, as discussed for example in:
- the circuitry 13 inputs the feature data including the inputs features (IF 1) to (IF 6) into the trained machine learning algorithm, which is trained to classify the material based on the feature data.
- Fig. 4 schematically illustrates an embodiment of the information processing device 1 for facial recognition, which is discussed in the following.
- the information processing device 1 is the information processing device 1 of Fig. 1 - a smartphone -, wherein the machine learning algorithm 15 is trained to classify the material into skin or non-skin.
- the information processing device 1 includes the spot iToF device 100 (not shown) with the spot illuminator 2 and the ToF camera 3.
- the information processing device 1 includes the circuitry 13 (not shown).
- the information processing device 1 includes further a touch-display 30 and a RGB camera 31.
- the touch-display 30 may include an LCD (Liquid-Crystal Display), an IPS-LCD (In-Plane Switching-Liquid-Crystal Display), an OLED (Organic Light-Emitting Diode) display or an AMOLED (Active-Matrix Organic Light-Emitting Diode) display.
- the touch functionality of the touch-display 30 may be based on a capacitive or resistive touch screen.
- the RGB camera 31 may include optical parts (e.g., a lens) and an image sensor (e.g., a CCD (“Charged-Coupled Device”) sensor or an active pixel sensor).
- optical parts e.g., a lens
- image sensor e.g., a CCD (“Charged-Coupled Device”) sensor or an active pixel sensor.
- a user may wish to unlock the smartphone 1 which requires authentication of the user.
- two-factor authentication may be employed which requires the user, e.g., to enter a PIN and to perform facial recognition.
- the smartphone starts the facial recognition.
- the user may hold the smartphone 1 in front of his face and presses a button 32 depicted on the touch-display such that the smartphone 1 in response thereto acquires an image 33 (as illustrated by the dashed box) of the user’s face with the RGB camera 31 and spot iToF data of the user’s face (the field-of-view of the RGB camera 31 and of the spot iToF device 100 overlap).
- the user may press again the button 32 to confirm that the image 33 and the spot iToF data should be used for the facial recognition.
- classifying the material as skin or non-skin for improving facial recognition may benefit from using only input features from certain regions of the human face, since taking feature data of the whole face may blur the distinguishability of skin with respect to other materials.
- the feature data is only computed for pixels (of the ToF sensor 20) which correspond to the regions-of-interest ROI (as illustrated by the dotted boxes).
- the ROIs cover forehead, nose, left/right cheek and lips of the face.
- the smartphone 1 (the circuitry 13 thereof) computes the input features (IF1) to (IF6) only for the ROIs to obtain the feature data. Then, the circuitry 13 inputs the feature data into the trained machine learning algorithm 15 which outputs a classification result indicating whether the light in the ToF measurement has been thrown back by a material that is skin or non-skin.
- the smartphone 1 determines that facial recognition has failed.
- the smartphone 1 proceeds with the facial recognition based on the image 33 and the spot iToF data from which it computes a depth map, as discussed in the general explanations.
- the accuracy of facial recognition may be improved, and spoofing may be avoided.
- Fig. 5 schematically illustrates an embodiment of a training of the trained machine learning algorithm 15 of Fig. 1, which is discussed in the following.
- the machine learning algorithm 15-t is in the training stage.
- the machine learning algorithm 15-t is one of a support vector classifier, a random forest, a decision tree, a k-nearest neighbor algorithm, a naive Bayes classifier and AdaBoost.
- the training is based on a plurality of training datasets 50, each training dataset including spot iToF data 51a and a label 51b.
- the label 51b indicates whether the measured material is a certain material or belongs to a certain material class, in particular whether the measured material is skin or non-skin.
- the label 51b indicates the material that was measured or the material class the measured material belongs to.
- the plurality of training datasets 50 includes training datasets which have been acquired under various different conditions to increase robustness of the classification.
- the various different conditions include, e.g., scenes with a plurality of different objects made of or being covered with different materials; a plurality of different mask (face mask) materials (e.g., silicone, plastic, rubber, wax, wood, paper, latex); a plurality of different persons (e.g., different age, gender, ethnicity, wearing glasses or not, having beard or not); a plurality of different rotations of the objects, persons or masks; and a plurality of different distances between spot iToF device and object, person or mask.
- face mask e.g., silicone, plastic, rubber, wax, wood, paper, latex
- a plurality of different persons e.g., different age, gender, ethnicity, wearing glasses or not, having beard or not
- a plurality of different rotations of the objects, persons or masks e.g., different age, gender, ethnicity, wearing glasses or not, having beard or not
- a plurality of different rotations of the objects, persons or masks e
- the various different conditions include further a plurality of different hardware, e.g., a plurality of different mobile devices (e.g., under-display camera phones, conventional smartphones), spot illuminators, ToF cameras or the like.
- a plurality of different mobile devices e.g., under-display camera phones, conventional smartphones
- spot illuminators e.g., spot illuminators
- ToF cameras e.g., ToF cameras or the like.
- Each training dataset is used for training the machine learning algorithm 15-t in the training stage.
- a feature data generator 52 obtains the spot iToF data 51a of the respective training dataset and computes the feature data 53 including the input features (IF1) to (IF6).
- the feature data 53 is input to the machine learning algorithm 15-t in the training stage.
- the machine learning algorithm 15-t in the training stage outputs a binary value 54 indicating whether the measured material belongs to a predetermined class (e.g., skin) or not.
- a predetermined class e.g., skin
- Binary classification may have thus, in some embodiments, the same amount of examples of skin vs. non-skin (50%). This type of classifier may reduce the probability of false negative but may increase the risk of false positive.
- the machine learning algorithm 15-t in the training stage outputs for each class a probability 54 that the measured material belongs to the respective class.
- the machine learning algorithm 15-t in the training stage outputs the material class 54 to which the material belongs to.
- Multiclass classification may have a ratio of skin vs. #number of the other classes (e.g., l/#classes*100%). This type of classifier may reduce the probability of false positives but may increase the risk of false negatives.
- a loss function 55 e.g., cross entropy, as generally known, which generates hyperparameter updates 56 based on a difference between the classification of the machine learning algorithm 15-t in the training stage and the label 51b.
- the trained machine learning algorithm 15 is obtained.
- Fig. 6 schematically illustrates an embodiment of normalized mean feature values for different material classes, which is discussed in the following.
- Fig. 6 a diagram is shown in which the normalized mean feature values of the input features spot size, TDGS SPOT, TOGS VALLEY, ZDGS V ALLEY and confoos VAL are compared for the material class skin (dotted pattern) and plastic (striped pattern). While TDGS SPOT is basically the same for both materials, the other input features are different and, thus, both material classes can be distinguished based on these input features, which are computed according to predefined features.
- Fig. 7 schematically illustrates in a flow diagram an embodiment of an information processing method 300.
- the information processing method 300 may be performed by the information processing device as described herein, e.g., by the information processing device 1 of Fig. 1 and 4.
- spot indirect time-of-flight data is acquired, as discussed herein.
- spots and valleys are detected in the spot indirect time-of-flight data, as discussed herein.
- feature data are computed according to a plurality of predefined features, wherein the feature data are computed in accordance with a plurality of predefined region-of-interests, as discussed herein.
- the feature data is input into a machine learning algorithm, wherein the machine learning algorithm is trained to classify the material based on the feature data, wherein the machine learning algorithm is trained to classify the material into skin or non-skin, as discussed herein.
- facial recognition using the classification result is performed, wherein, based on the spot indirect time-of-flight data, a depth map is computed and the facial recognition is performed further based on the depth map, as discussed herein.
- Fig. 8 schematically illustrates a multi-purpose computer 130 which can be used for implementing a circuitry.
- the computer 130 can be implemented such that it can basically function as any type of information processing device as described herein.
- the computer has components 131 to 141, which can form a circuitry, such as any one of the circuitries of information processing device, as described herein.
- Embodiments which use software, firmware, programs or the like for performing the methods as described herein can be installed on computer 130, which is then configured to be suitable for the concrete embodiment.
- the computer 130 has a CPU 131 (Central Processing Unit), which can execute various types of procedures and methods as described herein, for example, in accordance with programs stored in a read-only memory (ROM) 132, stored in a storage 137 and loaded into a random access memory (RAM) 133, stored on a medium 140 which can be inserted in a respective drive 139, etc.
- ROM read-only memory
- RAM random access memory
- the CPU 131, the ROM 132 and the RAM 133 are connected with a bus 141, which in turn is connected to an input/output interface 134.
- the number of CPUs, memories and storages is only exemplary, and the skilled person will appreciate that the computer 130 can be adapted and configured accordingly for meeting specific requirements which arise, when it functions as an information processing device.
- a medium 140 compact disc, digital video disc, compact flash memory, or the like
- the input 135 can be a pointer device (mouse, graphic table, or the like), a keyboard, a microphone, a camera, a touchscreen, etc.
- the output 136 can have a display (liquid crystal display, cathode ray tube display, light emittance diode display, etc.), loudspeakers, etc.
- a display liquid crystal display, cathode ray tube display, light emittance diode display, etc.
- loudspeakers etc.
- the storage 137 can have a hard disk, a solid state drive and the like.
- the communication interface 138 can be adapted to communicate, for example, via a local area network (LAN), wireless local area network (WLAN), mobile telecommunications system (GSM, UMTS, LTE, NR etc.), Bluetooth, infrared, a data bus (e.g., according to MIPI specifications), etc.
- the description above only pertains to an example configuration of computer 130. Alternative configurations may be implemented with additional or other sensors, storage devices, interfaces or the like.
- the communication interface 138 may receive spot iToF data via a data bus.
- the computer 130 may include a RGB camera, a touchdisplay and a spot iToF device.
- the communication interface 138 can further have a respective air interface (providing e.g. E-UTRA protocols OFDMA (downlink) and SC- FDMA (uplink)) and network interfaces (implementing for example protocols such as Sl-AP, GTP-U, SI -MME, X2-AP, or the like).
- the computer 130 may have one or more antennas and/or an antenna array. The present disclosure is not limited to any particularities of such protocols. It should be recognized that the embodiments describe methods with an exemplary ordering of method steps. The specific ordering of method steps is however given for illustrative purposes only and should not be construed as binding.
- An information processing device for classifying materials including circuitry configured to: obtain spot indirect time-of-flight data acquired in a time-of-flight measurement of light thrown back by a material; compute, based on the spot indirect time-of-flight data, feature data according to a plurality of predefined features; and input the feature data into a machine learning algorithm, wherein the machine learning algorithm is trained to classify the material based on the feature data.
- the predefined features are predefined features per detected spot requiring computation of reflectance of the spot, reflectance of the corresponding valley, the spot size, ratio between cartesian depth or phase of the spot after direct global separation and cartesian depth or phase of the corresponding valley, ratio between amplitude or confidence of the spot after direct global separation and amplitude or confidence of the corresponding valley, ratio between variance of cartesian depth or phase of the spot and variance of cartesian depth or phase of the corresponding valley.
- the machine learning algorithm is trained to classify the material into skin or non-skin.
- circuitry is configured to compute, based on the spot indirect time-of-flight data, a depth map and to perform the facial recognition further based on the depth map.
- circuitry is configured to compute the feature data in accordance with a plurality of predefined region-of- interests.
- (9) The information processing device of anyone of (1) to (8), wherein the machine learning algorithm is one of a support vector classifier, a random forest, a decision tree, a k-nearest neighbor algorithm, a naive Bayes classifier and AdaBoost.
- the machine learning algorithm is one of a support vector classifier, a random forest, a decision tree, a k-nearest neighbor algorithm, a naive Bayes classifier and AdaBoost.
- An information processing method for classifying materials including: obtaining spot indirect time-of-flight data acquired in a time-of-flight measurement of light thrown back by a material; computing, based on the spot indirect time-of-flight data, feature data according to a plurality of predefined features; and inputting the feature data into a machine learning algorithm, wherein the machine learning algorithm is trained to classify the material based on the feature data.
- the predefined features are predefined features per detected spot requiring computation of reflectance of the spot, reflectance of the corresponding valley, the spot size, ratio between cartesian depth or phase of the spot after direct global separation and cartesian depth or phase of the corresponding valley, ratio between amplitude or confidence of the spot after direct global separation and amplitude or confidence of the corresponding valley, ratio between variance of cartesian depth or phase of the spot and variance of cartesian depth or phase of the corresponding valley.
- (21) A computer program comprising program code causing a computer to perform the method according to anyone of (11) to (20), when being carried out on a computer.
- (22) A non-transitory computer-readable recording medium that stores therein a computer program product, which, when executed by a processor, causes the method according to anyone of (11) to (20) to be performed.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Oral & Maxillofacial Surgery (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Electromagnetism (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Image Analysis (AREA)
Abstract
L'invention concerne un dispositif de traitement d'informations pour classifier des matériaux, comprenant des circuits configurés pour : obtenir des données indirectes de temps de vol ponctuelles acquises dans une mesure de temps de vol de la lumière renvoyée par un matériau ; calculer, sur la base des données indirectes de temps de vol ponctuelles, des données de caractéristique selon une pluralité de caractéristiques prédéfinies ; et entrer les données de caractéristique dans un algorithme d'apprentissage machine, l'algorithme d'apprentissage machine étant entraîné pour classifier le matériau sur la base des données de caractéristique.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP23158593 | 2023-02-24 | ||
| PCT/EP2024/054508 WO2024175706A1 (fr) | 2023-02-24 | 2024-02-22 | Procédé et dispositif de classification de matériaux à l'aide de données indirectes de temps de vol |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| EP4670059A1 true EP4670059A1 (fr) | 2025-12-31 |
Family
ID=85383006
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| EP24706135.1A Pending EP4670059A1 (fr) | 2023-02-24 | 2024-02-22 | Procédé et dispositif de classification de matériaux à l'aide de données indirectes de temps de vol |
Country Status (2)
| Country | Link |
|---|---|
| EP (1) | EP4670059A1 (fr) |
| WO (1) | WO2024175706A1 (fr) |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11988780B2 (en) * | 2018-05-09 | 2024-05-21 | Sony Semiconductor Solutions Corporation | Imaging device and method |
| US20240061123A1 (en) * | 2020-12-22 | 2024-02-22 | Sony Semiconductor Solutions Corporation | Electronic device and method |
| CN116745646A (zh) * | 2021-01-27 | 2023-09-12 | 索尼半导体解决方案公司 | 根据飞行时间测量和光斑图案测量的三维图像捕获 |
-
2024
- 2024-02-22 WO PCT/EP2024/054508 patent/WO2024175706A1/fr not_active Ceased
- 2024-02-22 EP EP24706135.1A patent/EP4670059A1/fr active Pending
Also Published As
| Publication number | Publication date |
|---|---|
| WO2024175706A1 (fr) | 2024-08-29 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12288414B2 (en) | Systems and methods for performing fingerprint based user authentication using imagery captured using mobile devices | |
| US10339362B2 (en) | Systems and methods for performing fingerprint based user authentication using imagery captured using mobile devices | |
| US10521643B2 (en) | Systems and methods for performing fingerprint based user authentication using imagery captured using mobile devices | |
| EP3552150B1 (fr) | Systèmes et procédés d'exécution d'une authentification d'utilisateur basée sur des empreintes digitales au moyen d'images capturées à l'aide de dispositifs mobiles | |
| US9361507B1 (en) | Systems and methods for performing fingerprint based user authentication using imagery captured using mobile devices | |
| US9082000B2 (en) | Image processing device and image processing method | |
| US20160026857A1 (en) | Image processor comprising gesture recognition system with static hand pose recognition based on dynamic warping | |
| WO2024175706A1 (fr) | Procédé et dispositif de classification de matériaux à l'aide de données indirectes de temps de vol | |
| Saad et al. | Robust and fast iris localization using contrast stretching and leading edge detection | |
| HK40069201B (zh) | 执行指纹识别的方法和系统 | |
| HK40010111A (en) | Systems and methods for performing fingerprint based user authentication using imagery captured using mobile devices | |
| HK40010111B (en) | Systems and methods for performing fingerprint based user authentication using imagery captured using mobile devices | |
| HK1246928B (en) | Systems and methods for performing fingerprint based user authentication using imagery captured using mobile devices | |
| HK1246928A1 (en) | Systems and methods for performing fingerprint based user authentication using imagery captured using mobile devices |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
| PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
| 17P | Request for examination filed |
Effective date: 20250922 |
|
| AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC ME MK MT NL NO PL PT RO RS SE SI SK SM TR |