WO2021217575A1 - 用户感兴趣对象的识别方法以及识别装置 - Google Patents
用户感兴趣对象的识别方法以及识别装置 Download PDFInfo
- Publication number
- WO2021217575A1 WO2021217575A1 PCT/CN2020/088243 CN2020088243W WO2021217575A1 WO 2021217575 A1 WO2021217575 A1 WO 2021217575A1 CN 2020088243 W CN2020088243 W CN 2020088243W WO 2021217575 A1 WO2021217575 A1 WO 2021217575A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- user
- vehicle
- gaze area
- area
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W50/00—Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
- B60W50/08—Interaction between the driver and the control system
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional [3D] objects
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60K—ARRANGEMENT OR MOUNTING OF PROPULSION UNITS OR OF TRANSMISSIONS IN VEHICLES; ARRANGEMENT OR MOUNTING OF PLURAL DIVERSE PRIME-MOVERS IN VEHICLES; AUXILIARY DRIVES FOR VEHICLES; INSTRUMENTATION OR DASHBOARDS FOR VEHICLES; ARRANGEMENTS IN CONNECTION WITH COOLING, AIR INTAKE, GAS EXHAUST OR FUEL SUPPLY OF PROPULSION UNITS IN VEHICLES
- B60K35/00—Instruments specially adapted for vehicles; Arrangement of instruments in or on vehicles
- B60K35/10—Input arrangements, i.e. from user to vehicle, associated with vehicle functions or specially adapted therefor
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60K—ARRANGEMENT OR MOUNTING OF PROPULSION UNITS OR OF TRANSMISSIONS IN VEHICLES; ARRANGEMENT OR MOUNTING OF PLURAL DIVERSE PRIME-MOVERS IN VEHICLES; AUXILIARY DRIVES FOR VEHICLES; INSTRUMENTATION OR DASHBOARDS FOR VEHICLES; ARRANGEMENTS IN CONNECTION WITH COOLING, AIR INTAKE, GAS EXHAUST OR FUEL SUPPLY OF PROPULSION UNITS IN VEHICLES
- B60K35/00—Instruments specially adapted for vehicles; Arrangement of instruments in or on vehicles
- B60K35/20—Output arrangements, i.e. from vehicle to user, associated with vehicle functions or specially adapted therefor
- B60K35/21—Output arrangements, i.e. from vehicle to user, associated with vehicle functions or specially adapted therefor using visual output, e.g. blinking lights or matrix displays
- B60K35/22—Display screens
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60K—ARRANGEMENT OR MOUNTING OF PROPULSION UNITS OR OF TRANSMISSIONS IN VEHICLES; ARRANGEMENT OR MOUNTING OF PLURAL DIVERSE PRIME-MOVERS IN VEHICLES; AUXILIARY DRIVES FOR VEHICLES; INSTRUMENTATION OR DASHBOARDS FOR VEHICLES; ARRANGEMENTS IN CONNECTION WITH COOLING, AIR INTAKE, GAS EXHAUST OR FUEL SUPPLY OF PROPULSION UNITS IN VEHICLES
- B60K35/00—Instruments specially adapted for vehicles; Arrangement of instruments in or on vehicles
- B60K35/20—Output arrangements, i.e. from vehicle to user, associated with vehicle functions or specially adapted therefor
- B60K35/21—Output arrangements, i.e. from vehicle to user, associated with vehicle functions or specially adapted therefor using visual output, e.g. blinking lights or matrix displays
- B60K35/23—Head-up displays [HUD]
- B60K35/235—Head-up displays [HUD] with means for detecting the driver's gaze direction or eye points
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60K—ARRANGEMENT OR MOUNTING OF PROPULSION UNITS OR OF TRANSMISSIONS IN VEHICLES; ARRANGEMENT OR MOUNTING OF PLURAL DIVERSE PRIME-MOVERS IN VEHICLES; AUXILIARY DRIVES FOR VEHICLES; INSTRUMENTATION OR DASHBOARDS FOR VEHICLES; ARRANGEMENTS IN CONNECTION WITH COOLING, AIR INTAKE, GAS EXHAUST OR FUEL SUPPLY OF PROPULSION UNITS IN VEHICLES
- B60K35/00—Instruments specially adapted for vehicles; Arrangement of instruments in or on vehicles
- B60K35/20—Output arrangements, i.e. from vehicle to user, associated with vehicle functions or specially adapted therefor
- B60K35/28—Output arrangements, i.e. from vehicle to user, associated with vehicle functions or specially adapted therefor characterised by the type of the output information, e.g. video entertainment or vehicle dynamics information; characterised by the purpose of the output information, e.g. for attracting the attention of the driver
- B60K35/285—Output arrangements, i.e. from vehicle to user, associated with vehicle functions or specially adapted therefor characterised by the type of the output information, e.g. video entertainment or vehicle dynamics information; characterised by the purpose of the output information, e.g. for attracting the attention of the driver for improving awareness by directing driver's gaze direction or eye points
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60K—ARRANGEMENT OR MOUNTING OF PROPULSION UNITS OR OF TRANSMISSIONS IN VEHICLES; ARRANGEMENT OR MOUNTING OF PLURAL DIVERSE PRIME-MOVERS IN VEHICLES; AUXILIARY DRIVES FOR VEHICLES; INSTRUMENTATION OR DASHBOARDS FOR VEHICLES; ARRANGEMENTS IN CONNECTION WITH COOLING, AIR INTAKE, GAS EXHAUST OR FUEL SUPPLY OF PROPULSION UNITS IN VEHICLES
- B60K35/00—Instruments specially adapted for vehicles; Arrangement of instruments in or on vehicles
- B60K35/60—Instruments characterised by their location or relative disposition in or on vehicles
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60K—ARRANGEMENT OR MOUNTING OF PROPULSION UNITS OR OF TRANSMISSIONS IN VEHICLES; ARRANGEMENT OR MOUNTING OF PLURAL DIVERSE PRIME-MOVERS IN VEHICLES; AUXILIARY DRIVES FOR VEHICLES; INSTRUMENTATION OR DASHBOARDS FOR VEHICLES; ARRANGEMENTS IN CONNECTION WITH COOLING, AIR INTAKE, GAS EXHAUST OR FUEL SUPPLY OF PROPULSION UNITS IN VEHICLES
- B60K35/00—Instruments specially adapted for vehicles; Arrangement of instruments in or on vehicles
- B60K35/85—Arrangements for transferring vehicle- or driver-related data
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60R—VEHICLES, VEHICLE FITTINGS, OR VEHICLE PARTS, NOT OTHERWISE PROVIDED FOR
- B60R1/00—Optical viewing arrangements; Real-time viewing arrangements for drivers or passengers using optical image capturing systems, e.g. cameras or video systems specially adapted for use in or on vehicles
- B60R1/20—Real-time viewing arrangements for drivers or passengers using optical image capturing systems, e.g. cameras or video systems specially adapted for use in or on vehicles
- B60R1/22—Real-time viewing arrangements for drivers or passengers using optical image capturing systems, e.g. cameras or video systems specially adapted for use in or on vehicles for viewing an area outside the vehicle, e.g. the exterior of the vehicle
- B60R1/28—Real-time viewing arrangements for drivers or passengers using optical image capturing systems, e.g. cameras or video systems specially adapted for use in or on vehicles for viewing an area outside the vehicle, e.g. the exterior of the vehicle with an adjustable field of view
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W50/00—Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
- B60W50/08—Interaction between the driver and the control system
- B60W50/14—Means for informing the driver, warning the driver or prompting a driver intervention
-
- G—PHYSICS
- G02—OPTICS
- G02B—OPTICAL ELEMENTS, SYSTEMS OR APPARATUS
- G02B27/00—Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
- G02B27/01—Head-up displays
- G02B27/0179—Display position adjusting means not related to the information to be displayed
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
- G06F3/013—Eye tracking input arrangements
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
- G06Q30/0251—Targeted advertisements
- G06Q30/0265—Vehicular advertisement
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/59—Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
- G06V20/597—Recognising the driver's state or behaviour, e.g. attention or drowsiness
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/18—Eye characteristics, e.g. of the iris
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60K—ARRANGEMENT OR MOUNTING OF PROPULSION UNITS OR OF TRANSMISSIONS IN VEHICLES; ARRANGEMENT OR MOUNTING OF PLURAL DIVERSE PRIME-MOVERS IN VEHICLES; AUXILIARY DRIVES FOR VEHICLES; INSTRUMENTATION OR DASHBOARDS FOR VEHICLES; ARRANGEMENTS IN CONNECTION WITH COOLING, AIR INTAKE, GAS EXHAUST OR FUEL SUPPLY OF PROPULSION UNITS IN VEHICLES
- B60K35/00—Instruments specially adapted for vehicles; Arrangement of instruments in or on vehicles
- B60K35/65—Instruments specially adapted for specific vehicle types or users, e.g. for left- or right-hand drive
- B60K35/654—Instruments specially adapted for specific vehicle types or users, e.g. for left- or right-hand drive the user being the driver
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60K—ARRANGEMENT OR MOUNTING OF PROPULSION UNITS OR OF TRANSMISSIONS IN VEHICLES; ARRANGEMENT OR MOUNTING OF PLURAL DIVERSE PRIME-MOVERS IN VEHICLES; AUXILIARY DRIVES FOR VEHICLES; INSTRUMENTATION OR DASHBOARDS FOR VEHICLES; ARRANGEMENTS IN CONNECTION WITH COOLING, AIR INTAKE, GAS EXHAUST OR FUEL SUPPLY OF PROPULSION UNITS IN VEHICLES
- B60K35/00—Instruments specially adapted for vehicles; Arrangement of instruments in or on vehicles
- B60K35/65—Instruments specially adapted for specific vehicle types or users, e.g. for left- or right-hand drive
- B60K35/656—Instruments specially adapted for specific vehicle types or users, e.g. for left- or right-hand drive the user being a passenger
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60R—VEHICLES, VEHICLE FITTINGS, OR VEHICLE PARTS, NOT OTHERWISE PROVIDED FOR
- B60R2300/00—Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle
- B60R2300/20—Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle characterised by the type of display used
- B60R2300/205—Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle characterised by the type of display used using a head-up display
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60R—VEHICLES, VEHICLE FITTINGS, OR VEHICLE PARTS, NOT OTHERWISE PROVIDED FOR
- B60R2300/00—Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle
- B60R2300/20—Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle characterised by the type of display used
- B60R2300/207—Details of viewing arrangements using cameras and displays, specially adapted for use in a vehicle characterised by the type of display used using multi-purpose displays, e.g. camera image and navigation or video on same display
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W50/00—Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
- B60W50/08—Interaction between the driver and the control system
- B60W50/14—Means for informing the driver, warning the driver or prompting a driver intervention
- B60W2050/146—Display means
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2540/00—Input parameters relating to occupants
- B60W2540/225—Direction of gaze
-
- G—PHYSICS
- G02—OPTICS
- G02B—OPTICAL ELEMENTS, SYSTEMS OR APPARATUS
- G02B27/00—Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
- G02B27/01—Head-up displays
- G02B27/0179—Display position adjusting means not related to the information to be displayed
- G02B2027/0187—Display position adjusting means not related to the information to be displayed slaved to motion of at least a part of the body of the user, e.g. head, eye
Definitions
- This application relates to the field of smart cars, and more specifically, to a method and device for identifying objects of interest to users.
- the driver may be interested in objects in the area outside the vehicle during driving.
- the driver cannot look at the objects of interest in the area outside the vehicle for a long time and may not be able to understand the detailed information of the object; thus the driver’s feelings Interest objects can not be effectively recorded, reducing the driver's driving experience.
- This application provides a method and device for recognizing an object of interest to a user, and the recognition method and device of this application can improve the accuracy of recognizing an object of interest for a user.
- a method for recognizing an object of interest for a user including: obtaining information about the gaze area of the user and an environmental image corresponding to the user; and obtaining the user’s presence in the environmental image according to the environmental image.
- Information of the first gaze area wherein the first gaze area is used to indicate a sensitive gaze area determined by the physical characteristics of the human body; the user is obtained according to the information of the gaze area and the information of the first gaze area.
- the target gaze area of, wherein the target gaze area is used to indicate an area in the environment image where the target object gazes at by the user is located.
- the environment image corresponding to the user may refer to the image of the environment where the user is located.
- the user may refer to the driver of the vehicle or the passenger in the vehicle, and the environment image corresponding to the user may refer to the image of the environment where the vehicle is located, or the image outside the vehicle.
- Environmental image for the field of smart cars, the user may refer to the driver of the vehicle or the passenger in the vehicle, and the environment image corresponding to the user may refer to the image of the environment where the vehicle is located, or the image outside the vehicle.
- the user may refer to the user of the smart home in the home, and the environment image corresponding to the user may refer to the image in the home where the user of the smart home is located. .
- the information of the gaze area of the user may include position information of the gaze area of the user, the direction of the gaze area of the user, and the size of the range of the gaze area of the user.
- the first gaze area in the environmental image is the sensitive area of the user determined by the biological characteristics of the human body, but it is not necessarily the gaze area of the user; the sensitive area may refer to the physical characteristics of the human body, for example, according to the physical characteristics of the human body.
- the sensitivity of the change of different colors and shapes determines the area that is easy to attract the user's attention.
- the optical nerve of the human eye has different sensitivity to light of different wavelengths; the human eye is most sensitive to electromagnetic waves with a wavelength of about 555nm. This electromagnetic wave is in the green region of the optical spectrum, so the human eye is more sensitive to green light. Sensitive; therefore, the user's sensitive area for an image can refer to the green area in the image.
- the first gaze area in the environment image may refer to the user's pre-determined sensitive area in the environment image based on the characteristics of the human object; and the target gaze area is the area where the target object the user is gazing at is located in the environment image; environment The first gaze area in the image may be the same for different users, but for different users, different users can gaze at the area where the target object of interest in the environment image is located according to their own interests.
- the first gaze area in the environment image can be represented by the interest value of each area in the image, where the interest value can be obtained through a deep learning method or an edge detection method to obtain the richness of morphological changes , Obtain the richness of color changes through the gradient calculation method, and weight these two values to calculate the interest value of each position of the image; based on the interest value of each position in the image, pre-determine which area in the environment image the user may be interested in .
- the pre-judgment of the environmental image is introduced when identifying the object of interest to the user, that is, the user's target gaze area is determined according to the user's possible sensitive area in the environmental image and the user's gaze area, thereby Improve the accuracy of identifying objects of interest to users.
- the obtaining the target gaze area of the user according to the information of the gaze area and the information of the first gaze area includes:
- the target gaze area is determined according to the overlap area of the gaze gaze area and the first gaze area.
- the user's target gaze area can be determined based on the overlap area between the user's sensitive area in the environment image and the user's gaze area That is, the area described by the target object of interest to the user, thereby improving the accuracy of identifying the object of interest to the user.
- the user is a user in a vehicle
- the acquiring information about the gaze area of the user and the environment image corresponding to the user includes: acquiring The user’s gaze area information and the image of the dash cam of the vehicle, wherein the gaze area of the user in the vehicle is used to indicate the gaze area of the user outside the vehicle in the vehicle;
- Obtaining the target gaze area of the user by the information of the gaze area and the information of the first gaze area includes: according to the information of the gaze area of the user in the vehicle and the first gaze area in the image of the dash cam The information of a gaze area determines the target gaze area of the user in the vehicle.
- the user in the vehicle may refer to the user located inside the automobile; for example, the user may refer to the driver of the vehicle, or the user may refer to the passengers of the vehicle.
- the user in the vehicle is the driver of the vehicle
- the gaze area of the driver is used to indicate the gaze area of the driver in the direction of the front windshield of the vehicle.
- the user in the vehicle is a passenger in the vehicle, for example, a passenger in the front passenger position
- the user's gaze area is used to indicate the gaze area of the passenger in the direction of the front windshield of the vehicle, or The gaze area of the vehicle in the direction of the window.
- the user in the vehicle is a passenger in the back row of the vehicle
- the gaze area of the user is used to indicate the gaze area of the passenger in the direction of the window of the vehicle.
- the user in the vehicle may be the driver of the vehicle or the passenger of the vehicle, and the gaze area of the user in the vehicle may obtain the image of the user through the camera in the cab of the vehicle; and then according to the image of the user in the vehicle
- the acquiring the information of the gaze area of the user in the vehicle and the image of the dash cam of the vehicle includes: acquiring the image in the N frames Information about the user’s gaze area in the vehicle and M frames of the dash cam image, wherein the N frames of image and the M frame of the dash cam image are in the same starting time and ending time The acquired image; said determining the target gaze area of the user in the vehicle according to the information of the gaze area of the user in the vehicle and the information of the first gaze area in the image of the driving recorder includes: determining the gaze area of the user in the vehicle The difference in the gaze area of the user in the vehicle in the N frames of images satisfies a first preset range; it is determined that the difference in the first gaze area in the M frames of the dash cam image satisfies a second preset range Within; determine the overlap area according to the gaze area of the user in the vehicle in the N frames of images and the first gaze area in the image of the das
- the difference in the gaze area refers to the position difference of the gaze area; the difference in the first gaze area refers to the first gaze area The location difference.
- the acquisition of the image of the user in the N frames of the vehicle and the image of the M frame of the dash cam can also be the images acquired within the allowable range of the time difference; that is, the time when the image of the user in the N frames of the vehicle is acquired is
- the time of acquiring the image of the M frame dash cam can be similar or close.
- the next few frames of images can be predicted based on the acquired images of the users in the N frames of vehicles
- the gaze area of the user in the vehicle; or, the first gaze area in the next few frames of images can be predicted based on the images of the M frames of the driving recorder.
- the gaze area of the user in the vehicle in the acquired N frames of images may refer to N frames of images of the user in the vehicle collected by a camera arranged in the cockpit of the vehicle; based on the N frames of the vehicle
- the user's image can determine the user's gaze area in N frames of images.
- the user's gaze area in the direction of the windshield of the vehicle can be determined by the user's head position in the N frames of the user's image in the vehicle.
- N frames of images for example, images of users in N frames of vehicles
- M frames of dash cam images acquired at the same start time and end time may be the same or different.
- the driver’s line of sight can be tracked first, that is, it is determined that the obtained N frames of images, that is, the difference between the N frames of the driver’s images satisfies the first preset range; that is, it can be determined to obtain the N frames of the driver.
- the driver’s gaze area is not directly in front and the difference in N frames of image changes is kept small.
- the driving recorder continues to capture the same object for multiple frames without missing; at this time, it is further determined that the driver’s gaze area in the N frames of images and the M frames of the driving recorder To determine the driver’s target gaze area; through multiple frames of images satisfying the first preset range and the second preset range, the robustness of the method for identifying objects of interest to users provided in this application can be ensured .
- the method further includes: mapping the gaze area of the user in the vehicle to the imaging plane where the image of the dash cam is located; or, the dash cam The image of is mapped to the imaging plane where the gaze area of the user in the vehicle is located.
- the images located in the two imaging planes can be projected onto the same imaging plane; that is, The gaze area of the user in the vehicle is mapped to the imaging plane where the image of the dash cam; or, the image of the dash cam can be mapped to the imaging plane where the gaze area of the user in the vehicle is located.
- the method further includes: displaying information of the target gaze area on a display screen of the vehicle.
- the vehicle includes a plurality of display screens, and the display of information of the target gaze area on the display screen of the vehicle includes:
- the target display screen of the plurality of display screens is determined according to the position information of the user in the vehicle in the vehicle; and the information of the target gaze area is displayed on the target display screen.
- the user's identity information is determined according to the user's location information in the vehicle, and the identity information may include the driver or passenger; and then the target gaze area information can be pushed to the user according to the user's identity information; for example, The information of the target gaze area can be displayed on the display screen at the corresponding position; or, the information of the target gaze area can be broadcast; or, the information of the target area can be pushed to the user's mobile phone so that the user can continue to understand the target gaze area in the future. information.
- the head-up display HUD system can display the information of the target gaze area in the vehicle; for example, the information of the target gaze area can be displayed on the front windshield.
- the information of the target gaze area may be displayed on the display screen corresponding to the position.
- the information of the target gaze area is displayed in the vehicle through a head-up display HUD system.
- the above-mentioned method for identifying objects of interest to users can be applied to smart terminal scenarios.
- the method provided in this application can be used to identify objects of interest to users, thereby providing users with Provide smarter services and effectively improve user experience.
- a device for identifying objects of interest to a user including: an acquisition module for acquiring information about the user’s gaze area and an environmental image corresponding to the user; and a processing module for acquiring information based on the environmental image Obtain the information of the first gaze area of the user in the environment image, where the first gaze area is used to indicate a sensitive area determined by the physical characteristics of the human body; The information of a gaze area obtains the target gaze area of the user, wherein the target gaze area is used to indicate the area in the environment image where the target object gazes at by the user is located.
- the environment image corresponding to the user may refer to the image of the environment in which the user is located.
- the user may refer to the driver of the vehicle or the passenger in the vehicle, and the environment image corresponding to the user may refer to the image of the environment where the vehicle is located, or the image outside the vehicle.
- Environmental image for the field of smart cars, the user may refer to the driver of the vehicle or the passenger in the vehicle, and the environment image corresponding to the user may refer to the image of the environment where the vehicle is located, or the image outside the vehicle.
- the information of the gaze area of the user may include position information of the gaze area of the user, the direction of the gaze area of the user, and the size of the range of the gaze area of the user.
- the user may refer to the user of the smart home in the home, and the environment image corresponding to the user may refer to the image in the home where the user of the smart home is located. .
- the first gaze area in the environmental image is the sensitive area of the user determined by the biological characteristics of the human body, but it is not necessarily the gaze area of the user; the sensitive area may refer to the physical characteristics of the human body, for example, according to the physical characteristics of the human body.
- the sensitivity of the change of different colors and shapes determines the area that is easy to attract the user's attention.
- the optical nerve of the human eye has different sensitivity to light of different wavelengths; the human eye is most sensitive to electromagnetic waves with a wavelength of about 555nm. This electromagnetic wave is in the green region of the optical spectrum, so the human eye is more sensitive to green light. Sensitive; therefore, the user's sensitive area for an image can refer to the green area in the image.
- the first gaze area in the environmental image may refer to the user's pre-determined sensitive area in the environmental image based on the characteristics of the human object; and the target gaze area refers to the user's gaze area in the environmental image, that is, the user's gaze area.
- the area where the target object is located; the first gaze area in the environment image may be the same for different users, but for different users, different users can gaze at the object of interest in the environment image based on their own interests Area.
- the first gaze area in the environment image can be represented by the interest value of each area in the image, where the interest value can be obtained through a deep learning method or an edge detection method to obtain the richness of morphological changes , Obtain the richness of color changes through the gradient calculation method, and weight these two values to calculate the interest value of each position of the image; based on the interest value of each position in the image, pre-determine which area in the environment image the user may be interested in .
- the pre-judgment of the environmental image is introduced when identifying the object of interest to the user, that is, the user's target gaze area is determined according to the user's sensitive area in the environmental image and the user's gaze area, thereby improving The accuracy of identifying objects of interest to users.
- the processing module is specifically configured to:
- the target gaze area is determined according to the overlap area of the gaze gaze area and the first gaze area.
- the user's target gaze area can be determined based on the overlap area between the user's sensitive area in the environment image and the user's gaze area That is, the area described by the target object of interest to the user, thereby improving the accuracy of identifying the object of interest to the user.
- the user is a user in a vehicle
- the acquisition module is specifically configured to: acquire information about the gaze area of the user in the vehicle and the driving of the vehicle The image of the recorder, wherein the gaze area of the user in the vehicle is used to indicate the gaze area of the user outside the vehicle in the vehicle;
- the processing module is specifically configured to determine the target gaze area of the driver according to the information of the gaze area of the driver and the information of the first gaze area in the image of the driving recorder.
- the user in the vehicle may refer to the user located inside the vehicle; for example, the user may refer to the driver of the vehicle, or the user may refer to the passengers of the vehicle.
- the user in the vehicle is the driver of the vehicle
- the gaze area of the driver is used to indicate the gaze area of the driver in the direction of the front windshield of the vehicle.
- the user in the vehicle is a passenger in the vehicle, for example, a passenger in the front passenger position
- the user's gaze area is used to indicate the gaze area of the passenger in the direction of the front windshield of the vehicle, or The gaze area of the vehicle in the direction of the window.
- the user in the vehicle is a passenger in the back row of the vehicle
- the gaze area of the user is used to indicate the gaze area of the passenger in the direction of the window of the vehicle.
- the user in the vehicle can be the driver of the vehicle or the passenger of the vehicle, and the gaze area of the user in the vehicle can obtain the image of the user located inside the vehicle through the camera in the vehicle cab;
- the user’s face and eye states in the image determine the user’s gaze area, that is, determine the user’s gaze area in the windshield of the vehicle;
- the camera in the cab can refer to the camera in the driver monitoring system or the cockpit monitoring system.
- the acquisition module is specifically configured to: acquire information about the gaze area of the user in the vehicle in N frames of images and M frames of images of the driving recorder ;
- the processing module is specifically used for:
- the difference in the gaze area refers to the position difference of the gaze area; the difference in the first gaze area refers to the first gaze area The location difference.
- the acquired N frames of images may refer to the acquired images of users in N frames of vehicles, where the images of users in the N frames of vehicles and M frames of dash cam images are in the same time period. Images acquired within the same start time and end time.
- the acquisition of the image of the user in the N frames of the vehicle and the image of the M frame of the dash cam can also be the images acquired within the allowable range of the time difference; that is, the time when the image of the user in the N frames of the vehicle is acquired is
- the time of acquiring the image of the M frame dash cam can be similar or close.
- the next few frames of images can be predicted based on the acquired images of the users in the N frames of vehicles
- the gaze area of the user in the vehicle; or, the first gaze area in the next few frames of images can be predicted based on the images of the M frames of the driving recorder.
- N frames of images for example, images of users in N frames of vehicles
- M frames of dash cam images acquired at the same starting time and ending time may be the same or different.
- the line of sight of the user for example, the driver of the vehicle
- the vehicle may be tracked first, that is, it is determined that the difference between the acquired N frames of images, that is, the N frames of driver images satisfies the first preset range; It can be determined that the driver's gaze area is not directly in front of the obtained N frames of driver images and the difference of the N frames of image changes is kept small. At this time, it can be determined that the driver is gazing at the object of interest in the scene outside the vehicle; and It is determined that the difference between the acquired images of the M frames of the driving recorder meets the second preset range.
- the driver's line of sight in the N frames of images is further determined
- the gaze area and the overlap area in the image of the M-frame dash cam to determine the driver’s target gaze area through multiple frames of images satisfying the first preset range and the second preset range, it is possible to ensure that the user provided by this application is interested Robustness of the object recognition method.
- the processing module is further configured to:
- the images located on the two imaging planes can be projected onto the same imaging plane;
- the gaze area of the driver is mapped to the imaging plane where the image of the driving recorder is located; alternatively, the image of the dash cam can be mapped to the imaging plane where the gaze area of the driver is located.
- the processing module is further configured to:
- the information of the target gaze area is displayed on the display screen of the vehicle.
- the vehicle includes multiple display screens, and the processing module is specifically configured to:
- the target display screen of the plurality of display screens is determined according to the position information of the user in the vehicle in the vehicle; and the information of the target gaze area is displayed on the target display screen.
- the user's identity information is determined according to the user's location information in the vehicle, and the identity information may include the driver or passenger; and then the target gaze area information can be pushed to the user according to the user's identity information; for example, The information of the target gaze area can be displayed on the display screen at the corresponding position; or, the information of the target gaze area can be broadcast; or, the information of the target area can be pushed to the user's mobile phone so that the user can continue to understand the target gaze area in the future. information.
- the head-up display HUD system can display the information of the target gaze area in the vehicle; for example, the information of the target gaze area can be displayed on the front windshield.
- the information of the target gaze area may be displayed on the display screen corresponding to the position.
- the information of the target gaze area is displayed in the vehicle through a head-up display HUD system.
- the identification device is an in-vehicle device in the vehicle.
- the foregoing device for identifying objects of interest to users can be applied in a smart terminal scenario, and in the scenario of a smart terminal, the identifying device provided by this application can be used to identify objects of interest to the user, thereby providing Users provide smarter services and effectively enhance user experience.
- a device for identifying an object of interest to a user including: a memory for storing a program; a processor for executing the program stored in the memory, and when the program stored in the memory is executed, the The processor is configured to perform the following process: obtain the information of the gaze area of the user and the environment image corresponding to the user; obtain the information of the first gaze area of the user in the environment image according to the environment image, wherein, The first gaze area is used to indicate a sensitive area determined by the physical characteristics of the human body; the target gaze area of the user is obtained according to the information of the gaze area and the information of the first gaze area, wherein the target gaze The area is used to indicate the area where the target object gazes at by the user in the environment image is located.
- the processor included in the foregoing identification device is further configured to execute the method for identifying an object of interest to the user in the first aspect and any one of the implementation manners of the first aspect.
- an automobile in a fourth aspect, includes the identification device in any one of the foregoing second aspect and the second aspect.
- a vehicle system which includes a camera, a driving recorder, and the recognition device in any one of the foregoing second aspect and the second aspect disposed inside the vehicle.
- the camera configured in the vehicle may refer to a driver monitoring system (DMS) camera, or a cockpit monitoring system (CMS) camera.
- DMS driver monitoring system
- CMS cockpit monitoring system
- the position of the aforementioned camera can be placed near the A-pillar of the vehicle, or at the position of the steering wheel, instrument panel, or near the rearview mirror.
- a computer-readable storage medium is provided.
- the computer-readable medium storage medium is used to store program code.
- the program code is executed by a computer, the computer is used to execute the first aspect and the first aspect described above.
- a chip in a seventh aspect, includes a processor, and the processor is configured to execute the method for identifying an object of interest to a user in any one of the foregoing first aspect and the first aspect.
- the chip of the seventh aspect described above may be located in an in-vehicle terminal of a vehicle.
- a computer program product comprising: computer program code, when the computer program code runs on a computer, the computer executes any one of the first aspect and the first aspect.
- the above-mentioned computer program code may be stored in whole or in part on a first storage medium, where the first storage medium may be packaged with the processor, or may be packaged separately with the processor. There is no specific limitation.
- Fig. 1 is a schematic diagram of an application scenario provided by an embodiment of the present application
- Fig. 2 is a functional block diagram of a vehicle 100 provided by an embodiment of the present application.
- FIG. 3 is a schematic diagram of the hardware architecture of a vehicle provided by an embodiment of the present application.
- FIG. 4 is a schematic diagram of the software architecture of a vehicle provided by an embodiment of the present application.
- FIG. 5 is a schematic flowchart of a method for recognizing an object of interest for a user according to an embodiment of the present application
- FIG. 6 is a schematic flowchart of a method for recognizing an object of interest for a user according to an embodiment of the present application
- FIG. 7 is a schematic diagram of a driver's gaze area provided by an embodiment of the present application.
- FIG. 8 is a schematic flowchart of calculating the position of the driver's head provided by an embodiment of the present application.
- FIG. 9 is a schematic diagram of calculating the gaze area of the DVR line of sight according to an embodiment of the present application.
- FIG. 10 is a schematic diagram of scene attention prediction provided by an embodiment of the present application.
- FIG. 11 is a schematic flowchart of scene attention area judgment in the driving field provided by an embodiment of the present application.
- FIG. 12 is a schematic flowchart of scene attention area judgment in the field of smart terminals provided by an embodiment of the present application.
- FIG. 13 is a schematic diagram of a device for identifying objects of interest for a user provided by an embodiment of the present application.
- Fig. 14 is a schematic diagram of a device for identifying objects of interest for a user provided in an embodiment of the present application.
- FIG. 1 is a schematic diagram of an application scenario of a method for recognizing an object of interest for a user provided in an embodiment of the present application.
- the method for identifying objects of interest to users provided in the embodiments of the present application can be applied in the field of smart cars.
- users in the vehicle may be interested in roadside objects, such as roadside billboards; however, for driving safety considerations, driving The driver cannot keep his sight on the object of interest for a long time, resulting in the inability to understand the detailed information of the object of interest in real time; or, because the driving vehicle is moving at a faster speed, the passengers in the vehicle cannot learn about the object of interest in time.
- the target object so that the user's points of interest in the vehicle cannot be effectively recorded.
- the method for identifying objects of interest provided in the embodiments of the present application can accurately identify the target objects of interest of the user in the vehicle while driving by using the line of sight of the user in the vehicle to interact with the outside of the vehicle, and then compare the target objects in the vehicle accordingly.
- the user's corresponding information push can effectively improve the user's sense of experience.
- Fig. 2 is a functional block diagram of a vehicle 100 provided by an embodiment of the present application.
- the vehicle 100 may be a manually driven vehicle, or the vehicle 100 may be configured in a fully or partially automatic driving mode.
- the vehicle 100 can control its own vehicle while in the automatic driving mode, and can determine the current state of the vehicle and its surrounding environment through human operations, determine the possible behavior of at least one other vehicle in the surrounding environment, and The confidence level corresponding to the possibility of other vehicles performing possible behaviors is determined, and the vehicle 100 is controlled based on the determined information.
- the vehicle 100 can be placed to operate without human interaction.
- the vehicle 100 may include various subsystems, such as a traveling system 110, a sensing system 120, a control system 130, one or more peripheral devices 140 and a power supply 160, a computer system 150, and a user interface 170.
- a traveling system 110 a sensing system 120
- a control system 130 a control system 130
- peripheral devices 140 and a power supply 160 a computer system 150
- a user interface 170 a user interface 170.
- the vehicle 100 may include more or fewer subsystems, and each subsystem may include multiple elements.
- each of the subsystems and elements of the vehicle 100 may be wired or wirelessly interconnected.
- the travel system 110 may include components for providing power movement to the vehicle 100.
- the travel system 110 may include an engine 111, a transmission 112, an energy source 113, and wheels 114/tires.
- the engine 111 may be an internal combustion engine, an electric motor, an air compression engine, or other types of engine combinations; for example, a hybrid engine composed of a gasoline engine and an electric motor, or a hybrid engine composed of an internal combustion engine and an air compression engine.
- the engine 111 can convert the energy source 113 into mechanical energy.
- the energy source 113 may include gasoline, diesel, other petroleum-based fuels, propane, other compressed gas-based fuels, ethanol, solar panels, batteries, and other power sources.
- the energy source 113 may also provide energy for other systems of the vehicle 100.
- the transmission device 112 may include a gearbox, a differential, and a drive shaft; wherein, the transmission device 112 may transmit mechanical power from the engine 111 to the wheels 114.
- the transmission device 112 may also include other devices, such as a clutch.
- the drive shaft may include one or more shafts that can be coupled to one or more wheels 114.
- the sensing system 120 may include several sensors that sense information about the environment around the vehicle 100.
- the sensing system 120 may include a positioning system 121 (for example, a GPS system, a Beidou system or other positioning systems), an inertial measurement unit 122 (IMU), a radar 123, a laser rangefinder 124, and a camera 125.
- the sensing system 120 may also include sensors of the internal system of the monitored vehicle 100 (for example, an in-vehicle air quality monitor, a fuel gauge, an oil temperature gauge, etc.). Sensor data from one or more of these sensors can be used to detect objects and their corresponding characteristics (position, shape, direction, speed, etc.). Such detection and identification are key functions for the safe operation of the autonomous vehicle 100.
- the positioning system 121 can be used to estimate the geographic location of the vehicle 100.
- the IMU 122 may be used to sense changes in the position and orientation of the vehicle 100 based on inertial acceleration.
- the IMU 122 may be a combination of an accelerometer and a gyroscope.
- the radar 123 may use radio signals to sense objects in the surrounding environment of the vehicle 100. In some embodiments, in addition to sensing the object, the radar 123 may also be used to sense the speed and/or direction of the object.
- the laser rangefinder 124 may use laser light to sense objects in the environment where the vehicle 100 is located.
- the laser rangefinder 124 may include one or more laser sources, laser scanners, and one or more detectors, as well as other system components.
- the camera 125 may be used to capture multiple images of the surrounding environment of the vehicle 100.
- the camera 125 may be a still camera or a video camera.
- control system 130 controls the operation of the vehicle 100 and its components.
- the control system 130 may include various elements, such as a steering system 131, a throttle 132, a braking unit 133, a computer vision system 134, a route control system 135, and an obstacle avoidance system 136.
- the steering system 131 may be operated to adjust the forward direction of the vehicle 100.
- it may be a steering wheel system in one embodiment.
- the throttle 132 may be used to control the operating speed of the engine 111 and thereby control the speed of the vehicle 100.
- the braking unit 133 may be used to control the deceleration of the vehicle 100; the braking unit 133 may use friction to slow down the wheels 114. In other embodiments, the braking unit 133 may convert the kinetic energy of the wheels 114 into electric current. The braking unit 133 may also take other forms to slow down the rotation speed of the wheels 114 to control the speed of the vehicle 100.
- the computer vision system 134 may be operable to process and analyze the images captured by the camera 125 in order to identify objects and/or features in the surrounding environment of the vehicle 100.
- the aforementioned objects and/or features may include traffic signals, road boundaries and obstacles.
- the computer vision system 134 may use object recognition algorithms, structure from motion (SFM) algorithms, video tracking, and other computer vision technologies.
- the computer vision system 134 may be used to map the environment, track objects, estimate the speed of objects, and so on.
- the route control system 135 may be used to determine the travel route of the vehicle 100.
- the route control system 135 may combine data from sensors, GPS, and one or more predetermined maps to determine a travel route for the vehicle 100.
- the obstacle avoidance system 136 may be used to identify, evaluate, and avoid or otherwise cross potential obstacles in the environment of the vehicle 100.
- control system 130 may additionally or alternatively include components other than those shown and described. Alternatively, a part of the components shown above may be reduced.
- the vehicle 100 can interact with external sensors, other vehicles, other computer systems, or users through a peripheral device 140; wherein, the peripheral device 140 can include a wireless communication system 141, an onboard computer 142, a microphone 143, and/or Or speaker 144.
- the peripheral device 140 can include a wireless communication system 141, an onboard computer 142, a microphone 143, and/or Or speaker 144.
- the peripheral device 140 may provide a means for the vehicle 100 to interact with the user interface 170.
- the onboard computer 142 may provide information to the user of the vehicle 100.
- the user interface 116 can also operate the onboard computer 142 to receive user input; the onboard computer 142 can be operated through a touch screen.
- the peripheral device 140 may provide a means for the vehicle 100 to communicate with other devices located in the vehicle.
- the microphone 143 may receive audio (eg, voice commands or other audio input) from the user of the vehicle 100.
- the speaker 144 may output audio to the user of the vehicle 100.
- the wireless communication system 141 may wirelessly communicate with one or more devices directly or via a communication network.
- the wireless communication system 141 can use 3G cellular communication; for example, code division multiple access (CDMA), EVD0, global system for mobile communications (GSM)/general packet radio service (general packet radio service) packet radio service, GPRS), or 4G cellular communication, such as long term evolution (LTE); or, 5G cellular communication.
- CDMA code division multiple access
- EVD0 global system for mobile communications
- GSM global system for mobile communications
- general packet radio service general packet radio service
- GPRS general packet radio service
- 4G cellular communication such as long term evolution (LTE)
- LTE long term evolution
- 5G cellular communication 5G cellular communication.
- the wireless communication system 141 can communicate with a wireless local area network (WLAN) by using wireless Internet access (WiFi).
- WiFi wireless Internet access
- the wireless communication system 141 may directly communicate with the device using an infrared link, Bluetooth, or ZigBee; other wireless protocols, such as various vehicle communication systems, for example, the wireless communication system 141 may include one or Multiple dedicated short range communications (DSRC) devices, these devices may include public and/or private data communications between vehicles and/or roadside stations.
- DSRC dedicated short range communications
- the power supply 160 may provide power to various components of the vehicle 100.
- the power source 160 may be a rechargeable lithium ion or lead-acid battery.
- One or more battery packs of such batteries may be configured as a power source to provide power to various components of the vehicle 100.
- the power source 160 and the energy source 113 may be implemented together, such as in some all-electric vehicles.
- part or all of the functions of the vehicle 100 may be controlled by the computer system 150, where the computer system 150 may include at least one processor 151, and the processor 151 is executed in a non-transitory computer readable medium stored in the memory 152, for example.
- the computer system 150 may also be multiple computing devices that control individual components or subsystems of the vehicle 100 in a distributed manner.
- the processor 151 may be any conventional processor, such as a commercially available CPU.
- the processor may be a dedicated device such as an ASIC or other hardware-based processor.
- FIG. 2 functionally illustrates the processor, the memory, and other elements of the computer in the same block, those of ordinary skill in the art should understand that the processor, computer, or memory may or may not actually include Multiple processors, computers or memories in the same physical enclosure.
- the memory may be a hard disk drive or other storage medium located in a housing other than the computer. Therefore, a reference to a processor or computer will be understood to include a reference to a collection of processors or computers or memories that may or may not operate in parallel. Rather than using a single processor to perform the steps described here, some components such as steering components and deceleration components may each have its own processor that only performs calculations related to component-specific functions .
- the processor may be located away from the vehicle and wirelessly communicate with the vehicle.
- some of the processes described herein are executed on a processor disposed in the vehicle and others are executed by a remote processor, including taking the necessary steps to perform a single manipulation.
- the memory 152 may contain instructions 153 (e.g., program logic), which may be executed by the processor 151 to perform various functions of the vehicle 100, including those functions described above.
- the memory 152 may also contain additional instructions, for example, including sending data to, receiving data from, interacting with, and/or performing data to one or more of the traveling system 110, the sensing system 120, the control system 130, and the peripheral device 140. Control instructions.
- the memory 152 may also store data, such as road maps, route information, the position, direction, and speed of the vehicle, and other such vehicle data, as well as other information. Such information may be used by the vehicle 100 and the computer system 150 during the operation of the vehicle 100 in autonomous, semi-autonomous, and/or manual modes.
- the user interface 170 may be used to provide information to or receive information from a user of the vehicle 100.
- the user interface 170 may include one or more input/output devices in the set of peripheral devices 140, for example, a wireless communication system 141, a car computer 142, a microphone 143, and a speaker 144.
- the computer system 150 may control the functions of the vehicle 100 based on inputs received from various subsystems (for example, the traveling system 110, the sensing system 120, and the control system 130) and from the user interface 170.
- the computer system 150 may use input from the control system 130 in order to control the braking unit 133 to avoid obstacles detected by the sensing system 120 and the obstacle avoidance system 136.
- the computer system 150 is operable to provide control of many aspects of the vehicle 100 and its subsystems.
- one or more of these components described above may be installed or associated with the vehicle 100 separately.
- the storage 152 may exist partially or completely separately from the vehicle 100.
- the above-mentioned components may be communicatively coupled together in a wired and/or wireless manner.
- FIG. 2 should not be construed as a limitation to the embodiment of the present application.
- the vehicle 100 may be an autonomous vehicle traveling on a road, and may recognize objects in its surrounding environment to determine the adjustment to the current speed.
- the object may be other vehicles, traffic control equipment, or other types of objects.
- each recognized object can be considered independently, and based on the respective characteristics of the object, such as its current speed, acceleration, distance from the vehicle, etc., can be used to determine the speed to be adjusted by the self-driving car.
- the vehicle 100 or a computing device associated with the vehicle 100 may be based on the characteristics of the identified object and the state of the surrounding environment (for example, traffic, Rain, ice on the road, etc.) to predict the behavior of the identified object.
- each recognized object depends on each other's behavior. Therefore, all recognized objects can also be considered together to predict the behavior of a single recognized object.
- the vehicle 100 can adjust its speed based on the predicted behavior of the identified object.
- the self-driving car can determine based on the predicted behavior of the object that the vehicle will need to be adjusted (e.g., accelerate, decelerate, or stop) to a stable state.
- other factors may also be considered to determine the speed of the vehicle 100, such as the lateral position of the vehicle 100 on the road on which it is traveling, the curvature of the road, the proximity of static and dynamic objects, and so on.
- the computing device can also provide instructions to modify the steering angle of the vehicle 100 so that the self-driving car follows a given trajectory and/or maintains an object near the self-driving car (for example, , The safe horizontal and vertical distances of cars in adjacent lanes on the road.
- the above-mentioned vehicle 100 may be a car, truck, motorcycle, bus, boat, airplane, helicopter, lawn mower, recreational vehicle, playground vehicle, construction equipment, tram, golf cart, train, and trolley, etc.
- the application examples are not particularly limited.
- the method for identifying objects of interest to users can also be applied to other fields; for example, the field of smart home; the identification method of the present application can improve the accuracy of the user's recognition of the target gaze area, thereby benefiting Smart home provides users with smarter services.
- methods for detecting objects of interest usually need to obtain information about objects in the scene in advance; furthermore, based on the user's gaze tracking, it is determined that the user is gazing at the objects in the scene, so as to determine the objects of interest to the user. Since the method of detecting the object of interest depends on the information of the object in the scene, but in many scenes (for example, driving scene) it is impossible to obtain the information of the object in the current scene in advance. At this time, the above-mentioned object of interest cannot be detected. The method used to identify objects of interest, resulting in a poor user experience.
- the present application provides a method and device for identifying objects of interest to users.
- the user's target gaze area is determined from the sensitive area in the environment image and the user's gaze area, that is, the area where the user is interested in the target object, thereby improving the accuracy of identifying the user's object of interest.
- Fig. 3 is a schematic diagram of a hardware architecture provided by an embodiment of the present application.
- the vehicle 200 may include an in-vehicle camera 210, a driving recorder 220 and an image analysis system 230.
- the vehicle 200 may refer to a manually driven vehicle, or the vehicle 200 may be partially configured with an automatic driving mode.
- the in-vehicle camera 210 can be used to detect the status of a user in the vehicle (for example, the driver or a passenger in the vehicle); for example, driver fatigue monitoring, driver expression recognition, driver eyeball positioning; or, in the vehicle Eyeball positioning of passengers, facial expression recognition of passengers in vehicles, etc.
- a driver monitoring system (DMS) camera configured in the vehicle 200
- a cockpit monitoring system (CMS) camera configured in the vehicle 200
- the position of the aforementioned camera may be located near the A-pillar (A-pillar) of the vehicle 200, or at the position of the steering wheel, instrument panel, or near the rearview mirror.
- the driving recorder 220 may be used to record video images and sound information during the driving of the vehicle.
- a dash cam camera arranged at the front of the body of the vehicle 200.
- the image analysis system 230 may be used to process and analyze the images captured by the in-vehicle camera 210 or the driving recorder 220 in order to identify the user's line of sight tracking of objects outside the vehicle in the vehicle 200.
- FIG. 3 is an example, and the vehicle 200 may also include other devices necessary for normal operation.
- Figure 4 is a schematic diagram of the software architecture provided by the present application.
- the vehicle 300 may include a user visual field detection module 310, an outside scene detection module 320, and a user gaze area detection module 330.
- the user visual field detection module 310 is used to detect the user's head gaze area in the vehicle, for example, the driver's head gaze area is the area where the driver's line of sight passes through the front windshield; or, passengers in the vehicle The gaze area of the head's line of sight.
- the scene detection module 320 outside the vehicle is used to detect the physics in the scene outside the vehicle, and based on the physiological characteristics of the human body, predict which objects outside the vehicle or sensitive areas in the objects the user will encounter.
- the user gaze area detection module 330 is configured to determine the user's target gaze area, that is, the area where the user is interested in the target object outside the vehicle, according to the results of the user field of view detection module 310 and the outside scene detection module 320.
- the software architecture shown in FIG. 4 is an example, and the vehicle 300 may also include software modules necessary for normal operation.
- identification method shown in FIG. 5 may be executed by the driving vehicle shown in FIG. 1 or the smart terminal shown in FIG. 2; wherein, the driving vehicle may be the vehicle shown in FIG. 3, or, as shown in FIG. Vehicles.
- the identification method 400 shown in FIG. 5 includes steps S410 to S430, and these steps will be described in detail below.
- the information of the user's gaze area in the vehicle can be obtained through the DMS camera or the CMS camera; for example, the driver's facial image can be collected through the DMS camera; the driver can be determined based on the driver's facial image The gaze area of the head's line of sight.
- the environment image can be obtained through the driving recorder of the vehicle; for example, the outside environment image during the driving of the vehicle can be obtained through the driving recorder; or the outside environment image during the driving of the vehicle can also be obtained from the cloud.
- the DMS camera or CMS camera in the vehicle obtains the user's gaze area information, and the dash cam obtains the environmental image, it can be sent to the vehicle's computer system (for example, on-board equipment) through the communication system.
- the vehicle's computer system for example, on-board equipment
- the communication system can use 3G cellular communication; for example, CDMA, EVDO, GSM/GPRS; or 4G cellular communication, such as LTE; or, 5G cellular communication; the communication system can use wireless Internet access (WiFi) to communicate with wireless local area networks; or, The communication system can use infrared links, Bluetooth, or ZigBee to directly communicate with the device; the communication system can use other wireless protocols, such as various vehicle communication systems; or, the communication system can include one or more dedicated short-range Communication equipment, which may include public and/or private data communication between vehicles and/or roadside stations.
- 3G cellular communication for example, CDMA, EVDO, GSM/GPRS; or 4G cellular communication, such as LTE; or, 5G cellular communication
- the communication system can use wireless Internet access (WiFi) to communicate with wireless local area networks; or,
- WiFi wireless Internet access
- the communication system can use infrared links, Bluetooth, or ZigBee to directly communicate with the device; the communication system can use other
- the environment image corresponding to the user may refer to the image of the environment where the user is located.
- the information of the user's gaze area may include position information of the user's gaze area, the direction of the user's gaze area, and the size of the user's gaze area.
- the user may refer to the driver of the vehicle; or, it may also refer to the passengers in the vehicle; the environment image corresponding to the user may refer to the image of the environment where the vehicle is located, or the environment image outside the vehicle.
- the vehicle may refer to a manually driven vehicle, or a vehicle that is fully or partially configured in an automatic driving mode.
- the user may refer to the user of the smart home in the home, and the environment image corresponding to the user may refer to the image of the home where the user of the smart home is located.
- S420 Obtain information of the first gaze area of the user in the environment image according to the environment image.
- the first gaze area is a sensitive area determined by the physical characteristics of the human body.
- the first gaze area in the environmental image is the sensitive area of the user determined by the biological characteristics of the human body, but it is not necessarily the gaze area of the user; the sensitive area may refer to the physical characteristics of the human body, for example, according to the physical characteristics of the human body.
- the sensitivity of the change of different colors and shapes determines the area that is easy to attract the user's attention.
- the optical nerve of the human eye has different sensitivity to light of different wavelengths; the human eye is most sensitive to electromagnetic waves with a wavelength of about 555nm. This electromagnetic wave is in the green region of the optical spectrum, so the human eye is more sensitive to green light. Sensitive; therefore, the user's sensitive area for an image can refer to the green area in the image.
- the first gaze area in the environmental image can be represented by the interest value of each area in the image, where the interest value can be obtained through a deep learning method, or an edge detection method to obtain the richness of morphological changes, and a gradient calculation method Obtain the richness of color changes, and weight these two values to calculate the interest value of each position in the image; based on the interest value of each position in the image, pre-determine which area in the image the driver is more sensitive to, which is likely to cause the user's attention; See Figure 10 below.
- the Canny edge detection algorithm can be divided into the following 5 steps:
- Step 1 Use Gaussian filtering to smooth the environment image, the purpose is to remove the noise in the environment image;
- Step 2 Determine the intensity gradients in the environment image
- Step 3 Eliminate false edge detection by using non-maximum suppression technology (for example, it is not an edge but it is detected as an edge);
- Step 4 Determine the possible (potential) boundary in the environment image by using a double threshold method
- Step 5 Use hysteresis technology to track the boundaries in the environmental image.
- S430 Obtain the target gaze area of the user according to the information of the gaze area of the user and the information of the first gaze area.
- the target gaze area is used to indicate the area where the target object that the user gazes at is located in the environment image.
- the target object that the user gazes at may refer to the target object that the user is interested in; among them, the object that the user is interested in can be determined by collecting the user's historical behavior data to obtain the user's interest in different objects; or, by It is determined by collecting users' tags on different objects.
- the first gaze area may refer to the user's sensitive area in the environment image that is pre-judged based on the characteristics of the human object; and the target gaze area refers to the area where the user is interested in the target object in the environment image; the first gaze The area may be the same for different users, but the target gaze area is for different users, and different users can gaze at the area where the object of interest in the environment image is based on their own interests.
- the pre-judgment of the environment image can be introduced, that is, the user's target gaze area can be determined according to the sensitive area in the environment image and the gaze area of the user.
- the user further includes: displaying the information of the target gaze area on the display screen of the vehicle.
- the user may refer to the driver of the vehicle; or, it may also refer to the passengers in the vehicle; the vehicle may detect the identity information of the user, for example, the user is determined as the driver or the passenger according to the location information of the user in the vehicle;
- the information of the target gaze area can be pushed to the user according to the user's identity information; for example, the information of the target gaze area can be displayed on the display screen at the corresponding position; or the information of the target gaze area can be broadcast; or the target area can be broadcasted.
- the information is pushed to the user’s mobile phone so that the user can continue to learn about the target gaze area in the future.
- the information of the target gaze area can be displayed in the vehicle through a head-up display (HUD); for example, the information of the target gaze area can be displayed on the front windshield.
- HUD head-up display
- the information of the target gaze area may be displayed on the display screen corresponding to the position.
- obtaining the target gaze area of the user according to the information of the gaze area of the user and the information of the first gaze area may include: the difference between the gaze area and the first gaze area of the user The overlap area determines the user's target gaze area.
- the above-mentioned method for identifying objects of interest to users can be applied in a driving scene.
- the user can refer to a user located inside a car (for example, a user in a vehicle); the information of the user’s gaze area and the user’s corresponding gaze area are obtained.
- the environment image of may include: acquiring the information of the gaze area of the user in the vehicle and the image of the dash cam of the vehicle, where the gaze area of the user in the vehicle is used to represent the gaze area of the user outside the vehicle in the vehicle; according to the user Obtaining the user’s target gaze area from the information of the gaze area and the information of the first gaze area includes: determining the user’s gaze area in the vehicle according to the information of the user’s gaze area in the vehicle and the information of the first gaze area in the image of the dash cam Target gaze area.
- the user in the vehicle is the driver of the vehicle
- the gaze area of the driver is used to indicate the gaze area of the driver in the direction of the front windshield of the vehicle.
- the user in the vehicle is a passenger in the vehicle.
- the user’s gaze area is used to indicate the gaze area of the passenger in the direction of the front windshield of the vehicle, or the vehicle is in the car.
- the user's target gaze area may refer to the area where the driver's object of interest is located in the scene outside the vehicle; for example, it may be on a billboard on the side of a road.
- obtaining the information of the user's gaze area in the vehicle and the image of the vehicle's dash cam may include: obtaining the information of the user's gaze area in the vehicle in N frames of images, and N frames of dash cam images; determining the target gaze area of the user in the vehicle according to the information of the user’s gaze area in the vehicle and the information of the first gaze area in the image of the dash cam, which may include: determining the vehicle in the N frames of images The difference in the gaze area of the user meets the first preset range; it is determined that the difference in the first gaze area in the N frames of dash cam images meets the second preset range; the user’s gaze area in the vehicle in the N frames of images is determined to meet the M Frame the first gaze area in the image of the driving recorder to determine the overlap area; according to the overlap area, the target gaze area of the user in the vehicle can be determined.
- the difference in the gaze area of the driver in the above N frames of images may refer to the difference in the position of the gaze area of the driver in the multiple frames of images; alternatively, it may also refer to the gaze area of the driver in the multiple frames of images
- the difference in size; the difference in the gaze area in the N frames of images can determine whether the driver continues to look at the same target object within a preset time.
- the difference in the first gaze area in the image of the N frames of dash cam can refer to the difference in the position of the first gaze area in the image of the multi-frame dash cam; or it can also refer to the multi-frame driving record in the multi-frame image
- the difference in the size of the first gaze area in the image of the instrument; the difference in the first gaze area in the image of the M-frame driving recorder can determine whether the user in the vehicle can see the target object in the environment image within the preset time, and avoid the vehicle The driving direction of the vehicle suddenly changes. In the vehicle, the user may look at the target object outside the vehicle but cannot continue to look at the scene of the target object.
- the acquired N frames of images may refer to the acquired N frames of driver images, where the N frames of driver images and M frames of driving recorder images are images acquired in the same time period, that is, Images acquired at the same start time and end time.
- the processor in the vehicle can put a time stamp on the image, and according to the timestamp, it can be determined to acquire N frames of driver’s images in the same time period.
- Image and M-frame dash cam image can put a time stamp on the image, and according to the timestamp, it can be determined to acquire N frames of driver’s images in the same time period.
- N frames of driver images and M frames of dash cam images may also be images acquired within the allowable range of time difference; that is, the time of acquiring N frames of driver images and the time of acquiring M frames of dash cam The moments of the images are similar or close.
- the next few frames of driver images can be predicted based on the acquired N frames of driver images; or, according to The image of the M-frame driving recorder predicts the image of the next few frames of the driving recorder.
- the driver's gaze area in the acquired N frames of images may refer to N frames of driver images collected by a camera disposed in the cockpit of the vehicle; N frames of images can be determined based on the N frames of driver images For example, the driver’s gaze area on the front windshield of the vehicle can be determined by the driver’s head position in N frames of driver’s images.
- the difference in the gaze area of the driver in the N frames of images meets the first preset range and the difference in the first gaze area in the images of the N frames of dash cam meets the second preset range, it will be further Determine the overlap area between the driver’s gaze area in N frames of images and the first gaze area in M frames of dash cam images;
- the driver’s gaze gaze area in the N frame of image, the driver’s gaze area in the ith frame of the image and the M frame of the dash cam image, the first gaze area in the ith frame of the dash cam image determines the overlap area Size; to determine the driver's target gaze area by comparing the overlapping areas in the N frames of images.
- the driver’s line of sight can be tracked first, that is, it is determined that the obtained N frames of images, that is, the difference between the N frames of the driver’s images, meets the first preset range; that is, it can be determined to obtain the N frames of the driver’s In the image, the driver’s gaze area is not directly in front and the difference of N frames of image changes is kept small. At this time, it can be determined that the driver is gazing at the object of interest in the scene outside the vehicle; and determine the acquired M frames of dash cam The image difference meets the second preset range.
- the driving recorder continues to shoot the same object for multiple frames without missing; at this time, it is further determined that the driver’s gaze area in the N frames of image and the M frame of the driving recorder
- the overlapped area in the image is used to determine the driver's target gaze area; through multiple frames of images satisfying the first preset range and the second preset range, the robustness of the method for identifying objects of interest to the user provided in this application can be ensured.
- N frames of images for example, N frames of driver images
- M frames of dash cam images acquired at the same start time and end time may be the same or different.
- the number of N frames of images (that is, driver images) and M frames of dash cam images acquired at the same start time and end time are equal.
- the gaze area of the driver in the N frames of images may refer to the gaze area of the driver in the 3 frames of images #1 ⁇ #3; the image of the M frames of driving recorder may refer to the gaze area of the driver in the 3 frames of images.
- the 3 frames of dash cam images corresponding to the gaze area of, are #4 ⁇ #6, and the overlap area is determined based on the driver’s gaze area in the N frames of images and the first gaze area in the N frames of dash cam images It can mean determining the overlap area between the gaze area of the driver in image #1 and the gaze area of the first line of sight in the image of #4 driving recorder, which is recorded as overlap area 1.
- the number of N frames of images (that is, driver images) and M frames of dash cam images acquired at the same starting and ending moments are not equal; for example, if the camera that acquires the driver's image When the acquisition frequency of is different from that of the driving recorder, the number of image frames acquired in the same time period can be different.
- N frames of driver images and M frames of driving recorder images for example, where N ⁇ M.
- N frames of the driving recorder image For example, for each frame of driver image in N frames of driver images, find the image of the driving recorder closest to its time stamp among the images of M frames of driving recorder, that is, find the image of M frames of driving recorder based on N frames of driver images.
- N frames of the driving recorder image that is to say, for the M frames of the driving recorder image, part of the redundant image is allowed to be discarded.
- the images located in the two imaging planes can be projected onto the same imaging plane. ; That is, the information of the user's gaze area in the vehicle can be mapped to the imaging plane where the image of the dash cam; or the information of the dash cam image can be mapped to the imaging plane of the user's gaze area in the vehicle.
- the specific process can be seen in Figure 9 below.
- the above identification method further includes: displaying information of the target gaze area on a display screen of the vehicle.
- the vehicle may include multiple display screens, and displaying information of the target gaze area on the display screen of the vehicle includes: determining the amount of information according to the location information of the user in the vehicle in the vehicle.
- the target display screen in a display screen; the information of the target gaze area is displayed in the target display screen.
- the user's identity information is determined according to the user's location information in the vehicle, the identity information may include the driver or the passenger; and then the target gaze area information can be pushed to the user according to the user's identity information; for example, it can be displayed in the corresponding position
- the information of the target gaze area is displayed on the screen; or, the information of the target gaze area can be broadcast; or, the information of the target area can be pushed to the user's mobile phone so that the user can continue to learn the information of the target gaze area in the future.
- the head-up display HUD system can display the information of the target gaze area in the vehicle; for example, the information of the target gaze area can be displayed on the front windshield.
- the information of the target gaze area may be displayed on the display screen corresponding to the position.
- the method further includes: displaying the information of the target gaze area in the vehicle through a head-up display HUD system.
- the above method for identifying objects of interest to users can be applied to smart terminal scenarios.
- the method provided in this application can be used to identify objects of interest to users, thereby providing users with more intelligence.
- the service effectively enhances the user experience.
- the specific process can be referred to as shown in Figure 12, which will not be repeated here.
- the pre-judgment of the environment image can be introduced, that is, the user may be actually interested in the area of interest in the environment image and the user's gaze area. Area, thereby improving the accuracy of identifying objects of interest to the user.
- Fig. 6 is a schematic flowchart of a method for recognizing an object of interest for a user provided in an embodiment of the present application.
- the method shown in FIG. 6 includes steps S510 to S560, and these steps will be described in detail below.
- the method shown in FIG. 6 may be executed by the traveling vehicle shown in FIG. 1 or the smart terminal shown in FIG. 2; wherein, the traveling vehicle may be the vehicle shown in FIG. 3, or the vehicle shown in FIG. 4 vehicle.
- the user is the driver as an example in FIG. 6 for illustration; in the embodiment of the present application, the user may also be a user located in a vehicle, such as a passenger in a vehicle, or a user with other identities, This application does not make any restrictions on this.
- calculating the driver's gaze area is to determine the location of the area where the driver's gaze will pass through the front windshield.
- the image collected by the DMS/CMS can be input to a computing system for calculation, and by using a deep learning algorithm or a support vector machine algorithm, the driver's gaze area on the front windshield can be obtained.
- Support Vector Machines is a supervised machine learning algorithm that can be used for classification tasks or regression tasks.
- the front windshield can be divided into multiple square areas; assuming that the lower left corner is the starting point, the coordinates of the starting point in the area are (0, 0); up and right are corresponding positive directions; pass
- the deep learning algorithm can identify the area where the driver's gaze is in multiple square areas.
- the 3D position of the driver's head can be calculated by the binocular method.
- FIG. 8 is a schematic flowchart of calculating the 3D position of the driver's head. As shown in FIG. 8, the method includes steps S610 to S650, and these steps are respectively described in detail below.
- obtaining the internal or external parameters of DMS/CMS before leaving the factory can be obtained by camera calibration method to obtain the internal and external parameter matrices of the two cameras; thus, the camera and the camera, the camera and the other in the 3D space can be established based on the internal and external parameters.
- the position and direction of the object can be established based on the internal and external parameters.
- the key points P1 and P2 may respectively refer to the center position of the driver's head in the DMS or CMS view.
- detecting the head positions P1 and P2 in the DMS/CMS view may be to detect the 2D abscissa positions p1 and p2 of the head in the two views through deep learning or other algorithms.
- the 3D straight line O1P1 where the head is located can be calculated.
- O2P2 can be based on the camera's internal/external parameters to convert the head's 2D horizontal and vertical coordinate positions p1, p2 into 3D space coordinates P1, P2; , O1, O2 respectively represent the optical origin of DMS/CMS in 3D space, recorded in the external parameter matrix, so that O1P1, O2P2 can be calculated.
- the position P of the driver's head can be obtained by solving the intersection of two 3D straight lines O1P1, O2P2; if the two straight lines O1P1, O2P2 have no intersection, then the closest point between the two straight lines can be selected as point P .
- the 3D position of the driver's head can be obtained according to the internal and external parameters of the camera.
- calculating the DVR gaze area may refer to establishing an association relationship from the gaze area of the driver on the windshield to the gaze area of the DVR; for example, the association relationship may be a table lookup relationship.
- the area calculates the area corresponding to the driver's gaze area in the DVR.
- the eye point is the position of the rearview camera of the vehicle body
- the X straight line is the extended line of the center point of the driver's gaze area in space
- the DVR is the position of the dash cam in the space
- the shaded area in O1 indicates that the driver is gazing on the windshield Area
- the oblique area in O2 represents the corresponding driver's gaze area in the DVR view.
- the location of the driver’s object of interest is located on the extension of the driver’s gaze through the front windshield. Therefore, if the driver’s gaze area is determined in the DVR view, it can be determined that the driver is in the DVR. The approximate area of the object of interest.
- the scene attention prediction refers to the fact that the human eye is more sensitive and interested in areas with rich color and shape changes. Therefore, based on the physiological characteristics of the human body, the driver can predict which areas the driver may be exposed to outside the car. Region generates interest.
- an image is obtained through a deep learning method, or a Canny edge detection method is used to obtain the richness of morphological changes, and the richness of color changes is obtained by a gradient calculation method, and the two values are weighted and calculated
- the interest value of each position of the image is obtained; based on the interest value of each position in the image, it is pre-determined which area in the image the driver may be interested in.
- the shaded area in Figure 10 represents the driver's sensitive area, that is, the area that may attract the driver's attention.
- the above-mentioned image may be an image obtained from a DVR.
- the driver's real interest area in the scene outside the vehicle can be obtained.
- the driver's area of interest can be obtained, so that the information of the area can be obtained, and the information of interest can be provided to the driver later; for example, the information of the area of interest can be obtained Broadcast to the driver; or push the information to the driver’s mobile phone so that the driver can continue to learn about the information of the region of interest.
- the scene attention area judgment can include step S710 to step S750, and these steps will be described in detail below.
- judging whether the driver’s gaze area lasts for N frames refers to judging whether the driver’s gaze area on the front windshield remains stationary or changes in a small range in N frames of images; that is, to determine whether to drive Whether the operator continues to pay attention to a certain area for N consecutive frames.
- whether the driver’s gaze area of the N frames of images remains stationary or changes within a small range in the N frames of images may mean that the driver’s gaze area is outside the front and outside. And the gaze area of the driver in the N frames of images remains stationary or changes within a small range; wherein, the front direction may refer to the direction in which the vehicle is traveling.
- S740 Judge whether the pre-judgment of attention outside the vehicle continues for M frames; if yes, execute S750; if no, return.
- the pre-judgment of the attention of the continuous M frames outside the vehicle is the pre-judgment of the attention of the image obtained by the driving recorder, and the calculation of the overlapping area of the region of interest of the multi-frame image.
- the corresponding scene may be a relatively small side of the road during driving.
- the driver’s gaze time is short, and it is basically impossible to arouse the driver’s interest, so there is no need to perform S750; if the region of interest in the M frame image is continuous and not lost, Then, it is possible to execute S750 to predict whether this area is a real area of interest of the driver.
- N frames of images for example, N frames of driver images
- M frames of dash cam images acquired at the same start time and end time may be the same or different.
- S750 is based on the pre-judgment algorithm for obtaining driver's gaze tracking and attention from outside the vehicle.
- the scene attention area judgment process is activated; that is, if When the driver's gaze area and the attention outside the vehicle are predicted to remain constant in N frames of images at the same time, or change within a small range, it is necessary to judge the attention area of the scene outside the vehicle.
- N frames of images for example, 5 frames of images
- N frames of images for example, 5 frames of images
- N frames of images for example, 5 frames of images
- the driving recorder 5 Frame the driver’s gaze area in the front windshield in the camera’s images collected in the cab, and then determine whether the driver’s gaze area in the 5 frames of images remains the same or changes within a smaller range; the same goes for
- the areas that the driver may be interested in in the 5 frames of the scene outside the vehicle can be respectively determined, and then it can be judged whether the areas that the driver may be interested in in the 5 frames of the scene outside the vehicle images remain unchanged or relatively high.
- the above example illustrates the case where the number of images in the vehicle cab collected in the same time period is equal to the number of images in the dash cam; in the same time period, the number of images in the vehicle cab is collected
- the number and the number of images of the driving recorder may also be different, which is not limited in this application.
- the DVR gaze area calculation can be performed through S530, that is, the driver’s gaze area is moved from the windshield to the image of the dash cam, and the gaze area of the driver in the image of the dash cam is compared with that of the driving recorder.
- the pre-judgment area of interest in the image of the recorder is taken as an intersection. When the intersection of consecutive N frames of images remains unchanged or moves within a small range, the intersection area is considered to be the area that the driver really looks at.
- the driver’s gaze area is obtained by combining the driver’s image collected by the camera in the car, and the driver’s gaze area is combined with the scene area outside the car, thereby increasing the user’s interest
- the accuracy and robustness of object recognition algorithms provide users with a better interactive experience.
- FIG. 11 is a schematic flowchart of scene attention area judgment in the driving field.
- Figure 12 is a schematic flow chart of the judgment of the scene attention area in the field of smart terminals.
- the method for identifying objects of interest provided in this application can be used to determine which objects in the home the user is interested in, and provide corresponding information. Interaction.
- the method for identifying objects of interest is similar to the process shown in Figures 6 to 10 above.
- the difference is that in the driving field, the driver’s gaze area is obtained, while in the field of smart terminals, it is obtained. It is the difference between the user's gaze area and the judgment process of the scene attention area.
- the process of judging the scene attention area in the field of smart terminals will be described in detail below with reference to FIG. 12.
- the scene attention area judgment shown in FIG. 12 may include steps S810 to S830, and these steps will be described in detail below.
- the user's image can be acquired through the camera of the smart terminal; for example, the user's image can be acquired through the camera in the smart screen.
- S820 Determine whether the gaze area of the user lasts for N frames; if yes, execute S830; if not, return.
- the object that the user really looks at can be determined.
- the specific process of scene attention prediction can be referred to the specific description of FIG. 10, which will not be repeated here.
- Fig. 13 is a schematic block diagram of an apparatus for recognizing an object of interest for a user according to an embodiment of the present application.
- identification device 900 shown in FIG. 13 is only an example, and the device in the embodiment of the present application may further include other modules or units. It should be understood that the identification device 900 can execute each step in the identification method of FIG. 5 to FIG.
- the identification device 900 may include an acquiring module 910 and a processing module 920.
- the acquiring module 910 is used to acquire the information of the gaze area of the user and the environment image corresponding to the user; the processing module 920 is used to The environment image obtains the information of the first gaze area of the user in the environment image, wherein the first gaze area is used to indicate a sensitive area determined by the physical characteristics of the human body;
- the information of the first gaze area obtains the target gaze area of the user, wherein the target gaze area is used to indicate an area in the environment image where the target object gazes at by the user is located.
- the processing module 920 is specifically configured to:
- the target gaze area is determined according to the overlap area of the gaze gaze area and the first gaze area.
- the user is a user in a vehicle
- the acquiring module 910 is specifically configured to:
- the processing module 920 is specifically configured to:
- the target gaze area of the user in the vehicle is determined according to the information of the gaze area of the user in the vehicle and the information of the first gaze area in the image of the dash cam.
- the obtaining module 910 is specifically configured to:
- N and M are both positive integers
- the processing module 920 is specifically configured to:
- the target gaze area of the user in the vehicle is determined according to the overlap area.
- the difference in the gaze area refers to a difference in the position of the gaze area; the difference in the first gaze area refers to a difference in the position of the first gaze area .
- processing module 920 is further configured to:
- the image of the driving recorder is mapped to the imaging plane where the gaze area of the user in the vehicle is located.
- processing module 920 is further configured to:
- the information of the target gaze area is displayed on the display screen of the vehicle.
- the vehicle includes multiple display screens, and the processing module 920 is specifically configured to:
- the target display screen of the plurality of display screens is determined according to the position information of the user in the vehicle in the vehicle; and the information of the target gaze area is displayed on the target display screen.
- processing module 920 is further configured to:
- the head-up display HUD system displays the information of the target gaze area in the vehicle.
- the user in the vehicle is a driver of the vehicle or a passenger in the vehicle.
- the aforementioned identification device 900 refers to a car, where the acquisition module refers to an interface circuit in the car; and the processing module refers to a processor in the car.
- the interface circuit in the car can obtain the information of the user's gaze area collected by a camera (for example, a DMS camera or a CMS camera) disposed in the car through the communication network, and obtain the environment image of the car driving by the dash cam.
- a camera for example, a DMS camera or a CMS camera
- the DMS or CMS can be integrated in the car;
- the identification method in the above embodiments of the application can be implemented by a software algorithm, and the processor can obtain the user's gaze area information and environmental images from the interface circuit, and use the integrated logic circuit or Instructions in the form of software implement the identification method in this application.
- the aforementioned identification device 900 refers to an on-board device in an automobile, where the acquisition module refers to an interface circuit in the on-vehicle device; and the processing module refers to a processor in the on-board device.
- the interface circuit in the in-vehicle device can obtain the user's gaze area information collected by a camera (for example, a DMS camera or a CMS camera) configured in the car through the communication network, and obtain the environment image of the car driving by the dash cam. ;
- the processor obtains the information of the user's gaze area and the environment image from the interface circuit, and executes the identification method in this application through an integrated logic circuit or instructions in the form of software.
- the above-mentioned communication network may refer to 3G cellular communication; for example, CDMA, EVDO, GSM/GPRS; or 4G cellular communication, such as LTE; or, 5G cellular communication;
- the communication network can use wireless Internet access (WiFi) to communicate with wireless local area networks
- WiFi wireless Internet access
- the communication network can use infrared link, Bluetooth or ZigBee protocol (ZigBee) to directly communicate with the device
- other wireless protocols such as various vehicle communication systems, for example, the communication network can include one or more dedicated short-range communication devices, These devices may include public and/or private data communication between vehicles and/or roadside stations.
- identification device 900 here is embodied in the form of a functional unit.
- module herein can be implemented in the form of software and/or hardware, which is not specifically limited.
- a “module” can be a software program, a hardware circuit, or a combination of the two that realizes the above-mentioned functions.
- the hardware circuit may include an application specific integrated circuit (ASIC), an electronic circuit, and a processor for executing one or more software or firmware programs (such as a shared processor, a dedicated processor, or a group processor). Etc.) and memory, merged logic circuits and/or other suitable components that support the described functions.
- the units of the examples described in the embodiments of the present application can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
- Fig. 14 is a schematic block diagram of an apparatus for recognizing an object of interest for a user according to an embodiment of the present application.
- the identification device 1000 shown in FIG. 14 includes a memory 1001, a processor 1002, a communication interface 1003, and a bus 1004. Among them, the memory 1001, the processor 1002, and the communication interface 1003 implement communication connections between each other through the bus 1004.
- the memory 1001 may be a read only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM).
- the memory 1001 may store a program.
- the processor 1002 is configured to execute each step of the method for identifying an object of interest for a user in an embodiment of the present application. For example, FIG. 5 to FIG. 12 shows the various steps of the embodiment.
- the processor 1002 may adopt a general-purpose central processing unit (CPU), a microprocessor, an application specific integrated circuit (ASIC), or one or more integrated circuits for executing related programs.
- CPU central processing unit
- ASIC application specific integrated circuit
- the method for identifying objects of interest to users in the method embodiments of this application is implemented.
- the processor 1002 may also be an integrated circuit chip with signal processing capability.
- each step of the method for identifying an object of interest to a user in the embodiment of the present application can be completed by an integrated logic circuit of hardware in the processor 1002 or an instruction in the form of software.
- the aforementioned processor 1002 may also be a general-purpose processor, a digital signal processing (digital signal processing, DSP), an application specific integrated circuit (ASIC), a ready-made programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, Discrete gates or transistor logic devices, discrete hardware components.
- DSP digital signal processing
- ASIC application specific integrated circuit
- FPGA field programmable gate array
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
- the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
- the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
- the storage medium is located in the memory 1001, and the processor 1002 reads the information in the memory 1001, and combines its hardware to complete the functions required by the modules included in the identification device shown in FIG. 13 in the embodiment of the present application, or perform the method implementation of the present application
- the method for recognizing an object of interest to a user in this example can execute each step/function of the embodiment shown in FIG. 5 to FIG. 12.
- the communication interface 1003 may use, but is not limited to, a transceiver such as a transceiver to implement communication between the identification device 1000 and other devices or communication networks.
- the bus 1004 may include a path for transferring information between various components of the identification device 1000 (for example, the memory 1001, the processor 1002, and the communication interface 1003).
- identification device 1000 only shows a memory, a processor, and a communication interface, in the specific implementation process, those skilled in the art should understand that the identification device 1000 may also include other devices necessary for normal operation. At the same time, according to specific needs, those skilled in the art should understand that the identification device 1000 may also include hardware devices that implement other additional functions. In addition, those skilled in the art should understand that the identification device 1000 described above may also only include the components necessary to implement the embodiments of the present application, and not necessarily include all the components shown in FIG. 14.
- the identification device shown in the embodiment of the present application may be an in-vehicle device in a vehicle, or may also be a chip disposed in an in-vehicle device.
- An embodiment of the present application also provides a vehicle, which includes a device for identifying objects of interest to a user in the foregoing embodiments of the present application.
- An embodiment of the present application also provides a vehicle system, which includes a camera, a driving recorder, and an identification device disposed inside the vehicle.
- the embodiment of the present application also provides a chip, which includes a transceiver unit and a processing unit.
- the transceiver unit may be an input/output circuit and a communication interface;
- the processing unit may be a processor, a microprocessor, or an integrated circuit integrated on the chip;
- the chip may execute the method for identifying objects of interest to users in the foregoing method embodiments.
- An embodiment of the present application also provides a computer-readable storage medium on which an instruction is stored, and when the instruction is executed, the method for identifying an object of interest to a user in the foregoing method embodiment is executed.
- the embodiments of the present application also provide a computer program product containing instructions that, when executed, execute the method for identifying objects of interest to users in the foregoing method embodiments.
- the processor in the embodiment of the present application may be a central processing unit (central processing unit, CPU), and the processor may also be other general-purpose processors, digital signal processors (digital signal processors, DSP), and application-specific integrated circuits. (application specific integrated circuit, ASIC), ready-made programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
- the memory in the embodiments of the present application may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
- the non-volatile memory can be read-only memory (ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), and electrically available Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
- the volatile memory may be random access memory (RAM), which is used as an external cache.
- RAM random access memory
- static random access memory static random access memory
- DRAM dynamic random access memory
- DRAM synchronous dynamic random access memory
- Access memory synchronous DRAM, SDRAM
- double data rate synchronous dynamic random access memory double data rate SDRAM, DDR SDRAM
- enhanced synchronous dynamic random access memory enhanced SDRAM, ESDRAM
- synchronous connection dynamic random access memory Take memory (synchlink DRAM, SLDRAM) and direct memory bus random access memory (direct rambus RAM, DR RAM).
- the above-mentioned embodiments may be implemented in whole or in part by software, hardware, firmware or any other combination.
- the above-mentioned embodiments may be implemented in the form of a computer program product in whole or in part.
- the computer program product includes one or more computer instructions or computer programs.
- the processes or functions described in the embodiments of the present application are generated in whole or in part.
- the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
- the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
- the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server or data center via wired (such as infrared, wireless, microwave, etc.).
- the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center that includes one or more sets of available media.
- the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium.
- the semiconductor medium may be a solid state drive.
- At least one refers to one or more, and “multiple” refers to two or more.
- the following at least one item (a)” or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a).
- at least one item (a) of a, b, or c can mean: a, b, c, ab, ac, bc, or abc, where a, b, and c can be single or multiple .
- the size of the sequence number of the above-mentioned processes does not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, and should not correspond to the embodiments of the present application.
- the implementation process constitutes any limitation.
- the disclosed system, device, and method can be implemented in other ways.
- the device embodiments described above are merely illustrative.
- the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
- the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
- the technical solution of the present application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
- the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disks or optical disks and other media that can store program codes. .
Landscapes
- Engineering & Computer Science (AREA)
- Mechanical Engineering (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Transportation (AREA)
- General Physics & Mathematics (AREA)
- Chemical & Material Sciences (AREA)
- Combustion & Propulsion (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- Automation & Control Theory (AREA)
- Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Finance (AREA)
- Optics & Photonics (AREA)
- Ophthalmology & Optometry (AREA)
- Game Theory and Decision Science (AREA)
- Economics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Entrepreneurship & Innovation (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Traffic Control Systems (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
一种用户感兴趣对象的识别方法以及识别装置,涉及智能车领域,该识别方法包括:获取用户的视线注视区域的信息以及该用户对应的环境图像(S410);根据该环境图像得到该用户在该环境图像中的第一注视区域的信息,其中,该第一注视区域用于表示通过人体物理特征确定的敏感区域(S420);根据该视线注视区域的信息与该第一注视区域的信息得到该用户的目标注视区域,其中,该目标注视区域用于表示在该环境图像中该用户注视的目标对象所在的区域(S430)。基于该识别方法能够提高了识别用户感兴趣对象的准确性。
Description
本申请涉及智能车领域,更具体地,涉及一种用户感兴趣对象的识别方法以及识别装置。
随着人工智能技术的不断发展,用户对体验感的要求越来越高,从而希望能够获得更加智能的人机交互的体验。
驾驶员在驾驶的过程中可能会对车外区域的物体产生兴趣,但是为了确保驾驶安全,驾驶员不能长期注视车外区域的感兴趣物体可能无法了解该物品的详细信息;从而驾驶员的感兴趣对象也无法得到有效记录,降低驾驶员的驾驶体验。
因此,如何准确地识别用户感兴趣对象,从而提升用户的体验成为一个亟需解决的问题。
发明内容
本申请提供一种用户感兴趣对象的识别方法以及识别装置,通过本申请的识别方法与识别装置能够提高用户感兴趣对象识别的准确性。
第一方面,提供了一种用户感兴趣对象的识别方法,包括:获取用户的视线注视区域的信息以及所述用户对应的环境图像;根据所述环境图像得到所述用户在所述环境图像中的第一注视区域的信息,其中,所述第一注视区域用于表示通过人体物理特征确定的敏感注视区域;根据所述视线注视区域的信息与所述第一注视区域的信息得到所述用户的目标注视区域,其中,所述目标注视区域用于表示在所述环境图像中所述用户注视的目标对象所在的区域。
其中,用户对应的环境图像可以是指用户所处环境的图像。
在一种可能的实现方式中,对于智能车领域而言,用户可以是指车辆的驾驶员或者车辆中的乘客,则用户对应的环境图像可以是指车辆所在环境的图像,或者,车辆外部的环境图像。
在另一种可能的实现方式中,对于智能终端领域而言,用户可以是指家庭中的智能家居的使用者,则用户对应的环境图像可以是指智能家居的使用者所在的家庭中的图像。
需要说明的是,用户的视线注视区域的信息可以包括用户视线注视区域的位置信息、用户视线注视区域的方向、以及用户视线注视区域的范围大小等。
应理解,环境图像中第一注视区域是通过人体生物特征确定的用户的敏感区域,但并不一定是用户的注视区域;其中,敏感区域可以是指根据人体物理特征,比如,根据人体眼睛对不同颜色、形状变化的敏感程度确定的容易引起用户注意的区域。
例如,人眼的视觉神经对各种不同波长光的感光灵敏度是不一样的;人眼对波长约为 555nm的电磁波最为敏感,这种电磁波处于光学频谱的绿光区域,故人眼对绿光较为敏感;因此,对于一幅图像用户的敏感区域可以是指图像中的绿色区域。
还应理解,环境图像中第一注视区域可以是指基于人体物体特征提前预先判断的用户对环境图像中的敏感区域;而目标注视区域是在环境图像中用户注视的目标对象所在的区域;环境图像中第一注视区域对于不同用户而言可能是相同的,但是目标注视区域对于不同用户而言,不同用户可以根据自身的兴趣注视环境图像中自身感兴趣的目标对象所在的区域。
在一种可能的实现方式中,环境图像中的第一注视区域可以通过图像中各个区域的兴趣值来表示,其中,兴趣值可以是通过深度学习方法,或者使用边缘检测方法获取形态变化丰富度,通过梯度计算方法获取颜色变化丰富度,并对这两个值进行加权计算出图像各位置的兴趣值;基于图像中各个位置的兴趣值预先判断用户对环境图像中的哪个区域可能会产生兴趣。
在本申请的实施例中,通过在识别用户感兴趣的对象时引入环境图像的预判,即根据用户可能会对环境图像中的敏感区域与用户的视线注视区域确定用户的目标注视区域,从而提高了识别用户感兴趣对象的准确性。
结合第一方面,在第一方面的某些实现方式中,所述根据所述视线注视区域的信息与所述第一注视区域的信息得到所述用户的目标注视区域,包括:
根据所述视线注视区域与所述第一注视区域的重叠区域确定所述目标注视区域。
在本申请的实施例中,通过在识别用户感兴趣的对象时引入环境图像的预判,即可以根据用户对环境图像中的敏感区域与用户的视线注视区域的重叠区域确定用户的目标注视区域即用户感兴趣的目标对象所述的区域,从而提高了识别用户感兴趣对象的准确性。
结合第一方面,在第一方面的某些实现方式中,所述用户为车辆中用户,所述获取用户的视线注视区域的信息以及所述用户对应的环境图像,包括:获取所述车辆中用户的视线注视区域的信息以及所述车辆的行车记录仪的图像,其中,所述车辆中用户的视线注视区域用于表示所述车辆中用户在所述车辆外部的注视区域;所述根据所述视线注视区域的信息与所述第一注视区域的信息得到所述用户的目标注视区域,包括:根据所述车辆中用户的视线注视区域的信息与所述行车记录仪的图像中所述第一注视区域的信息确定所述车辆中用户的目标注视区域。
其中,车辆中用户可以是指用户位于汽车内部;比如,用户可以是指车辆的驾驶员,或者,用户可以是指车辆的乘客。
在一种可能的实现方式中,车辆中用户为车辆的驾驶员,则驾驶员的视线注视区域用于表示驾驶员在车辆的前挡风玻璃方向的注视区域。
在一种可能的实现方式中,车辆中用户为车辆中的乘客,比如,位于副驾驶位置的乘客,则用户的视线注视区域用于表示乘客在车辆的前挡风玻璃方向的注视区域,或者车辆在车窗方向的注视区域。
在一种可能的实现方式中,车辆中用户为车辆中后排的乘客,则用户的视线注视区域用于表示乘客在车辆的车窗方向的注视区域。
在一种可能的实现方式中,车辆中用户可以为车辆的驾驶员或者车辆的乘客,则车辆中用户的视线注视区域可以通过车辆驾驶室内的摄像头获取用户的图像;进而根据车辆中 用户的图像中的用户的面部以及眼睛状态确定用户的视线注视区域,即确定用户在车辆的挡风玻璃方向的注视区域;其中,驾驶室内的摄像头可以是指驾驶员监控系统或者座舱监控系统中的摄像头。
结合第一方面,在第一方面的某些实现方式中,所述获取所述车辆中用户的视线注视区域的信息以及所述车辆的行车记录仪的图像,包括:获取N帧图像中所述车辆中用户的视线注视区域的信息以及M帧所述行车记录仪的图像,其中,所述N帧图像与所述M帧所述行车记录仪的图像是在相同的起始时刻与终止时刻内获取的图像;所述根据所述车辆中用户的视线注视区域的信息与所述行车记录仪的图像中所述第一注视区域的信息确定所述车辆中用户的目标注视区域,包括:确定所述N帧图像中所述车辆中用户的视线注视区域的差异满足第一预设范围;确定所述M帧所述行车记录仪的图像中所述第一注视区域的差异满足第二预设范围内;根据所述N帧图像中所述车辆中用户的视线注视区域与所述M帧所述行车记录仪的图像中所述第一注视区域确定重叠区域;根据所述重叠区域确定所述车辆中用户的目标注视区域。
结合第一方面,在第一方面的某些实现方式中,所述视线注视区域的差异是指所述视线注视区域的位置差异;所述第一注视区域的差异是指所述第一注视区域的位置差异。
在一种可能的实现方式中,获取N帧车辆中用户的图像与M帧行车记录仪的图像也可以是在时间差允许的范围内获取的图像;即获取N帧车辆中用户的图像的时刻与获取M帧行车记录仪的图像的时刻可以相似或者接近。
在一种可能的实现方式中,若获取N帧车辆中用户的图像与M帧行车记录仪的图像存在一定的允许时间差,则可以根据获取的N帧车辆中用户的图像预测后几帧图像中车辆中用户的视线注视区域;或者,可以根据M帧行车记录仪的图像预测后几帧图像中的第一注视区域。
在一种可能的实现方式中,获取的N帧图像中车辆中用户的视线注视区域可以是指通过配置于车辆的驾驶舱内的摄像头采集的N帧车辆中用户的图像;基于N帧车辆中用户的图像可以确定N帧图像中用户的视线注视区域,比如,可以通过N帧车辆中用户的图像中用户的头部位置确定用户在车辆的挡风玻璃方向上的注视区域。
还应理解,在相同的起始时刻与终止时刻内获取的N帧图像(例如,N帧车辆中用户的图像)与M帧行车记录仪的图像的数量可以相同也可以不同。
在一种可能的实现方式中,可以首先对驾驶员的视线进行追踪,即确定获取的N帧图像即N帧驾驶员的图像的差异满足第一预设范围;即可以确定获取N帧驾驶员的图像中驾驶员的视线注视区域不在正前方且保持N帧图像变化差异较小,则此时可以确定驾驶员在注视车辆外场景中的感兴趣的物体;并且确定获取的M帧行车记录仪的图像的差异满足第二预设范围,此时可以确定该行车记录仪持续多帧拍摄到相同的物体未丢失;此时进一步确定N帧图像中驾驶员的视线注视区域与M帧行车记录仪的图像中的重叠区域,从而确定驾驶员的目标注视区域;通过多帧图像满足第一预设范围以及第二预设范围,可以确保本申请提供的用户感兴趣对象的识别方法的鲁棒性。
结合第一方面,在第一方面的某些实现方式中,还包括:将所述车辆中用户的视线注视区域映射至所述行车记录仪的图像所在成像平面;或者,将所述行车记录仪的图像映射至所述车辆中用户的视线注视区域所在成像平面。
在本申请的实施例中为了便于确定车辆中用户的视线注视区域与环境图像中第一注视区域的重叠区域,即重叠部分;可以将位于两个成像平面的图像投影至同一成像平面;即可以将车辆中用户的视线注视区域映射至行车记录仪的图像所在成像平面;或者,可以将行车记录仪的图像映射至车辆中用户的视线注视区域所在成像平面。
结合第一方面,在第一方面的某些实现方式中,还包括:在所述车辆的显示屏中显示所述目标注视区域的信息。
结合第一方面,在第一方面的某些实现方式中,所述车辆中包括多个显示屏,所述在所述车辆的显示屏中显示所述目标注视区域的信息,包括:
根据所述车辆中用户的在所述车辆中的位置信息确定所述多个显示屏中的目标显示屏;在所述目标显示屏中显示所述目标注视区域的信息。
在一种可能的实现方式中,根据车辆中用户的位置信息确定用户的身份信息,身份信息可以包括驾驶员或者乘客;进而可以根据用户的身份信息向用户推送目标注视区域的信息;比如,可以在相应位置的显示屏中显示目标注视区域的信息;或者,可以将目标注视区域的信息进行播报;或者,可以目标区域的信息推送至用户的手机中以便用户后续可以继续了解该目标注视区域的信息。
例如,若检测到用户为驾驶员,则可以通过抬头显示HUD系统在车辆中显示目标注视区域的信息;比如,可以在前挡风玻璃上显示目标注视区域的信息。
例如,若检测到用户为副驾驶位置或者车辆后排位置的乘客,则可以在该位置对应的显示屏中显示目标注视区域的信息。
结合第一方面,在第一方面的某些实现方式中,通过抬头显示HUD系统在所述车辆中显示所述目标注视区域的信息。
在一种可能的实现方式中,上述用户感兴趣对象的识别方法可以应用于智能终端场景中,在智能终端的场景中通过本申请提供的方法可以用于识别用户感兴趣的物体,从而为用户提供更加智能的服务,有效提升用户体验。
第二方面,提供了一种用户感兴趣对象的识别装置,包括:获取模块,用于获取用户的视线注视区域的信息以及所述用户对应的环境图像;处理模块,用于根据所述环境图像得到所述用户在所述环境图像中的第一注视区域的信息,其中,所述第一注视区域用于表示通过人体物理特征确定的敏感区域;根据所述视线注视区域的信息与所述第一注视区域的信息得到所述用户的目标注视区域,其中,所述目标注视区域用于表示在所述环境图像中所述用户注视的目标对象所在的区域。
其中,用户对应的环境图像可以是指用户所处的环境的图像。
在一种可能的实现方式中,对于智能车领域而言,用户可以是指车辆的驾驶员或者车辆中的乘客,则用户对应的环境图像可以是指车辆所在环境的图像,或者,车辆外部的环境图像。
需要说明的是,用户的视线注视区域的信息可以包括用户视线注视区域的位置信息、用户视线注视区域的方向、以及用户视线注视区域的范围大小等。
在另一种可能的实现方式中,对于智能终端领域而言,用户可以是指家庭中的智能家居的使用者,则用户对应的环境图像可以是指智能家居的使用者所在的家庭中的图像。
应理解,环境图像中第一注视区域是通过人体生物特征确定的用户的敏感区域,但并 不一定是用户的注视区域;其中,敏感区域可以是指根据人体物理特征,比如,根据人体眼睛对不同颜色、形状变化的敏感程度确定的容易引起用户注意的区域。
例如,人眼的视觉神经对各种不同波长光的感光灵敏度是不一样的;人眼对波长约为555nm的电磁波最为敏感,这种电磁波处于光学频谱的绿光区域,故人眼对绿光较为敏感;因此,对于一幅图像用户的敏感区域可以是指图像中的绿色区域。
还应理解,环境图像中第一注视区域可以是指基于人体物体特征提前预先判断的用户对环境图像中的敏感区域;而目标注视区域是指用户对于环境图像中的注视区域,即用户注视的目标对象所在的区域;环境图像中第一注视区域对于不同用户而言可能是相同的,但是目标注视区域对于不同用户而言,不同用户可以基于自身的兴趣注视环境图像中自身感兴趣的物体所在的区域。
在一种可能的实现方式中,环境图像中的第一注视区域可以通过图像中各个区域的兴趣值来表示,其中,兴趣值可以是通过深度学习方法,或者使用边缘检测方法获取形态变化丰富度,通过梯度计算方法获取颜色变化丰富度,并对这两个值进行加权计算出图像各位置的兴趣值;基于图像中各个位置的兴趣值预先判断用户对环境图像中的哪个区域可能会产生兴趣。
在本申请的实施例中,通过在识别用户感兴趣的对象时引入环境图像的预判,即根据用户对环境图像中的敏感区域与用户的视线注视区域确定用户的目标注视区域,从而提高了识别用户感兴趣对象的准确性。
结合第二方面,在第二方面的某些实现方式中,所述处理模块具体用于:
根据所述视线注视区域与所述第一注视区域的重叠区域确定所述目标注视区域。
在本申请的实施例中,通过在识别用户感兴趣的对象时引入环境图像的预判,即可以根据用户对环境图像中的敏感区域与用户的视线注视区域的重叠区域确定用户的目标注视区域即用户感兴趣的目标对象所述的区域,从而提高了识别用户感兴趣对象的准确性。
结合第二方面,在第二方面的某些实现方式中,所述用户为车辆中用户,所述获取模块具体用于:获取所述车辆中用户的视线注视区域的信息以及所述车辆的行车记录仪的图像,其中,所述车辆中用户的视线注视区域用于表示所述车辆中用户在所述车辆外部的注视区域;
所述处理模块具体用于:根据所述驾驶员的视线注视区域的信息与所述行车记录仪的图像中所述第一注视区域的信息确定所述驾驶员的目标注视区域。
其中,车辆中的用户可以是指用户位于汽车内部;比如,用户可以是指车辆的驾驶员,或者,用户可以是指车辆的乘客。
在一种可能的实现方式中,车辆中用户为车辆的驾驶员,则驾驶员的视线注视区域用于表示驾驶员在车辆的前挡风玻璃方向的注视区域。
在一种可能的实现方式中,车辆中用户为车辆中的乘客,比如,位于副驾驶位置的乘客,则用户的视线注视区域用于表示乘客在车辆的前挡风玻璃方向的注视区域,或者车辆在车窗方向的注视区域。
在一种可能的实现方式中,车辆中用户为车辆中后排的乘客,则用户的视线注视区域用于表示乘客在车辆的车窗方向的注视区域。
在一种可能的实现方式中,车辆中用户可以为车辆的驾驶员或者车辆的乘客,则车辆 中用户的视线注视区域可以通过车辆驾驶室内的摄像头获取位于车辆内部的用户的图像;进而根据用户的图像中的用户面部以及眼睛状态确定用户的视线注视区域,即确定用户在车辆的挡风玻璃中注视区域;其中,驾驶室内的摄像头可以是指驾驶员监控系统或者座舱监控系统中的摄像头。
结合第二方面,在第二方面的某些实现方式中,所述获取模块具体用于:获取N帧图像中所述车辆中用户的视线注视区域的信息以及M帧所述行车记录仪的图像;
所述处理模块具体用于:
确定所述N帧图像中所述车辆中用户的视线注视区域的差异满足第一预设范围;确定所述M帧所述行车记录仪的图像中所述第一注视区域的差异满足第二预设范围内;根据所述N帧图像中所述车辆中用户的视线注视区域与所述M帧所述行车记录仪的图像中所述第一注视区域确定重叠区域;根据所述重叠区域确定所述车辆中用户的目标注视区域。
结合第二方面,在第二方面的某些实现方式中,所述视线注视区域的差异是指所述视线注视区域的位置差异;所述第一注视区域的差异是指所述第一注视区域的位置差异。
在一种可能的实现方式中,获取的N帧图像可以是指获取的N帧车辆中用户的图像,其中,N帧车辆中用户的图像与M帧行车记录仪的图像是在相同的时间段内获取的图像,即在相同的起始时刻与终止时刻内获取的图像。
在一种可能的实现方式中,获取N帧车辆中用户的图像与M帧行车记录仪的图像也可以是在时间差允许的范围内获取的图像;即获取N帧车辆中用户的图像的时刻与获取M帧行车记录仪的图像的时刻可以相似或者接近。
在一种可能的实现方式中,若获取N帧车辆中用户的图像与M帧行车记录仪的图像存在一定的允许时间差,则可以根据获取的N帧车辆中用户的图像预测后几帧图像中车辆中用户的视线注视区域;或者,可以根据M帧行车记录仪的图像预测后几帧图像中的第一注视区域。
还应理解,在相同的起始时刻与终止时刻内获取的N帧图像(例如,N帧车辆中用户的图像)与M帧行车记录仪的图像的数量可以相同也可以不同。
在本申请的实施例中,可以首先对车辆中用户(例如,车辆的驾驶员)的视线进行追踪,即确定获取的N帧图像即N帧驾驶员的图像的差异满足第一预设范围;即可以确定获取N帧驾驶员的图像中驾驶员的视线注视区域不在正前方且保持N帧图像变化差异较小,则此时可以确定驾驶员在注视车辆外场景中的感兴趣的物体;并且确定获取的M帧行车记录仪的图像的差异满足第二预设范围,此时可以确定该行车记录仪持续多帧拍摄到相同的物体未丢失;此时进一步确定N帧图像中驾驶员的视线注视区域与M帧行车记录仪的图像中的重叠区域,从而确定驾驶员的目标注视区域;通过多帧图像满足第一预设范围以及第二预设范围,可以确保本申请提供的用户感兴趣对象的识别方法的鲁棒性。
结合第二方面,在第二方面的某些实现方式中,所述处理模块还用于:
将所述车辆中用户的视线注视区域的信息映射至所述行车记录仪的图像所在成像平面;或者,将所述行车记录仪的图像的信息映射至所述车辆中用户的视线注视区域所在成像平面。
在本申请的实施例中为了便于确定驾驶员的视线注视区域与环境图像中第一注视区域的重叠区域,即重叠部分;可以将位于两个成像平面的图像投影至同一成像平面;即可 以将驾驶员的视线注视区域映射至行车记录仪的图像所在成像平面;或者,可以将行车记录仪的图像映射至驾驶员的视线注视区域所在成像平面。
结合第二方面,在第二方面的某些实现方式中,所述处理模块还用于:
在所述车辆的显示屏中显示所述目标注视区域的信息。
结合第二方面,在第二方面的某些实现方式中,所述车辆中包括多个显示屏,所述处理模块具体用于:
根据所述车辆中用户的在所述车辆中的位置信息确定所述多个显示屏中的目标显示屏;在所述目标显示屏中显示所述目标注视区域的信息。
在一种可能的实现方式中,根据车辆中用户的位置信息确定用户的身份信息,身份信息可以包括驾驶员或者乘客;进而可以根据用户的身份信息向用户推送目标注视区域的信息;比如,可以在相应位置的显示屏中显示目标注视区域的信息;或者,可以将目标注视区域的信息进行播报;或者,可以目标区域的信息推送至用户的手机中以便用户后续可以继续了解该目标注视区域的信息。
例如,若检测到用户为驾驶员,则可以通过抬头显示HUD系统在车辆中显示目标注视区域的信息;比如,可以在前挡风玻璃上显示目标注视区域的信息。
例如,若检测到用户为副驾驶位置或者车辆后排位置的乘客,则可以在该位置对应的显示屏中显示目标注视区域的信息。
结合第二方面,在第二方面的某些实现方式中,通过抬头显示HUD系统在所述车辆中显示所述目标注视区域的信息。
结合第二方面,在第二方面的某些实现方式中,所述识别装置为所述车辆中的车载设备。
在一种可能的实现方式中,上述用户感兴趣对象的识别装置可以应用于智能终端场景中,在智能终端的场景中通过本申请提供的识别装置可以用于识别用户感兴趣的物体,从而为用户提供更加智能的服务,有效提升用户体验。
第三方面,提供了一种用户感兴趣对象的识别装置,包括:存储器,用于存储程序;处理器,用于执行所述存储器存储的程序,当所述存储器存储的程序被执行时,所述处理器用于执行以下过程:获取用户的视线注视区域的信息以及所述用户对应的环境图像;根据所述环境图像得到所述用户在所述环境图像中的第一注视区域的信息,其中,所述第一注视区域用于表示通过人体物理特征确定的敏感区域;根据所述视线注视区域的信息与所述第一注视区域的信息得到所述用户的目标注视区域,其中,所述目标注视区域用于表示在所述环境图像所述用户注视的目标对象所在的区域。
在一种可能的实现方式中,上述识别装置中包括的处理器还用于执行第一方面及第一方面中的任意一种实现方式中的用户感兴趣对象的识别方法。
应理解,在上述第一方面中对相关内容的扩展、限定、解释和说明也适用于第三方面中相同的内容。
第四方面,提供了一种汽车,所述车辆包括上述第二方面及第二方面中的任意一种实现方式中的识别装置。
第五方面,提供了一种车辆系统,包括配置于车辆内部的摄像头、行车记录仪以及上述第二方面及第二方面中的任意一种实现方式中的识别装置。
其中,配置于车辆摄像头可以是指驾驶员监控系统(driver monitoring system,DMS)摄像头,或者,座舱监控系统(cockpit monitoring system,CMS)摄像头。其中,上述摄像头的位置可以置于车辆的A柱(A-pillar)附近,或方向盘、仪表盘位置,或后视镜附近等位置。
第六方面,提供了一种计算机可读存储介质,所述计算机可读介质存储介质用于存储程序代码,当所述程序代码被计算机执行时,所述计算机用于执行上述第一方面及第一方面中的任意一种实现方式中的用户感兴趣对象的识别方法。
第七方面,提供了一种芯片,所述芯片包括处理器,所述处理器用于执行上述第一方面及第一方面中的任意一种实现方式中的用户感兴趣对象的识别方法。
在一种可能的实现方式中,上述第七方面的芯片可以位于车辆的车载终端中。
第八方面,提供了一种计算机程序产品,所述计算机程序产品包括:计算机程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行上述第一方面及第一方面中的任意一种实现方式中的用户感兴趣对象的识别方法。
需要说明的是,上述计算机程序代码可以全部或者部分存储在第一存储介质上,其中,第一存储介质可以与处理器封装在一起的,也可以与处理器单独封装,本申请实施例对此不作具体限定。
图1是本申请实施例提供的一个应用场景的示意图;
图2是本申请实施例提供的车辆100的功能框图;
图3是本申请实施例提供的车辆的硬件架构的示意图;
图4是本申请实施例提供的车辆的软件架构的示意图;
图5是本申请实施例提供的用户感兴趣对象的识别方法的示意性流程图;
图6是本申请实施例提供的用户感兴趣对象的识别方法的示意性流程图;
图7是本申请实施例提供的驾驶员视线注视区域的示意图;
图8是本申请实施例提供的计算驾驶员的头部位置的示意性流程图;
图9是本申请实施例提供的计算DVR视线注视区域的示意图;
图10是本申请实施例提供的场景注意力预判的示意图;
图11是本申请实施例提供的在驾驶领域中的场景注意区域判断的示意性流程图;
图12是本申请实施例提供的在智能终端领域中的场景注意区域判断的示意性流程图;
图13是本申请实施例提供的用户感兴趣对象的识别装置的示意图;
图14是本申请实施例提供的用户感兴趣对象的识别装置的示意图。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
首先,对本申请实施例的应用场景进行举例说明。
示例性地,图1是本申请实施例提供的用户感兴趣对象的识别方法的应用场景的示意图。
如图1所示,本申请实施例提供的用户感兴趣对象的识别方法可以应用在智能车领域。
例如,在驾驶过程中,车辆中用户(例如,驾驶员或者车辆中的乘客)可能会对路边的物体产生兴趣,比如,路边的广告牌;但是,出于驾驶安全性的考虑,驾驶员无法长期将视线停留在该感兴趣的物体上,导致无法即时了解该感兴趣物品的详细信息;或者,由于驾驶中的车辆的行驶速度较快,车辆中的乘客无法及时了解车外感兴趣的目标对象,从而车辆中用户的兴趣点也无法得到有效记录。本申请实施例中提供的用于识别感兴趣对象的方法,可以通过使用车辆中用户的视线与车外交互准确地识别车辆中用户在驾驶时的感兴趣的目标对象,并据此对车辆中用户进行相应的信息推送,能够有效地提升用户的体验感。
图2是本申请实施例提供的车辆100的功能框图。
其中,车辆100可以是人工驾驶车辆,或者可以将车辆100配置可以为完全或部分地自动驾驶模式。
在一个示例中,车辆100可以在处于自动驾驶模式中的同时控制自车,并且可通过人为操作来确定车辆及其周边环境的当前状态,确定周边环境中的至少一个其他车辆的可能行为,并确定其他车辆执行可能行为的可能性相对应的置信水平,基于所确定的信息来控制车辆100。在车辆100处于自动驾驶模式中时,可以将车辆100置为在没有和人交互的情况下操作。
车辆100中可以包括各种子系统,例如,行进系统110、传感系统120、控制系统130、一个或多个外围设备140以及电源160、计算机系统150和用户接口170。
可选地,车辆100可以包括更多或更少的子系统,并且每个子系统可包括多个元件。另外,车辆100的每个子系统和元件可以通过有线或者无线互连。
示例性地,行进系统110可以包括用于向车辆100提供动力运动的组件。在一个实施例中,行进系统110可以包括引擎111、传动装置112、能量源113和车轮114/轮胎。其中,引擎111可以是内燃引擎、电动机、空气压缩引擎或其他类型的引擎组合;例如,汽油发动机和电动机组成的混动引擎,内燃引擎和空气压缩引擎组成的混动引擎。引擎111可以将能量源113转换成机械能量。
示例性地,能量源113可以包括汽油、柴油、其他基于石油的燃料、丙烷、其他基于压缩气体的燃料、乙醇、太阳能电池板、电池和其他电力来源。能量源113也可以为车辆100的其他系统提供能量。
示例性地,传动装置112可以包括变速箱、差速器和驱动轴;其中,传动装置112可以将来自引擎111的机械动力传送到车轮114。
在一个实施例中,传动装置112还可以包括其他器件,比如离合器。其中,驱动轴可以包括可耦合到一个或多个车轮114的一个或多个轴。
示例性地,传感系统120可以包括感测关于车辆100周边的环境的信息的若干个传感器。
例如,传感系统120可以包括定位系统121(例如,GPS系统、北斗系统或者其他定位系统)、惯性测量单元122(inertial measurement unit,IMU)、雷达123、激光测距仪124以及相机125。传感系统120还可以包括被监视车辆100的内部系统的传感器(例如,车内空气质量监测器、燃油量表、机油温度表等)。来自这些传感器中的一个或多个的传感器数据可用于检测对象及其相应特性(位置、形状、方向、速度等)。这种检测和识别是自主车辆100的安全操作的关键功能。
其中,定位系统121可以用于估计车辆100的地理位置。IMU122可以用于基于惯性加速度来感测车辆100的位置和朝向变化。在一个实施例中,IMU 122可以是加速度计和陀螺仪的组合。
示例性地,雷达123可以利用无线电信号来感测车辆100的周边环境内的物体。在一些实施例中,除了感测物体以外,雷达123还可用于感测物体的速度和/或前进方向。
示例性地,激光测距仪124可以利用激光来感测车辆100所位于的环境中的物体。在一些实施例中,激光测距仪124可以包括一个或多个激光源、激光扫描器以及一个或多个检测器,以及其他系统组件。
示例性地,相机125可以用于捕捉车辆100的周边环境的多个图像。例如,相机125可以是静态相机或视频相机。
如图2所示,控制系统130为控制车辆100及其组件的操作。控制系统130可以包括各种元件,比如可以包括转向系统131、油门132、制动单元133、计算机视觉系统134、路线控制系统135以及障碍规避系统136。
示例性地,转向系统131可以操作来调整车辆100的前进方向。例如,在一个实施例中可以为方向盘系统。油门132可以用于控制引擎111的操作速度并进而控制车辆100的速度。
示例性地,制动单元133可以用于控制车辆100减速;制动单元133可以使用摩擦力来减慢车轮114。在其他实施例中,制动单元133可以将车轮114的动能转换为电流。制动单元133也可以采取其他形式来减慢车轮114转速从而控制车辆100的速度。
如图2所示,计算机视觉系统134可以操作来处理和分析由相机125捕捉的图像以便识别车辆100周边环境中的物体和/或特征。上述物体和/或特征可以包括交通信号、道路边界和障碍物。计算机视觉系统134可以使用物体识别算法、运动中恢复结构(structure from motion,SFM)算法、视频跟踪和其他计算机视觉技术。在一些实施例中,计算机视觉系统134可以用于为环境绘制地图、跟踪物体、估计物体的速度等等。
示例性地,路线控制系统135可以用于确定车辆100的行驶路线。在一些实施例中,路线控制系统135可结合来自传感器、GPS和一个或多个预定地图的数据以为车辆100确定行驶路线。
如图2所示,障碍规避系统136可以用于识别、评估和避免或者以其他方式越过车辆100的环境中的潜在障碍物。
在一个实例中,控制系统130可以增加或替换地包括除了所示出和描述的那些以外的组件。或者也可以减少一部分上述示出的组件。
如图2所示,车辆100可以通过外围设备140与外部传感器、其他车辆、其他计算机系统或用户之间进行交互;其中,外围设备140可包括无线通信系统141、车载电脑142、 麦克风143和/或扬声器144。
在一些实施例中,外围设备140可以提供车辆100与用户接口170交互的手段。例如,车载电脑142可以向车辆100的用户提供信息。用户接口116还可操作车载电脑142来接收用户的输入;车载电脑142可以通过触摸屏进行操作。在其他情况中,外围设备140可以提供用于车辆100与位于车内的其它设备通信的手段。例如,麦克风143可以从车辆100的用户接收音频(例如,语音命令或其他音频输入)。类似地,扬声器144可以向车辆100的用户输出音频。
如图2所述,无线通信系统141可以直接地或者经由通信网络来与一个或多个设备无线通信。例如,无线通信系统141可以使用3G蜂窝通信;例如,码分多址(code division multiple access,CDMA))、EVD0、全球移动通信系统(global system for mobile communications,GSM)/通用分组无线服务(general packet radio service,GPRS),或者4G蜂窝通信,例如长期演进(long term evolution,LTE);或者,5G蜂窝通信。无线通信系统141可以利用无线上网(WiFi)与无线局域网(wireless local area network,WLAN)通信。
在一些实施例中,无线通信系统141可以利用红外链路、蓝牙或者紫蜂协议(ZigBee)与设备直接通信;其他无线协议,例如各种车辆通信系统,例如,无线通信系统141可以包括一个或多个专用短程通信(dedicated short range communications,DSRC)设备,这些设备可包括车辆和/或路边台站之间的公共和/或私有数据通信。
如图2所示,电源160可以向车辆100的各种组件提供电力。在一个实施例中,电源160可以为可再充电锂离子或铅酸电池。这种电池的一个或多个电池组可被配置为电源为车辆100的各种组件提供电力。在一些实施例中,电源160和能量源113可一起实现,例如一些全电动车中那样。
示例性地,车辆100的部分或所有功能可以受计算机系统150控制,其中,计算机系统150可以包括至少一个处理器151,处理器151执行存储在例如存储器152中的非暂态计算机可读介质中的指令153。计算机系统150还可以是采用分布式方式控制车辆100的个体组件或子系统的多个计算设备。
例如,处理器151可以是任何常规的处理器,诸如商业可获得的CPU。
可选地,该处理器可以是诸如ASIC或其它基于硬件的处理器的专用设备。尽管图2功能性地图示了处理器、存储器、和在相同块中的计算机的其它元件,但是本领域的普通技术人员应该理解该处理器、计算机、或存储器实际上可以包括可以或者可以不存储在相同的物理外壳内的多个处理器、计算机或存储器。例如,存储器可以是硬盘驱动器或位于不同于计算机的外壳内的其它存储介质。因此,对处理器或计算机的引用将被理解为包括对可以或者可以不并行操作的处理器或计算机或存储器的集合的引用。不同于使用单一的处理器来执行此处所描述的步骤,诸如转向组件和减速组件的一些组件每个都可以具有其自己的处理器,所述处理器只执行与特定于组件的功能相关的计算。
在此处所描述的各个方面中,处理器可以位于远离该车辆并且与该车辆进行无线通信。在其它方面中,此处所描述的过程中的一些在布置于车辆内的处理器上执行而其它则由远程处理器执行,包括采取执行单一操纵的必要步骤。
在一些实施例中,存储器152可包含指令153(例如,程序逻辑),指令153可以被 处理器151执行来执行车辆100的各种功能,包括以上描述的那些功能。存储器152也可包含额外的指令,比如包括向行进系统110、传感系统120、控制系统130和外围设备140中的一个或多个发送数据、从其接收数据、与其交互和/或对其进行控制的指令。
示例性地,除了指令153以外,存储器152还可存储数据,例如,道路地图、路线信息,车辆的位置、方向、速度以及其它这样的车辆数据,以及其他信息。这种信息可在车辆100在自主、半自主和/或手动模式中操作期间被车辆100和计算机系统150使用。
如图2所示,用户接口170可以用于向车辆100的用户提供信息或从其接收信息。可选地,用户接口170可以包括在外围设备140的集合内的一个或多个输入/输出设备,例如,无线通信系统141、车载电脑142、麦克风143和扬声器144。
在本申请的实施例中,计算机系统150可以基于从各种子系统(例如,行进系统110、传感系统120和控制系统130)以及从用户接口170接收的输入来控制车辆100的功能。例如,计算机系统150可以利用来自控制系统130的输入以便控制制动单元133来避免由传感系统120和障碍规避系统136检测到的障碍物。在一些实施例中,计算机系统150可操作来对车辆100及其子系统的许多方面提供控制。
可选地,上述这些组件中的一个或多个可与车辆100分开安装或关联。例如,存储器152可以部分或完全地与车辆100分开存在。上述组件可以按有线和/或无线方式来通信地耦合在一起。
可选地,上述组件只是一个示例,实际应用中,上述各个模块中的组件有可能根据实际需要增添或者删除,图2不应理解为对本申请实施例的限制。
可选地,车辆100可以是在道路行进的自动驾驶汽车,可以识别其周围环境内的物体以确定对当前速度的调整。物体可以是其它车辆、交通控制设备、或者其它类型的物体。在一些示例中,可以独立地考虑每个识别的物体,并且基于物体的各自的特性,诸如它的当前速度、加速度、与车辆的间距等,可以用来确定自动驾驶汽车所要调整的速度。
可选地,车辆100或者与车辆100相关联的计算设备(如图2的计算机系统150、计算机视觉系统134、存储器152)可以基于所识别的物体的特性和周围环境的状态(例如,交通、雨、道路上的冰等等)来预测所述识别的物体的行为。
可选地,每一个所识别的物体都依赖于彼此的行为,因此,还可以将所识别的所有物体全部一起考虑来预测单个识别的物体的行为。车辆100能够基于预测的所述识别的物体的行为来调整它的速度。换句话说,自动驾驶汽车能够基于所预测的物体的行为来确定车辆将需要调整到(例如,加速、减速、或者停止)稳定状态。在这个过程中,也可以考虑其它因素来确定车辆100的速度,诸如,车辆100在行驶的道路中的横向位置、道路的曲率、静态和动态物体的接近度等等。
除了提供调整自动驾驶汽车的速度的指令之外,计算设备还可以提供修改车辆100的转向角的指令,以使得自动驾驶汽车遵循给定的轨迹和/或维持与自动驾驶汽车附近的物体(例如,道路上的相邻车道中的轿车)的安全横向和纵向距离。
上述车辆100可以为轿车、卡车、摩托车、公共汽车、船、飞机、直升飞机、割草机、娱乐车、游乐场车辆、施工设备、电车、高尔夫球车、火车、和手推车等,本申请实施例不做特别的限定。
示例性地,本申请实施例提供的用户感兴趣对象的识别方法还可以应用于其他领域; 比如,智能家居领域;通过本申请的识别方法可以提高用户识别目标注视区域的准确性,从而有利于智能家居向用户提供更加智能的服务。
应理解,上述为对应用场景的举例说明,并不对本申请的应用场景作任何限定。
目前,检测感兴趣对象的方法通常需要提前获取场景中的对象的信息;进而,基于用户的视线追踪确定用户注视场景中的对象,从而确定用户感兴趣的对象。由于该检测感兴趣对象的方法依赖于场景中的对象的信息,但是在许多场景中(例如,驾驶场景)是无法提前获取当前场景中的对象的信息的,此时无法通过上述检测感兴趣对象的方法识别用于感兴趣的对象,从而导致用户体验较差。
有鉴于此,本申请提供了一种用户感兴趣对象的识别方法以及识别装置,在本申请的实施例中可以通过在识别用户感兴趣的对象时可以通过引入环境图像的预判,即根据用户对环境图像中的敏感区域与用户的视线注视区域确定用户的目标注视区域即用户感兴趣的目标对象所在的区域,从而提高了识别用户感兴趣对象的准确性。
下面结合图3至图12对本申请提供的用户感兴趣对象的识别方法进行详细说明。
图3是本申请实施例提供的硬件架构的示意图。
如图3所示,车辆200中可以包括车内摄像头210、行车记录仪220以及图像分析系统230。
示例性地,车辆200可以是指人工驾驶车辆,或者可以将车辆200可以为部分地配置自动驾驶模式。
其中,车内摄像头210可以用于检测车辆中用户(例如,驾驶员或者车辆中的乘客)的状态;,比如,驾驶员疲劳监测,驾驶员表情识别,驾驶员眼球定位;或者,车辆中的乘客的眼球定位、车辆中的乘客的表情识别等。
例如,配置于车辆200的驾驶员监控系统(driver monitoring system,DMS)摄像头,或者,座舱监控系统(cockpit monitoring system,CMS)摄像头。其中,上述摄像头的位置可以置于车辆200的A柱(A-pillar)附近,或方向盘、仪表盘位置,或后视镜附近等位置。
其中,行车记录仪220可以用于记录车辆行驶过程中的视频图像以及声音信息。
例如,配置于车辆200车体前部的行车记录仪摄像头。
其中,图像分析系统230可以用于处理和分析由车内摄像头210或者行车记录仪220捕捉的图像以便识别车辆200中的用户对车外物体的视线追踪。
应理解,图3所示的硬件架构为举例说明,车辆200中还可以包括实现正常运行所必须的其他器件。
图4是本申请提供的软件架构的示意图。
如图4所示,车辆300中可以包括用户视野检测模块310、车外场景检测模块320以及用户注视区域检测模块330。
其中,用户视野检测模块310用于检测车辆中用户的头部视线注视区域,比如,驾驶员的头部视线注视区域即驾驶员的视线经过前挡风玻璃的区域位置;或者,车辆中的乘客的头部视线注视区域。
其中,车外场景检测模块320用于检测车外场景中的物理,并基于人体此生理特征预判用户会对车外哪些物体或者物体中的敏感区域。
其中,用户注视区域检测模块330用于根据用户视野检测模块310以及车外场景检测模块320的结果确定用户的目标注视区域,即用户对车外感兴趣的目标对象所在的区域。
应理解,图4所示的软件架构为举例说明,车辆300中还可以包括实现正常运行所必须的软件模块。
下面结合图5对本申请提供的用户感兴趣对象的识别方法进行详细的说明。
应理解,图5所示的识别方法可以由图1所示的行驶车辆,或者图2所示的智能终端来执行;其中,行驶车辆可以是图3所示的车辆,或者,图4所示的车辆。
图5所示的识别方法400包括步骤S410至步骤S430,下面分别对这些步骤进行详细描述。
S410、获取用户的视线注视区域的信息以及用户对应的环境图像。
示例性地,对于智能车辆领域,可以通过DMS摄像头或者CMS摄像头获取车辆中用户的视线注视区域的信息;比如,通过DMS摄像头可以采集驾驶员的面部图像;根据驾驶员的面部图像可以确定驾驶员的头部视线注视区域。进一步,可以通过车辆的行车记录仪获取环境图像;比如,通过行车记录仪可以获取车辆驾驶过程中的车外的环境图像;或者,也可以从云端获取车辆驾驶过程中车外的环境图像。
例如,当车辆中的DMS摄像头或者CMS摄像头获取到用户的视线注视区域的信息,以及行车记录仪获取到环境图像后,可以通过通信系统发送至车辆的计算机系统(例如,车载设备)。
其中,通信系统可以使用3G蜂窝通信;例如,CDMA、EVD0、GSM/GPRS;或者4G蜂窝通信,例如LTE;或者,5G蜂窝通信;通信系统可以利用无线上网(WiFi)与无线局域网通信;或者,通信系统可以利用红外链路、蓝牙、或者紫蜂协议(ZigBee)与设备直接通信;通信系统可以利用其他无线协议,例如,各种车辆通信系统;或者,通信系统可以包括一个或多个专用短程通信设备,这些设备可包括车辆和/或路边台站之间的公共和/或私有数据通信。
需要说明的是,用户对应的环境图像可以是指用户所处的环境的图像。用户的视线注视区域的信息可以包括用户视线注视区域的位置信息、用户视线注视区域的方向、以及用户视线注视区域的范围大小等。
例如,对于驾驶领域而言,用户可以是指车辆的驾驶员;或者,也可以是指车辆中的乘客;用户对应的环境图像可以是指车辆所在环境的图像,或者,车辆外的环境图像。其中,车辆可以是指人工驾驶车辆,或者,完全或部分地配置为自动驾驶模式的车辆。
例如,对于智能家居领域而言,用户可以是指家庭中的智能家居的使用者,则用户对应的环境图像可以是指智能家居的使用者所在的家庭中的图像。
应理解,上述为对用户对应的环境图像的举例说明,本申请对此并不作任何限定。
S420、根据环境图像得到用户在所述环境图像中的第一注视区域的信息。
其中,第一注视区域是通过人体物理特征确定的敏感区域。
应理解,环境图像中第一注视区域是通过人体生物特征确定的用户的敏感区域,但并不一定是用户的注视区域;其中,敏感区域可以是指根据人体物理特征,比如,根据人体眼睛对不同颜色、形状变化的敏感程度确定的容易引起用户注意的区域。
例如,人眼的视觉神经对各种不同波长光的感光灵敏度是不一样的;人眼对波长约为 555nm的电磁波最为敏感,这种电磁波处于光学频谱的绿光区域,故人眼对绿光较为敏感;因此,对于一幅图像用户的敏感区域可以是指图像中的绿色区域。
示例性地,环境图像中的第一注视区域可以通过图像中各个区域的兴趣值来表示,其中,兴趣值可以是通过深度学习方法,或者使用边缘检测方法获取形态变化丰富度,通过梯度计算方法获取颜色变化丰富度,并对这两个值进行加权计算出图像各位置的兴趣值;基于图像中各个位置的兴趣值预先判断驾驶员对图像中的哪个区域较敏感,容易引起用户的注视;参见后续图10所示。
举例来说,对Canny边缘检测算法进行说明,Canny边缘检测算法可以分为以下5个步骤:
步骤1:通过高斯滤波来对环境图像进行平滑处理,目的是去除环境图像中的噪声;
步骤2:确定环境图像中的强度梯度(intensity gradients);
步骤3:通过采用非最大抑制(non-maximum suppression)技术来消除边缘误检(例如,本来不是边缘但检测出来是边缘);
步骤4:通过采用双阈值的方法来确定环境图像中的可能(潜在的)边界;
步骤5:通过采用滞后技术来跟踪环境图像中的边界。
S430、根据用户的视线注视区域的信息与第一注视区域的信息得到用户的目标注视区域。
其中,目标注视区域用于表示在环境图像中用户注视的目标对象所在的区域。
示例性地,用户注视的目标对象可以是指用户感兴趣的目标对象;其中,用户感兴趣的对象可以通过收集用户的历史行为数据,从而得到用户对不同对象的兴趣值确定的;或者,通过收集用户对不同对象的标签确定的。
应理解,第一注视区域可以是指基于人体物体特征提前预先判断的用户对环境图像中的敏感区域;而目标注视区域是指用户对于环境图像中感兴趣的目标对象所在的区域;第一注视区域对于不同用户而言可能是相同的,但是目标注视区域对于不同用户而言,不同用户可以基于自身的兴趣注视环境图像中自身感兴趣的物体所在的区域。
在本申请的实施例中,在识别用户感兴趣的对象时可以通过引入环境图像的预判,即根据用户可能会对环境图像中的敏感区域与用户的视线注视区域确定用户的目标注视区域,从而提高了识别用户感兴趣对象的准确性。
进一步,在上述识别方法应用于智能车领域场景中,还包括:在车辆的显示屏中显示所述目标注视区域的信息。在一个示例中,用户可以是指车辆的驾驶员;或者,也可以是指车辆中的乘客;车辆可以检测用户的身份信息,比如,根据车辆中用户的位置信息确定用户为驾驶员或者乘客;进而可以根据用户的身份信息向用户推送目标注视区域的信息;比如,可以在相应位置的显示屏中显示目标注视区域的信息;或者,可以将目标注视区域的信息进行播报;或者,可以目标区域的信息推送至用户的手机中以便用户后续可以继续了解该目标注视区域的信息。
例如,若检测到用户为驾驶员,则可以通过抬头显示(head up display,HUD)在车辆中显示目标注视区域的信息;比如,可以在前挡风玻璃上显示目标注视区域的信息。
例如,若检测到用户为副驾驶位置或者车辆后排位置的乘客,则可以在该位置对应的显示屏中显示目标注视区域的信息。
可选地,在一种可能的实现方式中,根据用户的视线注视区域的信息与第一注视区域的信息得到用户的目标注视区域,可以包括:根据用户的视线注视区域与第一注视区域的重叠区域确定用户的目标注视区域。
在一个示例中,上述用户感兴趣对象的识别方法可以应用于驾驶场景中,此时用户可以是指位于汽车内部的用户(例如,车辆中用户);获取用户的视线注视区域的信息以及用户对应的环境图像,可以包括:获取车辆中用户的视线注视区域的信息以及车辆的行车记录仪的图像,其中,车辆中用户的视线注视区域用于表示车辆中用户在车辆外部的注视区域;根据用户的视线注视区域的信息与第一注视区域的信息得到用户的目标注视区域,包括:根据车辆中用户的视线注视区域的信息与行车记录仪的图像中第一注视区域的信息确定车辆中用户的目标注视区域。
在一个示例中,车辆中用户为车辆的驾驶员,则驾驶员的视线注视区域用于表示驾驶员在车辆的前挡风玻璃方向的注视区域。
在另一个示例中,车辆中用户为车辆中的乘客,比如,位于副驾驶位置的乘客,则用户的视线注视区域用于表示乘客在车辆的前挡风玻璃方向的注视区域,或者车辆在车窗方向的注视区域。
例如,在驾驶场景中,用户的目标注视区域可以是指车辆外的场景中的驾驶员感兴趣的对象所在的区域;比如,可以是道路边的广告牌上。
需要说明的是,在驾驶场景中的用于识别用户感兴趣对象的方法的具体流程可以参见后续图6至图11所示。
可选地,在一种可能的实现方式中,获取车辆中用户的视线注视区域的信息以及车辆的行车记录仪的图像,可以包括:获取N帧图像中车辆中用户的视线注视区域的信息以及N帧行车记录仪的图像;根据车辆中用户的视线注视区域的信息与行车记录仪的图像中第一注视区域的信息确定车辆中用户的目标注视区域,可以包括:确定N帧图像中车辆中用户的视线注视区域的差异满足第一预设范围;确定N帧行车记录仪的图像中第一注视区域的差异满足第二预设范围;根据N帧图像中车辆中用户的视线注视区域与M帧所述行车记录仪的图像中第一注视区域确定重叠区域;根据重叠区域可以确定车辆中用户的目标注视区域。
示例性地,上述N帧图像中驾驶员的视线注视区域的差异可以是指多帧图像中驾驶员的视线注视区域位置的差异;或者,也可以是指多帧图像中驾驶员的视线注视区域大小的差异;通过N帧图像中视线注视区域的差异能够确定驾驶员是否在预设时间内持续注视相同的目标对象。
同理,N帧行车记录仪的图像中第一注视区域的差异可以是指多帧行车记录仪的图像中第一注视区域位置的差异;或者,也可以是指多帧图像中多帧行车记录仪的图像中第一注视区域大小的差异;通过M帧行车记录仪的图像中第一注视区域的差异能够确定车辆中用户是否能够在预设时间内看到环境图像中的目标对象,避免车辆的行驶方向突然改变车辆中用户可能注视到车外的目标对象但是无法继续注视到该目标对象的场景。
需要说明的是,获取的N帧图像可以是指获取的N帧驾驶员的图像,其中,N帧驾驶员的图像与M帧行车记录仪的图像是在相同的时间段内获取的图像,即在相同的起始时刻与终止时刻内获取的图像。
例如,在获取N帧驾驶员的图像或者M帧行车记录仪的图像时,车辆中的处理器可以在图像上打上时间戳,根据时间戳可以确定在相同的时间段内获取N帧驾驶员的图像与M帧行车记录仪的图像。
应理解,上述获取N帧驾驶员的图像与M帧行车记录仪的图像也可以是在时间差允许的范围内获取的图像;即获取N帧驾驶员的图像的时刻与获取M帧行车记录仪的图像的时刻相似或者接近。
示例性地,若获取N帧驾驶员的图像与M帧行车记录仪的图像存在一定的允许时间差,则可以根据获取的N帧驾驶员的图像预测后几帧驾驶员的图像;或者,可以根据M帧行车记录仪的图像预测后几帧行车记录仪的图像。
示例性地,获取的N帧图像中驾驶员的视线注视区域可以是指通过配置于车辆的驾驶舱内的摄像头采集的N帧驾驶员的图像;基于N帧驾驶员的图像可以确定N帧图像中驾驶员的视线注视区域,比如,可以通过N帧驾驶员的图像中驾驶员的头部位置确定驾驶员在车辆的前挡风玻璃上的注视区域。
应理解,在N帧图像中驾驶员的视线注视区域的差异满足第一预设范围以及N帧行车记录仪的图像中第一注视区域的差异满足第二预设范围的情况下,才会进一步确定N帧图像中驾驶员的视线注视区域与M帧行车记录仪的图像中第一注视区域确定重叠区域;比如,在满足上述第一预设范围与第二预设范围的情况下,可以根据N帧图像中驾驶员的视线注视区域中的第i帧图像中驾驶员的视线注视区域与M帧行车记录仪的图像中第i帧行车记录仪的图像中的第一注视区域确定重叠区域的大小;进行通过比对N帧图像中的重叠区域确定驾驶员的目标注视区域。
在本申请的实施例中,可以首先对驾驶员的视线进行追踪,即确定获取的N帧图像即N帧驾驶员的图像的差异满足第一预设范围;即可以确定获取N帧驾驶员的图像中驾驶员的视线注视区域不在正前方且保持N帧图像变化差异较小,则此时可以确定驾驶员在注视车辆外场景中的感兴趣的物体;并且确定获取的M帧行车记录仪的图像的差异满足第二预设范围,此时可以确定该行车记录仪持续多帧拍摄到相同的物体未丢失;此时进一步确定N帧图像中驾驶员的视线注视区域与M帧行车记录仪的图像中的重叠区域,从而确定驾驶员的目标注视区域;通过多帧图像满足第一预设范围以及第二预设范围,可以确保本申请提供的用户感兴趣对象的识别方法的鲁棒性。
还应理解,在相同的起始时刻与终止时刻内获取的N帧图像(例如,N帧驾驶员的图像)与M帧行车记录仪的图像的数量可以相同也可以不同。
在一种情况下,在相同的起始时刻与终止时刻内获取的N帧图像(即驾驶员图像)与M帧行车记录仪的图像的数量相等。
例如,N帧图像中驾驶员的视线注视区域可以是指3帧图像分别为#1~#3图像中驾驶员的注视区域;M帧行车记录仪的图像可以是指与3帧图像中驾驶员的视线注视区域相对应的3帧行车记录仪的图像分别为#4~#6,则根据N帧图像中驾驶员的视线注视区域与N帧行车记录仪的图像中第一注视区域确定重叠区域可以是指确定#1图像中驾驶员的注视区域与#4行车记录仪的图像中第一视线注视区域的重叠区域,记为重叠区域1;同理,分别确定#2图像与#5行车记录仪的图像的重叠区域,记为重叠区域2;确定#3图像与#6行车记录仪的图像的重叠区域,记为重叠区域3;通过重叠区域1、重叠区域2以及重叠区 域3中的相同部分最终得到驾驶员的目标注视区域。
在另一种情况下,在相同的起始时刻与终止时刻内获取的N帧图像(即驾驶员图像)与M帧行车记录仪的图像的数量不相等;比如,若获取驾驶员图像的摄像头的采集频率与行车记录仪的采集频率不相同时,则在相同的时间段内采集的图像帧的数量可以不同。
示例性地,以获取N帧驾驶员图像,M帧行车记录仪的图像进行举例说明,其中,N<M。
1、向下匹配,即通过对数量较小的图像进行匹配,即通过对N帧驾驶员图像进行匹配。
例如,对N帧驾驶员图像中的每帧驾驶员图像寻找M帧行车记录仪的图像中与其时间戳最接近的行车记录仪的图像,即基于N帧驾驶员图像找到M帧行车记录仪的图像中的N帧行车记录仪的图像;也就是说,对于M帧行车记录仪的图像允许丢弃掉部分多余的图像。
2、向上匹配,即通过对数量较多的图像进行匹配,即通过对M帧行车记录仪的图像进行匹配。
例如,对于M帧行车记录仪图像中的每帧行车记录仪的图像寻找N帧驾驶员图像中与其时间戳最接近的驾驶员图像,即基于M帧行车记录仪的图像找到M帧驾驶员图像;也就是说,允许复用部分驾驶员图像,即允许M帧行车记录仪的图像中的多帧图像匹配相同的驾驶员图像。
同理,当N>M时与上述过程类似,此处不再赘述。
进一步地,在本申请的实施例中为了便于确定车辆中用户的视线注视区域与环境图像中第一注视区域的重叠区域,即重叠部分;可以将位于两个成像平面的图像投影至同一成像平面;即可以将车辆中用户的视线注视区域的信息映射至行车记录仪的图像所在成像平面;或者,可以将行车记录仪的图像的信息映射至车辆中用户的视线注视区域所在成像平面。具体过程可以参见后续图9所示。
可选地,在一种可能的实现方式中,上述识别方法还包括:在车辆的显示屏中显示所述目标注视区域的信息。
可选地,在一种可能的实现方式中,车辆中可以包括多个显示屏,在车辆的显示屏中显示目标注视区域的信息,包括:根据车辆中用户的在车辆中的位置信息确定多个显示屏中的目标显示屏;在目标显示屏中显示目标注视区域的信息。
示例性地,根据车辆中用户的位置信息确定用户的身份信息,身份信息可以包括驾驶员或者乘客;进而可以根据用户的身份信息向用户推送目标注视区域的信息;比如,可以在相应位置的显示屏中显示目标注视区域的信息;或者,可以将目标注视区域的信息进行播报;或者,可以目标区域的信息推送至用户的手机中以便用户后续可以继续了解该目标注视区域的信息。
例如,若检测到用户为驾驶员,则可以通过抬头显示HUD系统在车辆中显示目标注视区域的信息;比如,可以在前挡风玻璃上显示目标注视区域的信息。
例如,若检测到用户为副驾驶位置或者车辆后排位置的乘客,则可以在该位置对应的显示屏中显示目标注视区域的信息。
可选地,在一种可能的实现方式中,还包括:通过抬头显示HUD系统在所述车辆中 显示所述目标注视区域的信息。
在另一个示例中,上述用户感兴趣对象的识别方法可以应用于智能终端场景中,在智能终端的场景中通过本申请提供的方法可以用于识别用户感兴趣的物体,从而为用户提供更加智能的服务,有效提升用户体验。具体流程可以参见后续图12所示,此处不再赘述。
在本申请的实施例中,在识别用户感兴趣的对象时可以通过引入环境图像的预判,即根据用户可能会对环境图像中感兴趣的区域与用户的视线注视区域确定用户实际感兴趣的区域,从而提高了识别用户感兴趣对象的准确性。
图6是本申请实施例提供的用户感兴趣对象的识别方法示意性流程图。图6所示的方法包括步骤S510至步骤S560,下面分别对这些步骤进行详细描述。
应理解,图6所示的方法可以由图1所示的行驶车辆,或者图2所示的智能终端来执行;其中,行驶车辆可以是图3所示的车辆,或者,图4所示的车辆。
需要说明的是,图6中以用户为驾驶员为例进行举例说明;在本申请的实施例中,用户还可以为位于车辆中的用户,比如,车辆中的乘客,或者其它身份的用户,本申请对此不作任何限定。
S510、计算驾驶员的视线注视区域。
其中,计算驾驶员的视线注视区域即确定驾驶员的视线会经过前挡风玻璃的区域位置。
示例性地,可以将DMS/CMS采集到的图像输入至计算系统进行计算,通过采用深度学习算法,或者支持向量机算法,得到驾驶员在前挡风玻璃上注视的区域。
需要说明的是,支持向量机算法(Support vector machines,SVM)是一种有监督的机器学习算法,可用于分类任务或回归任务。
例如,如图7所示,可以将前挡风玻璃分成多个正方形区域;假设左下角为起点,该起点在区域中的坐标为(0,0);向上、向右为对应正方向;通过深度学习算法可以识别多个正方形区域中驾驶员视线注视的区域。
S520、计算驾驶员的头部3D位置。
例如,可以通过双目方法计算驾驶员的头部3D位置。
示例性地,图8是计算驾驶员的头部3D位置的示意性流程图。如图8所示,该方法包括步骤S610至步骤S650,下面分别对这些步骤进行详细描述。
S610、获取相机的内参以及外参标定。
例如,获取出厂前对DMS/CMS内参或者外参标定可以是通过相机标定方法,获取两个相机的内参与外参矩阵;从而能够基于内参与外参建立相机与相机、相机与3D空间中其他物体的位置、方向关系。
S620、检测关键点P1、P2。
其中,关键点P1与P2可以分别是指在DMS或者CMS视图中驾驶员的头部中心位置。
例如,在DMS/CMS视图中检测头部位置P1,P2可以是通过深度学习或其他算法检测出两个视图中头部2D横纵坐标位置p1,p2。
S630、计算关键点P1、P2所在直线。
S640、计算关键点P1、P2所在3D直线O1P1,O2P2。
例如,根据DMS/CMS内参/外参可以计算头部所在3D直线O1P1,O2P2可以是基于相机的内参/外参将头部2D横纵坐标位置p1,p2转换为3D空间坐标P1,P2;其中,O1,O2分别表示DMS/CMS在3D空间中的光学原点,记录在外参矩阵中,从而可计算出O1P1,O2P2。
S650、计算驾驶员头部的中心位置点P。
例如,驾驶员的头部中位置点P可以是通过求解两条3D直线O1P1,O2P2的交点得到的;若两条直线O1P1,O2P2无交点,则可以选择两条直线距离最近的点为点P。
通过上述S610至S650可以根据相机的内参与外参从而得到驾驶员的头部的3D位置。
S530、计算行车记录仪(driving video record,DVR)视线注视区域。
需要说明的是,计算DVR视线注视区域可以是指建立一个驾驶员在挡风玻璃的视线注视区域至DVR的视线注视区域的关联关系;比如,该关联关系可以是查表关系。
例如,如图9所示,驾驶员像平面O1与DVR像平面O2的空间位置关系,通过装在车身前部的行车记录仪获取前方图像,然后根据驾驶员头部位置、驾驶员的视线注视区域计算出驾驶员注视区域在DVR中对应的区域。眼睛点为车身后视摄像头位置,X直线为驾驶员视线注视区域中心点在空间中的延长线,DVR为行车记录仪在空间中的位置,O1中阴影区域表示驾驶员在挡风玻璃上注视区域,O2中斜线区域表示DVR视图中对应的驾驶员注视区域。
应理解,驾驶员的感兴趣物体所在位置位于驾驶员透过前挡风玻璃的视线注视区域的延长线上,因此确定DVR视图中对应的驾驶员的注视区域,则能够确定驾驶员在DVR中感兴趣的物体的大致所在区域。
S540、场景注意力预判。
需要说明的是,场景注意力预判是指由于人眼会对颜色、形状变化比较丰富的区域比较敏感并产生兴趣,因此,可以基于人体此生理特征,预判驾驶员可能会对车外哪些区域产生兴趣。
示例性地,如图10所示,获取一张图像通过深度学习方法,或者使用Canny边缘检测方法获取形态变化丰富度,通过梯度计算方法获取颜色变化丰富度,并对这两个值进行加权计算出图像各位置的兴趣值;基于图像中各个位置的兴趣值预先判断驾驶员对图像中的哪个区域可能会产生兴趣。比如,在图10中的阴影区域表示驾驶员的敏感区域,即可能会引起驾驶员注意的区域。
例如,上述图像可以是从DVR中获取的图像。
S550、场景注意区域判断。
例如,可以通过计算DVR中对应的驾驶员的视线注视区域以及场景注意力预判可以得到驾驶员对车外场景中真正感兴趣的区域。
S560、意图判断。
例如,基于上述S550场景注意区域判断可以得到驾驶员真正感兴趣的区域,从而可以获取该区域的信息,后续可以将该感兴趣的信息提供给驾驶员;比如,可以将该感兴趣区域的信息对驾驶员进行播报;或者,将该信息推送至驾驶员的手机中以便驾驶员后续可以继续了解该感兴趣区域的信息。
示例性地,上述场景注意区域判断的具体流程可以参见图11所示的流程。场景注意 区域判断可以包括步骤S710至步骤S750,下面分别对这些步骤进行详细描述。
S710、驾驶员视线追踪。
例如,具体地参见上述S510的计算驾驶员的视线注视区域。
S720、判断驾驶员的视线注视区域是否持续N帧;若是,则执行S750;若否,则返回。
需要说明的是,判断驾驶员的视线注视区域是否持续N帧是指判断驾驶员在前挡风玻璃上的视线注视区域是否在N帧图像中保持不动或在小范围内变化;即判断驾驶员是否连续N帧持续注意某个区域。
例如,若在N帧驾驶员的视线注视区域的变化超出范围,则可以认为驾驶员只是扫视经过,没有引起驾驶员的注意,在此状态下不需要计算驾驶员车外感兴趣区域;若驾驶员视线注视区域始终保持在正前方,则可以认为驾驶员在专心驾驶,没有注视车外物体等,在此状态下不需要计算驾驶员车外感兴趣区域。
应理解,在本申请的实施例中,N帧图像驾驶员的视线注视区域是否在N帧图像中保持不动或者在小范围内变化可以是指驾驶员的视线注视区域在除正前方向外且N帧图像中驾驶员的视线注视区域保持不动或在小范围内变化;其中,正前方向可以是指车辆行驶的方向。
S730、车外注意力预判。
例如,具体地参见上述S540的场景注意力预判。
S740、判断车外注意力预判是否持续M帧;若是,则执行S750;若否,则返回。
应理解,连续M帧车外注意力预判是通过对行车记录仪获取的图像进行注意力预判,计算多帧图像的感兴趣区域的重叠区域。
示例性地,若在M帧图像中感兴趣区域丢失,即感兴趣移出图像边界,或者多帧图像中的感兴趣区域之间不连续,则可能对应的场景为驾驶过程中车道路边的较小的物体(例如,路边的小广告牌),驾驶员的注视时间较短,基本不可能引起驾驶员的兴趣,则不需要执行S750;若M帧图像中感兴趣区域连续且不丢失,则可以执行S750预判此区域是否为驾驶员的真正的感兴趣区域。
还应理解,在相同的起始时刻与终止时刻内获取的N帧图像(例如,N帧驾驶员的图像)与M帧行车记录仪的图像的数量可以相同也可以不同。
S750、场景注意区域判断。
需要说明的是,S750是基于获驾驶员视线追踪与车外注意力预判算法,当这两个算法的结果都持续不变,或者小范围内改变时,场景注意区域判断流程开启;即若驾驶员的视线注视区域与车外注意力预判同时持续N帧图像中保持持续不变,或者在小范围内改变,则需要进行车外场景注意区域判断。
举例来说,假设通过车辆驾驶室内的摄像头获取驾驶员的N帧图像(例如,5帧图像);通过行车记录仪可以获取车外场景的N帧图像(例如,5帧图像);分别计算5帧驾驶室内采集的摄像头的图像中驾驶员在前挡风玻璃中的视线注视区域,进而判断该5帧图像中驾驶员的视线注视区域是否保持不变或者较小范围内的变化;同理,基于车外注意力预判可以分别确定车外场景的5帧图像中驾驶员可能感兴趣的区域,进而判断该5帧车外场景的图像中驾驶员可能感兴趣的区域是否保持不变或者较小范围内的变化;若同时满足上述 两个持续N帧保持不变或者较小范围内的变化,则分别判断驾驶室内的摄像头获取驾驶员的N帧图像中的每一帧图像与对应的每一帧行车记录仪获取的图像中的重叠区域,该重叠区域是驾驶员真正感兴趣的区域。
应理解,上述通过对相同时间段内采集的车辆驾驶室内的图像的数量与行车记录仪的图像的数量相等的情况进行了举例说明;在相同的时间段内,采集的车辆驾驶室内的图像的数量与行车记录仪的图像的数量也可能不同,本申请对此不作任何限定。
示例性地,通过S530可以进行DVR视线注视区域计算,即将驾驶员的视线注视区域从挡风玻璃移动到行车记录仪的图像内,将驾驶员在行车记录仪的图像内的视线注视区域与行车记录仪的图像内的感兴趣预判区域取交集,当连续N帧图像的交集保持不变或小范围内移动,则认为该交集区域是指驾驶员真正注视的区域。
在本申请的实施例中,通过结合车内摄像头采集的驾驶员的图像获取驾驶员的视线注视区域,并将驾驶员的视线注视区域与车外场景区域进行结合,从而提升了用户的感兴趣对象识别算法的准确度与鲁棒性,为用户带来了更好的交互体验。
应理解,上述图11是在驾驶领域中的场景注意区域判断的示意性流程图。图12是在智能终端领域的场景注意区域判断的示意性流程图,在智能终端领域可以通过本申请提供的用于识别感兴趣对象的方法判断用户对家居中的哪个物体感兴趣,并提供相应的交互。
在智能终端领域中,用于识别感兴趣对象的方法与上述图6至图10所示的流程类似,区别在于在驾驶领域中获取的是驾驶员的视线注视区域,在智能终端领域则获取的是用户的视线注视区域,以及场景注意区域判断的流程不同。
下面结合图12对智能终端领域中的场景注意区域判断的流程进行详细描述。图12所示的场景注意区域判断可以包括步骤S810至步骤S830,下面分别对这些步骤进行详细描述。
S810、用户视线追踪,即获取用户的视线注视区域。
例如,可以通过智能终端的摄像头获取用户的图像;比如,可以通过智慧屏中的摄像头采集用户的图像。
S820、判断用户的视线注视区域是否持续N帧;若是,则执行S830;若否,则返回。
应理解,在智能终端领域中,比如,对于智能家居而言,通常默认智能家居是不会移动的,因此可以只需要进行持续N帧用户视线注视区域的判断而不需要执行N帧场景注意力预判;即假设场景中的物体是不发送移动的,需要获取用户的视线注视区域,进而判断用户的视线注视区域是否在N帧图像中保持不变或者较小的变化范围内,即判断用户是否在一段时间内持续注视着某个物体。
S830、场景注意区域判断。
例如,可以基于持续N帧用户的视线注视区域以及场景注意力预判确定用户真正注视的物体。其中,场景注意力预判的具体流程可以参见图10的具体说明,此处不再赘述。
应理解,上述举例说明是为了帮助本领域技术人员理解本申请实施例,而非要将本申请实施例限于所例示的具体数值或具体场景。本领域技术人员根据所给出的上述举例说明,显然可以进行各种等价的修改或变化,这样的修改或变化也落入本申请实施例的范围内。
上文结合图1至图12,详细描述了本申请实施例中的用户感兴趣对象的识别方法, 下面将结合图13和图14,详细描述本申请的装置实施例。应理解,本申请实施例中的用于用户感兴趣对象的识别装置可以执行前述本申请实施例的各种用户感兴趣对象的识别方法,即以下各种产品的具体工作过程,可以参考前述方法实施例中的对应过程。
图13是本申请一个实施例提供的用户感兴趣对象的识别装置的示意性框图。
应理解,图13示出的识别装置900仅是示例,本申请实施例的装置还可包括其他模块或单元。应理解,识别装置900能够执行图5至图12的识别方法中的各个步骤,为了避免重复,此处不再详述。
如图13所示,识别装置900可以包括获取模块910和处理模块920,其中,获取模块910用于获取用户的视线注视区域的信息以及所述用户对应的环境图像;处理模块920用于根据所述环境图像得到所述用户在所述环境图像中的第一注视区域的信息,其中,所述第一注视区域用于表示通过人体物理特征确定的敏感区域;根据所述视线注视区域的信息与所述第一注视区域的信息得到所述用户的目标注视区域,其中,所述目标注视区域用于表示在所述环境图像中所述用户注视的目标对象所在的区域。
可选地,在一种可能的实现方式中,所述处理模块920具体用于:
根据所述视线注视区域与所述第一注视区域的重叠区域确定所述目标注视区域。
可选地,在一种可能的实现方式中,所述用户为车辆中用户,所述获取模块910具体用于:
获取所述车辆中用户的视线注视区域的信息以及所述车辆的行车记录仪的图像,其中,所述车辆中用户的视线注视区域用于表示所述车辆中用户在所述车辆外部的注视区域;
所述处理模块920具体用于:
根据所述车辆中用户的视线注视区域的信息与所述行车记录仪的图像中所述第一注视区域的信息确定所述车辆中用户的目标注视区域。
可选地,在一种可能的实现方式中,所述获取模块910具体用于:
获取N帧图像中所述车辆中用户的视线注视区域的信息以及M帧所述行车记录仪的图像,其中,所述N帧图像与所述M帧所述行车记录仪的图像是在相同的起始时刻与终止时刻内获取的图像,N、M均为正整数;
所述处理模块920具体用于:
确定所述N帧图像中所述车辆中用户的视线注视区域的差异满足第一预设范围;
确定所述M帧所述行车记录仪的图像中所述第一注视区域的差异满足第二预设范围内;
根据所述N帧图像中所述车辆中用户的视线注视区域与所述M帧所述行车记录仪的图像中所述第一注视区域确定重叠区域;
根据所述重叠区域确定所述车辆中用户的目标注视区域。
可选地,在一种可能的实现方式中,所述视线注视区域的差异是指所述视线注视区域的位置差异;所述第一注视区域的差异是指所述第一注视区域的位置差异。
可选地,在一种可能的实现方式中,所述处理模块920还用于:
将所述车辆中用户的视线注视区域映射至所述行车记录仪的图像所在成像平面;或者,
将所述行车记录仪的图像映射至所述车辆中用户的视线注视区域所在成像平面。
可选地,在一种可能的实现方式中,所述处理模块920还用于:
在所述车辆的显示屏中显示所述目标注视区域的信息。
可选地,在一种可能的实现方式中,所述车辆中包括多个显示屏,所述处理模块920具体用于:
根据所述车辆中用户的在所述车辆中的位置信息确定所述多个显示屏中的目标显示屏;在所述目标显示屏中显示所述目标注视区域的信息。
可选地,在一种可能的实现方式中,所述处理模块920还用于:
通过抬头显示HUD系统在所述车辆中显示所述目标注视区域的信息。
可选地,在一种可能的实现方式中,所述车辆中用户为所述车辆的驾驶员,或者所述车辆中的乘客。
在一种可能的实现方式中,上述识别装置900是指汽车,其中,获取模块是指汽车中的接口电路;处理模块是指汽车中的处理器。
例如,汽车中的接口电路可以通过通信网络获取配置于汽车内部的摄像头(例如,DMS摄像头或者CMS摄像头)采集的用户的视线注视区域的信息,以及获取行车记录仪拍摄的汽车行驶的环境图像,其中,DMS或者CMS可以集成在汽车中;上述本申请实施例中的识别方法可以通过软件算法实现,处理器可以从接口电路获取用户的视线注视区域的信息以及环境图像,并通过集成逻辑电路或者软件形式的指令执行本申请中的识别方法。
在一种可能的实现方式中,上述识别装置900是指汽车中的车载设备,其中,获取模块是指车载设备中的接口电路;处理模块是指车载设备中的处理器。
例如,车载设备中的接口电路可以通过通信网络获取配置于汽车内部的摄像头(例如,DMS摄像头或者CMS摄像头)采集的用户的视线注视区域的信息,以及获取行车记录仪拍摄的汽车行驶的环境图像;处理器从接口电路获取用户的视线注视区域的信息以及环境图像,并通过集成逻辑电路或者软件形式的指令执行本申请中的识别方法。
其中,上述通信网络可以是指3G蜂窝通信;例如,CDMA、EVD0、GSM/GPRS;或者,4G蜂窝通信,例如LTE;或者,5G蜂窝通信;通信网络可以利用无线上网(WiFi)与无线局域网通信;或者,通信网络可以利用红外链路、蓝牙或者紫蜂协议(ZigBee)与设备直接通信;其他无线协议,例如各种车辆通信系统,例如,通信网络可以包括一个或多个专用短程通信设备,这些设备可包括车辆和/或路边台站之间的公共和/或私有数据通信。
应理解,这里的识别装置900以功能单元的形式体现。这里的术语“模块”可以通过软件和/或硬件形式实现,对此不作具体限定。
例如,“模块”可以是实现上述功能的软件程序、硬件电路或二者结合。所述硬件电路可能包括应用特有集成电路(application specific integrated circuit,ASIC)、电子电路、用于执行一个或多个软件或固件程序的处理器(例如共享处理器、专有处理器或组处理器等)和存储器、合并逻辑电路和/或其它支持所描述的功能的合适组件。
因此,在本申请的实施例中描述的各示例的单元,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所 描述的功能,但是这种实现不应认为超出本申请的范围。
图14是本申请一个实施例的用户感兴趣对象的识别装置的示意性框图。
图14所示的识别装置1000包括存储器1001、处理器1002、通信接口1003以及总线1004。其中,存储器1001、处理器1002、通信接口1003通过总线1004实现彼此之间的通信连接。
存储器1001可以是只读存储器(read only memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(random access memory,RAM)。存储器1001可以存储程序,当存储器1001中存储的程序被处理器1002执行时,处理器1002用于执行本申请实施例的用户感兴趣对象的识别方法的各个步骤,例如,可以执行图5至图12所示实施例的各个步骤。
处理器1002可以采用通用的中央处理器(central processing unit,CPU),微处理器,应用专用集成电路(application specific integrated circuit,ASIC),或者一个或多个集成电路,用于执行相关程序,以实现本申请方法实施例的用户感兴趣对象的识别方法。
处理器1002还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请实施例的用户感兴趣对象的识别方法的各个步骤可以通过处理器1002中的硬件的集成逻辑电路或者软件形式的指令完成。
上述处理器1002还可以是通用处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1001,处理器1002读取存储器1001中的信息,结合其硬件完成本申请实施例中图13所示的识别装置包括的模块所需执行的功能,或者,执行本申请方法实施例的用户感兴趣对象的识别方法,例如,可以执行图5至图12所示实施例的各个步骤/功能。
通信接口1003可以使用但不限于收发器一类的收发装置,来实现识别装置1000与其他设备或通信网络之间的通信。
总线1004可以包括在识别装置1000各个部件(例如,存储器1001、处理器1002、通信接口1003)之间传送信息的通路。
应注意,尽管上述识别装置1000仅仅示出了存储器、处理器、通信接口,但是在具体实现过程中,本领域的技术人员应当理解,识别装置1000还可以包括实现正常运行所必须的其他器件。同时,根据具体需要,本领域的技术人员应当理解,上述识别装置1000还可包括实现其他附加功能的硬件器件。此外,本领域的技术人员应当理解,上述识别装置1000也可仅仅包括实现本申请实施例所必须的器件,而不必包括图14中所示的全部器件。
应理解,本申请实施例所示的识别装置可以是车辆中的车载设备,或者,也可以是配 置于车载设备中的芯片。
本申请实施例还提供一种汽车,该车辆包括执行上述本申请实施例中的用户感兴趣对象的识别装置。
本申请实施例还提供一种车辆系统,包括配置于车辆内部的摄像头、行车记录仪以及识别装置上述本申请实施例中的用户感兴趣对象的识别装置。
本申请实施例还提供一种芯片,该芯片包括收发单元和处理单元。其中,收发单元可以是输入输出电路、通信接口;处理单元为该芯片上集成的处理器或者微处理器或者集成电路;该芯片可以执行上述方法实施例中的用户感兴趣对象的识别方法。
本申请实施例还提供一种计算机可读存储介质,其上存储有指令,该指令被执行时执行上述方法实施例中的用户感兴趣对象的识别方法。
本申请实施例还提供一种包含指令的计算机程序产品,该指令被执行时执行上述方法实施例中的用户感兴趣对象的识别方法。
应理解,本申请实施例中的处理器可以为中央处理单元(central processing unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
还应理解,本申请实施例中的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的随机存取存储器(random access memory,RAM)可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。
上述实施例,可以全部或部分地通过软件、硬件、固件或其他任意组合来实现。当使用软件实现时,上述实施例可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令或计算机程序。在计算机上加载或执行所述计算机指令或计算机程序时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以为通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一 个或多个可用介质集合的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质。半导体介质可以是固态硬盘。
应理解,本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况,其中A,B可以是单数或者复数。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系,但也可能表示的是一种“和/或”的关系,具体可参考前后文进行理解。
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。
应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机 存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。
Claims (26)
- 一种用户感兴趣对象的识别方法,其特征在于,包括:获取用户的视线注视区域的信息以及所述用户对应的环境图像;根据所述环境图像得到所述用户在所述环境图像中的第一注视区域的信息,其中,所述第一注视区域用于表示通过人体物理特征确定的敏感区域;根据所述视线注视区域的信息与所述第一注视区域的信息得到所述用户的目标注视区域,其中,所述目标注视区域用于表示在所述环境图像中所述用户注视的目标对象所在的区域。
- 如权利要求1所述的识别方法,其特征在于,所述根据所述视线注视区域的信息与所述第一注视区域的信息得到所述用户的目标注视区域,包括:根据所述视线注视区域与所述第一注视区域的重叠区域确定所述目标注视区域。
- 如权利要求1或2所述的识别方法,其特征在于,所述用户为车辆中用户,所述获取用户的视线注视区域的信息以及所述用户对应的环境图像,包括:获取所述车辆中用户的视线注视区域的信息以及所述车辆的行车记录仪的图像,其中,所述车辆中用户的视线注视区域用于表示所述车辆中用户在所述车辆外部的注视区域的信息;所述根据所述视线注视区域的信息与所述第一注视区域的信息得到所述用户的目标注视区域,包括:根据所述车辆中用户的视线注视区域的信息与所述行车记录仪的图像中所述第一注视区域的信息确定所述驾驶员的目标注视区域。
- 如权利要求3所述的识别方法,其特征在于,所述获取所述车辆中用户的视线注视区域的信息以及所述车辆的行车记录仪的图像,包括:获取N帧图像中所述车辆中用户的视线注视区域的信息以及M帧所述行车记录仪的图像,其中,所述N帧图像与所述M帧所述行车记录仪的图像是在相同的起始时刻与终止时刻内获取的图像,N、M均为正整数;所述根据所述车辆中用户的视线注视区域的信息与所述行车记录仪的图像中所述第一注视区域的信息确定所述驾驶员的目标注视区域,包括:确定所述N帧图像中所述车辆中用户的视线注视区域的差异满足第一预设范围;确定所述M帧所述行车记录仪的图像中所述第一注视区域的差异满足第二预设范围内;根据所述N帧图像中所述车辆中用户的视线注视区域与所述M帧所述行车记录仪的图像中所述第一注视区域确定重叠区域;根据所述重叠区域确定所述车辆中用户的目标注视区域。
- 如权利要求4中所述的识别方法,其特征在于,所述视线注视区域的差异是指所述视线注视区域的位置差异;所述第一注视区域的差异是指所述第一注视区域的位置差异。
- 如权利要求3至5中任一项所述的识别方法,其特征在于,还包括:将所述车辆中用户的视线注视区域映射至所述行车记录仪的图像所在成像平面;或者,将所述行车记录仪的图像映射至所述车辆中用户的视线注视区域所在成像平面。
- 如权利要求3至6中任一项所述的识别方法,其特征在于,还包括:在所述车辆的显示屏中显示所述目标注视区域的信息。
- 如权利要求7所述的识别方法,其特征在于,所述车辆中包括多个显示屏,所述在所述车辆的显示屏中显示所述目标注视区域的信息,包括:根据所述车辆中用户的在所述车辆中的位置信息确定所述多个显示屏中的目标显示屏;在所述目标显示屏中显示所述目标注视区域的信息。
- 如权利要求3至6中任一项所述的识别方法,其特征在于,还包括:通过抬头显示HUD系统在所述车辆中显示所述目标注视区域的信息。
- 如权利要求3至9中任一项所述的识别方法,其特征在于,所述车辆中用户为所述车辆的驾驶员,或者所述车辆中的乘客。
- 一种用户感兴趣对象的识别装置,其特征在于,包括:获取模块,用于获取用户的视线注视区域的信息以及所述用户对应的环境图像;处理模块,用于根据所述环境图像得到所述用户在所述环境图像中的第一注视区域的信息,其中,所述第一注视区域用于表示通过人体物理特征确定的敏感区域;根据所述视线注视区域的信息与所述第一注视区域的信息得到所述用户的目标注视区域,其中,所述目标注视区域用于表示在所述环境图像中所述用户注视的目标对象所在的区域。
- 如权利要求11所述的识别装置,其特征在于,所述处理模块具体用于:根据所述视线注视区域与所述第一注视区域的重叠区域确定所述目标注视区域。
- 如权利要求11或12所述的识别装置,其特征在于,所述用户为车辆中用户,所述获取模块具体用于:获取所述车辆中用户的视线注视区域的信息以及所述车辆的行车记录仪的图像,其中,所述车辆中用户的视线注视区域用于表示所述车辆中用户在所述车辆外部的注视区域的信息;所述处理模块具体用于:根据所述车辆中用户的视线注视区域的信息与所述行车记录仪的图像中所述第一注视区域的信息确定所述车辆中用户的目标注视区域。
- 如权利要求13所述的识别装置,其特征在于,所述获取模块具体用于:获取N帧图像中所述车辆中用户的视线注视区域的信息以及M帧所述行车记录仪的图像,其中,所述N帧图像与所述M帧所述行车记录仪的图像是在相同的起始时刻与终止时刻内获取的图像,N、M为正整数;所述处理模块具体用于:确定所述N帧图像中所述车辆中用户的视线注视区域的差异满足第一预设范围;确定所述M帧所述行车记录仪的图像中所述第一注视区域的差异满足第二预设范围内;根据所述N帧图像中所述车辆中用户的视线注视区域与所述M帧所述行车记录仪的 图像中所述第一注视区域确定重叠区域;根据所述重叠区域确定所述车辆中用户的目标注视区域。
- 如权利要求14所述的识别装置,其特征在于,所述视线注视区域的差异是指所述视线注视区域的位置差异;所述第一注视区域的差异是指所述第一注视区域的位置差异。
- 如权利要求13至15中任一项所述的识别装置,其特征在于,所述处理模块还用于:将所述车辆中用户的视线注视区域映射至所述行车记录仪的图像所在成像平面;或者,将所述行车记录仪的图像映射至所述车辆中用户的视线注视区域所在成像平面。
- 如权利要求13至16中任一项所述的识别装置,其特征在于,所述处理模块还用于:在所述车辆的显示屏中显示所述目标注视区域的信息。
- 如权利要求17所述的识别装置,其特征在于,所述车辆中包括多个显示屏,所述处理模块具体用于:根据所述车辆中用户的在所述车辆中的位置信息确定所述多个显示屏中的目标显示屏;在所述目标显示屏中显示所述目标注视区域的信息。
- 如权利要求13至16中任一项所述的识别装置,其特征在于,所述处理模块还用于:通过抬头显示HUD系统在所述车辆中显示所述目标注视区域的信息。
- 如权利要求13至19中任一项所述的识别装置,其特征在于,所述车辆中用户为所述车辆的驾驶员,或者所述车辆中的乘客。
- 如权利要求13至20中任一项所述的识别装置,其特征在于,所述识别装置为所述车辆中的车载设备。
- 一种汽车,其特征在于,包括权利要求11至21任一项所述的识别装置。
- 一种车辆系统,其特征在于,包括配置于车辆内部的摄像头、行车记录仪以及权利要求11至21任一项所述的识别装置。
- 一种用户感兴趣对象的识别装置,其特征在于,包括处理器和存储器,所述存储器用于存储程序指令,所述处理器用于调用所述程序指令来执行权利要求1至10中任一项所述的识别方法。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有程序指令,当所述程序指令由处理器运行时,实现权利要求1至10中任一项所述的识别方法。
- 一种芯片,其特征在于,所述芯片包括处理器与数据接口,所述处理器通过所述数据接口读取存储器上存储的指令,以执行如权利要求1至10中任一项所述的识别方法。
Priority Applications (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202080004845.1A CN112654546B (zh) | 2020-04-30 | 2020-04-30 | 用户感兴趣对象的识别方法以及识别装置 |
| PCT/CN2020/088243 WO2021217575A1 (zh) | 2020-04-30 | 2020-04-30 | 用户感兴趣对象的识别方法以及识别装置 |
| EP20933677.5A EP4134867A4 (en) | 2020-04-30 | 2020-04-30 | Identification method and identification device for object of interest of user |
| US17/976,070 US20230046258A1 (en) | 2020-04-30 | 2022-10-28 | Method and apparatus for identifying object of interest of user |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2020/088243 WO2021217575A1 (zh) | 2020-04-30 | 2020-04-30 | 用户感兴趣对象的识别方法以及识别装置 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/976,070 Continuation US20230046258A1 (en) | 2020-04-30 | 2022-10-28 | Method and apparatus for identifying object of interest of user |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2021217575A1 true WO2021217575A1 (zh) | 2021-11-04 |
Family
ID=75368398
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2020/088243 Ceased WO2021217575A1 (zh) | 2020-04-30 | 2020-04-30 | 用户感兴趣对象的识别方法以及识别装置 |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US20230046258A1 (zh) |
| EP (1) | EP4134867A4 (zh) |
| CN (1) | CN112654546B (zh) |
| WO (1) | WO2021217575A1 (zh) |
Families Citing this family (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10909866B2 (en) * | 2018-07-20 | 2021-02-02 | Cybernet Systems Corp. | Autonomous transportation system and methods |
| CN113221798A (zh) * | 2021-05-24 | 2021-08-06 | 南京伯索网络科技有限公司 | 一种基于网络课堂学员积极度评价系统 |
| CN113992885B (zh) * | 2021-09-22 | 2023-03-21 | 联想(北京)有限公司 | 一种数据同步方法及装置 |
| CN114715175B (zh) * | 2022-05-06 | 2025-04-25 | Oppo广东移动通信有限公司 | 目标对象的确定方法、装置、电子设备以及存储介质 |
| CN115631517A (zh) * | 2022-08-15 | 2023-01-20 | 浙江极氪智能科技有限公司 | 图像处理方法、装置、设备及存储介质 |
| CN116486386A (zh) * | 2023-04-24 | 2023-07-25 | 上海临港绝影智能科技有限公司 | 一种视线分心范围确定方法和装置 |
| CN119523480B (zh) * | 2023-08-30 | 2025-09-19 | 上海交通大学 | 多维度想象能力的客观评测方法 |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101651772A (zh) * | 2009-09-11 | 2010-02-17 | 宁波大学 | 一种基于视觉注意的视频感兴趣区域的提取方法 |
| CN103246350A (zh) * | 2013-05-14 | 2013-08-14 | 中国人民解放军海军航空工程学院 | 基于感兴趣区实现辅助信息提示的人机接口设备及方法 |
| CN105620364A (zh) * | 2014-11-21 | 2016-06-01 | 现代摩比斯株式会社 | 提供驾驶信息的方法和装置 |
| CN109551489A (zh) * | 2018-10-31 | 2019-04-02 | 杭州程天科技发展有限公司 | 一种人体辅助机器人的控制方法及装置 |
| US20190318181A1 (en) * | 2016-07-01 | 2019-10-17 | Eyesight Mobile Technologies Ltd. | System and method for driver monitoring |
| CN110850974A (zh) * | 2018-11-02 | 2020-02-28 | 英属开曼群岛商麦迪创科技股份有限公司 | 用于侦测意图兴趣点的方法及其系统 |
| CN110929703A (zh) * | 2020-02-04 | 2020-03-27 | 北京未动科技有限公司 | 信息确定方法、装置和电子设备 |
Family Cites Families (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP4420002B2 (ja) * | 2006-09-14 | 2010-02-24 | トヨタ自動車株式会社 | 視線先推定装置 |
| CN102063623B (zh) * | 2010-12-28 | 2012-11-07 | 中南大学 | 一种结合自底向上和自顶向下的图像感兴趣区域提取方法 |
| US9323057B2 (en) * | 2012-12-07 | 2016-04-26 | Blackberry Limited | Mobile device, system and method for controlling a heads-up display |
| WO2015170142A1 (en) * | 2014-05-08 | 2015-11-12 | Sony Corporation | Portable electronic equipment and method of controlling a portable electronic equipment |
| CN106155288B (zh) * | 2015-04-10 | 2019-02-12 | 北京智谷睿拓技术服务有限公司 | 信息获取方法、信息获取装置及用户设备 |
| GB2539009B (en) * | 2015-06-03 | 2021-07-21 | Tobii Ab | Gaze detection method and apparatus |
| JP6563798B2 (ja) * | 2015-12-17 | 2019-08-21 | 大学共同利用機関法人自然科学研究機構 | 視覚認知支援システムおよび視認対象物の検出システム |
| US10269134B2 (en) * | 2016-07-01 | 2019-04-23 | Hashplay Inc. | Method and system for determining a region of interest of a user in a virtual environment |
| DE102017116702A1 (de) * | 2017-07-24 | 2019-01-24 | SMR Patents S.à.r.l. | Verfahren zum Bereitstellen einer Anzeige in einem Kraftfahrzeug, sowie Kraftfahrzeug |
| KR102446387B1 (ko) * | 2017-11-29 | 2022-09-22 | 삼성전자주식회사 | 전자 장치 및 그의 텍스트 제공 방법 |
| DE102018201768B4 (de) * | 2018-02-06 | 2020-02-06 | Volkswagen Aktiengesellschaft | Verfahren zum Anzeigen von Informationen in einem Head-Up-Display eines Fahrzeugs, Anzeigesystem für ein Fahrzeug und Fahrzeug mit einem Anzeigesystem |
| TWI642972B (zh) * | 2018-03-07 | 2018-12-01 | 和碩聯合科技股份有限公司 | 抬頭顯示系統及其控制方法 |
| CN109917920B (zh) * | 2019-03-14 | 2023-02-24 | 阿波罗智联(北京)科技有限公司 | 车载投射处理方法、装置、车载设备及存储介质 |
-
2020
- 2020-04-30 WO PCT/CN2020/088243 patent/WO2021217575A1/zh not_active Ceased
- 2020-04-30 CN CN202080004845.1A patent/CN112654546B/zh active Active
- 2020-04-30 EP EP20933677.5A patent/EP4134867A4/en active Pending
-
2022
- 2022-10-28 US US17/976,070 patent/US20230046258A1/en active Pending
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101651772A (zh) * | 2009-09-11 | 2010-02-17 | 宁波大学 | 一种基于视觉注意的视频感兴趣区域的提取方法 |
| CN103246350A (zh) * | 2013-05-14 | 2013-08-14 | 中国人民解放军海军航空工程学院 | 基于感兴趣区实现辅助信息提示的人机接口设备及方法 |
| CN105620364A (zh) * | 2014-11-21 | 2016-06-01 | 现代摩比斯株式会社 | 提供驾驶信息的方法和装置 |
| US20190318181A1 (en) * | 2016-07-01 | 2019-10-17 | Eyesight Mobile Technologies Ltd. | System and method for driver monitoring |
| CN109551489A (zh) * | 2018-10-31 | 2019-04-02 | 杭州程天科技发展有限公司 | 一种人体辅助机器人的控制方法及装置 |
| CN110850974A (zh) * | 2018-11-02 | 2020-02-28 | 英属开曼群岛商麦迪创科技股份有限公司 | 用于侦测意图兴趣点的方法及其系统 |
| CN110929703A (zh) * | 2020-02-04 | 2020-03-27 | 北京未动科技有限公司 | 信息确定方法、装置和电子设备 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4134867A4 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN112654546A (zh) | 2021-04-13 |
| US20230046258A1 (en) | 2023-02-16 |
| EP4134867A1 (en) | 2023-02-15 |
| EP4134867A4 (en) | 2023-05-31 |
| CN112654546B (zh) | 2022-08-02 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12049170B2 (en) | Adaptive rearview mirror adjustment method and apparatus | |
| CN112654546B (zh) | 用户感兴趣对象的识别方法以及识别装置 | |
| KR102043060B1 (ko) | 자율 주행 장치 및 이를 구비한 차량 | |
| US12192586B2 (en) | Method for presenting face in video call, video call apparatus, and vehicle | |
| EP4137914B1 (en) | Air gesture-based control method and apparatus, and system | |
| WO2020116195A1 (ja) | 情報処理装置、情報処理方法、プログラム、移動体制御装置、及び、移動体 | |
| US20230230368A1 (en) | Information processing apparatus, information processing method, and program | |
| WO2022061702A1 (zh) | 驾驶提醒的方法、装置及系统 | |
| KR102077575B1 (ko) | 차량 운전 보조 장치 및 차량 | |
| CN115042821B (zh) | 车辆控制方法、装置、车辆及存储介质 | |
| US12319275B2 (en) | Mapping method and apparatus, vehicle, readable storage medium, and chip | |
| CN114802258A (zh) | 车辆控制方法、装置、存储介质及车辆 | |
| CN114842454B (zh) | 障碍物检测方法、装置、设备、存储介质、芯片及车辆 | |
| CN115170630A (zh) | 地图生成方法、装置、电子设备、车辆和存储介质 | |
| CN114842455A (zh) | 障碍物检测方法、装置、设备、介质、芯片及车辆 | |
| KR20170033612A (ko) | 차량운전 보조장치 및 이를 포함하는 차량 | |
| CN115675504A (zh) | 一种车辆告警方法以及相关设备 | |
| CN114964294A (zh) | 导航方法、装置、存储介质、电子设备、芯片和车辆 | |
| CN115223122A (zh) | 物体的三维信息确定方法、装置、车辆与存储介质 | |
| CN114771514B (zh) | 车辆行驶控制方法、装置、设备、介质、芯片及车辆 | |
| EP4296132A1 (en) | Vehicle control method and apparatus, vehicle, non-transitory storage medium and chip | |
| CN115164910B (zh) | 行驶路径生成方法、装置、车辆、存储介质及芯片 | |
| CN114954528A (zh) | 车辆控制方法、装置、车辆、存储介质及芯片 | |
| CN114880408A (zh) | 场景构建方法、装置、介质以及芯片 | |
| CN118314538A (zh) | 一种图像处理方法、智能设备及车辆 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20933677 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2020933677 Country of ref document: EP Effective date: 20221111 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |