WO2020155615A1 - Vslam方法、控制器和可移动设备 - Google Patents

Vslam方法、控制器和可移动设备 Download PDF

Info

Publication number
WO2020155615A1
WO2020155615A1 PCT/CN2019/102686 CN2019102686W WO2020155615A1 WO 2020155615 A1 WO2020155615 A1 WO 2020155615A1 CN 2019102686 W CN2019102686 W CN 2019102686W WO 2020155615 A1 WO2020155615 A1 WO 2020155615A1
Authority
WO
WIPO (PCT)
Prior art keywords
pose
relative pose
related information
visual
key frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2019/102686
Other languages
English (en)
French (fr)
Inventor
李帅领
迟铭
张一茗
陈震
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qfeeltech Beijing Co Ltd
Original Assignee
Qfeeltech Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qfeeltech Beijing Co Ltd filed Critical Qfeeltech Beijing Co Ltd
Priority to JP2021543539A priority Critical patent/JP2022523312A/ja
Priority to EP19913780.3A priority patent/EP3919863A4/en
Priority to US16/718,560 priority patent/US10782137B2/en
Publication of WO2020155615A1 publication Critical patent/WO2020155615A1/zh
Priority to US16/994,579 priority patent/US11629965B2/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/579Depth or shape recovery from multiple images from motion
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/10Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
    • G01C21/12Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
    • G01C21/16Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
    • G01C21/165Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments
    • G01C21/1656Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments with passive imaging devices, e.g. cameras
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/20Instruments for performing navigational calculations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle

Definitions

  • the present invention relates to the field of movable equipment, in particular to a VSLAM method, controller and movable equipment.
  • Movable devices are machines that perform work autonomously or semi-autonomously, and can be applied to many scenarios.
  • the mobile device obtains environmental information through a variety of sensors and responds to the environmental information so that the mobile device can complete the set task safely, reliably, efficiently and intelligently.
  • Simultaneous Localization and Mapping means that a mobile device starts to move from an unknown location in an unknown environment, and locates itself based on its own pose and map during the movement, and at the same time, based on its own positioning Build incremental maps to realize autonomous positioning and navigation of mobile devices.
  • Visual SLAM refers to a mobile device that uses a visual system to realize autonomous positioning and map creation. It has the advantages of low cost and strong adaptability. In the VSLAM system, visual images and dead reckoning methods are combined to locate and map movable equipment.
  • the feature points of the current image and the feature points of the pre-created landmarks are generally matched first, and the three-dimensional coordinates of the feature points on the matched landmarks are calculated.
  • the three-dimensional coordinates of the feature points are generally Refers to the three-dimensional coordinates of the spatial point corresponding to the feature point in the camera coordinate system, and can also convert the three-dimensional coordinates in the camera coordinate system to the global coordinate system. If the three-dimensional coordinates of the feature point are in the global coordinate system, and the origin of the global coordinate system is selected as the initial position point of the movable device in the global coordinate system, the visual relative pose is the same as the visual absolute pose.
  • the three-dimensional coordinates of the spatial points corresponding to the feature points are generally calculated based on the selected two frames of images.
  • the two frames of images involved in the calculation need to meet certain conditions to calculate the three-dimensional coordinates of the spatial points, for example, the two frames of images can match There should be enough feature points, and the distance between the two frames of images in the space is within the set range.
  • the calculation of the three-dimensional coordinates of the feature points is easy to fail, resulting in a decrease in the success rate of creating road signs, and the number of road signs available for matching in the database is reduced.
  • the calculation result of VSLAM is not accurate enough, which affects the final positioning and mapping results.
  • the present invention provides a VSLAM method, a controller and a movable device.
  • a VSLAM method including:
  • the visual relative pose-related information is calculated according to the successfully matched image and key frames, wherein the visual relative pose-related information includes the visual relative pose, and the visual relative pose is based on the mutual relationship between the successfully matched image and the key frame.
  • the two-dimensional coordinates of the matched feature points are calculated;
  • the absolute pose and map of the movable device are updated based on the visual relative pose related information and dead reckoning relative pose related information.
  • the key frame includes: an absolute pose
  • the absolute pose is the pose of the movable device in the global coordinate system when the image on which the key frame is based is taken
  • the map includes at least one node Absolute pose
  • the method further includes:
  • the absolute pose in the key frame corresponding to the node is updated.
  • the calculation of visual relative pose related information based on the successfully matched image and key frame includes:
  • the visual relative pose is calculated
  • Reselect candidate frames and subsequent calculations until the end of the loop, the end of the loop includes: the number of reasonable visual relative poses reaches a preset number threshold, or all key frames that successfully match are selected;
  • the related information includes: the covariance matrix and the associated two node identifiers .
  • Optional also includes:
  • feature points are extracted from the image to obtain the two-dimensional coordinates and descriptors of the feature points;
  • a new key frame is created, and the new key frame is stored in a key frame database, and the new key frame includes: the feature The two-dimensional coordinates and descriptor of the point.
  • the key frame further includes: an absolute pose, where the absolute pose is the pose of the movable device in the global coordinate system when the image on which the key frame is based is taken, and the method further includes:
  • the absolute pose corresponding to the image is calculated by calculating the relative pose according to the absolute pose corresponding to the previous image or key frame and the dead-reckoning corresponding to the corresponding time interval.
  • Optional also includes:
  • the dead reckoning relative posture related information is calculated according to the original posture.
  • the updating the absolute pose and map of the movable device according to the visual relative pose related information and dead reckoning relative pose related information includes:
  • Optional also includes:
  • the dead reckoning relative pose related information is used to update the absolute pose of the movable device at the latest moment.
  • the two-dimensional coordinates are the two-dimensional coordinates of the feature point on the pixel coordinate system.
  • a controller including:
  • a processor and, a memory for storing executable instructions of the processor
  • a movable device including:
  • Dead reckoning sensor used to provide original pose or motion data to directly obtain the original pose or calculate the original pose based on the motion data
  • Vision sensor used to collect images
  • the controller is connected to the dead reckoning sensor and the visual sensor, and is used to execute the method according to any one of the first aspect of the embodiments of the present invention.
  • a non-transitory computer-readable storage medium When instructions in the storage medium are executed by a controller in a removable device, the The method of any one of aspects.
  • a VSLAM device including:
  • the first receiving module is used to receive the image sent by the vision sensor
  • the matching module is used to read key frames from a pre-established key frame database, and after the key frames are read, match the image with the read key frames;
  • the first calculation module is configured to calculate the visual relative pose related information according to the successfully matched image and the key frame, wherein the visual relative pose related information includes the visual relative pose, and the visual relative pose is based on the successfully matched image
  • the two-dimensional coordinates of the feature points that match each other with the key frame are calculated;
  • the first update module is used to update the absolute pose and map of the movable device after obtaining the visual relative pose related information, according to the visual relative pose related information result and dead reckoning relative pose related information.
  • the key frame includes: an absolute pose
  • the absolute pose is the pose of the movable device in the global coordinate system when the image on which the key frame is based is taken
  • the map includes at least one node
  • the device further includes:
  • the second update module is used to update the absolute pose in the key frame corresponding to the node according to the absolute pose of the node in the updated map.
  • the first calculation module is specifically configured to:
  • the visual relative pose is calculated
  • Reselect candidate frames and subsequent calculations until the end of the loop, the end of the loop includes: the number of reasonable visual relative poses reaches a preset number threshold, or all key frames that successfully match are selected;
  • the related information includes: the covariance matrix and the associated two node identifiers .
  • Optional also includes:
  • the creation module is used to extract feature points from the image when the preset creation conditions are reached to obtain the two-dimensional coordinates and descriptors of the feature points; when the number of extracted feature points is greater than or equal to the preset extraction threshold, Then, a new key frame is created, and the new key frame is stored in a key frame database.
  • the new key frame includes: the two-dimensional coordinates of the feature point and the descriptor.
  • the key frame further includes: an absolute pose, the absolute pose is the pose of the movable device in the global coordinate system when the image on which the key frame is based is taken, and the device further includes:
  • the acquisition module is used to calculate the absolute pose corresponding to the image according to the absolute pose corresponding to the previous image or key frame and the dead-reckoning corresponding to the corresponding time interval to calculate the relative pose. .
  • Optional also includes:
  • the second receiving module is configured to receive the original pose sent by the dead reckoning sensor; or, receive the motion data sent by the dead reckoning sensor, and calculate the original pose according to the motion data;
  • the second calculation module is configured to calculate the dead reckoning relative pose related information according to the original pose.
  • the first update module is specifically configured to:
  • Optional also includes:
  • the third update module is used to update the absolute pose of the movable device at the latest moment by using dead reckoning to calculate the relative pose related information when the visual relative pose related information cannot be obtained.
  • the two-dimensional coordinates are the two-dimensional coordinates of the feature point on the pixel coordinate system.
  • the various limitations of calculating the above-mentioned three-dimensional coordinates can be avoided, and the calculation of the visual relative position can be improved.
  • the success rate of the pose thereby improving the accuracy and computing speed of the final positioning and mapping results.
  • Fig. 1 is a schematic structural diagram of a VSLAM system according to an exemplary embodiment
  • Fig. 2 is a schematic structural diagram of a controller according to an exemplary embodiment
  • Fig. 3 is a schematic flowchart of a VSLAM method according to an exemplary embodiment
  • Fig. 4 is a schematic diagram showing a processing flow of data preprocessing according to an exemplary embodiment
  • Fig. 5 is a schematic diagram showing a processing flow of calculating visual relative pose according to an exemplary embodiment
  • Fig. 6 is a schematic diagram showing the absolute pose of a movable device according to an exemplary embodiment
  • Fig. 7 is a schematic diagram showing a process of creating a key frame according to an exemplary embodiment
  • Fig. 8 is a schematic diagram showing a processing flow of data fusion according to an exemplary embodiment
  • Fig. 9 is a schematic structural diagram of a controller according to an exemplary embodiment
  • Fig. 10 is a schematic structural diagram of a VSLAM device according to an exemplary embodiment.
  • the movable equipment involved in at least one embodiment of the present invention may be, for example, cleaning robots, companion mobile robots, service mobile robots, industrial inspection intelligent equipment, security robots, unmanned vehicles, drones, and the like.
  • Cleaning robots such as smart sweepers, smart floor cleaners, window cleaning robots, companion mobile robots such as smart electronic pets, nanny robots, service mobile robots such as reception robots in hotels, hotels, meeting places, and industrial inspection smart devices such as electricity Inspection robots, smart forklifts, security robots such as home or commercial smart guard robots.
  • sensors for mobile devices include:
  • Code plate a digital encoder used to measure angular displacement. It has the advantages of strong resolving power, high measurement accuracy and reliable work. It is one of the most commonly used displacement sensors for measuring the position of the shaft rotation angle.
  • the known size of the matching tire can be used for positioning and/or speed measurement of the movable device.
  • IMU Inertial measurement unit
  • the gyroscope is a device used to detect the angular movement of movable equipment.
  • the angular velocity of the movable equipment can be measured through the diagonal velocity. By integrating, the angle of the movable device can be obtained.
  • a 3-axis gyroscope can be used to calculate the posture of the movable device in three-dimensional space;
  • the accelerometer is a device used to detect the acceleration of the movable device and can measure the position of the movable device Acceleration is obtained by integrating acceleration to obtain velocity, and integrating velocity can obtain displacement.
  • a camera a device that can perceive the surrounding environment, is inexpensive and can provide rich information for positioning, mapping, and target/obstacle identification.
  • Cameras include monocular cameras, binocular cameras, and multi-lens cameras. Monocular cameras cannot provide a reference scale. Therefore, in practice, it is often necessary to combine other sensors to work, while binocular and multi-lens cameras can provide spatial scales.
  • GPS Global Positioning System
  • 2D/3D laser ranging sensors 2D/3D laser ranging sensors
  • ultrasonic ranging sensors etc.
  • code discs and inertial measurement units are all dead reckoning sensors.
  • VSLAM usually includes: feature extraction, image matching, data fusion and so on.
  • Image features can be extracted during feature extraction.
  • Commonly used image features include point features and line features.
  • point features There are many ways to extract point features, including Harris, FAST, ORB, SIFT, SURT, etc. and based on deep learning Point feature extraction method.
  • ORB uses FAST algorithm to detect feature points. This method is based on the gray value of the image around the feature point to detect the pixel value of a circle around the candidate feature point. If there are enough pixels in the area around the candidate point and the gray value of the candidate point is sufficiently different, it is considered The candidate point is a feature point. as follows:
  • I(x) is the gray value of any pixel on the circle with the candidate point as the center and the radius as the set value
  • I(p) is the center of the circle, which is the gray value of the candidate point
  • ⁇ d is gray
  • the threshold of the degree difference If N is greater than a given threshold, p is considered a feature point, and N is generally three-quarters of the total number of I(x).
  • I A represents the gray value of point A
  • I B represents the gray value of point B.
  • the final descriptor is: 1011.
  • is the orientation of the feature point
  • I(x,y) is the gray value of the pixel with coordinates (x0+x,y0+y)
  • (x0,y0) is the coordinate of the feature point
  • x and y are The coordinate offset.
  • BoW Bag of Words
  • the i-th image can be composed of n(i) feature points, that is, it can be expressed by n(i) feature vectors, then a total of sum(n(i)) feature vectors (ie words ).
  • the feature vector is represented by the descriptor of the feature point, or after the descriptor is normalized by the orientation, it is represented by the normalized descriptor.
  • the feature vector can be designed according to the feature problem. Common features include Color histogram, SIFT, LBP, etc.
  • Cluster the feature vectors obtained in the previous step (clustering methods such as K-means can be used) to obtain K cluster centers, and use the cluster centers to construct a codebook.
  • each "word" of the picture should belong to the "type of” word in the codebook through the nearest neighbor, so as to obtain the BoW representation of the picture corresponding to the codebook.
  • the tf-idf model is widely used in practical applications such as search engines.
  • the main idea of the tf-idf model is: if the word w appears frequently in an image d and rarely appears in other images, it is considered that the word w has a good distinguishing ability and is suitable for combining the image d with Differentiate from other images.
  • the model mainly includes two factors:
  • the tf-idf model calculates a weight for each image d and the query string q composed of keywords w[1]...w[k] according to tf and idf, which is used to indicate the matching degree between the query string q and the image d:
  • Kalman filtering The process equations and observation equations used in Kalman filtering are:
  • X_k AX_k-1+Bu_k+w_k-1
  • X_k is the state vector of the system
  • a and B are the parameters of the process equation
  • u_k is the input of the system
  • w_k-1 is the process noise
  • Z_k is the observation vector
  • H is the parameter of the observation equation
  • v_k is the observation noise.
  • each X_k can be represented by a linear random equation. Any state vector X_k is a linear combination of the state vector of its previous state plus the input quantity u_k and the process noise w_k-1.
  • the second formula indicates that any observation vector is a linear combination of the current state vector and the observation noise. Generally, the value obeys Gaussian distribution by default.
  • the current state vector can be predicted according to the state vector of the system at the last moment and the current input amount, as follows:
  • Is the estimator of the state vector at k-1 Is the estimator of the state vector at k-1
  • u k is the input at k
  • Is the predicted amount of the state vector at time k Is the predicted amount of the covariance matrix of the state vector at time k
  • P k-1 is the covariance matrix of the state vector at time k-1
  • Q is the variance of the process noise
  • a and B are the parameters of the process equation
  • the estimate of the current state vector can be obtained according to the current observation vector and the prediction of the current state vector, as follows:
  • K k is the Kalman gain
  • R is the variance of the observation noise
  • z k is the observation vector at time k
  • H is the parameter of the observation equation.
  • Kalman filter In practice, there are many improved methods based on Kalman filter, such as extended Kalman filter, unscented Kalman filter, iterative Kalman filter, multi-state Kalman filter and so on. There are also particle filters and so on.
  • the filtering method is based on recursion, while the nonlinear optimization method is based on iteration.
  • the nonlinear optimization method is introduced below.
  • H is the second derivative (Hesse matrix). You can choose to keep the first-order or second-order terms of the Taylor expansion, and the corresponding solution method is the one-step or two-step method. If one step is retained, then the incremental solution is:
  • This method is called Newton's method.
  • Gauss Newton method Levenberg-Marquadt method (Levenberg-Marquadt method) and so on.
  • sliding window optimization or incremental optimization iSAM
  • each key frame Stored in the key frame database, each key frame is constructed based on the image collected by the vision sensor. If not specifically stated, the vision sensor is set on a mobile device.
  • the vision sensor may be, for example, a camera, video camera or camera.
  • Each key frame includes the following set of data: the absolute pose of the movable device in the global coordinate system when the image on which the key frame is taken, the two-dimensional coordinates and descriptors of the feature points in the image on which the key frame is based .
  • the absolute pose represents position and posture; the position is represented by coordinates.
  • the absolute pose can be represented by three parameters (x, y, ⁇ ), where (x, y) represents the position of the movable device, ⁇ Represents the posture of the movable device; and in three-dimensional space, the position of the movable device is represented by (x, y, z) in the Cartesian coordinate system or ( ⁇ , ⁇ , r) in the spherical coordinate system; the posture of the movable device is represented by The orientation of the mobile device or its camera, usually an angle, such as in three-dimensional space, Indicates that these three angles are usually called pitch angle, roll angle, and yaw angle.
  • the key frame does not include the three-dimensional coordinates of the space points corresponding to the feature points. Specifically, it does not include the three-dimensional coordinates of the space points in the global coordinate system or the three-dimensional coordinates of the space points in the camera coordinate system. coordinate.
  • the above-mentioned two-dimensional coordinates refer to the two-dimensional coordinates of the feature point in the pixel coordinate system.
  • the absolute pose of the movable device that is, the absolute pose of the above-mentioned movable device in the global coordinate system, and refers to the position and attitude of the movable device in the global coordinate system.
  • the absolute pose of the node the data of the node stored in the controller, the value is consistent with the absolute pose of the movable device when the node is created.
  • Original pose Obtained according to the data provided by the dead reckoning sensor.
  • the dead reckoning sensor can directly provide the original pose, or the dead reckoning sensor can provide motion data, and the controller calculates the original pose based on the motion data.
  • the original pose is also an absolute amount to distinguish it from a relative amount, which can be understood as the absolute pose of the mobile device before optimization. It should be noted that, unless otherwise specified in the present invention, the dead reckoning sensor directly provides the original pose as an example for description.
  • Global coordinate system the coordinate system fixed in the environment.
  • Dead reckoning relative pose The relative amount of the absolute pose of the movable device at the first moment and the absolute pose of the movable device at the second moment provided by the dead reckoning sensor.
  • Visual relative pose The absolute pose of the mobile device at the first moment and the absolute pose of the mobile device at the second moment based on the current image taken by the vision sensor and the key frame that matches the current image successfully. The relative amount of pose.
  • the visual relative pose is calculated according to the two-dimensional coordinates of the matched feature points between the successfully matched image and the key frame. It is only related to the visual sensor and has nothing to do with the dead reckoning sensor.
  • the aforementioned absolute pose at the first moment refers to the absolute pose of the movable device when the vision sensor collects the current image
  • the absolute pose at the second moment refers to the absolute pose included in the key frame matching the current image
  • the above-mentioned dead reckoning relative pose and the relative amount in the visual relative pose both include the relative amount in position and the relative amount in attitude.
  • the relative amount of the visual relative pose is in the same form as the relative amount of the dead reckoning relative pose
  • the position of the visual relative pose is The relative quantity of is different from the relative quantity on the position of dead reckoning relative pose.
  • the dimension of the relative quantity on the position of the visual relative pose is one dimension less than the dimension of the relative quantity on the position of the dead reckoning relative pose.
  • the dead reckoning relative pose is expressed as ( ⁇ x, ⁇ y, ⁇ ), and in the present invention, the visual relative The pose is expressed as ( ⁇ , ⁇ ), where,
  • the visual relative pose of the present invention is 5 parameter values ( ⁇ , ⁇ , ⁇ , ⁇ , ⁇ ), while the dead reckoning relative pose is 6 parameter values ( ⁇ , ⁇ , ⁇ r, ⁇ , ⁇ , ⁇ ), it can be seen from the above comparison that the relative amount of the visual relative pose is one dimension less than the relative amount of the dead reckoning relative pose.
  • Fig. 1 is a schematic structural diagram of a VSLAM system according to an exemplary embodiment.
  • the system includes a dead reckoning sensor 11, a visual sensor 12 and a controller 13.
  • the dead reckoning sensor 11 is used to generate the original pose or motion data used to calculate the original pose.
  • the visual sensor 12 includes a camera, an infrared imager, etc., for collecting images.
  • the controller 13 is used for positioning and/or mapping based on the original pose and the image from the vision sensor. It can be understood that the pose output by the controller 13 in the figure refers to the updated absolute pose of the movable device.
  • the VSLAM system can be specifically applied to mobile devices, such as the field of mobile robots, so the system can be a component of mobile devices.
  • the controller 13 may be hardware, software, firmware, or a combination thereof.
  • the motion data of the dead reckoning sensor includes the displacement data, speed data, acceleration data, angle data, angular velocity data, etc. of the movable device.
  • the controller can calculate the original posture according to the motion data of the dead reckoning sensor, or dead reckoning
  • the sensor can also calculate the original pose based on the motion data and provide it to the controller. Unless otherwise specified, the present invention takes the dead reckoning sensor to provide the controller with the original pose as an example.
  • the motion data or the original pose generated by it has accumulated errors.
  • the original pose needs to be corrected.
  • the correction is made based on the image collected by the vision sensor.
  • the vision sensor can photograph the surrounding environment to obtain images according to the collection period set by the controller.
  • the image collected by the vision sensor and the original pose collected by the dead reckoning sensor are transferred to the controller, and the controller corrects the original pose collected by the dead reckoning sensor according to the image collected by the vision sensor, and then locates and builds the map.
  • the original pose is corrected based on the image collected by the vision sensor to realize positioning and mapping, and when applied to a movable device, the VSLAM of the movable device is realized.
  • Fig. 2 is a schematic diagram showing the structure of a controller according to an exemplary embodiment.
  • the controller can be divided into a data preprocessing module 21 and a data fusion module 22.
  • the data preprocessing module 21 receives the original pose from the dead reckoning sensor, and receives the image from the visual sensor, and after processing it, it can obtain the dead reckoning relative pose, visual relative pose, new key frame identification,
  • the node identification and related information constitute a preprocessing result, and the node identification includes: a new node identification and/or an associated node identification.
  • the data fusion module 22 performs positioning and mapping based on the preprocessing result. It is understandable that the dead reckoning relative pose and related information can also be calculated by the data fusion module. At this time, the data preprocessing module provides the original pose to the data fusion module, and the data preprocessing module does not perform dead reckoning. The calculation of relative pose and related information is calculated by the data fusion module.
  • the data fusion module can record the absolute pose of each node. After the absolute pose of the node is optimized based on the visual relative pose, the optimized absolute pose of the current node is used as the current positioning result, and the positioning is completed; the node includes key frames Nodes and pose nodes, the optimized absolute pose of the key frame node recorded in the back end can be understood as map information, thus completing the mapping.
  • the key frame database is used to store key frames. Depending on the current situation, the number of key frames stored in the key frame database can be zero, one or more.
  • the key frame is created based on the image collected by the vision sensor. For the specific content of the key frame, please refer to the above term description.
  • Fig. 3 is a schematic flowchart of a VSLAM method according to an exemplary embodiment.
  • the positioning method can be executed by the controller.
  • the VSLAM method includes:
  • S301 Receive an image sent by the vision sensor.
  • S302 Read key frames from a pre-established key frame database, and, after the key frames are read, match the image with the read key frames.
  • S303 Calculate visual relative pose related information according to the successfully matched image and key frame, where the visual relative pose related information includes the visual relative pose, and the visual relative pose is based on the relationship between the successfully matched image and the key frame The two-dimensional coordinates of the matching feature points are calculated.
  • the dead-reckoning relative pose-related information is used to update the absolute pose of the movable device at the latest moment.
  • the method may also include:
  • the dead reckoning relative posture related information is calculated according to the original posture.
  • the matching of images and key frames and the calculation of visual relative pose can be executed by the data preprocessing module of the controller, and the absolute pose and/or map of the mobile device can be updated according to the preprocessing results.
  • the data fusion module of the processor is executed. Among them, the specific process of data preprocessing can be seen in Figure 4, and the specific process of data fusion can be seen in Figure 8.
  • the two-dimensional coordinates of the matching feature points are used instead of the three-dimensional coordinates of the spatial points corresponding to the feature points, which can avoid the various limitations of calculating the above-mentioned three-dimensional coordinates.
  • Improve the success rate of the calculated visual relative pose thereby improving the accuracy and computing speed of the final positioning and mapping results.
  • the data preprocessing module mainly deals with visual information
  • the related dead reckoning relative pose and related information operations are relatively simple, so it will not be repeated here, and it can be realized by the methods adopted in related technologies.
  • the dead reckoning relative pose and related information can also be calculated by the data fusion module based on the original pose.
  • Fig. 4 is a schematic diagram showing a processing flow of data preprocessing according to an exemplary embodiment.
  • the processing flow of data preprocessing includes:
  • S401 Receive an image from the vision sensor.
  • the key frame database is used to store key frames. At the current moment, there may be key frames in the key frame database, or it may be empty.
  • the image data of the key frame includes the feature point information of the key frame.
  • the feature points of the image can be extracted to obtain the feature point information of the image, and then the feature point matching is performed according to the feature point information of the image and the feature point information of the key frame.
  • the matching threshold is a preset threshold.
  • the aforementioned feature point information includes: the two-dimensional coordinates of the feature point in the pixel coordinate system and the descriptor of the feature point.
  • the extraction of feature points can be achieved by various related technologies, for example, the scale-invariant feature transform (SIFT) algorithm, the accelerated robust feature (Speeded Up Robust Features, SURF) algorithm, and the rapid extraction of feature points and features
  • SIFT scale-invariant feature transform
  • SURF accelerated robust feature
  • the descriptor combines (Oriented FAST and Rotated Brief, ORB) algorithms for feature point extraction.
  • the feature point matching operation After acquiring the feature point information of the image and the feature point information of the key frame, the feature point matching operation is performed.
  • the feature point matching operation can also be implemented by various related technologies, for example, according to the original pose and feature corresponding to the image
  • the two-dimensional coordinates of the points determine the search range, within the search range, the vector distance is calculated according to the descriptor, and the matching feature points are determined according to the vector distance.
  • S405 Determine whether there is at least one successfully matched key frame, if yes, execute S406, otherwise, execute S407.
  • a different key frame can be read from the key frame database each time, and then matched with the image, until all the key frames in the key frame database are matched with the image, so as to determine whether there is at least one key for successful matching frame.
  • the visual relative pose can be calculated according to the two-dimensional coordinates of the matched feature points of the successfully matched image and the key frame.
  • the visual relative pose can be reasonable or unreasonable. Specifically, when the visual relative pose is reasonable, the covariance matrix of the visual relative pose and the two node identifiers associated with the visual relative pose are also obtained. Therefore, when the visual relative pose is reasonable, the visual relative pose and its covariance matrix, the current node identifier and the associated node identifier are used as the visual relative pose related information and put into the preprocessing result; in the calculated visual relative position When the posture is unreasonable, the visual relative posture related information is not included in the preprocessing result.
  • the above two node identifiers include the current node identifier and the associated node identifier, where the image identifier and the key frame identifier corresponding to the visual relative pose serve as the current node identifier and the associated node identifier, respectively.
  • the covariance matrix image can be pre-configured with image identification. Since the key frame is created based on the image, the key frame can be the same as the image identification of the image that created the key frame. Therefore, the above-mentioned current node identification can be selected as the current image Image identification, the above-mentioned associated node identification may specifically be an image identification of a key frame that successfully matches the image and can calculate a reasonable visual relative pose.
  • the current node identifier is the first identifier
  • the associated node identifier is the second identifier
  • the visual relative pose and dead reckoning relative pose are calculated in the data preprocessing module, which can be provided to the data fusion module for positioning and/or mapping.
  • the problems caused by the calculation of three-dimensional coordinates can be avoided, so that the constraint information provided by the vision can be obtained more simply, accurately and quickly.
  • Fig. 5 is a schematic diagram showing a processing flow for calculating a visual relative pose according to an exemplary embodiment.
  • the process of calculating the visual relative pose includes:
  • S501 Sort the key frames that are successfully matched.
  • the matched key frames can be sorted according to the similarity between the successfully matched key frames and the image.
  • the similarity can be calculated according to the bag of words method.
  • first train a word bag then according to the trained word bag, generate the image feature vector of the image and the image feature vector of the key frame; calculate the distance between the image feature vector of the image and the image feature vector of the key frame.
  • sorting the root distance can be sorted in ascending order.
  • a large number of pre-collected descriptors of feature points can be clustered into a fixed number of categories. Each category is called a word, and then the inverse document frequency of the word can be calculated according to the statistical method.
  • Frequency, IDF as the weight of words
  • word bag is composed of words and their weights.
  • the length of the vector is the number of words in the bag of words, and each bit in the vector is the word frequency—inverse document frequency of the word corresponding to the position in the current picture. , TF-IDF).
  • S502 Determine whether the queue is empty, if it is empty, execute S507, otherwise execute S503.
  • the queue is used to store the sorted key frames that are successfully matched.
  • the key frames that are successfully matched are taken out according to the order of similarity. After the key frames are taken out, then The corresponding key frame is missing from the queue.
  • S504 According to the two-dimensional coordinates of the feature points of the image and the two-dimensional coordinates of the feature points of the candidate frame, adopting the principle of epipolar geometry to calculate the visual relative pose.
  • the basic matrix is obtained according to the seven-point method or the eight-point method in epipolar geometry.
  • the basic matrix can be decomposed by matrix decomposition.
  • Get the visual relative pose Or, if you use the five-point method to obtain the essential matrix, you do not need to know the internal parameters of the camera, and you can also get the visual relative pose by performing matrix decomposition on the essential matrix.
  • the pose of the movable device can be expressed as (x,y, ⁇ )
  • the visual relative pose to be calculated can be expressed as: ( ⁇ , ⁇ )
  • K is the internal parameter matrix of the camera.
  • the visual relative pose of the mobile device can also be calculated when it moves in the three-dimensional space, then the pose of the mobile robot can be Can be expressed as (x, y, z, ⁇ , ⁇ , ⁇ ), then the visual relative pose to be calculated can be expressed as: ( ⁇ , ⁇ , ⁇ , ⁇ , ⁇ ),
  • K is the internal parameter matrix of the camera.
  • equation (2) can be expressed as:
  • (K, ⁇ 1 , ..., ⁇ i, ... ⁇ n), n is the number of groups each matched feature points.
  • the covariance matrix of the visual relative pose can be calculated by the following formula:
  • ⁇ (X) is the covariance matrix of the visual relative pose
  • ⁇ ( ⁇ ) is the empirical value, which is related to camera parameters and sensor noise.
  • the calculation method of the covariance matrix mentioned above is only an example method. Approximate calculation can also be done based on experience.
  • the covariance matrix is set according to the number of matching points, that is, the more matching points, the more The smaller the value is set, and vice versa.
  • the relative pose is calculated using time t1 as the starting point, and the end point at time t2.
  • the initial pose of the starting point is set to 0, and the data of the dead reckoning sensor is integrated. Dead reckoning relative pose.
  • the covariance of the dead reckoning relative pose can be calculated according to the principle of variance transmission, and the entire calculation process can refer to the process equation of the Kalman filter.
  • S505 Determine whether the visual relative pose is reasonable, if yes, execute S506, otherwise, repeat S502 and subsequent steps.
  • reasonable conditions can be preset. When the visual relative pose meets the preset reasonable conditions, it indicates that the visual relative pose is reasonable, otherwise it is unreasonable.
  • the preset reasonable conditions include, for example, that the image reprojection error calculated according to the visual relative pose is less than the preset error value. Specifically, after the visual relative pose is calculated, it is brought into the above-mentioned multiple sets of equations (1 ), if the difference between the overall value of these multiple sets of equations and 0 is less than the threshold, it indicates that the calculated visual relative pose is reasonable.
  • the above-mentioned overall value pairs are calculated according to multiple sets of equations, such as using average values or other calculation methods.
  • the preset reasonable conditions can also be based on the current covariance matrix. For example, the current covariance matrix indicates that the current visual relative pose is credible, which indicates that it is reasonable.
  • the covariance matrix is determined based on the dead reckoning sensor noise and the abnormal behavior of the movable device, such as slipping.
  • S506 Determine whether the number of reasonable visual relative poses reaches the preset number threshold, if yes, execute S507, otherwise, repeat S502 and subsequent steps.
  • the number threshold is equal to the preset number threshold, it indicates that it has not been reached.
  • the number is equal to the preset number threshold, it indicates that it has been reached.
  • S507 Correspond to each reasonable visual relative pose, obtain visual relative pose related information, and add the visual relative pose related information to the preprocessing result.
  • the visual relative pose related information includes: the visual relative pose, the covariance matrix of the visual relative pose, and the two node identifiers associated with the visual relative pose.
  • the visual relative pose includes: the visual relative pose, the covariance matrix of the visual relative pose, and the two node identifiers associated with the visual relative pose.
  • the related content please refer to the related content in S406, which will not be detailed here.
  • the reasonable visual relative pose can be one or more
  • the preprocessing result can include one or more sets of visual relative pose related information, and each set of information corresponds to a reasonable visual relative pose.
  • the group information may specifically include the visual relative pose and its covariance matrix, the current node identifier and the associated node identifier. Further, if there are multiple reasonable visual relative poses, since the current node identifiers corresponding to the multiple visual relative poses are the same, the current node identifier may be included in each group of information, or it may be in each group of information.
  • the group information does not include the current node ID, but multiple groups of information share the current node ID. For example, if there are two reasonable visual relative poses, the preprocessing result can have the following two forms:
  • Form 1 The first visual relative pose, the covariance matrix of the first visual relative pose, the current node identifier, the first associated node identifier; the second visual relative pose, the covariance matrix of the second visual relative pose, current The node identifier, the second associated node identifier; or,
  • Form 2 Current node identification; first visual relative pose, covariance matrix of first visual relative pose, first associated node identity; second visual relative pose, covariance matrix of second visual relative pose, first 2.
  • the associated node identifier is
  • the preprocessing result does not include the visual relative pose related information.
  • the dead reckoning relative position related information is calculated by the data preprocessing module, the preprocessing result includes the dead reckoning relative pose related information, otherwise it includes the original pose.
  • Fig. 7 is a schematic diagram showing a process of creating a key frame according to an exemplary embodiment.
  • the process of creating a key frame includes:
  • the currently received image from the vision sensor is taken as the current image.
  • S702 Determine whether the preset creation condition is met, if it is met, execute S703, otherwise execute S705.
  • the creation conditions can be set according to requirements, such as setting the time difference from the last creation of the key frame, and when the time difference from the last creation of the key frame reaches the set time difference, the creation of the key frame is triggered; or the trigger condition can also be based on the navigation
  • the original pose obtained by the position estimation sensor will trigger the creation of a key frame when the dead-reckoning calculated based on the original pose is greater than a certain threshold; or trigger the creation of a key frame according to the current operating state, such as turning; or When the overlapping area of two adjacent images is collected, when the proportion of the overlapping area is less than a predetermined value, the creation of a key frame is triggered.
  • S703 Determine whether the number of feature points of the current image is greater than or equal to the extraction threshold, if yes, execute S704, otherwise, execute S705.
  • the extraction process of the feature points can be referred to the above related description, which will not be described in detail here.
  • the two-dimensional coordinates of the feature point in the pixel coordinate system and the descriptor of the feature point can be obtained.
  • the extraction threshold is a preset threshold.
  • S704 The creation is successful, and the acquired image is used as a newly created key frame and stored in the key frame database.
  • the current node identifier can also be output to the data fusion module, and the current node identifier can be selected as the image identifier of the newly created key frame. Further, a new key frame identifier may also be included, which indicates that a new key frame is created at this time.
  • the key frames stored in the key frame database also include: the mobile device when the image was taken is in the global coordinate system Absolute pose in.
  • the absolute pose is calculated based on the absolute pose corresponding to the previous image or key frame and the dead-reckoning corresponding to the corresponding time interval.
  • the time correspondence between the image and the original pose can be aligned in the following way: assuming that the image was taken at t0, the original pose at t1 and the original pose at t2 are obtained at the two closest time points to t0.
  • the pose and the original pose at time t2 are interpolated to obtain the original pose after interpolation.
  • the absolute pose corresponding to the previous image or key frame increases as described above
  • the dead reckoning relative pose is the absolute pose of the image, which can be optimized and updated later.
  • the number of key frames in the key frame database can further improve the matching success rate of images and key frames, and improve positioning and mapping effects.
  • FIG. 8 is a schematic diagram showing a processing flow of data fusion according to an exemplary embodiment.
  • the processing flow of the data fusion module includes:
  • the preprocessing result always includes: dead reckoning relative pose, dead reckoning relative pose covariance matrix, where dead reckoning relative pose
  • the covariance matrix of the pose can be determined based on the sensor noise, the abnormal behavior of the movable device, and so on.
  • the preprocessing result includes: dead reckoning relative pose, visual relative pose, dead reckoning relative The covariance matrix of the pose, the covariance matrix of the visual relative pose, and the associated node identification. Further, it may also include a new key frame identifier and a new node identifier.
  • the preprocessing result may include a new key frame identifier, and the new key frame identifier indicates that a new key frame is created.
  • a new node is created.
  • the new node is identified as the data preprocessing module determining whether to create a pose node according to preset judgment conditions.
  • the preset judgment conditions include: the current original pose and the original of the previous node (that is, the last created node) Whether the distance between the poses, the angle difference, and the time interval are within or outside the corresponding threshold range.
  • the current node created is the key frame node, otherwise it is the pose node.
  • S803 Determine the mileage edge according to the dead reckoning relative pose-related information, and use the mileage edge to connect the current node to the existing last created node.
  • the data fusion module can record the absolute pose of each node.
  • the absolute pose is initially obtained from the absolute pose of the previous node plus the current dead-reckoning relative pose.
  • each node includes the use of vision The optimized value of the relative pose. That is, after the data fusion module is optimized, it will record: the optimized absolute pose of each node.
  • S804 Determine whether there is a key frame node associated with the current node in the existing nodes, if yes, execute S805, otherwise execute S807.
  • the associated node identifier can be obtained from the visual relative pose related information, and when the node indicated by the associated node identifier is included in the existing node, it is determined that there is The key frame node associated with the current node, and the associated key frame node is the node indicated by the associated node identifier.
  • S805 Determine a visual edge according to the visual relative pose related information, and use the visual edge to connect the current node to the associated key frame node.
  • S806 Perform graph optimization according to the nodes and edges to obtain the updated absolute pose and map of the movable device.
  • the input is the absolute pose of all nodes, the relative pose between nodes and its covariance matrix, and the output is the optimized absolute pose of each node.
  • the relative pose between nodes includes visual relative pose and Dead reckoning relative pose.
  • the optimized absolute pose of the current node is used as the current positioning result, that is, the updated absolute pose of the mobile device at the current position; the optimized absolute pose of the key frame node can be understood as the updated map result , Or, the updated absolute pose of the movable device in different positions.
  • the preprocessing result does not contain the visual relative pose related information, there is no need to perform image optimization. Only the information related to the visual relative pose is included, and the graph is optimized after the closed loop is formed by the visual edge.
  • the preprocessing result includes only the dead reckoning relative pose related information, that is, only including dead reckoning relative pose and its covariance matrix
  • the absolute position of the mobile device at the latest time is updated according to the dead reckoning relative pose related information
  • the pose for example, the absolute pose of the movable device at the most recent moment is determined as the absolute pose of the movable device at the previous moment, and the upper dead-end position is added to calculate the relative pose.
  • the absolute pose and/or map can be updated quickly after the data fusion module receives the visual relative pose related information.
  • the preprocessing result is filtered to update the absolute pose and map of the mobile device.
  • the visual relative pose related information is obtained, the prediction The processing results include visual relative pose related information and dead reckoning relative pose related information.
  • Kalman filtering it can be done in the following way:
  • the recurrence formula involved in Kalman filtering includes:
  • the absolute pose of the mobile device required at the current moment is the state vector mentioned above It is usually a vector composed of N elements, the i-th element is the absolute pose of the mobile device at the (k-N+i)-th moment, and N is a preset value.
  • N is a preset value.
  • the absolute pose of the movable device can also be updated through the filtering method to complete positioning and/or map.
  • All steps of the present invention do not involve the calculation of the three-dimensional coordinates of the spatial points corresponding to the feature points, which can avoid the various limitations of calculating three-dimensional coordinates in the prior art, and improve the accuracy and calculation speed of the final positioning and mapping results .
  • Fig. 9 is a schematic structural diagram of a controller according to an exemplary embodiment.
  • the controller includes: a memory 91 and a processor 92.
  • the memory 91 is used to store executable instructions, and when the instructions in the memory are executed by the processor, the above-mentioned VSLAM method is executed.
  • the embodiment of the present invention also provides a non-transitory computer-readable storage medium, which executes the above-mentioned VSLAM method when the instructions in the storage medium are executed by the controller in the removable device.
  • the embodiment of the present invention also provides a VSLAM device, which can be applied to a mobile device, as shown in FIG. 10, including: a first receiving module 101, a matching module 102, a first calculating module 103, and a first updating module 104 .
  • the first receiving module 101 is configured to receive an image sent by the vision sensor
  • the matching module 102 is configured to read key frames from a pre-established key frame database, and, after the key frames are read, match the image with the read key frames;
  • the calculation module 103 is configured to calculate the visual relative pose related information according to the successfully matched image and key frame, wherein the visual relative pose related information includes the visual relative pose, and the visual relative pose is based on the successfully matched image and The two-dimensional coordinates of the matching feature points between the key frames are calculated;
  • the first update module 104 is configured to update the absolute pose and map of the movable device according to the visual relative pose related information and dead reckoning relative pose related information when the visual relative pose related information is obtained.
  • the key frame includes: an absolute pose
  • the absolute pose is the pose of the movable device in the global coordinate system when the image on which the key frame is based is taken
  • the map includes at least one node
  • the device further includes:
  • the second update module is used to update the absolute pose in the key frame corresponding to the node according to the absolute pose of the node in the updated map.
  • the first calculation module is specifically configured to:
  • the visual relative pose is calculated
  • Reselect candidate frames and subsequent calculations until the end of the loop, the end of the loop includes: the number of reasonable visual relative poses reaches a preset number threshold, or all key frames that successfully match are selected;
  • the related information includes: the covariance matrix and the associated two node identifiers .
  • Optional also includes:
  • the creation module is used to extract feature points from the image when the preset creation conditions are reached to obtain the two-dimensional coordinates and descriptors of the feature points; when the number of extracted feature points is greater than or equal to the preset extraction threshold, Then, a new key frame is created, and the new key frame is stored in a key frame database.
  • the new key frame includes: the two-dimensional coordinates of the feature point and the descriptor.
  • the key frame further includes: an absolute pose, the absolute pose is the pose of the movable device in the global coordinate system when the image on which the key frame is based is taken, and the device further includes:
  • the acquisition module is used to calculate the absolute pose corresponding to the image according to the absolute pose corresponding to the previous image or key frame and the dead-reckoning corresponding to the corresponding time interval to calculate the relative pose.
  • Optional also includes:
  • the second receiving module is configured to receive the original pose sent by the dead reckoning sensor; or, receive the motion data sent by the dead reckoning sensor, and calculate the original pose according to the motion data;
  • the second calculation module is configured to calculate the dead reckoning relative pose related information according to the original pose.
  • the first update module is specifically configured to:
  • Optional also includes:
  • the third update module is used to use dead-reckoning to calculate relative pose-related information to update the absolute pose of the mobile device at the latest moment when the visual relative pose-related information cannot be obtained.
  • the two-dimensional coordinates are the two-dimensional coordinates of the feature point on the pixel coordinate system.
  • the first update module is specifically configured to filter the visual relative pose related information and dead reckoning relative pose related information, and update the absolute pose and map of the movable device.
  • each part of the present invention can be implemented by hardware, software, firmware or a combination thereof.
  • multiple steps or methods can be implemented by software or firmware stored in a memory and executed by a suitable instruction execution system.
  • a logic gate circuit for implementing logic functions on data signals
  • PGA programmable gate array
  • FPGA field programmable gate array
  • a person of ordinary skill in the art can understand that all or part of the steps carried in the method of the foregoing embodiments can be implemented by a program instructing related hardware to complete.
  • the program can be stored in a computer-readable storage medium. When executed, it includes one of the steps of the method embodiment or a combination thereof.
  • each embodiment of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units may be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. If the integrated module is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer readable storage medium.
  • the storage medium mentioned above may be a read-only memory, a magnetic disk or an optical disk, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)
  • Navigation (AREA)

Abstract

一种VSLAM方法、控制器和可移动设备,VSLAM方法包括:接收视觉传感器(12)发送的图像(S301);向预先建立的关键帧数据库内读取关键帧,以及,在读取到关键帧后,对图像与读取到的关键帧进行匹配(S302);根据匹配成功的图像和关键帧计算视觉相对位姿相关信息,其中,视觉相对位姿相关信息包括视觉相对位姿,视觉相对位姿根据匹配成功的图像和关键帧之间的相互匹配的特征点的二维坐标计算得到(S303);如果得到视觉相对位姿相关信息,则根据视觉相对位姿相关信息和航位推算相对位姿相关信息更新可移动设备的绝对位姿和地图(S304)。该方法可以提高计算视觉相对位姿的成功率,进而在定位和建图时提高准确度和运算速度。

Description

VSLAM方法、控制器和可移动设备 技术领域
本发明涉及可移动设备领域,尤其涉及一种VSLAM方法、控制器和可移动设备。
背景技术
可移动设备是自主或半自主执行工作的机器装置,能够应用到很多场景。可移动设备通过多种传感器获取环境信息,并对环境信息作出对应的反应,以便可移动设备安全、可靠、高效、智能地完成设定任务。
同步定位与建图(Simultaneous Localization and Mapping,SLAM)是指可移动设备在未知环境中从一个未知位置开始移动,在移动过程中根据自身位姿和地图进行自身定位,同时在自身定位的基础上建造增量式地图,实现可移动设备的自主定位和导航。
视觉SLAM(Visual SLAM,VSLAM)是指可移动设备利用视觉系统实现自主定位与地图创建,具有成本低,适应性强等优势。VSLAM系统中,将视觉图像与航位推算方法结合,从而对可移动设备进行定位和建图。
在现有的VSLAM相关技术中,往往会涉及到通过视觉来计算视觉相对位姿。在计算视觉相对位姿时,一般是先对当前图像的特征点与预先创建的路标的特征点进行匹配,根据匹配上的路标上的特征点的三维坐标计算得到,特征点的三维坐标一般是指特征点所对应的空间点在摄像机坐标系下的三维坐标,也可以将摄像机坐标系下的三维坐标转换到全局坐标系下。如果特征点的三维坐标是全局坐标系下的,且全局坐标系的原点选为可移动设备在全局坐标系下的初始位置点,则视觉相对位姿与视觉绝对位姿相同。
特征点所对应的空间点的三维坐标一般是根据选定的两帧图像计算得到的,参与计算的两帧图像需要满足一些条件才能计算出空间点的三维坐标,比如两帧图像中能够匹配上的特征点要足够多、两帧图像在空间中的距离在设定范围内等。在某些场景中,比如,图像采集周期较长,或相邻图像变化较大,计算特征点的三维坐标容易失败,导致创建路标的成功率降低,数据库中可用于匹配的路标减少,则后续VSLAM计算结果不够准确,进而影响最终的定位和建图结果。
发明内容
为至少在一定程度上克服相关技术中存在的问题,本发明提供一种VSLAM方法、控制器和可移动设备。
根据本发明实施例的第一方面,提供一种VSLAM方法,包括:
接收视觉传感器发送的图像;
向预先建立的关键帧数据库内读取关键帧,以及,在读取到关键帧后,对所述图像与读取到的关键帧进行匹配;
根据匹配成功的图像和关键帧计算视觉相对位姿相关信息,其中,所述视觉相对位姿相关信息包括视觉相对位姿,所述视觉相对位姿根据匹配成功的图像和关键帧之间的相互匹配的特征点的二维坐标计算得到;
如果得到视觉相对位姿相关信息,则根据所述视觉相对位姿相关信息和航位推算相对位姿相关信息更新可移动设备的绝对位姿和地图。
可选的,所述关键帧包括:绝对位姿,所述绝对位姿为拍摄所述关键帧依据的图像时可移动设备在全局坐标系中的位姿,所述地图中包括至少一个节点的绝对位姿,所述方法还包括:
根据更新后的地图中的节点的绝对位姿,更新与节点对应的关键帧中的绝对位姿。
可选的,所述根据匹配成功的图像和关键帧计算视觉相对位姿相关信息,包括:
对匹配成功的关键帧进行排序;
依序选择一个匹配成功的关键帧作为候选帧;
根据图像的特征点的二维坐标和候选帧的特征点的二维坐标,采用对极几何原理,计算得到视觉相对位姿;
根据预设的合理条件,判断所述视觉相对位姿是否合理;
重新选择候选帧及后续计算,直至循环结束,所述循环结束包括:合理的视觉相对位姿的个数达到预设的个数阈值,或者,所有匹配成功的关键帧均被选择;
在循环结束后,如果存在合理的视觉相对位姿,则将合理的视觉相对位姿及其相关信息加入到预处理结果中,所述相关信息包括:协方差矩阵和所关联的两个节点标识。
可选的,还包括:
在达到预设的创建条件时,对所述图像提取特征点,得到特征点的二维坐标和描述子;
在提取的特征点的个数大于或等于预设提取阈值时,则创建新的关键帧,并将所述新的关键帧存储到关键帧数据库中,所述新的关键帧包括:所述特征点的二维坐标和描述子。
可选的,所述关键帧还包括:绝对位姿,所述绝对位姿为拍摄所述关键帧依据的图像时可移动设备在全局坐标系中的位姿,所述方法还包括:
根据前一幅图像或关键帧对应的绝对位姿以及相应时间间隔对应的航位推算相对位姿来计算所述图像对应的绝对位姿。
可选的,还包括:
接收航位推算传感器发送的原始位姿;或者,接收航位推算传感器发送的运动数据,并根据所述运动数据计算得到原始位姿;
根据所述原始位姿计算得到所述航位推算相对位姿相关信息。
可选的,所述根据所述视觉相对位姿相关信息和航位推算相对位姿相关信息更新可移动设备的绝对位姿和地图,包括:
在得到视觉相对位姿相关信息后,创建当前节点;
根据所述航位推算相对位姿相关信息确定里程边,并采用所述里程边将所述当前节点连接到已有的最后创建的节点上;
在已有节点中存在与所述当前节点关联的关键帧节点时,根据所述视觉相对位姿相关信息确定视觉边,并采用所述视觉边将所述当前节点连接到相关联的关键帧节点上;以及,对节点和边进行图优化,得到更新后的可移动设备的绝对位姿和地图。
可选的,还包括:
如果不能得到视觉相对位姿相关信息,则采用航位推算相对位姿相关信息更新最近时刻的可移动设备的绝对位姿。
可选的,所述二维坐标为特征点在像素坐标系上的二维坐标。
根据本发明实施例的第二方面,提供一种控制器,包括:
处理器;以及,用于存储处理器可执行指令的存储器;
其中,当所述存储器中的指令被所述处理器执行时,执行如本发明实施例的第一方面任一项所述的方法。
根据本发明实施例的第三方面,提供一种可移动设备,包括:
航位推算传感器,用于提供原始位姿或者运动数据,以直接获取原始位姿或根据运动数据计算得到原始位姿;
视觉传感器,用于采集图像;
控制器,与所述航位推算传感器和所述视觉传感器连接,用于执行如本发明实施例的第一方面任一项所述的方法。
根据本发明实施例的第四方面,提供一种非临时性计算机可读存储介质,当所述存储介质中的指令由可移动设备中的控制器执行时,执行如本发明实施例的第一方面任一项所述的方法。
根据本发明实施例的第五方面,提供一种VSLAM装置,包括:
第一接收模块,用于接收视觉传感器发送的图像;
匹配模块,用于向预先建立的关键帧数据库内读取关键帧,以及,在读取到关键帧后,对所述图像与读取到的关键帧进行匹配;
第一计算模块,用于根据匹配成功的图像和关键帧计算视觉相对位姿相关信息,其中,所述视觉相对位姿相关信息包括视觉相对位姿,所述视觉相对位姿根据匹配成功的图像和关键帧之间的相互匹配的特征点的二维坐标计算得到;
第一更新模块,用于在得到视觉相对位姿相关信息,根据所述视觉相对位姿相关信息结果和航位推算相对位姿相关信息更新可移动设备的绝对位姿和地图。
可选的,所述关键帧包括:绝对位姿,所述绝对位姿为拍摄所述关键帧依据的图像时可移动设备在全局坐标系中的位姿,所述地图中包括至少一个节点的绝对位姿,所述装置还包括:
第二更新模块,用于根据更新后的地图中的节点的绝对位姿,更新与节点对应的关键帧中的绝对位姿。
可选的,所述第一计算模块具体用于:
对匹配成功的关键帧进行排序;
依序选择一个匹配成功的关键帧作为候选帧;
根据图像的特征点的二维坐标和候选帧的特征点的二维坐标,采用对极几何原理,计算得到视觉相对位姿;
根据预设的合理条件,判断所述视觉相对位姿是否合理;
重新选择候选帧及后续计算,直至循环结束,所述循环结束包括:合理的视觉相对位姿的个数达到预设的个数阈值,或者,所有匹配成功的关键帧均被选择;
在循环结束后,如果存在合理的视觉相对位姿,则将合理的视觉相对位姿及其相关信息加入到预处理结果中,所述相关信息包括:协方差矩阵和所关联的两个节点标识。
可选的,还包括:
创建模块,用于在达到预设的创建条件时,对所述图像提取特征点,得到特征点的二维坐标和描述子;在提取的特征点的个数大于或等于预设提取阈值时,则创建新的关键帧,并将所述新的关键帧存储到关键帧数据库中,所述新的关键帧包括:所述特征点的二维坐标和描述子。
可选的,所述关键帧还包括:绝对位姿,所述绝对位姿为拍摄所述关键帧依据的图像时可移动设备在全局坐标系中的位姿,所述装置还包括:
获取模块,用于根据前一幅图像或关键帧对应的绝对位姿以及相应时间间隔对应的航位推算相对位姿来计算所述图像对应的绝对位姿。。
可选的,还包括:
第二接收模块,用于接收航位推算传感器发送的原始位姿;或者,接收航位推算传感器发送的运动数据,并根据所述运动数据计算得到原始位姿;
第二计算模块,用于根据所述原始位姿计算得到所述航位推算相对位姿相关信息。
可选的,所述第一更新模块具体用于:
在得到视觉相对位姿相关信息后,创建当前节点;
根据所述航位推算相对位姿相关信息确定里程边,并采用所述里程边将所述当前节点连接到已有的最后创建的节点上;
在已有节点中存在与所述当前节点关联的关键帧节点时,根据所述视觉相对位姿相关信息确定视觉边,并采用所述视觉边将所述当前节点连接到相关联的关键帧节点上;以及,对节点和边进行图优化,得到更新后的可移动设备的绝对位姿和地图。
可选的,还包括:
第三更新模块,用于在不能得到视觉相对位姿相关信息,则采用航位推算相对位姿相关信息更新最近时刻的可移动设备的绝对位姿。
可选的,所述二维坐标为特征点在像素坐标系上的二维坐标。
本发明的实施例提供的技术方案可以包括以下有益效果:
通过在计算视觉相对位姿时,采用相互匹配的特征点的二维坐标,而不是特征点所对应的空间点的三维坐标,可以避免计算上述三维坐标的各种限制问题,提高计算视觉相对位姿的成功率,进而提高最终的定位和建图结果的准确度和运算速度。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本发明。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本发明的实施例,并与说明 书一起用于解释本发明的原理。
图1是根据一示例性实施例示出的一种VSLAM系统的结构示意图;
图2是根据一示例性实施例示出的一种控制器的结构示意图;
图3是根据一示例性实施例示出的一种VSLAM方法的流程示意图;
图4是根据一示例性实施例示出的数据预处理的一种处理流程示意图;
图5是根据一示例性实施例示出的一种计算视觉相对位姿的处理流程示意图;
图6是根据一示例性实施例示出的可移动设备的绝对位姿的示意图;
图7是根据一示例性实施例示出的一种创建关键帧的流程示意图;
图8是根据一示例性实施例示出的数据融合的一种处理流程示意图;
图9是根据一示例性实施例示出的控制器的结构示意图;
图10是根据一示例性实施例示出的VSLAM装置的结构示意图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本发明相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本发明的一些方面相一致的装置和方法的例子。
本发明至少一个实施例所涉及的可移动设备,比如可以是清洁机器人、陪伴型移动机器人、服务型移动机器人、工业巡检智能设备、安防机器人、无人驾驶车辆、无人机等。清洁机器人比如智能扫地机、智能擦地机、擦窗机器人,陪伴型移动机器人比如智能电子宠物、保姆机器人,服务型移动机器人比如酒店、旅馆、会晤场所的接待机器人,工业巡检智能设备比如电力巡检机器人、智能叉车,安防机器人比如家用或商用智能警卫机器人。
可移动设备常用的传感器包括:
(1)码盘,用来测量角位移的数字编码器。它具有分辨能力强、测量精度高和工作可靠等优点,是测量轴转角位置的一种最常用的位移传感器。配合轮胎的已知尺寸可以用于可移动设备的定位和/或速度测量。
(2)惯性测量单元(inertial measurement unit,IMU),包括陀螺仪和/或加速度计,其中陀螺仪是用来检测可移动设备角运动的装置,可以测得可移动设备的角速度,通过对角速度进行积分,可以获得可移动设备的角度,使用3轴的陀螺仪就可以计算可移动设备在三维空间上的姿态;加速度计是用来检测可移动设备加速度的装置,可以测得可移动设备的加速度,由加速度积分得到获得速度,对速度进行积分可以得到位移。
(3)摄像头,可以感知周围环境的设备,价格低廉,可以提供丰富的信息用来进行定位、建图、识别目标/障碍等。摄像头包括单目相机、双目相机、多目相机,单目相机无法提供参考尺度,因此在实际中往往需要结合别的传感器来工作,而双目、多目相机可以提供空间的尺度。
此外还包括全球定位系统(Global Position System,GPS)、2D/3D激光测距传感器、超声波测距传感器等。上述码盘、惯性测量单元均属于航位推算传感器。
VSLAM通常包括:特征提取、图像匹配、数据融合等。
特征提取时能够提取得到图像特征,常用的图像特征包括点特征和线特征,这里主要介绍点特征,点特征的提取方法有很多,包括:Harris、FAST、ORB、SIFT、SURT等以及基于深度学习的点特征提取方法。这里以ORB为例,ORB采用FAST算法来检测特征点。这个方法是基于特征点周围的图像灰度值,检测候选特征点周围一圈的像素值,如果候选点周围领域内有足够多的像素点与该候选点的灰度值差别够大,则认为该候选点为一个特征点。如下:
Figure PCTCN2019102686-appb-000001
其中,I(x)为以候选点为圆心,半径为设定值的圆周上任意一个像素点的灰度值,I(p)为圆心,也就是候选点的灰度值,ε d为灰度值差的阈值,如果N大于给定阈值,则认为p是一个特征点,N一般为I(x)的总数的四分之三。
为了获得更快的结果,还采用了额外的加速办法。如果测试了候选点周围每隔90度角的4个点,应该至少有3个和候选点的灰度值差足够大,否则则不用再计算其他点,直接认为该候选点不是特征点。候选点周围的圆的选取半径是一个很重要的参数,这里为了简单高效,采用半径为3,共有16个周边像素点需要比较。为了提高比较的效率,通常只使用K个周边像素点来比较,也就是FAST-K方法。得到特征点后,需要以某种方式描述这些特征点的属性。这些属性的输出称之为该特征点的描述子。ORB采用BRIEF算法来计算一个特征点的描述子。
BRIEF算法的核心思想是在特征点P的周围以一定模式选取N个点对,把这N个点对的比较结果组合起来作为描述子。步骤:
1).以关键点P为圆心,以d为半径做圆O。
2).在圆O内某一模式选取N个点对。这里为方便说明,N=4,实际应用中N可以取512。假设当前选取的4个点对分别标记为:
P 1(A,B)、P 2(A,B)、P 3(A,B)、P 4(A,B)
3).定义操作T
Figure PCTCN2019102686-appb-000002
其中,I A表示点A的灰度值,I B表示点B的灰度值。
4).分别对已选取的点对进行T操作,将得到的结果进行组合即为最终的描述子。假如:
T(P 1(A,B))=1
T(P 2(A,B))=0
T(P 3(A,B))=1
T(P 4(A,B))=1
则最终的描述子为:1011。
为了保证特征点的旋转不变性,还需为特征点添加一个朝向。以特征点P为中心,R为半径,计算该区域的灰度质心C,PC即为该特征点的朝向,如下所示:
θ=a tan 2(M 01,M 10)
Figure PCTCN2019102686-appb-000003
Figure PCTCN2019102686-appb-000004
其中,θ是特征点的朝向,I(x,y)是坐标为(x0+x,y0+y)的像素点的灰度值,(x0,y0)是特征点的坐标,x和y是坐标偏移量。
给定当前图像,从已有的数据库中查询与当前图像相似的图像,最为直接的方法是遍历数据库进行比对,但是这样效率低下,目前最受欢迎的为词袋(Bag of Words,BoW)方法。BoW方法主要有如下几个步骤:
(1).特征提取
假设有N张图像,第i张图像可由n(i)个特征点组成,也即可以由n(i)个特征向量表达,则总共能得到sum(n(i))个特征向量(即单词),一般来讲,特征向量用特征点的描述子表示,或者,采用朝向对描述子进行归一化后,用归一化后的描述子表示。特征向量可以根据特征问题自行设计,常用特征有Color histogram、SIFT、LBP等。
(2).生成字典/码本(codebook)
对上一步得到的特征向量进行聚类(可以使用K-means等聚类方法),得到K个聚类中心,用聚类中心构建码本。
(3).根据码本生成直方图
对每张图片,通过最近邻计算该图片的每个“单词”应该属于codebook中的“哪一类”单词,从而得到该图片对应于该码本的BoW表示。
目前,真正在搜索引擎等实际应用中广泛使用的是tf-idf模型。tf-idf模型的主要思想是:如果词w在一幅图像d中出现的频率高,并且在其他图像中很少出现,则认为词w具有很好的区分能力,适合用来把图像d和其他图像区分开来。该模型主要包含了两个因素:
1)词w在图像d中的词频tf(Term Frequency),即词w在图像d中出现次数count(w,d)和图像d中总词数size(d)的比值:
tf(w,d)=count(w,d)/size(d)
2)词w在整个图像集合中的逆向文档频率idf(Inverse Document Frequency),即图像总数n与词w所出现图像数docs(w,D)比值的对数:
idf=log(n/docs(w,D))
tf-idf模型根据tf和idf为每一个图像d和由关键词w[1]…w[k]组成的查询串q计算一个权值,用于表示查询串q与图像d的匹配度:
tf-idf(q,d)
=sum{i=1..k|tf-idf(w[i],d)}
=sum{i=1..k|tf(w[i],d)*idf(w[i])}.
选定一对图片后,进行特征点一一对应匹配,可以遍历整幅图像一一比对,也可采用k-d树进行加速,如果能够获得潜在的位姿关系,可以利用对极几何原理对匹配进行加速。此外还可以采用深度学习来对图像进行检索和匹配。
目前数据融合的方式大致分为两类:基于滤波的方法和基于非线性优化的方法。其中最为经典的滤波方法为卡尔曼滤波,卡尔曼滤波用到的过程方程和观测方程分别为:
X_k=AX_k-1+Bu_k+w_k-1
Z_k=HX_k+v_k
其中X_k为系统的状态向量,A、B为过程方程的参数,u_k为系统的输入量,w_k-1为过程噪声,Z_k为观测向量,H为观测方程的参数,v_k为观测噪声。
第一个公式中意味着每个X_k可以通过一个线性随机方程来表示。任何一个状态向量X_k都是它前一状态的状态向量加上输入量u_k及过程噪声w_k-1的线性组合。
第二个公式表示任何观测向量是当前的状态向量及观测噪声的线性组合,一般默认该值服从高斯分布。
这两个公式中的过程噪声及观测噪声,一般都认为是统计独立的。
依据卡尔曼滤波,根据系统上一时刻的状态向量以及当前的输入量可以预测当前的状态向量,如下:
Figure PCTCN2019102686-appb-000005
Figure PCTCN2019102686-appb-000006
其中,
Figure PCTCN2019102686-appb-000007
为k-1时刻的状态向量的估计量,u k为k时刻的输入量,
Figure PCTCN2019102686-appb-000008
为k时刻的状态向量的预测量,
Figure PCTCN2019102686-appb-000009
为k时刻的状态向量的协方差矩阵的预测量,P k-1为k-1时刻的状态向量的协方差矩阵,Q为过程噪声的方差,A,B为过程方程的参数;
当获得系统的观测向量时,可以根据当前的观测向量以及当前状态向量的预测量来获得当前状态向量的估计量,如下:
Figure PCTCN2019102686-appb-000010
Figure PCTCN2019102686-appb-000011
Figure PCTCN2019102686-appb-000012
其中K k为卡尔曼增益,R为观测噪声的方差,z k为k时刻的观测向量,H为观测方程的参数。
实际中还有很多基于卡尔曼滤波改进的方法,如扩展卡尔曼滤波、无迹卡尔曼滤波、迭代卡尔 曼滤波、多状态卡尔曼滤波等。此外还有粒子滤波器等。
滤波方法是基于递推的,而非线性优化方法是基于迭代的,下面介绍非线性优化方法。
如给定目标函数
Figure PCTCN2019102686-appb-000013
求其最小值,则:
将目标函数在x附近进行泰勒展开:
Figure PCTCN2019102686-appb-000014
这里J(x)是
Figure PCTCN2019102686-appb-000015
关于x的导数(雅可比矩阵),而H则是二阶导数(海塞矩阵)。可以选择保留泰勒展开的一阶或二阶项,对应的求解方法则为一阶梯度或二阶梯度法。如果保留一阶梯度,那么增量的解为:
Δx=-J T(x).
它的直观意义非常简单,只要沿着反向梯度方向前进即可,通常还会计算该方向上的一个步长,求得最快的下降方式,这种方法被称为最速下降法。
另一方面,如果保留二阶梯度信息,那么增量的解为:
HΔx=-J T(x).
该方法称为牛顿法。此外还有高斯牛顿法、列文伯格-马夸尔特方法(Levenberg-Marquadt法)等。在具体的实现形式上还可以采用滑窗法优化或增量式优化(iSAM)。
为了更好地理解本发明,对本发明所涉及的一些术语进行说明如下:
关键帧:存储在关键帧数据库中,每个关键帧是依据视觉传感器采集的图像构建的,如果未做特别说明,视觉传感器设置在可移动设备上。视觉传感器比如可以是照相机、摄像机或摄像头。每个关键帧包括如下一组数据:拍摄该关键帧所依据的图像时可移动设备在全局坐标系中的绝对位姿,该关键帧所依据的图像中的特征点的二维坐标和描述子。所述绝对位姿表示位置和姿态;位置以坐标表示。比如,可移动设备在二维空间移动时,如图6所示,该绝对位姿可以用三个参数(x,y,θ)表示,其中(x,y)表示可移动设备的位置,θ表示可移动设备的姿态;而在三维空间,可移动设备的位置以笛卡尔坐标系的(x,y,z)或球坐标系的(α,β,r)表示;可移动设备的姿态以可移动设备或其摄像头的朝向表示,通常是角度,比如在三维空间,以
Figure PCTCN2019102686-appb-000016
表示,这三个角度通常称为俯仰角、翻滚角、偏航角。需要说明的是,关键帧中不包括特征点所对应的空间点的三维坐标,具体的,既不包括空间点在全局坐标系中的三维坐标,也不包括空间点在摄像机坐标系中的三维坐标。
上述的二维坐标是指特征点在像素坐标系中的二维坐标。
可移动设备的绝对位姿:即上述的可移动设备在全局坐标系中的绝对位姿,是指可移动设备在全局坐标系中的位置和姿态。
节点的绝对位姿:控制器中存储的节点的数据,数值与创建节点时可移动设备的绝对位姿一致。
原始位姿:根据航位推算传感器提供的数据得到,比如,航位推算传感器可以直接提供原始位姿,或者,航位推算传感器可以提供运动数据,控制器根据运动数据计算得到原始位姿。原始位姿也 是绝对量,以区别于相对量,可理解为是可移动设备在未优化前的绝对位姿。需要说明的是,本发明中如未特别说明,以航位推算传感器直接提供原始位姿为例进行说明。
全局坐标系:固定在环境中的坐标系。
航位推算相对位姿:由航位推算传感器提供的、可移动设备在第一时刻的绝对位姿与该可移动设备在第二时刻的绝对位姿的相对量。
视觉相对位姿:根据视觉传感器拍摄的当前图像及关键帧数据库中与当前图像匹配成功的关键帧得到的、可移动设备在第一时刻的绝对位姿与该可移动设备在第二时刻的绝对位姿的相对量。所述视觉相对位姿根据匹配成功的图像和关键帧之间的相互匹配的特征点的二维坐标计算得到。其仅与视觉传感器相关而与航位推算传感器无关。
上述的第一时刻的绝对位姿是指视觉传感器采集到当前图像时可移动设备的绝对位姿,第二时刻的绝对位姿是指与当前图像匹配的关键帧所包括的绝对位姿。
上述的航位推算相对位姿和视觉相对位姿中的相对量均包括位置上的相对量和姿态上的相对量。
需要说明的是,在本发明中,所述的视觉相对位姿的姿态上的相对量与航位推算相对位姿的姿态上的相对量形式相同,而所述的视觉相对位姿的位置上的相对量与航位推算相对位姿的位置上的相对量形式不同。本发明中,视觉相对位姿的位置上的相对量的维度比航位推算相对位姿的位置上的相对量的维度少一维。
举例而言,以图6所示的可移动设备在二维空间内移动为例,通常方式下,航位推算相对位姿表示为(Δx,Δy,Δθ),而在本发明中,视觉相对位姿表示为(Δα,Δθ),其中,
Figure PCTCN2019102686-appb-000017
需要说明的是,在本发明的实际运算中,并不需要求取Δx,Δy,而是可以直接计算得到(Δα,Δθ),具体计算方式可以参见后续描述。而在航位推算相对位姿下,需要求取(Δx,Δy,Δθ)。类似的,可移动设备在三维空间移动时,本发明的视觉相对位姿是5个参数值(Δα,Δβ,Δφ,Δψ,Δθ),而航位推算相对位姿是6个参数值(Δα,Δβ,Δr,Δφ,Δψ,Δθ),由上述对比可见,视觉相对位姿的位置上的相对量比航位推算相对位姿的位置上的相对量少一个维度。
图1是根据一示例性实施例示出的一种VSLAM系统的结构示意图。
如图1所示,该系统包括:航位推算传感器11、视觉传感器12和控制器13。航位推算传感器11,用于产生原始位姿或者产生用于计算原始位姿的运动数据。视觉传感器12包括摄像机、红外成像仪等,用于采集图像。控制器13用于根据原始位姿和来自视觉传感器的图像进行定位和/或建图。可以理解的是,图中控制器13输出的位姿是指更新后的可移动设备的绝对位姿。
该VSLAM系统可以具体应用到可移动设备,比如移动机器人领域,所以该系统可以为可移动设备的组成部分。
控制器13可以为硬件、软件、固件,或者其结合。关于控制器13的具体内容可以参见图2和 图3。
航位推算传感器的运动数据包括可移动设备的位移数据、速度数据、加速度数据、角度数据、角速度数据等,控制器可以依据航位推算传感器的运动数据计算得到原始位姿,或者,航位推算传感器也可以自身根据运动数据计算得到原始位姿,并提供给控制器。如未特别说明,本发明以航位推算传感器向控制器提供原始位姿为例。
由于航位推算传感器本身的性能局限性,其产生的运动数据或者原始位姿存在累积误差,为了得到更准确的定位和建图结果,需要对原始位姿进行修正。本发明实施例中,基于视觉传感器采集的图像进行修正。
视觉传感器可以根据控制器设置的采集周期,对周围环境进行拍摄得到图像。视觉传感器采集的图像以及航位推算传感器采集的原始位姿传入到控制器中,控制器依据视觉传感器采集的图像对航位推算传感器采集的原始位姿进行修正,进而定位和建图。
本实施例中,基于视觉传感器采集的图像对原始位姿进行修正,实现了定位和建图,并且在应用到可移动设备上时,实现了可移动设备的VSLAM。
图2是根据一示例性实施例示出的一种控制器的结构示意图。
如图2所示,控制器可以分为数据预处理模块21和数据融合模块22。
数据预处理模块21接收来自航位推算传感器的原始位姿,以及,接收来自视觉传感器的图像,对其进行处理后,可以获取航位推算相对位姿、视觉相对位姿、新关键帧标识、节点标识以及相关信息构成预处理结果,节点标识包括:新节点标识和/或关联节点标识。数据融合模块22依据所述预处理结果进行定位和建图。可以理解的是,航位推算相对位姿及其相关信息也可以由数据融合模块计算得到,此时,由数据预处理模块将原始位姿提供给数据融合模块,数据预处理模块不进行航位推算相对位姿及其相关信息的计算,而由数据融合模块计算。
数据融合模块可以记录各个节点的绝对位姿,基于视觉相对位姿对节点的绝对位姿进行优化后,当前节点的优化后的绝对位姿作为当前的定位结果,完成了定位;节点包括关键帧节点和位姿节点,后端记录的关键帧节点的优化后的绝对位姿可以理解为地图信息,从而完成了建图。
关键帧数据库用于存储关键帧,根据当前情况的不同,关键帧数据库中所存储的关键帧可以为0个、1个或者多个。关键帧是依据视觉传感器采集的图像创建的。关键帧的具体内容可以参见上述的术语说明。
图3是根据一示例性实施例示出的一种VSLAM方法的流程示意图。该定位方法可以由控制器执行。
如图3所示,该VSLAM方法包括:
S301:接收视觉传感器发送的图像。
S302:向预先建立的关键帧数据库内读取关键帧,以及,在读取到关键帧后,对所述图像与读取到的关键帧进行匹配。
S303:根据匹配成功的图像和关键帧计算视觉相对位姿相关信息,其中,所述视觉相对位姿相关信息包括视觉相对位姿,所述视觉相对位姿根据匹配成功的图像和关键帧之间的相互匹配的特征点 的二维坐标计算得到。
S304:如果得到视觉相对位姿相关信息,则根据所述视觉相对位姿相关信息和航位推算相对位姿相关信息更新可移动设备的绝对位姿和地图。
另一方面,如果不能得到视觉相对位姿相关信息,则采用航位推算相对位姿相关信息更新最近时刻的可移动设备的绝对位姿。
另外,可以理解的是,该方法还可以包括:
接收航位推算传感器发送的原始位姿;或者,接收航位推算传感器发送的运动数据,并根据所述运动数据计算得到原始位姿;
根据所述原始位姿计算得到所述航位推算相对位姿相关信息。
可以理解的是,上述各步骤之间无时序限制关系,只要保证当前计算所需的参数在此之前已被接收和/或计算得到即可。
在具体实施时,对图像和关键帧进行匹配以及计算视觉相对位姿及其可以由控制器的数据预处理模块执行,根据预处理结果更新可移动设备的绝对位姿和/或地图可以由控制器的数据融合模块执行。其中,数据预处理的具体处理流程可以参见图4,数据融合的具体流程可以参见图8。
本实施例中,通过在计算视觉相对位姿时,采用相互匹配的特征点的二维坐标,而不是特征点所对应的空间点的三维坐标,可以避免计算上述三维坐标的各种限制问题,提高所计算视觉相对位姿的成功率,进而提高最终的定位和建图结果的准确度和运算速度。
鉴于数据预处理模块主要是处理视觉信息,涉及到的航位推算相对位姿及其相关信息操作比较简单,故不再赘述,可以采用相关技术中采用的方法实现。另外,航位推算相对位姿及其相关信息也可以由数据融合模块依据原始位姿计算得到。
图4是根据一示例性实施例示出的数据预处理的一种处理流程示意图。
如图4所示,数据预处理的处理流程包括:
S401:接收来自视觉传感器的图像。
S402:接收到图像后,从关键帧数据库中读取关键帧。
其中,关键帧数据库中用于存储关键帧,在当前时刻,关键帧数据库中可能存在关键帧,也可能为空。
S403:判断是否能够读取到关键帧,若是,执行S404,否则执行S407。
S404:将图像与读取的关键帧进行匹配。
其中,关键帧的图像数据包括关键帧的特征点的信息。
在匹配时,可以提取图像的特征点,得到图像的特征点的信息,然后依据图像的特征点的信息和关键帧的特征点的信息进行特征点匹配,当相互匹配的特征点的个数大于或等于匹配阈值时,则确定图像和关键帧匹配成功。其中,匹配阈值为预先设置的阈值。
上述的特征点的信息包括:特征点在像素坐标系中的二维坐标和特征点的描述子。
特征点的提取可以采用各种相关技术实现,比如,采用尺度不变特征变换(Scale-invariant feature transform,SIFT)算法、加速稳健特征(Speeded Up Robust Features,SURF)算法、快速提取 特征点和特征描述子结合(Oriented FAST and Rotated BRIEF,ORB)算法等进行特征点提取。
在获取到图像的特征点的信息和关键帧的特征点的信息后,进行特征点匹配运算,特征点匹配运算也可以采用各种相关技术实现,比如,根据图像所对应的原始位姿和特征点的二维坐标确定搜索范围,在搜索范围内根据描述子计算向量距离,根据向量距离确定出相互匹配的特征点。
S405:判断是否存在至少一个匹配成功的关键帧,若是,执行S406,否则执行S407。
其中,可以从关键帧数据库中每次读取一个不同的关键帧,然后与图像进行匹配,直至关键帧数据库中的所有关键帧与图像均进行了匹配,从而确定是否存在至少一个匹配成功的关键帧。
S406:根据匹配成功的图像和关键帧计算视觉相对位姿。
其中,可以根据匹配成功的图像和关键帧的相互匹配的特征点的二维坐标,计算视觉相对位姿。
根据预设的合理条件,视觉相对位姿可以是合理的或者不合理的。具体的,在视觉相对位姿合理时,还获取视觉相对位姿的协方差矩阵,视觉相对位姿所关联的两个节点标识。从而,在视觉相对位姿合理时,将视觉相对位姿及其协方差矩阵、当前节点标识和关联节点标识作为视觉相对位姿相关信息,放到预处理结果中;在计算得到的视觉相对位姿都不合理时,则预处理结果中不包含视觉相对位姿相关信息。
上述两个节点标识包括当前节点标识和关联节点标识,其中,视觉相对位姿所对应的图像的标识和关键帧的标识分别作为当前节点标识和关联节点标识。协方差矩阵图像可以预先被配置图像标识,由于关键帧是根据图像创建的,所以,关键帧可以与创建该关键帧的图像的图像标识相同,因此,上述的当前节点标识可以选择为当前图像的图像标识,上述的关联节点标识可以具体为与图像匹配成功且能计算出合理的视觉相对位姿的关键帧的图像标识。比如,第一标识的图像和第二标识的关键帧匹配成功能计算出合理的视觉相对位姿,则当前节点标识为第一标识,关联节点标识为第二标识。
根据图像和关键帧计算视觉相对位姿的具体内容可以参见图5。
S407:创建新的关键帧。
关于关键帧的创建过程具体参见图7。
本实施例中,通过在数据预处理模块进行视觉相对位姿和航位推算相对位姿的计算,可以提供给数据融合模块用于定位和/或建图。通过在计算视觉相对位姿时,采用二维坐标,可以避免计算三维坐标引起的问题,从而更简便、准确、快速的得到视觉提供的约束信息。
图5是根据一示例性实施例示出的一种计算视觉相对位姿的处理流程示意图。
如图5所示,计算视觉相对位姿的流程包括:
S501:对匹配成功的关键帧进行排序。
其中,可以依据匹配成功的关键帧与图像的相似度,对匹配的关键帧进行排序。
相似度可以根据词袋法计算得到。
比如,首先训练一个词袋;再根据所训练的词袋,生成图像的图像特征向量,以及关键帧的图像特征向量;计算图像的图像特征向量和关键帧的图像特征向量之间的距离,距离越小表明相似度越 高,则在排序时,可以根距离从小到大的顺序进行排序。
其中,训练词袋时,可以对预先收集的大量的特征点的描述子进行聚类成固定数量的类别,每个类别称为一个词,再根据统计方法计算得到词的逆文本频率(inverse document frequency,IDF),作为词的权重,词袋由词及其权重组成。根据词袋生成图像特征向量时,该向量的长度为词袋的词的数量,向量上的每一位为该位置对应的词在当前图片中的词频-逆文本频率(term frequency–inverse document frequency,TF-IDF)。
S502:判断队列是否为空,如为空则执行S507,否则执行S503。
其中,队列用于存放排序后的匹配成功的关键帧,在从队列中取出关键帧时,依据相似性从大到小的顺序,对匹配成功的关键帧进行取出,关键帧被取出后,则队列中少了相应的关键帧。
S503:按序选择一个匹配成功的关键帧作为候选帧。
S504:根据图像的特征点的二维坐标和候选帧的特征点的二维坐标,采用对极几何原理,计算得到视觉相对位姿。
比如,根据相互匹配的特征点的二维坐标,依据对极几何中的七点法或八点法求基本矩阵,在已知摄像机内部参数的情况下,可以通过矩阵分解的方法,分解基本矩阵,得到视觉相对位姿。或者,如果使用五点法得到本质矩阵,则不需要已知摄像机内部参数,对本质矩阵做矩阵分解,也可以得到视觉相对位姿。
举例而言,如图6所示,当带有摄像头62的可移动设备61在二维平面(XOY)上移动时,比如,可移动设备为扫地机器人,则可移动设备的位姿可以表示为(x,y,θ),则要计算的视觉相对位姿可以表示为:(Δα,Δθ),
其中,
Figure PCTCN2019102686-appb-000018
假设相互匹配的一组特征点的二维坐标为:
(u i,v i,1) T,(u′ i,v′ i,1) T
则在理想情况下存在如下等式:
Figure PCTCN2019102686-appb-000019
其中K为相机的内参数矩阵。
给定多组匹配的特征点,就有多个与之对应的等式(1),通过最优化使这多个等式尽量接近0,从而可以求得视觉相对位姿(Δα,Δθ)。
上述以可移动设备在二维空间中移动为例,可以理解的是,依据上述计算原理,也可以计算得到可移动设备在三维空间移动时的视觉相对位姿,则移动机器人可移动设备的位姿可以表示为(x,y,z,φ,ψ,θ),则要计算的视觉相对位姿可以表示为:(Δα,Δβ,Δφ,Δψ,Δθ),
其中
Figure PCTCN2019102686-appb-000020
假设相互匹配的一组特征点的二维坐标为:
(u i,v i,1) T,(u′ i,v′ i,1) T
则在理想情况下存在如下等式:
Figure PCTCN2019102686-appb-000021
其中,
Figure PCTCN2019102686-appb-000022
其中K为相机的内参数矩阵。
给定多组匹配的特征点,就有多个与之对应的等式(2),通过最优化使这多个等式尽量接近0,从而可以求得视觉相对位姿(Δα,Δβ,Δφ,Δψ,Δθ)。
定义:
X=(Δα,Δβ,Δφ,Δψ,Δθ)
Θ i=(u i v i u′ iv′ i)
则等式(2)可表示为:
Figure PCTCN2019102686-appb-000023
其中,Θ=(K,Θ 1,…,Θ i,…Θ n),n为相互匹配的特征点的组数。
按照方差传递准则,视觉相对位姿的协方差矩阵可以由下式计算:
Figure PCTCN2019102686-appb-000024
其中,∑(X)是视觉相对位姿的协方差矩阵,∑(Θ)为经验值,与相机参数和传感器噪声相关。
可以理解的是,上述协方差矩阵的计算方式只是一种示例方式,还可以根据经验来做近似计算,比如根据匹配点数来设置协方差矩阵,即匹配点数越多,协方差矩阵中各元素的值设置的越小,反之亦然。
航位推算相对位姿也有多种计算方式,这里介绍一种:
假设计算时刻t1和时刻t2之间的航位推算相对位姿,则以t1时刻为起点,t2时刻为终点,起点的初始位姿设置为0,对航位推算传感器的数据进行积分即可得到航位推算相对位姿。
航位推算相对位姿的协方差按照方差传递原则即可计算得出,整个计算过程可以参照卡尔曼滤波器的过程方程。
S505:判断视觉相对位姿是否合理,若是,执行S506,否则,重复执行S502及其后续步骤。
其中,可以预设合理条件,当视觉相对位姿满足预设的合理条件时,表明视觉相对位姿合理,否则为不合理。
预设的合理条件比如包括:根据视觉相对位姿计算得到的图像重投影误差小于预设误差值,具 体如,在计算得到视觉相对位姿后,将其带入上述的多组等式(1),如果这多组等式的整体值与0的差值小于阈值,则表明计算得到的视觉相对位姿是合理的。上述的整体值对根据多组等式计算得到的,比如采用均值或者其他运算方式。预设的合理条件还可以根据当前的协方差矩阵,比如当前协方差矩阵表明当前的视觉相对位姿是可信的,则表明其是合理的。协方差矩阵是根据航位推算传感器的噪声,可移动设备的异常行为,比如打滑等信息确定。
S506:判断合理的视觉相对位姿的个数是否达到预设的个数阈值,若是,执行S507,否则,重复执行S502及其后续步骤。
比如,可以在初始时,设置合理的视觉相对位姿的个数的初始值为0,在每计算出一个合理的视觉相对位姿时,将该个数增加1,在该个数小于预设的个数阈值时,表明未达到,在该个数等于预设的个数阈值时,表明达到。
S507:对应每个合理的视觉相对位姿,获取视觉相对位姿相关信息,以及,将视觉相对位姿相关信息加入到预处理结果中。
视觉相对位姿相关信息包括:视觉相对位姿、视觉相对位姿的协方差矩阵和视觉相对位姿所关联的两个节点标识。相关信息的具体内容可以参见S406中的相关内容,在此不再详述。
需要说明的是,合理的视觉相对位姿可以为一个或多个,则预处理结果中可以包括一组或多组视觉相对位姿相关信息,每组信息对应一个合理的视觉相对位姿,每组信息可以具体包括视觉相对位姿及其协方差矩阵、当前节点标识和关联节点标识。进一步的,如果合理的视觉相对位姿为多个,由于多个视觉相对位姿所对应的当前节点标识为同一个,则可以在每组信息中均包括当前节点标识,或者,也可以在每组信息中不包括当前节点标识,而是多组信息共用当前节点标识。举例而言,比如有两个合理的视觉相对位姿,则预处理结果的形式可以有以下两种形式:
形式一:第一视觉相对位姿,第一视觉相对位姿的协方差矩阵,当前节点标识、第一关联节点标识;第二视觉相对位姿,第二视觉相对位姿的协方差矩阵,当前节点标识、第二关联节点标识;或者,
形式二:当前节点标识;第一视觉相对位姿,第一视觉相对位姿的协方差矩阵,第一关联节点标识;第二视觉相对位姿,第二视觉相对位姿的协方差矩阵,第二关联节点标识。
当没有合理的视觉相对位姿时,则结束视觉相对位姿的计算,预处理结果中不包含视觉相对位姿相关信息。其中,如果由数据预处理模块计算航位推算相对位置相关信息,则预处理结果包括航位推算相对位姿相关信息,否则包括原始位姿。
本实施例中,通过优先采用相似度高的关键帧进行视觉相关位姿的计算,可以提高准确度和效率。
图7是根据一示例性实施例示出的一种创建关键帧的流程示意图。
如图7所示,创建关键帧的流程包括:
S701:获取当前图像。
比如,将当前接收的来自视觉传感器的图像作为当前图像。
S702:判断是否满足预设的创建条件,若满足执行S703,否则执行S705。
创建条件可以根据需求设置,比如设置距离上次创建关键帧的时间差,则在距离上次创建关键帧的时间差达到所设置的时间差时,则触发创建关键帧;或者,触发条件也可以是根据航位推算传感器得到的原始位姿,在根据该原始位姿计算得到的航位推算相对位姿大于一定阈值时触发创建关键帧;或者根据当前的运行状态,比如转弯等触发关键帧创建;或者根据采集的两个相邻图像的重叠区域,当重叠区域的占比小于预定值时触发关键帧创建等。
S703:判断当前图像的特征点的个数是否大于或等于提取阈值,若是,执行S704,否则,执行S705。
特征点的提取过程可以参见上述相关描述,在此不再详述。经过特征点提取可以获取到特征点在像素坐标系上的二维坐标和特征点的描述子。
其中,提取阈值为预先设置的阈值。
S704:创建成功,将所获取的图像作为新创建的关键帧,存储到关键帧数据库中。
另外,在创建成功时还可以向数据融合模块输出当前节点标识,当前节点标识可以选择为新创建的关键帧的图像标识。进一步的,还可以包括新关键帧标识,此时表明创建了新关键帧。
需要说明的是,对图像进行特征点提取后,可以获取到特征点的二维坐标和描述子,存储到关键帧数据库中的关键帧还包括:拍摄该图像时的可移动设备在全局坐标系中的绝对位姿。该绝对位姿根据前一幅图像或关键帧对应的绝对位姿以及相应时间间隔对应的航位推算相对位姿来计算得到。图像与原始位姿的时间对应关系可以采用如下方式对齐:假设图像的拍摄时间为t0,则获取距离t0最近的两个时间点t1的原始位姿和t2的原始位姿,对t1时刻的原始位姿和t2时刻的原始位姿进行插值计算,得到插值计算后的原始位姿。由前一幅图像或关键帧对应的原始位姿与所述图像对应的原始位姿计算得到相应的航位推算相对位姿,则前一幅图像或关键帧对应的绝对位姿增加上所述航位推算相对位姿即为所述图像的绝对位姿,之后可以优化更新。
S705:结束。
本实施例中,通过在创建关键帧时,提取并存储特征点的二维坐标,并不需要计算特征点所对应的空间点的三维坐标,从而可以避免计算三维坐标时所引起的问题,提高关键帧数据库中关键帧的数量,进而可以提高图像和关键帧的匹配成功率,提高定位和建图效果。
数据融合可以采用滤波的方法也可以采用非线性优化的方法,这里以非线性优化的方法为例。图8是根据一示例性实施例示出的数据融合的一种处理流程示意图。
如图8所示,数据融合模块的处理流程包括:
S801:接收到预处理结果。
其中,如果由数据预处理模块计算航位推算相对位姿相关信息,则所述预处理结果始终包括:航位推算相对位姿、航位推算相对位姿的协方差矩阵,其中航位推算相对位姿的协方差矩阵可以根据传感器噪声、可移动设备的异常行为等确定。
当存在至少一个匹配成功的关键帧,且根据匹配成功的关键帧计算得到合理的视觉相对位姿时,所述预处理结果包括:航位推算相对位姿、视觉相对位姿、航位推算相对位姿的协方差矩阵、视觉相对位姿的协方差矩阵、关联节点标识。进一步的,还可以包括新关键帧标识,新节点标识。
当满足创建关键帧条件并且成功创建关键帧时,则预处理结果可以包括新关键帧标识,且该新关键帧标识表明创建了新的关键帧。
S802:根据预处理结果创建当前节点。
当预处理结果包括新关键帧标识和/或新节点标识时,则创建一个新的节点。新节点标识为数据预处理模块根据预设的判断条件判断是否创建位姿节点,所述预设的判断条件包括:当前原始位姿与上一节点(即已有的最后创建的节点)的原始位姿之间的距离、角度差、时间间隔是否在相应阈值范围内或范围外。
另外,当新关键帧标识表示成功创建了关键帧时,则创建的当前节点为关键帧节点,否则为位姿节点。
S803:根据航位推算相对位姿相关信息确定里程边,并采用所述里程边将所述当前节点连接到已有的最后创建的节点上。
其中,数据融合模块可以记录每个节点的绝对位姿,该绝对位姿初始时由上一节点的绝对位姿加上当前航位推算相对位姿得到,在优化后,每个节点包括采用视觉相对位姿进行优化后的值。即,数据融合模块在优化后,会记录:各个节点的优化后的绝对位姿。
S804:判断已有节点中是否存在与当前节点关联的关键帧节点,若是,执行S805,否则执行S807。
其中,当预处理结果中包含视觉相对位姿相关信息时,可以从视觉相对位姿相关信息中获取关联节点标识,则已有节点中包含该关联节点标识所指示的节点时,则确定存在与当前节点关联的关键帧节点,且所关联的关键帧节点为该关联节点标识所指示的节点。
S805:根据所述视觉相对位姿相关信息确定视觉边,并采用所述视觉边将所述当前节点连接到相关联的关键帧节点上。
S806:根据节点和边进行图优化,得到更新后的可移动设备的绝对位姿和地图。
图优化算法比如采用g2o,ceres等。以g2o为例,输入为所有节点的绝对位姿、节点间的相对位姿及其协方差矩阵,输出为每个节点优化后的绝对位姿,节点间的相对位姿包括视觉相对位姿和航位推算相对位姿。其中,当前节点的优化后的绝对位姿作为当前定位结果,也就是可移动设备在当前位置的更新后的绝对位姿;关键帧节点的优化后的绝对位姿可以理解为更新后的地图结果,或者,可移动设备在各个不同位置的更新后的绝对位姿。
S807:结束。
其中,如果预处理结果中不包含视觉相对位姿相关信息,则不需要进行图优化。只有包含视觉相对位姿相关信息,通过视觉边构成闭环后再进行图优化。
如果预处理结果中只包括航位推算相对位姿相关信息,即只包括航位推算相对位姿及其协方差矩阵,则根据航位推算相对位姿相关信息更新最近时刻的可移动设备的绝对位姿,比如,将最近时刻的可移动设备的绝对位姿确定为上一时刻的可移动设备的绝对位姿增加上航位推算相对位姿。
本实施例中,通过节点和边的构建以及图优化,可以在数据融合模块接收到视觉相对位姿相关信息后快速地更新绝对位姿和/或更新地图。
另一实施例中,当数据融合模块采用滤波方法进行处理时,则对预处理结果进行滤波处理,更新可移动设备的绝对位姿和地图,其中,如果得到视觉相对位姿相关信息,则预处理结果包括视觉相对位姿相关信息和航位推算相对位姿相关信息。
以卡尔曼滤波为例,可以采用如下方式进行:
如上述实施例中的相关内容可知,卡尔曼滤波时,涉及的递推公式包括:
Figure PCTCN2019102686-appb-000025
Figure PCTCN2019102686-appb-000026
Figure PCTCN2019102686-appb-000027
Figure PCTCN2019102686-appb-000028
Figure PCTCN2019102686-appb-000029
具体在可移动设备的定位时,假设当前时刻为k时刻,则当前时刻要求取的可移动设备的绝对位姿即为上述的状态向量
Figure PCTCN2019102686-appb-000030
通常是由N个元素组成的向量,第i个元素为可移动设备在第(k-N+i)时刻的绝对位姿,N为预设值。在递推计算时,
Figure PCTCN2019102686-appb-000031
代入更新前的可移动设备的绝对位姿,z k代入视觉相对位姿,u k代入航位推算相对位姿,Q代入航位推算相对位姿的协方差矩阵,R代入视觉相对位姿的协方差矩阵。
因此,通过滤波方法也可以更新可移动设备的绝对位姿,以完成定位和/或地图。
本发明的所有步骤均不涉及计算特征点所对应的空间点的三维坐标,可以避免计算现有技术中计算三维坐标的各种限制问题,提高最终的定位和建图结果的准确度和运算速度。
图9是根据一示例性实施例示出的控制器的结构示意图。
如图9所示,该控制器包括:存储器91和处理器92。存储器91用于存储可执行指令,当存储器中的指令被所述处理器执行时,执行上述的VSLAM方法。
本发明实施例还提供一种非临时性计算机可读存储介质,当所述存储介质中的指令由可移动设备中的控制器执行时,执行上述的VSLAM方法。
本发明实施例还提供一种VSLAM装置,该装置可以应用到可移动设备上,如图10所示,包括:第一接收模块101、匹配模块102、第一计算模块103和第一更新模块104。
第一接收模块101,用于接收视觉传感器发送的图像;
匹配模块102,用于向预先建立的关键帧数据库内读取关键帧,以及,在读取到关键帧后,对所述图像与读取到的关键帧进行匹配;
计算模块103,用于根据匹配成功的图像和关键帧计算视觉相对位姿相关信息,其中,所述视觉相对位姿相关信息包括视觉相对位姿,所述视觉相对位姿根据匹配成功的图像和关键帧之间的相互匹配的特征点的二维坐标计算得到;
第一更新模块104,用于在得到视觉相对位姿相关信息时,根据所述视觉相对位姿相关信息和航位推算相对位姿相关信息更新可移动设备的绝对位姿和地图。
可选的,所述关键帧包括:绝对位姿,所述绝对位姿为拍摄所述关键帧依据的图像时可移动设备在全局坐标系中的位姿,所述地图中包括至少一个节点的绝对位姿,所述装置还包括:
第二更新模块,用于根据更新后的地图中的节点的绝对位姿,更新与节点对应的关键帧中的绝对位姿。
可选的,所述第一计算模块具体用于:
对匹配成功的关键帧进行排序;
依序选择一个匹配成功的关键帧作为候选帧;
根据图像的特征点的二维坐标和候选帧的特征点的二维坐标,采用对极几何原理,计算得到视觉相对位姿;
根据预设的合理条件,判断所述视觉相对位姿是否合理;
重新选择候选帧及后续计算,直至循环结束,所述循环结束包括:合理的视觉相对位姿的个数达到预设的个数阈值,或者,所有匹配成功的关键帧均被选择;
在循环结束后,如果存在合理的视觉相对位姿,则将合理的视觉相对位姿及其相关信息加入到预处理结果中,所述相关信息包括:协方差矩阵和所关联的两个节点标识。
可选的,还包括:
创建模块,用于在达到预设的创建条件时,对所述图像提取特征点,得到特征点的二维坐标和描述子;在提取的特征点的个数大于或等于预设提取阈值时,则创建新的关键帧,并将所述新的关键帧存储到关键帧数据库中,所述新的关键帧包括:所述特征点的二维坐标和描述子。
可选的,所述关键帧还包括:绝对位姿,所述绝对位姿为拍摄所述关键帧依据的图像时可移动设备在全局坐标系中的位姿,所述装置还包括:
获取模块,用于根据前一幅图像或关键帧对应的绝对位姿以及相应时间间隔对应的航位推算相对位姿来计算所述图像对应的绝对位姿。
可选的,还包括:
第二接收模块,用于接收航位推算传感器发送的原始位姿;或者,接收航位推算传感器发送的运动数据,并根据所述运动数据计算得到原始位姿;
第二计算模块,用于根据所述原始位姿计算得到所述航位推算相对位姿相关信息。
可选的,所述第一更新模块具体用于:
在得到视觉相对位姿相关信息后,创建当前节点;
根据所述航位推算相对位姿相关信息确定里程边,并采用所述里程边将所述当前节点连接到已有的最后创建的节点上;
在已有节点中存在与所述当前节点关联的关键帧节点时,根据所述视觉相对位姿相关信息确定视觉边,并采用所述视觉边将所述当前节点连接到相关联的关键帧节点上;以及,对节点和边进行图优化,得到更新后的可移动设备的绝对位姿和地图。
可选的,还包括:
第三更新模块,用于在不能得到视觉相对位姿相关信息,则采用航位推算相对位姿相关信息更 新最近时刻的可移动设备的绝对位姿。
可选的,所述二维坐标为特征点在像素坐标系上的二维坐标。
可选的,第一更新模块具体用于:对所述视觉相对位姿相关信息和航位推算相对位姿相关信息进行滤波,更新可移动设备的绝对位姿和地图。
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。
可以理解的是,上述各实施例中相同或相似部分可以相互参考,在一些实施例中未详细说明的内容可以参见其他实施例中相同或相似的内容。
需要说明的是,在本发明的描述中,术语“第一”、“第二”等仅用于描述目的,而不能理解为指示或暗示相对重要性。此外,在本发明的描述中,除非另有说明,“多个”的含义是指至少两个。
流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更多个用于实现特定逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且本发明的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或按相反的顺序,来执行功能,这应被本发明的实施例所属技术领域的技术人员所理解。
应当理解,本发明的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。例如,如果用硬件来实现,和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。
本技术领域的普通技术人员可以理解实现上述实施例方法携带的全部或部分步骤是可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,该程序在执行时,包括方法实施例的步骤之一或其组合。
此外,在本发明各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。
上述提到的存储介质可以是只读存储器,磁盘或光盘等。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。

Claims (10)

  1. 一种VSLAM方法,其特征在于,包括:
    接收视觉传感器发送的图像;
    向预先建立的关键帧数据库内读取关键帧,以及,在读取到关键帧后,对所述图像与读取到的关键帧进行匹配;
    根据匹配成功的图像和关键帧计算视觉相对位姿相关信息,其中,所述视觉相对位姿相关信息包括视觉相对位姿,所述视觉相对位姿根据匹配成功的图像和关键帧之间的相互匹配的特征点的二维坐标计算得到;
    如果得到视觉相对位姿相关信息,则根据所述视觉相对位姿相关信息和航位推算相对位姿相关信息更新可移动设备的绝对位姿和地图。
  2. 根据权利要求1所述的方法,其特征在于,所述关键帧包括:绝对位姿,所述绝对位姿为拍摄所述关键帧依据的图像时可移动设备在全局坐标系中的位姿,所述地图中包括至少一个节点的绝对位姿,所述方法还包括:
    根据更新后的地图中的节点的绝对位姿,更新与节点对应的关键帧中的绝对位姿。
  3. 根据权利要求1所述的方法,其特征在于,所述根据匹配成功的图像和关键帧计算视觉相对位姿相关信息,包括:
    对匹配成功的关键帧进行排序;
    依序选择一个匹配成功的关键帧作为候选帧;
    根据图像的特征点的二维坐标和候选帧的特征点的二维坐标,采用对极几何原理,计算得到视觉相对位姿;
    根据预设的合理条件,判断所述视觉相对位姿是否合理;
    重新选择候选帧及后续计算,直至循环结束,所述循环结束包括:合理的视觉相对位姿的个数达到预设的个数阈值,或者,所有匹配成功的关键帧均被选择;
    在循环结束后,如果存在合理的视觉相对位姿,则将合理的视觉相对位姿及其相关信息组成视觉相对位姿相关信息,所述相关信息包括:协方差矩阵和所关联的两个节点标识。
  4. 根据权利要求1所述的方法,其特征在于,还包括:
    在达到预设的创建条件时,对所述图像提取特征点,得到特征点的二维坐标和描述子;
    在提取的特征点的个数大于或等于预设提取阈值时,则创建新的关键帧,并将所述新的关键帧存储到关键帧数据库中,所述新的关键帧包括:所述特征点的二维坐标和描述子。
  5. 根据权利要求4所述的方法,其特征在于,所述关键帧还包括:绝对位姿,所述绝对位姿为拍摄所述关键帧依据的图像时可移动设备在全局坐标系中的位姿,所述方法还包括:
    根据前一幅图像或关键帧对应的绝对位姿以及相应时间间隔对应的航位推算相对位姿来计算所述图像对应的绝对位姿。
  6. 根据权利要求1所述的方法,其特征在于,还包括:
    接收航位推算传感器发送的原始位姿;或者,接收航位推算传感器发送的运动数据,并根据所述运动数据计算得到原始位姿;
    根据所述原始位姿计算得到所述航位推算相对位姿相关信息。
  7. 根据权利要求1所述的方法,其特征在于,所述根据所述视觉相对位姿相关信息和航位推算相对位姿相关信息更新可移动设备的绝对位姿和地图,包括:
    在得到视觉相对位姿相关信息后,创建当前节点;
    根据所述航位推算相对位姿相关信息确定里程边,并采用所述里程边将所述当前节点连接到已有的最后创建的节点上;
    在已有节点中存在与所述当前节点关联的关键帧节点时,根据所述视觉相对位姿相关信息确定视觉边,并采用所述视觉边将所述当前节点连接到相关联的关键帧节点上;以及,对节点和边进行图优化,得到更新后的可移动设备的绝对位姿和地图。
  8. 根据权利要求1-7任一项所述的方法,其特征在于,还包括:
    如果不能得到视觉相对位姿相关信息,则采用航位推算相对位姿相关信息更新最近时刻的可移动设备的绝对位姿。
  9. 一种控制器,其特征在于,包括:
    处理器;以及,用于存储处理器可执行指令的存储器;
    其中,当所述存储器中的指令被所述处理器执行时,执行如权利要求1-8任一项所述的方法。
  10. 一种可移动设备,其特征在于,包括:
    航位推算传感器,用于计算原始位姿或者运动数据,以直接获取原始位姿或根据运动数据计算得到原始位姿;
    视觉传感器,用于采集图像;
    控制器,与所述航位推算传感器和所述视觉传感器连接,用于执行如权利要求1-8任一项所述的方法。
PCT/CN2019/102686 2019-01-28 2019-08-27 Vslam方法、控制器和可移动设备 Ceased WO2020155615A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
JP2021543539A JP2022523312A (ja) 2019-01-28 2019-08-27 Vslam方法、コントローラ及び移動可能機器
EP19913780.3A EP3919863A4 (en) 2019-01-28 2019-08-27 VSLAM METHOD, CONTROL UNIT AND MOBILE DEVICE
US16/718,560 US10782137B2 (en) 2019-01-28 2019-12-18 Methods, apparatus, and systems for localization and mapping
US16/994,579 US11629965B2 (en) 2019-01-28 2020-08-15 Methods, apparatus, and systems for localization and mapping

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910109823.0 2019-01-28
CN201910109823.0A CN111489393B (zh) 2019-01-28 2019-01-28 Vslam方法、控制器和可移动设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/718,560 Continuation US10782137B2 (en) 2019-01-28 2019-12-18 Methods, apparatus, and systems for localization and mapping

Publications (1)

Publication Number Publication Date
WO2020155615A1 true WO2020155615A1 (zh) 2020-08-06

Family

ID=71796767

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/102686 Ceased WO2020155615A1 (zh) 2019-01-28 2019-08-27 Vslam方法、控制器和可移动设备

Country Status (3)

Country Link
EP (1) EP3919863A4 (zh)
CN (1) CN111489393B (zh)
WO (1) WO2020155615A1 (zh)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111968157A (zh) * 2020-08-13 2020-11-20 深圳国信泰富科技有限公司 一种应用于高智能机器人的视觉定位系统及方法
CN112114966A (zh) * 2020-09-15 2020-12-22 杭州未名信科科技有限公司 一种视觉slam的光束平差计算方法
CN112149692A (zh) * 2020-10-16 2020-12-29 腾讯科技(深圳)有限公司 基于人工智能的视觉关系识别方法、装置及电子设备
CN112665575A (zh) * 2020-11-27 2021-04-16 重庆大学 一种基于移动机器人的slam回环检测方法
CN112880675A (zh) * 2021-01-22 2021-06-01 京东数科海益信息科技有限公司 用于视觉定位的位姿平滑方法、装置、终端和移动机器人
CN113465617A (zh) * 2021-07-08 2021-10-01 上海汽车集团股份有限公司 一种地图构建方法、装置及电子设备
CN113607158A (zh) * 2021-08-05 2021-11-05 中铁工程装备集团有限公司 基于可见光通信的平板光源视觉识别匹配定位方法及系统
CN113963030A (zh) * 2021-11-09 2022-01-21 福州大学 一种提高单目视觉初始化稳定性的方法
CN114147707A (zh) * 2021-11-25 2022-03-08 上海思岚科技有限公司 一种基于视觉识别信息的机器人对接方法与设备
CN114184193A (zh) * 2020-09-14 2022-03-15 杭州海康威视数字技术股份有限公司 定位方法及系统
CN114494825A (zh) * 2021-12-31 2022-05-13 重庆特斯联智慧科技股份有限公司 一种机器人定位方法及装置
CN114593735A (zh) * 2022-01-26 2022-06-07 奥比中光科技集团股份有限公司 一种位姿预测方法及装置
CN114814872A (zh) * 2020-08-17 2022-07-29 浙江商汤科技开发有限公司 位姿确定方法及装置、电子设备和存储介质
CN115031735A (zh) * 2022-05-20 2022-09-09 北京理工大学 基于结构特征的单目视觉惯性里程计系统的位姿估计方法
WO2022217882A1 (zh) * 2021-04-15 2022-10-20 深圳市慧鲤科技有限公司 位姿数据的处理方法及接口、装置、系统、设备和介质
CN115439536A (zh) * 2022-08-18 2022-12-06 北京百度网讯科技有限公司 视觉地图更新方法、装置及电子设备
CN115540867A (zh) * 2021-12-31 2022-12-30 深圳市普渡科技有限公司 机器人、地图构建方法、装置和可读存储介质
CN116105720A (zh) * 2023-04-10 2023-05-12 中国人民解放军国防科技大学 低照度场景机器人主动视觉slam方法、装置和设备
CN116202551A (zh) * 2021-11-30 2023-06-02 珠海一微半导体股份有限公司 一种视觉机器人路标定位有效检测方法
CN116309829A (zh) * 2023-02-28 2023-06-23 无锡赛锐斯医疗器械有限公司 一种基于多目视觉的长方体扫描体群解码和位姿测量方法

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112307917A (zh) * 2020-10-21 2021-02-02 国网上海市电力公司 一种融合视觉里程计及imu的室内定位方法
CN112325770B (zh) * 2020-10-26 2022-08-02 武汉中海庭数据技术有限公司 一种车端单目视觉测量相对精度置信度评估方法及系统
CN112444246B (zh) * 2020-11-06 2024-01-26 北京易达恩能科技有限公司 高精度的数字孪生场景中的激光融合定位方法
CN112699266B (zh) * 2020-12-30 2024-09-06 视辰信息科技(上海)有限公司 一种基于关键帧相关性的视觉地图定位方法和系统
CN113034594A (zh) * 2021-03-16 2021-06-25 浙江商汤科技开发有限公司 位姿优化方法、装置、电子设备及存储介质
CN112734851B (zh) * 2021-03-29 2021-07-06 北京三快在线科技有限公司 一种位姿确定的方法以及装置
CN114216461A (zh) * 2021-09-29 2022-03-22 杭州图灵视频科技有限公司 一种基于全景相机的移动机器人室内定位方法及系统
CN114463429B (zh) * 2022-04-12 2022-08-16 深圳市普渡科技有限公司 机器人、地图创建方法、定位方法及介质
CN114969421B (zh) * 2022-06-08 2025-04-01 视辰信息科技(上海)有限公司 检索真值获取方法、图像检索方法及系统
CN114742884B (zh) * 2022-06-09 2022-11-22 杭州迦智科技有限公司 一种基于纹理的建图、里程计算、定位方法及系统
CN117948969A (zh) * 2022-10-31 2024-04-30 沃尔沃汽车公司 用于车辆定位的方法、设备、系统和计算机可读存储介质
CN115908366A (zh) * 2022-12-13 2023-04-04 北京柏惠维康科技股份有限公司 数据处理方法、装置、电子设备及存储介质

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106959691A (zh) * 2017-03-24 2017-07-18 联想(北京)有限公司 可移动电子设备和即时定位与地图构建方法
CN107193279A (zh) * 2017-05-09 2017-09-22 复旦大学 基于单目视觉和imu信息的机器人定位与地图构建系统
CN107677279A (zh) * 2017-09-26 2018-02-09 上海思岚科技有限公司 一种定位建图的方法及系统
CN108062537A (zh) * 2017-12-29 2018-05-22 幻视信息科技(深圳)有限公司 一种3d空间定位方法、装置及计算机可读存储介质
US20180211399A1 (en) * 2017-01-26 2018-07-26 Samsung Electronics Co., Ltd. Modeling method and apparatus using three-dimensional (3d) point cloud
CN108629829A (zh) * 2018-03-23 2018-10-09 中德(珠海)人工智能研究院有限公司 一种球幕相机与深度相机结合的三维建模方法和系统
CN108648274A (zh) * 2018-05-10 2018-10-12 华南理工大学 一种视觉slam的认知点云地图创建系统
CN108803591A (zh) * 2017-05-02 2018-11-13 北京米文动力科技有限公司 一种地图生成方法及机器人
CN108873908A (zh) * 2018-07-12 2018-11-23 重庆大学 基于视觉slam和网络地图结合的机器人城市导航系统

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2619742B1 (en) * 2010-09-24 2018-02-28 iRobot Corporation Systems and methods for vslam optimization
WO2017172778A1 (en) * 2016-03-28 2017-10-05 Sri International Collaborative navigation and mapping
US11263777B2 (en) * 2017-05-09 2022-03-01 Sony Corporation Information processing apparatus and information processing method
CN108682027A (zh) * 2018-05-11 2018-10-19 北京华捷艾米科技有限公司 基于点、线特征融合的vSLAM实现方法及系统

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180211399A1 (en) * 2017-01-26 2018-07-26 Samsung Electronics Co., Ltd. Modeling method and apparatus using three-dimensional (3d) point cloud
CN106959691A (zh) * 2017-03-24 2017-07-18 联想(北京)有限公司 可移动电子设备和即时定位与地图构建方法
CN108803591A (zh) * 2017-05-02 2018-11-13 北京米文动力科技有限公司 一种地图生成方法及机器人
CN107193279A (zh) * 2017-05-09 2017-09-22 复旦大学 基于单目视觉和imu信息的机器人定位与地图构建系统
CN107677279A (zh) * 2017-09-26 2018-02-09 上海思岚科技有限公司 一种定位建图的方法及系统
CN108062537A (zh) * 2017-12-29 2018-05-22 幻视信息科技(深圳)有限公司 一种3d空间定位方法、装置及计算机可读存储介质
CN108629829A (zh) * 2018-03-23 2018-10-09 中德(珠海)人工智能研究院有限公司 一种球幕相机与深度相机结合的三维建模方法和系统
CN108648274A (zh) * 2018-05-10 2018-10-12 华南理工大学 一种视觉slam的认知点云地图创建系统
CN108873908A (zh) * 2018-07-12 2018-11-23 重庆大学 基于视觉slam和网络地图结合的机器人城市导航系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3919863A4 *

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111968157B (zh) * 2020-08-13 2024-05-28 深圳国信泰富科技有限公司 一种应用于高智能机器人的视觉定位系统及方法
CN111968157A (zh) * 2020-08-13 2020-11-20 深圳国信泰富科技有限公司 一种应用于高智能机器人的视觉定位系统及方法
CN114814872A (zh) * 2020-08-17 2022-07-29 浙江商汤科技开发有限公司 位姿确定方法及装置、电子设备和存储介质
CN114184193A (zh) * 2020-09-14 2022-03-15 杭州海康威视数字技术股份有限公司 定位方法及系统
CN112114966A (zh) * 2020-09-15 2020-12-22 杭州未名信科科技有限公司 一种视觉slam的光束平差计算方法
CN112149692A (zh) * 2020-10-16 2020-12-29 腾讯科技(深圳)有限公司 基于人工智能的视觉关系识别方法、装置及电子设备
CN112149692B (zh) * 2020-10-16 2024-03-05 腾讯科技(深圳)有限公司 基于人工智能的视觉关系识别方法、装置及电子设备
CN112665575A (zh) * 2020-11-27 2021-04-16 重庆大学 一种基于移动机器人的slam回环检测方法
CN112665575B (zh) * 2020-11-27 2023-12-29 重庆大学 一种基于移动机器人的slam回环检测方法
CN112880675A (zh) * 2021-01-22 2021-06-01 京东数科海益信息科技有限公司 用于视觉定位的位姿平滑方法、装置、终端和移动机器人
WO2022217882A1 (zh) * 2021-04-15 2022-10-20 深圳市慧鲤科技有限公司 位姿数据的处理方法及接口、装置、系统、设备和介质
CN113465617A (zh) * 2021-07-08 2021-10-01 上海汽车集团股份有限公司 一种地图构建方法、装置及电子设备
CN113465617B (zh) * 2021-07-08 2024-03-19 上海汽车集团股份有限公司 一种地图构建方法、装置及电子设备
CN113607158A (zh) * 2021-08-05 2021-11-05 中铁工程装备集团有限公司 基于可见光通信的平板光源视觉识别匹配定位方法及系统
CN113963030A (zh) * 2021-11-09 2022-01-21 福州大学 一种提高单目视觉初始化稳定性的方法
CN114147707B (zh) * 2021-11-25 2024-04-26 上海思岚科技有限公司 一种基于视觉识别信息的机器人对接方法与设备
CN114147707A (zh) * 2021-11-25 2022-03-08 上海思岚科技有限公司 一种基于视觉识别信息的机器人对接方法与设备
CN116202551A (zh) * 2021-11-30 2023-06-02 珠海一微半导体股份有限公司 一种视觉机器人路标定位有效检测方法
CN115540867A (zh) * 2021-12-31 2022-12-30 深圳市普渡科技有限公司 机器人、地图构建方法、装置和可读存储介质
CN114494825B (zh) * 2021-12-31 2024-04-19 重庆特斯联智慧科技股份有限公司 一种机器人定位方法及装置
CN114494825A (zh) * 2021-12-31 2022-05-13 重庆特斯联智慧科技股份有限公司 一种机器人定位方法及装置
CN114593735B (zh) * 2022-01-26 2024-05-31 奥比中光科技集团股份有限公司 一种位姿预测方法及装置
CN114593735A (zh) * 2022-01-26 2022-06-07 奥比中光科技集团股份有限公司 一种位姿预测方法及装置
CN115031735A (zh) * 2022-05-20 2022-09-09 北京理工大学 基于结构特征的单目视觉惯性里程计系统的位姿估计方法
CN115439536A (zh) * 2022-08-18 2022-12-06 北京百度网讯科技有限公司 视觉地图更新方法、装置及电子设备
CN115439536B (zh) * 2022-08-18 2023-09-26 北京百度网讯科技有限公司 视觉地图更新方法、装置及电子设备
CN116309829B (zh) * 2023-02-28 2024-03-19 无锡赛锐斯医疗器械有限公司 一种基于多目视觉的长方体扫描体群解码和位姿测量方法
CN116309829A (zh) * 2023-02-28 2023-06-23 无锡赛锐斯医疗器械有限公司 一种基于多目视觉的长方体扫描体群解码和位姿测量方法
CN116105720A (zh) * 2023-04-10 2023-05-12 中国人民解放军国防科技大学 低照度场景机器人主动视觉slam方法、装置和设备

Also Published As

Publication number Publication date
CN111489393A (zh) 2020-08-04
EP3919863A1 (en) 2021-12-08
EP3919863A4 (en) 2022-11-09
CN111489393B (zh) 2023-06-02

Similar Documents

Publication Publication Date Title
WO2020155615A1 (zh) Vslam方法、控制器和可移动设备
US11629965B2 (en) Methods, apparatus, and systems for localization and mapping
JP2022523312A (ja) Vslam方法、コントローラ及び移動可能機器
US11830218B2 (en) Visual-inertial localisation in an existing map
JP6410530B2 (ja) Vslam最適化のための方法
CN112634451A (zh) 一种融合多传感器的室外大场景三维建图方法
WO2022188094A1 (zh) 一种点云匹配方法及装置、导航方法及设备、定位方法、激光雷达
WO2025190241A1 (zh) 一种灾害环境下北斗多源融合定位方法
CN114924287B (zh) 地图构建方法、设备与介质
Cui et al. Efficient large-scale structure from motion by fusing auxiliary imaging information
CN112233177A (zh) 一种无人机位姿估计方法及系统
GB2599947A (en) Visual-inertial localisation in an existing map
CN114325634A (zh) 一种基于激光雷达的高鲁棒性野外环境下可通行区域提取方法
Daoud et al. SLAMM: Visual monocular SLAM with continuous mapping using multiple maps
CN118293903A (zh) 一种基于imu和点线面特征综合的slam方法
WO2024099593A1 (en) Localization based on neural networks
CN119863578A (zh) 基于单目视觉和LiDAR融合的动态环境语义SLAM方法
WO2022062480A1 (zh) 移动设备的定位方法和定位装置
CN115861352A (zh) 单目视觉、imu和激光雷达的数据融合和边缘提取方法
CN113570716A (zh) 云端三维地图构建方法、系统及设备
CN115619824A (zh) 一种视觉惯性动态目标跟踪slam装置、方法、计算机及存储介质
Lv et al. So-pfh: Semantic object-based point feature histogram for global localization in parking lot
Timotheatos et al. Visual horizon line detection for uav navigation
Le Barz et al. Absolute geo-localization thanks to Hidden Markov Model and exemplar-based metric learning
Bamann et al. Visual-inertial odometry with sparse map constraints for planetary swarm exploration

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19913780

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021543539

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019913780

Country of ref document: EP

Effective date: 20210830

WWW Wipo information: withdrawn in national office

Ref document number: 2019913780

Country of ref document: EP