WO2021007320A1 - Détection d'objet dans des nuages de points - Google Patents
Détection d'objet dans des nuages de points Download PDFInfo
- Publication number
- WO2021007320A1 WO2021007320A1 PCT/US2020/041200 US2020041200W WO2021007320A1 WO 2021007320 A1 WO2021007320 A1 WO 2021007320A1 US 2020041200 W US2020041200 W US 2020041200W WO 2021007320 A1 WO2021007320 A1 WO 2021007320A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- points
- dimensional
- proposal
- location
- locations
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional [3D] objects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/136—Segmentation; Edge detection involving thresholding
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional [3D] objects
- G06V20/647—Three-dimensional [3D] objects by matching two-dimensional images to three-dimensional objects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30248—Vehicle exterior or interior
- G06T2207/30252—Vehicle exterior; Vicinity of vehicle
Definitions
- a system for detecting objects within point clouds obtains point cloud data representing a sensor measurement of a scene captured by one or more sensors and including three-dimensional points in the scene, and then determines multiple two-dimensional proposal locations based on the three-dimensional points in the scene.
- the system generates, for each two-dimensional proposal location, a feature representation from three-dimensional points in the point cloud data that are near the two-dimensional proposal location.
- the system then processes the feature representations of the two-dimensional proposal locations using an object detection neural network that is configured to generate an object detection output that identifies objects in the scene.
- the system described in this specification can process point cloud data representing a sensor measurement of a scene captured by one or more sensors to generate an object detection output that identifies locations of one or more objects in the scene.
- the one or more sensors can be sensors of an autonomous vehicle (e.g., LIDAR sensors), the scene can be a scene that is in the vicinity of the autonomous vehicle, and the object detection output can be used to make autonomous driving decisions for the vehicle, to display information to operators or passengers of the vehicle, or both.
- the system implements a non-convolutional point-based network designed specifically for point cloud data that can generate accurate object detection outputs with minimal latency and at a relatively low computational cost.
- the system is configured to leverage this capability to adapt the amount of computation that is dedicated to each spatial region in the scene to system priorities, resource availability, or both.
- the system may dynamically alter the computational demand by tuning the number of proposals that are determined without having to alter or retrain the system’s point-based network.
- This framework not only allows the system to be flexibly targeted across a range of computational priorities, but also enables the system to generate object proposals in a manner geared to maximize spatial coverage or match the density of point clouds. Given the need for accurate real time information in autonomous vehicles and the nature of their surroundings, the system described in this specification may better fit the requirements of autonomous vehicle- based perception systems.
- FIG. 2 shows a block diagram of an example perception subsystem.
- This specification describes a system implemented as computer programs on one or more computers in one or more locations that processes point cloud data representing a sensor measurement of a scene captured by one or more sensors to generate an object detection output that identifies locations of one or more objects in the scene.
- the one or more sensors can be sensors of an autonomous vehicle, e.g., a land, air, or sea vehicle, and the scene can be a scene that is in the vicinity of the autonomous vehicle.
- the object detection output can then be used to make autonomous driving decisions for the vehicle, to display information to operators or passengers of the vehicle, or both.
- the system receives point cloud data representing a sensor measurement of a scene captured by one or more sensors.
- the point cloud data includes a set of three-dimensional points, i.e., a set of points corresponding to reflections identified by one or more scans of the scene by the one or more sensors, and optionally features generated by the one or more sensors for the three-dimensional points, e.g., LiDAR features.
- Each three-dimensional point generally has x, y, and z coordinates (or three different coordinates in a different coordinate system).
- the system determines, based on the three-dimensional points in the scene, a plurality of two-dimensional proposal locations.
- the system samples a fixed number of two-dimensional locations from the locations of the three-dimensional points.
- the system designates a pair of coordinates, e.g., (x,y), from the three coordinates representing the three-dimensional points and then samples a fixed number of two-dimensional proposal locations from among the designated coordinates, e.g., the (x,y) coordinates, of the three-dimensional points in the scene.
- the system can sample the fixed number of two-dimensional proposal locations in any of a variety of data dependent but computationally efficient ways.
- the system can sample the fixed number of two-dimensional proposal locations using farthest point sampling, in which individual points are selected sequentially such that the next point selected is maximally far away from all previous points selected.
- the system can sample the fixed number of two-dimensional proposal locations using random uniform sampling, in which each two-dimensional proposal location is randomly sampled from a uniform distribution over the three-dimensional points, i.e., the (x,y) coordinates of each three-dimensional point are equally likely to be sampled.
- the system generates, for each two-dimensional proposal location, a feature representation from three-dimensional points in the point cloud data that are near the two- dimensional proposal location.
- the system can modify this phase of the object detection process based on the amount of computational resources available for the process or the latency requirements for the object detection process.
- the system can adjust how many points are used for each two-dimensional proposal location to satisfy the resource or latency requirements, i.e., the system can adapt the object detector to different computational settings without needing to re-train any of the neural network layers that are used by the object detector.
- the system can prioritize the points that have higher predictive priorities or that are in spatial regions that are likely to be relevant. For example, in the case of a self-driving vehicle, the system can prioritize points that are likely to be relevant to operation of the vehicle.
- the system then processes the feature representations of the two-dimensional proposal locations using an object detection neural network that is configured to generate an object detection output that identifies objects in the scene.
- FIG. 1 is a block diagram of an example on-board system 100.
- the on-board system 100 is physically located on-board a vehicle 102.
- the vehicle 102 in FIG. 1 is illustrated as an automobile, but the on-board system 100 can be located on-board any appropriate vehicle type.
- the vehicle 102 can be a fully autonomous vehicle that makes fully-autonomous driving decisions or a semi-autonomous vehicle that aids a human operator.
- the vehicle 102 can autonomously apply the brakes if a full- vehicle prediction indicates that a human driver is about to collide with a detected object, e.g., a pedestrian, a cyclist, another vehicle. While the vehicle 102 is illustrated in FIG.
- the vehicle 102 can be any appropriate vehicle that uses sensor data to make fully-autonomous or semi-autonomous operation decisions.
- the vehicle 102 can be a watercraft or an aircraft.
- the on-board system 100 can include components additional to those depicted in FIG. 1 (e.g., a control subsystem or a user interface subsystem).
- the on-board system 100 includes a sensor subsystem 120 which enables the on board system 100 to“see” the environment in a vicinity of the vehicle 102.
- the sensor subsystem 120 includes one or more sensors, some of which are configured to receive reflections of electromagnetic radiation from the environment in the vicinity of the vehicle 102.
- the sensor subsystem 120 can include one or more laser sensors (e.g., LIDAR sensors) that are configured to detect reflections of laser light.
- the sensor subsystem 120 can include one or more radar sensors that are configured to detect reflections of radio waves.
- the sensor subsystem 120 can include one or more camera sensors that are configured to detect reflections of visible light.
- the sensor subsystem 120 repeatedly (i.e., at each of multiple time points) uses raw sensor measurements, data derived from raw sensor measurements, or both to generate sensor data 122.
- the raw sensor measurements indicate the directions, intensities, and distances travelled by reflected radiation.
- a sensor in the sensor subsystem 120 can transmit one or more pulses of electromagnetic radiation in a particular direction and can measure the intensity of any reflections as well as the time that the reflection was received.
- a distance can be computed by determining the time which elapses between transmitting a pulse and receiving its reflection.
- Each sensor can continually sweep a particular space in angle, azimuth, or both. Sweeping in azimuth, for example, can allow a sensor to detect multiple objects along the same line of sight.
- the sensor data 122 includes point cloud data that characterizes the latest state of an environment (i.e., an environment at the current time point) in the vicinity of the vehicle 102.
- a point cloud is a collection of data points defined by a given coordinate system.
- a point cloud can define the shape of some real or synthetic physical system, where each point in the point cloud is defined by three values representing respective coordinates in the coordinate system, e.g., (x, y, z) coordinates.
- each point in the point cloud can be defined by more than three values, wherein three values represent coordinates in the coordinate system and the additional values each represent a property of the point of the point cloud, e.g., an intensity of the point in the point cloud.
- Point cloud data can be generated, for example, by using LIDAR sensors or depth camera sensors that are on-board the vehicle 102.
- each point in the point cloud can correspond to a reflection of laser light or other radiation transmitted in a particular direction by a sensor on-board the vehicle 102.
- the on-board system 100 can provide the sensor data 122 generated by the sensor subsystem 120 to a perception subsystem 130 for use in generating perception outputs 132.
- the perception subsystem 130 implements components that identify objects within a vicinity of the vehicle.
- the components typically include one or more fully- learned machine learning models.
- a machine learning model is said to be“fully -learned” if the model has been trained to compute a desired prediction when performing a perception task.
- a fully-learned model generates a perception output based solely on being trained on training data rather than on human-programmed decisions.
- the perception output 132 may be a classification output that includes a respective object score corresponding to each of one or more object categories, each object score representing a likelihood that the input sensor data characterizes an object belonging to the corresponding object category.
- the on-board system 100 can provide the perception outputs 132 to a planning subsystem 140.
- the planning subsystem 140 can use the perception outputs 132 to generate planning decisions which plan the future trajectory of the vehicle 102.
- the planning decisions generated by the planning subsystem 140 can include, for example: yielding (e.g., to pedestrians identified in the perception outputs 132), stopping (e.g., at a“Stop” sign identified in the perception outputs 132), passing other vehicles identified in the perception outputs 132, adjusting vehicle lane position to accommodate a bicyclist identified in the perception outputs 132, slowing down in a school or construction zone, merging (e.g., onto a highway), and parking.
- the planning decisions generated by the planning subsystem 140 can be provided to a control system of the vehicle 102.
- the control system of the vehicle can control some or all of the operations of the vehicle by implementing the planning decisions generated by the planning system.
- the control system of the vehicle 102 may transmit an electronic signal to a braking control unit of the vehicle.
- the braking control unit can mechanically apply the brakes of the vehicle.
- the on-board system 100 In order for the planning subsystem 140 to generate planning decisions which cause the vehicle 102 to travel along a safe and comfortable trajectory, the on-board system 100 must provide the planning subsystem 140 with high quality perception outputs 132.
- Many approaches to classifying or detecting objects within point cloud data involve projecting point clouds into 2D planar images and processing such point clouds as if they are camera images, e.g., using image processing techniques such as those involving the use of convolutional neural network (CNN) architectures or convolutional operations, to detect objects in the resulting images.
- CNN convolutional neural network
- CNN convolutional neural network
- the perception subsystem 130 may implement a non- convolutional object detector designed specifically for point cloud data that may better fit the requirements of autonomous vehicles.
- a non- convolutional object detector designed specifically for point cloud data that may better fit the requirements of autonomous vehicles. The architecture and functionality of such an object detector is described in further detail below with reference to FIG. 2.
- the perception subsystem 230 determines or selects a subset of neighboring points in the point cloud, featurizes these points, and regresses these points to object class and bounding box parameters.
- the object location is predicted relative to the selected location and does not employ any global information, i.e., information for points that are outside the subset of neighboring points in the point cloud. This setup ensures that each spatial location may be processed by the perception subsystem 230 independently, which may enable computation of each location by the perception subsystem 230 to be parallelized to decrease inference latency.
- each of the three-dimensional points in the scene has respective (x, y) coordinates
- the two-dimensional proposal locations 252 that are determined by the proposal location determination engine 250 correspond to the (x, y) coordinates where individual points reside in the point cloud.
- the proposal location determination engine 250 may determine or sample a fixed number of two-dimensional proposal locations from among the (x, y) coordinates of the three-dimensional points in the scene.
- the featurizer 260 For each two-dimensional proposal location included in the proposal locations 252, the featurizer 260 generates a feature representation from three- dimensional points in the point cloud data that are near the two-dimensional proposal location. As such, in some examples, the featurizer 260 generates feature representations 262 based on proposal locations 252 and based further on at least a portion of sensor data 222 or an abstraction thereof.
- the featurizer 260 may further receive or otherwise access contextual data 242 and determine or select the fixed number of points for each two-dimensional proposal location included in the proposal locations 252 based on their distance from the proposal location and based further on the contextual data 242.
- the featurizer 260 may initially sample a larger number of points from the points that have (x,y) coordinates that are within the threshold radius, i.e., a larger number than the fixed number that will be used to generate the feature representation and then rank these points based on a relative importance to operation of the self-driving vehicle of each point based on the contextual data 242. The featurizer 260 may then select, as the determined fixed number of points, a subset of the points that have (x, y) coordinates that are within the threshold radius of the proposal location based at least in part on the ranking. For example, the featurizer 260 may rank the points based on distance from the vehicle or based on other information in the contextual data 242.
- contextual data 242 may include data obtained or generated by the perception subsystem 230 for one or more previous frames, including sensor data 222 from one or more previous frames, proposal locations 252 from one or more previous frames, feature representations 262 from one or more previous frames, and/or perception output 232 from one or more previous frames.
- data from previous frames may serve to provide the perception subsystem 230 with a relatively reliable estimate of where objects may be expected to be located.
- the perception subsystem 230 may be able to allocate more computational resources to the regions in the scene in which objects are more likely to be located in the current frame and/or allocate fewer computational resources to the regions in the scene in which objects are less likely to be located in the current frame.
- the featurizer 260 includes a featurizer neural network that may be leveraged to generate feature representations 262. More specifically, for a given proposal location, the featurizer 260 may process a featurizer input for the given proposal location using the featurizer neural network to generate a feature representation for the given proposal location.
- the featurizer input that is applied to the featurizer neural network may include data indicating a fixed number of points that are determined or selected for the given proposal location.
- the featurizer input that is applied to the featurizer neural network may include data indicating the re-centered points.
- FIG. 3 is a block diagram of an example featurizer neural network 360.
- the featurizer neural network 360 receives data 357 as input and generates, based at least in part on data 357, a set of feature representations 362.
- the featurizer neural network 360 may be implemented as part of the featurizer 260 of the perception subsystem 230 as described herein with reference to FIG. 2.
- data 357 and feature representations 362 may correspond to the featurizer input and feature representations 262 as described above with reference to FIG. 2, respectively.
- the featurizer neural network 360 includes multiple layers 361A-361E (e.g., 5 layers).
- each object detection output included in the perception output 232 corresponds to one of the proposal locations and one of the anchor offsets and identifies (i) a location of a possible object relative to a region of the scene that corresponds to the proposal location offset by the anchor offset and (ii) a likelihood that an object is located at the identified location.
- different anchor offsets are associated with different projection weights, and to generate a respective feature vector for each anchor offset, the object detection neural network 270 projects each feature representation included in the feature representations 262 in accordance with projection weights associated with the anchor offset.
- the featurizer neural network of the featurizer 260 and the object detection neural network 270 may be trained jointly on ground truth object detection outputs for point clouds in a set of training data.
- the featurizer neural network of the featurizer 260 may correspond to the featurizer neural network 360 as described with reference to FIG. 3.
- the loss function used for the training of these neural networks can be an object detection loss that measures the quality of object detection outputs generated by the these neural networks relative to the ground truth object detection outputs, e.g., smoothed LI losses for regressed values and cross entropy losses for classification outputs.
- the perception subsystem 230 is further configured to remove points that are likely associated with ground reflections from obtained point cloud data.
- FIG. 4 is a flow diagram of an example process 400 for detecting objects within point clouds.
- the process 400 will be described as being performed by a system of one or more computers located in one or more locations.
- an on board system e.g., the on-board system 100 of FIG. 1, or subsystems thereof, e.g., the perception subsystem 130 of FIG. 1 or the perception subsystem 230 of FIG. 2, appropriately programmed in accordance with this specification, can perform the process 400.
- process 400 may be performed by other systems or system configurations.
- a computer program which may also be referred to or described as a program, software, a software application, an app, a module, a software module, a script, or code, can be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages; and it can be deployed in any form, including as a stand alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
- a program may, but need not, correspond to a file in a file system.
- a program can be stored in a portion of a file that holds other programs or data, e.g., one or more scripts stored in a markup language document, in a single file dedicated to the program in question, or in multiple coordinated files, e.g., files that store one or more modules, sub programs, or portions of code.
- a computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a data communication network.
- Computers suitable for the execution of a computer program can be based on general or special purpose microprocessors or both, or any other kind of central processing unit.
- a central processing unit will receive instructions and data from a read only memory or a random access memory or both.
- the essential elements of a computer are a central processing unit for performing or executing instructions and one or more memory devices for storing instructions and data.
- the central processing unit and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
- a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks. However, a computer need not have such devices.
- Machine learning models can be implemented and deployed using a machine learning framework, e.g., a TensorFlow framework, a Microsoft Cognitive Toolkit framework, an Apache Singa framework, or an Apache MXNet framework.
- a machine learning framework e.g., a TensorFlow framework, a Microsoft Cognitive Toolkit framework, an Apache Singa framework, or an Apache MXNet framework.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
- Traffic Control Systems (AREA)
Abstract
Priority Applications (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202080050276.4A CN114080629A (zh) | 2019-07-08 | 2020-07-08 | 点云中的对象检测 |
| JP2022500800A JP2022539843A (ja) | 2019-07-08 | 2020-07-08 | 点群におけるオブジェクト検出 |
| EP20750916.7A EP3980932B1 (fr) | 2019-07-08 | 2020-07-08 | Détection d'objet dans des nuages de points |
| KR1020227004263A KR20220031685A (ko) | 2019-07-08 | 2020-07-08 | 포인트 클라우드들에서의 객체 검출 |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201962871669P | 2019-07-08 | 2019-07-08 | |
| US62/871,669 | 2019-07-08 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2021007320A1 true WO2021007320A1 (fr) | 2021-01-14 |
Family
ID=71944315
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2020/041200 Ceased WO2021007320A1 (fr) | 2019-07-08 | 2020-07-08 | Détection d'objet dans des nuages de points |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US11450120B2 (fr) |
| EP (1) | EP3980932B1 (fr) |
| JP (1) | JP2022539843A (fr) |
| KR (1) | KR20220031685A (fr) |
| CN (1) | CN114080629A (fr) |
| WO (1) | WO2021007320A1 (fr) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20220319054A1 (en) * | 2021-03-01 | 2022-10-06 | Waymo Llc | Generating scene flow labels for point clouds using object labels |
Families Citing this family (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2021131652A (ja) * | 2020-02-19 | 2021-09-09 | 株式会社トプコン | データ構造、記録媒体、プログラム、及びシステム |
| US11636592B2 (en) * | 2020-07-17 | 2023-04-25 | International Business Machines Corporation | Medical object detection and identification via machine learning |
| CN112801036A (zh) * | 2021-02-25 | 2021-05-14 | 同济大学 | 一种目标识别方法、训练方法、介质、电子设备及汽车 |
| US20220292813A1 (en) * | 2021-03-10 | 2022-09-15 | Acronis International Gmbh | Systems and methods for detecting objects an image using a neural network trained by an imbalanced dataset |
| CN113205116B (zh) * | 2021-04-15 | 2024-02-02 | 江苏方天电力技术有限公司 | 输电线路无人机巡检拍摄目标点自动提取及航迹规划方法 |
| US12462572B2 (en) * | 2021-06-22 | 2025-11-04 | Grabtaxi Holdings Pte. Ltd. | Method and system for gathering image training data for a machine learning model |
| WO2023003354A1 (fr) * | 2021-07-20 | 2023-01-26 | 엘지전자 주식회사 | Dispositif d'émission de données de nuage de points, procédé d'émission de données de nuage de points, dispositif de réception de données de nuage de points et procédé de réception de données de nuage de points |
| CN115965925B (zh) * | 2023-03-03 | 2023-06-23 | 安徽蔚来智驾科技有限公司 | 点云目标检测方法、计算机设备、存储介质及车辆 |
| US12579814B2 (en) | 2023-03-28 | 2026-03-17 | Dell Products L.P. | Computer vision-based energy usage management system |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20170220876A1 (en) * | 2017-04-20 | 2017-08-03 | GM Global Technology Operations LLC | Systems and methods for visual classification with region proposals |
| US20180144496A1 (en) * | 2015-04-24 | 2018-05-24 | Oxford University Innovation Limited | A method of detecting objects within a 3d environment |
| US20180188038A1 (en) * | 2016-12-30 | 2018-07-05 | DeepMap Inc. | Detection of vertical structures based on lidar scanner data for high-definition maps for autonomous vehicles |
Family Cites Families (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2007069260A1 (fr) * | 2005-12-16 | 2007-06-21 | Technion Research & Development Foundation Ltd. | Procede et appareil de determination de la similitude entre des surfaces |
| US20130202197A1 (en) * | 2010-06-11 | 2013-08-08 | Edmund Cochrane Reeler | System and Method for Manipulating Data Having Spatial Co-ordinates |
| CN106407947B (zh) * | 2016-09-29 | 2019-10-22 | 百度在线网络技术(北京)有限公司 | 用于无人驾驶车辆的目标物体识别方法和装置 |
| CN110325818B (zh) * | 2017-03-17 | 2021-11-26 | 本田技研工业株式会社 | 经由多模融合的联合3d对象检测和取向估计 |
| CN107748871B (zh) * | 2017-10-27 | 2021-04-06 | 东南大学 | 一种基于多尺度协方差描述子与局部敏感黎曼核稀疏分类的三维人脸识别方法 |
| US10970553B2 (en) * | 2017-11-15 | 2021-04-06 | Uatc, Llc | Semantic segmentation of three-dimensional data |
| US10671860B2 (en) * | 2018-02-20 | 2020-06-02 | GM Global Technology Operations LLC | Providing information-rich map semantics to navigation metric map |
| US20210232871A1 (en) * | 2018-07-05 | 2021-07-29 | Optimum Semiconductor Technologies Inc. | Object detection using multiple sensors and reduced complexity neural networks |
| US11676005B2 (en) * | 2018-11-14 | 2023-06-13 | Huawei Technologies Co., Ltd. | Method and system for deep neural networks using dynamically selected feature-relevant points from a point cloud |
| CN109543601A (zh) * | 2018-11-21 | 2019-03-29 | 电子科技大学 | 一种基于多模态深度学习的无人车目标检测方法 |
-
2020
- 2020-07-08 WO PCT/US2020/041200 patent/WO2021007320A1/fr not_active Ceased
- 2020-07-08 JP JP2022500800A patent/JP2022539843A/ja not_active Ceased
- 2020-07-08 CN CN202080050276.4A patent/CN114080629A/zh active Pending
- 2020-07-08 EP EP20750916.7A patent/EP3980932B1/fr active Active
- 2020-07-08 KR KR1020227004263A patent/KR20220031685A/ko not_active Ceased
- 2020-07-08 US US16/923,823 patent/US11450120B2/en active Active
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20180144496A1 (en) * | 2015-04-24 | 2018-05-24 | Oxford University Innovation Limited | A method of detecting objects within a 3d environment |
| US20180188038A1 (en) * | 2016-12-30 | 2018-07-05 | DeepMap Inc. | Detection of vertical structures based on lidar scanner data for high-definition maps for autonomous vehicles |
| US20170220876A1 (en) * | 2017-04-20 | 2017-08-03 | GM Global Technology Operations LLC | Systems and methods for visual classification with region proposals |
Non-Patent Citations (3)
| Title |
|---|
| CHARLES R QI ET AL: "Frustum PointNets for 3D Object Detection from RGB-D Data", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 22 November 2017 (2017-11-22), XP080839554 * |
| TSUNG-YI LIN ET AL: "Feature Pyramid Networks for Object Detection", 9 December 2016, ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, XP080738158 * |
| ZHOU YIN ET AL: "VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection", 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, IEEE, 18 June 2018 (2018-06-18), pages 4490 - 4499, XP033473359, DOI: 10.1109/CVPR.2018.00472 * |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20220319054A1 (en) * | 2021-03-01 | 2022-10-06 | Waymo Llc | Generating scene flow labels for point clouds using object labels |
| US12106528B2 (en) * | 2021-03-01 | 2024-10-01 | Waymo Llc | Generating scene flow labels for point clouds using object labels |
Also Published As
| Publication number | Publication date |
|---|---|
| US11450120B2 (en) | 2022-09-20 |
| KR20220031685A (ko) | 2022-03-11 |
| JP2022539843A (ja) | 2022-09-13 |
| CN114080629A (zh) | 2022-02-22 |
| EP3980932A1 (fr) | 2022-04-13 |
| US20210012089A1 (en) | 2021-01-14 |
| EP3980932B1 (fr) | 2025-09-03 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11450120B2 (en) | Object detection in point clouds | |
| US11670038B2 (en) | Processing point clouds using dynamic voxelization | |
| JP7239703B2 (ja) | 領域外コンテキストを用いたオブジェクト分類 | |
| US12546882B2 (en) | Camera-radar sensor fusion using local attention mechanism | |
| KR102745062B1 (ko) | 앵커 궤적들을 이용하는 에이전트 궤적 예측 | |
| US12373984B2 (en) | Multi-modal 3-D pose estimation | |
| RU2767955C1 (ru) | Способы и системы для определения компьютером наличия динамических объектов | |
| EP4086817A1 (fr) | Modèles d'apprentissage de machine d'apprentissage automatique épuré utilisant un extracteur de fonctions pré-entraîné | |
| US10963706B2 (en) | Distributable representation learning for associating observations from multiple vehicles | |
| CN114061581A (zh) | 通过相互重要性对自动驾驶车辆附近的智能体排名 | |
| US11105924B2 (en) | Object localization using machine learning | |
| US11657268B1 (en) | Training neural networks to assign scores | |
| US20220355824A1 (en) | Predicting near-curb driving behavior on autonomous vehicles | |
| US11774596B2 (en) | Streaming object detection within sensor data | |
| US20240062386A1 (en) | High throughput point cloud processing | |
| US20240232647A9 (en) | Efficient search for data augmentation policies | |
| US12548248B2 (en) | Late-to-early temporal fusion for point clouds | |
| US12195013B2 (en) | Evaluating multi-modal trajectory predictions for autonomous driving | |
| US20250200751A1 (en) | Training a point cloud processing model using a computer vision model |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20750916 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2022500800 Country of ref document: JP Kind code of ref document: A |
|
| ENP | Entry into the national phase |
Ref document number: 2020750916 Country of ref document: EP Effective date: 20220105 |
|
| ENP | Entry into the national phase |
Ref document number: 20227004263 Country of ref document: KR Kind code of ref document: A |
|
| WWR | Wipo information: refused in national office |
Ref document number: 1020227004263 Country of ref document: KR |
|
| WWG | Wipo information: grant in national office |
Ref document number: 2020750916 Country of ref document: EP |