WO2022134996A1 - Lane line detection method based on deep learning, and apparatus - Google Patents

Lane line detection method based on deep learning, and apparatus Download PDF

Info

Publication number
WO2022134996A1
WO2022134996A1 PCT/CN2021/132554 CN2021132554W WO2022134996A1 WO 2022134996 A1 WO2022134996 A1 WO 2022134996A1 CN 2021132554 W CN2021132554 W CN 2021132554W WO 2022134996 A1 WO2022134996 A1 WO 2022134996A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature map
picture
target
obtaining
lane line
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2021/132554
Other languages
French (fr)
Inventor
Xuefeng Yang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to EP21908995.0A priority Critical patent/EP4252148B1/en
Publication of WO2022134996A1 publication Critical patent/WO2022134996A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/588Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • Autonomous driving and an intelligent assisted driving system can help a driver process most of road information, provide precise guidance for the driver, and reduce the probability of traffic accidents.
  • An intelligent transportation system can identify the number of vehicles on a lane, determine whether the lane is congested, plan a more reasonable travel route for the driver, and alleviate traffic congestion.
  • a vision-based lane line detection is very critical and is the basis and core technology for realizing lane departure warning and lane congestion warning.
  • the method for lane line detection requires a large number of training samples and a complex neural network model, resulting in a technical problem that the efficiency of detecting lane lines is very low.
  • the present disclosure provides a lane line detection method based on deep learning, comprising: obtaining a first picture that is planned to be detected; obtaining a target feature map by inputting the first picture into a target neural network model; wherein the target neural network model comprises a neural network model generated based on a multi-scale attention mechanism and a deep separable convolution model, and the target feature map is configured to represent a probability of each pixel in the first picture being a lane line pixel; and obtaining a target detection result by performing an image post-processing on the target feature map; wherein the target detection result is configured to indicate a detected lane line in the first picture.
  • the present disclosure provides a lane line detection apparatus based on deep learning, comprising: an obtaining module, configured to obtain a first picture that is planned to be detected; a first processing module, configured to obtain a target feature map by inputting the first picture into a target neural network model; wherein the target neural network model comprises a neural network model generated based on a multi-scale attention mechanism and a deep separable convolution model, and the target feature map is configured to represent a probability of each pixel in the first picture being a lane line pixel; and a second processing module, configured to obtain a target detection result by performing an image post-processing on the target feature map; wherein the target detection result is configured to indicate a detected lane line in the first picture.
  • the present disclosure provides a computer-readable storage medium, storing a computer program; wherein the computer program is configured to perform the method as described when executed.
  • the present disclosure provides an electronic device, comprising a processor, a memory, and a computer program stored in the memory and executable on the processor; wherein the processor is configured to perform the method as described above when executing the computer program.
  • a first picture that is planned to be detected is obtained; a target feature map is obtained by inputting the first picture into a target neural network model; wherein the target neural network model includes a neural network model generated based on a multi-scale attention mechanism and a deep separable convolution model, and the target feature map is configured to represent a probability of each pixel in the first picture being a lane line pixel; and a target detection result is obtained by performing an image post-processing on the target feature map; wherein the target detection result is configured to indicate a detected lane line in the first picture. Therefore, the problem of low detection accuracy of lane lines in the related art may be solved, thereby increasing the detection efficiency of lane lines, increasing the detection accuracy of lane lines, and reducing the detection cost.
  • FIG. 1 is a block view of a hardware structure of a mobile terminal performing a lane line detection method based on deep learning according to an embodiment of the present disclosure.
  • FIG. 2 is a flowchart of a lane line detection method based on deep learning according to an embodiment of the present disclosure.
  • FIG. 3 is a schematic view of a lane line detection method based on deep learning according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic view of a lane line detection method based on deep learning according to another embodiment of the present disclosure.
  • FIG. 4a is a schematic view of a lane line detection method based on deep learning according to further another embodiment of the present disclosure.
  • FIG. 4b is a schematic view of a lane line detection method based on deep learning according to further another embodiment of the present disclosure.
  • FIG. 4c is a schematic view of a lane line detection method based on deep learning according to further another embodiment of the present disclosure.
  • FIG. 5 is a schematic view of a lane line detection method based on deep learning according to further another embodiment of the present disclosure.
  • FIG. 9 is a schematic view of a lane line detection method based on deep learning according to further another embodiment of the present disclosure.
  • FIG. 10 is a structural block view of a lane line detection apparatus based on deep learning according to an embodiment of the present disclosure.
  • FIG. 1 is a block view of a hardware structure of a mobile terminal performing a lane line detection method based on deep learning according to an embodiment of the present disclosure.
  • the mobile terminal may include one or more (only one is shown in FIG. 1) processor 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA) and a memory 104 for storing data.
  • the mobile terminal may also include a transmission device 106 and an input/output device 108 for communication functions.
  • the structure shown in FIG. 1 is only for illustration and does not limit the structure of the mobile terminal.
  • the mobile terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration from that shown in FIG.
  • the transmission device 106 is configured to receive or send data via a network.
  • the network may include a wireless network provided by a communication provider of the mobile terminal.
  • the transmission device 106 includes a network interface adapter (NIC) , which can be connected to other network devices through a base station to communicate with the Internet.
  • the transmission device 106 may be a radio frequency (RF) module configured to communicate with the Internet in a wireless manner.
  • NIC network interface adapter
  • RF radio frequency
  • FIG. 2 is a flowchart of a lane line detection method based on deep learning according to an embodiment of the present disclosure. The method may include operations at blocks as followed.
  • a first picture that is planned to be detected is obtained.
  • a target feature map is obtained by inputting the first picture into a target neural network model; wherein the target neural network model includes a neural network model generated based on a multi-scale attention mechanism and a deep separable convolution model, and the target feature map is configured to represent a probability of each pixel in the first picture being a lane line pixel.
  • a target detection result is obtained by performing an image post-processing on the target feature map; wherein the target detection result is configured to indicate a detected lane line in the first picture.
  • the first picture may include, but is not limited to, a picture collected by an image or video capture device in an automatic driving system, and/or an intelligent assisted driving system, and/or an intelligent transportation system.
  • the first picture may also include but is not limited to a picture pre-stored in a database, or collected through other methods.
  • the target neural network model includes, but is not limited to, a convolutional neural network model, a recurrent neural network model, and a combination of one or more neural network models.
  • the target neural network model includes, but is not limited to, the neural network model generated based on the multi-scale attention mechanism and the deep separable convolution model.
  • the target feature map includes, but is not limited to, a picture generated after feature information is extracted from the first picture.
  • the value of each pixel in the first picture may include, but is not limited to, a probability of the corresponding position of each pixel in the first picture being a target object.
  • the value of each pixel in the first picture may also include, but is not limited to, the probability of each pixel in the first picture being the lane line pixel or a probability of each pixel in the first picture being an image background pixel.
  • a marked picture may include, but is not limited to, a picture collected by an image or video capture device in an automatic driving system, and/or an intelligent assisted driving system, and/or an intelligent transportation system.
  • the marked picture may also include but is not limited to a picture pre-stored in a database, or collected through other methods.
  • the image post-processing may include, but is not limited to, normalization, image smoothing, image sharpening, image dilation, image erosion, or another one or a combination of image processing methods.
  • a first picture that is planned to be detected is obtained, and a target feature map is obtained by inputting the first picture into a target neural network model.
  • the target neural network model is a model obtained by training a to-be-trained initial neural network model with a set of marked images. Each marked picture includes a marked image background pixel and a marked lane line pixel.
  • the target feature map is configured to represent a probability of each pixel in the first picture being a lane line pixel, and the target neutral network model is configured to determine a distribution of the probability in the first picture.
  • a target detection result is obtained by performing image post-processing on the target feature map. The target detection result is configured to indicate a detected lane line in the first picture.
  • the target detection result is configured to indicate the lane line detected in the first picture. Therefore, the technical problem of low detection efficiency of lane lines in related technologies may be solved, and the technical effects of improving the detection efficiency of lane lines, increasing the detection accuracy of lane lines, and reducing detection costs may be achieved.
  • the obtaining the target feature map by inputting the first picture into the target neural network model includes: obtaining a first feature map by inputting the first picture into a first convolutional layer; obtaining a second feature map by inputting the first feature map into a second convolutional layer, wherein the target neural network model includes the second convolutional layer, and the second convolutional layer is configured to increase a weight of a preset region in the first feature map based on the multi-scale attention mechanism; and obtaining the target feature map by performing a first preset processing on the second feature map, wherein the first preset processing includes an up-sampling operation.
  • the first preset resolution may be manually set by an operator, and may also set flexibly based on a computational power of the target neural network model or an original resolution of an image or video stream to which the first picture corresponds.
  • the first sub-convolutional layer is configured to reduce the resolution of a feature map to improve the efficiency of feature extraction
  • the second sub-convolutional layer is configured to perform the depth separable convolution operation to extract the feature information in the first sub-feature map.
  • the second preset resolution may be the same as or different from the first preset resolution.
  • the obtaining the second sub-feature map by inputting the first sub-feature map into the second sub-convolutional layer includes: obtaining the second sub-feature map by extracting the feature information of the first sub-feature map with a 1 ⁇ n convolutional kernel and by extracting the feature information of the first sub-feature map with an n ⁇ 1 convolutional kernel, wherein the n is a positive odd number greater than 1.
  • the first convolutional layer may include, but is not limited to, a special combination of a traditional convolutional neural network and a lightweight convolutional neural network.
  • the computational cost of the neural network model is reduced while increasing the receptive field of the target neural network model.
  • the number of parameters of the convolution layer is n ⁇ n ⁇ c1 ⁇ c2.
  • the number of parameters of the convolutional layer is 2 ⁇ n ⁇ c1 ⁇ c2, and the number of parameters is reduced by (n ⁇ n-2 ⁇ n) ⁇ c1 ⁇ c2. Therefore, the greater the value of n, the more obvious the effect of reducing the number of parameters, while the receptive field remains unchanged.
  • the obtaining the second feature map by inputting the first feature map into the second convolutional layer includes: obtaining a first type feature map and a plurality of second type feature maps by inputting the first feature map into the second convolutional layer and by performing a convolution operation on the first feature map with a plurality of convolutional kernels of different sizes included in the second convolution layer; determining a plurality of third type feature maps from the plurality of second type feature maps through a preset statistical method, wherein the plurality of third type feature maps are configured to be performed with a second preset processing such that sizes of the plurality of third type feature maps match each other, the plurality of third type feature maps are configured to be performed with a third preset processing to obtain an attention feature map, and a size of the attention feature map matches a size of the first type feature map; and obtaining the second feature map by performing the third preset processing on the attention feature map and the first type feature map.
  • the plurality of second type feature maps may include, but are not limited to, second type feature maps obtained by performing a convolution operation on the first feature map with a plurality of convolutional kernels of different sizes.
  • the preset statistical method may include, but is not limited to, norm formulas in statistics.
  • the preset statistical method may include, but is not limited to, vector norms, matrix norms, etc.
  • the second preset processing may include, but is not limited to, calling a function to adjust the number of channels of a feature map, for example, a reshape operation to adjust the number of rows or columns of a feature vector corresponding to the feature map.
  • the obtaining the second feature map by inputting the first feature map into the second convolutional layer may also include, but is not limited to, increasing a weight of an important region in the first feature map by the multi-scale attention mechanism in a region close to an output layer of the neural network model (for example, the weight of the important region in the first feature map includes but is not limited to a weight of a related region at which a lane line is prone to appear) , thereby improving the detection accuracy of the neural network model.
  • the first feature map is performed with convolution operations with convolutional kernels including but not limited to three convolutional kernels of different scales (1 ⁇ 1, 3 ⁇ 3, and 5 ⁇ 5, corresponding to the aforementioned plurality of convolutional kernels of different sizes) respectively.
  • the use of the different convolutional kernels is to fuse information of lane line elements at different receptive field scales.
  • Three feature maps are output, specifically including two second type feature maps and one first type feature map. While the resolution remains unchanged, and the number of channels is 0.5 ⁇ c, 0.5 ⁇ c, and c, respectively.
  • the 3 ⁇ 3 and 5 ⁇ 5 second type feature maps are configured to calculate a correlation between elements to determine the importance of each pixel position in the feature map for global inference. Then element value at each position is counted from channel dimensions using a statistical method.
  • a calculation formula may be as follows.
  • operations may include, but are not limited to, processing the output feature map into (w ⁇ h) ⁇ 1 and 1 ⁇ (w ⁇ h) (corresponding to the aforementioned third type feature maps) through the reshape function (corresponding to the second preset processing) , obtaining a matrix with a dimension of (w ⁇ h) ⁇ (w ⁇ h) through matrix multiplication (corresponding to the third preset processing) of the (w ⁇ h) ⁇ 1 and 1 ⁇ (w ⁇ h) feature maps, obtaining the attention feature map by performing a softmax operation on the matrix, processing a feature map generated by the 1 ⁇ 1 convolutional kernel (corresponding to the aforementioned first type feature map) into a dimension of c ⁇ (w ⁇ h) through the reshape function, obtaining another matrix with a dimension of c ⁇ (w ⁇ h) by performing
  • the dimension of the output feature is greatly reduced compared to the bottom of the target neural network.
  • Ordinary convolution is used for feature extraction, and the convolutional kernel is adopted with a larger kernel to continue to maintain the receptive field of the convolutional kernel. 1 ⁇ n and n ⁇ 1 convolutions are adopted instead of n ⁇ n convolution to reduce the network computational cost.
  • the target neural network generated based on the attention mechanism enables the output target feature map to more effectively and accurately reflect each pixel corresponding to the position of the each pixel in the input image, and achieve the detection accuracy of lane lines.
  • the method before the obtaining the target feature map by inputting the first picture into the target neural network model, the method further includes: obtaining a first sample picture and a label picture corresponding to the first sample picture; and obtaining the target neural network model by training a to-be-trained initial neutral network model with the first sample picture and the label picture, wherein the target neural network model is configured to be trained through a loss function as follows.
  • the preset ⁇ is configured to alleviate the impact of category imbalance.
  • the ⁇ may be set to be 0.1.
  • operations may include, but are not limited to, configuring an initial learning rate to a preset value, for example, 0.01; configuring a learning rate attenuation strategy to a preset strategy, for example, every 10,000 iterations, a learning rate is multiplied by 0.1; and configuring a total number of iterations to another preset value, for example, 60,000 iterations.
  • the obtaining the target detection result by performing the image post-processing on the target feature map includes: obtaining a lane line result segmentation map by performing a binarization processing on the target feature map; obtaining a processed lane line result segmentation map containing a plurality of connected domains by preprocessing the lane line result segmentation map with an image erosion operation and an image dilation operation; obtaining a target detection result by deleting at least one of the plurality of connected domains that does not meet a preset condition and fitting the remaining of the plurality of connected domains that meets a preset condition to obtain a fitted connected domain, wherein the fitted connected domain included in the target detection result represents a detected lane line in the first picture.
  • the binarization processing may include, but is not limited to, obtaining the lane line result segmentation map by marking an original image based on the target feature map; wherein a part predicted to be the lane line pixel is marked as 1, and another part predicted not to be a lane line, that is, to be the background pixel, is marked as 0.
  • FIG. 3 is a schematic view of a lane line detection method based on deep learning according to an embodiment of the present disclosure.
  • the lane line pixel may be filtered by performing operations including but not limited to the following operations.
  • Operation S1 A width and a height of a circumscribed rectangle of each connected domain 302 are calculated. When the width and height are each less than a corresponding threshold, the connected domain is likely not to be a lane line pixel position, and the connected domain is directly deleted.
  • a scenario before the filtering the lane line pixel may be shown in FIG. 4a, and a result of the filtering the lane line pixel may be shown in FIG. 4b.
  • Operation S2 On a processed probability map, each different connected domain may be marked with a different digital id through a Skimage library, representing different lane lines. A result after the processing may be shown in FIG. 4c. The lane line pixels represented by different brightness belong to different lane lines.
  • FIG. 5 is a schematic view of a lane line detection method based on deep learning according to further another embodiment of the present disclosure.
  • FIG. 6 is a schematic view of a lane line detection method based on deep learning according to further another embodiment of the present disclosure. As shown in FIG. 6, the range of the angle is required to be determined according to the camera’s installation position and focal length parameters. Then, according to the slope of each lane line, the candidate lane lines of which the angle is not within the range are continued to be filtered.
  • the preprocessing of the image erosion operation and the image dilation operation may include but is not limited to one or more operations to obtain the target feature map containing the plurality of connected domains.
  • the preset condition may include but is not limited to a determination based on the size, length and other shape characteristics of the connected domains. For example, by setting a preset length threshold and a preset width threshold, a connected domain is deleted in response to the length of the connected domain being less than the preset length threshold and/or the width of the connected domain being less than the preset width threshold, and the target feature map containing the connected domains that meet the preset condition is retained to determine the final target detection result.
  • the lane line features recorded in the target feature map output by the aforementioned target neural network model may show some discontinuities, as shown in FIG. 4a.
  • the scattered and small area lane line prediction pixels may be eliminated using including but not limited to the image erosion operation in OpenCV, because the scattered and small area of lane line prediction pixels are likely to be falsely detected.
  • the prediction pixel area for remaining lane lines is then increased using the image dilation operation in OpenCV, and finally, the connected domains that do not belong to the lane lines are filtered out by configuring the size of the connected domains.
  • the method further includes:
  • operations may include, but are not limited to, leaving a plurality of lane lines after filtering the non-lane lines.
  • Some candidate lane lines belong to different parts of the same lane line, as shown in the left side of FIG. 5.
  • the two candidate lane lines in the left belong to the same lane line logically, and these lane lines are required to be merged.
  • Clustering is performed using the slope of each candidate lane line.
  • the clustering algorithm used may include, but is not limited to, a mean drift clustering algorithm, which may be implemented by sklearn. cluster. MeanShift function.
  • a clustering radius is required to be determined based on the camera installation location and focal length parameters.
  • the number of clustering centers obtained is the number of lane lines, and the equation of each lane line after merging is as follows.
  • n indicates the number of candidate lane lines being merged into the lane line
  • k i and b i indicate the slope and intercept of the i-th candidate lane line respectively.
  • the target neural network model includes a neural network model including a lightweight convolutional neural network.
  • the target neural network model architecture may be adopted with a special combination of a traditional convolutional neural network model and a lightweight convolutional neural network model.
  • a traditional convolutional neural network model due to the relatively large resolution of the input image, usually only a smaller convolutional kernel can be used, resulting in a smaller receptive field of the bottom convolutional kernel, while the use of a larger convolutional kernel will lead to excessive computational cost.
  • the depth separable convolution of a large convolutional kernel is used at the bottom of the target neural network model, such that the computational cost of the neural network model is reduced while increasing the receptive field of the target neural network model.
  • the number of parameters of the convolution layer is n ⁇ n ⁇ c1 ⁇ c2.
  • the number of parameters of the convolutional layer is 2 ⁇ n ⁇ c1 ⁇ c2
  • the number of parameters is reduced by (n ⁇ n-2 ⁇ n) ⁇ c1 ⁇ c2. Therefore, the greater the value of n, the more obvious the effect of reducing the number of parameters, while the receptive field remains unchanged.
  • the dimensionality of the feature map is greatly reduced compared to the bottom of the target neural network model.
  • Ordinary convolution is used for feature extraction, and the convolutional kernel is adopted with a larger kernel to continue to maintain the receptive field of the convolutional kernel.
  • 1 ⁇ n and n ⁇ 1 convolutions are adopted instead of n ⁇ n convolution to reduce the computational cost of the target neural network model.
  • the target neural network model structure is shown in Table 1.
  • the above dewConvSP convolution operation may include but is not limited to the content shown in Table 2.
  • a multi-scale attention mechanism is configured to increase the weight of the important region on the feature map (corresponding to the attention convolutional layer in Table 1) .
  • the calculation flowchart of the proposed attention mechanism is shown in FIG. 7.
  • S702 The feature map output by a previous convolutional layer (corresponding to the aforementioned first feature map) is input, assuming that the resolution of the first feature map is w, h, and the number of channels is c.
  • S704 A convolution operation is performed on the input image using three convolutional kernels of different scales, wherein the use of the different convolutional kernels is to fuse information of lane line elements at different receptive field scales. Three feature maps are output, the resolution remains unchanged, and the number of channels is 0.5 ⁇ c, 0.5 ⁇ c, and c (from top to bottom) .
  • S706 First two feature maps are configured to calculate the correlation between elements. In order to determine the importance of each location to the global inference. Then element value at each position is counted from channel dimensions using a statistical method.
  • a calculation formula may be as follows.
  • the foregoing calculation process schematic diagram may include, but is not limited to, as shown in FIG. 8:
  • x is a counted element value
  • x i is element values on different channels at a same location.
  • the resolution of the output feature map remains unchanged, and the number of channels becomes 1.
  • the output feature map is processed into (w ⁇ h) ⁇ 1 and 1 ⁇ (w ⁇ h) through the reshape function, a matrix with a dimension of (w ⁇ h) ⁇ (w ⁇ h) is obtained through matrix multiplication, and the attention feature map is obtained by performing a softmax operation on the matrix.
  • the feature map will be filled with 0 around it before the convolution operation, ensuring that the resolution of the output feature map remains unchanged after the convolution operation.
  • the input of the network is an RGB three-channel color image.
  • the original image is resized to 320 ⁇ 184 ⁇ 3 using a bilinear interpolation, and the label image is resized to 320 ⁇ 184 using a nearest neighbor interpolation algorithm.
  • the result of lane line segmentation (corresponding to a picture in which each pixel indicates the probability value of whether the location of that pixel is a lane line or not) with the same resolution as the input image.
  • the two channels represent the image background pixel location information and lane line pixel location information, respectively.
  • the loss function used for model training is cross entropy with weights, with the following equation.
  • is configured to alleviate the impact of category imbalance, and ⁇ may be set to 0.1 during training.
  • the initial learning rate used in training is 0.01.
  • the learning rate decay strategy is that every 10,000 iterations, a learning rate is multiplied by 0.1, and the total number of iterations is set to 60,000.
  • the image post-processing and statistical methods are configured to filter the falsely detected lane lines and merge the lane lines.
  • the processing flow chart is shown in FIG. 9. The specific steps may be as follows.
  • the lane line features recorded in the target feature map output by the aforementioned target neural network model may show some discontinuities, as shown in FIG. 4a.
  • the scattered and small area lane line prediction pixels may be eliminated using including but not limited to the image erosion operation in OpenCV, because the scattered and small area of lane line prediction pixels are likely to be falsely detected.
  • the prediction pixel area for remaining lane lines is then increased using the image dilation operation in OpenCV, and finally, the connected domains that do not belong to the lane lines are filtered out by configuring the size of the connected domains.
  • lane line pixels can be filtered by including but not limited to the following methods:
  • each different connected domain may be marked with a different digital id through a Skimage library, representing different lane lines.
  • a result after the processing may be shown in FIG. 4c.
  • the lane line pixels represented by different brightness belong to different lane lines.
  • a plurality of lane lines may be left after filtering the non-lane lines. Some candidate lane lines belong to different parts of the same lane line, as shown in the left side of FIG. 5. The two candidate lane lines in the left belong to the same lane line logically, and these lane lines are required to be merged.
  • Clustering is performed using the slope of each candidate lane line.
  • the clustering algorithm used may include, but is not limited to, a mean drift clustering algorithm, which may be implemented by sklearn. cluster. MeanShift function.
  • a clustering radius is required to be determined based on the camera installation location and focal length parameters. The number of clustering centers obtained is the number of lane lines, and the equation of each lane line after merging is as follows.
  • n indicates the number of candidate lane lines being merged into the lane line
  • k i and b i indicate the slope and intercept of the i-th candidate lane line respectively.
  • a target neural network model based on a multi-scale attention mechanism is used for the lane line detection task
  • a lightweight lane line detection network is designed based on the characteristics of each layer of the deep neural network to achieve the use of a lightweight deep neural network in the lane line detection task, which reduces the complexity and computational cost of the network, has advantages when arranged in embedded devices with limited storage capacity and computational power, increases the receptive field of the target neural network model, and alleviates the problem of predicted lane line pixel discontinuity in the lane line segmentation task.
  • a set of post-processing algorithms may be designed that can accurately detect lane lines based on the coarse extraction of lane line features by the deep neural network, reducing the network’s requirement for training datasets.
  • the training can be performed on some general datasets, and the requirement of roughly extracting lane line features can be achieved without specifically collecting the corresponding road scenes for labeling, which reduces the cost of practical application of the algorithm.
  • the method according to the above embodiments can be implemented with the aid of software plus the necessary general purpose hardware platform, or of course by means of hardware, but in many cases the former is the better way of implementation.
  • the technical solution of the present disclosure which essentially or rather contributes to the prior art, may be embodied in the form of a software product, which is stored in a storage medium (e.g. ROM/RAM, disk, CD-ROM) and includes a number of instructions to enable a terminal device (which may be a cell phone, a computer, a server, or a network device, etc. ) to perform the method described in various embodiments of the present disclosure.
  • the present disclosure also provides a lane line detection apparatus based on deep learning, which is configured to implement the aforementioned embodiments and preferred implementations, and what has been explained will not be repeated.
  • the term “module” may implement a combination of software and/or hardware with predetermined functions.
  • the apparatus described in the following embodiments is preferably implemented by software, hardware or a combination of software and hardware is also possible and conceived.
  • FIG. 10 is a structural block view of a lane line detection apparatus based on deep learning according to an embodiment of the present disclosure. As shown in FIG. 10, the apparatus includes:
  • an obtaining module 1002 configured to obtain a first picture that is planned to be detected
  • a first processing module 1004 configured to obtain a target feature map by inputting the first picture into a target neural network model; wherein the target neural network model includes a neural network model generated based on a multi-scale attention mechanism and a deep separable convolution model, and the target feature map is configured to represent a probability of each pixel in the first picture being a lane line pixel;
  • a second processing module 1006 configured to obtain a target detection result by performing an image post-processing on the target feature map; wherein the target detection result is configured to indicate a detected lane line in the first picture.
  • the first processing module 1004 includes:
  • a first calculation unit configured to obtain a first feature map by inputting the first picture into a first convolutional layer; wherein the target neural network model includes the first convolutional layer, and the first convolutional layer is configured to adjust a resolution of the first picture and to extract the lane line pixel of the first image after the resolution adjustment;
  • a second calculation unit configured to obtain a second feature map by inputting the first feature map into a second convolutional layer; wherein the target neural network model includes the second convolutional layer, and the second convolutional layer is configured to increase a weight of a preset region in the first feature map based on the multi-scale attention mechanism;
  • a first processing unit configured to obtain the target feature map by performing a first preset processing on the second feature map; wherein the first preset processing includes an up-sampling operation.
  • the aforementioned apparatus is configured to input the first picture into the first convolutional layer in the following manner to obtain the first feature map: obtaining the first feature map by inputting the first picture with a resolution less than or equal to a first preset resolution into the first convolutional layer.
  • he aforementioned apparatus is configured to input the first picture into the first convolutional layer in the following manner to obtain the first feature map: obtaining a first sub-feature map by inputting the first picture into a first sub-convolutional layer, wherein the first sub-convolutional layer is configured to perform a convolution operation, and the first convolutional layer includes the first sub-convolutional layer; obtaining a second sub-feature map by inputting the first sub-feature map into a second sub-convolutional layer, wherein the second sub-convolutional layer is configured to perform a depth separable convolution operation to extract feature information of the first sub-feature map, and the first convolutional layer includes the second sub-convolutional layer; in response to a resolution of the second sub-feature map being greater than a second preset resolution, reducing a resolution of the first sub-feature map by re-inputting the second sub-feature map into the first sub-convolutional layer; in response to the resolution of the second sub-feature map being less than or equal to
  • the first calculation unit is configured to input the first sub-feature map into the second sub-convolutional layer in the following manner to obtain the second sub-feature map: obtaining the second sub-feature map by extracting the feature information of the first sub-feature map with a 1 ⁇ n convolutional kernel and by extracting the feature information of the first sub-feature map with an n ⁇ 1 convolutional kernel, wherein the n is a positive odd number greater than 1.
  • the second calculation unit is configured to input the first feature map into the second convolutional layer in the following manner to obtain the second feature map: obtaining a first type feature map and a plurality of second type feature maps by inputting the first feature map into the second convolutional layer and by performing a convolution operation on the first feature map with a plurality of convolutional kernels of different sizes included in the second convolution layer; determining a plurality of third type feature maps from the plurality of second type feature maps through a preset statistical method, wherein the plurality of third type feature maps are configured to be performed with a second preset processing such that sizes of the plurality of third type feature maps match each other, the plurality of third type feature maps are configured to be performed with a third preset processing to obtain an attention feature map, and a size of the attention feature map matches a size of the first type feature map; and obtaining the second feature map by performing the third preset processing on the attention feature map and the first type feature map.
  • the apparatus is further configured to obtain a first sample picture and a label picture corresponding to the first sample picture;
  • the target neural network model by training a to-be-trained initial neutral network model with the first sample picture and the label picture, wherein the target neural network model is configured to be trained through a loss function as follows.
  • C represents a cross entropy loss function
  • l p represents the lane line pixel obtained by inputting the first sample picture into the to-be-trained initial neural network model
  • l t represents the lane line pixel marked by the label picture
  • b p represents the image background pixel obtained by inputting the first sample picture into the to-be-trained initial neural network model
  • b t represents the image background pixel marked by the label picture
  • is a preset parameter greater than 0.
  • the second processing module 1006 includes:
  • a second processing unit configured to obtain a lane line result segmentation map by performing a binarization processing on the target feature map
  • a third processing unit configured to obtain a processed lane line result segmentation map containing a plurality of connected domains by preprocessing the lane line result segmentation map with an image erosion operation and an image dilation operation;
  • a fourth processing unit configured to obtain a target detection result by deleting at least one of the plurality of connected domains that does not meet a preset condition and fitting the remaining of the plurality of connected domains that meets a preset condition to obtain a fitted connected domain, wherein the fitted connected domain included in the target detection result represents a detected lane line in the first picture.
  • the apparatus is further configured to:
  • the target neural network model includes a neural network model including a lightweight convolutional neural network.
  • each of the above modules can be implemented by software or hardware. For the latter, it can be implemented in the following way, but not limited to: the above modules are all located in the same processor; or, each of the above modules is located in a different processor in any combination.
  • An embodiment of the present disclosure also provides a computer-readable storage medium in which a computer program is stored, wherein the computer program is configured to execute the steps in any one of the foregoing method embodiments when executed.
  • the computer-readable storage medium may be configured to store a computer program for executing the following steps:
  • a target feature map is obtained by inputting the first picture into a target neural network model; wherein the target neural network model includes a neural network model generated based on a multi-scale attention mechanism and a deep separable convolution model, and the target feature map is configured to represent a probability of each pixel in the first picture being a lane line pixel.
  • a target detection result is obtained by performing an image post-processing on the target feature map; wherein the target detection result is configured to indicate a detected lane line in the first picture.
  • the foregoing computer-readable storage medium may include, but is not limited to, a USB flash drive, a read-only memory (ROM) , a random access memory (RAM) , a mobile hard drive, a magnetic disk or an optical disk and other media that can store computer programs.
  • An embodiment of the present disclosure also provides an electronic device, including a memory and a processor, the memory stores a computer program, and the processor is configured to run the computer program to execute the steps in any of the foregoing method embodiments.
  • the aforementioned electronic device may further include a transmission device and an input-output device, wherein the transmission device is connected to the processor, and the input-output device is connected to the processor.
  • the processor may be configured to execute the following steps through a computer program:
  • a target feature map is obtained by inputting the first picture into a target neural network model; wherein the target neural network model includes a neural network model generated based on a multi-scale attention mechanism and a deep separable convolution model, and the target feature map is configured to represent a probability of each pixel in the first picture being a lane line pixel.
  • a target detection result is obtained by performing an image post-processing on the target feature map; wherein the target detection result is configured to indicate a detected lane line in the first picture.
  • modules or steps of the present disclosure described above may be implemented with a generic computing device, they may be centralized on a single computing device or distributed on a network of multiple computing devices, they may be implemented with program code executable by the computing device, thus, they may be stored in a storage device to be executed by the computing device.
  • the steps shown or described may be executed in a different order than herein, or they may be implemented separately as individual integrated circuit modules, or multiple modules or steps thereof may be implemented as individual integrated circuit modules. In this way, the present disclosure is not limited to any particular combination of hardware and software.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Image Analysis (AREA)

Abstract

Disclosed are a lane line detection method based on deep learning, an apparatus, a storage medium, and an electronic device. The method includes: obtaining a first picture that is planned to be detected; obtaining a target feature map by inputting the first picture into a target neural network model; wherein the target neural network model includes a neural network model generated based on a multi-scale attention mechanism and a deep separable convolution model, and the target feature map is configured to represent a probability of each pixel in the first picture being a lane line pixel; and obtaining a target detection result by performing an image post-processing on the target feature map; wherein the target detection result is configured to indicate a detected lane line in the first picture.

Description

LANE LINE DETECTION METHOD BASED ON DEEP LEARNING, AND APPARATUS
CROSS REFERENCE
The present application claims foreign priority of China Patent Application No. 202011555482.9 filed on December 25, 2020, in the China National Intellectual Property Administration, the entire contents of which are hereby incorporated by reference.
TECHNICAL FIELD
The present disclosure relates to the field of communication technologies, and in particular to a lane line detection method based on deep learning, an apparatus, a storage medium, and an electronic device.
BACKGROUND
With the rapid development of technology and economy, there are more and more cars running on the road, which facilitates people’s travel while bringing more and more traffic accidents and traffic congestion problems. Autonomous driving and an intelligent assisted driving system can help a driver process most of road information, provide precise guidance for the driver, and reduce the probability of traffic accidents. An intelligent transportation system can identify the number of vehicles on a lane, determine whether the lane is congested, plan a more reasonable travel route for the driver, and alleviate traffic congestion. In the automatic driving, intelligent driving assistance system and intelligent transportation system, a vision-based lane line detection is very critical and is the basis and core technology for realizing lane departure warning and lane congestion warning.
In the current related technologies, the method for lane line detection requires a large number of training samples and a complex neural network model, resulting in a technical problem that the efficiency of detecting lane lines is very low.
Aiming at the technical problem of low efficiency of detecting lane lines in the related technologies, no effective solution has been proposed yet.
SUMMARY OF THE DISCLOSURE
The present disclosure provides a lane line detection method based on deep learning, an apparatus, a storage medium, and an electronic device, to solve the technical problem of low detection accuracy of lane lines in the related art.
In a first aspect, the present disclosure provides a lane line detection method based on deep learning, comprising: obtaining a first picture that is planned to be detected; obtaining a target feature map by inputting the first picture into a target neural network model; wherein the target neural network model comprises a neural network model generated based on a multi-scale attention mechanism and a deep separable convolution model, and the target feature map is configured to represent a probability of  each pixel in the first picture being a lane line pixel; and obtaining a target detection result by performing an image post-processing on the target feature map; wherein the target detection result is configured to indicate a detected lane line in the first picture.
In a second aspect, the present disclosure provides a lane line detection apparatus based on deep learning, comprising: an obtaining module, configured to obtain a first picture that is planned to be detected; a first processing module, configured to obtain a target feature map by inputting the first picture into a target neural network model; wherein the target neural network model comprises a neural network model generated based on a multi-scale attention mechanism and a deep separable convolution model, and the target feature map is configured to represent a probability of each pixel in the first picture being a lane line pixel; and a second processing module, configured to obtain a target detection result by performing an image post-processing on the target feature map; wherein the target detection result is configured to indicate a detected lane line in the first picture.
In a third aspect, the present disclosure provides a computer-readable storage medium, storing a computer program; wherein the computer program is configured to perform the method as described when executed.
In a fourth aspect, the present disclosure provides an electronic device, comprising a processor, a memory, and a computer program stored in the memory and executable on the processor; wherein the processor is configured to perform the method as described above when executing the computer program.
In the present disclosure, a first picture that is planned to be detected is obtained; a target feature map is obtained by inputting the first picture into a target neural network model; wherein the target neural network model includes a neural network model generated based on a multi-scale attention mechanism and a deep separable convolution model, and the target feature map is configured to represent a probability of each pixel in the first picture being a lane line pixel; and a target detection result is obtained by performing an image post-processing on the target feature map; wherein the target detection result is configured to indicate a detected lane line in the first picture. Therefore, the problem of low detection accuracy of lane lines in the related art may be solved, thereby increasing the detection efficiency of lane lines, increasing the detection accuracy of lane lines, and reducing the detection cost.
BRIEF DESCRIPTION OF THE DRAWINGS
The drawings described here are to provide a further understanding of the present disclosure and constitute a part of the present disclosure. The exemplary embodiments and their descriptions of the present disclosure are to explain the present disclosure, and do not constitute an improper limitation of the present disclosure.
FIG. 1 is a block view of a hardware structure of a mobile terminal performing a lane line detection method based on deep learning according to an embodiment of the present disclosure.
FIG. 2 is a flowchart of a lane line detection method based on deep learning according to an embodiment of the present disclosure.
FIG. 3 is a schematic view of a lane line detection method based on deep learning according to an embodiment of the present disclosure.
FIG. 4 is a schematic view of a lane line detection method based on deep learning according to another embodiment of the present disclosure.
FIG. 4a is a schematic view of a lane line detection method based on deep learning according to further another embodiment of the present disclosure.
FIG. 4b is a schematic view of a lane line detection method based on deep learning according to further another embodiment of the present disclosure.
FIG. 4c is a schematic view of a lane line detection method based on deep learning according to further another embodiment of the present disclosure.
FIG. 5 is a schematic view of a lane line detection method based on deep learning according to further another embodiment of the present disclosure.
FIG. 6 is a schematic view of a lane line detection method based on deep learning according to further another embodiment of the present disclosure.
FIG. 7 is a schematic view of a lane line detection method based on deep learning according to further another embodiment of the present disclosure.
FIG. 8 is a schematic view of a lane line detection method based on deep learning according to further another embodiment of the present disclosure.
FIG. 9 is a schematic view of a lane line detection method based on deep learning according to further another embodiment of the present disclosure.
FIG. 10 is a structural block view of a lane line detection apparatus based on deep learning according to an embodiment of the present disclosure.
DETAILED DESCRIPTION
The embodiments of the present disclosure will be described in detail with reference to the drawings and in conjunction with the embodiments.
It should be noted that the terms “first” and “second” in the specification and claims of the present disclosure and the drawings are used to distinguish similar objects, and not necessarily to describe a specific sequence or order.
The method embodiments provided in the embodiments of the present application may be executed in a mobile terminal, a computer terminal or a similar computing device. Taking running on a mobile terminal as an example, FIG. 1 is a block view of a hardware structure of a mobile terminal performing a lane line detection method based on deep learning according to an embodiment of the present disclosure. As shown in FIG. 1, the mobile terminal may include one or more (only one is shown in FIG. 1) processor 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA) and a memory 104 for storing data. The mobile terminal may also include a transmission device 106 and an input/output device 108 for  communication functions. Those skilled in the art can understand that the structure shown in FIG. 1 is only for illustration and does not limit the structure of the mobile terminal. For example, the mobile terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration from that shown in FIG.
The memory 104 may be configured to store computer programs, for example, software programs and modules of application software. Specifically, the memory may store computer programs corresponding to the lane line detection method based on deep learning herein the embodiments of the present disclosure. The processor 102 runs the computer programs stored in the memory 104 to perform various functional applications and data processing, that is, to achieve the above method. The memory 104 may include a high-speed random access memory, and may also include a non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include a memory remotely arranged relative to the processor 102, and the remote memory may be connected to the mobile terminal through a network. Examples of the network include, but are not limited to, the Internet, corporate intranet, local area network, mobile communication network, and any combination thereof.
The transmission device 106 is configured to receive or send data via a network. Specific examples of the network may include a wireless network provided by a communication provider of the mobile terminal. In some examples, the transmission device 106 includes a network interface adapter (NIC) , which can be connected to other network devices through a base station to communicate with the Internet. In some examples, the transmission device 106 may be a radio frequency (RF) module configured to communicate with the Internet in a wireless manner.
In the embodiments, a lane line detection method based on deep learning running on a mobile terminal, a computer terminal or a similar computing device is provided. FIG. 2 is a flowchart of a lane line detection method based on deep learning according to an embodiment of the present disclosure. The method may include operations at blocks as followed.
At block S202: A first picture that is planned to be detected is obtained.
At block S204: A target feature map is obtained by inputting the first picture into a target neural network model; wherein the target neural network model includes a neural network model generated based on a multi-scale attention mechanism and a deep separable convolution model, and the target feature map is configured to represent a probability of each pixel in the first picture being a lane line pixel.
At block S206: A target detection result is obtained by performing an image post-processing on the target feature map; wherein the target detection result is configured to indicate a detected lane line in the first picture.
In some embodiments, the first picture may include, but is not limited to, a picture collected by an image or video capture device in an automatic driving system, and/or an intelligent assisted driving system, and/or an intelligent transportation system. The first picture may also include but is not limited to  a picture pre-stored in a database, or collected through other methods.
In some embodiments, the target neural network model includes, but is not limited to, a convolutional neural network model, a recurrent neural network model, and a combination of one or more neural network models. Specifically, the target neural network model includes, but is not limited to, the neural network model generated based on the multi-scale attention mechanism and the deep separable convolution model.
In some embodiments, the target feature map includes, but is not limited to, a picture generated after feature information is extracted from the first picture. The value of each pixel in the first picture may include, but is not limited to, a probability of the corresponding position of each pixel in the first picture being a target object. The value of each pixel in the first picture may also include, but is not limited to, the probability of each pixel in the first picture being the lane line pixel or a probability of each pixel in the first picture being an image background pixel.
It should be noted that the lane line pixel and the image background pixel may include, but are not limited to, two target objects that are mutually exclusive. In other words, when a pixel is not configured to represent a lane line, the pixel is configured to represent an image background.
In some embodiments, a marked picture may include, but is not limited to, a picture collected by an image or video capture device in an automatic driving system, and/or an intelligent assisted driving system, and/or an intelligent transportation system. The marked picture may also include but is not limited to a picture pre-stored in a database, or collected through other methods.
In some embodiments, the image post-processing may include, but is not limited to, normalization, image smoothing, image sharpening, image dilation, image erosion, or another one or a combination of image processing methods.
The foregoing is only an exemplary description, and the embodiments do not make any specific limitations.
Through the present disclosure, a first picture that is planned to be detected is obtained, and a target feature map is obtained by inputting the first picture into a target neural network model. The target neural network model is a model obtained by training a to-be-trained initial neural network model with a set of marked images. Each marked picture includes a marked image background pixel and a marked lane line pixel. The target feature map is configured to represent a probability of each pixel in the first picture being a lane line pixel, and the target neutral network model is configured to determine a distribution of the probability in the first picture. A target detection result is obtained by performing image post-processing on the target feature map. The target detection result is configured to indicate a detected lane line in the first picture. The target detection result is configured to indicate the lane line detected in the first picture. Therefore, the technical problem of low detection efficiency of lane lines in related technologies may be solved, and the technical effects of improving the detection efficiency of lane lines, increasing the detection accuracy of lane lines, and reducing detection costs may be achieved.
In some embodiments, the obtaining the target feature map by inputting the first picture into  the target neural network model includes: obtaining a first feature map by inputting the first picture into a first convolutional layer; obtaining a second feature map by inputting the first feature map into a second convolutional layer, wherein the target neural network model includes the second convolutional layer, and the second convolutional layer is configured to increase a weight of a preset region in the first feature map based on the multi-scale attention mechanism; and obtaining the target feature map by performing a first preset processing on the second feature map, wherein the first preset processing includes an up-sampling operation.
In some embodiments, the first preset processing may include, but is not limited to, the up-sampling operation, a feature extraction operation, and other processing methods for generating the target feature map from the second feature map.
In some embodiments, the target neural network model includes the first convolutional layer, and the first convolutional layer is configured to adjust the resolution of the first picture, and to extract the lane line pixel of the first image after the resolution adjustment. The first convolutional layer may also include, but is not limited to, a convolutional neural network for feature extraction, for example, a lightweight convolutional neural network. The lightweight convolutional neural network may perform operations including, but not limited to, a convolution operation with a convolutional kernel in 1×1, 3×3, 5×5, 7×7 and other size. The convolution operation may be achieved with a preset sliding step size. For example, the step size may be 1, 2, etc.
The foregoing is only an exemplary description, and the embodiments do not make any specific limitations.
In some embodiments, the obtaining the first feature map by inputting the first picture into the first convolutional layer includes: obtaining the first feature map by inputting the first picture with a resolution less than or equal to a first preset resolution into the first convolutional layer.
In some embodiments, the first preset resolution may be manually set by an operator, and may also set flexibly based on a computational power of the target neural network model or an original resolution of an image or video stream to which the first picture corresponds.
In some embodiments, the obtaining the first feature map by inputting the first picture with a resolution less than or equal to the first preset resolution may include, but is not limited to, inputting a picture with a lower resolution into the first convolutional layer to achieve the technical effect of improving the efficiency of feature extraction.
The foregoing is only an exemplary description, and the embodiments do not make any specific limitations.
In some embodiments, the obtaining the first feature map by inputting the first picture into the first convolutional layer includes: obtaining a first sub-feature map by inputting the first picture into a first sub-convolutional layer, wherein the first sub-convolutional layer is configured to perform a convolution operation, and the first convolutional layer includes the first sub-convolutional layer; obtaining a second sub-feature map by inputting the first sub-feature map into a second sub-convolutional  layer, wherein the second sub-convolutional layer is configured to perform a depth separable convolution operation to extract feature information of the first sub-feature map, and the first convolutional layer includes the second sub-convolutional layer; in response to a resolution of the second sub-feature map being greater than a second preset resolution, reducing a resolution of the first sub-feature map by re-inputting the second sub-feature map into the first sub-convolutional layer; in response to the resolution of the second sub-feature map being less than or equal to the second preset resolution, determining the second sub-feature map to be the first feature map.
In some embodiments, the first sub-convolutional layer is configured to reduce the resolution of a feature map to improve the efficiency of feature extraction, and the second sub-convolutional layer is configured to perform the depth separable convolution operation to extract the feature information in the first sub-feature map. The second preset resolution may be the same as or different from the first preset resolution.
The foregoing is only an exemplary description, and the embodiments do not make any specific limitations.
In some embodiments, the obtaining the second sub-feature map by inputting the first sub-feature map into the second sub-convolutional layer includes: obtaining the second sub-feature map by extracting the feature information of the first sub-feature map with a 1×n convolutional kernel and by extracting the feature information of the first sub-feature map with an n×1 convolutional kernel, wherein the n is a positive odd number greater than 1.
In some embodiments, the first convolutional layer may include, but is not limited to, a special combination of a traditional convolutional neural network and a lightweight convolutional neural network. At the bottom of the above target neural network, due to the relatively large resolution of the input image, usually only a smaller convolutional kernel can be used, resulting in a smaller receptive field of the bottom convolutional kernel, while the use of a larger convolutional kernel will lead to excessive computational cost. By adopting the depth separable convolution with the large convolutional kernel in the second sub-convolutional layer in the first convolutional layer mentioned, and by using 1×n and n×1 convolutions instead of n×n convolutions, the computational cost of the neural network model is reduced while increasing the receptive field of the target neural network model. For an n×n convolutional kernel, assuming that the number of input channels is c1 and the number of output channels is c2, the number of parameters of the convolution layer is n×n×c1×c2. When the two convolutions of 1×n and n×1 are used instead, the number of parameters of the convolutional layer is 2×n×c1×c2, and the number of parameters is reduced by (n×n-2×n) ×c1×c2. Therefore, the greater the value of n, the more obvious the effect of reducing the number of parameters, while the receptive field remains unchanged.
The foregoing is only an exemplary description, and the embodiments do not make any specific limitations.
In some embodiments, the obtaining the second feature map by inputting the first feature map into the second convolutional layer includes: obtaining a first type feature map and a plurality of second  type feature maps by inputting the first feature map into the second convolutional layer and by performing a convolution operation on the first feature map with a plurality of convolutional kernels of different sizes included in the second convolution layer; determining a plurality of third type feature maps from the plurality of second type feature maps through a preset statistical method, wherein the plurality of third type feature maps are configured to be performed with a second preset processing such that sizes of the plurality of third type feature maps match each other, the plurality of third type feature maps are configured to be performed with a third preset processing to obtain an attention feature map, and a size of the attention feature map matches a size of the first type feature map; and obtaining the second feature map by performing the third preset processing on the attention feature map and the first type feature map.
In some embodiments, the plurality of second type feature maps may include, but are not limited to, second type feature maps obtained by performing a convolution operation on the first feature map with a plurality of convolutional kernels of different sizes.
In some embodiments, the preset statistical method may include, but is not limited to, norm formulas in statistics. Specifically, the preset statistical method may include, but is not limited to, vector norms, matrix norms, etc.
In some embodiments, the second preset processing may include, but is not limited to, calling a function to adjust the number of channels of a feature map, for example, a reshape operation to adjust the number of rows or columns of a feature vector corresponding to the feature map.
In some embodiments, the obtaining the second feature map by inputting the first feature map into the second convolutional layer may also include, but is not limited to, increasing a weight of an important region in the first feature map by the multi-scale attention mechanism in a region close to an output layer of the neural network model (for example, the weight of the important region in the first feature map includes but is not limited to a weight of a related region at which a lane line is prone to appear) , thereby improving the detection accuracy of the neural network model.
Specifically, assuming that the resolution of the first feature map is w, h, and the number of channels is c. First, the first feature map is performed with convolution operations with convolutional kernels including but not limited to three convolutional kernels of different scales (1×1, 3×3, and 5×5, corresponding to the aforementioned plurality of convolutional kernels of different sizes) respectively. The use of the different convolutional kernels is to fuse information of lane line elements at different receptive field scales. Three feature maps are output, specifically including two second type feature maps and one first type feature map. While the resolution remains unchanged, and the number of channels is 0.5×c, 0.5×c, and c, respectively. The 3×3 and 5×5 second type feature maps are configured to calculate a correlation between elements to determine the importance of each pixel position in the feature map for global inference. Then element value at each position is counted from channel dimensions using a statistical method. A calculation formula may be as follows.
Figure PCTCN2021132554-appb-000001
where x is a counted element value, x i is element values on different channels at a same location. After calculation, the resolution of the output feature map remains unchanged, and the number of channels becomes 1. Then, operations may include, but are not limited to, processing the output feature map into (w×h) ×1 and 1× (w×h) (corresponding to the aforementioned third type feature maps) through the reshape function (corresponding to the second preset processing) , obtaining a matrix with a dimension of (w×h) × (w×h) through matrix multiplication (corresponding to the third preset processing) of the (w×h) ×1 and 1× (w×h) feature maps, obtaining the attention feature map by performing a softmax operation on the matrix, processing a feature map generated by the 1×1 convolutional kernel (corresponding to the aforementioned first type feature map) into a dimension of c× (w×h) through the reshape function, obtaining another matrix with a dimension of c× (w×h) by performing matrix multiplication of the processed feature map and the attention feature map, processing the another matrix into a c×w×h feature map through the reshape function, and taking the processed another matrix as the second feature map.
It should be noted that, in middle and top layers of the target neural network, the dimension of the output feature is greatly reduced compared to the bottom of the target neural network. Ordinary convolution is used for feature extraction, and the convolutional kernel is adopted with a larger kernel to continue to maintain the receptive field of the convolutional kernel. 1×n and n×1 convolutions are adopted instead of n×n convolution to reduce the network computational cost.
The foregoing is only an exemplary description, and the embodiments do not make any specific limitations.
Through the embodiments, the target neural network generated based on the attention mechanism enables the output target feature map to more effectively and accurately reflect each pixel corresponding to the position of the each pixel in the input image, and achieve the detection accuracy of lane lines.
In some embodiments, before the obtaining the target feature map by inputting the first picture into the target neural network model, the method further includes: obtaining a first sample picture and a label picture corresponding to the first sample picture; and obtaining the target neural network model by training a to-be-trained initial neutral network model with the first sample picture and the label picture, wherein the target neural network model is configured to be trained through a loss function as follows.
loss=C (l p, l t) +αC (b p, b t)
where C represents a cross entropy loss function, l p represents the lane line pixel obtained by inputting the first sample picture into the to-be-trained initial neural network model, l t represents the lane line pixel marked by the label picture, b p represents the image background pixel obtained by inputting the first sample picture into the to-be-trained initial neural network model, b t represents the image background pixel marked by the label picture, and α is a preset parameter greater than 0.
In some embodiments, since the number of the image background pixels is usually far greater than the number of the lane line pixels, the preset α is configured to alleviate the impact of category  imbalance. During the training process, the α may be set to be 0.1.
In some embodiments, during the training process of the initial neural network model, operations may include, but are not limited to, configuring an initial learning rate to a preset value, for example, 0.01; configuring a learning rate attenuation strategy to a preset strategy, for example, every 10,000 iterations, a learning rate is multiplied by 0.1; and configuring a total number of iterations to another preset value, for example, 60,000 iterations.
The foregoing is only an exemplary description, and the embodiments do not make any specific limitations.
In some embodiments, the obtaining the target detection result by performing the image post-processing on the target feature map includes: obtaining a lane line result segmentation map by performing a binarization processing on the target feature map; obtaining a processed lane line result segmentation map containing a plurality of connected domains by preprocessing the lane line result segmentation map with an image erosion operation and an image dilation operation; obtaining a target detection result by deleting at least one of the plurality of connected domains that does not meet a preset condition and fitting the remaining of the plurality of connected domains that meets a preset condition to obtain a fitted connected domain, wherein the fitted connected domain included in the target detection result represents a detected lane line in the first picture.
In some embodiments, the binarization processing may include, but is not limited to, obtaining the lane line result segmentation map by marking an original image based on the target feature map; wherein a part predicted to be the lane line pixel is marked as 1, and another part predicted not to be a lane line, that is, to be the background pixel, is marked as 0.
It should be noted that FIG. 3 is a schematic view of a lane line detection method based on deep learning according to an embodiment of the present disclosure. As shown in FIG. 3, the lane line pixel may be filtered by performing operations including but not limited to the following operations.
Operation S1: A width and a height of a circumscribed rectangle of each connected domain 302 are calculated. When the width and height are each less than a corresponding threshold, the connected domain is likely not to be a lane line pixel position, and the connected domain is directly deleted. A scenario before the filtering the lane line pixel may be shown in FIG. 4a, and a result of the filtering the lane line pixel may be shown in FIG. 4b.
Operation S2: On a processed probability map, each different connected domain may be marked with a different digital id through a Skimage library, representing different lane lines. A result after the processing may be shown in FIG. 4c. The lane line pixels represented by different brightness belong to different lane lines.
Operation S3: FIG. 5 is a schematic view of a lane line detection method based on deep learning according to further another embodiment of the present disclosure. As shown in FIG. 5, after the connected domains of different ids are obtained, a function y=kx+ b is applied to fit each connected domain separately, where k is the slope and b is the intercept, thereby obtaining a lane line model  corresponding to each connected domain as a candidate lane line 502.
For example, due to the perspective effect, when parallel lane lines in a three-dimensional space are projected into a two-dimensional space, an angle between each lane line and a horizontal direction will be within a certain range. FIG. 6 is a schematic view of a lane line detection method based on deep learning according to further another embodiment of the present disclosure. As shown in FIG. 6, the range of the angle is required to be determined according to the camera’s installation position and focal length parameters. Then, according to the slope of each lane line, the candidate lane lines of which the angle is not within the range are continued to be filtered.
In some embodiments, the preprocessing of the image erosion operation and the image dilation operation may include but is not limited to one or more operations to obtain the target feature map containing the plurality of connected domains. The preset condition may include but is not limited to a determination based on the size, length and other shape characteristics of the connected domains. For example, by setting a preset length threshold and a preset width threshold, a connected domain is deleted in response to the length of the connected domain being less than the preset length threshold and/or the width of the connected domain being less than the preset width threshold, and the target feature map containing the connected domains that meet the preset condition is retained to determine the final target detection result.
For example, in some complex road scenes, the lane line features recorded in the target feature map output by the aforementioned target neural network model may show some discontinuities, as shown in FIG. 4a. The scattered and small area lane line prediction pixels may be eliminated using including but not limited to the image erosion operation in OpenCV, because the scattered and small area of lane line prediction pixels are likely to be falsely detected. The prediction pixel area for remaining lane lines is then increased using the image dilation operation in OpenCV, and finally, the connected domains that do not belong to the lane lines are filtered out by configuring the size of the connected domains.
The foregoing is only an exemplary description, and the embodiments do not make any specific limitations.
In some embodiments, the method further includes:
re-determining the detected lane line in the first picture by like-for-like merging the fitted connected domains contained in the target detection result by a clustering algorithm.
In some embodiments, operations may include, but are not limited to, leaving a plurality of lane lines after filtering the non-lane lines. Some candidate lane lines belong to different parts of the same lane line, as shown in the left side of FIG. 5. The two candidate lane lines in the left belong to the same lane line logically, and these lane lines are required to be merged. Clustering is performed using the slope of each candidate lane line. The clustering algorithm used may include, but is not limited to, a mean drift clustering algorithm, which may be implemented by sklearn. cluster. MeanShift function. A clustering radius is required to be determined based on the camera installation location and focal length parameters. The number of clustering centers obtained is the number of lane lines, and the equation of each lane line  after merging is as follows.
Figure PCTCN2021132554-appb-000002
where n indicates the number of candidate lane lines being merged into the lane line, and k i and b i indicate the slope and intercept of the i-th candidate lane line respectively.
The foregoing is only an exemplary description, and the embodiments do not make any specific limitations.
In some embodiments, the target neural network model includes a neural network model including a lightweight convolutional neural network.
The foregoing is only an exemplary description, and the embodiments do not make any specific limitations.
The following will further explain the embodiments in combination with specific examples.
The target neural network model architecture may be adopted with a special combination of a traditional convolutional neural network model and a lightweight convolutional neural network model. At the bottom of the above target neural network model, due to the relatively large resolution of the input image, usually only a smaller convolutional kernel can be used, resulting in a smaller receptive field of the bottom convolutional kernel, while the use of a larger convolutional kernel will lead to excessive computational cost. In the present disclosure, the depth separable convolution of a large convolutional kernel is used at the bottom of the target neural network model, such that the computational cost of the neural network model is reduced while increasing the receptive field of the target neural network model. For an n×n convolutional kernel, assuming that the number of input channels is c1 and the number of output channels is c2, the number of parameters of the convolution layer is n×n×c1×c2. When the two convolutions of 1×n and n×1 are used instead, the number of parameters of the convolutional layer is 2×n×c1×c2, and the number of parameters is reduced by (n×n-2×n) ×c1×c2. Therefore, the greater the value of n, the more obvious the effect of reducing the number of parameters, while the receptive field remains unchanged.
In the middle and top layers of the target neural network model, the dimensionality of the feature map is greatly reduced compared to the bottom of the target neural network model. Ordinary convolution is used for feature extraction, and the convolutional kernel is adopted with a larger kernel to continue to maintain the receptive field of the convolutional kernel. 1×n and n×1 convolutions are adopted instead of n×n convolution to reduce the computational cost of the target neural network model.
The target neural network model structure is shown in Table 1.
Table 1
Figure PCTCN2021132554-appb-000003
Figure PCTCN2021132554-appb-000004
Among them, the above dewConvSP convolution operation may include but is not limited to the content shown in Table 2.
Table 2
Figure PCTCN2021132554-appb-000005
In some embodiments, near the output layer of the target neural network model, a multi-scale attention mechanism is configured to increase the weight of the important region on the feature map (corresponding to the attention convolutional layer in Table 1) . To improve the accuracy of target neural network model detection, the calculation flowchart of the proposed attention mechanism is shown in FIG. 7.
S702: The feature map output by a previous convolutional layer (corresponding to the aforementioned first feature map) is input, assuming that the resolution of the first feature map is w, h, and the number of channels is c.
S704: A convolution operation is performed on the input image using three convolutional kernels of different scales, wherein the use of the different convolutional kernels is to fuse information of lane line elements at different receptive field scales. Three feature maps are output, the resolution remains unchanged, and the number of channels is 0.5×c, 0.5×c, and c (from top to bottom) .
S706: First two feature maps are configured to calculate the correlation between elements. In order to determine the importance of each location to the global inference. Then element value at each position is counted from channel dimensions using a statistical method. A calculation formula may be as follows.
Figure PCTCN2021132554-appb-000006
The foregoing calculation process schematic diagram may include, but is not limited to, as shown in FIG. 8:
where x is a counted element value, x i is element values on different channels at a same location. Taking i equal to 4 as an example, after calculation, the resolution of the output feature map remains unchanged, and the number of channels becomes 1. Then the output feature map is processed into (w×h) ×1 and 1× (w×h) through the reshape function, a matrix with a dimension of (w×h) × (w×h) is obtained through matrix multiplication, and the attention feature map is obtained by performing a softmax operation on the matrix.
S708: a feature map generated by H (x) is processed into a dimension of c× (w×h) through the reshape function, another matrix with a dimension of c× (w×h) is obtained by performing matrix multiplication of the processed feature map and the attention feature map, the another matrix is processed into a c×w×h feature map through the reshape function, and the processed another matrix is taken as the second feature map.
In some embodiments, for all convolution operations in the target neural network model (corresponding to Table 1) , as long as the sliding step size is 1, the feature map will be filled with 0 around it before the convolution operation, ensuring that the resolution of the output feature map remains unchanged after the convolution operation. The input of the network is an RGB three-channel color image. The original image is resized to 320×184×3 using a bilinear interpolation, and the label image is resized to 320×184 using a nearest neighbor interpolation algorithm. After a series of convolution processing, the result of lane line segmentation (corresponding to a picture in which each pixel indicates the probability value of whether the location of that pixel is a lane line or not) with the same resolution as the input image. The two channels represent the image background pixel location information and lane line pixel location information, respectively. The loss function used for model training is cross entropy with weights, with the following equation.
loss=C (l p, l t) +αC (b p, b t)
where C represents a cross entropy loss function, l p represents the lane line pixel obtained by inputting the first sample picture into the to-be-trained initial neural network model, l t represents the lane line pixel marked by the label picture, b p represents the image background pixel obtained by inputting the first sample picture into the to-be-trained initial neural network model, b t represents the image background pixel marked by the label picture, and α is a preset parameter greater than 0.
It should be noted that since background pixels are usually far more than lane line pixels, α is configured to alleviate the impact of category imbalance, and α may be set to 0.1 during training. The initial learning rate used in training is 0.01. The learning rate decay strategy is that every 10,000 iterations, a learning rate is multiplied by 0.1, and the total number of iterations is set to 60,000.
After the aforementioned target neural network model outputs the results of the probability map of the lane line segmentation results, the image post-processing and statistical methods are configured to filter the falsely detected lane lines and merge the lane lines. The processing flow chart is shown in FIG. 9. The specific steps may be as follows.
S902: the target neural network model outputs the target feature map;
S904: corrosion and dilation processing are performed on the target feature map;
S906: the connected domains are filtered;
S908: the connected domains are each marked with an ID;
S910: the lane lines are fitted according to the connected domains;
S912: the lane lines are filtered according to the angle range;
S914: clustering is performed to obtain the final lane line;
S916: the detection result is output.
Among them, in some complex road scenes, the lane line features recorded in the target feature map output by the aforementioned target neural network model may show some discontinuities, as shown in FIG. 4a. The scattered and small area lane line prediction pixels may be eliminated using including but not limited to the image erosion operation in OpenCV, because the scattered and small area of lane line prediction pixels are likely to be falsely detected. The prediction pixel area for remaining lane lines is then increased using the image dilation operation in OpenCV, and finally, the connected domains that do not belong to the lane lines are filtered out by configuring the size of the connected domains.
It should be noted that the above lane line pixels can be filtered by including but not limited to the following methods:
S1: A width and a height of a circumscribed rectangle of each connected domain 302 as shown in FIG. 3 are calculated. When the width and height are each less than a corresponding threshold, the connected domain is likely not to be a lane line pixel position, and the connected domain is directly deleted. A result after the lane line pixel is processed may be shown in FIG. 4b. The connected domains shown in FIG. 4b do not include part of the connected domains included in FIG. 4a.
S2: On a processed probability map, each different connected domain may be marked with a different digital id through a Skimage library, representing different lane lines. A result after the processing may be shown in FIG. 4c. The lane line pixels represented by different brightness belong to different lane lines.
S3: After the connected domains of different ids are obtained, a function y=kx+ b is applied to fit each connected domain separately, where k is the slope and b is the intercept, thereby obtaining a lane line model corresponding to each connected domain as a candidate lane line 502 as shown in FIG. 5.
For example, due to the perspective effect, when parallel lane lines in a three-dimensional space are projected into a two-dimensional space, an angle between each lane line and a horizontal direction will be within a certain range. As shown in FIG. 6, the range of the angle is required to be determined according to the camera’s installation position and focal length parameters. Then, according to the slope of each lane line, the candidate lane lines of which the angle is not within the range are continued to be filtered.
A plurality of lane lines may be left after filtering the non-lane lines. Some candidate lane lines belong to different parts of the same lane line, as shown in the left side of FIG. 5. The two candidate lane lines in the left belong to the same lane line logically, and these lane lines are required to be merged.  Clustering is performed using the slope of each candidate lane line. The clustering algorithm used may include, but is not limited to, a mean drift clustering algorithm, which may be implemented by sklearn. cluster. MeanShift function. A clustering radius is required to be determined based on the camera installation location and focal length parameters. The number of clustering centers obtained is the number of lane lines, and the equation of each lane line after merging is as follows.
Figure PCTCN2021132554-appb-000007
where n indicates the number of candidate lane lines being merged into the lane line, and k i and b i indicate the slope and intercept of the i-th candidate lane line respectively.
With this embodiment, a target neural network model based on a multi-scale attention mechanism is used for the lane line detection task, a lightweight lane line detection network is designed based on the characteristics of each layer of the deep neural network to achieve the use of a lightweight deep neural network in the lane line detection task, which reduces the complexity and computational cost of the network, has advantages when arranged in embedded devices with limited storage capacity and computational power, increases the receptive field of the target neural network model, and alleviates the problem of predicted lane line pixel discontinuity in the lane line segmentation task. Moreover, combined with the output format of the target neural network model, a set of post-processing algorithms may be designed that can accurately detect lane lines based on the coarse extraction of lane line features by the deep neural network, reducing the network’s requirement for training datasets. For example, the training can be performed on some general datasets, and the requirement of roughly extracting lane line features can be achieved without specifically collecting the corresponding road scenes for labeling, which reduces the cost of practical application of the algorithm.
From the description of the above embodiments, it will be clear to those skilled in the art that the method according to the above embodiments can be implemented with the aid of software plus the necessary general purpose hardware platform, or of course by means of hardware, but in many cases the former is the better way of implementation. Based on this understanding, the technical solution of the present disclosure, which essentially or rather contributes to the prior art, may be embodied in the form of a software product, which is stored in a storage medium (e.g. ROM/RAM, disk, CD-ROM) and includes a number of instructions to enable a terminal device (which may be a cell phone, a computer, a server, or a network device, etc. ) to perform the method described in various embodiments of the present disclosure.
The present disclosure also provides a lane line detection apparatus based on deep learning, which is configured to implement the aforementioned embodiments and preferred implementations, and what has been explained will not be repeated. As mentioned below, the term “module” may implement a combination of software and/or hardware with predetermined functions. Although the apparatus described in the following embodiments is preferably implemented by software, hardware or a combination of software and hardware is also possible and conceived.
FIG. 10 is a structural block view of a lane line detection apparatus based on deep learning  according to an embodiment of the present disclosure. As shown in FIG. 10, the apparatus includes:
an obtaining module 1002, configured to obtain a first picture that is planned to be detected;
first processing module 1004, configured to obtain a target feature map by inputting the first picture into a target neural network model; wherein the target neural network model includes a neural network model generated based on a multi-scale attention mechanism and a deep separable convolution model, and the target feature map is configured to represent a probability of each pixel in the first picture being a lane line pixel;
second processing module 1006, configured to obtain a target detection result by performing an image post-processing on the target feature map; wherein the target detection result is configured to indicate a detected lane line in the first picture.
In some embodiments, the first processing module 1004 includes:
a first calculation unit, configured to obtain a first feature map by inputting the first picture into a first convolutional layer; wherein the target neural network model includes the first convolutional layer, and the first convolutional layer is configured to adjust a resolution of the first picture and to extract the lane line pixel of the first image after the resolution adjustment;
a second calculation unit, configured to obtain a second feature map by inputting the first feature map into a second convolutional layer; wherein the target neural network model includes the second convolutional layer, and the second convolutional layer is configured to increase a weight of a preset region in the first feature map based on the multi-scale attention mechanism; and
a first processing unit, configured to obtain the target feature map by performing a first preset processing on the second feature map; wherein the first preset processing includes an up-sampling operation.
In some embodiments, the aforementioned apparatus is configured to input the first picture into the first convolutional layer in the following manner to obtain the first feature map: obtaining the first feature map by inputting the first picture with a resolution less than or equal to a first preset resolution into the first convolutional layer.
In some embodiments, he aforementioned apparatus is configured to input the first picture into the first convolutional layer in the following manner to obtain the first feature map: obtaining a first sub-feature map by inputting the first picture into a first sub-convolutional layer, wherein the first sub-convolutional layer is configured to perform a convolution operation, and the first convolutional layer includes the first sub-convolutional layer; obtaining a second sub-feature map by inputting the first sub-feature map into a second sub-convolutional layer, wherein the second sub-convolutional layer is configured to perform a depth separable convolution operation to extract feature information of the first sub-feature map, and the first convolutional layer includes the second sub-convolutional layer; in response to a resolution of the second sub-feature map being greater than a second preset resolution, reducing a resolution of the first sub-feature map by re-inputting the second sub-feature map into the first sub-convolutional layer; in response to the resolution of the second sub-feature map being less than or  equal to the second preset resolution, determining the second sub-feature map to be the first feature map.
In some embodiments, the first calculation unit is configured to input the first sub-feature map into the second sub-convolutional layer in the following manner to obtain the second sub-feature map: obtaining the second sub-feature map by extracting the feature information of the first sub-feature map with a 1×n convolutional kernel and by extracting the feature information of the first sub-feature map with an n×1 convolutional kernel, wherein the n is a positive odd number greater than 1.
In some embodiments, the second calculation unit is configured to input the first feature map into the second convolutional layer in the following manner to obtain the second feature map: obtaining a first type feature map and a plurality of second type feature maps by inputting the first feature map into the second convolutional layer and by performing a convolution operation on the first feature map with a plurality of convolutional kernels of different sizes included in the second convolution layer; determining a plurality of third type feature maps from the plurality of second type feature maps through a preset statistical method, wherein the plurality of third type feature maps are configured to be performed with a second preset processing such that sizes of the plurality of third type feature maps match each other, the plurality of third type feature maps are configured to be performed with a third preset processing to obtain an attention feature map, and a size of the attention feature map matches a size of the first type feature map; and obtaining the second feature map by performing the third preset processing on the attention feature map and the first type feature map.
In some embodiments, the apparatus is further configured to obtain a first sample picture and a label picture corresponding to the first sample picture; and
obtain the target neural network model by training a to-be-trained initial neutral network model with the first sample picture and the label picture, wherein the target neural network model is configured to be trained through a loss function as follows.
loss=C (l p, l t) +αC (b p, b t)
where C represents a cross entropy loss function, l p represents the lane line pixel obtained by inputting the first sample picture into the to-be-trained initial neural network model, l t represents the lane line pixel marked by the label picture, b p represents the image background pixel obtained by inputting the first sample picture into the to-be-trained initial neural network model, b t represents the image background pixel marked by the label picture, and α is a preset parameter greater than 0.
In some embodiments, the second processing module 1006 includes:
a second processing unit, configured to obtain a lane line result segmentation map by performing a binarization processing on the target feature map;
a third processing unit, configured to obtain a processed lane line result segmentation map containing a plurality of connected domains by preprocessing the lane line result segmentation map with an image erosion operation and an image dilation operation;
a fourth processing unit, configured to obtain a target detection result by deleting at least one of the plurality of connected domains that does not meet a preset condition and fitting the remaining of  the plurality of connected domains that meets a preset condition to obtain a fitted connected domain, wherein the fitted connected domain included in the target detection result represents a detected lane line in the first picture.
In some embodiments, the apparatus is further configured to:
re-determine the detected lane line in the first picture by like-for-like merging the fitted connected domains contained in the target detection result by a clustering algorithm.
In some embodiments, the target neural network model includes a neural network model including a lightweight convolutional neural network.
It should be noted that each of the above modules can be implemented by software or hardware. For the latter, it can be implemented in the following way, but not limited to: the above modules are all located in the same processor; or, each of the above modules is located in a different processor in any combination.
An embodiment of the present disclosure also provides a computer-readable storage medium in which a computer program is stored, wherein the computer program is configured to execute the steps in any one of the foregoing method embodiments when executed.
In this embodiment, the computer-readable storage medium may be configured to store a computer program for executing the following steps:
S1: A first picture that is planned to be detected is obtained.
S2. A target feature map is obtained by inputting the first picture into a target neural network model; wherein the target neural network model includes a neural network model generated based on a multi-scale attention mechanism and a deep separable convolution model, and the target feature map is configured to represent a probability of each pixel in the first picture being a lane line pixel.
S3: A target detection result is obtained by performing an image post-processing on the target feature map; wherein the target detection result is configured to indicate a detected lane line in the first picture.
In some embodiments, the foregoing computer-readable storage medium may include, but is not limited to, a USB flash drive, a read-only memory (ROM) , a random access memory (RAM) , a mobile hard drive, a magnetic disk or an optical disk and other media that can store computer programs.
An embodiment of the present disclosure also provides an electronic device, including a memory and a processor, the memory stores a computer program, and the processor is configured to run the computer program to execute the steps in any of the foregoing method embodiments.
In some embodiments, the aforementioned electronic device may further include a transmission device and an input-output device, wherein the transmission device is connected to the processor, and the input-output device is connected to the processor.
In some embodiments, the processor may be configured to execute the following steps through a computer program:
S1: A first picture that is planned to be detected is obtained.
S2. A target feature map is obtained by inputting the first picture into a target neural network model; wherein the target neural network model includes a neural network model generated based on a multi-scale attention mechanism and a deep separable convolution model, and the target feature map is configured to represent a probability of each pixel in the first picture being a lane line pixel.
S3: A target detection result is obtained by performing an image post-processing on the target feature map; wherein the target detection result is configured to indicate a detected lane line in the first picture.
For specific examples in this embodiment, reference may be made to the examples described in the aforementioned embodiments and exemplary implementations, which will not be repeated here.
Clearly, it should be understood by those skilled in the art that the modules or steps of the present disclosure described above may be implemented with a generic computing device, they may be centralized on a single computing device or distributed on a network of multiple computing devices, they may be implemented with program code executable by the computing device, thus, they may be stored in a storage device to be executed by the computing device. In some cases, the steps shown or described may be executed in a different order than herein, or they may be implemented separately as individual integrated circuit modules, or multiple modules or steps thereof may be implemented as individual integrated circuit modules. In this way, the present disclosure is not limited to any particular combination of hardware and software.
The foregoing is only preferred embodiments of the present disclosure and is not intended to limit the present disclosure. To those skilled in the art, the present disclosure is subject to various modifications and variations. Any modification, equivalent replacement, improvement, etc. made within the principles of the present disclosure shall be included in the scope of the present disclosure.

Claims (13)

  1. A lane line detection method based on deep learning, comprising:
    obtaining a first picture that is planned to be detected;
    obtaining a target feature map by inputting the first picture into a target neural network model; wherein the target neural network model comprises a neural network model generated based on a multi-scale attention mechanism and a deep separable convolution model, and the target feature map is configured to represent a probability of each pixel in the first picture being a lane line pixel; and
    obtaining a target detection result by performing an image post-processing on the target feature map; wherein the target detection result is configured to indicate a detected lane line in the first picture.
  2. The method according to claim 1, wherein the obtaining the target feature map by inputting the first picture into the target neural network model comprises:
    obtaining a first feature map by inputting the first picture into a first convolutional layer;
    obtaining a second feature map by inputting the first feature map into a second convolutional layer, wherein the target neural network model comprises the second convolutional layer, and the second convolutional layer is configured to increase a weight of a preset region in the first feature map based on the multi-scale attention mechanism; and
    obtaining the target feature map by performing a first preset processing on the second feature map, wherein the first preset processing comprises an up-sampling operation.
  3. The method according to claim 2, wherein the obtaining the first feature map by inputting the first picture into the first convolutional layer comprises:
    obtaining the first feature map by inputting the first picture with a resolution less than or equal to a first preset resolution into the first convolutional layer.
  4. The method according to claim 2, wherein the obtaining the first feature map by inputting the first picture into the first convolutional layer comprises:
    obtaining a first sub-feature map by inputting the first picture into a first sub-convolutional layer, wherein the first sub-convolutional layer is configured to perform a convolution operation, and the first convolutional layer comprises the first sub-convolutional layer;
    obtaining a second sub-feature map by inputting the first sub-feature map into a second sub-convolutional layer, wherein the second sub-convolutional layer is configured to perform a depth separable convolution operation to extract feature information of the first sub-feature map, and the first convolutional layer comprises the second sub-convolutional layer;
    in response to a resolution of the second sub-feature map being greater than a second preset resolution, reducing a resolution of the first sub-feature map by re-inputting the second sub-feature map into the first sub-convolutional layer; and
    in response to the resolution of the second sub-feature map being less than or equal to the second preset resolution, determining the second sub-feature map to be the first feature map.
  5. The method according to claim 4, wherein the obtaining the second sub-feature map by inputting the first sub-feature map into the second sub-convolutional layer comprises:
    obtaining the second sub-feature map by extracting the feature information of the first sub-feature map with a 1×n convolutional kernel and by extracting the feature information of the first sub-feature map with an n×1 convolutional kernel, wherein the n is a positive odd number greater than 1.
  6. The method according to claim 2, wherein the obtaining the second feature map by inputting the first feature map into the second convolutional layer comprises:
    obtaining a first type feature map and a plurality of second type feature maps by inputting the first feature map into the second convolutional layer and by performing a convolution operation on the first feature map with a plurality of convolutional kernels of different sizes included in the second convolution layer;
    determining a plurality of third type feature maps from the plurality of second type feature maps through a preset statistical method, wherein the plurality of third type feature maps are configured to be performed with a second preset processing such that sizes of the plurality of third type feature maps match each other, the plurality of third type feature maps are configured to be performed with a third preset processing to obtain an attention feature map, and a size of the attention feature map matches a size of the first type feature map; and
    obtaining the second feature map by performing the third preset processing on the attention feature map and the first type feature map.
  7. The method according to claim 1, before the obtaining the target feature map by inputting the first picture into the target neural network model, further comprising:
    obtaining a first sample picture and a label picture corresponding to the first sample picture; and
    obtaining the target neural network model by training a to-be-trained initial neutral network model with the first sample picture and the label picture, wherein the target neural network model is configured to be trained through a loss function as follows:
    loss=C (l p, l t) +αC (b p, b t)
    where C represents a cross entropy loss function, l p represents a lane line pixel obtained by inputting the first sample picture into the to-be-trained initial neural network model, l t represents a lane line pixel marked by the label picture, b p represents an image background pixel obtained by inputting the first sample picture into the to-be-trained initial neural network model, b t represents an image background pixel marked by the label picture, and α is a preset parameter greater than 0.
  8. The method according to claim 1, wherein the obtaining the target detection result by performing the image post-processing on the target feature map comprises:
    obtaining a lane line result segmentation map by performing a binarization processing on the target feature map;
    obtaining a processed lane line result segmentation map containing a plurality of connected domains by preprocessing the lane line result segmentation map with an image erosion operation and an image dilation operation; and
    obtaining a target detection result by deleting at least one of the plurality of connected domains that does not meet a preset condition and fitting a remaining of the plurality of connected domains that meets a preset condition, wherein a fitted connected domain included in the target detection result represents the detected lane line in the first picture.
  9. The method according to claim 8, further comprising:
    re-determining the detected lane line in the first picture by like-for-like merging the fitted connected domain included in the target detection result by a clustering algorithm.
  10. The method according any one of claims 1-9, wherein the target neural network model comprises a neural network model comprising a lightweight convolutional neural network.
  11. A lane line detection apparatus based on deep learning, comprising:
    an obtaining module, configured to obtain a first picture that is planned to be detected;
    a first processing module, configured to obtain a target feature map by inputting the first picture into a target neural network model; wherein the target neural network model comprises a neural network model generated based on a multi-scale attention mechanism and a deep separable convolution model, and the target feature map is configured to represent a probability of each pixel in the first picture being a lane line pixel; and
    a second processing module, configured to obtain a target detection result by performing an image post-processing on the target feature map; wherein the target detection result is configured to indicate a detected lane line in the first picture.
  12. A computer-readable storage medium, storing a computer program; wherein the computer program is configured to perform the method according to any one of claims 1-10 when executed.
  13. An electronic device, comprising a processor, a memory, and a computer program stored in the memory and executable on the processor; wherein the processor is configured to perform the method according to any one of claims 1-10 when executing the computer program.
PCT/CN2021/132554 2020-12-25 2021-11-23 Lane line detection method based on deep learning, and apparatus Ceased WO2022134996A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP21908995.0A EP4252148B1 (en) 2020-12-25 2021-11-23 Lane line detection method based on deep learning, and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011555482.9A CN112287912B (en) 2020-12-25 2020-12-25 Lane line detection method and device based on deep learning
CN202011555482.9 2020-12-25

Publications (1)

Publication Number Publication Date
WO2022134996A1 true WO2022134996A1 (en) 2022-06-30

Family

ID=74426133

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/132554 Ceased WO2022134996A1 (en) 2020-12-25 2021-11-23 Lane line detection method based on deep learning, and apparatus

Country Status (3)

Country Link
EP (1) EP4252148B1 (en)
CN (1) CN112287912B (en)
WO (1) WO2022134996A1 (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115240158A (en) * 2022-08-09 2022-10-25 上海励驰半导体有限公司 Lane line detection method based on deep learning
CN115311573A (en) * 2022-10-08 2022-11-08 浙江壹体科技有限公司 Site line detection and target positioning method, electronic equipment and storage medium
CN115512325A (en) * 2022-10-14 2022-12-23 东南大学 An End-to-End Instance Segmentation Based Lane Detection Method
CN115511779A (en) * 2022-07-20 2022-12-23 北京百度网讯科技有限公司 Image detection method, device, electronic device and storage medium
CN115526908A (en) * 2022-09-27 2022-12-27 电子科技大学 A high-speed moving filter rod explosion bead detection and positioning method
CN115564975A (en) * 2022-08-30 2023-01-03 海口乐帆技术开发有限公司 Image matching method, device, terminal equipment and storage medium
CN115661188A (en) * 2022-11-08 2023-01-31 南京莱斯电子设备有限公司 Road panoramic target detection tracking method under edge computing platform
CN115861650A (en) * 2022-12-14 2023-03-28 安徽大学 Shadow detection method and device based on attention mechanism and federal learning
CN116416590A (en) * 2023-04-12 2023-07-11 西安电子科技大学 Unsupervised Domain Adaptation Lane Line Detection Method Based on Progressive Feature Alignment
CN116872961A (en) * 2023-09-07 2023-10-13 北京捷升通达信息技术有限公司 Control system for intelligent driving vehicles
CN116935349A (en) * 2023-09-15 2023-10-24 华中科技大学 Lane line detection method, system, equipment and medium based on Zigzag transformation
CN117036931A (en) * 2023-07-04 2023-11-10 中国铁建昆仑投资集团有限公司 A method for detecting small target pests in ecological landscape engineering based on convolutional neural network
CN117058636A (en) * 2023-07-21 2023-11-14 上海欧菲智能车联科技有限公司 Lane line detection method, device, electronic equipment and storage medium
CN117292348A (en) * 2023-10-27 2023-12-26 中汽创智科技有限公司 Road element detection method, device, computer equipment and storage medium
CN118097340A (en) * 2024-04-28 2024-05-28 合肥市正茂科技有限公司 A training method, system, device and medium for lane image segmentation model
CN119919771A (en) * 2025-01-02 2025-05-02 华南农业大学 A method for target detection in open road scenes based on thermal infrared recognition
CN120014275A (en) * 2025-01-22 2025-05-16 内蒙古农业大学 A method and system for constructing a cross-scale large kernel convolution corn leaf disease segmentation model based on coordinated attention mechanism

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112287912B (en) * 2020-12-25 2021-03-30 浙江大华技术股份有限公司 Lane line detection method and device based on deep learning
CN112818943A (en) * 2021-03-05 2021-05-18 上海眼控科技股份有限公司 Lane line detection method, device, equipment and storage medium
CN113052135B (en) * 2021-04-22 2023-03-24 淮阴工学院 Lane line detection method and system based on deep neural network Lane-Ar
CN113537378B (en) * 2021-07-28 2025-02-18 浙江大华技术股份有限公司 Image detection method and device, storage medium, and electronic device
CN113724214B (en) * 2021-08-23 2024-02-23 唯智医疗科技(佛山)有限公司 Image processing method and device based on neural network
CN114648747B (en) * 2022-03-30 2025-09-16 上海商汤临港智能科技有限公司 Target detection and driving control method and device, electronic equipment and storage medium
CN115019277B (en) * 2022-07-01 2025-08-29 上海西井科技股份有限公司 Multi-camera perception and recognition method, device, electronic device, and storage medium
CN115410178A (en) * 2022-08-26 2022-11-29 中汽创智科技有限公司 Lane line detection method, device and storage medium
CN116935065A (en) * 2023-06-14 2023-10-24 武汉长江通信智联技术有限公司 Lane line instance detection method and system based on fusing and fusion
CN117953464B (en) * 2023-12-26 2025-05-09 北京鉴智科技有限公司 Traffic signal lamp identification method and device, electronic equipment and storage medium
CN119091127B (en) * 2024-11-05 2025-04-08 北京西南交大盛阳科技股份有限公司 Visual information processing method in railway shunting operation scene
CN120088752B (en) * 2025-05-06 2025-07-08 南昌大学 A method and system for detecting drivable area and lane lines on a traffic road

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009524A (en) * 2017-12-25 2018-05-08 西北工业大学 A kind of method for detecting lane lines based on full convolutional network
CN110276267A (en) * 2019-05-28 2019-09-24 江苏金海星导航科技有限公司 Method for detecting lane lines based on Spatial-LargeFOV deep learning network
CN112036467A (en) * 2020-08-27 2020-12-04 循音智能科技(上海)有限公司 Abnormal heart sound identification method and device based on multi-scale attention neural network
CN112287912A (en) * 2020-12-25 2021-01-29 浙江大华技术股份有限公司 Lane line detection method and device based on deep learning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109902600B (en) * 2019-02-01 2020-10-27 清华大学 Road area detection method
CN111914596B (en) * 2019-05-09 2024-04-09 北京四维图新科技股份有限公司 Lane detection method, device, system and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009524A (en) * 2017-12-25 2018-05-08 西北工业大学 A kind of method for detecting lane lines based on full convolutional network
CN110276267A (en) * 2019-05-28 2019-09-24 江苏金海星导航科技有限公司 Method for detecting lane lines based on Spatial-LargeFOV deep learning network
CN112036467A (en) * 2020-08-27 2020-12-04 循音智能科技(上海)有限公司 Abnormal heart sound identification method and device based on multi-scale attention neural network
CN112287912A (en) * 2020-12-25 2021-01-29 浙江大华技术股份有限公司 Lane line detection method and device based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4252148A4 *

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115511779A (en) * 2022-07-20 2022-12-23 北京百度网讯科技有限公司 Image detection method, device, electronic device and storage medium
CN115511779B (en) * 2022-07-20 2024-02-20 北京百度网讯科技有限公司 Image detection method, device, electronic equipment and storage medium
CN115240158A (en) * 2022-08-09 2022-10-25 上海励驰半导体有限公司 Lane line detection method based on deep learning
CN115564975A (en) * 2022-08-30 2023-01-03 海口乐帆技术开发有限公司 Image matching method, device, terminal equipment and storage medium
CN115526908A (en) * 2022-09-27 2022-12-27 电子科技大学 A high-speed moving filter rod explosion bead detection and positioning method
CN115311573A (en) * 2022-10-08 2022-11-08 浙江壹体科技有限公司 Site line detection and target positioning method, electronic equipment and storage medium
CN115512325A (en) * 2022-10-14 2022-12-23 东南大学 An End-to-End Instance Segmentation Based Lane Detection Method
CN115661188A (en) * 2022-11-08 2023-01-31 南京莱斯电子设备有限公司 Road panoramic target detection tracking method under edge computing platform
CN115861650A (en) * 2022-12-14 2023-03-28 安徽大学 Shadow detection method and device based on attention mechanism and federal learning
CN115861650B (en) * 2022-12-14 2026-02-03 安徽大学 Shadow detection method and device based on attention mechanism and federal learning
CN116416590A (en) * 2023-04-12 2023-07-11 西安电子科技大学 Unsupervised Domain Adaptation Lane Line Detection Method Based on Progressive Feature Alignment
CN116416590B (en) * 2023-04-12 2025-07-29 西安电子科技大学 Unsupervised domain adaptive lane line detection method based on progressive feature alignment
CN117036931A (en) * 2023-07-04 2023-11-10 中国铁建昆仑投资集团有限公司 A method for detecting small target pests in ecological landscape engineering based on convolutional neural network
CN117058636A (en) * 2023-07-21 2023-11-14 上海欧菲智能车联科技有限公司 Lane line detection method, device, electronic equipment and storage medium
CN116872961B (en) * 2023-09-07 2023-11-21 北京捷升通达信息技术有限公司 Control system for intelligent driving vehicle
CN116872961A (en) * 2023-09-07 2023-10-13 北京捷升通达信息技术有限公司 Control system for intelligent driving vehicles
CN116935349A (en) * 2023-09-15 2023-10-24 华中科技大学 Lane line detection method, system, equipment and medium based on Zigzag transformation
CN116935349B (en) * 2023-09-15 2023-11-28 华中科技大学 Lane line detection method, system, equipment and medium based on Zigzag transformation
CN117292348A (en) * 2023-10-27 2023-12-26 中汽创智科技有限公司 Road element detection method, device, computer equipment and storage medium
CN118097340A (en) * 2024-04-28 2024-05-28 合肥市正茂科技有限公司 A training method, system, device and medium for lane image segmentation model
CN119919771A (en) * 2025-01-02 2025-05-02 华南农业大学 A method for target detection in open road scenes based on thermal infrared recognition
CN120014275A (en) * 2025-01-22 2025-05-16 内蒙古农业大学 A method and system for constructing a cross-scale large kernel convolution corn leaf disease segmentation model based on coordinated attention mechanism

Also Published As

Publication number Publication date
CN112287912B (en) 2021-03-30
EP4252148B1 (en) 2026-04-01
CN112287912A (en) 2021-01-29
EP4252148A4 (en) 2024-03-27
EP4252148A1 (en) 2023-10-04

Similar Documents

Publication Publication Date Title
WO2022134996A1 (en) Lane line detection method based on deep learning, and apparatus
EP4152204B1 (en) Lane line detection method, and related apparatus
CN112528878A (en) Method and device for detecting lane line, terminal device and readable storage medium
EP3806064B1 (en) Method and apparatus for detecting parking space usage condition, electronic device, and storage medium
CN110084095B (en) Lane line detection method, lane line detection device and computer storage medium
WO2022126377A1 (en) Traffic lane line detection method and apparatus, and terminal device and readable storage medium
CN115631344B (en) Target detection method based on feature self-adaptive aggregation
CN113221750A (en) Vehicle tracking method, device, equipment and storage medium
CN111914596B (en) Lane detection method, device, system and storage medium
JP7119197B2 (en) Lane attribute detection
CN111461221A (en) A multi-source sensor fusion target detection method and system for autonomous driving
CN111382658B (en) Road traffic sign detection method in natural environment based on image gray gradient consistency
CN112699711B (en) Lane line detection method and device, storage medium and electronic equipment
CN114898306B (en) Method and device for detecting target orientation and electronic equipment
CN111027539A (en) License plate character segmentation method based on spatial position information
CN109635701B (en) Lane passing attribute acquisition method, lane passing attribute acquisition device and computer readable storage medium
CN114511832B (en) Lane line analysis method and device, electronic device and storage medium
CN107689157A (en) Traffic intersection based on deep learning can passing road planing method
CN113822149A (en) Emergency lane visual detection method and system based on view angle of unmanned aerial vehicle
CN115683142A (en) A method and device for determining a region of interest
CN110135382A (en) A human detection method and device
CN118722598A (en) Parking assistance method, device, equipment, and storage medium
CN112464938B (en) License plate detection and identification method, device, equipment and storage medium
CN115376093A (en) Object prediction method and device in intelligent driving and electronic equipment
CN121074896B (en) Method, device, equipment and storage medium for training model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21908995

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202317043692

Country of ref document: IN

ENP Entry into the national phase

Ref document number: 2021908995

Country of ref document: EP

Effective date: 20230628

NENP Non-entry into the national phase

Ref country code: DE

WWG Wipo information: grant in national office

Ref document number: 2021908995

Country of ref document: EP