International Core Journal of Engineering 2020-26 | Page 89

feature map to reduce the feature map dimension to 1024. The feature map is sent. In the RPN, N candidate regions are generated. At the same time, the feature map after dimensionality reduction generates a position-sensitive score graph with channel number channel=(C+1)*k^2 through a special convolution, where C is the number of signature categories +1, K^2 The size of the scores map that is finally obtained for each category. Finally, the ROI Pooling operation is performed by combining the candidate region and the score map generated by the RPN network to obtain a vector of K^2 size. Here, we use the average pooling method to vote on the obtained vector to obtain the confidence of each category. Finally, the flag type corresponding to the highest reliability value is taken as the detection type. in the stage of image processing and shallow neural network, and the accuracy and performance are low. Anti-interference, real-time, universality and robustness are the criteria for measuring detection algorithms. It is necessary to conduct in- depth research onroad marking algorithms based on deep learning. At present, the target detection algorithm based on deep learning is mainly divided into a single-stage detection algorithm and a two-stage detection algorithm. In the section III, We will give the detail introduction about implement two- stage and one-stage object detection methods to detect road marking. Comparison and anlalysis of experimental results are given in section IV. Conclusion is drawn in Section V. III. E XPERIMENTAL F RAMEWORK During the training process, when the parameters are updated by backpropagation, the cross-entropy of the loss value needs to be calculated. Similar to the Faster RCNN, the general target detection is divided into two parts: positioning and classification. The R-FCN loss function is also Divided into classification and positioning. Which represents the cross-entropy loss of the classification, representing the regression Loss of the target location. We define the cross- entropy as: A. Two-stage road marking detection In 2014, Ross Girshick proposed the RCNN model [2], which applied the convolutional neural network to the target detection for the first time and achieved great results. The structure of the algorithm also became the classic model of target detection. Later, the Fast RCNN Faster RCNN R-FCN appeared. These are both two-stage detection algorithms. On account of the R-FCN algorithm position-sensitive distributed convolutional network replacing the fully connected network after the ROI pooling layer, the feature sharing of the whole network is realized, which effectively solves the contradiction between the translation invariance of the object classification and the translational change of the object detection. The previous Fast RCNN and Faster RCNN took a lot of time. Therefore, R-FCN is used as one of the models for road marker detection. By reason of ResNet18 performance is similar to ResNet50,while the speed is faster than ResNet18, the 18-layer residual network is applied for training. L(s, , y, w, h) = ( ∗) + [ ∗ > 0] ( , ∗ ).(1) In the R-FCN model training process, the RPN and R-FCN network alternate training methods are used to share the features. The two models are alternately trained twice to obtain the final model. The four-stage training method is adopted: RPN training-R-FCN training-RPN training-R - FCN training. The OHEM algorithm is also used in the training forward training. The obtained ROIs are arranged according to the cross entropy loss values in descending order, and finally only the B ROIs with the largest loss value are returned. The visualization results are shown in Fig. 1. In the model training, the input image is first extracted by the backbone network ResNet to obtain a feature map with a dimension of 208. A 1024 convolution layer is added to the Fig. 1. Visual test result. method of multi-level convolution feature extraction in SSD algorithm makes the feature extraction of small targets poor, and the detection effect is not good. Considering that in the actual road marker detection scene, due to the small target appearing in the distance, in 2018, Zuoxin Li et al. proposed an improved model FSSD for insufficient SSD feature extraction [17], which improved the accuracy of SSD and still maintained Higher speed, so this paper also uses FSSD as one of the models for road identifier detection. Single-stage detection often has the problem of unbalanced sample categories. In 2017, Tsung-Yi Lin et al. presented Focal Loss [16], which can improve the accuracy of the model. Therefore, this paper will also modify the SSD of Focal Loss as one of the models. B. One-stage road marking detection The two-stage detection network such as RCNN has an RPN structure. The detection accuracy is improved and the algorithm is slow, which cannot meet the real-time requirements of some scenarios. In response to this problem, some scholars have proposed a single-stage detection algorithm based on regression, such as YOLO SSD, which can ensure both the accuracy and speed. In 2015, Joseph Redmon proposed an end-to-end YOLO algorithm [7] with a speed of 45 frames per second, but there is a certain error in recall rate and positioning accuracy. In 2016, Wei Liu et al pinpointed that the SSD algorithm solves the problem of positioning accuracy of the YOLO algorithm [6], and combines the regression idea of YOLO with the anchor mechanism of Faster RCNN to improve the accuracy under the premise of YOLO. Therefore, this paper uses SSD algorithm as one of the models for road marker detection. The 1) SSD-basedroad marking detection SSD uses VGG-16 as the main network architecture. In fact, based on VGG-16, the last two fully connected layers 67