International Core Journal of Engineering 2020-26 | Page 154

The location of Darknet-53 network can be seen in the structure diagram. The structure of Darknet-53 network is shown in Figure 7. the predicted candidate box, is the predicted category. III. E XPERIMENTAL R ESULTS A ND A NALYSIS The characteristic of this network is that it uses continuous convolution layers of 3 *3 and 1 *1 instead of pooling layer and full connection layer. In the process of computation, the network completes the tensor size transformation by changing the step size of the convolution core. In Figure 3.14, there are five RESN layers. Each RESN layer has a convolution layer with a step size of (2,2). This shows that every convolution layer passes through, the edge length of the image is reduced to half of the original one, that is, the area is reduced to one fourth of the original one. After five layers of convolution, the feature map will be reduced to 1/32 of the original input size, that is, input 416*416, and output 13*13. So take as an example, because data flows through all darknet-53 networks, the output is 13*13*n, and is 26*26*n, is 52*25*n. In fact, in order to so on improve the detection speed of the network, the number of Darknet layers can be reduced, but darknet-53 has been used to detect 30 frames per second, so darknet-53 is used as feature extraction network. In this paper, bilinear difference and Gauss low-pass filter are used as degradation functions. Different image quality images are obtained by different degree of difference degradation. The training data resolution is 256*256 pixels, and the resolution is reduced to 181*181, 128*128 and 86*86, respectively, so that the algorithm can accurately detect ships in many situations. Transfer learning is used to train the weights in the original model. The CPU of the computer used in this paper is Intel Core i7 7700, with 4 cores and 8 true threads, core speed over 4 GHz, dual channel 16G DDR4- 2400 memory, and graphics card is Nvidia GTX1070Ti.Because only the dimension of the last layer has changed, the model used in this paper transfers the learning method to train the model, that is, freezing all the layers before the last layer, only changing the weights of the last layer. Because of the limitation of memory, the batch size parameter is set to 8. Adam is used as the optimizer to train the network. The initial learning rate weight alpha is 1e-4, the exponential attenuation rate of first-order moment estimation is 0.9, and the exponential attenuation rate of second-order moment estimation is 0.999. The YOLO V3 model outputs three feature maps of different scales, namely , and in Figure 3.13. In the original algorithm, each grid cell predicts three boxes, and each box has five basic parameters, namely, location information: ( , ), length and width of the selected box: ( , ℎ), confidence of recognition, and then class probability. Because this paper only deals with ship images, the class of this algorithm is 0. The dimension of can be obtained by using the following formula. = ∗ + + + + = 3 ∗ (5 + 0) = 15 So the depth of The loss curve is as follows: + in Figure 6 is 15. In the loss function of YOLO V3 model, we need to pay attention to six information, namely ( , , , ℎ, ) mentioned above and the category of objects: . In the super-resolution task of naval vessels, only one kind of target is identified, so in order to simplify the algorithm and improve the operation speed, this algorithm only uses the first five terms as the content of the formula function, so the following formula is obtained: ∑ . = ∗ . ∑ . ℎ ∗ . ∑ . ∑ . ∗ . 1 1 ∑ . ]+ ∑ . is the actual category, and ∑ . 1 [( − [ ∗ . ) +( − ∑ . − − ) ] + + ℎ − Figure 8 Loss Value Curve In order to test the effectiveness of training, 200 ship images with 256*256 pixels resolution are used as test data to test whether the original YOLO and the migrated YOLO can detect the ships. Because the algorithm only marks the position of the ship, this paper uses the True Positive Rate (TPR) as the detection standard to test the ship images in different scenarios. The TPR formula is as follows: (2) TPR = 1 − + The results are as follows: (1) T ABLE 1 T EST RESULTS OF YOLO IN DIFFERENT DEGRADED IMAGES is the weight of position error, is the confidence weight of candidate box without object, is the confidence weight of candidate box with object, 1 judges whether the j candidate box in the first grid algorithm Original YOLO judges whether the center of object contains object, 1 falls in i, , are the actual coordinates. Value, , are the predicted coordinate values, , ℎ are the width and height of the actual candidate box, , ℎ are the width and height of Retraining YOLO 132 Resolving power 256*256 181*181 128*128 86*86 256*256 181*181 128*128 86*86 Detection rate 100% 95.5% 92.5% 85.5% 99% 97% 97.5% 97%