International Core Journal of Engineering 2020-26 | Page 41

Fig. 1. Flowchart of system model. The system model consists of data source, data preparation, training CNN, and web application four parts. Based on the user needs and application scenarios, we design and implement the marking system. It’s a web-based system. The developer environment is based on LAMP (Apache+ MySQL + PHP under Linux), and the server uses the MVC framework, which separates the input, processing procedure and output [11]. In our system, all the English handwriting images are pre-classified by our algorithm, then users can make changes to the classification and mark the English handwriting papers on the platform rather than traditional manual methods, which improves efficiency and convenience. Fig. 2. Preprocessed image of resolution 224 × 224. III. CNN S FOR E NGLISH H ANDWRITING E VALUATION In this section, We propose a CNN for English Handwriting quality Evaluation. Given an English handwriting image, we first perform image preprocessing, then feed the preprocessed image to the CNN to predict the class label. Fig. 3. Flowchart of English handwriting region extraction. Firstly, convert original RGB image to grayscale image and blur the grayscale image. Secondly, detect edges and extract contours. Then, find the minimum enclosing rectangle for each contour. Lastly, the largest enclosing rectangle is just the English handwriting region. A. Preprocessing The preprocessing of the provided English calligraphy images consists of three parts: English handwriting region extraction, rotation correction, and size normalization. Fig. 2 shows the image after the preprocessing. B. Network Architecture Fig. 4 shows the architecture of our network. As shown in Fig. 4, the network is the classical ResNet-18 [13] like architecture. The network can be summarized as 224 × 224 - 112 × 112 × 64 - 56 × 56 × 64 - 28 × 28 × 128 - 14 × 14 × 256 - 7 × 7 × 512 - 1 × 1 × 512 - 2. x English handwriting region extraction: Fig. 3 shows the flow of English handwriting region extraction. Firstly, convert original RGB image to grayscale image and blur the grayscale image. Secondly, use Canny operator [12] to detect edges and extract contours from image edges. Then, find the minimum enclosing rectangle for each contour. Finally, find the largest enclosing rectangle and ensure that it is the correct English handwriting region. C. Training We first adopt softmax with loss as the loss function and perform Stochastic Gradient Descent(SGD). However, the risk of misclassification of different categories is sometimes different and our positive and negative samples are unbalanced, so we also try to use weighted softmax with loss as loss function and compare two loss functions. The output of softmax layer is defined in the following equation. x Rotation correction: The original English handwriting images may occur two-dimensional rotation transformation, so we need to correct the angle for them. Our solution is to obtain the two-dimensional rotation transformation matrix, and then use the affine transformation to correct the angle.  x Size normalization: The resolution of the English handwriting region is higher than 1500 × 1500. In order to reduce the computational complexity and maintain consistency, we normalized the size of image to 224 × 224. P(y = c k ) = e Ok 6 L e Oi    Where P(y = c k ) is the probability on k th class, O i is the output of the i th neuron in softmax layer, and K is the number of classes. The class that outputs the max probability is taken as the predicted class. The weighted softmax loss can be described in the following equation. 19