Fig. 1. Flowchart of system model. The system model consists of data
source, data preparation, training CNN, and web application four parts.
Based on the user needs and application scenarios, we
design and implement the marking system. It’s a web-based
system. The developer environment is based on LAMP
(Apache+ MySQL + PHP under Linux), and the server uses
the MVC framework, which separates the input, processing
procedure and output [11]. In our system, all the English
handwriting images are pre-classified by our algorithm, then
users can make changes to the classification and mark the
English handwriting papers on the platform rather than
traditional manual methods, which improves efficiency and
Fig. 2. Preprocessed image of resolution 224 × 224.
In this section, We propose a CNN for English
Handwriting quality Evaluation. Given an English
handwriting image, we first perform image preprocessing,
then feed the preprocessed image to the CNN to predict the
class label.
Fig. 3. Flowchart of English handwriting region extraction. Firstly,
convert original RGB image to grayscale image and blur the grayscale
image. Secondly, detect edges and extract contours. Then, find the
minimum enclosing rectangle for each contour. Lastly, the largest enclosing
rectangle is just the English handwriting region.
A. Preprocessing
The preprocessing of the provided English calligraphy
images consists of three parts: English handwriting region
extraction, rotation correction, and size normalization. Fig. 2
shows the image after the preprocessing.
B. Network Architecture
Fig. 4 shows the architecture of our network. As shown
in Fig. 4, the network is the classical ResNet-18 [13] like
architecture. The network can be summarized as 224 × 224 -
112 × 112 × 64 - 56 × 56 × 64 - 28 × 28 × 128 - 14 × 14 ×
256 - 7 × 7 × 512 - 1 × 1 × 512 - 2.
x English handwriting region extraction: Fig. 3 shows
the flow of English handwriting region extraction.
Firstly, convert original RGB image to grayscale
image and blur the grayscale image. Secondly, use
Canny operator [12] to detect edges and extract
contours from image edges. Then, find the minimum
enclosing rectangle for each contour. Finally, find the
largest enclosing rectangle and ensure that it is the
correct English handwriting region.
C. Training
We first adopt softmax with loss as the loss function and
perform Stochastic Gradient Descent(SGD). However, the
risk of misclassification of different categories is sometimes
different and our positive and negative samples are
unbalanced, so we also try to use weighted softmax with loss
as loss function and compare two loss functions. The output
of softmax layer is defined in the following equation.
x Rotation correction: The original English handwriting
images may occur two-dimensional rotation
transformation, so we need to correct the angle for
them. Our solution is to obtain the two-dimensional
rotation transformation matrix, and then use the affine
transformation to correct the angle.
x Size normalization: The resolution of the English
handwriting region is higher than 1500 × 1500. In
order to reduce the computational complexity and
maintain consistency, we normalized the size of
image to 224 × 224.
P(y = c k ) = e Ok 6 L e Oi
Where P(y = c k ) is the probability on k th class, O i is the
output of the i th neuron in softmax layer, and K is the number
of classes. The class that outputs the max probability is taken
as the predicted class.
The weighted softmax loss can be described in the
following equation.