International Core Journal of Engineering 2020-26 | Page 83
澳
learning framework, this paper proposes a new algorithm for
synthesizing training data. Based on digging and utilizing the
context information of the logo image, the algorithm
synthesizes images which fit the real world as much as
possible, thus improves the performance of the logo
recognition algorithm without adding additional labeling
costs. Although this work is not the first attempt to
synthesize logo images [12, 16], this paper improves the
simple idea of synthesizing logo images in the past, making
full use of the interior of logo object, the neighborhood of
logo object, the link between logo object and other objects
and the scene where logo object lives in, so that training by
means of automatically synthesized logo images can produce
more significant performance gains. In the experimental
aspect, detailed experiments have been conduct on the
benchmark dataset FlickrLogos-32 [9], and the best results in
logo recognition task which based on the synthetic logo
image-assisted have been obtained. The results (mAP 58.9%
this paper VS. 54.8% [12]) fully verify the effectiveness of
the proposed algorithm.
II. C ONTEXT - BASED L OGO I MAGE S YNTHESIS A LGORITHM
Fig. 2 shows the overall algorithm framework for logo
recognition based on synthetic data. Among them, Generate
Synthetic Image as the core of the algorithm, it mainly
includes four processes: Logo Exemplar Selection,
Background
Image
Selection,
Logo
Exemplar
Transformation and Logo Image Synthesis, which will be
elaborated separately below. In model training, this paper
basically follows the sequential learning strategy in [12],
which stems from easy-to-hard staged learning ideas in
Curriculum Learning [17], first deploying a large number of
synthetic images to pre-train a deep model, followed by
fine-tuning the deep model with the sparse real images. In
addition, this paper explores that training with mixed data of
synthetic images and real images, and then fine-tune with
real images again, it will achieve better training results.
Fig. 2. Logo recognition overall framework based on synthetic data.
classes of the FlickrLogos-32 as an example. The
corresponding logo exemplars are shown in Fig. 3. The
reason for trying like this probably is that by combining
pixel-level logo masks and completely transparent logo
images, it can enhance the robustness of the model to
complex backgrounds while learning more about the real
context.
A. Logo Exemplar Selection
To synthesize images for a given logo class, an exemplar
image for each logo class is needed. In the selection of the
logo exemplar, this paper both select pixel-level logo masks
and completely transparent logo images (the ratio is 1:1),
which is different from [12] and [16]. Take the 32 logo
Fig. 3. Selected logo exemplars. Left: pixel-level. Right: background transparent
61