International Core Journal of Engineering 2020-26 | Page 59
conditions, and the system cost is relatively low, so it is used
by most document classification systems.
B. Emotional classification hierarchy
x Word level sentiment analysis. The sentiment analysis
of words is an important basis for studying the
emotional analysis of texts. At present, most of the
sentiment analysis research for word level is only
based on one dimension and makes a judgment on the
emotional polarity. In order to quantitatively analyze
the emotional polarity of a word, a real number of the
numerical interval [-X, X] is usually used as the
emotional polarity of the word. Less than 0 means
derogatory meaning, and greater than 0 means
derogatory meaning, and its absolute value indicates
strong polarity weak.
C. Graph LSTM Short Text Sentiment Classification Model
Due to the characteristics of short text, the text feature
information extraction and model analysis process can use the
following model description. First, the irregular short texts are
processed by the model to obtain semi-structured knowledge,
and the semi-structured knowledge can get the answers we
want after the model is processed.
Knowledge
x Statement level sentiment analysis. The processing of
word-level sentiment analysis is a single word or entity
name, while the processing object of sentence-level
sentiment analysis is a sentence in a specific context.
The task of sentiment analysis of a sentence is to
discriminate the emotional tendency of the sentence, or
to identify emotional elements such as the commentary,
the subject of the comment, the tendency and intensity
of the comment. For example, the sentence "I think the
i Phone screen is good but the battery is not strong."
The commenter in the sentence is "I"; the evaluation
object is "i Phone", "screen", "battery", wherein "i
Phone" is an indirect comment object, and "screen"
and "battery" are direct comment objects; The
propensity words are “good” and “not to force”. The
tendency to describe the screen is derogatory, and the
tendency to describe the battery is derogatory.
Short text
Knowledge
understanding (internal representation)
Answer
Fig. 3. Knowledge extraction model
Based on Graph LSTM text sentiment classification, the
input text is first divided into eight categories: no emotion,
anger, disgust, fear, happiness, preference, sadness and
surprise. Each text is treated as an independent node. The key
information is stored in the node. The nodes are related to each
other. All nodes form a graph. The neighborhood of each node
contains information for inferring or predicting sentiment
classification. Learning the characteristics of adjacent nodes
and modifying the model helps the prediction classification of
the next node. Because the linear LSTM can only consider the
relationship between two adjacent nodes, it only applies to the
classification problem similar to the time series relationship.
Graph LSTM captures the complex relationship between
nodes and nodes more than the general linear LSTM. This
provides a theoretical basis for Graph LSTM to have better
performance in the judicial field than linear LSTM.
x Short text sentiment analysis. At present, the
application of chapter-level sentiment analysis, the
effect is relatively good, mainly focused on the
emotional analysis of product reviews. Since the
length of Weibo does not exceed 140 characters,
especially Chinese microblogs generally have several
simple sentences or several phrases and several
emoticons. The sentiment analysis techniques for
microblog short texts mainly focus on word level and
sentence level. The application of sentiment analysis.
Pak et al. implemented an sentiment classifier based on
naive Bayes, support vector machines, and conditional
random fields.
Sentence feature extraction
Neural network classifier
Word Embeding
Input node
Hidden node
Word vector
Graph LSTM
Output node
Word vector
Sentence vector
Word vector
Word vector
Word vector
The method of knowledge engineering mainly relies on
linguistic knowledge. By manually compiling a large number
of inference rules as classification knowledge, the
implementation is quite complicated. Simply using this
method to classify. For more complex systems, the number of
rules will vary with the complexity of the system. It is
exponentially increasing, and for different classification
systems, it may need to modify a large number of existing
inference rules. Therefore, this classification system requires
a lot of manpower and material resources, which is very
difficult to implement, but knowledge engineering has better
perception in logic and knowledge. In contrast, the
implementation mechanism of statistical methods is relatively
simple, but when classifying complex documents with strong
logical dependence, or classifying categories with fuzzy
classification categories, the effect is not satisfactory.
Comprehensive comparison of these two methods, because
the statistical method to achieve document classification is
simple to implement, the classification of most actual
documents is faster, the accuracy is higher under certain
Fig. 4. Graph LSTM Short text classification model
In the training set and test set, the XML file is equivalent
to a structured vector, and does not need to be vector by word
embedding before entering Graph LSTM learning. The
common models for short text sentiment analysis are
presented.
IV. E XPERIMENTAL RESULTS AND ANALYSIS
A. Experimental data set
In this experiment, 3000 microblogs were randomly
selected from the annotation corpus, and three methods of
machine learning, SVM, LSTM, and Graph LSTM were used
for comparison experiments. The data set processed by LSTM
is the vector of the original judicial document after word
embedding. The data set is divided into training set and data
set according to the proportion of 70% and 30%. After training
the model with the training set, the test set is used to test the
37