International Core Journal of Engineering 2020-26 | Page 58
number of iterations of LBP update is several times that of
backpropagation, so the LBP method is very time consuming.
limitation between the child nodes. In the LSTM diagram, the
encoding of linguistic knowledge is made more flexible from
the backpropagation strategy.
A standard LSTM unit consists of an input vector (word
embedding), a storage unit and an output vector (context
representation), and several gates. It includes input gates that
control the flow of information into and out of the cells, output
gates and control gates, and forgetting gates that selectively
delete information in the state of the cells. In linear LSTM,
each cell contains only one forgetting gate because it has only
one direct precursor node. However, in the LSTM diagram, a
cell has many precursor nodes, so a forgotten gate is
introduced for each precursor node
Output Layer
w6
w6
w5
w6
w5
w5
w5
Backward Layer
w4
w3
w4
w4
w3
w3
Forward Layer
w2
w2
w2
w1
w2
w1
w1
III. S HORT TEXT SENTIMENT CLASSIFICATION BASED ON
G RAPH LSTM
Input Layer
Fig. 2. Graph LSTM forward propagation back propagation
A. Short text features
Compared with traditional texts such as articles and news,
short texts have great specialties:
The LSTM is used to process the prediction of the target
node to any target sequence of adjacent nodes. For example,
the prediction of the target node needs to be trained on the
basis of a standard neural network, and the node data
characteristics of the nearby target distance radius D are
summarized, so that there is a connection between the nodes.
= (
+∑
∈ ( )
= (
( , )
+∑
=
+
̃ +∑
∈ ( )
ℎ +
( , )
∈ ( )
( , )
(1)
)
( , )
∈ ( )
+∑
= tanh(
= (
ℎ +
Data sparsity: Short text Due to the limited number of text
words, a single short text usually consists of only one or a few
short sentences, which largely causes the problem of sparse
data. Emotional analysis of short texts with only a few tens of
bytes in size, containing only a few or a dozen words, it is
difficult to extract emotional words effectively, and the
emotional space model is bound to cause serious data sparse
problems.
ℎ +
)
)
ℎ +
)
(2)
Irregularity: short text expressions are concise,
terminology is refined and colloquial, extremely irregular,
abbreviations and spelling mistakes are common, and are
often mixed with recent popular web languages, emoji, and
link addresses. If data sparseness can solve this problem to
some extent through some word clustering, but because of the
informality of short text language, word clustering has
become a new bottleneck in the process of sentiment analysis.
) (3)
(4)
(5)
In LSTM, the input byte point t vector is a hidden state
vector. Node t, w is the input weight matrix, and b is the offset
vector. tanh stands for sigmoid function, hyperbolic tangent
function and Hadamard product (point multiplication), which
is the main difference of basic cyclic unit. In the graph LSTM,
each node may have multiple precursors P(t), for each node
, and a type of weight matrix
there is a forgotten gate
( , )
U
, where m(t,j) represents the input and output gates ,
representing the input gate and the forgetting gate in
,
the middle of the memory cell.
Real-time: Short text can be published and received via the
Internet and various intelligent terminals anytime, anywhere,
without the need for approval, convenient and fast, and its
timeliness is greatly enhanced. Any news or event may
quickly form a topic in a short text, and comments on the topic
will be overwhelming in a short period of time. By analyzing
the massive data generated in real time, it can quickly guide
the public opinion, but the massive short text will lead to
uneven distribution of effective samples and more noise, so
the performance requirements of the short text sentiment
analysis system are higher.
C. Graph LSTM VS Chain LSTM
The main advantages of graphical LSTM are versatility
and flexibility. Although the LSTM of the chain structure is
the most widely used, many cases have an inherent connection
with the input structure, and most of the input of the
application scene is not a simple sequence structure. For
example, the case and case mentioned above are not a simple
linear structure but a complex many-to-many relationship.
The traditional linear LSTM can only represent the
relationship between two adjacent nodes. If applied to the
derived model of other input structures, the predicted
classification result will be inaccurate. The tree LSTM [21] is
a special case LSTM, which is more capable of representing
many-to-many relationships between nodes than the tree
LSTM. The memory unit in the LSTM of the tree structure
can reflect the history information of multiple child nodes and
subsequent nodes through a recursive process, and there is no
Interactivity: Short text is a social platform based on
“weak relationship”. A large amount of short text information
is generated by replying and forwarding. Such “interactive”
short text has rich context and short text generated. With a
large number of omissions and references, and a large number
of irrelevant characters, this makes it difficult to extract
emotional elements.
No domain: Most of the traditional comments are only for
a specific field of products and news. However, the content of
short texts is lack of domain, and the field of comments is wide,
whether it is the discussion of news hot events or products.
Commenting, or expressing your own life's sentiments, even
direct conversations with friends, stars, and opinion leaders
are widespread in short texts. This requires a short text
sentiment analysis system that is portable.
36