FREQUENCY-DRIVEN LATE FUSION-BASED
WORD DECOMPOSITION APPROACH ON
THE PHRASE-BASED STATISTICAL
MACHINE TRANSLATION SYSTEMS
Text: Mehmet Tatl?c?o?lu
Topic: Natural Language Processing /Computer Science
Communication is truly the oldest
problem that humankind has ever
had. With the growing population of
the world, various languages spoken by different civilizations have
emerged.
Today, it is reported that there are
more than 4,000 languages spoken
by at least a thousand people.
Increasing use of textual materials on
computers dramatically raised the
importance of automated natural language translation tasks, since human
aided translations cannot meet the
demand at the desired level. Moreover, an increasing number of people
speaking different languages through
the Internet has attracted attention to
automated machine translation systems.
With the recent techniques developed in the scope of Arti?cial Intelligence (AI), computers have started to
handle the tasks that might be rather
time consuming for humans.
Job: SDE
To solve the communication problem
between people speaking different
languages, AI proposed various approaches, which are classi?ed under
the label of Machine Translation (MT),
implying that the translation hypotheses of the texts can be generated in
an automated way by computers. The
main schema of the arti?cial MT systems are shown below:
The increasing popularity of MT systems has motivated researchers to
utilize arti?cial MT systems to ease
daily life. Today, MT systems are
widely used from multilingual web
pages to mobile phones. However,
the accuracy rates of contemporary
MT systems are not at the desired
level for humans, and computers are
not even close to human translators
in terms of translation accuracy rates.
Researchers have been working in
this area to boost the translation performance of MT systems. Contemporary approaches in MT are far behind
the desired level needed to produce
an accurate translation which does
not need any human post-translation
processes. Today, MT systems are
heavily used as supplementary translation memories, a sort of extensive
look up dictionary for professional
translators.
There are many different machine
translation systems introduced in the
literature. Some of the researchers
applied example-based approaches, while some of them worked on
rule-based approaches. Statistical
approaches have also been widely used. In this study, the statistical
phrase-based machine translation
system paradigm is used for all experiments.
In the phrase-based statistical machine translation paradigm, the researchers have mostly focused on
increasing the translation accuracy by
applying various approaches. Most
of the successful methods show that
for agglutinative languages, exploiting
51