EESTEC Magazine Vol 33 2013/2 | Page 51

FREQUENCY-DRIVEN LATE FUSION-BASED WORD DECOMPOSITION APPROACH ON THE PHRASE-BASED STATISTICAL MACHINE TRANSLATION SYSTEMS Text: Mehmet Tatl?c?o?lu Topic: Natural Language Processing /Computer Science Communication is truly the oldest problem that humankind has ever had. With the growing population of the world, various languages spoken by different civilizations have emerged. Today, it is reported that there are more than 4,000 languages spoken by at least a thousand people. Increasing use of textual materials on computers dramatically raised the importance of automated natural language translation tasks, since human aided translations cannot meet the demand at the desired level. Moreover, an increasing number of people speaking different languages through the Internet has attracted attention to automated machine translation systems. With the recent techniques developed in the scope of Arti?cial Intelligence (AI), computers have started to handle the tasks that might be rather time consuming for humans. Job: SDE To solve the communication problem between people speaking different languages, AI proposed various approaches, which are classi?ed under the label of Machine Translation (MT), implying that the translation hypotheses of the texts can be generated in an automated way by computers. The main schema of the arti?cial MT systems are shown below: The increasing popularity of MT systems has motivated researchers to utilize arti?cial MT systems to ease daily life. Today, MT systems are widely used from multilingual web pages to mobile phones. However, the accuracy rates of contemporary MT systems are not at the desired level for humans, and computers are not even close to human translators in terms of translation accuracy rates. Researchers have been working in this area to boost the translation performance of MT systems. Contemporary approaches in MT are far behind the desired level needed to produce an accurate translation which does not need any human post-translation processes. Today, MT systems are heavily used as supplementary translation memories, a sort of extensive look up dictionary for professional translators. There are many different machine translation systems introduced in the literature. Some of the researchers applied example-based approaches, while some of them worked on rule-based approaches. Statistical approaches have also been widely used. In this study, the statistical phrase-based machine translation system paradigm is used for all experiments. In the phrase-based statistical machine translation paradigm, the researchers have mostly focused on increasing the translation accuracy by applying various approaches. Most of the successful methods show that for agglutinative languages, exploiting 51