Computer-based Translation into Lithuanian: Alternatives and Their Linguistic Evaluation

Authors

  • Inga Petkevičiūtė
  • Bronius Tamulynas

DOI:

https://doi.org/10.5755/j01.sal.0.18.407

Keywords:

Kompiuterinis vertimas, lingvistinės klaidos, morfologija, leksika, sisteminės klaidos, daugiareikšmiškumas

Abstract

In machine translation (MT) it is extremely complicated to create perfectly functioning system. The main problems in the systems are errors occurring during translation process. The topic of this research is relevant because in the last years two freely accessible MT systems, supporting the Lithuanian language, were introduced in Lithuania. Comprehensive and well-grounded analysis of these systems would be useful to the system developers and ordinary users. The object of this research are typical linguistic and systemic problems occurring during translation. Those problems are indicators determining translation quality. The aim of this paper is to explore the main practical translation problems that ordinary MT users commonly deal with. Analysis has shown that the Google Translator had made 1066 and VDU system 565 translation errors. Most translation errors are common to both systems: declensional, polysemy, non-translated words, not suitable parts of speech constituted about 70 % of all errors. Out of 15 tested texts, VDU system has translated 13 texts in good quality. Out of 23 types of errors 19 types errors were "produced" by Google system. Both systems demonstrated the best translation results in translating administrative text and the worst results in translating fictional texts. Following the conducted analysis such recommendations could be made: to create or supplement dictionaries of phraseological units, expressions, constant word combinations, abbreviations, jargon and spoken language; constantly update dictionaries with new words and their forms; create larger parallel and comparative corpora; solve proper noun and systematic problems, etc.

http://dx.doi.org/10.5755/j01.sal.0.18.407

Downloads

Published

2011-05-27

Issue

Section

COMPUTATIONAL LINGUISTICS