Labels

Showing posts with label NLP. Show all posts
Showing posts with label NLP. Show all posts

Monday, 25 February 2019

Translation platform or Tools for Indian Languages

Translation platform often confused with Translation software is a tool/platform that aids a translator to use computer aided resources that are required to translate source language text into target language text.

While a translation software provides the possible Machine Translation of a source language text based on how its build where there is no human intervention. This machine translated output is most likely to be present in nonpunishable quality, meaning a human translator is needed to verify the machine generated translation and edit/review it so that the translation is perfect and publishable.



In today's world there are many translation platforms that support human translators to translate text at greater speeds and deliver high quality translations with publishable quality. European languages are very well supported with these kind of translation platforms. An Ideal translation platform should consist of the following features that can enable a translator to deliver high quality translation.4


  • Integrated with Multiple Machine Translations software that helps a translator to choose the most closely generated text from machine and then edit/review for further enhancements.
  • Availability of bilingual dictionaries/synonym dictionaries/Terminologies/Glossaries etc..
  • Translation Memory if available.
  • Transliteration tool.
  • Concordance search to search for some text in wide variety of available corpora.
  • Name Entity identification and Terminology identification.
  • Powerful target language spell checker.
  • Ability to add user's dictionary or existing Translation Memory.
One such tool that can help translators to deliver high quality publishable content is Transzaar.

Translation Memory Exchange(TMX)

Translation Memory Exchange or TMX is an xml file format for storing translation units for the exchange of translation memory data between computer-aided translation and localization tools with little or no loss of critical data.

<tmx version="1.4">
  <header
    creationtool="PyTool" creationtoolversion="1.01-023"
    datatype="PlainText" segtype="sentence"
    adminlang="en-us" srclang="en"
    o-tmf="ABCTransMem"/>
  <body>
    <tu>
      <tuv xml:lang="en">
        <seg>Hello world!</seg>
      </tuv>
      <tuv xml:lang="te">
        <seg>ప్రపంచానికి నమస్కారం</seg>
      </tuv>
    </tu>
  </body>
</tmx>

This is how a sample TMX file looks. Here I have given an example of English->Telugu translation Memory.

Translation Memory is useful in the following ways:

  • To recollect a past translation that has already been done and added to Translation Memory database.
  • Fuzzy search in Translation Memory helps to find out similar translations that can aid a translator.