Linguistic software - computer programs and data that provide analysis, processing, storage and retrieval of audio data, pictures (OCR) and texts in natural language.

Linguistic analysis software is a tool that enhances the comprehension of information present in documents, or across a set of documents. Some linguistic analysis software tools use an automatic mathematical or statistical approach to analyze and extract relations that occur in a set of documents and the words they contain. The idea is that the distribution and frequency of relevant terms in documents that cover a specific topic is similar and that the meaning of a specific word can be inferred by the above assumption.

The term CAT stands for Computer-Assisted Translation Tool. CAT tools have significantly changed the way how translators work and manage translation projects today. CAT tools split large multilingual documents into segments (phrases & paragraphs) which are stored in a database. This is called translation memory which means that previously translated material can be reused at any time. Nowadays, it is more common for enterprises and translators to use CAT tools to speed up their work and increase their productivity globally.A number of computer-assisted translation software and websites exists for various platforms and access types.

According to a 2006 survey undertaken by Imperial College of 874 translation professionals from 54 countries, primary tool usage was reported as follows: Trados (35%), Wordfast (17%), Déjà Vu (16%), SDL Trados 2006 (15%), SDLX (4%), STAR Transit (3%), OmegaT (3%), others (7%).


F.A.Q about Linguistics Software

CAT Tool Functionality

  • Spell checkers, autocorrect—automatically highlight and fix spelling and grammar mistakes.
  • In-context review—display multimedia documents with images, text box sizes and layout in real time in both source and target language.
  • Integrated machine translation—suggest translations for segments from a connected MT engine.
  • Adaptive machine translation—offer translations for segments from a connected MT engine as autosuggest pop-ups, learn from user input.
  • Concordance—retrieve instances of a word or an expression and their respective context in a text corpus, such as a translation memory database, to check their usage.
  • Electronic dictionaries—allow term search inside the tool, track usage statistics.
  • Text search tools—find phrases or terms in the text for reference.
  • Alignment—build translation memories texts from source text and its translation. When a translation company adopts a CAT-tool, they often use alignment to create their first TM databases.

There are online and offline CAT tools available for purchase. Web-based CAT tools work in the web browser; desktop CAT tools require installation and do not depend on internet connection stability.

Basic CAT tools support MS Office formats, such as .docx, .odt, .csv, .xlsx, plus .html files and .xml. The primary translation industry formats are XLIFF, and tmx. Advanced tools also support various software formats, such as .json, .properties, Visual Studio, as well layout building software such as InDesign, Corel Draw and sometimes AutoCAD.

CAT tools are useful when translating multimedia formats. They extract text for editing and rebuild files in the target language after the translation is finished. This ability reduces the time needed to create multilingual artwork. Instead of fishing for each individual bit of text on the page, the translator simply goes segment by segment in plain text.

Automated Quality Assurance (AutoQA)

AutoQA tools scan bilingual texts and detect errors in translation, such as wrong numbers and number format, incorrect terminology, missing tags, missing segments, erroneous formatting and many others. Complementary to spellcheckers, AutoQA helps editing.

While popular translation memory tools feature built-in QA components, standalone software offers extended functionality and support for morphology of selected languages. Advanced functionality leads to better error detection and fewer false positives. Standalone tools may require import/export of translation files from the TMS, unless there is an integration in place.

Terminology Management

Terminology Management or glossary management, refers to technologies that centrally maintain lists of subject matter-specific, company, or other technical terms to improve consistency and speed of translation. Such systems usually include guidance on terms that should not be translated to target languages or additional reference material and images to assist translators in understanding the translation material.

Advanced terminology systems include integrations with authoring tools, workflows for terminology creation and validation processes, and automated term mining functions.