Project Description

Automatic translation is an undeniable need in a globalized world where communication using several languages becomes increasingly relevant. Translation Memory (TM) and Machine Translation (MT) systems are the two most elaborate technologies to support human translation. Recent developments in the area of Example-based and Statistical Machine Translation (EBMT and SMT), in particular, have shown the potential of data-driven approaches for producing fast and low cost translations. A number of user studies have however established shortcomings in the technology state-of-the-art, including poor quality translations for low resource languages, interfaces that do not take into account user requirements and user feedback, etc.

We create an Initial Training Network to train young researchers on ways to improve current data-driven MT technologies (TM, SMT and EBMT) by exploiting their individual strengths through their combination and by addressing some of the main limitations of each of these technologies.
 
Leading academic and industrial partners in all data-driven translation technologies, along with both professional translators and end-users of translation technologies will support young researchers of the network during the whole research and development cycle, providing guidance, core and complementary training skills and evaluating the resulting technologies.
 
A comprehensive set of training materials on core and complementary skills developed during this project will be made freely available to other researchers interested in the field. We expect the training of researchers in the new skills required for the development and use of technologies that can increase productivity and reduce costs in the translation sector, as well as facilitate reliable communication and content creation in multiple languages, will contribute to several aspects of Europe’s ICT development.
 
The list of EXPERT research projects is available here: 

Fellow

Project Title

HOST INSTITUTION

ESR 1
Investigation of translators’ requirements
from translation technologies
UMA
ESR 2
Investigation of an ideal translation workflow
for hybrid translation approaches
USAAR
ESR 3
Collection and preparation of multilingual
data for multiple corpus-based approaches to
translation
UMA
ESR 4
Use of language technology to improve
matching & retrieval in translation memories
UoW
ESR 5
Use of terminologies and ontologies to
improve corpus-based approaches to
translation
USAAR
ESR 6
Learning from human feedback on the quality
of the translations
USFD
ESR 7
Estimating the confidence of corpus-based
approaches to translation and the quality of
the translated texts
USFD
ESR 8
Investigation of how each individual corpusbased
translation approach (TM, EBMT and
SMT) can benefit from each other
DCU
ESR 9
Investigation of the ideal infrastructure for
computer-aided translation: pipeline with
NLP tools for pre/post-processing, SMT,
EBMT and TM techniques–a hybrid CAT tool
DCU
ESR 10
Exploiting hierarchical alignments for
linguistically-informed SMT models to meet
the hybrid approaches that aim at
compositional translation
UvA
ESR 11
Exploiting hierarchical alignments for a
semantically-enriched SMT system that offers
an extension to existing TMs to allow
incremental, recursive partial match of the
input using hierarchical constructions
containing variables
UvA
ESR 12
Investigation of methodologies to evaluate the
improved SMT, EBMT and TM prototypes
and new hybrid computer-aided translation
technology proposed in EXPERT
UoW
ER 1
Investigation of automatic methods for
collection & preparation of multilingual data
Translated
ER 2
Implementation and evaluation (including
user aspects) of the improved SMT, EBMT
and TM prototypes proposed in EXPERT
Hermes
ER 3
Implementation and evaluation of the new
hybrid computer-aided translation technology
proposed in EXPERT
Pangeanic