Carla Parra Escartín, ER2
Location: Hermes, Spain
Project Title: Implementation and evaluation (including user aspects) of the improved SMT, EBMT and TM prototypes proposed in EXPERT (WP6)
I joined Hermes and the EXPERT ITN in April 2015 as an Experienced Researcher (ER). Since then I have been engaged in several experiments and have focused my research mainly on MT Evaluation and improving TM fuzzy match retrieval.
As far as MT Evaluation is concerned, I have worked jointly with my colleague at Hermes Manuel Arcedillo testing traditional MT Evaluation metrics, comparing them with the fuzzy-match score used in the industry to analyze translation memory leverage and applying the benefits of a convergence of MT and TM evaluation systems in hybrid scenarios. We have also analyzed how evaluation metrics correlate with productivity gains in real translation settings. A manual evaluation of the data gathered in our experiments using both fuzzy matches and MT output is being carried out to further assess the reliability of automatic MT evaluation metrics.
I have explored TM fuzzy match retrieval in different ways. I have tested a shallow, language-independent method and I am currently refining it with the aim of providing a tool that given a new text to be translated, automatically generates new fuzzy matches from the Translation Memory provided in the project.
ESR4, Rohit Gupta did a one-month secondment at Hermes in June-July 2015. As a result of this secondment, Gupta’s paraphrasing approach for TM fuzzy match retrieval (Gupta and Orasan 2014) has been expanded to comply with real needs and requirements in the translation industry. The tool is being tested with real data to assess whether integrating it in real translation workflows not only improves TM fuzzy match retrieval but also the productivity of translators.
ESR8, Liangyou Li, is currently doing a 3-month secondment at Hermes. One of the aims of this secondment aims at testing his approach to integrate Translation Memory features into Statistical Machine Translation (Li et al. 2014).
Ongoing work on Machine Translation Quality Estimation and its integration in real production workflows is being done together with ESR12, Hanna Bechara, who did a 2-month secondment at Hermes in July-August 2015. The aim is to test whether QuEst (Specia et al. 2013) can be used in a real translation environment and to develop a user-friendly tool to enhance MT Post-Editing throughputs and provide accurate cost estimations by adding Quality Estimation information in sentence and document levels.
I have am also working on experiments to test automatic post-editing of MT output and improving current state-of-the-art systems for handling inline tags in the MT output.
To learn more about her background and previous experience and publications see also Carla's personal website.
Research Interests: translation, machine translation, evaluation, quality estimation, post-editing
- Eduard Barbu, Carla Parra Escartin, Luisa Bentivogli, Matteo Negri, Marco Turchi, Constantin Orasan, Marcello Federico (2016). The First Automatic Translation Memory Cleaning Shared Task. Machine Translation.
Eduard Barbu, Carla Parra Escartín, Luisa Bentivogli, Matteo Negri, Marco Turchi Marcello Federico, Luca Mastrostefano, Constantin Orasan . 1st Shared Task on Automatic Translation Memory Cleaning Preparation and Lessons Learned. In Proceedings of the 2nd Workshop on Natural Language Processing for Translation Memories (NLP4TM 2016), pages 1-6, 28 May 2016, Portorož , Slovenia
Bechara, H., Parra Escartin, C. Orasan, C and Specia, L. (2016) Semantic Textual Similarity for Quality Estimation. In Proceedings of 19th Annual Conference of the European Association for Machine Translation, EAMT. Riga, Latvia. May 29-31.
Li, Liangyou; Parra Escartín, Carla; Way, Andy; and Liu, Qun (2016) "Combining Translation Memories and Statistical Machine Translation Using Sparse Features", to appear in Machine Translation Journal. Special Issue: NLP for Translation Memories.
Liangyou Li, Carla Parra Escartín and Qun Liu (2016) Combining Translation Memories and Syntax-Based SMT: Experiments with real industrial data, In Proceedings of EAMT 2016, 30 May – 1 June, 2016, Riga, Latvia
Losnegaard, Gyri; Sangati, Federico; Parra Escartín, Carla; Savary, Agata; Bargmann, Sascha and Monti, Johanna. (2016): "PARSEME Survey on MWE Resources", in the Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC'16), 23-28 May 2016, Portorož, Slovenia
- Orăsan, Constantin; Parra Escartín, Carla; Barbu, Eduard; and Federico, Marcello (2016) “Proceedings of the 2nd Workshop on Natural Language Processing for Translation Memories (NLP4TM 2016)” at LREC 2016, 28 May 2016, Portorož, Slovenia
Parra Escartín, Carla; Martínez Alonso, Héctor (2015) Assessing WordNet for bilingual compound dictionary extraction. In Proceedings of the Workshop on Multi-word Units in Machine Translation and Translation Technology (MUMTTT2015). Málaga, 1-2 July
Parra Escartín, Carla; Arcedillo, Manuel (2015). A fuzzier approach to machine translation evaluation: A pilot study on post-editing productivity and automated metrics in commercial settings. In Proceedings of the ACL 2015 Fourth Workshop on Hybrid Approaches to Translation (HyTra), pp. 40–45, Beijing, China, July 31, 2015. Association for Computational Linguistics.https://aclweb.org/anthology/W/W15/W15-4107.pdf
Parra Escartín, Carla (2015). Creation of new TM segments: Fulfilling translators' wishes. In Proceedings of the RANLP 2015 Natural Language Processing for Translation Memories (NLP4TM) Workshop. Hissar, Bulgaria, 11 Septemberhttp://rgcl.wlv.ac.uk/events/NLP4TM/7_Paper.pdf
Parra Escartín, Carla; Arcedillo, Manuel. 2015. Machine translation evaluation made fuzzier: A study on post-editing productivity and evaluation metrics in commercial settings. In Proceedings of the MT Summit XV, Research Track, pp. 131-144. Miami (Florida), 30 October - 3 November.http://amtaweb.org/wp-content/uploads/2015/10/MTSummitXV_ResearchTrack.pdf#page=138
Parra Escartín, Carla; Arcedillo, Manuel. 2015. Living on the edge: productivity gain thresholds in machine translation evaluation metrics. Proceedings of the Fourth Workshop on Post-editing Technology and Practice, pp. 46-56. Miami (Florida), 4 Novemberhttp://amtaweb.org/wp-content/uploads/2015/10/MTSummitXV_WPTP4Proceedings.pdf#page=50
Parra Escartín, Carla; Nevado Llopis, Almudena and Sánchez Martínez, Eoghan (2016) "Spanish Multiword Expressions: looking for a taxonomy", to appear in Sailer, Manfred and Markantonatou, Stella (eds.): Multiword Expressions: Insights from a Multilingual Perspective. Language Science Press
Savary, Agata; Sailer, Manfred; Parmentier, Yannick; Rosner, Michael; Rosén, Victoria; Przepiórkowski, Adam; Krstev, Cvetana; Vincze, Veronika; Wójtowicz, Beata; Smørdal Losnegaard, Gyri; Parra Escartín, Carla; Waszczuk, Jakub; Constant, Matthieu; Osenova, Petya and Sangati, Federico. (2015) PARSEME – PARSing and Multiword Expressions within a European multilingual network. Proceedings of the 7th Language & Technology Conference (LTC 2015), 27-29 November 2015, Poznań, Poland.