Hernani Costa, ESR3

Location: University of Malaga, Spain

Project Title: Collection and preparation of multilingual data for multiple corpus-based approaches to translations

Project Description: 

Hernani Costa is a Marie Curie Early Stage Researcher in the Department of Translation and Interpreting at the Faculty of Philosophy and Humanities, University of Malaga, Spain. His main research interests lie in the Computational Linguistics and Artificial Intelligence areas, especially its practical application in the fields of Translation Technologies, Natural Language Processing, Information Extraction and Information Retrieval. He is also interested in (or has worked on) a number of other topics such as Recommender Systems, Multiagent Systems, Affective Computing, amongst others.

Hernani completed his BSc and MSc on Informatics Engineering in the Bologna model at the Department of Informatics Engineering of the University of Coimbra (UC) in 2010. As always, he is highly motivated to find new challenges that defy his competences and skills in Computer Science field. That is why he enrolled the doctoral program in September 2013 at the Department of Translation and Interpreting, at the Faculty of Philosophy and Humanities of the University of Malaga, Spain.

Nowadays, along with his PhD, he is working on the Expert Project (also since September 2013), and his main role on the project is on the collection and preparation of multilingual data for multiple corpus-based approaches to translations. As a result of his research the candidate already published several article, which can be accessed from the publications' section. These publications result from: a comprehensive literature review on methods for human and automatic compilation of multilingual data; investigation of existing techniques for the compilation of multilingual comparable and parallel corpora and their exploitation of their utility in the translation workflow; investigation of existing corpus compilation software, terminology extraction and management tools; and the identification not only translators and interpreters' needs but also professionals and ordinary people requirements.

An important help to successfully accomplish his research goals were the secondments assigned to him. Thanks to the EXPERT project, the candidate carried out research activities in two other institutions that make part of the EXPERT project consortium. The first secondment took place on September, 2014 till December, 2014 and it has in the University of Wolverhampton, more precisely in the RIILP research group. There, the candidate had the opportunity to improve his communication and acquire complementary skills in core research areas such as computational linguistic. Regarding his second secondment, in Translated (between October, 2015 and December, 2015 - Rome, Italy), the candidate had the opportunity not only to receive local training in an industrial environment but also to work close with Dr. Eduard Barbu (the ER1). Being a leading language service provider and translation technologies developer, Translated provided an excellent environment to work on the infrastructure for data collection.

Research Interests: Machine Translation, NLP, Information Extraction, Information Retrieval

Home pagehttps://eden.dei.uc.pt/~hpcosta/

Publication list

  1. Hanna Bechara, Hernani Costa, Shiva Taslimipoor, Rohit Gupta, Constantin Orasan, Gloria Corpas Pastor and Ruslan Mitkov. (2015). MiniExperts: A SVM approach for Measuring Semantic Textual Similarity. In Proceedings of the 2015 Conference of the North American Chapter of the 34 Association for Computational Linguistics: Human Language Technologies, Denver, Colorado

    http://alt.qcri.org/semeval2015/cdrom/pdf/SemEval017.pdf
  2. Costa, H., Corpas Pastor, G., and Durán Muñoz, I. (2014). A Comparative User Evaluation of Terminology Management for Interpreters. In 25th Int. Conf. on Computational Linguistics (COLING'14), 4th Int. Workshop on Computational Terminology (CompuTerm'14), pages 68–76, Dublin, Ireland, August. Association for Computational Linguistics and Dublin University http://www.aclweb.org/anthology/W14-4809
  3. Costa, H., Corpas Pastor, G. and Seghiri, M. (2014). iCompileCorpora: A Web-based Application to Semi-automatically Compile Multilingual Comparable Corpora. In 36th Translating and the Computer Conference, London, UK, November. ASLING https://eden.dei.uc.pt/~hpcosta/docs/papers/201411-ASLING.pdf
  4. Costa, H., Corpas Pastor, G. and Durán Muñoz, I. (2014). Technology-assisted Interpreting. MultiLingual #143, 25(3):27–32, April/May.

    http://eden.dei.uc.pt/~hpcosta/docs/papers/201404-MultiLingual.pdf
  5. Costa, Hernani; Corpas Pastor, Gloria; Sighiri, Miriam and Mitkov, Ruslan. (2015). "iCorpora: Compiling, Managing and Exploring Multilingual Data". 7th Int. Conf. of the Iberian Association of Translation and Interpreting Studies (AIETI'15). Malaga, Spain. January, 2015. pp.74-76. ISBN:9782970073635. http://www.tradulex.com/varia/AIETI7.pdf#page=74
  6. Costa, Hernani. (2015). "Assessing Comparable Corpora through Distributional Similarity Measures". EXPERT Scientific Technological Workshop. Malaga, Spain. June, 2015. pp.23-32. ISBN:9782970073666. http://eden.dei.uc.pt/~hpcosta/docs/papers/201506-EXPERT_ESR03.pdf
  7. Costa, Hernani; Corpas Pastor, Gloria and Mitkov, Ruslan. (2015). "Measuring the Relatedness between Documents in Comparable Corpora". In Proceedings of the 11th Int. Conf. on Terminology and Artificial Intelligence (TIA'15). Granada, Spain. November, 2015.

    http://eden.dei.uc.pt/~hpcosta/docs/papers/201511-TIA.pdf
  8. Costa, Hernani and Corpas Pastor, Gloria and Durán Muñoz, Isabel. (2015). "An Interpreters' Guide to Selecting Terminology Management Tools". In Proceedings of NATO Conference on Terminology Management. Brussels, Belgium. November

    http://eden.dei.uc.pt/~hpcosta/docs/papers/201511-NATO.pdf
  9. Costa, Hernani; Corpas Pastor, Gloria; Mitkov, Ruslan and Sighiri, Miriam. (2015). "Towards a Web-based Tool to Semi-automatically Compile, Manage and Explore Comparable and Parallel Corpora". AIETI'15. Malaga, Spain. September

    http://eden.dei.uc.pt/~hpcosta/docs/papers/201601-AIETI-iCorpora.pdf
  10. Costa, Hernani. "EXPloiting Empirical appRoaches to Translation: D3.1 Framework for Data Collection". LEXYTRAD, University of Malaga. Technical Report. February, 2015. http://expert-itn.eu/sites/default/files/outputs/expert_d3.1_20150213.pdf
  11. Costa, H., Zaretskaya, A., Corpas Pastor, G., and Seghiri, M. (2016). Nine Terminology Extraction Tools: Are they useful for translators? Multilingual 159, vol. 27, nº 3, April/May, 2016. pp.14–20

  12. Costa, H. and Corpas Pastor, G. and Durán Muñoz, I. (2016). (In Press) Assessing Terminology Management Systems for Interpreters. Trends in e-tools and resources for translators and interpreters. Brill.

  13. Marcos Zampieri, Binyam Gebrekidan Gebre, Hernani Costa and Josef Van Genabith. (2015). "Comparing Approaches to the Identification of Similar Languages". In Proceedings of the Joint Workshop on Language Technology for Closely Related Languages, Varieties and Dialects (LT4VarDial'15). 2nd Discriminating between Similar Languages Shared Task (DSL'15). Hissar, Bulgaria. September, 2015.

    http://eden.dei.uc.pt/~hpcosta/docs/papers/201509-DSL.pdf