Rohit Gupta, ESR4

Location: University of Wolverhampton, UK

Project Title: Use of language technology to improve matching & retrieval in translation memories

Project Description: 

I am working on improving matching and retrieval in Translation Memory. The notion of similarity captured by the current TMs is different from what humans would consider similar. As a result different segments having same fuzzy match score may require considerably different post editing efforts and times.  To improve the TM matching, I have proposed a novel and efficient approach to incorporating semantic information in the form of paraphrasing with edit-distance. The approach is based on greedy approximation and dynamic programming. We have performed automatic as well as human evaluations to test our approach. Our results show that paraphrasing improves TM matching and retrieval, resulting in translation performance increases when translators use paraphrase enhanced TMs.

I am recently proposed a new Machine Translation (MT) evaluation metric based on dense vector spaces and recurrent neural networks (RNNs), in particular Long Short Term Memory (LSTM) networks. For the WMT-14 dataset, our new metric scores best for two out of five language pairs, and overall best and second best on all language pairs, using Spearman and Pearson correlation, respectively.

I am currently working on using deep learning techniques to improve matching and retrieval in TM.

Research Interests: 


Publication list

  1. Hanna Bechara, Hernani Costa, Shiva Taslimipoor, Rohit Gupta, Constantin Orasan, Gloria Corpas Pastor and Ruslan Mitkov. (2015). MiniExperts: A SVM approach for Measuring Semantic Textual Similarity. In Proceedings of the 2015 Conference of the North American Chapter of the 34 Association for Computational Linguistics: Human Language Technologies, Denver, Colorado
  2. Rohit Gupta, Hanna Bechara, and Constantin Orasan. 2014. Intelligent Translation Memory Matching and Retrieval Metric Exploiting Linguistic Technology. In Proceedings of the thirty sixth Conference on Translating and Computer, London, UK.
  3. Rohit Gupta and Constantin Orasan (2014) Incorporating Paraphrasing in Translation Memory Matching and Retrieval. In Proceedings of the European Association of Machine Translation (EAMT-2014). Dubrovnik, Croatia
  4. Rohit Gupta, Hanna Bechara, Ismaïl El Maarouf and Constantin Orasan. (2014). UoW: NLP techniques developed at the University of Wolverhampton for Semantic Similarity and Textual Entailment. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), Dublin, Ireland. pp. 785-789
  5. Rohit Gupta, Constantin Orasan, Marcos Zampieri, Mihaela Vela and Josef van Genabith. 2015. Can Transfer Memories afford not to use paraphrasing? In Proceeding of EAMT-2015, Antalya Turkey
  6. Rohit Gupta, Constatin Orasan and Josef van Genabith (2015). ReVal: A Simple and Effective Machine Translation Evaluation Metric Based on Recurrent Neural Networks. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP-2015), Lisbon, Pourtgal,
  7. Gupta, R., Orasan, C., & van Genabith, J. (2015). Machine Translation Evaluation using Recurrent Neural Networks. In Proceedings of the Tenth Workshop on Statistical Machine Translation (pp. 380–384). Lisbon, Portugal.
  8. Rohit Gupta, Constantin Orasan, Qun Liu, Ruslan Mitkov (2016). A Dynamic Programming Approach to Improving Translation Memory Matching and Retrieval using Paraphrases. In Proceedings of the 19th International Conference on Text, Speech and Dialogue (TSD), Brno, Czech Republic.

  9. Constantin Orasan and Rohit Gupta (eds.) (2015) Proceedings of the First Workshop on Natural Language Processing for Translation Memories (NLP4TM 2015).  Hissar, Bulgaria, 11 Sept

  10. Liling Tan, Rohit Gupta, Josef van Genabith (2015). USAAR-WLV: Hypernym Generation with Deep Neural Nets. In Proceedings of the International Workshop on Semantic Evaluation (SemEval-2015), Denver, Colorado, USA.