Translation Quality Estimation for the IT Domain using Knowledge Distillation Approach
/Amal Abdelsalam Mahmoud Mohamed
- 2022
- p. ill. 21 cm.
Supervisor: Mohamed El-Helw
Thesis (M.A.)—Nile University, Egypt, 2022 .
"Includes bibliographical references"
Contents:
Abstract: Machine Translation (MT) plays a vital role in overcoming language barriers in today’s interconnected world, producing vast streams of translated text used in businesses and daily life. Traditional evaluation methods rely on human-generated reference translations are no longer practical, creating an urgent need for automatic Quality Estimation (QE) systems to assess translation quality. While recent advances in Deep Learning (DL) have significantly improved MT systems, QE systems still face challenges in domain-specific and low-resource settings. Addressing these gaps is crucial to enable reliable and cost-effective QE systems for real-world use. Recognizing that building QE models requires a curated model design, this thesis proposed using knowledge distillation approach to build bilingual distributed representations for training a light QE neural model. The proposed model design aimed to generate bilingual representations able to embed deep semantics and linguistics for the language pair used for translation into a single vector space. Additionally, with the capabilities of knowledge distillation as a model compression technique, the proposed design is aimed to enable the adoption of the QE model in real-world applications. The model is evaluated on the sentence level QE in the Information Technology (IT) domain datasets provided by the Machine Translation Community (WMT). The model performance outperforms strong QE systems that are based on complex deep networks and ensemble models. It achieves the best performance on the WMT IT-domain QE data versions of 2016 and 2017. And it achieves the third best reported correlation on the WMT IT-domain QE data version of 2018. Additionally, the proposed model reduces the QE model size to one-third of that of existing QE ensemble models. With these achievements, this research proved offering scalable and efficient solutions for real-world applications in the field of low-resource QE.