000 02880nam a22002657a 4500
008 201210b2022 a|||f bm|| 00| 0 eng d
024 7 _ahttps://orcid.org/0000-0003-3874-805X
_2ORCID
040 _aEG-CaNU
_cEG-CaNU
041 0 _aeng
_beng
_bARA
082 _a610
100 0 _aAmal Abdelsalam Mahmoud Mohamed
_93678
245 1 _aTranslation Quality Estimation for the IT Domain using Knowledge Distillation Approach
_c/Amal Abdelsalam Mahmoud Mohamed
260 _c2022
300 _a p.
_bill.
_c21 cm.
500 _3Supervisor: Mohamed El-Helw
502 _aThesis (M.A.)—Nile University, Egypt, 2022 .
504 _a"Includes bibliographical references"
505 0 _aContents:
520 3 _aAbstract: Machine Translation (MT) plays a vital role in overcoming language barriers in today’s interconnected world, producing vast streams of translated text used in businesses and daily life. Traditional evaluation methods rely on human-generated reference translations are no longer practical, creating an urgent need for automatic Quality Estimation (QE) systems to assess translation quality. While recent advances in Deep Learning (DL) have significantly improved MT systems, QE systems still face challenges in domain-specific and low-resource settings. Addressing these gaps is crucial to enable reliable and cost-effective QE systems for real-world use. Recognizing that building QE models requires a curated model design, this thesis proposed using knowledge distillation approach to build bilingual distributed representations for training a light QE neural model. The proposed model design aimed to generate bilingual representations able to embed deep semantics and linguistics for the language pair used for translation into a single vector space. Additionally, with the capabilities of knowledge distillation as a model compression technique, the proposed design is aimed to enable the adoption of the QE model in real-world applications. The model is evaluated on the sentence level QE in the Information Technology (IT) domain datasets provided by the Machine Translation Community (WMT). The model performance outperforms strong QE systems that are based on complex deep networks and ensemble models. It achieves the best performance on the WMT IT-domain QE data versions of 2016 and 2017. And it achieves the third best reported correlation on the WMT IT-domain QE data version of 2018. Additionally, the proposed model reduces the QE model size to one-third of that of existing QE ensemble models. With these achievements, this research proved offering scalable and efficient solutions for real-world applications in the field of low-resource QE.
546 _aText in English, abstracts in English and Arabic
650 4 _aInformaticsIFM
655 7 _2NULIB
_aDissertation, Academic
_9187
690 _aInformaticsIFM
942 _2ddc
_cTH
999 _c11021
_d11021