Semi formally verified enhanced code translators of mobile applications programming languages
Amira Tarek Mahmoud Ibrahim
Semi formally verified enhanced code translators of mobile applications programming languages /Amira Tarek Mahmoud Ibrahim - 2025 - 152 p. ill. 21 cm.
Supervisor:
Dr Walaa Medhat
Prof Ahmed Hasan, Prof Hala Zayed, Dr Nahla Elaraby
Thesis (PH.D.)—Nile University, Egypt, 2025 .
"Includes bibliographical references"
Contents:
Contents
Page
Copyright . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . III
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV
Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V
Declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XII
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XIV
List of Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XVI
List of Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XVII
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Dissertation Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Mobile Applications Development in Industry . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Literature Review: Mobile Applications Development Industry-Related studies . . . 6
2.3 Questionnaire Design and Methodology . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3.1 Questionnaire Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3.2 Technical Question Flow and Content . . . . . . . . . . . . . . . . . . . . . 9
2.3.3 Pilot Testing and Questionnaire Distribution Strategy . . . . . . . . . . . . 10
2.3.4 Analysis Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Questionnaire Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.1 Demographic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.2 Technical Questions Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5 Key Findings and Implications of Mobile Applications Development in Industry . . 18
VIII
3. LLMs and Transcompilers for Mobile Application Code Translation . . . . . . . . . . . . 20
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Literature Review on Code Translation . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.1 Translation of Mobile Application Code . . . . . . . . . . . . . . . . . . . . 20
3.2.2 LLMs for Code Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3 Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3.1 Web Services Code Translation from Swift to Java . . . . . . . . . . . . . . 26
3.3.2 Translating from JavaScript to Swift and Java . . . . . . . . . . . . . . . . . 27
3.3.3 GPT4o Evaluation to Translate from Swift to Java . . . . . . . . . . . . . . 28
3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.4.1 Results of Web Services Code Translation . . . . . . . . . . . . . . . . . . . 29
3.4.2 Results of Translation from JavaScript to Java and Swift . . . . . . . . . . . 29
3.4.3 Results of Evaluating GPT4o . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4. Formal Verification Approaches for Code Conversion . . . . . . . . . . . . . . . . . . . . 34
4.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2 Literature of Previous Reviews of Formal Verification . . . . . . . . . . . . . . . . . 36
4.3 Research Corpus Collection Methodology . . . . . . . . . . . . . . . . . . . . . . . 37
4.3.1 Verification Approaches Categorization Methodology . . . . . . . . . . . . . 39
4.4 Verification Techniques for Code Optimization . . . . . . . . . . . . . . . . . . . . . 40
4.4.1 Symbolic Execution for Equivalence Checking . . . . . . . . . . . . . . . . . 40
4.4.2 Equivalence Checking Using State-Machine Representations . . . . . . . . . 44
4.4.3 Equivalence Checking via Assertions . . . . . . . . . . . . . . . . . . . . . . 49
4.4.4 Proof-Based Verification of Compiler Optimizations . . . . . . . . . . . . . . 50
4.5 Formal Verification of Code-to-Code Translation . . . . . . . . . . . . . . . . . . . 52
4.5.1 Symbolic Execution for Cross-Language Equivalence Checking . . . . . . . . 53
4.5.2 Assertions Based on K-Safety Properties for Translation Verification . . . . 53
4.5.3 Proof-Based Verification of Code Translators . . . . . . . . . . . . . . . . . 53
4.6 Formal Verification of Model-Based Code Generation . . . . . . . . . . . . . . . . . 55
4.6.1 Model Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.6.2 Assertions and Formal Proof Obligations for Equivalence Verification . . . . 56
4.7 Analysis of the Trends in Formal Verification of Code Conversion . . . . . . . . . . 57
4.7.1 State-Machine and Graph-Based Equivalence Checking . . . . . . . . . . . . 61
4.7.2 Proof-Based Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.7.3 Symbolic Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.7.4 Assertion-Based Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.7.5 Model Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.8 Analysis of Formal Verification Methods for Cross-Language Code Translation . . . 63
4.8.1 Proof-Based Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.8.2 Symbolic Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.8.3 Assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.8.4 State-Machines and Model-Checking . . . . . . . . . . . . . . . . . . . . . . 64
IX
5. A Model-Checking-Based Verification Approach for Transcompilers . . . . . . . . . . . . 66
5.1 Literature Review on Translators Verification . . . . . . . . . . . . . . . . . . . . . 67
5.2 The TC-Verifier Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.2.1 Transcompiler-Based Code Translators . . . . . . . . . . . . . . . . . . . . . 68
5.2.2 Tool Selection for Model Checking . . . . . . . . . . . . . . . . . . . . . . . 69
5.2.3 Key Components and Concepts in Uppaal . . . . . . . . . . . . . . . . . . . 70
5.2.4 The TC-Verifier Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2.5 Evaluation Metric: Verification Success Rate . . . . . . . . . . . . . . . . . . 73
5.2.6 Applying the Verifier Model to a Translator Case Study . . . . . . . . . . . 74
5.3 Experimental Setup: Statements Modeled . . . . . . . . . . . . . . . . . . . . . . . 77
5.3.1 Variable Declaration Statement . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.3.2 Function Declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.3.3 Class Declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.3.4 While Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.4 Results of Applying TC-Verifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.4.1 Variable Declaration Verification . . . . . . . . . . . . . . . . . . . . . . . . 84
5.4.2 Function Declaration Verification . . . . . . . . . . . . . . . . . . . . . . . . 85
5.4.3 Class Declaration Verification . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.4.4 While Statement Verification . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.4.5 Comparison with BLEU Score . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6. Enhancing LLMs Using TransCompiler-Assisted Prompting . . . . . . . . . . . . . . . . 92
6.1 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.2 Methodology of TC-Assisted Prompting . . . . . . . . . . . . . . . . . . . . . . . . 94
6.3 Case Studies: Selected Models and Experimental Design . . . . . . . . . . . . . . . 95
6.3.1 Models Selection Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.3.2 Two-Step Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.3.3 Codestral Fine-tuning Experiment . . . . . . . . . . . . . . . . . . . . . . . 97
6.4 Datasets Used for Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.4.1 New dataset TGAlpaca . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.4.2 HumanEvalX Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.4.3 XLCoST Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.4.4 Cyclomatic Complexity of Code CC . . . . . . . . . . . . . . . . . . . . . . 100
6.5 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.5.1 Compilation Success Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.5.2 Assertion-Based Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.5.3 pass@1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.5.4 CodeBLEU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.6 Results of LLMs enhancement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.6.1 Results on TGAlpaca dataset . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.6.2 Comparison Results with HumanEvalX Dataset . . . . . . . . . . . . . . . . 106
6.6.3 Comparison Results with XLCoST Dataset . . . . . . . . . . . . . . . . . . 108
6.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.7.1 TC-Assisted Prompting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.7.2 No More Fine-Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.7.3 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.7.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
X
7. Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Appendices:
A. Questionnaire Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
B. Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Abstract:
Code translation is a critical task in software engineering, particularly for cross-platform development. Building mobile applications across platforms often requires translation between native
languages such as Java, Swift, and Kotlin. While modern frameworks like React Native and Flutter
have gained traction, translation between cross-platform languages (e.g., JavaScript and Dart) and
native languages remains a practical necessity for developers.
Previous research has proposed transcompiler-based translators for mobile application languages.
However, developers rarely adopt these tools due to their incompleteness and the unquantified nature of their coverage, often lacking rigorous validation or verification. At the same time, large
language models (LLMs) have transformed code generation tasks, including code completion, generation, summarization, and translation, by leveraging large-scale training datasets. Yet, datasets
specifically focused on mobile application languages remain scarce and underexplored.
In this dissertation, we propose transcompiler-assisted prompting of LLMs for code translation.
We begin by introducing a methodology to formally verify both the capabilities and limitations of
existing compiler-based translators. This verification identifies which translation rules are correctly
covered and which are not. The verification process revealed 50.7% coverage for the transcompiler
under study. The verified partial translations produced by the transcompiler, guaranteed to be
correct for the parts they cover, are then embedded into the prompts provided to LLMs. These
verified fragments guide the LLMs toward reduced hallucination and improved syntactic and semantic
correctness in their output.
We apply this prompt-aided methodology to three different LLMs and evaluate their performance
using our newly created dataset, TGAlpaca, including 902 different code snippets, and two benchmark
datasets, HumanEval-X and XlCost. Our results show improvements ranging from 1% to 40% across
all three models for TGAlpaca. Furthermore, benchmark comparisons with the HumanEvalX data
set achieved a pass@1 score comparable to the CodeGeex fine-tuned model and a 21-27 pass@1 score
improvement over other baseline models. This demonstrates that our approach achieves competitive
or superior translation quality while saving computational resources otherwise required for training
or fine-tuning.
Keywords
Code translation, Formal verification, Large language models, Model checking, Mobile Application development
Text in English, abstracts in English and Arabic
0000-0002-1702-6544 ORCID
Software Engineering
Dissertation, Academic
627
Semi formally verified enhanced code translators of mobile applications programming languages /Amira Tarek Mahmoud Ibrahim - 2025 - 152 p. ill. 21 cm.
Supervisor:
Dr Walaa Medhat
Prof Ahmed Hasan, Prof Hala Zayed, Dr Nahla Elaraby
Thesis (PH.D.)—Nile University, Egypt, 2025 .
"Includes bibliographical references"
Contents:
Contents
Page
Copyright . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . III
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV
Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V
Declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XII
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XIV
List of Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XVI
List of Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XVII
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Dissertation Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Mobile Applications Development in Industry . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Literature Review: Mobile Applications Development Industry-Related studies . . . 6
2.3 Questionnaire Design and Methodology . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3.1 Questionnaire Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3.2 Technical Question Flow and Content . . . . . . . . . . . . . . . . . . . . . 9
2.3.3 Pilot Testing and Questionnaire Distribution Strategy . . . . . . . . . . . . 10
2.3.4 Analysis Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Questionnaire Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.1 Demographic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.2 Technical Questions Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5 Key Findings and Implications of Mobile Applications Development in Industry . . 18
VIII
3. LLMs and Transcompilers for Mobile Application Code Translation . . . . . . . . . . . . 20
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Literature Review on Code Translation . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.1 Translation of Mobile Application Code . . . . . . . . . . . . . . . . . . . . 20
3.2.2 LLMs for Code Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3 Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3.1 Web Services Code Translation from Swift to Java . . . . . . . . . . . . . . 26
3.3.2 Translating from JavaScript to Swift and Java . . . . . . . . . . . . . . . . . 27
3.3.3 GPT4o Evaluation to Translate from Swift to Java . . . . . . . . . . . . . . 28
3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.4.1 Results of Web Services Code Translation . . . . . . . . . . . . . . . . . . . 29
3.4.2 Results of Translation from JavaScript to Java and Swift . . . . . . . . . . . 29
3.4.3 Results of Evaluating GPT4o . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4. Formal Verification Approaches for Code Conversion . . . . . . . . . . . . . . . . . . . . 34
4.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2 Literature of Previous Reviews of Formal Verification . . . . . . . . . . . . . . . . . 36
4.3 Research Corpus Collection Methodology . . . . . . . . . . . . . . . . . . . . . . . 37
4.3.1 Verification Approaches Categorization Methodology . . . . . . . . . . . . . 39
4.4 Verification Techniques for Code Optimization . . . . . . . . . . . . . . . . . . . . . 40
4.4.1 Symbolic Execution for Equivalence Checking . . . . . . . . . . . . . . . . . 40
4.4.2 Equivalence Checking Using State-Machine Representations . . . . . . . . . 44
4.4.3 Equivalence Checking via Assertions . . . . . . . . . . . . . . . . . . . . . . 49
4.4.4 Proof-Based Verification of Compiler Optimizations . . . . . . . . . . . . . . 50
4.5 Formal Verification of Code-to-Code Translation . . . . . . . . . . . . . . . . . . . 52
4.5.1 Symbolic Execution for Cross-Language Equivalence Checking . . . . . . . . 53
4.5.2 Assertions Based on K-Safety Properties for Translation Verification . . . . 53
4.5.3 Proof-Based Verification of Code Translators . . . . . . . . . . . . . . . . . 53
4.6 Formal Verification of Model-Based Code Generation . . . . . . . . . . . . . . . . . 55
4.6.1 Model Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.6.2 Assertions and Formal Proof Obligations for Equivalence Verification . . . . 56
4.7 Analysis of the Trends in Formal Verification of Code Conversion . . . . . . . . . . 57
4.7.1 State-Machine and Graph-Based Equivalence Checking . . . . . . . . . . . . 61
4.7.2 Proof-Based Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.7.3 Symbolic Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.7.4 Assertion-Based Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.7.5 Model Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.8 Analysis of Formal Verification Methods for Cross-Language Code Translation . . . 63
4.8.1 Proof-Based Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.8.2 Symbolic Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.8.3 Assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.8.4 State-Machines and Model-Checking . . . . . . . . . . . . . . . . . . . . . . 64
IX
5. A Model-Checking-Based Verification Approach for Transcompilers . . . . . . . . . . . . 66
5.1 Literature Review on Translators Verification . . . . . . . . . . . . . . . . . . . . . 67
5.2 The TC-Verifier Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.2.1 Transcompiler-Based Code Translators . . . . . . . . . . . . . . . . . . . . . 68
5.2.2 Tool Selection for Model Checking . . . . . . . . . . . . . . . . . . . . . . . 69
5.2.3 Key Components and Concepts in Uppaal . . . . . . . . . . . . . . . . . . . 70
5.2.4 The TC-Verifier Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2.5 Evaluation Metric: Verification Success Rate . . . . . . . . . . . . . . . . . . 73
5.2.6 Applying the Verifier Model to a Translator Case Study . . . . . . . . . . . 74
5.3 Experimental Setup: Statements Modeled . . . . . . . . . . . . . . . . . . . . . . . 77
5.3.1 Variable Declaration Statement . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.3.2 Function Declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.3.3 Class Declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.3.4 While Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.4 Results of Applying TC-Verifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.4.1 Variable Declaration Verification . . . . . . . . . . . . . . . . . . . . . . . . 84
5.4.2 Function Declaration Verification . . . . . . . . . . . . . . . . . . . . . . . . 85
5.4.3 Class Declaration Verification . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.4.4 While Statement Verification . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.4.5 Comparison with BLEU Score . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6. Enhancing LLMs Using TransCompiler-Assisted Prompting . . . . . . . . . . . . . . . . 92
6.1 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.2 Methodology of TC-Assisted Prompting . . . . . . . . . . . . . . . . . . . . . . . . 94
6.3 Case Studies: Selected Models and Experimental Design . . . . . . . . . . . . . . . 95
6.3.1 Models Selection Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.3.2 Two-Step Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.3.3 Codestral Fine-tuning Experiment . . . . . . . . . . . . . . . . . . . . . . . 97
6.4 Datasets Used for Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.4.1 New dataset TGAlpaca . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.4.2 HumanEvalX Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.4.3 XLCoST Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.4.4 Cyclomatic Complexity of Code CC . . . . . . . . . . . . . . . . . . . . . . 100
6.5 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.5.1 Compilation Success Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.5.2 Assertion-Based Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.5.3 pass@1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.5.4 CodeBLEU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.6 Results of LLMs enhancement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.6.1 Results on TGAlpaca dataset . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.6.2 Comparison Results with HumanEvalX Dataset . . . . . . . . . . . . . . . . 106
6.6.3 Comparison Results with XLCoST Dataset . . . . . . . . . . . . . . . . . . 108
6.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.7.1 TC-Assisted Prompting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.7.2 No More Fine-Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.7.3 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.7.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
X
7. Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Appendices:
A. Questionnaire Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
B. Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Abstract:
Code translation is a critical task in software engineering, particularly for cross-platform development. Building mobile applications across platforms often requires translation between native
languages such as Java, Swift, and Kotlin. While modern frameworks like React Native and Flutter
have gained traction, translation between cross-platform languages (e.g., JavaScript and Dart) and
native languages remains a practical necessity for developers.
Previous research has proposed transcompiler-based translators for mobile application languages.
However, developers rarely adopt these tools due to their incompleteness and the unquantified nature of their coverage, often lacking rigorous validation or verification. At the same time, large
language models (LLMs) have transformed code generation tasks, including code completion, generation, summarization, and translation, by leveraging large-scale training datasets. Yet, datasets
specifically focused on mobile application languages remain scarce and underexplored.
In this dissertation, we propose transcompiler-assisted prompting of LLMs for code translation.
We begin by introducing a methodology to formally verify both the capabilities and limitations of
existing compiler-based translators. This verification identifies which translation rules are correctly
covered and which are not. The verification process revealed 50.7% coverage for the transcompiler
under study. The verified partial translations produced by the transcompiler, guaranteed to be
correct for the parts they cover, are then embedded into the prompts provided to LLMs. These
verified fragments guide the LLMs toward reduced hallucination and improved syntactic and semantic
correctness in their output.
We apply this prompt-aided methodology to three different LLMs and evaluate their performance
using our newly created dataset, TGAlpaca, including 902 different code snippets, and two benchmark
datasets, HumanEval-X and XlCost. Our results show improvements ranging from 1% to 40% across
all three models for TGAlpaca. Furthermore, benchmark comparisons with the HumanEvalX data
set achieved a pass@1 score comparable to the CodeGeex fine-tuned model and a 21-27 pass@1 score
improvement over other baseline models. This demonstrates that our approach achieves competitive
or superior translation quality while saving computational resources otherwise required for training
or fine-tuning.
Keywords
Code translation, Formal verification, Large language models, Model checking, Mobile Application development
Text in English, abstracts in English and Arabic
0000-0002-1702-6544 ORCID
Software Engineering
Dissertation, Academic
627