Normal view MARC view ISBD view

UniTextFusion: (Record no. 11024)

MARC details
000 -LEADER
fixed length control field	08793nam a22002657a 4500
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field	201210b2025 a\|\|\|f bm\|\| 00\| 0 eng d
024 7# - Author Identifier
Standard number or code	0009-0002-9308-0291
Source of number or code	ORCID
040 ## - CATALOGING SOURCE
Original cataloging agency	EG-CaNU
Transcribing agency	EG-CaNU
041 0# - Language Code
Language code of text	eng
Language code of abstract	eng
--	ara
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number	610
100 0# - MAIN ENTRY--PERSONAL NAME
Personal name	Salma Khaled Ali Mohamed Ali
245 1# - TITLE STATEMENT
Title	UniTextFusion:
Remainder of title	A Unified Early Fusion Framework for Arabic Multimodal Sentiment Analysis with LLMs
Statement of responsibility, etc.	/Salma Khaled Ali Mohamed Ali
260 ## - PUBLICATION, DISTRIBUTION, ETC.
Date of publication, distribution, etc.	2025
300 ## - PHYSICAL DESCRIPTION
Extent	86p.
Other physical details	ill.
Dimensions	21 cm.
500 ## - GENERAL NOTE
Materials specified	Supervisor: <br/>Dr. Walaa Medhat<br/>Dr. Ensaf Hussein Mohamed<br/>
502 ## - Dissertation Note
Dissertation type	Thesis (M.A.)—Nile University, Egypt, 2025 .
504 ## - Bibliography
Bibliography	"Includes bibliographical references"
505 0# - Contents
Formatted contents note	Contents:<br/>Contents<br/>Page<br/>Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV<br/>Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI<br/>List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . X<br/>List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XI<br/>List of Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XII<br/>Chapters:<br/>1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1<br/>1.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1<br/>1.2 Challenges in Arabic Multimodal Sentiment Analysis . . . . . . . . . . . . . . . . . 2<br/>1.3 Thesis Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2<br/>1.4 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3<br/>1.4.1 Background and Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3<br/>1.4.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3<br/>1.4.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3<br/>1.4.4 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4<br/>1.4.5 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . 4<br/>2. Background and Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5<br/>2.1 Multi-modal Sentiment Analsysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5<br/>2.2 Multi-modal Data Fusion Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . 6<br/>2.2.1 Feature Level (early-stage) Fusion Technique . . . . . . . . . . . . . . . . . 6<br/>2.2.2 Model Level (mid-stage) Fusion Technique . . . . . . . . . . . . . . . . . . . 6<br/>2.2.3 Decision Level (late-stage) Fusion Technique . . . . . . . . . . . . . . . . . . 7<br/>2.2.4 Comparison of Fusion Techniques . . . . . . . . . . . . . . . . . . . . . . . . 7<br/>2.3 MuSA Techniques and Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . 9<br/>2.3.1 Traditional Machine Learning Approaches . . . . . . . . . . . . . . . . . . . 9<br/>2.3.2 Deep Learning Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . 9<br/>2.3.3 Large Language Model (LLM)-based Generative Approaches . . . . . . . . . 10<br/>3. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12<br/>3.1 Existing Multimodal Datasets for Sentiment Analysis . . . . . . . . . . . . . . . . . 12<br/>VII<br/>3.1.1 Comparison of English and Arabic Datasets . . . . . . . . . . . . . . . . . . 12<br/>3.1.2 Gaps in Existing Multimodal Datasets . . . . . . . . . . . . . . . . . . . . . 14<br/>3.2 Approaches to Multimodal Sentiment Analysis . . . . . . . . . . . . . . . . . . . . 15<br/>3.3 Research Gaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18<br/>4. Arabic Multimodal Sentiment Analysis Methodology . . . . . . . . . . . . . . . . . . . . 20<br/>4.1 Ar-MUSA: Arabic Multimodal Sentiment Analysis Dataset . . . . . . . . . . . . . . 20<br/>4.1.1 Data Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20<br/>4.1.2 Data Preperation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20<br/>4.1.3 Data Labeling Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22<br/>4.1.4 Ethical Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25<br/>4.2 Multimodal Sentiment Analysis Fusion and Models . . . . . . . . . . . . . . . . . . 25<br/>4.2.1 Pre-trained Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26<br/>4.2.2 Generative LLMs Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30<br/>4.2.3 UniText Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34<br/>5. Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46<br/>5.1 Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46<br/>5.1.1 Weighted Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46<br/>5.1.2 Weighted Recall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46<br/>5.1.3 Weighted F1-Score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47<br/>5.1.4 Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47<br/>5.2 Pre-trained Models: Setup and Results . . . . . . . . . . . . . . . . . . . . . . . . . 47<br/>5.2.1 Text Based Transformer: MarBERT Model . . . . . . . . . . . . . . . . . . 47<br/>5.2.2 Audio Based Transformer: Egyptian HuBERT . . . . . . . . . . . . . . . . . 48<br/>5.2.3 Image Based Transformer: MobileNet V2 . . . . . . . . . . . . . . . . . . . 49<br/>5.2.4 Multi-Modal Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50<br/>5.3 Generative LLMs Models: Setup and Results . . . . . . . . . . . . . . . . . . . . . 51<br/>5.3.1 Text Based LLM: Qwen2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51<br/>5.3.2 Audio Based LLM: Qwen2-Audio . . . . . . . . . . . . . . . . . . . . . . . . 51<br/>5.3.3 Image Based LLM: Qwen2-VL . . . . . . . . . . . . . . . . . . . . . . . . . 52<br/>5.3.4 Multi-modal Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52<br/>5.4 UniTect Fusion Approach: Setup and Results . . . . . . . . . . . . . . . . . . . . . 54<br/>5.4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54<br/>5.4.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55<br/>5.5 Comparative Analysis of UniText Fusion and State-of-the-Art Techniques . . . . . 57<br/>6. Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60<br/>6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60<br/>6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61<br/>Appendices:<br/>A. Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62<br/>VIII<br/>Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
520 3# - Abstract
Abstract	Abstract:<br/>Multimodal Sentiment Analysis (MuSA) combines text, audio, and visual inputs to detect and classify<br/>emotions. Despite its growing relevance, Arabic MuSA research is limited due to the lack of highquality annotated datasets and the complexity of Arabic language processing. This work presents<br/>Ar-MuSA, an open-source Arabic MuSA dataset containing aligned text, audio, and visual data. Unlike existing unimodal resources, Ar-MuSA supports sentiment analysis across multiple modalities.<br/>The dataset is evaluated using MarBERT (text), HuBERT (audio), MobileNet (vision), Qwen2 (multimodal), and ensemble methods. Results indicate improved performance through modality fusion;<br/>MarBERT achieved a 71% F1-score for text-only classification, while audio and image modalities performed lower individually. Fusion with text improved performance from 39% to 67%, representing an<br/>absolute gain of 28%. To further improve results, the UniTextFusion framework is proposed. It performs Early Fusion by converting audio and visual signals into text descriptions, which are combined<br/>with transcripts and used as input to large language models (LLMs). Fine-tuning Arabic-compatible<br/>LLMs—LLaMA 3.1-8B Instruct and SILMA AI 9B—using LoRA (Low-Rank Adaptation) yielded<br/>F1-scores of 68% and 71%, surpassing unimodal baselines of 34% and 41% by 34 and 30 percentage<br/>points, respectively.<br/>Keywords:<br/>Arabic Multimodal Sentiment Analysis, LoRA, Fine-tuning, Arabic MuSA Dataset, Multimodal<br/>Generative LLMs, Fusion
546 ## - Language Note
Language Note	Text in English, abstracts in English and Arabic
650 #4 - Subject
Subject	InformaticsIFM
655 #7 - Index Term-Genre/Form
Source of term	NULIB
focus term	Dissertation, Academic
690 ## - Subject
School	InformaticsIFM
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme	Dewey Decimal Classification
Koha item type	Thesis
655 #7 - Index Term-Genre/Form
--	187

Holdings
Withdrawn status	Lost status	Source of classification or shelving scheme	Damaged status	Not for loan	Home library	Current library	Date acquired	Total Checkouts	Full call number	Date last seen	Price effective from	Koha item type
		Dewey Decimal Classification			Main library	Main library	10/08/2025		610/S.A.U/2025	10/08/2025	10/08/2025	Thesis