Image from Google Jackets

Real-Time Stream Processing For Recommendation Engine(s)/ Mohamed Reda Zohair Ali Bennawy

By: Material type: TextTextLanguage: English Summary language: English Publication details: 2023Description: 75 p. ill. 21 cmSubject(s): Genre/Form: DDC classification:
  • 610
Contents:
Contents: List of Abbreviations.................................................................................................................. i List of Tables............................................................................................................................. ii List of Figures .......................................................................................................................... iii Chapter 1: Introduction...............................................................................................................1 1.1 Motivation ........................................................................................................................1 1.2 Problem Statement ...........................................................................................................1 1.3 User Stories ......................................................................................................................2 1.4 Challenges........................................................................................................................3 1.5 Summary of Contributions...............................................................................................3 1.6 Thesis Outline...................................................................................................................4 Chapter 2: Literature Survey ......................................................................................................5 2.1 Introduction ......................................................................................................................5 2.2. Data Architecture ............................................................................................................6 2.3 Use Cases .........................................................................................................................9 2.4 Frameworks....................................................................................................................12 2.5 Comparison ....................................................................................................................22 2.6 Architecture ....................................................................................................................24 2.6.1 Benefits of modern stream processing architecture .................................................25 2.7 Session-based Recommender Systems...........................................................................26 2.8 Evaluation of any recommender system:........................................................................27 2.8.1 A/B Testing..............................................................................................................27 2.8.2 Rating Prediction Accuracy .....................................................................................28 2.8.3 Ranking Measures....................................................................................................29 2.8.4 Hit Rates Measures..................................................................................................30 2.8.5 Diversity Measures..................................................................................................31 2.9 Recommender Model Types...........................................................................................32 2.9.1 Content-Based Recommender System.....................................................................33 2.9.2 Collaborative Filtering Recommender System........................................................34 Chapter 3: Methodology...........................................................................................................42 3.1 Introduction ....................................................................................................................42 3.2 E-Tourism Datasets........................................................................................................42 3.3 Trivago Dataset Description...........................................................................................43 3.3.1 Overview..................................................................................................................43 3.3.2 Files Description ......................................................................................................46 3.3.3 Descriptive Analysis................................................................................................46 Chapter 4: Results and Discussions..........................................................................................52 4.1 Experiments....................................................................................................................52 4.2 Evaluation Metrics .........................................................................................................52 4.3 Experiments....................................................................................................................53 4.4 Benchmarks....................................................................................................................53 4.5 Proposed Solution...........................................................................................................56 Chapter 5: Conclusions and Future Work ................................................................................58 5.1 Conclusion......................................................................................................................58 5.2 Future directions.............................................................................................................59 References................................................................................................................................6
Dissertation note: Thesis (M.A.)—Nile University, Egypt, 2023 . Abstract: Abstract: Event stream processing (ESP) is a data processing methodology which tackle online processing for a variety of events. Recently stream processing witnessed a huge interest in both academic research and corporate use cases. As a consequence, for the extremely huge data sources recently generated and diversely of usage. Data sources vary from websites logs, social media feeds, news articles, internal business transactions, IoT devices logs, ... etc. Academically, a lot of research papers discuss how to deal with enormous cloud of events in real-time with different data structures such as text, video, logs, transactions, … etc. Aside with, highlighting the weakness and strength points of the streaming platforms technologies. From corporate point of view, decision makers ask about how to best utilize those events with minimal delay in order to 1) uncover insights in real-time, 2) mine textual events, 3) recommend decisions. This requires a mix of machine learning, stream processing technologies and modern architecture to achieve best utilization with low latency. Unfortunately, each technology is typically optimized independently therefore, it is a challenge to combine all technologies together and have a scalable real-world application. Through the thesis, we shall discuss the state-of-the-art event stream processing technologies by summarizing definitions, data flow architectures, use cases, frameworks, and architecture best practices Also, we propose a recommendation engine architecture to perfectly cope with a real-life data stream in the E-Tourism domain. The Association for Computing Machinery ACM recommendation systems challenge (ACM RecSys) [1] released an e-tourism dataset for the first time in 2019. Challenge shared hotel booking sessions from the trivago website asking to rank the hotel`s list for the users. The better ranking should achieve a high click-out rate. We introduce a state of art architecture and a session-based recommender system on top of portal streaming data. Proposed solution take into consideration both recommendation engine accuracy and a low latency architecture. Compared to different benchmark publications on same dataset, proposed solution outperform in time with 10x faster using only 2% of computational power used. Paper shared the architecture and recommendation engine into details taking into consideration the ability to deploy the model into real-life production environments.
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)

Supervisor:
Walid Al-Atabany

Thesis (M.A.)—Nile University, Egypt, 2023 .

"Includes bibliographical references"

Contents:
List of Abbreviations.................................................................................................................. i
List of Tables............................................................................................................................. ii
List of Figures .......................................................................................................................... iii
Chapter 1: Introduction...............................................................................................................1
1.1 Motivation ........................................................................................................................1
1.2 Problem Statement ...........................................................................................................1
1.3 User Stories ......................................................................................................................2
1.4 Challenges........................................................................................................................3
1.5 Summary of Contributions...............................................................................................3
1.6 Thesis Outline...................................................................................................................4
Chapter 2: Literature Survey ......................................................................................................5
2.1 Introduction ......................................................................................................................5
2.2. Data Architecture ............................................................................................................6
2.3 Use Cases .........................................................................................................................9
2.4 Frameworks....................................................................................................................12
2.5 Comparison ....................................................................................................................22
2.6 Architecture ....................................................................................................................24
2.6.1 Benefits of modern stream processing architecture .................................................25
2.7 Session-based Recommender Systems...........................................................................26
2.8 Evaluation of any recommender system:........................................................................27
2.8.1 A/B Testing..............................................................................................................27
2.8.2 Rating Prediction Accuracy .....................................................................................28
2.8.3 Ranking Measures....................................................................................................29
2.8.4 Hit Rates Measures..................................................................................................30
2.8.5 Diversity Measures..................................................................................................31
2.9 Recommender Model Types...........................................................................................32
2.9.1 Content-Based Recommender System.....................................................................33
2.9.2 Collaborative Filtering Recommender System........................................................34
Chapter 3: Methodology...........................................................................................................42
3.1 Introduction ....................................................................................................................42
3.2 E-Tourism Datasets........................................................................................................42
3.3 Trivago Dataset Description...........................................................................................43
3.3.1 Overview..................................................................................................................43
3.3.2 Files Description ......................................................................................................46
3.3.3 Descriptive Analysis................................................................................................46
Chapter 4: Results and Discussions..........................................................................................52
4.1 Experiments....................................................................................................................52
4.2 Evaluation Metrics .........................................................................................................52
4.3 Experiments....................................................................................................................53
4.4 Benchmarks....................................................................................................................53
4.5 Proposed Solution...........................................................................................................56
Chapter 5: Conclusions and Future Work ................................................................................58
5.1 Conclusion......................................................................................................................58
5.2 Future directions.............................................................................................................59
References................................................................................................................................6

Abstract:
Event stream processing (ESP) is a data processing methodology which tackle online
processing for a variety of events. Recently stream processing witnessed a huge
interest in both academic research and corporate use cases. As a consequence, for the
extremely huge data sources recently generated and diversely of usage. Data sources
vary from websites logs, social media feeds, news articles, internal business
transactions, IoT devices logs, ... etc. Academically, a lot of research papers discuss
how to deal with enormous cloud of events in real-time with different data structures
such as text, video, logs, transactions, … etc. Aside with, highlighting the weakness
and strength points of the streaming platforms technologies. From corporate point of
view, decision makers ask about how to best utilize those events with minimal delay
in order to 1) uncover insights in real-time, 2) mine textual events, 3) recommend
decisions. This requires a mix of machine learning, stream processing technologies
and modern architecture to achieve best utilization with low latency. Unfortunately,
each technology is typically optimized independently therefore, it is a challenge to
combine all technologies together and have a scalable real-world application.
Through the thesis, we shall discuss the state-of-the-art event stream processing
technologies by summarizing definitions, data flow architectures, use cases,
frameworks, and architecture best practices Also, we propose a recommendation
engine architecture to perfectly cope with a real-life data stream in the E-Tourism
domain.
The Association for Computing Machinery ACM recommendation systems challenge
(ACM RecSys) [1] released an e-tourism dataset for the first time in 2019. Challenge
shared hotel booking sessions from the trivago website asking to rank the hotel`s list
for the users. The better ranking should achieve a high click-out rate. We introduce a
state of art architecture and a session-based recommender system on top of portal
streaming data. Proposed solution take into consideration both recommendation
engine accuracy and a low latency architecture. Compared to different benchmark
publications on same dataset, proposed solution outperform in time with 10x faster
using only 2% of computational power used. Paper shared the architecture and
recommendation engine into details taking into consideration the ability to deploy the
model into real-life production environments.

Text in English, abstracts in English and Arabic

There are no comments on this title.

to post a comment.