EVOLUTION OF OBJECT DETECTION, TRACKING, AND MOTION ESTIMATION ALONG WITH DEEP NEURAL NETWORKS/
Khaled Adel Ezzat
- 2022
- 61 p. ill. 21 cm.
Supervisor: Khaled Foad Mustafa Elattar
Publication: 1-On the Application of Hierarchical Adaptive Structured Mesh" HASM®" Codec for Ultra Large Video Format https://dl.acm.org/doi/abs/10.1145/3436829.3436870
2-nnDPI: A Novel Deep Packet Inspection Technique Using Word Embedding, Convolutional and Recurrent Neural Networks https://ieeexplore.ieee.org/abstract/document/9257912
3-On Optimizing the Visual Quality of HASM-Based Streaming—The Study the Sensitivity of Motion Estimation Techniques for Mesh-Based Codecs in Ultra High Definition Large Format Real-Time Video Coding https://link.springer.com/chapter/10.1007/978-981-33-6129-4_15
4-MinkowRadon: Multi-Object Tracking Using Radon Transformation and Minkowski Distance https://ieeexplore.ieee.org/abstract/document/9581542
Abstract: Object Detection, Tracking and Motion Estimation have been a major concern since the 1970s, from Self Driving Cars, Surveillance Cameras, Industrial robotics, Traffic monitoring, Medical diagnosis systems, to Activity recognition, are expecting a huge increase in demand for automated detection-tracking systems. Modern hardware specifications and evolving deep learning applications with advancement of Computer Vision and Digital Video Processing are resulting in a massive progress towards fully automated systems, with all advance models and systems like R-CNN, YOLO, SSD, and RetinaNet, there will always be a trade-off between precision (mAP) and speed (FPS) which puts a new limits to computer vision advancement. Technological merging has the potential to drive the intuition to achieve such advancements, and overcome some of the existing limitations. Introducement of a combination between Deep Neural Networks and Digital signal processing to enable once again progress to be done in improving Object Detection, Tracking and Motion Estimation in a real-time videos. Utilizing both of the fields, this thesis purposes a complete detection/ tracking framework utilizing YOLO v4 as a state-of-art object detector to detect the objects in the video sequences. In addition to a novel MinkowRadon tracking algorithm which utilizes the Radon Transformation and Minkowski Distance to translates the rest of video frames sequence to the signal’s domain, in an attempt iv to tackle extreme object tracking problems found in video sequences like eg. trembling moving cameras, deformation, motion blur, fast motion, and in-plane rotation. Tracking through signals have proven with a higher accuracy compared to the stateof- art tracking techniques that a combination between classical techniques and deep learning models is sufficient to solve most modern problems.