Image from Google Jackets

Replication in NileStore: A Peer To Peer Global Storage System / Ahmed Ali ElDin Mohamed Hassan

By: Material type: TextTextLanguage: English Summary language: English Publication details: 2010Description: 116 p. ill. 21 cmSubject(s): Genre/Form: DDC classification:
  • 610
Contents:
Contents: Chapter 1 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Thesis organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 An overview of distributed storage systems . . . . . . . . . . . . . . . . . . . . . 4 2.1 Examples of distributed storage systems . . . . . . . . . . . . . . . . . . . 5 2.1.1 The Farsite project . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1.1.1 Feasability of a distributed storage system . . . . . . . . . 6 2.1.1.2 Network problem not storage problem . . . . . . . . . . . 6 2.1.2 The OceanStore project . . . . . . . . . . . . . . . . . . . . . . . . 8 2.1.2.1 Thermodynamical approach . . . . . . . . . . . . . . . . . 8 2.1.2.2 Using Tapestry DHT . . . . . . . . . . . . . . . . . . . . . 9 2.1.2.3 Tackling versioning of mutable files . . . . . . . . . . . . . 11 2.1.3 The Tahoe project . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.1.3.1 Tahoe security explained . . . . . . . . . . . . . . . . . . . 12 2.1.3.2 Technologies used and system architecture . . . . . . . . 14 2.1.4 The GEMS project . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.1.4.1 A Global Storage for Academic Simulations . . . . . . . . 16 2.1.4.2 Replica Management as a Linear Control Problem . . . . 17 2.2 The availability and durability problem . . . . . . . . . . . . . . . . . . . 18 2.2.1 Replica types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2.2 Replica placement strategies and algorithms . . . . . . . . . . . . . 19 2.2.2.1 Use Of AI hill-climbing algorithms . . . . . . . . . . . . . 20 2.2.2.2 Lazy Repair and Reintegration . . . . . . . . . . . . . . . 20 2.2.2.3 Using Age Estimation For Placement . . . . . . . . . . . 22 2.2.2.4 Proactive replication . . . . . . . . . . . . . . . . . . . . . 25 2.2.2.5 Gossiping is Practical . . . . . . . . . . . . . . . . . . . . 25 2.3 Security in distributed storage systems . . . . . . . . . . . . . . . . . . . . 26 2.3.1 The Sybil attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.4 Incentives to use online storage systems . . . . . . . . . . . . . . . . . . . . 29 vii 2.4.1 Replica placement as a game . . . . . . . . . . . . . . . . . . . . . . 29 2.5 Measurements and models of distributed systems . . . . . . . . . . . . . . 30 2.5.1 Analysis of deployed P2P systems . . . . . . . . . . . . . . . . . . . 30 2.5.1.1 KAD network analysis . . . . . . . . . . . . . . . . . . . . 30 2.5.1.2 Availability analysis in Overnet . . . . . . . . . . . . . . . 31 2.5.1.3 Comparing different systems . . . . . . . . . . . . . . . . . 32 2.5.2 Modeling of distributed storage systems . . . . . . . . . . . . . . . 33 2.5.2.1 Failure Correlation Analysis . . . . . . . . . . . . . . . . . 33 2.5.2.2 Effect of environment on churn . . . . . . . . . . . . . . . 36 2.5.2.3 Redundancy Vs Scaling Vs Dynamical System . . . . . . . 38 2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3 Validation of a Hybrid Replication Algorithm . . . . . . . . . . . . . . . . . . . 41 3.1 Why use an adaptive hybrid replication scheme? . . . . . . . . . . . . . . . 41 3.2 Modeling the distributed storage problem . . . . . . . . . . . . . . . . . . . 42 3.2.1 System assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.2.2 Peers’ model description . . . . . . . . . . . . . . . . . . . . . . . . 43 3.2.3 The system model . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.3 Replication as an adaptive control problem . . . . . . . . . . . . . . . . . . 45 3.3.1 The estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.4 Building an adaptive controller . . . . . . . . . . . . . . . . . . . . . . . . 46 3.5 Choosing T: the time between two estimations of the churn . . . . . . . 46 3.6 Validating the algorithm using simulation . . . . . . . . . . . . . . . . . . . 47 3.7 Critique for this algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4 NileStore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.1 A framework for replica placement . . . . . . . . . . . . . . . . . . . . . . 51 4.2 Nomenclature and Notations . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.3 NileStore system: Introduction and Design goals . . . . . . . . . . . . . . 53 4.3.1 System Design Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.4 NileStore: The protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.4.1 The Tracker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.4.2 The peers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.5 From Replica Placement to Task Assignment . . . . . . . . . . . . . . . . . 61 4.5.1 Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.5.2 Profit functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.5.2.1 Family 1: Profit function I . . . . . . . . . . . . . . . . . . 64 4.5.2.2 FAmily 1: Profit function II . . . . . . . . . . . . . . . . . 66 4.5.2.3 Family 1:Profit function III . . . . . . . . . . . . . . . . . 67 4.5.2.4 Family 1:Profit function IV . . . . . . . . . . . . . . . . . 68 4.5.2.5 Family 2:Profit function V . . . . . . . . . . . . . . . . . . 69 4.5.3 Solving the problem . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5.1 Discrete Event Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 viii 5.1.1 Discrete Event Simulation Anatomy . . . . . . . . . . . . . . . . . . 72 5.1.1.1 Workload Modeling . . . . . . . . . . . . . . . . . . . . . . 72 5.1.1.2 Service Disciplines . . . . . . . . . . . . . . . . . . . . . . 72 5.1.1.3 Accuracy Vs Cost . . . . . . . . . . . . . . . . . . . . . . 73 5.1.2 Network Simulation: Practical design . . . . . . . . . . . . . . . . . 73 5.2 Simulating NileStore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5.2.1 NileStore’s Simulator Anatomy . . . . . . . . . . . . . . . . . . . . 74 5.2.2 NileStore’s Workload modeling . . . . . . . . . . . . . . . . . . . . 74 5.2.2.1 Peers’ Upload and Download Rates . . . . . . . . . . . . . 75 5.2.2.2 Storage Model . . . . . . . . . . . . . . . . . . . . . . . . 75 5.2.2.3 Leave/Join Rates . . . . . . . . . . . . . . . . . . . . . . . 76 5.2.3 NileStore’s Simulator Practical Design . . . . . . . . . . . . . . . . 77 5.2.3.1 Classes for Managing the Queue . . . . . . . . . . . . . . 77 5.2.3.2 Networking Classes . . . . . . . . . . . . . . . . . . . . . . 78 5.2.3.3 NileStore’s Classes . . . . . . . . . . . . . . . . . . . . . . 79 5.2.3.4 The Main Loop . . . . . . . . . . . . . . . . . . . . . . . . 80 6 Experiments and Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 6.1 NileStore’s Performance with time . . . . . . . . . . . . . . . . . . . . . . . 81 6.1.1 Experiment conditions . . . . . . . . . . . . . . . . . . . . . . . . . 81 6.1.2 Experiment results . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 6.1.2.1 Reason for the poor performance of profit function V . . . 83 6.1.2.2 Family 1 profit functions are similar . . . . . . . . . . . . 83 6.1.3 Changing the experiment conditions . . . . . . . . . . . . . . . . . 87 6.2 Effect of changing the churn rate . . . . . . . . . . . . . . . . . . . . . . . 89 6.3 Effect of changing τ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 6.4 Effect of changing r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 6.5 Time taken for replica placement . . . . . . . . . . . . . . . . . . . . . . . 95 7 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 7.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 References . . . . . . . .
Dissertation note: Thesis (M.A.)—Nile University, Egypt, 2010 . Abstract: Abstract: Replica placement algorithms in distributed storage and backup systems have a direct effect on the required time for replication, the availability and durability of files and the latency of file retrieval. In this work we introduce Nilestore, a P2P-assisted cloud storage protocol that transforms the replica placement problem into a linear task assignment optimization problem. The task assignment problem is solved using an optimization engine. The assignment takes into account each peer’s upload bandwidth, download bandwidth, contributed free storage and the size of data that it needs to replicate. The task assignment is solved using a suboptimal optimization algorithm for speed. This approach to solving the replica placement problem has not been taken before for online storage systems. We show our simulation results for the algorithms under realistic network conditions that were obtained from many studies conducted on P2P networks and distributed storage. Our results show that using NileStore can offload the storage servers of the cloud by more than 75% using the available replicas in the system.
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Item type Current library Call number Status Date due Barcode
Thesis Thesis Main library 610 / AH.R 2010 (Browse shelf(Opens below)) Not For Loan

Supervisor: Nael Osman

Thesis (M.A.)—Nile University, Egypt, 2010 .

"Includes bibliographical references"

Contents:
Chapter
1 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Thesis organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 An overview of distributed storage systems . . . . . . . . . . . . . . . . . . . . . 4
2.1 Examples of distributed storage systems . . . . . . . . . . . . . . . . . . . 5
2.1.1 The Farsite project . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1.1 Feasability of a distributed storage system . . . . . . . . . 6
2.1.1.2 Network problem not storage problem . . . . . . . . . . . 6
2.1.2 The OceanStore project . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.2.1 Thermodynamical approach . . . . . . . . . . . . . . . . . 8
2.1.2.2 Using Tapestry DHT . . . . . . . . . . . . . . . . . . . . . 9
2.1.2.3 Tackling versioning of mutable files . . . . . . . . . . . . . 11
2.1.3 The Tahoe project . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.3.1 Tahoe security explained . . . . . . . . . . . . . . . . . . . 12
2.1.3.2 Technologies used and system architecture . . . . . . . . 14
2.1.4 The GEMS project . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1.4.1 A Global Storage for Academic Simulations . . . . . . . . 16
2.1.4.2 Replica Management as a Linear Control Problem . . . . 17
2.2 The availability and durability problem . . . . . . . . . . . . . . . . . . . 18
2.2.1 Replica types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.2 Replica placement strategies and algorithms . . . . . . . . . . . . . 19
2.2.2.1 Use Of AI hill-climbing algorithms . . . . . . . . . . . . . 20
2.2.2.2 Lazy Repair and Reintegration . . . . . . . . . . . . . . . 20
2.2.2.3 Using Age Estimation For Placement . . . . . . . . . . . 22
2.2.2.4 Proactive replication . . . . . . . . . . . . . . . . . . . . . 25
2.2.2.5 Gossiping is Practical . . . . . . . . . . . . . . . . . . . . 25
2.3 Security in distributed storage systems . . . . . . . . . . . . . . . . . . . . 26
2.3.1 The Sybil attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4 Incentives to use online storage systems . . . . . . . . . . . . . . . . . . . . 29
vii
2.4.1 Replica placement as a game . . . . . . . . . . . . . . . . . . . . . . 29
2.5 Measurements and models of distributed systems . . . . . . . . . . . . . . 30
2.5.1 Analysis of deployed P2P systems . . . . . . . . . . . . . . . . . . . 30
2.5.1.1 KAD network analysis . . . . . . . . . . . . . . . . . . . . 30
2.5.1.2 Availability analysis in Overnet . . . . . . . . . . . . . . . 31
2.5.1.3 Comparing different systems . . . . . . . . . . . . . . . . . 32
2.5.2 Modeling of distributed storage systems . . . . . . . . . . . . . . . 33
2.5.2.1 Failure Correlation Analysis . . . . . . . . . . . . . . . . . 33
2.5.2.2 Effect of environment on churn . . . . . . . . . . . . . . . 36
2.5.2.3 Redundancy Vs Scaling Vs Dynamical System . . . . . . . 38
2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3 Validation of a Hybrid Replication Algorithm . . . . . . . . . . . . . . . . . . . 41
3.1 Why use an adaptive hybrid replication scheme? . . . . . . . . . . . . . . . 41
3.2 Modeling the distributed storage problem . . . . . . . . . . . . . . . . . . . 42
3.2.1 System assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2.2 Peers’ model description . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2.3 The system model . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.3 Replication as an adaptive control problem . . . . . . . . . . . . . . . . . . 45
3.3.1 The estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.4 Building an adaptive controller . . . . . . . . . . . . . . . . . . . . . . . . 46
3.5 Choosing T: the time between two estimations of the churn . . . . . . . 46
3.6 Validating the algorithm using simulation . . . . . . . . . . . . . . . . . . . 47
3.7 Critique for this algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4 NileStore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1 A framework for replica placement . . . . . . . . . . . . . . . . . . . . . . 51
4.2 Nomenclature and Notations . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3 NileStore system: Introduction and Design goals . . . . . . . . . . . . . . 53
4.3.1 System Design Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.4 NileStore: The protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.4.1 The Tracker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.4.2 The peers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.5 From Replica Placement to Task Assignment . . . . . . . . . . . . . . . . . 61
4.5.1 Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.5.2 Profit functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.5.2.1 Family 1: Profit function I . . . . . . . . . . . . . . . . . . 64
4.5.2.2 FAmily 1: Profit function II . . . . . . . . . . . . . . . . . 66
4.5.2.3 Family 1:Profit function III . . . . . . . . . . . . . . . . . 67
4.5.2.4 Family 1:Profit function IV . . . . . . . . . . . . . . . . . 68
4.5.2.5 Family 2:Profit function V . . . . . . . . . . . . . . . . . . 69
4.5.3 Solving the problem . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.1 Discrete Event Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
viii
5.1.1 Discrete Event Simulation Anatomy . . . . . . . . . . . . . . . . . . 72
5.1.1.1 Workload Modeling . . . . . . . . . . . . . . . . . . . . . . 72
5.1.1.2 Service Disciplines . . . . . . . . . . . . . . . . . . . . . . 72
5.1.1.3 Accuracy Vs Cost . . . . . . . . . . . . . . . . . . . . . . 73
5.1.2 Network Simulation: Practical design . . . . . . . . . . . . . . . . . 73
5.2 Simulating NileStore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.2.1 NileStore’s Simulator Anatomy . . . . . . . . . . . . . . . . . . . . 74
5.2.2 NileStore’s Workload modeling . . . . . . . . . . . . . . . . . . . . 74
5.2.2.1 Peers’ Upload and Download Rates . . . . . . . . . . . . . 75
5.2.2.2 Storage Model . . . . . . . . . . . . . . . . . . . . . . . . 75
5.2.2.3 Leave/Join Rates . . . . . . . . . . . . . . . . . . . . . . . 76
5.2.3 NileStore’s Simulator Practical Design . . . . . . . . . . . . . . . . 77
5.2.3.1 Classes for Managing the Queue . . . . . . . . . . . . . . 77
5.2.3.2 Networking Classes . . . . . . . . . . . . . . . . . . . . . . 78
5.2.3.3 NileStore’s Classes . . . . . . . . . . . . . . . . . . . . . . 79
5.2.3.4 The Main Loop . . . . . . . . . . . . . . . . . . . . . . . . 80
6 Experiments and Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.1 NileStore’s Performance with time . . . . . . . . . . . . . . . . . . . . . . . 81
6.1.1 Experiment conditions . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.1.2 Experiment results . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.1.2.1 Reason for the poor performance of profit function V . . . 83
6.1.2.2 Family 1 profit functions are similar . . . . . . . . . . . . 83
6.1.3 Changing the experiment conditions . . . . . . . . . . . . . . . . . 87
6.2 Effect of changing the churn rate . . . . . . . . . . . . . . . . . . . . . . . 89
6.3 Effect of changing τ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.4 Effect of changing r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.5 Time taken for replica placement . . . . . . . . . . . . . . . . . . . . . . . 95
7 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
References . . . . . . . .

Abstract:
Replica placement algorithms in distributed storage and backup systems have a direct effect
on the required time for replication, the availability and durability of files and the latency
of file retrieval. In this work we introduce Nilestore, a P2P-assisted cloud storage protocol
that transforms the replica placement problem into a linear task assignment optimization
problem. The task assignment problem is solved using an optimization engine. The assignment
takes into account each peer’s upload bandwidth, download bandwidth, contributed
free storage and the size of data that it needs to replicate. The task assignment is solved
using a suboptimal optimization algorithm for speed. This approach to solving the replica
placement problem has not been taken before for online storage systems. We show our
simulation results for the algorithms under realistic network conditions that were obtained
from many studies conducted on P2P networks and distributed storage. Our results show
that using NileStore can offload the storage servers of the cloud by more than 75% using
the available replicas in the system.

Text in English, abstracts in English.

There are no comments on this title.

to post a comment.