Towards a Peer-to-Peer Simulator - Semantic Scholar

2 downloads 0 Views 101KB Size Report
rely only on peers. – Centralized protocols suffer from central point of failure. (ex: Legal plight of Napster). – Most new protocols of interest are Decentralized. P.
The 5th International Conference on Internet Computing (IC 2004)

Towards a Peer-to-Peer Simulator Dwight Deugo, Jon Harris School of Computer Science Carleton University Ottawa, Canada [email protected] [email protected]

Contents • Introduction & Motivation • Peer-to-Peer Protocols • Network Topology • Communication • Event Processing • Modeling Content • Conclusion

Introduction & Motivation • What is peer-to-peer? • Why simulate peer-to-peer networks? • Limitations of existing peer-to-peer simulators • Our primary goals – Scalability (simulate large # of peers) – Support & compare existing & future protocols

Peer-to-Peer Protocols • Centralized vs. Decentralized – How to coordinate communication between peers? S

• Centralized protocols use a central server(s) P

• Decentralized protocols rely only on peers.

S P

P

P

P

P

P

P P

– Centralized protocols suffer from central point of failure (ex: Legal plight of Napster) – Most new protocols of interest are Decentralized.

Peer-to-Peer Protocols (cont’d) • Unstructured vs. Structured P

– Unstructured protocols (BF) • Need general notion of what you’re looking for.

P

P

• Peers ‘guess’ who to route the search to. • No guarantee that a correct match will be found.

P

– Structured protocols (DHT) • Need to know exactly what you’re looking for (key). • Peers know exactly who to route the search to. P • Match is guaranteed if the searched data exists.

P

P P

P

– Needle vs. the hay stack

P P

Network Topologies • Two different levels – Protocol level & transport level

• Representation – Use graph theory: Peers = nodes, connections = edges. • Adjacency Matrix – Store an N x N matrix – If matrix[i, j] != NIL then node i connected to node j

• Edge List – Store a list of N nodes. – Each node stores a list of edges.

• Comparison – Edge List is most prevalent

Network Topologies (cont’d) • Structural Models – Useful for ‘bootstrapping’ the simulation – Power Law Graphs • Few highly connected nodes, many low connected nodes • Which protocols evolve power law topologies? • Generation – Incremental Growth – Preferential Attachment – Topology Generator Tools

Network Topologies (cont’d) • Structural Models – Small World Graphs • Clustering Coefficient & Characteristic Path Length • 6 Degrees of Separation – It’s a small world • Generate via rewiring of regular graph

Increasing randomness

– Structured protocols usually create a fixed structural model.

Network Topologies (cont’d) • Evolution – Peer arrivals and departures • Requires efficient operations to add/remove nodes to/from graphs

– Changes to a peer’s neighbourhood • Requires efficient operations to add/remove edges to/from nodes • Occurs more frequently than arrival/departure

– Communication between peers. • Requires efficient operations to add/remove edges to/from nodes. • Occurs extremely frequently.

Communication • Overlay Networks – Peer-to-peer networks are overlay networks superimposed onto the internet (or other communication mechanism)

– Storing the overlay instead of the underlying network topology requires significantly less memory. – If we only store the overlay network topology, can we accurately model the bandwidth usage and latency?

Communication (cont’d) • Bandwidth Models – Defines how long a message of size X takes to travel between peer A and peer B (in simulation time). – Traditionally modeled using packet level simulators such as ns2 – Adding or removing a message affects the bandwidth of other messages at the sender and receiver • Propagation can affect bandwidth of every other message!

Communication (cont’d) • Bottleneck Bandwidth – The link with the lowest bandwidth along a message’s route – Usually the ‘last-hop’ link in peer-to-peer networks

• Latency – Delay of message transmission – Caused by propagation, physical network, data processing – Interacts with Bandwidth to determine message transmission time

Communication (cont’d) • Min-Equal-Share Bandwidth Model – Each peer has a maximum bandwidth – When peer A sends a message m to Peer B • Determine bandwidth of m as minimum of: – (peer A’s maximum bandwidth) / (number of messages at peer A) – (peer B’s maximum bandwidth) / (number of messages at peer B)

4 8

4

4 2

2 12

2

4 2 8

6 3

9

3

3 2

12 2

2

2

2

2

9

6

– Can be extended for asymmetric upload/download bandwidth

Communication (cont’d) • Transporting Messages – Method calls instead of sockets • RMI can be used to simulate in parallel/distributed environments

– Messages are represented by communication events. • Communication events store fields such as: – Start & Expected time – Source & Destination peer – Bandwidth and Latency. » May be derived from source/destination peers – Payload (protocol specific)

– Large scale simulation imposes constraints on the size of communication events

Event Processing • Generating User Events – Examples: Join, Depart, Query, Transfer Content – In real life, these events occur sporadically. – Poisson Process can randomly determine the number of events to generate at each time step. P(N = X)

• Average rate is a parameter

• Creating Communication Events – Left up to the protocol

0

1

2

3

X

4

5

6

7

8 9

Event Processing (cont’d) • World Views – Event Scheduling • Maintain a sorted list of all future events • Advance simulation time to first future event • Process and remove first future event

– Activity Scanning • Scan every entity (peer) in the simulation, and test for any pending events • Simulation time is incremented by a fixed time step

– Comparison • Event scheduling more appropriate for networks

Modeling Content • Content Repository – Stores content objects, peers only store references.

• User Preferences and Behaviour – Peers (users) are typically only interested in a subset of available content. – Group the content in the repository into categories. • Categories, Content within categories are ranked by popularity

– Model each peer’s interest level in each category. • Affects what content they possess, and what they search for.

Conclusion • Have discussed: – Peer-to-Peer Protocols – Classifications – Network Topology – Representation & Generation – Communication – Bandwidth Models – Event Processing – World Views & Generation – Modeling Content – Categorization & User Interests

• Our simulator is being developed as a plug-in for the Eclipse platform (www.eclipse.org) – The source code will be available soon (Eclipse is open source)

References •

J. Banks, B. Nelson, and J. Carson. Discrete-Event System Simulation. 3rd Edition, Prentice Hall, 2000.



T. Bu and D. Towsely. On Distinguishing between Internet Power Law Topology Generators, Proceeedings of IEEE INFOCOM '02, 2002.



Y. Chawathe, S. Ratnasamy, L. Breslau, “Making Gnutella-like P2P Systems Scalable”, Proceedings of the 2003 conference on Applications, architectures, and protocols for computer communications (SIGCOMM’03), p. 407-418, 2003.



M. Faloutsos, P. Faloutsos and C. Faloutsos. On Power-Law Relationships of the Internet Topology. In SIGCOMM, pages 251-262, 1999.



TJ Giuli and M. Baker. Narses: A Scalable, Flow-Based Network Simulator. Technical Report, Computer Science Department, Stanford University, USA, Nov 2002.



M. T. Schlosser, T. E. Condie, S. D. Kamvar, Simulating a P2P File-Sharing Network. 2003



D. Watts and S. Strogatz. Collective dynamics of small-world networks. Nature, 393:440-442, June 1998.

Suggest Documents