The 5th International Conference on Internet Computing (IC 2004)
Towards a Peer-to-Peer Simulator Dwight Deugo, Jon Harris School of Computer Science Carleton University Ottawa, Canada
[email protected] [email protected]
Contents • Introduction & Motivation • Peer-to-Peer Protocols • Network Topology • Communication • Event Processing • Modeling Content • Conclusion
Introduction & Motivation • What is peer-to-peer? • Why simulate peer-to-peer networks? • Limitations of existing peer-to-peer simulators • Our primary goals – Scalability (simulate large # of peers) – Support & compare existing & future protocols
Peer-to-Peer Protocols • Centralized vs. Decentralized – How to coordinate communication between peers? S
• Centralized protocols use a central server(s) P
• Decentralized protocols rely only on peers.
S P
P
P
P
P
P
P P
– Centralized protocols suffer from central point of failure (ex: Legal plight of Napster) – Most new protocols of interest are Decentralized.
Peer-to-Peer Protocols (cont’d) • Unstructured vs. Structured P
– Unstructured protocols (BF) • Need general notion of what you’re looking for.
P
P
• Peers ‘guess’ who to route the search to. • No guarantee that a correct match will be found.
P
– Structured protocols (DHT) • Need to know exactly what you’re looking for (key). • Peers know exactly who to route the search to. P • Match is guaranteed if the searched data exists.
P
P P
P
– Needle vs. the hay stack
P P
Network Topologies • Two different levels – Protocol level & transport level
• Representation – Use graph theory: Peers = nodes, connections = edges. • Adjacency Matrix – Store an N x N matrix – If matrix[i, j] != NIL then node i connected to node j
• Edge List – Store a list of N nodes. – Each node stores a list of edges.
• Comparison – Edge List is most prevalent
Network Topologies (cont’d) • Structural Models – Useful for ‘bootstrapping’ the simulation – Power Law Graphs • Few highly connected nodes, many low connected nodes • Which protocols evolve power law topologies? • Generation – Incremental Growth – Preferential Attachment – Topology Generator Tools
Network Topologies (cont’d) • Structural Models – Small World Graphs • Clustering Coefficient & Characteristic Path Length • 6 Degrees of Separation – It’s a small world • Generate via rewiring of regular graph
Increasing randomness
– Structured protocols usually create a fixed structural model.
Network Topologies (cont’d) • Evolution – Peer arrivals and departures • Requires efficient operations to add/remove nodes to/from graphs
– Changes to a peer’s neighbourhood • Requires efficient operations to add/remove edges to/from nodes • Occurs more frequently than arrival/departure
– Communication between peers. • Requires efficient operations to add/remove edges to/from nodes. • Occurs extremely frequently.
Communication • Overlay Networks – Peer-to-peer networks are overlay networks superimposed onto the internet (or other communication mechanism)
– Storing the overlay instead of the underlying network topology requires significantly less memory. – If we only store the overlay network topology, can we accurately model the bandwidth usage and latency?
Communication (cont’d) • Bandwidth Models – Defines how long a message of size X takes to travel between peer A and peer B (in simulation time). – Traditionally modeled using packet level simulators such as ns2 – Adding or removing a message affects the bandwidth of other messages at the sender and receiver • Propagation can affect bandwidth of every other message!
Communication (cont’d) • Bottleneck Bandwidth – The link with the lowest bandwidth along a message’s route – Usually the ‘last-hop’ link in peer-to-peer networks
• Latency – Delay of message transmission – Caused by propagation, physical network, data processing – Interacts with Bandwidth to determine message transmission time
Communication (cont’d) • Min-Equal-Share Bandwidth Model – Each peer has a maximum bandwidth – When peer A sends a message m to Peer B • Determine bandwidth of m as minimum of: – (peer A’s maximum bandwidth) / (number of messages at peer A) – (peer B’s maximum bandwidth) / (number of messages at peer B)
4 8
4
4 2
2 12
2
4 2 8
6 3
9
3
3 2
12 2
2
2
2
2
9
6
– Can be extended for asymmetric upload/download bandwidth
Communication (cont’d) • Transporting Messages – Method calls instead of sockets • RMI can be used to simulate in parallel/distributed environments
– Messages are represented by communication events. • Communication events store fields such as: – Start & Expected time – Source & Destination peer – Bandwidth and Latency. » May be derived from source/destination peers – Payload (protocol specific)
– Large scale simulation imposes constraints on the size of communication events
Event Processing • Generating User Events – Examples: Join, Depart, Query, Transfer Content – In real life, these events occur sporadically. – Poisson Process can randomly determine the number of events to generate at each time step. P(N = X)
• Average rate is a parameter
• Creating Communication Events – Left up to the protocol
0
1
2
3
X
4
5
6
7
8 9
Event Processing (cont’d) • World Views – Event Scheduling • Maintain a sorted list of all future events • Advance simulation time to first future event • Process and remove first future event
– Activity Scanning • Scan every entity (peer) in the simulation, and test for any pending events • Simulation time is incremented by a fixed time step
– Comparison • Event scheduling more appropriate for networks
Modeling Content • Content Repository – Stores content objects, peers only store references.
• User Preferences and Behaviour – Peers (users) are typically only interested in a subset of available content. – Group the content in the repository into categories. • Categories, Content within categories are ranked by popularity
– Model each peer’s interest level in each category. • Affects what content they possess, and what they search for.
Conclusion • Have discussed: – Peer-to-Peer Protocols – Classifications – Network Topology – Representation & Generation – Communication – Bandwidth Models – Event Processing – World Views & Generation – Modeling Content – Categorization & User Interests
• Our simulator is being developed as a plug-in for the Eclipse platform (www.eclipse.org) – The source code will be available soon (Eclipse is open source)
References •
J. Banks, B. Nelson, and J. Carson. Discrete-Event System Simulation. 3rd Edition, Prentice Hall, 2000.
•
T. Bu and D. Towsely. On Distinguishing between Internet Power Law Topology Generators, Proceeedings of IEEE INFOCOM '02, 2002.
•
Y. Chawathe, S. Ratnasamy, L. Breslau, “Making Gnutella-like P2P Systems Scalable”, Proceedings of the 2003 conference on Applications, architectures, and protocols for computer communications (SIGCOMM’03), p. 407-418, 2003.
•
M. Faloutsos, P. Faloutsos and C. Faloutsos. On Power-Law Relationships of the Internet Topology. In SIGCOMM, pages 251-262, 1999.
•
TJ Giuli and M. Baker. Narses: A Scalable, Flow-Based Network Simulator. Technical Report, Computer Science Department, Stanford University, USA, Nov 2002.
•
M. T. Schlosser, T. E. Condie, S. D. Kamvar, Simulating a P2P File-Sharing Network. 2003
•
D. Watts and S. Strogatz. Collective dynamics of small-world networks. Nature, 393:440-442, June 1998.