Gryphon: A Dynamically Tailorable Mechanism for ... - GriffMonster

Gryphon: A Dynamically Tailorable Mechanism for Customizing Location and Caching Policies in Distributed Object Subsystems by ADAM JONATHAN GRIFF B.A., Lehigh University, 1991 M.S., Lehigh University, 1992

A thesis submitted to the Faculty of the Graduate School of the University of Colorado in partial fulfillment of the requirement for the degree of Doctor of Philosophy Department of Computer Science 2000

This thesis entitled: Gryphon: A Dynamically Tailorable Mechanism for Customizing Location and Caching Policies in Distributed Object Subsystems written by Adam Jonathan Griff has been approved for the Department of Computer Science

Chair of Committee

Committee Member

Date

The final copy of this thesis has been examined by the signators, and we find that both the content and the form meet acceptable presentation standards of scholarly work in the above mentioned discipline.

Griff, Adam Jonathan (Ph.D., Computer Science) Gryphon: A Dynamically Tailorable Mechanism for Customizing Location and Caching Policies in Distributed Object Subsystems Dissertation directed by Professor Gary J. Nutt Many of today’s software applications are of a distributed nature and are being designed and implemented by general programmers. Distributed object models are well suited to these applications and provide a layer of abstraction to the developer. By creating an object interface layer to the applications distribution layer, developers can use a framework for creating distributed systems without needing detailed knowledge of the distribution layer mechanisms. Many distributed object systems are designed with location and update transparency as a means of sheltering the developer from implementation details. In this dissertation, mechanisms have been identified, designed, implemented, and analyzed that enable transparent distributed object subsystems to utilize application level hints in order to dynamically customize object location and caching policies. A distributed object management system has been built as a proof of concept system to study the performance of explicit location and caching policies. The methodology for tailoring object location and caching policies in the subsystem is performed at an abstract level using Gryphon agents, which provide an auxiliary interface to existing distributed object subsystems, thereby eliminating the developers need to tune performance at the low-level distribution layer. Run time policy decisions are performed at each client utilizing the local Gryphon agents resulting in bandwidth and latency efficiency. In the best case, applications using a Gryphon enhanced subsystem reduce the message traffic down to a fraction of a percent compared to unenhanced subsystems and eliminate access latencies to objects existing or cached locally, and in the worst case the performance is no worse than with the unenhanced subsystem.

Dedication

This thesis is dedicated to anyone with a learning disability. Though you often must struggle through stormy seas where others find smooth sailing, perseverance generates the skills necessary to surmount obstacles encountered in subsequent journeys. May your every endeavor give you the fortitude to confront life’s challenges, and the maturity to recognize which mountains are yours to climb.

Acknowledgment

This dissertation was not accomplished alone but with the help and support of friends and family. Mommieeee! - provider of emotional support and encouragement. You taught me how to keep a good balance of activities in order to prevent burnout. You worked with me in the early years when my attention span sent me under the table and over the chair...literally. And, thank you for all of your help with the final editing of this dissertation. Dad - host into a world of fine dining, wine tasting, nautical expeditions, worldly vacations, and literary enrichment. You were always there to remind me that the “real world” was anxiously waiting to greet me. Grandma Mickey (Mrs. J) - my inspiration. I have learned from your love of life. You were a few generations ahead of your time. Around you, no one would ever say, “I'm bored,” because that would earn them your standard response of, “Nobody's bored unless they are a bore themselves.” You live on in all the lives you have touched. Scott and Stefanie - my connections to the business world when I was lost in the academic world. Little Brother, you gave me all the abuse (and then some!) that I needed early in life to prepare me for the abuse that was to come. Stefanie, thanks for taking Scott off my hands and helping him grow up into a proper adult. Aunt Maxine, Jeff, Uncle B., and Linda - my wonderful support team. Let's not forget the great fun of all of those Thanksgivings and family reunions. You all listened

vi to me think out loud and provided me with suggestions without being judgemental. BJ, your advice and guidance keeps coming and coming. Dr. Jeff, you made me more of a man, since the major difference between men and boys is the price of their toys...you also did your best to bring out the GQ in AJ. Linda, you guided the beginning of my musical aspirations. And, Aunt Maxine, you always have innovative ideas for curing all that ails me. Aunt Barbara - my favorite other point of view. You showed me that it is never to late to start over, pick a new path, follow your new dreams, and have the tenacity to face all adversities. Bob, thank you for being there for me and Aunt B. The Cohen Gang - the best extended family a guy could wish for...you are all a bit overeducated, but I guess I am, too (hee, hee, hee). So many doctors in one family: Dr. Dave, Dr. Rick, Dr. Julius, Dr. Brently, and Rose - the Master Supreme and Dr.-tobe. I always have a great time at the summer reunions. Rose and Rick, you provided great fun during all of our special visits. And, Julius, thanks for making sure we can always indulge in the full spectrum of cultural cuisine: from authentically ordered Chinese food to late night tuna sandwiches. Dr. Zulah and Dr. Carlos - creators of great times throughout all the years of graduate school (except, of course, for my final year of struggle, during which you abandoned me!). Zulah, you supported me and helped me to learn CU’s system. You also reminded me not to take anything too seriously and to shut up (i.e. stop my incessant wining). You two knew that beer, wine, and food could solve or temporarily vanquish most dragons. Dr. Reini, Dr. Heinzi, Milen, Noah, Matt, Doppke, and Vollmar - bearers of German beer and German-style fun. Give a beer to the obstacle that it might go away or tip over. That would be “some kine'ta guut,” right Heinz? What number quote is that, Reini? Thanks for all those drunken orgies and almost-sober evenings at the symphony. From hot tubs to the theater to the Oasis and hitting “Rock Bottom” (the

vii brew pub that is). Let’s not forget the sumptuous dinner parties featuring the combined talents of CS chefs. Kudos to Doppke and Vollmar for having the courage to leave for new challenges and say, “Enough with graduate school!” Dr. Brandt, Mr. Bob, Andre, Antonio, and Laura - my comrades in the crazy world of CU CS. You were my sounding boards, ready to listen to my raw thoughts and help me transform them into coherent ideas. Anshu and Suvas - Dr. Jekyl and Dr. Hyde (you two can fight over who is who). Thanks to you two characters for entertaining me with comedies and tragedies worthy of the Bard. I look forward to your future musings on the role of the extended family in facilitating the love lives of computer scientists. Adina - my good friend, even though you can be a pain in the tuchis, you are always there for me through the good times and bad times. Thank you for sharing Sasha and letting me be the “joint custody dad.” And, thanks for introducing me to the crazy world of deviant sociologists. What would parties be without naked Twister, whips and chains, and, of course, exotic and erotic Halloweens?!? Myron, Susan, and Jose - my extended, non-genetic family. Myron and Susan, you always expressed interest in my progress and my passions. Thanks for forgiving my inability to keep your daughter from the clutches of evil. Jose, thank you for being there for Adina after she escaped it and for being open to a friendly “Ex.” J-man and Karen - providers of sound advice and delicious dinners. You two helped me to feel like Boulder was my home. Thanks for always looking out for me. Your home was a place for me to be with non-university “real people” and grow in non-academic ways. Sasha - my furry daughter. You give me unconditional affection, what a doggie! Brian, Tim, and the rest of the UGGS gang - my multi-disciplinary friends. It was always a treat to get away from the world of computers and enjoy the diversity of

viii CU. Anne - my sexiest dance instructor. Your great love and support carried me through many a stressful day. Thanks for teaching me all of those “special” dances. You also introduced me to the rest of Colorado outside of the “Peoples Republic of Boulder”. Chris, Betsy, David Capps, and especially David Dorfmann - you all gave me the opportunity to dance and explore myself when I need it most. Monika - my trusty editor. Thanks for your time and help, oh Goddess of the Pink Palace. Linda, Vicki, and the CSops Gang - thanks for making the whole process as enjoyable as it could be. Finally, thanks to my committee: Gary Nutt, Skip Ellis, Dirk Grunwald, Dennis Heimbigner, and Jintae Lee. Gary, as the chair of my committee, I appreciated your leadership. Dennis, I greatly appreciate the energy you put into giving me feedback along the way. To all of you, your critiques, comments, and signatures were invaluable to me ;) The FrameMaker program which enabled painless editing and formatting. Lets not forget the great color coded diff feature which helped me through the final mile of this dissertation.

Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Objects and Distributed Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Example Application Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Network Transparency and Performance Bottlenecks. . . . . . . . . . . . . . . . . . . . . 3 Using Application Knowledge to Influence Distributed Object Policy . . . . . . . 4 Object Location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Object Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Object Consistency Policies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 The Gryphon Enhanced Subsystem using GAAOPs. . . . . . . . . . . . . . . . . . . . . . 9 Integration with Existing SubSystems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Explicit Data Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Distributed Shared Memory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Munin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Midway . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Distributed Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Standards for Distributed Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 CORBA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 ANSA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 ActiveX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Specialized Languages and Operating Systems for Distributed Objects. . . . . . 21 Virtual Reality Systems use of Distributed Objects . . . . . . . . . . . . . . . . . . . . . 23 RhoVeR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 DIVE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 RING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 AVIARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Black Sun Community Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 MPEG-4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 General Systems Addressing Performance in Distributed Objects . . . . . . . . . . 28 COBS Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Globe Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 MinORB Object Caching System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Coign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

x 3 System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Architectural Overview of Gryphon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Distributed Object Model based on Implementation Transparency . . . . . . . . . 37 Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Spectrum of Systems Identified. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Gryphon Design for Enhancing Subsystems . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Minimum ORB Features Required to Support Gryphon . . . . . . . . . . . . . . . 43 Designing the Agent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Creating Objects for the System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Application Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Interaction between Gryphon and Agent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Update Techniques Available to the Gryphon. . . . . . . . . . . . . . . . . . . . . . . . . . 50 4 Details Specific to the System Prototype Implementation . . . . . . . . . . . . . . . . . . . 53 The Gryphon-GAAOP Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Implementing the GAAOP for DOM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Implementing the Gryphon on top of DOM . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Push Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Pull Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Server Processing Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Application Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Object Utilization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Utilizing the Gryphon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Gryphon Layer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Data propagation in a single threaded environment . . . . . . . . . . . . . . . . . . 65 Modifying an object. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Details of the Gryphon Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Class VprConnection methods used at the Gryphon level. . . . . . . . . . . . 68 class DistObjMan methods used at the Gryphon level. . . . . . . . . . . . . . . 69 class GAAOP methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 The base hintObject class has the following variables: . . . . . . . . . . . . . . 71 The base stateObject has the following variables: . . . . . . . . . . . . . . . . . . 72 5 Analysis of the System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 System Load Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Analytic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Measuring System Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Trace Analysis Tools and Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Virtual Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Using Virtual Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Gryphon Subsystem Overhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Space Overhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Time Overhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

xi Analyzing the Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 Analyzing the Application Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Validation using Gryphon Enabled DOM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Stochastic Load Representation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Metrics for Comparing Gryphon to Other Subsystem. . . . . . . . . . . . . . . . . . . 102 Summary of Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 7 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Applications using Gryphon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Future Work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Throttle Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 Using domain Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Appendix A DOPICL - Trace Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Trace Format Needed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Network Issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 PICL for Distributed Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 A Total Ordering of Events is not Possible using Distributed System Clocks . 114 A Total Ordering using Logical Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Utilizing Modified System Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 PICL for Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 DOPICL Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Processing DOPICL Traces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Using DOPICL Traces on Gryphon. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Analyze and verify the applications and the analytical model.. . . . . . . . . . 125 Visualization of the traces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Appendix B DOM - Subsystem Design used by the Gryphon . . . . . . . . . . . . . . . 129 Communication Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Steps involved in using a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Interactions Management of Distributed Objects . . . . . . . . . . . . . . . . . . . . 131 Detailed Implementation Issues of the GOM/LOM . . . . . . . . . . . . . . . . . . 134 The World_Objects Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Socket Layer Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 Matching Sends and Receives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 Distributed Object Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 Object Instantiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 Object Class Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Marshalling Requests and Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 Application Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 Appendix C Details of Applications Built on DOM . . . . . . . . . . . . . . . . . . . . . . . 144 Session Manager and Object Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 Scenario Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

xii VPR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 Floaters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 3D Tic-Tac-Toe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 MPEG Client-Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Appendix D Abstract Auxiliary Interface Hints . . . . . . . . . . . . . . . . . . . . . . . . . . 151 High-Level Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 System-Level Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 Example of Auxiliary Interface Implementation and Utilization . . . . . . . . . . 156 Appendix E Example Distributed Application . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Header Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 A Create Callback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 A Modify Callback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 A Delete Callback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 Modification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Initialization and Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

xiii Tables

1. System 1...................................................................................................................40 2. System 2...................................................................................................................40 3. System 3...................................................................................................................41 4. System 4...................................................................................................................41 5. System 5...................................................................................................................42 6. Scenario Performance ..............................................................................................92 7. DOPICL Events Body ...........................................................................................121 8. PICL Events Body .................................................................................................122

xiv Figures

1. Object References ......................................................................................................6 2. Gryphon subsystem Layers......................................................................................10 3. CORBA architecture................................................................................................17 4. IDL processing, implementation installation, and the resulting generation ............19 5. Method Call Dynamics ............................................................................................19 6. ORB with the Gryphon ............................................................................................44 7. Distributed Object/Attribute Client Utilization .......................................................64 8. DOM subsystem hierarchy of Classes .....................................................................67 9. Systems in Section 3.3 varying number of processes (L).......................................88 10. Systems in Section 3.3 varying number of modified objects at each process (M). ... 89 11. Systems in Section 3.3 varying number of updates (U). .......................................90 12. Theoretic vs. Actual Mean Message Traffic using System 5 ................................95 13. Scenario 1 with Poisson Distribution with mean of 5 objects and ........................99 14. Scenario 2 with Poisson Distribution with mean of 100 objects and ..................100 15. Scenario 3 with Poisson Distribution with mean of 20 objects and ....................101 16. Automated Decision Architecture using DQM ...................................................109 17. DOPICL Architecture ..........................................................................................116 18. Example snapshot view of LC in generated events .............................................117 19. DOM Architecture for Distributed Objects .........................................................130 20. GOM-LOM Architecture .....................................................................................133 21. Distributed Object Manager.................................................................................134 22. The VPR Organization.........................................................................................148 23. Simple use of the Users hint. ...............................................................................157 24. More Complex use of the Users hint. ..................................................................158

Chapter 1

Introduction

1.1 Objects and Distributed Systems Groupware applications are utilizing the distributed-object model design for its level of abstraction as described in the report from the first international workshop on Object Oriented Groupware Platforms [VVT98]. By using the distributed-object model designers intend to generate reusable code and minimize the specialized knowledge required of the application programmer [CE95][GHJ+95] [Krue92]. The distributed-object model uses the object-oriented (OO) model whose programming languages take abstract data types and functions of procedural languages and wraps them in a class that is comprised of data and method calls [Kay93] [Tama97]. The tight coupling between state (data) and actions (methods) allows implementation details to be hidden from the user of an object instance which is instantiated from a class. When using a distributed object, the programmer needs to be aware of the interface of the object but not the way the object maintains its internal state, performs its functions, or how the logical object physically exists. By issuing method calls on the object, the client can use the specified interface to access the object’s functionality. While the abstract interface remains unchanged, the object can be treated as a black box where the underlying implementation can be radically modified without affecting the user of the object [Kicz96].

2 System developers using the OO model have improved software maintainability and longevity over the procedural model since performance of an object can be enhanced without additional changes to the system, whereas the object interface is backward compatible [SS96]. Changes to objects can be made in the form of implementation changes and infusion of new technology to improve an object’s performance. For these reasons, the current trend is for systems, like groupware applications, to be written in OO programming languages. The rationale for distributed systems spans the spectrum from distributing data for multi-user interactive applications [EGR91], to distributing tasks among various heterogeneous workstations to improve resource utilization by making use of underutilized resources [JLH+88]. It was a natural progression to bring distributed technologies into the OO framework because the developer gained the knowledge and tools to generate distributed applications without needing to be an expert in sockets, concurrency control, and other low-level protocols such as distributed shared memory, and TCP/IP. Rather, the developer needs to learn the distributed object packages’ highlevel interfaces which provide the distribution services and enable the developer to solve immediate problems without first having to solve low-level networking details for a heterogeneous set of systems [HV99]. For OO languages such as SmallTalk, Ada, Java, and C++, a diverse number of distributed object interface standards (ActiveX [Chap97], CORBA [Li95], and ANSA [Scot97]) reduce the complexity the programmer normally encounters by hiding the implementation of object distribution. These distributed object interfaces focus on bridging the communication gap between machines and the heterogeneous issues between architectures, operating systems, and programming languages. 1.2 Example Application Domains Object technology has caused an unmistakable evolution in the way programmers produce software. Current applications executing on a workstation may

3 be constructed using hundreds of objects, and each object can demand significant support from the hardware. For example, a virtual environment (VE) is typically defined as a collection of different objects, each having a graphic representation (e.g., in VRML) and an arbitrary behavior [BC97]. For workstations connected via contemporary networks with bandwidths of 1 Gbps (and more in the near future), it is natural for object-oriented applications to attempt to capitalize on distributed technology. For example, the VE can become a distributed virtual environment (DVE) whereby the objects that define the DVE are referenced from many different workstations. Other domains using distributed objects include: object-oriented databases [Nørv99], finite element analysis [SPH+96], Boeing F-15 flight computers [Lach99], Netscape web browser [Carr97], and a multitude of other areas [OMG99]. 1.3 Network Transparency and Performance Bottlenecks At the core of OO languages and distributed object interface standards is the transparency design model simplifying programming for the application developers by abstracting away the low-level implementation details. The distributed object model does allow system developers to ignore the details of distribution, but at a performance cost. In these distributed object systems, the developer has no control over object location, distribution, caching, persistence, and consistency policies. Each application has different requirements for performance, and even a well-tuned subsystem, like an object request broker (ORB) that meets the common object request broker architecture (CORBA), as specified by Object Management Group (OMG) [OMG95], cannot satisfy all types of applications. For example, Electra, a CORBA implementation built on top of ISIS [MS97], a communication subsystem, has been designed to provide error recovery, security, and reliability. The result is a fault-tolerant, reliable, and secure tool for developing a distributed object banking system, though it has not been tuned for a high-performance application. Many applications have broad requirements

4 and place constraints on different parts of the system, sometimes on a per-object basis. For example, Electra and my own Distributed Object Manager (DOM) before the Gryphon enhancements were incapable of supporting highly interactive distributed systems with performance acceptable to the end user. The object-transparency model contributed to the performance inadequacy, resulting in the application developer needing a way to dynamically effect object-distribution policy on a per-object basis in order to produce more efficient programs. 1.4 Using Application Knowledge to Influence Distributed Object Policy Current distributed object models such as CORBA, ANSA, and ActiveX with location transparency are unsatisfactory with respect to performance. A facility that subverts location transparency, Gryphon, is proposed as a means of improving performance. A per-process active agent [Morr98] is created that uses application specified hints to influence object policies. These hints are mechanisms available at the application interface level to tune and customize distributed object policy. Proxies of the agent representing a process are propagated to remote processes where these agent proxies act on behalf of the represented process. This type of interface enables application developers to make high-level suggestions regarding the location and caching policies of objects. The level of abstraction provided by hints and agents allows application developers to express their intended use and requirements of individual objects. The hints are processed by the Gryphon subsystem to decide where objects should reside, how they should be cached, the granularity (in time and space) of consistency updates, and object lifetime. The result is a new programming model for distributed objects that allows customization of each distributed object to suit the application-specific utilization and enhances performance. Application developers can initially ignore enhancements, reducing the load on the developer and flattening out the learning curve for the new distributed object programming model. The application developer can modify and add hints at a later date when the system is functioning and

5 domain knowledge is obtained. If no hints are provided then default settings are utilized, eliminating the burden from the programmer. 1.4.1 Object Location Assume a system provides a CORBA interface to manage distributed objects within an application, i.e., remote objects are referenced using the interface specified by an interface definition language (IDL). CORBA explicitly addresses the possibility that the underlying object manager may be distributed, meaning in a network environment an object may be physically located anywhere within the network. Since CORBA supports location transparency, a client needs only the interface to reference the object. The client is not permitted to know the location of an object; it can only reference the object through the interface, at which point the underlying ORB will locate and reference the object. The system’s location policy defines the ORB’s strategy for placing objects at various locations in the network. The ORB is free to choose any location for an object, assuming it can still provide an object reference. Ultimately, ORBs are expected to become sufficiently sophisticated (and networks sufficiently fast) so that any inefficiencies in remote references are inconsequential. Today, however, latencies resulting from remote object references dominate the performance bottleneck of a distributed object application. Section 6.2 illustrates how increased message traffic produced by augmenting the quantity of distributed objects in an application may cause some applications to be infeasible. Since CORBA provides and enforces location transparency, its location policy is determined by the ORB implementation or the system administrator. Neither the ORB designer nor the system administrator is likely to know, a priori, the reference pattern for any object whose pattern is entirely determined by the way the application uses the object. Applications use individual objects in diverse ways and therefore perobject policies may need to be specified.

6 The application software and developer are good sources of information regarding the location and caching policies for objects in a distributed object management system. Therefore, I advocate an approach in which the application can influence the location and caching policies by suggesting object locations and caching policies through hints supplied to the agents. VPR Process u

x

**y

*s t’

*r

Reference to Remote Object Surrogate (cached) Object Object

Graphics Engine

Local Object Manager Object Interaction Surrogate Object Store Manager Manager

Global Object Manager

Global Object Manager s Object Store

*y

Graphics Engine

y

Object Store

t

*x

r

Object Interaction Surrogate Object Store Manager Manager Local Object Manager

t’ v

**x

*s

*r

Figure 1: Object References Figure 1 represents various location policies available to an application to reduce performance bottlenecks and eliminate latencies. Some objects (like objects u and v) are private to an application but managed by a Local Object Manager (LOM). Other objects are accessed by many processes and require that strict consistency is maintained; these objects (like r and s in the Figure 1) should be kept on one server process (a Global Object Manager, or GOM) with all references to the object being remote references. Other objects in the figure such as x and y are placed at a process

7 that is dominating the utilization, yet can be referenced from other processes. The implementation of the subsystem determines whether the access to x and y is direct or through an intermediate process server. 1.4.2 Object Caching In a DVE type application, a large number of objects might never change state throughout their lifetime, e.g., walls and floors. If such objects are repeatedly referenced from each of the client processes (e.g., to determine their VRML representation and room positioning) there will be considerable wasted network traffic and service requests on the corresponding object storage process location. In distributed systems, this type of problem is typically handled by caching the read-only objects and distributing the copies to each client process that wishes to read them, i.e., such objects are cached at the client processes. Cached objects (such as t and the cached copy t’ in Figure 1) have the original object stored on the process serving the object and maintain copies on the client processes. Caching an object can result in large performance improvement since remote calls (~1,000 per second on 10Mb ethernet) are several orders of magnitude slower than local function calls (~1,000,000 per second) [HV99]. Caching is not as straight forward when method calls are not read-only and result in modifications. 1.4.3 Object Consistency Policies In a strong consistency memory model, caching becomes difficult on object updates because cached copies need to be invalidated until their copy is made consistent. With some objects, sequential consistency is clearly necessary, while in other objects relaxed forms of consistency are appropriate. For a description of relaxed consistency techniques using in distributed shared memory (DSM) see Adve and Gharachorloo’s tutorial [AG95]. Briot et al. [BGL98] show that uniformly applying sequential consistency leads to a large performance cost for a small effective gain

8 while uniformly applying relaxed forms of consistency may cause costly race conditions to occur on objects that need fine-grained sharing support. Therefore, consistency policy should be selected on a per-object basis. The analysis of system 3 in Section 6.2 demonstrates that even with desired object placement, many applications are unacceptably slow. To improve performance, caching needs to be an option. This option can be implemented without input from the developer, but consistency strategies must be kept to standard accepted strategies. The ability to specify non-traditional consistency algorithms tailored to specialized application domains gives the developer a new level of flexibility not available in the transparent distributed object model. Using hints, the developer can create objects whose data not only is replicated and cached but also has an unorthodox application specific consistency. Copies of the data do not necessarily need to be synchronized on a write. For example, an object can be modified but the state may only need updating at some cached copies every nth update. In a DVE, the visual representation of a user in a VE (an avatar) can move smoothly around the local view of the room by performing fine-grain movements while sending a few updates per second to the rest of the network. These strategies are application domain specific and cannot be implemented without developer input since the unorthodox behavior generates unexpected and even incorrect results if the developer is unaware of these update policies. The performance analysis in Section 6.2 shows that a DVE with location transparency and without caching is not feasible due to the quantity of message traffic and response latency. All objects need to perform a read for each screen refresh of a DVE (creating enormous amounts of message traffic in addition to the latency) making it impossible to maintaining an acceptable frame rate. Significant performance improvements can be achieved when the developer specifies additional information to aid the subsystem in determining object placement and caching policy selection,

9 including non-traditional update policies. 1.5 The Gryphon Enhanced Subsystem using GAAOPs The focus of this research is to enable the development of efficient distributed applications using distributed object location and update policies specified by the developer. I designed and implemented the DOM and used it to support remote object references in a prototypical DVE. The DOM design is consistent with Figure 1—it has a LOM component and a GOM component. To experiment with Gryphon, the DOM has been enhanced to demonstrate the feasibility of the approach introduced in Section 1.4. The resulting system provides a test bed for observing system performance under different work loads. Development is achieved through processes participating in a distributed application that pass hints to the Gryphon subsystem. These hints enable the Gryphon Active Agent Object Process (GAAOP) within the LOM to tailor distributed object policies to fit with an application’s requirements. An application can dynamically modify hints and these hints will be evaluated at run time to enable objects to meet new requirements during their lifetime. Each process that constitutes the application acts as both a client and a server and the process has its own GAAOP. The GAAOP makes copies of itself (called proxies) and propagates these proxies to all the other processes involved in the distributed application. These GAAOP proxies make decisions on behalf of their representative process while located at remote processes. Figure 2 illustrates the Gryphon enhanced ORB and the role of the subsystem enhancements. For example, Client Process 2 contacts its own GAAOP to set hints regarding objects in the ORB. Client Process 2 sends requests to object stubs which are processed by the Gryphon enhanced layer. Get requests require the help of the local GAAOP to determine the status of the local copy and decide if the request needs to be propagated remotely. If the local copy is not valid a remote request is sent. This request is processed by the Gryphon layer on a remote process utilizing the GAAOP

10 on the remote process. If the client portion of the process performs set operations, the local GAAOP is queried to determine if the object exists locally for modification otherwise the request is sent remotely to the holder of the object. In the case where the object exists locally, the Gryphon layer communicates with all the GAAOPs to see which processes require updates of their cached copies of the object. In the case of propagation of data, the GAAOP representing a remote process returns the object with the data to be propagated to the remote process. This technique enables the GAAOP to customize full or partial updates. The GAAOP design enables object technology to switch from black box to open implementation with auxiliary interfaces [OIG97], thereby enabling developers to parametrically influence object optimization utilizing placement and update strategies information. Surrogate (cached) Object

Application Layer

O

Gryphon Enhanced Layer

Object

s nt hi

Ge to rS et

Client Process 2

Obj Stub 1 Obj Stub 2 Set t or Se Get

Obj Stub m

GAAOP1 Set

GAAOP2

GAAOP3

...

Set Ge to rS et

ds tho Me ote Rem

B

Base ORB

Rem ote Me tho ds

R

Figure 2: Gryphon subsystem Layers

GAAOPn

11 1.6 Integration with Existing SubSystems As previously stated, distributed middleware like CORBA, DCOM, and other distributed object interfaces suffer in performance due to the abstract level of the interface. The Gryphon system uses distributed middleware to enable developers to specify additional information. With a Gryphon enhanced subsystem, the GAAOPs act as an auxiliary interface, allowing application driven performance tuning. Using Gryphon enhancements, developers can reduce the access time overhead for their objects and reduce unnecessary updates in their applications. The GAAOPs are used by Gryphon enhanced subsystem to optimize object access implementation; distributed middleware implementations that do not use Gryphon ignore these GAAOPs. The Gryphon subsystem has default configurations enabling objects and applications written for general distributed object interface specifications to run on an enhanced systems without modifications. 1.7 Contributions My research has contributed to the current technology in distributed objects by identifying techniques for reducing latency and message traffic. These techniques include utilizing active agents which receive application input affecting location and update policies. The prototype Gryphon enhanced subsystem is an exemplary proof of this concept. It has been designed and implemented to utilize a set of agents between the application and a conventional distributed object manager. The GAAOP uses application specified hints in configuring a particular object location and update policy. The Gryphon enhancements enable hints to be utilized to make the DOM subsystem functionally equivalent to several different approaches for supporting distributed objects. Various distributed subsystem approaches can be used with the same applications to compare the performance of different distributed object policies. Mechanisms have been identified, designed, implemented, and analyzed to enable transparent distributed object subsystems to utilize application level hints in order to

12 dynamically customize object location and caching policies. The remainder of this document is organized as follows. Chapter 2, “Related Work,” discusses related work ranging from low-level systems issues, high-level abstractions, and applications domains. Chapter 3, “System Design,” discusses the general design and architecture of Gryphon enhanced subsystems for reducing latency and bandwidth. Chapter 4, “Details Specific to the System Prototype Implementation,” presents implementation details of the enhancement of the DOM prototype with the gryphon architecture described in the previous chapter. Chapter 5, “Analysis of the System,” presents the technique used to evaluate the system. Chapter 6, “Results,” presents the empirical benefits of using the Gryphon enhanced system. Chapter 7, “Conclusion and Future Work,” gives a detailed structure of the work that has been completed and discusses some future work. Appendix A, “DOPICL - Trace Generation,” describes the distributed object trace format and its use in debugging, design, and performance tuning. Appendix B, “DOM - Subsystem Design used by the Gryphon,” details the distributed object subsystem. Appendix C, “Details of Applications Built on DOM,” presents applications built using the Gryphon enhanced DOM and how such programs are executed. Appendix D, “Abstract Auxiliary Interface Hints,” gives some possible abstract interfaces that can be build using Gryphon to provide another layer of abstraction. Appendix E, “Example Distributed Application,” shows the code needed to generate an application for the DOM prototype.

Chapter 2

Related Work

This chapter presents research which is a foundation or is related to this dissertation. Work from the distributed shared memory, distributed object, and some standards communities is described followed by a discussion of current products and research in distributed run time tailorable objects for specialized application domains. Low level systems are important because the targeted abstract distributed object systems can be built with a network sub-layer that utilizes these systems. It should be noted that the communication techniques discussed are limited to distributed heterogeneous systems and do not address data distribution techniques designed solely for multiprocessor machines e.g. see [Rama96]. The domain work is relevant since this thesis work addresses general techniques for improving performance. 2.1 Explicit Data Distribution Systems have been developed with explicit data distribution and provide a foundation for the research at a higher level of abstraction. These systems range from application specific implementations used for basic communications, to well defined standards for message passing programming environments. Message Passing Interface (MPI) [DW95], Parallel Virtual Machine (PVM) [GBD+94], and Remote Procedure Calls (RPC) [SUN88] are examples of standards in use today. Another is the Horus [RBM96] subsystem comprised of protocol layers which can be combined into a large

14 number of different protocol stacks. Trade-offs between efficiency of communication, fault-tolerance, load-balancing, and security of the system are all considerations when selecting protocol layers. The Horus communication system and the other protocols mentioned here can all be utilized as components to create a distributed application and to implement a distributed shared memory protocol and other abstract programming interfaces. The survey by Skillicorn and Talia discusses the classifications of these and other models for parallel and distributed computing [ST98]. 2.2 Distributed Shared Memory The following distributed shared memory (DSM) systems address issues relating to the development of efficient distributed systems for high performance computing. DSMs attempt to hide the data distribution from the user by implementing a shared conventional memory that is not hindered in performance by the extra layers of abstraction associated with its OO counterparts. However, DSMs lack the OO user interface that makes this technology accessible to the general computing communities and supports multiple languages and hardware systems. The Munin and Midway systems described in the following subsections utilize application input to improve performance in ways similar to what the Gryphon does for distributed objects. 2.2.1 Munin Munin is designed to overcome shared memory limitations while maintaining ease of programming [BCZ90] through a distributed shared memory that accepts hints about anticipated access patterns which affect the coherence mechanism at an object level. Munin is one of the first projects to address variable sized caches in its update policy which can be mapped directly to object level caching. Delayed multiple object updates can be merged and sent as one network packet when required (i.e. synchronization point), thereby reducing communication overhead related to message

15 headers. The user allows for tuning by providing semantic hints about access patterns resulting in the object being categorized into the following types: Write-once, Private, Write-many, Result, Synchronization, Migratory, Producer-consumer, Read-mostly, and General read-write. The object can be re-categorized on the fly. 2.2.2 Midway The goal of Midway is to enable the creation of efficient DSM programs with reduced load on the programmer. The Midway [BZS93] system is designed for DSM multiprocessors and uses an entry consistency model guaranteeing consistency at a processor when the synchronization object, known to guard the data, is acquired. This system is written in C and requires the programmer to explicitly associate the synchronization object and the data it guards. Like Munin, Midway supports multiple memory consistency models. 2.3 Distributed Objects The area of distributed objects is an expansive field with many diverse research projects and production systems. For example, some systems use OO languages with objects for the application interface and then use DSM, PVM, or some other distribution mechanism in the subsystem to distribute the data contained in the objects. Some distributed object research involves standards for data distribution (examples will be given later in this chapter) while others are concerned with implementation details for efficient data distribution. See Briot et al. [BGL98] and Chin et al. [CC91] for surveys on distribution object issues and Lewandowski [Lewa98] for a survey on client-server frameworks which include distributed object standards. Ahamad and Smith (1994) [AS94] have identified mutual-consistency requirements in objects being shared. By determining the location of an object and the shared repositories where it is stored, the object can be cached where it is frequently

16 accessed, reducing data access latency and associated communication overhead. Ahamad et al.’s research uses causal consistency (a form of weak consistency) to regulate consistency maintenance of the object copies and is crucial in developing techniques and strategies for more efficient objects. Information learning can then be incorporated into the Gryphon enhanced distribution subsystems, allowing the application programmer and the application to supply information that could then be mapped to the available technology. Barth et al. [BFF+99] utilize their WINNER resource manager with a CORBA name server to provide load distribution for distributed scientific computing applications. Load balancing is achieved without modifying the CORBA interface while preserving location transparency. The name server is a central server which attempts to load balance using workstations’ base speeds and current loads. Objects are replicated at multiple sites with a unique reference assigned to each copy. The applications request the reference to an object via the name server which responds to object requests with the reference object copy that is identified by WINNER. Unlike Gryphon this system focuses on CPU utilization reduction and not message traffic. Rabinovich et al. [RRR99] have created an automated mirroring system which unlike WINNER addresses both load and bandwidth utilization. This system creates mirrored copies of objects and the application sends requests to a redirector node. The redirector node forwards the requests to the mirror copy of the object that will maximize performance. The goal in this system is to reduce the response message distance since the request message is assumed to be relatively small. The algorithms in this system and WINNER can be used to automate setting the Environment variables that could be used by Gryphon to aid in location and caching policy decisions. 2.4 Standards for Distributed Objects Many standards including the three previously mentioned have been developed to accommodate distributed computing at the object level. These standards strive to

17 create a transparent interface that makes local and remote objects appear the same to the application programmer thereby reducing the dependency on the programmer. Performance can be a drawback of the black box transparency interface over an open implementation [OIG97][FBC+98]. This thesis shows the performance gains of an open implementation with auxiliary interfaces for location and update policy. CORBA is presented in detail as it is the standard with the most documentation and the interface originally intended for use in this thesis, while ANSA and ActiveX are only mentioned briefly as the competing alternatives. The article by Tanenbaum, Chodhry, and Hughes [TCH97] gives examples of how the CORBA standard is being used in the internet domain. 2.4.1 CORBA CORBA 2.0 is designed for heterogeneous hardware architectures, OSs, and programming languages and provides an infrastructure for creating heterogeneous distributed applications. The OMG Interface Definition Language (IDL) C++ provides Client

Dynamic Invocation Interface

Object Implementation

IDL

ORB

ORB

Static

Stubs

Interface

Interface

Skeleton

ORB Core

IDL

Dynamic Skeleton Interface

Object Adapter

ORB Core

IIOP

ORB

Interface identical for all ORB implementations There may be multiple object adapters There are stubs and a skeleton for each IDL object type ORB dependent interface

Figure 3: CORBA architecture

mapping to generate C++ interfaces between object stubs and the actual object implementation. The CORBA Internet inter-ORB Protocol (IIOP) allows one more

18 type of heterogeneity: objects in one application running on one ORB vendor can be accessed at run time by another application running on a different vendor’s ORB. An object method call can be categorized into four types of ORB implementations: 1. The object implementation resides in the same process as the caller. The resulting call is mapped into a local library mapped call. 2. The object implementation is on the same processor, but in a different process. In this case an OS call occurs to access the object. 3. Similar to the previous type of ORB except the object is on a different processor. The result is a network call to the remote ORB. 4. The call in this type is between different ORB implementations that cannot communicate using the ORB specific implementation protocol and instead use IIOP as a common communication protocol. Figure 3 shows the CORBA architecture and the use of IIOP to communicate with another ORB. The ORB Core is the communication subsystem and has its own interface to remote sites. Other interfaces are then exported to the application, resulting in a uniform CORBA interface, irrespective of the specific implementation. Ensemble is an example of an ORB Core, with Electra, exporting its CORBA compliant application programming interface (API) to the application [MS97]. In order for an application to access a CORBA object, a call is made through either the Dynamic Invocation Interface or the IDL Stubs interface. If the object code in the application is compile time generated from an IDL definition, as seen in the IDL compile process in Figure 4, then the IDL Stubs interface is used. The call is then processed by the subsystem in one of the four manners previously described, and comes out the other end of the system via the Static IDL Skeleton or Dynamic Skeleton Interface based on the same selection process for the original call. The call reaches the object implementation, and the resulting effects of the method are passed

19 back through the subsystem to the caller. IDL

Implementation Installation

Definitions Compiler IDL C++

Client Stubs C++ Classes

Interface Repository

Client

Server Skeletons C++ Classes

Implementation Repository

Object Implementation

Figure 4: IDL processing, implementation installation, and the resulting generation Compiling IDL, using the C++ mapping as seen in Figure 4, results in the generation of C++ files containing client stubs and server skeletons and also places them in their respective repositories for dynamic invocation. The stub and implementation interfaces generated contain the variables and method calls to be used in the object class. This code also contains everything necessary to marshal and unmarshal calls between the stub and the actual object implementation as seen in Figure 5. The IDL generated code is the interface that shelters the application Client Method Call

Method Body

Stub Method

Server Skeleton

Marshal

Unmarshal

ORB Run Time Surrogate Object

Marshal

Unmarshal

ORB Run Time Server Object

Figure 5: Method Call Dynamics

programmer from the ORB dependent interface. Therefore, the creator of the class

20 must write the actual object implementation code in the same manner in which standard object code is implemented. It should be noted that the hierarchical structure of the CORBA model lends itself to optimizations and enhancements at various levels. For example, Reverbel and Maccabe [RM97] added persistent objects to CORBA implementations where the code was unavailable. By adding new layers on top of the Object Adaptor, the new features can be made available cleanly and transparently. Narasimhan, Moser, and Melliar-Smith [NMM97] added fault tolerance to an existing ORB by exploiting the IIOP interface to intercept messages and add additional features to an ORB which was not part of the product’s implementation or its design. By using open systems like Electra and Ensemble, access to every level of the system is achieved, in addition to the access techniques mentioned above. Abdul-Fatah et al. ‘98 have conducted performance analysis on ORBs using three different architectures called Handle Driven ORB (H-ORB), Forwarding ORB (F-ORB), and Process Planner ORB (P-ORB) [AM98], based on VisiBroker, a product by Inprise/Corel (Formerly ORBeline by Postmodern Technologies) [Visi99]. Their work shows the impact of inter-node delays, message size, and request service times on the latency and scalability of these architectures. Trade-offs were found between latency and scalability of the different ORB implementations based on work load and application specific utilization. These issues are addressed in this dissertation in the form of caching and location optimizations. 2.4.2 ANSA It is difficult to discuss Advanced Network Systems Architecture (ANSA) separately from CORBA because they are extremely similar, differing mostly in implementation [Scot97]. The CORBA standard was created after the ANSA standard and incorporated knowledge from the prior design. The ANSA standard is based on the C language and does not have many of the advantages of the OO design of

21 CORBA with its C++ like interface. ANSA has a transparent mechanism to support object migration and object replication, a nucleus to hide the underlying heterogeneity of the hardware architectures, and many of the other features found in CORBA [Li95]. ANSA had a strong following in Europe while CORBA was being developed in the USA. ANSA and CORBA are both open designs (which is preferred in the research community) while ActiveX is a closed system but has commercial support yielding an advantage in industry. 2.4.3 ActiveX Distributed Component Object Model (DCOM) is technology marketed by Microsoft under the name ActiveX [Chap97]. ActiveX technologies enable programs to share libraries and objects distributed across a network. A somewhat biased comparison was conducted by the OMG group between ActiveX and CORBA [OMG96] but legitimate points were made regarding the proprietary nature of ActiveX, the specific OS required, and the lack of a documented standard. Therefore, it is difficult to implement an ActiveX subsystem without access to proprietary Microsoft software and interfaces. An ActiveX system would be a closed system that only works on Microsoft Windows platforms and would be of little use to the open research community. It should be noted that CORBA applications can be used with ActiveX since the underlying DCOM technology is inter-operable via a DCOM/ CORBA interface. 2.5 Specialized Languages and Operating Systems for Distributed Objects Many languages and Operating Systems (OS) have been designed with distributed technology in mind and the most relevant are discussed in this section. Project SIRAC [BAB+96] addresses the creation of distributed applications for real time interaction using multiple workstations. Utilizing Olan, a language designed for run time support of distributed applications, Brown and Najork (1996) [BN96] created

22 distributed active objects (known as Oblets) written in Obliq, Obliq facilitating the distribution of objects over the World Wide Web (WWW) by providing distribution primitives. Emerald is another example of an object based language and system designed for distributed programming [JLH+88]. Emerald provides a global name space for locating objects and method invocation on objects and also manages the objects internal data requirements. These features are common to many systems but Emerald violates the location transparency model to improve performance. Language primitives are provided to enable programmers to specify object location. The features include specifying a location, querying for current location, toggling on/off the change location feature, and moving an object to be with another object. The related PRESTO library does not create a new language but instead provides an environment for writing object-oriented parallel programs for shared-memory multiprocessors using C++ [BLV88]. This library provides primitives including threads for concurrency and locking mechanisms for synchronization. The Aster system creates a configuration-based development environment using a declarative language and a set of tools [IBS98]. Aster provides automated configuration of middleware that can be built using existing CORBA ORBs to provide non-functional properties (e.g., fault-tolerance, security). Hassen, Athanasiu, and Bal (1996) [HAB96] developed a programming model on top of the Amoeba distributed OS. Using features of Amoeba, the Hawk and Orca Run time system (RTS), objects can be distributed. This RTS supports three different representation schemes: single-copy, replication, and partition. By using the specialized features of the Amoeba OS, the RTS can create efficient objects. The general UNIX type OS has been implemented as a timesharing system with strategies for process scheduling and blocking. In general, data throughput cannot efficiently or reasonably support some of the current applications. Researchers are working on

23 developing OSs that support the new tasks being presented. Another solution is to utilize a micro-kernel (e.g. Mach [ABB+86], Spring [BSP+95], Chorus [BGG+91], Exokernel [EKO95]) and other specially designed servers tuned for VEs that allow for a higher degree of communication between the OS and the application, and on-the-fly tuning of the OS. These types of OSs allow applications to request resources and specify the quality of service (QoS) they require. The technology from the Chorus project, which includes its COOL language, has been used to generate a CORBA ORB that addresses performance issues on embedded systems and PC platforms [CHOR97]. The Gryphon system gives the CORBA user a way to communicate with the subsystem to specify object location and makes use of the specialized features that arise in new or enhanced OSs. Using programming languages specially designed for data distribution has its pros and cons. The result can be a more efficient system and better tools for developing specialized applications than general purpose programming languages. The drawback is that these languages often require the programmer to learn a new programming language syntax and semantics. Also, these specialized languages often do not support all the features necessary to create large, diverse applications. The result is programmers needing to learn different languages for each type of specialized task. See Thorn for a discussion of additional languages designed to accommodate distributed objects [Thor97]. 2.6 Virtual Reality Systems use of Distributed Objects This section presents some DVE projects and describes how they address the efficiency problems associated with creating a usable system. Social computing has been around for almost 20 years in the form of text based virtual shared environments called Multi-user dungeons (MUDs) and later MUD object-oriented (MOOs). In 1978 Roy Trubshaw and Richard Bartle at Essex University created the first MUD which was written in the MACRO-10 Assembly language on the PDP-10. In 1990 the first

24 real MOO, LambdaMOO, was created by Pavel Curtis at Xerox PARC [Burk95]. Virtual Reality (VR) in its early forms was used in military simulations but in an effort to be more pragmatic VR has been changed to an improved representation dubbed Virtual Environment (VE) [BC97]. VEs then merged with MOOs to form distributed VEs (DVEs) which are currently used in various applications ranging from games to groupware applications. A question often raised is whether this technology can actually be useful. Kouzes, Myers, and Wulf (1996) discuss collaboratories and the fundamental psychosocial questions necessary in order for this technology to be accepted [KMW96]. They argue that in order for this technology to gain acceptance, it needs to enable current modes of communication in addition to the new modes being devised. Thomas et al. address issues of multi-user interaction and visualization of conflict [TSK98]. Their DVE prototype system produces visual cues to notify the user of the application that conflicts have arisen over an object. When a user attempts to move an object that cannot be moved or is being moved by another user then the stretching visual alterations is applied to the object. Distributed objects research is well suited to DVEs and the DOM system used in this thesis has been applied to the VPR system. [NAB+95] gives a detailed description of the types of applications the VPR runs and the requirements these applications place on the system. These types of applications place requirements on the system at all levels, for example, disk access, network bandwidth, application computation, and graphic rendering. The VPR project addresses many of these problems, but this research focuses on the object distribution aspect. Roehle (1997) [Roeh97] summarizes many communication issues revolving around DVEs and discusses some solutions currently in practice. One solution, dead reckoning, raises the data being transmitted to a higher level of abstraction thereby reduces the message traffic. For example, instead of sending a large number of updates

25 for an object in motion, the initial position and velocity vector can be transmitted, allowing the client to calculate the current position. Another problem is synchronizing updates to an object from multiple sources. Solutions to this problem include requesting permission from the object for each modification, locking the object, and attaching the object to the object requesting the modification (i.e. users avatar). The attachment concept allows the updates from one object to also cover the attached object. Filters can reduce unnecessary information by culling updates from objects which are too distant to be visible or are occluded from view. Groups can be formed to allow for multi-casts of object updates to all interested parties, a central concept in the E/E subsystem that was originally considered for a base platform to this thesis. None of the projects mentioned in this section suffer from location transparency issues, the reason being that these systems have interfaces designed specifically to run virtual reality applications and have also been optimized to run these specialized applications. 2.6.1 RhoVeR The RhoVeR [Rhod97] system is a VE system that is build on a low-level communication system. Data distribution occurs using a message passing and shared memory model, an example being a system that uses a standard OS and whose software components are entirely designed from the ground up for a specialized purpose. The system should work efficiently since it is designed for a specialized purpose. The goal of this thesis is to generate distributed objects that can be used to create systems with efficiency on par with these specialized systems but with the advantages of being abstract and able to utilize the latest in OS and communication technologies as they arise. 2.6.2 DIVE The Distributed Interactive Virtual Environment (DIVE) is another multi-user

26 system [BF93] dealing with the issue of interaction in virtual space where objects are responsible for their own interactions within the environment. An object has a focus (how aware it is of other objects) and a nimbus (how aware other objects are of it). These criteria help throttle the quantity of information propagated. This system does not attempt to create distributed objects but instead passes events around the system. By understanding application semantic, the information can be utilized to restrict state update propagation to a subset of participants which have expressed interest. Interest is expressed by having participants register for messages by event type which is used by DIVE. The events give the semblance of distributed objects. Benford et al. also identify that strict consistency between replicas is difficult in a real time system and inconsistencies can and must be tolerated to maintain rapid response time. This is an older systems that addresses some issues and proposes solutions to reducing data update traffic. 2.6.3 RING This RING system is a DVE built on top of a multi-casting subsystem [Funk95] with a centralized server that provides communication between clients. This system is designed to reduce the amount of message traffic, using many of the visibility issues summarized at the start of Section 2.6, “Virtual Reality Systems use of Distributed Objects” on page 23. The RING system shows that in a densely populated environment with much visual occlusion, only a very small fraction of the messages need to be propagated to each user. To reduce the problems associated with a central server, distributed servers were implemented and the data partitioned to generate better locality of reference. 2.6.4 AVIARY The AVIARY project [WHH+92] at the University of Manchester, UK, created a virtual world with properties and laws acting on objects in the world. Objects have

27 interactions within the environment based on their properties (including mass and energy) and laws that constrain how these objects can act and interact with each other in the world (for example, gravitation and conservation of energy). This work attempts to model the real world and allows interaction within the space. Processes register for events which results in event messages from AVIARY. The applications are responsible for processing each message and interpreting the events based on the current state of the world. This system is designed around the ParSiFal T-Rack made up of 64 T800 Transputer processors and is interesting in its application of real world properties to the virtual environment. The current project at Manchester is called DEVA [PW97] and it involves creating an operating environment with performance efficiencies geared towards VEs. 2.6.5 Black Sun Community Server Black Sun Community Server [Blac97] is a product developed to host a distributed community. Unlike other systems with one server, clients in Black Sun need to connect to multiple servers to acquire all the services necessary to be a member of a DVE session. Currently there are four services: ID Service (logs users), Motion Service (keeps track of users’ positions and also calculates and updates nearest neighbors), Text Service (chat channels are maintained), and Shared State Service (manages and distributes the state of shared objects in the environment). It should be noted that the current implementation of this system requires too much bandwidth to operate in 3D mode over a modem line [Blac97]. Black Sun has found that 85% of the traffic being generated results from updates the user receives regarding the position of other avatars. In order to prevent the bandwidth requirement from growing and making the system unusable (even on a high bandwidth line), an algorithm is used for keeping data traffic to individual users constant. As the density of users increases, the visible range used for culling is reduced. This aids in scalability and results in a solution that may be acceptable in this specialized application domain.

28 It should also be noted that this solution does aid the client, but the centralized servers will become overloaded by the network traffic. In the case of Black Sun, a hard coded update rate of 3-times a second is utilized to decrease the frequency of avatar position updates. Another solution, described by Holbrook (1995) [Holb95] in reference to the Distributed Interactive Simulation (DIS) project, is to vary the update rate based on distance from the object. This concept of varying update rate is based on the level of detail model used in the graphics community. This system addresses many of the important issues necessary to generate a usable application in this specialized domain. All optimizations used in this system are hard coded into the system, thus users have no way to modify these rules on a per object or even global level. Bandwidth reduction techniques are limiting but without them the Black Sun Community Server would not have taken the first step forward. The VPR system utilizes these same reduction techniques to improve performance. 2.6.6 MPEG-4 The MPEG-4 standard has many features in common with virtual environments [Koen99]. The standard includes audio and video components which have been broken down into individual objects. The Binary Format for Scenes (BIFS) language is used to add and delete objects and perform more complex functions, including changing visual and acoustic properties of objects and their interactions without effecting the actual objects. MPEG-4 has similarities to VRML but has added functionality for 2-D lines and rectangles and real-time streaming. MPEG-4 is another example of current technology that is utilizing distributed objects and can benefit from advances in object technology. 2.7 General Systems Addressing Performance in Distributed Objects The previous section described many domain specific techniques being used by the DVE community to improve performance of distributed objects. This section

29 describes some other systems which address performance issues for use with general applications. Many of these systems, including Gryphon, perform caching and Liskov in her work with Thor [Lisk99] agrees that data is often small and data shipping is a necessity for scalability. 2.7.1 COBS Project The Configurable OBjectS (COBS) project at Georgia Institute of Technology is building a CORBA compliant system to run on high-performance architectures [SA97]. The COBS research is directed towards high performance machines (IBM SP2, CM-5, etc.) interconnected via high speed networks (Myrinet, ATM, etc.). The project tailors the performance levels of the objects at run time based on the requirements of the application. Tailoring is achieved using attributes implemented to directly control system characteristics and includes selecting object implementations, making objects passive, single- or multi-threaded, fragmenting and replicating object state, using reliable, unreliable or multi-cast protocols, and selecting compression and secure transmission protocols. To provide for efficient heterogeneous operation, a Portable Binary I/O package (PBIO) is used to create an abstract network layer. The goal of this project is to supply the application programmer with access to low-level high performance subsystem features. Unlike the general hints used in Gryphon, this project gives access to low-level specifics of the COBS CORBA implementation platform and therefore requires specialized knowledge of this platform by the developer. Attributes are associated with every object operation and each system component within the COBS architecture layers processes the attributes it has been designed to filter and the remaining attributes are passed to the next layer. This architecture does not limit or predefine the range of attributes, however, application programs using the COBS system must have a well defined understanding of how the subsystem works in order to utilize the provided features. The attributes are assigned on a per object basis at run time but not on a per client basis as in Gryphon.

30 Unlike Gryphon the COBS project focuses on the transport layer and on hardware for efficient data propagation. Gryphon, with an abstract level of dealing with object location and update optimization, would utilize COBS in the same way as DSM. The COBS system provides low-level mechanisms which could be placed in a Gryphon enhanced ORB and used by the Gryphon to implement its abstract interface features. The COBS project terminated without releasing performance results and no further information is available. 2.7.2 Globe Project Globe is another system which hides distributed object location details from the developer and intends to provide a uniform interface complete with scalability, fault-tolerance, and security [SHT97]. The goal of Steen et al.’s Globe research is to create distributed objects that scale to the world wide web [SHT99]. The Globe project is attempting to create a naming service for locating objects and a caching system which can scale to a billion hosts and an even greater number of distributed objects. Steen et al. identified that many applications utilize transmission control protocol (TCP) connections or developed specialized protocols for each application type on top of TCP (hypertext-transfer protocol (HTTP) for web access and uniform resource locators (URL) for naming service, SSL for security, mail, and news protocols). The Globe project is attempting to develop a scalable middleware which hides the underlying transport protocols and Globe creates yet another interface by enhancing CORBA IDL with additional features. The goal is to have a middleware which can be used for all applications and Globe claims that CORBA, DCE, Andrew File System (AFS), and other middleware cannot be used for sophisticated applications due to their wired-in policy. Gryphon, COBS, MinORB, and others have shown that existing middleware interfaces can be enhanced to handle sophisticated proxies refuting Globes claim. The Globe system attempts to create an interface for accessing objects which

31 can contain all types of data. One example is an object which represents a web page and all the associated components including graphics, applets, subframes, etc. The web client contacts a Globe enhanced web proxy which locates the nearest object and forwards the web page and associated components within the object to the web client. With a Globe enhanced web client the entire object would be sent to the client and separated into the necessary components. The result of this technique is application specified modules for data co-location. The Globe objects can also be useful to update news objects (within an online news application) using a publish/subscribe type of data replication. The main contribution of Globe is to identify novel uses of distributed objects and data composition. Similar to Gryphon and MinORB, Globe’s premise is that objects need varying consistency policies specific to the application as discussed in the DSM work, such as Munin. This work agrees with the findings that standards such as CORBA have not addressed replication and update strategies. Globe also finds that location transparency must be subverted at times for performance reasons. The Globe technique supplies a Java style interface and is able to modify the underlying implementation to suit the application demands. A Globe object consists of subobjects which handle the security, communication, replication, control, persistence, and other services. With Globe, an application program should be able to focus on the application and utilize various subobjects to effect the implementation. To reduce the complexity of creating objects from subobjects the Globe project is attempting to define an object definition language (ODL) from which the components can be specified for object construction. Globe suggests how an object can be constructed with special subobjects to suit the application while Gryphon describes specifics of location and update policies. In addition Gryphon illustrates how run time tuning and distributed decision making can be performed. Globe creates a new protocol and distributed object system allowing each

32 object to set its distribution policy. The Globe policy addresses lower level issues and allows objects to be partitioned and distributed as subobjects. Unlike Globe, Gryphon can be applied to existing protocols such as CORBA without modifying the IDL. The results of the Globe design is an object composed of many subobjects of varying policies. This implementation produces large objects while Gryphon uses the agents which are external to the object resulting in lighter weight objects. In addition, the Gryphon system has the advantage since it supports policy specification by the application at run time using hints to the subsystem. The Gryphon’s GAAOP technique eliminates the overhead and special interface required when putting the decision components into each object. Gryphon enhanced DOM supports sub-updates of objects through manual generation of the objects since existing IDLs do not support this feature. 2.7.3 MinORB Object Caching System MinORB is a CORBA-like middleware simulator, developed by Martin et al., that addresses the issue of object replication using consistency specified by the application [MCC99]. Martin et al. identified the need for scalable high-performance constrained-latency middleware to support Intelligent Networks (IN), Call Control, World Wide Web, Web-based electronic commerce, dynamic documents, and Collaborative Virtual Reality (CVR) applications. Their research illustrates that architectural middleware standards, such as CORBA, automate repeated network programming tasks (object location, binding, and invocation) but conventional ‘NoReplication’ implementations of subsystems suffer from high network latency and do not scale to large numbers of objects. ‘Full-Replication’ is a solution but requires high resources, does not scale, and has difficulty supporting these new applications which, unlike their predecessors, are not read-dominated and therefore have cache consistency issues. Martin et al. explore ‘Partial-Replication’ as an approach for addressing latency and scalability issues.

33 Necessary features were identified for future applications including high invocation throughput for IN applications which have 1,000 hosts performing 1,000 call-attempts per second on each host. These calls could potentially result in 10 nested method invocations which would require a throughput of 10,000,000 invocations per second. Low invocation latency, reliability, inter-operability, and support for a large number of objects (upper bounds in CVR applications were estimated to be as high as 1,000 hosts with up to 1 billion objects per host). The MinORB system uses a distributed application simulator for preimplementation design and pre-deployment capacity planning. The simulator needs application characteristics/load including object size, object count, caching proxy counts, and the ratio of read to write operations. The MinORB simulator is based on SES/Workbench using a state machine emulating a Solaris multi-threaded executable. The simulator returns average invocation latency in microseconds, invocation throughput in invocations per second and server-to-server bandwidth demand. MinORB utilizes existing technologies to enhance the ORB implementation. Zero-message copy, which passes the memory when possible, is used for passing messages through the subsystem to eliminate copy and allocation overhead. Colocation is used to directly access objects or proxies in the same address space thereby avoiding unnecessary marshalling, unmarshalling, and message passing. Hashing is used for rapid demultiplexing of messages to objects. MinORB’s small footprint is maintained (~65 Kbytes) to fit into current L2 CPU caches. The caching policy is determined by the lag variable which specifies the timebased inconsistency the application will tolerate between a write operation and the cache update. This mechanism, dubbed ‘Smart Proxy’, caches object state and propagates updates to cached copies in a manner that violates strict serialization of concurrent invocations. Future work of this project will analyze the trade-offs between enforcing strict serialization and throughput. Similar to Gryphon and Electra, object

34 data is propagated via set and get value methods. MinORB’s Partial-Replication, like Gryphon’s caching, exemplifies that caching is sometimes required while other times caching is disadvantageous. MinORB’s use of a lag variable to decide policy is similar to Gryphon’s time update caching policy. In addition, the Gryphon research identifies a modification count update caching policy and provides for various cache consistency policies. While MinORB uses the lag variable to address the implementation issues, the Gryphon subsystem uses an interface (using agents) for specifying object caching and placement on a per-object per-client basis. The Gryphon research also addresses run time distributed decision making for caching policies and location specification. 2.7.4 Coign Coign is an Automatic Distributed Partitioning System (ADPS) which can be run by the end user since Coign takes an application binary compiled for Microsoft COM and converts it into distributed applications using DCOM [HS99]. The Coign system creates a profiled version of the application binary to intercept all the calls to components (also called objects). The profiled application is then run using scenariobased profiling which attempt to represent how the application is actually utilized. The result of profiling is a graph model of inter-component communication which is processed using a left-to-front minimum-cut graph-cutting algorithm to decide on the distribution of components. The current version only supports distribution of the application across two machines since the graph-cutting algorithm is an NP-hard problem when using more then two machines. The resulting distributed application is tuned to the scenario usage and attempts to minimize network communication delays. On three applications with 295, 458, and 786 components Coign was able to distribute 8, 2, and 281 components respectively to a remote server/host. The result is off loading some of the processing to a remote host. Another experiment which showed positive results involved having the developers of applications attempt to

35 manually distribute components to a server and then see if Coign could generate a version, which when run on the scenario, would have less message traffic. Coign is an example of an automated tool which could utilize the Gryphon subsystem to achieve distribution. The current implementation of Coign only runs on machines that supporting COM and DCOM but using a Gryphon enhanced subsystem the Coign analysis could generate hints to the Gryphon. In future work they hope to perform run time repartitioning which could then be used to supply run time hints to a Gryphon enhanced subsystem.

Chapter 3

System Design

Chapter 1, “Introduction,” presents the use of location and caching to improve distributed object performance. This chapter describes the Gryphon design with location and caching policies and its application to actual subsystems supporting these features. These enhancements enable entities external to the subsystem or ORB to tune the performance within the ORB. The DVE virtual art museum application described in this chapter is an aid to understanding the structure and use of the Gryphon enhanced subsystem. The DVE virtual museum contains a number of works of art available for viewing and discussion. An individual enters the virtual museum, browses artwork, and communicates with other visitors. An individual is represented in the museum through the presence of an avatar in the environment. When an individual sees a number of avatars near an interesting piece of work, the individual may opt to join the group, view the work, and begin discussing it. The virtual art museum has a number of static objects with complex VRML specifications (corresponding to the art works) whose locations within the museum are assumed to be static. The avatars move around as the visitors wander between works of art. Chapter 4, “Details Specific to the System Prototype Implementation,” shows the feasibility of the Gryphon approach and its implementation within the DOM

37 subsystem. Chapter 5, “Analysis of the System,” demonstrates the effectiveness of this approach. The Gryphon design can be applied to any subsystem while allowing legacy applications to continue to function on the enhanced subsystem. 3.1 Architectural Overview of Gryphon Distributed applications allow multiple processes on various hosts to merge into a cohesive application. Each process within the distributed application is composed of an application layer and a communication layer or subsystem. The application layer contains the instructions and coding specific to the task addressed by the application. The communication layer transparently manages the transfer of data between the application layers of multiple processes. In this thesis the communication layers of interest are those of an OO nature. The communication layer or subsystem with an OO interface is used by the application layer to integrate multiple distributed processes. The application utilizes objects as if they were local objects and the subsystem transparently manages the method distribution, synchronization, and any other communication issues that arise. The Gryphon architecture supplies an auxiliary interface enabling the application to influence the subsystem implementation. 3.2 Distributed Object Model based on Implementation Transparency The goal in many OO models and distributed object models is to shelter the application developer from implementation details. These models have the burden of presenting an interface which is both general and efficient. General interfaces are easy to utilize but tend to be inefficient. Generating more specialized interfaces can help address performance issues but at the cost of a more complex interface. The opaque interface is a duel edged sword, enabling ease of use for the beginner but limiting functionality available to the more advanced developer. The Gryphon system addresses the problem of location transparency in distributed object systems by optionally allowing the developer to impart application-

38 specific knowledge to the subsystem, aiding in the location distribution decision process. The rationale is that it is not possible to select policies that suit all applications, therefore policies should be specified on a per application basis as needed. One solution to give the developer control over object placement is to soften the location transparency interface model by adding an auxiliary interface to supply location information. It should be noted that the location transparent interface model can be implemented using the location non-transparent interfaces without supplying information regarding location, which is analogous to using the location transparent model that makes no provision for additional information. Appendix D, “Abstract Auxiliary Interface Hints,” illustrates hints that can be specified at a higher abstraction level than the Gryphon auxiliary interface described in this chapter. Theses hints influence cache update and location policies using application information and system resource information. 3.2.1 Caching Calls made to an object that is not cached or cannot be cached (no information is available about the effect of each method call) are passed through the subsystem to the object implementation on a server. When the object interface information described in Section 3.5 has been specified by the application, it is feasible to cache an object and use the cached copy when get method calls are requested. The put methods are more complicated and require mechanisms and algorithms to determine updating requirements of cached copies. In the virtual museum example, the wall and art work objects are assumed to rarely change therefore the most frequent requests are get method calls to retrieve their visual representations and placements. Visitors presences in the museum are represented by avatars. When one visitor is near another the interaction may be fine grained and require strict cache consistency or no caching at all while updates to cached copies can be less frequent if the visitors are far apart or

39 in separate rooms. Later sections describe the way some objects are cached using relaxed forms of consistency. 3.2.2 Location In a transparent object system, clients in an application are unaware of the location of the object within the subsystem. The object may exist on the local process or on a remote process of a different computer. This research does not violate the location transparent interface provided to the application but instead adds an auxiliary interface that enables the application to influence the location of the object if it desires. The object instance can be moved to any server which supports the executable version of the object. In the virtual art museum, the visitor’s personal avatar object is colocated with the client process. This reduces the latency associated with the visitor moving through the museum and prevents delays that would occur if each step required confirmation with a remote location before the move could take effect. 3.3 Spectrum of Systems Identified Many APIs have been specified for distributed objects. These APIs describe the object interface to be used by the applications in a distributed system. The APIs address the issue of generality, while system design focuses on performance. The five system designs that follow represent a broad spectrum of distributed object system implementations, all of which are supported by the Gryphon enhanced DOM. These system descriptions focus on the underlying mechanism that provide object distribution (rather than their interfaces). System 1 (Centralized object manager): In this system, a centralized object manager is implemented to provide DVE style applications with cached distributed objects. A single server allows objects to be cached to each client location using a strict consistency policy. Updates are requested from the central location which returns a confirmation and propagates the updates to all the clients. Table 1 illustrates all

40 objects located on the central server and each client has a cached copy of the objects. Table 1: System 1 Server

Client X

Client Y

X1,X2...Xn

X’1,X’2...X’n

X’1,X’2...X’n

Y1,Y2...Ym

Y’1,Y’2...Y’m

Y’1,Y’2...Y’m

S1,S2...Sp

S’1,S’2...S’p

S’1,S’2...S’p

Key: X, Y- Object used mostly by Client X or Client Y respectively S - Objects that are static and don’t have a best Client for collocation ’ - cached copy of Object ’’ - cached copy of Object with update policy caching * - A reference to Object, i.e. not a cached object

System 2 (ORB Central CORBA): This style of object manager is a centralized ORB: A single server stores all objects, any reference to an object requires a remote reference. The objects are not cached; each request requires a send and receive message to determine the state of the object. The server becomes a bottleneck due to the large quantity of traffic through the centralized ORB. Table 2 shows all the objects located on the central server and each client with references to the objects. Table 2: System 2 Server

Client X

Client Y

X1,X2...Xn

*X1,*X2...*Xn

*X1,*X2...*Xn

Y1,Y2...Ym

*Y1,*Y2...*Ym

*Y1,*Y2...*Ym

S1,S2...Sp

*S1,*S2...*Sp

*S1,*S2...*Sp

System 3 (ORB with Opaque Location): The object manager is a distributed set of ORBs. Static objects, S, are distributed across processes while objects being modified by a client are located on that client. The ORB is not centralized and local object accesses do not result in message traffic. Table 3 shows the objects being modified located at the appropriate process, therefore Process X has objects X while Process Y has objects Y. For analysis, the static objects are equally distributed across processes

41 and each process in this system has a reference to objects located on remote processes. Table 3: System 3 Process X

Process Y

X1,X2...Xn

*X1,*X2...*Xn

*Y1,*Y2...*Ym

Y1,Y2...Ym

S1...S(p/2), *S(p/2)+1...*Sp

*S1...*S(p/2), S(p/2)+1...Sp

System 4 (ORB with Opaque Location and Caching): The object manager includes a Gryphon capable of acting on location and caching hints. Objects are distributed across processes, but in this system the objects are located on the client making the modifications and are cached at the other clients. Table 4 is similar to System 3 except references to objects have now become strict cached copies of the objects. Table 4: System 4 Client X

Client Y

X1,X2...Xn

X’1,X’2...X’n

Y’1,Y’2...Y’m

Y1,Y2...Ym

S1...S(p/2), S’(p/2)+1...S’p

S’1...S’(p/2), S(p/2)+1...Sp

System 5 (ORB with Opaque Location, Caching, and Update Policies): The object manager includes a Gryphon capable of acting on all hints including application specified consistency policies. Like Systems 4, objects are distributed across processes and caching is used. In this type of system, caches are allowed to drift out of synchronization for varying lengths of time (as specified by the application). Table 5 is similar to System 4 except the strict consistency has been replaced with application

42 specified consistency policies as illustrated by the double quotes. Table 5: System 5 Client X

Client Y

X1,X2...Xn

X’’1,X’’2...X’’n

Y’’1,Y’’2...Y’’m

Y1,Y2...Ym

S1...S(p/2), S’’(p/2)+1...S’’p

S’’1...S’’(p/2), S(p/2)+1...Sp

3.4 Gryphon Design for Enhancing Subsystems Section 2.4 describes various distributed object interfaces that follow the transparent interface model which simplifies programming for the application developers by abstracting away the low-level implementation details. These interfaces fall short in performance due to transparency. For example, DVEs frequently perform changes on a small subset of objects but they may query all the objects in the environment at regular intervals (e.g. to refresh the screen). DVE domain specific knowledge can be utilized to develop distributed objects tuned for the particular DVE environment. However, distributed objects using location and update transparency can be inefficient when used with some domains. As demonstrated in Section 6.2, applications that perform many read operations on an object require object caching to reduce remote access latencies and objects being modified frequently at one client might be better placed locally on the client performing the modifications. The research goal is to improve performance by having an open implementation with an auxiliary interface for a conventional ORB, agents in this case. The auxiliary interface specifies location and update policies. Using the auxiliary interface, the location and update policies can be explicit rather than implicit as is the case with the normal transparent policy. The Gryphon system demonstrates the feasibility and performance advantages of allowing the developer to influence location and update information to the

43 subsystem. With a Gryphon enhanced ORB, the developer can achieve high performance without writing a domain-specific distributed object subsystem. Qualitative data gathered from developers suggests that the addition of the location and update interface does not significantly reduce usability. The interface utilized within the application clients using a Gryphon enhanced ORB does not require modification provided they do not wish to influence policies. Distributed objects are accessed using the same interface for the enhanced ORB and the ORB prior to its enhancement. The application has the additional agent auxiliary interface available to specify location and update information. 3.4.1 Minimum ORB Features Required to Support Gryphon In order to enhance an ORB with the Gryphon, the ORB source code needs to be available. The ORB must support the following two features that enable the implementation of location and update policy: • A mechanism for specifying and modifying object location at run time. • The ability to cache objects and have influence over the techniques used for data consistency in server push and client pull models, which are described in Section 4.3. The CORBA specification does not mandate the presence of these two features within the ORB therefore not all ORBs can be enhanced with Gryphon. The Electra/ Ensemble (E/E) CORBA ORB [Birm97] [MS97] is an example of a subsystem with the necessary features to support location and update policy enhancement using Gryphon. Figure 6 depicts the Gryphon integration into an ORB whose implementation has the features necessary to support Gryphon (while Figure 2 in Chapter 1, “Introduction,” depicts the agent and Gryphon design and interface).

44

Application

Gryphon Calls to ORB with required features Location & Update Features

base ORB Figure 6: ORB with the Gryphon

3.4.2 Designing the Agent The agent is added to the subsystem to allow the application to communicate indirectly with the subsystem. A local client passes hint objects to its agent in reference to distributed objects. Hints can be passed to the agents at run time using domain knowledge, user input, and automated performance tuning tools. The tuning tools perform run time analysis of the application to generate input to the agents. Proxies of the agents with these hints are distributed to remote processes. The Gryphon communicates with the agents and proxies to effect the location and caching policies within the subsystem. The information passed to the agent includes location, remote update support which provides locking facilities, and cache consistency policies. See Section 4.5 for details of the hints implemented in the enhanced DOM subsystem prototype. In the virtual art museum application the objects representing the art work and building structure are distributed to various server processes but caches at the client process. These objects use a strict consistency policy since the objects rarely resulting in little

45 message traffic. The client process caches the visitors in the museum, enabling each visitor to see what the others are doing. The visitor objects are cached using an application specified consistency policy. As a visitor gets further away or even obscured from another visitor, the frequency of updates is reduced. Using virtual position information regarding the visitor in the museum it is possible to cache the objects of current relevance to the client process. A visitor in one room of the museum cannot see the other rooms and the objects within those rooms, therefore only the objects in the current room need caching. Since not all objects are cached a server must be contacted with location information regarding the visitor thereby enabling the identification of the objects requiring local caching. The purpose of the Gryphon1 is to provide a prototype interface that enables applications to influence object access performance. The Gryphon enhancement technique leaves the existing distributed object application level interfaces unaltered by passing hint objects to agents. Each agent uses this information to influence per object update and location policies. To an ORB that has not been enhanced with Gryphon, the agent is just another distributed object, while an enhanced ORB recognizes the agent as a special active object that merges the notion of process and object [BGL98]. A Gryphon enhanced ORB utilizes the agent to process updates and validation of object states. An advantage of using the agent design is that applications with special needs can enhance the agent and hint class without requiring any changes to the subsystem. One such example is an MPEG application which needs additional information to distinguish between level of updates (see Appendix C, “Details of Applications Built on DOM,” for more detail). 1. Gryphon - A fabled animal with the body of a lion and the head and wings of an eagle. It is known for its domination of both the earth and the sky, and its combination of intelligence and strength. The Gryphon was alleged to watch over gold mines and hidden treasures. The Gryphon of this body of research is symbolic as a guardian over distributed objects. Gryphon Image from http://enteract.com/~tirya/gallery.html

46 Appendix D, “Abstract Auxiliary Interface Hints,” describes high-level hints that can be created utilizing the agent interface. The Gryphon architecture is sufficiently abstract to be utilized both with existing subsystem technology and with new technologies and strategies for future systems. One method of designing the Gryphon layer would be to embed hint and state objects within every distributed object. However, this does not scale since each distributed object would need information pertaining to the requirements of each process. Since the system is dynamic, the creation of a new object would cause every process to simultaneously attempt updates of the associated hint and state object within the distributed object. This would result in synchronization issues and delays to gain object write access. Designing the system with one agent per process, instead of a shared hint object per distributed object, reduces complexity, network overhead, resources required, and also eliminates an update synchronization problem that occurs when multiple writers share a hint object. Each process sends a copy (or proxy) of its agent to each of the other processes. The remote proxy agent is the active agent representing the process and decides when object updates on the remote process need to be propagated back to the represented process. The agent design has a default setting for objects that do not have specific per-object hints. The agents’s specific object hints that pertain to an object only need to be propagated to the process holding the actual object. No special interface is required to handle the case where the distributed object is implemented on the local machine as the local process’ agent handles this issue. 3.5 Creating Objects for the System When developing a distributed object oriented application, detailed design is required. Once the application is well defined the distributed objects can then be created. These designs usually specify components and objects and indicate how to fit them together to form the application. Sadr uses the Unified Modeling Language

47 (UML) to describe some of the design and development stages used in the creation of distributed object oriented applications [Sadr98]. CORBA and ActiveX/DCOM initiate the object creation process using an Interface Definition Language (IDL) described in Section 2.4.1. In the Java program language, the object interface is separate from the implementation. To enable cache optimizations it is important to distinguish the object interface methods that are read methods, also referred to as get methods, and the write or set methods. One way to specify this information is to declare methods as constants (which follows the C++ use of const with methods). Using this information it is possible to know which interfaces will not cause modification. Unfortunately some languages do not enforce the constant interface which prevents method implementations from making modifications. When enforcement is not provided by the language the developer is required to maintain the specifications of the interface. A more difficult solution to distinguish read and write methods is compile time analysis of the implementation. Unfortunately this solution is not always feasible since the application may only have access to the interface and the implementation may be in an external library and unavailable at compile time. The design of the Gryphon system assumes that a majority of distributed objects contain small amounts of data which can easily be cached and that methods will tend to be small. These methods are mostly in the form of get and set calls on the objects as described by Liskov [Lisk99]. This assumption implies that it is feasible to cache entire objects and utilize the interface information to identify when modify methods occur requiring object state propagation. Section 7.2 describes enhancements that allow for more complex partial object state updates. In the DVE virtual museum example each distributed object must have a visual representation with a location and rotation specifying its position and orientation in 3D space. The object created for this task is the VPR object. In addition to the location and

48 rotation information the class has a unique ID field and a name field which allows the application to locate an object at run time using the ID or name. A text field is added to the VPR object that specifies the location of a virtual reality markup language (VRML) file which dictates the visual representation of the object. Get and set methods are used by the application respectively to access and change the objects rotation, location, and visual representation. 3.6 Application Layer Distributed object applications written using CORBA, ActiveX, and Java RMI do not require the application developer to understand the communication mechanisms of the underlying subsystem. Similarly applications utilizing a Gryphon enhanced subsystem do not need special knowledge of the subsystem implementation. The description of the system in this chapter focuses on the general architecture of the system and its application. In describing the Gryphon architecture the relationship between client or application layer and the server or subsystem layer are assumed to be one process. Chapter 4, “Details Specific to the System Prototype Implementation,” describes a particular implementation of the architecture including the relationship between client and server. The client server process can be multi-threaded with the client and server each having their own threads or single threaded with the control relinquished by the client to server. Another option is the client and server as separate processes on the same host machine. Separating the server into its own process allows for language independence between distributed objects and the client application. Architectural heterogeneity is also feasible since objects and applications can exist on different machines. For the remainder of this chapter every process is assumed to be both a client and a server. The application layer uses an auxiliary influence policy within the Gryphon enhanced subsystem. Calls are made by a client process within an application to the local agent. These method calls set and get hints that specify how objects should be

49 cached and where they should be placed. The method calls take a hint object as the parameter and this object specifies default hints or specific object hints. The hints set in the agent are subsequently utilized by the agent to service requests made by the Gryphon. 3.7 Interaction between Gryphon and Agent Gryphon’s general architectural organization is shown in Figure 6. Each process within the application uses the distributed object interface to reference objects, i.e., it is assumed that the system uses an ORB to reference shared objects. The base ORB is extended to include a Gryphon that utilizes agent objects with the policy hints specified by the application. The Gryphon and agent extensions to the ORB are located within each client process address space resulting in no remote calls when utilizing these mechanisms. The Gryphon interacts with the agent when updates are performed and when reads are requested. Read requests to a distributed object are intercepted by the Gryphon which communicates with the local agent to determine the validity of the current distributed object’s values. If the object is cached locally, the values of the distributed object will be valid and returned; otherwise a remote request will be made to obtain the current values. When write requests are made to a distributed object the Gryphon intercepts these requests and communicates with the local agent to determine the form of update acceptable for this object. If the original object exists locally the change is made and the agent proxies are queried to determine if updates need to be propagated. If original object does not exists locally the asynchronous or synchronous updates are performed when found acceptable. If the local agent responds that these update techniques are not acceptable, a request to move the original object locally is made to the holder of the object so a local update may be performed. If the move fails or the move is not allowed the update will be denied. With a modified ORB the application uses the distributed objects and the subsystem handles the communication policies and returns the results from method

50 calls. Information is passed to agents to tune the subsystem. In the virtual art museum a domain knowledge engine is used to pass information to the agent effecting the subsystem. The user of the application does not need to be concerned with performance issues. The domain knowledge portion of the application changes hints to the subsystem based on analysis of the application utilization. The Gryphon layer within the ORB then utilizes the agents to change the caching and location strategies. Implementing the Gryphon architecture requires a base ORB with its own internal mechanism for supporting object location and caching. For example, E/E incorporates these features as part of its internal object mobility mechanism. The Gryphon invokes internal location and enhanced update features using method calls provided by the base ORB. The Gryphon architecture, like the ORB, is distributed. Each process has a Gryphon to utilize the agents to make choices on behalf of the local and remote processes. When a hint is modified on the agent, the method is processed by the local Gryphon and then propagated, when appropriate, to the affected hosts. If the default hint is modified, all remote copies of the agent will be notified, but if the hint is specific to one distributed object then the agent modification is only sent to the remote agent copy of the process holding the distributed object in question. 3.8 Update Techniques Available to the Gryphon Some ORBs (like Electra) implement get_state() and set_state() methods for each distributed object, enabling the transfer of object state. The ability to save and restore state data has the additional advantage of enabling object persistence since the data can be written to a file and the object can later be reassembled using the stored state. It should be noted that distributing the entire object state can be costly and result in superfluous data distributed in the update. One possible method for reducing superfluous data propagation is to support propagation of partial object data or deltas as is used in MPEG [Koen99]. In a DVE system, this problem can be extensive since

51 the large VRML description of an object does not change. (Implementation of the VPR does not have this problem since the object only contains a reference to the VRML file). Cache update policies resulting from hints are implemented by the distributed Gryphon using three update techniques. the_object push to cached_copy(s) - This technique is used when the Gryphon on the process where the_object is located decides it is necessary to update all the cached_copy(s). The get_state() method is called on the_object and then set_state() is called on all elements of the set of cached_copy(s), resulting in synchronizing the state of all the cached_copy(s) with that of the_object. If additional information is supplied, Gryphon updates a subset of the cached_copy(s). cached_copy push to the_object - With a cached_copy, the Gryphon updates the_object. The get_state() method is called on cached_copy and then set_state() is called on the_object. Once the_object is modified the other cached_copy(s) may need to be updated which is performed by the Gryphon holding the_object using the push to cached_copy(s) technique. cached_copy pull from the_object - With a cached_copy, the Gryphon synchronizes with the_object. The Gryphon calls get_state() method on the_object and then set_state() on its cached_copy. The fourth technique, the_object pull from cached_copy(s), is not used since there is a one-to-many relationship between the original object and the cached copies. This one-to-many relationship causes a synchronization problem since multiple cached copies may be different and there is no reasonable way to select the cached copy that should update the original. Without additional information it must be assumed that all method calls modify the object, limiting the available implementation

52 options for the system. The developer can label methods as read-only, increasing the implementation options available to the Gryphon.

Chapter 4

Details Specific to the System Prototype Implementation

Chapter 3, “System Design,” addresses the general architecture and design issues necessary to enhance an ORB with Gryphon. This chapter describes a prototype system for a Gryphon enhanced DOM implementation. The issues addressed are not specific to location and caching that comprise this thesis but instead are tied to the interface, implementation, and enhancement of the DOM. Section 4.5.3.4 and Section 4.5.3.5 describe the hints and state information utilized by the prototype Gryphon enhanced DOM subsystem. 4.1 The Gryphon-GAAOP Interface This section describes the hints used to communicate with the Gryphon. The Gryphon prototype is implemented using DOM (see Section 1.5). All the local object managers (LOM)s and the global object manager (GOM) embody the DOM. See Section 1.5 for the general view of the GAAOP within the ORB and Appendix B, “DOM - Subsystem Design used by the Gryphon,” for a description of how the DOM ORB’s LOMs and GOM communicate to transfer object state updates between processes. GAAOPs and GAAOP proxies are located within the LOM part of the architecture. Application hints are used by the GAAOP to answer the Gryphon’s queries regarding update propagation and update type. Update types include local original update, remote update using asynchronous or synchronous communication,

54 and updates by moving the remote object locally before performing an update when remote updates are not supported. Update propagation is used to modify remotely cached copies of objects when the original objects are modified. The GAAOPGryphon interaction results in object placement and caching policy for each object. If hint information about an object is not provided to a GAAOP, then the GAAOP uses default policies. 4.2 Implementing the GAAOP for DOM In an earlier design, the subsystem sent a subset of a GAAOP’s hint and state objects to remote copies of the GAAOP. This feature required changing the DOM subsystem to generate partial updates specific to the GAAOP distributed object. This proved to be infeasible since the communication layer of the subsystem was required to know the details of the GAAOP object and create a partial update propagation. The system was redesigned to handle remote partial updates at the Gryphon level by creating a temporary GAAOP with a subset of the information and sending that GAAOP to update the remote object proxy. A flag is set within the GAAOP that indicates whether the GAAOP being shipped to cause the remote update is a full update that should replace the old GAAOP or a partial update that performs a delta on the GAAOP proxy. Partial updates reduce message traffic by enabling modifications of one hint to be propagated without having to send all the hints. No change needs to be made to the subsystem to handle this feature. Instead, overloading of the assignment operator was required within the GAAOP class. This same technique could be used to perform partial updates of other distributed objects as required in various applications. Besides propagating the GAAOP updates when hints change within the GAAOP, the remote process may need to make requests. If a client changes an object hint setting from non-caching to caching, the client will block since the subsystem must update the object cache before continuing. If the update is not performed as a

55 blocking operation, the next read of the object could have inappropriate data since the cached object has not yet arrived. The DOM subsystem needed some additional modifications to handle the GAAOPs. Each client needs a GAAOP and from the DOM perspective the GAAOP is just another object. When a client requests the creation of a GAAOP, it must wait for acknowledgment from the DOM. In the meantime, new object creations will arrive but hints cannot be set for them since a valid GAAOP is not yet present. One solution is to have the DOM assign the GAAOP a unique ID on connection, which allows the GAAOP to send modifications of itself, without waiting for an acknowledgment of creation. The addition of a new process to the distributed application is signaled by the arrival of a new GAAOP proxy. In response, the existing processes send the new process their GAAOP proxy with the default hints. These defaults help the new process prioritize how other processes want objects handled on their behalf until specialized per-object hints are specified. Additional hints need not be sent since other objects will not exist on that client at creation. As the new client creates objects, existing processes can send specialized hints as necessary. If a client is the holder of an object and the client dies, then the object might die as well. One solution to this fault tolerance problem is to have the client with the most recently cached copy of the object become the new owner of that object. If fault tolerance is important to the application, then remote cached copies should follow a strict cache consistency model. This feature could be implemented using DOM and the only missing element is the algorithm in the GOM to select the new holder process for orphaned objects. 4.3 Implementing the Gryphon on top of DOM This section describes the specific implementation issues that were considered when enhancing the DOM subsystem with Gryphon. The major topics include data

56 push and pull and the effect of object mobility on implementation issues. Section 3.8 describes the abstract design while this section describes the implementation details of the push and pull of object updates to processes. 4.3.1 Push Changes There are two cases that cause cashed objects to be updated. The first case occurs when sufficient updates have been made to the original object, as specified by the GAAOP and the associated hint object. The second occurs when the time since the last state propagation has exceeded the GAAOP’s setting for maximum time update delay. The time based update technique is important for objects that may have bursty update patterns. Sending every nth update may be a good solution but if a long lull exists after the last modification it may be desirable to propagate the last update after a specified amount of time. The hint object contains hints used by the GAOOP and the state objects contains the update state history information. When a distributed object changes locations the update state object must also be transferred. Otherwise, a client process requesting every nth update will never see updates if the object always moves before the nth update occurs. Each distributed object has a holder_pid field which specifies the current holder of the object. The subsystem uses this field to identify the owner of a distributed object and to access the object. This field is a cached value of the result returned from the name server which is in the GOM for the DOM specific ORB implementation. When the cached location information is stale the LOM receives an error on attempts to access the distributed object. At this point the GOM is queried to retrieve the new location value. A GAAOP is created for each process as it initiates a VprConnection and its proxy is distributed to all the other processes involved in the session. One of the update techniques available to the application is time based update propagation. The

57 method VprConnection::timeBasedPropagateUpdates(), within the Gryphon layer, checks with each of the GAAOPs to determine if time based updates are required for any of the objects held on the local process. The timeBasedPropagateUpdates method is called when the application or the Gryphon updates the current time within the Gryphon layer using the VprConnection::SetCurrentTime() method. The DOM’s ORB connection protocol involves using the VprConnection object which establishes the LOM to GOM connection. The application specifies whether the subsystem clock will be controlled by the Gryphon layer or by the application. When the application requests that time updates be automated, the Gryphon layer updates time when the Pull method is invoked. The Pull method is used since the process switches from client mode to server mode at this point and processes remote updates. The internal clock is maintained to allow trace files to be generated and for use in time based updates. The cost of time based updates is high when the internal clock is frequently updated. Therefore, time based updates should be used sparingly and the time steps should not be relatively large. Incrementing the subsystem clock on process #L, either by the application or automatically based on the mode selected by the application, results in the GAAOP proxies on process #L (representing all the other processes) being queried for updates. Each GAAOP is asked about each object residing on the current process. If an update is required, the update is sent by the Gryphon and the internal stateObject in the GAAOP for the remote process is updated to reflect the update propagation. Modifying a distributed object on process #L will have the same effect but the GAAOPs are only queried in regard to the object in question. 4.3.2 Pull Changes Pulling of changes is complicated because it requires matching requests and responses in an RPC style. If process #L wants the current state of an object, it sends a

58 request for the state to process #M. Process #M then writes the object back to process #L. A unique ID is placed in each message and returned with the response to indicate it is an update generated by a request. Since the prototype DOM subsystem is not multi-threaded, reading and processing messages must be performed until the requested response arrives. The process waits for the reply to the request. To preserve sequential consistency, all updates received prior to the awaited reply must be processed in the order received. Update request of objects that have been deleted present unique timing issues that need to be addressed. If a delete is received for an object for which a request response is anticipated, control needs to be returned to the caller of the DOM method with an error. Any object updates that arrive regarding an object that is awaiting a matching pull must not generate a callback since the change was requested and does not require an alert. If a callback was allowed to occur then a GetAttr might occur within the callback, resulting in another pull and hence infinite recursion. For this reason, any calls to GetAttr within a callback returns the current state and will not result in another callback. These issues demonstrate that although the pull of data can be very complex, it can be simplified by hints. Pulls typically occur when an object is not being cached, in which case data should not be pushed to the process regarding the object. 4.3.3 Server Processing Details When a process first connects to the DOM via the LOM to GOM connection, a session is specified to enable an application-level shared distributed object name space. These sessions are also used to preserve persistent objects. When a process leaves a session, the process deletes all transient and locally owned objects before closing the LOM to GOM connection. The absence of a GAAOP commonly occurs

59 when the process is leaving and terminating the current session since the objects are not deleted in any specified order. If the Gryphon cannot find the local GAAOP on a GetAttr method call, the Gryphon assumes the current object is valid. Since the DOM is not multi-threaded the call to the Pull method causes the client portion of the process to explicitly relinquish control to the server. There are other methods which cause control to be implicitly relinquished to the server. For this reason an application writer using DOM views the system as multi-threaded. Upon gaining control, the server uses this opportunity to process object updates that have been pushed and object requests sent from remote LOMs. Callbacks are generated to notify the client portion of the process that a specified object has been created, modified or deleted. Inside of callback methods it is not acceptable to make modifications to distributed objects or call any methods which may result in callbacks and infinite recursion. To prevent infinite recursion, callbacks are not allowed to occur within callbacks and a lock is used in the Pull method. If the application violates this lock by calling any method that results in a call to Pull, the application will terminate and an error message will be generated using the assert method. This occurs when a developer attempts to call a PutAttr method from within a callback which results in a call to Pull. GetAttr calls occurring while the lock is set cause the current state of the object to be used. This prevents the problem of a GetAttr being called within a callback, resulting in another callback. Creates, modifies, and deletes all result in callbacks upon processing while pull modifications do not because they have been requested via a GetAttr. A queue of the distributed objects available to a process can be accessed to search and locate objects in the DOM system. The distributed object interface objects (IfObjects) is used by the application to access distributed objects. The IfObjects

can never be removed since it is not possible to know if pointers

60 referencing the objects still exist. Nor can these objects be removed from the queue of distributed objects and moved to an invalid queue due to timing problems. For example, a loop may be executing and if the IfObject is moved to a deleted queue, the loop would be in an unknown state with the moved object pointing to an invalid object. The solution is to have the methods First and Next skip all objects marked as invalid, preventing any process from getting access to an invalid pointer. An application client process causes data propagation within the ORB by creating, deleting, reading (GetAttr), or writing (PutAttr) an object. The call to create objects results in a request for a unique object id. Other client processes are notified of the object creation through callbacks. Delete removes an object and notifies the other clients processes. The next attempt to use the deleted object results in notifying the client processes that the object is invalid. 4.4 Application Layer Sample proof of concept applications have been constructed to explore the approach; some simple distributed objects and applications exist as templates and additional documentation to application developers. The DOM system has been used by other developers to create a 3D tic-tac-toe game and an efficient MPEG clientserver application which are described in Appendix C, “Details of Applications Built on DOM.” Construction of new object classes as illustrated in Figure 7 was found to be feasible, though more time consuming, than if IDL compilers were provided. Once the class is hand generated, the remainder of the system is equivalent to a CORBA system in respect to usability. See Appendix E, “Example Distributed Application,” for an example code of a Gryphon enhanced DOM application. 4.4.1 Object Utilization All attributes within a distributed object can be accessed using the associated attribute class derived from base_attributes. The attribute class is convenient

61 since many applications involve objects with small amounts of data (a few hundred bytes) where the latency of a remote call dominates the invocation. The attribute class provides a solution to this problem using the fat operation technique which bundles many attributes into one structure or object instead of fine-grained method calls [HV99]. Without this technique, reading values x, y, and z from an object is not atomic and requires three method calls on the object and the resulting values from the three method calls are not necessarily related. Attribute objects are used to observe and change the state of an object. The holder_pid is a state that cannot be modified via attributes; only the Gryphon can make the changes, although the user can send hints to request the move. Synchronization and strong locking is possible with the current system by configuring the hints to allow only one writer (the one which currently has the real object i.e. holder_pid set to that process). Before the change can be applied and synchronization achieved, a request must be made to the current owner requesting possession. Each distributed object in the DOM subsystem, the_object, resides in one physical location while any cached copies, cached_copy, are distributed throughout the system. Each distributed object has a field that specifies the location of the_object. In order to change the location of the_object, the Gryphon modifies the field to specify the new holder of that object. This results in the DOM subsystem moving the_object by notifying clients via the embedded name server that the object has changed locations. 4.4.2 Utilizing the Gryphon The application client utilizes its GAAOP to specify hints when new objects are first created and anytime update strategies need to change. The hint objects are propagated to the processes’ remote GAAOP proxy, running on the processes where the objects in question reside. If no hint object is created the default will be used. The

62 state object is created by the remote GAAOP at the location holding the object. The state object and hint object move to a new location when the referenced object moves. If the object is deleted, the state and hint objects are also removed. The application client sets the hints by passing the object id or 0 for the default hint object and asking the VprConnection to locate the hint object. The locate method returns the hint object, or 0 if the object is not found, or the client can also ask the VprConnection to create a new hint object. Once the hint object is tailored for the desired object, SetHints is called on VprConnection passing the hint object and the object id, resulting in modification of the local GAAOP and the hint object propagation to the remote agent location. 4.5 Gryphon Layer Distributed objects in the Gryphon system prototype have a high level design comprised of two components: the Gryphon level distributed object and an associated application-specific attribute object. The distributed object is used by the subsystem to distribute information while the attribute object is the interface used by the client portion of the application. The attribute object acquires a replicated copy of the internal state of the distributed object when the client process calls the GetAttr method on the distributed object with the attribute object as the output parameter. The result of this fat operation design technique is that database transaction level updates can be committed when the changes are acceptable using the PutAttr method call on the distributed object with the attribute object as the input parameter. The changes can be discarded when the modifications are not desirable by not issuing the call to the PutAttr method. Figure 7 is an example of how the distributed object component and the attribute object are created and used within an application. The first few lines in Figure 7 illustrate how a new distributed object class and the associated attribute class are created. The states of the distributed object and the attribute object is displayed in between the lines of code. Sample application level code follows that

63 demonstrates the process of instantiating a new distributed object using the attribute instance to initialize the values. The distributed object is then seen to be modified over time due to remote updates or updates in another part of the application on the same process. In Figure 7 the attribute object is updated with the new state and the Z value is modified. Finally the distributed object is modified using the attribute object to assign the new values.

64

// Structure of designing a new distributed object component and associated // attribute. xyz_distributed_object : public base_distributed_object { public: xyz_attributes : public base_distributed_object::base_attributes } // The application utilizes the attribute object and accesses the distributed // object using the Interface Object. xyz_distributed_object::xyz_attributes xyz_attr; xyz_attr.x = 7; xyz_attr.y = 11; xyz_attr.z = 13; IfObject *ifobj_xyz_do = Vcon->CreateObj(&xyz_attr); // xyz_distributed_object xyz_do is accessed via the ifobj_xyz_do xyz_attr X=7 Y = 11 Z = 13

xyz_do X=7 Y = 11 Z = 13

// time passes and remote updates have occurred xyz_attr X=7 Y = 11 Z = 13

xyz_do X=2 Y=5 Z=3

ifobj_xyz_do->GetAttr(&xyz_attr); xyz_attr X=2 Y=5 Z=3

xyz_do X=2 Y=5 Z=3

xyz_attr.z = xyz_attr.x + xyz_attr.y; xyz_attr X=2 Y=5 Z=7

xyz_do X=2 Y=5 Z=3

ifobj_xyz_do->PutAttr(&xyz_attr); xyz_attr X=2 Y=5 Z=7

xyz_do X=2 Y=5 Z=7

Figure 7: Distributed Object/Attribute Client Utilization The application interface that is manually created in the DOM implementation (but which would be automatically generated if an IDL compiler was available) is the interface provided in the attribute object. The distributed object component is only

65 utilized to propagate state to cached copies or partial state in the case of an enhanced distributed object component. Enhancing a distributed object component for partial updates or additional distribution features also requires enhancements to the GAAOP but no other portion of the subsystem. These enhancements are specified at object design and require no additional knowledge by the DOM subsystem. Only the GAAOP object utilizes the information; therefore the application developer does not need to alter the DOM ORB. The only exception to enhancements hidden from the DOM subsystem is the GAAOP distributed object component which contains partial update features which the Gryphon layer (within the DOM) utilizes directly. 4.5.1 Data propagation in a single threaded environment Enhancing the DOM with Gryphon required creating a Gryphon layer that utilized the specific interface provided by the DOM. One of the issues specific to the DOM was the fact that each process is both a client and a server but the process is not multi-threaded. The server part of the DOM (which responds to remote requests) gains control of the processing thread from the client when it explicitly relinquishes control via a method call or implicitly when the client calls distributed object methods within the subsystem. When caching is not used for a particular object, reads to that object will result in synchronous blocking operations. Since the DOM is not multi-threaded, while the local client is waiting for the read process request to respond from the object owner’s server, the DOM allows the local server to respond to read requests from other processes. Control is relinquished by the client portion of the process to the server part, explicitly when the client calls the Pull method, or implicitly when the client performs a GetAttr or PutAttr method call on a distributed object. The application, on creating the connection to the DOM, can elect to reduce the implicit relinquishment of control caused by the Gryphon calling the Pull method. This affords the application increased control over when the processes switch from client to

66 server mode. The methods CreateObj and Delete do not cause implicit calls to Pull in an effort to improve application performance by allowing rapid creation and deletions of objects without interruption from callbacks. 4.5.2 Modifying an object Applications cause changes to distributed objects using the PutAttr method call described earlier. Objects may be modified locally or remotely using synchronous or asynchronous methods. If remote updates are not supported to the object then the remote object must be moved locally before modification can occur. • Modifying a local object - The object exists locally and modifications take effect instantly since the distributed object is held locally. The local Gryphon propagates the updates to processes whose GAAOP responded affirmatively to requiring the update. • Asynchronously modifying a remote object - The object is not local and the modification is sent to the process holding the actual object. No acknowledgment is required, and the local system assumes the change has taken effect. When the remote location receives the update, it performs the local update task. The process that requested the modification is not sent the change or an acknowledgment. • Synchronously modifying a remote object - The same as asynchronous except the process making the request blocks waiting for an acknowledgment that the modification was accepted before continuing execution. • Move remote object locally - In this situation the object needs to be moved locally before a change can be made to the object. The location holding the object is sent a request and if accepted, will send the current state of the object with the holder_pid flag set to the requesting process id at which point the local update task is performed.

67 4.5.3 Details of the Gryphon Implementation This section gives detailed descriptions of the classes and methods available within the Gryphon layer of the subsystem. The hierarchy of the classes described in this section are represented by Figure 8 Application ----------------IfObject - distributed object interface with GetAttr/PutAttr ----------------VprConnection - main layer containing the Gryphon and GAAOPs ---------------DistrObjMan - manages messages ---------------socket layer Figure 8: DOM subsystem hierarchy of Classes The DOM system is composed of five layers. • The socket layer is used by the DistrObjMan object to distribute updates across the network. • The distributed object layer which exists as a DistrObjMan object handles updates and modifications at the object level of abstraction. These first two layers are not part of the research contribution of this thesis, they are therefore described in Appendix B, “DOM - Subsystem Design used by the Gryphon.” • The VprConnection object is central to the Gryphon layer and handles the consistency and validity of the distributed objects. This task is performed by making sure distributed objects meet the applications requirements using the GAAOPs which contain the application hints. • The IfObject object is the distributed object and supports puts and gets by the application layer. • The application layer utilizes the IfObject(s) and GAAOPs to create distributed applications.

68 4.5.3.1 Class VprConnection methods used at the Gryphon level. The VprConnection class provides functionality which is an essential part of the Gryphon layer within the DOM. These methods are called by the Gryphon as a result of changes in time or object modifications and make use of the GAAOPs. void timeBasedPropagateUpdates() - This method checks if any distributed object modifications need to be propagated as a result of sufficient time between previous update propagations. The VprTime object in the DistrObjMan class, named TraceTime, and is used to keep trace time information. The SetCurrentTime method is called to update time and initiates a call to this method. The checks are performed with the aid of the GAAOPs for each process. If time has expired but no changes remain that have yet to be propagated then no propagation is required. void objectPropagateUpdate(const IfObject *o, const long changer_pid) - This method queries each GAAOP to see if the update to the distributed object needs to be propagated to the remote process. The variable changer_pid prevents this method from propagating changes back to the process that caused the change, otherwise an echo to the process that caused the update would result. void hintChangesRequireUpdates(IfObject *g) - The GAAOP, g, has just had modifications applied which may result in new caching and update policies. The GAAOP is queried to check if the state of distributed objects need to be propagation to the process being represented by the GAAOP. bool

objectValid(const

IfObject

*o) - An object is valid if the

holder_pid is the local process which signifies that the distributed object, o, exists locally. If the objects is not local then local GAAOP needs to be checked to see if the object is considered valid. It will be valid if the object is cached, otherwise a request is made to obtain a valid version of the object from

69 the holder of the object. bool

objectUpdated(IfObject

*o,

const

base_distributed_object::base_attributes *attr) - This method performs modifications using local update, asynchronous update, synchronous update, or move to local then local update. The local GAAOP is queried to find out what form of update is supported. 4.5.3.2 class DistObjMan methods used at the Gryphon level This class and the associated methods are used by the Gryphon to transmit and request distributed object updates, send create requests, and send delete requests. The main function of this class is to create messages for transmission through the socket layer. As described in detail in Appendix B, “DOM - Subsystem Design used by the Gryphon,” the GOM is both the naming service and a proxy through which all data is propagated between processes in the DOM subsystem. void PushCreate(const queue_object *obj) - Method used by the Gryphon to request creation of a new object from the GOM. The GOM will then notify all the processes that a new object has been created. void PushDelete(queue_object *&obj) - Method used by the Gryphon to propagate deletion of an object to all processes. void

PushModify(const

queue_object

*obj,

const

long

to_pid) - Gryphon sends an object modifications to the to_pid process. long PullModify(const long obj_id, const long to_pid) - Gryphon request an object update from the process holding the object and waits for the response within the objectValid method. long PushRequestModify(queue_object *&obj) - A remote request is sent for a synchronous object update. long PushRequestMove(const long obj_id, const long to_pid) - A remote request is sent for moving a distributed object from its current loca-

70 tion the a new location. void PullRequestResponce(queue_object *&obj, const long to_pid) - Used by the Gryphon to respond to requests from other processes regarding reading an objects state, causing object modification, or changing the object’s location. 4.5.3.3 class GAAOP methods This class is used by the Gryphon layer and the VprConnection class. This class is the key to the Gryphon layer and handles distributed decision making on the part of the remote processes. The GAAOP class is derived from the base_distributed_object class like all the other distributed objects. The GAAOP contains a collection of hintObject(s) and stateObject(s) which it manages and utilizes to aid the Gryphon in provide location and update policies. const om_object * timeBasedPropagateUpdates(const TimeVal &CurrentTime, const om_object &obj) - The VprTime object within the DistObjMan calls this method when its time is modified. This method calls the method checkPropagateUpdate with the changesmade variable set to 0. const

om_object*

checkPropagateUpdate(const

om_object

&obj, bool changesmade=1) - Used to propagate updates when the time changes or when the object is modified. The caller of this method is assumed to perform the necessary update upon receiving a return value of true. const om_object * updateRequested(const om_object &obj) This method allows for security features to protect against unwanted updates. Returns the object to be propagated and 0 when the update is denied. bool

setObjectState(stateObject

&state) - Used by the process’

GAAOP to modify the state status. stateObject * LocateS(long id) - Returns the state status object for the

71 requested object or returns 0 if the id is not present. bool setObjectHints(hintObject &hint) - Used to set hints. hintObject* getObjectHints(long obj_id) - Returns the hint settings for the requested object or returns 0 if the obj_id is not present. When no hints are specified for an object, the default settings are being used. The

following

two

methods

are

standard

parts

of

the

base_distributed_object. The GAAOP and the entire collection of hintObject(s) are serialized, used to generate the class, and also used by receiving process to install the new states. int print(string &gaaop_buffer) const - Generates a serialized string of the GAAOP class including the embedded state and hint objects. int parse(const char token[], const string &data_string) Parses the string generated by the print method and results in the instantiation of a new object containing all data from the original object. 4.5.3.4 The base hintObject class has the following variables: obj_id - used to identify the hintObject for a given object. maxCount - specifies how many updates can occur before the cached copy needs to be update; 0 is the default which turns off caching while 1 causes strict caching. maxDeltaTime - specifies how much time can elapse before updates require propagation; 0 is the default which turns off time based updates. supportAsync - a boolean value that specifies if asynchronous updates are supported. supportSync - a boolean value that specifies if synchronous updates are supported. supportMove - a boolean value that specifies if the object is allowed to be relocated. If remote updates are not supported and relocation is not supported then the

72 object is essentially locked and can only be modified by the current holder of the object. 4.5.3.5 The base stateObject has the following variables: obj_id - used to identify the stateObject for a given object. lastTime - specifies the time of the last update. count - current number of modifications not propagated.

Chapter 5

Analysis of the System

This thesis defines a new technique for supporting efficient access to distributed objects using hints. The previous chapter described a proof-of-concept system that demonstrates the feasibility of the research. Before building the Gryphon System, analysis was performed to explore the theoretical gains of the system. After the prototype system was built, its behavior was measured to validate the model, and to study the behavior in more detail. Two of the major performance issues of distributed object systems are latency and message traffic. The latency of message traffic on communication subsystems is already well understood [HV99] and is therefore not addressed in the Gryphon System analysis. Instead the focus of this analysis is on message traffic reduction which results in reduced bandwidth requirements and correlates directly to a reduction in latency. Before analysis could be performed, it was crucial to identify the characteristics of ORBs. This resulted in the identification of the five types of ORBs introduced in Section 3.3: Centralized ORB, centralized cached ORB, distributed ORB with location policy, distributed ORB with location policy and strongly consistent caching, and a distributed ORB with location and caching policies. These five systems range from existing CORBA systems to an ORB utilizing the full extent of the research presented in this thesis.

74 Having identified the five classes of systems, analytic models were created to represent the performance in terms of message traffic for these systems. Utilizing the analytic models required that the input of expected loads on the system be specified. Five scenarios representing a variety of applications were defined and used as input loads to the models. This chapter describes the analytic models for the five types of ORBs and describes five application scenarios that represent a spectrum of uses of these ORBs. The positive results from the initial analysis provided the impetus to implement and measure the Gryphon System. After implementing the Gryphon System, the next step was to characterize the load from the scenarios, then to run the scenarios on the actual system. Experimentation with the running system accentuated the need for fast computers and networks. When attempts were made to test configurations with many objects, the currently available hardware did not provide sufficient CPU cycles nor bandwidth to support the tests. Therefore, a test environment using Virtual Clocks (Section 5.3.2) was created to examine performance in a simulated time environment. 5.1 System Load Scenarios The following scenarios represent the load on object management facilities, illustrating how distributed objects are used in applications. These scenarios were created to model diverse styles of applications and describe workloads which demonstrate plausible utilizations of distributed object systems. They are the basis of the analytic model load patterns, and for defining synthetic workloads used to drive the actual systems. In the case of scenarios representing DVE applications, synthetic load is essential due to the infeasibility of gathering a large number of users to test the scenarios in real-time. Scenario A: Virtual Art Museum. A DVE is a virtual museum, containing a number

75 of works of art available for viewing and discussion. A person enters the virtual museum, browsing various works without communication; however, it is expected that the person will also wish to find other people to discuss the work. The DVE represents a person’s presence through the presence of an avatar in the environment; when one person sees a number of avatars near an interesting piece of work, the person can join the group, view the work, and begin discussing it. The virtual art museum has a number of static objects with complex VRML specifications (corresponding to the art works). Avatars move infrequently, but most other objects do not move at all. Scenario B: An Art Gallery. An art gallery has the same essential properties as an art museum, with the difference being one of scale. The art gallery is a more intimate environment and tends to be encompassed in one room. The FLOATERS blimp application, described in Appendix C, “Details of Applications Built on DOM,” provides similar load characteristics to an art gallery. The blimp and users are in the virtual space together and they can see each other as well as other objects in the room, collaborating together to fly the blimp. Scenario C: Collaborative Office Work. In a small department, people randomly visit different work areas in the office. As they perform their day-to-day work, they visit the copier, the file room, the printer, etc. Each worker has a set of supplementary tools that can be invoked on demand, e.g., word processors or database query interfaces. Each worker generally does not need to be intimately aware of the location of other workers in the office, except when there is collaborative work to be accomplished. The worker avatar objects change their state frequently, though other office objects do not tend to change at all. Scenario D: Model-Based Virtual Environment. This scenario describes modelbased virtual environments as collaborative environments containing a model to provide context for the collaboration [NAB+95] [Nutt97]. In these systems, multiple

76 workers interact with one another and with isolated parts of a larger artifact (a shared model of work, software methodology, CAD design, etc.). In this scenario, users may interact with many different components at a relatively high rate, but these interactions do not need to be propagated to other processes at a high frequency. Scenario E: A Weather Modeling Application. Weather modeling is a highly computational and data intensive process. Each object contains data which is a region within the larger data grid. In weather modeling, data is broken into small regional subsets, then processed intensely. After a large amount of processing, data at the fringes of the subsets are distributed to subsets of other processes, and the computation continues. 5.2 Analytic Model The analytic model is based on processes within a distributed application performing reads and writes method calls at various rates (according to a chosen scenario). State changes occur by write method calls (messages). All messages are considered to be the same size since they are small and fit into one network data packet. In Systems 1 and 2 there is a central process acting as the server and the rest of the processes are clients while Systems 3, 4, and 5 all processes act as both client and server. Analysis is based on the amount of message traffic that can occur with different scenarios using the different object managers described in Section 3.3. The following parameters are used to model message traffic: M U L V S F R

= Number of Object being modified at each process = Update rate for the objects being modified = Number of processes using the object = Number of processes that are reading objects = Number of static objects = Rate of object reads per second = Ratio of updates that get propagated The models determine the following metrics, based on the parameters:

Tcen = Amount of network traffic, in messages per second, to central object server

77 process Trw = Amount of network traffic, in messages per second, to each process reading and writing some objects Two = Amount of network traffic, in messages per second, to each process reading only objects it is writing Ttotal = Total traffic in the network, in messages per second System 1 (Centralized object manager): Type

Send

Receive

Tcen = Trw = Two =

UML2 UM

UML UML

Trw

Trw

Ttotal =

UML+UML2

UML2+UML

System 2 (ORB Central): Type Tcen = Trw = Two = Ttotal =

Send 2UML + VF(ML+S)

Receive

2UM + F(ML+S) 2UM 4UML + 2VF(ML+S)

same as send same as send same as send same as send

System 3 (ORB with Opaque Location): Measurement comparisons with the central ORBs data becomes an issue because accesses that would have gone to the central ORB now go to the process where the object is located. The static objects are assumed to be equally distributed across the processes. In Trw, the first part of the expression represents read operations by the local client process, and the second part represents reads by external client processes to the data stored on the local server process. Type Trw = Two = Ttotal =

Send - Requests

+

Responses

Receive

F(ML+S)((L-1)/L) + F(M+S/L)(V-1) same as send F(M+S/L)V same as send VF(ML+S)((L-1)/L) + VF(M+S/L)(V-1)+(L-V)F(M+S/L)V same as send

System 4 (ORB with Opaque Location and Caching): The model reflects the fact that data in this system is pushed to the processes instead of pulled via request messages (i.e., the 2X multiplier is removed to reflect the absence of a send message). The value of R is equal to 1 for this system.

78 Type Trw = Two =

Send

Receive

UM(L-1)R

Ttotal =

UML(L-1)R

same as send same as send same as send

Trw

System 5 (ORB with Opaque Location, Caching, and Update Policies): In System 4, R was set to 1. The model uses the same equation, but R varies on a per object and per host basis. For the purposes of analysis when using the equations, one value of R is selected for all objects and users. These models were used with each scenario type and various parameter settings. Chapter 5, “Analysis of the System,” describes the results. 5.3 Measuring System Performance With the Gryphon enhanced DOM subsystem implemented, the performance of the actual system could be evaluated. A scenario program was created to run synthetic loads corresponding to each of the previously described scenarios. The scenario program uses virtual time since the scenarios described cannot run in realtime on the available hardware. Virtual Clocks are used to synchronize the processes in the application and produce deterministic and reproducible executions. The run time of the application yields information regarding performance but other techniques described in this section are required to measure message traffic. 5.3.1 Trace Analysis Tools and Techniques Message traffic data is required to determine the improvement in performance related to bandwidth utilization. DOM is instrumented to produce one trace file per process and these individual trace files are merged into one global trace file which is processed by various perl scripts. Unfortunately the trace files generated by larger scenarios require implausible amounts of storage (multiple gigabytes) which were not available. As a consequence, run time trace analysis was built into the DOM which produced summary message traffic data. For details on the methodology used for trace

79 generation and analysis see Appendix A, “DOPICL - Trace Generation.” 5.3.2 Virtual Clocks A common problem when creating distributed applications includes synchronization and starvation among processes. When left untethered, a process can monopolize the system or be dominated by other processes. For example, when all the processes exist on a uniprocessor system, the OS will tend to give each process what it considers to be a fair share of the processing time. Unfortunately, a process holding a disproportionately large number of the distributed objects can spend all of its cycles servicing the other processes. Conversely, a process with few distributed objects can spend all of its time performing its own processing and using other processes’ resources, resulting in starvation of the less “aggressive” processes. Synchronization mechanisms can eliminate starvation between processes. Abdul-Fatah et al ‘98 encountered starvation problems when performing their timing scenarios on a collection of processors running their derivatives of VisiBroker: faster processors were starving the slower ones. Their solution was to slow down the processors with delays to create homogenous machines [AM98] allowing for testing, but not solving the starvation issue in real applications. We have used a similar approach: At times process A needs to perform certain tasks before process B. The solution I created, Virtual Clocks, works at the application level using a derivative of Lamport’s Logical Clocks [Lamp78]. A distributed clock is created to synchronize all processes. When a process in the distributed application is ready to continue, it submits a request to move the clock to its next desired time. The process in charge of incrementing the global clock waits until all processes have finished their work for the current time; the global clock is then incremented to the lowest requested time. Processes sleep and do not awake until the global clock reaches the time requested via the local clock. The Virtual Clock mechanism is implemented using distributed objects at the

80 application level leaving enforcement to the application.1 Implementation involves creating one global clock object representing the current time, and each process creates its own local clock representing its requested next time step. A single process updates the global clock but the clock is distributed unlike a central clock. When the process modifying the global clock finds the minimum time of all the processes’ local clocks to be greater then the global clock, the global clock is then updated to the next clock step. The processes’ that awaken do the required work for the time period and then call a sleep function. The sleep function will call the function setting the local clock to the next locally requested time and wait until the global clock has reached the requested time. The sleep function returns when the sleep time has expired and the global clock is equal to the requested local clock time. The Virtual Clock mechanism is distributed since the global and local clocks are implemented using distributed objects. The sleep function on the client process compares the global clock object (which is cached in the my implementation) with the local clock object (which exists locally in my implementation). With this implementation the local clock object is cached at the process controlling the global clock updates and results in message traffic only on modifications. The same is true of the global clock which propagates updates to the cached copies on client processes using the object. 5.3.3 Using Virtual Clocks The DOPICL trace files described in Appendix A, “DOPICL - Trace Generation,” have been very helpful in data analysis and visualization. Trace files and the ParaGraph visualization tool [HE91] have also been an invaluable tool in identifying and debugging design problems within the DOM subsystem and applications using DOM. When using the scenario load generator to analyze the systems described in Section 3.3, all the processes started executing immediately. This 1. If applications do not comply voluntarily, Brandt suggests moving the compliance mechanism into the OS [Bran99].

81 is problematic since some processes start before others are even created, making it difficult to generate a deterministic run of the scenario application. The number of processes involved in the application changed as processes caught up and joined the session, resulting in a varying number of broadcast of object updates messages as the population changed. In order to have the scenario program validate the theoretical performance specified in the Section 5.2, measurements must be taken once the state specified in the scenarios exists. In order to achieve this, the scenario program needs to run for a fixed amount of time with a fixed number of objects and processes executing at the same rate using the various levels of Gryphon. Virtual Clocks are one solution to this synchronization problem. The TimeVal class which is used to implement Virtual Clocks was enhanced to allow negative time. The global clock is initialized with a time of -1 micro-seconds. All processes sleep until the global clock time reaches 0, and global time is only advanced once all the processes create their local clocks with the requested time of 0. To ensure all processes are in a ready state before starting, the processes must first create all the objects for the scenario, and then create the local clock and wait on time 0. The processes then perform their specified work loads in the appropriate time frame, until the termination time is reached within the application. Using Virtual Clocks allows for the production of repeatable and desired application execution. Another advantage is the ability to run applications at a slower rate than real-time, which enables limited computing resources to emulate faster systems. For example, the Pentium Pro 200Mhz hardware can only render 2.4 frames per second in the VPR application (with many graphic objects) by slowing down time, therefore, one second actually took ten seconds making it possible to achieve the desired 24 frames per second.

Chapter 6

Results

Important performance issues in utilizing distributed objects have been identified in previous chapters. System analysis techniques have been specified in Section 5.3 and the Gryphon was implemented to enable the measurement of performance relative to these issues. Section 6.1 describes the space, time, and message traffic overhead incurred due to enhancing an ORB with Gryphon. The remainder of the chapter demonstrates that the Gryphon enabled DOM is feasible to implement, and it then illustrates the resulting performance improvement. The performance improvement is measured in message traffic reduction. Data is also provided exemplifying the relative time reduction resulting from reduction in message traffic and the associated message latency. 6.1 Gryphon Subsystem Overhead Utilizing a Gryphon enabled ORB introduces both space and time overhead on each process participating in the application. The overhead is directly related to the hint settings and the utilization of per distributed object hint objects. 6.1.1 Space Overhead The space overhead grows based on the need for hint and state objects (i.e. for GAAOPs). The hint objects hold the hint information regarding caching and update

83 propagation strategies which currently takes a few bytes. The hint object size is based on the quantity and type of information the application can pass to the subsystem, the base hint object is defined in Section 4.5.3.4. The state objects hold information pertaining to last time update, number of unpropagated updates, and only takes a few bytes. The state object may also hold other information required for specialized partial update objects. The hint space overhead ranges from a minimum when all the GAAOPs set only the default hint object (amount of hint memory = (default hint object) * (Number of processes) * (size of hint object)) to the high end when each distributed object has its own specialized hint object (amount of hint memory = ((((default hint object + 1 hint object for every distributed object located locally) * (Number

of

processes))

+

(1

hint

object

for

every

distributed object located remotely)) * (size of hint object))). Every process has a default hint object set in its GAAOP to instruct the default object caching and update behavior. Hint objects are set by the application and located on the local processes’ GAAOP and distributed to the GAAOP proxy at the location where the object actually resides. The hint object referring to a distributed object only needs to exist in the GAAOP proxy on the process holding and acting as server for the distributed object and the process which is being represented by the GAAOP. The amount of state memory is as low as 0 when objects do not need state, this occurs when objects are not cached or use strict caching. Distributed objects that are not cached do not need state objects since they are always invalid, and strictly cached distributed objects do not need state information since they are always valid. A distributed object only requires a state object at the location holding and acting as server for the distributed object, so if no distributed objects exist locally then no state

84 objects need to exist locally either. The highest number of state objects is required on a process when all distributed objects use application specific consistency caching and exist on the process in question (amount of state memory = (Number of processes - 1) * (Number of objects located locally and using non-strict caching policies) * (size of hint object)). For example, System 5 running Scenario E has 200 objects on each process, which is a relatively high number of objects per process, and results in 200 hint objects for each of the 4 other processes. 6.1.2 Time Overhead The method call that checks if remote propagation of distributed objects is required is a few lines of C++ code and is executed on the local machine so no remote traffic is generated. The processing required to perform time based updates ranges from zero method calls, when no time based updates are required, to the upper bounds when all objects are located locally and require time updates (number of method calls = (Number of processes - 1) * (Number of objects located locally requiring time based updates)). The message traffic generated is dependent on the hints being used. The space cost of adding time based updates to one object is (Number of processes - 1) hint objects and (Number of processes - 1) state objects and the processing cost is (Number of processes - 1) method calls to see if an update needs to be propagated. The time based checks occur at the granularity of time updates selected by the application. The application can update the Gryphon clock or the application can let the subsystem perform the clock updates. Every time the clock changes, the above tests need to be performed; so choosing a small time granularity causes time updates to be relatively costly in CPU utilization. If the granularity available on some hardware were used for

85 time then the entire CPU would be utilized by time update checks. The processing cost of modifying a distributed object that is located locally is a method call to each GAAOP (Number of processes - 1) to check if the modification needs to be propagated. Message traffic occurs when distributed object modifications require propagation and when a process changes its GAAOP hints. Changing the location of a distributed object causes the associated state object, in the GAAOPs, to be transferred to the new object location. 6.2 Analyzing the Architectures The patterns of message traffic for various systems are represented by the models for message traffic described in Section 5.2. Before comparing subsystem performance under different scenarios, consider the behavior of the different systems under fixed loads representing Scenario B, while varying some of the parameters. This section gives an understanding of the parameters that effect performance of an application and yields insight into subsystem selection. The following parameters are utilized in Figure 9 with the number of processes varying from 2 to 40: M U V S F R

= Number of Object being modified at each process = 20/L = Update rate for objects being modified = 2 updates/second = Number of processes that are reading objects = L/2 = Number of static objects = 100 = Rate of object reads = 24 reads/second = Ratio of updates that get propagated = 10% Figure 9 represents message traffic (in log10 messages per second) for the

number of client processes, L, from 2 to 40. For Trw and Two on Systems 1 and 2, all data passes through a central server process creating a bottleneck as described in Section 3.3. This bottleneck is equal to the total number of messages that flow through the system, Ttotal, since all communication must go through the central server. Figure 9 Ttotal and Trw shows that Systems 2 and 3 have a much higher message traffic rate than Systems 1, 4, and 5 caused by the read requests which result in remote accesses. The Two in Figure 9 represents processes that do not perform reads and reflects the same

86 results as Trw and Ttotal except in System 2 which has less traffic since write only processes do not access external objects and no objects exist locally. System 3 has high traffic since objects are evenly distribution across the available processes and a process not performing reads must still respond to external reads. Figure 9 shows that using programs with these characteristics is scalable on system 1, 4, and 5 and demonstrates the advantage of specifying location in conjunction with caching policies. It is not surprising that the fixed policy of System 1 performs well since it has been specifically designed for this style of application. Figure 10 shows message traffic (in log10 messages per second), L is the constant 20, and the number of objects being modified at each process, M, ranges from 0 to 10. The figure shows poor performance for System 2 and 3, except System 2 in Two performs well for the same reason previously described in reference to Figure 9. Figure 10 shows that location and caching policies are crucial in enabling an object system to scale to an increasing number of objects. In Figure 11 messages traffic is in messages per second and M is fixed at 1, L at 20, and U (the update rate) ranges from 0 and 100. Figure 11 demonstrates the importance of specifying update policy, the frequency of updates to each object increase while reads remain constant resulting in excessive update distribution. This is the case in System 1 and 4 (which provide a strongly consistent cache) and even though the reads are only performed at a fixed rate of F, the updates are propagated at an increasing rate represented by U. This part of the analysis illustrates the need for selecting the appropriate system based on the needs of the application. For applications requiring low latency the systems with caching perform much better. When the application performs many writes but very few reads then the cached systems can reduce performance. System 5 performs best for all applications and the system can be tailored for each object. System 5 can use sequential consistency caching, infrequent caching, disable caching,

87 and specify location. For applications where System 5 performs best, the system demonstrates several orders of magnitude improvement over the other systems.

88

Graph of T rw

10000

1 2 3 4 5

Log10 Msg/Sec

1000

100

10

1 0

5

10

15

20 Values for L

25

30

35

40

Graph of Two

10000

1 2 3 4 5

Log10 Msg/Sec

1000

100

10

1 0

5

10

15

20 Values for L

25

30

35

40

Graph of Ttotal

1e+06

1 2 3 4 5

100000

Log10 Msg/Sec

10000

1000

100

10

1 0

5

10

15

20 Values for L

25

30

35

40

Figure 9: Systems in Section 3.3 varying number of processes (L). Settings M=20/L: U=2: V=L/2: S=100: F=24: R=0.1 and L from 2 to 40.

89

Graph of Trw

100000

1 2 3 4 5

10000

Log10 Msg/Sec

1000

100

10

1

0.1 0

1

2

3

4

5 Values for M

6

7

8

9

Graph of Two

10000

1 2 3 4 5

1000

Log10 Msg/Sec

10

100

10

1

0.1 0

1

2

3

4

5 Values for M

6

7

8

9

10

Graph of Ttotal

1e+06

1 2 3 4 5

100000

Log10 Msg/Sec

10000

1000

100

10

1

0.1 0

1

2

3

4

5 Values for M

6

7

8

9

10

Figure 10: Systems in Section 3.3 varying number of modified objects at each process (M). Settings U=2: L=20: V=10: S=100: F=24: R=0.1 and M from 0 to 10.

90

Graph of Trw

9000

1 2 3 4 5

8000 7000

Msg/Sec

6000 5000 4000 3000 2000 1000 0 0

20

40

60

80

100

Values for U Graph of Two

4000

1 2 3 4 5

3500

3000

Msg/Sec

2500

2000

1500

1000

500

0 0

20

40

60

80

100

Values for U Graph of Ttotal

140000

1 2 3 4 5

120000

100000

Msg/Sec

80000

60000

40000

20000

0 0

20

40

60

80

100

Values for U

Figure 11: Systems in Section 3.3 varying number of updates (U). Settings M=1: L=20: V=10: S=100: F=24: R=0.1 and U from 0 to 100.

91 6.3 Analyzing the Application Scenarios This section compares the performance of the various scenarios across the various systems, i.e., a set of values is selected for the variables to represent each scenario. The results demonstrate the advantages of using a Gryphon enabled subsystem to meet the needs of specific applications. Scenario A - A Virtual Art Museum: M = 1 (only avatars move): U = 2 (averaged moves): L = 1,000: V = 1,000: S = 10,000 (includes art and building structure): F = 24: R = 0.01 (only propagate 1/2 the updates for 20/1,000 of the patrons since an avatar can only see a small subset). Scenario B - An Art Gallery: M = 1: U = 2: L = 20: V = 20: S = 100: F = 24: R = 0.5 (only propagate 1/2 the updates). Scenario C - A Collaborative Office Work in the VPR: M = 20: U = 5: L = 5: V = 5: S = 1,000: F = 24: R = 0.017 (only propagate 1/3 of updates and only see 5/100 of the objects). Scenario D - A Model-Based Virtual Environment: Assume 5 participants each modify 3 objects at a time. Also assume each object in the environment controls itself, so no application is needed. M = 4: U =5: L = 5: V = 5: S = 10,000: F = 24: R =0.1 (the client process only sees some modifications and reduces the propagation frequency). Scenario E - A Weather Modeling Application: M = 1,000: U = 1,000: L = 5: V = 0: S = 0: F = 0: R = 0.0001 (the data only need to be redistributed once every 10 seconds).

92 .

Table 6: Scenario Performance Models

Scenario A

Scenario B

Scenario C

Scenario D

Scenario E

Trw 1

2,002

42

600

120

6,000,000

2

528,008

5,768

53,200

481,040

4,000,000

3

1,054,940

10,944

84,480

769,536

0

4

3,996

76

800

160

8,000,000

5

40

38

14

16

800

Two 1

2,002

42

600

120

6,000,000

2

8

8

400

80

4,000,000

3

528,000

5,760

52,800

480,960

0

4

3,996

76

800

160

8,000,000

5

40

38

14

16

800

Ttotal 1

4,004,000

1,680

6,000

1,200

60,000,000

2

1,056,020,000

230,720

532,000

4,810,400

40,000,000

3

1,054,940,000

218,880

422,400

3,847,680

0

4

3,996,000

1,520

4,000

800

40,000,000

5

39.960

760

68

80

4,000

Key: Unit in table - Messages per second System 1. - DOM System 3. - Location information System 5. System 4+Update Policies Trw - Msgs per process performing reads/writes Ttotal - Total Msgs in the system

System 2. - Central ORB System 4. - System 3 +Caching Two - Msgs per process performing writes

Table 6 shows Trw (the amount of network traffic to each read and write processes in messages per second), Two (the amount of network traffic to each write only processes in messages per second), and Ttotal (the total traffic in the network in messages per second) for each system under each scenario. Observe that System 2 (centralized ORB) does not perform well for many scenarios, this is the result of location transparency and lack of caching. The table illustrates the performance

93 advantage of object placement in System 3 running Scenario E, where objects are placed in the application-favored location resulting in a large performance gain, however Scenario E is deceiving since only System 5 shows the extra messages that results from infrequent reads of small portions of data. For many applications, caching (Systems 1 and 4) results in significant performance gains while in some applications, Scenario A and E, caching results in unnecessary cache consistency updates. System 5 with update policies demonstrates a larger reduction in message traffic except in Scenario B, where message traffic is “only” reduced by half of the number of messages generated by System 4 without the update policies. This example highlights why update policies are important since Scenario B can be interpreted as Scenario A with update strategies already applied to the scenario. These scenarios illustrate that System 5 with location, caching, and update policies generates the best performance; with developer input, a usable system tailored to the specialized application domain can be created. System 5’s Ttotal message traffic is only a fraction of a percent of the centralized ORB System 2 in all 5 scenarios. 6.4 Validation using Gryphon Enabled DOM The Gryphon enabled DOM was implemented after analysis suggested orders of magnitude message traffic reduction of having a System 5 type ORB. The data gathered from running the scenario application on the enhanced DOM has confirmed the theoretical performance. The Gryphon design, with the use of GAAOPs, resulted in minimal message traffic overhead. The message traffic overhead is only incurred when hint settings are modified. The scenario program running on System 5 validated these results. Figure 12 depicts 3 graphs from running the scenario application using System 5 with varying number of processes, modifiable objects, and update modification rates. Each client process uses a poisson distributions [Jain91], base on the specified mean, to decide the client’s update rate and number of objects that the client will be

94 modifying each second. The number of objects, the update rates, and the number of processes were selected to illustrate a few ranges of message traffic. The theoretical mean is computed using the System 5 equation with the mean values utilized by the processes as inputs. The graphs in Figure 12 illustrate the relative similarity between the theoretical and actual mean that results from running the applications. The variance in load results in significant variation in the message traffic, though it is correlated with the mean value analysis.

95

Graph of Message Traffic 1200

Mean Theoretic Mean

1100 1000

Msg/Sec

900 800 700 600 500 400 0

5

10

15

20

25 30 35 Time in Seconds

40

45

50

55

60


Mean Theoretic Mean

2400

Msg/Sec

2200

2000

1800

1600

1400

1200 0

5

10

15

20


40

45

50

55

60


Mean Theoretic Mean

11000

Msg/Sec

10000

9000

8000

7000

6000 0

5

10

15

20


40

45

50

55

60

Figure 12: Theoretic vs. Actual Mean Message Traffic using System 5 The 3 graphs in this figure depict 60 second runs with each client process changing their behavior by recomputing the poisson distribution each second. Graph 1 - 5 Processes with a mean of 10 objects and 10 updates per second Graph 2 - 10 Processes with a mean of 10 objects and 10 updates per second Graph 3 - 5 Processes with a mean of 100 objects and 5 updates per second

96 6.5 Stochastic Load Representation Section 6.3 provides results from executing 5 scenarios with uniform distributions. In an effort to create more realistic application client processes, poisson distributions are utilized by the clients to select the update rate and quantity of objects being modified. This section uses three new scenarios to illustrate the performance of the 5 systems. The scenario program described in detail in Appendix C, “Scenario Application,” was run using three sets of parameters (1, 2, and 3). These executions of the scenario program where performed on a Pentium Pro 200MhZ running Linux kernel 2.0.x. The graphs in Figure 13, Figure 14, and Figure 15 show the message traffic generated in log10 messages per second and the application run time (on the non dedicated Linux machine) in Log10 seconds. The scenario data was collected from running processes in the scenario application on one computer, so network delays and variances did not effect the timing results since all the message traffic was confined within the single machine. The run time information is supplied to contrast relative execution times but the focus remains on the message traffic. The message traffic shows the minimum, maximum, mean, and theoretic mean number of messages that occurred in a given process. The theoretic mean is computed using the mathematical model with input fitting a uniform distribution. The mean is computed by taking the total message traffic and dividing it by the number of processes. Showing that the theoretic mean and actual mean values are close to one another supports the validity of the scenario application. Providing the minimum and maximum message traffic gives an idea of the range of traffic and identifies the bottleneck. In System 1 and 2 the maximum message traffic occurs in the central process which is the bottleneck and has much higher traffic then the other processes as evident from the location of the mean in the following logarithmic graphs. The processes within the scenario application utilize the parametric input of the mean and a poisson distribution function to calculate the number of objects to

97 create and the update modification rate of the objects. A random number generator with a fixed seed value is chosen, enabling repeatability of the distributions and relative comparisons between the systems. The scenario application measurements begin when the processes have synchronized resulting in a stable predictable state and all updates are based on number of updates per second. The scenario application was executed for one virtual second, using virtual clocks, to compare message traffic across systems. It is sufficient to run the application for one virtual second except with System 5 which only distributes R updates. For System 5 the application needs to execute for 1/R virtual seconds before a full cycle is executed, the time and message count for System 5 is divided by 1/R to yield the one virtual second data. Figure 13 shows results for Scenario 1 illustrating the behavior when 20 processes are busy reading many objects and modifying a few objects. The number of objects being modified at each client process and the modification rate is decided at each client by using a poisson distribution with the mean prescribed by the scenario. The data shows that message traffic is 1 1/2 orders of magnitude greater in System 2 and 3 then 1 and 4 resulting from caching. Another order of magnitude improvement occurs in System 5 which reduces unnecessary update propagation. The run time reflects the reduction in message traffic between systems. Figure 14 is a graph of Scenario 2 which shows 10 processes performing many updates with 5 of the processes performing reads. This graph is an example of unnecessary update propagation since Systems 1 and 4 do not do as well as Systems 2 and 3 which do not use caching. System 3 is slightly lower in message traffic then System 2 since the central bottleneck is removed and the objects being modified are in the proper location. System 5 reduces the unnecessary update propagation and runs faster then the other systems. The run time for the first four systems do not reflect the message traffic and results from the implementation and the cost of the synchronization mechanism used to collect the data. Figure 13 has only 45

98 synchronization points and runs in 28 seconds with caching enabled and update propagation disabled while Figure 14 has 914 synchronization points and takes 3244 seconds to run. Figure 15 graphs Scenario 3 with 5 processes performing reads of a high number of objects. The results are similar to Scenario 1 except System 2 and 3 perform relatively worse then Systems 1, 4, and 5 due to the high number of reads. System 5 sees only a small improvement over System 4 since update propagation is not a major issue. Similar to Scenario 1, Scenario 3 has 41 synchronization points and runs in 103 seconds when caching is turning on and update propagation is turned off.

99

Graph of Message Traffic 1e+06 Min/Max Mean Theoretical Mean

.

G ;

100000

Log10 Msg/Sec

G ;.

G ;.

10000

1000

G ;.

G ;.

100

G ;.

10 1

2

3

4

5

4

5

Systems

Graph of Run Time 100000

Log10 Seconds

10000

1000

100

10 1

2

3 Systems

Figure 13: Scenario 1 with Poisson Distribution with mean of 5 objects and mean of 4 updates per second. The 20 processes perform 24 reads per second and 100 static objects are present in the system. System 5 distributes 1/10 of messages.The top graph is the minimum, maximum, mean, and theoretical mean of messages per second and the bottom graph is the execution time on a Pentium Pro 200MhZ running 2.0.x Linux kernel.

100


.

G ;

Log10 Msg/Sec

1e+06

;G.

;. G ;. G

100000

;. G

10000

;. G

1000 1

2

3

4

5

4

5

Systems

Graph of Run Time

Log10 Seconds

100000

10000

1000 1

2

3 Systems

Figure 14: Scenario 2 with Poisson Distribution with mean of 100 objects and mean of 100 updates per second. The 10 processes were 5 perform 24 reads per second and 0 static objects are present in the system. System 5 distributes 1/30 of messages.The top graph is the minimum, maximum, mean, and theoretical mean of messages per second and the bottom graph is the execution time on a Pentium Pro 200MhZ running 2.0.x Linux kernel.

101


G ;.

Log10 Msg/Sec

100000

.

G ;

G ;.

10000

1000

. G ;

. G ; . G ;

100 1

2

3

4

5

4

5

Systems

Graph of Run Time 10000

Log10 Seconds

1000

100

10 1

2

3 Systems

Figure 15: Scenario 3 with Poisson Distribution with mean of 20 objects and mean of 5 updates per second. The 5 processes perform 24 reads per second and 1,000 static objects are present in the system. System 5 distributes 1/2 of messages.The top graph is the minimum, maximum, mean, and theoretical mean of messages per second and the bottom graph is the execution time on a Pentium Pro 200MhZ running 2.0.x Linux kernel.

102 6.6 Metrics for Comparing Gryphon to Other Subsystem The Gryphon system was implemented in the prototype DOM subsystem and the scenario application created to test the performance. The data in Section 6.5 shows the performance in term of message traffic and wall clock run time. When comparing Gryphon with other subsystems only the message traffic metrics should be used. Comparing run time is not a proper metrics since that only tests the implementation efficiency of the DOM prototype and not the Gryphon architecture. Using hints it is possible to cause the DOM prototype to behave like five distinct subsystems. This feature made it possible to compare the Gryphon architecture on the system with application specified caching and location information (System 5) with the other four systems. The five systems are compared using the loads for the different scenarios described in Section 6.3. Other systems can compare their performance with Gryphon by using the message traffic metrics and running an application generating the load specified in the scenarios. When comparing the Gryphon architecture with other subsystems it is important to remember the goal of Gryphon. This architecture was designed to enable developers, applications, and automated tools to effect the subsystem using an auxiliary interface. In this manner the subsystem is general and is then tuned by external forces. 6.7 Summary of Results The results in this chapter, from executing the Gryphon enhanced DOM prototype, demonstrate a significant reduction in message traffic and a measurable reductions in run time of using location and caching policies. Scenario 1 in Figure 13 illustrates a three orders of magnitude reduction in the message traffic per process (100,000 messages in System 2 and only 100 messages in System 5) on the most overloaded process when location and caching policies are utilized over the centralized ORB approach. In this same scenario the execution time for the application

103 was reduced by more then two orders of magnitude from over 10,000 seconds in System 2 to 30 seconds in System5. Since the message traffic reduction in Scenario 1 is per process the overall reduction in message traffic over the network can be significantly larger. Table 6 illustrates such a message traffic reduction, over four orders of magnitude reduction in total traffic through the system, of running a virtual art museum DVE as described in Scenario A. The scenarios in this chapter show the results of using the same settings for all objects in the system. For many applications, the objects need to be tuned on a per object basis which is supported by the Gryphon System design. Per object tuning was used in the scenario program to improve the performance of the synchronization mechanism. The Appendix A, “Visualization of the traces,” section describes how hint settings were used with the global clock and local clocks to reduce message traffic and latencies resulting in improvements in performance of the scenario application. This chapter illustrated that tailoring the subsystem to each application’s unique characteristics results in efficient message traffic utilization and appropriate access latencies. The choice of traffic and latency is made by the application based on the desired characteristics of the application. These improvements in distributed system utilization come at the cost of processing and space requirements as described in Section 6.1. The data in the other sections demonstrate that caching, location, and update strategies aid in message traffic and execution time reduction. The techniques identified in this thesis should be considered as enhancement options for existing distributed systems.

Chapter 7

Conclusion and Future Work

This dissertation has addressed distributed object technology and the associated drawbacks. Results have shown the advantages of enhancing an ORB with the Gryphon architecture to reduce bandwidth utilization and reduce latency. This chapter gives a summary of the completed research and suggests areas for future research. 7.1 Infrastructure In order to perform the research in this thesis, it was first necessary to identify and study the cutting edge research in distributed objects and other areas which impact the development or utilization of distributed object technology, Chapter 2, “Related Work,” references these works and identifies the areas needing further research that are addressed in this thesis. The research represented in this thesis has shown impressive results using the enhanced Gryphon system design. It would have been desirable to utilize an existing distributed object subsystem instead of creating one from scratch, thereby reducing the work and implementation time; unfortunately, finding a suitable preexisting subsystem was infeasible. The distributed object subsystem needs to have object level communication that can be modified programmatically to make changes in distribution techniques and add the Gryphon to utilize the GAAOPs; subsystems available at the time of this research did not have the

105 required features to support location specification and object caching. Therefore the locally developed DOM was used as the subsystem and enhanced with a Gryphon. This method was satisfactory as it was possible to make modifications to the ORB and to test the research. Unfortunately, the DOM subsystem had its drawbacks: Such a system requires applications to be written specifically to the DOM subsystem interface. The resulting applications are not portable to other subsystem platforms and existing applications from other subsystems need to be modified to work with the DOM subsystem. The result was a specialized subsystem that was flexible and allowed for many optimizations, but all applications needed to be written from the bottom up to suit the subsystem. For this reason, the decision was made to change the interface to fit the CORBA standard. Because the CORBA standard suffers from location transparency it would have been necessary to enhance the CORBA ORB to provide location information. Reading the specifications and seeing what was required to become CORBA compliant, it became clear that too much time would be spent on the communication substructure instead of the actual research. A subsequent search for an open CORBA implementation resulted in Electra, with a CORBA compliant interface, that works on ISIS, Horus, and Ensemble. However, as Electra is no longer supported, it is not possible to utilize this system. No other open systems with the necessary features were available at the time. The ultimate solution included using the DOM subsystem, making the interface similar to CORBA, but without actually implementing full CORBA compliance. The result was a system which yields results that can be applied to CORBA, DCOM, EJB (Enterprise Java Beans), and other distributed object technologies. 7.2 Applications using Gryphon Part of this thesis research involved applying the hints to objects within applications and validating that the improvements in performance did occur. Objects

106 were given hint settings based on the application demands. These settings significantly improved performance of applications; so in this regard, the Gryphon enhanced subsystem has demonstrated the value of the approach. A minor goal was to see if developers would be satisfied with the type of interaction provided by the system. A multi player 3D Tic-tac-toe application was built by Bob Cooksey and is described in Section C.5. With a dozen hours of work he was able to build the initial version of the application and later use the hints to generate an application meeting his requirements. Section C.6 describes another example application utilizing the Gryphon, a distributed MPEG client-server application designed by Michael Neufeld. His system has one process serving the data and multiple processes utilizing the data with varying degrees of QoS. As the network becomes congested or a client process cannot keep up with graphic frames, the less significant frames are no longer propagated to the client process in question. These two applications made use of Gryphon to reduce message traffic and provided a small usability study. Another distributed object application is a stock market ticker application with a server containing objects, each object representing a stock, and clients with cached copies of the stock objects of interest to the client. The stock server then propagates all updates to the stock objects of interest to the clients with sufficient resources, while clients with reduced resources receive less frequent updates and any major updates (large volume trades, new highs and lows, etc.). It is clear that bandwidth can be reduced using the Gryphon and developers with domain knowledge can use hints to generate efficient programs, which validates that location and update opaqueness are essential. 7.3 Conclusion The qualitative performance of the Gryphon was evaluated using the DOM subsystem and distributed applications (i.e. the effect on the feel of the VPR). DVE applications that were infeasible are now feasible, indicating an improvement in

107 qualitative performance. Quantitative tests and measurements of distributed object applications using the Gryphon enhanced DOM validated the theoretical results which showed a reduction of multiple orders of magnitude in message traffic. This dissertation has demonstrated the feasibility of implementing applications that were previously infeasible due to subsystems that did not support specification for object location, object caching, and object update policies. 7.4 Future Work This thesis has shown that application developers, with knowledge regarding application characteristics, can significantly improve application performance. Future research needs to be performed to study how special distributed data (i.e. environment variables) can be used to enable the subsystem to make automated decisions when no substantial knowledge is available. Network bandwidths and latencies, I/O, CPU power, and resource sharing might have an effect on object placement and update policy and an automated system could use this additional information in making decisions. For example, a user interacting with groupware on a low bandwidth high latency connection will need different strategies than another user employing a high bandwidth low latency connection. Future work will entail taking the Gryphon architecture and enhancing existing systems with this new technology. The TAO system from Washington is an open CORBA [SLM98] that could be used in future work to create a CORBA compliant system enhanced with Gryphon. Another area of study is to increase the knowledge within the GAAOP. The GAAOP, along with smarter objects, would then be able to generate partial updates and more complex update strategies. This technology would enable applications like the MPEG player and weather modeling applications to efficiently utilize available bandwidth.

108 7.4.1 Throttle Clocks Another area of research involves applying Virtual Clocks to systems to achieve loose synchronization across machines. This application of Virtual Clocks which I have named Throttle Clocks, may produce scheduling that is not achievable with schedulers that do not involve the application in the decision process. Throttle Clocks may enable applications collaborating on an individual user’s desk top computer to improve user satisfaction with reduced resources. Throttle Clocks are implemented with a combination of Virtual Clocks and Real Time clocks based on system time. Throttle Clocks can be used when running processes that need to be kept at a relative rate of execution but do not need tight time coupling. For example, a DVE process is rendering 24 frames a second but another process performing object modifications is falling behind on reading and processing the changes to the object states within the environment. Rather than blocking and waiting for the update process to catch up, the rendering process should throttle back until the modifying process has caught up. Once the update process is caught up and proceeding at a desirable speed, the rendering process can slowly increase its rendering rate until a stable and desirable state is achieved. In another example, if a rendering process can acquire enough computing power to generate 50 frames per second, it might choose to stop at 24 anyway if no additional benefit is obtained from the additional computations. Throttling back of the rendering process may in return cause other processes to slow down since their additional work is yielding no benefit to the application(s). Throttle Clocks will allow applications to throttle themselves so they each get the relative processing they request, resulting in loosely synchronized applications within and across machines.

109 7.4.2 Using domain Knowledge

Application Layer DQM

GAAOP - Domain Knowledge and Resource Information Gryphon Distributed System Figure 16: Automated Decision Architecture using DQM

Another area of future work involves using domain knowledge to specify utilization via the GAAOP and then using Scott Brandt’s Dynamic QoS Resource Manager (DQM) [Bran99] to automate the decision process as seen in the architecture diagram in Figure 16. The DQM is a central resource management mechanism that dynamically determines a discrete level at which an application or object should run based on the availability of resources. The DQM manages the bandwidth and other resources by communicating with the GAAOP containing the domain knowledge. This domain knowledge part of the application is mapped into discrete levels of feasible message traffic and other resource utilization. The application changes resource utilization to remain within feasibility boundaries imposed by the limited resources.

Appendix A

DOPICL - Trace Generation

A.1 Trace Format Needed Performance analysis and debugging for parallel machines have proven to be a crucial aspect of concurrent software development. Because of the potential for complex software architectures with multiple communicating processes, these systems are difficult to design, debug and tune. Production quality parallel software often requires the programmer to expend considerable effort studying and analyzing the software, even after the first working version has been produced. In addition to the inherent complexity of concurrent software, the physical parallelism significantly increases difficulty associated with tuning and debugging due to difficulty in observing the global state of the computation. Traces of the computation’s execution -- debugging an earlier version of the software, tuning more mature versions, or extrapolation of the behavior on simulators -- have been shown to be an extremely useful tool for studying concurrent software [HSR91]. By the early 1990s, multiprocessor traces were widely used to capture the behavior of parallel computations in order for the behavior to be studied following completion of the computation. Robust trace technology stimulated the development of several important analysis tools, including Paragraph [HE91], the Pablo sonification tools [MR95], and others. Researchers quickly realized the importance of

111 standardizing traces [PGU+90] resulting in Reed’s proposal for self-defining traces [MR95] and also the Portable Instrumented Communication Language (PICL) trace format [GHP+90]. PICL has become a widely used trace format for capturing the message-level behavior of parallel computations. MPL, MPI [DW95], PVM [GBD+94], NX/2, and VERTEX communication protocols all have libraries instrumented to produce PICL traces. PICL was previously used for studying extendable performance visualization environments

in parallel

computations, particularly

scientific

computations

[NGM+95]. The decision was made to use trace files to enable analysis of the Gryphon distributed object manager [NBG+99] to support these applications, as it became apparent that tools were needed similar to the ones developed and used for studying parallel scientific computations. This resulted in instrumentation of the DOM subsystem to produce PICL traces, with which existing tools could be reuse to facilitate debugging, explore software structure, and otherwise optimize the DOM design through visualization and quantitative analysis. Two significant problems occurred when an attempt was made to use PICL in its existing form in the DOM environment: • PICL timestamps were insufficient to differentiate among events occurring on distributed computers not sharing a global clock. • Analysis of object-oriented systems requires knowledge of the object and methods involved in an event; PICL events do not provide this information. As Worley extended PICL to work with MPI [Worl99], the PICL format is extended to addressed these two problems, deriving a new version of PICL called Distributed Object PICL (DOPICL). The DOM was instrumented to use a form of Lamport’s logical clock timestamps [Lamp78] to identify events. A Logical Clock (LC) was included into the trace files, along with object and method identification, were included to address issues previously stated. A tool was built to merge the

112 collection of DOPICL traces according to the Lamport’s logical clocks, creating a feasible trace of the activity in the distributed system (cf. Lamport and Chandy’s distributed snapshots [LC85]). A second tool performs object and method specific analysis on the events in the global clock ordered DOPICL trace, producing a conventional PICL trace for use by various existing visualization tools. Section A.3, “PICL for Distributed Clocks” describes DOPICL, including how the DOPICL tools are gathered from the DOM systems without globally synchronized clocks and are used to merge multiple trace files into a single trace file for debugging, bottleneck identification, and verification of programs. This section also addresses how the DOM run time architecture allows for on the fly trace reduction and consumption. Section A.4, “PICL for Objects” discusses enhancements DOPICL has to improve object-oriented analysis. In Section A.6, “Using DOPICL Traces on Gryphon” demonstrates the utility of the DOPICL traces and tools and explains how it is used to study the Gryphon DOM [NBG+99]. The system’s message traffic was first represented by mathematical formula. The Gryphon prototype was then built and instrumented to produce DOPICL traces. Using the DOPICL traces in conjunction with Paragraph and ParaVision it was possible to identify errors in the quantitative models discussed in Section 5.2 and in the scenario application used in Section 6.5. A.2 Network Issues It should be noted that network communication latency (response time) and bandwidth (throughput) vary based on network type. There are differences between 28.8Kb Modem, 10Mb Ethernet, 100Mb Ethernet, local machine, DSL, ISDN, ATM, Cable Modems, and other communication mediums, each having unique characteristics based on the hardware and protocols being utilized. Latency is the amount of time it takes for a message to travel the network. Hennessy, and Patterson measured latencies of 0.1ms for 128 bytes and 1.5ms for 1530 bytes on a 10Mb Ethernet [HP90]. Modems have a static latency since they are

113 point to point connection, whereas ethernet latency is dynamic given the number of gateways and physical distance may vary and collisions can occur. Also, messages from one host to multiple hosts have varying latencies and bandwidths on ethernet. The modem is generally the bottleneck, not the final destination, over faster communication lines. Bandwidth refers to the amount of data that can be sent through the network. A modem is relatively constant, whereas an ethernet can vary greatly due to traffic anywhere along the network. Bandwidth over modems can also vary based on hardware compressibility of data. Latency and Bandwidth will vary for applications based on work load caused by outside influences. The DOPICL trace files can be used to identify and analyze these features since the ordering of events is not dependent on the system clocks. This thesis does not specifically address the communication medium but instead provides mechanisms which allow the application developer to make trade offs based on bandwidth utilization. A.3 PICL for Distributed Clocks PICL traces are widely used to capture the behavior of parallel computations where the individual elements of the computation are processes being executed on a collection of relatively well-synchronized computers. In general these applications are executing on distributed memory machines, so that PICL and the analysis tools implicitly focus on various aspects of message passing systems. Contemporary distributed computations are often designed and implemented in an object-oriented environment. These systems tend to make more heavy use of messages than conventional distributed systems, focusing on performance of the overall computation on the behavior of the IPC mechanism. In a distributed object system (such as DCOM [Chap97] or CORBA [OMG95] systems), each inter object reference generates message traffic. In a DVE, each entity that appears in the environment is likely to be implemented as an object. Therefore, each time an object is

114 repainted on the screen it potentially has to be referenced. In creating an efficient distributed object manager, the analysis of system behavior depends on having an accurate representation of the distributed behaviors. Given the success of PICL traces with earlier work, PICL was used to collect trace data on the DOM prototype and then analyze the data with Paragraph and ParaVision. A.3.1 A Total Ordering of Events is not Possible using Distributed System Clocks In the same fashion as the conventional PICL instrumented libraries, each process in the target distributed object system generates its own trace file. An alternative would be to create a single server on the network to collect events as they occur, and generate one totally ordered trace file using timestamps from the server’s clock. PICL is designed with the idea that individual traces will be created by different parts of the parallel computation and that each trace event will be time stamped with a floating point number generated by a central communication mechanism. In a conventional PICL environment the traces are merged using the PICL timestamp. Distributed object systems do not have access to a central location and need to collect events that occurred between processes existing on the same machine and on remote machines. For example, the Gryphon system works across UNIX and MS Window environments. In these environment timestamps were obtained using the gettimeofday system call (which returns an integer representing the number of seconds since the epoch, and a second integer field with the additional microseconds). Most kernels are configured to minimize the frequency of clock interrupts, causing the system time to be coarse grained; though gettimeofday returns a microsecond count, its accuracy is coarser grained (and depends on the interrupt rate and clock cycle time of the CPU). For example an Alpha 292 MHz EV5 (21164) processor running OSF1 generates only milli-second level granularity on the gettimeofday method call. The clock granularity causes an ordering problem even when the processes are on the same physical machine.

115 Networks of UNIX machines utilizing Network Time Protocol (NTP) [Mill92], to synchronize the local system time, and MS Windows machines, whose clocks are usually set by the users, need to work together. The NTP synchronized system clocks are only loosely synchronized, down to 100 millisecond as Liskov points out [Lisk99]. As for the user set machines, the times are often minutes apart. Liskov agrees that system time, even in the NTP case, cannot be used to insure correctness in ordering events occurring on remote machines. A.3.2 A Total Ordering using Logical Clocks This time problem manifested itself by producing incorrect traces (e.g., see [EN96] for elaboration on multiprocessor trace inconsistencies) when ordering on time, as done in PICL. In this type of environment it is not unusual for one machine sending a message to another machine to use a local timestamp that is later than the local time for the receiving event. When the two machine’s traces are merged, the receive event appears with an earlier global timestamp than the send event. Lamport’s logical clocks (LCs) [Lamp78] are a set of monotonically increasing integer values, one LC per process. Briefly, an LC clock value is incremented each time a significant event (e.g. a trace event) occurs within a process, or whenever a process receives a message with a higher LC value. In DOPICL, the LC value is used in the timestamp field instead of the real-time clock value used by PICL. This approach is analogous to the vector clocks used by Fidge [Fidg88] and Mattern [Matt89], the technique for creating a snapshot of a distributed system by Lamport and Chandy [LC85], and conservative technique used to establish a total order on event execution in distributed simulation. The RoltMP system also uses this approach in its message passing system [RK98]. The LC is incremented whenever an event is recorded, and is transmitted as a logical timestamp within the header of each message. If a message is received with a higher LC than the local LC, the local LC value is set to a value one greater than the received LC. Since a local value of an LC

116 represents that machine’s view of the global time and since LC values are synchronized on message receipt, the collective LC values in the trace establish a total order resulting in causal integrity being preserved on the events that transpire on all machines. If a central clock is available the resulting total order in the DOPICL trace does not necessarily represent the actual sequence in which the events occurred in the distributed system, however, the total order does accurately represent all causal orderings that result from message-passing.

Process 1

Process 2

Process n

DOPICL 1

DOPICL 2

DOPICL n

Trace Merge DOPICL

Filter and Convert

Figure 17: DOPICL Architecture

PICL

117

Process 1 ... 55

...

Process 2 23 24 ...

Process 3

Process n

19

15

25 26

67 68

...

35 27

69

...

39

78

Figure 18: Example snapshot view of LC in generated events Generation of a globally ordered trace file from the partially ordered trace files can be seen in the architecture diagram in Figure 17. Using the LC, causal ordering is preserved but the actually ordering (if a central clock was used) may be quite different. As described previously, the LC value on a receiving process is incremented only when the LC value from the sender is greater. Process 1 in Figure 18 shows a situation were a process receives many messages but does not generate many responses. This process may have an LC value significantly greater than other process. Using only the LC value for ordering will generate a valid but highly unlikely ordering of events. A.3.3 Utilizing Modified System Time The simple and common solution to determining one-way propagation time between two Internet hosts is approximated as half of the round-trip time (RTT) between the hosts [CPB93]. Postmortem traces can be used to measure one way propagation time of messages, both directions are measured since message paths between machines may not be symmetric. One simple technique is to average the differences of a sampling of messages, yielding a statistical sampling in between hosts in both directions. The result yields an approximate delta in time between two clocks,

118 and eventually all the clocks, after this technique is applied to each process. This technique works as long as an associative relationship exists between all the processes, implying a connected graph. According to Claffy [CPB93], this technique can be used to synchronize trace files generated by MS Windows machines with times set minutes apart to within a few 100 milliseconds. If the goal is too have highly synchronized clocks, then see Garnett [Garn99] and Paxson’s [Paxs97] work as they each describe techniques to improve the synchronization between distributed clocks. As mentioned previously, even the best technique is not enough to ensure ordering correctness as clock resolutions may be too course grained to yield accurate times. Another problem is caused by the OS scheduler; a process can be swapped out right before or after generating a trace event. This results in time skew - the appearance that a message took several seconds or more to send or receive. This clock adjustment technique is sufficient for this research, improving information gleaned during the analysis phase in regards to timing issues on both machines using NTP and other personal computers when time is set by a user. 1. New DOPICL events also include two fields corresponding to the seconds and microsecond values returned by gettimeofday. These wall clock values are used to measure the elapsed time between two events. Since wall clock times can be unsynchronized (in the case of user set clocks on Windows machines) trace files need to be analyzed in order to loosely determine the time. Time fixes discussed in the merger algorithm are applied when clock values are misordered. A solution to this problem is to utilize the system clock times that have been adjusted using Paxson’s techniques for ordering. System time is used in conjunction with each event’s Globally Unique ID (GUID), which is constructed from the processes LC and unique process id. Time is not used for ordering in the case when an event with a dependency on another event GUID has not yet been satisfied. The total ordering algorithm using time and LC dependency checking is as follows: 2. Open all the trace files to be merged and read the first event from each file. When a new event is read, it has a valid bit set to true when no dependency exists. If a dependency is present, the GUID of the restraining event is stored into the dependency variable. The valid bit is set to false when no more entries exist in the input file. 3. Select the valid event with the lowest adjusted system time. Note: if clock skew results in an event occurring with a clock time smaller then the previously used event, then the skew clock is adjusted at this point. The selected event is then

119 written to the globally ordered file. 4. Read the next event from the file just written in step 2, and set its dependency variables as described in step1. 5. Check and mark any dependency changes that have been satisfied by step 2. 6. Go to step 2 while valid bit set events exist. 7. At this point, all files have been consumed or an error in the generation of the trace files exists. A.4 PICL for Objects New DOPICL events include fields for a unique object and method identifiers for each event. The method field contains the name of the method as a string, and the actual data passed in can feasibly be inserted into the string if needed for debugging. A label event also exists enabling an object name to be assigned to the object id. These additional events and fields provide enough information to allow analytic tools to accommodate the object paradigm, enabling system analysis to focus on the objects and methods of interest to the developer. Having object and method names enables many run time and postmortem features to be exploited. The DOM run time system keeps an object id to name mapping by utilizing the labelling events. The name map can then be used at run time with environment variables to focus the trace generation and analysis. Specifying the list of object names to trace causes only those events to be generated. It would be difficult or impossible to know the object id before execution (as some many objects may be generated on the fly) but using the names allows for powerful trace focusing. The method names are also a powerful tool for analysis by using the combination of object id, method name, sender of message, and receiver of message in postmortem analysis, one can discover the location and caching behavior of the system. For example, a local read event means the object is either cached or the original exists locally. The location of the original object can be determine on a remote update event since the method name will fill in the missing information. Method names are only useful if they are descriptive enough and specialized enough to yield

120 information. This is the case with the Gryphon system with its low level tracing. The method name information will also be useful at the high level in CORBA, as long as remote updates of a cached copy have a different method name than a remote update of the original. Since the high-level CORBA interface does not support caching, none of these events are generated. With some additional processing described later in detail, DOPICL can be used with the widely-available PICL visualization tools. Using these new events has focused the analytic and visualization tools on the relevant parts of applications. A.5 DOPICL Format A PICL file is a collection of records, each having a header, a body, and the following format: As previously described a DOPICL header is the same as the PICL header except using real-time for the TimeStamp field, the Lamport Clock number is used resulting in a PICL compliant trace file which can also produce a total ordering. Remaining DOPICL enhancements exist in the form of new events with additional information. Table 7 lists new events that have been added for DOPICL. Table 8 shows PICL events generated when parsing and analyzing DOPICL traces in order to use visualization tools. All the events used in the DOM system come in pairs; a method call has an Entry into the call and an Exit when the method is completed. The information in DOPICL allows for Entry and Exit methods to be matched, and causality is identified by matching Send and Receive messages. The ProcessID and ProcessLC fields provide necessary information and are set to zero in the case of the TraceProcess event which has no dependencies. The Exit/Send event has no ProcessID dependencies, therefore it is set to zero while the ProcessLC is set to the Entry/Send event for matching ability. Entry/Send events do not have

121 inter-process dependencies resulting in the ProcessID being set to the negative value of the destination process ID. Recv events do have inter-process dependencies resulting in the ProcessID being set to the positive value of the senders process ID. The ProcessLC field is used both to identify the matching event in the pair and also for merging the trace files into one trace file.

Table 7: DOPICL Events Body # of Data Field s

DOPICL Record

DOPICL Event

Data Descriptor

Data Fields

Label(-5)

ProcessName/ ObjectName (-1)

1

“%d%s”

ObjectID Name

Entry(-3) or Exit(-4)

TraceProcess (-2901)

1

“%d%ld%ld%d”

ProcessID (zero) ProcessLC (zero) Seconds MicroSeconds

Entry(-3)

AsyncSend (-2027) SyncRequestSend (-2127) SyncResponceSend (-2227) LocalSend (-2327)

1

“%d%ld%ld%d%d%d %s”

ProcessID ProcessLC Seconds MicroSeconds ObjectID length in bytes Method Call

Exit(-4)

AsyncSend (-2027) SyncRequestSend (-2127) SyncResponceSend (-2227) LocalSend (-2327)

1

“%d%ld%ld%d%d”

ProcessID ProcessLC Seconds MicroSeconds ObjectID

122 Table 7: DOPICL Events Body DOPICL Record

DOPICL Event

# of Data Field s

Data Descriptor

Data Fields

Entry(-3)

AsyncRecv (-2051) SyncRequestRecv (-2151) SyncResponceRecv (-2251) LocalRecv (-2351)

1

“%d%ld%ld%d%d”

ProcessID ProcessLC Seconds MicroSeconds ObjectID

Exit(-4)

AsyncRecv (-2051) SyncRequestRecv (-2151) SyncResponceRecv (-2251) LocalRecv (-2351)

1

“%d%ld%ld%d%d%d %s”

ProcessID ProcessLC Seconds MicroSeconds ObjectID length in bytes Method Call

Table 8: PICL Events Body PICL Record

PICL Event

# of Data Field s

Entry(-3) or Exit(-4)

Tracenode/ Tracehost (-901)

0

Entry(-3)

Sendbegin0 (-27)

3

Data Descriptor

Data Fields

2

length in bytes destination processor id destination process id

123 Table 8: PICL Events Body PICL Record

PICL Event

# of Data Field s

Data Descriptor

Data Fields

Exit(-4)

Sendbegin0 (-27)

1

2

send request id

Entry(-3)

Recv0 (-51)

1

2

requested type

Exit(-4)

Recv0 (-51)

3

2

length in bytes source processor id source process id

The body format in DOPICL is the same self defining format used in PICL. The first field of the body specifies the number of fields to be processed and the second field specifies the type of the event. This type can be specified using print/scan syntax from C or using a more compact format (1 for string, 2 for integer, etc.). A.5.1 Processing DOPICL Traces As previously noted, PICL has been integrated into the libraries of several message-passing packages. When these libraries are used with PICL, one trace file is generated for each process and records in each trace file are time stamped with the value of the coarse-grained globally shared clock. A single global trace can then be created by merging the individual traces by their timestamp. The global trace can then be used to drive postmortem analysis tools. DOPICL events are created by instrumentation embedded in each target process. In this study I focused on message traffic generated from using distributed objects and instrumented all method calls that could possibly generate message traffic. The Async and Sync events represent message traffic, while the Local events represent traffic that would have occurred had the object not been located or cached

124 locally. PICL trace files are produced by modifying the libraries used for various message passing protocols discussed earlier. In DOPICL, a trace object instance is created and methods get called on this object from within the distributed object system, resulting in trace file generation. Environment variables such as summary mode where no trace events are generated, turn on/off local access tracing, and filter to generate only desired object/method events, are used at run time to focus the trace generation. TraceOpen is the first method called, at which point the environment variables are processed, an appropriate output file is opened and an Entry event is generated when not in summary mode. When tracing is complete, the TraceClose method is called resulting in a trace Exit event or generation of the summary data when in summary mode. TracePrint, the third method call and it is called with the DOPICL trace data to be printed. The TracePrint method takes a variable number of arguments and contains an DOPICL a parser for the data. This method then uses this data to compute a summary output, or output the information to the trace file. Filters driven by environment variables are checked to see the information is part of the trace focus prior to summary or result output. The DOPICL generation method are inserted into the subsystem for trace generation. Location placement of these calls depends on the specific subsystem. In the prototype target system traces are generated at a low-level within the architecture, resulting in ORB specific information which provides details about object cache updates and other synchronization methods. Placing trace methods at a higher level within the CORBA ORB result in information regarding the applications level method calls on objects. The run time mechanism that generates trace information can perform analysis and trace reduction at run time, eliminating the large space and time requirements when using a large application trace. Additional information in the DOPICL trace file format enables filtering of events both at run time when generating the trace files and

125 also during postmortem analysis. It was useful to get a “big picture” view of the applications communications patterns when analyzing large programs and focusing on specific objects and/or methods within the application. Environment variables affect the trace generation at run time and are used to turn off tracing or toggle between generating a trace file and generating an analysis of the trace file (i.e. the trace file is consumed at run time). These variables can also be used to focus the trace and only observe the requested objects and/or methods. Care must be taken to focus the trace using the same run time variables across all processes, otherwise the trace may be incorrect with receive events missing their matching send message. The run time analysis feature generates high-level information per process which are then combined to give a high-level analysis of the entire distributed application. A.6 Using DOPICL Traces on Gryphon While designing and building the Gryphon enhanced DOM system and applications using DOM, it was difficult to tune application without additional information and tools. Run time debuggers are not appropriate due to the perturbation effects on the application. DOPICL has proven an invaluable tool in this endeavor to improve performance and verify correctness of the DOM system and applications. The remainder of this section describes ways in which DOPICL has been used to fine tune the Gryphon system and the applications using this system. A.6.1 Analyze and verify the applications and the analytical model. The Gryphon system was designed with existing object request broker (ORB) features and additional required features. Five different types of ORBs were identified from currently existing ORBs and new ORBs that can be create using the Gryphon enhanced DOM. Analytic models were created to represent these systems and also represent the expected network traffic generated from running simple scenarios. Models played a key role in both assessing the feasibility of the system and validating

126 the system once it was built. Using an DOPICL trace and some simple quantitative analysis tools, a comparison between the performance predicted by the analytic model and the actual system can be made. The first attempt at validating the implementation showed discrepancies. Analytic analysis identified errors in two of the models and implementation errors in the base features of Gryphon. While running one scenario using a centralized CORBA subsystem, greater message traffic then predicted by the analytic model was observed. Filters were used on the DOPICL trace in order to narrow focus to a subset of objects. These objects were being modified by a client process, and the model assumed that a write modification of a remote object not cached locally requires two messages: one message to send the change and a second to receive an acknowledgment of the change. The switched was then made to the visualization tools to focus the analysis further. A.6.2 Visualization of the traces When visualizing the DOPICL trace file from the previous example, an echo effect for every write appeared. The actual trace events were scrutinized and the echo was found to be caused by an unanticipated object state read before writing of changes. Since the objects were not cached, additional message traffic for the read updates occurred over the network. The models and subsystem were fixed using the analytic and visualization tools and were able to debug and locate unexpected message patterns within applications. One such pattern was caused by an application utilizing throttle clocks to adjust the speed of the application components. These clocks consist of one global clock located on one process and each process has its own local clock. Each process performs tasks and updates its local clock appropriately. The global clock is checked to see the rate the entire system is achieving. If the processes; local clock is behind the global clock the process can speed up and increase it processing rate while the reverse causes the application to slow down. Throttle clock techniques are used to achieve

127 equitable resource utilization and synchronization between applications running on distributed systems. In essence, the work normally performed by the OS scheduler on a single machine is distributed. Throttle clocks do serve a purpose on a single machine as well. Using analysis tools, it was found that processes with many objects were not getting their own work done because they were using their OS allotted time to answer requests by other processes. When visualizing the trace file, latencies and quantities of message traffic being generated by the process with the global clock were identified. The global clock was not being cached, resulting in remote reads by all the other processes of the global clock. A large number of requests were caused by the process with the global clock requesting the local clock times. The chosen solution was to cache the objects, subsequently eliminated the read latency. The number of messages resulting from the local clocks was surprisingly large, even after the previous modification were implemented. It was then observed that the local clocks were generating message traffic to keep all cached copies strictly consistent. In this application, only the process with the global clock and the process with the local clock need know the state of the local clock. The current implementation of this application only keeps a cached copy of the local clocks at the global clock process. Using visualization and analytic tools together, it was possible to reduce the message traffic generated by the clocks and eliminate associated latencies. The traffic for updating cached local clocks on a 20 process scenario, where each process updates their local clock 24 times a second, results in 8,664 (19 processes x 19 clocks x 24 updates) messages per second. This figure was reduced to 456 (19 clocks x 24 updates) messages per second to the process with the global clock. Caching the local clocks at the process with the global clock and caching the global clock at all processes eliminated latencies and a variable amount of traffic based on the read access rate of the aforementioned objects. The resulting performance tuning and bottle-neck reduction was simplified by the availability of DOPICL and the associated

128 tools.

Appendix B

DOM - Subsystem Design used by the Gryphon

B.1 Communication Layer The DOM is a proxy that manages all message distribution. The design and implementation of the DOM is not part of the research presented in this thesis but instead was created as a result of practical implementation choice required to produce a working ORB. The chosen implementation results in a reduction in the number of sockets from N x (N-1), where N is the number of processes, to N connections. Instead of a point-to-point network, every message goes through the proxy. The proxy becomes a bottleneck when N becomes large but can be remedied by creating layers of proxies. This multi-layer architecture is similar to the Globe system and the KSR architecture. Details of the implementation and methods called are described, some of these method names appear to be specific to the VPR application but are actually residue of the first application built using the DOM layer. B.1.1 Steps involved in using a Session This section describes the details of the DOM architecture and the steps involved in running any distributed application that utilized the DOM subsystem. The steps described relate to Figure 19.

130

SM (Session Manager) Session Storage Step1

GOM (Global Object Manger) Step3 Step2

LOM

Step7 Step4 Step5

GOM

Step6

LOM (Local Object Manager)

LOM

Key: SM - Session Manager (session_manager program) GOM - Global Object Manager (object_manager program) LOM - Local Object Manager (any application process i.e. scenario, vpr, tictactoe, etc.) Arrow thickness - describes the bandwidth utilization of the connection Figure 19: DOM Architecture for Distributed Objects Step 1(SM): Start-up the SM at a known host:port. If the host:port is not occupied the SM’s job is to wait for connections from LOMs and GOMs. A LOM connects to the SM in order to find an active session. An active session is one that is associated with a GOM. Step 2(GOM->SM): A GOM connects to the SM and requests to activate a session. If the session is not currently active, the GOM is registered with the SM as the GOM for the session. The GOM then requests and parses the saved state of the world.

131 The GOM then waits idle for connection requests from LOMs. Step 3(LOM->SM): The LOM connects to the SM and requests a session. The SM returns the GOM host:port if the session is active and the port set to zero if the session is not active. If the session is not active the LOM can make a system call to start-up a GOM locally. The GOM will follow step 2 and the LOM will again request the session. When the LOM gets the GOM host:port information, the connection between SM and LOM is terminated. Step 4(LOM->GOM): The LOM connects to the GOM and requests to join the session. The GOM adds the LOM to the list of LOMs associated with the session. The LOM then requests and parses the current state of the world it has just joined. Step 5(LOMsGOM): Interaction Management. Requests for creation, modification, and deletion are managed and propagated. See the following section for more information. Step 6(LOM->GOM): LOM sends the notification to the GOM that it is leaving the session. The connection between LOM and GOM is then closed and the GOM removes the LOM from the list of LOMs associated with the session. Step 7(GOM->SM): When the last LOM leaves a session, the session becomes inactive. The GOM generates serialized data representing the current state of the world and sends it to the SM for storage. The GOM then notifies the SM that it is terminating and the SM renders the session inactive. B.1.2 Interactions Management of Distributed Objects Once the world is active, the ability to create, modify, and delete objects needs to be facilitated. The systems interface designed to simplify the object interaction is described in detail. This interface is at the DOM level and is not seen by the application programmer. In the DOM architecture the GOM accepts changes and propagates them to the LOMs. The following is an overview of the functions used by the LOM and details on

132 the effects of these functions. When starting up a new application, first instantiate an object from the local_world_interface class. This object will be used to interact with the LOM which in turn interacts with the GOM and other LOMs via the GOM. For this example the instance of local_world_interface is called lom. After the lom is instantiated the method init is called and takes four parameters, the first is the hostname of the SM, the second is the port of the SM, the third is the name of the session, and the fourth parameter is the session number. Once the lom has been initialized it contains all the objects in the private variable vpr_world of type world_objects. The world_objects is a class containing the functionality needed to manage a linked list of om_objects and will be described further. All objects in these methods are of type queue_object which contains an action to be taken and the om_object. The lom has one method for creation and one for deletion. Calling the method create_object with a queue_object as the parameter will place the new object on the queue for creation. To delete an object the method remove_object is called with the object id as the parameter. To modify an object, the queue_object is modified and then the method send_object_modification is called with queue_object as the parameter. Changes are not processed or propagated to the local LOM, GOM, or other LOMs until a commit takes place. To process the change request

queue

and

commit

the

requested

changes,

the

method

swap_object_queue is called on lom. This method causes all the creation, deletion, and modification requests to be processed and propagated to the GOM. This method call empties the send queue and result in remote changes received from the GOM to be placed in the receive queue. If no connection was created to a GOM, the LOM can act independently using loop back, all requests are moved from the send queue to the receive queue with no further processing. The loop back has the effect of making the LOM also act as the GOM and results in a centralized system without

133 external interaction. If the GOM is active, the new messages generated by other LOMs are received by the local LOM from the GOM and placed in the local receive queue. To

receive

the

requests

from

the

receive

queue

the

method

receive_object_modification is called on the lom. This method takes no parameters and returns a queue_object with the action tag set to cObjectModification,

cCreateObject,

or

cRemoveObject

for

modification, creation, deletion, respectively. The om_object contains all the information that is necessary for the interface used by the application. Figure 21 and Figure show a detailed view of the design of the DOM architecture that has just been described.

IN

Another distributed part of the application.

Msgs

OUT

Msgs

s roces

d

GOM

e ra t e

Proxy part of Application with GOM

Modifies, Adds, and Deletes Requests for and Notifications of

IN/P

/Gen

LOM

Msgs

O UT

Msgs

Distributed part of Application with LOM

Figure 20: GOM-LOM Architecture This is the architecture for a distributed application where the proxy has the GOM and the distributed processes contain the LOMs.

134

Receive Object Modifications

Object DB

Object DB

Swap Object Queues (Commit Requests)

receive queue

send queue

Create/Remove/Modify Object Requests

Serial

LOM

GOM

Figure 21: Distributed Object Manager

B.1.3 Detailed Implementation Issues of the GOM/LOM This section uses a scenario to provide a detailed description of each method call and the effects on the system. The first step is to instantiate the lom variable of type local_world_interface. init lom.init(,,,) lom.init(“vpr.cs.colorado.edu”,3456,”sessions/ajg_world1.vpr”,0) The init method will contact the SM on host ‘vpr.cs.colorado.edu’ using port 3456 and request to join the session names ‘sessions/ajg_world1.vpr’ instance number 0. If the session is not active then a system call is made to start-up the GOM locally. There is then a short sleep to wait for the GOM to start-up. Another request is made to

135 the SM for the active session. If it is still not active a longer sleep occurs and if the final check for the active session fails then the init is finished and lom is setup in local loop back mode. When failure results, the world will be started with no objects. At this point the connection with the SM is closed, irrespective of success and failure to get a GOM. In the case of success the GOM is contacted, the current state of the session is requested, and the session objects are created locally with the necessary information to map back to the GOM objects. Create messages are placed in the receive queue for each object generated by the init method. swap_object_queue lom.swap_object_queue() This is the method that causes all the requests involving the send queue to be removed and processed. For this reason it makes more sense to discuss the swap_object_queue method in terms of how it handles the information in the queue generated by the other methods. This is also the method that generates new queue_objects and places them on the receive queue. create_object queue_object *q_obj q_obj = new queue_object // initialize the q_obj in some way lom.create_object(q_obj) This method places the q_obj pointer into the send queue and sets the action to cCreateObject. When swap_object_queue is called the queue_object is removed from the send queue and processed. The processing is to forward the message to the GOM. When the reply from the GOM is received, the object is created and the event is place in the receive queue. The object is created locally and the request is sent on the GOM. Upon receiving the request for creation the GOM will create the object and then broadcast to the LOMs that a new object has been added. The request is placed in the receive queue of the LOMs and can then be processed to

136 create the new object. The MsgSubId is the locally value assigned on creation so, if the MsgSubId matches then the object exists locally then this is the LOM that requested the creation. This object is then be updated with the global id and no message is placed in the receive queue. remove_object lom.remove_object() lom.remove_object(2) A request is placed in the send queue to remove the object.

When

swap_object_queue is called the request is removed from the queue and is sent to the GOM for processing with the action cRemoveObject. There is no initial effect on the local LOM caused by the request. The GOM processes the request by removing the object and broadcasting the remove object request to the LOMs. Upon receipt by the LOM the object is removed and the event is place in the receive queue. send_object_modification queue_object *q_obj q_obj = new queue_object // change the q_obj.data to contain changes made to the object lom.send_object_modification(q_obj) A request for action cObjectModification is placed in the send queue. Upon calling swap_object_queue the request is removed from the send queue and sent to the GOM. The GOM updates the object and sends the modified object to the specified LOM. Upon receipt by the LOM the object is modified and the event is place in the receive queue. receive_object_modification queue_object *q_obj q_obj = lom.receive_object_modification() An object of type queue_object is removed from the receive queue and returned.

The

objects

are

placed

on

the

receive

queue

when

the

137 swap_object_queue method is called. Close lom.close() The LOM notifies the GOM that it is leaving the session. The GOM removes the LOM from the list of active LOMs in the session. The LOM awaits an acknowledgment and then the connection between the GOM and LOM is terminated. When a LOM terminates without notifying the GOM or crashes, the LOM will be removed from the list of active LOMs when the GOM becomes aware of the broken socket connection. B.1.4 The World_Objects Class All the objects in the GOM and LOM are stored locally in an instance of the class world_objects, the class is in ‘om_world.h’. The methods described in this section are used by the GOM and LOM and do not need to be used by the application writer. This object manages creation, deletion, modification, serialization, unique id generation, and much more. The various methods and their effects are discussed indepth in this section. The data store object, vpr_world of type world_objects, is instantiated with no objects in storage. All the objects in the data storage are of type om_object. count_objects int count count = vpr_world.count_objects() Returns the number of objects stored in the vpr_world object. unique_id int changed om_object *obj // data should be place in the object changed = vpr_world.unique_id(obj) This method checks if the object id is unique. If the id is found to be unique the

138 method returns 1 otherwise the object id of obj is replaced with a valid id and the method returns 0. It should be noted that this function is private and should only need to be used by add_object. The unique id is simply an id not being used by an object in the vpr_world and will not be reserved until the object is actually added to the world. locate int id=23 om_object *obj obj = vpr_world.locate(id) This method locates an object with the specified id and return a pointer to the om_object, otherwise a null is returned. print_object #1 int string_length,maxsize=1024,id=23 om_object *obj char object_buffer[maxsize] string_length = vpr_world.print_object (obj, id, object_buffer, maxsize) This method will generate a string that contains all the information about obj and stores it in object_buffer. The object id is set to the value in id allowing for renumbering on serialization. All of the print methods return -1 if there is insufficient space in the string for storing the output. print_vpr_world_store int string_length, maxsize char *world_buffer maxsize = vpr_world.count() * 1024 world_buffer = new char[maxsize] string_length = vpr_world.print_vpr_world_store(world_buffer, maxsize) All the objects stored in the vpr_world are serialized using the method print_object and the result is stored in world_buffer. World_buffer is

139 then printed, passed to another process, or written to disk. This method renumbers the unique object ids and is now obsolete and only described for completeness. print_vpr_world_active Exactly the same as the print_vpr_world_store method except the object id’s are not renumbered. parse_object om_object *new_obj int size_parsed char object_string[1024] // this assumed the object_string contains a serialized object new_obj = new om_object size_parsed = vpr_world.parse_object(new_obj, object_string) This method parses the world_string and stores the data in the new_obj. The method returns the number of characters parsed in the object_string. add_object #1 om_object *new_obj int changed changed = vpr_world.add_object(new_obj) This method adds a new object to the vpr_world. The method returns the same value as the method unique_id and similarly will change the id if it is not unique. add_object #2 om_object *new_obj int size_parsed char object_string[1024] size_parsed = vpr_world.add_object(new_obj, object_string) This method is a combination of a call to parse_object followed by ‘add_object’ #1. The method return the number of characters parsed in

140 object_string. add_vpr_world char world_buffer [some size] int size_parsed size_parsed = vpr_world.add_vpr_world(world_buffer) This method parses a buffer of multiple objects and adds all the objects to the vpr_world. This method uses add_object #2 which results in object ids being renumbered if they are not unique. So, if you load a world twice, the objects in the second loading will have their ids changed. remove_object int id, requester int removed removed = vpr_world.remove_object(id, requester) This method removes the requested object and returns 1 if it was located and removed successfully. B.2 Socket Layer Issues Buffers filling up on the socket will cause writes to block. The result can be deadlock and starvation. Deadlock occurs because all the processes are trying to perform writes of data and no processes are reading data. This deadlock also occurs in the GOM when it attempts to forward a message to the requested destination and block on a write. In order to preserve sequential consistency, the GOM cannot be modified and must wait for the write socket of the destination to be available. The solution to deadlock lies in the processes, which act as clients for some objects and servers for others. The processes need to loop on a non-blocking write until it succeeds. While in this loop, the processes read of messages from the read socket and place them in an internal queue of objects to be processed. These messages are not actually processed until later but have been removed from the socket eliminating the possibility of GOM

141 deadlock on write. In the process of eliminating deadlock on GOM write, deadlock on process write is eliminated since the GOM is always available to read from the processes. Starvation can occur when the GOM reads sockets in some fixed or unfavorable order. The solutions to this problems is to create a torus read order where the least recently read socket is read first. If only one process is writing then the process will be serviced only after the GOM makes sure no other processes have writes pending. B.2.1 Matching Sends and Receives The socket level message passing system supports both data push and data pull. Send and Receives need to be matched up in order to support an RPC (Remote Procedure Call) pull style message system. All messages have a sequence number (Logical Clock LC) discussed further in Section 5.3.2. Messages have an additional field (reqfixnum) which can be either zero when no message matching is required or the LC that has resulting in this message response. B.3 Distributed Object Layer This section describes the level of the architecture just below the IfObject layer utilized by the application. The distributed object layer is where the Gryphon was injected into the DOM architecture resulting in an enhanced subsystem. B.3.1 Object Instantiation When an application wishes to create an object, a create request is sent to the DOM. The DOM then responds by creating the requested object and assigning it a unique globalid. To reduce the object creation latency the method call is non blocking asynchronous. This type of create requires that the unique object id be matched with the requesting create, but at the time of the request the unique id is not yet known. The solution is to create a pseudo-unique id called the Instance Variable.

142 The Instance Variable is only used in the case where the local copy has a globalid set to 0. The globalid is set once the acknowledgment of creation is returned to the requesting process. Each process has a unique process id, client_pid. The Instance Variable is generated by multiplying client_pid by * 1,000 and then adding the value MaxInstance. The MaxInstance value is incremented for each new object and a mod 1,000 is performed. This solution allowing for one thousand objects to exist waiting for acknowledgment of creation. This solution is not perfect and can cause problems if more the 1,000 objects get created in rapid succession without receiving acknowledgments. B.3.2 Object Class Creation New object classes used by an application must be manually created since no IDL compiler is available to perform this task. A new header file must be created for each new object class, the process requires deriving a new class from a base_distributed_object base class and adding new methods and variables. The serialization and de-serialization methods need to be enhanced to handle the new variables. This header file then gets included into ‘distributed_objects.h’, a header file, that contains all the object classes known by the system. The only code that cannot work

in

this

fashion

is

the

constructor,

so

there

is

a

file

called

‘distributed_objects_constructor_includes.cc’ that is included into one of the methods of the container class. The application developer adds their C++ file into the list of C++ files in the ‘distributed_objects_constructor_includes.cc’. The constructor file checks to see if the requested constructor string is equivalent to its class string name. If the object to be constructed is of this class, then the constructor is called and the class name flag is set to this class. The result of this design is a clean system which can easily be enhanced with new classes by the developer. The existing code does not need to be modified since the RTTI (Run Time Type Information) of C++ handles all these issues. The assignment operator needs to

143 be generic and expect a base_distributed_object in order for the code to work seamlessly. Using the RTTI’s const_cast and dynamic_cast methods, the object on the right hand side of the assignment operator can be tested and converted to the class on the left hand side. B.3.3 Marshalling Requests and Data The requests and the results of method calls need to be marshalled and unmarshalled. The result of the marshalling process is a string object which contains raw character data and a length specifier. Since the string is not null terminated, any data can be present which is transmitted for unmarshalling on the other end. This design allows MPEG streams to be transmitted through the network and the layers of the system. B.3.4 Application Termination When an application tells the GOM it is leaving a session, the application then waits for an acknowledgment. The result is a robust exit that prevents messages sent right before exit from being lost if the socket connection were terminated prematurely. When the VprConnection terminates the callbacks are first turned off for modify and create, unnecessary changes to be processed. The delete callback is left active so that local cleanup can be performed on objects being deleted. Since external updates keep coming, there is no correct stable state to stop processing external modification. Processes leaving the system can cause synchronization problems, one solution is for object requests to wait some length of time for a response and then time out. If no response is received then the request is sent again. The timing problem occurs when a process receives the request and then leaves the session without answering the request. Once the process is gone then a requests will show that the process is gone. The delay period can be adjusted on the fly to match the network characteristics.

Appendix C

Details of Applications Built on DOM

C.1 Session Manager and Object Manager All of the applications built using DOM make use of the object manager and session manager programs. The session manager has two roles: providing persistence for session objects and locating sessions. Having sessions enables processes in an application to utilize distributed objects in a name space that does not interfere with other applications. The session manager is run on a known host at a known port where processes contact the session manager to locate the global object manager for a given session. The object manager contacts the session manager to register as the global object manager for the specified session. When an object manager activates a session the object manager retrieves the persistent objects from the session manager. Upon deactivation of the session the object manager sends the persistent objects to the session manager for storage and the object manager unregistered as the global object manager for the session. The session manager program is then run in the following fashion: SessionManager/session_manager -S ‘pwd’/sessions -P 3456 & -S session directory path -P port number

145 ‘pwd’ needs to be the current directory containing the sessions and SessionManager subdirectories. SesssionManger is the directory containing the session_manager program and sessions is the directory that stories inactive sessions enabling persistence. The object manager will be started automatically by the application if an object manager is not already present for the current session and the object manager is located in the directory ../ObjectManager relative to the application location. The object manager can also be started from the command line in the same location from which the session manager was started by entering: ObjectManager/object_manager

foobar

-P

3456

scenario.vpr 0& ObjectManager/object_manager

Upon termination of the last client process in the session, the object manager automatically write out any persistent objects in the session and then terminates. C.2 Scenario Application The scenario application was written to facilitate the creation of distributed applications with the ability to take on characteristics of different application scenarios. A scenario client process requests creations, deletions, reads, and writes of objects in the scenario application. Using the application parameters the number of objects created and the update rate of these objects use a uniform, normal, or poisson distributions creating realistic work loads for the scenario. No message traffic is required when a client process performs a read of a distributed object located or cached locally. When performing a write to modify an object, the scenario process must first get the current state. If the object does not exist locally and it is not cached then a message pair occur: one to request the current state from the object owner and another to return the state to the requester. Once the state has been retrieved it is then

146 possible to make the modification and write the new state. The write can take 1 message if using asynchronous communication, 2 messages if using synchronous communication, and 2 messages if a move request is required before the object can be modified. Modifications to an object that exists locally will result in additional asynchronous communication to propagate changes to cached copies of the object. The scenario program needs to be run passing in various parameters on the command line as well as setting various environment variables. An example of the parameters passed on the command line might be: scenario -l 20 -v 20 -f 24 -m 1 -M 0 -s 100 -C 1 -r 2 -u 2 -U 0 -L 0 -l number of processes -v number of type vpr, doing frame reads at rate -f -f frame update rate -m number of modifiable objects per process -M distribution type for modifiable object number. 0(uniform), 1(normal), 2(poisson) -s total number of static objects -C number of seconds to run -c number of micro-seconds to run -r caching info - 0 (no-cache), 1 (cache), 2 or more (every nth update) -u update rate per second -U distribution type for update rate -L 0 for central and 1 for distributed objects Additional command line options are available and can be found by using the h flag. The environment variables can be set as follows: setenv VPRHOST

‘hostname‘:3456

setenv VPRSESSION

scenario.vpr

147 setenv VPRTRACEDIR

‘pwd‘/tracedata

setenv VPRTRACEONLY

scenario

setenv VPRSUMMARYMODE

1

setenv VPRLOCALMSGS

1

VPRHOST - the host on which the session manager is running VPRSESSION - the session name the application is using VPRTRACEDIR - the directory for storing trace files, when this variable is not set or is set to /dev/null then no trace is generated VPRTRACEONLY - filters the trace so only objects containing the specified string are traced. VPRSUMMARYMODE - 0 gives a full trace, 1 creates a summary VPRLOCALMSGS - 0 ignores local access, 1 produces events for local access All of the tracing features can be used by any application using DOM. The command line parameters -L and -r cause the scenario application to perform the given scenario in a mode equivalent to one of the 5 systems described in Section 3.3 System 1 - central, cache -L 1 -r 1 System 2 - central, no-cache -L 1 -r 0 System 3 - distributed, no-cache -L 0 -r 0 System 4 - distributed, cache -L 0 -r 1 System 5 - distributed, cache with 1/R update rate -L 0 -r 2 or more In an effort to gather the data for analysis purposes, System 1-4 can be run for 1 virtual second and will be in a stable state since updates are evenly divided into 1 second. System 5 needs to run longer since the propagation rate will not evenly divide into 1 second. So, these need to run for the number of seconds of one over the update ratio.

148 C.3 VPR VPR Client

Other Client

Domain Extensions VPR

Domain Extensions

Object Manager Interface

Object Manager Interface

Object Manager

Figure 22: The VPR Organization One of the applications is a meta-application, extensible desktop DVE, called the Virtual Planning Room (VPR) [NBH+97]. The basic VPR configuration is a set of processes acting as both client and server, where each process implements an interface into the DVE (the VPR application) and an interface to the object system (see Figure 22). Domain-specific extensions can be added to the VPR by adding (other processes with) objects that implement the extension; domain extensions also use the object system. Thus the object management service is a global service in which shared objects reside; they can be manipulated using remote method calls in an instance of the VPR or in a domain extension. The desktop DVE must represent visible/perceivable objects to its user. This is done by first defining a world as a collection of shared objects representing the entities that are logically significant in that world. For example, many of the VPR instances have a floor, walls, and miscellaneous other objects in the room. In particular, whenever a person enters the VPR (logs into a VPR session), that person’s avatar is created as a collection of objects, and the corresponding objects are placed in the world. Each VPR client process implements a user interface; it determines which

149 objects are visible to a person (based on the corresponding avatar’s location and orientation), then invokes methods on the relevant objects to obtain their VRML description so they can be rendered at the client’s user interface. If the VPR’s user or a domain extension manipulates an object, then the appropriate method in the (shared remote) object must be called to update the object’s state. An additional environment variable not used in the scenario program is: setenv VPRAVATAR

‘pwd‘/worlds/avatars/gary.wrl

This environment variable selects the VRML file to be used as the avatar. Many options exit and can be found by running the vpr with the -x flag. To run a VPR client process using all the environment variables enter the following in the Vpr directory: vpr C.4 Floaters Domain extensions define VRML appearances, whenever relevant, and behaviors that represent the semantics of the extension. An unoccupied air vehicle (the FLOATERS “blimp”1) navigation and control extension was built using the VPR. The VPR domain extension is an interface to the blimp that allows it to be controlled by VPR occupants, and which allows the blimp to appear in the VPR. The occupants in the VPR can send requests to FLOATERS at various levels of abstraction ranging from engine speed and pitch commands to leaving cookies for the blimp to follow and eye phones to direct its gaze. FLOATERS then uses its on-board algorithms to decide how to follow the cookies and when to take video of the desired location specified by the eye phones. The floaters program is executed by starting the floaters_sim program or the real blimp when it is available. The controller program xfloaters is then started and 1. FLOATERS was designed and built by Sam Siewert as a testbed facility for his thesis work on distributed real-time control [Siew97] [SNH97].

150 used to control the blimp. These programs are located in the Apps directory. C.5 3D Tic-Tac-Toe The 3D tic-tac-toe game is designed to allow zero, one, or two players. The game is played using keyboard input and output to a text terminal. It is also possible to watch the game using the VPR and use the graphic view to aid in the decision making process. The VPR needs to be started with the session name tictactoe.vpr. The application is run from the Apps directory by typing: tictactoe -n -n 0 (human player), 1 (computer player), 2(human vs. computer) When using option 0 or 1, the tictactoe program must be run again with 0 or 1 to activate the second player. With option 2 the second computer process is activated in the background. C.6 MPEG Client-Server This MPEG client-server program involves starting a server process that outputs MPEG streams. The server creates a channel in the form of an object and streams MPEG video in the form of object updates. Client processes join the session and select the update policy for a given channel. Using the remote control channel changer style paradigm, the client can select which channels they wish to observe. The object(s) representing the selected channel(s) set the hints to specify that caching is required and what update policy is requested. When the client’s process observes a network problem or processing problem the hints are changed to increase or decrease the bandwidth utilization. The server then adjusts and uses an enhanced GAAOP to discern the various frame levels found in MPEG and only distributing what is appropriate for the client. Additional details about MPEG can be found in the MPEG literature and the associated references [Koen99] and details about this application can be found in the associated documentation.

Appendix D

Abstract Auxiliary Interface Hints

A description is given on ways the Gryphon enhancements can be used by a higher level domain specific interface facilitating location, update, and security selections. Auxiliary interfaces, part of open implementation [OIG97] allowing developers to parametrically influence objects, at various levels of abstraction are described. Examples of high-level hints which can be provided to an application are described along with the methods used to implement the auxiliary interface and specify the hints. The methods that implement this high-level interface eventually utilize the Gryphon subsystem GAAOP hints to achieve the desired result. The auxiliary interface uses the combination of high-level hints in selecting location and update policies. Some hints are overly restrictive and must be left unspecified if other hints are to have any effect in the decision process. Highly-specific hints essentially dictate how the object is to be distributed, while general hints allow for optimization along different dimensions as this appendix describes. In the descriptions that follows, the_object is the name of a distributed object in the system. D.1 High-Level Hints Location - Location is more of a directive than a hint therefore dictating where the object should reside and causing the other location hints to be ignored. The method call move(the_object,

host_name) is used within the auxiliary interface

152 implementation to move the_object to the specified location and keep it there until another move method is called to change the location. Host_name is a string representing the host name where the object should reside. The method locate(the_object) returns the current location of the object. Calling the fix(the_object) method will cause the object to remain at its current location. This method is used to fix the_object at the location where the auxiliary interface has

specified

it

should

reside

and

has

the

move(the_object,locate(the_object)).

same

effect

Calling

as

the

calling method

unfix(the_object) causes the object to use other more flexible hints described in this section. Users - The Users hint specifies which client processes are using the distributed object, as well as a numeric value representing usage quantity. Numeric usage value is not absolute; rather, the value are selected by the application and show relative usage between clients. The auxiliary interface implementation takes this information and considers placing the object at the location with the highest utilization while also taking into account the other hints. Over time and as the users of the object change, the object may migrate. Calling the method usage(the_object) returns an associative

array

of

clients

and

their

usage

values.

The

method

usage_clear(the_object) removes all the usage information regarding the object. Calling the usage method with a client name and object returns the current usage value for that client, while a call with client name, object, and a value parameters sets the usage value. A client caching an object sets the update value to 0, since the client is not directly accessing the object. Relations - Relations hint indicates which objects should and should not be placed in proximate to one another. CPU intensive objects may need to be kept apart on separate hosts, conversely, related objects with high interaction rates should be kept together on the same host. The method relations(the_object) returns an

153 associative array of object names and their relation value with the sign and magnitude of the value representing the attraction or repulsion factor. Passing two objects as parameters returns the association value, while passing two objects and a value sets the association. The method relations_clear(the_object) clears the current values. Cached - specifies whether or not the object should be cached. The query method cached(the_object) returns true if cached and false if it is not cached. Calling the modify method which has the same name but an additional boolean parameter turns caching on and off. Consistency - If the object is being cached, the consistency hint is used to specify what type of consistency is to be applied. Not all objects have the same consistency requirements. Some objects require strict update policies, while others allow more flexible strategies. In certain situations, update losses and inconsistencies are not only acceptable

but

necessary

for

efficient

implementation.

The

method

call

consistency(the_object) returns the current policy while calling this method with the addition of a policy parameter will set the policy. The policies include: strong consistency, weak consistency, and domain acceptable consistency. The domain acceptable consistency must be selected for the Update Rates hint to have any effect. Update Rates - This hint is affective only when the domain

acceptable

consistency is selected. The method update_rates(the_object) is used to query the update rate, which is specified on a per user basis to specify the update rate for an object. The update_rates method with the addition of user and update rate parameters is used to specify the desired ratio of updates that need to be propagated. To specify the maximum time that may pass before modifications to an object get propagated, the update_rates method is called with user and time as

154 parameters.

It

has

the

same

syntax

as

the

Users

hint

with

update_rates_clear(the_object) used to clear the values. If the value is not set for a client then other hints are used to decide on an update rate. Access - The Access hint controls access privileges by specifying a list of which groups can or cannot utilize an object, listing groups of people who have access and those who do not. This hint is similar to Unix file system permissions but with greater complexity than read, write, and execute permissions. The hint has a large syntax similar to web server page security [Apac99], including reserved words such as grant, deny, all, none, read, and write. The list is parsed in left to right precedence, with terms on the right overriding those on the left. Names that appear in the Group

Members system variable may be used in relation to granted

permissions, specifying which users have access to each method and what privileges they are granted. Owners - Specifies ownership of the object. This hint can be used to grant access privileges to additional methods and maintenance functions, and follows the same syntax and parsing rules as the Access hint. Usage type - Based on the usage of the object, the Usage

type hint aids in

selection and distribution of update techniques. The hint has information related to how the object is being used and for what purpose. This hint is very general in its definition and therefore subject to numerous interpretations and implementations by systems. The syntax for this hint is a string defining the types defined in Munin: Write-once,

Private,

Write-many,

Result,

Synchronization,

Migratory, Producer-consumer, Read-mostly, and General

read-

write. Persistent - Specifies whether the object is persistent or transient, i.e., the lifetime of the object can be specified to include time and/or cause of death. The syntax is either

155 the word persistent or transient followed by a time or cause of death. D.2 System-Level Hints Hints can also be specified which apply to all objects in the application. An object, called the system_environment object, is created at each client process and contains all the hints relating to the system. The hints are modified using method calls, these hints are then utilized in the object placement and update policy decision process. Group Members - An associative array of group names and their members, including users, groups and the reserved words all, none, and the not prefix. Note: individual users are an instance of a group with only one member. User Resources - A per-user environment variable that can be modified on the fly and specifies information about an individual end user, including the bandwidth the user wishes to allocate to the system. Someone running a DVE on a small bandwidth connection might need to allocate the entire bandwidth, while others multi-tasking on a high bandwidth connection can reduce the bandwidth allocation to a fraction. Hardware Resources - A per-machine environment variable that specifies the resources provided by each machine. This variable has similar properties to User Resource except the variable applies to machines, which in some cases are just servers without associated users. The resources include max CPU power, current utilization, disk space, and other hardware details. Environment Variables - User resources and hardware resources can be specified for each process using system_environment.resource(). By passing in bandwidth and a value, the auxiliary interface implementation is aware of the quantity of data that can be handled by the process. Other specifiable resources include computing

power,

available

disk

space,

etc.

The

system_environment.resource() method is used to querying and be

156 notified as to the remaining resources. In order to remain within acceptable resource limits, this feature can be used by the developer to throttle the application demands. D.3 Example of Auxiliary Interface Implementation and Utilization The rest of this appendix gives examples of ways the abstract hints, which are implemented using the Gryphon auxiliary interfaces to an ORB, can be used to select a location to place the object. The following is an example of utilizing the auxiliary interface to specify an object location. Assume that the developer knows, a priori, the desired object placement for the_object is location Host_X, and needs a way to impart this information to the object distribution subsystem. The developer can achieve the desired object placement by assigning the Location hint which results in a move(the_object, Host_X). The Location hint directly indicates the desired location to the auxiliary interface and demonstrates the ability to specify object placement. The following examples use a hint to achieve the desired placement when all other factors are inconsequential: all hosts are equidistant from one another, have equal bandwidth over the paths, and have equivalent available resources. When these factors hold, the Users hint is sufficient to achieve the desired object placement. The internal functions that are used to implement the auxiliary interface and analyze information specified by the Users hint to achieve desired object placement are described to clarify the auxiliary interface. The Users hint specifies a set of user utilization pairs which specify the relative amount each user is utilizing the distributed object in question. The locate_user() function maps the user name to the associated host name, locate_user(user_name) → host_name, and the place_object() function, place_object(weighted_host_list)

→

host, takes the weighted host information and produces the proper location placement for the object. The two previously mentioned functions are used within the auxiliary interface to return the host name where the object shall reside. The usage number is paired with the user, specifying the frequency with which each user is

157 referencing the object. The developer supplies the utilization number that represents the user reference frequency to non-cached copies of an object. Each user is then mapped to a physical host locations (in what may be a many-to-one relationship) in order to determine the utilization number at each host. Host names are not specified directly since a user may not know their host name or the host may change if the user changes hosts. The location with the highest utilization is where the object is placed. Figure 23 is a a simple example of using the Users hint to obtain the desired object placement. In this example User_X1 has a relative utilization number of 1. User_X1 is the only user located on machine Host_X so the utilization number for Host_X is 1. Since this is the only host using the object then it must have the maximum utilization, therefore the object is placed on Host_X. /* Users is a list of users and their relative utilization of the object*/ Users = {(User_X1, 1)} /* The locate_user() function maps the users to host names and is called for each user*/ locate_user(User_X1) → Host_X /* The result of the mapping */ weighted_host_list = {(Host_X, 1)} /* The next function finds the host utilizing the object the most */ Location = place_object(weighted_host_list) → Host_X Figure 23: Simple use of the Users hint.

Figure 24 is a more complex example with four users of the object. The auxiliary interface maps the users to hosts to generate the weighted host list. The weighted host list is used by the auxiliary interface to identify the object’s highest usage host and the object is placed on that host which in this example is Host_X. When the network bandwidth and machine loads are not homogeneous then system data needs to be considered in the object placement process.

158

/* Users is a list of users and their relative utilization of the object*/ Users = {(User_X1, 10),(User_X2, 3), (User_Y, 11),(User_Z, 1)} /* The locate_user() function maps the users to host names and is called for each user*/ locate_user(User_X1) → Host_X locate_user(User_X2) → Host_X locate_user(User_Y) → Host_Y locate_user(User_Z) → Host_Z /* The result of the mapping */ weighted_host_list = {(Host_X, 13), (Host_Y, 11),(Host_Z, 1)} /* The next function finds the host utilizing the object the most */ Location = place_object(weighted_host_list) → Host_X Figure 24: More Complex use of the Users hint. User Resources and Hardware Resources are system variables that yield information about available bandwidth, CPU, memory, and disk, along with network topology information that is used by the auxiliary interface to decide object placement. The proper locations were achieved in the previous examples since sufficient information was provided to the auxiliary interface; however even partial information from the developer improves the placement process over no information at all.

Appendix E

Example Distributed Application

The following is a simple code example of an application built using the DOM enhanced Gryphon. The code is divided by section headings for readability purposes only and should be considered one large flow. E.1 Header Information #include #include #include #include #include #include #include #include "simple_string.h" #include "IfObject.h" #include "distributed_objects.h" #include "local.h" #define SCENARIOOBJNAME "scenario_obj" VprConnection *Vcon=0; #define MOVINGWRLOBJECT "soccer.wrl" E.2 A Create Callback static void VconCreateObj(IfObject *o, void *arg) { assert(o != 0); if (!(strncmp(o->class_name().c_str(), VPR_DATA_STRING, strlen(VPR_DATA_STRING)))) { GAAOP::hintObject hint_obj(o->ObjectId()); hint_obj.maxCount = 3;

160 hint_obj.maxDeltaTime = TimeVal(1,48000); // 1sec and 48msec Vcon->SetHintObject(hint_obj); } else if (!(strncmp(o->class_name().c_str(), GAAOP_STRING, strlen(GAAOP_STRING)))) { GAAOP::hintObject hint_obj(o->ObjectId()); hint_obj.maxCount = 1; Vcon->SetHintObject(hint_obj); // special setting to optimize the virtual clock mechanism } else if ((ProcessNum != 1) && !(strncmp(o->obj_name().c_str(), GLOBAL_CLOCK_NAME, strlen(GLOBAL_CLOCK_NAME)))) { GAAOP::hintObject hint_obj(o->ObjectId()); hint_obj.maxCount = 1;f Vcon->SetHintObject(hint_obj); } else if (!(strncmp(o->obj_name().c_str(), LOCAL_CLOCK_PREFIX, strlen(LOCAL_CLOCK_PREFIX)))) { GAAOP::hintObject hint_obj(o->ObjectId()); if (ProcessNum == 1) hint_obj.maxCount = 1; else hint_obj.maxCount = 0; Vcon->SetHintObject(hint_obj); } } E.3 A Modify Callback static void VconModifyObj(IfObject *o, void *arg) { // update the time if this is the clock object if (o == VO_clock) { time_object::time_attributes time_attr; if (o->GetAttr(&time_attr)) { TimeVal current_time(time_attr.seconds, time_attr.useconds); Vcon->SetCurrentTime(current_time); } } } E.4 A Delete Callback static void VconDeleteObj(IfObject *o, void *arg) { base_distributed_object::base_attributes b_attr; o->GetAttr(*b_attr); fprintf(stderr,"%d:%s: IfObject Delete %d:\n", ProcessNum, Progname, o->ObjectId(), b_attr.name.c_str());

161 } E.5 Creation void create_moving_objects(int ProcNum) { vpr_data::vpr_attributes vpr_attr; char *moving_wrl_object = MOVINGWRLOBJECT; vpr_attr.vrml = moving_wrl_object; vpr_attr.rot.init(); int Modifiable_Objects = 5; // create new objects if they don’t already exist int i=1; for(i=1; i Locate(buf)) { fprintf(stderr, "%d: Creating Failed Object Exists: %s\n", ProcNum, buf); } else { vpr_attr.name = buf; fprintf(stderr, "%d: Creating Object : %s\n", ProcNum, vpr_attr.name.c_str()); vpr_attr.type = IfTypeTRANSIENT; IfObject *Newobj = Vcon->CreateObj(&vpr_attr); } } } E.6 Modification static void Modify(IfObject *o, double radial, vpr_point3 axis) { vpr_data::vpr_attributes vpr_attr; char buf[64]; int process_num; int sub_id; int parse_value; parse_value = sscanf(o->obj_name().c_str(), "%[^0-9-]-%d-%d", buf, &process_num, &sub_id); if (parse_value != 3 || strncmp(buf, SCENARIOOBJNAME, strlen(SCENARIOOBJNAME)) != 0 || process_num != ProcessNum) return; if (o->GetAttr(&vpr_attr)) { double ang = clock_to_radian(radial); vpr_attr.rot.set(ang, axis);

162 if (!o->PutAttr(&vpr_attr)) fprintf(stderr, "%d - Failed Update to Object %s\n", ProcessNum, vpr_attr.name.c_str()); } E.7 Termination void cleanup() { delete Vcon; } void onintr(int num) { fprintf(stderr,"error: %d - %d Intr : Going Down\n",ProcessNum,num); cleanup(); exit(-2); } E.8 Initialization and Processing int main(int argc, char **argv) { Progname = argv[0]; GetOpts(argc, argv); fprintf(stderr, "%s: Connecting to host PortOpt %d SessionOpt \n", Progname, HostnameOpt, PortOpt, SessionOpt); Vcon = new VprConnection(HostnameOpt, PortOpt, SessionOpt, 0, APP_CONTROLLED_TIME,APP_NOT_DOING_PULLS_OFTEN); if (!Vcon->IsConnected()) { fprintf(stderr, "%s: Unable to Connect\n", Progname); exit (-1); } Vcon->SetCallBack(VprConnection::CallBack::CB_CREATE, VconCreateObj, 0); Vcon->SetCallBack(VprConnection::CallBack::CB_MODIFY, VconModifyObj, 0); Vcon->SetCallBack(VprConnection::CallBack::CB_DELETE, VconDeleteObj, 0); if (signal(SIGINT, onintr) == SIG_ERR) { fprintf(stderr,"%s: signal: %s\n", Progname, strerror(errno)); exit(-1); } TimeVal now; struct timeval t_now; if(gettimeofday(&t_now, 0)) printf("Time call error\n");

163 now = t_now; if (!Vcon->Run(TheTime,now)) { fprintf(stderr,"Could not run the Vcon\n"); exit(-1); } // set default hint const GAAOP::hintObject *constHint; GAAOP::hintObject modHint; constHint = Vcon->LocateHintObject(GAAOP_DefaultObjectId); if (!constHint) { fprintf(stderr,"Could not get default hint in the Vcon\n"); exit(-1); } modHint = *constHint; modHint.maxCount = Param_R; modHint.supportAsync = 0; modHint.supportSync = 1; modHint.supportMove = 0; Vcon->SetHintObject(modHint); create_moving_objects(ProcessNum); while (Vcon->GetTraceTime() < DoneTime) { // get all the waiting modifications. Vcon->Pull(); for(IfObject *v = Vcon->First(); v; v = Vcon->Next(v)) { Modify(v,radial,axis); } clock_sleep(sleep_time); cleanup(); }

References

[ABB+86] Mike Accetta, Robert Baron, William Bolosky, David Golub, Richard Rashid, Avadis Tevanian, & Michael Young. (1996, June 9-13). Mach: A new kernel foundation for UNIX development. Proceedings of the Summer 1986 USENIX Conference in Atlanta, GA, 93-112. [AG95] Sarita V. Adve, & Kourosh Gharachorloo. (1995, September). Shared Memory Consistency Models: A Tutorial. Digital Western Research Laboratory Available URL: http://www.research.digital.com/wrl/publications/abstracts/95.7.html [AM98] I. Abdul-Fatah, & S. Majumdar. (1998, May 26 - 29). Performance Comparison of Architectures for Client-Server Interactions in CORBA. The 18th International Conference on Distributed Computing Systems (ICDCS’98) , Amsterdam, The Netherlands. Available URL: http://www.computer.org/conferen/proceed/icdcs/8292/8292toc.htm [Apac99] The Apache Group. (1999, ) Apache HTTP Server Version 1.3: Security Tips for Server Configuration. [On-line]. Available URL: http://www.apache.org/docs/misc/security_tips.html [AS94]

Mustaque Ahamad, & Shawn Smith. (1994, December). Detecting Mutual Consistency of Shared Objects. Proceedings of Workshop on Mobile Computing Systems and Applications.

[BAB+96] Luc Bellissard, Slim Ben Atallah, Fabienne Boyer, & Michel Riveill. (1996, May). Distributed Application Configuration. Proc. 16th International Conference on Distributed Computing Systems - Hong-Kong - IEEE Computer Society, 579-585. Available URL: http://sirac.inrialpes.fr/Biblio/publi.html [BC97]

Robert Braham, & Richard Comerford. (1997, March). Sharing Virtual Worlds. IEEE Spectrum, 34(3), 18-19.

[BCZ90] John K. Bennett, John B. Carter, & Willy Zwaenepoel. (1990, July). Munin: Distributed Shared Memory Based on Type-Specific Memory Coherence. CACM, 168-176.

165 [BF93]

Steven Benford, & Lennart Fahlen. (1993, September). A Spatial Model of Interaction in Large Virtual Environments. Third European Conference on CSCW, Milan, Italy. Available URL: http:// www.crg.cs.nott.ac.uk/~sdb/

[BFF+99] T. Barth, G. Flender, B. Freisleben, & F. Thilo. (1999, September 5-7). Load Distribution in a CORBA Environment. International Symposium on Distributed Objects and Applications (DOA99) in Edinburgh, United Kingdom. Available URL: http://www.computer.org/conferen/proceed/doa/0182/0182toc.htm [BGG+91] A. Bricker, M. Gien, M. Guillemont, J. Lipkis, D. Orr, & M. Rozier. (1991, July). Architectural Issues in Microkernel-based Operating Systems: the CHORUS Experience. Computer Communications, 14(6), 347-357. [BGL98] Jean-Pierre Briot, Rachid Guerraoui, & Klaus-Peter Lohr. (1998, September). Concurrency and Distribution in Object-Oriented Programming. ACM Computing Surveys, 30(3), 291-329. [Birm97] Ken Birman. (1997). Ensemble system web pages. [On-line]. Available URL: http://simon.cs.cornell.edu/Info/Projects/Ensemble/overview.html [Blac97] Black Sun. (1997, June). Black Sun Community Server. [Computer software] [On-line]. Available URL: http://ww3.blacksun.com/launch/ server.html [BLV88] B. N. Bershad, E. D. Lazowska, & H. M. Levy. (1988, August). PRESTO: A system for object-oriented parallel programming. Software: Practice and Experience, 18(8), 713-732. [BN96] Marc H. Brown, & Marc A. Najork. (1996, April 15). Distributed Active Objects. Digital SRC Research Report, 141a. Available URL: http:// gatekeeper.dec.com/pub/DEC/SRC/research-reports/abstracts/src-rr-141a.html [Bran99] Scott Brandt. (1999). Soft Real-Time Processing with Dynamic QOS Level Resource Management.. [BRK98] Jean-Pierre Briot, Rachid Guerraoui, & Klaus-Peter Lohr. (1998, September). Concurrency and Distribution in Object-Oriented Programming. ACM Computing Surveys, 30(3), 291-329. Available URL: http:// www.acm.org/pubs/citations/journals/surveys/1998-30-3/p291-briot/ [BSP+95] B. N. Bershad, S. Savage, P. Pardyak, E. G. Sirer, M. E. Fiuczynski, D. Becker, C. Chamgers, & S. Eggers. (1995, December). Extensibility, safety, and performance in the SPIN operating system. In Procedings of the 15th ACM Symposium on Operating Systems Principles in Copper Mountain, CO, 95112.

166 [Burk95] Lauren P. Burka. (1995). The MUDline. [On-line]. Available URL: http://www.apocalypse.org/pub/u/lpb/muddex/mudline.html [BZS93] Brian N. Bershad, Matthew J. Zekauskas, & Wayne A. Sawdon. (1993). The Midway Distributed Shared Memory System. COMPCON. [Carr97] David F. Carr. (1997, June 16). Building Blocks: Communicator A Big Boost For CORBA Backers. Wall Street Journal’s internet edition. [On-line] Available URL: http://www.internetworld.com/print/1997/06/16/software/ 19970616-communicator.html [CC91]

Roger S. Chin, & Samuel T. Chanson. (1991, March). Distributed, object-based programming systems. ACM Computing Surveys 23(1), 91-124.

[CE95]

Martin D. Carroll, & Margaret A. Ellis. (1995). Designing and Coding Reusable C++. Reading, MA: Addison-Wesley.

[Chap97] David Chappell. (1997, June 17). An Introduction to ActiveX and COM. Presented at the USENIX 3rd Conference on Object-Oriented Technologies and Systems in Portland, Oregon, (COOTS ‘97). [CHOR97] CHORUS/COOL ORB. (1997, September). CORBA-Compliant Object Request Broker for Distributed Real-Time Embedded Systems Technical Documentation. [On-line]. Available URL: http://www.chorus.com/ Documentation/Cool/index.html [CPB93] K. Claffy, G. Polyzos, & H-W. Braun. (1993, September). Measurement Considerations for Assessing Unidirectional Latencies. Internetworking: Research and Experience, 4(3), 121-132. [DW95] Jack Dongarra, & David Walker. (1995, June). MPI: A MessagePassing Interface Standard. [On-line]. Available URL: http:// www.mcs.anl.gov/mpi/index.html [EGR91] Clarence A. Ellis, Simon J. Gibbs, & Gail L. Rein. (1991, January). Groupware: Some Issues and Experiences. Communications of the ACM, 34(1). [EKO95] Dawson R. Engler, M. Frans Kaashoek, & James O’Toole Jr. (1995, December). Exokernel: An Operating System Architecture for ApplicationLevel Resource Management. SIGOPS ‘95 ACM, 251-266 [EN96]

Zulah K. F. Eckert & Gary J. Nutt (1996). Tracing Nondeterministic Programs on Shared Memory Multiprocessors. In State-of-the-Art in Performance Modeling and Simulation: Modeling and Simulation of Advanced Computer Systems, Volume 1, Gordon and Breach Publishing Company.

[FBC+98]

Tom Fitzpatrick, Gordon Blair, Geoff Coulson, Nigel Davies, &

167 Philippe Robin. (1998, May 4-6). Supporting Adaptive Multimedia Applications through Open Bindings. International Conference on Configurable Distributed Systems (CDS ‘98) in Annaolis, Maryland. Available URL: http://www.computer.org/conferen/proceed/cds/8451/8451toc.htm [Fidg88] J. Fidge. (1988, February). Timestamps in Message-Passing Systems that Preserve the Partial Ordering. Proceedings of the 11th Australian Computer Science Conference, 10(1), 56-66. [Funk95] Thomas A. Funkhouser. (1995). RING: A Client-Server System for Multi-User Virtual Environments. ACM for 1995 Symposium on Interactive 3D Graphics, Monterey CA., 85-92. [Garn99] James Grosvenor Garnett. (1999, May), Distributed Phase and Frequency Synchronization. Master of Science thesis, Department of Computer Science, University of Colorado, Boulder, CO. [GBD+94] Al Geist, Adam Beguelin, Jack Dongarra, Weicheng Jiang, Robert Manchek, & Vaidy Sunderam. (1994). PVM: Parallel Virtual Machine A Users’ Guide and Tutorial for Networked Parallel Computing. Cambridge, MA: MIT Press. Available URL: http://www.netlib.org/pvm3/book/ pvm-book.html [GHJ+95] Erich Gamma, Richard Helm, Ralph Johnson, & John Vlissides. (1995). Design Patterns: Elements of Reusable Object-Oriented Software. Reading, MA: Addison-Wesley. [GHP+90] G. A. Geist, M. T. Heath, B. W. Peyton, & P. H. Worley. (1990). A Machine-Independent Communication Library. The Proceedings of the Fourth Conference on Hypercubes, 565-568. [HAB96] Saniya Ben Hassen, Irina Athanasiu, & Henri E. Bal. (1996, October). A Flexible Operation Execution Model for Shared Distributed Objects. Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA ‘96), San Jose, CA. Available URL: http:// www.cs.vu.nl/ vakgroepen/cs/orca_papers.html [HBB+97] Marty Humphrey, Toby Berk, Scott Brandt, & Gary Nutt. (1997, December). Dynamic Quality of Service Resource Management for Multimedia Applications on General Purpose Operating Systems. 1997 IEEE Workshop on Middleware for Distributed Real-Time Systems and Services, San Francisco, CO. [HE91]

Michael T. Heath & Jennifer A. Etheridge. (1991, September). Visualizing the Performance of Parallel Programs. IEEE Software, 8(5), 29-39.

168 [[HMR95] M.T. Heath, A. D. Malony, & D. T. Rover. (1995, November). The Visual Display of Parallel Performance Data. IEEE Computer 28(11), 21-28. [Holb95] Hugh Holbrook. (1995, June 1). Distributed Interactive Simulation. [Slides]. [On-line]. Available URL: http://www-dsg.stanford.edu/ DSGHomePage.html [HP90]

John L. Hennessy, & David A. Patterson. (1990). Computer Architecture A Quantitative Approach.

[HS99]

G. C. Hunt, & M. L. Scott. (1999, February). The Coign Automatic Distributed Partitioning System. Proceedings of the 3rd OSDI in New Orleans, LA. Available URL: ftp://ftp.cs.rochester.edu/pub/papers/systems/ 99.OSDI.Coign_automatic_distributed_partitioning_system.pdf.gz

[HSR91] Ann H. Hayes, Margaret L. Simmons, & Daniel A. Reed. (1991, October). Workshop Summary Parallel Computer Systems: Software Performance Tools. NSF Advanced Scientific Computing and Department of Energy. [HV99] Michi Henning, & Steve Vinoski. (1999, April). Advanced CORBA Programming with C++. Reading, Massachusetts: Addison Wesley Longman, Inc. [IBS98] Valérie Issarny, C. Bidan, & T. Saridakis. (1998, May 4-6). Achieving Middleware Customization in a Configuration-Based Development Environment: Experience with the Aster Prototype. International Conference on Configurable Distributed Systems (CDS ‘98) in Annaolis, Maryland. Available URL: http://www.computer.org/conferen/proceed/cds/8451/ 8451toc.htm [Jain91] Raj Jain. (1991). The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling. New York: John Wiley and Sons, Inc. [JLH+88] Eric Jul, Henry Levy, Norman Hutchinson, & Andrew Black. (1988, February). Fine-Grained Mobility in the Emerald System. ACM - Topics on Computer Systems, 6(1), 109-133. [Kay93] Alan C. Kay. (1993, March). The Early History of Smalltalk. The Second ACM SIGPLAN History of Programming Languages Conference (HOOPL-II), ACM SIGPLAN Notices 28(3), 69-75. [Kicz96] Gregor Kiczales. (1996, January). Beyond The Black Box Open Implementation. IEEE Software. Available URL: http://www.parc.xerox.com/spl/projects/oi/ieee-software/

169 [KMW96] Richard T. Kouzes, James D. Myers, & William A. Wulf. (1996, August). Collaboratories: Doing Science on the Internet. IEEE Computer, 29(8), 47-54. [Koen99] Rob Koenen. (1999, February). MPEG-4 Multimedia for our time. IEEE Spectrum, 36(2), 26-33. [Krue92] Charles W. Krueger. (1992, June). Software reuse. ACM Computing Surveys 24(2), 131-183 [Lach99] Ralph Lachenmaier. (1999, June). Open Systems Architecture Puts Six Bombs on Target. [On-line] Available URL: http://www.cs.wustl.edu/~schmidt/TAO-boeing.html [Lamp78] L. Lamport. (1978, July). Time, Clocks, and the Ordering of Events in a Distributed System. Communications of the ACM, 21(7), 558-565. [LC85]

L. Lamport & K. M. Chandy. (1985, February), Distributed Snapshots: Determining Global States of Distributed Systems, ACM Transactions on Computer Systems, 3(1), 63-75.

[Lewa98] Scott M. Lewandowski. (1998, March). Frameworks for componentbased client/server computing. ACM Computing Surveys, 30(1), 3-27. [Li95]

Guangxing Li. (1995, March). An Overview of Real-Time ANSAware 1.0 - ANSA Phase III. ANSA - APM.1285.01. [On-line]. Available URL: http:/ /www.ansa.co.uk/phase3-doc-root/sponsors/index.html

[Lisk99] Barbara Liskov. (1999, May 6). Invited Talk: Supporting Privacy in a Distributed Environment. Presented at the USENIX 5rd Conference on ObjectOriented Technologies and Systems in San Diego, California, (COOTS ‘99). [LMY88] Kum-Yew Lai, Thomas W. Malone, & Keh-Chiang Yu. (1988, October). Object Lens: A “Spreadsheet” for Cooperative Work. ACM Transactions on Office Information Systems, 6(4). [Matt89] F. Mattern. (1989). Virtual Time and Global States of Distributed Systems. (M. Cosnard et al. eds.), Parallel and Distributed Algorithms, Elsevier Science, North-Holland, 215-226. [MCC99] Paul Martin, Victor Callaghan, & Adrian Clark. (1999, September 5-7). High Performance Distributed Objects using Caching Proxies for Large Scale Applications. International Symposium on Distributed Objects and Applications (DOA99) in Edinburgh, United Kingdom. Available URL: http://www.computer.org/conferen/proceed/doa/0182/0182toc.htm [MH97] Marke Maybee, & Dennis Heimbigner. (1997). ORB Kit. [In a Conversation with the authors].

170 [Mill92] D. Mills. (1992, March). Network Time Protocol (Version 3): Specification, Implementation and Anal-ysis. RFC 1305, Network Information Center, SRI International, Menlo Park, CA. [Morr98] Patricia Morreale. (1998, April). Agents on the Move. IEEE Spectrum, 35(4), 34-41. [MR95] Tara M. Madhyastha & Daniel A. Reed. (1995, March). Data Sonification: Do You See What I Hear? IEEE Software, 12(2), 45-56. [MS97] Silvano Maffeis, & Douglas C. Schmidt. (1997, February). Constructing Reliable Distributed Communication Systems with CORBA. IEEE Communications, 14(2). [NAB+95] Gary J. Nutt, Joe Antell, Scott Brandt, Chris Gantz, Adam Griff, & James Mankovich. (1995, December). Software Support for a Virtual Planning Room. (Technical Report CU-CS-800-95). University of Colorado Boulder, Department of Computer Science . Available URL: ftp://ftp.cs.colorado.edu/ pub/distribs/Nutt/cu-cs-800-95.ps.Z [NBB+97] Gary Nutt, Toby Berk, Scott Brandt, Marty Humphrey, & Sam Siewert. (1997, September). Resource Management for a Virtual Planning Room. Proceedings of the Third International Workshop on Multimedia Information Systems, 129-134. [NBG+99] Gary J. Nutt, Scott Brandt, Adam J. Griff, Sam Siewert, Marty Humphrey, & Toby Berk. (1999). Dynamically Negotiated Resource Management for Virtual Environment Applications, to appear in IEEE Transactions on Knowledge and Data Engineering. Available URL: http://www.cs.colorado.edu/~griff/CV/IEEETKDE99.pdf [NBH+97] Gary J. Nutt, Toby Berk, Scott Brandt, Marty Humphrey, & Sam Siewert. (1997, September). Resource Management of a Virtual Planning Room. Proceedings of the Third International Workshop on Multimedia Information Systems in Como, Italy. Available URL: ftp://ftp.cs.colorado.edu/ pub/distribs/Nutt/iwmis97.ps [NGM+95] Gary J. Nutt, Adam J. Griff, James E. Mankovich, & Jeffrey D. McWhirter. (1995). Extensible Parallel Program Performance Visualization. International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS’95). Available URL: http://www.cs.colorado.edu/~griff/CV/Mascots95.pdf [NMM97] P. Narasimhan, L. E. Moser, & P. M. Melliar-Smith. (1997, June). Exploiting the Internet Inter-ORB Protocol Interface to Provide CORBA with Fault Tolerance. Presented at the USENIX 3rd Conference on Object-Oriented Technologies and Systems in Portland, Oregon, (COOTS `97), 81-90.

171 [Nørv99] Kjetil Nørvåg. (1999, September). The Persistent Cache: Improving OID Indexing in Temporal Object-Oriented Database Systems. 25th International Conference on Very Large Databases, Edinburgh, Scotland, UK. [Nutt95] Gary J. Nutt. (1995, December). Model-Based Virtual Environments for Collaboration. University of Colorado, Department of Computer Science Technical Report No. CU-CS-800-95. [Nutt97] Gary J. Nutt. (1996, December). The Evolution Toward Flexible Workflow Systems. Distributed Systems Engineering, 3(4) , 276-294. [OIG97] OI Group, Xerox Parc. (1997, May). Beyond the Black Box: Open Implementation. [On-line]. Available URL: http://www.parc.xerox.com/oi/ [OMG95] OMG. (1995, July). The Common Object Request Broker: Architecture and Sepcification. OMG Technical Document, PTC/96-03-04. Available URL: http://www.omg.org/specindx.htm [OMG96] OMG. (1996). Comparing ActiveX and CORBA/IIOP. [On-line]. Available URL: http://www.omg.org/library/library.htm [OMG99] OMG. (1999, June). CORBA Success Stories. [On-line]. Available URL: http://www.corba.org [Paxs97] Vern Paxson. (1997, April). Measurement and Analysis of End-to-End Internet Dynamics. Ph.D. Thesis, Computer Science Division, University of California, Berkeley, CA. Available URL: ftp://ftp.ee.lbl.gov/papers/vp-thesis/ dis.ps.gz [PGU+90] Cherri Pancake, Dennis Gannon, Sue Utter, & Donna Bergmark. (1990). Supercomputing 90 {BOF} Session on Standardizing Parallel Trace Formats. Available URL: ftp://eagle.cnsf.cornell.edu [PW97] Steve Pettifer, & Adrian J. West. (1997, June). Deva -operating environment for virtual reality applications. [On-line]. Available URL: http:// socrates.cs.man.ac.uk/~srp/deva/ [Rama96] Umakishore Ramachandran. (1996). Beehive: A Scalable Shared Memory Multiprocessor. [On-line]. Available URL: http://www.cc.gatech.edu/ computing/Architecture/projects/beehive.html [RBM96] Robber van Renesse, Kenneth P. Birman, & Silvano Maffeis. (1996, April). Horus: A Flexible Group Communications System. CACM. Available URL: http://www.cs.cornell.edu/Info/Projects/HORUS/Papers.html [Rhod97] Rhodes Virtual Reality Special Interest Group. (1997, April). Specification of Rhodes VRSIG Virtual Reality System: RhoVeR. [On-line]. Available URL: http://cs.ru.ac.za/vrsig/ Rhover

172 [RK98] Michiel A. Ronsse & Dieter A. Kranzlmuller. (1998). RoltMP - Replay of Lamport Timestamps for Message Passing Systems. Euromicro Workshop on Parallel and Distributed Processing, The Sixth (EUROMICRO-PDP’98), 87-93. [RM97] Francisco C. R. Reverbel, & Arthur B. Maccabe. (1997, June). Making CORBA Objects Persistent: the Object Database Adapter Approach. Presented at the USENIX 3rd Conference on Object-Oriented Technologies and Systems in Portland, Oregon, (COOTS `97), 55-65. [Roeh97] Bernie Roehle. (1997, March). Channeling the data flood. IEEE Spectrum, 34(3), 32-38. [RRR99] Michael Rabinovich, Irina Rabinovich, & Rajmohan Rajaraman. (1999, May 31 - June 4). A Dynamic Object Replication and Migration Protocol for an Internet Hosting Service. 19th IEEE International Conference on Distributed Computing Systems (ICDCS99) in Austin, Texas. Available UR: http://www.computer.org/conferen/proceed/icdcs/0222/0222toc.htm [SA97]

Karsten Schwan, & Mustaque Ahamad. (1997, May). The COBS Project - Configurable OBjectS for High Performance Systems. [On-line]. Available URL: http://www.cc.gatech.edu/systems/projects/COBS/

[Sadr98] Babak Sadr. (1998). Unified Objects Object-Oriented Programming using C++. IEEE Computer Society Press. [Scot97] Christopher W. Scott. (1997). ANSAware and ORBeline: a comparision. [On-line]. Available URL: http://www.dcc.ufal.br/~cws/ [SHT97] M. Van Steen, P. Homburg, & A.S. Tanenbaum. (1997, March). The Architectural Design of Globe: A Wide-Area Distributed System, Internal Report IR-422. Available URL: http://www.cs.vu.nl/vakgroepen/cs/globe_papers.html [SHT99] M. van Steen, P. Homburg, & A.S. Tanenbaum. (1999, JanuaryMarch). Globe: A Wide-Area Distributed System. IEEE Concurrency, 70-78. Available URL: http://www.cs.vu.nl/~steen/globe/ [Siew97] Sam Siewert. (1996). Operating Systems Support for Parametric Control of Isochronous and Sporadic Execution Streams in Multiple Time Frames. [PhD. dissertation proposal]. [SLM98] Douglas C. Schmidt, David Levine, & Sumedh Mungee. (1998, April 4). The Design of the TAO Real-Time Object Request Broker. Computer Communications Special Issue on Building Quality of Service into Distributed Systems, Elsevier Science, 21(4). Available URL:

173 http://www.cs.wustl.edu/~schmidt/TAO.ps.gz [SNH97] Sam Siewert, Gary J. Nutt, & Marty Humphrey. (1997, June). Realtime Parametric Controlled In-Kernel Pipelines. In Proceedings of the Thirds IEEE Real-Time Technology and Application Symposium in Montreal, Canada. Available URL: ftp://ftp.cs.colorado.edu/pub/distribs/Nutt/rt-pcip.ps [SPH+96] Randall M. Summers, James S. Peery, Roy E. Hogan, Victor P. Holmes, & David J. Miller. (1996). Coupling Finite Element Codes Using CORBABased Environments. Available URL: http://www.cs.sandia.gov/HPCCIT/ corba/impres.html [SS96]

Jag Sodhi, & Prince Sodhi. (1996). Object-Oriented Methods for Software Developement. (30-32). New York: McGraw-Hill.

[ST98]

David B. Skillicorn, & Domenico Talia. (1998, June). Models and languages for parallel computation. ACM Computing Surveys, 30(2), 123-169.

[SUN88] Sun Microsystems, Inc. (1988, June). RPC: Remote Procedure Call Protocol Specification Version 2. Network Working Group Request For Comments. Mountain View, CA. [Tama97] Roberto Tamassia. (1996, December). Strategic directions in computational geometry. ACM Computing Surveys 28(4), 591-606 [TCH97] Jay M. Tanenbaum, Tripatinder S. Chowdhry, & Kevin Hughes. (1997, May). Eco System: An Internet Commerce Architecture. IEEE Computer, 30(5), 48-55. [Thor97] Tommy Thorn. (1997, September). Programming languages for mobile code. ACM Computing Surveys 29(3), 213-239 [TSK98] Bruce Thomas, David Stotts, & Lalit Kumar. (1998, May 4-6). Warping Distributed System Configurations. International Conference on Configurable Distributed Systems (CDS ‘98) in Annaolis, Maryland. Available URL: http://www.computer.org/conferen/proceed/cds/8451/8451toc.htm [Visi99] VisiBroker version 3.4 On-Line documentation (1999, May). Available URL: http://www.inprise.com/techpubs/visibroker/ [VVT98] U. von Lukas, H.J. van der Lugt, & G.H. ter Hofte. (1998, April). Workshop report of OOGP’97, the first international workshop on object oriented groupware platforms. SIGGROUP bulletin, 19(1), 13-17. Available URL: http://www.telin.nl/events/ecscw97oogp/papers/oogp97rp.pdf. [WDG+97] Victor Fay Wolfe, Lisa Cingiser DiPippo, Roman Ginis, Michael Squadrito, Steven Wohlever, Igor Zykh, & Russell Johnston. (1997, June). Expressing and Enforcing Timing Constraints in a Dynamic Real-time CORBA

174 System. (Technical Report 97-252). University of Rhode Island. Available URL: http://www.cs.uri.edu/rtsorac/publications.html [WHH+92] Adrian J. West, Toby L.J. Howard, Roger J. Hubbold, Alan D. Murta, Dave N. Snowdon, & D. Alex Butler. (1992, May). AVIARY - A Generic Virtual Reality Interface for Real Applications. Proceedings Virtual Reality Systems Conference (sponsored by the BCS). Available URL: http:// www.cs.man.ac.uk/aig/aig.html [Worl99] Patrick H. Worley. (1999, April). MPICL: a port of the PICL tracing logic to MPI. Available URL: http://www.epm.ornl.gov/picl [WW97] Jim Waldo, & Ann Wollrath. (1997, June 17). An Introduction to Java Beans. Presented at the USENIX 3rd Conference on Object-Oriented Technologies and Systems in Portland, Oregon, (COOTS ‘97).