Elastic Infrastructure for Interactive Data Farming ...

Available online at www.sciencedirect.com

Procedia Computer Science 9 (2012) 206 – 215

International Conference on Computational Science, ICCS 2012

Elastic Infrastructure for Interactive Data Farming Experiments Dariusz Krola , Bartosz Kryzaa , Michal Wrzeszcza , Lukasz Dutkaa , Jacek Kitowskia,b b AGH

a AGH University of Science and Technology, ACC Cyfronet AGH, Krakow, Poland University of Science and Technology, Faculty of Electrical Engineering, Automatics, Computer Science and Electronics, Department of Computer Science, Krakow, Poland

Abstract With the increasing availability of high performance computing power, new possibilities with respect to simulation and analysis become available and feasible. One of such methodologies is data farming, where large amounts of data are generated through simulation of several configurations from large parameter space and then analyzed for patterns. Unfortunately, the availability of versatile data farming systems is very limited and none of existing solutions allows integration with novel Cloud solutions. In this paper, we present our system, which is a flexible solution for running very large Data Farming experiments on both intranet clusters as well as on remote computational resources, including public Clouds. Another important aspect of the presented system is the support of interactive Data Farming experiments with online analysis of partial experiment results and experiment extending capability. Sample application of our system is present on military mission planning support scenario. Keywords: data farming, cloud computing, agent-based simulation, military mission planning

1. Introduction According to several trends in modern scientific research, it is becoming clear that scientific research as well as technological breakthroughs will be dependent on the possibilities available for large scale data processing [1]. This notion is often called the 4th paradigm in scientific research [2]. One of the novel techniques possible with the availability of substantial computing power is the data farming, i.e. generation of large amounts of data from computer based simulations and not collected from physical sensors or experiments. This is a very important technique in areas and problems where real data is not readily available. This includes all kinds of problems where the results of experiments are highly nonlinear and where the parameter space is very complex, which results in impossibility of predicting the outcome of an experiment for a given parameter configuration and thus only a large number of simulations for multiple parameter combinations can give an insight into the problem. In particular the problem of simulating and predicting outcomes of agent based simulations representing people or entire crowds often gives qualitatively different results even with minor parameter shifts, which requires a significant coverage of the parameter space in order to get meaningful data for further analysis. In this paper we discuss a flexible system for running such large scale data farming experiments and its application to the problem of military mission Email addresses: [email protected] (Dariusz Krol), [email protected] (Bartosz Kryza)

1877-0509 © 2012 Published by Elsevier Ltd. doi:10.1016/j.procs.2012.04.022

Dariusz Krol et al. / Procedia Computer Science 9 (2012) 206 – 215

207

planning support for asymmetric threat scenarios, i.e. operations involving relatively small number of soldiers or policemen which have to deal with a larger crowd of hostile or neutral civilians. In this particular use case, the simulations are implemented using the agent based modeling paradigm and each experiment instance includes evaluation of the base scenario (e.g. protection of entrance to a military base) with some modifications of the input parameters (such as number of civilians or their hostility). Unfortunately, even in simple scenario having 10 parameters with 10 possible values gives 1010 possibilities which combined with several minutes necessary for each simulation to evaluate, makes it impossible to perform full coverage of the parameter space. For this purpose most data farming experiments require reduction of the parameter space using one of the design of experiment methods (e.g. Nearly Orthogonal Latin Hypercube) which allows to select only a subset of the parameter space which should bring the most relevant results. Our system includes a web-based Graphical User Interface for easy experiment setup, i.e. definition of parameter ranges, selection of design of experiment methods, monitoring and interactive management of the experiment, allowing for instance to fine tune the simulated parameter space based on the partial results of the experiment. A description of the system in context of military mission planning support is presented in [3]. The presented system is developed within the European Urban Simulation for Asymmetric Scenarios (EUSAS) project, which is financed by 20 nations under the Joint Investment Program Force Protection of the European Defence Agency (EDA). The main purpose of the project is to support a training process of security forces, such as military, policemen, etc. in asymmetric scenarios, i.e. the number of participants on each side is different. The project aims at improving tactics, techniques and procedures of security forces by running and analyzing multi-agent simulations, which utilize advanced psychological models of human behaviour. The rest of the paper is organized as follows. In Section 2, we present existing solutions for performing data farming. Section 3 contains basic information about Data Farming process. Next, in Section 4, we describe our system, its main design principles and goals we wanted to accomplish. Then, in Section 5, the process of conducting a Data Farming experiment is depicted with support of our solution. Finally, in Section 6, we discuss the implementation of the presented system along with technologies utilized during the development process. We conclude this paper in Section 7 and describe future work directions of this research. 2. State of the art Flexible and universal data farming systems are not yet available. A good overview of current status of various approaches to data farming is presented in [4]. Several systems have been developed for the military applications in order to support the mission planning and decision making process in the higher echelons [5]. In [6] authors present one of the first attempts to run large scale data farming experiments based on the agent modeling library called NetLogo [7]. Authors of [8] present more elaborate example of running data farming experiments using agent based modeling techniques with complex human psychological models requiring large parameter space. The problem of too large parameter space is addressed by application of proper methods of design of experiment. In [9] authors discuss novel methods for limiting the parameter space in data farming experiments and in [10] authors discuss in particular the Nearly Orthogonal Latin Hypercube (NOLH) method. A more up to date discussion of problems related to data farming in military applications is presented in [11]. The problems directly related with asymmetric threat scenarios and their agent based modeling and simulation are discussed in [12, 13]. One of the few tools allowing actual data farming is the OldMcData which provides means for generating combinations of parameters using design of experiment methods, in particular NOLH, and integrates with Condor [14] scheduler for running the jobs and collecting results. With the increasing popularity of Cloud computing, several research projects investigate its efficiency and usefulness in High Performance Computing (HPC). In [15], authors tested the Amazon Elastic Cloud Compute system with several scientific application as benchmarks, especially atmosphere-ocean climate modelling. Although, the performance is lower comparing to dedicated data centers, it can be compared to low-cost clusters. Also in [16], authors noticed the lower performance of cloud computing systems comparing to dedicated clusters and Grid environments, though pointing a few appealing features of clouds from scientists point of view, e.g. scalability and no infrastructure maintenance costs. Benchmarking of scientific workflows in Clouds is described in [17]. Authors presented performance comparison of workflow runtime both in clouds and in a dedicated cluster. In addition cost comparison is

208


presented. While the measured performance is lower in clouds but reasonable given the used resources, the overall cost is relatively small. Since data farming is still a relatively uncharted territory none of existing tools or libraries provides functionality required for flexible running of various data farming experiments with different types of parameters and even simulation implementation technologies. 3. Basics of Data Farming process Before describing our system for Data Farming, we should explain vocabulary required to understand a Data Farming process. Activity in Data Farming is divided into processes called Data Farming experiments (DF experiments). Each DF experiment aims at investigating a single scenario, which describes a concrete situation in a concrete physical environment. A definition of a scenario contains information about physical environment in which the scenario is taking place, description of actors, i.e. civilians and security forces, who are participating in the situation. Each actor is represented in the scenario by a programming agent, who can have multiple input parameters, e.g. prestige or a level of agression. In addition, an integral part of an agent is his/her behaviour, which is an algorithm, which decides how a particular agent acts in different situations. A DF experiment is configured by its input parameter set, which includes input parameters of all involved agent and other input parameters, which can describe various aspects of the scenario, e.g. number of agents or wheather. Each such input parameter can have multiple values to check. After collecting values of every input parameter, a set of vectors of input parameter values is generated. Performing a DF experiment means running a scenario specified in DF experiment definition multiple times but with different input values. Each such a run is called experiment instance. The output of an experiment instance is twofold. The first element is a set of values, called Measures of Effectiveness (MoE), which describe properties of the running scenario important from the end user point of view. Depending on the executed scenario, MoE can be any measurable property, e.g. the number of injured agents or the state of concrete agent mind. The second element of results is a log file from the experiment instance. It contains detailed information about the progress of an experiment instance, i.e. a list of performed actions performed by agents during the runtime. The output of a DF experiment is constituted of aggregated results from each experiment instance. 4. Elastic Environment for Data Farming As existing systems for performing Data Farming have many constraints regarding flexibility and accessibility, we developed our own solution using state of the art technologies to accomplish the following goals: • Infrastructure flexibility is probably the most desirable feature, when considering running complex simulation at a large scale in today’s data centers. Such computations can be done by using different types of computational resources: dedicated worker nodes if there is a number of machines to spare, virtual machines when we want to share the same hardware by multiple users simultaneously with minimal administration effort. Recently, the cloud computing approach gained popularity, especially in cases where a single organization does not have sufficient computational power. Public clouds can be utilized to perform important computations in tight schedule and require payment only for actually used CPU-time. • Experiment interactivity is a feature of long-lasting experiments, which can evolve during runtime based on partial results and analysis. In existing systems, which do not support interactive DF experiments, the user has to run a desired DF experiment multiple times with different set of input parameter values just to explore what happens in different cases. This is a time-consuming and uncomfortable process for users, thus in our system, we provide a few easy to use tools for analysing partial results of an experiment and extending the experiment with additional input parameter values. With this approach, the user can start a small DF experiment with a dozen of experiment instances, which will grow during the runtime, based on partial results. • Accessibility from various devices is an important feature in our mobile world, where users work on multiple devices, such as workstations, laptops or tablets. Many existing systems are based on the thick client approach, which are not intended to run on mobile devices, e.g. tablets or smartphones, being limited only to powerful


209

workstations and laptops. In contrast, our system is based on web-technologies and thin clients, which require only a web browser on a device, while the bussiness logic is stored and performed only on the server side.

Figure 1: Overview of the Elastic Data Farming environment.

The structure of the presented solution is depicted in Figure 1. The central point of the environment, called Simulation manager, is responsible for running DF experiments on a managed infrastructure, which can consist of heterogenous computational resources such as physical worker nodes or virtual machines. Moreover, we intend to support existing public clouds, e.g. Amazon Elastic Compute Cloud (Amazon EC2) [18] at first, which can be treated as sources of computing power, which are easy to access on demand and exploit and which go beyond a single organization infrastructure. The second responsibility of Simulation manager is serving dynamic content to users with the HTML language, which can be interpreted with a standard web browser, even in a mobile version. Internally, Simulation manager is horizontally divided into components, which are responsible for managing two separate parts of the environment: the experiment-related part and the infrastructure-related part. In addition, components of each part are divided vertically according to the Model-View-Controller (MVC) design pattern. All information from the environment is stored in a dedicated database: definition of DF experiments (current and historical), configuration of generated experiment instances and their results, and information about available infrastructure. Besides the Simulation manager application, each computational resource includes a dedicated piece of software, which is responsible for running experiment instances on the resource using input values retrieved from Simulation manager. After finishing an experiment instance, results from the instance are stored on a shared storage, and information about obtained MoE values is sent to Simulation manager. The most crucial functionality of the Simulation manager is dispatching experiment instances to available computational resources. Moreover, DF infrastructure management is supported, hence the user can add computational power to boost a running experiment. Current version of the system allow to start new virtual machines from a special template, which is configured to work with our system. A sequence diagram of such a boost along with activity performed by software located on the virtual machne is depicted in Figure 2. At first, each started virtual machine checks whether or not any DF experiment is running. If so, the repository with all neccessary files for running the experiment is downloaded and unpacked. Then, availability of experiment instance to perform is checked. Simulation manager decides, which experiment instance should be sent to the computational resource, based on the scheduling policy assigned to the experiment. For each available experiment instance, a config

210


Figure 2: A diagram of communication between Simulation manager and computational resource.

file with concrete set of input parameter values is returned and a seperate folder hierarchy is prepared on the shared storage. Based on the config file, a multi-agent simulation is run on the computational resource, and logs from the simulation are stored on the shared storage. After finishing the simulation, MoE values are sent from the computational resource to the Simulation manager, which persists them in a database, and marks the experiment instance as done. 5. Conducting Interactive Data Farming Experiments The environment described in the previous Section intends to perform complex Data Farming experiments with heterogenous computing resources. Currently, we are testing the infrastructure by running multi-agent simulations in urban locations, which aim at supporting the training process of security forces. A sample scenario involves controlling the access of civilians to a military base camp during elections in a mission abroad. In this scenario, civilians dividied into two groups, each of which has its leader, are waiting in front of a camp entrance to an operation base and intend to start a skirmish. From the security forces point of view, the goal of this scenario is to prevent escalation of agression by effectivie negotiations. However, civilians may act differently, depending on input parameter values, hence actions performed by security forces should be adjusted to a concrete behaviour. Each DF experiment consists of a few phases, each of which is facilitated by our system. The first phase is called parametrization and its goal is to define a set of input values. Each element of the set is a vector of input parameter values. The user goes to this phase, after selecting one of available scenarios to run. A definition of the selected scenario is stored in an XML file and among other things, it contains information about every input parameter of the scenario, e.g., parameter name and its constraints. The parametrization phase involves three steps: • Selecting type of parametrization for each parameter. Currently, we are supporting: a range of values, a random value with Uniform or Gauss distribution and a concrete value. Each parametrization type involves specifying different parameters. The random parametrization type requires parameters for selected distribution configuration. The range parameterization type requires minimum, maximum and step parameters. The concrete value parameterization type requires only the value.


211

Figure 3: Data Farming experiment monitoring view.

• Providing values for input value generator for each parameter, based on selected type of parameterization. While providing all the required values, the user can check how many vectors of input parameter values for the DF experiments will be generated. Additionally, a life validation is performed to exclude scenarios when the user provide invalid values, e.g. characters of string instead of numbers. • Applying Design of Experiment (DoE) methods [19]. The main goal of designing an experiment is reduction of the number of elements in the set of input parameter values while preserving these vectors of input parameter values, which can lead to meaningful results. The current version of the environment supports the following DoE methods: 2 to k method, full factorial, fractional factorial with the Federov algorithm [20] and our own custom method, which is based on Near Orthogonal Latin Hypercubes [21]. While, the basic NOLH method requires equal number of values for each input parameter, we modified the NOLH method to accept parameters with different number of possible values, by aligning the number of values to the highest value counter of the given parameter set. The added values are duplicates of existing values. During this phase, input parameters can be arbitrarily grouped with different DoE method applied to each group. Depending on the goal of performing a DF experiment and applied DoE methods, one can achieve a reduction of several order of magnitudes in the set of input parameter values without loosing interesting results. After specifying the set of input parameter values, all possible value combinations are generated and the DF experiment is started, i.e. experiment instances can be performed on a Data Farming infrastructure. Each running DF experiment can be monitored with a dedicated view, depicted in Figure 3. This view provides information about the progress of an experiment in form of a torrent-like progress bar along with a few statistics experiment instances, e.g. a number of all, done and currently running experiment instances and an average execution time of a single experiment instance. The progress bar is divided into a number of vertical pieces, each of which represents one or more experiment instances. The color of a vertical piece depicts the progress of experiment instances execution, which

212


are embraced by the piece. Using such a graphical representation of the progress allows users to identify at which state the experiment is. On the other hand, the monitoring view provides detailed information about the number of experiment instances in various states, i.e. idle, currently running and completed. In addition, a predicted time of DF experiment completion is estimated and updated continuously. The second type of actions, which can be performed with the monitoring view is analysis of partial results of a DF experiment. Moreover, a DF experiment can be extended, i.e. one can create new vectors of input parameter values, by using the knowledge from such an analysis. An analysis of partial result can be done with two types of charts: histograms and regression trees. Each chart accepts a name of a Measure of Effectiveness as the input. A histogram chart presents how many experiment instances ended with different MoE values. The user can specify into how many regions the MoE value should be divided, thus allowing to control the resolution of the chart. Besides the chart, this type of analysis presents some basic statistical information about the selected MoE, i.e. minimum, mean, maximum and standard deviation values. While the histogram chart provides some basic yet valuable information about the destribution of MoE values, the regression tree chart [22] provides information about the influence of input parameters on the selected MoE. For selected MoE, a regression tree chart draws a binary tree, whose each node (except leaves) contains the following information: • an inequality where on the left side is the name of an input parameter, and on the right side is a value, • a mean value of selected MoE, calculated based on completed experiment instances, which are embraced by the node, • the number of experiment instances embraced by the node. By embraced experiment instances, we mean these instances, which satisfy inequalities occuring on a path from the root node to the tree node. In the root node, the number of embraced experiment instances equals the number of experiment instances finished till the start of performing analysis. Besides performing an analysis of partial experiment results, one can extend the set of input parameter values of DF experiment with regression tree charts. In each node of a regression tree, which is not a leaf, an icon of a plus sign is presented. After clicking the icon, a dialog is opened with a list of already generated values of the input parameter from the tree node. Such a dialog with a sample regression tree is depiced in Figure 4. Besides the parameter values, the dialog contains a form for defining a range of additional values, which should be generated with a given step. In addition, a priority of these new instances can be defined. Experiment instances with higher priority will be performed first. After submitting such a form, addidional experiment instances are created with specified values for the given input parameter. Afterwards, the progress bar of the experiment is updated along with information about the number of all experiment instances from the experiment. In our case, we intended to analyze, which parameters of civilians influence the ”Global Anger” MoE the most, as the anger level should be as low as possible to prevent agression escalation. By creating this regression tree, we can deduct that the prestige parameter of two concrete civilans is the most important factor. These two civilians are leaders of two groups, thus to lower the anger level of civilans, it seems that soldiers should try to prevail these two agents. However, as the agent prestige parameters has only two different values in our experiment, we can extend the experiment to investigate more values of this parameter. Here, the support of experiment interactivity is especially important. Without this feature, the user would have prepare a seperate new experiment from scratch, and try to combine results from two experiments afterwards. As the size of a DF experiment can be increased during runtime, also the amount of computational power, which is dedicatad for performing the experiment, can be adjusted. To facilitate this action, we provide a dedicated dialog, which is opened after clicking the ”Booster” button. Using this dialog, one can specify how many computational power should be added to the experiment in form of virtual machines with a specified number of CPU cores and the mount of RAM memory. After completing the dialog, the required number of virtual machines will be started on the DF infrastructure. Naturally, the number of started virtual machines will be constrained by the amount of available resources. Besides scaling DF infrastructure dedicated to a single experiment, the user can specify scheduling policy, which controles the order in which experiment instances are performed. Currently three policies are supported: Monte Carlo, Sequential forward and Sequential backward. Depending on users requirements different policies can be used.


213

Figure 4: A regression tree with a dialog for extending an experiment.

After a DF experiment is completed, the user can download results of the experiment, which consist of log files from every experiment instance and a Comma-Seperated Values (CSV) file with input parameter values and obtained MoE values from every experiment instance. Such results can be used for further analysis with third-party tools. The EUSAS project utilizes a dedicated algorithm for behavioural pattern discovery, named MASDA [23], to analyze log files. The CSV file can be processed by any tool oriented on statistical analysis, e.g. JMP [24], since the presented environment does not address the issue of detailed statistical analysis. 6. Implementation To speed up the development process, we decided to use the Ruby on Rails framework [25]. It is a modern programming framework oriented on building web applications based on the Model-View-Controller architectural pattern, which supports the seperation of concerns principle. To perform previously described analysis tasks, e.g. creating regression trees, we use an open-source statistical language called R [26]. Integration between Ruby and R is handled by the RinRuby project [27], which allows to run and exchange data between R and Ruby programs. Thus, to create regression tree we only have to put the name of MoE, which should be analyzed, then perfomed a dedicated R function for creating regression trees based on data from a file, and retrieve results afterwards. To enhance the user experience within the view layer, we applied several javascript libraries such as jQuery, Infovis, Highcharts to produce appealing, interactive charts, dialog windows, and to implement the life validation behaviour. Also, to increase interactivity of the view layer, we use the AJAX-based, asynchronous method calls. Moreover, the presented system utilizes the fifth version of the HTML language, often referred to as HTML5. Compared to the previous version, HTML5 provides several useful mechanisms, e.g. WebSockets, which provide a bidirectional communication channel between the server and the client. Thus, sending information to clients, e.g. push notifications, can be implemented easily. Concerning infrastructure management, we utilize the libvirt project [28], which allows to manage various hypervisors remotely with a common API. On the server side, libvirt requires a special service, which accepts remote connections, thus each host which run virtual machines has to be appropriately configured. Besides libvirt, we use

214


the standard Ruby library, which provides means for establishing SSH connections to remote locations. Using this module, we can perform any terminal command on each server to which we have access rights. On each computational resource, a small, Java-based applications is deployed. It works in the pull mode to retrieve configurations of experiment instances to run, similarly to approaches presented in [29, 30, 31]. The number of instances that are running simultaneously at each virtual machine equals to the number of cores available for it. Multi-agent simulations are implemented using the MASON framework [32], which is written in Java, and they use log4j to log all important information to files. Each simulation in MASON uses one thread for its main loop, though it can utilize multi-core (pure CPU) machines easily. To communicate between Simulation manager and applications on computational resources, we use the pure HTTP protocol and the REST model. In our test installation, we exploit the Network File System protocol [35] as a foundation of the shared storage component. Though, NFS is often regarded as not scalable, its latest versions, called parallel NFS, look promising when concerning scalability. It intends to mitigate the single server bottleneck by enabling direct and parallel storage access [36]. As computational resources, we have tested virtual machines running on a cluster, physical machines such as laptops and cluster nodes, and virtual machines running on the Amazon EC2 public cloud. 7. Conclusions and Future Work In this paper, we have presented a versatile system for running large scale data farming experiments involving processing of very complex parameter space where custom design of experiment methods and interactive fine tuning of processed parameter space are required. The presented system is implemented using open source technologies and can be used on most of existing cluster systems as well as integrated with selected commercial Cloud providers. The system is currently being evaluated for military mission planning support in order to generate large amounts of data for further analysis in order to improve behavior models for agent based simulation component and to allow drawing conclusions regarding selected measured of effectiveness for higher echelons. Regarding future work, we plan to integrate with several, popular cloud providers such as RackSpace Cloud Servers [33]. Besides public clouds, we will provide means for integration with private clouds, which use open-source solutions, e.g. Eucalyptus [34]. As cloud computing becomes more and more popular, integration with existing clouds is especially important. In addition, we intend to enhance the shared storege component of the environment. Currently, it is based on the Network File System protocol, which is insufficient when concerning hundreds or thousands of virtual machines running in public clouds. Thus, we will deploy a dedicated storage cloud, based on Eucalyptus, with our own extension for supporting distributed storage [37]. Additionally, we will incorparate a novel approach for intelligent data distribution based on data usage profiles [38]. Acknowledgments. This work is supported by the EDA project A-0938-RT-GC EUSAS. References 1. A. Szalay and J. Gray, 2020 Computing: Science in an exponential world, Nature 440, pp. 413-414, 2006, doi:10.1038/440413a. 2. Hey, T., Tansley, S., and Tolle, K. M. (ed.) The Fourth Paradigm: Data-Intensive Scientific Discovery, Microsoft Research, 2009 3. Kryza, B., Krol, D., Wrzeszcz, M., Dutka, L., Kitowski, J., Interactive Cloud Data Farming Environment For Military Mission Planning Support, in: Computer Science Annual of AGH-UST, AGH Press, Krakow, in press. 4. Horne, G. E., Schwierz, K.-P. Mason, S. J., Hill, R. R., Mnch, L., Rose, O., Jefferson, T. and Fowler, J. W., Data Farming around the world overview, WSC, 2008, pp. 1442-1447 5. Horne, Gary E., Beyond Point Estimates: Operational Synthesis and Data Farming, in: Maneuver Warfare Science 2001, ed. Gary Horne and Mary Leonardi, Marine Corps Combat 6. Koehler, M. T., Upton, S. and Tivnan, B. F., Clustered Computing With NetLogo And RepastJ: Beyond Chewing Gum and Duct Tape, In Proceedings of the Agent 2005 Development Command Publication, 2001, pp. 1-7. 7. Tisue, S. and Wilensky, U., NetLogo: A simple environment for modeling complexity, in Ali Minai and Yaneer Bar-Yam, ed., ’Proceedings of the Fifth International Conference on Complex Systems ICCS 2004’ , 2004, pp. 16–21 . 8. Schwarz, G., Kunde, D. Urban, C. Agent based Modeling in Military Operations Research, A Case Study. In: 3rd Workshop on Agent based Simulation / C. Urban (ed.), SCS Society for Computer Simulation Int. ; Delft ; Erlangen ; Ghent ; San Diego, 2002, p. 55-60 9. Kleijnen, J.P.C., Sanchez, S.M., Lucas, T.W., and Cioppa, T.M., A user’s guide to the brave new world of designing simulation experiments, INFORMS Journal on Computing, 2005, 17(3), pp. 263-289. 10. Cioppa, T.M., Lucas, T.W., Efficient nearly orthogonal and space-filling Latin hypercubes, Technometrics, 2007, 49(1): 45-55. 11. Horne, G. E. and T. Meyer, 2010, Data farming and defense applications, MODSIM World Conference and Expo, ”21st Century DecisionMaking, The Art of Modeling and Simulation”, Hampton Roads Convention Center, Hampton, VA, USA 13-15 October 2010


215

12. Hluchy, L. Kvassay, M. Dlugolinsky, S. Schneider, B. Bracker, H. Kryza, B. Kitowski, J., Handling internal complexity in highly realistic agent-based models of human behaviour, proc. of 6th IEEE International Symposium on Applied Computational Intelligence and Informatics (SACI 2011), 2011, pp. 11-16 13. Kvassay, M., Hluchy, L., Kryza, B., Kitowski, J., Seleng, M., Dlugolinsky, S., Laclavik, M., Combining object-oriented and ontology-based approaches in human behaviour modelling, proc. of IEEE 9th International Symposium on Applied Machine Intelligence and Informatics, 2011, Smolenice, Slovakia, pp. 177 182 14. D. Thain, T. Tannenbaum, and M. Livny, ”Distributed Computing in Practice: The Condor Experience” Concurrency and Computation: Practice and Experience, 17(2-4), pp. 323–356, February-April, 2005. 15. Evangelinos, C., Hill, C. N., Cloud Computing for parallel Scientific HPC Applications: Feasibility of Running Coupled Atmosphere-Ocean Climate Models on Amazon’s EC2, Cloud Computing and Its Applications 2008, [online: www.cca08.org/papers/Paper34-Chris-Hill.pdf , as of December 27, 2011] 16. Vecchiola, C., Pandey, S., Buyya, R., High-Performance Cloud Computing: A View of Scientific Applications, Proceedings of the 10th International Symposium on Pervasive Systems, Algorithms and Networks (I-SPAN 2009, IEEE CS Press, USA), Kaohsiung, Taiwan, December 14-16, 2009. 17. Juve, G., Deelman, E., Vahi, K., Mehta, G., Berriman, B., Berman, B. P., Maechling, P., Scientific Workflow Applications on Amazon EC2, Workshop on Cloud-based Services and Applications in conjunction with 5th IEEE International Conference on e-Science, Oxford UK, December 9-11, 2009. 18. Amazon Elastic Compute Cloud website [on-line: http://aws.amazon.com/ec2, as of December 27, 2011]. 19. Atkinson, A. C., and Donev, A. N., Optimum Experimental Designs, Clarendon Press, Oxford. 1992. 20. Nguyen, N., Miller, A. J., A review of some exchange algorithms for constructing discrete D-optimal designs, Computational Statistics & Data Analysis, 1992, 14(4), pp. 489-498. 21. Ye, K. Q., Orthogonal column Latin hypercubes and their application in computer experiments, Journal of the American Statistical Association, 1998, 93 (444), pp. 14301439, doi:10.2307/2670057. 22. Breiman, L., Friedman, J., Olshen, R., Stone, C., Classification and Regression Trees, Belmont, California: Wadsworth, 1984, ISBN: 0412048418. 23. Bezek, A., Gams M., Bratko, I., Multi-agent strategic modeling in a robotic soccer domain, Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, May 08-12, 2006, Hakodate, Japan. 24. Sall, J., Creighton, L., Lehman, A., JMP Start Statistics: A Guide to Statistics and Data Analysis Using Jmp, Fourth Edition, SAS Press, 2007, ISBN-10: 159994572X. 25. Ruby on Rails framework websize [on-line: http://http://rubyonrails.org/, as of December 27, 2011]. 26. Curran, J. M., Introduction to Data Analysis with R for Forensic Scientists, CRC Press, Boca Raton, FL, 2011, ISBN 9781420088267. 27. Dahl, D. B., Scott Crawford, S., RinRuby: Accessing the R Interpreter from Pure Ruby, Journal for Statistical Software, 29(4), Jan 2009. 28. Bolte, M., Sievers, M., Birkenheuer, G., Niehorster, O., Brinkmann, A., Non-intrusive Virtualization Management using libvirt, Design, Automation & Test in Europe Conference & Exhibition (DATE), 8-12 March 2010, pp. 574 - 579, ISSN: 1530-159. 29. Funika, W., Korcyl, K., Pieczykolan, J., Skital, L., Balos K., Slota R., Guzy K., Dutka L., Kitowski J., Zielinski K., Adapting a HEP Application for Running on the Grid, in: Computing and Informatics, 28(3)(2009)353-367. 30. J. Marco, I. Campos, I. Coterillo, I. Diaz, A. Lopez, R. Marco, C. Martinez-Rivero, P. Orvis, D. Rodriguez, J. Gomes, G. Borges, M. Montecelo, M. David, B. Silva, N. Dias, J. P. Martins, C. Fernandez, L. Garcia-Tarres, C. Veiga, D. Cordero, J. Lopez Cacheiro, I. Lopez, J. Garcia-Tobio, N. Costas, J. C. Mourino, A. Gomez, W. Bogacki, N. Meyer, M. Owsiak, M. Plociennik, M. Pospieszny, M. Zawadzki, A. Hammad, M. Hardt, E. Fernandez, E. Heymann, M. A. Senar, A. Padee, K. Nawrocki, W. Wislicki, P. Heinzreiter, M. Baumgartner, H. Rosmanith, D. Kranzmuller, J. Volkert, S. Kenny, B. Coghlan, R. Pajak, Z. Mosurska, T. Szymocha, P. Lason, L. Skital, W. Funika, K. Korcyl, J. Pieczykolan, K. Balos, R. Slota, K. Guzy, L. Dutka, J. Kitowski, K. Zielinski, L. Hluchy, M. Dobrucky, B. Simo, O. Habala, J. Astalos, M. Ciglan, V. Sipkova, M. Babik, E. Gatial, R. Valles, J. M. Reynolds, F. Serrano, A. Tarancon, J. L. Velasco, F. Cstejon, K. Dichev, R. Keller, and S. Stork, The Interactive European GRID: Project Objecives and Achievements, Computing and Informatics, 27(2)(2008)161-171. 31. Korcyl K., Szymocha T., Funika W., Kitowski J., Slota R., Balos K., Dutka L., Guzy K., Kryza T., Pieczykolan J., The ATLAS experiment on-line monitoring and filtering as an example of real-time application, Computer Science, 9(2008)77-86, AGH University Press. 32. Luke, S., Cioffi-Revilla, C., Panait, L., Sullivan, K., Balan, G., MASON: A Multi-Agent Simulation Environment, In Simulation: Transactions of the society for Modeling and Simulation International, 2005, 82(7), pp. 517-527. 33. Rackspace Cloud Servers, [on-line: http://www.rackspace.com/cloud/cloud hosting products/servers/, as of December 12, 2011] 34. Eucalyptus Systems Inc. [on-line: http://open.eucalyptus.com/, as of December 27, 2011] 35. Network File System version 4 protocol specification [on-line: http://tools.ietf.org/html/rfc3530, as of December 27, 2011]. 36. D. Hildebrand, P. Honeyman, Direct-pNFS: scalable, transparent, and versatile access to parallel file systems, Proceedings of the 16th international symposium on High performance distributed computing, ACM New York, NY, USA, 2007. 37. Krol, D., Kitowski, J., Distributed Storage Support in Private Clouds Based on Static Scheduling Algorithms, in: Proc. of CLOUD COMPUTING 2011 The Second International Conference on Cloud Computing, GRIDs, and Virtualization, November 25-30, 2011 - Rome, Italy, IARIA, 2011, pp. 141-146. 38. Krol, D., Slota, R., Funika. W., Behaviour-inspired Data Management in the Cloud, in: Proc. of CLOUD COMPUTING 2010 The First International Conference on Cloud Computing, GRIDs, and Virtualization, November 21-26, 2010 - Lisbon, Portugal, IARIA, 2010, pp. 98-103

Elastic Infrastructure for Interactive Data Farming ...

Elastic Infrastructure for Interactive Data Farming ...

Suggest Documents

Scheduling Schemes for Data Farming

AiiDA: Automated Interactive Infrastructure and Database for ...

AiiDA: Automated Interactive Infrastructure and Database for ...

PatOlympics: an infrastructure for interactive ... - PROMISE NoE

Tuplespaces as Coordination Infrastructure for Interactive Workspaces

NextGen Infrastructure for Big Data

Data Mining Workbench for Interactive Data Exploration

AGILE: elastic distributed resource scaling for Infrastructure-as-a ...

Summary of Data Farming - MDPI

old salem infrastructure improvements - North Carolina Interactive ...

GRID INFRASTRUCTURE FOR SATELLITE DATA ... - Journals

Strategy for a European Data Infrastructure - CSC

Architecting Climate Change Data Infrastructure for Nevada

Developing a data infrastructure for a learning

Data Stream Processing Infrastructure for ... - Semantic Scholar

DEVELOPING SPATIAL DATA INFRASTRUCTURE FOR ... - BVSDE

Telecommunications Infrastructure Standard for Data Centers ANSI ...

Comprehensive Data Infrastructure for Plant Bioinformatics

Strategy for a European Data Infrastructure - CSC.fi

Strategy for a European Data Infrastructure - CSC

Assessing Spatial Data Infrastructure Architecture for ... - CiteSeerX

Researching Frameworks for evolving Spatial Data Infrastructure

An Infrastructure for Manipulating Multidimensional Semistructured Data

Changing demands for Spatial Data Infrastructure ... - CiteSeerX