A Proactive Database System and its Query ...

6 downloads 2502 Views 314KB Size Report
Online social networks (e.g. Facebook, Twitter and ... As a sample application for social network simulation, consider this example: managers of an online ... database systems (e.g. relational or graph databases) are passive, since they just answer .... Java is used as the programming language and a relational database ...
A Proactive Database System and its Query Language for Social Network Simulation Sadegh Aliakbary1, Jafar Habibi2 Sharif University of Technology Tehran, Iran 1

[email protected], [email protected]

Abstract. Social networks appear in different forms in various domains. Online social networks, mobile communication networks, co-authoring networks and human interactions in real life are some examples of social networks. Today, analysis of dynamics and evolution of social networks is an important field of research. Computer simulation is a powerful method of study – and in some cases the only one – in this field. Agent-based computer simulation has found many applications for simulation of social processes and Agent Based Social Simulation (ABSS) became the dominant method in this area. Although there exist some well-known and mature ABSS models and frameworks for social simulation, but most of them have limitations for social network simulation. In this paper we propose a distributed and scalable simulation model for social networks which exploits a central proactive database system. In this model, it is possible to distribute millions of agents in different processing units and propagate the simulation data between the agents efficiently using the central proactive database. We have also proposed a query language for this database system to be used by agents for capturing network properties and listening to network changes. The database is considered proactive, since it not only accepts select and update queries from the agents, but it also actively sends changes of the network to appropriate agents without being queried. We included some implementation notes about the proposed framework and the applications in which it is applicable and useful. Researchers and analysts of social networks may use the proposed model and framework for developing their desired agents and for running the simulation in a scalable and simple framework.

Keywords: Social network, computer simulation, multi agent systems, database, query language. Table of Contents 1. Introduction 2. Motivations and Related Works 3. Proactive DBMS 3.1. Query Language 3.2. Proactive Database System Components 4. Implementation Issues 4.1. Coordinator 4.2. Scheduler 4.3. Access Levels 5. Discussion and Conclusion 6. Future Works References Appendix. Grammar of Social Network Simulation Query Language

1. Introduction Nowadays, social networks are important and ever-growing structures. A social network is a graph of some social entities along with their relationships. The nodes of a social network are usually human beings, but sometimes some other entities like animals, robots, web-sites and computers with their relationships are considered as social networks. Online social networks (e.g. Facebook, Twitter and Google+), communication networks (e.g. mobile or email communications), citation networks and collaboration networks (e.g. in co-authoring a paper) are some examples of social networks. Study of characteristics and evolution of social networks is a noteworthy field in network analysis, with applicable methods like mathematical modeling, computer simulation and field research. Computer simulation, as a third way of doing science (after induction and deduction) [1], is a powerful and flexible approach in network analysis. When the problem is too complex to be modeled mathematically, computer simulation would usually be the best method of analysis. With social network simulation, an analyst represents a social network as a computer program, runs social activities and processes in a virtual simulated environment and then studies the properties, dynamics and evolution of the network. Diffusion, synchronization, search, advertisement and bargaining are just some examples of social network interactions and processes, nominated for simulation. As a sample application for social network simulation, consider this example: managers of an online social network web-site are willing to enhance some features of their service, but they are uncertain about the results and effects of these changes on the behavior of their clients and on key performance indicators. This social network is a complex system with many interdependent processes (e.g. diffusion of news, dating, effect of advertisements) and parameters (e.g. properties of members and relationships). In this situation, the best approach would be the initiation of a simulation of the social network with desired parameters and processes. Social network simulation has also many applications in sociology, physics, biology, economics and management. For simulating a social network, we need an appropriate computer program, and there are some choices to develop such a program. Since there are many similarities and common functionalities in different social network simulations, we can implement common features as a core infrastructure and extend it for different simulation instances. So perhaps the worst way is to program (from scratch) for requirements of our target social network in each simulation problem. It seems that we need a platform which helps us with an infrastructure of the simulation and lets us develop and configure desired behaviors of target social entities. In fact, social network simulation is a special kind of social simulation and there exist some mature platforms for social simulation like Swarm[2], MASON[3] and Repast[4]. Among different methods of implementation, agent-based simulation is perhaps the most popular one in social simulation[5]. In this approach, the social system is modeled as a multi-agent system with autonomous and intelligent agents, representing and simulating the behavior of real social entities. But in many situations, the existing agent-based frameworks are not suitable for simulation of social networks. They are not designed for social networks, they do not support especial properties and functionalities of social networks and they are not scalable for very large networks with complicated agents. In this paper we propose a new method for social network simulation. The main elements of this method are agents and a proactive database. The agents simulate behavior of network nodes and the database manages the state of the network structure and properties of nodes and edges. While usual database systems (e.g. relational or graph databases) are passive, since they just answer queries of clients, the database in our proposed method is a proactive one. It not only handles agent queries, but it also informs agents about changes of the network. With this approach, agents do not need repetitive queries of database to be informed about the network properties: the database informs an agent about a network change, if this change is considered important and is accessible by the agent. We designed a query language for this database and access levels for node/edge properties. The proposed method

allows the simulation of very large networks with complicated agent behaviors by distributing the execution of agents and performing an efficient information flow process. The rest of this paper is structured as follows: we review the literature and explain the motivations of our model in section 2. In the third section the proactive database and its query language is described. Section 4 is about some technical and implementation notes. Section 5 covers some discussion about properties of the proposed method and a conclusion. Finally, the future works are presented in the last section. An appendix is also included at the end of this paper with a short specification for the grammar of the proposed query language.

2. Motivations and Related Works In this section we overview the main researches related to social network simulation and our proposed method. We discuss about their features and their pros and cons and finally we propose a new approach for social network simulation and we discuss the need and necessity to this new model. When the outcome of a system is hard to be modeled mathematically, computer simulation would be the best choice for study and analysis of the system. In many instances of social network analysis problems, we face such a complex, nonlinear and dynamical system that it leads to simulation as the only applicable approach of study. Social simulation is the area of applying computational methods for modeling and simulating social systems and studying their mechanisms. Agent-based social simulation (ABSS) is one of the most important methods and perhaps the dominant approach in this area [5], [6]. Popularity of ABSS models and frameworks is growing rapidly[7]. ABSS has found many applications in physics, biology, sociology and management[6]. As Fig 1 shows, ABSS is the intersection of three scientific fields: social science, agent-based computing and computer simulation [5][8].

Fig 1. ABSS as a cross disciplinary research field [5]

Some cognitive architectures are proposed by cognitive scientists, for modeling the structure and behavior of agents, among them Soar [9], ACT-R [10] and Clarion [11] are more popular [12]. Although the use of cognitive architectures for developing agents in our proposed simulation approach is possible, but our proposed method is a high-level framework for the simulation, so cognitive architectures are regarded out of the scope in this discussion. Social simulation frameworks like MASON [3] , Repast [4], Swarm[2] and NetLogo [13] offer common functionalities of a simulation. The user of these frameworks should implement the behavior of agents and simulate the behaviors using the tools provided by the framework. These frameworks have made possible a growing number of applications in a variety of fields and domains. Perhaps the first candidate for social network simulation is now using one of these agent-based social simulation frameworks as the starting point and the infrastructure of the simulation.

Despite the existence of ABSS frameworks, we believe that a new simulation model is required in many instances of social network simulation problems. ABSS frameworks are not designed with social networks in mind. They usually support simple reactive agents, without a powerful support of scalable world-state management. For example, in MASON there is a shared world-state object, named SimState (simulation state: agent properties, environment and etc.) among all the agents. This model of accessing the world-sate is inefficient, insecure and unscalable. This model and similar approaches of current ABSS frameworks does not support (at least some of) requirements like these ones:    

 

Limited information access. An agent should see parts of the network that is allowed (not the whole network). Large networks. For very large networks with millions of nodes and edges, it is not possible to run the simulation on a single machine. (CPU and memory limits) Complex agents. Centralized simulation frameworks do not support distribution of agents among multiple processing units. Interrupt instead of polling. In an ordinary approach, an agent requests the required information in each simulation cycle from the world-state object, but in many circumstances an agent is waiting for something to happen. In this case it is better (more efficient) to inform the agent about a change whenever it is necessary, instead of forcing the agent to ask the state in each cycle. Hidden agent implementation. In some applications, the internal architecture and implementation of agents should be hidden and only their interaction with the network could be public. Flexible time period for simulation cycles. The time length of a cycle is fixed in most of simulation engines, but some cycles may consume more or less time to finish.

Some attempts are reported in the literature to encounter some of these challenges. For example authors of [14] proposed using Terracotta [15] for providing a global heap memory for Repast agents. It seems that grid, cluster and cloud computing techniques are also able to handle some parts (but not all) of scalability issues. But we think that no thorough research is available handling all of these challenges as a social network simulation platform or model. It seems that we need a more powerful and efficient model of information management and propagation (in handling properties of nodes and edges) for social network simulation, with better mechanisms for accessing and changing state of the network. In this paper we propose a proactive database system and a new simulation paradigm to overcome mentioned limitations. In our model, world-state is maintained and managed in a centralized social network database and the agents are executed on different machines. The centralized database is proactive, meaning that the database may call agents and inform them about a change. Unlike usual database systems (e.g. relational DBMSs) which are passive, handling only the requests of clients, our database not only responds the requests, but also it actively sends some necessary information about network changes to appropriate agents. This approach eliminates some of mentioned limitations by enabling efficient distribution of agents and managing access of agents to node/edge properties. Along with the new simulation model and the proactive database system, we have presented a query language for the social network database. This query language is used by agents for communication with the centralized database system. There exist some other SQL-like query languages, proposed to be used with social network databases. For example, SoQL[16] supports queries for finding groups and paths in social networks. BiQL[17] supports data manipulation operations on social networks and FQL[18], developed in Facebook for its third-party developers for accessing Facebook data, supports operations that a user can do on Facebook’s website. But the goal of our proposed query language is different. Agents in social network simulation may ask information about the network, (especially about the nodes and edges in their neighborhood) and they may change the properties of some nodes or edges (almost always they only update themselves and their own relationships). So we propose a query language that supports these requirements and is also designed with efficiency in mind for a proactive database system.

As a technical note, it seems that NoSQL databases and graph databases are useful for implementation of the proposed simulation model. We have implemented the framework with a relational database, but we think that NoSQL databases like neo4j[19] are better alternatives in this project and hopefully we will use them for subsequent versions.

3. Proactive DBMS In this section we explain the proposed simulation scheme and the proactive database system as the heart of this scheme. In our model, there is a central database system responsible for maintaining and managing simulation data and properties of nodes and edges. We propose a query language to be used by agents for communication with this database system.

3.1. Query Language Before we describe the architecture and processes of the proactive database, let’s start with some sample queries that agents may use in social network simulation. We categorize queries in three major classes: Select, Update and Add-Listener. A select query asks the database about an allowed piece of information. For example, an agent is usually allowed to ask about properties of its friends and edges (and sometimes about the friends of friends), so these are some valid select queries:   

Select name from friends n where link(me(),n).weight>10 Select * from friendsoffriends f where f.education= ‘PhD’ Select weight, trust-level from links l where l.to.age>18

Sophisticated select queries, like those of SoQL[16] that consider discovering especial paths and groups among arbitrary nodes, are not supported in our database because the nodes of a real social network almost never have access to such information. In many cases, an agent may need a special type of information in each simulation cycle. This agent would ask the information by repeating a Select query in each cycle, and the database system is forced to answer it even if no change is available. In this situation, it is more efficient to inform the agent when a change is happened. To support this mechanism, we propose using Add-Listener queries by which the agent asks the database system to inform him about any changes in specified nodes and edges. Consider these examples:   

Add node-listener on name,education from friends Add edge-listener on weight from links Add node-listener on * from friends

When an agent adds a listener, some triggers are installed on some nodes or edges in the database. If specified properties change on target nodes/edges, the database proactively informs the agents who have added a corresponding listener. For example, by the first mentioned example query (from three above), the agent asks the database to inform him whenever the name or education of one of his friends is changed. Finally, an agent may change its properties. Update queries are used to reflect a change in node/edge properties. Here are some examples:     

Update node m set health-status=’sick’ where m=me() Update edge e set weight=10 where e.id=23 (Permission is denied if the link with id=23 is not one of requesters links) Update edge e set weight=10 where e.from=me() and e.to.id=43 Add Link l where l.from=me() and l.to.id=45 Remove Link l where l.from=me() and l.to.age

Suggest Documents