Linux/Unix or Task Manager for Windows. The main ... Many other Event Monitoring tools are currently ... good event classifier and shorten the development time.
Agent based approach to events monitoring in Complex Information Systems Marek Woda, Tomasz Walkowiak Institute of Computer Engineering, Control and Robotics Wroclaw University of Technology ul. Janiszewskiego 11/17, 50-372 Wrocław, Poland {marek.woda, tomasz.walkowiak}@pwr.wroc.pl
Abstract This work is devoted to some theoretical aspects of events monitoring in Complex Information Systems. Authors depict some aspects of nowadays monitoring systems that make them ineffective. Next, research areas are presented. It is followed by a proposition of a new technique of monitoring that is being implemented, based on agents and idea of the molecule. Then test system architecture is discussed. Lastly, the advantages of agents based approach are revealed.
1. Introduction In computer science, event monitoring is the process of collecting, analyzing, and signaling event occurrences to subscribers such as operating system processes, active database rules as well as human operators. These event occurrences may stem from arbitrary sources in both software and hardware such as operating systems, database management systems, application software and processors. Event monitoring is increasingly a key part of systems defense. It is so inevitably related to the intrusion detection [1], that in many case it is almost impossible to separate both aspects when considering system security. With improvement of administration in mind, many software companies offer commercial tools for network monitoring. One of the most advanced monitoring tools is IBM Tivoli Monitoring. This very complex platform provides professional tools for gathering data and presents it to administrator in very convenient graphical diagrams. This feature helps to have a quick view of what is happening on the network. Other solutions are not as advanced as IBM Tivoli software; however there is some interesting software like OpManager that provides interface for
monitoring network traffic, CPU usage, memory and disk usage. It can also generate daily (weekly) statistics. AdRem NetCrunch is another monitoring platform. This system is dedicated for network data analyzing basing on SNMP protocol. It makes possible to gather system statistics characteristic for monitored workstation (i.e. memory usage). It available many interesting features like physical and logical topology recognition. From free software, there are usually local applications like top - a command line tool that presents system statistics of the local workstation for Linux/Unix or Task Manager for Windows. The main player on open-source market of network management software (which has network monitoring capability) is OpenNMS. It is a truly distributed, scalable platform for all aspects of the FCAPS network management model (FCAPS is a network management functional model defined by ITU-T and ISO in specification M.3400 - http://www.itu.int). Many other Event Monitoring tools are currently being used, but they are relatively ineffective [2] mainly because sparse intelligence and often massive resource consumption. To improve this, the Artificial Intelligence (AI) is introduced [3,4] and plays a driving role in evolving event-monitoring services [5,6].
2. Event monitoring system requirements The major requirements for such robust monitoring system are flexibility, modularity and the capability of processing all kind of data from the network in all kinds of ways to produce meaningful information. The implementation of the effective monitoring system requires the synthesis of several technologies. One must bring together knowledge in the fields of artificial intelligence, data processing, distributed systems, and networks. While extensive research has been conducted in all of these areas [9,11,12], an agent
based system imposes some new design parameters, which must be met [10]. Existing event monitoring systems, especially commercial ones, are based on misuse event detection approach, which means these systems will only be able to detect known event types and in most cases they tend to be ineffective due to various reasons like nonavailability of patterns, time consumption for developing new patterns, insufficient data, etc. Applying artificial intelligence methods for the development of such systems yields some advantages, compared to classical approach [2]. AI techniques have been employed in the computer security field since the early nineties. The AI provides new flexibility to the uncertain problem also in intrusion detection systems and allows much greater complexity for event monitoring systems. However, most of the AI based systems require human experts to refine their response. It is inevitable occurrence. Unfortunately, these tasks are time consuming and human dependent. Nonetheless, if the reaction rules were automatically generated, less time would be consumed for building a good event classifier and shorten the development time of building or updating a new event classifier. A hybrid (AI based) system should be proposed for aiding network administrator in the task of event monitoring with pre-filtering methods and finally supporting computer intrusion detection. Authors combined, in not-yet production system, AI techniques (fuzzy logic, data mining, neural networks and/or clustering techniques) to provide efficient method for anomaly based, unknown events detection and to utilize this approach as aid for host based intrusion detection. Our long-term goal is to make this system working in a real-world environment. AI based system for event monitoring is proposed as a counter-measure on computer systems in a network environment to avoid commonly known shortcomings of nowadays detection systems. The system is built using intelligent agents, and applies the data mining techniques to support intrusion detection.
3. Agent based approach to monitoring system – early prototype Entire system that is being monitored ought to be considered as a set of molecules. This approach would allow system to be services oriented and by the same to divide it for the sake of the business services that are being served. It will allows for monitoring not solely system as a whole (which is kind a difficult, for the sake of resources and synchronization reasons etc.), but divided into small easily controllable regions called molecules. Molecule is the collection of the low-level
services that make possible rendering business (high level) services. There should be at least one, duplicated, low-level service in each molecule that has vital meaning for entire system. Each molecule should be able to interact with another one. Inner molecule services can be used to support other similar ones, in case when one of lowlevel services fails and there is no spare one to backup. System environment will be divided into three tiers (Fig. 1): - High level (business services); - Middle level (component services); - Low level: o operating system (local), o smart agents (remote). Smart-Sensors
Virtual User Representatives
(pre-filtering) Application - Level 3
Services - Level 2 Protocol - Level 1 Hardware - Level 0 (Bottom level)
Advanced filtering
Repository
Impact assessment
Human decision maker
Reconfiguration GP&R DBs Server
Simulator 3rd paty monitoring tools
Figure 1. High-level architecture High level – this is the end user (business) services level. Each main service in any system (test bed) like e.g. delivering video on demand can be perceived as a complex service. All distinguished and recognized, critical, services are being monitored for security reasons, with paying attention to their accessibility. None of these tasks can be entrusted to a human operator to monitor. An individual is no reliable enough, is such case, due to elaborateness and monotony of these tasks. So at this level User Virtual Representative agents will be employed. Their main role will be administering lower-tier agents in pursue to detect and recognize the major services unavailability. Middle-tier – this is the component (“lower” level – functional, logical, and physical) services level. Each of the available main services (“ticket purchase system”) comprise of several components, usually more specific (‘lower” level) services like e.g. DNS or DHCP services or physical one like physical network connection. Molecule / mobile agents play here important role. We have defined two types of agents: molecule agent,
mobile agents (communication agents). Molecule agents are ascribed to one molecule or single a component service in a molecule and cannot be recalled or relocated from the initial position. Molecule agent assignment is to either entire molecule or one low level service, and it is strictly dependent of molecule complexity (or rather on business services). For the molecules that are not business critical - agents are rather not advisable to use (for the sake of resources consumption) and in that case, it is strongly recommended to use mobile ones. Mobile agents work as communication line, it means they are doing most of the information exchange or facilitate it. Communication Agent mobility feature comes from its ability to transport its state from one environment to another, with its data intact, and still being able to perform appropriately in the new environment [1]. It infrequently happens that environment which agent operates became hostile or no longer handy to work in then agent could move to another molecule and restore its activity. Mobile agents decide when and where to move next. Agent moves as an ordinary user does, namely it does not really visit a website but only make a copy of it, a mobile agent accomplishes its ‘move’ through data duplication. When a mobile agent decides to move, it saves its own state and transports this saved state to next host and resume execution from the saved state. Mobile agents perform two roles: - molecule administrators, where are able to manage entire (rather tiny and simple) molecules, act in such case like molecule agents, but more than one low-level service can be entrusted to them, it applies only for molecules without crucial meaning for entire system, - the carriers of critical data, they move from one molecule to another with information about low(-er) services, their availability, malfunctions, unexpected events, and passing data about false alarms, in order avoid alarming entire system when similar event occurs - the receivers that receive reconfiguration actions from Policy & Reaction Servers for low-level services, Low level (OS level) – this is the lowest level, mainly oriented on interactions at kernel API (local approach). All processes, their operations, and interactions between them are monitored at the root. Monitoring action is directly pinned to the operating system. This approach is somewhat restricted because of the direct connection to operating system core and by that limited to the implementation of specific OS (UNIX, Windows). Nonetheless if properly implemented this might be best way to raise alarms about real malicious events
due to the fact that at this level “data noise” is almost nonexistent and agents are not deafened by the other working (higher level) applications. Remote approach of smart agents encompasses sending preconfigured smart agent to the root source of events usually located far from system core (it could be a network node, or remote server etc.). All events are being monitored by the Smart agents, data-harvest oriented, mainly responsible for threat detection / filtering. Smart agents are based on AI techniques, but only in vestigial form. These agents should stay as small as possible, not to bring additional, unnecessary burden for the operating system. High-level agents called User Virtual Representatives function as substitutes of users, acts like human users, perform regular human actions in pursue to detect service unavailability. One UVR agent is ascribed only to one business service. When service, which agent is ascribed to, is no longer responding, and UVR agent give cannot recognize the culprit, it gives commands to lover tier agents in order to identify the situation. Middle-tier level agents gather data, render complex event logs and activity data into common formats (normalized data) while low-level agents called smart agents classify recent activities and provide data and current classification states to each other, and to a higher level of agents that use data mining techniques.
4. Smart agents Our test system takes advantage of Smart agents, agents that conform to mobile agent definition; it denotes a process that can transport its state from one environment to another, with its data intact, and still being able to perform appropriately in the new environment. Smart agent, unlike to a mobile one, due to its limited AI does not decide for itself where to move. Higher tier agents or human operator decide on smart agent behalf to move, then it saves its own state and transports this saved state to next host and resume execution from the saved state. Orientation if the network state is intact along with the immediate situation assessment is vital; we need precisely know what are the spotted issues, therefore up-to-the-minute harvest traffic data must be made on par with preliminary events assessment. Whenever quick and accurate network state assessment is required, Smart Agent is the right solution. Due to its lightweight, cost-efficient architecture, it can be glued to any vital point of the network to perform its tasks. Smart Agent should be able to perform: • initial processing of the traffic data, • preliminary filtration of the events,
• • •
web server analysis, mail server analysis, security monitoring and management of the network devices.
Global Reaction Agents Virtual User Representatives
IDSes Mobile agent (Floating & Local) Smart - Agents Traffic analyzers
Figure 2. Agents and tools hierarchy in multi-agent event monitoring system Smart agents functionality aims in gathering and processing (from specific nodes e.g. users’ computer) traffic data. These functionalities could be perceived in two, extreme opposite, ways: • Entities that are spying (especially) on users (enduser standpoint), • Entities that extend the functionality of the system and the same enforce the desired security level by allowing the prevention malicious event to spread. We would rather concur with the second opinion, because the system operators / administrators are deploying these smart agents, so their structure is known inside out, and what is more, they are totally transparent for the host system, their main goal is to provide and render business services at highest possible level with utmost security level. Lastly these users, who disagree to have smart agent deploy on host machine, will be not be given full functionality of the available services.
Reason for having exchangeable IN / OUT modules is the Smart Agent to be more versatile and at the same time as compact as possible. An appropriate IN/OUT module could be applied in a necessary situation and in the environment, that smart agent will be placed. To meet agent versatility requirement it suppose to be able to utilize SNMP protocol and WMI mechanism. WMI is effective management technique of PC and server systems (Windows based) in an enterprise network, benefits from well-instrumented computer software and hardware, which allow system components to be monitored and controlled, both locally and remotely. WMI is the Microsoft implementation of the Common Information Model (CIM) initiative developed by the Distributed Management Task Force (DMTF) The access to the manageable entities is made via a software component, called a “provider” which is nothing else than a DLL implementing a COM object written in C/C++. Because a provider is designed to access some specific management information, the CIM repository is also logically divided into several areas called a namespace. Each namespace contains a set of providers with their related classes specific to a management area (i.e. Root\Directory\LDAP for Active Directory, Root\SNMP for SNMP information or Root\MicrosoftIISv2 for Internet Information Server information). Agents could share their data in two ways by sending a specific set of interfaces handling SOAP requests/responses (also in SYSLOG-NG format via UDP protocol through SSL/TLS) or allows other agents authorized agents to have insight their internal data. For the sake that the more lightweight is Smart Agent that more mobile can be and by that less memory it consumes, these in/out modules can be accustomed to the special tasks or requirements for data output. So that some part of it will be used on the basis where it will be deployed or/and what environment will be monitored.
5. Smart agent – architecture overview Agent (Fig. 3) comprise of the CORE that includes compact internal DB (threats & traffic signatures, small amount of processed data), rapid reporting engine, watchdog & alerting module, and pre-filtering module. Other parts of smart agent are IN / OUT data modules that could be replaced anytime.
Figure 3. Smart agent architecture Native multi type data input module:
•
SNMP, WMI mechanism incorporated into module system / device deployment dependent. Pre-filtering module: • screening traffic flow, • separates useless data and prevent by passing it to the internal db (e.g. only error from Windows event log are taken into consideration) . Watchdog & alerting module • screening for real hazardous events & abnormal behaviours, • could process data from Internal DB against security flaws, compare with known patterns, • detected threats & recognized patterns could be propagated automatically to higher level / system administrator by email, • suspicious / abnormal unrecognized traffic patterns passed from Internal DB to other agents for complex processing, Rapid reporting module • generates simple (e.g. system / service operational or not) or compound (e.g. service availability) reports about monitored system / device status.
Figure 4. Administrator console
6. Implementation status The presented and described monitoring system is currently partly implemented, it still works unfortunately not yet in production environment, and it is still being under development. Different types of smart agents were developed, they allow monitoring almost all known events (SYSLOG or file based) –. They are also capable to monitor performance in Linux (by libstatgrab) and Windows (by WMI) environments. In case of SNMP only network performance services has been developed and tested. Designed and partly developed system - is still a proof of concept (prototype) rather than ready for use system, however major parts of its source code was implemented, so that it could be reused during development of the monitoring system product. The monitoring system is developed mainly in Java language (with usage of JAXB for XML processing – configuration and sending events), however WMI based agents are coded in C#. Fig. 4 depicts the architecture of a model test-bed. Events that come from a particular host in the monitored environment (Fig. 4) could be displayed (Fig. 5) on an agent’s console.
Figure 5. Agent’s console
7. Summary The paper presented agent based approach to events monitoring in Service-oriented Information Systems that have following advantages over current solutions monitoring solutions: It combines WMI & SNMP mechanisms along prefiltering mechanisms that allows collecting only real hazardous events. System members can be easily notified about threats (communication between similar agents) due to their independent communication layer. Security policies / authentication methods incorporated in the agent-based systems prevent misuse or distort the information held by agents. AI techniques incorporated in the monitoring system based system provide accurate recognition of the events. Even based pre-filtering mechanisms could prevent false alarms. Due to its distributed nature, system can easily analyse high traffic volume. Agents’ mobility facilitates possibility to pass events information to the generally not accessible (sub) nets. Smart agents located on the user’s host machines can diagnose conflicting services, hardware malfunctions and prevent user’s misuse actions.
Acknowledgement Work reported in this paper was sponsored by an EU grant DESEREC IST-2004-026600, (years: 20062008) "Dependability and Security by Enhanced Reconfigurability"} in a frame of which research reported in this paper was developed. DESEREC project propose a three-tiered approach to the dependability improvement of CIS DESEREC devises and develops innovative approaches and tools to design, model, simulate, and plan critical infrastructures to improve their resilience. It integrates various detection mechanisms to ensure fast detection of severe incidents but also to detect complex ones, based on a combination of seemingly unrelated events, or on an abnormal behaviour. It also provides a framework for computer-aided counter-measures initiatives to respond in a quick and appropriate way to a large range of incidents to mitigate the threats to the dependability and rapidly thwart the problem. Reconfiguration is the utmost mechanism for system survivability.
References [1] R.G Bace, “Intrusion Detection”. Technical Publishing (ISBN 1-57870-185-6) [2] N.B. Idris, B. Shanmugam, “Artificial Intelligence Techniques Applied to Intrusion Detection. IEEE Indicon 2005 Conference, India, Chennai, 2005 [3] R.C. Garcia, J.A. Copeland, “Soft Computing Tools to Detect and Characterize Anomalous Network Behaviour”, IEEE World Congress, 2000, 475-478
[4] J. E. Dickerson, J. Juslin, J. A. Dickerson and O. Koukousoula, “Fuzzy Intrusion Detection”, North American Fuzzy Information Processing Society 2001 (NAFIPS 2001) , Vancouver, Canada, July 25th, 2001 [5] S. Staniford-Chen, B. Tung, and D . Schnackenberg, “The Common Intrusion Detection Framework (CIDF)”. Information Survivability Workshop, Orlando FL, October 1998 [6] W. Chenxi, J.C. Knight, “Towards survivable intrusion detection”. Third Information Survivability Workshop -ISW-2000, October 24-26, 2000 (www.cert.org/research/isw/isw2000/papers/38.pdf) [7] S. Naqvi, M. Riguidel “Security and trust assurances for smart environments”. In: IEEE International Conference on Mobile Adhoc and Sensor Systems Conference, 7-10 Nov. 2005, pp. 8 [8] Petrie C. J. “Agent-Based Engineering, the Web, and Intelligence” Stanford Center for Design Research, IEEE Expert. 1996 [9] Wooldridge M., Jennings N.: “Software Engineering with Agents: Pitfalls and Pratfalls”, IEEE Internet Computing, 1999 [10] Russel, S. Norvig, P.: “Artificial Intelligence: a modern approach”, New York, Prentice Hall, 2004. [11] Cetnarowicz K.: „Problems of Multi-agent Systems Development” Monografia, Akademia Górniczo-Hutnicza, Kraków, Poland, 2000 [12] Wooldridge M., Dunne P. E. “The Complexity of Agent Design Problems: Determinism and History Dependence”, Annals of Mathematics and Artificial Intelligence, December 2005