MAYPOLE: VISUALISATING CONTINGENCY TABLES S. McBRIDE, R. STERRITT1, E.P. CURRAN, K. ADAMSON, C.M. SHAPCOTT School of Information and Software Engineering, Faculty of Informatics, University of Ulster, Shore Road, Newtownabbey. Co. Antrim. BT37 OQB Northern Ireland Phone: [44] (0)1232 368198
Fax: [44] (0)1232 366068
E-mail:
[email protected]
Keywords: AI, Expert Systems, Data Preprocessing, Data Mining, Data Visualisation Symposium: AI and Expert Systems
EXTENDED ABSTRACT Data Visualisation is a powerful tool in modern computing. It allows ease of navigation of large information spaces and it is invaluable in communicating complex ideas. As the volume of data stored by organisations continues to increase, Data Visualisation can reduce mountains of data to visually insightful representations, which can aid decision making, increase productivity, and in some cases reduce physical risk. As a result it is envisaged that the next generation of databases will be more graphical and hence easier to navigate and accessible to a wider range of users. Such development is possible because of the increases in computer processing power and screen resolution. Developers are starting to utilise the remarkable perceptual abilities that humans possess, such as the capacity to recognise images quickly, and detect the subtlest changes in size, colour, shape, movement or texture. [1] Business development is moving towards such fields as data warehousing, involving vast libraries of documents and multimedia files. Industry is also investing large sums in data-mining tools in attempts to make more effective use of vast information repositories. There is no doubt, given its success in industry and its growing popularity, that data-mining is a powerful technology, but as yet data-mining tools and the data-mining process (often referred to as the knowledge discovery process [2]) are not user-centred. Clearly it would be helpful to visualise the data at every point so that the user can gain some trust in the process and hence have more confidence in the final solution. To this end it will be necessary to establish some goals and approaches to data visualisation. [3][4]. Shneiderman’s Data Type by Task Taxonomy [5] offers a solution, it allows the categorisation of the data neatly and highlight existing properties within the data structure that may have been missed by using a non-standard approach. In an effort to make a knowledge discovery architecture [6] more user-centric a contingency table visualisation tool has been developed. A contingency table, in simple terms, is a list of various variables and their states in combination that have occurred in the history observed, and the frequency of occurrence. Several possibilities were designed for visualising the contingency table; variable-pair histogram, parallel co-ordinate plotting, web model. With the implemented design, "the maypole model", all relationships could be visualised by assigning a different colour to each row in the contingency table and showing the links passing from all alarms to a central point (like ribbons hanging from a maypole). By doing this, the user could see all triple and quadruple links as well as the binary links. In the literature there is no clear method for visualisation of multi-dimensional data that was sufficient for contingency tables and as such the Maypole model was developed. This has been tested with TMN (Telecommunications Management Network) data, however as long as the data takes the shape of the kind of ‘shopping basket’ format that is seen in contingency tables, then the Maypole visualisation model can be applied to it.
1
Author for contact
This paper will present; Ø A literature review of existing data visualisation approaches for data mining. Ø The newly developed "Maypole Graph" multidimensional contingency table visualisation tool. Ø And finally a discussion on the potential of other approaches for information visualisation.
REFERENCES [1] http://www.avs.com/solution/, Advanced Visual Systems Incorporated. [2] U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, "From Data Mining to Knowledge Discovery: An Overview", Eds. U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, R. Uthurusamy, Advances in Knowledge Discovery and Data Mining, AAAI Press, 1996, pp1-34. [3] http://www.cs.uml.edu/~grinstei/kddvis-workshop.html Workshop on the Issues in the Integration of Data Mining and Data Visualisation. [4] http://www.satafe.edu/~kurt/text/dmviz/modelviz.shtml , Approaches to Data Visualisation. [5] B Shneiderman , Designing the User Interface – Strategies for Effective Human Computer Interaction (Third Edition) pp 510-540. [6] R. Sterritt, K. Adamson, CM Shapcott, DA Bell, "An Architecture for Knowledge Discovery in Complex Telecommunication Systems", Editors: Adey R.A., Rzevski G., Nolan P., Applications of Artificial Intelligence in Engineering XIII, Computational Mechanics Publications, Southampton, 1998, pp141-143 (CD-ROM pp627-640).
OTHER REFERENCES APPLICAPLE TO FULL PAPER [] http://www.trajecta.com/telecom.htm Data Mining in the Telecommunications Industry [] A Case for Interactive Information Retrieval Behaviour and Effectiveness J Koenemann & N Belkin, Proc. CHI ’96 (Human Factors In Computing Systems, ACM Press, New York, 1996) [] http://lunar.arc.nasa.gov/dataviz/datamaps/index.html Lunar Prospector Data Maps [] http://www.cs.umd.edu/projects/hcil/Research/1995/vhe.html Visible Human Explorer, Human Computer Interaction Laboratory, University of Maryland [] http://www.parc.xerox.com/istl/projects/uir/projects/InformationVisualisation.html Xerox PARC User Information Research Group [] http://www.cse.ucsc.edu/research/slvg/unvis.html Santa Cruz Laboratory for Visualisation and Graphics – Uncertainty Visualisation [] http://www.icase.edu/docs/hilites/viz.html ICASE Visualisation and Graphics Research [] http://www.inxight.com/Content/7.html Inxight Web-site Map as Hyperbolic Tree (Java Applet Demo) [] Human Computer Interaction p49 Jenny Preece (1996, Addison-Wesley)