Panel 2- DATA QUALITY IN INTERNET TIME ... - Semantic Scholar

4 downloads 410 Views 91KB Size Report
Review, Journal of Information Technology Management, and IEEE Computer. ... (B.S. and M.S.), Management (M.S.), and Computer Science (Ph.D.) from MIT.
PANEL DATA QUALITY IN INTERNET TIME, SPACE, AND COMMUNITIES Chair:

Yang W. Lee, Northeastern University, U.S.A.

Panelists: Paul L. Bowen, University of Queensland, Australia James D. Funk, S. C. Johnson and University of Wisconsin, Parkside, U.S.A. Matthias Jarke, GMD-FIT and RWTH Aachen, Germany Stuart E. Madnick, Massachusetts Institute of Technology, U.S.A. Yair Wand, University of British Columbia, Canada MOTIVATION AND QUESTION Quality data is a key resource for planning, producing, and communicating in the new millennium. Use of data transcends time, space, and communities. With the use of the Internet increasing dramatically, poor-quality data can be processed and distributed faster than ever and wider than ever. Reports on impacts of data quality range from customer dissatisfaction, stoppage of business operation, and reduced revenue, to human loss (Huang et al. 1999; Redman 1996). Equally critical, but under-reported, ramifications of poor-quality data include jeopardizing the capacity to understand new dynamics and the context of global business, to understand changing customers’ view, and to understand how to respond to new opportunities. Just this year, NASA lost its Mars Climate Orbiter. The spacecraft flew too close to the planet and burned up in the Martian atmosphere. The Orbiter was lost and the project failed, in part, because the NASA scientists simply did not convert data between metric and non-metric units. This failure to properly convert data in the appropriate context is only one example among other more complicated data quality problems researchers and practitioners strive to solve. In this panel, we will take stock of the status of research on data quality from diverse perspectives and across national boundaries. We will then discuss key emerging data quality issues and suggest directions to understand and solve these problems. The panel will address a set of critical questions: How do different perspectives on data quality define and solve data quality problems? Do different approaches shape different solutions to the same problems? Are we facing different problems in the Internet era and thus need a fresh look at how we approach framing and solving data quality problems?

PERSPECTIVES ON DATA QUALITY The panel discussion on data quality will be based on the following complementary, but diverse, perspectives.

Ontological Perspective Ontologically, a data deficiency is defined as an inconformity between the view of the real-world system that can be inferred from representing an information system and the view that can be obtained by directly observing the real-world system. The ontological approach concentrates on the internal view and is oriented toward system design and data production. This approach supports a set of information quality dimensions that are comparable across applications. These dimensions can be viewed as being intrinsic to the data. This view can be used to guide the design of an information system with certain information quality objectives (Wand

713

Panel: Data Quality in Internet Time, Space, and Communities

and Wang 1996; Wand and Weber 1990). We will briefly illustrate the principles of data quality from an ontological perspective and discuss how this perspective can define data quality problems and suggest solutions.

Architectural Perspective Data warehousing is used as a solution to solving some aspects of data quality problems. Data warehousing, data marts, and ERP systems focus on delivering systems infrastructure that dictates how data are stored and shared for cross-functional operation. The panel will use real project examples in Europe to illustrate how these collaborative projects identified and solved data quality problems. The Foundations of Data Warehouse Quality (DWQ) is a collaborative research project funded by the European Communities under the Reactive Long Term Research Branch of ESPIRIT IV Research and Development Programs. Participating institutions include many international universities and research institutions in Greece, Germany, France, and Italy (Jarke et al. 1999; Jarke and Vassiliou 1997). The project aims to establish foundations of data warehouse quality through linking semantic models of data warehouse architecture to explicit models of data quality. We will offer lessons learned from the research project and discuss future directions for architectural solutions to emerging data quality problems.

Context-Mediation Perspective Context mediation perspective can be illustrated by the story of the “tower of Babel,” where people with different languages are required to communicate to achieve a common goal. Disparate heterogeneous databases need to be aggregated across space, time, and communities. Specifically, the common integration goal in large-scale databases demands a perspective to frame and answer such challenges ranging from the units of currency to different local meanings of the same terminology (Madnick 1995a, 1995b). The panel will illustrate how geographic, functional, and organizational context form a specific meaning of data, which in turn generates challenges for global and future design and use of database systems. Some example solutions will be briefly described and future directions for data quality research and practice will be discussed.

Time-Based E-Commerce Perspective Time-based threats to data quality occur when real-world conditions or events are not reflected by the information systems. Ecommerce systems are especially vulnerable to time-based threats because of the speed with which business is transacted and changes to various aspects of the environment occur. The problems and challenges include achieving and sustaining data quality dimensions such as accuracy, timeliness, and comparability (Bowen et al. 1995, 1998). Threats to data quality include identifying and recording real world events that affect values stored in the information system. These events can be discrete or continuous. A more complex situation exists in the case of forecast data for value chain members. For example, suppliers further up in the value chain need up-to-date forecasts of their downstream trading partners’ anticipated sales. The panel will discuss how timebased threats to data used in E-commerce systems can be identified and further research directions will be discussed.

Information Product Perspective The information product perspective advocates managing information as a product, not as by-product. If manufacture of physical product, such as a car, can be managed as a product with a well-designed production process, materials, and services for the customers, information ought to be managed in the same fashion (Wang et al. 1998). Data collectors, data custodians (IS professionals), and data consumers in the data production process all have a critical role to play in improving data quality (Lee 1996). Unless data is fit for use by information consumers, the data lacks quality (Strong et al. 1997; Wang and Strong 1996). The panel will present recent theories and report how industry is managing information as a product.

PANEL FORMAT The panel will adhere to a question and answer format with brief presentations. It aims to achieve lively discussion between the audience and the panelists, and engaging discussion among the panelists. To launch the interactive discussion, panelists will answer the questions posed by the moderator. We will have brief presentations to describe the core ideas for each perspective. This will be followed by the practitioner’s comments and open discussion among the panelists. We will then open the discussion to include the audience.

714

Panel: Data Quality in Internet Time, Space, and Communities

PANEL PARTICIPANTS Yang W. Lee ([email protected]) is Assistant Professor of Information Technology at Northeastern University. Her research centers on data quality, IT-mediated institutional learning, and systems integration strategy. She received her Ph.D. from Massachusetts Institute of Technology. Her publications have appeared in Communications of the ACM, Sloan Management Review, Journal of Information Technology Management, and IEEE Computer. Professor Lee is a co-author of Quality Information and Knowledge (Prentice-Hall, 1999), Data Quality (Kluwer Academic Publisher, 2000), and Journey to Data Quality: A Road to Higher Productivity (MIT Press, forthcoming). She will be a visiting professor at Massachusetts Institute of Technology in 2001 ([email protected]). Paul Bowen ([email protected]) is a Senior Lecturer of Information Systems in the Department of Commerce at the University of Queensland. He received his B.S. in Industrial Management from Georgia Tech. He has an MBA, Master’s of Accountancy, an M.S. in Computer Science, and a Ph.D. in Accounting Information Systems from the University of Tennessee. His research interests include data quality, end-user computing, database design, internal controls, and software reliability. Paul’s interest in data quality originated from his work as a systems analyst and project manager at the Oak Ridge National Laboratory from 1980 to 1988. His research has appeared in the Australian Computer Journal, Australian Journal of Information Systems, Data Quality, IS Audit & Control Journal, Journal of Cost Management, and Managerial Auditing Journal. He is on the editorial board of the Journal of Information Systems and the International Journal of Accounting Information Systems. He served on the program committee for WITS’97, WITS’98, and WITS’99, and as co-chair of WITS 2000. James Funk ([email protected]) is manager of Global Data Architecture at S. C. Johnson, a consumer goods manufacturer. He is also an instructor at the University of Wisconsin, Parkside, in the school of business. He has spent the last 15 years focusing on issues related to data administration, conceptual and logical data modeling, and data quality at his current employer and at a large utility. He has given presentations on data quality at MIT's summer session on Total Data Quality Management and at the 1999 TDQM conference held at MIT. He has over 30 years experience in information systems and is currently responsible for developing and maintaining a global data architecture as well as instituting a data quality practice within his current organization. He is a co-author of Journey to Data Quality: A Road to Higher Productivity (MIT Press, forthcoming). Mr. Funk will serve as the Conference Co-Chairman for the 2001 International Conference on Information Quality to be held at MIT. Matthias Jarke ([email protected]) is Professor of Information Systems at Aachen University of Technology (RWTH Aachen) and Executive Director of the FIT Institute for Applied Information Technology at the GMD National German IT Research Labs. His research focuses on information systems support for cooperative work in engineering, business, and culture. He has been coordinator of three European ESPRIT projects in information systems engineering, and is currently a member of two National Collaborative Centers of Excellence addressing Computer-Aided Chemical Engineering and Media and Cultural Communication. He has published over 150 refereed papers and several books, most recently Fundamentals of Data Warehouses (Springer-Verlag). In 1999, he was elected Vice President of the German Informatics society. Jarke holds diploma degrees in Computer Science and Business Administration and a Doctorate from Hamburg University, Germany, and served on the faculties of New York University and Passau University prior to joining RWTH Aachen in 1991. Stuart Madnick ([email protected]) is the John Norris Maguire Professor of Information Technology and Leaders for Manufacturing Professor of Management Science at the MIT Sloan School of Management. His research interests include information technology strategy, connectivity among disparate distributed information systems, database technology, and software project management. He is the author or co-author of over 250 books, articles, or reports on these subjects, including the classic textbook, Operating Systems (McGraw-Hill), and the book, The Dynamics of Software Development (Prentice-Hall). He co-heads (with Professor Rich Wang) the Total Data Quality Management (TDQM) program at MIT. He is also co-heading a project to develop new technologies for gathering, aggregating, and analyzing information from many different sources, including traditional databases and the World Wide Web. He is testing these new technologies in industries such as financial services, manufacturing, logistics, and transportation. He has been active in industry, making significant contributions as one of the key designers and developers of projects such as IBM’s VM/370 operating system and Lockheed’s DIALOG information retrieval system. Dr. Madnick has degrees in Electrical Engineering (B.S. and M.S.), Management (M.S.), and Computer Science (Ph.D.) from MIT. He has been a Visiting Professor at Harvard University, Nanyang Technological University (Singapore), University of Newcastle (England), and Technion (Israel). Yair Wand ([email protected]) is a Professor of Information Technology at the Faculty of Commerce and Business Administration, The University of British Columbia. Yair’s research focuses on theoretical foundations for information systems analysis and design, and on modeling methods used in systems analysis. Yair’s work has appeared in The Accounting Review, ACM Transactions on Database Systems, ACM Transactions on Software Engineering, Communications of the ACM, Decision Sciences, Information and Management, Information Systems, INFOR, Journal of Information Systems, and Management Science. Yair has done consulting with various organizations on information systems and software products development.

715

Panel: Data Quality in Internet Time, Space, and Communities

References Bowen, P. L., Fuhrer, D. A., and Guess, F. M. “Continuously Improving Data Quality in Persistent Databases,” Data Quality (4:1), September 1998 (available electronically at http://www.dataquality.com/998bowen.htm). Bowen, P. L., Schneider, G. P., and Fields, K. T. “Managing Data Quality in Client/Server Environments,” IS Audit & Control Journal (IV), 1995, pp. 28-35. Huang, K., Lee, Y., and Wang, R. Quality Information and Knowledge, Upper Saddle River, NJ: Prentice Hall, 1999. Jarke, M., Jeusfeld, M. A., Quix, C., and Vassiliadis, P. “Architecture and Quality in Data Warehouses: An Extended Repository Approach,” Information Systems (24:3), 1999, pp. 229-253. Jarke, M., and Vassiliou, M. “Foundations of Data Warehouse Quality: An Overview of the DWQ Project,” in Proceedings of the International Conference on Information Quality, Cambridge, MA, 1997, pp. 299-313. Lee, Y. W. “Why ‘Know Why’ Knowledge is Useful for Solving Information Quality Problems,” in Proceeding of the Americas Conference on Information Systems, J. M. Carey (ed.), Phoenix, AZ, August 1996, pp. 200-202. Madnick, S. E. “Are We Moving Toward an Information Super Highway or a Tower of Babel?” in Proceedings of the TwentyFirst International Conference on Very Large Data bases (VLDB), 1995a, pp. 11-17. Madnick, S. E. “Integrating Information from Global Systems: Dealing With the ‘On- and Off-Ramps’ of the Information Superhighway,” Journal of Organizational Computing (5:2), 1995b, pp. 69-82. Redman, T. Data Quality for the Information Age, Boston: Artech House, 1996. Strong, D. M., Lee, Y. W., and Wang, R. W. “Data Quality in Context,” Communications of the ACM (40:5), 1997, pp. 103-110. Wand, Y., and Wan, R. Y. “Anchoring Data Quality Dimensions in Ontological Foundations,” Communications of the ACM (39:11), 1996, pp. 86-95. Wand, Y., and Weber, R. “An Ontological Model of an Information System,” IEEE Transactions of Software Engineering (16:11), 1990, pp. 1282-1292. Wang, R. Y., Lee, Y. L., Pipino, L., and Strong, D. M. “Manage Your Information as a Product,” Sloan Management Review (39:4), 1998, pp. 95-105. Wang, R. Y.,and Strong, D. M. “Beyond Accuracy: What Data Quality Means to Data Consumers,” Journal of Management Information Systems (12:4), 1996, pp. 5-34.

716

Suggest Documents