A Dynamic Structure for Experiential Data in a Collaboration ...

Proceedings of the 38th Hawaii International Conference on System Sciences - 2005

A Dynamic Structure for Experiential Data in a Collaboration Marketplace to Manage Tacit and Contextual Knowledge for Reuse Michael B. Spring and Charles E. Grindle [email protected], [email protected] Department of Information Science and Telecommunications University of Pittsburgh

Abstract This paper addresses issues in the development of a system for knowledge management in the context of extensible XML structured documents with RDF based ontologies used to aid in linkage and classification. The potential of a web services modeled collaboration marketplace as a delivery mechanism is elaborated. The discussion is exploratory in the context of the development of a design for experimentation, but based on extensive work on existing collaboration environments developed at the University of Pittsburgh over the last decade. The proposed system uses the military After Action Review (AAR) as the major input to the system and explores how such an input might be optimally structured to make use of an array of processing tools.

Introduction Knowledge management is a broad umbrella that encompasses formal and informal activities. All of the various efforts look to leverage data, information, and knowledge in organizations to enable the organizations to do engage in better processes. Indeed, as Davenport and Prusak point out, knowledge management is directed to helping organizations use what they know. (Davenport and Prusak, 2000). Some efforts are theoretically grounded and others are a-thoeretical. Some are strongly tied to particular technologies. What is clear is that advances in information theory and technology now make it possible for organizations to try to manage tacit,

contextual, semi-structured and highly structured knowledge. What is not yet clear is how to systematically approach various opportunities for knowledge management. Kankanhalli et al (2003) suggest that one way to categorize knowledge management initiatives is to classify them as product or service based and as occurring in high or low volatility environments (Figure 1). We see this kind of taxonomic classification as an important first step in organizing knowledge management initiatives. Further, we believe that it should be possible to predict the issues and obstacles to be faced in a given activity once sufficient experience is gained in classifying and laying out specific programs of knowledge management. In addition to defining the While Kankanhalli et. al. would be enhanced by identification of the primary source of the knowledge being transferred. While knowledge taxonomies proliferate, we would suggest that there is some agreement at this point in time that knowledge exists in at least four forms – internal informal, internal formal, external informal and external formal. The frame of reference for internal and external is the individual. The often used term “tacit knowledge” refers to internal and informal knowledge. What an individual knows in a more formal but still unshared form would be classed formal internal. The vast store of knowledge that exists in databases and document stores might be classed as external informal. It is our belief that what Swanson (1986) referred to as “undiscovered public knowledge” falls into this category as well.

0-7695-2268-8/05/$20.00 (C) 2005 IEEE

1


Finally, we have explicit knowledge – sharing not only files, but an array of formally defined knowledge that is information artifacts. The question again externally stored. Kankanhalli et. al.(2003, became how these technologies could be p73) found that organizations focused on used and contributes to knowledge service in low volatility contexts tended to management. Databases were restructured focus on knowledge codification whereas to provide up-to-date and widely shared organizations focused on service in high information. With time, these transactional volatility environments tended to focus on databases were supplemented by data personalization. It seems reasonable to warehouses and specialized tools for data speculate that situations in which knowledge is tacit will focus more on acquisition tasks Context High and situations in which Volatility knowledge is already explicit, or Low explicit by nature, will focus on Volatility knowledge sharing and knowledge use. Internal Informal

Background

Type of Knowledge

As organizations networked PCs, email became an important means for sharing information. It may not be too far fetched to suggest that in some ways, knowledge management is an outgrowth of email and file sharing. As managers began to realize that important information was being shared in this way, they began to seek tools for organizing this and other sources of corporate knowledge and sharing and using it. With time, more automated tools – generally classed as groupware – were developed both inside and outside the knowledge management efforts to facilitate workflow management, versioning, and other processes associated with less structured data. While knowledge management uses groupware technologies, the focus of knowledge management is on identifying knowledge sources, knowledge analysis and managing the flow of knowledge within an organization. With the growth of the popularity of the World Wide Web, it became clear that there was both a vast source of additional information as well as a ready to use tool for

Internal Formal External Informal External Formal Service Related

Product Related

Organizational Function Figure 1 mining and other operations. With time, these data intensive repositories were coupled with knowledge structuring tools to bring order. Most recently, efforts at knowledge management have explored a variety of tools for structuring knowledge, acquiring new knowledge, manipulating knowledge bases and making use of knowledge. Abdullah, Benest, Evans and Kimble (2002) review a variety of approaches to structuring knowledge for reuse. These range from tools for modeling highly structured knowledge such as UML to tools for incorporating expert knowledge within a conceptual ontology such as Protégé 2000. The Augmentation Laboratory at the University of Pittsburgh developed a tool

0-7695-2268-8/05/$20.00 (C) 2005 IEEE

2


After Action Review (AAR) used by the called CASCADE to support collaboration Army to process information before, during among groups working on documents. and after operations. Many organizations CASCADE stands for Computer conduct reviews of major events – sales Augmented Support for Collaborative encounters, customer service incidents, Authoring and Document Editing. course evaluations, tenure reviews, etc. CASCADE was developed with support AARs are conducted to understand from the National Institute of Standards transpiring events at multiple levels from and Technology to facilitate the squad level to the highest level – the Army. development of standards (Spring and This information is gathered in many forms, Vathanophas 2003; Sapsomboon and and is often compiled in a document that is Spring, 1996; Spring, Andriati and reviewed at higher levels. It may be used to Vathanophas, 1999; Spring et. al. 1997). inform immediate action. It may be filed The prototype document selected for CASCADE development was a standard. Standards are documents that involve AAR AAR Review Tools teams of dozens to hundreds of people in a Processing variety of different roles. Tools AAR They are worked on for Validated periods of years and involve Submission multiple supporting AAR documents – minutes, test Repository Mining Tools plans, formal definitions etc. and benefit from AAR AAR augmented processes for commenting, balloting, etc. Figure 2 CASCADE is a three-tier and not used again. A reviewed operation client server system that monitors and may be similar in scope to a follow-on event tracks activity on document artifacts related or may have other events that build on it. to a task. The system has been used on an One unit may replace another and need experimental basis by a number of national background information to conduct future standards developing organizations operations but not have communication including the IEEE and ASME. This paper with the replaced unit. This paper suggests looks at extensions to CASCADE informed a dynamic structure for The Army’s AAR by knowledge management goals in the process (or any other experiential data context of activity reviews regularly environment) managed within a conducted by the army. collaborative tool marketplace. The approach to the management of Volatile Informal Internal this information and the knowledge Knowledge Reuse related to developed from it involves three primary Performance components. The first is a system for This paper looks to understand the structuring the information. The second is optimal tools, both structural and a system for classifying the information. functional, for working with knowledge that The third is a system for accessing and is primarily internal and informal in an managing the information. We believe such environment of medium to high volatility a design provides an optimum format and for the purpose of performing a service. As environment to allow the Army to capture a prototypical example, we have chosen the and reuse contextual knowledge. Figure 2

0-7695-2268-8/05/$20.00 (C) 2005 IEEE

3


shows these three processes functionally. The structured document process is used to validate submissions. Review tools aid in the classification and processing and mining tools serve management and use functions. Each of these three subsystems is described briefly below.

Structuring Information With the emergence of the XML family of standards, it is possible to address the issue of document modeling at a new level of detail and increased flexibility. CASCADE provided support for XML documents and used XML documents for data interchange and storage. Ballots and comments are classified and managed as XML documents and as XML data records making it possible for the system to process the information for additional action. For example, a ballot in CASCADE serves as a ballot, a record of votes, and a data source for various processes such as ballot reminders and ballot analysis. However, these models and the support were based on a very SGML-like XML standard in its early stages of development. Since then, the world Wide Web Consortium has overseen the development of the complete XML Standard with provision for data modeling and control (W3C, 1998; W3C 2004). In addition, the use of namespaces (W3C, 1999a) as a part of XML schema are now better understood and the ability to develop modular TEI (TEI, 1998) like document sets has been made significantly easier. The model for primary and secondary document artifacts will be based on XML. Early work by Mahler and Andalusia (1996) on SGML document analysis and design and more recent work by Glushko and McGrath (2002) on document engineering provide approaches that can be used for design. While the Mahler and Andaloussi work is focused on static documents, and the Glushko work is focused primarily on transactional documents, the approaches they have pioneered along with the work of the

humanities community on TEI can be used to inform a process that will provide intuitive and extensible semantics for reports. The reports will be constructed from templates derived from a variety of interrelated namespaces. All review and comment documents will also use XML allowing them to be classified and linked using extensible semantics in a user-friendly form. Most importantly in this process, the use of namespaces will provide a seamless capability to incorporate new events and new technologies in the reviews that are conducted. The management of these new components for the purpose of knowledge management will fall to the second component of the system – information classification.

Information Classification While classification helps to organize information, it has one major limitation. For a resource to be discovered, the class or category must be known. Useful information resources may not be discovered if the finder looks in the wrong classes. This becomes even more complicated when different people classify things differently, or when the classification is extensible, as is the case for the AARs. It was imagined that information could be organized by its associations to other resources. Bush (1945) introduced the notion of the organization by association where information resources are discovered and retrieved by their relatedness. As has been made clear with the explosion of the World Wide Web (WWW), this kind of organization does not work very well. Even when information is well classified and related, discovery may be frustrated. Swanson (1987) calls this situation undiscovered public knowledge. One of the examples illustrated by Swanson is that suppose two reports are published separately and independently: one reports that the process A causes the result B and the other reports that B causes the result C. Although it could be implied that A could

0-7695-2268-8/05/$20.00 (C) 2005 IEEE

4


cause C, the fact may never be discovered unless the two reports are known to the same person. Swanson referred to the problem as "the missing link in the logic of discovery" (p. 110). It has been suggested that deduction of classified information related by onlologies could solve part of this problem. For example, when classes are organized hierarchically, it should be inferred that the members of a category must consist of the direct category members as well as all the members of its subcategories. In addition, to determine the classes which a resource belongs to, it should be inferred that all the classes that the resource is directly classified into as well as their parent categories should be included. Another form of deduction would be based on an ontology that would define classifications as similar. If painting and sculpture are each considered as a kind of artwork. Given a resource described as a review article about a painting and another resource described as a review article about a sculpture, the deduction system should classify them under the same category of art reviews. In similar fashion, the development of ontologies related to namespaces would allow other logical operations such as transitivity, cardinality, and disjunction to be used. The mechanism for incorporating this higher level is provided by XML and RDF (W3C, 1999b; W3C, 199c) and has been extended using a variety of ever more powerful languages for describing the relationships between RDF schema and RDF Descriptions. The theoretical foundations for such relationships in document spaces have been provided by the Darpa work on agents and by the work of Phelps and Wilensky (1996) and Rothenberg (1996).

Access and Management The broad outline of the system shown in Figure 2. is simplistic. The system envisioned is much more complex. It consists of extensible and dynamically configurable services for collaboration based on an instantiation of a collaboration marketplace (Figure 3) consistent with the broad vision proposed by Dertouzos (1998, 2001). A marketplace allows services to be configured to support a variety of tailored collaboration tasks. This functionality is achieved through a structural architecture that includes market, infrastructure, basic, and broker services (Spring and Gennari, 2004). Users attach to a coordinator or broker service, which composes the needed basic services that in turn rely on infrastructure and market services. These services can be local or remote. In implementation, there will be dozens of services to assist, broker, support and transform basic services. The services provided by a collaboration marketplace include management, navigation, communication, and artifact services (Heo, Morse, Willms, and Spring, 1996; Morse and Spring, 1996). From a user perspective, the services provided by a collaboration system include: •

Management o Creating users and groups and assigning access rights

Figure 3

0-7695-2268-8/05/$20.00 (C) 2005 IEEE

5


o Developing spaces and artifacts within those spaces • Work management o Navigation o Activity review • Artifact Services o Creating specific pieces of work o Writing, editing, adding • Communication Services o Directly through tools o Indirectly through votes, comments, etc. The work builds on web services models such as jxta, e-speak and .NET. The resulting system enables numerous collaboration tools that share a common artifact base that is highly structured, but extensible. It will allow multiple tailored collaboration environments that can be constructed at a low cost and adapted easily and quickly to meet changing needs without a loss of quality. From a system perspective, these services are provided in layers, with lower layer services providing basic functionality used by higher layer services. Services are classed as infrastructure, marketplace, service, basic and broker. Market services are those required for the operation of the marketplace and are generally singular in nature. Examples would include certificate services, authentication services, banking and escrow services, etc. Infrastructure services are those that support the basic services in the marketplace and may be competitive in nature. Examples would be delivery services, logging services, translation services, etc. Basic services are those that are used directly by consumers and supported by marketplace and infrastructure services. Examples include communication interfaces, document construction interfaces. Broker services serve two primary roles: aggregation of sets of basic services and selection of services from competitive sets. This marketplace arrangement provides an environment within which the various information

sources can be easily accessed and in which a variety of lightweight and heavyweight clients can be quickly constructed.

References

Bush, V. As we may think. The Atlantic Monthly. 176, 1 (1945), 101-108. Davenport, T.H. and Prusak, L. (2000) Working Knowledge: How organizations Manage What They Know, Harvard Business School Press, Boston. Dertouzos, M. What Will Be. Harper, 1997. Dertouzos, M. The Unfinished Revolution. HarperCollins, 2002 Gennari, J., Harrington, M., Hughes, S., Manojlovich, M., and Spring, M.B., Preparatory Observations Ubiquitous Knowledge Environments: The Cyberinfrastructure Information Ether. NSF Post Digital Library Futures Workshop, Chatham, MA. June 15-17, 2003. Glushko, R. J., Tenenbaum, J.& Meltzer, B., “An XML Framework for Agent-Based Ecommerce,” Communications of the ACM, Vol. 42, No. 3, March 1999, pp. 106–114. Glushko, R. J., McGrath, T. Document Engineering for e-Business. ACM Symposium on Document Engineering. November 2002, 42-48. Goldfarb, Charles F. The SGML Handbook. Edited and with a foreword by Yuri Rubinsky. Oxford: Oxford University Press, 1990. Extent: 688 pages. ISBN: 0-19-853737-1. T Heo, M., Morse, E.L., Willms, S. and Spring, M. B., Multi-level Navigation of a Document Space. WebNet96, AACE, San Francisco, CA, October 16-19, 1996. Kanakanhalli, A, et. Al. The Role of IT in Successful Knowledge Management Initiatives. Communications of the ACM, September 2003, 46(9), pp 69-73.

0-7695-2268-8/05/$20.00 (C) 2005 IEEE

6


Maler, E, and Andaloussi, J. Developing SGML DTDs: From Text to Model to Markup. Prentice Hall, 1996. Morse, E.L., and Spring, M.B., Visualizations that Support Group Work. 1996 IEEE Symposium on Visual Languages, Boulder, CO, September 3-6, 1996. Phelps, T., and R. Wilensky. “Toward active, extensible, networked documents: Multivalent architecture and applications.” In Proceedings of DL’96, New York: ACM Press, 1996. 100-108. Rothenberg, J. “Metadata to Support Data Quality and Longevity.” Proceedings of the 1st IEEE Metadata Conference, NOAA Complex, Silver Spring, Md., 16-18 April, 1996, www.computer.org/conferences/meta9 6/rothenberg_paper/ieee.dataquality.html, (acc. Oct 2000). Sapsomboon, B. and Spring, M.B., Computer Based Collaborative Authoring for Standards Development. Open Systems Standards Tracking Report, November, 1996, 5(8), pp4-6. Spring M. B. and Gennari J., Standards for the Next-Generation Web: Architectural Considerations from a Standardization Perspective, International Journal of IT Standards and Standardization Research (forthcoming). Spring M.B. and Vathanophas, V. Peripheral Social Awareness Information in Collaborative Work. Journal of the American Society of Information Science and Technology. September 2003, pp 1006-1013, 54(11). Spring, M.B., Andriati, R. and Vathanophas, V., Usability of a Collaborative Authoring System for Standards Development: Preferences, Problems, and Prognosis, Proceedings of the First IEEE Conference on Standardization and Innovation in Information Technology, Aachen, Germany, September 15-17, 1999

Spring, M.B., Fritsch, R., Lautenbacher, G., Lenox, T., Morse, E., Sapsomboon, B. Stewart, D. and Vathanophas, V., Embodying Social Capital Facilitators in a Collaborative Authoring System, Proceedings Association for Information Systems 1997, Americas Conference, Indianapolis, Indiana, August 15-17, 1997 (Full proceedings: http://hsb.baylor.edu/ramsower/ais.ac.97/program. html).

Swanson, Don R. (1986). "Undiscovered Public Knowledge," Library Quarterly 56(2):103-118. [See also: Swanson, Don R. (1987). "Two Medical Literatures that are Logically but not Bibliographically Connected." Journal of the American Society for Information Science 38(4):228-233.] TEI. Tei. http://www.tei-uic.edu/orgs/tei/, June 1999. Web site. W3C. Extensible markup language (xml) 1.0. http://www.w3.org/TR/1998/RECxml-19980210, February 1998. W3C Recommendation. W3C. Extensible Markup Language (XML) 1.0 (Third Edition) W3C Recommendation 4th February 2004, François Yergeau, Tim Bray, Jean Paoli, C. M. SperbergMcQueen, Eve Maler W3C. Namespaces in XML W3C Recommendation 14 January 1999a, Tim Bray, Dave Hollander, Andrew Layman W3C. XML Path Language. http://www.w3.org/TR/1999/RECxpath-19991116 , 4 November 1999f. James Clark, Steve DeRose W3C. Extensible style language (xsl). http://www.w3.org/TR/WD-xsl, June 1999e. W3C Recommendation. W3C. Document object model (dom). http://www.w3.org/TR/PR-DOMLevel-1, June 1999d. W3C Recommendation. W3C. Resource description framework (rdf): Model and syntax specification. http://www.w3.org/TR/1999/REC-

0-7695-2268-8/05/$20.00 (C) 2005 IEEE

7


rdf-syntax-19990222, February 1999b. W3C Recommendation. W3C. Resource description framework (rdf): Schema specification. http://www.w3.org/TR/1999/RECrdf-syntax-19990222, March 1999c. Proposed recommendation.

0-7695-2268-8/05/$20.00 (C) 2005 IEEE

8

A Dynamic Structure for Experiential Data in a Collaboration ...

A Dynamic Structure for Experiential Data in a Collaboration ...

Suggest Documents

A Data Structure for Dynamic Trees

A Data Structure for Dynamic Trees - CiteSeerX

Considerations toward a Dynamic Mesh Data Structure

A Dynamic Data Structure for Flexible Molecular ... - CiteSeerX

International Student Collaboration and Experiential

The Skip Quadtree: A Simple Dynamic Data Structure for ...

The Skip Quadtree: A Simple Dynamic Data Structure for ...

Hornet: An Efficient Data Structure for Dynamic

Dynamic Data Structure Analysis for Java Programs

Case Studies in Accelerated Experiential Dynamic Psychotherapy

A Dynamic Stabbing-Max Data Structure with Sub-Logarithmic Query ...

A secure collaboration service for dynamic virtual organizations

Understanding Dynamic Collaboration in ...

Data Structure for Efficient Dynamic Processing in 3-D

Human-Automation Collaboration in Dynamic Mission Planning: A ...

A data structure for family relations

A scalable data structure for three-dimensional

Persistence of Data in a Dynamic Network

A Compact Topological DBMS Data Structure For

A Data Structure for Subsumption-Based Tabling

A data structure for manipulating priority queues

A Scaleless Data Structure for Geographic Information

Implementing a Novel Data Structure for

A Scaleless Data Structure for Geographic Information