The Internet of Things, the continuing globalization of logistics networks, decreas- ing product life ... hosting, and distribution. ..... (e.g. coupons). URLs of web.
Toward a Taxonomy of the Data Resource in the Networked Industry Boris Otto, Fraunhofer Institute for Material Flow and Logistics, Dortmund Rene Abraham, Institute of Information Management, University of St. Gallen Simon Schlosser, Institute of Information Management, University of St. Gallen
1. Abstract The Internet of Things, the continuing globalization of logistics networks, decreasing product life cycles and increasing numbers of product variants are examples of current developments that pose, in general, new requirements both on networked industries and on data in logistical systems. Volume, heterogeneity and importance of data for businesses are growing. In order to be able to manage data under these complexity constraints, networked industries need a current, comprehensive and consistent understanding of the data architecture, i.e. of the key data entities, their relationships, their sources of origin, trustworthiness, frequency of occurrence, quality and ownership. As current approaches for data architecture management in particular and data resource management in general fall short in providing support for this endeavor, the paper at hand proposes a morphology of the data resource in networked industries. The morphology is the result of a taxonomic analysis aiming at providing structure to complex data environments. The paper uses four case studies to identify and describe the dimensions of the morphology and its characteristics. Furthermore, the paper develops the baseline of a method for guiding the application of the morphology.
2. Introduction 2.1. The Role of Data in the Networked Industry The adoption of the Internet of Things (IoT) (Bullinger & ten Hompel, 2007; Fleisch & Mattern, 2005), the proliferation of cyber-physical systems and an increasing tendency towards consumer-centricity (Ross, 2009) in many industries are developments that foster networked forms of organization. Radio Frequency Identification (RFID) technologies increase the visibility of goods within supply networks,
382
D1
Big Data and Information Management
consumers are demanding more and more real-time information (such as allergen implications of grocery products) from retailers and suppliers and modern logistics systems are based on autonomous devices that organize themselves in a decentralized, network-like fashion. The exchange of data and information has always been a prerequisite for networked forms of organization. In the 1980s, standards for Electronic Data Interchange (EDI) such as EDIFACT and ANSI X12 emerged before electronic business and electronic market scenarios became popular in the late 1990s (Alt & Klein, 2011). The latter required electronic information about products, suppliers and customers. Today, data has evolved from a pure enabler of networked organizations to a key resource in networked forms of business. The German “Smart Service Welt” initiative, for example, considers data as the key for future business models in the internet-based service economy (Smart Service Welt Working Group, 2014). Consequently, a market emerged for companies whose product actually is data (Otto & Aier, 2013). Table 1 shows a selection of businesses dubbed data markets. Factual1
InfoChimps2
Windows Azure Data Market3
Data.com4
Year of Foundation
2007
2009
2010
2010 (formerly Jigsaw, 2004)
Owner
Venture capital firms
CSC
Microsoft
Salesforce.com
Offering
Open data platform, API use for free or at a charge.
15,000 data sets, open data platform, four different pricing models, web service.
Wide range of data, including open data platform. Buying and selling data via Azure marketplace.
Data sets for increasing master data quality, maintained by community of 2.000.000 users.
Services
Data mining, data retrieval, data acquisition from external parties.
Data collection, infrastructure development, hosting, and distribution.
Software as a Service (SaaS) applications and data sets, partially real-time access.
Different service and pricing models. Access to contact information, real-time updated data sets.
Table 1: Selected Data Markets 1 2 3 4
See See See See
http://www.factual.com/ for details. http://www.infochimps.com/ for details. http://datamarket.azure.com/ for details. http://www.data.com/ for details.
383
Furthermore, the data resource within organizations is changing. Whilst in the past it was dominated by internally created and managed data stored in relational databases today, a variety of different types of data in growing volumes is observable. The scientific community has long been studying the data resource in firms. Two decades ago, first contributions dealt with the strategic nature of data as an asset (Horne, 1995) and approaches for strategic data planning were developed (Goodhue, Kirsch, Quillard & Wybo, 1992; Shanks, 1997). Data resource management became established (King & Nehyba, 1982; Nolan, 1982) with data architecture management as a key function. A data architecture identifies the key data objects in an enterprise, unambiguously defines them for unique identification and describes their relationship with each other (Brackett, 1994; Inmon, 1992). A data architecture is a prerequisite for authoritative data sources (Ponzio, 2004) – which then function as “single sources of the truth” – and for data sharing within organizations and across the boundaries of businesses. The practice of data resource management issues the DAMA Data Management Body of Knowledge (DAMA, 2009a), and software vendors and consultancies offer a wide array of data architecture management solutions. However, the current discourse falls short in analyzing, explaining or even prescribing data architecture design and management in the networked industry. There have been some contributions in the field of RFID use (Chalasani & Boppana, 2007; Derakshan, Orlowska & Li, 2007), but, they have not investigated the topic in a comprehensive fashion. This situation leaves data managers, data architects, as well as line of business managers with only limited support when it comes to a vendor-neutral and technology-independent understanding of modern data architecture design and management. This gap in research motivates the paper at hand.
2.2. Research Question and Approach The paper pursues the goal of increasing the understanding of the data resource in the networked industry. It wants to add to the scientific knowledge base by analyzing the data resource, mainly its dimensions and characteristics. In addition to that, it aims at helping companies in the networked industry to better understand and design their data architecture. A taxonomic analysis of the object of study is performed following the principles of morphological analyses (Zwicky, 1969). Thus, the paper aims at responding to the following research questions:
384
D1
Big Data and Information Management
– RQ1: How does a morphology of the data resource in the networked industry look like? – RQ2: How should a methodology be designed that helps companies in the networked industry to apply the morphology for data architecture design? The paper uses empirical data from case studies. The results are reflected against the scientific knowledge base in order to ground the morphology in literature. The development of the method for morphology application follows the guidelines of method engineering as a scientific design approach (Brinkkemper, 1996). The remainder of the paper is structured as follows. Section 3 summarizes the related work in the field of data architecture management before Section 4 provides a brief overview on the research process. Section 5 describes and analyzes mini case studies from which the morphology is designed in Section 6. Section 7 shows how enterprises can use the morphology. The paper closes with a conclusion in Section 8.
3.
Related Work
3.1. Data Resource Management According to information processing theory, data is the “raw material” of information, i.e. pieces of data are the building blocks of information (Oppenheim, Stenson & Wilson, 2003). There are different logical views of data (Chen, 1976; Levitin & Redman, 1998; Yoon, Aiken & Guimaraes, 2000). On the lowest level of logical aggregation, there are data items. Data items are instantiations of attributes of data objects (the surname of a customer, for example). A set of data items constitutes a data record. A data record is the instantiation of a data object (a customer master data record including all its attributes, for example). Data records can be further aggregated to form database tables (or files). A table comprises, for example, data of all the customers of a company. Databases are aggregations of tables or files. A customer management database, for instance, might contain customer master data and sales order data. Finally, the total number of databases in a company constitutes the company’s data resource. With the proliferation of computer systems, companies started to refer to data as a resource, i.e. as a means for production (Levitan, 1982; Levitin & Redman, 1998). A number of scientific studies aimed at the identification of characteristics and features of the data resource, such as intangibility, divisibility, transportability etc.
385
(Levitin & Redman, 1998). Managing the data resource is considered an organizational task comprising, for example, data architecture management, data access management, database management and the assignment of responsibilities for data (Goodhue, Quillard & Rockart, 1988; Van den Hoven, 1999). Consequently, DAMA (2008) defines data management as the “business function that develops and executes plans, policies, practices and projects that acquire, control, protect, deliver and enhance the value of data and information”.
3.2. Data Architecture Management An architecture describes a system by its components and their relations. It can further specify concepts to design and evolve that system (Bernard, 2005; ISO, 2011; Schekkerman, 2006). Enterprise architecture defines an enterprise system. It may comprise numerous types of sub-architectures, such as process architecture, data architecture, integration architecture, application architecture, system architecture or software architecture (Zachman, 1987). The data architecture provides the data view on the enterprise architecture. It integrates the sub-architectures of an enterprise architecture by providing a common view of data (Brancheau, Schuster & March, 1989). As the data architecture is located between the process and application architecture, it facilitates data integration between business – i.e. organizational tasks, activities or roles – and information technology – i.e. data application functionalities and databases (Brancheau et al., 1989; Goodhue et al., 1992). A data architecture consists of two parts (Periasamy & Feeny, 1997, p. 198; Pienimäki, 2005, p. 39). The first part is a conceptual data model describing a company’s key business objects as well as the relationships between them on a conceptual level (e. g. in a data object model) (DAMA, 2009b, p. 63). The second part is an application architecture comprising the entirety of a company’s applications that create, store and update instances of the entity types defined in the conceptual data model as well as the data flows between these applications. Data architecture management is the business function that analyzes, defines, documents, maintains and transforms the data architecture in an organization.
386
D1
Big Data and Information Management
3.3. Data in the Networked Economy No commonly accepted definition of the term “networked industry” exists. In the 1990s, the term was used for large corporations and conglomerates which organized as networks (Levin, 1998). Later on, companies continued to focus on their core competencies and the term networked industry mainly described a network of specialized companies that collaborated to provide a joint value proposition (Österle, Fleisch & Alt, 1999). Legner (2009) found that there are three perspectives on the concept of the networked industry, namely an enterprise perspective, a network perspective and a technology perspective. From an enterprise perspective, she identifies a set of collaborative processes such as product life cycle management, supply chain management and service management. The network organization – including the exchange and sharing of data – is addressed in the network perspective while aspects of the technical infrastructure of are part of the technological perspective. Table 2 uses Legner’s framework to analyze existing contributions in the field of data management in the networked industry. Networked Industry Perspective
Selected Contributions with Data Summary of Knowledge Base Focus
Enterprise
(Addo-Tenkorang, Helo, Shamsuzzoha, Ehrs & Phuong, 2012), (Bettoni, Alge, Rovere, Pedrazzoli & Canetta, 2012), (Legner & Schemm, 2008)
Data modeling in supply chains Supply chain data management
Network
(Howard, Vidgen, Powell & Graves, 2001), (Lampathaki, Mouzakitis, Gionis, Charalabidis & Askounis, 2009), (Legner & Schemm, 2008), (Nelson, Shaw & Qualls, 2005)
Data and information sharing Data standards Interoperability
Technology
(Chalasani & Boppana, 2007), (D‘Amours, Lefrançois & Montreuil, 1996), (Derakshan et al., 2007), (Dreibelbis et al., 2008), (Parlanti, Paganelli & Giuli, 2011) (Wang & Jin, 2008)
Data as a service (SOA) Information systems design RFID data architecture design
Table 2: Related Work in the Field of Data Management in the Networked Industry This related work addresses many important aspects when it comes to data management in the networked industry. A detailed analysis of the data architecture and the nature of the data used in networked industries, however, does not exist so far.
387
4.
Research Design
The research design follows a two-staged research process as illustrated in Figure 1. Phase I aims at responding to research question RQ 1. This results in the analysis of the object of study and the respective data morphology design. The results of the literature review on data resource management and data architecture management as well as the findings from the case studies will be used as input to the morphology design process. In general, a case study is a qualitative research method that aims at gathering empirical data. The use of case studies is appropriate when the knowledge base about the object of study is limited and/or the phenomenon cannot be separated from its environment (Yin, 2002). The case study in this paper is of explorative nature. The data resource in the four cases is analyzed by using tables and described using entity relationship (ER) diagrams (cf. Chen, 1976). Morphological analysis as proposed by Zwicky (1969) is used to analyze and structure the results from the cases and the literature review and to develop a data morphology.
Figure 1: Research Process
388
D1
Big Data and Information Management
The case study takes four perspectives on data in the networked industry (see Table 3): Case
A
B
C
D
Perspective
ConsumerCentricity
Supply Chain Excellence, IoT
Purchasing
Electronic commerce
Industry
Consumer goods Consumer goods Pharmaceutical, and retail and retail chemical, food
Online retailing
Data objects in focus
Suppliers, retailers, products, consumers
Suppliers, retailers, load carrier
Suppliers
Customers, products
Case study partners
Beiersdorf, Migros
Mars, Rewe, Chep
Bayer, Nestlé, Novartis, Syngenta
Amazon
Data collection and analysis
Interviews Participatory case study
Expert interviews Case study
Interviews, focus groups, data overlap analysis Participatory case study
Archival records, public documentation Case Study
Project context
Competence Center Corporate Data Quality5
SmaRTI6
Corporate Data League7
-
Table 3: Case Study Overview Data was collected mainly through interviews (except in Case D where public documentation was analyzed), both with subject matter experts on a bilateral basis and in the form of focus groups. Expert interviews allow for accelerated accumulation of knowledge compared to other empirical methods, such as, for example, surveys (Meuser & Nagel, 2009). Focus groups try to use the dynamics of group discussions when participants are stimulated to think into directions that emerge from the discussion itself (Morgan & Krueger, 1993; Stewart, Shamdasani & Rook, 2007). Two cases (A and C) were participative (Baskerville, 1997), i.e. one or more of the authors of the paper were engaged with the case study partners in the projects mentioned. Participative case studies lack the distance of regular cases studies, but offer richer insight to the object of study in return.
5 6 7
See http://cdq.iwi.unisg.ch/en/ for details. See http://www.smart-rti.de/ for details. See http://corporate-data-league.ch/wiki/Main_Page for details.
389
Phase II aims at responding to research question RQ 2 and results in a methodological approach for applying the data morphology. This phase uses the knowledge base about data resource management as an organizational function and method engineering as input. Data resource management specifies not only activities (such as data architecture management and database management) but also roles that perform these activities and resulting artifacts (e.g. data architecture). Method engineering (Brinkkemper, 1996; Olle, 1991) is an approach to design methods consisting of five method elements, namely meta-model, roles, activities, results and a procedure model (Gutzwiller, 1994).
5.
Case Studies in the Networked Industry
5.1. Case A: Consumer-Centricity 5.1.1. Context This case includes two companies from the retail and consumer goods industry, respectively. Migros is the largest retailer in Switzerland with almost 90,000 staff and revenues of more than 20 billion Euros in 2011. Beiersdorf is a globally operating producer of skincare products – mainly known for its NIVEA brand – headquartered in Hamburg, Germany, with more than 16,000 staff and revenues of more than 6 billion Euros in 2012. Both companies were concerned with how to use the current development of consumer-centricity (Ross, 2009) in their industry in order to turn it into business value. In 2008, Migros decided to establish a number of consumer interaction channels. Examples are LeShop.ch8, Migros’ online shop, and Migipedia9, an online platform where consumers can discuss products, comment on them or get in touch with Migros. Similarly, Beiersdorf launched a project in 2011 to analyze the implications of consumer-centricity on information management (Schierning, 2012).
5.1.2. Business Perspective Migros identified multi-channel integration as a prerequisite to transform consumer-centricity into business value. The company also analyzed the impact of “channel cannibalism” and found that losses in stationary retail revenue were overcompensated by online revenue growth. In fact, an analysis of 6,000 loyalty 8 9
See http://www.leshop.ch/leshop/Main.do for details. See http://www.migipedia.ch/de/ for details.
390
D1
Big Data and Information Management
program customers revealed an overall revenue increase of 30 percent (Schemm, 2012). Figure 2 shows the nine different channels – from print media advertisement to text messages – through which Migros interacts with its consumers along the consumer process and how a potential “path” of interaction looks like.
Figure 2: Multi-channel integration at Migros Beiersdorf analyzed consumer-centricity and compared their product information exchange network in 2007 and 2012. Figure 3 shows the result of this network analysis. The different actors in the network as well as the betweenness centrality are depicted. Betweenness centrality is a measure of a node’s centrality in a network and equals to the number of shortest paths that pass through that node. The edges depict the flow of product information. The analysis results were twofold. Firstly, the network has grown from five to ten actors. The Global Data Synchronization Network (GSDN), retailer web shops, pure online retailers and social networks have joined the network. Secondly, there was a consumer move from the outer rings of the network towards the center – i.e. its power to control product information has increased significantly.
391
Figure 3: Networked Industry at Beiersdorf (Schierning, 2012) 5.1.3. Data Resource Perspective Table 4 shows an overview of the data in Case A along the four different data domains “consumer”, “product”, “retailer” and “brand owner”. The table represents a combined list of findings from both Migros and Beiersdorf. Consumer-related data items mainly relate to the individual’s identity, his/her personal network and the relationship to the products he/she consumes. The latter includes purchasing histories and posts about products on websites such as Migipedia. The majority of data items is related to the product itself and comprises “traditional” pieces of information that can be found on the label of the product today. Examples are the Global Trade Identification Number (GTIN), content, product name etc. Furthermore, product related data also comprises allergen information, explanations of ingredients and product ratings. Retailer related data is about the availability of the product both online and in the store, where-to-buy-information etc. Brand owner related data describes the brand itself (in Case A NIVEA, for example), the country of origin of the brand owner as well as URLs of websites.
392
D1
Big Data and Information Management
Data domain
Consumer
Products
Retailer
Brand owner
Data items
Blog entries and posts Connections (e.g. Facebook friends) Polls on products Purchasing history Real name Web identity (e-mail, Facebook account etc.)
Allergen information Brand Claim Content Country of origin Dimensions GTIN Ingredients (chemical substances) Ingredient explanations Long text Links to social network sites Manufacturer “Pack shot” Prices Product name Product description Product ratings Recommendations Sustainability information Value proposition Weight “Where to buy” information
Product availability Promotion data (e.g. coupons) URLs of web shops etc. “Where to buy” information
Brand Country of origin Promotion data (e.g. coupons) URLs of brand websites etc.
Table 4: Data Items in Case A Figure 4 shows the data resource in Case A as an ER diagram.
393
Figure 4: Important Data Objects and Their Relations in Case A When analyzing the data resource, the companies in this case made the following findings: – Data about products is driven by what was printed on the product label. However, this was deemed insufficient with regard to increasing consumer information demands such as carbon foot print information, allergen implications, and explanations on ingredients. – The quality of the product data is high in areas where lots of data management expertise exists (supply chain data such as GTIN, for example). Maturity in data management and consequently data quality is more problematic when it comes to quality of the product images, consistency of long text across channels, up-to-dateness of images etc. – Data sources are not transparent in cases where data flow control is with the consumer (ratings, blogs, posts about products etc.). One manager articulated the concern that – as QR codes are literally for free and can easily be created and linked to a product of the brand owner – it is getting more and more likely that incorrect product information gets populated through various online media and communication channels. – Data structures are changing. Until recently, data was mainly organized in records and tables (according to the relational database paradigm). New forms of data, though, come as streams, videos etc.
394
D1
Big Data and Information Management
5.2. Case B: Supply Chain Excellence and Internet of Things 5.2.1. Context The consortium research project SmaRTI – short for “Smart Reusable Transport Items” – aimed at improving the fast moving consumer goods supply through RFID technology on load carriers (pallets)10. The project consortium consisted of six industrial partners (Chep, Deutsche Post, Infineon, Mars, Lufthansa Cargo, and Rewe) and two research partners (the Fraunhofer Institute for Material Flow and Logistics and TU Dortmund University). The project was motivated by the need in the retail industry to avoid “out of stock” situations in stores as far as possible and by the increasing complexity of flows of goods and data within the supply chain. With an overall budget of more than eight million Euro, the project ran over 3.5 years ending in December 2013.
5.2.2. Business Perspective The consumer goods supply chain leaves room for improvement. Between five and ten percent of all products run into out-of-stock situations, one third of which is caused by errors in the order process. Therefore, retailers loose between two and three percent of their revenue. Transparency in the supply chain could reduce this value to 0.5 percent and, in addition to that could help to reduce the number of pallets by ten percent and the number of losses of pallets by 50 percent. Thus, the project aimed at the development of solutions for a more efficient and effective flow of goods within the consumer goods supply chain. To achieve this, the project addressed four areas of work. Firstly, in the area of material flow technology, pallets were equipped with “hybrid” AutoID technology – i.e. a combination of one- and two-dimensional barcode systems and RFID chips. For example, within the cycle of euro pallets between Mars, Chep and Rewe, 18 read points were installed to read more than 2,600 SmaRTI pallets creating more than 90,000 EPCIS11 events. Secondly, business processes were redesigned in order to allow for more decentralized process flow control. In other words, the process no longer determines the exact flow of a pallet through the supply chain but the pallets finds its way on its own.
10 11
See http://www.smart-rti.de/ for more information about the project. Abbreviation for Electronic Product Code Information Services. The standard can be retrieved under http://www.gs1.org/gsmp/kc/epcglobal/epcis/epcis_1_0_1-standard-20070921.pdf.
395
Thirdly, the flow of information increased in volume and quality. Through the close coupling of information and material flow, a data service became required to manage real-time data of load carriers. A cloud-based solution provides access to the rich source of data to all involved partners in the supply chain. Fourthly, to leverage the data, turn it into information and for improved decisionmaking, a business intelligence platform was established.
5.2.3. Data Resource Perspective The new role of pallets in the supply chain characterizes the data resource perspective in Case B. In the past, pallets were not tracked at all and did not provide data about their location, status etc. From an information management perspective, pallets were mainly managed as stock items. Today, the richness of available data on the load carrier and the products carried, allows for more advanced business process and business intelligence scenarios. Table 5 shows the relevant data items in Case B. Data domain
Retailer
Supplier
Pallet Provider
Pallet
Data items
Business location Business step Business transaction Customer identification Customer name Purchase order Read point
Business location Business step Business transaction Read point Sales order Transport order
Business location Read point
Business GTIN step Product Business name transaction Disposition ID EPCIS events (Aggregation event, Object event, Quantity event, Transaction event) Location Time
Table 5: Data Items in Case B Figure 5 shows the ER diagram for Case B.
396
Product
Reader Business location Logical reader ID Physical reader ID
D1
Big Data and Information Management
Figure 5: Important Data Objects and Their Relations in Case B
5.3. Case C: Purchasing and Business Partner Data 5.3.1. Context As part of a CTI12 funded research project, the Institute of Information Management at the University of St. Gallen and the Business Engineering Institute develop a collaborative approach for business partner data management – referred to as the Corporate Data League (CDL). The project comprises three industry partners, namely Nestlé, Syngenta and Bayer. The CDL embraces the idea of a growing community of organizations that share their business partner data. The approach is based on the assumption that the quality of business partner data has a “natural” limit as long as data is maintained by one company only. Address changes or mergers of business partners, for example lead to outdated data. The maintaining company will not necessarily realize these changes. Only the data owner (Otto, 2011) – in this case the business partner itself – has the knowledge about the correct pieces of data (e.g. addresses).
12
Commission for Technology and Innovation (CTI) of the Swiss Federal Department of Economic Affairs, Education and Research. Details under http://www.kti.admin.ch/?lang=en.
397
Consequently, companies are continuously facing the question whether they can trust their data or not. Trust in business partner data mainly refers to the question if a data item is accurate and reliable. Especially in the case of business partner addresses, it may be possible to find out if an address really exists – an existing address, however, does not necessarily mean that there really is a location of a certain business partner. In order to increase the reliability of their data, companies typically verify a particular address against multiple sources such as external data providers or governmental registers, for instance. Still, though, companies are required to trust 3rd parties that usually do not maintain a business relationship with the business partner they provide data about. Furthermore, many companies maintain data of the same business partners. Thus, from a networked industry view, data maintenance is carried out redundantly. The CDL proposes that much of the data maintenance effort could be saved – while increase data quality at the same time – if different companies shared and maintained their business partner data collaboratively. In other words, a group of companies trusting each other would lead to a “golden record” of business partner data. The quality of this golden record and the trust in it increase by the extent to which business partners overlap between companies. The bigger the overlap, the higher is the likelihood for erroneous data can be detected and corrected through the CDL. The technological foundation of the CDL is a cloud-based platform. CDL companies can connect to the platform through web-services.
5.3.2. Business Perspective The CDL is a community that establishes a network between its member organizations through the inter-linkage of business partner data. However, membership in the CDL community is linked to some prerequisites CDL candidates have to fulfill such as, for example, a certain maturity in managing the data resource and the commitment to take over responsibility for certain data objects. Furthermore, there is trust that needs to exist between the members in order to make the CDL work. Therefore, the CDL can be viewed as an exclusive “club” of members that collaboratively manage business partner data as a “digital club good”. The quality of business partner data is crucial for various reporting purposes and many business processes. Examples are global spend analyses in purchasing and risk assessments in order to comply with legal export and import restrictions. Three major challenges exist when it comes to managing business partner data (Madnick, Wang & Zhang, 2002):
398
D1
Big Data and Information Management
– Unambiguous identification of business partners – Reliability of business partner hierarchies – Correct and transparent relationships between business partners The ambition to overcome these challenges motivates companies to become a member of the CDL. The business rationale behind this motivation is manifold. Firstly, there is the improvement of purchasing conditions as high-quality data about legal hierarchies of suppliers are a prerequisite for reliable analyses. Secondly, there is risk management as companies, for instance, need to analyze its risk exposure through dependencies on a distinct supplier. Thirdly, in the case of consolidating revenues along customer hierarchies, companies want to create transparency on potential non-payment risks. All the business partner data needed to achieve these goals would be shared in the CDL in a high-quality, trustworthy manner.
5.3.3. Data Resource Perspective Business partner data typically includes organizational data (e.g. addresses, industry classifications), contact information (e.g. phone numbers, e-mail addresses) and bank data (Madnick, Wang, Dravis & Chen, 2001). While some business partner data attributes are unique to a company’s business operations – pricing conditions, for example – many others are common for different companies – address data, for example. The CDL focuses on global, public data items – the ones that are not unique to a company’s business operations. Moreover, only a few global and – generally public – data items cause most of the maintenance effort. These items are ideal candidates for data collaboratively managed in the CDL and they constitute the entities of the first version of the CDL data architecture. Address data, for instance, is based on the extensible Address Language (xAL) which is part of the OASIS Customer Information Quality standard13. When designing the CDL data architecture, the following points became evident. – The business object “business partner” is not unambiguously understood across companies and, thus, data objects do not consistently represent it in information systems. This situation impedes the design of a shared data model.
13
See https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=ciq for details.
399
– The quality of address data is country-specific. Address data of countries with local post units provide address reference data that allows for higher data quality. Moreover, different address formats in different countries lead to inappropriate use of data items. In addition, the use of different languages and character sets of addresses are aggravating data quality as they prevent the same addresses in different systems with a different language base being linked to each other. Standardized general address data items help to overcome these challenges, e.g. by the use of the extensible Address Language contained in the OASIS CIQ standard. – Public availability of data accelerates data sharing, whereas company-specific data – in other words, private data – like pricing conditions are barriers. – The unique identification of a certain business partner is a challenge in all organizations under study. External data vendors often provide unique identifiers but the linking to the correct business partner remains a difficult task to achieve. – Organizations are improving the reliability of their business partner data by consulting external data services. However, the reliability of the data provided is limited. Sharing business partner data with other organizations in a network that are doing business with the same business partner can significantly improve the reliability of data. Table 6 shows the relevant data items in Case C. Data domain
Business Partner
Location
Data items
Unique CDL Id Individual Id (Legal) name Legal form Tax number Business partner hierarchy Business partner type Certificates (not yet) Banking data (not yet)
Country Administrative area Subadministrative area Postal code Locality Sublocality Thoroughfare Subthoroughfare Premise Subpremise Geo-coordinates Post office box Address type
Table 6: Data Items in Case C
400
D1
Big Data and Information Management
Analogously to the two previous cases, Figure 6 shows the ER diagram for Case C.
Figure 6: Important Data Objects and Their Relations in Case C
5.4. Electronic Commerce and Anticipatory Shipping 5.4.1. Context In December 2013, Amazon patented a method and system for anticipatory package shipping in the US (Spiegel, McKenna, Lakshman & Nordstrom, 2013). The company aims at shipping goods prior to the customers’ order in order to reduce delivery time by predicting what buyers are going to buy before they actually do it.
5.4.2. Business Perspective The patent supports multiple use cases. At first, one or more items are selected as a package for potential shipment to a certain geographical area as a destination. At the time of shipment, the packages’ delivery addresses are not completely specified, but will be in-transit.
401
Figure 7: Information Systems Architecture in Case D (Spiegel et al., 2013) In a second use case, Amazon tries to match some of the goods in transit with current customer orders. This method aims at determining the package closest to the potential buyer. The common carriers’ transportation networks – such as UPS, DHL or FedEx – act as a dynamic, position-changing warehouse. One aspect of this use case is to use the disadvantage of lower-cost transportation, i.e. non-expedited delivery, for buffering the speculatively selected items. If one of the items in transit fits to a customer demand, the delivery time will be similar to express delivery in the majority of cases. Thus, Amazon, respectively the customer, saves transportation costs. In detail, items can be tracked as they arrive at a fulfillment center from suppliers and then will be taken into inventory where they later will be picked for packaging and shipment. Items are identified by using bar codes, either magnetically or optically readable characters and other types of scanning technologies. As a package is speculatively shipped to a destination area, data regarding this package is stored within the data warehouse. If a customer places
402
D1
Big Data and Information Management
a similar order for one or more items in the package, the fulfillment information system queries the data warehouse to determine whether a speculatively shipped package in transit partially or completely satisfies the demand. In the case of the item being already in a delivery truck, the complete delivery address can be transmitted to the vehicle via radio, satellite, messaging etc. to instruct the driver to deliver the item to the specific address. An additional address labeling would not be necessary in this case. If two or more packages fit to the orders requirement, Amazon would use historical shipping data from the common carriers to predict which of the potential packages is closest to the customers delivery address. Once the closest package is determined, the particular delivery address is transmitted to the location of the package. In a third use case, the information about the package availability may be determined before a customer places an order and is then shown to the customer during the shopping process – aiming at increasing the likelihood of the customer to actually perform the purchase. This offer is made while the customer interacts with the e-commerce portal to view various items and browses a particular item. The e-commerce portal interacts with the fulfillment information system to determine which speculatively shipped packages, including the particular item, are in transit to the customers location. The shopping process itself generates much of the required forecast data, e.g. the specific web pages viewed and duration of views, overall lengths of a customers’ visit to the e-commerce portal, links hovered over with the mouse arrow and duration of hovering, shopping cart, wish lists etc. This data can be combined with specific customer data received from other channels, such as telephone inquiries, responses to marketing materials or salesperson contact. Another important point is the relatedness with previous purchases, e.g. in the case of new product releases, where no historical buying patterns from similar customers exist. Figure 7 shows the Information Systems Architecture in Case D.
5.4.3. Data Resource Perspective Table 7 shows the data items required to implement the patented use cases.
403
Data domain
Customer
Products
Online Retailer
Data items
Customer ID Browsing habits (e.g. web pages viewed, duration of views, mouse arrow hovering links or items) Shopping card habits (historical patterns, e.g. items placing into but finally not purchasing and removing) Wish lists (historical patterns, e.g. items placing into but finally not purchasing and removing)
Forecasted shipping data (risk of redirecting to another location, costs of redirecting item, successful speculated shipments in the past) Real-time product status (detailed location of current in-transit items, number of items per package)
Prices Web shop URL
Table 7: Data Items in Case D Figure 8 shows the ER diagram of Case D.
Figure 8: Important Data Objects and Their Relations in Case D
404
D1
6.
Big Data and Information Management
A Morphology of Data in a Networked Industry
6.1. Morphology Overview The case analysis, combined with the input from literature, leads to a morphology of the data resource in the networked industry (see Table 8). Dimension
Characteristics
Business criticality
Competitive advantage
Compliance relevant
Operations relevant
Data classification
Private
Public
Purpose-related
Data domain type
Account
Data format
ASCII
Data management level
Class
Instantiation
Data occurrence
Batch
Stream
Data ownership
Owned by one legal entity
“Club” good
Public good
Data quality
Authoritative
Within tolerance, fuzzy
Below thresholds
Data source
Internal
Party Audio
Data Semantics standardization
Thing JPEG
Video
Other Numeric
XML
External Syntax
Data Not trusted trustworthiness
Values Trusted
Data sharing
Open
Free
Proprietary
Data maintenance costs
Low
Medium
High
Table 8: Morphology of the Data Resource in the Networked Industry The morphology comprises eleven dimensions with two to four potential values each.
6.2. Morphology Details Business criticality is a measure for the impact of data on business success. Data with impact on competitive advantage, for example is data about a new recipe in the food or healthcare industry. Other data elements are even more critical as
405
they are relevant for a company to report compliance to legal and governmental regulations (such as ingredients of food or healthcare products). Data relevant for operations facilitates more efficient supply chains than RFID data about load carriers, such as in Case B. Data classification refers to the confidentiality levels that apply for data elements. In general, data can be either private, public or disclosed to be used in a certain context. Many of the “opt in” regulations for personal information in the consumercentric scenario fall in the third category. Data format refers to what in practice is often discussed as “structured” vs. “unstructured” data. The morphology uses the term as a definition for the encoding format in which the data is stored. Two levels can be distinguished regarding the data management. Managing on class level conforms to traditional relation database approaches in which entities are described (for example a material), master data represents the object and transactional and inventory data refer to the latter. In the consumer goods industry, the GTIN identifies a certain class of products. The networked industry, on the other hand, requires an increasing portion of data managed on instantiation level. Case C, for example, does not manage a class of palettes, but every single instance. Data occurrence refers to the way data is gathered. Batches represent the incremental way of data gathering, namely the one of records being processed. In contrast, much data in the networked industry has to be processed in streams (e.g. social media streams, shop-floor data streams). Data ownership follows the understanding of data being treated as physical goods (e.g. resources, assets) (Fisher, 2009). If a legal entity executes control over data items, it takes over ownership of them similarly to private economic goods. Alternative dimension values are “club goods” and public goods. Public data goods are addresses, for example. Data quality forms another dimension of the morphology. One can distinguish between accurate data that is stored in an authoritative data source, data of a quality within a tolerance level and data below a certain quality threshold. Another dimension refers to the question whether data is sourced internally or externally from an enterprise perspective. Many product data items are created and thus sourced, internally whereas often addresses, for example, are purchased from data service providers.
406
D1
Big Data and Information Management
Data standardization refers to the extent to which data complies with standards. Data can be standardized on three levels: On a syntactic level, on a semantic level and regarding the values that can be assigned to the data item. Data trustworthiness describes the extent to which the data can be trusted. Data sharing is concerned with the legal shareability of data. One can distinguish between data that is not allowed to being shared, e.g. proprietary solvency information that is purchased from a data vendor and data that is completely open to being shared. Moreover, data may be free to be shared but it could be that the original data source always has to be disclosed. This distinction refers to the known types of software licenses, where usually open source, free and proprietary software are differentiated. Data maintenance costs describe the effort needed for maintaining a certain data object or item within an organization or across organizational boundaries, e.g. in collaboration with business partners.
7.
Taxonomy Application
7.1. Method design The data morphology presented in Section 6 is not an end in itself but has a certain business purpose. It is designed to support data architecture management in a networked industry. General advantages of methods are the guidance in development processes and their acceleration as well as the incorporation of good practices and transferability of method fragments (Brinkkemper, 1996; Gutzwiller, 1994; Olle, 1991). The method aims at providing a procedure model for data architecture design, to create transparency about the characteristics of the data resource and to help analyze and assess risks and opportunities arising from the specifics of the data resource in the networked industry.
7.2. Method overview The method is described along its fragments and is structured according to the activities in a procedure model. For each method fragment the design result, activities producing the result, the roles involved and the techniques used are described (Bucher, 2009).
407
The method follows a three-phase approach: Phase I identifies the domain and the scope of the analysis and architecture design. In particular, the data resource under study is elaborated in terms of data objects and data items to be considered. Phase II analyzes the data resource using the data morphology. Firstly, the morphology is applied to the identified data items which, in a second step, allows to analyze the data resource regarding risks and opportunities based on the identified scope of analysis. Finally, phase III is concerned with the elicitation of design requirements for the data architecture. In a final step, the data architecture for the data resource is designed according to the design requirements. Figure 9 shows an overview of the method.
Figure 9: Method for Morphology Application
408
D1
Big Data and Information Management
7.3. Method fragments 7.3.1. Phase I: Identify domain and scope Activity I.1 (see Table 9) is concerned with the definition of the scope of analysis. The results of this activity are two-fold. On the one hand, the data domain is clearly identified in order to narrow down the scope of analysis to a manageable complexity. On the other hand, analysis objectives need to be recognized. For example, all data items of a distinct data object that are relevant for compliance should be identified with the purpose of eliciting requirements for the data architecture. For instance, the data architecture must distinguish compliance-relevant data items from other data items that are irrelevant from a compliance perspective. I.1 Define the scope Result
Data domain and analysis objectives
Roles
Data steward
Techniques
Process analysis Expert interviews Workshops/Focus groups
Input
Process documentations, business objectives, data model
Table 9: Activity I.1 The second activity within phase I comprises the identification of data objects and their corresponding data items (see Table 10). This activity is of particular importance as the data items jointly constituting a data object influence the instantiation of the morphology. The result of this activity is a list of data objects and their items that should be analyzed. It serves as an input for the second phase of the method. I.2 Identify data objects and items Result
List of data objects and items to be analyzed
Roles
Data steward, data architect, data owner
Techniques
Process analysis Expert interviews Workshops/Focus groups
Input
Process documentations, descriptions of applications, data model
Table 10: Activity I.2
409
7.3.2. Phase II: Analyze During the analysis phase, the morphology is instantiated. First, in activity II.1 for every data object, identified in the previous phase, each dimension of the morphology is analyzed with regard to the matching characteristics (see Table 11). The results are visualized in form of a “data heat map” which can serve two purposes using different representations: – The selected characteristics of the data object are visualized as a whole. In other words, an aggregated view on the characteristics of all data items that establish the data object is provided. Characteristics that match more frequently than others are marked more intensively than their counterparts. – A certain dimension is selected and each characteristic is assigned to a shade of color. The data heat map then consists of several selected data items of a certain data object. The matching characteristics for each data item establish the marking color. While for the first purpose the map provides the applicant of the morphology with an aggregated view in order to analyze the data object as a whole, the second purpose allows for distinguishing data items with respect to certain selected characteristics. For instance, those data items that are characterized as public may be distinguished more simply from those that are private. This, for example, could help to understand the instantiation of the morphology for the whole data object by knowing the individual characterization of each data item. So, depending on the identified analysis objectives, the data heat map supports a thorough analysis of each data item. II.1 Create transparency Result
Data heat map
Roles
Data steward, data owner, data scientist
Techniques
Characterize data objects for each morphology dimension
Input
Morphology of data in a networked industry, List of data objects and items
Table 11: Activity II.1 In a second step of phase II, the results of the morphology application are analyzed with respect to the analysis objectives (see Table 12). The data heat map, resulting from the previous activity, provides the input for the analysis. Potential risks and opportunities for the selected data objects are examined with reference
410
D1
Big Data and Information Management
to the design options for the data architecture. For example, risks may arise for data objects that are compliance-relevant (e.g. ingredients information) if they are redundantly stored and, thus, maintained in several databases of different network actors. On the other hand, opportunities may arise, e.g. in the case of address data. According to the characteristics shown in the data heat map, one can identify that maintaining address data in collaboration with different network actors could significantly improve data quality and reduce the maintenance effort. These findings need to be considered when designing the data architecture in a further step of the method. II.2 Analyze and assess Result
Documentation of risks and opportunities related to each data object that is characterized using the morphology
Roles
Data scientist, data architect
Techniques
Analysis of data heat maps
Input
Data heat map
Table 12: Activity II.2 7.3.3. Phase III: Design The third phase of the method is dealing with deriving design requirements for the data architecture of the analyzed data objects and, finally, with the actual design of the architecture (see Table 13). Design requirements are elicited based on the analysis results. Generally, certain patterns of data characteristics might be identified leading to similar requirements on the data architecture. Based on various applications of the morphology in different contexts, a knowledge base of morphology patterns and corresponding data architecture design requirements can be established. Enterprises in the networked industry coping with the challenge of designing data architectures could then use these patterns to save design efforts and possibly improve interoperability of their architectures. III.1 Derive design requirements Result
Requirements list, design patterns
Roles
Data architect, data steward
Techniques
Requirements elicitation techniques
Input
Data heat map and documentation of risks and opportunities Table 13: Activity III.1
411
Finally, as part of the last activity, the data architecture is designed based on the results of the previous activities (see Table 14). Design patterns and identified requirements are the input to design a conceptual model which integrates the subarchitectures of related network actors and considers the data flows. Moreover, as the second part of the data architecture, the application architecture is modeled using the conceptual model and the analysis results. This approach allows for the inclusion of applications across organizational boundaries. For instance, cloud applications providing services for creating, storing and updating certain data objects in collaboration with other network actors can be explicitly considered. Thereby, the whole data architecture as a result of applying this method has the objective to facilitate data integration between enterprises in the network. Reusable design patterns for data objects with similar characteristics simplify and speed up this design process. III.2 Design data architecture Result
Data architecture for the networked industry
Roles
Data architect
Techniques
Conceptual data modeling
Input
Design patterns, requirements list
Table 14: Activity III.2
7.4. Method demonstration 7.4.1. Demonstration approach The online encyclopedia Wikipedia proves that collaborative data management by communities of web users is an approach that may result successfully in high quality content (Arazy, Nov, Patterson & Yeo, 2011). However, such kind of collaboration mechanisms cannot automatically be obtained in business environments (Majchrzak, Wagner & Yates, 2006; Wagner & Majchrzak, 2007). It can only be achieved for limited applications as the selection of a data resource for collaborative data management is highly dependent on the data object’s characteristics. Data records providing a competitive advantage, for instance, will definitely not be shared by a certain company. Thus, the method for morphology application is applied to identify the influence of the CDL (see Case D) on the data architecture from the perspective of a member organization.
412
D1
Big Data and Information Management
7.4.2. Purchasing and business partner data The analysis scope is defined as assessing the suitability of business partner data, in particular supplier data, for collaborative maintenance. All data items representing a supplier in the CDL, including address data, were selected. Moreover, further supplier data items that are not within the scope of the CDL were included to cover most of the data and enterprise needs to maintain. These are, for example, purchasing conditions and solvency information. Figure 10 shows the resulting consolidated data heat map for business partner data.
Figure 10: Heat Map for Business Partner Data One can recognize that some data items provide a competitive advantage. Such advantages could be pricing conditions and/or instances of strategic suppliers which provide a competitive advantage against competitors. Moreover, some data items are characterized as private. Again, this can be pricing conditions but also enterprise-internal IDs for each business partner. Syntax and values are highly standardized, but regarding semantics, there exist shortcomings with regard to a common understanding about certain data items. Most of the data items are public and open which makes them suitable for sharing. However, some data is also proprietary, e.g. solvency data from certain business information providers.
413
A more thorough analysis of the data heat map identifies risks for the collaborative maintenance of business partner data that need to be considered in the data architecture. Two examples are poor semantics which could make it impossible to establish a common understanding with other organizations. Secondly, wrong compliance-relevant data is critical and would be in danger to be inaccurately maintained by other members. On the other hand, several opportunities were uncovered. For example, most of the data is characterized as public and open while the maintenance costs are high. Collaborative maintenance may provide a big opportunity for these specific data items. Based on these findings, the requirements for the data architecture of the CDL are derived. One important requirement is that data objects should not be relatable to the organizations using the data object by any third organization. Secondly, in order to address the risk of having wrong compliance-relevant data, the data architecture needs to consider versioning and additional local checks of this data. Additionally, a design pattern can be recognized which shows that public and open data of the business partner domain is in general suitable for sharing with other organizations.
8. Conclusion 8.1. Summary The paper addresses the challenge of managing the complexity of the data resource in the networked industry by introducing a data morphology. As all taxonomic analyses in general, the data morphology aims at providing structure to a complex phenomenon. Four cases illustrate the variety and heterogeneity of data and form a foundation for the development of a morphological box. The morphological box represents the different dimensions that can be applied to data items as well as potential instantiations. Besides the data morphology itself, the paper proposes a method for the application of the morphology in the process of designing or updating a data architecture.
8.2. Contribution The paper makes a two-fold contribution as both the practitioners’ and the scientific community benefit from its results. From a scientific perspective, the
414
D1
Big Data and Information Management
morphology is among the first artifacts that help structuring the data perspective in the networked industry. Following Gregor’s (2006) understanding of theory in Information Systems research, the morphology meets the criteria of an “analytical theory”. This group of theories mainly aims at analyzing and describing “what is”. Analytical theories form the foundation for theories of “higher order” which identify causalities between variables and explain or even predict their behavior. A method – such as the one presented in this paper to apply the morphology – can in general be seen as a design theory, i.e. one that prescribes how artifacts are supposed to be designed. However, the method in this paper does not claim to be a “fully-fletched” design theory but rather is a first step in this direction. Practitioners – in particular data managers, data architects and data scientists working in the networked industry – benefit from the morphology, and the method alike, as this adds structure to their fields of work. The morphology can, for instance, function as a cornerstone of data mapping activities. The ultimate goal of data architecture management is to facilitate the strategic use of data and the morphology represents a building block in companies’ endeavors to make use of the potentials of the various incarnations of the networked industry.
8.3. Outlook The morphology and the method supporting its use are only first steps in a general revision of how data resource management and data architecture management are performed in the future. With the proliferation of “data-driven businesses” and the continuing transformation from model to data-driven management approaches, more research is needed to understand the data resource in the networked industry in greater detail. Trajectories for future research are more studies in the field (e.g. through case studies and action research projects) and design science approaches for new data management frameworks.
415
References Addo-Tenkorang, R., Helo, P. T., Shamsuzzoha, A., Ehrs, M. & Phuong, D. (2012). Logistics & Supply Chains Man agement Tracking Networks: DataManagement System Integration / Interfacing Issues. Paper presented at the Technology Management for Emerging Technologies (PICMET 12), Vancouver, Canada. Alt, R. & Klein, S. (2011). Twenty years of electronic markets research – looking backwards towards the future. Electronic Markets, 21 (1), 41-51. Arazy, O., Nov, O., Patterson, R. & Yeo, L. (2011). Information Quality in Wikipedia: The Effects of Group Composition and Task Conflict. Journal of Management Information Systems, 27 (4), 71-98. Baskerville, R. (1997). Distinguishing Action Research from Participative Case Studies. Journal of Systems and Information Technology, 1 (1), 25-45. Bernard, S. A. (2005). An Introduction to Enterprise Architecture (2nd ed.). Bloomington, IN: Authorhouse. Bettoni, A., Alge, M., Rovere, D., Pedrazzoli, P. & Canetta, L. (2012). Towards Sustainability Assessment: Reference Data Model for Integrated Product, Process, Supply Chain Design Paper presented at the 18th International Conference on Engineering, Technology and Innovation Munich. Brackett, M. H. (1994). Data sharing using a common data architecture. New York, NY: John Wiley. Brancheau, J. C., Schuster, L. & March, S. T. (1989). Building and Implementing an Information Architecture. Data Base, 20 (2), 9-17. Brinkkemper, S. (1996). Method Engineering: Engineering of Information Systems Development Methods and Tools. Information and Software Technology, 38 (4), 275-280. Bucher, T. (2009). Ausrichtung der Informationslogistik auf operative Prozesse – Entwicklung und Evaluation einer situativen Methode. (Dissertation), Institut für Wirtschaftsinformatik, Universität St. Gallen, St. Gallen. Bullinger, H.-J. & ten Hompel, M. (Eds.). (2007). Internet der Dinge. Berlin: Springer. Chalasani, S. & Boppana, R. V. (2007). Data Architectures for RFID Transactions. IEEE Transactions on Industrial Informatics, 3 (3), 246-257. doi: 10.1109/TII.2007.904147. Chen, P. P.-S. (1976). The Entity-Relationship Model: Toward a Unified View of Data. ACM Transactions on Database Systems, 1 (1), 9-36.
416
D1
Big Data and Information Management
D’Amours, S., Lefrançois, P. & Montreuil, B. (1996). Design of an information system for a networked industry. Paper presented at the International Conference on Engineering and Technology Management (IEMC 96), Vancouver, Canada. DAMA. (2008). The DAMA Dictionary of Data Management. Bradley Beach, NJ: Technics Publications. DAMA. (2009a). DAMA Data Management Body of Knowledge (DMBOK): Functional Framework. Lutz, FL: DAMA International. DAMA. (2009b). The DAMA Guide to the Data Management of Knowledge. Bradley Beach, New Jersey: Technics Publications. Derakshan, R., Orlowska, M. E. & Li, X. (2007). RFID Data Management: Challenges and Opportunities. Paper presented at the 2007 IEEE International Conference on RFID, Grapevine, TX. Dreibelbis, A., Hechler, E., Milman, I., Oberhofer, M., van Run, P. & Wolfson, D. (2008). Enterprise Master Data Management: An SOA Approach to Managing Core Information. Upper Saddle River, NJ: IBM Press. Fisher, T. (2009). The Data Asset: How Smart Companies Govern Their Data For Business Success. Hoboken, NJ: Wiley. Fleisch, E. & Mattern, F. (Eds.). (2005). Das Internet der Dinge: Ubiquitous Computing und RFID in der Praxis: Visionen, Technologien, Anwendungen, Handlungsanleitungen. Berlin: Springer. Goodhue, D. L., Kirsch, L. J., Quillard, J. A. & Wybo, M. D. (1992). Strategic data planning: lessons from the field. MIS Quarterly, 16 (1), 267-274. Goodhue, D. L., Quillard, J. A. & Rockart, J. F. (1988). Managing The Data Resource: A Contingency Perspective. MIS Quarterly, 12 (3), 373-392. Gregor, S. (2006). The Nature of Theory in Information Systems. Management Information Systems Quarterly, 30 (3), 611-642. Gutzwiller, T. (1994). Das CC RIM-Referenzmodell für den Entwurf von betrieblichen, transaktionsorientierten Informationssystemen. Heidelberg: Physica. Horne, N. W. (1995). Information as an Asset: The Board Agenda. Computer Audit Update(9), 5-11. Howard, M., Vidgen, R., Powell, P. & Graves, A. (2001). Planning for IS Related Industry Transformation: The Case of the 3DayCar. Paper presented at the 9th European Conference on Information Systems (ECIS 2001), Bled, Slovenia. Inmon, W. H. (1992). Data architecture: the information paradigm (2nd ed.). Boston, MA: QED Technical Publishing Group.
417
ISO. (2011). Systems and software engineering – Architecture description (ISO/ IEC FDIS 42010). King, S. J. & Nehyba, S. J. (1982). Data Resource Management. In: R. L. Nolan (Ed.), Managing the Data Resource Function (Vol. Second Edition, pp. 185-209). St. Paul, MN: West Publishing Company. Lampathaki, F., Mouzakitis, S., Gionis, G., Charalabidis, Y. & Askounis, D. (2009). Business to business interoperability: A current review of XML data integration standards. Computer Standards and Interfaces, 31 (6), 1045-1055. Legner, C. (2009). Towards Service-Oriented Business Networking. (Habilitation), University of St. Gallen, St. Gallen. Legner, C. & Schemm, J. (2008). Toward the Inter-organizational Product Information Supply Chain – Evidence from the Retail and Consumer Goods Industries. Journal of the AIS, 9 (3/4), 120-152. Levin, B. M. (1998). Strategic networks: The emerging business organization and its impact on production costs. International Journal of Production Economics, 56-57, 397-405. doi: 10.1016/S0925-5273(96)00110-7. Levitan, K. B. (1982). Information Resources as “Goods” in the Life Cycle of Information Production. Journal of the American Society for Information Science, 33 (1), 44-54. Levitin, A. V. & Redman, T. C. (1998). Data as a Resource: Properties, Implications, and Prescriptions. Sloan Management Review, 40 (1), 89101. Madnick, S., Wang, R. Y., Dravis, F. & Chen, X. (2001). Improving the Quality of Corporate Household Data: Current Practices and Research Directions. Paper presented at the 6th International Conference on Information Quality, Cambridge, Massachusetts, USA. Madnick, S., Wang, R. Y. & Zhang, W. (2002). A Framework for Corporate Householding. Paper presented at the 7th International Conference on Information Quality, Cambridge, Massachusetts, USA. Majchrzak, A., Wagner, C. & Yates, D. (2006). Corporate Wiki Users: Results of a Survey. Paper presented at the 2006 International Symposium on Wikis, Odense, Dänemark. Meuser, M. & Nagel, U. (2009). The Expert Interview and Changes in Knowledge Production. In A. Bogner, B. Littig & W. Menz (Eds.), Interviewing Experts (pp. 17-42). Hampshire, UK: Palgrave Macmillan.
418
D1
Big Data and Information Management
Morgan, D. L. & Krueger, R. A. (1993). When to use Focus Groups and why? In D. L. Morgan (Ed.), Successful Focus Groups (pp. 3-19). Newbury Park, California: Sage. Nelson, M. L., Shaw, M. J. & Qualls, W. (2005). Interorganizational System Standards Development in Vertical Industries. Electronic Markets, 15 (4), 378-392. Nolan, R. L. (1982). Managing the data resource function (1982 Ed. 2. Auflage ed.). St. Paul: West publishing company. Olle, W. T. (1991). Information Systems Methodologies. Kent: Addison-Wesley. Oppenheim, C., Stenson, J. & Wilson, R. M. S. (2003). Studies on Information as an Asset I: Definitions. Journal of Information Science, 29 (3), 159-166. Österle, H., Fleisch, E. & Alt, R. (1999). Business Networking. Shaping Enterprise Relationships on the Internet. Berlin: Springer. Otto, B. (2011). Data Governance. Business & Information Systems Engineering, 3 (4), 241-244. doi: 10.1007/s12599-011-0162-8. Otto, B. & Aier, S. (2013). Business Models in the Data Economy: A Case Study from the Business Partner Data Domain. Paper presented at the 11th International Conference on Wirtschaftsinformatik (WI2013), Leipzig. Parlanti, D., Paganelli, F. & Giuli, D. (2011). A Service-Oriented Approach for Network-Centric Data Integration and Its Application to Maritime Surveillance. IEEE Systems Journal, 5 (2), 164-175. doi: 10.1109/ JSYST.2010.2090610. Periasamy, K. P. & Feeny, D. F. (1997). Information architecture practice: research-based recommendations for the practitioner. Journal of Information Technology, 12 (3), 197-205. Pienimäki, T. (2005). A Business Application Architecture Framework in Manufacturing Industry. Tampere: Tampere University of Technology. Ponzio, F. J. (2004). Authoritative Data Source (ADS) Framewort and ADS Maturity Model. Paper presented at the Ninth International Conference on Information Quality (ICIQ 2004), Cambridge, MA. Ross, B. (2009). Ten tips to winning at consumer centricity: for retailers and manufacturers. Journal of Consumer Marketing, 26 (6), 450-454. doi: 10.1108/07363760910988265. Schekkerman, J. (2006). How to Survive in the Jungle of Enterprise Architecture Frameworks: Trafford Publishing. Schemm, J. W. (2012). Consumer-Centricity at Migros. Paper presented at the Focus Group Workshop “Consumer-Centric Information Management (CCIM)”, St. Gallen.
419
Schierning, A. (2012). Consumer-Centric Information Management: Exploring Product Information. Paper presented at the Business Engineering Forum 2012, Bregenz, Austria. Shanks, G. (1997). The challenges of strategic data planning in practice: an interpretive case study. Journal of Strategic Information Systems, 6 (1), 69-90. doi: 10.1016/S0963-8687(96)01053-0. Smart Service Welt Working Group. (2014). Smart Service Welt: Recommendations for the Strategic Initiative Web-based Services for Businesses. Berlin. Spiegel, J. R., McKenna, M. T., Lakshman, G. S. & Nordstrom, P. G. (2013). United States Patent No. US 8,615,473 B2. A. Technologies. Stewart, D. W., Shamdasani, P. N. & Rook, D. W. (2007). Focus Groups: Theory and Practice (2. ed.). Thousand Oaks, California: Sage. Van den Hoven, J. (1999). Information resource management: Stewards of data. Information Systems Management, 16 (1), 88-91. Wagner, C. & Majchrzak, A. (2007). Enabling Customer-Centricity Using Wikis and the Wiki Way. Journal of Management Information Systems, 23 (3), 17-43. Wang, G. & Jin, G. (2008). Research and Design of RFID Data Processing Model Based on Complex Event Processing. Paper presented at the International Conference on Computer Science and Software Engineering, Wuhan, P.R. China. Yin, R. K. (2002). Case study research: design and methods (3rd ed.). Thousand Oaks, CA: Sage Publications. Yoon, V. Y., Aiken, P. G. & Guimaraes, T. (2000). Managing Organizational Data Resources: Quality Dimensions. Information Resources Management Journal, 13 (3), 5-13. Zachman, J. A. (1987). A Framework for Information Systems Architecture. IBM SYSTEMS JOURNAL, 26 (3), 454-470. doi: 10.1147/sj.263.0276. Zwicky, F. (1969). Discovery, Invention, Research through the morphological approach. New York, NY: Macmillan.
420
D1
Big Data and Information Management
421