SQuaRE-Aligned Data Quality Model for Web Portals Carmen Moraga1, Mª Ángeles Moraga1, Coral Calero1, Ángélica Caro2 1 Alarcos Research Group – Institute of Information Technologies & Systems, Paseo de la Universidad 4. 13071 Ciudad Real, Spain 2 Department of Computer Science and Information Technologies, University of Bio Bio, Chillán, Chile
[email protected],
[email protected],
[email protected],
[email protected]
Abstract The Internet is a perfect environment for the daily exchange and publication of data, and Web portals serve as an important means to access information worldwide. Those who use these applications may find a considerable amount of data which might equally be correct or incorrect. However, it is important to attain useful data and suitable information since consumers use it in their everyday tasks. We therefore identify the need to evaluate data quality in Web portals and define a data quality model, denominated as SPDQM (SQuare Portal Data Quality Model), which is based on a previous model, PDQM (Portal Data Quality Model) and the ISO/IEC 25012, the Data Quality standard, part of the SQuaRE (Software product Quality Requirements and Evaluation) family.
1. Introduction A Web Portal is a site that aggregates information from multiple sources on the Web and organizes this material in an easy user-friendly manner [23]. Moreover, Web Portals offer a broad array of resources and services for customers and business partners [22]. People who use the data from these applications in their work need to be sure that the data are of a sufficiently high quality. Data Quality (DQ) is often defined as the ability of a collection of data to meet user requirements [1, 20]. It is therefore important to provide high DQ in Web portals in such a manner that these portals can attract new visitors and potential customers. A model with which to evaluate the data quality in the environment of Web portals is therefore necessary.
With regard to data quality, the ISO/IEC 25012 [12], which is part of a series of International Standards under the general title of Software product Quality Requirements and Evaluation (SQuaRE), has recently been approved. The ISO/IEC 25012 standard determines a set of relevant characteristics for data quality. However, these characteristics are generic, and it is necessary to adapt them to each concrete environment, which has led us to carry out this adaptation for Web portal Data. This was done by using a data quality model for Web portals, named PDQM (Portal Data Quality Model), as a starting point. The remainder of this paper is organized as follows: Section 2 summarizes the PDQM model. Section 3 describes and analyzes the characteristics of the ISO/IEC 25012 standard. The method used to obtain a final model, named SPDQM (SQuaRE-Aligned Portal Data Quality Model) is defined in Section 4, and finally, Section 5 presents our conclusions and future work.
2. PDQM (Portal Data Quality Model) PDQM [3] is a data quality model for Web portals which focuses on the perspective of the data consumer. This essentially means two things. Firstly, that the model only evaluates the portal data which is accessible to the data consumer. Secondly, that the model evaluates data in much the same way as a data consumer. The development of PDQM was divided into two stages: the theoretical definition and the operational definition of the model. The goal of the theoretical definition was to determine a set of DQ characteristics that are relevant to data consumers when evaluating the DQ of any Web
portal. To do this, a set of DQ characteristics proposed in literature was selected to evaluate the DQ in a Web context, and the selection of the most relevant characteristics for a Web portal was made (based on the functionality of a Web portal [7] and the Internet user´s DQ expectations [18]). This set was empirically validated, resulting in the final set of DQ characteristics for the model. In order to obtain the operational version of PDQM, the characteristics were first organized into four DQ categories: - Intrinsic, which denotes that data have quality in their own right. - Operational, which emphasizes the importance of the role of systems; that is, the system must be accessible but secure. Easy of operation
Traceability
Amount of Data Flexibility
Reliability
Interactivity Reputation
- Contextual, which highlights the requirement stating that data quality must be considered within the context of the task in hand. - Representational, which denotes that the system must present data in such a way that they are interpretable, easy to understand, and concisely and consistently represented. Within each category, influential relationships were then established between the characteristics to determine which characteristics were dependent on other characteristics. As a result of this, a BN (Bayesian Network) was obtained which organizes the 33 DQ characteristics into four network fragments (one for each DQ category). The BN graph is showed in Figure 1.
Timeliness
Documentation
Expiration
Objectivity
Customer Support
Response Time
Applicability
Novelty
Interpretability
Organization
Specialization Completeness
Duplicates
Accuracy
Security Currency
DQ_Intrinsic
Consistent Representation
ValueAdded
Accessibility
Believability Availability
Relevance
Validity
DQ_Operational
DQ_Contextual
Concise Representation
Understandability
Attractiveness
DQ_Representation
PDQ
Figure 1: Bayesian network which organizes the PDQM DQ characteristics
3. ISO/IEC 25012 SQuaRE is a set of International Standards which consists of different divisions. One of these is the 2501n family which is focused on software product quality models and, as part of this family, the 25012 standard is focused on data. This model defines fifteen characteristics considered from two different points of view: inherent and system dependent [12].
3.1. Inherent data quality Inherent data quality refers to the degree to which quality characteristics of data have the intrinsic potential to satisfy stated and implied needs when data is used under specified conditions [12]. From the inherent point of view, data quality refers to data itself [12], in particular to:
-
-
Data domain values and possible restrictions (e.g. business rules governing the quality required for the characteristic in a given application) Relationships of data values (e.g. consistency) Metadata
3.2. System dependent data quality System dependent data quality refers to the degree to which data quality is attained and preserved within a computer system when data is used under specified conditions [12]. From this point of view data quality depends on the technological domain in which data are used; this is achieved by the capabilities of computer system components such as: hardware devices (e.g. to make data available or to obtain the required precision), computer system software (e.g. backup software to
achieve recoverability), and other software (e.g. migration tools to achieve portability) [12].
3.3. Data quality model
SCOPUS, ScienceDirect, Wiley IEEE Digital Library and ACM Digital Library in our case). As result we identified 39 characteristics. The set of characteristics shown in Table 2 were obtained from the aforementioned three sources.
The data quality model defined in ISO/IEC 25012 outlines the fifteen DQ characteristics as regards inherent and system dependent data quality [12] (see in Table 1) . Table 1: Data quality model characteristics Characteristics Accuracy Completeness Consistency Credibility Currentness Accessibility Compliance Confidentiality Efficiency Precision Traceability Understandability Availability Portability Recoverability
Inherent X X X X X X X X X X X X
Data quality System dependent
X X X X X X X X X X
4. Method used to obtain SPDQM SPDQM (SQuaRE-Aligned Portal Data Quality Model) has been obtained by defining a process consisting of 5 steps. Greater details of the sequence and steps in this process are shown in Figure 2.
4.1. Step 1: Initial set of DQ characteristics The goal of the first step was to obtain a set of DQ characteristics that would be applicable to the context of Web portals. To do this, we based our work on the PDQM model, the ISO/IEC 25012 standard and a review of literature previously carried out [15]. This survey was the continuation of another which took place in [3], and which covered the period of time up to 2005 (inclusive). Our survey therefore covered the period between 01/01/2006 and 31/12/2008. The survey was developed following the guidelines proposed by Kitchenham [13]. The objective was to obtain a set of characteristics for data quality in the Web context, according to the existing proposals in literature. This was done by defining one or more strings containing the terms used in the search (for example, "data quality" AND web AND "information quality"). These strings were used to carry out several searches in different digital libraries (for example,
Figure 2: Method to work The initial set of DQ characteristics for Web portals was therefore composed of 33 characteristics from PDQM, 15 characteristics from ISO/IEC 25012, and 39 characteristics from our survey (see Table 2).
4.2. Step 2: Refinement of the set of DQ characteristics This step consisted of refining the set of characteristics obtained in the previous set. This step was principally achieved by: - Ensuring that all the characteristics were applicable to the context of Web portals. - Resolving any possible conflict between the characteristics obtained from the various sources by checking for the existence of either characteristics with the same name and a different meaning or characteristics with a different name but the same meaning.
Table 2: Classification of the DQ characteristics Characteristics Accessibility Accuracy Availability Completeness Consistency Credibility Currentness Traceability Understandability Compliance Confidentiality Portability Precision Recoverability Efficiency Amount of data Applicability Attractiveness Concise Representation Customer Support Documentation Duplicates Ease of operation Expiration Flexibility Interactive Interpretability Novelty Objectivity Organization Relevancy Reliability Reputation Response Time Security Specialization Timeliness Validity Value-added Effectiveness Readability Usability Usefulness Verifiability
PDQM [3] X X X X Consistent Representation X Currency X X
X X X X X X X X X X X X X X X X X X X X X X X X
ISO/IEC 25012 [12] X X X X X X X X X X X X X X X
Our survey [15] X X X X Consistent Representation X Currency X X
X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X
4.2.1. Evaluation of the characteristics. In Table 2, the characteristics which needed to be reviewed in accordance with the previously mentioned criteria are shown in grey. As the characteristics in PDQM and those recovered from the survey were specifically defined and obtained for the context of Web portals, all of them are directly incorporated into our set of characteristics for SPDQM. However, it is necessary to study the applicability of those from ISO/IEC 25012 [12] which do not coincide with either of the other two sources.
4.2.1.1. Applicability of the characteristics to the context of DQ in Web portals. There are five characteristics from ISO/IEC 25012 that are not part of PDQM and which did not appear in our survey: - Compliance: this is defined in [12] as “the degree to which data has attributes that adhere to standards, conventions or regulations in force and similar rules relating to data quality in a specific context of use”, It is therefore related to standards and regulations at the level of data (not at the level of standard characteristics) and is thus useful, for example, for the Organic Law on Data Protection (LOPD) and for data convention (the differentiation between “,” and “.” to identify units of a thousand). This is relevant to Web portals. - Confidentiality: according to [12], this is identified as “the degree to which data has attributes that ensure that it is only accessible and interpretable by authorized users in a specific context of use”. It is therefore useful because data must only be shown and used by authorized personnel in the environment of Web portals. - Portability: this is related to ”the degree to which data has attributes that enable it to be installed, replaced or moved from one system to another preserving the existing quality in a specific context of use” [12]. This characteristic is important for Web portals because consumers may use different browsers and operative systems. - Precision: this is defined in [12] as “the degree to which data has attributes that are exact or that provide discrimination in a specific context of use”. The scale used for the numerical data thus makes the Web portal elegant and increases customer loyalty. - Recoverability: this is related to “the degree to which data has attributes that enable it to maintain and preserve a specified level of operations and quality, even in the event of failure, in a specific context of use” [12]. Therefore, the ability to recover data in a Web portal increases user confidence. All of the above are therefore considered in our set of characteristics 4.2.2. Conflict resolution. The next objective was to check that the characteristics identified not only had the same name but referred to the same concept. Upon carrying out this evaluation, we detected the following: - Completeness: in [3], this is “the extent to which the data provided by a Web portal are of sufficient breadth, depth, and scope for the task at hand” and in [12], it is “the degree to which subject data associated with an entity has values for all expected
attributes and related entity instances in a specific context of use”. The term did not, therefore, have the same meaning in both contexts, and it was necessary to change the name of one of them. In this case, the “Completeness” characteristic of PDQM was renamed as “Scope”. - Consistent Representation and Consistency were initially classified as a single characteristic. However, upon analysing the definition of “Consistency Representation” which according to [3], is “the extent to which data are always presented in the same format, are compatible with previous data and consistent with other sources”, and the definition of “Consistency”, which is defined in [12] as “the degree to which data has attributes that are free from contradiction and are coherent with other data in a specific context of use”, we realized that although the name appeared to refer to the same concept, this was not the case. They have therefore been considered to be two different characteristics in SPDQM. - Currentness, according to [12] this is “the degree to which data has attributes that are of the right age in a specific context of use” and “Currency” is “the extent to which the Web portal provides nonobsolete data” [3]. We therefore consider that they have a similar meaning, and the name given in ISO/IEC 25012 is that used in SPDQM. The remaining characteristics were coherent as regards name and meaning. 4.2.3. Selection of characteristics. Finally, the characteristics of our survey [15] were reviewed in order to detect whether any of them could be removed. Thus: - Security was a more generic characteristic than Availability and Confidentiality, and it was in a top level. We therefore preferred to include Availability and Confidentiality, and to eliminate Security (as in ISO/IEC 25012). - Usability, included Efficiency and Effectiveness. Thus, we selected Efficiency and Effectiveness, and eliminated Usability, because the characteristic named Usability is not defined in ISO/IEC 25012.
4.3. Step 3: Final set of DQ characteristics Once the depuration of characteristics had taken place, we obtained the final set of characteristics shown in Table 3.
Table 3: Data quality characteristics SPDQM Characteristics Accessibility Accuracy Availability Credibility Currentness Traceability Understandability Completeness Compliance Confidentiality Consistency Portability Precision Recoverability Efficiency Amount of data Applicability Attractiveness Concise Representation Consistent Representation Customer Support Documentation Duplicates Ease of operation Expiration Flexibility Interactive Interpretability Novelty Objectivity Organization Relevancy Reliability Reputation Response Time Scope Specialization Timeliness Validity Value-added Effectiveness Readability Usefulness Verifiability
PDQM [3]
ISO/IEC 25012 [12]
X X X X Currency X X
X X X X X X X X X X X X X X X
Our survey [15] X X X X Currency X X
X X X X
X X X X X
X
X
X X X X X X X X X X X X X X X X X X X X
X X X X X X X X X X X X X X X X X X X X X X X X
4.4. Step 4: Organization of the characteristics Having obtained the final set of characteristics for SPDQM, we decided to classify them into the categories in PDQM and from the two points of view stated in ISO/IEC 25012, “Inherent” and “System Dependent”, considering that: - The “Intrinsic” category in PDQM coincided with the “Inherent” data quality in SQuaRE. - The “Operational”, “Contextual” and “Representational” categories in PDQM coincided
with the “System Dependent” data quality in SQuaRE. Although PDQM only permitted one characteristic or sub-characteristic in each category, in our model, if a characteristic or sub-characteristic is both “Inherent” and “System Dependent” in ISO/IEC 20512, it can also be included in more than one category in SPDQM.
4.5. Step 5: SPDQM (SQuaRE-Aligned Portal Data Quality Model) SPDQM is consequently made up of 44 DQ characteristics (32 PDQM characteristics, 5 characteristics from our survey and 7 characteristics from ISO/IEC 25012). Table 4 shows the final definition of each characteristic, adapted to the environment of Web portals, and considers the definitions found in the three sources used. Finally, Table 5 shows the complete structure of our model, which has four levels: - The first level corresponds with the two points of view adopted from the ISO/IEC 25012 standard. - The second level corresponds with the DQ categories adopted from the PDQM model. - The third corresponds with the set of characteristics in each category.
- Finally, the fourth level contains those subcharacteristics which are associated with some of the characteristics in the previous level.
5. Conclusions and future works In this article, we have presented the development of SPDQM (SQuaRE-Aligned Portal Data Quality Model) a quality model for Web portal data. The model is based on PDQM (Portal Data Quality Model) and on ISO/IEC 25012. SPDQM has 44 DQ characteristics and subcharacteristics which were obtained from the aforementioned models, and a set of characteristics obtained from a study of literature. These characteristics and their sub-characteristics were analyzed according to the DQ categories mentioned in ISO/IEC 25012 (“Inherent” and “System Dependent”) and assigned to one of the four PDQM categories (“Intrinsic”, “Operational”, “Contextual” and “Representational”), thus obtaining a four-level structure. As a future work we shall tackle the application of SPDQM in data quality evaluation. To do this we intend to define measures for each characteristic, and to create an automatic tool with which to evaluate the data quality in Web portals based on our model.
Table 4: Description of data quality Characteristic Accessibility
Accuracy Amount of data Applicability Attractiveness Availability Completeness Compliance Concise Representation Confidentiality Consistency Consistent Representation Credibility
Description The degree to which a Web portal provides navigation mechanisms to attain the desired data faster and more easily, particularly by people who need supporting technology or a special configuration because of a disability. The degree to which a Web portal’s data are free from errors, can be verified offline, and the data have attributes that correctly represent the true value of the intended attribute of a concept or event. The extent to which the quantity or volume of data delivered by the Web portal is appropriate for the task at hand. The extent to which data is specific, useful and easily applicable for the target community in Web portals. The extent to which the Web portal is attractive to its visitors. The extent to which data are available through the Web portal and have attributes that enable them to be retrieved by authorized users and/or applications. The degree to which a Web portal’s data are able to serve a user’s information needs, implicitly capturing other criteria such as ease of understanding, and serving as an indicator for relevancy. The degree to which a Web portal’s data have attributes that adhere to standards, conventions or regulations in force and similar rules relating to data quality in a specific context of use. The extent to which a Web portal’s data are compactly represented without superfluous or non-related elements and enable the detection of incorrect descriptions. The degree to which a Web portal’s data have attributes that ensure that they are only accessible and interpretable by authorized users in a specific context of use. The degree to which a Web portal’s data have attributes that are free from contradiction, coherent with other data and presented in the same format in a specific context of use. The extent to which a Web portal’s data are always presented in the same format, are compatible with previous data and consistent with other sources. The degree to which a Web portal’s data have attributes that are regarded as true, correct and believable by users.
Ref. [2, 3, 5, 12]
[2, 3, 10, 12, 14] [2, 3, 5] [2, 3] [2, 3] [2, 3, 12]
[12, 17] [12] [2, 3, 5, 9] [12] [5, 12] [2, 3] [2, 3, 8, 12, 19]
Characteristic Currentness Customer Support Documentation Duplicates Ease of operation Effectiveness Efficiency Expiration Flexibility Interactive Interpretability Novelty Objectivity Organization Portability Precision Readability Recoverability Relevancy Reliability Reputation Response Time Scope Specialization Timeliness Traceability Understandability
Usefulness
Validity Value-added Verifiability
Description The degree to which a Web portal’s data have attributes that are of the right age, are non-obsolete and are up-to-date in a specific context of use. The extent to which the Web portal provides on-line support by means of text, e-mail, telephone, and so on. Amount and usefulness of documents with meta information. The extent to which data delivered by the portal contains duplicates. The extent to which a Web portal’s data are easily managed, can be applied to different tasks and are handled (i.e. updated, moved, aggregated, and so on). To extent to which design uses adequate analytical techniques in the Web portal. The degree to which a Web portal’s data have attributes that can be processed and provide the expected levels of performance by using the appropriate amounts and types of resources. The extent to which the date until which data remain current is known. The extent to which a Web portal’s data are expandable, adaptable, and easily applied to other needs. The extent to which the way in which a Web portal’s data are accessed or retrieved can be adapted to one´s personal preferences through interactive elements. The extent to which a Web portal’s data are in a language and units which are appropriate for consumer capability. The extent to which data obtained from a Web portal influence knowledge and new decisions. The extent to which a Web portal’s data are unbiased, unprejudiced and impartial. The organization, visual settings or typographical features (colour, text, font, images, etc.) and the consistent combinations of these various components. The degree to which a Web portal’s data have attributes that enable them to be installed, replaced or moved from one system to another, preserving their existing quality. The degree to which a Web portal’s data have attributes that are exact or that provide discrimination in the Web portal and how they assist users to find relevant results and avoid irrelevant results. The extent to which text is legible, the text in the Web portal is presented in an easy-to-read manner and the placement of the text in the Web portal ensures good readability. The degree to which a Web portal’s data have attributes that enable them to maintain and preserve a specified level of operations and quality, even in the event of failure. The extent to which a Web portal’s data are applicable, useful and helpful for users´ needs in the task at hand. The extent to which users can trust the data and their sources and the end-user’s perception of the adequate technical functioning of the Web portal. The extent to which a Web portal’s data are trusted or highly regarded in terms of their source or content, and information is highly regarded in terms of its source or content. Amount of time until complete response reaches the user. The extent to which the data, provided by a Web portal are of sufficient breadth, depth, and scope for the task at hand. Degree of specificity of data/information contained in and delivered by the Web application, i.e. it should incorporate all details which might be seen by its visitors. The degree to which information has changed during time and the information is transitory or stable. The degree to which a Web portal’s data are well-documented, verifiable, easily attributed to a source and provide an audit trail of access to the data and of any changes made to the data. The degree to which a Web portal’s data have attributes that enable them to be read, which are clear, non-ambiguous, easy, comprehensible and well-interpreted by users, and are expressed in appropriate languages, symbols and units in a specific context of use. The extent of user assessment both of the likelihood that the information will enhance their decisions, and their satisfaction with the usefulness of content: this aspect concerns the focus of the content, the use of appropriate language, and the utility of information according to the needs of the audience to whom it is directed. The extent to which users can judge and comprehend data delivered by the Web portal. The extent to which a Web portal’s data are beneficial and provide advantages from their use. The extent of references to original sources.
Ref. [2, 3, 8, 12, 14] [2, 3] [2, 3] [2, 3] [2, 3, 5] [24] [11, 12] [2, 3] [2, 3] [2, 3, 8] [2, 3] [2, 3] [2, 3, 5, 14] [2, 3] [12] [6, 12] [8] [12] [2, 3, 5, 8, 9] [2, 3, 8] [2, 3, 5] [2, 3] [2, 3] [2, 3] [2, 3, 5, 10]
[2, 3, 12]
[2, 3, 5, 12]
[4, 16, 21]
[2, 3] [2, 3, 5] [21]
Table 5: SPDQM Point of view
Category
Characteristic Accuracy: Credibility:
Inherent
Intrinsic: this denotes that data have quality in their own right
Operational: this emphasizes the importance of the role of systems; that is, the system must be accessible but secure
Currentness Expiration Completeness Consistency Accessibility Compliance Confidentiality Efficiency Precision Understandability Availability: Accessibility:
System Dependent
Representational: this denotes that the system must present data in such a way that they are interpretable, easy to understand, and concisely and consistently represented
Response Time Interactive Ease of operation Customer Support
Verifiability Confidentiality Portability Recoverability Validity:
Contextual: this highlights the requirement which states that data quality must be considered within the context of the task in hand
Subcharacteristic Duplicates Objectivity Reputation Traceability
Value-added: Relevancy:
Reliability Scope Applicability Flexibility Novelty Novelty Timeliness
Specialization Usefulness Traceability Compliance Precision Concise Representation Consistent Representation Understandability: Attractiveness: Readability Efficiency Effectiveness
Interpretability Amount of data Documentation Organization Organization
9. Acknowledgment This work is part of the INCOME project (PET2006-0682-01) supported by the Spanish Ministerio de Educación y Ciencia, by the IVISCUS project (PAC08-0024-5991) supported by Consejería de Educación y Ciencia (JCCM) and by VIMECUS (TC20080556) supported by the University of CastillaLa Mancha.
10. References [1]Cappiello, C., C. Francalanci and B. Pernici. "Data quality assessment from the user´s perspective". in Proceeding on International Workshop on Information Quality in Information Systems (IQIS2004). 2004. Paris, France. ACM pp. 68-73
[2]Caro, A., C. Calero, "Modelo de Calidad de Datos para Portales Web", in Departamento de Tecnologías y Sistemas de Información. 2007, Universidad de CastillaLa Mancha: Spain. [3]Caro, A., C. Calero, I. Caballero and M. Piattini, "A proposal for a set of attributes relevant for Web portal data quality". Software Quality Journal, 2008. 16(4): pp. 513542. [4]Cheung, C.M.K., M.K.O. Lee, "The structure of webbased information systems satisfaction: Testing of competing models". Journal of the American Society for Information Science and Technology, 2008. 59(10): pp. 1617-1630. [5]Chung, W., "Studying information seeking on the nonEnglish Web: An experiment on a Spanish business Web portal". International Journal of Human-Computer Studies, 2006. 64(9): pp. 811-829.
[6]Chung, W., A. Bonillas, G. Lai, W. Xi and H. Chen, "Supporting non-English Web searching: An experiment on the Spanish business and the Arabic medical intelligence portals". Decision Support Systems, 2006. 42(3): pp. 1697-1714. [7]Collins, H., "Corporate Portal Definition and Features". AMACOM, 2001. [8]De Wulf, K., N. Schillewaert, S. Muylle and D. Rangarajan, "The role of pleasure in web site success". Information & Management, 2006. 43(4): pp. 434-446. [9]Domingues, M.A., C. Soares and A.M. Jorge. "A WebBased System to Monitor the Quality of Meta-Data in Web Portals". in IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IATW'06). 2006 pp. 188-191 [10]Dondio, P., S. Barrett, "Computational trust in web content quality: A comparative evalutation on the Wikipedia project". Informatica, 2007. 31: pp. 151-160. [11]Grigoroudis, E., C. Litos, V.A. Moustakis, Y. Politis and L. Tsironis, "The assessment of user-perceived web quality: Application of a satisfaction benchmarking approach". European Journal of Operational Research, 2008. 187(3): pp. 1346-1357. [12][ISO/IEC-FDIS-25012], "Software engineering Software product Quality Requirements and Evaluation (SQuaRE) - Data quality model". 2008. [13]Kitchenham, B., S. Charters, "Guidelines for performing systematic literature reviews in software engineering". Technical Report EBSE-2007-01, School of Computer Science and Mathematics, Keely University, 2007. [14]Metzger, M.J., "Making sense of credibility on the web: Models for evaluating online information and recommendations for future research". Journal of the American Society for Information Science and Technology, 2007. 58(13): pp. 2078-2091. [15]Moraga, C., M. Moraga and C. Calero. "Towards the Discovery of Data Quality Attributes for Web Portals". in 9th International Conference on Web Engineering. Pending of aceptance. 2009
[16]Moraga, M., C. Calero and M. Piattini, "Comparing different quality models for portals". Online Information Review, 2006. 30(5): pp. 555-568. [17]Prestipino, M., F.-R. Aschoff and G. Schwabe. "How upto-date are Online Tourism Communities? An Empirical Evaluation of Commercial and Non-commercial Information Quality". in 40th Annual Hawaii International Conference on System Sciences (HICSS'07). 2007 [18]Redman, T., "Data Quality: The field guide." Digital Press. Boston., 2000. [19]Robins, D., J. Holmes, "Aesthetics and credibility in web site design". Information Processing & Management, 2008. 44(1): pp. 386-399. [20]Strong, D., Y. Lee and R. Wang, "Data Quality in Context". Communications of the ACM, 1997. 40(5): pp. 130-110. [21]Stvilia, B., M.B. Twidale, L.C. Smith and L. Gasser, "Information quality work organization in Wikipedia". Journal of the American Society for Information Science and Technology, 2008. 59(6): pp. 983-1001. [22]Wynn, M., S. Zhang. "Web Portals in SMEs - Two Case Studies". in Proceedings of the 2008 Third international Conference on Internet and Web Applications and Service (ICIW). 2008. IEEE Computer Society, Washington, DC pp. 303-308. [23]Xiao, L., S. Dasgupta, User Satisfaction with Web Portals: An Empirical Study, in Web Systems Design and Online Consumer Behavior, chapter 11,. 2005: Gao. Y., (Ed). 2005, Idea Group Publishing. Hershey. PA. pp. 193205 [24]Yen, B., H. P.J-H. and M. Wang, "Toward an analytical approach for effective Web site design: A framework for modeling, evaluation and enhancement". Electronic Commerce Research and Applications, 2007. 6(2): pp. 159-170.