Creation of High Quality Web Contents Available to Everyone, Through the use of Semantic Editors
Héctor Diez-Machío Lorena Rivas-Morán Alejandro Prieto-Castro
_____________________________________________________________________________________________
Creation of High Quality Web Contents Available to Everyone Héctor Diez-Machío1, Lorena Rivas-Morán2 and Alejandro Prieto-Castro3 1
INTECO, Centre of Reference for Accessibility and Web Standards. Spain.
[email protected] INTECO, Centre of Reference for Accessibility and Web Standards. Spain.
[email protected] 3 INTECO, Centre of Reference for Accessibility and Web Standards. Spain.
[email protected] 2
Abstract The current systems for the edition of Web pages have different inconveniences, which makes them unsuitable, for users with no technical knowledge, to create high quality Web pages. This may pose a new technological barrier, due to the fact the Web users are in charge of creating a great part of its contents and this is an increasing tendency. In this paper, we propose a new editing system based on the use of semantic editors. In this model of editors, the editing process is guided by the meaning of what is being written and not by its final appearance. The system makes use of a language (called WebCS) in order to semantically describe the contents and an export system that permits to create the final Web pages. The advantages of this system affect any organization or service that requires the generation of Web contents by users that have no technical knowledge, especially the Public Administrations and companies.
Key words: Web Editing, Semantic Editors, e-participation
____________________________________________________________________________________________ 1 Collecter Iberoamérica 2008
Creation of High Quality Web Contents Available to Everyone, Through the use of Semantic Editors
Héctor Diez-Machío Lorena Rivas-Morán Alejandro Prieto-Castro
_____________________________________________________________________________________________
1 Introduction During the last years, we have seen how the Web 2.0 concept has been gaining importance in the great part of services and portals and in business and technology related to the Web. According to the creator of the concept [01], one of the most well-known characteristics of Web 2.0 is that users go from being mere spectators to persons who play a more dynamic role. One of the functions that these users acquire is adding value to the Web through small decentralized contributions that make up a total whose dimension can be compared to the one offered by large companies. This value, offered by users, may appear in the form of votes, inventories or resource sharing, but the most usual way is through the creation of contents. These contents may be, for example: opinions expressed on a political forum, analysis and review of a company’s products, health advice, travel photos with their corresponding explanation, comments on the contents added by others, etc. Moreover, this content creation is not only limited to external users. Inside an organization, adding content to the Web is a task carried out by a lot of people, not only by the technical administrators. Any Web user should be able to add these contents, and they are indeed already able to, but there is one aspect that has not been given much importance: it should be possible for them to add contents on equal terms, i.e., users should have the same ability to add contents, regardless of their technological knowledge or their control over the tools they need to use. If we do not pay attention to this matter, we could be creating a new technological barrier: people who can only access the Web to receive contents and those who can interact in both ways, receiving and sending contents. To be more exact, the people who can only add contents in a rudimentary way and those who can add contents using the whole power of the available technology. The contents created by the people of the first group could have problems, such as the failure to comply with standards and rules regarding accessibility, design problems and an unattractive aspect. This can create situations, such as the fact that an interesting political opinion, could be less seen because it is not well rendered in some Internet navigators. Or, an e-commerce company could have an outdated Web catalog because the stock manager is not able to add the new products and he has to ask to the Web developers to do it. We will begin by explaining the systems currently used for introducing contents on the Web by analyzing the problems of each system. Afterwards, we will illustrate the system we propose to solve these problems. Next, we will see an example of how we can use this system, using an editor programmed to show how it works and, lastly, we will explain the advantages that this new system offers.
2 Current systems for introducing contents The most common way in which users insert contents is through an editor. For a survey of text editors, see [2], [3]. This editor is usually part of the CMS (Content Management System), used for the creation of the Web site and which will permit to insert contents using one or several of the following systems:
1. WYSIWYG editor (What You See Is What You Get). Users write the contents and, in real time, they see the final appearance they will obtain. It is the most used, as it permits to carry out a great number of functions (it has great power) in a simple way. Its power is also one of its defects, as it can create low quality pages, when in the hands of non-technical users. In addition, these editors are not appropriate for mobile devices and most of them only work with the most important Web browsers (they greatly depend on the platform). We can see some critics to this paradigm in [4].
2. HTML editor. It permits to write directly in the language in which the Web pages are programmed. At the same time, it allows maximum power and a total control over the resulting pages. Its main problem is that you need technical knowledge, as it is highly complex.
3. Editors through own languages. They aim to be a simplification of the former ones, by reducing the number of elements of the language to the most important ones and by simplifying the syntax. An example of these may be the Wikitext language [5] used, for example, on the Wikipedia. Despite of its greater simplicity, you still need to study and know it beforehand and it is not a standardized solution.
4. Plain text editors. They permit to write only in character strings. They do not need any specific knowledge, but their power is minimal and they are only appropriate for simple forms.
____________________________________________________________________________________________ 2 Collecter Iberoamérica 2008
Creation of High Quality Web Contents Available to Everyone, Through the use of Semantic Editors
Héctor Diez-Machío Lorena Rivas-Morán Alejandro Prieto-Castro
_____________________________________________________________________________________________ Therefore, we see that neither of these systems includes all the desirable mentioned characteristics: powerful, easy to use, easy to learn, multiplatform and a way to keep control over the quality of the result.
3 System based on semantic editors In view of the problems of the previous systems, when it comes to introducing contents, we propose a system based on semantic editors. This new editing model focuses on the meaning of the contents, instead of on its final appearance; it is called WYSIWYM (What You See Is What You Mean). This approach has been successfully used for written scientific documents in Lyx [06]. In order to use WYSIWYM editors, it is necessary to know the meaning of the parts that make up the contents, i.e. if there is a title, a date, an author, etc. Thanks to the previous description of the contents, you have greater control over the created final code. Adapting this edition model to the Web content, a WYSIWYM editor must work inside a system as the presented in Figure 1.
Content Schema Semantic description of the allowed contents
WYSIWYM Editor
Rules that guide the transformation of the contents to their final format.
Export
+ Area where the Web page fragment
user edits contents
XML document where the contents are tagged following the pre-defined semantic description
Figure 1: Diagram of the system based on semantic editors. In the different pages of a Web portal there exist areas in which users can insert contents. These areas are linked to a file, in which the contents allowed for that area are semantically described. The semantic Web editor reads this schema and indicates to the user the contents s/he has to fill in. Once the user inserts the contents, they are stored in tagged XML files with its semantic description. In this way, the contents are completely separated from the appearance. In order to give the contents their final appearance (normally as HTML Web pages), the descriptions of the contents have rules associated to them. Through these rules, the contents automatically transform into their final format, which will be the one included in the corresponding editing area.
3.1
WebCS Language
The WebCS (Web Content Schemas) language is used to describe the meaning of the different parts of the contents (that is, for writing the content schemas). A content schema is made up of series of content elements and generating operations that link them. The content elements, which we will abbreviate to, simply, elements, are the structural units with which the content schemas will be created. The content schemas themselves are elements that have been constructed from other, more simple elements, by means of a series of operations.
____________________________________________________________________________________________ 3 Collecter Iberoamérica 2008
Creation of High Quality Web Contents Available to Everyone, Through the use of Semantic Editors
Héctor Diez-Machío Lorena Rivas-Morán Alejandro Prieto-Castro
_____________________________________________________________________________________________ The elements are grouped in vocabularies. There is an initial vocabulary with common elements like: title, body, date, e-mail, etc. The elements of this initial vocabulary can be used to create new, more complex, elements. To create a new element we need to join two or more elements using the appropriate generating operation. The allowed generating operations are the following:
3.2
•
Union of. It generates a new element through the combination of other ones. Example: date = Union of (Day, Month, Year).
•
Choice between. It generates a new element that can be one amidst a set of options. Example: Author = Choice between (Person, Institution).
•
List of. It generates a new element, such as the repetition, once or more times, of a specific element. Example: Glossary = List of Glossary_Term.
•
Precision of. It generates a new element, specifying the meaning of another one. Example: Responsible_Name = Precision of Person.
Formats of representation
We can use different formats to represent an instance of the WebCS language.
3.2.1 Graphical representation The objective of the graphical representation is for people to quickly understand the schemas. The content elements are represented by their name. The elements linked through the operation “Union of” are written as a list preceded by the symbol “–“. Those ones linked through the operation “Choice between” appear open, through lines from the father element. The operations “Precision of” appear with the symbol “”. The operations “List of” are indicated by “*” after the generating element. Finally, the generating elements of an added element appear in a list with the symbol “*”. Let’s see an example of this representation format. The following example represents a possible structure for politics opinions written in a Web page. Politics_opinion Title Person -
Author Institution
-
-
-
Opinion_data Type Political_party Contact Telephone E-mail Text Body
Figure 2: Graphical representation of a possible Politics_opinion structure
3.2.2 XML This format represents the data from the WebCS instance in such a way that it can be processed by a computer. To become acquainted with the used XML format, we have equivalent schemas in the XML Schema and RelaxNG language. The previous example in this representation format will look like this:
____________________________________________________________________________________________ 4 Collecter Iberoamérica 2008
Héctor Diez-Machío Lorena Rivas-Morán Alejandro Prieto-Castro
Creation of High Quality Web Contents Available to Everyone, Through the use of Semantic Editors
_____________________________________________________________________________________________
… Figure 3: XML representation of a possible Politics_opinion structure
3.3
Operation of the editor
This section describes the functioning of an editor that follows this edition model. The WYSIWYM Web editor is in charge of interpreting the WebCS language and showing the structure the user has to fill in.
WebCS Instance (content schema) Read Vocabulary
XSLT Edit Save
Intermediate XML
Export
XHTML
PDF
...
Figure 4: General flow of the editor Once the editor has read the chosen content schema, we start with the editing process, in which we can carry out the following actions: • •
Save the structure we are editing in an intermediate XML language. Export the structure to another language, such as XHTML. In this case, XSLT transformations are needed that are specifically programmed to perform the transformation into the final language.
3.3.1 Edition depending on each generating operation When editing an element – either a complete content schema or one of its elements – the editor will act in one way or another, depending on the generating operation of the element. The editor will create the interface dynamically depending on the data of the WebCS instance and the actions of the user. In this dynamical interface the elements are organised in boxes in such a way that, within the box of an element, we find the boxes of its generating elements. This behaviour is systematically repeated. Next, the behaviour of the interface each operation is described. It must be taken into account that the images are illustrative and show an example of how they could appear in a possible editor, but they are not the only possible form of representation.
Operation: a = Union of (b, c)
____________________________________________________________________________________________ 5 Collecter Iberoamérica 2008
Creation of High Quality Web Contents Available to Everyone, Through the use of Semantic Editors
Héctor Diez-Machío Lorena Rivas-Morán Alejandro Prieto-Castro
_____________________________________________________________________________________________
a Editing area
b
b c c
Figure 5: Operation “Union of” If we are editing an element (a) that is a Union of other elements (b,c), the editor should allow the user to add the elements “b” and “c”. In the editing area of “a”, the corresponding box of the added element (“b” or “c”) will appear. The user should also be allowed to remove elements and reorder the already inserted elements.
Operation: a = Choice between (b, c)
a b
c
Table 4: Operation “Choice between” The editor will offer the user the possibility of inserting the boxes “b” or “c” within the box “a”, but not both of them.
Operation: a = Precision of (b)
a (b)
Figure 5: Operation “Precision of” With this operation, the editing area for element “b” will simply open, indicating that it is related to element “a”.
Operation: a = list of (b)
____________________________________________________________________________________________ 6 Collecter Iberoamérica 2008
Creation of High Quality Web Contents Available to Everyone, Through the use of Semantic Editors
Héctor Diez-Machío Lorena Rivas-Morán Alejandro Prieto-Castro
_____________________________________________________________________________________________
List of b b
+ Figure 6: Operation “List of” Initially, within the box of element “a”, a box of element “b” and a button to add more elements “b” will appear. Every time this button is pushed, a new box “b” appears.
Text writing When an element cannot be divided into simpler elements, it means that the only action available for the user is text writing. In this case, the editor will show a text area or an emerging window in which to introduce the text.
4 Edition example In order to check the robustness of the system and show how it works, a demonstration prototype has been programmed following the aforementioned architecture. This editor has been programmed as a desktop application that allows us to select the WebCS instance we wish to use, edit the contents following this instance and save and open contents and export them to HTML by selecting the XSLT transformation we wish to use. The real editing environment would not be one of a desktop application, but the editor should be included within a CMS in charge of loading the WebCS instances, opening and saving the intermediate XML files and exporting them with the use of the suitable transformations. Apart from the options of the editing process, the only options for the user would be the following: changing to the editing mode (the editor would open) or accepting or cancelling the changes performed. Next, we show a screenshot of the editor belonging to a possible state during the edition of the structure “politics_opinion” defined in Figure 2. In this figure we can see the nested boxes of some elements and the text added by the user.
____________________________________________________________________________________________ 7 Collecter Iberoamérica 2008
Creation of High Quality Web Contents Available to Everyone, Through the use of Semantic Editors
Héctor Diez-Machío Lorena Rivas-Morán Alejandro Prieto-Castro
_____________________________________________________________________________________________
Figure 7: A possible state in the editing process. From these contents, and after having executed the XSLT transformation, we will obtain the contents in HTML format. For example in the form of the following Web page:
Figure 8: A possible Web page with the edited contents
5 Advantages and conclusions Proceeding from the aforementioned, we can list the following advantages of the system:
1. It democratizes the creation of Web contents, since it allows anyone to create Web pages without requiring specific expertise (except for the simple use of the editor). Furthermore, the system guarantees that the generated pages have maximum quality, regardless of the type of users that create them.
____________________________________________________________________________________________ 8 Collecter Iberoamérica 2008
Creation of High Quality Web Contents Available to Everyone, Through the use of Semantic Editors
Héctor Diez-Machío Lorena Rivas-Morán Alejandro Prieto-Castro
_____________________________________________________________________________________________
2. It permits to semantically enrich the created contents without an additional effort. Due to the fact that contents are added on a semantic framework, the user is unconsciously cataloguing the contents. These semantic tags can be incorporated into the final Web pages, obtaining semantically annotated pages. A more detailed study of this feature can be seen in [7].
3. It permits to guarantee the fulfilment of the accessibility standards of the generated contents, regardless of the work of the users that generate them. These users do not need to know the concepts related to accessibility or to have specific training. The accessibility of the final code depends on the transformations that have been programmed by the Web designer, who should know accessibility and ensure that transformations generate accessible code. This is a very important issue for the Public Administration, as, in some cases, such as Spain, they are obliged by law to fulfil the accessibility standards. And, although this legal framework does not exist, the factor should be taken into account by public and private organisations and companies, due to its important social impact.
4. This system entails a better distribution of the roles related to the creation of Web portals within a company or organisation, so that each person is only in charge of those tasks related to his/her post. The three roles required by the system are: a.
5.
Person in charge of defining the content schemas. S/he will be represented by marketing, external relations or communications staff and will be the one to decide which contents must appear on each Web page. Those ones completing these contents will do it following these structures. b. Person in charge of inserting contents. Through this system, s/he will be limited to exclusively write contents and his/her task will have no influence on the appearance or quality of the final pages. S/he will also edit following the allowed structure. c. Person in charge of the final appearance and code of the Web pages. S/he will be the Web designer and will have total control over the final code without having to review or correct the contents inserted by users. The editing system is characterised by its simple interface and architecture; therefore, it is very easy to implement in multiple platforms.
Despite these advantages, there are situations in which this system is not the most appropriate one, for example, when the contents to be edited are not previously known or when the users that insert contents have technical expertise and prefer greater flexibility instead of greater simplicity. Therefore, this system does not intend to substitute the current editing systems, but coexist with them, so that a CMS can offer their users the system that better adjusts to their needs in each situation.
References [1] O’Reilly (2005, September). What Is Web 2.0. [Online]. Available: http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html [2] N. Meyrowitz and A. Van Dam, Interactive Editing Systems: Part 1 and 2. ACM Computing Surveys, 1982. 14(3): pp. 321-352. [3] A. Van Dam and D.E. Rice, "On-line Text Editing: A Survey." Computing Surveys, 1971. 3(3): pp. 93-114 [4] C. Sauer. WYSIWIKI – Questioning WYSIWYG in the Internet Age, in Proceedings of Wikimania, 2006. [5] C. Sauer, C. Smith and T. Benz, WikiCreole:: a Common Wiki Markup, in Proceedings of the International Symposium on Wikis, New York, USA, 2007, pp. 131-142. [6] Lyx. A WYSIWYM document processor. Available: http://www.lyx.org. [7] H. Diez-Machio, A. Prieto-Castro and L. Rivas-Moran, Enriquecimiento semántico del contenido Web mediante el lenguaje WebCS y editores WYSIWYM, in Proceedings of the II Jornadas sobre Ontologías y Web Semántica, Zaragoza, Spain, 2007, pp. 47-55.
____________________________________________________________________________________________ 9 Collecter Iberoamérica 2008