Int. J. Internet Technology and Secured Transactions, Vol. 5, No. 2, 2014
Automation of the semantic annotation of web resources Sahar Maâlej Dammak* MIRACL Laboratory, Sfax University, FSEGS, Airport Road, BP 1088, 3018 Sfax, Tunisia E-mail:
[email protected] *Corresponding author
Anis Jedidi MIRACL Laboratory, Sfax University, ISIMS, Tunis Road, BP 242, 3021 Sfax, Tunisia E-mail:
[email protected]
Rafik Bouaziz MIRACL Laboratory, Sfax University, FSEGS, Airport Road, BP 1088, 3018 Sfax, Tunisia E-mail:
[email protected] Abstract: The annotation of a web page allows us to associate a semantic to the content of this page. But with the great mass of pages managed through the world, and especially with the advent of the web, their manual annotation is not feasible. In this paper, we will focus on the semiautomatic annotation of the web pages. We will propose an approach and a component (entitled ‘Querying Web’) for semantic annotation of web pages. The proposal of this component aims at improving the interrogation process in the semantic web environment. Our solution is an enhancement of the first result of annotation done by the ‘Semantic Radar’ plug-in on the web resources, by new annotations using an enriched domain ontology. Finally, we will present in this paper an evaluation of the automation made by our component. Keywords: semantic web; semantic annotation; web resources; semantic radar; domain ontologies; querying web. Reference to this paper should be made as follows: Maâlej Dammak, S., Jedidi, A. and Bouaziz, R. (2014) ‘Automation of the semantic annotation of web resources’, Int. J. Internet Technology and Secured Transactions, Vol. 5, No. 2, pp.133–148. Biographical notes: Sahar Maâlej Dammak is a PhD student in Computer Science at Faculty of Economic Sciences and Management, Sfax University, Tunisia. She works in the MIRACL Laboratory (Multimedia Information Systems and Advanced Computing Laboratory). Her research focuses on the construction of a system to achieve semantic and fuzzy annotation and interrogation for web resources. Copyright © 2014 Inderscience Enterprises Ltd.
133
134
S. Maâlej Dammak et al. Anis Jedidi is an Assistant Professor on Computer Science, is currently a member of MIRACL Laboratory at the Sfax University, Tunisia. His research interests include semi-structured document modelling, multimedia document design, Semantic Web, annotation and querying. He is interested also on extension of XML query language for multimedia document and presentation and pervasive information system. He works with PhD students on the annotation of semantics features in multimedia and web resources. Rafik Bouaziz is a Full Professor on Computer Science. He is currently the Director of the Economy, Management and Computer Science Doctoral School at the Sfax University, Tunisia. He was a Consulting Engineer in the Organisation and Computer Science and the Head of the Department of Computer Science at CEGOS-TUNISIA between 1979 and 1986. His PhD dealt with temporal data management and historical record of data in information systems. The subject of his accreditation to supervise research was ‘A contribution for the control of versioning of data and schema in advanced information systems’. Currently, his main research topics of interest are temporal databases, real-time databases, information systems engineering, ontologies, data warehousing and workflows. This paper is a revised and expanded version of a paper entitled ‘Automation and evaluation of the semantic annotation of web resources’ presented at The 8th International Conference for Internet Technology and Secured Transactions (ICITST-2013, London, UK, 09–12 December 2013.
1
Introduction
One of the objectives of the Semantic Web is to build and to use metadata to semantically annotate the resources in order to improve the interrogation process. Achieving this goal depends on the use of the appropriate ontologies, and the opportunities to automate the annotation process. This is an area of research in different communities. Nevertheless, the existing annotations of web resources are not sufficient to satisfy the user when making her/his interrogation. So, our objective in this paper is to see how we can annotate semantically the web resources. In fact, web resources are very heterogeneous in terms of their structure as well as in the used languages. Thus, the semantic annotation of web pages is a difficult task. Therefore, an annotation process automation is required. Our principal contribution here is a semantic annotation approach of web resources. Indeed, building contextual and semantic relationships between resources can improve the interrogation process. We intend to assist users through the web search. We are also interested in the operationalisation of this approach in order to automate the extraction of RDF annotation metadata of the web resources. Then, we propose a component, developed under ‘Eclipse’, for semantic annotation using the plug-in ‘Semantic Radar’ (Bojars et al., 2007, 2008; SemanticR, 2009) to extract semantic descriptors such as friend-of-a-friend (FOAF) (Brickley and Miller, 2014), semantically-interlinked online communities (SIOC) (Bojars and Breslin, 2010), etc., for each web page. In addition, this component, entitled ‘Querying Web’, ensures the extraction of Resource Description Framework (RDF, http://www.w3.org/RDF/) annotation metadata employing the semantic annotation model that we defined in Maâlej et al. (2013). Indeed, we apply this model to search the
Automation of the semantic annotation of web resources
135
equivalence between the concepts of the enriched domain ontology corresponding to the search area of the user and the concepts generated by the Semantic Radar plug-in. The paper is structured as follows. The second section presents our motivations and related work. The third section describes the semantic annotation process of web resources. The fourth section shows the implementation details of the component ‘Querying Web’. We demonstrate the contribution of our component through the evaluation of these results in the fifth section. Finally, in the sixth section we present our conclusion and some further work.
2
Motivation and related work
Our goal is to improve the performance of the annotation process in a Semantic Web environment. In fact, creating contextual and semantic relationships between the resources and annotating their semantic content can facilitate the interrogation process. In this context, our work is intended to improve the web research without using any documentary corpus defined in advance. To achieve this work, we have studied many tools that allow to extract metadata, such as RDFa distiller (RDFa D, 2010) and Semantic Radar (SemanticR, 2009). As these tools only provide limited annotations, they do not really indicate the semantics of a web page and its relations with other pages. Indeed, we aim at automating the semantic annotation of the web resources in order to help the domain experts to overcome the complexity of this annotation. For this purpose, we have studied different approaches and systems that allow the semantic annotation and the automatic generation of the semantic descriptors. The approach proposed by Benyahia et al. (2009) requires the extraction of candidate words to the page which will then be connected to an ontology by the user intervention. Had not the approach been limited to the annotation of a page from the text, it would have been a good example of semantic annotation for us. Likewise, for the semantic annotation approach of semi-structured documents proposed by Thiam (2010), the author treats the text components of documents. Whereas our goal is to semantically annotate the entire content of the web pages. However, ALLRIGHT (Shchekotykhin et al., 2007) is an automatic generation system of the ontology instances from the tabular web documents. As this system is only based on tabular web data for the description of a resource, it does not focus on the semantic of a web resource. As for WebCat (Martins and Silva, 2005), it is a framework for generating metadata for resources. Had it included semantics and taken into account the links between the web pages in this generation, it would have been an inspiration source for us. Based on these studies, we observe that these works are limited to only a part of a web page (text, images, etc.) to extract the RDF metadata. In addition, they do not address the semantic relationships between the web resources during the extraction. We also conclude that using ontologies is a very encouraging solution in the annotation of the web resources. We can then define a new approach to semantic annotation of web resources taking into account existing links between these resources. This annotation becomes more relevant if it uses ontologies. It can help to solve the problems related to the annotation for web resources, existing in the literature, and to improve the interrogation process in a Semantic Web environment (Maâlej, 2012).
136
3
S. Maâlej Dammak et al.
An approach to semantic annotation
We have proposed a new approach to the semantic annotation of web resources (Maâlej, 2012; Maâlej et al., 2013). This approach is based on the domain ontologies that have been extended by FOAF and SIOC concepts in the instantiation process. These concepts represent the data of the Semantic Web. If we do not find in the literature an ontology that includes FOAF, SIOC, description of a project (DOAP) and/or RDFa (RDF embedded in XHTML) standards, we note that the extensions made on the ontologies usually use FOAF and SIOC standards. Our approach uses the semantic structures instantiated by the Semantic Radar tool in a set of RDF files. Indeed, the proposed annotation on the resources is an enhancement of the first result of annotation made by the Semantic Radar. The additional annotation concerns the concepts using an enriched domain ontology and the FOAF and SIOC ontologies. The general steps of our approach are as follows (c.f., Figure 1): S1 After the specification of the study field, the user (the domain expert) has to interrogate the web through the graphical interface of our annotation component. This interrogation is based on the concepts of the used domain ontology. S2 Then, we propose to transfer the returned web pages to the Semantic Radar for an automatic and semantic analysis. This analysis returns an RDF file for each page containing the descriptors of FOAF and SIOC. The set of the RDF files extracted by the Semantic Radar represents the input of the annotation process. S3 Subsequently, the analysed pages have to be automatically annotated by the method that we propose to produce RDF metadata for each web resource. We use in this method equivalence rules and a semantic annotation model that we have defined in (Maâlej et al., 2013). The new RDF resource will be linked to the original resource on the web and will be saved in the annotation base. The semantic annotation method: The purpose of our semantic annotation method is to seek equivalences between concepts of the analysis result made by the Semantic Radar on the web resources and concepts of the selected enriched domain ontology. In fact, the user (the domain expert) starts her/his search by the research field concepts, that have been structured in the used ontology. To ensure these equivalences, we have defined equivalence rules and a semantic annotation model, with a set of new descriptors created for the semantic annotations process. With these rules and this model, we produce an annotation result of each resource as an RDF file that represents an enhancement of the RDF result generated by the Semantic Radar. The dotted part in Figure 1 shows generally the main parts of our semantic annotation method of web resources. S4 Finally, we propose to use an appropriate method, which we intend to define, for filtering web pages. The web pages returned after user interrogation (expert or not expert) pass to the filtering system. This system returns the more relevant pages (indexed and non-annotated and/or indexed and annotated) to the user, after the querying of the annotation base. The filtering method will help to improve the results of web queries.
Automation of the semantic annotation of web resources Figure 1
4
137
Synoptic schema of our approach (see online version for colours)
The annotation component
The prototype of the component ‘Querying Web’ uses the plug-in ‘Semantic Radar’ to extract semantic descriptors (step 2 of our approach), and ‘Eclipse’ as development tool to automate the extraction of RDF annotation metadata of web resources (step 3 of the approach). We limit our presentation here to the principal implementation parts of our component for semantic annotation through the following five steps. Figure 2 shows the interface of our annotation component.
138 Figure 2
S. Maâlej Dammak et al. The graphical interface of our component (see online version for colours)
4.1 Writing a semantic query As we have shown here before, web querying is done through the proposed interface of our component. Thus, the domain expert must specify the study domain (the right part of the graphical interface, ‘Domain Ontologies’, in Figure 2). After the specification of this domain, we show to the user the hierarchy of the ontology to assist him/her in writing the semantic query. In addition, we tell him/her to follow a well-defined syntax in the writing of this request in order to have a correct advanced search on the web (c.f., Figure 3). Figure 3
General syntax of search query
We have chosen to study the field ‘Network of Scientists’ that corresponds to the enriched domain ontology ‘Network of Scientists’ (vivo.owl). All examples apply to this area. VIVO is an open source Semantic Web application that, when populated with researcher interests, activities and accomplishments, enables the discovery of research and scholarship across disciplines (Mitchell et al., 2011).
4.2 Web querying Further to the study of opportunities of the internet interrogation through our annotation component, we notice the need to use a web service to communicate with Google. Google Custom Search API (https://developers.google.com/custom-search/) is an API to retrieve and display results from a Custom Search Engine. With this API,
Automation of the semantic annotation of web resources
139
we can use Representational State Transfer (REST) (https://developers.google.com/ custom-search/jsonapi/v1/using_rest) requests for the results of web searches, but with a low number of requests per day: 100 requests/day. Therefore, we can use this API for querying the web with Google, from our Java application in Eclipse. In fact, this API replaces an old API called ‘Google Web Search API’ (https://developers.google.com/ web-search/). This last API has been officially deprecated since November 1, 2010. Figure 4 shows a general interface of the Custom Search API Project, while Figure 5 shows Custom Search Engine that we use, called ‘Search Google’. This project and this engine are executed for each web search, in the background, with a transparent manner to the user. Figure 4
General interface of the custom search API project (see online version for colours)
Figure 5
Custom search engine ‘Search Google’ (see online version for colours)
4.3 Definition of a search method After writing the query, the search will be achieved through a connection between the Google Custom Search API and the Custom Search Engine. As a search method, we propose, at each web interrogation, to display the URL returned by Google (ten links per search list) in the table ‘Table of URL’ and also the number of search lists. The part A in Figure 6 shows the display of the URL in Table of URL after interrogation. In the part B, there is the display of the number of the search lists.
140
S. Maâlej Dammak et al.
In addition, we import with each search the RDF descriptions of web pages returned in the lists. An internal file, called ‘RDF.txt’, will be created automatically to be the first source of analysis. This file contains the RDF descriptors extracted by the Semantic Radar for only the pages that have a first level of semantic annotation. Figure 6 shows that our annotation system has imported the RDF description for the resource http://journal.webscience.org/532/; this description represents a first level of annotation by the Semantic Radar. The content of this description will be stored in the ‘RDF.txt’ internal file. Figure 6
Example of a first level of annotation on a web page (see online version for colours)
4.4 The generation of annotation metadata In this step, we propose to produce the RDF annotation metadata for each web resource. Indeed, the required inputs are ready: the enriched domain ontology (.OWL), the FOAF and SIOC ontologies (.RDF) and the file ‘RDF.txt’. The remaining task is to apply equivalence rules and the semantic annotation model proposed in Maâlej et al. (2013) to generate an RDF document related to the original resource on the web (c.f., Figure 1). In fact, we have defined four equivalence rules to apply for the annotation of resources. The equivalences are identified in the relationships between the FOAF concepts of the FOAF ontology and the SIOC concepts of the SIOC ontology (Bojars and Breslin, 2010). As for the defined semantic annotation model, it is based on the annotation of the FOAF (or SIOC) concepts (stored in the internal file ‘RDF.txt’) by the FOAF (or SIOC) concepts of the enriched domain ontology, the domain concepts and/or
Automation of the semantic annotation of web resources
141
the FOAF concepts of the FOAF ontology (or the SIOC concepts of the SIOC ontology) (c.f., Figure 7). In this model we have defined different descriptors of annotation as ‘Has Child’, ‘IS A’, etc. (c.f., Figure 8) (Maâlej et al., 2013). Figure 7
Application of the defined semantic annotation model (see online version for colours)
142 Figure 8
S. Maâlej Dammak et al. General extract of an annotation model based on FOAF and SIOC
Our annotation system allows the automatic and semantic annotation of the web page after interrogation. An RDF annotation file is automatically generated for each web resource. This file is an enhancement of the first result RDF for the resource by the Semantic Radar (the first level of annotation). Figure 6 shows a first level of annotation for this web resource: http://journal.webscience.org/532/. Figure 9 displays the proposed semantic annotation, which is automatically generated after the changing of the result of the Semantic Radar, for the resource http://journal.webscience.org/532/. In this example, the concept ‘FOAF: Person’ of the result of the Semantic Radar is annotated by the concepts of the enriched domain ontology ‘Network of scientists’, using the proposed semantic annotation model: Librarian EmeritusFaculty FacultyMember NonFacultyAcademic NonAcademic EmeritusProfessor. In addition, our system enriches this annotation by annotating the same concept using the concepts of the FOAF ontology (Agent).
Automation of the semantic annotation of web resources Figure 9
143
Example of a generated file of a semantic annotation for a web resource (see online version for colours)
We also suggest another enrichment of the annotation of the RDF metadata by different degrees of possibilities. Indeed, we have observed that the concepts of the result of the Semantic Radar may be connected to several terms of the ontology, but connections may be uncertain. We then get a fuzzy semantic annotation of web resources. This work was published in Maâlej et al. (2014). So, we have suggested assigning weights of annotations in metadata in order to associate each line of annotation with a weight indicating the possibility degree of the annotation concept. These weights are framed by the descriptors and .
4.5 Attachment of the annotation metadata to their original resources We propose in our approach that the new RDF resource of annotation has to be linked to the original resource and released on the web. To achieve this goal, we propose, on the one hand, to create a new database for annotation in order to connect each original resource to their RDF of annotation. On the other hand, we see that the interrogation by the user has to be indexed on the Google Web server and the new base of annotation in order to return the non-annotated web pages and those annotated (register in the database of annotation). To create a database of annotation, we have used the web development platform ‘EasyPHP 5.3.9’ (http://www.easyphp.org/). After the semantic and automatic annotation of web resources, it becomes necessary to record each resource with its RDF in the database of annotation. In our annotation component, we find three cases:
S. Maâlej Dammak et al.
144 •
If the resource is non-annotated, the automatic annotation will be generated (c.f., Figure 9) and the recording of this resource with their annotation will be done automatically.
•
If the resource is already annotated by the enriched domain ontology, a message appears ‘Resource already annotated’ (so already stored in the database).
•
If the resource is already annotated with an ontology and the current query is made by another ontology, an automatic annotation will be generated by the new ontology for the same resource and the recording of this resource will be done automatically.
5
Experimental study
We also use the field ‘Network science’, which corresponds to the enriched domain ontology ‘Network of Scientists’ (vivo.owl), for this study. We see that an automation of the generation of the RDF annotations for web resources is beneficial for this area. In fact, this annotation improves the performance of the annotation process. After the proposed annotation of web resources, a new search returns a new sorting of the result. Indeed, annotated resources will appear at the top of the list. Then, we evaluate the automation made by our component ‘Querying Web’. For this, we present hereafter the percentages of the automation of the annotation for this case study. We take the query result by the name of the ontology ‘Network of Scientists’ with the keyword ‘Conference Paper’ (‘Network+of+Scientists’+ConferencePaper), according to the general syntax of a query (c.f., Figure 3). This result returns ten URL (ten pages non-annotated) in the first search list. After the annotation process, the system proceeds to the annotation of three pages among ten pages. We present below the statistical reports of the step of the RDF annotation automation for web resources, using the following four percentages: •
Percentage of the pages correctly annotated: Which represents the percentage of the pages annotated automatically and correctly relatively to the result of the web pages returned after interrogation. Percentage of the pages correctly annotated =
•
Percentage of the pages non-annotated: Which represents the percentage of the pages non-annotated relatively to the result of the web pages returned after interrogation. Percentage of the pages non-annotated =
•
Number of pages correctly annotated Number of web pages
Number of pages non-annotated Number of web pages
Percentage of the pages that should be annotated (annotation missing): Which represents the percentage of the pages that must be annotated but have not been annotated relatively to the result of the web pages returned after interrogation. Percentage of the pages that should be annotated =
Number of the pages that should be annotated Number of web pages
Automation of the semantic annotation of web resources •
145
Percentage of the unnecessarily annotated pages: Which represents the percentage of the pages that shall not be annotated relatively to the result of the web pages returned after interrogation. Percentage of the unnecessarily annotated pages Number of the pages that shall not be annotated = Number of web pages
We present these percentages for our case in Table 1 shown below: Table 1
Result of the automation of the semantic annotation
Percentage of the pages correctly annotated
Percentage of the pages non-annotated
30%
70%
Percentage of the Percentage of the pages unnecessarily annotated that should be pages annotated 0%
0%
To better evaluate this automation for this case study, the following metric standards are used: recall, precision and F-measure (Van Rijsbergen, 1979). •
The precision (P): Is the number of correct data found by the programme (i.e., the correctly annotated pages) divided by the total number of data found by the programme (i.e., the annotated pages and the non-annotated pages).
•
The recall (R): Is the number of correct data found by the programme divided by the total number of the real data identified manually.
•
The F-measure (F): Is a metric that combines into a single value the two measures precision and recall: F = (2* P * R) (P + R)
Table 2 shows the automation evaluation of the semantic annotation for the web resources by the component ‘Querying Web’. Table 2
Result of the automation of the semantic annotation by the standard metrics for the field ‘Network science’
The annotated pages
Precision: P
Recall: R
F-measure: F
30%
100%
46%
So, we conclude that the percentage of the annotated pages obtained by our component is about 46% (percentage of satisfaction with our component) for this case study. Throughout the various tests that we have done in our component, we found encouraging results. By hundred (100) different queries, we could get about 50% of queries that give a percentage from 30% (three annotated pages/ten pages) to 50% (five annotated pages/ten pages) of annotated pages (c.f., Table 3). To better evaluate this automation, we have also used the standard metrics: recall, precision and F-measure (c.f., Table 3).
S. Maâlej Dammak et al.
146 Table 3
Result of the automation of the semantic annotation by the standard metrics
The annotated pages
Precision: P
Recall: R
F-measure: F
30%
100%
46%
50%
100%
66% 56%
Thus, we conclude that the percentage of the annotated pages obtained by our component is about 56% (percentage of satisfaction). In the literature, we do not find studies that evaluate the number of the annotated web pages by the proposed system compared to the number of the non-annotated web pages. But, we find studies of information extraction for the annotation of texts (not web pages) that happen to an F-measure of 67.23% as in Ben Abacha and Zweigenbaum (2010). In addition, the authors of Ben Abacha et al. (2012) evaluated the annotation of an English corpus and they got a result approximately equal to 50%. We also see in Weiser (2010) the automatic semantic annotation of temporal expressions in web pages for an application of e-tourism. In this case, the F-measure obtained is 58.9%. In our work, we obtained a percentage of satisfaction with our component (for the annotation of the entire contents of web pages) of about 56%. We then believe that this percentage is satisfactory to improve the querying process, mainly it is a percentage of semantic annotation automation, which represents a tedious and long task, for web resources, which are very heterogeneous.
6
Conclusions and further work
In this paper, we are interested in the semantic annotation of web resources in order to improve the interrogation process in a Semantic Web environment. However, the creation of annotations for web resources is a delicate and difficult task, given the complexity of these resources in terms of their structure as well as the used languages. So, the annotation process automation is required. It is our main contribution in this paper. We have proposed an approach to semantic annotation for web resources. This approach helps to improve the interrogation process in a Semantic Web environment. We have then implemented a component to automate this approach, using the plug-in ‘Semantic Radar’ to extract semantic descriptors and ‘Eclipse’ as development tool to automate the extraction of RDF annotation metadata. The results of the implementation of our component, that allows the automation of the semantic annotation, have shown the feasibility and the benefits of our proposals, despite the fact that this automation is a delicate and difficult task. This component, entitled ‘Querying Web’, assists domain experts in the annotation of the web resources. The field of study ‘Network of Scientists’ has allowed us to show the importance of the semantic annotation. The evaluation of this annotation, by the component ‘Querying Web’, allows the location of the automation percentage of this annotation which is about 46% for this case study. In general, our component helped to give a percentage of automation between 46% and 66%. This result is satisfactory, compared to that achieved in the literature. However, improvements are needed. In future works, we will propose an extension for the query language ‘SPARQL’ (http://www.w3.org/standards/semanticweb/) in order to query the proposed annotation
Automation of the semantic annotation of web resources
147
metadata on the web resources. In addition, the use of this language will accelerate the process of querying of these resources. Also, we will propose a solution to filter the web pages after interrogation. This task is in progress and will be published later. In fact, the filtering method that we are currently developing is based on two activities: an allocation of scores for the web pages (indexed and annotated and/or indexed and non-annotated) and a classification of these pages. In addition, we will seek to improve the fuzzy semantic annotation of web resources using the fuzzy ontologies (Maâlej et al., 2010; Ghorbel et al., 2013) that help to remove all imprecision, which may exist in the universe of discourse.
References Ben Abacha, A. and Zweigenbaum, P. (2010) ‘Annotation et interrogation sémantiques de textes médicaux’, Proceedings of Atelier Web Sémantique Médical 2010 à IC 2010, Nîmes, pp.61–70. Ben Abacha, A., Zweigenbaum, P. and Max, A. (2012) ‘Extraction d’information automatique en domaine médical par projection inter-langue: vers un passage à l’échelle’, Proceedings of TALN, Traitement automatique des langues naturelles, Grenoble, pp.15–28. Benyahia, K., Lehireche, A. and Latreche, A. (2009) ‘Annotation Sémantique De Pages Web’, Proceedings of the 2nd Conférence Internationale sur l’Informatique et ses Applications-CEUR Workshop Proceedings, Saida, Algeria, 3–4 May, Vol. 547. Bojars, U. and Breslin, J. (2010) ‘Sioc core ontology specification’, 25 March [online] http://rdfs.org/sioc/spec/ (accessed 28 February 2014). Bojars, U., Breslin, J., Peristeras, V., Tummarello, G. and Decker, S. (2008) ‘Interlinking the social web with semantics’, Journal of IEEE Intelligent Systems, Vol. 23, No. 3, pp.29–40. Bojars, U., Passant, A., Giasson, F. and Breslin, J. (2007) ‘An architecture to discover and query decentralized RDF data’, SFSW 2007: Proceedings of the ESWC’07 Workshop on Scripting for the Semantic Web, CEUR Workshop Proceedings, Innsbruck, Austria. Brickley, D. and Miller, L. (2014) ‘FOAF vocabulary specification’, 14 January [online] http://xmlns.com/foaf/spec/ (accessed 28 February 2014). Custom Search [online] https://developers.google.com/custom-search/ (accessed 12 March 2014). EasyPHP [online] http://www.easyphp.org/ (accessed 12 March 2014). Ghorbel, H., Maâlej, S., Bahri, A. and Bouaziz, R. (2013) ‘A framework for the semi-automatic generation of fuzzy ontologies: Text2FuzzyOnto’, Journal of Technical and Computer Science, Vol. 32, No. 6, pp.671–698. Google Web Search API [online] https://developers.google.com/web-search/ (accessed 12 March 2014). Maâlej, S. (2012) ‘Semantic annotation of web resources: state of the art and research perspectives’, Proceedings of INFORSID’12, Montpellier, France, 29–31 May, pp.591–598. Maâlej, S., Ghorbel, H., Bahri, A. and Bouaziz, R. (2010) ‘Construction of the fuzzy ontological components from the fuzzy semantic data corpus’, Proceedings of INFORSID’10, Marseille, France, 25–28 May, pp.361–376. Maâlej, S., Jedidi, A. and Bouaziz, R. (2013) ‘Semantic annotation framework for web resources’, Proceedings of ITA, Fifth International Conference on Internet Technologies & Applications, Wrexham, North Wales, UK, 10–13 September, pp.106–113. Maâlej, S., Jedidi, A. and Bouaziz, R. (2014) ‘Fuzzy semantic annotation of web resources’, Paper presented at the World Symposium on Computer Applications & Research: International Conference on Artificial Intelligence, Sousse, Tunisie, 18–20 January.
148
S. Maâlej Dammak et al.
Martins, B. and Silva, MJ. (2005) ‘The WebCAT framework-automatic generation of meta-data for web resources’, Proceedings of WI-the 2005 IEEE/WIC/ACM International Conference on Web Intelligence, France, pp.236–242. Mitchell, S., Chen, S., Ahmed, M., Lowe, B., Markes, P., Rejack, N., Corson-Rikert, J., He, B., Ding, Y. and VIVO collaboration (2011) ‘The VIVO ontology: enabling networking of scientists’, Proceedings of the ACM WebSci’11, Koblenz, Germany, 14–17 June, pp.1–2. RDFa D. (2010) RDFa Distiller and Parser [online] http://www.w3.org/2007/08/pyRdfa/ (accessed 28 February 2014). Resource Description Framework (RDF) [online] http://www.w3.org/RDF/ (accessed 12 March 2014). Semantic Web [online] http://www.w3.org/standards/semanticweb/ (accessed 12 March 2014). SemanticR (2009) Semantic Radar [online] https://addons.mozilla.org/enUS/firefox/addon/semantic-radar/ (accessed 28 February 2014). Shchekotykhin, K.M., Jannach, D., Friedrich, G. and Kozeruk, O. (2007) ‘AllRight: automatic ontology instantiation from tabular web documents’, Proceedings of The 6th International Semantic Web Conference and 2nd Asian Semantic Web Conference ISWC/ASWC2007, Busan, South Korea, Vol. 4825 of LNCS, pp.463–476. Thiam, M. (2010) Annotation Sémantique de Documents Semi-structurés pour la Recherche d’Information, Unpublished PhD thesis, University South Paris, France. Using REST to Invoke the API [online] https://developers.google.com/custom-search/jsonapi/ v1/using_rest (accessed 12 March 2014). Van Rijsbergen, K. (1979) Information Retrieval, 2nd ed., ISBN 0-408-70929-4, Butterworth-Heinemann, USA. Weiser, S. (2010) Repérage et typage d’expressions temporelles pour l’annotation sémantique automatique de pages Web-Application au e-tourisme, Unpublished PhD thesis, University Paris Ouest Nanterre, La Défense.