A proposed architecture for a semantic search engine

PSSE: An Architecture For A Personalized Semantic Search Engine A. M. Riad,Hamdy K. Elminir, Mohamed Abu ElSoud, Sahar. F. Sabbeh

PSSE: An Architecture For A Personalized Semantic Search Engine A. M. Riad Head of information system dep. Faculty of computers and information sciences. Mansoura university, Egypt [email protected]

Hamdy K. Elminir Head of communication dep Misr Academy for Engineering & technology. [email protected]

Mohamed Abu ElSoud Computer Science Dept. Faculty of computers and information sciences. Mansoura university, Egypt [email protected]

Sahar. F. Sabbeh Assistant Lecturer Alzarka Higher institute for administration & computer sciences [email protected]

doi: 10.4156/aiss.vol2.issue1.9

Abstract Semantic technologies promise a next generation of semantic search engines. General search engines don’t take into consideration the semantic relationships between query terms and other concepts that might be significant to user. Thus, semantic web vision and its core ontologies are used to overcome this defect. The order in which these results are ranked is also substantial. Moreover, user preferences and interests must be taken into consideration so as to provide user a set of personalized results. In this paper we propose, an architecture for a Personalized Semantic Search Engine (PSSE). PSSE is a crawler-based search engine that makes use of multi-crawlers to collect resources from both semantic as well as traditional web resources. In order for the system to reduce processing time, web pages' graph is clustered, then clusters are annotated using document annotation agents that work in parallel. Annotation agents use methods of ontology matching to find resources of the semantic web as well as means of information extraction techniques in order to provide a well description of HTML documents. System ranks resources based on a final score that's calculated based on traditional link analysis, content analysis and a weighted user profile for more personalized results. We have a belief that the merge of these techniques together enhances search results.

Keywords: Semantic web, Search Engine, Personalization 1. Introduction If we took a look at the structure of the web, it's composed of an enormous amount of documents and links between them. However, current web documents present human readable contents targeted at humans. Yet, the web is not used only by humans, as software agents are becoming users of the web too. This has led to the development of the semantic web[1,2]. Information retrieval technology can draw massive benefits from using semantic web vision. As standard retrieval systems usually regard query terms, as queries must match with keywords used in the metadata. However, these systems don’t take into consideration the semantic relationships between query terms and other concepts that might be significant to user. That necessitated the augmentation of semantic web vision into traditional retrieval systems resulting in the notion of semantic search. As search is seen as a key application that can benefit from semantic web vision to provide improvements to recall and precision over traditional information retrieval (IR) techniques. Traditional keyword-based search as it offers high recall but low precision, causing user to face too many irrelevant results. This is due to the deal with documents as a set of words disregarding web semantics as no semantic analysis is carried out. Semantic search provides enhancement to traditional

102

International Journal on Advances in Information Sciences and Service Sciences Volume 2, Number 1, March 2010 search as it allows for retrieval that incorporates the underlying terms semantics [3,4]. Whereas, users usually don't articulate well the terms they want to search providing only one or two terms for each search engine. In this context, using ontologies to represent relationships between concepts can improve search results [5-7]. That's why finding and ranking ontologies on the semantic web has been put forward as one of the motivations of the Semantic web vision and has been subject to many researches[8-10]. Found ontologies can be used to enhance search process either statically or dynamically [11]. Thereupon, researchers sake after developing a full featured semantic search engine. Swoogle was the first developed semantic search engine. Swoogle employs crawlers to discover RDF documents and HTML documents with embedded RDF content. Swoogle exploits these RDF triples to record meaningful metadata about them in its database [12,13]. However, the birth of other markup languages such as RDFs, OWL,…etc and being widely used [14] made it mandatory to develop search engines that can deal with these languages. However, according to the analysis of the current Semantic Web Search and ranking techniques we found that each of them has some limitations and missing components. And due to the fact that information retrieval system can be personalized using profiles. Therefore, we considered an architecture that uses multi-crawlers to provide crawling services for both semantic as well as traditional web. Moreover, system uses multi agents to provide parallel annotation of web documents making use of information extraction techniques and an ontology. Ontology is used too to expand user query at search time to provide more enhanced search results. Moreover, a final ranking score is assigned to documents. This score is calculated based on a combination of link-based analysis, contentbased analysis and finally a personalization factor (hereafter is referred to as PF) is calculated for each user for more personalized results. We shall begin this paper by a description of the proposed architecture's main components. Then a detailed description of these components and their functionality is presented in section 3. Our conclusion and future work is presented in section 4.

2. PSSE Architecture As Fig.1 depicts, the processes of PSSE are separated into an offline and an online part. The offline part includes crawling and preprocessing processes. The online phase includes query processing and result ranking. The main components of each stage are articulated as described in Fig.1.

2.1. Offline Phase In this phase, crawling the world wide web and preprocessing of crawled pages take place. 2.1.1. Crawler PSSE uses Multi-crawlers (web spiders) that traverse world wide web, collect web resources and store them in database. Crawlers work with the aid of information extraction techniques to find link information in the retrieved pages. More details about crawlers and the crawling process can be found in section 3.1.1. 2.1.2. Preprocessor The preprocessor is used to maintain resources that are downloaded from Web sites. The main task of query Indexer and link analyzer is to cluster the crawled web documents to enable parallel processing. This can be done in three steps: first indexer and link analyzer builds a graph of the crawled pages. Link analysis is then performed to calculate authoritativeness of web pages. And finally

103

PSSE: An Architecture For A Personalized Semantic Search Engine A. M. Riad,Hamdy K. Elminir, Mohamed Abu ElSoud, Sahar. F. Sabbeh the graph is clustered by identifying its connected components. These clusters are then annotated by annotation agents that work in parallel to reduce processing time. Afterwards, annotations are weighted so as to determine their relevancy to web resource using term relevancy evaluator.

Offline Phase Crawler

Preprocessor

Indexer & link analyzer

Clustered document

Annotation Crawled documents

Term Importance Evaluator

Annotated Documents

Ontology

Online Phase Ontology

Searcher

Query Analyze

User Log

Ranking Module Searc h Agent

Figure 1. PSSE Architecture

2.2. Online Phase 2.2.1. User interface System comes with an easy to use Google like search interface. After submitting his query, results are displayed. 2.2.2. Searcher This component is responsible for searching and retrieving relevant results. First query analyzer performs mapping of query terms as well as query expansion using an ontology. This component is responsible too for maintaining user log and keeping track of user search history. Afterwards, search

104

International Journal on Advances in Information Sciences and Service Sciences Volume 2, Number 1, March 2010 agent retrieves relevant results from resources database. Retrieved results are then passed to ranking module to be ranked. 2.2.3. Ranking module This module is responsible for ranking the retrieved results. Three factors contribute to the score. The first one is the page authoritativeness which is calculated during the preprocessing phase using link analysis techniques. The second is the relevancy of resource content to query terms which depends on content analysis. And finally, the third factor is the personalization factor (PF). PF is used to support tailoring results according to user's interests and preferences. Personalization factor is calculated based on the analysis of user's log file. Analyzing user's search history can result in a value that represent user's interests in a particular query term. The final ranking score is the combination of these three factors.

3. Personalized Semantic Search Engine (PSSE) In this section, we provide a detailed description of the search process performed by PSSE. The process of semantically searching and ranking user results can be logically divided into two phases, We refer to these phases as Offline and Online phases.

3.1. Offline Phase This phase includes crawling the web and preprocessing the crawled pages. These processes can be outlined as follows: 3.1.1. Crawling Stage Crawling is the very first step in developing search engines. As the content of the search engines are fed by crawlers. Owing to their importance, crawling process witnessed extensive study. As a consequence many crawling strategies have been proposed and investigated [15,16]. Moreover, the endeavors to build semantic search engines triggered a particular interest in crawling. As ontology can be used to semantically enhance the crawling process so as to provide focused crawling [17,18]. During crawling, crawlers (robots) traverse the web to collect web resources. But yet, there's farreaching difference between crawling traditional HTML web documents and crawling semantic web resources. As crawlers go through a stage of URL extraction. HTML crawlers extract links from HTML pages in order to find additional sources to crawl. This mechanism usually doesn't work for semantic resources, as there exists no direct concept of a hyperlink. That's why our system uses multicrawlers that can traverse both traditional as well as semantic web.

Gatherer URL Extractor

Database manager

URL Filter

URL Queue

Figure 2. Crawling process

105

Crawled documents

PSSE: An Architecture For A Personalized Semantic Search Engine A. M. Riad,Hamdy K. Elminir, Mohamed Abu ElSoud, Sahar. F. Sabbeh As can be seen in Fig.2, Crawls are initiated within the Gatherer. The Gatherer receives a URL to be crawled., passes it to URL Extractor which in turn extracts URLs from page. The filter then checks if URLs are already visited, stores them into database, associates related URLs, and queues them for further examination. Our crawling algorithm is the standard breadth-first search algorithm as it has been known for its high quality results [19]. Moreover, in order to be able to scale, crawling is performed in parallel. Those crawled pages form the input of the preprocessing stage. 3.1.2. Preprocessing Stage This phase includes the preprocessing of the crawled documents. so as for preprocessing to take less time, indexer first builds a graph of all crawled pages. Then graph is clustered using conventional connected components algorithm[20]. Clustering resources makes it easy for annotation agents afterwards to work in parallel using clustered sets as well as making it easy to calculate authoritativeness of pages. The used clustering algorithm is shown in Fig.3.

Figure 3. Connected Components algorithm As can be seen, the connected components algorithm takes pages' graph as input G(V,E). Pages' graph is an undirected graph composed of a set of nodes/vertices (V) and a set of Edges (E). the algorithm uses depth first search that recursively visit graph vertices v and if not visited before, assigns it to a cluster k. afterwards, starting from v visit all connected nodes w ∈ V as long as there exists an edge between (v , w ) ∈ E. Finally, the output will be cluster structure L with each node v assigned to cluster L(k). The resulted clusters are then processed using link analysis techniques to calculate authoritativeness of each web document. We follow the typical PageRank [21-23] algorithm, PR(A) is defined in as in equation1:

PR( A) = (1 − d ) + d ∑i =1 n

Where:  PR(A) is the PageRank of page A,

106

PR(Ti ) C (Ti )

1

International Journal on Advances in Information Sciences and Service Sciences Volume 2, Number 1, March 2010  PR(T i ) is the PageRank of pages T i which link to page A,  parameter d is a damping factor which is usually set between 0 and 1.  C (T i ) is defined as the number of outbound links in page T i . So, as can be inferred from equation.1, PageRank does not rank web sites as a whole, but is determined for each page independently. Further, the PageRank of page A is recursively defined by the PageRanks of all pages that points to page A. Then, the PageRank of a page Ti is always weighted by the number of outbound links C(Ti) on page Ti. The weighted PageRank of pages Ti – Tn are then summed up. Finally, the sum of the weighted PageRanks of all pages Ti is multiplied with a damping factor (d) which usually set between 0 and 1. After link analysis is performed, explicit, non–embedded annotations are assigned to documents by annotation agents. Using multi-agents that work in parallel help to speed the entire process even more. And since our system aim at a general querying infrastructure, we need to extract information from the files and transform them into a structured representation. In this context, some of the available semantic search engines settled for annotating resources while others built an RDF graph o which resources are submitted [24-25]. Ideally, using an ontology and ontology matching techniques in this phase will provide better annotation of resources. Traditional HTML documents are annotated using information extraction techniques with the aid of an ontology. whereas, semantic resources are annotated using ontology as well as ontology matching techniques. Ontology matching techniques used to overcome the differences that may exist when mapping the formerly annotated resources into our structured representation [26]. Annotation agents

Ontology

Clustered documents

Term relevancy scoring For each d∈ R

Annotated resources

For each t ∈ T(d)

Calculate TF-IDF

Update database Figure 4. TF-IDF algorithm Subsequently, annotations are assigned weights that are calculated based on their relevancy to document. Our system uses vector space model [27-30] namely TF-IDF [31] to represent documents in weighted terms. Each weight terms vector is considered in two factors. First factor is termed frequency, tfi,j , number of times term ki appears in document Aj. Second factor, document frequency, dfi, is the number of documents Aj that has term ki. Value of idfi is an inverse document frequency of ki in collection: idfi = log(N/dfi) where N is number of documents in collection. So we calculated weightterm from wij=tfi*idfi. Each user query was also represented by weight term vector. So we calculate the value of similarity from operation of vector using cosine function according to equation 2.

107

PSSE: An Architecture For A Personalized Semantic Search Engine A. M. Riad,Hamdy K. Elminir, Mohamed Abu ElSoud, Sahar. F. Sabbeh

Wij = tfij * log 2

N n

2

Where: • w ij = weight of Term T i in Document A j • tf ij = frequency of Term T i in Document A j • N = number of Documents in collection • n = number of Documents where term T i occurs at least once As shown in Fig.4, annotation agents takes clustered documents as an input. Agents work with the aid of an ontology, making use of information extraction and ontology matching techniques to generate annotated documents. Annotations are then assigned weights that indicate their relevancy to document's content. the vector space model takes an input of set of resources (R). for each document (d ∈ R), get the associates annotation terms T(R). for each term (t ∈ T(R)), calculate TF-IDF and store this weight to database.

3.2. Online Phase During this phase, the actual searching process takes place. As system receives user query, process it, retrieves results, ranks results and display these results to user. The whole process involves 2 stages as follows. 3.2.1. Searching stage In this stage as shown in Fig.5, user query is processed, when user first enters query, query analyzer perform text mapping using traditional text processing and natural language processing techniques. Additionally, system updates and maintains a user log so that its information can be used during ranking phase to provide personalized search results.

Search Agent

Term mapping & NLP Maintain user Log

Unranked relevant results

Query Expansion

Ontology Annotated Resources

User Log

Figure 5. Searching Phase As ontology enable contextualizing user's search terms by making it possible associating concepts and properties around a specific domain. Ontology can be used manually[32] or in the form of ontological analysis that results in suggestions that user can use to refine his query [33,34] Domain ontology as well can help in filtering results that match user queries [35,36]. Automating the whole process without any user intervention and making it all transparent is a goal for many researches [3739]. Taxonomies too can be used in correlation with ontology so that to provide better document annotation[40].

108

International Journal on Advances in Information Sciences and Service Sciences Volume 2, Number 1, March 2010 In PSSE, query is expanded with terms that might be of relevance to user query making use of ontology. in this step, term Synonyms, sub-concepts and super concepts are added to improve retrieval performance. Afterwards, search agent retrieves unranked relevant resources and filters them in order to be ranked. 3.2.2. Ranking stage Ranking is considered a key function of any search engine. Usually, current search engines take into account Link analysis techniques to classify result relevance, such as PagRank [21-23] and it's extensions In [41,42]. However, another point of view censured link analysis techniques for not paying attention to document's content, that's why many researches used using vector space model, regarding content-based ranking to be a better ranking methodology[27-30].At the level of semantic ranking, XSEarch [43], XRANK [44] were essays to rank XML elements. In PSSE, results are ranked according to a final score that represents a combination of three different factors: The first factor is Page authoritativeness which is calculated using link analysis techniques, namely PageRank algorithm. Authoritativeness value is calculated during the preprocessing phase. The second factor is content relevancy. Query terms in correlation with the weighted annotations are used to calculate query relevancy to each document individually. The third factor, user interests and search history are used to provide more personalized search results. To provide Personalized results, a user profile must be created and maintained using information that can be collected implicitly by monitoring user behavior or by explicit user input or feedback [45-47]. PSSE maintains a user log that contains user's usage data and search history. Terms from user's search history. Data in user's log are then used during ranking to calculate personalization factor which in addition to the pervious factors form the final score according to which results are ranked. During ranking stage, weights are assigned to terms by analyzing user log and usage data against query terms. The frequencies assigned to profile keywords are significant since they express the rate of user interests. The weighting step starts from these frequencies to calculate profile query term weights. Calculating the weights of the initial query terms is performed by pointing out the highest frequency number and dividing each frequency number by this highest number. Personalization factor (PF) determines the degree of user's interest in a certain query term. Personalization Factor (PF) is calculated according to equation 3.

PF(j,u) = • •

∑s ∑s

j

(u )

k

(u )

3

Where: s j (u) is the frequency of term j in user search history. s k (u) is the entire number of terms appeared in user search history

Finally, ranking module calculates the final score using weights calculated from link analysis, weighted annotation and PF as in equation 4,5:

sem (i, j,u) = Wi, j + PFj,u

 4

Score( i , q ,u ) = ∑ j∈q sem( i , j ,u ) + PR( Ai )  5 • •

Where : sem(i,j,u): the similarity between document i and query term j for user u score(i,q,u) is the final weight assigned to document i against query q for user u.

109

PSSE: An Architecture For A Personalized Semantic Search Engine A. M. Riad,Hamdy K. Elminir, Mohamed Abu ElSoud, Sahar. F. Sabbeh Ranking module then passes results back to search agent which in turn passes them to user interface.

4. Conclusion In this paper, we have presented a general framework for personalized Semantic Search Engine (PSSE). PSSE is a crawler-based search engine in which multi-crawlers work in parallel to traverse both traditional as well as semantic web. Additionally, user interests and preference are automatically learned from Web usage data and integrated with page authoritativeness and content relevancy to rank final results. We think that the parallel processing during data preprocessing reduces required time. Furthermore, taking resource authoritativeness and content as well as regarding user preferences enhances final result and increases user satisfaction.

5. References [1] N. R. Shadbolt, W. Hall, and T. Berners-Lee, “The semantic web: Revisited,” IEEE-Intelligent Systems, vol. 21, issue 3, pp. 96–101, May 2006. [2] T. Burners-Lee, J. Hendler, and O. Lassila, “The semantic web,” Scientific American, vol. 284(5), May 2001. [3] R. Guha, R. McCool, and E. Miller. "Semantic Search". In Proceedings of the 12th international conference on World Wide Web, pages 700–709, 2003. [4] T.Finin, J.Mayfield, C.Fink, A.Joshi, and R.S. Cost,” Information retrieval and the semantic web,” in Proceeding of the 38th International Conference on System Sciences, Hawaii, USA, 2005. [5] E. Mäkelä. "Survey of semantic search research". In: Proceedings of the Seminar on Knowledge Management on the Semantic Web, Department of Computer Science, University of Helsinki (2005) [6] Ramprakash, S. K. Malik, N. Prakash, S. Rizvi , "A Comparative Study of Different Types of Search Engines in Context of Semantic Web”, National Conference on Advancements in Information & Communication Technology (NCAICT) , on March 15-16, 2008. [7] W.A. PINHEIRO, A. Maria , C. Moura. "Semantic Search in Portals using Ontologies". I Workshop de Web Semântica (WWS2004), Brasília, 22 de outubro de 2004 [8] C.Patel, K.Supekar, YLee, and E.Park, ”Ontokhoj: A semantic web portal for ontology searching ,ranking and classification ,”in Proc.of ACM 5th International Workshop on Web Information and Data Management (WIDM),New Orleans, pp.58-61,2003. [9] E. Thomas, J. Z. Pan, D. H. Sleeman. "ONTOSEARCH2: Searching Ontologies Semantically". In Proc. of OWL Experiences and Directions Workshop, 2007. [10] H. Alani, N. Noy, N. Shah, N. Shadbolt, M. Musen "Searching Ontologies Based on Content: Experiments in the Biomedical Domain In: 4th International Conference on Knowledge Capture.ACM Press; p.55–62, 2007. [11] D. Taibi, M. Gentile, L. Seta. "A semantic search engine for learning resources". Third International Conference on Multimedia and Information & Communication Technologies in Education.2005. [12] L. Ding, T. Finin, A. Joshi, R. Pan, R. S. Cost, Y. Peng, P. Reddivari, V. C. Doshi, and J. Sachs. "Swoogle: A semantic web search and metadata engine for the semantic web". In Proc. 13th ACM Conf. on Information and Knowledge Management, Nov. 2004. [13] T. Finin et al., "Swoogle: Searching for knowledge on the Semantic Web", In proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI 05) 2005. [14] S. Decker, S. Melnik, F. van Harmelen, D. Fensel, M. Klein, J. Broekstra, M. Erdmann, I. Horrocks. "The Semantic Web: the roles of XML and RDF". Internet Computing, IEEE Vol 4, Issue 5 P: 63-73, 1089-7801. 2002. [15] J. L. Wolf , M. S. Squillante , P. S. Yu , J. Sethuraman , L. Ozsen, Optimal crawling strategies for web search engines, Proceedings of the 11th international conference on World Wide Web, May 07-11, Honolulu, Hawaii, USA, 2002.

110

International Journal on Advances in Information Sciences and Service Sciences Volume 2, Number 1, March 2010 [16] C. Castillo, "effective web crawling", SIGIR Forum, ACM Press, Volume 39, Number 1, New York, NY, USA, p.55-56 (2005)" [17] W. Buntine, K. Valtonen, M. Taylor. "The alvis document model for a semantic search engine". In Association for Computational Linguistics, editor, 2nd Annual European Semantic Web Conference, 2005. [18] A. Ardo, "Focused crawling in the ALVIS semantic search engine", In Proceedings of European Semantic Web Conference, ESWC, 2005. [19] M. Najork, J. L. Wiener. "Breadth-First Crawling Yields High-Quality Pages". In Proceedings of the Tenth International World Wide Web Conference, pages 114-118, May 2001. [20] T. H. Cormen, C. E. Leiserson, R. L. Rivest, C. Stein "Introduction to Algorithms, Second Edition", MIT Press, 2001. [21] C. Ridings, M. Shishigin, “PageRank Uncovered”, http://www.voelspriet2.nl/ PageRank.pdf, (retrieved may 2008), 2002. [22] A. N. Langville, C. D. Meyer, "Deeper Inside PageRank", Internet Math. Vol 1, No 3, p:335-380, 2003. [23] C. Benincasa, A. Calden, E. Hanlon, M. Kindzerske, K. Law, E. Lam,J Rhoades, I. Roy,M. Satz, E. Valentine and N. Whitaker, "Page Rank Algorithm", http://www.math.umass.edu/~law/Research /PageRank/Google.pdf, (retrieved may 2008), 2006. [24] B. Popov, A. Kiryakov, I. Kitchukov, K. Angelov, D. Kozhuharov: "Co-occurrence and ranking of entities based on semantic annotation". IJMSO 3(1): P.21-36 (2008). [25] A. Harth, A. Hogan, J. Umbrich, S. Decker. "Building a Semantic Web Search Engine: Challenges and Solutions". In Proceedings of 3rd XTech Conference, Dublin, Ireland, 2008. [26] J. Euzenat, P. Shvaiko, "Ontology matching". Springer, Berlin Heidelberg, DE (2007) [27] M. Jones and H. Alani, "Content-based Ontology Ranking", http://protege.stanford.edu/conference/2006/submissions/abstracts/11.1_Alani_Harith_protege06final.pdf , (retrieved may 2008) , 2006. [28] V. H. Tuulos, "Design and Implementation of a Content-Based Search Engine", http://www.cs.helsinki.fi/u/tuulos/tuulos-thesis.pdf , (retrieved may 2008),2007. [29] J. Pokorny and J. Smizansky, "Page Content Rank: An Approach To The Web Content Mining", http://www.ksi.mff.cuni.cz/~pokorny/papers/ IADIS-AP05.pdf , (retrieved may 2008),2005. [30] V. Chellappa, "Content-Based Searching with Relevance Ranking for Learning Objects", PhD dissertation, Univ. of Kansas, 2004. [31] J. Ramos, "Using TF-IDF to Determine Word Relevance in Document Queries", First International Conference on. Machine Learning, 2003. [32] W.A. PINHEIRO, A.M.C. MOURA, "Semantic Search in Portals using Ontologies". I Workshop de Web Semântica (WWS2004), 2004. [33] I. Celino, E. D. Valle, D. Cerizza, A. Turati. "Squiggle: a Semantic Search Engine for Indexing and Retrieval of Multimedia Content". SEMPS 2006 [34] A. Duke, T. Glover, J. Davies. "Squirrel: An Advanced Semantic Search and Browse Facility". ESWC 2007: 341-355 [35] R. Ramachandran, S. Movva, S. Graves. "Ontology-based Semantic Search Tool for Atmospheric Science”. 22nd International Conference on Interactive Information Processing Systems (IIPS), 86th American Meteorological Society Annual Meeting, Atlanta, Georgia, 2006. [36] S. Movva, R. Ramachandran, X. Li, P. Cherukuri, S. Graves. "Noesis: A Semantic Search Engine and Resource Aggregator for Atmospheric Science". NSTC2007. [37] Y. Lei, V.S. Uren, E. Motta, "SemSearch: a search engine for the semantic web". In Proceedings EKAW 2006, Managing Knowledge in a World of Networks, pages pp. 238-245, 2006. [38] M. Buranarach, M.B. Spring, "Metadata and Semantics: a Case Study on Semantic Searching in Web System". Proceedings of the IFIP International Conference on Research and Practical Issues of Enterprise Information Systems (CONFENIS 2006) Vienna, Austria, 2006. [39] Q. Zhou, C. Wang, M. Xiong, H. Wang, Y. Yu. "SPARK: Adapting keyword query to semantic search". In Proceedings of the sixth international semantic web conference (ISWC), pp. 694-707, 2007. [40] I. Hochstatter, M. Duergner, M. Krause. "A Context Middleware Using an Ontology-Based Information Model". EUNICE 2007: p.17-24, 2007. 10TU

U10T

10TU

U10T

10TU

U10T

10TU

U10T

111

PSSE: An Architecture For A Personalized Semantic Search Engine A. M. Riad,Hamdy K. Elminir, Mohamed Abu ElSoud, Sahar. F. Sabbeh [41] M. Holi, E. Hyvönen, P. Lindgren: "Integrating tf-idf Weighting with Fuzzy View-Based Search". Proceedings of the ECAI Workshop on Text-Based Information Retrieval (TIR-06), Riva del Garda, Italy, Aug, 2006. [42] Markus Holi and Eero Hyvönen: Fuzzy View-Based Semantic Search. Proceedings of the 1st Asian Semantic Web Conference (ASWC2006), Beijing, China, Springer-Verlag, September 3-7, 2006. [43] S. Cohen, J. Mamou, Y. Kanza, Y. Sagiv: XSEarch: A Semantic Search Engine for XML. VLDB (2003). [44] L. Guo, F. Shao, C. Botev, J. Shanmugasundaram. "Xrank: Ranked keyword search over XML documents". In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2003 [45] A. Sieg, B. Mobasher, R. Burke, "Learning Ontology-Based User Profiles: A SemanticApproach to Personalized Web Search". IEEE INTELLIGENT INFORMATICS BULLETIN, VOL. 8, NO. 1, 2007. [46] A. Sieg, B. Mobasher, R. Burke. "Web Search Personalization with Ontological User Profiles", Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM 2007), 2007. [47] M. Dudev, S. Elbassuoni, J. Luxenburger, M. Ramanath, G. Weikum , "Personalizing the Search for Knowledge", 2nd International Workshop on Personalized Access, Profile Management, and Context Awareness: Databases (PersDB'08), 2008. A.M. Riad - Head of Information Systems department, Faculty of Computers and Information Systems, Mansoura University. Graduated in Mansoura University from electrcal engineering department in 1982. Obtained Master degree in 1988, and Doctoiral degree in 1992. Main research points currently are intelligent information systems and e-Learning. Hamdy K. Elminir was born in EI-Mahala, Egypt in 1968. He received the B.Sc. in Engineering from Monofia University, in 1991 and completed his master degree in automatic control system in 1996. He obtained his PhD degree from the Czech Technical University in Prague in 2001. Currently he is an associate professor and the head of communication department – masr academy for engineering, Mansoura, Egypt. Sahar F. Sabbeh was born in Damietta, Egypt in 1982. She received the B.Sc. in Information systems from Mansoura University, Egypt in 2003 and completed her master degree in Information systems 2008.Currently she is an assistant lecturer in Alzarka Higher Institute For Computer And Adminstration Sciences, Damietta, Egypt.

112

A proposed architecture for a semantic search engine

A proposed architecture for a semantic search engine

Suggest Documents

Web Service Architecture for a Meta Search Engine - Semantic Scholar

WOTS2E: A Search Engine for a Semantic Web of Things

A Semantic Search Engine for Internet Videos - Semantic Scholar

A Semantic-based Search Engine for Professional ... - Semantic Scholar

Architecting a Search Engine for the Semantic Web - Semantic Scholar

A NEW SEMANTIC TEXT-IMAGE SEARCH ENGINE

GeneView: a comprehensive semantic search engine ... - BioMedSearch

A Proposed Communication Architecture for Secure

A Proposed Hardware-Software Architecture for ...

A Proposed Architecture For Mobile Government ... - CiteSeerX

A Proposed Communication Architecture for Secure ...

A Proposed Hardware-Software Architecture for

A Proposed Architecture - IEEE Xplore

A Proposed Architecture for the GENI Backbone ... - Semantic Scholar

A Proposed Architecture for Secure Two-Party ... - Semantic Scholar

Architecture of a grid-enabled Web search engine

A Tree Based Router Search Engine Architecture with Single Port ...

STEWARD: Architecture of a Spatio-Textual Search Engine*

A PROPOSED PROCEDURE FOR DISTRIBUTING ... - AgEcon Search

Architecture of a Federated Query Engine for

A Search Engine for Mathematical Formulae

FOAMSearch.net: A custom search engine for ...

A search engine for Arabic documents - CiteSeerX

Transforming Wikipedia into a Search Engine for ... - Semantic Scholar