Converting Web Applications into Standard XML Web ... - IEEE Xplore
Recommend Documents
Record 1 - 100 - guidelines for the design and deployment of secure multi-tiered web application. Keywords - Security; web applications; multi-tiered applications.
CHIC â Converting Hamburgers Into Cows. Joseph A. Townsend, Jim Downing, Peter Murray-Rust. Unilever Centre for Molecular Science Informatics, University ...
XML Encryption, the XML Key Management Specification. (XKMS), WS-Security, WS-Trust, WS-SecureConversation, Web. Services Policy, WS-SecurityPolicy, ...
Listings 1 - 11 - a degree in Russian from an ivy league school in the United States. He has pub- ..... XML IS THE CENTR
Listings 1 - 11 - using XML data with relational databases. ⢠building ...... DTDs are good at design time, and they a
Oct 26, 2011 - ASP C# .NET. C# .DLL. C++ COM. PHP. VB ActiveXJava Applet. CGI. Java .... XSLT (Extensible Stylesheet Lan
Abstract: In Current era, internet is playing such a vital role in our everyday life that it is very difficult to survive without it. The. World Wide Web (WWW) has ...
Computer Society. Vol. 6, No. ... Fernando Berzal, University of Granada and iKor Consulting. WebDAV: .... an IT services and consulting firm. Contact him at ...
Abstractâ Medical Imaging is a multidisciplinary field which combines the knowledge of physics, mathematics, electrical and computer engineering in order to ...
Reverse Engineering. Chunyan Wang. Anthony Lo. Reda Alhajj Ken Barker. Advanced Database Systems and Applications Lab. Department of Computer ...
we provide an overview of Personalized Health Applications in the Web 2.0. We reviewed the health applications integrated in. Google Health, Microsoft ...
Mar 1, 2006 - to their existing applications as web services without having to write extra code or .... progress in building grid applications by composing them.
integration of software running in mobile and/or embedded environments with other software ... guarantees), and accounting/billing. They can be ..... put into resource management, contract management, and Web service re- composition.
detected attacks properties to semantic web forms (SWFs) in several verified servers in cloud computing infrastructure. The major advantage of this approach is ...
takes advantage of the entities and units already defined in WebML. ... seats for this show are displayed (Booking page) and they can be added to the cart. .... Client confirms the buying and the order is processed .... RIA model taking as input the
AbstractâDue to colossal amount of data on the Web, Web archivists typically make use of Web crawlers for automated collection. The Internet Archive is the ...
applications for analyzing the walking, messaging, and social-interaction .... or smartphones to desktop computers. To handle the ... to run HTML5 compliant web browsers. People or ..... develop new types of FPGs, where the characters and.
technologies (see sidebar, âRelated Web. Programming Technologiesâ) are dri- ving this shift in application design. Web-based enterprise computing allows.
Software agent is a main component in building semantic web and acts as middleware between users and web services to find the service that best fits the ...
In this paper, we describe an approach for building a system for efficiently monitoring ... different versions of a web page, and evaluating the significance of web ...
providers, who reply through SOAP responses. In this paper, we describe DB2® and Web services, with techniques for integrating information from multiple Web.
Dynamic Web GIS, and Active Web GIS. These three broad categories are also discussed briefly in this paper. Key Words: GIS, HTML, Multimedia, Web, MMGIS,.
improving the service retrieval process. KeywordsâDiscovery of Web Services, User Goals, Client Goals,. SOA, Ranking, Service Discovery, UDDI, QoS, Quality ...
JDA Software [email protected] ... Web Structure Mining - analysis of the node and connection ... Predictive analytics based on Web and social media content ...
Converting Web Applications into Standard XML Web ... - IEEE Xplore
Converting Web Applications into Standard XML Web Services: Two Case Studies ... examples: it only extracts texts and clips the web page into a limited page ...
Converting Web Applications into Standard XML Web Services: Two Case Studies
Natheer Khasawneh
Mohammed A. Shatnawi, Mohammad Fraiwan
Department of Software Engineering Jordan University of Science and Technology Irbid, JORDAN [email protected]
Department of Computer Engineering Jordan University of Science and Technology Irbid, JORDAN [email protected], [email protected] functionality can be distributed efficiently over the World Wide Web.
Abstract— Internet contains a tremendous amount of valuable web applications that can be used in many systems. To use this kind of applications with other systems, the interaction needs to be in a standard structured format such as XML web service. In this paper, we present a method to convert the current web applications into standard XML web services. The system design and implementation are presented. We applied the proposed system on two test cases: Jordan University of Science and Technology (JUST) course online schedule and Wiley product search engine.
The organization of the rest of this paper is as follows. In Section 2 we give an overview of the related work. In Section 3 we explain our proposed system design in details. System application to the two case studies is presented in section 4. Finally, we conclude our approach and give the future work in Section 5. II.
Keywords-component; web services, web mining, contents extraction, information integration, automated submission.
I.
Since few websites currently provide remote service functionalities [4], many implementations and researches about accessing web resources regardless if the websites provide a remote service or not have been developed to integrate the web applications. Scrapbook [6] is one of the examples: it only extracts texts and clips the web page into a limited page; however, it does not deal with dynamic pages. Pollock [8] is another system that can create a virtual web service from a query interface, but users still need to parse the returned HTML document. The authors in [5] proposed a hierarchical model that integrates all features extracted from web pages and learn their importance. The proposed system is a template independent model that extracts all fields from a web page and searches for the best field that may contain the desired data specified by end-users. In [4], the authors developed an end-user programming tool, which is called Marmite and combined the access to web pages content and services. Marmite is being currently implemented as a plug-in using JavaScript and XML User Interface Language (XUL) in the Firefox web browser and cannot be called remotely by the end-users. In [2], the authors presented a method to integrate different web pages for personal use. They implemented a system that can integrate the ordinary static HTML pages and dynamic pages having contents that are generated by client-side scripts. The same methodology as, in [3], is adopted for the proposed system. However, the response in [2] is reformatted through a user-defined language called WACDL (Web Application Contents Description Language), which is an XML based language that describes
INTRODUCTION
There are a huge number of applications that are available online these days. The only way to use these applications is via web browsers and through human interactions. The use of web applications from the web with limited human intervention is the new version of the current web standards, which is known as the semantic web (web 3.0). For example, it is easy to manually retrieve a list of courses from JUST course online schedule, but it is difficult to build a system which automatically extracts course information by executing the online from. To do this, an enduser would have to manually extract desired information or implement special-purpose software to handle the same job. Most existing technologies are based on human intervention, which is often completely manual process. Moreover, manual or static technology which depends on human interaction is very difficult, time consuming, and needs extra efforts. In this paper, we present a method to convert existing web applications into a standard XML web service, which makes these applications easily accessible through other systems. To achieve this purpose, we propose a flexible and generic web service that can easily access web contents and return the result in a structured Simple Object Access Protocol (SOAP) format. Using web services would clearly simplify and generalize the extraction process and standardize the communication message format for end-users through the use of Web Service Description Language (WSDL) and SOAP messages. Hence, the web application
c 978-1-4244-8136-1/10/$26.00 2010 IEEE
RELATED WORK
807
web content, scopes of target contents and the desired information to be fetched. III.
I. SETUP PHASE Webpage URL
PROPOSED SYSTEM
In this paper, we propose a system to convert existing web applications into a standard XML web service. The method was applied to two test cases: JUST courses online schedule and Wiley search engine. Starting from known test cases would simplify the process of finding a generic framework that is capable of handling dynamic webpages. XML web service is the best way to distribute our system functionalities over the Internet. End-users need just to check web service description language reference and start using the service. Once the extraction process is finished, a SOAP object that contains the response is returned to endusers for further processing. As shown in Fig. 1, the system consists of two phases: setup phase and execute phase. These phases are divided into several layers, where the output of each layer is consumed by the next one. In the Setup phase, each new webpage URL is processed in cooperation with end-users to define three main features: The first one is a description of the desired output. The second is full details about the input that is required by users, different input fields are extracted to obtain this feature. The third feature is a definition of how the result obtained would be converted to a structured XML format. Based on the selected features, the WSDL is constructed and saved in the service database for execution phase. In the Execution phase, which is executed each time an end-user wants to extract web data, end-users can only run this phase for web pages that are already learned about in the setup phase. Users need to pass the intended webpage URL and multiple actions would be taken until the final result is returned as structured XML SOAP response.
Select Inputs
Execute form/View result
Select Outputs
Generate Conversion Descriptions
Build Service (WSDL)
Save I/O and conversion descriptions to Services Database
II. EXECUTE PHASE Webpage URL
Look-up Services Database
Generate SOAP Request
Get Structure XML SOAP Response Figure 1. Overall System Architecture
IV.
SYSTEM IMPLEMENTATION
In the section we present the application of the proposed system to two test cases: JUST courses online schedule and Wiley publisher products search engine. A. JUST Courses Online Schedule JUST courses online schedule is an online application that enables students to browse available courses according to: semester, faculty, department and section status (opened, closed). The form uses the POST method to exchange data with the server. One main problem arises in this form is the HTML controls dependency. (i.e., when a student selects his/her faculty, the page will automatically display all the departments related to that one). We addressed this problem by posting the data times the number of controls until the final HTML result is obtained. We implemented a web service method to execute the form automatically without human intervention according to end-users argument. The final result is then formatted in SOAP object for further processing in the client-side. The following steps would clarify the whole process:
808
x x x x x x x
Step 1: Retrieve the posted data from end-users. Step 2: Post the first argument (Semester) and wait for the other HTML fields to be filled. Step 3: Automatically, select the desired faculty. Step 4: Wait for the department’s field to be filled and select the posted department. Step 5: choose whether to show only the opened, closed or both sections status. Step 6: Create an HTML DOM tree. Step 7: Pick only needed nodes in the tree and build the structured data that contains information about each course.
As shown in Fig. 1, all HTML form inputs are selected and extracted as the first step in the setup phase. The number of times that the form must be executed should be specified to get the correct result. After the desired HTML output is obtained, our proposed system extracts courses information and generates structured data as shown in Fig. 6 and the process of data conversion is stored as XML schema
2010 10th International Conference on Intelligent Systems Design and Applications
description. The WSDL file shown in Fig. 2 is built at this phase and stored with the inputs, outputs and the conversion schema in a database to future use. Each time JUST course schedule URL is passed by endusers to retrieve courses information, the URL enters the execute phase as the WSDL web service is fetched from the database. At this phase, end-users generate a SOAP request and include their intended semester, faculty, department and section status as parameters to call the web service method to get the final SOAP response. The formats for both SOAP messages are shown in Fig. 4 and Fig. 5. For example, if an end-user is interested in retrieving all courses belonging to the medicine department. JUST web server executes end-user’s form request and returns the courses from the server database in an HTML format. A sample of the returned HTML text is shown in Fig. 3. The bold text represents the intended data to be extracted for each course. The returned HTML text contains a division (div tag) with a defined ID. Inside the div, a number of (N) tables equal to number of courses filtered by the end-user, is created by the server. Each table contains details about each course which are extracted, structured and finally sent as SOAP response. Our proposed wrapper which is implemented as a web service method gets the inner HTML code for the div, build a DOM tree and iterate over all tables which represent the actual courses information. B. Wiley Products Search Engine Wiley is a company that publishes books, journals, and encyclopedias, in print and electronically, as well as selling online products and services. We implemented a web service method that searches for all products published by Wiley and returns the result in a structured format to be consumed by end-users. Besides retrieving a list of matching products, our proposed method also sorts and narrows user search to certain products. The web service method takes three arguments: search query, sort method and product type. End-users may leave the last two parameters blank to search for all unsorted products. The result is returned from Wiley search engine server as a plain HTML text, which is then interpreted and processed by our web service method. In the setup phase, the three HTML inputs (search query, sort method and product type) are selected. The desired output locations and each detail about the product are specified and the structured conversion process is generated according to the output specified by end-users. Finally, the WSDL schema that represents the actual search method is built and stored in the database with the conversion description for future use in the execute phase. Fig. 7 shows a portion of the WSDL file that presents the generated web service method that uses Wiley search engine. From end-users side, end-users should generate a SOAP request as described in Fig. 8 and specify the three main arguments, representing the selected inputs and then make a
web service call. Our proposed system will immediately respond to that call and send the result back in the SOAP format shown in Fig. 9. í í í í Figure 2.
Part of JUST WSDL web service
Line Number:
662110
Course Symbol:
VM211
Section
Days
Time
Hall
Capacity
Reg Students
12345
10:30**11:30 … Figure 3. Returend HTML code for JUST courses
string int int int int Figure 4.
Courses SOAP 1.1 request message format
2010 10th International Conference on Intelligent Systems Design and Applications
809
int int string string Figure 5.
Courses SOAP 1.1 response message format
í í + í 662110 3 VM211 ANIMAL-HEALTH í í 48 48 1 G2121 12345 Figure 6.