FUTURE UNIVERSITY. FACULTY OF POSTGRADUATE STUDIES. MASTER OF SCIENCES IN GEOINFORMATICS. CREATING A PYTHON TOOLBOX TO.
FUTURE UNIVERSITY FACULTY OF POSTGRADUATE STUDIES MASTER OF SCIENCES IN GEOINFORMATICS
CREATING A PYTHON TOOLBOX TO IMPLEMENT THE ANALYTICAL HIERARCHY PROCESS METHOD IN ARCGIS
Prepared by: Khalid Galal Eldin Mohamed Khair Supervisor: Dr. Samir Mahmoud Adam
Thesis submitted in partial fulfillment of the Requirements for the Degree of Master of Science May 2018
ACKNOWLEDGEMENTS
First and foremost, Praise to Allah. I would like to express my sincere gratitude to my supervisor, Dr. Samir Mahmoud Adam for the continuous support of my M.Sc. study and research, for his patience and enthusiasm. His guidance helped me in all of my time in research and through the writing of this thesis. One simply could not wish for a better or friendlier supervisor. In my daily work I have been blessed with a helpful, friendly and cheerful group of fellow friends in the Geoinformatics Master Program that I want to thank by name including, Mohammed, Gosai, Taha, Yassin and Yasser as well as Safa, Malaz and Eman. I would like to extend my thanks and sincere gratitude to my friend Mohammed Hamadouba and the rest of my extended family inside and outside Sudan for their immense support, pray and encouragement as well as their understanding during the writing of this thesis. Nonetheless the most important thanks and appreciation is dedicated to my family: my dear father Galal ElDin and my late mother Tamadur Magzoub (May Allah be merciful to her), and my siblings Ahmed, Abdullah, Anas and Jumana for their prayers, support and understanding during my study.
II
Contents ACKNOWLEDGEMENTS ...............................................................................II LIST OF TABLES ............................................................................................ V LIST OF FIGURES ......................................................................................... VI LIST OF ABBREVIATIONS ......................................................................... VII خالصة البحث......................................................................................................... VIII
ABSTRACT ..................................................................................................... IX CHAPTER 1 INTRODUCTION ........................................................................... 1 1.1: Overview:.................................................................................................... 1 1.2: Problem Statement: ..................................................................................... 2 1.3: Research Objectives:................................................................................... 3 1.4: Scope of the Study: ..................................................................................... 3 1.5: Thesis Layout:............................................................................................. 4 CHAPTER 2 LITERATURE REVIEW ................................................................ 5 2.1: Multi-criteria decision making:................................................................... 5 2.2: Analytical Hierarchy Process:..................................................................... 6 2.3: TerrSet (formerly IDRISI): ......................................................................... 8 2.4: AHP in GIS: ................................................................................................ 9 2.5: Python: ...................................................................................................... 10 III
CHAPTER 3 THE AHP TOOLBOX DEVELOPMENT .................................... 12 3.1: Overview:.................................................................................................. 12 3.2: The Analysis Phase: .................................................................................. 12 3.3: The Implementation Phase........................................................................ 14 3.3.1: Python Program:.................................................................................... 14 3.3.2: Implementation in ArcGIS: ................................................................... 16 3.4: Bugs and Obstacles: .................................................................................. 17 3.4.1: Table format: ......................................................................................... 17 3.4.2: The User Interface: ................................................................................ 19 CHAPTER 4 RESULTS AND DISCUSSION .................................................... 24 4.1: Overview:.................................................................................................. 24 4.2: Mathematical results: ................................................................................ 24 4.3: Visual Comparison: .................................................................................. 26 CHAPTER 5 CONCLUSION AND RECOMMENDATIONS .......................... 27 5.1: Conclusion: ............................................................................................... 27 5.2: Recommendations:.................................................................................... 28 REFERENCES .................................................................................................... 30 APPENDIX SOURCE CODE ............................................................................. 32
IV
LIST OF TABLES
Table 2.1: Judgment Matrix based on user input ................................................... 7 Table 2.2: Normalized value table with weights or priority vector ....................... 7 Table 4.1: An AHP table with priority values and their reciprocals inserted. ..... 25 Table 4.2: A comparison of the Weights produced by both tools as well as a summary of the difference between them. ........................................................... 25
V
LIST OF FIGURES
Figure 2.1: A scale that represent the user favorability of a factor compared to another factor (Estoque, 2011) ............................................................................... 6 Figure 3.1: Example of a skewed table in the preliminary python program. ....... 18 Figure 3.2: First window in the tool to receive the number of the factors. .......... 19 Figure 3.3: Second window that have the table prefilled and only leave the necessary fields empty for the user to insert. (In this example three factors were used). .................................................................................................................... 20 Figure 3.4: Judgment Value Matrix sheet in the output Excel file. ..................... 22 Figure 3.5: Normalized Value sheet in the output Excel file. .............................. 22 Figure 3.6: Priority Vector or Weights sheet in the output Excel file. ................ 23 Figure 3.7: The final sheet in the output Excel file, and it contains the verification process for the previously user inserted values.................................................... 23 Figure 4.1: A visual comparison between (A) The Suitability map produced from the AHP Tool. And (B) The Suitability map produced from the TerrSet WEIGHT module.................................................................................................................. 26
VI
LIST OF ABBREVIATIONS
AHP
Analytical Hierarchy Process
GIS
Geographic Information Systems
SDSS
Spatial Decision Support System
ESRI
Environmental Systems Research Institute
MCDM
Multi-Criteria Decision Making
MODM
Multi-Objective Decision Making
MADM
Multi-Attribute Decision Making
MCE
Multi-Criteria Evaluation
CR
Consistency Ratio
CI
Consistency Index
RI
Random Index
VBA
Visual Basic for Applications
DLL
Dynamic-link library
FOSS
Free And Open-Source Software
ENVI
Environment for Visualizing Images
OS
Operating System
UI
User Interface
PANDAS
Python Data Analysis Library
PCF
Pairwise Comparison File
DSF
Decision Support File
HTML GUI
Hypertext Markup Language Graphical User Interface
VII
خالصة البحث
يتم استخدام عملية التحليل الهرمي ( ) AHPلتعيين أوزان للعوامل المقدمة استنادًا إلى عوامل األولويات في جدول المقارنة المتزاوج .ال يحتوي تطبيق ArcGISالرائد في أنظمة المعلومات الجغرافية على أداة مستقلة تقوم بحسابات .AHPعلى هذا النحو تبحث هذه الدراسة جدوى إنشاء وتنفيذ هذه األداة في ، ArcGISودورها في سير ضا تفضيل Pythonكلغة برمجة لهذه األداة .باإلضافة لتعرضه إلى مراحل العمل الكلي للبرنامج .يناقش هذا البحث أي ً تصميم وتنفيذها هذه األداة والتحقق من نتائجها ،في حين ستتم مناقشة بعض األخطاء اله امة أو الحرجة اللتي تمت معالجتها أثناء سير هذا البحث .كما تمت مناقشة وجيزة بين األدوات القائمة في الصناعة وهذه األداة فيما يتعلق بالنتائج الرياضية والبصرية .كانت النتائج متطابقة تقريبًا ,األمر الذي أكد نجاح و جودة األداة .في الختام ،سيتم تقديم بعض المقترحات لتعزيز وتحسين أداء األداة في المستقبل.
VIII
ABSTRACT
The Analytical Hierarchy Process (AHP) is used to assign weights to factors based on the factors priorities in a pairwise comparison table. ArcGIS the leading GIS Application does not contain a standalone tool that does the AHP calculations. As such this research investigate the feasibility of creating and implementing such tool in ArcGIS, and its role in the overall workflow of the program. This research will also discuss the preference of Python as the programing language of this tool as well as the design, implementation and verification of its results while a few critical bugs will be discussed as well. A brief comparison between the established tools in the industry and this tool in regards to mathematical and visual results have also been discussed briefly. The results were almost identical which asserted the tool functionality. In conclusion a few proposals would be presented to enhance and improve the functionality of the tool in the future.
IX
CHAPTER 1 INTRODUCTION
1.1: Overview: The Analytical Hierarchy Process (AHP) method has been an integral part of the decision making process around the world throughout the last 30-40 years (T. L. Saaty, 1977). And during the development of the Geographic Information Systems (GIS) it started to incorporate decision support processes into its workflow in the form of Spatial Decision Support Systems (SDSS), employing the AHP method in it as well. This led to a more involving process of integrating SDSS in planning and executing spatial projects, which resulted in GIS and Remote Sensing-based applications implementing elements of SDSS to help the users or decision makers take quicker and more precise actions based on the inputs provided. On the other hand this also gave rise to applications specialized in spatial analysis and decision making support such as TerrSet (formerly IDRISI) which is a computer application that is specialized in analyzing earth systems for decision making regarding environmental management. That is one among other major functionalities included in its portfolio (Eastman, 2016).
However ESRI’s ArcGIS, one of the leading applications in the GIS industry did not integrate many significant tools for SDSS until a relatively recent time, leaving much 1
to be desired. In one of their latest releases (10.4.1) plenty of tools were include to help with the decision making process however when it came to an independent AHP tool to calculate the weights of each factor based on the user priority, there were none. Which led to the creation of this tool.
1.2: Problem Statement: ESRI’s effort in the SDSS domain included among others, two sophisticated tools named appropriately (Weighted Sum and Weighted Overlay). The Weighted Sum Tool multiplied the weight of each factors/layer into its pixels values before adding them all to a sum map that would give an overall suitability map of the study area. While the Weighted Overlay would overlays several factors/layers using a common measurement scale and weights each according to its importance. The problem with the Weighted Overlay Tool is that it asks the user to insert a percentage termed “Influence Percentage” based on his estimation for the value of each factor effect on the overall Suitability map. And using this method terminated the need of the AHP’s pairwise comparison table method. Unfortunately discussing the advantage of either of them over the other falls outside the scope of this study. ESRI on the other hand made both methods available to fulfill the market requests. Each of these tools has issues that will be discussed in more details in the next few chapters. However the reason for the creation of the AHP tool was due to the fact that the Weighted Sum Tool does not calculate the weights of each factor/layer but depends on the user to define them based on his own calculations outside the tool. And this can be considered as a major obstacle in using this tool.
2
1.3: Research Objectives: The main objective of this research is to create a tool written entirely in Python that will work seamlessly with ArcGIS and would produce the AHP pairwise comparison table to the user to take his input regarding the priority of the factors when compared to each other and then calculate the weight of each factor. And finally verify the user’s weight input through Thomas L Saaty’s Consistency Ratio number. The verified weights are then used to create a suitability map that will be used to decide the most suitable location to the user needs (best land to build a school or best house to buy etc.…). The specific objective on the other hand is creating this tool to be portable and easy to use, thus disregarding the need of complicated installation process and complex instructions to use it.
1.4: Scope of the Study:
This research is mainly concerned with the technical aspect in the creation of an AHP Tool written in the Python programming language that will calculate the weight of each inserted factor based on the user input and save the results in a compatible file format that can be used in ArcGIS and possibly other GIS applications as well. Then the process of integrating the aforementioned tool with ArcGIS’ Python Toolbox and modifying the python code as needed to fulfill the program intended objectives. Any related terms, methods, libraries and concepts will be discussed as well throughout the research chapters.
3
1.5: Thesis Layout:
The Contents of this Thesis were divided into five chapters. Chapter one will be dedicated to give general information about the Tool and the current status of the GIS industry regarding the implementation of SDSS as well as the scope of the study and the objectives of this thesis. Chapter two will be dedicated to the literature review of all the related tools, methods, terms and concepts used in this program, such as their basic theories, development, limitations, general applications and the previous studies concerning them. Chapter three will discuss the actual process that went into the tool development and the obstacles faced during its inception. Chapter four will be devoted to the discussion of the results from the tool as well as comparing the output of the tool with other tools developed independently by major GIS developers. Chapter five will be the conclusion of this research and it will include a few recommendations for possible future research prospects and possible development aspects that can improve the AHP Tool functionality and features.
4
CHAPTER 2 LITERATURE REVIEW
2.1: Multi-criteria decision making: Multi-Criteria Decision Making (MCDM) is a broad science and one of the most well-known branches of decision making. However due to the time limitation of this research, the MCDM will be explained in general briefly, then a narrowing explanation of the concepts would follow until the AHP method is clarified conveniently. MCDM is divided into multi-objective decision making (MODM) and multi attribute decision making (MADM). However, MADM have been used to refer to the MCDM. Both of them have plenty of aspects in common but the main difference between MCDM/MADM and MODM is concerned with the decision space, the MCDM/MADM deals with problems with discrete decision spaces where the problem has a predetermined set of decision alternatives. Whereas in the MODM the decision space is continuous such as in mathematical
programming
problems with multiple objective functions
(Triantaphyllou, 2000). The MCDM methods are abundant and can be classified in several ways, one of them is based on the data type they use. And using the aforementioned analogy, there can be several methods, the most famous among them are the Stochastic, Fuzzy and Deterministic methods. The following section focuses on the analytical hierarchy Process in the Deterministic methods (Triantaphyllou, 2000).
5
2.2: Analytical Hierarchy Process: According to R. W. Saaty (1987) “The Analytic Hierarchy Process (AHP) is a general theory of measurement. It is used to derive ratio scales from both discrete and continuous paired comparisons. These comparisons may be taken from actual measurements or from a fundamental scale which reflects the relative strength of preferences and feelings. The AHP has a special concern with departure from consistency, its measurement and on dependence within and between the groups of elements of its structure. It has found its widest applications in multi-criteria decision making, planning and resource allocation and in conflict resolution. In its general form the AHP is a nonlinear framework for carrying out both deductive and inductive thinking without use of the syllogism by taking several factors into consideration simultaneously and allowing for dependence and for feedback, and making numerical tradeoffs to arrive at a synthesis or conclusion”. Dr. Ronald C. Estoque of the University of Tsukuba, Japan have also explained it in a simplified and illustrated form in his lectures at the Natural Resource Management course (Estoque, 2011). Out of which the design of this tool was actually based. (Figure 2.1) Displays an example of the Scale of Priority (read favorability) for each factor compared to the others.
Figure 2.1: A scale that represent the user favorability of a factor compared to another factor (Estoque, 2011) 6 Figure 2.2: Example of a skewed table in the preliminary python program. Figure 2.3: A scale that represent the user favorability of a factor compared to another factor (Estoque, 2011)
Table 2.1 shows a simplified form of the table after the user insert the values representing his judgment regarding the priority of each factor compared to the others while the values opposite to them would represent the reciprocal values of what the user entered.
Table 2.1: Judgment Matrix based on user input F1
F2
F3
F1
1
9
3
F2
1/9
1
1/5
F3
1/3
5
1
Total
1.444444
15.000
4.200
Table 2.2 shows the Normalized values for each cell as well as the weights or the Priority Vectors calculated based on the matrix displayed in Table 2.1
Table 2.2: Normalized value table with weights or priority vector F1
F2
F3
Priority Vector or Weight
F1
0.6923
0.6000
0.7143
0.6689
F2
0.0769
0.0667
0.0476
0.0637
F3
0.2308
0.3333
0.2381
0.2674
As a final step after the calculations, the AHP method has a verification process through the value of the Consistency Ratio (CR), CR is derived from the division of the Consistency index (CI) over the Random Index (RI), whereas the resulting value must be below the threshold (0.10). Thomas L. Satty has explained the process of calculating the CR, CI and RI in comprehensive details in his paper (R. W. Saaty, 1987).
7
2.3: TerrSet (formerly IDRISI): TerrSet or as it was known formerly IDRISI is an integrated geographic information system (GIS) and remote sensing software developed by Clark Labs at Clark University for the analysis and display of digital geospatial information. TerrSet is a PC grid-based system that allows users to analyze earth system dynamics for effective and responsible decision making for environmental management, sustainable resource development and equitable resource allocation (Eastman, 2016). TerrSet includes a fundamental and specialized collection of tools bundled together as a toolset named IDRISI GIS Analysis. It consists of a wide range of fundamental analytical tools for GIS analysis, primarily oriented to raster data. Special features of the IDRISI toolset include a suite of multi-criteria and multi-objective decision procedures and a broad range of tools for statistical, change and surface analysis. Special graphical modeling environments are also provided for dynamic modeling and decision support (Eastman, 2016). The IDRISI toolset includes several modules, the one that is of interest among them is the Decision Support module. It was introduced in the early 1990s and have been constantly developed and updated since then. In the Decision Support module our interest among the tool repository lies in the WEIGHT module which Clark Labs developed using Saaty’s AHP procedure. The tool written and designed in this research is inspired by the AHP based WEIGHT module implemented in the Decision Support module.
8
2.4: AHP in GIS: Analytical Hierarchy Process have been developed by Thomas L. Satty since 1977. however its implementation in GIS software can be considered relatively new as the first documented case was done by (Rao et al., 1991). However, his implementation was only considered in this research based on practicality side as the author’s actual implementation of the AHP method was developed outside the GIS software using a variety of analytical resources (Marinoni, 2004). Since then many implementations of the AHP method were done across many GIS applications due to the popularity of the AHP and its importance in weight distributions. One notable implementation that needs to be considered in details through this research is the effort of Dr. Oswald Marinoni in implementing the AHP method in ESRI’s ArcGIS in 2002 (Marinoni, 2004). Dr. Marinoni implemented the AHP method in ArcGIS using Visual Basic for Applications (VBA) Macros. However his implementation malfunctioned upon the release of ArcGIS 10.0, this was mainly due to the fact that ESRI overhauled several fundamental architectures inside ArcGIS. Including but not limited to their python ArcGIS class, changing its name from “arcgisscripting” into “arcpy”. They also deprecated the VBA platform, and made Python as the preferred scripting tool in ArcGIS (van Rees, 2014). One more probable reason for the tool incompatibility with newer versions of ArcGIS was the tool design. The tool would take the user preference and do the calculations in the designed VBA Macro (EigenUtl.dll) outside the ArcGIS framework, then return the results to the program to be displayed. Marinoni (2004)Choice of the programming language was the VBA language due to the following reasons:
VBA macros in ArcGIS can use the ArcGIS functionality to its full extent.
VBA macros can take advantage of global ArcGIS Variables.
The coding of the management of the data layers is much easier in VBA.
The possibility to create, test and debug macros in the ArcGIS Visual Basic editor are the same as in the VB development environment.
9
In 2014 Dr. Marinoni overhauled the tool to be compatible with ArcGIS 10.1 &10.2. Nevertheless, the change log of his upgrade is unfortunately not documented, and communication trials with the author proved futile. One notable issue is that the updated AHP tool comes packaged as a windows installation file, and upon inspection of the installation folder, a DLL file with the name (extAhp.dll) exist which could indicate a probability that the tool still uses VBA as a platform but it’s not enough proof to be certain. 2.5: Python: Python is considered as the leading programming/scripting language for scientific purposes, as well as in Earth Science and GIS analysis. As Lin (2012) puts it in his article, Python is the next wave in Earth Sciences computing for one simple reason: Python enables users to do more and better science. Lin (2012) argues that “Python is now a robust integration platform for all kinds of atmospheric sciences work, from data analysis to distributed computing, and graphical user interfaces to geographical information systems. Among its salient features, Python has a concise but natural syntax for both arrays and non-arrays, making programs exceedingly clear and easy to read. The modern data structures and object-oriented nature of the language makes Python code more robust and less brittle. Finally, Python’s open-source pedigree, aided by a large user and developer base in industry as well as the sciences, means that your programs can take advantage of the tens of thousands of Python packages that exist”. It should be noted that this article was published in 2012 and taking time passage into account it’s hard to imagine the feats accomplished by the language in the scientific fields since the publication date. In many workflows the tools are usually isolated from each other and the communication between them occurs through the files they take or produce. One distinct feature of python is that all the tools are used in the same interpreted environment which greatly expands our accessibility to the tools and their properties, as well as the added advantage of the robust and flexible workflow (Lin, 2012).
10
The future of using Python specifically in GIS is well documented by van Rees (2014) in his article Python Scripting and GIS Increasing Efficiency: “The use of free and open source software (FOSS) at Esri took a huge leap when Esri officially embraced Python as the preferred scripting tool and integrated it further into their ArcGIS platform after the release of ArcGIS 10. This decision led to Python scripting and GIS becoming inextricably linked within just a couple of years.” One good example of how Python can increase the efficiency of GIS can be illustrated through the case of a user wanting to check the coordinates of a thousands of datasets and then re-project any that doesn’t have a coordinates, it would take an intern an entire summer internship to accomplish this task, however with python all it would take is a few lines of code and within minutes or hours the task would be accomplished (van Rees, 2014). In computer science a scripting language is not the same as a programming language: a programming language involves the development of more sophisticated multifunctional applications, while a scripting language automates certain functionality within another program. So in a sense Scripting allows you to collect various existing elements together, while programming allows you to build components from scratch. Python is both a scripting and a programming language, so it can be used for both (van Rees, 2014). That is one of the main advantages of it as a language and the main reason among others it was used to write and create the AHP tool discussed throughout this research.
11
CHAPTER 3 THE AHP TOOLBOX DEVELOPMENT
3.1: Overview: This chapter will discuss in details the three main phases that the tool needed to pass by before it was completed: the analysis and the implementation of the tool as well as the bugs and obstacles faced during the development. The analysis will specify the expected requirements that the tool needs to address, while the implementation section will discuss the actual writing process of the tool. The final section will discuss the obstacles faced during the implementation phase and the workarounds followed to solve them. 3.2: The Analysis Phase: The AHP Tool workflow is divided into three main steps, the first is to take from the user the Number of the factors and their Names. The next step requires the program to create a Table with the columns and rows number equal to the Number of the inputted factors. While the headings of the columns and rows would be the factors Names. After the creation of the table, it must be presented to the user with half of its fields greyed out as they would be reciprocal values based on what the user would enter in the other half empty fields. The final step would take the filled Table and calculate the weights as well as the Consistency Ratio to check the validity of the inputs, then save all the results to a file and present the findings to the user. A standalone python based tool would-be created to test the feasibility of applying the aforementioned workflow as a code in the python programming language, the tool must accomplish the three main tasks of the workflow and provide a foundation for
12
integrating the code with ArcGIS. The suggested way to accomplish the intended workflow is by dividing these tasks into smaller independent yet interconnected segments within the program. The extensive collection of libraries in the standard Python installation allowed the creation of this tool without the need to install any extra external modules, and this was an essential preference as portability is a main requirement for the tool, and asking the user to install extra files to use the tool is detrimental to say the least. After the completion of the tool, the code would be modified and adapted to integrate fully with ArcGIS through the Python Scripting Toolbox or the Scripting Toolbox. A final concern with the requirements is the utilization of the tool output. As previously mentioned in chapter one, ESRI’s implementation of the SDSS included two tools of concern to this study, among the others. The Weighted Sum and the Weighted Overlay. As both tools have been discussed previously, this clause will focus on the suitable tool that the AHP Tool output needs to address or work with. The difference between the two tools can be established in two main points, the first is the type of input that the tools receive, while the second is the way both tools calculates the suitability map based on user inserted weights. The Weighted Overlay requires that the input file be created through the classification process such that, it has several distinct classes so the user can insert priority values for the layer/factor as well as its classes. This tool however was incompatible with TerrSet, Erdas Imagine and ENVI’s classified raster files. Erdas Imagine and ENVI’s files worked when exported specifically in ESRI’s ArcGIS format. However, there was no possible way to import TerrSet’s raster files into the tool. On the other hand the Weighted Sum tool accepted all of the aforementioned raster files without issues or needs for conversion, proving the tool to be more compatible and easier to use. The file format in the Weighted Overlay tool’s weight distribution depended on the classification data from the other applications and it used a proprietary format, which proved difficult to emulate in this tool’s output. Alternatively the Weighted Sum tool did not accept any input files for its weight distribution forcing the user to insert the values manually. Upon comparing both tools in the specified points mentioned above, The
13
Weighted Sum tool was the more suitable option. As for the output file format, a normal text file or a Microsoft Excel file that lists the factors and their weights would accomplish the task. 3.3: The Implementation Phase The process of writing the python program as well as adapting and implementing the code to work with ArcGIS is discussed in details here. 3.3.1: Python Program: The Python program was created with simplicity and portability in mind, thus all the modules that was used in the program came with default python installation. Those modules were the OS module and the print function from the __future__. The OS module was required to dissect the directories received by the user, and the print function from Python 3.x was used. As a more advanced printing function was required. Seven segments were created to implement the aforementioned design and are discussed below in details. First Segment: This segment would ask the user to enter the number of factors that needs calculation. After the user enter the number, the module would check if the number is below or above the limits of the tool. If it is, the tool would exit, otherwise it would take the number and use it to create two tables (table 1 and 2) each of them with rows and columns equal to the entered number of factors. Next it would ask the user to enter the factors names and then fill the table’s headings with the entered names as well as entering the value of (1) across the table whenever a factor in a row crossed with itself in a column. Second Segment:
14
The next segment have a very simple yet important job. It creates a python method that would take a two dimensional list and print out a visual table for the user whenever it is called. Third Segment: This segment would be asking the user to insert his preference for each pair of factors and then store it in table 1 which was created in the first segment. Fourth Segment: The segment would create a list of values each of them contains the total of single column in table 1. Next the segment would create a normalized table by filling table 2 based on the division of each corresponding cell over its column total. And finally print it out to the user. Fifth Segment: The fifth segment would calculate the Normalized Principal Eigen Vector also called Priority Vector or just Weight. The calculation is done by calculating the mean value of each row and assign it to the corresponding factor. Sixth Segment: This segment would calculate the Principal Eigen Value and the Consistency Index then determine the Random Index. Finally it would calculate the Consistency Ratio based on the previous two, and check if it is below the threshold (0.10) or not. After which it would print out a message representing the findings to the user. Seventh Segment:
15
The final segment saves the results into a file format that is compatible with TerrSet’s AHP Module. 3.3.2: Implementation in ArcGIS: Upon the completion of the tool and the successful testing of its segments (Khair, 2018), the next stage of this section would require the tool to work within ArcGIS either through Scripting Toolbox or Python Toolbox. Both tools are scripting tools that allows user-created tools to work in the ArcGIS program as well as allowing the aforementioned tools access to ArcGIS’s rich Python Site Package “arcpy”. A notable feature common in both tools is their ability to easily create a User Interface (UI) that allows the developers to take user inputs for the tools. The Python Toolbox was chosen as the preferred way to create the AHP Tool due to several factors, chief among them was the ability of the Python Toolboxes to create a parameter that has a special data type termed GPValueTable. Said data type can create tables with custom number of columns based on the developer’s choice. Which should fulfil our expected requirements impeccably. There is other important factors as well (ESRI, 2016) such as:
The Python Toolbox creates the parameter definitions, validation code, and the source code in the same place, making it easier to create, maintain and transport Python tools.
It allows the user to have more freedom to use his python skills whereas in the Scripting Toolbox the developer would be guided by the wizard limiting his access to customization.
16
The Python toolbox is created as two classes or more (depending on how many tools does the toolbox have). The first class represent the toolbox while the second and any after it represents the tools inside the toolbox. The tool class contain five methods, the first one is the initialization method, while the second one is the getParmeterInfo() method which creates input parameters in the tool interface thus allows for user input. Three parameters were created inside this method, however this will be discussed further in the obstacles section. The third method isLicensed() check the license of the program and see if the tool is licensed to run. The forth method updateParameters() is concerned with updating the parameters based on the possibility of certain conditions occurring, and this method starts the moment the tool is opened and keep working until the OK button is pressed. The fifth tool updateMessages() do as it says, it updates the messages on the tool parameters depending on the user input, and in this method a check over the number of factors is made, if the number is below 3 or above 15 factors, the tool won’t run and an error message will appear to the user asking him to correct the issue. The final and most important method is excute(), and this method starts after the user press the OK button in the tool. It contains the actual code of the program and all the calculations it does. 3.4: Bugs and Obstacles: During the development phase of this tool numerus bugs and obstacles appeared and were addressed in the most suitable way. A few significant ones will be discussed in this section. 3.4.1: Table format: One of the early obstacles to tackle was the table format in programming language, as python by default does not have any data type that represent tables, however this was overcame through the usage of nested lists or two dimensional lists. This method allowed the storage of data in a list (representing rows) inside another list (representing columns). However using this method created the issue of representing the values to the user, which
17
was solved through the importation of the print method from Python 3 _future_ module. This produced a visually acceptable table, where the lines or the column and rows aligned nicely, however upon the introduction of decimals in the table its organized form was skewed Figure 3.1. However there was no possible solution for this issue, and since the python program was just a precursor for the actual development this issue was left as it is, to be tackled again during the integration with the ArcGIS platform.
Figure 3.1: Example of a skewed table in the preliminary python program. The aforementioned issue had to be dealt with at the beginning of the modification of the python program to work in ArcGIS. The practical solution led to the usage of Python Data Analysis Library (pandas). The pandas primary data structure is named Dataframe which is a two dimensional data structure with labeled axes the ability to do arithmetic operations align on both row and column labels (Wes McKinney & PyData Development Team, 2017). The pandas library has an extensive list of methods that deals with its Dataframe data structure which gives a great degree of customization and accessibility. However using pandas had one significant drawback. The pandas library was introduce in the ArcGIS built-in Python application recently (Artz, 2014). As a result users with ArcGIS applications lower than 10.4 won’t be able to use library and in consequence the tool. And this limits the tool availability to a small group of users. And prevent a huge crowd from accessing it. However, upon contemplating other solutions, this appeared to be the only sensible approach. Thus it was implemented.
18
3.4.2: The User Interface: Another issue that occurred during the tool integration into ArcGIS was the approach to accomplish the first step in the workflow, whereas the tool should ask the user to insert the factors numbers (Figure 3.2).Then based on that number it would present the user with another window that contains a table filled with the factors names as headings for rows and columns and its rows and columns number would equal the number of factors (Figure 3.3). The final step would take the table filled with user input and do the necessary calculations based on it, then output the results to the screen.
Figure 3.2: First window in the tool to receive the number of the factors. However the above hypothetical approach never materialized despite the best effort spent in trying to apply it. This was mainly due to the fact the ESRI never intended the custom tools (Custom Scripting Toolbox and Python Toolbox) to produce extra dialog windows that can take user inputs, also the toolboxes were never designed to be interactive. And since the intended workflow have to follow the aforementioned approach, a workaround was devised to implement it. However the workaround included the usage of Microsoft Excel. This produced another shortcoming to the intended requirements as it decreased the portability of the tool further through requiring the users to have Microsoft 19
Excel installed in their devices for the tool to work. The workaround would make use of multivalued property of the Python toolbox. Said property allows the user to choose the factors they need to insert (giving the tool the number of factors as well as their name in one step).
Figure 3.11: Second window that have the table prefilled and only leave the necessary fields empty for the user to insert. (In this example three factors were used). After getting the names and number of factors from the first parameter with its multivalued property enabled, the tool passes those values to the actual program code. The program then creates a Dataframe and determine its columns and rows size based on the number of factors, and fill their headings with the factors names. The tool then fill the table with the value of (1) each time a factor in the rows encounters itself in the columns, while filling half the table with the (X) value in the cells that opposes their counterparts to signal the user not to modify them as they will be filled later on automatically by the reciprocal value of the user introduced number. After the completion of the previous tasks the tool then creates an Excel file in the location that the user specified in the second parameter, and export the Dataframe to the excel file.
20
The tool opens the Microsoft Excel application and open the newly created file, displaying the previously created table with its prefilled values and waiting for the user to complete the table. During this time the tool would check every 15 seconds to see if the file is still open or not, and it would give a message in the geoprocessing window that remind user to fill the table, save the file and close Excel. Once it detects that the table is complete and closed, it retrieves the data from the file and store it again in a Dataframe for further processing. After all the calculations are done, the results would be saved into the file location specified in the second parameter as an Excel file. Depending on the user preference in the third parameter a PCF and a DSF files can be created, both of which are used to export the results to TerrSet’s AHP Tool. An illustrative example was produced with three factors in comparison. Their weight preference assignment was relatively random. The result Excel file contains 4 sheets, the first sheet termed “Judgment Value Matrix” contains a table completely filled by the user preference between factors and their reciprocal values (Figure 3.4). The second one named “Normalized Values” contains the normalized values for each cell (Figure 3.5).While the third sheet called “Priority Vector or Weights” contains the calculated weight or priority vector for each factor (Figure 3.6). The final sheet “Final Results” contains the weight assignment verification process named as Consistency Index (Figure 3.7). And from the “Priority Vector or Weights” sheet the user can copy and paste the values into the weight column in the Weighted Sum tool.
21
Figure 3.20: Judgment Value Matrix sheet in the output Excel file.
Figure 3.29: Normalized Value sheet in the output Excel file.
22
Figure 3.38: Priority Vector or Weights sheet in the output Excel file.
Figure 3.47: The final sheet in the output Excel file, and it contains the verification process for the previously user inserted values.
23
CHAPTER 4 RESULTS AND DISCUSSION
4.1: Overview: A dataset was used to test and verify the results of the AHP Tool as well as compare it with other existing tools, namely TerrSet’s Idrisi WEIGHT module. The testing data was obtained from the Tutorial Data included with TerrSet’s MCE section. Each factor layer was standardized using the Fuzzy Set Membership theory (Eastman, 2016), and the resulting layers were used as the input factors for TerrSet’s WEIGHT module as well as the AHP Tool that was made in this research. The comparison layer consist of Landfuzz, Townfuzz, Waterfuzz, Roadfuzz, Slopefuzz and Devfuzz. Each represent a factor in the AHP table. According to Eastman (2016) the algorithm TerrSet use in calculating the AHP weights is based on Thomas L Saaty’s paper published in 1977, while the AHP Tool developed in this research was developed based on Thomas L Saaty’s paper published in 1987 (R. W. Saaty, 1987). The next section will discuss the mathematical results and any disparity that appear in it, while the last section will compare the results visually to give the reader a sense of the similarity or the disparity between the tools results. 4.2: Mathematical results: TerrSet’s WEIGHT module produce two files as output (.pcf and .dsf) as well as a displayed text that can be saved as an HTML file for reference. Using the aforementioned example dataset from the MCE Tutorial would produce the following table 4.1:
24
Table 4.1: An AHP table with priority values and their reciprocals inserted.
Landfuzz
Townfuzz
Waterfuzz
Roadfuzz
Slopefuzz
Devfuzz
Landfuzz
1
0.333333
0.333333
0.333333
0.2
1
Townfuzz
3
1
1
0.142857
0.333333
0.333333
Waterfuzz
3
1
1
Roadfuzz
3
7
3
0.333333 1
0.333333 1
1 3
Slopefuzz
5
3
3
1
1
5
Devefuzz
1
3
1
0.333333
0.2
1
The calculated weights from the above table in the AHP Tool as well as TerrSet’s WEIGHT module are in table 4.2 as follows: Table 4.2: A comparison of the Weights produced by both tools as well as a summary of the difference between them.
Weight:
Weight:
Weights
Difference
AHP Tool
TerrSet WEIGHT
Difference
Percentage
Landfuzz
0.063244
0.0620
0.001244
1.97%
Townfuzz
0.09057
0.0869
0.00367
4.05%
Waterfuzz
0.110475
0.1073
0.003175
2.87%
Roadfuzz
0.312404
0.3182
-0.005796
1.82%
Slopefuzz
0.319171
0.3171
0.002071
0.65%
Devefuzz
0.104135
0.1085
-0.004365
0.04%
Factor
The calculated weights of the two tools had an average difference of (1.9%), which can be considered negligible in drawing a suitability map, as the next section will demonstrate. However, even after a strict review of the AHP Tool source code, the error source was note found. And unfortunately TerrSet’s WEIGHT module is a proprietary tool as such, it is not possible to review the underlying source code to determine the cause 25
of the difference between the two results. Thus the created AHP tool was used to compare the results of numerical examples from other sources and papers such as (Estoque, 2011) and fortunately it produced identical results. Consequently it is reasonable to assume that TerrSet has optimized its module’s formula further from its original form, resulting in a slightly different results than the original formula output. 4.3: Visual Comparison: In the interest of consistency all the drawing processes during the comparison between different results was conducted in ArcGIS’s Overlay Sum tool. The resulting Suitability maps from both tools were practically identical (Figure 4.1).
Figure 4.1: A visual comparison between (A) The Suitability map produced from the AHP Tool. And (B) The Suitability map produced from the TerrSet WEIGHT module.
As a result of the previous comparisons it can be assumed with a fair certainty that the tool is working impeccably and as intended (Khair, 2018).
26
CHAPTER 5 CONCLUSION AND RECOMMENDATIONS
5.1: Conclusion: The AHP Tool developed through this research integrated fully with ArcGIS and is proven to working as intended. Despite the limitations imposed by ArcGIS’s Python Toolbox with regards to the design template of any custom tools GUI. The accessibility and adaptability of Python has allowed the integration of Microsoft Excel in the workflow. Which allowed the development to overcome a critical obstacle that could have stopped the research in its infancy. The development of such a tool has already been accomplished by previous programmers however, none of them met the goals intended by this tool. As such this tool should be a significant addition to the GIS community in general and the SDSS community in particular. The preference of Python as the development language of this tool should act as a future proof element, and reduce or even eliminate the need of maintaining the tool through minor and maybe even major revision of ArcGIS. One of the Main advantages of the Python Toolbox is discarding the need to install files or initiating a complicated sequence of steps to start it, a major issue that plagues the previous tools. The tool would be ready to work with a simple drag and drop. The user interface of the tool was simplified to invite newcomers as well as allow professionals quick calculations without too many issues or parameters to deal with.
27
5.2: Recommendations: The AHP Tool has a few limitations based on the way it was designed. Among them is it was based on R. W. Saaty (1987) aging AHP formula. A newer and better approach based on Saaty’s formula was published in (Alonso & Lamata, 2006). Unfortunately there were insufficiently applications of the aforementioned approach to warrant the risk of rewriting the tool source code based on it. However, it seems promising and its disposal of the Random Index removes the upper limitations of the 15 factor in the tool thus making the comparison number theoretically unlimited. As no major or minor GIS application seems to have implemented this approach in their applications yet. Adapting the current AHP Tool with the new approach while providing a comparison of their results would provide a fertile ground for new research and probably investments and sponsorship from the major GIS software publishers. The inclusion of Microsoft Excel was a major limitation in the intended workflow of this tool. However, it was the most practical solution to solve the non-interactive interface issues of the custom scripts tools in ArcGIS. Due to the limitation of time, the prospect of trying other alternative methods as workarounds after the Excel was not possible. However, for future developments there might be ways to work around that limitation without the need of an external software. The AHP Tool falls in the middle of the MCE workflow. The Tool input is the result of the standardization process (Fuzzy membership standardizations was the process used in the example of this study). And its output is used for a weight aggregation tool (Weighted Linear Combination method was the aggregation method used in this study). Which would output the final suitability map to the user. The main issue of ArcGIS’s Weighted Sum tool is that it does not accept preprocessed weight distribution files, and this can be considered as a significant disturbance or at least an inconvenience in cases where results are needed quickly or the factors number is great. An interesting future endeavor in this area can create a weight aggregation tool in the same toolbox as the AHP 28
Tool, which would automatically take the output of the AHP Tool and use it as inputs to produce the final suitability map. Eliminating the need of entering the weights manually, creating a seamless workflow in the process. Another potential development aspect for the tool is through decrypting the Weighted Overlay tool’s input. Which should allow the AHP Tool to creating an output that is compatible with the Weighted Overlay tool’s weight distribution file format. Thus creating the intended seamless workflow. The AHP Tool code is Open Sourced and has been published in GitHub website (Khair, 2018). This will allow developers interested in the tool to easily access and improve the tool in the future.
29
REFERENCES
Alonso, J. A., & Lamata, M. T. (2006). Consistency in the analytic hierarchy process: a new approach. International journal of uncertainty, fuzziness and knowledgebased systems, 14(04), 445-459. Artz, M. (2014). Strengthening the Link between GIS and Science. Retrieved from https://blogs.esri.com/esri/esri-insider/2014/11/05/strengthening-the-linkbetween-gis-and-science/ Eastman, J. (2016). Terrset Manual. Clark Labs-Clark University: Worcester, MA, USA. ESRI.
(2016). Comparing custom and Python toolboxes. Retrieved from http://desktop.arcgis.com/en/arcmap/10.4/analyze/creating-tools/comparingcustom-and-python-toolboxes.htm
Estoque, C. (2011). GIS-based multi-criteria decision analysis,(in natural resource management). Ronald, D1–Division of spatial information science, University of tsukuba. Khair, K. G. (2018). Analytical Hierarchy Process (AHP) Python Toolbox for ArcGIS. Retrieved from https://doi.org/10.5281/zenodo.1234060 Lin, J. W.-B. (2012). Why Python is the next wave in earth sciences computing. Bulletin of the American Meteorological Society, 93(12), 1823-1824. Marinoni, O. (2004). Implementation of the analytical hierarchy process with VBA in ArcGIS. Computers & Geosciences, 30(6), 637-646. Rao, M., Sastry, S., Yadar, P., Kharod, K., Pathan, S., Dhinwa, P., . . . Phatak, V. (1991). A weighted index model for urban suitability assessment—a GIS approach. Bombay Metropolitan Regional Development Authority, Bombay. Saaty, R. W. (1987). The analytic hierarchy process—what it is and how it is used. Mathematical modelling, 9(3-5), 161-176.
30
Saaty, T. L. (1977). A scaling method for priorities in hierarchical structures. Journal of mathematical psychology, 15(3), 234-281. Triantaphyllou, E. (2000). Multi-criteria decision making methods. In Multi-criteria decision making methods: A comparative study (pp. 5-21): Springer. van Rees, E. (2014). Python Scripting and GIS Increasing Efficiency. GeoInformatics, 17(7), 46. Wes McKinney, & PyData Development Team. (2017). pandas: powerful Python data analysis toolkit. Retrieved from http://pandas.pydata.org/pandasdocs/stable/generated/pandas.DataFrame.html?highlight=dataframe
31
APPENDIX SOURCE CODE
from __future__ import print_function import arcpy, os, numpy import pandas as pd arcpy.env.overwriteOutput=True class Toolbox(object): def __init__(self): self.label = "AHPython Toolbox" self.alias = "ahp" # List of tool classes associated with this toolbox self.tools = [AHP]
class AHP(object): def __init__(self): self.label = "Analytical Hierarchy Process Calculation Tool" self.description = "This tool was developed to use the Analytical Hierarchy Process (ahp) method to calculate the weights of each factor compared to the other factors." self.canRunInBackground = False def getParameterInfo(self): ## The First Parameter takes the factors param0 = arcpy.Parameter() param0.name= "parameter0" param0.displayName="Factors Names" param0.direction="Input" param0.datatype="DEFile" param0.parameterType="Required" param0.multiValue=True ## The Second Parameter takes the user desired output location and output name param1 = arcpy.Parameter() param1.name= "parameter1" param1.displayName="Output File" param1.direction="Output" param1.datatype="DEFile" param1.parameterType="Required" ## The Third Parameter is a Boolean that takes the User preference regarding his desire to export his result to a Terrset compatible files that will accompany the output file
32
param2 = arcpy.Parameter() param2.name = "parameter2" param2.displayName="Create a (.pcf) and (.dsf) param2.direction="input" param2.datatype="GPBoolean" param2.parameterType="Optional"
files"
return [param0, param1,param2] def isLicensed(self): return True def updateParameters(self, parameters): return def updateMessages(self, parameters): ## Check if the number of inserted factors is below or above the limits of the Tool if parameters[0].altered and parameters[1].hasBeenValidated: factorsinput=parameters[0].valueAsText factors=factorsinput.split(";") if len(factors)>=15: parameters[0].setErrorMessage("The chosen factors are above the limits of this tool, please reduce thier number.") elif len(factors) < 3: parameters[0].setErrorMessage("The chosen factors are below the limits of this tool, please increase thier number.") return
def execute(self, parameters, messages): ## Reading Factors from File factorsinput=parameters[0].valueAsText factors=factorsinput.split(";") opath=parameters[1].valueAsText if opath[-4:]!=".xls": opath=opath+".xls" else: pass No_Factors = len(factors) x=0 while x