Study And Development Of Ad Hoc Algorithms For Designing Waste

SELECTED TOPICS in SYSTEM SCIENCE and SIMULATION in ENGINEERING

Study And Development Of Ad Hoc Algorithms For Designing Waste Collection Routes: Test Of Capabilities CLAUDIA CABALLINI PIETRO GIRIBONE ROBERTO REVETRIA ALESSANDRO TESTA

DIPTEM – Department of Industrial Production, Technology, Engineering and Modelling Via Opera Pia 15, Genoa, ITALY [email protected]; [email protected]; [email protected]; [email protected]

Abstract – This paper presents the analysis, development and testing of a Decision Support System (DSS) allowing planning, management and optimization of waste collection operations in an urban context. A proprietary simulator developed in Java™ and composed by four functionality modules (Global Positioning System, Data Mining system, Waste collection points placement optimizer, planner for routing and resource exploitation) was implemented by the Authors, and was then validated on a specific case study thanks to the cooperation with a big town (about 50’000 inhabitants) on Central Italy that was available for testing the simulation model.

Key Words – Decision Support System (DSS), Simulation, Global Positioning System(GPS), Data Mining system, optimizer, planner.

attention. In some big towns especially of central

1. Introduction

and southern Italy, the capacity of disposal and

The problem of waste collection is a problem more

recycling waste has been a critical issue since

and more critical in the modern society, where

long time, and the necessity of differential waste

people is used to have all comforts that they can

collection led to an increase of number of

afford, and where packaging and “durable” goods

collection

themselves are often discarded and not recycled or

repaired.

The

Governments

and

with

trucks

and

other

specialist means to be bought, and an increase in

many

number of collection points was mandatory. These

multinational companies tried to make people

changes

more sensitive to the waste problem, but the

need

to

be

matched

with

traffic

conditions, physical bonds in terms of routing and

global situation itself is a contradiction in terms, on

accessibility, and the necessity of an high service

that subject, and everything we buy seems to be

level anyway, since it’s not possible to leave waste

born to be discarded very soon with no particular

ISSN: 1792-507X

resources,

unattended for days in front of houses or public

270

ISBN: 978-960-474-230-1


places. Hence the need of a simulation tool able to

test, test R2 Lack of Fit test) and feed the form

support decision making in such topic. It’s not the

of artificial neural networks for determining the

first time that DIPTEM – University of Genoa,

response surface (RSM). The forecasting

faces such problem with a simulator, but in this

system is thus defined to be more reliable than

new generation system, many more functionalities

those currently in the literature and specialized

(more linked to modern necessities) were added

software. 3. Optimizing placement of collection points.

respect to the first models developed in the middle of the ‘90s.

Development optimization

of in

a

coherent

"placement"

model

of

Collection

Points based on mathematical programming

2. Structure of the Model

models with integer numbers, developed ad

The simulator consists of four macro features:

hoc and tested by the Authors in several case

1. Geo-referencing roads and attributes through

studies in Northern and Central Italy. This

the SOAP protocol. The integration with geo-

placement is applicable both to the collection

referenced systems using the SOAP protocol,

bins located by the road, both with separate

(Simple Object Access Protocol) allowing to

collection bins or door to door collection.

build GIS (Geographic Information System)

4. Planning routes and fleet for collection.

support skipping the stage of data entry (or at

Development of an integrated planning of the

least drastically reducing it), thus reducing

fleet for collecting MSW based on algorithms

implementation costs. The significant points

and heuristics CRVP (capacitated Vehicle

can still be included (geo-referenced) to the

Routing

digital mapping with the aid of GPS ensuring

Problem)

derived

from

The language used for development is Java

2. Data Mining System on Waste production.

(J2SE) 1.4, the interface with SOAP Web Service

Use of a Data Mining based on Artificial

has been provided by Apache Axis Framework.

Neural Networks and Polynomial Models of

The development of interfaces to enable systems

the Second Order. The module extracts the

integration capabilities above were made on their

data reporting and collection of census data

applications using JBuilder.

(population, non-domestic users, etc.), the map of the relationship between the socio-

3. The Validation Phase

(Municipal Solid Waste). The results are subjected to a thorough statistical analysis (F-

ISSN: 1792-507X

MIT

(Massachusetts Institute of Technology).

the completeness of the information.

economic area and the production of MSW

for

1. Validation of Georeferencing Module

271

ISBN: 978-960-474-230-1


The application allows geo-referencing roads. In particular obtains, by connecting to the GIS Web server, longitude and latitude of each address entered.

Via Maestri del Lavoro 3 (12.91401-42.42176)

The Method called georeferencing has two

Via Antonazzi E 10 (12.89279-42.40738)

different components: The first, given an address, returns the longitude and latitude of the point, taking in account if there is an ambiguous situation (e.g.

more addresses

corresponding

to the

Via Terminillo 3 (12.87915-42.40648)

Figure 1: Results of testing simulator against world

selection). In this case, the user must choose the

wide web GPS tools.

correct one among a list of addresses matching the selection criteria. If nothing is found for the selection criteria, the user can insert manually the

2. Validation of Data Mining System The second application tool uses a data mining

parameters. If no civic number is given to the

system based on artificial neural networks and

system, it will calculate the center of the address

second-order polynomial models. The module

inserted. The second component given a file

extracts the data reporting and collection of

properly formatted of addresses, returns an array

census data (population, non-domestic users,

of addresses. Each address object array follows

etc.), the map of the relationship between the

the logic described above

socio-economic area and the production of MSW.

To validate the Geo-referencing module was necessary

to

test

the

correctness

of

Via Ternana 78 (12.85596-42.42923)

The results are subjected to a thorough statistical

the

analysis (F-test, test R2 Lack of Fit test). If the

assignment of addresses to latitude and longitude:

regression analysis can not find a suitable

a set of data was taken randomly with some

relationship is possible to use neural networks.

logical points and compared with the results given

This case involves an error of approximation

by a popular worldwide website, giving information

typically greater than the Analyzer method. The

on maps and place positioning. The results were

method has two different components: the first

really satisfactory (see Figure 1 where the points

using regression analysis, the second, using

calculated by the simulator and by the most

neural network. Before to run a regression it is

diffused web application are both reported as

necessary to configure the analyzer by method.

signals on the maps).

There is a great number of methodologies aimed to the formulation of laws designed to interpret mathematical relationships, more or less complex,

ISSN: 1792-507X

272

ISBN: 978-960-474-230-1


between the variables (factors, responses) that

remains to observe that the model is built on

determine the behavior of the system. In other

available data, i.e. the selection of variables

words it’s possible to construct a model able to

influencing the system and the characterization of

connect the answer (waste) to factors (non-

the study areas (boxes census) were imposed by

domestic and domestic users) involved in the

the availability of these, and so were not detected

system. In general there is a dependent variable Y

by an aprioristic design. Therefore this approach

(response: MSW produced daily per census box)

has been necessary in the iterative regression test

which depends on K independent variables named

scenarios (in those cases the risk to detect

X1, X2, .., Xk (factors: Hotels, Shopping Centers

relationships between variables, that are not

etc). The relationship between these variables is

significant in reality, is very high) to find the correct

characterized by a mathematical model called

mix of factors able to represent the situation.

"regression

The regression model is represented by the

equation".

The

relationship

is

expressible symbolically by:

following equation:

Y = φ ( X 1 + X 2 + ........ + X k )

Y = b1 * Inhabi tan ts + b2 * Hotels + b3 * Industries + b4 * Other + (b5 * Hotels * Other ) + (b6 * Industries * Others)

It is unknown and it is necessary to choose an appropriate function (use polynomial models) to approximate Φ. Operationally it is necessary to identify the type of relationship (linear, nonlinear) that best approximates the system testing the significance and goodness of fit. Indeed since the model is a polynomial expression, whose order is given, it is our task to attempt to verify the correctness of the hypothesis made (order of the

The coefficients b1, b2, ...., b6 are the correlation coefficients of the model, whose values are presented in Table 1. To estimate the above parameters it was assumed that the errors are random variables with normal distribution and are not correlated. Thus for each coefficient it’s possible to identify a range of confidence which is a function of average and standard deviation.

polynomial adopted). The tests are performed according to the methodology ANOVA (acronym

b1 b2 b3 b4 b5 b6

for Analysis of Variance). Without going into formalisms

of

statistical

techniques

ANOVA,

analyzing the variance of the samples, by analyzing the sample average values in order to

AVG

P value

0,889 272,99 770,30 24,84 -8,169 -134,33

2,78372E-12 0,00604 0,00859 0,01901 0,01721 0,02040

Correlation Coefficients of the Model Std Error -95% 95% 0,07910 92,39 273,91 10,02 3,239 54,86

0,728 84,30 210,89 4,377 -14,78 -246,36

1,051 461,68 1329,7 45,30 -1,554 -22,29

t Stat

VIF

11,24 2,955 2,812 2,479 -2,522 -2,449

1,420 9,120 930,42 3,030 11,19 934,07

Table 1: Correlation Coefficients

achieve a unique test of significance, which can take decisions with a desired degree of risk. It

ISSN: 1792-507X

273

ISBN: 978-960-474-230-1


As shown in Table 2, the significance of the regression is successfully passed, showing the correctness of regressive approach. Source

SS

ANOVA SS%

Regression Residual Total

15405729,71 3323476,85 18729206,56

82 18 100

MS

F

2567621,618 27,81 110783

F Signif 8,87987E11

df 6

Figure 2: production of domestic and non domestic waste

Table 2: Significance of regression

The regression model estimates the production

The correctness of the model is translated into its

percentage from households in 67% of total waste

ability to represent the system itself, as shown in

generated. But to verify the reliability of the model

Figure 2. Regarding the quantities of waste

identified, it’s necessary first of all to make some

produced daily, the difference between simulated

considerations.

data and real ones is about 7% (considering the

In

the

first

analysis,

the

quantitative and qualitative production of waste

average correlation coefficients). Percentage that

variates according to the geographic reference

can be considered as a first approximation the

(north and south of Italy). In addition, the socio-

error of regression model for this test.

economic realities of territory are influencing the type

of

this

production.

Under

the

above

considerations, let’s compare the case study with data from similar reality, but while there are abundant studies on the product breakdown of the waste, research on distribution channels are very scarce. Only some regions in Italy have published studies in this regard, and they are the most advanced from the socio-economic point of view.

Figure 1: real data vs. simulation

The model produced by the neural network is to be used especially when the regressive analytical

The construction of the regressive model is a preliminary identification of the specific production of MSW for domestic and for non-domestic use. Examining the equation previously found, these two factors are clear. It’s possible to spin off the total waste produced daily by type of user. The households controls about 67% of the daily production of waste while the non-domestic 33%.

ISSN: 1792-507X

model is unable to successfully resolve the problem, and is not explained by human symbolic language: the results must be accepted "as is", hence the definition of Neural Networks as "black boxes". In other words, unlike an algorithmic

274

ISBN: 978-960-474-230-1


system, where you can examine the step-by-step path from the input that generates output, a neural network is able to generate a valid result, or a result with a high probability of being acceptable, but it is not possible to explain how and why such a result is generated. The graph in Figure 3 shows the actual daily production of MSW vs. simulated with the neural network. The error provided by the application of the model is higher than the regressive one of about 12%.

Figure 4: daily MSW production (regression and neural network) 3. Validation of Optimizer for placement of

collection points.

Figure 3: Production of real and simulated daily MSW (Neural Network) The comparison between the production of MSW obtained by the regression method with that obtained with the neural network quality shows the same trend (Figure 4) unless quantities of small offsets.

ISSN: 1792-507X

The optimal location of bins, both for the number both for the type, was made using the methodology of Branch & Bound. This methodology, developed by Land and by Doing starts from the set of all feasible solutions, which is then divided into two sets with empty intersection to summarize the initial set (Branching). Then it is calculated for each set a limit not higher than the minor "cost" (defined as objective function to be maximized / minimized) for each element (bounding). Proceeding in this direction, gradually branching sub sets that contains the best solution, it’s possible to reach a set with one element which is excellent. This methodology appears to be one of the most appropriate in literature for solving resource allocation problems. Regarding the boundary conditions for the definition of the optimum point, they are summarized as follows: • Type of box. Represents the volume of containers to be placed on the ground. For different types of garbage collection there are standard sizes • Filling% maximum. Represents the load factor of the box. • Frequency of collection. Represents the number of collections made weekly. The frequency gives the number of days for which the bins have to buffer the production of MSW. For example, a frequency 3 to 7 (three times a week) means the necessity to have a capacity of containers of 3 days in buffer.

275

ISBN: 978-960-474-230-1


• Maximum Distance allowable user / bin. In general, municipal regulations provide for a maximum allowable distance between the house and location of the nearest box. The output of the method in addition to the collection points (characterized by longitude and latitude, number and type of containers present, utilities served) must also provide a possible list of non-compliant users (typically in relation to the distance) to be managed manually downstream optimization. The validation of the tool for placement was made by considering a sample area comprising two roads with 284 people resident. For the test area it was assumed a collection of wet fraction of proximity. In particular, the variables considered were: 1. Volume of containers. Volumes ranging from 120 liters to 80 liters 2. Frequency of collection. Two possible frequency of collection, six times a week or three times a week. 3. Maximum distance allowable from the box to houses. Varying between 50 meters and 100 meters. Tables 3 and 4 show the results.

Validation of route and fleet planner In the optimization process and the establishment of delivery sequences, the cost of a complete tour of deliveries (in this case the collection) is usually proportional to the distance traveled, trying to calculate the path that has a minimum total distance among the locations visited once each. The matrix of savings method is quite simple to be implemented and can be used to assign deliveries / collections to the vehicle if there is a time constraint. The method comprises the following steps: • Creation of the distance matrix • Create the array of savings • Allocation of delivery /collection vehicles Once performed the steps the above, it’s possible to perform real optimization of each mission. Since typically the cost of each solution is proportional to distance traveled, the distance matrix is calculated based on geometric means of the formula: Dist(A,B) = √[Xa – Xb] The matrix expresses the convenience in term of savings, for two deliveries to be grouped on the same path; a higher value corresponds to greater convenience and the saving is calculated using the formula: S (A, B) = Dist (DC A) + Dist (DC, B) - Dist (A, B)

Table 3: results for zone 1

A path is composed by at least one delivery and covers a journey from the base and the return (eg DC -> Delivery -> DC), same thing happens if there are multiple deliveries grouped together (eg DC -> Delivery 1 -> Delivery 2 -> DC). The procedure consists in grouping on the same track (vehicle) deliveries /collections that have the highest value of savings, but respecting the maximum capacity of the vehicle (in this case, the maximum daily flow rate and time). The analysis examines one by one all couples of deliveries starting from the highest value of saving and then moved immediately to the next. Deliveries are then aggregated gradually in the available vehicles that, once filled, will be optimized to minimize travel time and thus guarantee a tangible resource saving for the company. To test this last module, consider to compute the path according to Travelling Salesman Problem, just to

Table 4: results for zone 2 This shows some interesting information: a. The days of waste accumulation depend only from collection frequency (in the same area) and not on the type of bins installed in the territory and their number. b. By varying the frequency and type of collection bin installed, the volume of waste managed per year changes. c. The filling system of bins, that literature provides as an index of efficiency of the system, is below 80% defined as input. This parameter is influenced not only by the frequency of service but also by the maximum allowable distance between house and garbage box.

ISSN: 1792-507X

276

ISBN: 978-960-474-230-1


empty the bins placed with the 3rd module (Table 8). For this purpose it is necessary to fix the place of departure and arrival of the means for the collection (in this case, collecting organic fraction using a vehicle with tank type porter) and the type of vehicle used (Table 7).

[2] Briano E., Caballini C., Revetria R., Schenone M., Testa A. (2010); “Use of System Dynamics for modelling customers flows from residential areas to selling centers”, Proceedings of WSEAS ACMOS’10, Catania, Italy, May 29-31. [3] Cantarella, G. E.,and A. Sforza, Traffic assignment, in Concise Encyclopedia of Traffic and Transportation Systems (ed. Papageorgiou, M.) Pergamon Press, Oxford, 1991, pp. 513-520. [4] Christofides, N. (1976), “Worst-case analysis of a new heuristic for the travelling salesman problem”, Technical Report 388, Graduate School of Industrial Administration, Carnegie-Mellon University, Pittsburgh. [5] Cormen, T. H.; Leiserson,., C. E.; Rivest, R. L.; Stein, C. (2001), "The traveling-salesman problem", Introduction to Algorithms (2nd ed.), MIT Press and McGraw-Hill, pp. 1027–1033, ISBN 0-26203293-7. [6] Land A.H., Doig A.G. (2010), “An Automatic Method for Solving Discrete Programming Problems”; in “50 Years of Integer Programming 1958-2008”, Springer Berlin Heidelberg, pp.105132; ISBN 978-3-540-68274-5; [7] Potts, R.B., and R.M. Oliver, Flows in Transportation Networks, 1972, Academic Press, New York. [8] Sheffi, Y.Urban Transportation Networks, 1985, Prentice Hall, Englewood Cliffs, New Jersey. [9] Zhang G., Patuwo B.E., Hu M.Y. (1998); “Forecasting with artificial neural networks: The state of the art”; International Journal of Forecasting, Volume 14, Issue 1, 1 March 1998, pp. 35-62

Table 7: Points collection object routing

Table 8: Points collection object routing The results of the module were compared with the ones calculated mathematically according to the Traveling Salesman Problem, and the overlapping was obtained with a satisfactory approximation.

4. Conclusion The simulation model realized by the Authors has proven good results in the validation phase based on a real case study. The 4 modules have been tested one by one, but since they have independent functionalities an integration test is implicitly passed. Any case, before the application on the town that has been the pilot case study, that has shown the interest in using the simulator as DSS permanently, the authors will perform a further integration test to measure the adaptability of the simulation model to different boundary conditions. In the next months the simulation model here presented, will be in fact operating as DSS on at least 4 towns in Italy. References [1] Applegate, D. L.; Bixby, R. M.; Chvátal, V.; Cook, W. J. (2006), “The Traveling Salesman Problem”, ISBN 0691129932.

ISSN: 1792-507X

277

ISBN: 978-960-474-230-1