(and Benefits) of Dependable Public IP Service

1

Assessing the Costs (and Benefits) of Dependable Public IP Service Networks. Ed Stoker and Joanne Bechta Dugan University of Virginia [email protected] and [email protected]

Abstract— We consider the problem of estimating the financial impact of dependability improvements for commercial IP networks. In particular, greater corporate and individual reliance on commercial Internet services, along with the general economic slowdown have put phenomenal economic pressures on the industry. To address this problem, we have developed a Economic Reliability Analysis framework (called ERA) at the University [21] and applied it to a modern, commercial IP network. The ERA framework extends traditional Reliability engineering techniques to include financial elements such as: revenues, expenses, and investment periods. We have already extended full-path enumeration techniques [21], Markov reward models (MRM) [20], software reliability growth models (SRGM) [22] and stochastic reliability models [19] using our ERA framework. This paper looks at fusing the ERA framework to reliability bounding techniques and using it to forecast the expected reliability and expected economic efficiency of a commercial IP data network. The purpose behind this effort provide engineers with the means to estimate the impact of network reliability in financial terms. The commercial network being analyzed links 21 distinct locations along with 2 trans-oceanic edges to another IP network, each with its own collection of nodes, edges, customers and local topologies. Changes are proposed to this network to improve overall customer dependability, and the impacts of these changes (in terms of network dependability and economic return) are analyzed.

Keywords: Dependable networks, economic efficiency, reliability estimation, economic models I. C URRENT S ITUATION Consider the following scenario. A telecommunications company has a large, distributed network that provides the basis for its revenue. This network is, by and large very reliable and contains points of presence (POP) connected by leased communication links of various capacity, reliability and cost. The company must decide how to change its network to meet its new and present customers’ current and future network usage requirements with improved reliability, better performance at lower operational and support costs without endangering its current revenues. What is needed is a means to accurately predict the number and economic impact of network failures on network systems as these systems are being designed. This capability would allow network engineers a means to accurately estimate the costs and benefits of competing network

designs and technologies before actual implementation, deployment and support of these systems. In short, network engineers need a methodology that provides a way to compare economic efficiency with network reliability across multiple network design options. This would facilitate better network project planning and provide greater insight into the organization’s financial and customer requirements before incurring implementation costs. II. R ELATED W ORK Reliability engineering focuses on the design and implementation of systems and networks that rarely fail. It has developed (and continues to investigate) a large and varied set of strategies to model and measure network and system failures and the impact of component failures on networks and systems ( [2], [17]). Assigning economic value to failure-oriented and flow-oriented network reliability models has taken many directions in the current literature. Research at British Telecom, [24], [3], [25] focused on modeling repair costs. Markov and semi-Markov models have been used to evaluate network and system reliability for a wide variety of digital systems and environments. [6], [5], [14], [10], [18] . Markov models have been applied to system reliability problems when combinatorial models (such as full pathset enumeration) are inapplicable [1]. Other reliability research has looked into the problem of fault modeling and detection in modern networks [29], [7], [23] including QoS fault detection and prediction [15], [12], [11], [8], [9], [4]. Research strategies have approached reliability as an engineering design problem and have developed tools and techniques to address it [16], [16], [13], and in related work, economic models have been proposed dealing with electrical power distribution [26], [27], [28]. All of this research to date overlooks two important economic concepts. They are: monetary benefits (i.e. revenues) associated with a software system and the time-value of money. Inclusion of these concepts can produce a better understanding of the economic worth of a reliable software system. This article aims to extend the software reliability growth model and software operational profiles with an Economic Reliability Analysis (ERA) framework developed

2

at the University of Virginia [21] to include economic elements such as: revenues, expenses, and investment periods. We have already extended full-path enumeration techniques [21], Markov reward models (MRM) [20], software reliability growth models (SRGM) [22] and stochastic reliability models [19] using this framework.

B1

C1

A3

DB2

A2

ST1 F2

A4

L3

F1

MN1 D1

III. T HE P ROBLEM L2

The general problem, simply stated is, ”How do you profitably design and operate a dependable, commercial IP network?” This general problem can be addressed by a set of smaller, more directed questions. These questions are: 1) What is the dependability and profitability of the current network? 2) How will proposed network modifications alter the dependability and profitability of the network? 3) When will the improvement in profits from the proposed modifications compensate for the investment needed to make them? We will extend our Economic Reliability Analysis framework to answer these questions for an actual commercial IP network. The network topology is shown in Figure 1. We will apply reliability bounding techniques to estimate network failures based on distinct node and edge failures. The results from this analysis will be used as input for ERA along with economic data to determine the economic efficiency for the current network and three proposed network changes. Specific revenue, cost and repair data has been altered for this paper. The reliability bounding techniques presented in this paper are fairly simple and are meant to illustrate how even straight-forward reliability models can provide important insights into the economic efficiency of network designs. In practice, more sophisticated techniques such as Markov reward models [20], or stochastic network simulations [19], by reliability engineers for use with ERA. A. EdNet Description We call our commercial IP network EdNet. A map of the network used in this study is shown in Figure 1. This network consists of 21 sites each with its own set of nodes, ’backbone’ edges and access circuits. Sites are geographically distributed across many cities. Each of these sites were designed to have dependable local networks with different numbers of redundant nodes, edges, ’clean’ environments with redundant backup power, and 7 x 24 monitoring and local hardware support. Access circuits (not shown on the EdNet map) represent traffic sources/sinks and provides both traffic demand and revenue estimates. Understanding the impact on economics (profitability) and dependability is critical in determining the placement of redundant edges.

P6

R1

P5 MI1

MD1 P4

AM1

SF1

P3

AM2 Legend EDGE

Fig. 1.

MTBF MTTR 8760 hrs. 2 hrs. 39420 hrs. 3 hrs. 52560 hrs. 120 hrs. 52560 hrs. 2 hrs. 52560 hrs. 240 hrs.

EdNet Network

B. ERA Description In [21], Stoker and Dugan define an Economic Reliability Analysis methodology to evaluate the economic worth of a reliable network. The general strategy behind the ERA framework is to collect and use availability and financial data about a system from within an organization rather than build ”yet another reliability/financial model”. The ERA framework provides a means to determine how changes in component reliability, service pricing, or component/task dependencies influence system availability, return on investments and design. Table I provides a summary of the operational set of processes performed by the ERA framework. Figure 2 shows that these processes can contain additional processes. These processes determine the nature and characteristics of the economic reliability model being built. 1) Evaluate ERV Model Results: The evaluation process consists of answering a set of economic, engineering and management questions using the ERV and economic vectors. Answers to such questions as 1) How does network availability affect the economic worth of a network? 2) How does one evaluate network component changes in terms of reliability and return on investment?

3

Start

Reliability Data

Initialize counters Build Reliability Model for system (j) System, Component and QoS Failure Data

Performance Thresholds Component Failure Data, QoS Metrics, System Designs, Components, QoS Metrics

Design failed thresholds. New reliability reqs.

System Design Data

Calculate Revenues Calculate Lost Revenues

Get next system design ( j = j + 1)

System Component Lists

Calculate Repair Cost

Build ERV Model for system (j)

System, Component and QoS Failure Cost Data, Revenue Data Customer Demand Requirements

Calculate Other Costs

Economic Data

Calculate Capital Costs Compute System ERV

Compare Lease vs. Purchase cost of new system component and place equipment costs correctly.

System Economic Values

Evaluate ERV Model for system (j)

No

All ERV models evaluated? Yes Select best ERV system model.

Fig. 2.

End

ERA Methodology Flowchart

3) How long will it take new network components and topology return a profit on the initial investment? 4) What is the appropriate level of component and network reliability?

These questions and others can be answered once network ERVs are calculated and compared.

4

Step

1. 2. 3. 4. 5. 6. 7. 8. 9. 10.

Function Determine the duration, size and scope of the analysis. Build software reliability models for all design choices. Map software reliability information into component and task failure data. Calculate revenue vectors for all design choices. Calculate lost revenue vectors for all design choices. Calculate recurring cost vectors for all design choices. Calculate other cost vectors for all design choices. Calculate capital cost vectors for all design choices. Calculate [ERV] for all design choices. Analyze and interpret results of evaluation. TABLE I ERA F RAMEWORK A LGORITHM

C. ERA Framework Assertions The ERA framework relies on the following set of assertions. 1) 2) 3) 4)

Network revenues can be estimated. Failure costs can be estimated. Failures are independent and identically distributed The network structure is coherent.

These assertions can be checked when the ERA framework is verified. Verification occurs when actual results (failures, revenues, costs) are analyzed and compared with the modeled predictions. New information can be fed back into the model and the forecasts are updated. 1) Network Revenues can be estimated.: There are two simple ways to estimate revenue forecasts. The first approach involves building an average revenue per customer circuit and estimating the growth in customer circuits based on past growth. This approach can be complicated with the inclusion of customer churn and price erosion factors which also can be estimated on past performance. A second approach involves the use of Sales Forecasts and involves the same set of variables (including churn and price erosion) along with a better understanding of the placement of sales resources and service/location growth. This information typically comes from Marketing and Sales and can be more accurate as a predictor than previous experience. Other methods of revenue estimation are possible. The revenue estimates used in EdNet have been created using the first approach. A ’sanitized’ set of average revenues by access type. These averages do not reflect actual values (but

could). These averages are multiplied by the number (and type) of access circuits found in each Site and are summed together to determine the Site Revenue. 2) Failure costs can be estimated.: There are two economic components that can be associated with failures. The first is a cost component that is associated with component failures; the second is a refundable revenue component associated with network failures. The important feature is that average component costs and network refundable revenues be estimated. Component failures. Component failure costs can be determined fairly easily. Component failure records are kept that typically include labor and other cost metrics. These records and can be used to derive an average component failure cost. These average component failure costs can be multiplied by the expected number of component failures to get expected component repair costs. Component repair cost estimates used in EdNet employ this method. A set of ’sanitized’ component capital, edge leasing and node repair costs are provided that do not (but could) reflect actual values. These costs along with Site revenues and refundable revenues are used to construct a set of annuities in the ERA framework. Network failures. Estimating the financial impact of network failure is more difficult than component cost estimates because network failures: 1) occur infrequently, 2) involve refundable revenues rather than actual costs, and 3) are complicated by contractual and customer perceptions of network failure. The current ERA framework addresses these concerns by defining an Upper Bound on refundable revenues incurred from network failures. This bound is a function of network availability. The Upper Bound is produced by calculating the ”all-sites connected” availability of the network and multiplying that by the sum of the Site revenues (see Equation 1). M

M

j=1

j=1

Refund RevenueUB = (1 − ∏ P j ) ∑ S j

(1)

P j → Site j connection probability S j → Site j refundable revenues M→

Total number of Sites

Equation 1 Upper Bound for Refundable Revenue from Network Failures This simplification of availability is used only for the upper bound on revenue loss. At first blush, this formula may seem an oversimplification. This formula will provide a ’worst-case’ estimate for lost revenues which is reasonable

5

for financial budgets. More detailed models are used to predict component failure counts and durations. Equation 1 links refundable revenues to all-site network availability. It assumes that the Sites are perfectly reliable and that only edges contribute to network availability. Unfortunately, as the number of sites grows (M → ∞), Refundable revenues → Total revenues. Typically, however, service contracts limit the financial liability to a fraction of revenues for select classes of customers. Customers who do not subscribe (and pay) for this guarantee will get ”best effort” service and not be entitled to a partial refund of their monthly bill. The ERA framework accounts for this by means of an Effective Refundable Revenue Factor (ERRF) and by counting only those customer revenues refunded when network failures occur. 3) Network and Component failures are independent and identically distributed.: The current ERA framework assumes that component failures occur randomly ( i.e. independent of other failure conditions ) and that the failure characteristics are the same over an entire class of network components. In other words, all network edges of the same class behave in the same manner. This assertion allows us to develop an initial framework and allows a simplification of network reliability models. It permits us to define component failures in terms of averages. In addition, to permitting average component failures, this assertion allows us to combine average failures (from a stochastic reliability process) with average cost/failures to support a stochastic financial process. Independence of network component failures is not an onerous assertion and is supported in the Site topology design, that includes node redundancy, local edge and even node power supply redundancy. Actual site failure data supports the contention of failure independence. There is insufficient problem data to confirm (or deny) identical failure distributions. The use of highly reliable common hardware, structured wiring, backup power and common software supports the notion of identical distribution of failures. 4) Network structure is coherent.: Network reliability can be viewed as a Stochastic Binary System that randomly fails as a function of random component failures. Each component S in the component set T either operates or fails. The structure of the network can be viewed as a function ψ(S) defined for each S ⊆ T. A SBS is coherent if ψ (T) = / = 0 and ψ (S 0 ) ≥ ψ (S) for any S 0 ⊃ S. 1, ψ (0) This property implies that the failure of any component will not improve the operations of the network. For any Stochastic Coherent Binary System, a pathset is a set of components that implies system operation; a cutset is a set of failed components that implies system failure. The set of minimum pathsets and set of minimum cutsets are important values and can provide bounds on overall network reliability.

Coherence is a critical assertion as it allows for the usage of network reliability bounding techniques in the ERA framework. Coherence is essential to produce bounds for refundable revenue from network failures. IV. E D N ET A NALYSIS A. EdNet Framework Model Assumptions There are some critical assumptions that are are part of the ERA framework model. These assumptions provide the basis for the analysis of EdNet. Number 1 2 3 4 5 6

Assumption The model time period consists of 12 months. Component reliability metrics do not change over time period. Revenues and access circuits do not change over time period. All revenues, refundable revenues, and costs are computed on a monthly basis. All months contain 730 hours. Access2 customers are entitled to a 25% monthly refund when network fails. TABLE II E D N ET Framework A SSUMPTIONS

In addition, to the framework assumptions, There are Financial data assumptions (Table III) that must be included to estimate revenues, refundable revenues and costs. The Effective Refundable Revenue Factor is provided as part of these assumptions. Amount $500.00 $175.00 $500.00 $400.00 $1,000.00 $1,000.00 0.2500 0.0100

Description Monthly revenue from Access1 circuit Monthly revenue from Access2 circuit Monthly revenue from Access3 circuit Monthly revenue from Access4 circuit Average repair cost per node failure Average repair cost per site failure

Effective Refundable Revenue Factor Monthly discount rate TABLE III E D N ET Financial D ATA A SSUMPTIONS

Both sets of EdNet assumptions can be changed in the framework, but are used to illustrate the calculations needed to produce the Economic Reliability Value. The first three framework assumptions determine the scope of the analysis. The duration of the analysis is critical in interpreting any results. The Financial assumptions help assess the cost of network failures in EdNet.

6

B. Get current and proposed network component and topology data. 1) Current Revenue and Cost Data Formulae: Equation 2 describes how the individual site revenues are calculated at periodi. Equation 3 describes how total revenue is calculated at periodi . Equation 4 describes how the individual site repair costs are calculated at periodi . Equation 5 describes how the network edge repair costs are calculated at periodi.

Symbol Li

Definition Max. Number of Access edges at site j of type` at periodi Total number of sites at periodi Total number of edges at periodi Number of Access edges in site j of type` at periodi Avg. Revenue for Access type` at site j for periodi Expected number of node repairs for site j at periodi Avg. Node Failure Cost at site j for periodi Expected number of repairs at site j for periodi Avg.Site Failure Cost at site j for periodi Expected number of repairs at edgek for periodi Avg. repair cost at edgek for periodi

Mi Ni # Access( j)(i)(`) Avg. Rev/Access( j)(i)(`) Exp # of node rep.( j)(i)

Li

Site Rev( j)(i) =

∑

# Access( j)(i)(`) ∗ Avg. Rev/Access( j)(i)(`)

`=1

Equation 2 Site Revenue Calculation. Mi

Total Rev(i) =

∑ Site Rev( j)(i)

j=1

Equation 3 Total Revenue Calculation.

Avg. node rep. cost( j)(i) Exp # of site rep.( j)(i) Avg. site rep. cost( j)(i) Exp # of edge rep.(k)(i) Avg. edge rep. cost(k)(i)

Site Rep( j)(i) = Exp # of node rep( j)(i) ∗ Avg. node rep. cost( j)(i)

TABLE IV

+Exp # of site rep.( j)(i) ∗ Avg. site rep. cost( j)(i)

F ORMULAE D EFINITIONS

Equation 4 Site Repair Cost Calculation. Total Edge Rep Cost(i) = Ni

∑ Exp # of edge rep.(k)(i) ∗ Avg. edge rep. cost(k)(i)

j=1

Equation 5 Total Repair Cost Calculation. 2) Current and proposed network component and topology data: Two types of topology data are needed for the ERA framework. The first is site connection data. For EdNet this data is described in (see Figure 1). Site Inventory Data (shown in Table V) is also required. This table provides the Name, Availability, the Network Disconnect Probability, Revenue Refundable Revenue and Repair Cost values for each Site. Site Availability and Disconnect Probability values are determined by Reliability Engineering. Table V contains both site node information (used to estimate Site Availability) as well as the number of customers by access class. The latter data is used to determine revenues and refundable revenues. This information is needed for all network design alternatives being considered. Table VI provides the connection, availability, and expected cost data on all of the network edges. a) Proposed Network Changes.: 1) Link C1 to A3. This seems a likely candidate for redundancy as the site C1 is the only site that requires two serial edges to be connected. This topology would provide both sites C1 and ST 1 with alternate routes.

Site Name A2 A3 A4 B1 C1 D1 DB2 F1 F2 L2 L3 MD1 MI1 MN1 P3 P4 P5 P6 R1 SF1 ST1 Total

Site Avail. 0.9982 0.9991 0.9991 0.9982 0.9991 0.9991 0.9986 0.9982 0.9991 0.9982 0.9982 0.9991 0.9982 0.9991 0.9998 0.9998 0.9991 0.9982 0.9991 0.9998 0.9991

Disc. Prob. 5.510E-14 5.510E-14 1.448E-09 1.448E-09 1.268E-04 7.610E-05 7.610E-05 2.383E-26 7.610E-05 1.102E-13 7.346E-14 7.610E-05 3.734E-05 3.861E-09 7.610E-05 5.213E-08 5.213E-08 5.744E-21 5.792E-09 7.610E-05 7.610E-05

Site Rev. $4,525.00 $2,000.00 $1,300.00 $2,875.00 $975.00 $1,025.00 $2,225.00 $6,475.00 $1,075.00 $14,700.00 $13,625.00 $1,025.00 $4,400.00 $3,225.00 $500.00 $0.00 $800.00 $3,525.00 $2,100.00 $800.00 $1,675.00 $68,850.00

TABLE V S ITE I NVENTORY D ATA

Ref. Rev. $1,006.25 $0.00 $0.00 $218.75 $43.75 $131.25 $306.25 $743.75 $43.75 $2,275.00 $2,931.25 $131.25 $175.00 $306.25 $0.00 $0.00 $0.00 $656.25 $0.00 $0.00 $218.75 $9,187.50

Rep. Costs $2.74 $1.40 $1.37 $2.74 $1.37 $1.37 $2.05 $2.74 $1.37 $2.74 $2.74 $1.37 $2.74 $1.37 $0.46 $0.46 $1.37 $2.74 $1.37 $0.46 $1.37 $36.30

7

Edge Name

MTBF in hrs

MTTR in hrs

Lease Costs

A2-A3 A2-F1 A2-A4 A3-A4 A3-B1 B1-L3 C1-ST1 DB2-L3 D1-F1 F1-AM2 F1-P6 F1-ST1 F1-F2 F1-MI1 L3-L2 L3-MN1 L2-AM1 L2-R1 L2-P6 MD1-P6 MN1-R1 P3-P6 P4-P5 P4-P6 P5-P6 P6-SF1 Total

52560 52560 52560 52560 52560 52560 39420 39420 39420 52560 52560 39420 39420 53560 52560 39420 53560 39420 52560 39420 39420 39420 8760 8760 8760 39420

2 2 2 2 2 2 2 3 3 120 2 3 3 2 2 2 240 3 2 3 3 3 2 2 2 3

$350 $350 $350 $350 $350 $350 $500 $425 $350 $1,000 $350 $800 $225 $350 $350 $200 $15,000 $275 $350 $750 $200 $200 $150 $150 $150 $200 $24,075

Exp Avg. Rep. Cost per edge $1,000 $1,000 $1,000 $1,000 $1,000 $1,000 $1,800 $1,500 $500 $21,000 $1,000 $1,500 $350 $2,000 $300 $750 $82,000 $750 $1,000 $2,000 $500 $400 $400 $400 $400 $500 $125,050

Exp Month Rep. Cost $0.04 $0.04 $0.04 $0.04 $0.04 $0.04 $0.09 $0.11 $0.04 $47.95 $0.04 $0.11 $0.03 $0.07 $0.01 $0.04 $367.44 $0.06 $0.04 $0.15 $0.04 $0.03 $0.09 $0.09 $0.09 $0.04 $416.79

TABLE VI E DGE I NVENTORY D ATA

C. Derive Network Availability estimates for current and each network proposal. We derive estimates for the current network and three proposed network changes. We divide the Network Availability estimate into a Site availability portion that the proposed network change does not affect and the Site Disconnect probability portion that changes based on the inclusion of redundant links. Table VIII shows the product of the Site Availability and Site Disconnect Probability estimates for all four scenarios. We call this value the Economic Network Availability. NOTE: The Site Availability values do not change because all proposed changes only affect edges between sites. Design Current Prop. 1 Prop. 2 Prop. 3

Site Avail. 0.99522 0.99522 0.99522 0.99522

Site Disc. Prob. 0.999302978 0.999302990 0.999492521 0.999302984

Economic Net. Avail. 0.994522723 0.994522734 0.994711358 0.994522728

TABLE VIII N ETWORK AVAILABILITY E STIMATES

D. Convert each network failure data into network failure expenses. Table IX presents network failure cost estimates. E. Get revenue and traffic growth forecasts.

2) Link MI1 to MD1. This is as likely candidate for redundancy as this would provide sites MI1 and MD1 with alternate routes. 3) Link D1 to F2. This is as likely candidate for redundancy as this would provide sites D1 and F2 with alternate routes. Reliability and cost estimates for these proposed edges are provided in Table VII Edge Name C1 - A3 MI1 - MD1 D1 - F2

MTBF (hrs) 39420 39420 39420

MTTR (hrs) 3 3 3

Capital Cost $15,000 $35,000 $2,000

TABLE VII P ROPOSED new

Lease Cost $850 $2,000 $800

Repair Cost $2,500 $2,500 $2,500

There are no revenue or traffic growth forecasts (see ERA framework assumption #3). This means that our proposed edges will function as redundant rather than load leveling edges. It also simplifies the annuity calculations given in the next section. F. Build a set of Availability driven annuities for all networks. We now build a set of annuities for each network topology. Each annuity consists of a set of revenues (from Table V) minus Refundable revenues [calculated by multiplying the Network Availability (Table VIII by Refundable Revenues (also from Table V)] minus Capital Costs (from Table VII) minus Repair and Lease costs from Tables V, VII and VI. Table IX shows the relative contributions of all these factors to the building of each network annuity. NOTE: Network revenue does not change for any of the proposals.

EDGE DATA

G. Evaluate annuities for all networks. Table X shows how the Economic Reliability Value increases on a monthly basis for the duration of the analysis.

8

Design Current Prop. 1 Prop. 2 Prop. 3

Current Prop. 1 Prop. 2 Prop. 3

Repair Cost $36.30 $36.49 $36.49 $36.49

Rev. $68,850 $68,850 $68,850 $68,850 Lease Cost $24,075 $24,925 $26,075 $24,875

Lost Rev. $12.58 $12.58 $12.15 $12.58 Annuity Month1 $44,726 $28,876 $7,726 $41,926

Capital Cost $0.00 $15,000 $35,000 $2,000 Annuity Month(2−12) $44,726 $43,876 $42,726 $43,926

Design Name Current Proposal 1 Proposal 2 Proposal 3

Design Avail. 0.99452272 0.99452273 0.99471136 0.99452273

Design ERV $508,429.95 $483,765.30 $450,697.46 $497,333.68

TABLE XI S UMMARY

TABLE IX

C. When will the improvement in profits from the proposed modifications compensate for the investment needed to make them?

N ETWORK A NNUITIES E STIMATES

Time Period 1 2 3 4 5 6 7 8 9 10 11 12

Current ERV $44,726 $89,009 $132,854 $176,265 $219,245 $261,801 $303,935 $345,652 $386,956 $427,851 $468,341 $508,430

Prop. 1 ERV $28,876 $72,317 $115,329 $157,914 $200,078 $241,825 $283,158 $324,082 $364,600 $404,718 $444,438 $483,765

Prop. 2 ERV $7,726 $50,030 $91,914 $133,384 $174,443 $215,096 $255,346 $295,198 $334,655 $373,721 $412,401 $450,697

Prop. 3 ERV $41,926 $85,417 $128,477 $171,111 $213,323 $255,117 $296,498 $337,468 $378,033 $418,196 $457,962 $497,334

TABLE X N ETWORK ERV E STIMATES

V. R ESULTS Table XI presents the results of our analysis. These results show there is a tradeoff between return on investment and overall network availability but it is not direct. Proposal 1 and 3 have the similar network availability estimates, but Proposal 3 has a noticably higher ERV. This difference in results occurs because of the additional edge lease and repair costs in Proposal 1. A. What is the dependability and profitability of the current network? The current network availability is estimated at .99452 The ERV after one year is $508, 429.95. B. How will proposed network modifications alter the dependability and profitability of the network? Table XI shows the results of the proposed network changes.

Never. None of the proposals will ever have a higher ERV than the current network. This can be verified in the Annuities table (Table IX). The monthly annuity of the current network is always greater than any of the proposals hence, the ERV of the current network will always be greater than the ERV of the proposals. VI. C ONCLUSIONS We have shown that return of investment can be noticably different for two networks with the same estimated dependability. The ERA framework provides a structured means to compare network design choices in terms of dependability and value concurrently. This framework can lead to the design and implementation of more dependable networks with better economic value. It can also lead to better mechanisms for adapting networks to changing technological and economic conditions such as those currently facing the industry. R EFERENCES [1] M. Balakrishnan and A. Reibman, “Characterizing a lumping heuristic for a markov network reliability model,” in The Twenty-Third International Symposium on Fault-Tolerant Computing (FTCS-23). Digest of Papers. 22-24 Jun 1993, 1993, pp. 56 – 65. [2] M. O. Ball, C. J. Colbourn, and J. S. Provan, “Network reliability,” in Handbook of Operations Research and Management Science, M. Ball, T. Magnanti, C. Monma, and G. Nemhauser, Eds. Amsterdam, Netherlands: Elsevier, 1995, pp. 673 – 762. [3] P. Bell, R. Walling, and J. Peacock, “Costing the access network an overview,” British Telecom Technology Journal, vol. 14, no. 2, pp. 128 – 132, April 1996. [4] E. Brinksma and H. Hermanns, “Process algebra and markov chains,” in Lectures on Formal Methods and Performance Analysis, E. Brinksma, H. Hermanns, and J. Katoen, Eds. Berlin, Germany: Springer-Verlag, 2001, pp. 183 – 231. [5] J. Carrasco, “Computationally efficient and numerically stable reliability bounds for repairable fault-tolerant systems,” IEEE Transactions on Computers, vol. 51, no. 3, pp. 254 – 268, March 2002. [6] J. B. Dugan, S. A. Doyle, and F. A. Patterson-Hine, “Simple models of hardware and software fault tolerance,” in Proceeding of the 1994 Reliability and Maintainability Symposium. Los Alamitos, CA, United States: IEEE Computer Society Press, 1994, pp. 124 – 129.

9

[7] B. Floering, B. Brothers, Z. Kalbarczyk, and R. Iyer, “An adaptive architecture for monitoring and failure analysis of high-speed networks,” in Proceedings of the 2002 International Conference on Dependable Systems and Networks. Los Alamitos, CA, United States: IEEE Computer Society Press, 2002, pp. 69–78. [8] R. German, “Non-markovian analysis,” in Lectures on Formal Methods and Performance Analysis, E. Brinksma, H. Hermanns, and J. Katoen, Eds. Berlin, Germany: Springer-Verlag, 2001, pp. 156 – 182. [9] U. Herzog, “Formal methods for performance evaluation,” in Lectures on Formal Methods and Performance Analysis, E. Brinksma, H. Hermanns, and J. Katoen, Eds. Berlin, Germany: Springer-Verlag, 2001, pp. 1 – 37. [10] S. Juneja and P. Shahabuddin, “Splitting-based importance-sampling algorithm for fast simulation of markov reliability models with general repair-policies,” IEEE Transactions on Reliability, vol. 50, no. 3, pp. 235 – 245, September 2001. [11] A. Kesselman and Y. Mansour, “Loss-bounded analysis for differentiated services,” in Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms. New York, N.Y., United States: ACM Press, 2001, pp. 591 – 600. [Online]. Available: http://doi.acm.org/10.1145/365411.365544 [12] W. Lee and J. Srivastava, “A market-based resource management and qos support framework for distributed multimedia systems,” in Proceedings of the ninth international conference on Information and knowledge management. New York, N.Y., United States: ACM Press, Nov. 2000, pp. 472 – 479. [13] S. H. Low and D. E. Lapsley, “Optimization flow control i: Basic algorithm and convergence,” IEEE/ACM Transactions on Networking (TON), vol. 7, no. 6, pp. 861 – 874, 1999. [Online]. Available: http://doi.acm.org/10.1145/323983.323990 [14] R. Mallubhatla and K. P. Pattipati, “Discrete-time markov reward models of automated manufacturing systems with multiple part types and random rewards,” IEEE Transactions on Robotics and Automation, vol. 16, no. 5, pp. 553 – 566, October 2000. [15] R. Neugebauer and D. McAuley, “Congestion prices as feedback signals: an approach to qos management,” in Proceedings of the 9th workshop on ACM SIGOPS European workshop. New York, N.Y., United States: ACM Press, 2000. [Online]. Available: http://doi.acm.org/10.1145/566726.566748 [16] A. V. Ramesh, D. W. Twigg, U. R. Sandadi, T. C. Sharma, K. S. Trivedi, and A. K. Somani, “An integrated reliability modeling environment,” Reliability Engineergin and System Safety, vol. 65, pp. 65 – 75, 1999. [17] M. L. Shooman, Reliability of Computer Systems and Networks Fault Tolerance, Analysis, and Design. New York, N.Y., United States: John Wiley & Sons, 2002. [18] J. A. Stanshine, “Modeling silent failures in telecommunications systems,” in Annual Reliability and Maintainability Symposium Proceedings. January 16-19 1995, 1995, pp. 261 – 264. [19] E. J. Stoker and J. B. Dugan, “Economic reliability forecasting in an uncertain world,” in 17 th European Simulation MultiConference [ESM] 9 - 11 June 2003, no. 17. Nottingham, United Kingdom: SCS European Publishing House, June 2003, pp. 217 – 221. [20] ——, “Economic reliability forecasting using markov reward models,” University of Virginia, Charlottesville, VA, United States, Technical Report TR-ES2003a, February 2003. [21] ——, “A framework for economic reliability analysis,” University of Virginia, Charlottesville, VA, United States, Technical Report TRES2003, February 2003. [22] ——, “When does it pay to make more reliable software systems?” University of Virginia, Charlottesville, VA, United States, Technical Report TR-ES2003a, March 2003. [23] A. Striegel and G. Manimaran, “Edge-based fault detection in a diffserv network,” in Proceedings of the 2002 International Conference on Dependable Systems and Networks. Los Alamitos, CA, United States: IEEE Computer Society Press, 2002, pp. 79 – 88. [24] M. R. Thomas, P. Bell, C. A. Gould, and J. Mellis, “Fault rate analysis, modelling and estimation,” British Telecom Technology Journal, vol. 14, no. 2, pp. 133 – 139, April 1996.

[25] J. Tindle, S. J. Brewis, and H. M. Ryan, “Advanced simulation and optimisation of the telecommunications network,” British Telecom Technology Journal, vol. 14, no. 2, pp. 140 – 146, April 1996. [26] Y. T. Yoon and M. D. Ili`c, “Independent transmission company (itc) and markets for transmission,” in Power Engineering Society Summer Meeting, vol. 1. Los Alamitos, CA, United States: IEEE Press, 2001, pp. 229 – 234. [27] ——, “Price-cap regulation for transmission: objectives and tariffs,” in Power Engineering Society Summer Meeting, vol. 2. Los Alamitos, CA, United States: IEEE Press, 2001, pp. 1052 –1057. [28] ——, “A possible notion of short-term value-based reliability,” in Power Engineering Society Winter Meeting, vol. 2. Los Alamitos, CA, United States: IEEE Press, 2002, pp. 772 – 778. [29] X. Zhao, D. Pei, L. Wang, D. Massey, A. Mankin, S. Wu, and L. Zhang, “Detection of invalid routing announcement in the internet,” in Proceedings of the 2002 International Conference on Dependable Systems and Networks. Los Alamitos CA. United States: IEEE Computer Society Press, 2002, pp. 59 – 68.

Ed Stoker (born 1951) received his B.A., M.A., and M.B.A in 1975, 1980 and 1981 respectively from the University of Pittsburgh Pittsburgh PA, and has worked in the computer network engineering since that time. He is currently a PhD candidate in Electrical Engineering at the University of Virginia.

Joanne Bechta Dugan (F’00) received the B.A. degree (1980) in mathematics and computer science from La Salle University, Philadelphia, PA, and the M.S. and PhD. degrees in 1982 and 1984, respectively, in electrical engineering from Duke University, Durham, NC. She is Professor of Electrical and Computer Engineering withthe University of Virginia. She has performed and directed research on the development and application of techniques for the analysis of computer systems which are designed to tolerate hardware and software faults. Her research interests in include hardware and software reliability engineering, fault tolerant computing and mathematical modeling using dynamic fault trees, Markov models, Petri nets and simulation. Dr. Dugan was an Associate Editor of the IEEE T RANSACTIONS ON R ELIABILITY for 10 years, and is Associate Editor of the IEEE T RANS ACTIONS ON S OFTWARE E NGINEERING . She served on the USA National Research Council Committee on Application aof Digital Instrumentation and Control Systems to Nuclear Power Plant Operations and Safety.