Measurement of Productivity of Web Sites - CiteSeerX

Fachbericht Nr.1999/2 Working Paper No. 1999/2

Prof. Dr. Paul Alpar Dr. Marcus Porembski Dipl.Vw. Sebastian Pickerodt

Measurement of Productivity of Web Sites

Philipps-University of Marburg School of Business Administration and Economics Universitätsstr. 24 D-35032 Marburg Germany E-mail:[email protected]

2

Herausgeber: Prof. Dr. Paul Alpar Prof. Dr. Ulrich Hasenkamp Institut für Wirtschaftsinformatik Philipps-Universität Marburg D-35032 Marburg, Germany Telefon (06421) 28-23894 E-mail: {alpar | hasenkamp}@wiwi.uni-marburg.de

Alle Rechte vorbehalten  by Philipps-Universität Marburg, Institut für Wirtschaftsinformatik 1999

3

Measurement of Productivity of Web Sites Abstract: The rise of Internet usage and the number of commercial web sites is inevitable. However, the commercial success of these web sites is often difficult to assess and control. We develop and apply a model to measure the “productivity” of web sites as one factor of their commercial success. Inputs and outputs of web site activity are defined and a nonparametric production function is constructed on the basis of data for a number of German web sites. This way best practice web sites among the observed ones are determined.

Introduction

The measurement of success of a web site is not easy for a number of reasons. First, the purposes for which web sites are created vary considerably. There are a number of proposals to classify these purposes, usually referred to as “business models” (see Hoffman et al. 1995 for an early classification). In some cases the site only offers information about the company and its products or services in order to raise awareness about the company or to enhance its image. In other cases, the site is used for selling products generating direct revenues from the visitors of the site. Another group of sites offers navigational services (e.g., search engines and catalogues) or editorial content to visitors while generating revenues from advertising space or sponsored areas sold to advertisers and sponsors. Some sites serve several purposes at once. Obviously, in each case the measurement of success must be adjusted to the purpose(s) of the site. There may be only one intermediate goal that most web sites share: to attract a lot of traffic. While this is not the ultimate goal, in many cases this is a necessary condition to achieve the higher level goals. For example, “lookers” in an electronic shop do not generate sales but without visitors there are no sales at all. In this work, we concentrate on this intermediate goal, the generation of web traffic, and how to achieve it in a most efficient

4

way. Another problem in measuring the success of a web site relates to the difficulties of measuring web site activities. The quest for high traffic is especially strong with sites that create revenues from selling advertising space. The reason is that many of the pricing models directly relate to the number of page views (or impressions) by the visitors of a site. Cartellieri et al. (1997) summarize advertising price models on the web as follows: 1. pricing per exposure (impression or unit of time spent), 2. pricing per response (click-through), or 3. pricing per action (download, information exchange, or transactions). Pricing per exposure is still the prevalent pricing model used. Like in traditional media, prices are expressed in CPM, the cost of exposing one thousand visitors to a message. In pricing per response only those visitors count who click on the advertisement (usually a banner) to get to the designated area (usually at the web site of the advertiser). An example for the third model is when the advertiser pays a commission on the sales generated by the visitors who came to him through the ad. This model incorporates the idea of the value of the attracted visitors to the advertiser. While in the first two pricing models only the quantity of attracted visitors counts, the third model adds the requirement of “quality.” It is not yet exactly known how to achieve the intermediate goal of “a lot of traffic” on a web site. Global market studies (e.g., European Commission 1998) and more specific empiric research (Alpar 1998) suggest that “good” information content is what web surfers are mostly looking for. However, for web sites that cater to a wide audience it is not easy to determine what the best content is to offer. Therefore, there is a tendency to offer as much information as possible under the given restrictions of time and budgets. If we consider the traffic generated by a web site as the output it produces then a meaningful economic goal is to attempt to achieve the given levels of traffic by minimal input levels. The dual economic goal, the achievement of maximal traffic with given input levels, is meaningful as well. Besides the

5

question of operational determination of traffic and input levels that will be addressed below these goals can only be achieved if the relationship between web site traffic and inputs is known. Such relationships are usually represented by a production function. In the next section, we define a production model of a web site by determining the inputs and outputs and a way to estimate their relationship.

The web site productivity model

The traffic of a web site is usually measured by page hits, page views (or impressions), or visits. Page hits are not much used for further evaluations since each separate file that makes up a part of a web page is counted when that page is called up by a visitor. The view of one page can, therefore, generate several hits. Page impressions count the view of one page only once. It only must be assured that each page view is really logged. In order to prevent uncounted page views from proxy servers or user caches developers of web pages can enforce that at least parts of a page have to be loaded from the original server. This is often done with a help of a small file that is even not seen by the user. A surfer may visit many web sites during a session and call up many pages from the same web site. In the general case, the logging performed by the web server does not provide enough information to exactly determine which of the page requests constitute a “visit” by a visitor to this site. Almost each program that evaluates server logs has its own definition to determine a visit out of the clickstream logged for a user. Therefore, the measurement of visits and comparisons of visits to different sites are not reliable. Mandatory user registration and use of certain technologies (e.g., cookies) can improve the measurement of visits but these approaches are not always useable both for tactical reasons (e.g., many users do not want to register with a site) and technical reasons (e.g., some users disable cookies).

6

We choose page impressions to represent the traffic of a web site for the reasons mentioned above. It could be assumed that visits could be used as an additional output measure if all the observed sites measure them in the same way. This is not true for two reasons. First, there is a high correlation between visits and page impressions, i.e., both figures express more or less the same activity. Second, whichever definition is taken to measure visits there is a certain arbitrariness in it. Therefore, we think that in the general case, visits are neither a good complement nor surrogate to measure the traffic of a web site. Other constructs for traffic measurement are also used (e.g., “ad impressions” are used to measure views of an ad on pages where ads are dynamically rotated). For analyses of productivity and efficiency page impressions are a better output indicator than advertisement revenues as well since the latter include price or bundling effects (e.g., when advertisement space on the web is bundled with ad space in a magazine). As indicated in the introduction, it is mostly information content that drives the traffic. However, there is no way to measure the information content found on static web pages or on dynamic web pages generated from database contents and calculations. As a surrogate for information content, we consider the number of HTML pages, web forms, Perl and other scripts, Java applets, and other files and programs offered by a web site. These inputs could be characterized as the quantity of information offered. A comparison of two sites with equal input levels but different output levels could be an indication of different information quality between the offerings. Similar to the discussion of the output side it should be noted that for analyses of productivity these variables are preferable to cost figures for the generation of web pages, scripts, and other files and programs because the cost figures may be based on different prices for these inputs. Thus, they could bias the actual input levels. Once input and output variables are determined, the question arises as to how to relate them to each other. If there were only one output and one input variable a simple ratio could be calculated to determine differences in productivity of different sites. Since this is not the case

7

a more complex model is necessary. One way is to assume a specific functional relationship between inputs and outputs and to estimate the parameters of this (production or cost) function based on observed figures. This is the traditional econometric approach. This approach requires some additional information, e.g., input prices if a cost function is to be estimated. Another approach that does not require the assumption of a specific functional form and the knowledge of input prices is the non-parametric approach in which a production frontier is constructed based on observed figures. We choose this approach because of it needs less data and less assumptions. More specifically, we apply Data Envelopment Analysis (DEA).

Data Envelopment Analysis

The technique of data envelopment analysis (DEA) introduced by Charnes, Cooper and Rhodes (1978) and extended by Banker, Charnes and Cooper (1984) is now widely employed for the estimation of multiple input, multiple output production correspondences and the evaluation of the relative production efficiency of decision making units (DMUs). Consider N DMUs, where we assume that all DMUs have used the same m inputs to produce the same n outputs, although, in general, in different amounts. For DMUj, j=1,2,...,N, let (xj,yj) be the observed input-output configuration, where xj with xj > 0 is a vector of m observed input quantities and yj with yj > 0 is a vector of n observed output quantities. Under the DEA approach we consider a piecewise linear, concave and non-decreasing production frontier which connects the most efficient DMUs. Every DMU which lies on the production frontier is efficient and every DMU which lies below the production frontier is inefficient. The efficiency of a DMU can be measured by the “distance” of a DMU to the production frontier, where we distinguish between the input oriented and the output oriented perspective. In the input oriented perspective the question is whether the given output levels

8

can be achieved with less input. In the output oriented perspective the question is whether with the given inputs higher output levels can be achieved. In the input oriented perspective we determine the maximal proportionate reduction of DMUj’s inputs such that the resulting input-output configuration lies on the production frontier. For this we define the (m,N)-matrix X of observed inputs and the (n,N)-matrix Y of observed outputs and solve the linear optimization problem (1)

minimize

p

s.t.

Yl ≥ yj Xl ≤ pxj l ≥ 0, p ≥ 0

Let the N-vector lj = (lj,1,...,lj,N)T and the scalar pj be an optimal solution of (1). Then (Xlj,Ylj) is the input-output configuration of a positive linear combination of DMUs in the sample, which lies on the production frontier and produces at least the output quantities yj of DMUj without using more than a share pj ∈ (0,1] of its inputs xj. In case of pj = 1 we call DMUj efficient, and inefficient otherwise. In case of inefficiency, DMUs for which the corresponding weight lj,k is strictly positive, label them DMUk, are called reference DMUs of DMUj. Reference DMUs are always efficient and can prove useful when trying to improve a DMU’s performance. Even merely contrasting input/output levels of an inefficient DMU with those of its reference DMUs often helps to identify inadequacies. In the output oriented perspective we determine the maximal proportionate increase of DMUj’s outputs such that the resulting input-output configuration lies on the production frontier by solving the linear optimization problem (2)

maximize

q

s.t.

Yk ≥ qyj Xk ≤ xj k ≥ 0, q ≥ 0.

9

Similarly as in the input oriented perspective for an optimal solution kj = (kj,1,...,kj,N)T and qj of (2) (Xkj,Ykj) describes the input-output configuration of a positive linear combination of DMUs in the sample, that lies on the production frontier. However, this combination produces at least qj times the output quantities yj of DMUj without using more than its inputs xj. We have qj ≥ 1and we call DMUj efficient, if and only if, qj = 1. Like in the input oriented perspective in case of ineffeciency of DMUj, DMUs with non-zero weights kj,i are called reference DMUs for DMUj. The reference technology of the minimization problem (1) and the maximization problem (2) exhibits constant returns to scale (CRS). To get a tighter fit of the production frontier and to be able to examine scale efficiency, we add to the linear programs (1) and (2) (3)

eTl = 1 resp. eTk = 1

as an additional constraint, where e = (1,...,1)T. This additional constraint implies that the reference technology uses variable returns to scale (VRS). For each DMU we get by solving the four optimization problems (1), (2), (1) + (3), and (2) + (3) the efficiency scores pjCRS, qjCRS , pjVRS and qjVRS , respectively. We have pjCRS ≤ pjVRS ≤ 1 and qjCRS ≥ qjVRS ≥ 1. In case of pjVRS =1 or qjVRS = 1 we call DMUj technically efficient. By defining rj := pjCRS/pjVRS and sj := qjCRS/qjVRS we have pjCRS = rjpjVRS and qjCRS = sjqjVRS and it holds rj ≤ 1 and sj ≥ 1. rj and sj are measures of scale efficiency of DMUj. For a technically efficient DMUj rj is in the input oriented perspective the proportionate reduction of DMUj’s inputs which can be achieved by adopting the CRS technology. Accordingly, in the output oriented perspective sj is the corresponding proportionate increase of DMUj’s outputs. rj = 1 resp. sj = 1 are equivalent to scale efficiency, whereas rj < 1 resp. sj > 1 indicates scale inefficiency. To see whether DMUj in case of rj < 1 resp. sj > 1 produces under increasing or decreasing returns to scale we have to examine the optimal solution of (1) resp. (2). In case of

10

eTlj < 1 resp. eTkj < 1 DMUj produces under increasing and in case of eTlj > 1 resp. eTkj > 1 DMUj under decreasing returns to scale. To illustrate the above relationships let us consider the one input, one output case of Figure 1. Every DMU which lies on the CRS-production frontier is efficient, i.e. technically and scale efficient. Every DMU which lies on the VRS-production frontier but not on the CRSproduction frontier is technically efficient but not scale efficient. DMU0 and DMU1 are inefficient. For DMU0 we have in the input oriented perspective p0CRS = c/a, p0VRS = c/b and r0 = b/a. It holds p0CRS < 1, p0VRS < 1 and r0 < 1. Under variable returns to scale DMU3 and DMU2 are reference DMUs for DMU0. Furthermore, DMU0 produces under increasing returns to scale. Let us now consider DMU1 where we take the ouput oriented perspective. Then it holds q1CRS = a’/c’, q1VRS = b’/c’ and s1 = a’/b’ and we have q1CRS > 1, q1VRS > 1 and s1 > 1. The reference DMUs of DMU1 are DMU4 and DMU5, and DMU1 produces under decreasing returns to scale.

Figure 1: Hypothetical DMUs and production frontiers

11

DEA assigns a score of one to efficient DMUs. To allow a ranking of efficient DMUs Andersen and Petersen (1993) introduced the concept of superefficiency. The basic idea is to compare the DMUj under evaluation with a positive linear combination of all other DMUs in the sample, i.e. DMUj itself is excluded. This is done by adding (4)

ejTl = 1 resp. ejTk = 1

as an additional constraint to the respective linear programs, where ej is the jth unit vector. In general we determine superefficiency scores only for variable returns to scale technologies, i.e. we solve in the input oriented perspective the linear program (1) + (3) + (4) and in the output oriented perspective the linear program (2) + (3) + (4). The additional constraint (4) does not change the efficiency score of an inefficient DMU, since an inefficient DMU cannot be a reference DMU of itself. But for an efficient DMU we may get in the input oriented perspective an efficiency score larger than one and in the output oriented perspective an efficiency score of less than one, where a high (resp. a low) superefficiency score is associated with a high efficiency rank. However, a very high score in the input oriented (resp. a very low score in the output oriented perspective) may indicate that a DMU is highly specialized and therefore not comparable to other DMUs. Hence the concept of superefficiency also helps to identify such DMUs. A geometric interpretation of technical superefficiency is given in Figure 2. As we can see in Figure 1 DMU4 is technically efficient, i.e. p4 = q4 = 1. To determine the superefficiency scores of DMU4 we consider the production frontier which is determined by the technically efficient DMUs different from DMU4. Let p4SUP and q4SUP denote the superefficiency scores of DMU4 in the input oriented and in the output oriented perspective, respectively. Then we have p4SUP = g/f and q4SUP = f’/g’ with p4SUP > 1 and q4SUP < 1.

12

Figure 2: Technical superefficiency In the multiple input, multiple output framework DEA provides with its different efficiency scores aggregated indicators of outputs relative to inputs. This allows us to identify best practice, poor practice, and specialized practice. To summarize, the main strengths of DEA are that we do not have to make any assumptions about the functional relationship between inputs and outputs like in the econometric approach, multiple inputs and multiple outputs can be handled without any knowledge of input- and output prices, and in addition to efficiency scores reference DMUs for every inefficient DMU are provided.

Data gathering

Since advertisers often pay on the basis of page impressions standardized and verifiable measurement is necessary. In our case the standardized counting assures that the figures are comparable. In Germany, an association called IVW is the neutral agency that monitors the circulation of media. It is an agency similar to the Audit Bureau of Circulations in USA. In 1997 it has established a procedure to measure and publish web site traffic data

13

(www.ivw.de/data/). A number of web sites that sell advertising space have adopted this procedure. These are, for example, web sites containing editorial content offered by newspapers, TV networks, news magazines, and professional journals or sites containing search catalogues and engines, classified ads, weather information. The IVW procedure defines a page impression as an exposures of a certain HTML page to any user. Only exposures of pages that deliver some kind of content and potentially an advertisement are regarded. The counting is done using a small image file that is dynamically generated by a CGI program every time the page is displayed in the browser window. This is necessary because the actual HTML file and any other images, sound files etc. contained in the page can be stored in the local cache of the user’s web browser or on a proxy server and therefore would not generate a new hit in the web server’s log file. On sites using a frameset, only exposures of documents in one single frame are regarded as page impressions. Visits are defined as continuous usage events of a web site. Such an event starts when a page inside the web site is requested by a user from outside of the web site. Any moves of the user inside of the web site generate new page impressions but no new visits. The in- and outside of web sites is distinguished by the HTTP "Referer" field which delivers the URL of the page from which a user linked to the actual page. Distinct web browsers and users are identified by a combination of HTTP header fields that are usually not logged by web servers, e.g. the "Via" field containing the chain of proxies by which the original request was forwarded and the "User-Agent" field containing the browser’s type and the operating system it is running on. For this research we have chosen web sites from the groups “editorial content of general interest” (GI) and “editorial content of special interest” (SI) because the other groups are too small for meaningful best practice comparisons at this point in time. In addition, within the group GI a special subgroup of web sites owned by daily newspapers can be identified (DN).

14

For the measurement of the defined input variables there are unfortunately no standards and no published figures. We had to determine them ourselves with the help of programs that visited the sites, downloaded their content and performed the necessary counting. Obviously, HTTP, the web’s underlying protocol, was mainly designed to support browsing through information content rather than examining the whole set of documents a site offers. This means that it does not provide a method to get all of a server’s content by a single command. Some servers deliver listings of some of their subdirectories when they receive URLs that contain a path without a specific document name, but this function is disabled at most commercial web sites. Even if available, this function would not help much, because it only reports the names of documents without the content. Therefore, the only way to examine a whole web site is to search it recursively, following all its internal hyperlinks starting with the homepage and ending with documents that contain no links at all or only internal links already visited and external links, i.e. links pointing to other web sites. There are tools that automatically conduct such a search, which are often referred to as robots. For our purpose, a simple (and free) UNIX/Linux-based robot called getwww (ftp://sunsite.unc.edu/pub/linux/system/Network/info-systems/www)

was

most

suitable,

because it mirrors the document structure of the searched site exactly to a directory of the system on which it is running. Other tools often put all the documents found on the remote web site to a single file in a special format which would be more difficult to analyze. The actual data gathering for each site comprised two steps. First, getwww was used to retrieve all the documents of a site, except for multimedia items like pictures and sound files. In the second step, the resulting set of documents was examined using a script written in PERL. This script counted the following values: the total number of files with a .html or .htm extension, the number of directly linked files with a .pl extension, the number of hyperlinks using parameters, the total number of APPLET tags contained in the HTML files, the number of different Java applets those tags referred to, the number of FORM tags contained in the

15

HTML files and the number of different actions which were triggered by the forms (e.g., reading from a database). The collected figures had to be corrected in the case of HTML documents because on some servers, the “mini pictures” used in the IVW method to count page impressions have exactly the same name as the document they are referring to. This means that for every actual HTML document there exists a second file whose extension is also .html (or .htm) but which does not represent any useful content. To avoid this, a distinction had to be made between 'real' documents and the dummies used for logging purposes. This was possible due to the fact that in all examined cases the string "ivw" appeared in the path in which the dummies were stored. Other problems were encountered with certain servers. There are sites that do not offer any traditional HTML documents at all but generate all their content on the fly out of any kind of database. Those sites are using systems like Lotus Domino or even proprietary systems customized to a certain site. Generally, those sites use only one traditional URL with several combinations of parameters. To determine different documents on this kind of site exact knowledge about the underlying database architecture would be needed and a rather big effort would be necessary to adapt the counting software to every single site that was to be analyzed. For these reasons, the present study concentrates on sites that mainly use traditional HTML documents. This makes also sense from the point of view of production functions as it is assumed that the observed units underlie the same production technology. Even some of the traditional sites presented problems to our method of counting. In some cases there were several aliases for the hostname used inside of the same site. Since getwww, like other robots, distinguishes between internal and external links according to the hostname, it is not able to download such a site as a whole. Therefore, in cases where it was not possible to access the whole site by one single hostname this site was not included at all. The collected data are given in Table 1. The site type is determined by IVW. The column “SF” contains the sum of the number of scripts and the number of different functions

16

triggered by forms. Java applets were disregarded since only a few of the observed sites made use of this technology at that point in time. The variables visits and page impressions refer to a time period, in this case August 1998. Therefore, the variables HTML pages and SF should actually also be calculated for the whole period as they may vary over that period. However, due to the big effort needed to determine the values of these variables only one measurement has been taken in the middle of the month.

URL of DMU

Site description

http://www.focus.de http://www.bch.de http://www.heise.de http://www.tvspielfilm.de http://www.stern.de http://www.praline.de http://www.tvmovie.de http://www.prosieben.de http://www.zdnet.de http://www.dm-online.de http://www.chip.de http://www.bz-berlin.de http://www. sueddeutsche.de www.city-guide.de http://www.welt.de http://www.berlinermorgenpost.de http://www. wirtschaftswoche.de http://www.geo.de http://www.mopo.de http://www.fr-aktuell.de http://www.brigitte.de http://www.faz.de

weekly magazine business information computer magazines TV movies guide weekly magazine adults magazine TV movies guide TV station computer magazines money magazine computer magazine daily newspaper daily newspaper

Site HTML type pages GI 7,190 SI 500 SI 7,846 GI 1,841 GI 14,750 GI 2,408 GI 748 GI 9,073 SI 25,640 SI 682 SI 13,017 DN 9,099 DN 534

SF

Visits

Page impressions

149 22 4 24 16 12 19 33 25 16 20 2 2

2,102,986 1,520,918 1,447,983 2,281,284 1,366,543 853,855 954,398 966,286 910,231 1,209,370 663,796 579,788 673,712

9,551,149 8,188,509 6,765,054 6,112,067 6,096,683 5,734,584 4,171,964 4,051,647 3,898,400 3,514,025 3,222,304 2,766,472 2,483,836

daily newspapers daily newspaper daily newspaper

DN DN DN

47,078 19,410 1,236

54 25 10

670,524 658,462 424,408

2,273,772 2,270,891 1,262,545

business magazine

SI

345

14

265,635

1,147,300

adventure magazine daily newspaper daily newspaper women magazine daily newspaper

GI DN DN GI DN

4,611 1,279 2,346 2,099 707

8 21 10 25 2

287,360 264,195 117,204 221,775 127,881

927,909 907,315 789,305 764,568 269,025

Table 1: Data of the observed DMUs in August 1998 Results Three of the sites were overall efficient, i.e., both technically and scale efficient, both in the input- and output-oriented perspective: Bch (business-channel), Heise, and Sueddeutsche. As can be seen in table 1 the first two are special interest sites while the third is maintained by

17

one of the main German daily newspapers. Bch is an offering from various companies: Several finance magazines, a book publisher, and the news agency Reuters. It is therefore very rich on content such as financial information including stock quotes, position offerings and career recommendations, real estate information, and news. This content appeals to important internet users segments such as professionals and college students. Seven sites were technically efficient but only three sites were scale efficient in the input oriented model. In the output oriented model these figures were six and three respectively. To be able to better rank the sites superefficiency scores were also calculated. Table 2 presents the ranking of sites according to their technical (super)efficiency scores. The rank according to page impressions has been added in the last column for comparison purposes.

DMU

Sueddeutsche Bch Focus Bz-berlin Heise Wirtschaftswoche Praline Stern Tvspielfilm Tvmovie Dm-online Pro-sieben Zdnet Chip Welt City-guide Berliner-morgenpost Geo Fr-aktuell Mopo Faz Brigitte

Rank according Rank in the Rank in the input oriented to page output impressions oriented model model 1 4 13 2 1 2 3 6 1 4 3 12 5 2 3 6 4 17 7 8 6 8 13 5 9 11 4 10 10 7 11 9 10 12 18 8 13 20 9 14 19 11 15 21 15 16 22 14 17 12 16 18 15 18 19 16 20 20 14 19 21 6 22 22 16 21

Table 2: The ranking of sites by efficiency scores

18

Table 2 shows that extreme differences between the two perspectives only occur for “Faz” where poor performance in the output oriented model seems to correspond with good performance in the input oriented model. However, an analysis of residuals (Lovell 1993) in this case reveals that big residual output inefficiency exists so that the performance of this site is not good in the input oriented model either. In order to compare sites by site type group average efficiency scores for the output oriented model are shown in table 3. Superefficiency scores were not used in this calculation. Site type group SI GI DN

Average technical efficiency 1.5783 3.2575 4.8538

St. Dev. of technical efficiency 0.664 3.574 3.266

Table 3: Group efficiency comparisons The average technical efficiency of special interest sites is significantly better than those of the both general interest groups. To allow further comparisons between similar sites we also conducted calculations by site type. We calculated models with the input and output perspective but the following results all stem from output oriented models under the assumption that the observed sites are more interested in increasing the output (page impressions) with the given inputs (HTML-pages and information provided through scripts and forms) rather than saving inputs to achieve the given output levels. The reason for this assumption is that the site owners already created the contents offered for their primary business and this content is often in a form that requires little effort to make it available on the Web as well. Table 4 presents results for general interest sites that are not owned by publishers of daily newspapers. The efficiency score of 8.03 achieved by Brigitte means that this site needed to create 8.03 times more page impressions than it already had to become technically efficient. Scale efficiency scores show the necessary output increases to produce under constant returns to scale, i.e., to achieve an optimal ratio between inputs and outputs. Total efficiency is the

19

product of technical and scale efficiency as shown in section 3. The signs -, 0, and + indicate whether a site currently produces under decreasing, constant, or increasing returns to scale. The last column shows the relevant best practice reference set for the inefficient units. The weights of these reference units always add to 1.

Site Brigitte Focus Geo Praline Pro-sieben Stern Tvmovie Tvspielfilm

Technical Scale Total Returns Reference units efficiency efficiency efficiency to scale (weights) 8.03 1.18 9.51 Tvspielfilm (.97), Stern (.02), Focus (.01) 1.00 3.65 3.65 1.00 4.12 4.12 + 1.00 1.00 1.00 0 1.60 2.44 3.89 Stern (.51), Tvspielfilm (.38), Focus (.11) 1.00 1.25 1.25 1.00 1.00 1.00 0 1.00 1.10 1.10 -

Table 4: Efficiency scores and other results in the output oriented case for sites of type GI Table 5 shows the results for general interest sites owned by daily newspapers. The interpretation of the efficiency scores and other entries is the same as in table 4.

Site

Technical Scale Total Returns Reference units efficiency efficiency efficiency to scale Berliner-mo. 1.99 2.29 4.55 Sueddeutsche (.92), Bz-berlin (.08) Bz-berlin 1.00 1.00 1.00 0 Faz 9.25 1.00 9.25 0 Sueddeutsche (.98), Bz-berlin (.02) Fr-aktuell 3.22 4.29 13.82 Sueddeutsche (.79), Bz-berlin (.21) Mopo 2.76 2.37 6.56 Sueddeutsche (.91), Bz-berlin (.09) Sueddeutsche 1.00 1.00 1.00 0 Sum of Köln 1.22 24.63 29.97 Bz-berlin (1) Welt 1.22 11.37 13.86 Bz-berlin (1)

Table 5: Efficiency scores and other results in the output oriented case for sites of type DN Table 6 shows the results for special interest sites that are mostly owned by companies publishing financial or computer magazines. The interpretation of the efficiency scores and other entries is the same as in tables 4 and 5.

20

Site Bch Chip Dm-online Heise Wirtschaftsw. Zdnet

Technical Scale Total Returns Reference units efficiency efficiency efficiency to scale (weights) 1.00 1.00 1.00 0 2.49 1.99 4.96 Bch (.89), Heise (.11) 1.00 1.76 1.76 + 1.00 1.00 1.00 0 1.00 4.56 4.56 + 2.10 3.22 6.76 Bch (1)

Table 6: Efficiency scores and other results in the output oriented case for sites of type SI

Results interpretation

Our model shows that high traffic on a web site does not indicate good use of web resources. Thus, putting more data on a web site will often lead to a proportionately lower increase in traffic. This can also be seen in tables 4 through 6 which show that most web sites already produce with decreasing returns to scale. Depending on other activities related to the web site (e.g., pricing for ad space), the revenues may still increase in such cases but this is becoming more difficult when producing inefficiently. Table 3 indicates that sites with contents of special interest perform better than sites with general interest content. Some general interest sites actually also carry a lot of special interest content, they obviously need to make this more transparent. Comparisons within groups show that brand strength by itself does not seem to explain performance differences. On one hand, Sueddeutsche and Frankfurter Allgemeine (Faz) Zeitung have similar brand equities but the performances of their web sites are worlds apart. On the other hand, Focus and Stern, the two competing weekly magazines, are not that far apart in their popularity on the web. Our observations are based on a small sample of web sites. Therefore, our future research efforts will be aimed at increasing the number of sites observed and at improving the measurement of inputs. However, the number of sites that evaluate their traffic data in a standardized way and which data are audited is still not high in Germany (169 in May of 1999), the number of sites for which comparable input data can be collected is even smaller.

21

References

Alpar, P.: Satisfaction with a Web Site: Its Measurement, Factors, and Correlates. In Scheer, A.-W. and Nüttgens, M. (Eds): Electronic Business Enineering, Physica-Verlag, Heidelberg 1999, 271-287.

Andersen, P., Petersen, N.C.: A Procedure for Ranking Efficient Units in Data Envelopment Analysis. Management Science, Vol. 39, No. 10, 1993, 1261-1264.

Banker, R.D., Charnes, A., Cooper, W.W.: Some models for estimating technical and scale inefficiencies in data envelopment analysis. Management Science Vol. 30 No. 9, 1984, 10781092.

Cartellierri, C., Parsons, A.J., Rao, V., Zeisser, M.P.: The real impact of Internet advertising. The McKinsey Quarterly, No. 3, 1997, 44-62.

Charnes, A., Cooper, W.W., Rhodes, E.: Measuring the efficiency of decision making units. European Journal of Operational Research, Vol. 2 No. 6, 1978, 429-444.

European Commission: Content and Commercedriven Strategies in Global NetworksBuilding of the Network Economy in Europe- Summary, Luxemburg, 1998.

Hoffman, D.L., Novak, Th.P., Chatterjee, P.: Commercial Scenarios for the Web: Opportunities and Challenges. J. of Computer-Mediated Communication, Vol. 1, No. 3, December 1995. (http://www.ascusc.org/jcmc/vol1/issue3/vol1no3.html).

22

Lovell, C.A.K.: Production Frontiers and Productive Efficiency, in: H.O. Fried, C.A.K. Lovell, and S.S. Schmidt(Eds.): The Measurement of Productive Efficiency, Oxford University Press, New York, Oxford, 1993, S. 3-67.

Measurement of Productivity of Web Sites - CiteSeerX

Measurement of Productivity of Web Sites - CiteSeerX

Suggest Documents

Nonparametric Measurement of Productivity And ... - CiteSeerX

Outsourcing, Offshoring, and Productivity Measurement in ... - CiteSeerX

ECON 626: Efficiency and Productivity Measurement - CiteSeerX

Productivity Measurement and Management Accounting - CiteSeerX

Productivity Dispersion, Competition and Productivity Measurement

On Productivity: concepts and measurement - Productivity Commission

The Challenge of Productivity Measurement - NetValence

Development of Productivity Measurement and Analysis Framework ...

The measurement of productivity and efficiency

productivity assessment and improvement measurement of ... - Eric

An evaluation of office productivity measurement - IngentaConnect

Guide to Measurement of Government Productivity

Measurement of Labour Productivity Through ...

The Challenge of Productivity Measurement - NetValence

Measurement of Rural Firefighter Productivity & Workload

Measurement of Government Output and Productivity for

Assessment and measurement of productivity - Taylor & Francis

PRODUCTIVITy MEASUREMENT FOR INTERNATIONAL FIRMS

capital data for productivity measurement

Web Usability Measurement: Comparing Logic Scoring ... - CiteSeerX

Simple measurement of surface free energy using a web ... - CiteSeerX

AutomatedFluorometricProcedurefor Measurement of ... - CiteSeerX

Measurement of productivity of property services â an ... - RESER

Towards Increasing Web Application Productivity

Measurement of Productivity of Web Sites - CiteSeerX