Bayesian Estimation of the Multiproduct Cost Function and Economies of Scale and Scope for an Industry with Multiple Technologies of Production Using Markov chain Monte Carlo Methods Maria Ana E. Odejar Kansas State University, Department of Economics, Manhattan, Kansas, 66502, USA,
[email protected]
Abstract Unsupervised clustering of firms according to their latent production technology is incorporated in modeling the multiproduct cost function of an industry characterized by simultaneous existence of multiple technologies of production. A Bayesian method using Markov chain Monte Carlo methodology is developed to estimate the multiproduct cost function as a latent class model which is a finite mixture of flexible fixed cost quadratic (FMFFCQ) functions. Firms are stochastically clustered according to the mixing parameter which has a Dirichlet distribution. Hyperparameters, of the prior distributions are expressed in terms of the mean and variance of the prior densities obtained from information years before and decades after regulations or innovations are introduced. The
2
estimated FMFFCQ multiproduct cost function is then utilized to evaluate the degree and direction of overall and product specific economies of scale and scope of banks in Kansas offering agricultural loans. Parameter estimates for the finite mixture model are compared with those of the standard regression model representing one common production technology.
KEY WORDS: Data augmentation; Dirichlet mixing parameter; Finite mixture of flexible fixed cost quadratic functions; Markov chain Monte Carlo methods.
1. INTRODUCTION
Multiproduct cost functions (Baumol, Panzar and Willig 1982) play an important role in evaluating the degree and direction of economies of scale and scope of a particular industry. The usual underlying assumption is that all firms in the industry have identical technology and consequently common long run cost function.
This assumption is not
realistic since firms in an industry are
characterized by simultaneous latent multiple production technologies which reflect various managerial production techniques and / or industry undergoing a transition period after innovations and / or regulatory changes have been introduced (Beard, Caudill and Gropper 1991). Adoption of technology are determined by various factors like size of firms, geographical locations, expected net benefit from innovations and management structure (Scherer 1984, Hannan and McDowell 1984, Akhavein et. al. 2001, Rose and Joskow 1990, Saloner and
3
Shepard 1995, and Johnes and Johnes 2007). Thus, firms are unable to simultaneously respond to new innovations as well as policy regulations. In fact, adoption and diffusion process of new technology takes several years or even decades (Mansfield 1968, Hannan and McDowell 1984 , 1987, Akhavein et. al. 2001). For these reasons, it is essential to incorporate into the model the possibility of the existence of simultaneous multiple technologies as reflected in the alternative specifications of total cost function coefficients. Depending on the regime corresponding to the technology and or policy regulation in effect, cost would respond differently to changes in output or input price shocks. Since there is no information as to which particular latent production technology the firm is using, estimation of multiproduct cost functions that account for the simultaneous existence of multiple technologies require more appropriate methods of estimation.
In the past, several methods have been introduced to estimate the finite mixture models. One method of estimating the FMFFCQ multiproduct cost function is by the method of moments (Cohen 1967, Day 1969). The sample moments are equated to the corresponding theoretical central moments about the mean. However, this method does not provide standard errors of the estimates.
Another method, introduced by Quandt and Ramsey (1978) is the moment generating function estimation method. It minimizes the sum of squared
4
deviations of the sample from the theoretical moment generating function to derive parameter estimates. The moment generating function method produces estimates closer to the true values than estimates from the method of moments. However, there is a problem of choosing a satisfactory value of in the moment generating function, E[eC] , which affects the asymptotic covariance matrix and the quality of numerical approximation including the ease with which convergence is achieved.
Another method is to implement maximum likelihood estimation using iterative algorithms. According to Hartley (1978), the Expectation Maximization (EM) of Dempster and Rubin (1977) can provide maximum likelihood estimates of the finite mixture model parameters. However, the EM algorithm has some known drawbacks. When the components are not well separated or when initial values are far from the true values, the EM algorithm converges intolerably slowly (Celeux and Diebolt 1985). To remedy this problem, Celeux and Diebolt modified the EM algorithm by adding a stochastic step to prevent the iteration from staying in an unstable stationary point of the likelihood function. Thus the modified algorithm, Stochastic Expectation Maximization (SEM), avoids the slow convergence observed in the
EM
algorithm. Another well known criticism
pertains to the maximum likelihood of the finite mixture model itself. According to Maddala and Nelson (1975), Kiefer (1978), Swamy and Mehta (1975) and Quandt and Ramsey (1978), the maximum likelihood function for this model is unbounded at the edges of the parameter space. Phillips (1991) showed that
5
mild constraints on the regime variances can be imposed to force the likelihood to be bounded. The global maximizer of the likelihood function on the constrained parameter space is consistent, efficient and asymptotically normally distributed. However, when it comes to testing parameter stability, there is still a problem regardless of how the null hypothesis is stated. The reason is that under the null hypothesis that the mixing parameter is zero or that the component coefficients are identical for the regimes, normality cannot be assured because the information matrix is singular (Phillips 1991). Another criticism on the maximum likelihood method is that it does not provide parameter estimates accurate enough to be useful for small samples and even for moderately large samples. Hosmer (1973) conducted a Monte Carlo experiment which showed this disadvantage of using the maximum likelihood method.
Since, economic theory and literature provide some cost function prior information at least the direction and relative magnitude, Bayesian analysis is applied as the estimation methodology. This procedure is implemented using the Markov Chain Monte Carlo method data augmentation.
Section 2 discusses the finite mixture of flexible fixed cost quadratic functions. In section 3, the Bayesian methodology via Markov chain Monte Carlo methods and the algorithm for estimating the finite mixture of flexible fixed cost quadratic functions are presented. The sample data used for estimating the multiproduct cost function and the procedure for evaluating measures of scale and scope are
6
described in section 4. In section 5, multiproduct cost function estimation results for the finite mixture of flexible fixed cost quadratic functions are presented and compared to the standard regression multiproduct cost coefficient estimates. Section 6 summarizes and concludes.
2. FINITE MIXTURE OF FLEXIBLE FIXED COST QUADRATIC FUNCTIONS
The multiproduct cost function C j for the jth firm can be expressed as C j C j * ( Y j , Pj , ) j
j = 1,2,…,n
(1)
where: Y j is the vector of outputs of the j th firm P j is the vector of input prices of the j th firm
is the vector of coefficients of the cost function of the j th firm Assuming the existence of simultaneous multiple production technologies, the multiproduct cost function for the jth firm can be expressed as a finite mixture of cost functions s
C j i C ij * ( Y j , Pj , i ) j
with probability λi
(2)
i 1
of utilizing the ith production technology 0 < i < 1,
s
i 1
i
i = 1, 2, ... , s ,
j = 1, 2, ... , n
7
and with
the number of components s equal to the number of distinct
technologies. Each component of the cost function of the jth firm is of the flexible fixed cost quadratic form which accommodates zero output C ij * ( Y j , Pj , i ) i 0
m
v
m
r 1
k 1
v
v
1 / 2 ikk'Y jkY jk'
this
study
but
it
can
be
(3)
r 1 r' 1
k 1 k' 1
in
m
ir ln Pjr ikY jk 1 / 2 irr' ln Pjr ln Pjr'
some
v
m
k 1 r 1
other
Y jk ln Pjr
ikr
flexible
, i = 1 ,..., s functional
form.
The random errors ij are assumed to be normally and independently distributed with mean 0 and variance i2, ij ~ iid N(0 , i2 ) . The vector of parameters is θi = {i, i , i2}. The mean total cost for the j th firm is s
E ( C j ) i C ij * ( Y j , Pj , i ) .
(4)
i 1
Symmetry has been imposed on the flexible fixed cost quadratic function. Because of the form of the components of the cost function, linear homogeneity in input prices cannot be imposed [Friedlaender, Winston and Wang (1983) and Cohn, Rhine and Santos (1989)]. It is feasible to impose curvature restrictions on the cost functions (Galant and Golub 1984, Hazilla and Koop 1985 and 1986, Terrell 1996, Wolff, H., Heckelei, T. and
Mittelhammer, R. 2010 and
Featherstone and Moss 1994). However, no such restrictions are imposed in this study to allow the data to speak for itself and depict the actual behavior of firms especially since theoretical rational cost minimizing economic behavior assumptions of producers may not hold for as high as 50% of firms (Ogawa 2011) in the sample of an industry. Non-compliance of empirical results to
8
economic theory is usually observed during a transition or adjustment stage to shocks such as adopting technological innovations and complying with new policy regulation or deregulation.
Firms reacting to these innovative and
regulatory shocks may not necessarily be minimizing costs. They would probably opt to purchase the more expensive, more durable or reliable equipment or brand especially for improving technology. Patterson and Scott (1979) observed similar cost function concavity violation for the sample period due to changes in fuel inputs from coal towards oil and electricity. Considine (1989) concluded part of the reason for the positive own-price elasticity can be explained by two major regulatory policies. First, the natural gas curtailments which encouraged fuel switching and second, environmental policies which sought to reduce air pollution from fuel combustion. Ogawa (2008) demonstrated that concavity violations during the “bubble period” in the 1980s when land and stock prices soared are due to weak bank-firm relationship and borrowing constraints, and massive transactions of land resulting in large changes in quasi-fixed inputs to require firms to reorganize and relocate employees and/or machinery. Consequently, incurring additional necessary expenses and allocations inevitably preventing minimizing production costs. Furthermore, the quadratic term in the mutiproduct cost function should be able to capture the curvature.
3. MARKOV CHAIN MONTE CARLO METHODS
In the absence of information as to which observations follow which component
9
cost model, estimation of the regression parameters becomes very complicated and cumbersome. Implementing the Bayesian approach for model (2) becomes intractable as the sample size becomes moderately large such as n = 50, since the posterior distribution takes into account all possible partitions of the sample
{ C1 , ... ,Cn } into at most s groups. A solution to this computational problem is to use Markov Chain Monte Carlo (MCMC) methods which summarize the features of a distribution such as the posterior mean, standard deviation, quantiles and credible interval, by sampling indirectly from the distribution. These methods are most valuable for complicated distributions such as high dimensional joint distributions for which it is infeasible to sample from directly. MCMC generates an ergodic Markov chain
( 1 ) , ( 2) ,…, ( m ) , ... which is straightforward to
sample and whose equilibrium distribution is the desired joint posterior d ~ g( ) . After an initial t warming up iterations, the distribution ( m )
resulting sample can provide consistent estimator of a posterior quantity. MCMC is essentially Monte Carlo integration using Markov chains. The Ergodic Theorem assures that the sample average converges to the population mean as long as t
E h k
h
(m)
m a 1
k
t a
this expectation is finite when m goes to infinity. The Ergodic Theorem also indicates that there is no need for several starting values. Based on the RaoBlackwell Theorem Gelfand and Smith (1990) improved on this approximation by using instead the average of conditional densities. The MCMC Bayesian
10
methods, in contrast to the maximum likelihood method, the EM and SEM algorithms, and the bootstrap resampling method, are useful even with finite samples since the convergence results depend only on the number of iterations.
One type of MCMC methods is Gibbs sampling. A Markov chain ( 1 ) , ( 2) ,…,
( m ) , ... is constructed by sampling from the full conditional distributions which are feasible to sample from. In data augmentation, the natural hierarchical structure of the model is incorporated in the sampling sequence from the conditional distributions. Parameters are partitioned into two groups, the missing or latent variables and the rest of the parameters.
The equilibrium distribution of the Markov chain is the target joint posterior distribution which is
s 2 2 2 g(|C) ii C j i , i g(1,…,s) g( i | i ) g( i ) j 1 i 1 n
for the finite mixture of flexible fixed cost quadratic function model.
(5) For
identifiability, proper prior densities are used: 1. g(1 ,…,s ) is the Dirichlet prior distribution for the mixing parameters with hyperparameters ( 1 ,..., s )
(6)
2. g( i | i ) is the multivariate normal prior density for the regression 2
coefficients with mean Ai and covariance i Qi 2
(7)
11
3. g( i ) is the inverse-gamma prior density for the variance with parameters 2
i / 2 and i / 2 .
(8)
In specifying the hyperparameters, it is assumed that there is little prior information about the mixing parameters λ. For two component cost function corresponding to new and old technologies respectively, 1 and 2 are set so that the mean of λ is 0.5 and its standard deviation is 0.15. These values are
1 [ ( 1 )] / [ 1 4Var( )] / [ 8Var( )] 2
2
and
2 1
.
(9)
The parameters i and i (i = 1, 2) are set so that the mean of i is 2
p
ˆ l 1
2 il
.
These means give a coefficient of determination R 2 of 0.50, typical of economic data. It is assumed that there is weak prior information about the error variance thus, i , and i are also chosen so that the standard deviation of i p
/ 3 ˆ il 2 / 3 2
i
(i = 1, 2) thus setting the lower bound for
2
i
l 1
2
is
i 2 to 0
which is three standard deviations from the mean.
For simplicity it is assumed that Qi (i = 1, 2) is diagonal. The diagonal elements of Qi are set to il / 2 and 2 are the expected value of the error variance 2
i
i
corresponding to the i th technology component cost. For this study, information
12
years before and decades after regulation or innovations provided cogent informative prior density.
The key to estimating the parameters of the finite mixture of cost functions model is to express the model in terms of the latent cluster or component indicator variable Z j { zij } ,
i = 1, 2, . . . , s, j = 1, 2, . . . , n. The latent cluster
indicator variable assumes the value 1 if the cost function C ij of the j th firm is generated by the ith cost component model and 0 otherwise. If the latent cluster indicator variable data
Z j { zij } were observed, estimation is straightforward.
However, since Z j is not observed, its elements can be considered as unknown parameters to be estimated along with the other model parameters. For identifiability each technology cluster cannot be empty and contains at least 50 banks or observations.
The MCMC sampling algorithm generates parameter values from the conditional distributions with starting values i
(0 )
, i
(0 )
, i
2( 0 )
:
1. Imputation Step: In the imputation step of data augmentation, r independent values of the component indicator variables Z j
( m1 )
are drawn from its conditional distribution
to reduce variability. Although r = 1 is enough for point estimation, a larger r is required for small sample sizes (Diebolt and Robert, 1994). The conditional
13
distribution of the latent cost component indicator variable is the multinomial distribution with one trial and probability w ij for the i th cost function component
Zj
( m1 )
|C j ,Y j , Pj , i
(m)
~ Multinomial ( 1, wij
( m1 )
(10)
)
where: Ci * ( Y , P , i )
wij
( m1 )
i ( m )i ( C j ,Y j , Pj , i ( m ) ) s
i 1
i ( C j ,Y j , Pj , i
(m) i
i ( C j ,Y j , Pj , i
(m)
)
1 2 i
2( m )
(m)
)
1 [ C C * ( Y , P , ( m ) )] 2 j ij j j i exp 2( m ) 2 i
2. Posterior Step: In the posterior step the rest of the parameters are generated from their conditional distributions. 2a. The mixing parameter is generated from the Dirichlet distribution
( m 1 ) | C j ,Y j ,P j , Z ( m ) ~ Dirichlet( 1 n1
( m1 )
,..., s ns
( m1 )
)
(11)
where:
ni
( m1 )
n
zij
( m1 )
j 1
is the number of firms which utilizes the i th technology
with corresponding
flexible fixed cost quadratic function component. Conditional upon the latent data Z, which classifies each firm into a technology cost regime, the Bayesian posterior distributions for the standard regression model apply. 2b. Thus, the error variance is drawn from the conditional distribution which is the inverse-gamma
14
i
2( m1 )
| C j ,Y j ,P j , Z
( m1 )
,
( m1 )
~ IG [( i ni
( m1 )
) / 2 , ( i S i
( m1 )
) / 2 ] (12)
where:
Si
( m1 )
Hi
Ci
( m1 )
( m1 )
( Xi
' Ci
( m1 )
( m1 )
' Xi
1
Ai' Qi Ai Hi
( m1 )
Qi
1
)1 ( X i
( m1 )
' ( Xi
( m1 )
'C i
( m1 )
' Xi
( m1 )
( m1 )
1
Qi )Hi
( m1 )
1
Qi Ai )
2c. The regression coefficients of the i th technology flexible fixed cost quadratic function are generated from the multivariate normal
i
( m1 )
| C j ,Y j ,P j , Z ( m 1 ) , ( m 1 ) , i ~ N [ Hi
( m 1 )
, i
2( m 1 )
2 ( m1 )
( Xi
( m1 )
' Xi
( m 1 )
1
Qi )1 ]
(13)
Ergodic averaging of the Markov Chain ( m ) , or its function h( (m ) ) , provides a consistent estimator of the parameters or a function h( ) of θ t
E h k
h
(m)
m a 1
k
(14)
.
t a
The algorithm is allowed to iterate until stationarity is achieved at the ath burn in iteration, before sampling results are included in the estimation method.
The burn-in samples which are the first a values of the parameters are discarded to correct for correlated values of the iterations and for the estimates to be invariant of the initial values. averaged
to
The last t - a values of the parameters are
obtain Bayesian
estimates of
the
parameters and
corresponding standard deviations. For this study the value of
their
t, the total
15
number of iterations is 6000 and the burn in period is a = 2000. Coefficients are considered significant if the credible interval between the 2.5% and 97.5% percentiles of the t-a samples from the conditional posterior distribution of i exclude zero.
4. Data and Methodology
The data for this study were obtained from the Federal Reserve Call Report. The sample consists of
462
Kansas banks with agricultural loans in 1989. The
choice of the sample period is dictated by technological and regulatory changes that
occurred
in
the
1980s.
Technology
breakthroughs
in
computer,
communication and electronic transfer systems have critical impact on costs incurred by banks. Moreover, the Depository Institutions Deregulation and Monetary Control Act of 1980 and the Depository Institutions Act of 1982 have severely changed banks’ processes.
Federal law enacted in 1980 the (1)
Monetary Control Act which extended reserve requirements to all US banking institutions and also deals with the banking services furnished by the Federal Reserve System; and (2) the Depository Institutions Deregulations Act of 1980 which phased out Federal Reserve Regulations Q deposit interest rate ceilings by March 31, 1986. In 1982, Congress enacted the federal law Garn-St Germain Depository Institutions Act authorizing banks and savings institutions to offer money market deposit account, a transaction account with no interest rate ceiling
16
to compete more effectively with money market mutual funds. Other highlights of this act are authorization of savings and loans associations to make commercial, corporate, business and agricultural loans up to 10% of assets after January 1, 1984; increased consumer lending from 20% to 30% of assets; and a raise on the investments ceiling by savings institutions in nonresidential real estate from 20% to 40%.
The flexible fixed cost quadratic function outputs are agricultural loans, nonagricultural real estate loans, non-agricultural loans, deposits on total transaction accounts and deposits on total non-transaction accounts. The value-added approach was used to define inputs and outputs. The input prices are interest rate, wage rate and occupancy rate. Interest rate is the ratio of interest expense by total loans. Wage rate is the ratio of employee expense to the number of employees. Occupancy rate is occupancy expense divided by fixed asset value. It is assumed that there are two technologies clusters, old and new technologies. Economies derived from the size or scale of a firm’s operations and also from simultaneous production of several different outputs of a single enterprise are evaluated.
Overall economies of scale is
Sn ( Y ) C ( Y )
(15)
Y C k 1
k
k
(Y )
17
where:
Ck ( Y ) C ( Y )
Yk
.
(16)
Returns to scale are increasing, constant or decreasing if Sn is greater, equal or less than one, respectively. Sn measures how the cost of the current bundle changes if the bundle remains in the same proportion as size changes. Product specific economies of scale is evaluated as the ratio of average incremental cost over its marginal cost
Sk ( Y )
AIC k
Ck ( Y )
IC k ( Y )
Yk C k ( Y )
(17)
where ICk(Y)= C(Y) – C(Yn-k) and Yn-k= (Y1,…Yk-1,0 Yk+1,…, Yn ).
Economies of scope relative to a set of outputs t is defined as SCt(Y)= [C(Yt) + C(Yn-t) – C(Y)] / C(Y) .
(18)
This measures the relative change in cost that would occur from splintering production of Y into separate groups. Economies of scope imply increase in relative cost savings from diversification or multi-output production or relatively greater cost from splintering of production into separate groups. If SCt(Y) is greater, less or equal to zero economies, diseconomies or no economies of
18
scope exists respectively. Overall and product specific economies of scale and scope are evaluated at the mean bank size.
5. Results and Discussion
Table 1 Summary Statistics for the Outputs and Prices
Variable Name
Label
Mean
Standard Deviation
AGLOANS (Y1 )
Agricultural Production Loans
493.3186
583.1380
RELOANS (Y2 )
Agricultural Real Estate Loans
815.8054
2274.0600
OTLOANS (Y3 )
Non-Agricultural Real Estate Loans
1015.6000
3273.1100
TRANDEP (Y4 )
Deposits -Total Transaction Accounts
1122.0500
2289.0200
NTRNDEP (Y5 )
Deposits-Total Non-Transaction Accounts
2959.2700
5741.4400
INTRATE (P1 )
Interest Rate
0.0530
0.0078
WAGETR (P2 )
Wage Rate
28.2204
6.3674
OCCURTE (P3)
Occupancy Rate
0.3175
0.2749
Summary statistics for the variables used in estimating the multiproduct cost function of Kansas banks with agricultural loans are shown in table 1.
All the
outputs have large standard deviations indicating heterogenous banks are of different sizes. Table 2 shows that estimates of the coefficients for the standard regression model and the finite mixture model do not only differ in magnitude but
19
Table 2 Multiproduct Cost Functions Estimates 1
COEFFICIENT2
VARIABLE
(STANDARD ERROR) STANDARD REGRESSION
INTERCEPT
-8709.878277* (4379.594273)
FINITE MIXTURE MODEL COMPONENT 1
COMPONENT 2
28141.950000 ** (926.112780)
1.583388 ns (1364.8954)
-1.015875 ** (0.275626)
-2.158000 ** (0.291936)
AGLOANS
0.266861 ns (0.809981)
RELOANS
0.468745 ns (0.681584)
-0.305199 ns (0.362888)
0.402309 ns (0.300963)
OTLOANS
1.702714 * (0.818664)
-1.419685 ** (0.377745)
-0.368041 ns (0.332597)
TRANDEP
-1.386187* (0.648787)
6.041401** (0.437314)
3.524290 ** (0.303747)
NTRNDEP
2.981970** (0.523506)
0.698526 ** (0.153564)
1.996372 ** (0.158940)
AGLOANS 2
-0.000052 ns (0.000032)
-0.000068 ns (0.000053)
0.000139 ns (0.000083)
AGLOANS*RELOANS
0.000019 ns (0.000058)
-0.000092 ns (0.000158)
0.000244 ns (0.000206)
AGLOANS*OTLOANS
-0.000416 ** (0.000050)
-0.000360 * (0.000145)
-0.000292 ns (0.000208)
AGLOANS*TRANDEP
0.000015 ns (0.000053)
-0.000020 ns (0.000096)
-0.000382 * (0.000149)
AGLOANS*NTRNDEP
0.000156 ** (0.000037)
0.000169 * (0.000077)
0.000130 ns (0.000082)
RELOANS 2
0.000038* (0.000018)
-0.000047 ns (0.000076)
0.000105 ns (0.000103)
20 COEFFICIENT2
VARIABLE
(STANDARD ERROR) STANDARD REGRESSION
FINITE MIXTURE MODEL COMPONENT 1
COMPONENT 2
RELOANS*OTLOANS
-0.000236** (0.000027)
-0.000172 * (0.000086)
-0.000277 * (0.000110)
RELOANS*TRANDEP
0.000178** (0.000024)
0.000060 ns (0.000106)
0.000227 * (0.000104)
RELOANS*NTRNDEP
0.000006 ns (0.000028)
0.000082 ns (0.000072)
-0.000086 ns (0.000119)
OTLOANS 2
-0.000137** (0.000020)
-0.000117 ns (0.000084)
-0.000204 * (0.000091)
OTLOANS*TRANDEP
0.000033 ns (0.000033)
0.000021 ns (0.000074)
-0.000122 ns (0.000086)
OTLOANS*NTRNDEP
0.000192** (0.000030)
0.000163 ns (0.000089)
0.000337 ** (0.000109)
TRANDEP 2
0.000057* (0.000027)
0.000110 ns (0.000088)
0.000274 ** (0.000092)
TRANDEP*NTRNDEP
-0.000083** (0.000024)
-0.000121 ns (0.000074)
-0.000215 * (0.000090)
NTRNDEP 2
-0.000033* (0.000013)
-0.000024 ns (0.000022)
-0.000017 ns (0.000033)
ln INTRATE
-3633.254536 ns (1913.491380)
12493.717000 ** (441.980240)
2109.714500 ** (554.536000)
ln WAGETR
1711.969216 ns (1270.744463)
-4932.983000 ** (293.008400)
2136.203000 ** (530.319800)
ln OCCURTE
-803.730854* (392.646153)
392.160040 ** (96.962159)
81.092331 ns (148.593400)
ln INTRATE 2
-397.104990 ns (253.052716)
1364.807800 ** (67.673919)
665.229180 ** (74.747168)
ln INTRATE*ln WAGETR
306.970747 ns (293.909316)
-1144.361000 ** (82.284666)
700.423370 ** (116.302654)
ln INTRATE*ln OCCURTE
-135.922928 ns
97.384053 **
141.646740 **
21 COEFFICIENT2
VARIABLE
(STANDARD ERROR) STANDARD REGRESSION
FINITE MIXTURE MODEL COMPONENT 1
COMPONENT 2
(106.800339)
(28.940038)
(39.009368)
-100.537669 ns (121.678191)
191.734170 ** (40.983441)
31.542682 ns (72.333408)
ln WAGETR*ln OCCURTE
124.652781* (59.428739)
-25.696420 ns (21.551094)
130.519790 ** (32.25075)
ln OCCURTE 2
-5.894861 ns (11.833675)
-6.987082 ns (4.274385)
17.058000 * (7.872254)
AGLOANS*ln INTRATE
-0.079192 ns (0.245396)
-0.353596 ** (0.088947)
-0.874497 ** (0.086722)
AGLOANS*ln WAGETR
-0.172134 ns (0.132274)
0.046096 ns (0.058766)
-0.243859 ** (0.055475)
AGLOANS*ln OCCURTE
0.032333 ns (0.040280)
0.152808 ** (0.035801)
-0.221220 ** (0.026029)
RELOANS*ln INTRATE
0.440900 * (0.174544)
0.080366 ns (0.084504)
0.529220 ** (0.070463)
RELOANS*ln WAGETR
0.247939 ns (0.134236)
0.364204 ** (0.117952)
0.278998 ** (0.072850)
RELOANS*ln OCCURTE
0.009785 ns (0.037286)
0.337980 ** (0.021762)
-0.264152 ** (0.079556)
OTLOANS* ln INTRATE
0.084092 ns (0.196094)
-0.482899 ** (0.124964)
-0.670476 ** (0.084148)
OTLOANS* ln WAGETR
-0.334813 * (0.167772)
0.038944 ns (0.086557)
-0.492666 ** (0.032828)
OTLOANS* ln OCCURTE
0.114085** (0.034773)
0.026596 ns (0.024174)
-0.004039 ns (0.074675)
TRANDEP* ln INTRATE
-0.200267 ns (0.183774)
1.914006 ** (0.110045)
1.364913 ** (0.067243)
TRANDEP* ln WAGETR
0.454533** (0.125753)
0.171156 ns (0.100573)
0.541129 ** (0.045510)
ln WAGETR 2
22 COEFFICIENT2
VARIABLE
(STANDARD ERROR) STANDARD REGRESSION
COMPONENT 1
COMPONENT 2
-0.083714 ns (0.048723)
-0.016247 ns (0.029883)
0.184468 ** (0.036449)
NTRNDEP*ln INTRATE
0.824956** (0.139109)
0.106319 * (0.046976)
0.397671 ** (0.031344)
NTRNDEP* ln WAGETR
0.019467 ns (0.087387)
0.016037 ns (0.035667)
-0.061768 ** (0.013143)
NTRNDEP*ln OCCURTE
-0.013013 ns (0.022379)
-0.097698 ** (0.010722)
0.008604 ** (0.000000)
0.501091 ** (0.052334)
0.498909 ** (0.052334)
TRANDEP* ln OCCURTE
MIXING PARAMETER
1 2
FINITE MIXTURE MODEL
Standard errors are enclosed in parentheses . * , ** and ns means significant at the 5%, 1% level and not statistically significant respectively.
in their signs as well. The finite mixture of flexible fixed cost quadratic function model has better goodness-of-fit as evident by more precise with smaller standard errors and more statistically significant coefficients estimates. For the standard regression model coefficients of ln INTRATE 2, ln WAGETR 2, ln OCCURTE 2, are all negative, implying that input demand for total deposits, labor and fixed assets all have the expected theoretical negative slope conforming with concavity of the cost function in input prices. However, these coefficients are not significant and they are not really expected to conform with theory curvature because of the simultaneous existence of multiple technologies due to adopting
23
the necessary novel technologies and high tech staff even though they entail higher input price.
The other reasons are the Depository Institutions
Deregulation and Monetary Control Act of 1980 and the Depository Institutions Act of 1982 which the banks need to adjust and adhere to. For the mixture model, the ln INTRATE 2 coefficients are both significantly positive for the old and new technology cost function mixture components. These coefficients reflect positive own price elasticity of total deposits depicting the effects of Federal Reserve Regulations Q deposit interest rate ceilings on the old technology cost function component and the effects of the Depository Institutions Deregulations Act of 1980 which phased out this Federal Reserve Regulations Q on the new technology cost function component. The ln WAGETR
2
coefficients are both
positive for the old and new technology cost function mixture components though significant only for the former. These coefficients reflect positive own price elasticity of labor. Even before and after the new technology is adopted banks have to hire more educated staff who are knowledgeable of the new technology even at a higher salary rate to prepare the other staff for adopting the new technologies. The coefficient for ln OCCURTE 2 is negative and not significant for the old technology mixture cost function component and positively significant for the new technology cost function component revealing a positive own price elasticity of fixed assets. To improve fixed assets with novel technologies, banks have to spend more even at higher price of innovative technologies. Both mixing parameters are highly significantly different from zero which further proves that
24
the bank industry is still in a transitory stage with simultaneous multiple technology. The difference in modeling the multiproduct cost function by the standard regression and the finite mixture model is reflected in the posterior mean estimates of total costs in
table 3,
marginal costs in table 4, and
economies of scale and scope measures
Table 3 Total Costs
COSTS
STANDARD REGRESSION
FINITE MIXTURE MODEL
C(Y)
3062.090
3475.476
C(Y1 )
31.286
43.212
C(Y2 )
117.017
241.497
C(Y3 )
167.895
64.205
C(Y4 )
1087.147
1501.943
C(Y5 )
1700.386
1613.550
C(YN-1 )
3095.000
3529.087
C(YN-2 )
3055.000
3333.808
C(YN-3 )
2782.964
3266.702
C(YN-4 )
2141.1569
2639.495
C(YN-5 )
917.496
1525.869
* YN- i = (Y1 ,..., Yi-1 , 0 ,Yi+1 ,...,YN )
Table 4
25
Marginal Costs
MARGINAL COSTS
STANDARD REGRESSION
FINITE MIXTURE MODEL
C Y1
-0.0211
-0.1431
C Y2
0.0269
0.1620
C Y3
0.3433
0.3817
C Y4
0.8253
0.4002
C Y5
0.8108
1.2173
in tables 5 and 6. With the assumption of simultaneous existence of multiple technologies of production incorporated through the finite mixture model, the data apparently manifests different degrees and or direction of scale and scope evaluated at the mean bank size than those of the standard regression model. Most product specific economies of scale measures for the standard regression model reflect nearly constant returns to scale. Product specific economies of scale measure for the finite mixture model imply the expected heterogeous economies of scale which is more typical of firms adjusting to shocks in bank regulations and technological innovations. economies
As a consequence, the overall
of scale for the finite mixture model is lower implying more
diseconomies than that provided by the standard regression model. The finite mixture model is able to capture economies of scale in transaction account deposits. Economies of scope measures for the two alternative cost regression models also differ but the difference is less conspicuous than the scale
26
measures. Table 5 Scale Economies SCALE ECONOMIES VARIABLE STANDARD REGRESSION
FINITE MIXTURE MODEL
OVERALL
0.8308
0.7722
AGLOANS
1.3417
0.7594
RELOANS
-0.0454
1.0719
0.8755
0.5386
TRANDEP
0.9972
1.8617
NTRNDEP
0.9406
0.5412
OTLOANS
Table 6 Scope Economies
VARIABLE
SCOPE ECONOMIES
27
STANDARD REGRESSION
FINITE MIXTURE MODEL
OVERALL
0.0325
0.0205
AGLOANS
0.0210
0.0279
RELOANS
0.0359
0.0287
OTLOANS
-0.0363
-0.0785
TRANDEP
0.0543
0.1916
NTRNDEP
-0.1451
-0.0967
6. Conclusion
In this paper unsupervised clustering of firms according to their latent production technology is incorporated in modeling the multiproduct cost function of an industry characterized by simultaneous existence of multiple technologies of production. A Bayesian method using Markov chain Monte Carlo methodology is developed to estimate the multiproduct cost function as a latent class model which is a finite mixture of flexible fixed cost quadratic (FMFFCQ) functions.
28
Firms are stochastically clustered according to the mixing parameter which has a Dirichlet distribution. Hyperparameters, of the prior distributions are expressed in terms of the mean and variance of the prior densities obtained from information years before and decades after regulations or innovations are introduced. The estimated FMFFCQ multiproduct cost function is then utilized to evaluate the degree and direction of overall and product specific economies of scale and scope of banks in Kansas offering agricultural loans. The multiproduct cost function estimates using standard regression model and the finite mixture of flexible fixed cost quadratic functions model provide very different estimates and consequently different magnitudes and or direction of overall and product specific economies of scale and scope. These results imply that the traditional regression method may actually produce misleading estimates of the multiproduct cost function coefficients under multiple technologies of production. The finite mixture model provides more statistically significant and more precise coefficients than those of the standard regression model.
The method presented here can also be used to estimate the mixing parameter for several years for monitoring the rate at which technology is adopted changes over time.
ACKNOWLEDGMENTS
29
I am thankful to Prof. Featherstone for access to the bank data. Comments and suggestions of participants at the European Econometrics and Economics Association meetings are gratefully acknowledged. My sincere gratitude also to Prof. Ragan for the financial support while at Kansas State University.
References
Akhavein, J., Frame, S. and White, L. (2001), “The Diffusion of Financial Innovations: An Examination of the Adoption of Small Business Credit Scoring by Large Banking Organizations,” Working Paper Federal Reserve Bank of Atlanta Beard, R., Caudill, S., and Gropper, D. (1991), “Finite Mixture Estimation of Multiproduct Cost Functions,” Review of Economics and Statistics, 73, 654-664. Celeux, G., and Diebolt, J. (1985), “The SEM Algorithm: A Probabilistic Teacher Algorithm Derived from the EM Algorithm for the Mixture Problem,” Computational Statistics Quarterly, 73-82. Cohen, A. Clifford (1967), “Estimation in Mixtures of Two Normal Distributions,” Technometrics, 9, 15-28. Cohn, E., Rhine, S., and Santos, M. (1989), “Institutions of Higher Education as Multiproduct Firms: Economies of Scale and Scope,” This Review, 71, 284-290.
30
Considine, T. (1989), ”Separability, functional form and regulatory policy in models of interfuel substitution,” Energy Economics, 11, 82-94. Day, N. E. (1969), “Estimating the Components of a Mixture of Normal Distributions,” Biometrika, 56, 463-474. Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977), “Maximum Likelihood from Incomplete Data Via the EM Algorithm,” Journal of the Royal Statistical Society (B), 39, 1-38. Diebolt, J., and Robert, C. (1994), “Estimation of Finite Mixture Distributions through Bayesian Sampling,” Journal of the Royal Statistical Society (B), 56, 363-375. Fair, R., and Jaffee, D. (1972) “Methods of Estimation for Markets in Disequilibrium,” Econometrica, 40, 497-514. Featherstone, A. and Moss, C. (1994) “Measuring Economies of Scale and Scope in Agricultural Banking,” American Journal of Economics, 76, 655-661. Friedlaender, A., Winston, C., and Wang, K. (1983), “Costs, Technology and Productivity in the U. S. Automobile Industry,” Bell Journal of Economics, 14,1-20. Gallant, R. and Golub, G. (1984), “Imposing Curvature Restrictions on Flexible Functional Forms. Journal of Econometrics, 26, 295-332. Gelfand, A. and Smith, A. (1990), “Sampling Based Approaches to Calculating Marginal Densities,” Journal of the American Statistical Association, 85,
31
398-409. Geman, S. and Geman, D. J. (1984), “Stochastic Relaxation, Gibbs Distribution and the Bayesian Restoration of Images,” IEEE Transactions on Pattern Analysis and Machine Intelligence 6, 721-741. Hannan, T. and McDowell, J. (1987), “The Determinants of Technology Adoption: The Case of the Banking Firm,” RAND Journal of Economics, 15, 328335. Hartley, M. (1978), “Comment” (on “Estimating Mixtures of Normal Distributions and Switching Regressions,” by Quandt and Ramsey), Journal of the American Statistical Association, 73, 738-741. Hazilla, M. and Koop, R. (1985) “Imposing Curvature Restrictions on Flexible Functional Forms,” Resources for the Future Discussion Paper, Washington. Hazilla, M. and Koop, R. (1986) “Systematic Effects of Capital Service Price Definition on Perception of Input Substitution,” Journal of Business and Economic Statistics, 4, 209-224. Hosmer, D. (1973) “A Comparison of Iterative Maximum Likelihood Estimates of the Parameters of a Mixture of Two Normal Distributions Under Three Different Types of Samples,” Biometrics, 29, 761-770. Johnes, G. and Johnes, J. (2007) International Handbook on the Economics of Education, Cheltenham: Edward Elgar Publishing Limited. Kiefer, N.M. (1978), “Discrete Parameter Variation : Efficient Estimation of a
32
Switching Regression Model,” Econometrica 46, 427-434. Maddala, G. S., and Nelson, F. D. (1975), “Switching Regression Models with Exogenous and Endogenous Switching,” American Statistical Association Proceedings of Business and Economics Statistics Section, 423-426. Mansfield, E. (1968) Industrial Research and Technological Innovation, New York, W. W. Norton & Co. Ogawa, K. (2011), “Why are Concavity Conditions Not Satisfied in the Cost Function? The Case of Japanese Manufacturing Firms during the Bubble Period,” Oxford Bulletin of Economics and Statistics, 73, 556-580. Patterson, K. and Schott, K. (1979) The Measurement of Capital Theory and Practice, London: Palgrave McMillan. Phillips, P. (1991), “A constrained Maximum-Likelihood Approach to Estimating Switching Regressions,” Journal of Econometrics, 48,240-261. Phillips, P. (1991), “A Note on Testing for Switching Regressions,” Economics Letters, 35, 31-33. Quandt, R. and Ramsey, J. (1978), “Estimating Mixtures of Normal Distributions and Switching Regressions,” Journal of the American Statistical Association, 73, 730-738. Robert, C. (1994), The Bayesian Choice, New York: Springer-Verlag New York Inc. Rose, N. and Joskow, P. (1990), “The Diffusion of New Technologies: Evidence from the Electric Utility Industry,” RAND Journal of Economics, 21, 354-
33
374. Saloner, S. and Sheppard, A. (1995), “Adoption of Technologies with Network Effects: An Empirical Examination of the Adoption of Automated Teller Machines,” RAND Journal of Economics, 26, 479-501. Scherer, F. (1984), Innovation and Growth, Massachusetts: MIT Press. Smith, A. F. M. , and Robert, G. O. (1993), “Bayesian Computation via the Gibbs Sampler and Related Markov Chain Monte Carlo Methods,” Journal of the Royal Statistical Society (B), 55, 3-23. Swamy, P. and Mehta, J. (1975), “Bayesian and Non-Bayesian Analysis of Switching Regressions and of Random Coefficient Regression Models,” Journal of the American Statistical Association, 70, 593-602. Terrell, D. (1996), “Incorporating Monotonicity and Concavity Conditions in Flexible Functional Forms,” Journal of Applied Econometrics, 11, 179-194. Wolff, H. Heckelei, T., and Mittelhammer, R. (2010), “Imposing Curvature and Monotonicity on Flexible Functional Forms: An Efficient Regional Approach,” Journal of Econometrics, 36, 309-339.