models, the techniques of segmented polynomial regression can be applied. Data from ... dependent data set showed the segmented polynomial regression models with three sub- models to ..... ticians involved in research or teaching related.
SegmentedPolynomialRegression Applied to Taper Equations TIMOTHY
HAROLD
A.
MAX
E. BURKHART
Abstract. The shapeof various segmentsof tree boles approximate different geometric solids. The lower bole portion is generallyassumedto be a neiloid frustum, the middle a paraboloidfrustum, and the upper portion a cone. To describestem taper throughoutthe bole, a set of submodelscan be used. When thesesubmodelsare taken to be polynomial models, the techniquesof segmentedpolynomial regressioncan be applied. Data from plantation and natural standsof loblolly pine were used as examplesto compare three segmentedpolynomial regressionmodels and a previously proposedsinglequadratic model for ability to describetree taper. Resultsof hypothesistests and analyseswith an independentdata set showedthe segmentedpolynomial regressionmodelswith three submodelsto be the bestmodel testedfor predictingtaper in loblolly pines. Forest Sei. 22: 283-289.
Additional key words. Forest measurement,nonlinear least squares,Pinus taeda.
RELATIVELY SIMPLE REGRESSIONMODELS
are sufficientto solvemany forestryproblems. More complexmodels,however,are often needed to adequately describe a responsesurfaceover the entire range of the independentvariables. For example, complexmodelsmay be necessaryin tree form description. Although a tree bole cannotbe completelydescribedin mathematical terms, it is common and convenient to assumethat segmentsof a tree bole approximatevariousgeometricsolids. The lower bole portion is generallyassumedto be a neiloidfrustum,the middleportiona paraboloidfrustum,and the upper portion a cone (Husch and others 1972). This suggeststhat three models are needed to describetree taper, one model eachfor the lower, middle, and upper segmentsof the bole. Thesethree modelscan be joined to form a singlemodelwhichcan be analyzed
by regression techniques.Techniquesfor handling regressionmodels of this kind, called segmented polynomial regression models,haverecentlyreceivedconsiderable attentionin the statisticalliterature (e.g., Fuller 1969, Gallant and Fuller 1973, Gallant 1974a). The objectiveof this paper is to present the methods and results of
applyingsegmentedpolynomialregression modelsto describetree taper. Past Work on Taper Equations
Forestershave usedmany differentmodels of varying complexity in attemptsto describetree taper. For example,Bruce and others (1968) used polynomial models with powerssuch as 3/2, 2, and 40, Fries andMat6rn (1965) proposeda multivariate approach involving a linear combination of several high-degreepolynomials,and Bennett and Swindel (1972) developed taper curves consistingof third-degree polynomials.Kozak and Smith(1966) and Kozak and others (1969), on the other hand, have advocateda simple quadratic model for describingtree taper of many speciesoccurringin British Columbia. In these and all other previousattemptsto describetree taper, a single model was usedto representthe entirebole lengthof interest.
The authors are Assistant Professor and Asso-
ciate Professor, respectively, Department of Forestry and Forest Products, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061. Manuscript receivedNovember 21, 1975.
volume 22, number 3, 1976 / 283
proposedto describetree taperis the quadraticmodelpresentedby Kozak andothers
Segmented Polynomial Regression Models
Segmentedpolynomialmodelsdescribedby Fuller (1969), Gallant and Fuller ( 1973 ), Gallant (1974a), and others consist of a sequenceof grafted submodels.In the case of one independentvariable the domain is partitioned and a different polynomial submodel
is defined
on each section of
the partition. These submodelsare then
grafted together to form the segmented polynomialmodel. In generalsucha segmented polynomial model can be written (Gallant and Fuller 1973)--
a •< x •< a• a• < x •< a•
a•_• < x •< b.
These submodelsare then grafted together at the join points,a•, a2.... ar-•, by imposing restrictionson the model. The restrictionsmay be imposedin sucha manner that f is continuousand has continuous first or higher order derivatives. If the join points are known fixed constants, then [ is linear in the unknown
parametersfi, and fi can be estimatedusing multiplelinear regressionwith somemodified independentvariables(Fuller 1969). If the join pointsmustbe estimatedand is, ß ß., f,. are polynomialsthen ! can be rewrittenby a methodpresentedby Gallant and Fuller (1973). The reparameterization proposedby Gallant and Fuller imposes the restrictions that f be continuous and
have a continuousfirst partial derivative at each join point. The reparameterized model is linear in fi but nonlinearin a. However fi and a can be estimatedby using the modified Gauss-Newton nonlinear least squaresprocedure (Hartley 1961). This paper is concerned with modelsin which the join points must be estimated.
One of the simplestof the many models
284 / Forest Science
1) + fl•s(x?- 1) +e•
(1)
y = dS/Ds, d = diameter inside bark at any given height h, h = heightabovethe ground, D = diameter at breast height outside bark, H = total tree heightfrom groundto tip, x = h/H.
where
= L.(x, fiD,
y• = fl•(x•where
and
y• : [(x•) + e•
f(x) = f•(x, riO, = f•(x, t•),
(1969). Their model, referred to as Model 1, can be written as
These definitionsof y and x will be retained throughout the remainder of the paper. Clearly Model 1 is conditionedsuch that 9 = 0 when x = 1, that is, diameter is zero whenmeasuredat the tip of the tree. Severalgraftedpolynomialmodelswere formulatedand compared.The first consistedof two quadraticfunctionsgraftedat one join point. Sucha quadratic-quadratic model can be written
as
y• = [(x•) +e• where
= B• + B2x+ Baxs,
a• < x • 1.
When the abovemodel is reparametefized by the Gallant and Fu•er (1973) method, a model in which the parameterscan be estimatedby no•inear leastsquaresresults. The restriction that [ be continuous at a• requires that = B• + B•a• + B3• 2. The restriction that [ have a continuous
first partial derivativewith respectto x at at requires that As+2Aaa•
= B,+2Baa•.
These two restrictions are used to eliminate
B• and B, from the quadratic-quadratic model. Next the res•ction that p :
0
whenx = 1 is imposedand the parameters
are rename. d. The resultant quadraticquadraticmodel, referred to as Model 2, is givenby y.•= fi21(xi - 1) + fis•(xi • - 1) '• flga(O{21--X,021+(a2•--X½) + ei (2) where
where
l+(ot41--Xi):
1, = O,
I+(a42 - x•) = 1, = 0,
O•41 --X i 0/41-- Xi a42- x• a4• - x•
/• 0 < 0 >• 0 < O.
Data
I+(a•l --Xi) = 1, = O,
Data available for this study were from
a•l -xi ½ 0 a21-- Xi < 0
and a• is the join point of the two quadratic submodels.Gallant and Fu•er (1973) developeda methodby whichsimplepolynomial modelscan be written directlyin reparameterized form. However some complexmodelsmay have to be derived by the method outlined above to ensure that the correctreparameterized model is obtained. The third model consisted of three sub-
models. One submodel,a quadratic, describedthe taper of the lower sectionof the tree and accounts for butt swell.
The
taper of the middle sectionof the tree is
trees felled on temporaryyield plots in plantations andnaturalstandsof loblolly pine. Plantation-grown sampletreeswere from the Piedmont and Coastal Plain re-
gionsof Virginia,and from the Coastal Plain regionof Delaware,Maryland,and North Carolina. Sample tree data from natural stands were obtained in the Pied-
montandCoastalPlainregionsof Virginia and in the Coastal Plain of North Carolina.
On eachplot two trees (the 10th and 20th trees measured for dbh) were felled and cut into 4-foot (1.2 m) sections. Al-
thoughstumpheightswerenot measured, all wereapproximately 0.5 foot (0.15 m) and a constantheightof 0.5 foot (0.15 m)
describedby a first degreepolynomial. The profile of the tree top sectionis assumedto be quadraticand is conditioned
was assumedin the analysis.In addition to measuring the tree dbh and total height,
such that p = 0 whenx •
measuredat the stumpand at 4-foot (1.2 m) intervalsupthestemto an approximate 2-inch (5.1 cm) top diameter, outside
1. Model 3,
writtenin reparameterized form, is yi = •81(Xl-- 1) +•a2(x? - 1) - •a• (aa• - x•)21+(aal - x•) + fiaa(aa•-xO•l+(aa•-xO + e• (3)
diameters
I+ (a31 -- Xi) = 1, : 0,
There was a total of 422 sample trees from plantations and 230 from natura! The data sets were first stratified
into 10-foot (3.0 m) height classesand then 25 percentof the treeswere selected at random from each height class. The
a31-- Xi ½ 0 aa•-x• < 0
trees in these random
and aa• and aaaare the join pointsof the submodels.
Model
3 is referred
to as a
quadratic-linear-quadratic model. The fourth model is similar to Model
were
bark.
stands.
where
inside and outside bark
3
exceptthat the taper of the middlesection of the tree is assumedto be quadratic rather •an linear. Model 4, a quadraticquadratic-quadratic model, is y½= •(x;1) + •a(x? - 1) + •a(a• -x,)al+(a• - x;) + •(a•a-xOal+(a•a-xO + ei (4)
subsets were con-
sideredrepresentativeof the original data and were withheld for testingpurposes The remaining318 plantation-grownand 172 natural-stand-growntrees were used for initialfittingof the tapermodels. Analysis and Results Models 1, 2, 3, and 4 were fitted using
the plantationand the natural-standdata. Tests were performedusing the LikelihoodRatio proceduredescribed by Gallant (1974b, 1975a, 1975b) to determinewhich model was best supportedby the data.
volume 22, number 3, 1976 / 285
p is the number of parametersin the
The first hypothesistestedwas whether Model 2 was superiorto the simplerModel 1. Thereforewe testedthe hypothesis Ho:
unrestricted model, n is the number of observations, and
F• is the upper 100or percentagepoint of a central F-distribution with q numeratordegreesof freedomand n -p denominatordegreesof freedom.
tg2a= O, ot2•= O
against
Ha:
/g•a•0,
ot•0.
For the data from planted stands
The test statistic used was
T =
RSSx/n RSS•/n
T-
--
where RS& is the residualsum of squares
2 was the unrestricted
model.
The value
of T was comparedto the critical point c* defined by the equation
c* = 1+ qF•/(n - p)
17.6829/2843
= 1.9191
and at the 1 percent level of significance c*= 1+ 2(4.60/2839)
for model i and n is the number of obser-
vations. In this example Model 1 was restrictedto agreewith Ho whereasModel
33.9356/2843
= 1.0032
Hence, Ho was rejected and we concluded that Model 2 was better than Model 1. same conclusion was reached for the
The
data from natural stands.
Next, Model 3 was checked against Model 2. The samehypothesistestingprocedure was used, with Model 2 as the
where
q is the number of parameterstested in Ho ,
restrictedmodel. For both the planted-and natural-standdata, the null hypothesiswas rejected resulting in the conclusionthat
TABLE 1. Comparisonof the predictiveability of Models 1, 2, 3, and 4 on an independent data set of plantation-grown trees and on an independentdata set of natural-standgrown trees. Deviation is defined to be the actual measureddiameter minus the predicted diameter.
Mean absolute deviation (cm)
Section of
relative height
N
Model 1
Model 2
Model 3
Model 4
147 107 108 114 109 110 105 133
1.50 1.20 0.85 0.58 0.62 0.78 0.91 0.76
0.87 0.55 0.55 0.54 0.58 0.64 0.69 0.93
0.86 0.53 0.55 0.55 0.58 0.64 0.67 0.68
0.86 0.53 0.55 0.55 0.58 0.64 0.67 0.68
110 76 90 79 83 86 84 147
2.24 1.81 0.98 0.69 0.90 1.15 1.13 0.95
1.23 0.64 0.65 0.72 0.78 0.73 0.83 1.34
1.23 0.64 0.64 0.71 0.77 0.74 0.80 0.93
1.24 0.62 0.64 0.69 0.73 0.70 0.81 0.83
Plantations:
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
• • • • • • • •
Natural
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
x x x x x x x x
< < • < < < < •
0.1 0.2 0.3 0.4 0.5 0.6 0.7 1.0
stands:
• • • • • • • •
x x x x x x x x
< < < < < < < •
0.1 0.2 0.3 0.4 0.5 0.6 0.7 1.0
286 / Forest Science
TABLE 2. Comparisono/the bias in the predictions[rom Models 1, 2, 3, and 4. Predictionswere made on independentdata sets[or plantation-grownand natural-stand-grown trees. Bias is de[ined as the mean deviation o[ actual measured diameter minus the predicted diameter. Bias (mean deviation, cm)
Section of
relative height
N
Model 1
0.0 • x < 0.1 0.1 • x < 0.2
147 107
0.2 • x % 0.3 0.3 • x % 0.4
108 114 109 110
0.41 -1.18 -0.70
Model 2
Model 3
Model 4
Plantations:
0.4 • x % 0.5 0.5 • x % 0.6 0.6 • x < 0.7 0.7 • Natural
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
x •
-0.13 0.27 0.56 0.74
105 133
1.0
-0.04 -0.19
-0.02 -0.09
-0.02 -0.09
-0.11 -0.07 0.08 0.04
-0.15 -0.06 -0.07 -0.05
-0.15 -0.06
0.39
-0.07 -0.70
0.29 -1.79 -0.89 -0.02 0.64 1.06 1.01 0.31
-0.28 -0.20 -0.03 0.26 0.39 0.33 -0.12 -1.12
0.05 0.03
-0.07 -0.04 0.04 0.02
stands:
• • • • • • • •
x • x • x < x • x < x • x < x •
0.1 0.2 0.3 0.4 0.5 0.6 0.7 1.0
110 76 90 79 83 86 84 147
Model 3 wasto be preferredover Model 2. The testswere performed at the 0.01 level of significance. The last hypothesistestedwas whether the middle sectionof the segmented polynomialmodelshouldbe quadraticor linear. That is, did the data justify the use of Model 4 over Model 3. In the case of the
natural-standdata, the null hypothesiswas rejectedat the 0.01 level of significance, and the use of Model
4 was considered
justifiable. In the caseof the plantationstand data, however,the null hypothesis couldnotbe rejected;Model3 wasjudged sufficient to describe the data.
We concluded,then, that Model 3 was
sufficientto describe the datafromplantation-grown treeswhile Model 4 was necessary to describe the natural-stand data.
This differencewas probablydue to the greateraverageheight of the natural-stand sampletrees than the plantation sample trees. Hence, in terms of relative height, the upperjoin point ota•or a4• has a larger
value for natural-stand
-0.28 -0.23 -0.03 0.20 0.35 0.35 0.02 -0.51
-0.22 -0.08 -0.04 0.07 0.16 0.17 -0.05 -0.16
trees than the cor-
respondingjoin point for plantation-grown trees. Similarly,the lower join point Ires a smaller value for natural-stand
trees than
the corresponding join point for plantationgrown trees. This implies that for trees grown in natural stands,the middle segment representsa greaterproportion of the
treebolethan doesthe corresponding segmentfor plantation-grown trees. This proportionatelylarger midsectionof the tree bole apparentlycannot be describedby a simple linear model.
GallantandFuller (1973) presentedan approximatehypothesistesting procedure
basedon directanalogyto the theoryof linear models. Using their approximate technique, all hypothesesin this section, were retested and the conclusions were the same as those reached when the Likelihood Ratio test was used.
Models 1, 2, 3, and 4 were further tested by usingthe data withheld to be used as an independentdata set. To analyze the
volume 22, number 3, 1976 / 28?
TABLE 3. Estimates of the parameters in Model 4. The model was fitted to data from plantationsand to data from natural standsfor estimatingdiameter inside bark. Parameter Parameters
Plantations
short.
Using the Model 4 parameterestimates shownin Table 2, diameterswerepredicted for representativedbh and total height combinationswithin the range of sample
estimates Natural
will be justified in most cases and was rejectedfor the planted-standdata because the trees in our sampleswere relatively
stands
-3.0257 1.4586 -1.4464
-4.2844 2.0870 -2.3287
39.1081
86.7575
0.7431 0.1125
0.7965
to plot stem profiles. The stem profile plotswereconsistent with expectedresults; no abnormalitiesor illogical trendswere
0.0884
detected.
data and the resultant diameters were used
Conclusions
predictive ability of the modelsover the entire lengthof the stem, the independent variable, relative height, was divided into eight sections. Within each section, the diameters available from the withheld data
set were compared to the diameterspredicted from the models.
Tables 1 and 2
summarizethese resultsfor both planted and natural stands. The results in Tables
1 and 2 demon-
stratethe superiorpredictiveability of the segmented-polynomial models,in particular Models
3 and 4.
All
models were con-
ditionedsuchthat the predicteddiameter waszero whenrelativeheightequaledone. Hence, there is little differenceamongthe modelswhen testingfor predicteddiameters at relative heightsnear one. Models 3 and 4 were, however,slightlysuperior in this region,particularlyin termsof bias (Table 2). Differencesamongthe models
The segmentedpolynomial models with estimated join pointsprovidedan improved
description of tree taper whencompared to a single quadratic taper model used throughout the stemlength. Althoughthis improvement hasbeenshownonly for the two Ioblolly pine data sets available for analysis,we feel that the models should prove useful for other speciesdue to the
increasedflexibilityof segmented models to describestemtaper. We also feel that thereare manyotherpotentialapplications in forestryfor segmentedpolynomialregressiontechniques. Literature
BENNETT, F. A., and B. F. SWINDEL. Forest Serv Res Note SE-179, 4 p.
BRUCE,D., R. O. CUaT•S,and C. VANCOEvERING 1968. Development of a system of taper and volume
lower relativeheightswere compared.In particular,for relativeheightlessthan 0.2 the segmentedpolynomial models were greatly superior since these models can accountfor butt swell whereasa simple quadratic model does not have this flexibility. After all testingwas completed,the data reservedfor testing purposeswere combined with the data used for fitting the models and the parameters of Model 4 were reestimated. These parameter esti-
339-350.
3.
For unifor-
mity Model 4 wasusedfor both plantation and natural
stands.
We feel that Model
288 / Forest Science
4
1972
Taper curves for planted slash pine. USDA
were greater when predicted diameters at
mates are shown in Table
Cited
tables for red alder.
Forest
Sci 14
FRIES,J., and B. MAT•RN. 1965. On the use of multivariate
methods
for the construction
of
tree taper curves. Advisory Group of Forest Statisticians of the I.U.F.R.O. Section 25, Stockholm Conference, October, 1965. Paper No. 9, 32 p.
FULLER, W.A. 1969. Grafted polynomials as approximating functions. Aust J Agric Econ 13:35-46.
GALLANT, A. R. 1974a. The theory of nonlinear regressionas it relates to segmentedpolynomial regressionswith estimated join points. Institute of Statistics Mimeograph Series No. 925, 25 p. Raleigh, N.C. 1974b. Testing a subsetof the param-
eters of a nonlinear regressionmodel. Institute of StatisticsMimeograph Series No. 943, 24 p. Raleigh, N.C. 1975a. Nonlinear regression. Am Stat
nometrics
1975b. Testing a subsetof the parameters of a nonlinear regression model. J Am Stat Assoc 70:927-932.
and W. A. FULLER. 1973. Fitting segmented polynomial regression models whose join points have to be estimated. J Am Stat Assoc 68:144-147.
HARTLEY, H. O.
1961.
The modified. Gauss-
3:269-280.
HuscH, B., C. I. MILLER, and T. W. BEERS. 1972.
29:73-81.
•,
Newton method for the fitting of non-linear regressionfunctions by least squares. Tech-
Forest
mensuration.
Ronald
Press
Company, New York, N.Y. 410 p. KOZAK,A., and J. H. G. SMITH. 1966. Critical analysis of multivariate techniquesfor estimating tree taper suggeststhat simpler methods are best.
For
Chron
42:458-463.
, D. D. MUNRO, and J. H. G. SMITH
1969. Taper functions and their application in forest inventory. For Chron 45:278-283.
Advanced Study Institute on StatisticalModeling and Sampling for EcologicalAbundance and Diversity with Applications Under the sponsorshipof the International Statistical Ecology Program of the International Associationfor Ecology, the Biometric Society, and the International Statistical Institute, and with major financial support of the NATO
Scientific Affairs
Division, an
advancedstudyinstituteon statisticalmodeling and samplingfor ecologicalabundance and diversity with applications will be held at The PennsylvaniaState University, University Park, Pennsylvania,during August 820, 1977.
The programis plannedfor theoreticaland applied ecologists,biometricians,and statisticiansinvolvedin researchor teachingrelated to abundance,diversity, and biomonitoring. Young predoctoraland postdoctoralindividuals will be encouragedto participate. The subjects coveredwill include: scientificmodel-
ing andstochastic thinking,species abundance models and problems, diversity as a concept, measurementof ecologicaldiversity,statistical problems of sampling and inference with ex-
amples,localand globaldeterminants of diversity,relevantdata analysis,biologicalmonitoring in environmentalsciences, monitoringin
social and managementsciences,applications to other fields, and advancedcurrent topics. Lectures will consist of forward-looking reviews of recent developments,with an emphasisupon informal presentation. Program will also includeworkshops,tutorial sessions, study groups, individual problems and consultations. Edited proceedingswill be published. Final arrangementsare being made for the instructors and the speakers. Sug-
gestionsare invited. Participants in full attendancewill receive partial supportdependenton individual need and availability of funds. Completedparticipation forms are due on February 15, 1977. Awards will be announcedon March 1, 1977
Tentative plans are being made for a short intensive preparation course during August 1-6, 1977, for those who would find such a
refresher course necessaryand desirable. For further information and participation forms, write to: Professor G. P. Patil, 318 Pond Laboratory, The Pennsylvania State University, University Park, Pennsylvania 16802, U.S.A.
volume 22, number 3, 1976 / 289