Segmented Polynomial Regression Applied to Taper Equations

2 downloads 0 Views 114KB Size Report
models, the techniques of segmented polynomial regression can be applied. Data from ... dependent data set showed the segmented polynomial regression models with three sub- models to ..... ticians involved in research or teaching related.
SegmentedPolynomialRegression Applied to Taper Equations TIMOTHY

HAROLD

A.

MAX

E. BURKHART

Abstract. The shapeof various segmentsof tree boles approximate different geometric solids. The lower bole portion is generallyassumedto be a neiloid frustum, the middle a paraboloidfrustum, and the upper portion a cone. To describestem taper throughoutthe bole, a set of submodelscan be used. When thesesubmodelsare taken to be polynomial models, the techniquesof segmentedpolynomial regressioncan be applied. Data from plantation and natural standsof loblolly pine were used as examplesto compare three segmentedpolynomial regressionmodels and a previously proposedsinglequadratic model for ability to describetree taper. Resultsof hypothesistests and analyseswith an independentdata set showedthe segmentedpolynomial regressionmodelswith three submodelsto be the bestmodel testedfor predictingtaper in loblolly pines. Forest Sei. 22: 283-289.

Additional key words. Forest measurement,nonlinear least squares,Pinus taeda.

RELATIVELY SIMPLE REGRESSIONMODELS

are sufficientto solvemany forestryproblems. More complexmodels,however,are often needed to adequately describe a responsesurfaceover the entire range of the independentvariables. For example, complexmodelsmay be necessaryin tree form description. Although a tree bole cannotbe completelydescribedin mathematical terms, it is common and convenient to assumethat segmentsof a tree bole approximatevariousgeometricsolids. The lower bole portion is generallyassumedto be a neiloidfrustum,the middleportiona paraboloidfrustum,and the upper portion a cone (Husch and others 1972). This suggeststhat three models are needed to describetree taper, one model eachfor the lower, middle, and upper segmentsof the bole. Thesethree modelscan be joined to form a singlemodelwhichcan be analyzed

by regression techniques.Techniquesfor handling regressionmodels of this kind, called segmented polynomial regression models,haverecentlyreceivedconsiderable attentionin the statisticalliterature (e.g., Fuller 1969, Gallant and Fuller 1973, Gallant 1974a). The objectiveof this paper is to present the methods and results of

applyingsegmentedpolynomialregression modelsto describetree taper. Past Work on Taper Equations

Forestershave usedmany differentmodels of varying complexity in attemptsto describetree taper. For example,Bruce and others (1968) used polynomial models with powerssuch as 3/2, 2, and 40, Fries andMat6rn (1965) proposeda multivariate approach involving a linear combination of several high-degreepolynomials,and Bennett and Swindel (1972) developed taper curves consistingof third-degree polynomials.Kozak and Smith(1966) and Kozak and others (1969), on the other hand, have advocateda simple quadratic model for describingtree taper of many speciesoccurringin British Columbia. In these and all other previousattemptsto describetree taper, a single model was usedto representthe entirebole lengthof interest.

The authors are Assistant Professor and Asso-

ciate Professor, respectively, Department of Forestry and Forest Products, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061. Manuscript receivedNovember 21, 1975.

volume 22, number 3, 1976 / 283

proposedto describetree taperis the quadraticmodelpresentedby Kozak andothers

Segmented Polynomial Regression Models

Segmentedpolynomialmodelsdescribedby Fuller (1969), Gallant and Fuller ( 1973 ), Gallant (1974a), and others consist of a sequenceof grafted submodels.In the case of one independentvariable the domain is partitioned and a different polynomial submodel

is defined

on each section of

the partition. These submodelsare then

grafted together to form the segmented polynomialmodel. In generalsucha segmented polynomial model can be written (Gallant and Fuller 1973)--

a •< x •< a• a• < x •< a•

a•_• < x •< b.

These submodelsare then grafted together at the join points,a•, a2.... ar-•, by imposing restrictionson the model. The restrictionsmay be imposedin sucha manner that f is continuousand has continuous first or higher order derivatives. If the join points are known fixed constants, then [ is linear in the unknown

parametersfi, and fi can be estimatedusing multiplelinear regressionwith somemodified independentvariables(Fuller 1969). If the join pointsmustbe estimatedand is, ß ß., f,. are polynomialsthen ! can be rewrittenby a methodpresentedby Gallant and Fuller (1973). The reparameterization proposedby Gallant and Fuller imposes the restrictions that f be continuous and

have a continuousfirst partial derivative at each join point. The reparameterized model is linear in fi but nonlinearin a. However fi and a can be estimatedby using the modified Gauss-Newton nonlinear least squaresprocedure (Hartley 1961). This paper is concerned with modelsin which the join points must be estimated.

One of the simplestof the many models

284 / Forest Science

1) + fl•s(x?- 1) +e•

(1)

y = dS/Ds, d = diameter inside bark at any given height h, h = heightabovethe ground, D = diameter at breast height outside bark, H = total tree heightfrom groundto tip, x = h/H.

where

= L.(x, fiD,

y• = fl•(x•where

and

y• : [(x•) + e•

f(x) = f•(x, riO, = f•(x, t•),

(1969). Their model, referred to as Model 1, can be written as

These definitionsof y and x will be retained throughout the remainder of the paper. Clearly Model 1 is conditionedsuch that 9 = 0 when x = 1, that is, diameter is zero whenmeasuredat the tip of the tree. Severalgraftedpolynomialmodelswere formulatedand compared.The first consistedof two quadraticfunctionsgraftedat one join point. Sucha quadratic-quadratic model can be written

as

y• = [(x•) +e• where

= B• + B2x+ Baxs,

a• < x • 1.

When the abovemodel is reparametefized by the Gallant and Fu•er (1973) method, a model in which the parameterscan be estimatedby no•inear leastsquaresresults. The restriction that [ be continuous at a• requires that = B• + B•a• + B3• 2. The restriction that [ have a continuous

first partial derivativewith respectto x at at requires that As+2Aaa•

= B,+2Baa•.

These two restrictions are used to eliminate

B• and B, from the quadratic-quadratic model. Next the res•ction that p :

0

whenx = 1 is imposedand the parameters

are rename. d. The resultant quadraticquadraticmodel, referred to as Model 2, is givenby y.•= fi21(xi - 1) + fis•(xi • - 1) '• flga(O{21--X,021+(a2•--X½) + ei (2) where

where

l+(ot41--Xi):

1, = O,

I+(a42 - x•) = 1, = 0,

O•41 --X i 0/41-- Xi a42- x• a4• - x•

/• 0 < 0 >• 0 < O.

Data

I+(a•l --Xi) = 1, = O,

Data available for this study were from

a•l -xi ½ 0 a21-- Xi < 0

and a• is the join point of the two quadratic submodels.Gallant and Fu•er (1973) developeda methodby whichsimplepolynomial modelscan be written directlyin reparameterized form. However some complexmodelsmay have to be derived by the method outlined above to ensure that the correctreparameterized model is obtained. The third model consisted of three sub-

models. One submodel,a quadratic, describedthe taper of the lower sectionof the tree and accounts for butt swell.

The

taper of the middle sectionof the tree is

trees felled on temporaryyield plots in plantations andnaturalstandsof loblolly pine. Plantation-grown sampletreeswere from the Piedmont and Coastal Plain re-

gionsof Virginia,and from the Coastal Plain regionof Delaware,Maryland,and North Carolina. Sample tree data from natural stands were obtained in the Pied-

montandCoastalPlainregionsof Virginia and in the Coastal Plain of North Carolina.

On eachplot two trees (the 10th and 20th trees measured for dbh) were felled and cut into 4-foot (1.2 m) sections. Al-

thoughstumpheightswerenot measured, all wereapproximately 0.5 foot (0.15 m) and a constantheightof 0.5 foot (0.15 m)

describedby a first degreepolynomial. The profile of the tree top sectionis assumedto be quadraticand is conditioned

was assumedin the analysis.In addition to measuring the tree dbh and total height,

such that p = 0 whenx •

measuredat the stumpand at 4-foot (1.2 m) intervalsupthestemto an approximate 2-inch (5.1 cm) top diameter, outside

1. Model 3,

writtenin reparameterized form, is yi = •81(Xl-- 1) +•a2(x? - 1) - •a• (aa• - x•)21+(aal - x•) + fiaa(aa•-xO•l+(aa•-xO + e• (3)

diameters

I+ (a31 -- Xi) = 1, : 0,

There was a total of 422 sample trees from plantations and 230 from natura! The data sets were first stratified

into 10-foot (3.0 m) height classesand then 25 percentof the treeswere selected at random from each height class. The

a31-- Xi ½ 0 aa•-x• < 0

trees in these random

and aa• and aaaare the join pointsof the submodels.

Model

3 is referred

to as a

quadratic-linear-quadratic model. The fourth model is similar to Model

were

bark.

stands.

where

inside and outside bark

3

exceptthat the taper of the middlesection of the tree is assumedto be quadratic rather •an linear. Model 4, a quadraticquadratic-quadratic model, is y½= •(x;1) + •a(x? - 1) + •a(a• -x,)al+(a• - x;) + •(a•a-xOal+(a•a-xO + ei (4)

subsets were con-

sideredrepresentativeof the original data and were withheld for testingpurposes The remaining318 plantation-grownand 172 natural-stand-growntrees were used for initialfittingof the tapermodels. Analysis and Results Models 1, 2, 3, and 4 were fitted using

the plantationand the natural-standdata. Tests were performedusing the LikelihoodRatio proceduredescribed by Gallant (1974b, 1975a, 1975b) to determinewhich model was best supportedby the data.

volume 22, number 3, 1976 / 285

p is the number of parametersin the

The first hypothesistestedwas whether Model 2 was superiorto the simplerModel 1. Thereforewe testedthe hypothesis Ho:

unrestricted model, n is the number of observations, and

F• is the upper 100or percentagepoint of a central F-distribution with q numeratordegreesof freedomand n -p denominatordegreesof freedom.

tg2a= O, ot2•= O

against

Ha:

/g•a•0,

ot•0.

For the data from planted stands

The test statistic used was

T =

RSSx/n RSS•/n

T-

--

where RS& is the residualsum of squares

2 was the unrestricted

model.

The value

of T was comparedto the critical point c* defined by the equation

c* = 1+ qF•/(n - p)

17.6829/2843

= 1.9191

and at the 1 percent level of significance c*= 1+ 2(4.60/2839)

for model i and n is the number of obser-

vations. In this example Model 1 was restrictedto agreewith Ho whereasModel

33.9356/2843

= 1.0032

Hence, Ho was rejected and we concluded that Model 2 was better than Model 1. same conclusion was reached for the

The

data from natural stands.

Next, Model 3 was checked against Model 2. The samehypothesistestingprocedure was used, with Model 2 as the

where

q is the number of parameterstested in Ho ,

restrictedmodel. For both the planted-and natural-standdata, the null hypothesiswas rejected resulting in the conclusionthat

TABLE 1. Comparisonof the predictiveability of Models 1, 2, 3, and 4 on an independent data set of plantation-grown trees and on an independentdata set of natural-standgrown trees. Deviation is defined to be the actual measureddiameter minus the predicted diameter.

Mean absolute deviation (cm)

Section of

relative height

N

Model 1

Model 2

Model 3

Model 4

147 107 108 114 109 110 105 133

1.50 1.20 0.85 0.58 0.62 0.78 0.91 0.76

0.87 0.55 0.55 0.54 0.58 0.64 0.69 0.93

0.86 0.53 0.55 0.55 0.58 0.64 0.67 0.68

0.86 0.53 0.55 0.55 0.58 0.64 0.67 0.68

110 76 90 79 83 86 84 147

2.24 1.81 0.98 0.69 0.90 1.15 1.13 0.95

1.23 0.64 0.65 0.72 0.78 0.73 0.83 1.34

1.23 0.64 0.64 0.71 0.77 0.74 0.80 0.93

1.24 0.62 0.64 0.69 0.73 0.70 0.81 0.83

Plantations:

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

• • • • • • • •

Natural

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

x x x x x x x x

< < • < < < < •

0.1 0.2 0.3 0.4 0.5 0.6 0.7 1.0

stands:

• • • • • • • •

x x x x x x x x

< < < < < < < •

0.1 0.2 0.3 0.4 0.5 0.6 0.7 1.0

286 / Forest Science

TABLE 2. Comparisono/the bias in the predictions[rom Models 1, 2, 3, and 4. Predictionswere made on independentdata sets[or plantation-grownand natural-stand-grown trees. Bias is de[ined as the mean deviation o[ actual measured diameter minus the predicted diameter. Bias (mean deviation, cm)

Section of

relative height

N

Model 1

0.0 • x < 0.1 0.1 • x < 0.2

147 107

0.2 • x % 0.3 0.3 • x % 0.4

108 114 109 110

0.41 -1.18 -0.70

Model 2

Model 3

Model 4

Plantations:

0.4 • x % 0.5 0.5 • x % 0.6 0.6 • x < 0.7 0.7 • Natural

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

x •

-0.13 0.27 0.56 0.74

105 133

1.0

-0.04 -0.19

-0.02 -0.09

-0.02 -0.09

-0.11 -0.07 0.08 0.04

-0.15 -0.06 -0.07 -0.05

-0.15 -0.06

0.39

-0.07 -0.70

0.29 -1.79 -0.89 -0.02 0.64 1.06 1.01 0.31

-0.28 -0.20 -0.03 0.26 0.39 0.33 -0.12 -1.12

0.05 0.03

-0.07 -0.04 0.04 0.02

stands:

• • • • • • • •

x • x • x < x • x < x • x < x •

0.1 0.2 0.3 0.4 0.5 0.6 0.7 1.0

110 76 90 79 83 86 84 147

Model 3 wasto be preferredover Model 2. The testswere performed at the 0.01 level of significance. The last hypothesistestedwas whether the middle sectionof the segmented polynomialmodelshouldbe quadraticor linear. That is, did the data justify the use of Model 4 over Model 3. In the case of the

natural-standdata, the null hypothesiswas rejectedat the 0.01 level of significance, and the use of Model

4 was considered

justifiable. In the caseof the plantationstand data, however,the null hypothesis couldnotbe rejected;Model3 wasjudged sufficient to describe the data.

We concluded,then, that Model 3 was

sufficientto describe the datafromplantation-grown treeswhile Model 4 was necessary to describe the natural-stand data.

This differencewas probablydue to the greateraverageheight of the natural-stand sampletrees than the plantation sample trees. Hence, in terms of relative height, the upperjoin point ota•or a4• has a larger

value for natural-stand

-0.28 -0.23 -0.03 0.20 0.35 0.35 0.02 -0.51

-0.22 -0.08 -0.04 0.07 0.16 0.17 -0.05 -0.16

trees than the cor-

respondingjoin point for plantation-grown trees. Similarly,the lower join point Ires a smaller value for natural-stand

trees than

the corresponding join point for plantationgrown trees. This implies that for trees grown in natural stands,the middle segment representsa greaterproportion of the

treebolethan doesthe corresponding segmentfor plantation-grown trees. This proportionatelylarger midsectionof the tree bole apparentlycannot be describedby a simple linear model.

GallantandFuller (1973) presentedan approximatehypothesistesting procedure

basedon directanalogyto the theoryof linear models. Using their approximate technique, all hypothesesin this section, were retested and the conclusions were the same as those reached when the Likelihood Ratio test was used.

Models 1, 2, 3, and 4 were further tested by usingthe data withheld to be used as an independentdata set. To analyze the

volume 22, number 3, 1976 / 28?

TABLE 3. Estimates of the parameters in Model 4. The model was fitted to data from plantationsand to data from natural standsfor estimatingdiameter inside bark. Parameter Parameters

Plantations

short.

Using the Model 4 parameterestimates shownin Table 2, diameterswerepredicted for representativedbh and total height combinationswithin the range of sample

estimates Natural

will be justified in most cases and was rejectedfor the planted-standdata because the trees in our sampleswere relatively

stands

-3.0257 1.4586 -1.4464

-4.2844 2.0870 -2.3287

39.1081

86.7575

0.7431 0.1125

0.7965

to plot stem profiles. The stem profile plotswereconsistent with expectedresults; no abnormalitiesor illogical trendswere

0.0884

detected.

data and the resultant diameters were used

Conclusions

predictive ability of the modelsover the entire lengthof the stem, the independent variable, relative height, was divided into eight sections. Within each section, the diameters available from the withheld data

set were compared to the diameterspredicted from the models.

Tables 1 and 2

summarizethese resultsfor both planted and natural stands. The results in Tables

1 and 2 demon-

stratethe superiorpredictiveability of the segmented-polynomial models,in particular Models

3 and 4.

All

models were con-

ditionedsuchthat the predicteddiameter waszero whenrelativeheightequaledone. Hence, there is little differenceamongthe modelswhen testingfor predicteddiameters at relative heightsnear one. Models 3 and 4 were, however,slightlysuperior in this region,particularlyin termsof bias (Table 2). Differencesamongthe models

The segmentedpolynomial models with estimated join pointsprovidedan improved

description of tree taper whencompared to a single quadratic taper model used throughout the stemlength. Althoughthis improvement hasbeenshownonly for the two Ioblolly pine data sets available for analysis,we feel that the models should prove useful for other speciesdue to the

increasedflexibilityof segmented models to describestemtaper. We also feel that thereare manyotherpotentialapplications in forestryfor segmentedpolynomialregressiontechniques. Literature

BENNETT, F. A., and B. F. SWINDEL. Forest Serv Res Note SE-179, 4 p.

BRUCE,D., R. O. CUaT•S,and C. VANCOEvERING 1968. Development of a system of taper and volume

lower relativeheightswere compared.In particular,for relativeheightlessthan 0.2 the segmentedpolynomial models were greatly superior since these models can accountfor butt swell whereasa simple quadratic model does not have this flexibility. After all testingwas completed,the data reservedfor testing purposeswere combined with the data used for fitting the models and the parameters of Model 4 were reestimated. These parameter esti-

339-350.

3.

For unifor-

mity Model 4 wasusedfor both plantation and natural

stands.

We feel that Model

288 / Forest Science

4

1972

Taper curves for planted slash pine. USDA

were greater when predicted diameters at

mates are shown in Table

Cited

tables for red alder.

Forest

Sci 14

FRIES,J., and B. MAT•RN. 1965. On the use of multivariate

methods

for the construction

of

tree taper curves. Advisory Group of Forest Statisticians of the I.U.F.R.O. Section 25, Stockholm Conference, October, 1965. Paper No. 9, 32 p.

FULLER, W.A. 1969. Grafted polynomials as approximating functions. Aust J Agric Econ 13:35-46.

GALLANT, A. R. 1974a. The theory of nonlinear regressionas it relates to segmentedpolynomial regressionswith estimated join points. Institute of Statistics Mimeograph Series No. 925, 25 p. Raleigh, N.C. 1974b. Testing a subsetof the param-

eters of a nonlinear regressionmodel. Institute of StatisticsMimeograph Series No. 943, 24 p. Raleigh, N.C. 1975a. Nonlinear regression. Am Stat

nometrics

1975b. Testing a subsetof the parameters of a nonlinear regression model. J Am Stat Assoc 70:927-932.

and W. A. FULLER. 1973. Fitting segmented polynomial regression models whose join points have to be estimated. J Am Stat Assoc 68:144-147.

HARTLEY, H. O.

1961.

The modified. Gauss-

3:269-280.

HuscH, B., C. I. MILLER, and T. W. BEERS. 1972.

29:73-81.

•,

Newton method for the fitting of non-linear regressionfunctions by least squares. Tech-

Forest

mensuration.

Ronald

Press

Company, New York, N.Y. 410 p. KOZAK,A., and J. H. G. SMITH. 1966. Critical analysis of multivariate techniquesfor estimating tree taper suggeststhat simpler methods are best.

For

Chron

42:458-463.

, D. D. MUNRO, and J. H. G. SMITH

1969. Taper functions and their application in forest inventory. For Chron 45:278-283.

Advanced Study Institute on StatisticalModeling and Sampling for EcologicalAbundance and Diversity with Applications Under the sponsorshipof the International Statistical Ecology Program of the International Associationfor Ecology, the Biometric Society, and the International Statistical Institute, and with major financial support of the NATO

Scientific Affairs

Division, an

advancedstudyinstituteon statisticalmodeling and samplingfor ecologicalabundance and diversity with applications will be held at The PennsylvaniaState University, University Park, Pennsylvania,during August 820, 1977.

The programis plannedfor theoreticaland applied ecologists,biometricians,and statisticiansinvolvedin researchor teachingrelated to abundance,diversity, and biomonitoring. Young predoctoraland postdoctoralindividuals will be encouragedto participate. The subjects coveredwill include: scientificmodel-

ing andstochastic thinking,species abundance models and problems, diversity as a concept, measurementof ecologicaldiversity,statistical problems of sampling and inference with ex-

amples,localand globaldeterminants of diversity,relevantdata analysis,biologicalmonitoring in environmentalsciences, monitoringin

social and managementsciences,applications to other fields, and advancedcurrent topics. Lectures will consist of forward-looking reviews of recent developments,with an emphasisupon informal presentation. Program will also includeworkshops,tutorial sessions, study groups, individual problems and consultations. Edited proceedingswill be published. Final arrangementsare being made for the instructors and the speakers. Sug-

gestionsare invited. Participants in full attendancewill receive partial supportdependenton individual need and availability of funds. Completedparticipation forms are due on February 15, 1977. Awards will be announcedon March 1, 1977

Tentative plans are being made for a short intensive preparation course during August 1-6, 1977, for those who would find such a

refresher course necessaryand desirable. For further information and participation forms, write to: Professor G. P. Patil, 318 Pond Laboratory, The Pennsylvania State University, University Park, Pennsylvania 16802, U.S.A.

volume 22, number 3, 1976 / 289

Suggest Documents