Learning to Solve Least Squares Curve Method by ...

26 downloads 0 Views 390KB Size Report
Bowerman, Bruce L.,Richard T. Connell and Michael L. Hand. Newbury Park, California: Sage. Publishing Inc., 2001. Social Statistics in. Practice.Second Edition ...
Learning to Solve Least Squares Curve Method by Using Spreadsheets by Sami M. Khayat, PhD Craig N. Refugio, PhD Negros Oriental State University, Dumaguete City, Philippines Abstract Least Squares Curve/Line is a mathematical procedure for finding the best fitting curve to a given set of points by minimizing the sum of the squares of the “offsets” or "the residuals" of the points from the curve/line. The sum of the squares of the offsets is used instead of the offsets’ absolute values because this allows the residuals to be treated as a continuous differentiable quantity. However, because squares of the offsets are used, outlying points can have a disproportionate effect on the fit, a property that may or may not be desirable depending on the problem at hand. In this paper, all of the aforementioned terms are calculated using spreadsheets and the procedures are emphasized on a step by step manner.

INTRODUCTION This study was conducted to a group of 25 Bachelor of Secondary Education students major in Mathematics school year 2011-2012, College of Education, Main Campus 1, Negros Oriental State University, Philippines using a one group pretest-posttest design. The study aimed to use spreadsheets to solve least squares and determine if students would gain significant knowledge in least squares upon using spreadsheets. We conceptualized and developed teaching least squares through spreadsheets into 5 parts:linear, exponential, power law, logarithmic and applications. In discussing the different parts, the manual computations using the different rules were emphasized first before using spreadsheets so that students would really appreciate the “software” in doing the laborious and often repetitivetasks.

LEAST SQUARES FITTING Least squares fitting is a mathematical procedure for finding the best fitting curve to a given set of points by minimizing the sum of the squares of the offsets ("the residuals") of the points from the curve. The sum of the squares of the offsets is used instead of the offsets’ absolute values because this allows the residuals to be treated as a continuous differentiable quantity. However, because squares of the offsets are used, outlying points can have a disproportionate effect on the fit, a property that may or may not be desirable depending on the problem at hand. The following are the different types of least squares fitting:

I.

Least Squares Fitting - Linear y-axis

x-axis Figure 1

y=a + bx  N  ∑ x

∑ x  a  =  ∑ y  ∑ x  b  ∑ xy  2

(1)

or

a.N + b.∑ x = ∑ y

(2)

a.∑ x + b.∑ x 2 = ∑ xy

(3)

Where N is the total numbers of point Solve equations 2 & 3 to obtain the value of the constants a & b. Note: the values of x and y are given. Example 1 Given the following points on the plane, find the best line using Least Squares Fitting X= 10 20 30 40 50 60 Y=

5

10 15 20 25 30

By using MS-Excel, we can obtain the value of the intercept (a) equal to zero, and the value of the slope (b) equal to 0.5. The tablesand figures below show the functions.

Figure 2

II.

Least Squares Fitting – Exponential

Figure 3

To fit a functional form 𝑦𝑦 = 𝑎𝑎𝑒𝑒 𝑏𝑏.𝑥𝑥 (4) take the logarithm of both sides 𝑙𝑙𝑙𝑙 y = 𝑙𝑙𝑙𝑙 𝑎𝑎 + 𝑏𝑏 . 𝑥𝑥 (5) The best-fit values are then

 N  ∑ x

∑ x   A =  ∑ ln y  ∑ x   B  ∑ x.ln y  2

Solving for a and b,

A.N + B.∑ x = ∑ ln y

(6)

A.∑ x + B.∑ x 2 = ∑ x. ln y

(7)

Where 𝑏𝑏 ≡ 𝐵𝐵 𝑎𝑎𝑎𝑎𝑎𝑎 𝑎𝑎 ≡ exp⁡ (𝐴𝐴)

Example 2. Least Squares Fitting- Exponential Given

x y

10

20

30

40

5.437

14.778

40.171

109.196

𝑦𝑦 = 𝑎𝑎𝑒𝑒

50

60

296.826

806.858

𝑏𝑏.𝑥𝑥

ln(y)

x2

x.ln(y)

ycal=aeb*x

5.437

1.6932

100

16.93

5.43656366

20

14.778

2.6931

400

53.86

14.7781122

3

30

40.171

3.6931

900

110.79

40.1710738

4

40

109.196

4.6931

1600

187.73

109.1963

5

50

296.826

5.6931

2500

284.66

296.826318

6

60

806.858

6.6931

3600

401.59

806.857587



210

1273.266

25.1589

9100

1055.56

n

x

y

1

10

2

Answer:

ln(a)=

0.693147

Note: the values of x and y are given.

a=

2

b=

0.1

III.

Least Squares Fitting - Power Law

Given a function of the form 𝑦𝑦 = 𝑎𝑎𝑥𝑥 𝑏𝑏

𝑙𝑙𝑙𝑙(𝑦𝑦) = 𝑙𝑙𝑙𝑙(𝑎𝑎 ) + 𝑏𝑏. 𝑙𝑙𝑙𝑙(𝑥𝑥) Least square fitting gives the coefficients as

 N  ∑ ln x

(8) (9)

∑ ln x   A =  ∑ ln y  ∑ (ln x)   B  ∑ ln x.ln y  2

Solving for a and b,

A.N + B.∑ ln x = ∑ ln y

A.∑ ln x + B.∑ ln( x) 2 = ∑ ln x. ln y

Where 𝑏𝑏 ≡ 𝐵𝐵 𝑎𝑎𝑎𝑎𝑎𝑎 𝑎𝑎 ≡ exp⁡ (𝐴𝐴)

(10) (11)

Example 3. Least Squares Fitting-Power Law

x y

1 3

2 12

3 27

4 48

5 75

6 108

n

x

y

ln(y)

ln(x)

ln(x).ln(y)

ln(x).ln(x)

ycal=a. xb

1

1

3.00

1.0986

0.0000

0.0000

0.0000

3.00

2

2

12.00

2.4849

0.6931

1.7224

0.4805

12.00

3

3

27.00

3.2958

1.0986

3.6208

1.2069

27.00

4

4

48.00

3.8712

1.3863

5.3666

1.9218

48.00

5

5

75.00

4.3175

1.6094

6.9487

2.5903

75.00

6

6

108.00

4.6821

1.7918

8.3893

3.2104

108.00



21

273.00

19.7502

6.5793

26.0479

9.4099

Answer: ln(a)=

1.098612

Note : the values of x and y are given. Solving using MS Excel:

a=

3

b=

2

IV.

Least Squares Fitting—Logarithmic

Given a function of the form

(12)

𝑦𝑦 = 𝑎𝑎 + 𝑏𝑏. 𝑙𝑙𝑙𝑙(𝑥𝑥) thecoefficients can be found from least squares fitting as

 N  ∑ ln x

∑ ln x ∑ (ln x)

2

  A  ∑ y    =     B  ∑ y. ln x 

Solving for a and b,

A.N + B.∑ ln x = ∑ y

(13)

A.∑ ln x + B.∑ ln( x) = ∑ y. ln x 2

(14)

Where 𝑏𝑏 ≡ 𝐵𝐵 𝑎𝑎𝑎𝑎𝑎𝑎 𝑎𝑎 ≡ 𝐴𝐴

Example 4.Least Squares Fitting-Logarithmic𝑦𝑦 = 𝑎𝑎 + 𝑏𝑏. 𝑙𝑙𝑙𝑙(𝑥𝑥) n

x

y

ln(x)

y. ln(x)

ln(x).ln(x)

ycal=a + b* ln(x)

1

10

9.6052

2.3026

22.1168

5.3019

9.6052

2

20

10.9915

2.9957

32.9276

8.9744

10.9915

3

30

11.8024

3.4012

40.1423

11.5681

11.8024

4

40

12.3778

3.6889

45.6602

13.6078

12.3778

5

50

12.8240

3.9120

50.1678

15.3039

12.8240

6

60

13.1887

4.0943

53.9991

16.7637

13.1887



210

70.7896

20.3948

245.0138

71.5199

Answer: a=5 and b=2

Solving using MS Excel:

Note: the value of x and y were given.

V.

Application Example from Chemistry: First- Order Reaction The following data are collected for first-order chemical reaction at constant temperature. n 1 2 3 4 5 6 7 8 9 10

Time (min) 0 1 2 3 4 10 15 20 25 30

[A]t 2.719 2.612 2.586 2.509 2.459 2.138 1.855 1.664 1.448 1.276

This example for first-order reaction. [𝐴𝐴]𝑡𝑡 = [𝐴𝐴]0 . 𝑒𝑒 −𝑘𝑘𝑘𝑘

(15)

This shows that the amount at any time t follows a negative exponential function of time. The initial amount, [A]0, is constant for a given experiment. Negative exponential function has the characteristic of having a maximum value at the variable t=zero, and declining monotonically and asymptotically toward zero. Figure below shows the general trend for [A]t over time. The speed with which the amount [A]t approaches zero is dictated by the rate constant k. Another way to rewrite equation (15) is: [𝐴𝐴]𝑡𝑡 = ln[𝐴𝐴]0 – k.t ln⁡

(16)

1.2000 1.0000

ln[A]t

0.8000 0.6000

ln[A]t

0.4000

Linear (ln[A]t)

0.2000 0.0000 0

10

20

30

40

Time (Min.)

The plot above shows the first-order reaction whose slope is –k. By using MS-Excel we can obtain the slope= -0.02496.

RESULTS Before starting the formal instruction of the course, a Likert-type (0-Has No Knowledge, 1-Has Basic Knowledge, and 3-Has Advanced Knowledge) pretest that contains 10 items was conducted to find out if the students had prior knowledge of the five aforementioned parts of the designed course. At the end of the course, a post-test (items were the same to pretest but randomly reordered) was then administered. The pretest and posttest were designed in such a way that students answered them in written form and their respective answers were counter checked during the hands on activities for the pretest and posttest. We matched their written ratings and the ratings that we gave during the pretest and posttest hands on activities. Results indicated perfect matching between the two types of ratings. Table 1.0 shows the knowledge levels of the 25 subjects of this study.

Type of Test

Table 1.0Pretest and PosttestKnowledgeLevels inLeast Squares Curve through Spreadsheets n=25 Mean Standard Deviation Description

Pretest

0.72

0.46

Has Basic Knowledge

Posttest

1.92

0.28

Has Advanced knowledge

Legend: 0.00-0.66 0.67-1.33 1.34-2.00

Has No Knowledge Has Basic Knowledge Has Advanced knowledge

As reflected in table 1.0, the pretest mean score (0.72) disclosed that the 25 subjects of the study had basic knowledge of the five different parts of the course from linear to its applications. This is being substantiated when we let the students browse and operate least squares using spreadsheets in accordance with the five parts of the course. Seventeen out of 25 manifested/demonstrated basic knowledge and the remaining 8 had no knowledge at all. However, the post test mean score (1.92) revealed that at the end of the course, students’ had gained advanced knowledge. This is further substantiated when we let the students perform tasks according to the five parts of the course. Twenty three out of 25 manifested/demonstrated advanced knowledge while the remaining two showed basic knowledge. . The standard deviations showed that pretest scores were more variable that the posttest scores. To determine the significance of the mean difference between the pretest and posttest mean scores, a dependent t-test was performed and the results are shown in table 2.0.

Table 2.0 Test of Difference Between the Pretest and PosttestKnowledgeLevel inLeast Squares Curve through Spreadsheets n=25

Type of Test

n

Mean

Pretest

25

0.72

Posttest

25

1.92

Mean

Standard Deviation

Computed

Degrees of Freedom

pvalue at α=0.05

Interpretation

24

0.000

Significant

t

Differenc e 0.46 -1.20

0.28

-14.70

A dependent t-test comparing the mean scores of the pretest and posttest found a significant difference between the means of the two groups (t(24) =-14.70, p < 0.05). The mean of the pretest was significantly lower (m=0.72, sd=0.46) than the mean of the posttest (m=1.92, sd=0.28). This means that the post-test disclosed that the 25 subjects gained significant knowledge in least squares curvethrough spreadsheetsat 5% level of significance from basic to advanced knowledge level.

REFERENCES Aiken, Lendon S. and Susane G. West.Multiple Regression. Newbury Park, California: Sage Publishing Inc., 2001. Berry, Wees D. Understanding Regression Assumptions. Newbury Park, California: Sage Publishing Inc., 2001. Bowerman, Bruce L.,Richard T. Connell and Michael L. Hand. Social Statistics in Practice.Second Edition.USA: McGraw Hill, Inc., 2001. Ferguson, George E. and Yoshio A. Takane.Statistical Analysis in Psychology and Education.Sixth Edition.USA: McGraw Hill Inc.,1989. Fox, John Q. Regression Diagnostics Newbury Park, California: Sage Publishing Inc., 2001.

Suggest Documents