Selecting the right number of knots for B-spline

Submitted to Thin Solid Films

Selecting the right number of knots for B-spline parameterization of the dielectric functions in spectroscopic ellipsometry data analysis D.V. Likhachev* GLOBALFOUNDRIES Dresden Module One LLC & Co. KG, Wilschdorfer Landstr. 101, D-01109 Dresden, Germany Keywords Dielectric function; Parameterization; B-splines; Information criteria; Data analysis; Spectroscopic ellipsometry

B-spline representation of the dielectric functions provides many theoretical and practical benefits for material modeling in spectroscopic ellipsometry. However, the number of knots (and their locations, in general) defines actual performance of B-splines in ellipsometric data

analysis. On the one hand, too large number of knots can result in serious overfitting of the experimental data. On the other hand, this number should be sufficient to fit all essential spectral features. Selection of the right number of knots is, in practice, a very subjective and empirically-driven task. In this paper, we discuss the choice of the number of knots utilizing three well-established versions of statistical information criteria in form of Akaike, corrected Akaike and Bayesian Information Criteria (AIC, AICc and BIC, respectively). The criteria establish a compromise between over- and underfitting of experimental data and allow formalized selection of the right number of knots. Effectiveness of the proposed methodology is illustrated using a few real-data examples.

1. Introduction B-spline (or Basis-spline) parameterization is a popular and relatively new approach to express the dielectric function of materials in a purely mathematical way which has been introduced by Johs and Hale [1]. Since then, it has been proven to be very effective for multiple applications in spectroscopic ellipsometry (SE) [2–11]. B-splines are constructed from piecewise polynomial functions connected smoothly at a set of points on the x-axis * Electronic mail: [email protected]. 1


called “knots”. B-splines of any order can be evaluated by using simple recurrence relation (for more details, see Ref. [12]). One of obvious advantages to employ B-splines is that there is no necessity for physical oscillator parameterization of the optical constants. Therefore, this approach is quite useful in cases where typical physics-based oscillator models cannot be easily applied. Moreover, B-splines allow the Kramers–Kronig consistent formulation and, therefore, guarantee physical validity of the line shape of the dielectric function in considered spectral range. The authors of Ref. [1] also provided a quite detailed discussion on the practical use of B-splines to parameterize dielectric functions including, in particular, knot termination and possible issues with presence of absorption features outside of the spectral range. However, together with all advantages mentioned in Ref. [1], B-spline parameterization possesses a non-trivial practical problem of choosing the number (and location) of knots. Usually, the number of knots used in ellipsometric data analysis is determined and tuned empirically based on quite often ambiguous decisions and optimal knot number selection is an extremely complex undertaking. It is a matter of fact that if, for simplicity, we assume equally spaced (equidistant) knots in B-spline model of the dielectric function, then increasing the number of knots above optimal value will “overfit” the data, i.e., overfitting causes the model to fit the noise in the data and the dielectric function will contain some artificial features rather than actual behavior. Since "one picture is worth ten thousand words" [13], we illustrate the problem with two examples of B-spline parameterization for ~200 Å-thick TiN film deposited on ~4000 Å-thick SiO2 layer on c-Si (Fig. 1). It appears that the parameterization with 53 knots (Fig. 1a) fits, figuratively speaking, the “von Neumann’s elephant” [14,15]. In other words, such parameterization looks too adaptable and a few fine structures observed in real (ε1) and imaginary (ε2) parts of the TiN dielectric function at 2.75, 3.95, 5.0, and 6.0 eV are nothing more than modeling artifacts due to small knot spacing. Reduction of the number of knots clearly eliminates these unrealistic features although, expectedly, deteriorates the quality of fit (increases the mean squared error (MSE) value) (Fig. 1b). Thus, a feasible compromise needs to be found between sufficiently large number of knots to fit the data satisfactory, i.e., getting low MSE value, and obvious necessity to avoid data overfitting.

There are different approaches to avoid overfitting by spline parameterization. Recently, Gilliot et al. [16] used special form of splines, so-called “constrained cubic splines”, under particular constraints on first order derivatives to model the dielectric function of zinc oxide ZnO. With regard to B-splines, it is advantageous to introduce a penalty to restrict flexibility 2


of dielectric function parameterization by splines and prevent the overfitting. It is achieved by putting some constrains on the B-spline coefficients. In this kind of penalized splines, known as P-splines [17–21], the penalty degree is easily controlled by a certain smoothing parameter. Due to the penalty P-splines possess very interesting the “power of the penalty” property: “The number of B-splines can be (much) larger than the number of observations. The penalty makes the fitting procedure well-conditioned. This should be taken literally: even a thousand splines will fit ten observations without problems.” [20]. Actually, use of Psplines replaces the problem of knot selection by a task of choosing the smoothing parameter using various approaches [22–25]. In spite of all these advantages, the lack of available spectroscopic ellipsometry modeling software packages with implemented P-spline parameterization of the dielectric function currently makes this approach less practical than simply and solely use of existing software with already built-in B-splines, for instance, CompleteEASE® software from J.A. Woollam Co., Inc. (Lincoln, NE, U.S.A.), and apply some systematic method(s) to select optimal knot distribution.

Fig. 1. Real (ε1) and imaginary (ε2) parts of the complex dielectric function of ~200 Å-thick TiN film parameterized using B-spline with 53 knots (a) and 11 knots (b) equally spaced in the spectral range 1.25–6.45 eV (or 192–992 nm).

3


In general, splines can be considered as mathematical models and, therefore, a natural idea is to apply well-known and well-tested statistical model-selection techniques to optimize placement and number of knots in B-spline parameterization of the dielectric function. Our objective in this paper is to discuss application of classical MSE-based approaches, namely, the Akaike information criterion (AIC) [26,27] and the Bayesian information criterion (BIC), also known as Schwarz information criterion (SIC) [28] (a comprehensive overview on the information criteria (IC) methodology can be found, for instance, in Refs. [29–31]). These methods are extensively used in various disciplines and recently their effectiveness has been demonstrated also in ellipsometric applications [32,33]. Both criteria were developed as an objective way to compare performance of different models (often called “candidate models”) to describe given measurement data and select a model with optimal complexity. For instance, if we are interested which model better fits our ellipsometric data, we can form a set of various physics-based (multi-)oscillator models with different numbers and combinations of independent parameters and use the models from this set to fit the data. Typically, with increasing number of fitting parameters in each particular model the quality of fit will improve. Natural question, of course, is when to stop introducing additional model parameters to avoid data overfitting and not lose model’s generality. This is when the AIC and BIC come into play. The information criteria include a penalty term which reduces accomplished gain in the data fit quality due to extra parameters (increasing model’s complexity). Thus, the criteria select the best model which is the one with the lowest AIC or BIC score, - in other words, the model achieving the best fit with minimal number of parameters.

In regard to model selection for the task of careful knot choice, we consider, for simplicity's sake, equally spaced knots in B-spline model of the dielectric function, thus reducing drastically a large amount of candidate models (B-splines with different number and location of knots) to choose from. In that case, the complexity of each spline model is characterized by the number of knots only. Such equidistant knot arrangement could be also justified because in our analysis the sample size (number of data points) is not small and spectroscopic data are distributed uniformly in the considered wavelength range. Optimal number of knots will correspond to the lowest AIC (or BIC) value. Thus, the IC technique serves as a practical and also objective guidance, as opposite to intuition and various “rules of thumb”, for selecting the number of equidistant knots for B-spline parameterization. A similar approach to select the number of knots for curve-fitting smoothing using only the Akaike information 4


criterion has been applied previously by Atilgan and Bozdogan [34] and Yanagihara and Ohtaki [35]. However, use of the AIC only might be not always a good idea due to a difference in the AIC and BIC penalty terms. In fact, the BIC applies a heavier penalty and, therefore, under some circumstances will select a simpler spline model, i.e., with fewer number of knots, than the AIC. Because both criteria were introduced to pursue the same aim of selecting the best model, it has been recommended in the statistical literature to use the AIC and BIC together (see, for instance, Ref. [36]).

2. Theory We briefly review here the main facts from the theory of spline functions and the modelselection approach based on information criteria.

2.1. Ellipsometric data analysis and B-splines Proper data analysis in spectroscopic ellipsometry requires modeling of the dielectric functions of materials as functions of photon energy (dispersion models). Conventional analytical physics-based dispersion models are introduced for ε2, the imaginary part of the dielectric function, and then the real part of the dielectric function can be found by KramersKronig integration. Here we parametrize the ε2 spectrum by using B-splines. A spline curve Y(x), which represents ε2, is constructed in terms of weights (spline coefficients) bi and basis functions Bip ( x) of degree p: Y ( x )   bi Bip ( x ).

(1)

i

The functions from the B-spline basis are piecewise polynomials with local support, i.e., are equal to zero outside defined interval, and smoothly connected at the knots. By summing more or fewer Bip ( x) with different weights we can create the curves of various shapes and complexity. As pointed out by John Rice, “One should view B-splines as new elementary or special functions, analogous to sines or Bessel functions.” [37]. The B-spline basic functions of any order can be recursively calculated from the lower-order functions using the Cox-de Boor recursion formula [12]:

5


1, if ti  x  ti 1 Bi0 ( x)   , 0, otherwise

(2a)

Bi ( x)   i ( x) Bi ( x)  (1   ( x)) B p

p

p 1

p i 1

p 1 i 1

 x t , if ti  p  ti  ( x), where  i ( x)   ti  p  ti . 0, otherwise  p

(2b)

Here t = (ti) (i = 0, …, l) is the knot vector, i.e., a non-decreasing sequence of real numbers of length at least p + 2 (t0 ≤ t1 ≤ … ≤ ti ≤ ti+1 ≤ … tm).

Fig. 2. B-spline basis functions of different degrees.

The spline coefficients bi can be readily obtained from a least-squares fit to the SE experimental data using suitable multi-parameter non-linear optimization algorithm and minimizing merit function, i.e., the function which determines the quality of fit.

6


Fig. 2 shows some examples of the B-spline basis functions of different degrees. It demonstrates that the basis functions of degree p have support only over p+1 intervals and they are exactly zero outside the finite range [ti, ti+p+1].

Fig. 3. Some examples of arbitrary weighted B-spline basis functions of degree 3 (black dashed lines) and their linear combinations (red solid lines) illustrating various resulting spline curves for equidistant knots. 7


Fig. 3 gives a few examples of arbitrary weighted B-spline basis functions of degree 3 and their linear combinations. In these examples all spline coefficients were selected on purpose to be positive, - it guarantees that all spline curves are positive in compliance with “convex hull” property [38]. Therefore, B-spline parameterization provides a simple and elegant way to ensure that the imaginary part of the dielectric function ε2 ≥ 0 [1]. 2.2. Commonly used information criteria Given a set of experimental data  y j 1 , we fit measured data points by functional relations n

FΨ,Δ for ellipsometric angles Ψ and ∆ derived by using an appropriate optical model formed with B-spline parameterization of the complex dielectric function of a particular layer under study. The least-square fit estimates unknown spline coefficients bi as well as other model parameters by minimizing a (weighted) sum of squared residuals (residual sum of squares, RSS) between the measured yj and modeled Fj (Yi | xj) data

n

min   y j , M  FM (Yi | x j )  , 2

(3)

j 1 M

where Yi  b, B p  is the spline function defined by Eq. (1), index i denotes the interior knots within the analysis spectral range and summation over subscript M corresponds to different ellipsometric angles (Ψ or ∆). A set of possible models with different numbers of knots forms a series of candidate models which can be fitted to measured data one after another to evaluate the quality of fit. Then the information criteria should be applied to calculate respective IC scores and select the “best” spline model from the set.

Many MSE-based model selection techniques have been proposed and widely used in various fields. We will use three most common and well-founded criterion functions, namely, AIC [26,27], BIC [28] and corrected AIC (AICc) [39,40]. In terms of the residual sum of squares IC are given by a common functional form

 RSS(m)  IC(p)  n ln     m, n  

(4)

8


where m is the number of model parameters (consists of the number of interior knots and other model variables, like film thicknesses) and  is the penalty term of different kind for various information criteria:

Akaike information criterion (AIC): Corrected AIC (AICc): Bayesian information criterion (BIC):

2

  2n / (n  m 1)   ln n

The corrected AIC has been introduced for model selection in cases when the number of model parameters m is not small compare to the number of data points n and the ordinary AIC sometimes performs poorly selecting models with excessive number of parameters (there is a simple “rule of thumb”: the sample size is considered to be small if n/m is less than 40 [29]). If n is large relative to m, n >> m, then the correction to original AIC’s penalty term becomes negligible and the AICc asymptotically tends to AIC. Therefore, it is highly recommended in practice to use AICc rather than AIC [29,30]. Due to the differences in their penalty terms, the IC might disagree from time to time on the ranking of candidate models. In these instances, the IC appear to indicate at least upper and lower bounds for the range of suitable models [36].

Instead using raw AIC, AICc or BIC values for model selection, other associated measures are usually used [29–31]: 

IC differences:

ΔICk

 ICk  ICmin ,

k  1, ..., q,

(5)

where k is the number of candidate models under test (for instances, with different knot numbers), ICmin is the score of the “best” candidate model, i.e., the minimum IC score among all candidate models; 

IC weights:

wr (IC) 

exp(  ICr / 2) q

 exp(IC k 1

k

/ 2)

q

,

 w (IC)  1. k 1

k

(6)

The IC differences can be interpreted as the evidence for support of the kth model by comparison with the “best” model from a set of candidate models. By convention, the

9


models, if any, with ΔIC < 2 should be considered almost as good as the “best” model. Models with ΔIC > 10 are very unlikely and thus should be ruled out from consideration. Models with intermediate IC values are significantly less plausible and fall into the grey area. The IC weight, wr (IC), simply indicates a probability that the rth model is the “best” among q candidate models and, therefore, its numerical value always lies between 0 and 1. The ratio of the IC weights of two different models ER  wA / wB ( wB  0), often called “the evidence ratio”, points out how much model A is more likely than model B. Hence, the evidence ratio can be very helpful in more accurate interpretation of the IC results than pretty rough “cut-off rules” for the IC differences mentioned above (see Example 1 in subsection 3.1).

3. Application examples In this section we provide a few real-data examples to demonstrate efficiency and versatility of our approach to selection of the number of equidistant knots for B-spline parameterization of the dielectric functions.

3.1. Example 1: 200 Å TiN / 4000 Å SiO2 / Si The first example is a set of ellipsomteric data taken from an unpatterned test wafer with TiN/thick SiO2/c-Si film structure. The ellipsometric measurements were performed by using VUV-VASE® GEN-II rotating-analyzer spectroscopic ellipsometer from J.A.Woollam Co., Inc. at 65°, 70° and 75° angles of incidence (AOIs). The measured data in the spectral range of 1.25–6.45 eV (992–192 nm; 105 data points) were analysed with CompleteEASE® software (version 4.86). The optical properties of the TiN layer were described by using the “B-spline Layer” (with enforced Kramers-Kronig consistency) dispersion model available in the software. Good starting optical properties for the layer to set up the B-spline model have been obtained using various oscillator models from WVASE32® software which provides the global fit option to avoid false solutions. The silicon substrate and silicon dioxide optical constants were taken from the CompleteEASE® database of materials. The optical models with different numbers of equidistant knots in the B-spline layer form a set of candidate models which were used to fit the experimental data. The fitting parameters include the thicknesses of the roughness, TiN and SiO2 layers as well as the B-spline coefficients. The optical properties of the roughness layer were obtained by mixing those of the underlying TiN film with 50% of “void” via the Bruggeman effective medium approximation.

10


Table 1 The AIC, AICc and BIC results for B-spline parameterization of the TiN film. No.of

AICk

ΔAICk

BICk

ΔBICk

wr(BICk)

35

188.2

11.5

0.0020

233.1

48.5

0.0000

289.0

64.6

0.0000

27

182.3

5.6

0.0380

207.4

22.8

0.0000

261.9

37.5

0.0000

21

182.6

5.9

0.0328

197.6

13.0

0.0014

246.3

21.8

0.0000

18

178.1

1.5

0.2971

189.3

4.7

0.0876

233.9

9.5

0.0087

15

176.6

0.0

0.6301

184.6

0.0

0.9104

224.4

0.0

0.9839

14

194.0

17.3

0.0001

201.0

16.4

0.0003

239.1

14.7

0.0006

12

194.8

18.1

0.0001

200.2

15.6

0.0004

234.6

10.2

0.0060

11

201.7

25.1

0.0000

206.4

21.8

0.0000

238.9

14.5

0.0007

10

211.1

34.4

0.0000

215.1

30.5

0.0000

245.6

21.1

0.0000

9

217.1

40.4

0.0000

220.4

35.9

0.0000

248.9

24.5

0.0000

knots

wr(AICk) AICck ΔAICck wr(AICck)

Note: No.of knots = number of used equally spaced knots for B-spline model k; ΔAICk = [AICk – min(AIC)], ΔAICck = [AICck – min(AICc)], and ΔBICk = [BICk – min(BIC)]; w(AIC), w(AICc), and w(BIC) are the corresponding IC weights (please, see text for details).

Fig. 4. Determination of the right number of knots: the AIC, AICc and BIC values plotted against the number of fitting parameters in B-spline parameterization of the TiN dielectric function. The thin vertical line indicates minima of the AIC, AICc and BIC functions.

11


The IC results for B-spline parameterization of the TiN film, along with associated measures ΔIC and w(IC), are shown in Table 1 and also plotted in Fig. 4 as a function of the number of fitting parameters. Table 1 and Fig. 4 indicate that the minima of all IC measures (AIC, AICc as well as BIC) occur at the number of fitting parameters m = 18 which yields i = 15 as the optimal number of knots. The B-spline model with 15 equidistant knots received 63% (AIC), 91% (AICc) and 98% (BIC) of the total weight for the set of candidate models. The fact that the winning model demonstrates obviously lower support in terms of the AIC weight (63% vs. > 90% for AICc and BIC) is a clear indicator that the corrected AIC should be used since the number of fitting parameters (m ∈ [12..38]) is not small enough compare to the sample size (n = 105). It is important to note that according to the above-stated cut-off rule for AIC, the 18-knot B-spline model is suitable almost as the best one with 15 knots (since its difference ΔAIC < 2). However, the probability of that is more than two times less than for 15-knot model as it demonstrated by the evidence ratio: w15-knot(AIC)/w18-knot(AIC) = 0.63/0.30 = 2.1. Thus, this example clearly illustrates that AIC has a tendency to overparameterize a model while AICc (and BIC) tend to select simpler parameterization. The corresponding real and imaginary parts of the complex dielectric function are illustrated in Fig. 5 (solid lines). Comparison of the dielectric functions obtained by B-spline fit and pointby-point data inversion procedure [41] shows very good agreement (Fig. 5) which supports our strong confidence in obtained results.

Fig. 5. Real (ε1) and imaginary (ε2) parts of the complex dielectric function of ~200 Å-thick TiN film. Solid lines show results based on B-spline parameterization, reported as the best by the IC approach, with 15 equally spaced knots, while circles represent the results obtained via

12


a point-by-point fit. Note that the B-spline parameterized dielectric function does not contain modeling artifacts (compare to Fig. 1).

3.2. Example 2: 250 Å Ta / 3300 Å SiO2 / Si As a second example, the measured data from a test wafer with Ta/thick SiO2/c-Si film structure were examined. The ellipsometric spectra were acquired using rotating-compensator variable-angle spectroscopic ellipsometer M-2000® from J.A. Woollam Co., Inc. at AOIs 50°–75° with step 5°. All fits were performed using CompleteEASE® software (version 4.86) in the spectral range of 1.24–6.21 eV (1000–200 nm; 503 data points). Similar to the first example, the coefficients of the B-spline model of Ta and the roughness, Ta and SiO2 thicknesses were fitted to the measured data.

Table 2 The AIC, AICc and BIC results for B-spline parameterization of the Ta film. No.of

AICk

ΔAICk

BICk

ΔBICk

wr(BICk)

39

437.4

31.6

0.0000

445.3

37.6

0.0000

614.7

120.3

0.0000

33

426.9

21.1

0.0000

432.6

24.9

0.0000

578.8

84.4

0.0000

28

419.1

13.4

0.0011

423.3

15.7

0.0004

550.0

55.6

0.0000

25

414.4

8.6

0.0116

417.8

10.1

0.0056

532.5

38.1

0.0000

22

411.8

6.0

0.0418

414.5

6.8

0.0288

517.3

22.9

0.0000

20

411.2

5.5

0.0560

413.5

5.8

0.0475

508.3

13.9

0.0006

18

405.8

0.0

0.8590

407.7

0.0

0.8834

494.4

0.0

0.6172

17

412.5

6.7

0.0300

414.2

6.5

0.0337

496.9

2.5

0.1778

15

421.0

15.3

0.0004

422.4

14.8

0.0006

497.0

2.6

0.1679

14

437.6

31.9

0.0000

438.9

31.2

0.0000

509.4

15.0

0.0003

13

433.1

27.4

0.0000

434.3

26.6

0.0000

500.7

6.3

0.0267

12

439.5

33.7

0.0000

440.4

32.8

0.0000

502.8

8.4

0.0094

11

473.6

67.8

0.0000

474.4

66.7

0.0000

532.7

38.3

0.0000

10

497.4

91.7

0.0000

498.2

90.5

0.0000

552.3

57.9

0.0000

9

539.0

133.3

0.0000

539.7

132.0

0.0000

589.7

95.3

0.0000

8

579.3

173.5

0.0000

579.8

172.1

0.0000

625.7

131.3

0.0000

knots


13


Table 2 presents the results for B-spline parameterization of the Ta film. For better visualization, Fig. 6 depicts the obtained results as a function of the number of fitting parameters. It is evident from Table 2 and Fig. 6 that all criteria select the B-spline model with the number of fitting parameters m = 21 and, therefore, with the optimal number of knots i = 18. It is interesting to note that in this example the AIC and AICc demonstrate purely comparable weights values of 86% and 88%, respectively. This is due to substantially larger sample size (n = 503), which makes the correction to the AIC’s penalty term much less important. Fig. 7 compares the complex dielectric function of the Ta film obtained from the B-spline model (with 18 equidistant knots) and point-by-point data inversion procedure.

Fig. 6. Determination of the right number of knots: the AIC, AICc and BIC values plotted against the number of fitting parameters in B-spline parameterization of the Ta dielectric function. The thin vertical line indicates minima of the AIC, AICc and BIC functions.

In Fig. 8 three resulting curves for the real part of the dielectric function, obtained with different number of basis knots (8, 18 and 49), are shown. It can be clearly seen that the curve with 49 equidistant knots, which yields lowest MSE value of 2.010, contains noticeable wiggly artifacts between 3 and 4 eV and, therefore, measurement noise is not filtered out efficiently. At the same time, the curve with only 8 knots produces higher misfit (MSE = 3.028) and, therefore, misses some essential spectral features. The curve generated with selected by both information criteria optimal number of 18 knots displays the absence of apparent artifacts and yields acceptable MSE value of 2.061. 14


Fig. 7. Comparison of the real (ε1) and imaginary (ε2) parts of the complex dielectric function of ~250 Å-thick Ta film obtained from the B-spline model (with 18 equidistant knots) and point-by-point data inversion procedure.

Fig. 8. Comparison of the real parts of the dielectric function for ~250 Å-thick Ta film obtained from the B-spline model with different number of basis knots: 8, 18 (optimal) and 49 knots. For better visibility, the curves for 18 and 49 knots are shifted up vertically by one and two units, respectively.

3.3. Example 3: 250 Å TaN / 3000 Å SiO2 / Si Finally, the last example is just like the second one, except for the top layer, where the Ta film was replaced by TaN, and slightly lower SiO2 layer thickness. Here we have used the same measurement and analysis setup as in previous example. The results from B-spline 15


model of the TaN film are summarized in Table 3 and shown in Fig. 9. Examining obtained results, one can clearly see that all three criteria choose B-spline representation with 14 equidistant knots (and total number of fitting parameters m = 17). Looking at Table 3, we note that there are 72%, 79% and 96% chances, according to AIC, AICc and BIC, respectively, that the above-mentioned number of knots is really optimal for our TaN Bspline model. Fig. 10 compares the complex dielectric function of the TaN film obtained from the B-spline model (with 14 equidistant knots) and point-by-point data inversion procedure.

Table 3 The AIC, AICc and BIC results for B-spline parameterization of the TaN film. No.of

AICk

ΔAICk

BICk

ΔBICk

wr(BICk)

39

803.0

37.2

0.0000

810.8

43.7

0.0000

980.2

142.7

0.0000

33

791.4

25.6

0.0000

797.2

30.1

0.0000

943.4

105.8

0.0000

28

783.5

17.7

0.0001

787.7

20.6

0.0000

914.3

76.8

0.0000

25

778.9

13.1

0.0010

782.4

15.3

0.0004

897.1

59.5

0.0000

22

774.8

9.0

0.0079

777.6

10.5

0.0042

880.3

42.8

0.0000

20

768.7

2.9

0.1708

771.0

3.9

0.1113

865.8

28.2

0.0000

18

772.5

6.7

0.0252

774.4

7.4

0.0199

861.1

23.6

0.0000

17

772.9

7.1

0.0211

774.6

7.5

0.0183

857.3

19.7

0.0001

15

771.1

5.3

0.0514

772.5

5.4

0.0524

847.1

9.5

0.0083

14

765.8

0.0

0.7198

767.1

0.0

0.7904

837.6

0.0

0.9623

13

777.0

11.2

0.0026

778.1

11.1

0.0031

844.5

7.0

0.0292

12

796.7

30.9

0.0000

797.7

30.6

0.0000

860.0

22.5

0.0000

11

797.0

31.2

0.0000

797.9

30.8

0.0000

856.1

18.6

0.0001

10

827.6

61.8

0.0000

828.4

61.3

0.0000

882.5

44.9

0.0000

9

844.3

78.5

0.0000

844.9

77.9

0.0000

894.9

57.4

0.0000

8

863.8

98.0

0.0000

864.3

97.3

0.0000

910.2

72.7

0.0000

knots


16


Fig. 9. Determination of the right number of knots: the AIC, AICc and BIC values plotted against the number of fitting parameters in B-spline parameterization of the TaN dielectric function. The thin vertical line indicates minima of the AIC, AICc and BIC functions.

Fig. 10. Comparison of the real (ε1) and imaginary (ε2) parts of the complex dielectric function of ~250 Å-thick TaN film obtained from the B-spline model (with 14 equidistant knots) and point-by-point data inversion procedure.

4. Conclusions In this paper, we have applied three widespread and well-established information criteria (AIC, AICc and BIC) to non-trivial problem of selecting the number of equidistant knots in B-spline parameterization of the dielectric functions. This approach demonstrates a great 17


potential as an efficient tool to avoid overfitting of experimental data and achieve higher model accuracy and predictability. A few application examples illustrated the use of the IC techniques and interpretation of the obtained results. An important advantage of the IC approach is that it provides an objective and unambiguous guidance for choosing the number of knots in B-spline models and, thus, removes possible ad hoc decisions by an ellipsometry user. Further possible enhancement of our approach should also include a procedure for selecting optimal knot locations, especially when the number of knots is small (for instance, employing various “knot deletion and adjustment” techniques [42–44]). A simplified approach may include initially manual placement of knots with denser distribution near particular spectral features like critical points and then application of our suggested way of minimizing the information criteria to select appropriate knot number. However, it requires some prior knowledge on the material under test optical properties. Otherwise, an automatic knot adjustment will require a global search over entire set of possible knot locations and counts and that rapidly becomes computationally impracticable due to slow convergence to a global minimum and higher computation cost.

Acknowledgements The author wish to acknowledge Dr. Martin Weisheit (GLOBALFOUNDRIES, Dresden, Germany) for providing original VUV-VASE ellipsometric data for the TiN film used in present work.

References [1] B. Johs, J.S. Hale, Dielectric function representation by B-splines, Phys. Status Solidi A 205 (2008) 715–719. [2] J.W. Weber, T.A.R. Hansen, M.C.M. van de Sanden, R. Engeln, B-spline parametrization of the dielectric function applied to spectroscopic ellipsometry on amorphous carbon, J. Appl. Phys. 106 (2009) 123503. [3] J.W. Weber, V.E. Calado, M.C.M. van de Sanden, Optical constants of graphene measured by spectroscopic ellipsometry, Appl. Phys. Lett. 97 (2010) 091904. [4] S.G. Choi, J. Zúñiga-Pérez, V. Muñoz-Sanjosé, A.G. Norman, C.L. Perkins, D.H. Levi, Complex dielectric function and refractive index spectra of epitaxial CdO thin film grown on r-plane sapphire from 0.74 to 6.45 eV, J. Vac. Sci. Technol. B 28 (2010) 1120–1124.

18


[5] H.T. Beyene, J.W. Weber, M.A. Verheijen, M.C.M. van de Sanden, M. Creatore, Real time in situ spectroscopic ellipsometry of the growth and plasmonic properties of Au nanoparticles on SiO2, Nano Res. 5 (2012) 513–520. [6] S.G. Choi, H.Y. Zhao, C. Persson, C.L. Perkins, A.L. Donohue, B. To, A.G. Norman, J. Li, I.L. Repins, Dielectric function spectra and critical-point energies of Cu2ZnSnSe4 from 0.5 to 9.0 eV, J. Appl. Phys. 111 (2012) 033506. [7] A. Jimenez, D. Lepage, J. Beauvais, J.J. Dubowski, Study of surface morphology and refractive index of dielectric and metallic films used for the fabrication of monolithically

integrated

surface

plasmon

resonance

biosensing

devices,

Microelectron. Eng. 93 (2012) 91–94. [8] E. Agocs, B. Fodor, B. Pollakowski, B. Beckhoff, A. Nutsch, M. Jank, P. Petrik, Approaches to calculate the dielectric function of ZnO around the band gap, Thin Solid Films 571 (2014) 684–688. [9] L. Kőrösi, A. Scarpellini, P. Petrik, S. Papp, I. Dékány, Sol–gel synthesis of nanostructured indium tin oxide with controlled morphology and porosity, Appl. Surf. Sci. 320 (2014) 725–731. [10] L.S. Abdallah, S. Zollner, C. Lavoie, A. Ozcan, M. Raymond, Compositional dependence of the optical conductivity of Ni1 − xPtx alloys (0 < x < 0.25) determined by spectroscopic ellipsometry, Thin Solid Films 571 (2014) 484–489. [11] H.G. Tompkins, J.N. Hilfiker, Spectroscopic Ellipsometry: Practical Application to Thin Film Characterization, Momentum Press, New York NY, 2016. [12] C. de Boor, A Practical Guide to Splines, Revised Edition, Springer-Verlag, New York NY, 2001. [13] R. Allen, Allen's Dictionary of English Phrases, Penguin Books, London, England, 2008, p. 554. [14] “…with four parameters I can fit an elephant, and with five I can make him wiggle his trunk.” (attributed to John von Neumann by Enrico Fermi, as quoted by Freeman Dyson in Ref. [15]). [15] F. Dyson, A meeting with Enrico Fermi, Nature (London) 427 (2004) 297. [16] M. Gilliot, A. Hadjadj, M. Stchakovsky, Spectroscopic ellipsometry data inversion using constrained splines and application to characterization of ZnO with various morphologies, Appl. Surf. Sci., (2016), http://dx.doi.org/10.1016/j.apsusc.2016.09.106. [17] P.H.C. Eilers, B.D. Marx, Flexible smoothing with B-splines and penalties, Statist. Sci. 11 (1996) 89–102. 19


[18] B.D. Marx, P.H.C. Eilers, Generalized linear regression on sampled signals and curves: a P-spline approach, Technometrics 41 (1999) 1–13. [19] P.H.C. Eilers, B.D. Marx, Splines, knots, and penalties, WIREs Comp. Stat. 2 (2010) 637–653. [20] P.H.C. Eilers, B.D. Marx, M. Durbán, Twenty years of P-splines, SORT-Stat. Oper. Res. Trans. 39 (2015) 149–186. [21] D. Ruppert, M.P. Wand, R.J. Carroll, Semiparametric Regression, Cambridge University Press, Cambridge, U.K., 2003. [22] S. Imoto, S. Konishi, Selection of smoothing parameters in B-spline nonparametric regression models using information criteria, Ann. Inst. Statist. Math. 55 (2003) 671– 687. [23] G. Kauermann, A note on smoothing parameter selection for penalized spline smoothing, J. Stat. Plan. Inference 127 (2005) 53–69. [24] T. Krivobokova, G. Kauermann, A note on penalized spline smoothing with correlated errors, J. Am. Stat. Assoc. 102 (2007) 1328–1337. [25] C. Wager, F. Vaida, G. Kauermann, Model selection for penalized spline smoothing using Akaike information criteria, Aust. N. Z. J. Stat. 49 (2007) 173–190. [26] H. Akaike, Information theory as an extension of the maximum likelihood principle, in: B.N. Petrov, F. Csaki (Eds.), Proceeding of the Second International Symposium on Information Theory, Akademiai Kiado, Budapest (1973), pp. 267–281; Reprinted in: S. Kotz, N.L. Johnson (Eds.), Breakthroughs in Statistics, Vol.I, Foundations and Basic Theory, Springer, New York NY, 1992, pp. 610–624 and E. Parzen, K. Tanabe, G. Kitagawa (Eds.), Selected Papers of Hirotugu Akaike, Springer, New York NY, 1998, pp. 199–213. [27] H. Akaike, A new look at the statistical model identification, IEEE Trans. Automat. Contr. 19 (1974) 716–723. [28] G. Schwarz, Estimating the dimension of a model, Ann. Statist., 6 (1978) 461–464. [29] K.P. Burnham, D. Anderson, Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, 2nd Ed., Springer-Verlag, New York NY, 2002. [30] D.R. Anderson, Model Based Inference in the Life Sciences: A Primer on Evidence, Springer-Verlag, New York NY, 2008. [31] H.T. Banks, S. Hu, W.C. Thompson, Modeling and Inverse Problems in the Presence of Uncertainty, CRC Press, Boca Raton FL, 2014.

20


[32] D.V. Likhachev, A practitioner’s approach to evaluation strategy for ellipsometric measurements of multilayered and multiparametric thin-film structures, Thin Solid Films 595 (2015) 113–117. [33] D.V. Likhachev, Model selection in spectroscopic ellipsometry data analysis: Combining an information criteria approach with screening sensitivity analysis, Appl. Surf. Sci., (2016), http://dx.doi.org/10.1016/j.apsusc.2016.09.139. [34] T. Atilgan, H. Bozdogan, Selecting the number of knots in fitting cardinal B-splines for density estimation using AIC, J. Japan Statist. Soc. 20 (1990) 179–190. [35] H. Yanagihara, M. Ohtaki, Knot-placement to avoid over fitting in B-spline scedastic smoothing, Commun. Stat. Simulat. 32 (2003) 771–785. [36] J. Kuha, AIC and BIC: comparisons of assumptions and performance, Socio. Meth. Res. 33 (2004) 188–229. [37] J.R. Rice, Numerical Methods in Software and Analysis, 2nd Ed., Academic Press, San Diego CA, 1993, p. 80. [38] P. Dierckx, Curve and Surface Fitting with Splines, Oxford University Press, Inc., New York NY, 1993. [39] N. Sugiura, Further analysis of the data by Akaike’s information criterion and the finite corrections, Commun. Stat. Theor. Meth. 7 (1978) 13–26. [40] C.M. Hurvich, C.-L. Tsai, Regression and time series model selection in small samples, Biometrika 76 (1989) 297–307. [41] R.W. Collins, A.S. Ferlauto, Optical physics of materials, in: H.G. Tompkins, E.A. Irene (Eds.), Handbook of Ellipsometry, William Andrew Publishing/Noyes, Norwich NY, 2005, p. 95. [42] W. Van Loock, G. Pipeleers, J. De Schutter, J. Swevers, A convex optimization approach to curve fitting with B-splines, in: Proceedings of the 18th IFAC World Congress, The International Federation of Automatic Control, Milano, Italy 44 (2011), 2290–2295. [43] Y. Yuan, N. Chen, S. Zhou, Adaptive B-spline knot selection using multi-resolution basis set, IIE Transactions 45 (2013) 1263–1277. [44] H. Kang, F. Chen, Y. Li, J. Deng, Z. Yang, Knot calculation for spline fitting via sparse optimization, Comput. Aided Des. 58 (2015) 179–188.

21