Schoenberg himself states that âB-splines were probably known to Hermite and ... Arnold Sommerfeld was the first one3 to draw the first four spline curves in his.
B-spline parameterization of the dielectric function and information criteria: the craft of non-overfitting Dmitriy V. Likhachev
GLOBALFOUNDRIES, Dresden
Agenda • Spectroscopic Ellipsometry as an Indirect Optical Technique: A Need for Optical Models • Splines and All That Jazz: When Splines Were Physical Objects • Splines and All That Jazz: Splines as Mathematical Objects • B-spline Dispersion Model • “Beware of von Neumann's Elephants and Spherical Cows”: Data Overfitting and Underfitting
• How Can We Choose the Optimal Number of Knots? • Illustrative Examples: Application to 250 Å Ta / 3300 Å SiO2 / Si and 200 Å TiN / 4000 Å SiO2 / Si
Film Stacks
• Conclusions
2
Spectroscopic Ellipsometry as an Indirect Optical Technique: A Need for Optical Models SE has an indirect nature and requires appropriate modeling analysis to interpret optical measurements and extract useful information. In particular, this analysis demands a suitable optical model consisting of the optical properties of materials (the complex refractive index N = n + ik or the dielectric function ε = ε1 + iε2) and thickness (or other topographic parameters) of each layer for a structure under investigation. Ways to represent the dielectric function ε: • • • •
Tabulated data for the n&k’s of each layer; Effective medium approximation; Physics-based analytical models; Polynomial-based models.
One of the polynomial-based methods is the dielectric function representation by B-splines, a relatively new approach to the optical properties modeling which uses a class of splines called B-splines (or Basis-splines). 3
Splines and All That Jazz: When Splines Were Physical Objects
Before the widespread use of computer aided design (CAD) tools, drawing of smooth and precise curved objects – for shipbuilding, airplane manufacturing and the like – required use of a set of templates, called “French curves”, and/or a thin flexible strip of wood or steel, called a “spline”. Hooked weights, called “ducks” or “whales”, accurately secure a spline for tracing the hull of a ship.
4
Splines and All That Jazz: Splines as Mathematical Objects Splines are piecewise polynomial functions of pth degree smoothly and continuously connected together at the joining points the abscissas of which are called “knots” or “break-points”. Spline interpolation as a mathematical subject was first introduced in the pioneering work of I.J. Schoenberg in 1946 but became a popular tool in various branches of applied mathematics only in the early or middle of 1960s. Generally, Prof. Isaac Jacob Schoenberg (1903–1990) is regarded as the father of splines, particularly on account of his pioneering two-part paper1,2. There being a father of splines, there also has to be a grandfather or great.., grandfather. Indeed, Schoenberg himself states that “B-splines were probably known to Hermite and certainly to Peano" and, further, “B-splines were already known to Laplace” (1820). Prof. Arnold Sommerfeld was the first one3 to draw the first four spline curves in his 1904 paper concerned with an especially descriptive derivation of the Gauss-Laplace law of probability.
1 I.J.
Schoenberg, Contributions to the problem of approximation of equidistant data by analytic functions. Part A. On the problem of smoothing or graduation. A first class of analytic approximation formulae, Quart. Appl. Math., 4 (1946) 45–99. 2 I.J. Schoenberg, Contributions to the problem of approximation of equidistant data by analytic functions. Part B. On the problem of osculatory interpolation. A second class of analytic approximation formulae, Quart. Appl. Math., 4 (1946) 112–141. 3 P.L. Butzer, M. Schmidt, E.L. Stark, Observations on the history of central B-splines, Arch. Hist. Exact Sci. 39 (1988) 137–156.5
Splines and All That Jazz: Splines as Mathematical Objects Polynomial interpolation
Runge's phenomenon: a problem of oscillation at the edges of an interval when using polynomial interpolation with polynomials of high degree over a set of equidistant interpolation points.
Spline curves are piecewise polynomials
B-spline basis and splines
In the left panels the solid line indicates spline function of a particular order that fits the sine function shown as a dashed line. In the right panels the corresponding fits to its derivative, a cosine function, are shown. The vertical dotted lines are the interior breakpoints or knots defining the spline fits. Source: J.O. Ramsay, B.W.
The functions from the B-spline basis are piecewise polynomial functions of order k (or degree p = k - 1) that are connected at the knots and have only small support.
Silverman, Functional Data Analysis, 2nd Ed. (Springer-Verlag, New York, 2005), p.47.
6
Splines and All That Jazz: Splines as Mathematical Objects Any spline curve S can be uniquely constructed as a linear combination (weighted sum) of localized B-spline basis functions of some degree p:
S ( x)
B i
i
p
( x),
i
where ωi denotes the weight (spline coefficient) for the ith basis function of degree p which can be recursively calculated from the lower-order functions using the Cox-de Boor recursion formula: ti p 1 x p 1 x ti p 1 1, if ti x ti 1 p 0 Bi ( x ) , Bi ( x ) Bi ( x ) Bi 1 ( x ), p 1. 0, otherwise t t t t i p i i p 1 i 1
7
B-spline Dispersion Model B-spline (or Basis-spline) parameterization as a way to express the dielectric function of materials in a purely mathematical way has been introduced by Johs and Hale in 2008. Since then, it has been proven to be very effective for multiple applications in spectroscopic ellipsometry.
B-splines allow the Kramers–Kronig consistent formulation and, therefore, guarantee physical validity of the resulted dielectric function in considered spectral range. B-splines have compact support (which allows the function to be modified in one part of the spectrum without affecting other parts of the spectrum). B-splines exhibit the “convex hull” property, i.e., to ensure that the imaginary part of the dielectric function ε2 ≥ 0 we simply need to constrain the weights (spline coefficients) to be positive. B-splines can function as a stepping stone to some physics-based parameterization such as well-established Tauc-Lorentz oscillator model and all that. However, together with all advantages, B-spline parameterization possesses a non-trivial practical problem of choosing the number (and location) of knots since the actual performance of B-spline parameterization in ellipsometric data analysis strongly depends on them. Usually, the number of knots used in ellipsometric data analysis is determined and tuned empirically based on quite often ambiguous decisions and optimal knot number selection is an extremely complex undertaking. 8
“Beware of von Neumann's Elephants and Spherical Cows”: Data Overfitting and Underfitting In a simplest case of uniform B-splines, i.e., splines with equally spaced knots, it is intuitively easy to conceive that fewer knots in the considered wavelength range may be not enough to fit all essential spectral features (underfitting). On the contrary, increasing the number of knots within the wavelength range beyond the optimal value can result in significant overfitting of the experimental data, i.e., the B-spline model will fit the experimental data very well although the spline interpolation itself will be pretty poor and be more prone to artificial bending. As one can see, the B-spline model with 35 knots is too flexible and over-parameterized the “One picture is worth ten experimental data, resulting in a few clearly modeling artifacts at ~200, 245, 320, 440, 730 thousand words” nm due to small knot spacing. Reduction of the number of knots suppresses this “wiggly” overfitting behavior although not without deterioration in the quality of fit. Thus, we have to come up with the “good enough solution” to optimize the fit or, in other words, establish a rule 35 knots to set a balance between under- and overfitting, i.e., appropriately fit measured ellipsometric data while using the fewest number of knots. “…with four parameters I can fit an elephant, and with five I can make him wiggle his trunk.”
TiN
14 knots
(attributed to John von Neumann by Enrico Fermi, as quoted by Freeman Dyson in F. Dyson, A meeting with Enrico Fermi, Nature (London) 427 (2004) 297).
Spherical Cow: Too simple model 9
How Can We Choose the Optimal Number of Knots? In general, splines can be considered as mathematical models and, therefore, a natural idea is to apply wellknown and well-tested statistical model-selection techniques, - information criteria (IC), - to optimize placement and number of knots in B-spline parameterization of the dielectric function. Our objective is to use the classical MSE-based approaches, namely, the Akaike information criterion (AIC) [26,27] and the Bayesian information criterion (BIC). These methods are extensively used in various disciplines and recently their effectiveness has been demonstrated also in ellipsometric applications4,5. In terms of the residual sum of squares IC are given by a common functional form Information criterion rule
RSS(m) IC(p) n ln m, n
Akaike information criterion (AIC) Corrected AIC (AICc) Bayesian information criterion (BIC)
Penalty coefficient µ 2 2n / (n – m – 1) ln n
where m is the number of model parameters (consists of the number of interior knots and other model variables, like film thicknesses) and μ is the penalty term of different kind for various information criteria. D.V. Likhachev, A practitioner’s approach to evaluation strategy for ellipsometric measurements of multilayered and multiparametric thin-film structures, Thin Solid Films 595 (2015) 113–117. 5 D.V. Likhachev, Model selection in spectroscopic ellipsometry data analysis: Combining an information criteria approach with screening sensitivity analysis, Appl. Surf. Sci. (2016), http://dx.doi.org/10.1016/ j.apsusc.2016.09.139. 10 4
Illustrative Examples: Application to 250 Å Ta / 3300 Å SiO2 / Si and 200 Å TiN / 4000 Å SiO2 / Si Film Stacks Ta
TiN
Ta
The curve with 49 equidistant knots, which yields lowest MSE value of 2.010, contains noticeable wiggly artifacts between 3 and 4 eV and, therefore, measurement noise is not filtered out efficiently. The curve with only 8 knots produces higher misfit (MSE = 3.028) and, therefore, misses some essential spectral features. The curve generated with selected by AIC and BIC optimal number of 18 knots displays the absence of apparent artifacts and yields acceptable MSE value of 2.061. See more details in: D.V. Likhachev, Selecting the right number of knots for B-spline parameterization of the dielectric functions in spectroscopic ellipsometry data analysis, Submitted to Thin Solid Films (2017). 11
Conclusions We applied three widespread and well-established information criteria (AIC, AICc and BIC) to non-trivial problem of selecting the number of equidistant knots in B-spline parameterization of the dielectric functions. An important advantage of the IC approach is that it provides an objective and unambiguous guidance for choosing the number of knots in B-spline models and, thus, removes possible ad hoc decisions by an ellipsometry user.
A possible enhancement of this approach should also include a procedure for selecting optimal knot locations, especially when the number of knots is small, and we intend to explore various “knot deletion and adjustment” techniques in future work. A simplified approach may include initially manual placement of knots with denser distribution near particular spectral features like critical points and then application of our suggested way of minimizing the information criteria to select appropriate knot number.
12
Thank you very much for your attention!
© 2017 GLOBALFOUNDRIES Inc. All rights reserved.