modeling and evaluating the distribution of the output

September 5, 2017

14:40

WSPC Proceedings - 9in x 6in

Witkovsky-etal˙AMCTM2017-FullPaper

1

MODELING AND EVALUATING THE DISTRIBUTION OF THE OUTPUT QUANTITY IN MEASUREMENT MODELS WITH COPULA DEPENDENT INPUT QUANTITIES ´ VIKTOR WITKOVSKY Institute of Measurement Science, Slovak Academy of Sciences, Bratislava, Slovakia E-mail: [email protected] GEJZA WIMMER Mathematical Institute, Slovak Academy of Sciences, Bratislava, Slovakia Faculty of Natural Sciences, Matej Bel University, Bansk´ a Bystrica, Slovakia E-mail: [email protected] ˇ ˇ ´ ZUZANA DURI SOV A Slovak Institute of Metrology, Bratislava, Slovakia E-mail: [email protected] ˇ ˇ RUDOLF PALENC ˇ AR ´ AND JAKUB PALENC ˇ AR ´ STANISLAV DURI S, Faculty of Mechanical Engineering, Slovak University of Technology, Bratislava, Slovakia E-mail: [email protected], [email protected], and [email protected] Proper uncertainty analysis in metrology frequently requires using model with a nonlinear measurement equation Y = f (X), where X = (X1 , . . . , XN ), and non-trivial joint multivariate distribution of the input quantities X. This is exactly the situation suitable for application of the approach suggested in the GUM Supplements 1 and 2. However, the only joint multivariate distributions explicitly considered in the guides for such situations are the multivariate Gaussian distribution and the multivariate Student’s t distribution, followed with just a brief comment on the possibility to use the copula-type multivariate distributions to assign the joint PDF to X. In this paper we emphasize the advantage of using the Characteristic Function Approach for sampling from the distribution of the output quantity Y , based on the joint copula-type distribution of the inputs X, with given convolution-type marginal distributions of the input quantities Xi specified by using and combining the Type A and/or Type B evaluation methods. The suggested modeling and evaluation approach is illustrated by simple example. Keywords: Copula; Monte Carlo Method; Characteristic Function Approach.

page 1

September 5, 2017

14:40



2

1. Introduction Based on the GUM 1 and its Supplements 2,3 , evaluation of the measurement uncertainty should be based on a correct measurement model specification and the state-of-knowledge distribution of the input quantities. Proper evaluation of the measurement uncertainty and/or the coverage intervals typically requires evaluation of the probability density function (PDF), the cumulative distribution function (CDF) and/or the quantile function (QF) of a random variable Y associated with the measurand (here, the random variable Y represents distribution of possible values attributed to the measurand, based on current knowledge). A mathematical model of measurement of a single (scalar) quantity can be expressed as a functional relationship, Y = f (X),

(1)

where Y is the scalar output quantity and X represents the vector of N input quantities (X1 , . . . , XN ). Here, each Xi is regarded as a random variable with possible values ξi , reasonably attributed to the i-th input quantity based on current knowledge, and Y is a random variable with possible values η, consequently attributed to the measurand. The joint PDF for X is denoted by gX (ξ) and CDF is denoted by GX (ξ), where ξ = (ξ1 , . . . , ξN ) is a vector variable describing the possible values of the vector quantity X. PDF for Y is denoted by gY (η) and the CDF by GY (η). Marginal distribution functions (PDFs/CDFs) for Xi are denoted by gXi (ξi ) and GXi (ξi ), respectively. Frequently, it is adequate to assume that the functional relationship Y = f (X) is linear in Xi and the input random variables Xi are mutually independent, leading to the convolution-type distribution of Y . However, this is a strong assumption which could be inadequate in many important situations, and usage of the nonlinear functional relation f and/or the non-trivial joint multivariate distribution of X becomes an indispensable requirement. This is exactly the situation suitable for application of the approach suggested in the GUM Supplements. However, the only multivariate distribution explicitly considered in the GUM Supplement 1 is the multivariate Gaussian distribution. Moreover, in the GUM Supplement 2 there is considered the multivariate Student’s t distribution and just briefly is mentioned the possibility to use the copula-type multivariate distributions, as a mathematical device to assign the joint PDF of X. Here we focus on the problem of sampling from a fully specified (by an expert knowledge) joint multivariate copula-type distributions with de-

page 2

September 5, 2017

14:40



3

pendence structure given by specific copula and with known continuous marginal distributions. In particular, we emphasize the advantage of using the characteristic function approach (CFA) for sampling from the distribution of the output quantity Y , based on the joint copula-type distribution of the inputs X, with convolution-type marginal distributions assigned to the input quantities Xi specified by using and combining the Type A and/or Type B evaluation methods 4,5 . The paper is organized as follows. In Section 2 we briefly recall the definition and basic properties of the copula distributions and present the algorithms for sampling from the specific copulas. Section 3 presents the concept for sampling from the multivariate copula-type distributions useful for metrology applications. In particular, presented is sampling method based on computing the marginal distribution functions and the quantiles by numerical inversion of their characteristic functions, as well as the alternative (computationally more simple) approximate method based on using the empirical quantiles of the marginal distributions. In Section 4 the suggested modeling and evaluation approach is illustrated by a simple example.

2. The copula distributions Copulas are being increasingly used to model multivariate distributions with specified continuous margins in fields such as hydrology, actuarial sciences and finance 6,7 . However, their modeling flexibility and conceptual simplicity makes it a natural choice also for modeling multivariate distributions in metrology and measurement science 8 . In particular, for uncertainty evaluation based on using the Monte Carlo method (MCM), as suggested in the GUM Supplements, it is important to sample many realizations from the joint multivariate distribution specified by the given copula and the marginal distributions. The copula approach to dependence modeling is rooted in a representation theorem due to Sklar 9,10 , who showed that an N -dimensional joint distribution can be decomposed into its N univariate marginal distributions and an N -dimensional copula. This provides a tool for constructing large and flexible classes of multivariate distributions and, on the other hand, it suggests a straightforward method for sampling from such multivariate distributions via copulas. Let X = (X1 , . . . , XN ) be a random vector with joint distribution function GX and with continuous marginal cumulative distribution functions GXi , for i = 1, . . . , N . If the marginal distributions are continuous, then

page 3

September 5, 2017

14:40



4

Gaussian copula with

= 0.8

t copula with

1

= 3 and

= 0.8

0.9 0.8

0.8

0.7 0.6

u2

u2

0.6 0.5

0.4

0.4 0.3

0.2

0.2 0.1

0 0.2

0.4

0.6

0.8

0

0.2

u1

0.4

0.6

0.8

1

u1

Fig. 1. Random sample of size n = 10000 from the bivariate Gaussian copula with correlation parameter ̺ = 0.8 (left panel), and the bivariate Student’s t copula with ν = 3 degrees of freedom and the correlation ̺ = 0.8 (right panel).

the CDF GX of X can be uniquely represented as GX (ξ) = C (GX1 (ξ1 ), . . . , GXN (ξN )) ,

(2)

where C(ω) = C(ω1 , . . . , ωN ) is the (uniquely given) N -dimensional joint distribution function with uniform marginals on the interval (0, 1), called a copula. Alternatively, the expression specifying the unique copula distribution C(ω), given the known joint and the marginal distributions, is given by −1 (3) C(ω1 , . . . , ωN ) = GX G−1 X1 (ω1 ), . . . , GXN (ωN ) ,

where G−1 Xi denote the inverse distribution function (also known as the quantile function) of the marginal distribution GXi , and ω = (ω1 , . . . , ωN ) is specified vector value of the copula distribution with ωi ∈ (0, 1). Equation (3) can be used to specify particular copula distributions, e.g. the Gaussian copula and/or the Student’s t copula. In particular, the Gaussian copula specified by the correlation matrix R is given by G CR (ω1 , . . . , ωN ) = ΦR Φ−1 (ω1 ), . . . , Φ−1 (ωN ) , (4)

where Φ−1 is the inverse cumulative distribution function of a standard normal and ΦR is the joint CDF of a multivariate normal distribution with mean vector zero and covariance matrix equal to the correlation matrix R. Similarly, the Student’s t copula specified by the correlation matrix R is given by t −1 Cν,R (ω1 , . . . , ωN ) = tν,R t−1 (5) ν (ω1 ), . . . , tν (ωN ) ,

page 4

September 5, 2017

14:40



5

Clayton copula with = 2

1

Gumbel copula with

1

=2

0.6

0.6

0.6

0.4

0.4

0.4

0.2

0.2

0.2

0 0

0.2

0.4

0.6

u1

0.8

1

=2

u2

0.8

u2

0.8

u2

0.8

0

Frank copula with

1

0 0

0.2

0.4

0.6

0.8

1

0

0.2

0.4

u1

0.6

0.8

1

u1

Fig. 2. Random sample of size n = 10000 from the bivariate Clayton copula with parameter θ = 2 (left panel), the bivariate Gumbel copula with θ = 2 (middle panel), and the Frank copula with θ = 2 (right panel).

where tν be a univariate t distribution with ν degrees of freedom and tν,R is the multivariate Student’s t distribution with a correlation matrix R and ν degrees of freedom. Besides the elliptical copulas, i.e. the copulas arising from multivariate elliptical distributions such as the Gaussian and/or the Student’s t distribution, there are many other possibilities for choosing the dependence structure by copula distributions. In particular, Archimedean copulas are a prominent class of copulas, able to capture complicated tail dependencies (useful for modeling extreme events in hydrology, actuarial applications and finance) with a common method of construction involving one-dimensional generator 11 , say ψ, which is a continuous, monotone (decreasing) function on the interval h0, ∞) with ψ(0) = 1 and ψ(∞) = 0. Archimedean copula specified by its generator ψ is given by Cψ (ω1 , . . . , ωN ) = ψ ψ −1 (ω1 ) + · · · + ψ −1 (ωN ) .

(6)

Note that the most simple dependence model, namely the independence, is provided by the generator function ψ(t) = exp(−t) with ψ −1 (t) = − log(t), and hence the corresponding independence copula is C iid (ω1 , . . . , ωN ) =

N Y

ωi .

(7)

i=1

Common families of Archimedean copulas (useful also for metrology appliacations) defined by their specific generator functions are illustrated in Table 1.

page 5

September 5, 2017

14:40



6 Table 1.

Specific families of Archimedean copulas.

Archimedean copula

Parameter space

Generator function ψ(t)

Clayton

θ>0

ψ(t) = (1 +t)− θ

Gumbel

θ>1

Frank

θ ∈ (−∞, ∞)\{0} for N = 2 and θ > 0 for N ≥3

1

ψ(t) = exp

1 −t θ

ψ(t) = − θ1 log (1 − exp(−t)[1 − exp(−θ)])

2.1. Sampling from the specific copulas 2.1.1. Sampling from the Gaussian copula For the Gaussian copula the input parameter is the correlation matrix R. Let U = (U1 , . . . , UN ) denotes one random draw from the copula: (1) Generate a multivariate normal vector Z ∼ N (0, R) where R is an N -dimensional correlation matrix. This can be achieved by Cholesky decomposition of the correlation matrix R = LLT where L is a lower triangular matrix with positive ˜ ∼ N (0, I), then LZ ˜ ∼ N (0, R). elements on the diagonal. If Z (2) Transform the vector Z into U = (Φ(Z1 ), . . . , Φ(ZN )), where Φ is the CDF of univariate standard normal.

2.1.2. Sampling from the Student’s t copula For the Student’s t copula the input parameters are the degrees of freedom ν and the correlation matrix R. To generate one random draw from the copula: (1) Generate a multivariate vector T ∼ tν,R following the centered t distribution with ν degrees of freedom and p correlation matrix R. This can be achieved by using T = ν/QZ, where Z ∼ N (0, R) is a multivariate normal vector and Q ∼ χ2ν is independent univariate chi-square distributed random variable with ν degrees of freedom. (2) Transform the vector T into U = (tν (T1 ), . . . , tν (Tm ))T , where tν is the CDF of univariate t distribution with ν degrees of freedom. Figure 3 illustrates the MATLAB codes for sampling from the Gaussian copula and the Student’s t copula specified by their correlation matrix R.

page 6

September 5, 2017

14:40



7

% Sample from the bivariate Gaussian copula with rho=0.8 n = 10000; rho = 0.8; R = [1 rho ; rho 1]; L = chol(R,'lower'); Z = L * randn(2,n); U = normcdf(Z'); % Alternative one-line MATLAB code for given parameters mu, R, n: mu = [0 0]; U = normcdf(mvnrnd(mu,R,n));

% Sample from the bivariate Student's t copula with nu=3, rho=0.8 n = 10000; nu = 3; rho = 0.8; R = [1 rho ; rho 1]; L = chol(R,'lower'); Z = L * randn(2,n); Q = chi2rnd(nu,1,n); T = sqrt(nu./Q) .* Z; U = tcdf(T',nu); % Alternative one-line MATLAB code for given parameters R, nu, n: U = tcdf(mvtrnd(R,nu,n),nu);

Fig. 3. MATLAB code for generating n = 10000 draws from the Gaussian copula specified by its correlation matrix R (upper panel), and the Student’s t copula with ν = 3 degrees of freedom and given correlation matrix R (lower panel).

2.1.3. Sampling from the Archimedean copulas Archimedean copulas are usually sampled with the algorithm of Marshall and Olkin 12 . To generate one random draw from the copula: (1) Generate a random variable V with the specific distribution function F , defined by the inverse Laplace transform of the copula generator function ψ(t): • For Clayton copula generate V ∼ Gamma θ1 , 1 . • For Gumbel copula generate V ∼ Stable θ1 , 1, γ, 0 with γ = π cosθ 2θ , where by Stable (α, β, γ, µ) we denote the α-stable distribution function with the parameters α (stability), β (skewness), γ (scale), and µ (location). • For Frank copula generate V ∼ Logarithmic (1 − exp(−θ)).

page 7

September 5, 2017

14:40



8

% Sample from the bivariate Clayton copula with theta=2 n = 10000; theta = 2; U1 = rand(n,1); r = rand(n,1); U2 = U1.*(r.ˆ(-theta./(1+theta))-1+U1.ˆtheta).ˆ(-1./theta); U = [U1,U2];

% Sample from the bivariate Gumbel copula with theta=2 % This uses the Marshal-Olkin method n = 10000; theta = 2; r = rand(n,4); v1 = (r(:,1)-.5) .* pi; v2 = v1 + pi/2; v3 = -cos(v1-v2./theta)./log(r(:,2)); v4 = (sin(v2./theta)./v3).ˆ(1./theta) .* v3./cos(v1); U1 = exp(-(-log(r(:,3))).ˆ(1./theta) ./ v4); U2 = exp(-(-log(r(:,4))).ˆ(1./theta) ./ v4); U = [U1 U2];

% Sample from the bivariate Frank copula with theta=2 n = 10000; theta = 2; r = rand(n,1); U1 = rand(n,1); U2 = -log(1 + (1-exp(theta))./(exp(theta) ... + exp(theta.*(1-U1)).*(1-r)./r))./theta; U = [U1 U2];

Fig. 4. MATLAB codes for generating n = 10000 draws of U = (U1 , U2 ), from the bivariate Clayton copula (upper panel), the bivariate Gumbel copula (middle panel), and the bivariate Frank copula (lower panel) with the specific value of the parameter θ = 2.

(2) Generate independent uniform random variables (R1 , . . . , RN ), with Ri ∼ R(0, 1) . T (3) Return U = (ψ(− log(R1 )/V ), . . . ψ(− log(RN )/V )) In general, based on a known relationship among the Laplace transform and the Fourier transform, for given generator ψ(t) the characteristic function (CF) of the distribution F is ϕF (t) = ψ(−it), with i2 = −1. This allows to apply the methods for numerical inversion of the CFs to get the CDF and QF of the distribution F , and consequently, to use it directly to generate V ∼ F .

page 8

September 5, 2017

14:40



9

Frequently, draws from the bivariate Archimedean copulas (with N = 2) can be more easily generated by sampling from the conditional distribution C(ω2 | ω1 ), for more details see the MATLAB code in Figure 4. 3. Sampling from the multivariate copula-type distributions For metrological applications it is important to sample realizations of X = (X1 , . . . , XN ) from the joint multivariate distribution GX specified by the given copula C(ω) and the marginal distributions GX1 (ξ1 ), . . . , GXN (ξN ). Based on that, we can further evaluate realization of the output quantity defined by the measurement equation Y = f (X) and use the methods of GUM Suplements for proper determination of the measurement uncertainty. If U = (U1 , . . . , UN ) is a random vector with given copula distribution and uniformly distributed marginals Ui ∈ (0, 1), i.e. (U1 , . . . , UN ) ∼ C,

(8)

and X = (X1 , . . . , XN ) is the transformed random vector, such that −1 (X1 , . . . , XN ) = G−1 X1 (U1 ), . . . , GXN (UN ) ,

(9)

then the random vector X has the required joint multivariate distribution, X ∼ GX , with dependence structure defined by the copula C and the given marginal distributions GX1 , . . . , GXN , as defined in (2). Hence, sampling many realizations of X = (X1 , . . . , XN ) requires many evaluations of the quantiles G−1 Xi (Ui ) of the marginal distributions GXi , i = 1, . . . , N , for specific probabilities (U1 , . . . , UN ) generated from the specified copula distribution. If the given marginal distributions GXi are standard (e.g. the Gaussian distribution, the Student’s t distribution or the rectangular distribution), with easily computable quantile functions (or with available algorithms), the task is straightforward and the computation is rather simple and efficient. Here we shall consider more complicated situations, motivated by the hierarchical modeling approach. In particular, we shall assume that each input variable Xi is modeled as a linear combination of simple and indePni cij Xij , and thus the specified statependent inputs Xij , i.e. Xi = j=1 of-knowledge marginal distribution of Xi is given as a convolution-type distribution. Such distribution functions and their quantile functions can be evaluated efficiently by using the characteristic function approach 4,5,13 .

page 9

September 5, 2017

14:40



10 Table 2. Characteristic functions of selected distributions used in metrological applications. Here, Γ(a) denotes the gamma function, Jν (z) is the Bessel function of the first kind and Kν (z) is the modified Bessel function of the second kind. M (a, b, z) and U (a, b, z) are the confluent hypergeometric functions of the first and second kind, respectively 14,15 . Probability distribution

Student’s tν

Characteristic function ϕ(t) 2 ϕ(t) = exp − t2 1 ν 1 2 K ν ν 2 |t| , ν > 0 ν 2 |t| ϕ(t) = ν −11 ν

Rectangular R(−1, 1) Triangular T (−1, 1) Arcsine U (−1, 1)

ϕ(t) = t 2−2 cos(t) ϕ(t) = t2 ϕ(t) = J0 (t)

Symmetric Beta B(−1, 1)

ϕ(t) =

Gaussian N (0, 1)

Exponential Exp(λ) Gamma Γ(α, β) Chi-squared χ2ν Fisher-Snedecor’s Fν1 ,ν2

2

22 Γ( 2 ) sin(t)

ϕ(t) =

1 +θ) Γ( 2 t) (2

θ− 1 2

λ

λ−it ϕ(t) = 1 −

Jθ− 1 (t), θ > 0 shape 2

, λ > 0 rate, −α it , α > 0 shape, β > 0 rate β ν

ϕ(t) = (1 − 2it)− 2 , ν > 0 ν ν Γ( 21 + 22 ) ϕ(t) = U ν21 , 1 − ν2 Γ( 2 )

ν2 , − νν2 it 2 1

, ν1 > 0, ν2 > 0

3.1. Sampling by using the characteristic function approach CFA was suggested to form the state-of-knowledge probability distribution of the output quantity in linear measurement model, based on the numerical inversion of its CF, which is defined as a Fourier transform of its PDF. Table 2 presents CFs of selected univariate distributions frequently used in metrological applications for modeling marginal distributions of the input quantities. Notice that CFs of the symmetric zero-mean distributions are real functions of its argument t, while in general, CFs are complex valued functions. CF of a weighted sum of independent random variable is simple to derive if the measurement model is linear and the input quantities are independent. Pni In particular, let Xi = j=1 cij Xij with coefficients cij and independent Xij . In such situation, CF of the input quantity Xi is given by ϕXi (t) = ϕXi1 (ci1 t) × · · · × ϕXini (cini t),

(10)

where by ϕXij (t) we denote the (known) CFs of the input quantities Xij . Based on the Gil-Pelaez inversion theorem 16 , the distribution function GXi of the quantity Xi can be evaluated at specific value(s) ξi by numerical inversion of its CF ϕXi (t), and in most metrology applications it can be

page 10

September 5, 2017

14:40



11

Bivariate Students t copula with given marginals 6

4

X2

2

0

-2

-4

-6 -8

-6

-4

-2

0

2

4

6

8

X1

Fig. 5. Random sample of size n = 10000 from the joint bivariate copula-type distribution, specified by Student’s t copula with ν = 3 degrees of freedom and given correlation matrix R (with ̺ = 0.8), and the convolution-type marginal distributions of the input quantities X1 and X2 , where X1 ∼ XN + 4XR and X2 ∼ 2XT + 3XU , with independent input random variables: XN standard normal, XR rectangular on (−1, 1), XT triangular on (−1, 1), and XU arcsine on (−1, 1). The sample was generated by the MATLAB code presented in Figure 6.

efficiently calculated by a simple trapezoidal quadrature: −itk ξi K δX e ϕXi (tk ) 1 , wk ℑ GXi (ξi ) ≈ − 2 π tk

(11)

k=0

where K is sufficiently large integer, wk are the trapezoidal quadrature weights ( w0 = wK = 12 , and wk = 1 for k = 1, . . . , K −1), tk are equidistant T nodes from the interval (0, T ), for sufficiently large T , and δ = K . Here, by ℑ(z) we denote the imaginary part of the complex value z. For given value of Ui ∈ (0, 1), the required realization of the input −1 quantity is given by Xi = G−1 Xi (Ui ), where the quantile function GXi is

page 11

September 5, 2017

14:40



12

% % % %

Bivariate Student's t copula with given marginals X1 = XN + 4 * XR, and X2 = 2 * XT + 3 * XU This is using the characteristic function approach Requires CharFunTool: https://github.com/witkovsky/CharFunTool n = 10000; nu = 3; R = [1 0.8;0.8 1]; U = tcdf(mvtrnd(R,nu,n),nu); cfN = @(t) cfS Gaussian(t); cfR = @(t) cfS Rectangular(t); cfT = @(t) cfS Triangular(t); cfU = @(t) cfS Arcsine(t); cfX1 = @(t) cfN(t).*cfR(4*t); cfX2 = @(t) cfT(2*t).*cfU(3*t); options.SixSigmaRule = 8; options.isInterp = true; X1DF = cf2DistGP(cfX1,[],[],options); X2DF = cf2DistGP(cfX2,[],[],options); X1 = X1DF.QF(U(:,1)); X2 = X2DF.QF(U(:,2)); X = [X1 X2];

Fig. 6. MATLAB code, based on using the characteristic function approach, for generating n = 10000 realizations from the joint bivariate copula-type distribution specified by Student’s t copula and convolution-type marginal distributions of the input quantities X1 and X2 , as specified in Figure 5.

evaluated at Ui by using barycentric interpolation from the distribution function. In fact, the quantile function G−1 Xi is specified only once by a small number of values GXi (ξi ) evaluated at specified Chebyshev points 17 from the support of the distribution, ξiCheb ∈ Supp(Xi ). This is a highly efficient method which can be used for generating multiple realizations of Xi . The MATLAB Characteristic functions toolbox CharFunTool 13 is a set of algorithms for computing and combining the characteristic functions and further for computing the CDF, PDF and QF of the output quantity by numerical inversion of the associated CF. Figure 6 presents the MATLAB code, based on using the characteristic function approach, used for generating the random sample (plotted in Figure 5) from the joint bivariate distribution specified by Student’s t copula and convolution-type marginal distributions of the input quantities X1 and X2 , where X1 ∼ XN + 4XR and X2 ∼ 2XT + 3XU , with independent input random variables: XN standard normal, XR rectangular on (−1, 1), XT triangular on (−1, 1), and XU arcsine on (−1, 1).

page 12

September 5, 2017

14:40



13

% Bivariate Student's t copula with given marginals % X1 = XN + 4 * XR, and X2 = 2 * XT + 3 * XU % This is using the empirical quantile approach n = 10000; nu = 3; R = [1 0.8;0.8 1]; U = tcdf(mvtrnd(R,nu,n),nu); [~,id1] = sort(U(:,1)); [~,id2] = sort(U(:,2)); rank = (1:n); r1(id1) = rank; r2(id2) = rank; XN = randn(n,1); XR = 2*rand(n,1)-1; XT = rand(n,1)-rand(n,1); XU = sin(2*pi*rand(n,1)); X1 = sort(XN + 4*XR); X2 = sort(2*XT + 3*XU); X1 = X1(r1); X2 = X2(r2); X = [X1 X2];

Fig. 7. MATLAB code, based on using the empirical quantile approach, for generating n = 10000 realizations from the joint bivariate copula-type distribution specified by Student’s t copula and convolution-type marginal distributions of the input quantities X1 and X2 , as specified in Figure 5.

3.2. Sampling by using the empirical quantile approach The empirical quantiles are natural and consistent estimators of the true quantiles of the distribution. Hence, for large n, the empirical quantiles of the marginal distributions GXi can be used as a reasonable approximate tool for sampling from the copula-type distribution with given marginal distributions, as given in (9). The sampling algorithm is rather simple: (1) Generate n samples of X1 , . . . , XN , from the marginal distributions GX1 , . . . , GXN , as well as samples U1 , . . . , UN from the given copula C. (2) Sort the samples U1 , . . . , UN and save vectors of their ranks (the original orders) (r1 , . . . , rN ). (3) Sort the marginal samples X1 , . . . , XN and reorder them with respect to the ranks (r1 , . . . , rN ). ∗ ) as the sample of (4) Return the reordered samples X ∗ = (X1∗ , . . . , XN size n from the joint multivariate distribution GX defined by (2).

page 13

September 5, 2017

14:40



14 Table 3. Ethanol concentrations µ1 , µ2 , µ3 , µ4 , µ5 , µ6 , in six reference gas mixtures of ethanol in nitrogen used for calibration, determined by gravimetric method 18 . Quantity

Estimate

Standard uncertainty

Type of evaluation

Probability distribution

µ1 µ2 µ3 µ4 µ5 µ6

0.00008127 0.0001508 0.0002594 0.0004887 0.0006486 0.0007976

0.00000026 0.0000072 0.0000024 0.0000020 0.0000056 0.0000042

Type Type Type Type Type Type

normal normal normal normal normal normal

A A A A A A

Figure 7 presents the MATLAB code, by using the empirical quantile approach, for generating sample of size n = 10000 from the joint bivariate copula-type distribution specified by Student’s t copula and convolutiontype marginal distributions of the input quantities X1 and X2 , as specified in Figure 5. 4. Example The need for sampling from the joint multivariate distributions is motivated by many real metrological applications. A simple motivation example is the type approval of the breath analyzers, a routinely performed analysis at the Slovak Institute of Metrology (SMU). One of the tests used in the process of type approval is the test of cross sensitivity. i.e. the test of the effect of interfering substances. In accordance with the international recommendation OIML R 126 Evidential breath analyzers these measure devices should fulfill the requirements of maximum permissible errors and maximum influence of following substances: Acetone, methanol, isopropanol, toluene, ethyl acetate and diethyl ether. For this type of test aqueous solutions of ethanol and the other interfering substances are prepared (possibly in different mixture concentrations) and measured by the breath analyzer. In this case, the interest is in the best estimate of the concentration of the ethanol (the measurand) and its uncertainty. Here, the input quantities of the measurement equation include the concentrations of ethanol and interfering substances, which can be characterized by the joint multivariate state-of-knowledge copula-type distribution with given marginal distributions. Another motivation for using the joint copula-type multivariate distributions with known correlation structure and given marginal distributions comes from the (uncertainty) analysis based on using the EIV (errors-in-

page 14

September 5, 2017

14:40



15 Table 4. Ethanol concentrations ν1 , ν2 , ν3 , ν4 , ν5 , ν6 , in six reference gas mixtures of ethanol in nitrogen used for calibration, determined by using the nondispersive infrared spectrometer 18 . Quantity

Estimate

Standard uncertainty

Type of evaluation

Probability distribution

ν1

0.000081027

ν2

0.000150957

ν3

0.00026165

ν4

0.000488587

ν5

0.00064851

ν6

0.00079307

0.000000013 0.00000037 0.000000019 0.00000036 0.00000012 0.00000050 0.000000016 0.00000076 0.00000014 0.0000010 0.00000016 0.0000013

Type Type Type Type Type Type Type Type Type Type Type Type

normal rectangular normal rectangular normal rectangular normal rectangular normal rectangular normal rectangular

A B A B A B A B A B A B

variables) calibration models 18,19 . For illustration, here we consider linear calibration of ethanol concentrations in nitrogen (measured in mole fractions), say µ and ν, determined by using two independent measurement techniques, namely the gravimetric method and the nondispersive infrared spectrometry (NDIR), by using the EIV calibration model 18 . The best estimates of the parameters a and b of the linear calibration curve ν = a + bµ were estimated by using the best linear unbiased estimator (BLUE), together with their covariance matrix and their marginal distributions (evaluated by using the CFA), based on using the state-of-knowledge distributions about the measurand(s), say µi and νi , measured in the specified calibration points by two measurement techniques. By using the data presented in Table 3 and Table 4 the best estimates of the linear calibration curve parameters are a ˆ = −0.00000008016 with ˆ its standard uncertainty uaˆ = 0.0000006 and b = 0.9989 with its standard uncertainty uˆb = 0.0037. The covariance matrix is a ˆ 3.6462 × 10−13 −1.5693 × 10−9 , (12) Cov ˆ = −1.5693 × 10−9 1.3788 × 10−5 b with the correlation coefficient ̺ = Corr(ˆ a, ˆb) = −0.7. The marginal stateof-knowledge distributions of the parameters a and b are computed by numerical inversion of the associated characteristic functions. By using the Gaussian copula with ̺ = −0.7, we can further generate random sample from the joint state-of-knowledge distribution of (a, b), see Figure 8.

page 15

September 5, 2017

14:40



16

Joint distribution of the calibration line parameters (a,b) 1.015

1.01

1.005

b

1

0.995

0.99

0.985

0.98 -2.5

-2

-1.5

-1

-0.5

0

a

0.5

1

1.5

2 10-6

Fig. 8. Random sample of size n = 10000 from the joint bivariate copula-type distribution, specified by the Gaussian copula with given correlation matrix R (with ̺ = −0.7), and the convolution-type marginal state-of-knowledge distributions of the input quantities a and b.

5. Conclusions As was illustrated, sampling from the multivariate copula-type distributions is useful for many applications in measurement science and metrology. Here we have presented some basic properties of the copula distributions together with the algorithms for sampling from specific copulas. We have emphasized the advantage of using the characteristic function approach, i.e. the method for computing the required marginal distribution functions and/or the quantile functions by numerical inversion of the associated characteristic functions. As an alternative, we have also presented an approximate method for sampling from the copula-type distributions, based on using the empirical quantiles of the marginal distributions. The suggested approach was illustrated by modeling the joint state-of-knowledge distribution of the calibration line parameters, based on using the EIV calibration model.

page 16

September 5, 2017

14:40



17

Acknowledgements The work was supported by the Slovak Research and Development Agency, project APVV-15-0295, and by the Scientific Grant Agency of the Ministry of Education of the Slovak Republic and the Slovak Academy of Sciences, projects VEGA 2/0047/15, VEGA 1/0604/15, VEGA 1/0748/15, and VEGA 2/0011/16. References 1. JCGM100:2008, Evaluation of measurement data – Guide to the expression of uncertainty in measurement (GUM 1995 with minor corrections), in JCGM - Joint Committee for Guides in Metrology, (ISO (the International Organization for Standardization), BIPM, IEC, IFCC, ILAC, IUPAC, IUPAP and OIML, 2008) 2. JCGM101:2008, Evaluation of measurement data – Supplement 1 to the Guide to the expression of uncertainty in measurement – Propagation of distributions using a Monte Carlo method, in JCGM - Joint Committee for Guides in Metrology, (ISO (the International Organization for Standardization), BIPM, IEC, IFCC, ILAC, IUPAC, IUPAP and OIML, 2008) 3. JCGM102:2011, Evaluation of measurement data – Supplement 2 to the Guide to the expression of uncertainty in measurement – Extension to any number of output quantities, in JCGM - Joint Committee for Guides in Metrology, (ISO (the International Organization for Standardization), BIPM, IEC, IFCC, ILAC, IUPAC, IUPAP and OIML, 2011) 4. V. Witkovsk´ y, Numerical inversion of a characteristic function: An alternative tool to form the probability distribution of output quantity in linear measurement models, Acta IMEKO 5, 32 (2016). ˇ sov´ ˇ s and R. Palenˇcár, 5. V. Witkovsk´ y, V., G. Wimmer, Z. Duriˇ a, S. Duriˇ Brief overview of methods for measurement uncertainty analysis: GUM uncertainty framework, Monte Carlo method, characteristic function approach, in MEASUREMENT 2017, Proceedings of the 11th International Conference on Measurement, (Smolenice, Slovakia, May 29-31, 2017). 6. C. Genest and A.-C. Favre, Everything you always wanted to know about copula modeling but were afraid to ask, Journal of Hydrologic Engineering 12, 347 (2007). 7. I. Kojadinovic and J. Yan, Modeling multivariate distributions with

page 17

September 5, 2017

14:40



18

8. 9.

10. 11. 12. 13. 14.

15.

16. 17. 18.

19.

continuous margins using the copula R package, Journal of Statistical Software 34, 1 (2010). A. Possolo, Copulas for uncertainty analysis, Metrologia 47, p. 262 (2010). A. Sklar, Fonctions de répartition à n dimensions et leurs marges, Publications de l’Institut de Statistique de l’Université de Paris 8, 229 (1959). A. Sklar, Random variables, joint distribution functions, and copulas, Kybernetika 9, 449 (1973). A. J. McNeil, Sampling nested Archimedean copulas, Journal of Statistical Computation and Simulation 78, 567 (2008). A. W. Marshall and I. Olkin, Families of multivariate distributions, Journal of the American Statistical Association 83, 834 (1988). V. Witkovsk´ y, CharFunTool: The characteristic functions toolbox for MATLAB (2017), https://github.com/witkovsky/CharFunTool. M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions: With Formulas, Graphs, and Mathematical Tables (Courier Corporation, 1964). F. W. Olver, D. W. Lozier, R. F. Boisvert and C. W. Clark, NIST handbook of mathematical functions (2010), US Department of Commerce, National Institute of Standards and Technology, Washington, DC, http://dlmf.nist.gov. J. Gil-Pelaez, Note on the inversion theorem, Biometrika 38, 481 (1951). N. Hale and L. N. Trefethen, Chebfun and numerical quadrature, Science China Mathematics 55, 1749 (2012). ˇ s, Z. Duriˇ ˇ sov´ S. Duriˇ a, M. Dovica and G. Wimmer, EIV calibration of gas mixture of ethanol in nitrogen, in International Conference on Advanced Mathematical and Computational Tools in Metrology and Testing XI , (University of Strathclyde, Glasgow, Scotland, 2017). ˇ s, R. Palenˇcár and V. Witkovsk´ G. Wimmer, S. Duriˇ y, EIV calibration model of thermocouples, in International Conference on Advanced Mathematical and Computational Tools in Metrology and Testing XI , (University of Strathclyde, Glasgow, Scotland, 2017).

page 18

modeling and evaluating the distribution of the output

modeling and evaluating the distribution of the output

Suggest Documents

EVALUATING THE RESEARCH OUTPUT OF ... - Semantic Scholar

Evaluating the Output of Machine Translation Systems

Evaluating the Prevalence and Distribution of Envelope Wages in the ...

Quantitative Immunohistochemistry for Evaluating the Distribution of ...

Evaluating the Steady State Distribution of

MODELING THE RADIAL ABUNDANCE DISTRIBUTION OF THE ...

MODELING THE RADIAL ABUNDANCE DISTRIBUTION OF THE ...

Estimating the Effect of the Age Distribution on Cyclical Output ...

Estimating the Effect of the Age Distribution on Cyclical Output

Spatial distribution and output characteristics of ...

Modeling the temperature distribution and ...

Modeling the Hysteresis Power Losses of the Output

Evaluating the output of machine translation ssytems - MT Archive

Evaluating the Output of an Embodied Multimodal ... - CiteSeerX

Modeling the Motion and Distribution of Interstellar Dust inside the ...

Modeling the distribution of H2O and HDO in the

Proposals for evaluating the regularity of a scientist's research output

Modeling the Pressure Distribution and the Changes of Water Level ...

Ensemble distribution modeling of the Mesopotamian ...

Modeling The Marginal Distribution of Gene ...

Modeling the Residual Strength Distribution of ... - VTechWorks

modeling the distribution of mexican plants - SciELO

Modeling the polymeric distribution of alkyl

Modeling the Distribution of Cutaneous Leishmaniasis Vectors ...