May 18, 2012 ... Statistics I Final Examen, 18th May 2012. ADE, DER-ADE, ADE-INF, FICO, ECO,
ECO-DER. EXAM RULES: 1) Use separate booklets for each ...
Statistics I Final Examen, 18th May 2012. ADE, DER-ADE, ADE-INF, FICO, ECO, ECO-DER. EXAM RULES: 1) Use separate booklets for each problem. 2) Perform the calculations with at least two significant decimal places. 3) You may not leave the exam during the first 30 minutes. 4) You are not allowed to leave the classroom without handing in the exam. 1. A survey based on 1000 Spanish households led to a bivariate data set from (X, Y ), where X = ’number of cars owned in 2011’, with possible values 0, 1, 2, and Y = ’household income in 2011 (in thousands of euros)’. The following table shows the survey results: (a) (0.25 points) What kind of variables are X and Y ?
0 X1 2
[0, 50) 324 112 1
Y [50, 100) [100, ∞) 105 37 234 6 4 177
(b) (0.5 points) Find the marginal absolute frequency distribution of X. Calculate the mean and the quasi-standard deviation of X. (c) (0.5 points) For those households with incomes below 50 thousand euros, what was the mean and the most frequent number of cars owned?
Here you have some additional information of Y grouped by the different values of X: Summary Statistics for Y X=0 X=1 Count 466 352 Average 39.70 58.93 Median 28.35 59.52 Variance 1376.33 372.85 Standard dev. 37.10 19.31 Coeff. of variation 93.44% 32.77% Minimum 0.03 0.94 Maximum 288.44 126.09 Range 288.41 125.15 Lower quartile 12.39 46.16 Upper quartile 57.07 71.96 Interquartile range 44.67 25.80
X=2 182 248.12 265.24 2672.49 51.70 20.84% 32.56 299.81 267.26 235.33 285.62 50.29
(d) (0.5 points) Identify each of the boxplots with one of the three groups: X = 0, X = 1, X = 2. Justify your choice. (e) (0.25 points) Identify each of the histograms I), II), III) with one of the boxplots a), b), c) from the previous part. Justify your choice. (f) (0.5 points) We pick one household from each of the three groups and report their incomes (in thousands of euros). The values are: 51 for X = 0; 62 for X = 1 and 75 for X = 2. Compared to the rest of the incomes in their corresponding groups, which of these three household is the poorest? (Hint: standardize).
2. In a factory, there are two assembly lines, L1 and L2. Of all items produced by L1, 5% are nonconforming and of all items produced by L2, 10% are nonconforming. A quality control inspector selects items for inspection in the following manner: first, he selects one assembly line at random (L1 with probability 0.4 and L2 with probability 0.6) and then he chooses an item from the line. (a) (0.5 points) What is the probability that the inspector finds a nonconforming item? A routine inspection involves selecting 3 items hence the above selection scheme is repeated independently 3 times. (b) (0.5 points) Consider the random variable X =“the number of nonconforming items in a routine inspection”. Describe the support of X (the set of all possible values that X can take) and decide if X is a discrete or a continuous r.v. (c) (0.5 points) Find the probability distribution of X. (d) (0.5 points) Calculate the expectation of X. (e) (0.5 points) The factory is penalized 30 euros for every nonconforming item found by the inspector. Consider the r.v. Y =“the total amount of fine paid by the factory after a routine inspection”. Describe the support of Y and calculate the factory’s mean amount of fine.
3. A company manufacturing electronic components receives, on average, 3 orders per day. Assume that the number of orders per day follows a Poisson distribution. (a) (0.75 points) What is the probability that there will be more than 5 orders in a given day? (b) (0.75 points) What is the probability there there will be two orders in a given hour? Assume that a working day consists of 8 working hours. (c) (1 point) The company’s policy is to attend the orders the same day that they are received, even if it involves the overtime. This situation typically occurs when there are more than five orders in a day. What is the probability that the employees will have to work overtime at least one day in a given week? Assume that a working week consists of 5 working days.
4. A real estate agency is interested in the house rental prices of a particular city. A simple random sample of 80 families from this city is selected and their monthly rents recorded. Based on this information, the real estate agency produces the following two graphs:
●
0.002 0.001
●
−1
0
1
Theoretical Quantiles
(b) (1 point) The agency assumes that the monthly rents in the city have a mean of 500 euros and a standard deviation of 100 euros. What is the (approximate) probability that the sum of 80 sampled rents will be between 40000 and 42000 euros.
0.000
500
600
●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●● ● ●● ● ●●●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ●●●
−2
(a) (0.75 points) Can the monthly rents be described by a Normal probability distribution? Justify your answer.
0.003
700
●● ● ●●
400 300
Sample Quantiles
● ●
0.004
Histogram
800
Normal Q−Q Plot
2
200
400
600
800
monthly rent
(c) (0.75 points) Suppose that instead of selecting a simple random sample of 80 families, the survey was conducted with only 25 families. In such case, would it be possible to obtain the (approximate) probability of the mean rent of those 25 families being between 500 and 600 euros? Justify your answer.