Rehabilitating Microdata for Trends Ananlysis

4 downloads 121 Views 28KB Size Report
1. Rehabilitating Microdata for Trends Ananlysis. Michael J. Levin. Harvard Center for Population and. Development Studies. February 23, 2008 ...
Rehabilitating Microdata for Trends Ananlysis

Michael J. Levin Harvard Center for Population and Development Studies February 23, 2008

Rehabilitating Census Data Sets - February 23, 2008

1

Types of Activities 1. 2. 3. 4. 5.

Developing the dictionary Adding attribute lists Changing geography Adding recodes to assist in trends Reweighting

Rehabilitating Census Data Sets - February 23, 2008

2

1. Developing Dictionaries • Note that this presentation assumes use of CSPro for trends analysis • Dictionaries in IMPS or CSPro can be updated directly • Dictionaries from other packages can sometimes be converted • Dictionaries from meta-data (code lists) • Dictionary items from other sources – international occupation and industry code lists Rehabilitating Census Data Sets - February 23, 2008

3

2. Adding Attribute Lists • Various levels of geography: Major and Minor Civil Division, locality, sub-locality • Groupings: 5, 10 and 15 year age groups • Ethnic groups, religions, languages, educational levels • Major and Minor occupation and industry categories • Housing variables: types of walls and roof, water supply, toilets Rehabilitating Census Data Sets - February 23, 2008

4

3. Changing Geography • General: For trends analysis, need relationship between current and previous geography – so recodes • General: May also need intermediate geography if more than one change • Uganda 2002 – added 1991 geography for trends analysis, then post-2002 geography

Rehabilitating Census Data Sets - February 23, 2008

5

4. Adding Recodes • Recodes assist in general, but for simple crosstab systems like CSPro in particular • Total Children ever born and surviving for Male and female CEB and CS separately • Employment status recodes: UN style and “Western” style • Family and household types for HIV/AIDS analysis • Wealth indices and Quintiles from household assets Rehabilitating Census Data Sets - February 23, 2008

6

5. Reweighting • When Census data do not correspond to printed and other reports • Reasons: incomplete because of data loss in the field, during processing (duplicates, misplaced geography), or post-processing (storage) • Method: add appropriate weights • Examples: Sudan 1983, Ghana 1984, Sierra Leone 2004 Rehabilitating Census Data Sets - February 23, 2008

7

Census Editing • • • • • • •

Structure edits Duplicate records Incomplete records Content edits Population characteristics Housing characteristics Death characteristics

Rehabilitating Census Data Sets - February 23, 2008

8

Duplicate Records • About 11,000 households exactly duplicated • Many others partially duplicated • Problems with duplicates appearing because of lack of housing records • CSPro refuses to make tables with duplicates

Rehabilitating Census Data Sets - February 23, 2008

9

Content edits – Sierra Leone 2004 • Started with edited file – problems with IMPS keying (ASCII) to ACCESS and then out again • 5 record types as individual records • Return to unedited data • How to resolve unknowns? • How to resolve inconsistencies? • UN Editing Handbook … PLUS • Most editing issues resolved Rehabilitating Census Data Sets - February 23, 2008

10

Reweighting – Sierra Leone 2004 • Obtained edited age and sex by District • Used CSPro to obtain the same for the data after removing duplicates and incompletes • Divided one by another, each cell, for weights to add to data set • Added weights Rehabilitating Census Data Sets - February 23, 2008

11

Presented for Illustration only: Edited data by Expert SIERRA LEOPULATION AN Sierra Leone

Eastern P Kailahun DKenema DKono DistrNorthern PBombali DiKambia DisKoinadugu Port Loko DTonkolili D

Table01: NForeign-boForeign-boForeign-boForeign-boForeign-boForeign-boForeign-boForeign-boForeign-boForeign-boForeign-bo ALL AGES Total

Total

TOTAL. . . 4,930,532 1,181,870 00 - 04. . . 752,807 184,524 05 - 09. . . 738,076 170,735 10 - 14. . . 566,163 123,940 15 - 19. . . 536,507 127,797 20 - 24. . . 414,117 100,074 25 - 29. . . 404,754 106,836 30 - 34. . . 312,031 78,901 35 - 39. . . 299,509 77,079 40 - 44. . . 213,169 49,986 45 - 49. . . 176,903 43,843 50 - 54. . . 128,387 29,210 55 - 59. . . 84,815 18,632 60 - 64. . . 87,675 19,619 65 - 69. . . 61,214 13,909 70 - 74. . . 54,421 12,552 75 - 79. . . 36,705 9,022 80 - 84. . . 27,098 6,266 85 - 89. . . 15,400 3,740 90 + . . . . . 20,781 5,205

Total

Total

357,175 56,779 57,574 37,638 41,911 27,856 28,680 22,173 21,582 13,969 12,477 8,567 5,149 6,175 4,192 4,170 2,955 2,252 1,343 1,733

490,429 76,256 66,576 50,969 50,999 43,411 45,711 33,642 32,920 21,664 18,122 12,441 7,988 8,372 5,693 5,284 3,528 2,678 1,571 2,604

Total

Total

334,266 1,741,926 51,489 280,483 46,585 291,243 35,333 205,235 34,887 181,856 28,807 123,977 32,445 125,559 23,086 100,764 22,577 101,146 14,353 74,867 13,244 63,538 8,202 46,830 5,495 31,039 5,072 34,195 4,024 22,961 3,098 20,867 2,539 13,288 1,336 10,751 826 5,822 868 7,505

Total

Total

406,392 61,129 65,887 49,615 44,771 30,111 29,008 23,030 23,212 17,725 15,888 11,071 7,719 7,967 5,538 5,095 3,062 2,448 1,428 1,688

Rehabilitating Census Data Sets - February 23, 2008

270,376 46,871 48,860 29,674 25,743 18,581 19,041 15,714 15,140 10,983 8,949 6,898 4,885 5,233 3,545 3,477 2,257 1,931 1,124 1,470

Total 265,683 39,427 46,917 32,471 28,196 18,458 19,263 15,913 15,066 12,411 9,745 7,576 4,204 5,362 2,942 3,000 1,617 1,520 698 897

Total 453,019 76,152 73,204 52,280 44,503 33,052 33,216 26,894 26,396 19,451 16,063 12,129 8,265 9,030 6,283 5,619 3,786 2,933 1,575 2,188

Total 346,456 56,904 56,375 41,195 38,643 23,775 25,031 19,213 21,332 14,297 12,893 9,156 5,966 6,603 4,653 3,676 2,566 1,919 997 1,262

12

Data after removal of duplicates and housing omissions Table 1. P6_SEX and P5_AGE by PROVDIST2 Northern PBombali Kambia Koinadugu Port Loko Tonkolili Sierra LeonEastern Pr Kailahun Kenema Kono P6_SEX Total P5_AGE Total 4342914 1053488 315991 435911 301586 1542508 358101 238509 229526 405960 310412 00_04 663935 164335 50079 67829 46427 248316 53826 41224 34129 68309 50828 05_09 650256 152005 51000 59062 41943 257482 57928 43095 40454 65487 50518 10_14 498252 110532 33375 45214 31943 182130 43696 26209 28217 46903 37105 15_19 473289 114201 37134 45490 31577 161418 39403 22754 24423 40047 34791 20_24 366571 90058 24797 39073 26188 110579 26593 16601 16050 29812 21523 25_29 363497 97323 25791 41730 29802 113683 25990 17311 16950 30284 23148 30_34 272765 69772 19277 29857 20638 88496 20269 13654 13474 23966 17133 35_39 258809 67379 18775 28435 20169 87700 20180 13046 12831 23298 18345 40_44 186340 44250 12365 19108 12777 65712 15519 9575 10626 17334 12658 45_49 154486 38816 11006 15919 11891 55710 13930 7838 8327 14297 11318 50_54 112871 25876 7559 10946 7371 41493 9882 6070 6494 10807 8240 55_59 74610 16604 4536 7097 4971 27555 6746 4350 3691 7452 5316 60_64 77426 17452 5501 7412 4539 30347 7042 4614 4686 8079 5926 65_69 53940 12376 3716 5054 3606 20407 4906 3110 2544 5666 4181 70_74 48025 11144 3714 4652 2778 18543 4523 3081 2567 5035 3337 75_79 32482 8017 2648 3104 2265 11826 2741 2000 1412 3384 2289 80_84 23855 5534 2002 2360 1172 9485 2156 1694 1314 2591 1730 85_89 13551 3286 1176 1366 744 5158 1284 983 590 1393 908 90+ 17954 4528 1540 2203 785 6468 1487 1300 747 1816 1118

Rehabilitating Census Data Sets - February 23, 2008

13

Weights based on District, Age and Sex Sierra LeonEastern Pr Kailahun P6_SEX Total P5_AGE Total 00_04 05_09 10_14 15_19 20_24 25_29 30_34 35_39 40_44 45_49 50_54 55_59 60_64 65_69 70_74 75_79 80_84 85_89 90+

1.135305 1.133856 1.135055 1.136298 1.133572 1.129705 1.113500 1.143955 1.157259 1.143979 1.145107 1.137467 1.136778 1.132372 1.134854 1.133181 1.130010 1.135946 1.136447 1.157458

1.121864 1.122853 1.123220 1.121304 1.119053 1.111217 1.097747 1.130840 1.143962 1.129627 1.129508 1.128845 1.122139 1.124169 1.123869 1.126346 1.125359 1.132273 1.138162 1.149514

Kenema

1.130333 1.133789 1.128902 1.127730 1.128642 1.123362 1.112016 1.150231 1.149507 1.129721 1.133654 1.133351 1.135141 1.122523 1.128095 1.122779 1.115937 1.124875 1.142007 1.125325

1.125067 1.124239 1.127222 1.127284 1.121104 1.111023 1.095399 1.126771 1.157728 1.133766 1.138388 1.136580 1.125546 1.129520 1.126435 1.135856 1.136598 1.134746 1.150073 1.182025

Kono

1.108360 1.109031 1.110674 1.106127 1.104823 1.100008 1.088685 1.118616 1.119391 1.123347 1.113784 1.112739 1.105411 1.117427 1.115918 1.115191 1.120971 1.139932 1.110215 1.105732

Northern PBombali

1.129282 1.129541 1.131120 1.126860 1.126615 1.121162 1.104466 1.138628 1.153318 1.139320 1.140513 1.128624 1.126438 1.126800 1.125153 1.125330 1.123626 1.133474 1.128732 1.160328

1.134853 1.135678 1.137395 1.135459 1.136233 1.132290 1.116122 1.136218 1.150248 1.142148 1.140560 1.120320 1.144234 1.131355 1.128822 1.126465 1.117111 1.135436 1.112150 1.135171

Kambia

1.133609 1.136983 1.133774 1.132206 1.131362 1.119270 1.099936 1.150872 1.160509 1.147050 1.141745 1.136409 1.122989 1.134157 1.139871 1.128530 1.128500 1.139906 1.143438 1.130769

Rehabilitating Census Data Sets - February 23, 2008

Koinadugu Port Loko Tonkolili

1.157529 1.155235 1.159762 1.150760 1.154486 1.150031 1.136460 1.181015 1.174188 1.167984 1.170289 1.166615 1.138987 1.144259 1.156447 1.168679 1.145184 1.156773 1.183051 1.200803

1.115920 1.114816 1.117840 1.114641 1.111269 1.108681 1.096817 1.122173 1.132973 1.122130 1.123522 1.122328 1.109098 1.117713 1.108895 1.115988 1.118794 1.131995 1.130653 1.204846

1.116117 1.119540 1.115939 1.110228 1.110718 1.104632 1.081346 1.121403 1.162824 1.129483 1.139159 1.111165 1.122272 1.114242 1.112892 1.101588 1.121014 1.109249 1.098018 1.128801

14

Differences in Characteristics Due to Weighting – Demographic Table . Comparison of Demographic and Social Characteristics before and after adjustment: 2004 Characteristic New Old Difference Percent Total 4,930,530 4,930,205 325 0.0066 Native 4,839,373 4,840,660 -1,287 -0.0266 Foreign born 91,157 89,545 1,612 1.8002 Males Native Foreign

2,391,997 2,342,309 49,688

2,391,836 2,342,973 48,863

161 -664 825

0.0067 -0.0283 1.6884

Females Native Foreign born

2,538,533 2,497,064 41,469

2,538,369 2,497,687 40,682

164 -623 787

0.0065 -0.0249 1.9345

Heads Never married NM 10-14 Widowed 10-14

821,624 1,372,456 543,149 6,839

819,827 1,330,242 515,145 5,990

1,797 42,214 28,004 849

0.2192 3.1734 5.4361 14.1736

Lang: Krio Mende Temne

477,985 1,576,639 1,496,594

466,057 1,558,591 1,460,500

11,928 18,048 36,094

2.5593 1.1580 2.4713

Ethn:Krio Mende Temne

70,905 1,584,076 1,570,182

70,501 1,587,230 1,568,977

404 -3,154 1,205

0.5730 -0.1987 0.0768

Relig:RC 355,186 354,059 Sunni Moslem 2,603,567 2,605,692 Source: Unpublished data, 2004 Sierra Leone Census

1,127 -2,125

0.3183 -0.0816

Rehabilitating Census Data Sets - February 23, 2008

15

Conclusions • Eventually similar work for inter-censal surveys • Concatenation of census data • Concatenation of census and survey data, including DHS, prevalence and other surveys • THANK YOU Rehabilitating Census Data Sets - February 23, 2008

16