Compression is a method to reduce the size of storing data sets. .... multi-table PROC SQL query, they may be unnecessar
SUGI 31
Posters
Paper 131-31
Using ) as select * from Scores(drop=A1 A3); quit;
Partial display of the SCORES drop=subject_no center) as select * from Tx T1, Scores T2 where T1.subject_no=input(substr(T2.subject_id,5),8.) and T1.center=input(substr(T2.subject_id,1,3),8.); quit;
These fields are not needed in the output ; run; NOTE: The data set WORK.TX has 3000 observations and 4 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds
data Scores(label='Recorded Scores at Visits' drop=j c); length Subject_ID $8 Visit 8 A1-A10 8 B1-B10 $20; /* B fields made intentionally */ array A[10]; array B[10]; /* longer than they needed to be */ do c=1 to 3000; if c le 1500 then subject_id=compress('100-'||put(c,z4.)); else subject_id=compress('200-'||put(c, z4.)); do Visit=1 to 20; do j=1 to 10; a[j]=j*ranuni(j); b[j]=left(put(.5+j**2*ranuni(j),7.3)); end; output; end; end; format a: 5.2; run; NOTE: The data set WORK.SCORES has 60000 observations and 22 variables. NOTE: DATA statement used (Total process time): real time 1.11 seconds 1.10 seconds cpu time
data Surgery; length SID $8 Visit 8 case 8; do k=5 to 3000 by 197; if k lt 1500 then SID=compress('100-'||put(k,z4.)); else SID=compress('200-'||put(k, z4.)); Visit=10+int(10*ranuni(971156)); /* only one visit per subject */ case+1; drop k; output; end; run; NOTE: The data set WORK.SURGERY has 16 observations and 3 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds