PROC TABULATE using SAS 9.3 and SAS Enterprise Guide ...

13 downloads 19760 Views 1MB Size Report
Jul 11, 2013 ... Enterprise Guide to produce PROC TABULATE output, and will also ..... entered manually, so the syntax of the PROC TABULATE will have to ...
PROC TABULATE using SAS 9.3 and SAS Enterprise Guide Lewis Purnell, Dootsonic, Manchester, England

ABSTRACT This paper is going to examine the PROC TABULATE procedure within SAS and all of the associated statements that allow the user to maximize its potential. The three main statements for PROC TABULATE are the CLASS, VAR and TABLE statements. These and any associated OPTIONS that can be attached to these statements will be covered. This paper will cover how to produce different outputs using the PROC TABULATE procedure, how to improve their appearance and how to understand the different dimensions of tables this procedure can output. On completion of this paper the reader should be fully aware of how to use this procedure effectively and understand the ins and outs of the statements. Once PROC TABULATE has been discussed, this paper will discuss the differences between using SAS 9.3 and SAS Enterprise Guide to produce PROC TABULATE output, and will also discuss the effectiveness of both programs.

Introduction This paper will provide a description of the PROC TABULATE procedure, both on SAS 9.3 and SAS Enterprise Guide and the options associated with it. The main purpose of PROC TABULATE is to create descriptive statistics in tabular format based on information within a data set. The main statements attached to PROC TABULATE are CLASS, VAR and TABLE. This report will cover these statements and how to understand them and use them effectively. The syntax for PROC TABULATE is as follows: PROC TABULATE ; CLASS variable(s) ; VAR analysis-variable(s) ; TABLE , , ;

CLASS The CLASS statement within PROC TABULATE identifies classification variables for the table. Class variables determine the categories that PROC TABULATE uses to calculate statistics. The CLASS statement allows one row or column for each value of the CLASS variable. Either a character or

www.dootsonic.com

1|Page

Copyright Dootsonic 2013

PROC TABULATE using SAS 9.3 and SAS Enterprise Guide

numeric variable can be placed in the CLASS statement as it only produces counts and percentages. See below for an example of a CLASS statement including an option. PROC TABULATE DATA=sashelp.class; CLASS weight / MISSING; TABLE weight; RUN;

The options connected to the CLASS statement are separated with a ‘/ ‘. This is to prevent SAS reading in this option as a new variable. The option MISSING is being used in this example, this option tells SAS to read in the missing values (if any) as observations and not skip them to help produce more accurate count or percentage readings. All the options for CLASS are listed below. CLASS variable / OPTION ASCENDING

Description Specifies to sort the class variable values in ascending order. Specifies to sort the class variable values in descending order. Excludes from tables and output data sets all combinations of class variables that are not found in the preloaded range of user-defined formats. Specifies not to apply formats to the class variables when PROC TABULATE groups the values to create combinations of class variables. Considers missing values as valid class variable levels. Enables PROC TABULATE to use the format label or labels for a given range or overlapping ranges to create subgroup combinations when a multi-label format is assigned to a class variable. Specifies the order to group the levels of the class variables in the output.

DESCENDING EXCLUSIVE

GROUPINTERNAL

MISSING MLF

ORDER=

Orders values according to their order in the input data set.

DATA FORMATTED

Orders values by their ascending formatted values.

UNFORMATTED

Orders values by their unformatted values (same as PROC SORT)

FREQ PRELOADFMT

Orders values by descending frequency count. Specifies that all formats are preloaded for the class variables.

If there is more than one variable and you want to specify different options to each of these, the CLASS statement can be used numerous times. See below for an example. www.dootsonic.com

2|Page

Copyright Dootsonic 2013

PROC TABULATE using SAS 9.3 and SAS Enterprise Guide

PROC TABULATE DATA=sashelp.class; CLASS sex /MISSING; CLASS height/DESCENDING; TABLE sex, height; RUN;

It is important to understand that there is an automatic variable attached to the CLASS statement ‘N’ which displays at program run time. ‘N’ shows the number of observations for that given variable with the given data, this variable is automatically created unless another statistic is requested.

www.dootsonic.com

3|Page

Copyright Dootsonic 2013

PROC TABULATE using SAS 9.3 and SAS Enterprise Guide

VAR The VAR statement identifies numeric variables to use as analysis variables. This statement is used to list the variables intended to create summary statistics on, this statement only works if the selected variable is numeric. The VAR statement is not always necessary within PROC TABULATE unless other statistics besides count and percentages are required. The VAR statement changes the output of the PROC TABUALTE. The default statistics attached to VAR is ‘SUM’, this is automatically generated through SAS, unless specifying another statistic. The examples below show the difference of the output when a numeric variable from CLASS is placed in the VAR statement – PROC TABULATE DATA=sashelp.class; CLASS sex height; TABLE sex, height; RUN;

PROC TABULATE DATA=sashelp.class; CLASS sex; VAR height; TABLE sex, height; RUN;

As you can see from the examples there is a distinct difference between the two outputs created. As the first example displays the variable ‘height’ in the CLASS statement, the default variable ‘N’ is created for this output which then displays how many observations have that specific height within the data set. The second example has the VAR statement included, and as shown from the output, the ‘SUM’ statistic has automatically been generated and shows the combined total of all the heights within the data set. The VAR statement can be effective when trying to gain totals of numeric variables.

www.dootsonic.com

4|Page

Copyright Dootsonic 2013

PROC TABULATE using SAS 9.3 and SAS Enterprise Guide

TABLE The TABLE statement within PROC TABULATE specifies the dimensions of the table that is going to be created. There can be a 1, 2 or 3 dimensional table created depending on how many variables are stated in the TABLE statement. There is a specific order that must be followed when using the TABLE statement, if you refer back to the syntax – TABLE , , ;

Page  Row  Column This is the order in which the table is created, and depending on the order the variables are specified, this changes the appearance of the table. It is very important to note that all the variables listed in TABLE must have been specified in either CLASS or VAR, otherwise this will prevent SAS from recognizing these statements. The example above displays the rules for a 3 dimensional table, but under some circumstances 3 variables may not always be entered into the TABLE statement. 1 Dimensional Table To create a 1 dimensional table, only 1 variable needs to be listed in the TABLE statement. Using only 1 variable creates only one column e.g. PROC TABULATE DATA=sashelp.class; CLASS weight /MISSING; TABLE weight; RUN;

As only 1 variable is listed in the TABLE statement, it creates a one dimensional table only creating a column for the weight variable. As TABLE can take up to 3 variables, (separated by commas to create the different dimensions) the statement follows different rules for how many variables are entered into the statement – 3 Variables (separated by commas) – Page, Row, Column www.dootsonic.com

5|Page

Copyright Dootsonic 2013

PROC TABULATE using SAS 9.3 and SAS Enterprise Guide

2 Variables (separated by commas) – Row, Column 1 Variable – Column 2 Dimensional Tables To create a 2 dimensional table, 2 variables need to be listed in the TABLE statement. The two variables will create a row and column as described previously. E.g. PROC TABULATE DATA=sashelp.class; CLASS weight sex; TABLE sex, weight; RUN;

As you can see from the example, the variables sex and weight are separated by commas, this is telling SAS they are to be separated by row and column, as there is only 2 variables being listed this displays a 2 dimensional table. To help understand this further, if the two variables are switched as follows: PROC TABULATE DATA=sashelp.class; CLASS weight sex; TABLE weight, sex; RUN;

With a 2 dimensional table SAS always reads the first variable in the TABLE statement as the row and the following variable (after the comma) as the column, the following would be the new output of the code: As you can see the variables have been switched as SAS is reading in the first variable as the row and the second as the column. 3 Dimensional Table To create a 3 dimensional table within PROC TABULATE, 3 variables (which much be listed in either the VAR or CLASS statements) must be present in the TABLE statement separated by commas, it is important to notice the commas within the TABLE statement as the viewer can understand before the code is run what dimension the table being created will look like. The following code is an example of a 3 dimensional table: www.dootsonic.com

6|Page

Copyright Dootsonic 2013

PROC TABULATE using SAS 9.3 and SAS Enterprise Guide

PROC TABULATE DATA=sashelp.class; CLASS weight sex height; TABLE sex, weight, height; RUN;

This code is telling SAS to create a page for each unique observation in sex, create a row for each observation in weight and create a column for each observation in height. The following output is created when running this code: As you can see from the image (left) an individual table has been created for each unique value in the variable sex, which is ‘M’ and ‘F’ (Male and Female). This output is displaying the number of observations that fall into each category. Where the observation had a height of 51.3 the matching weight for that observation was 50.5, this is indicated by a 1 in the output. As the variable ‘sex’ was listed first in the TABLE statement this creates an individual table for ‘F’ & ‘M’ as these are the only 2 observations within the variable inside the data set. If, within the table statement, the variables ‘sex’ and ‘weight’ were swapped, there would have been an individual table created for each value within the variable weight. This would take a lot more time to process and would not produce as accurate and as easy to read results, it is important to ensure the ordering of the TABLE statement is correct before submitting. Page  Row  Column

Statistics Statistics can be specified for the variables in the PROC TABULATE procedure. The statistics to choose from are as follows: COLPCTN MAX MEAN STDERR PAGEPCTSUM www.dootsonic.com

PCTSUM MIN STDDEV NMISS PCTN 7|Page

COLPCTSUM ROW PCTN N SUM

Copyright Dootsonic 2013

PROC TABULATE using SAS 9.3 and SAS Enterprise Guide

All of these statistics can be used within this procedure to help improve the analysis of the variables and create a more descriptive output. It is important to remember to specify these statistics, and remember that the VAR statement must be used. The example below shows where these new statistics can be implemented into the coding of a PROC TABULATE: PROC TABULATE DATA=sashelp.class; CLASS sex; VAR height; TABLE sex, height * (N MEAN MAX); RUN;

From this code you can understand how to request more statistics for the certain variable. The requested statistics are placed within parenthesis ‘()’ and are then displayed on the output. With this example N, MEAN and MAX are being requested for the variable ‘height’, this example creates 3 new variables for the variable ‘height’. These statistics are linked to the preceding variable via the ‘*' displaying the requested statistics:

As you can see from the image (left), the new statistics have been created under the ‘Height’ variable and display these extra statistics in relation to the variable ‘sex’. The statistic ‘N’ outputs the number of observations, ‘Mean’ produces the average height in the data set and the ‘Max’ displays the largest height from the data set.

The previous example only shows the results in the column. This is not a requirement as the extra statistics can be displayed on the rows by simply changing the code as follows: PROC TABULATE DATA=sashelp.class; CLASS sex; VAR height; TABLE sex * (N MEAN MAX), height; RUN;

As you can see from the image (right), the statistics have been moved to the row variable (‘sex’). Any of the statistics options listed above can be placed within the brackets following the variable to include them into the output.

www.dootsonic.com

8|Page

Copyright Dootsonic 2013

PROC TABULATE using SAS 9.3 and SAS Enterprise Guide

Programming in SAS 9.3 Programming in SAS 9.3 is very effective. This software allows full customization for the programmer, and is aimed at the more experienced programmer as it does not have short-cuts or interface features to help complete simple tasks. The user interface is also very simple. It is plain, minimalistic but effective as it can provide what is needed to program in SAS. The basic interface displays the libraries, File shortcuts etc. for navigation allowing the programmer to easily program. When producing an output within SAS 9.3, a new window opens (within the software) displaying the created data, and when an error has occurred these display in the log, which are displayed across the bottom of the software in tabs (See image above ).

Programming in SAS Enterprise Guide The alternative to using SAS 9.3 is SAS Enterprise Guide. This software allows the user to program in the same language but in a completely different style. This interface is a lot more pleasing and updated as to the previous, and includes a lot more features. These features are mostly to do with the implementation of the code. SAS Enterprise Guide allows the user to input code in a ‘dragand-drop’ style to gain the same output as in SAS 9.3 but with little to no coding involved. To understand this more clearly, a PROC TABULATE example is to be created in both SAS 9.3 and SAS Enterprise Guide generating the same output, to understand the differences between both programs and which is most suitable. SAS Enterprise Guide has a symbol system which is effective when using variable types: Character variable

Numeric variable can be placed here

Numeric variable

Numeric/Character can be placed here

Date variable www.dootsonic.com

9|Page

Copyright Dootsonic 2013

PROC TABULATE using SAS 9.3 and SAS Enterprise Guide

SAS 9.3 PROC TABULATE For the examples within this section the data set ‘sashelp.class’ is going to be used to provide the data for the output. The data set is as follows:

The aim of this section is to create a more descriptive output for this data set using PROC TABULATE within SAS 9.3. To complete this task, the coding will all have to be entered manually, so the syntax of the PROC TABULATE will have to be found before beginning to implement the procedure. The code entered into SAS 9.3 is as follows: PROC TABULATE DATA=sashelp.class; CLASS sex weight; VAR height; TABLE sex, weight, height; RUN;

As you can see the coding is simple, there are no comments available and the code had to be written from scratch.

www.dootsonic.com

10 | P a g e

Copyright Dootsonic 2013

PROC TABULATE using SAS 9.3 and SAS Enterprise Guide

SAS Enterprise Guide PROC TABULATE SAS Enterprise Guide has a completely different input as to the previous method. For this example the drag-and-drop method will be used, and also the same data set (sashelp.class) will be used to generate the same output as in SAS 9.3. The data set has to be selected from the server list (see image right). The Server List is automatically open by default when opening SAS Enterprise Guide. The data set is dragged onto the body of the main page, where the data set will open, and you can view its contents, once this has completed, the ‘Process Flow’ must be selected (top left of the program) then the data set will be visible (see below).

Once the data set is visible, you can then begin creating the PROC TABULATE output (like in SAS 9.3). To create a PROC TABULATE procedure within SAS Enterprise Guide, the Summary Tables Wizard can be used; this option provides a step by step guide on creating a table to display the desired information. The Summary Tables Wizard will be implemented using the drag-and-drop method. As you can see from the image (left), the Task List needs to be selected. All the available tasks are listed by category. This needs to be changed to ‘By Name’. To implement the Summary Tables Wizard to the data set sashelp.class, the Summary Tables Wizard is simply dragged and dropped into the body of the program and as there is only 1 data set available, the program will automatically link with the data set and open the wizard. If there is more than one dataset in the process flow, drag the task onto the dataset that is to be processed.

www.dootsonic.com

11 | P a g e

Copyright Dootsonic 2013

PROC TABULATE using SAS 9.3 and SAS Enterprise Guide

Once the Summary Tables Wizard is open, (image below), the user will be prompted on how they wish to create their table. This first screen verifies which data set the Summary Tables (PROC TABULATE) will be linked to. As there is only 1 data set currently in the Process Flow, SAS Enterprise Guide automatically chooses the data set sashelp.class. Once confirmed the correct data set is selected, the ‘Next’ button is pressed.

The next screen prompts for the analysis variables to be selected for the table. In relation to SAS 9.3, this is selecting the variable in the VAR statement, which has to be a numeric value. These are selected from the ‘Add’ button, and, depending on how you want this variable imported into the table, the user can select from the dropdown list where they would like this variable to appear (page, row, column or hidden) in the output.

www.dootsonic.com

12 | P a g e

Copyright Dootsonic 2013

PROC TABULATE using SAS 9.3 and SAS Enterprise Guide

As we want to create the same output as the SAS 9.3 program, we know that within the PROC TABULATE statement that the variable ‘Sex’ was used for the page, the variable ‘Weight’ for the rows and the variable ‘Height’ for the columns. The way to do this in the SAS Enterprise guide is as follows: The variable ‘height’ is selected by using the ‘Add’ button, and then (by default) this will be added in columns, and a preview of the variable (including automatically generated variables) display in the Preview section of this window. As this is correct (Variable ‘height’ in columns, like in SAS 9.3) the ‘Next’ button can be pressed to continue.

The Summary Tables Wizard requires variables to be entered (if applicable) to add to the output. These are in a simple format, stating which variable will be added to the column, row or page. The Variable ‘weight’ can be added in to the Rows section and the Variable ‘sex’ can be added to the Page section to produce an identical output as in SAS 9.3.

Once the variables have been selected, the next step within the wizard is to specify the totals. As this procedure uses the VAR statement, a SUM variable is automatically generated, and a grand total (of the observations) can also be generated. This window is just confirming you want to view these totals, but they can be hidden (to match the output of this procedure in SAS 9.3) by using the drop down lists. For this example the grand totals are not required so they are selected as ‘None’.

www.dootsonic.com

13 | P a g e

Copyright Dootsonic 2013

PROC TABULATE using SAS 9.3 and SAS Enterprise Guide

The next step within the wizard specifies if the user would like to create an additional output for the created table. The equivalent in SAS 9.3 is an OUTPUT statement. This is used should the user wish to create an output dataset that can be used for further processing.

The final step within the Summary Tables Wizard is naming the output. Within SAS 9.3 this would be creating the TITLE and FOOTNOTE statements. These can be left as default. Once the Finish button is pressed on the final screen of the Summary Tables Wizard, SAS Enterprise Guide automatically creates the results and displays them, as shown below.

As you can see the produced output is identical to the output within SAS 9.3, and no coding was performed in order to achieve this. www.dootsonic.com

14 | P a g e

Copyright Dootsonic 2013

PROC TABULATE using SAS 9.3 and SAS Enterprise Guide

SAS 9.3 Output

www.dootsonic.com

Enterprise Guide Output

15 | P a g e

Copyright Dootsonic 2013

PROC TABULATE using SAS 9.3 and SAS Enterprise Guide

Although no code was entered into SAS Enterprise Guide, the program does generate code. As the Summary Tables Wizard was being used, the program understood what the user wanted and created the specific code & output. To view the code within SAS Enterprise Guide the ‘Code’ tab must be selected at the top of the screen. The following code was automatically created using the Summary Tables Wizard.

SAS Enterprise Guide generated this code automatically, and has also generated commenting to help the user understand the procedure. This would have had to be manually entered if using SAS 9.3.

www.dootsonic.com

16 | P a g e

Copyright Dootsonic 2013

PROC TABULATE using SAS 9.3 and SAS Enterprise Guide

A comparison of code for both programs: SAS 9.3 PROC TABULATE DATA=sashelp.class; CLASS sex weight; VAR height; TABLE sex, weight, height; run;

SAS Enterprise Guide TITLE; TITLE1 "Summary Tables"; FOOTNOTE; FOOTNOTE1 "Generated by the SAS System (&_SASSERVERNAME, &SYSSCPL) on %TRIM(%QSYSFUNC(DATE(), NLDATE20.)) at %TRIM(%SYSFUNC(TIME(), TIMEAMPM12.))"; /* ------------------------------------------------------------------Code generated by SAS Task Generated on: 11 July 2013 at 09:19:24 By task: Summary Tables Input Data: SASHELP.CLASS Server: Local ------------------------------------------------------------------- */ /* ------------------------------------------------------------------Run the tabulate procedure ------------------------------------------------------------------- */ PROC TABULATE DATA=SASHELP.CLASS ; VAR Height; CLASS Weight / ORDER=UNFORMATTED MISSING; CLASS Sex / ORDER=UNFORMATTED MISSING; TABLE /* PAGE Statement */ Sex , /* ROW Statement */ Weight , /* COLUMN Statement */ (Height * Sum={LABEL="Sum"} ) ; ; RUN; /* ------------------------------------------------------------------End of task code. ------------------------------------------------------------------- */ RUN; QUIT; TITLE; FOOTNOTE;

www.dootsonic.com

17 | P a g e

Copyright Dootsonic 2013

PROC TABULATE using SAS 9.3 and SAS Enterprise Guide

Although the coding for SAS 9.3 looks a lot simpler than in SAS Enterprise Guide, the coding was all entered manually, whereas with Enterprise Guide it was all automatically generated with no manual input of coding required for the same output.

SAS 9.3 vs. SAS Enterprise Guide If the two programs were to be compared, SAS Enterprise Guide would be more effective for a novice programmer. The interface and navigation system on this program is simple and effective; it allows for advanced programming without the effort required in SAS 9.3. As SAS 9.3 can produce the same output as SAS Enterprise Guide, it allows both programs to be as effective as one another. Due to the features that SAS Enterprise Guide provides, it allows for a smoother programming process. As the example of PROC TABULATE shows, instead of having to research the syntax for the procedure and type in all the code manually, the drag-and-drop feature creates an easier way to produce simple coding, such as selecting a data set. Another very effective feature to SAS Enterprise Guide is the symbol system of variables. The symbol system shows the user what type of variable it is (character/numeric/date). This is most effective for programming to prevent any errors occurring. SAS 9.3 does not have a feature like this, and would have to review each data set and view the variables attributes to understand their types. In addition to this, SAS Enterprise Guide (as shown in the PROC TABULATE example) has a preview of the output before submitting, this is very informative and saves a lot of time submitting incorrect code and cycling through the process. The reason SAS 9.3 is not as effective as SAS Enterprise Guide is because it does not include all of the ‘simple’ features that make the coding easier to implement. SAS 9.3 would be recommended for an advanced programmer as they would not need to use the features that are implemented into Enterprise Guide. Enterprise Guide would be very effective to a novice programmer, as little to no coding is required for the same output.

www.dootsonic.com

18 | P a g e

Copyright Dootsonic 2013

PROC TABULATE using SAS 9.3 and SAS Enterprise Guide

REFERENCES SAS Institute Inc. Base SAS ® 9.2 Procedures Guide, Second Edition: TABULATE Procedure. Available: http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a000146 761.htm SAS Institute Inc. (2013) Base SAS ® 9.3 Procedures Guide, Second Edition: TABULATE Procedure. Available: http://support.sas.com/documentation/cdl/en/proc/65145/HTML/default/viewer.htm#n00yutbv vckjwrn1ldg5xkvjy1pu.htm CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Lewis Purnell Dootsonic Manchester M2 3HZ Phone: 0161 236 0961 Email: [email protected] Web: http://www.dootsonic.com/ Brand and product names are trademarks of their respective companies.

www.dootsonic.com

19 | P a g e

Copyright Dootsonic 2013