Tag Archives: data dictionary

Create an Excel Data Dictionary with Proc Contents and Proc Export

Sometimes it may be useful to generate a table listing of all the SAS datasets within a folder (the code below uses the previously assigned libref HRA to refer to this folder) — a sort of table of contents — along with some more detail about the fields within those tables. While this information can be output within SAS using a Proc Contents, you may wish to organize this information differently, keep and drop certain variables, and export it in the form of a multi-sheet Excel workbook. The code below shows how a simple data dictionary can be created using Proc Contents and exported to Excel with Proc Export (or alternatively, the Libname statement). Both of these options for exporting data to Excel were discussed in a previous post. Keep in mind that although the code below only selects a very few of the variables available in SAS, you can modify it to select other variables as well. Try it without the keep= option to see the full range of available variables.

Data Dictionary with Proc Export:

PROC CONTENTS DATA=HRA._all_ memtype=data Out=table_listing (keep=memname Nobs name length format crdate);
RUN;

PROC SQL;
Create table directory as
SELECT distinct memname, nobs, crdate
from table_listing;
QUIT;

PROC EXPORT DATA=directory
OUTFILE=”\\path\Data_Dictionary.xls”
DBMS=EXCEL REPLACE;
SHEET=”directory”;
RUN;

PROC EXPORT DATA=table_listing
OUTFILE=”\\path\Data_Dictionary.xls”
DBMS=EXCEL REPLACE;
SHEET=”table_listing”;
RUN;

Alternate Syntax with Libname Statement:

libname myxls “\\path\Data_Dictionary.xls”; /*does not support replace option*/

PROC CONTENTS DATA=HRA._all_ memtype=data Out=myxls.table_listing (keep=memname Nobs name length format crdate);
RUN;

PROC SQL;
Create table myxls.directory as
SELECT distinct memname, nobs, crdate
from myxls.table_listing;
QUIT;

HASUG Meeting Notes: February 2011

The first quarter HASUG Meeting on February 24, 2011, took place at Case Memorial Library in Orange, CT from 10 am -1 pm.

Santosh Bari, a SAS-certified professional currently with eClinical Solutions (a division of Eliassen Group in New London, CT), opened the meeting with his presentation on Proc Report: A Step-by-Step Introduction to Proc Report and Advanced Techniques. Proc Report is a powerful report-generating procedure which combines many of the features of Proc Print, Proc Sort, Proc Means, Proc Freq, and Proc Tabulate. Mr. Bari’s presentation was a very in-depth discussion of proc report options and attributes which included code samples alongside corresponding sample output. He did a very thorough job of presenting the wide array of functionality included in proc report, including more advanced, lesser-known topics such as BREAK BEFORE/AFTER statements, COMPUTE blocks, and PANELS and FLOW options.

Following Mr. Bari, Charles Patridge of ISO Innovative Analytics presented Best Practices: Using SAS Effectively/Efficiently. His presentation, a compilation of a number of popular past topics, included the introduction of an effective naming convention for programs and files along with compelling reasons for creating such a naming system, creation of data dictionaries in Excel with the use of a proc contents-based macro, and central macro autocall libraries. Mr. Patridge used his many years of past consulting experience to argue for spending a little time up-front to organize and name one’s programs and data sets in such a way that makes transparent the order of execution of the programs and the origin of the datasets. When dataset names correspond to the names of the programs which created them, this makes a project self-documenting and easier to hand off to others.