Tag Archives: macro

Using an IF Block to Write Dynamic SQL

Sometimes I want a code block to differ slightly depending on the value of a parameter input to a macro. This is easily done using a %if statement with the %str function to add a block of code if a certain condition is met. Some examples are shown below: the first one shows adding a join in the FROM statement if the MARKET parameter = ‘MF’, while the second example adds more fields to the SELECT statement depending on the value of the POST_ELIG_MONTHS parameter. Don’t forget to add a semi-colon after the %if statement. You can use this in many different ways to add flexibility to your code.

Example 1:
%macro cohort_rx(cohort=,market=,index_yr=,post_elig_months=);
proc sql;
create table drv_com_&cohort as
select distinct
a.nhi_individual_id,
a.index_dt
from proj.index_com_&cohort a
join proj.elig_&cohort._com b on a.nhi_individual_id=b.nhi_individual_id
%if %unquote(%str(%’&market%’))=’MF’ %then %str(
join proj.dx_com_&cohort._&market. c on a.nhi_individual_id=c.nhi_individual_id
)
;

where b.months_elig_pre>=3
and b.months_elig_post>=&post_elig_months
and a.index_yr=&index_yr
and a.nhi_individual_id not in(select nhi_individual_id from proj.prior_rx_com_&cohort)
;
quit;
%mend;

Example 2:
proc sql;
create table proj.persistency_&cohort._sum as
select
%unquote(%str(%’&cohort%’)) as cohort
,’Persistent’ as metric
,sum(case when p1=1 then 1 else 0 end) as p1 label=’Month 1′
,sum(case when p2=1 then 1 else 0 end) as p2 label=’Month 2′
,sum(case when p3=1 then 1 else 0 end) as p3 label=’Month 3′
,sum(case when p4=1 then 1 else 0 end) as p4 label=’Month 4′
,sum(case when p5=1 then 1 else 0 end) as p5 label=’Month 5′
,sum(case when p6=1 then 1 else 0 end) as p6 label=’Month 6′
%if &post_elig_months=12 or &post_elig_months=24 %then %str(
,sum(case when p7=1 then 1 else 0 end) as p7 label=’Month 7′
,sum(case when p8=1 then 1 else 0 end) as p8 label=’Month 8′
,sum(case when p9=1 then 1 else 0 end) as p9 label=’Month 9′
,sum(case when p10=1 then 1 else 0 end) as p10 label=’Month 10′
,sum(case when p11=1 then 1 else 0 end) as p11 label=’Month 11′
,sum(case when p12=1 then 1 else 0 end) as p12 label=’Month 12′
)
;
%if &post_elig_months=24 %then %str(
,sum(case when p13=1 then 1 else 0 end) as p13 label=’Month 13′
,sum(case when p14=1 then 1 else 0 end) as p14 label=’Month 14′
,sum(case when p15=1 then 1 else 0 end) as p15 label=’Month 15′
,sum(case when p16=1 then 1 else 0 end) as p16 label=’Month 16′
,sum(case when p17=1 then 1 else 0 end) as p17 label=’Month 17′
,sum(case when p18=1 then 1 else 0 end) as p18 label=’Month 18′
,sum(case when p19=1 then 1 else 0 end) as p19 label=’Month 19′
,sum(case when p20=1 then 1 else 0 end) as p20 label=’Month 20′
,sum(case when p21=1 then 1 else 0 end) as p21 label=’Month 21′
,sum(case when p22=1 then 1 else 0 end) as p22 label=’Month 22′
,sum(case when p23=1 then 1 else 0 end) as p23 label=’Month 23′
,sum(case when p24=1 then 1 else 0 end) as p24 label=’Month 24′
)
;

from proj.persistency_&cohort
group by
1,2
;
quit;

Execute Excel Commands from SAS

There are a few ways to execute Excel commands from SAS, but this method uses DDE (“Dynamic Data Exchange”) to facilitate communication between the two applications. For this example, I want to open 25 multisheet XML workbooks in Excel, call an Excel macro to run for each tab, and then re-save the XML files as XLSX files before closing them. In order to cycle through each of these XML output files, I also use a macro in SAS, which I run using call execute in a data step. Continue reading

Reflexive Joins in SAS with Proc SQL

Sometimes, you may want to use a table twice in the same query. When you join a table to itself*, it is called a reflexive join. In the example below, a common task when calculating admissions-related healthcare quality measures is to exclude transfers from admissions rates. When a patient is transferred from one facility to another within 24 hours of being discharged from the first facility, we do not want to count this as a separate admission. If we did, this would inflate our rates.

The parameterized macro below uses a reflexive join to identify transfers and delete them from the original file (designated by the parameter filename). The table is referenced twice in the FROM clause, first with alias A and then with alias B. We use member identifying information (fields rhmo and bkey) to join the table to itself. We also use the intnx function to calculate when an admission for a single member occurs within 24 hours of the discharge date. Finally, in the WHERE clause we specify that the readmission must have occurred at a different facility (src_sys_prov_id). The unique identifier for the admission (conf_id) is used in the DELETE operation to exclude these transfers from the original file.

%macro transfer(filename=)/STORE SOURCE;
CREATE TABLE &filename._TRANS AS
  SELECT B.conf_id, B.admit_dt, A.disch_dt
     FROM PQI.&filename A
       JOIN PQI.&filename B ON A.rhmo=B.rhmo AND A.bkey=B.bkey
           AND A.disch_dt BETWEEN B.admit_dt AND intnx(‘DAY’,B.admit_dt,-1)
               WHERE A.conf_id <> B.conf_id
AND A.src_sys_prov_id <> B.src_sys_prov_id
;
DELETE
  FROM PQI.&filename
    WHERE conf_ID IN
      (SELECT conf_id FROM &filename._TRANS)
;
%mend transfer;

*Fun grammar fact! “Itself” is a reflexive pronoun, which we use in English when the subject of a verb is also the object. Such a verb is a reflexive verb. In Spanish, these verbs take on an entirely different form than other verbs, while in English, a reflexive verb can usually also be used in a non-reflexive way (example: “I dressed myself” vs. “I dressed my cat”).

Two Interview Questions for SAS Programmers

I have a colleague who asks the same two questions every time he interviews a SAS programming consultant for an opening on his team. His first question has to do with the program data vector (pdv). His logic is that understanding the pdv is central to understanding how SAS processes data. Since he is a data step programmer, I also suspect that he uses this question to weed out programmers who rely extensively on PROC SQL, but lack a basic understanding of how the DATA STEP works. I am certainly guilty of preferring PROC SQL over DATA STEP programming, but I do know about the pdv and agree that it is important to understand how SAS processes data in order to make sure your programs run efficiently. Unlike querying with SQL in a relational database, where you want to use a set-based, non-sequential approach to maximize efficiency, SAS processes data sequentially. Understanding this procedural/iterative processing vs. the familiar set-based approach is important for someone coming from SQL to SAS.

When SAS compiles a data step, it creates the program data vector, which contains the automatic variables _N_ and _ERROR_. These variables are not written to the output data set, but they can be accessed and output to another variable or to the log; _N_ stores the number of times the data step has iterated during execution, while _ERROR_ is a binary variable set to 0 unless an error occurs during execution, in which case it is set to 1. As a data step executes and reads the input file, the current observation is read to the pdv, _N_ increments by 1, and the observation gets written to the output file. Then the program loops back to the top of the data step, and the process iterates until it reaches the end of the file, or until an error occurs. You can use this knowledge of the data step and the automatic variables to debug your programs — for example, you can write an observation containing an error to the log using the following statement:

if _error_=1 then put _infile_;

You can also use _N_ as an observation counter, if you are only reading in one observation for each iteration of the data step (it is important to remember that _N_ actually represents the number of iterations of the data step, which will often equal the number of observations read, but not always).

The second question for the prospective SAS programmer is to see if they can name three system options that are useful for debugging code containing macros. These are MPRINT, MLOGIC, and SYMBOLGEN. SYMBOLGEN automatically resolves macro variables used in submitted statements and prints them to the log; this helps you see at a glance if your macro variables are resolving to the values you expect. MPRINT outputs the full macro statements executed when you call a macro and prints them to the log. MLOGIC helps trace the execution of a macro by writing messages to the log at various stages of execution, so you can see more easily where a macro fails.