Tag Archives: Proc SurveySelect

Simple Random Sampling with Proc SurveySelect

A common way to create a random sample of n=1000 in SAS is to generate a random number field for each observation using RANUNI or a similar function. The data set is then sorted on that field and the top 1000 selected as the sample.

PROC SURVEYSELECT offers a simple alternative with just a few lines of code:

proc surveyselect data=Customers
method=srs n=1000 out=SampleSRS;
run;

The METHOD statement set equal to “srs” indicates that simple random sampling will be used. DATA= specifies the input data set, while OUT= specifies the output data set. N= is used to set the sample size, and an optional SEED= statement can be used if a particular seed is desired for generating the random number; otherwise the seed will default to the time of day from the computer clock. Default output will include the seed, selection probability, and sampling weight for each observation.

Alternatively, if you want to get a little fancier and play around with various sample sizes for different markets, I like to use macro variables when setting some of the parameters:

proc surveyselect data=work.total_elig_pop_&mkt
method=srs n=&size out=ci_share.sample_&mkt._new;
run;

There are many other options available with PROC SURVEYSELECT which you can use for more complex sampling. Additional SAS survey procedures for analyzing data created using complex sampling methods are discussed in one of my conference papers. For more about macro variables and how they make your code easier to maintain, see 10 Steps to Easier SAS Code Maintenance.