Contents
Simple
frequency histogram, to check if need log transformation
Use scan
to extract bits of text, especially numbers and times in a text string
Use
TRANSTRN to replace or remove all occurrences of a substring in a character
string
Get age
from date and date of birth
Convert
text date or time to sas date or time.
Input a
time in text format and convert to SAS time
Extract
info from a variable name
Assign a
value to a macro variable from another macro variable
Check for
misspelt names in a huge dataset with repeated measurements
Get the
values of some variable(s) corresponding to the min or max of another variable
Transpose
the values of the variable Estimate giving the new variables the values in
Effect
How to
output results for log vs raw analyses.
Set up
macro variables to put 2SD or whatever into an estimate statement
Suppress
output in proc mixed in SAS Studio but keep the graph of the residuals
Generate
confidence limits from conf level and SE
Relax
convergence in proc mixed and proc glimmix
Parametric
bootstrap confidence limits for the ICC
If proc
univariate plot does not produce a plot
Import an
excel file in the full SAS package
Import an
excel file in SAS Studio
Print to
an excel file in full SAS program but not Studio
Clear the
listing and log windows in full SAS package
Get
old-fashioned listing-type output in Studio
Make
sounds, useful at the end of long-running program
Send an
email alert to yourself when a long-running program has finished
Check for
misspelt names in a huge dataset with repeated measurements
Simple frequency histogram, to check if need log transformation
ods graphics / reset width=12cm height=8cm;
proc sgplot data=dat0;
histogram PossessionLength;
by PossessionType;
run;
Use scan to extract bits of text, especially numbers and times in a text string
data temp;
RunTime="1:23";
d1=scan(RunTime,1,":");
d2=scan(RunTime,2,":");
TimeInMin=d1*60+d2;
data temp;
RunTime="1:23:45.6";
d1=scan(RunTime,1,":");
d2=scan(RunTime,2,":");
d3=scan(RunTime,3,":");
TimeInSec=d1*60*60+d2*60+d3;
Use TRANSTRN to replace or remove all occurrences of a substring in a character string
if index(text,"Athlete=") then
Name=transtrn(text,"Athlete=",trimn(''));
*otherwise puts a space in front;
NameA=byte(64+NameX); *A,B,… for NameX=1,2,…;
NameA=byte(96+NameX); *a,b,… for NameX=1,2,…;
Get age from date and date of birth
Age=yrdif(DOB,CompetitionDate);
AgeRounded=round(Age,1);
Convert text date or time to sas date or time
options datestyle=dmy; *specifies sequence of
day month year in inputted date data;
data temp;
TodaysDate='14OCT2010'd; *single or double
quotes are OK;
BirthDate=input('11/9/01',anydtdte24.); *this
is 11SEP2001 for datestyle=dmy, but 11NOV2001 for mdy;
CheckThis=input('14Feb2015',anydtdte24.);
Age=yrdif(BirthDate,TodaysDate,'ACT/ACT');
*ACT/ACT is the default and can be left blank;
format TodaysDate BirthDate CheckThis date9.;
proc print;
run;
Input a time in text format and convert to SAS time
data temp;
x="1:23";
TimeInSec1=input(x,ANYDTTME24.); *1 h, 23
min;
TimeInSec2=input("1:23:45.6",ANYDTTME24.);
*1 h, 23 min, 45.6 s;
TimeInSec3="1:23:45.6"t; *single or
double quotes are OK;
proc print;run;
Extract info from a variable name
The example shown was for data in a
spreadsheet, in which there were column headings
of 1/1, 1/2, 1/3, 2/1, 2/2, etc. through
9/3,representing weeks (1 to 9) and sessions (1 to 3 each week).
The code in the data step extracts the number
of the week and the number of the session.
array a _1_1--_9_3;
do over a;
temp=vname(a);
Week=substr(temp,2,1);
Session=substr(temp,4,1);
&dep=a;
output;
end;
drop _1_1--_9_3 temp;
Assign a value to a macro variable from another macro variable
%let predmhalf=%SYSEVALF(0.5*&predm);
Check for misspelt names in a huge dataset with repeated measurements
Go to the last item in this doc.
Get the values of some variable(s) corresponding to the min or max of another variable
proc means noprint data=peak1;
var Perform;
output out=peak2 min=PeakPerform minid(Perform(Age MidPeak))=PeakAge MidPeak;
*output out=peak2 min=PeakPerform minid(Perform(var1 var2…))=;
by Event Sex Competitor;
Transpose the values of the variable Estimate giving the new variables the values in Effect
Proc transpose is always difficult! You have to play with it to get it right.
proc transpose data=solf out=solftr; *prefix=Whatever;
var Estimate;
id Effect;
by PosGrp;
run;
How to output results for log vs raw analyses
%let logflag=1; *or 0, if not using logs;
Then make separate proc prints (or whatever)
with: where &logflag; or with: where &logflag=0;
I also sometimes use this to create a macro
variable &unitsrawlog,
for use in results tables and heading:
data _null_;
if &logflag then call
symput('unitsrawlog',"percent");
else call
symput('unitsrawlog',"raw");
Set up macro variables to put 2SD or whatever into an estimate statement
data _null_;
set meansd;
if Sex="Female" and TeacherRole="Assistant"
then
call
symput('FA2sd',left(trim(round(2*SD, .01))));
Suppress output in proc mixed in SAS Studio but keep the graph of the residuals
*I
stopped using this, preferring to use proc sgplot for residuals vs predicteds
and residuals vs predictors;
N.B. must be preceded by a run;
run;
ods
select none;
ods
select StudentPanel;
proc
mixed data=games plots(only)=StudentPanel(conditional)
alpha=0.1;
class
AthleteID;
model
LnDepVar=.../residual;
random
Intercept/subject=AthleteID;
run;
ods
select all;
Generate confidence limits from conf level and SE
*normal
dist;
lower=estimate+probit(alpha/2)*StdErr;
upper=estimate-probit(alpha/2)*StdErr;
*or;
Lower=estimate-quantile('NORMAL',1-alpha/2)*Stderr;
Upper=estimate+quantile('NORMAL',1-alpha/2)*Stderr;
*chi-squared
dist for residuals;
Lower=DegFree*estimate/CINV(1-alpha/2,DegFree);
Upper=DegFree*estimate/CINV(alpha/2,DegFree);
*tdist;
tValue=tinv(1-&alpha/2,DF);
Lower=estimate-tValue*StdErr;
Upper=estimate+tValue*StdErr;
*same again in one hit:;
Lower=estimate-tinv(1-&alpha/2,DF)*Stderr;
Upper=estimate+tinv(1-&alpha/2,DF)*Stderr;
Relax convergence in proc mixed and proc glimmix
Use when you get
"did not converge" or "too many likelihood evaluations".
When you get it to
work with, say 1E-6, try 1E-5 to check that the answer is practically the same.
If it's not, then you
have to live with the fact that the results may be only approximate.
proc mixed
data=alldat3 covtest cl alpha=&alpha &nob CONVH=1E-6 convf=1E-6;
for proc glimmix it's pconv=1E-6;
Or use even smaller
values than -6. (The default is -8.) I
have sometimes had to resort to -2 with glimmix.
Parametric bootstrap confidence limits for the ICC
data covicc;
merge covbet2(rename=(estimate=AthVar
stderr=AthVarSE) keep=Condition estimate stderr alpha)
covwthn2;
by Condition alpha;
ICC=AthVar/(Athvar+estimate);
*if Group ne "Condition Ruck";
do i=1 to 5000;
AthVarX=AthVar+rannor(0)*AthVarSE;
*if
AthVarX<0 then AthVarX=0;
ICCboot=AthVarX/(AthVarX+estimate*sqrt(cinv(ranuni(0),DF)/DF)); *checked
by simulation;
*ICCboot=AthVarX/(AthVarX+estimate+rannor(0)*StdErr);
output;
end;
keep ICC ICCboot Condition alpha;
proc univariate noprint data=covicc;
var ICCboot ICC;
output out=boot1(drop=d) mean=d ICC
median=ICCboot ICC pctlpts=0.5, 5,95,99.5 pctlpre=CL pctlname=L99 L90 U90 U99;
by Condition;
data boot2;
set boot1;
CLpm90=(CLU90-CLL90)/2;
CLpm99=(CLU99-CLL99)/2;
option ls=90 ps=90;
title4 "Confidence limits via semi-bootstrapping";
title5 "CLpm90 = 90%CL in approx. ±
form, etc. CLL90 = lower 90% conf.
limit, etc. ";
title6 "Unrealistic 99%CL reflects the
inadequate sample size";
proc print noobs data=boot2;
var Condition icc clpm90 clpm99 cll90 clu90
cll99 clu99;
format _numeric_ 5.2;
run;
options ps=52;
If proc univariate plot does not produce a plot
This is relevant to the main SAS package only, I think, not SAS Studio.
ods _ALL_ close;
ods listing;
ods graphics off;
run;
*to turn on html graphics again;
ods html;
ods graphics;
Import an excel file in the full SAS package
*this version allows the xlsx to be open but
not with a cell active;
*Note the extra period before the xls, if you have a macro variable at the
end. You may need extra periods, if you have several macro variables;
*the variable type is determined by the type
in Row 2;
*this does not work with a filename statement
replacing the address info;
*you can mix \ and / (but not in SAS Studio?
all / needed;
*NB with several sheets, if one is Sheet1,
SAS will read that by default if you do not specify the sheet;
PROC IMPORT
DATAFILE="D:\Will's
Documents\Projects\etc\data&year..xlsx"
OUT=teams
DBMS=EXCELCS replace;
*sheet=&sheet;
GUESSINGROWS=N OR MAX; *this works only with
XLSX;
attrib _all_ label="" format=;
*this deletes all annoying labels and misleading nominal formats;
run;
*DBMS=XLSX requires the file to be closed, and if there is a single
non-numeric value (blanks excepted), the variable becomes nominal;
*DBMS=EXCELCS does not work in the latest 9.4, incredibly;
Import an excel file in SAS Studio
*SAS Studio produces most of this code for
you. It does not work in the main SAS package (9.40;
PROC IMPORT DATAFILE=REFFILE
DBMS=XLSX
OUT=whatever
GETNAMES=YES;
*the following are optional statements. See
the documentation for other options;
SHEET="whatever"; *the name of the
spreadsheet on the tab;
GUESSINGROWS=N OR MAX; *for proc import to
guess nominal or numeric and the format;
attrib _all_ label="" format=;
*this deletes all annoying labels and misleading nominal formats;
RUN;
MIXED=NO; *YES makes variable with blanks
into nominal, NO looks for majority of type, maybe!;
*the above seems to have disappeared from the
documentation, and it doesn't work;
You can proc print the file, then copy from the
html file into a spreadsheet.
You can control the output by KEEPing
variables and setting the FORMATs thereof before you proc print.
Here's another solution for Studio (it does
not work in SAS 9.4).
This solution does not allow you to control
the format.
You have to download the resulting Excel file
to your computer.
libname out xlsx "/home/.../TheNameOfTheExcelFile.xlsx";
data out.x;
set whatever; *the dataset you want to save
to an Excel file;
run;
libname out clear;
Print to an excel file in full SAS program but not Studio
/*
This is an older klunky solution. There are
probably better ways now.
Open the file when prompted. The file appears
to be in html format. You have to
View/Gridlines, fill all cells with No fill, and Save as… xlsx.
Note the extra period before the xls, if you
have a macro variable at the end.
A better version is available in the latest
version 9.4, using ods excel.
*/
ods listing close;
ods html
file="D:\Will's Documents\Projects\Patria Hume\Steve
Hollings\athletics2009\Men Track\Curves&gender.&event..xls";
proc print data=curves;
run;
ods html close;
ods listing;
run;
Clear the listing and log windows in full SAS package
dm "out;clear;log;clear;";
Get old-fashioned listing-type output in Studio
*I don't use this;
ods listing file='/folders/myshortcuts/ExternalFiles/Projects/_VU Melbourne/Alireza Esmaeili/Ali Study3/whatever.txt';
*etc as above;
ods listing close;
*You have to open the file in Word and
process it quite a lot.
Make sounds, useful at the end of long-running program
*Alas, does not work in Studio, and no-one in
the SAS communities could help;
*but you can get Studio to send you an email!
See below;
data _null_;
call sound(440,200);
call sound(622,200);
call sound(880,200);
run;
Send an email alert to yourself when a long-running program has finished
*code provided by Cynthia Zender of SAS
Global Academic <academic@sas.com>;
*put this block of code at the end of the
program;
filename doemail email
to='willthekiwi@gmail.com'
subject='Program
is finished';
* cc='first3.last3@email3.com'; *if these two
lines needed, remove semicolons and insert into the above;
* from='willthekiwi@gmail.com' ;
data _null_;
file
doemail;
* put 'Message Line 1.';
* put 'Message Line 2.';
run;
Check for misspelt names in a huge dataset with repeated measurements
*ex NIH/Paul Solberg/Weights study;
*which was modified from more complex read
rowing comps SAS.doc, in Brett Smith rowers/Rowing Competitions;
data
check;
set
dat1;
keep
Athlete Born;
*if
Sex="W";
if
Sex="M"; *ran separately because >2000 athletes if both together;
proc
sort;
by Sex
Athlete;
data
check1;
set
check;
if
lag(Athlete) ne Athlete;
data
check2;
set
check1;
rename
Athlete=AthleteX Born=BornX;
*make
Cartesian product;
proc
sql;
create
table check3 as
select
*
from
check1 as l, check2 as r;
quit;
run;
*where
l.gender=r.gender; *put before semicolon before the quit;
data
check4;
set
check3;
if
0<compged(Athlete,AthleteX)<250;
if
lag(Athlete)=Athlete and lag(AthleteX)=AthleteX then delete;
run;
*proc
print;
run;
data
check5;
set
check4;
length
ret1name1-ret1name2000 ret2name1-ret2name2000 $ 33;
array
a $ ret1name1-ret1name2000;
array
b $ ret2name1-ret2name2000;
retain
ret1name1--ret2name2000;
a(_n_)=Athlete;
b(_n_)=AthleteX;
if
_n_>1 then do;
do i=1 to _n_-1;
if Athlete=b(i) and AthleteX=a(i) then
delete;
end;
end;
drop i
ret1name1-ret1name2000 ret2name1-ret2name2000;
if
lag(Athlete)=Athlete and lag(Athletex)=Athletex then delete;
if
lag2(Athlete)=Athlete and lag2(Athletex)=Athletex then delete;
if
lag3(Athlete)=Athlete and lag3(Athletex)=Athletex then delete;
if
lag4(Athlete)=Athlete and lag4(Athletex)=Athletex then delete;
if lag5(Athlete)=Athlete
and lag5(Athletex)=Athletex then delete;
run;
*the
above might give index out of range error if too many athletes;
options
ls=120;
proc
print data=check5 noobs;
var
Athlete Athletex Born BornX;
format
Athlete Athletex $25.;
where Born=BornX;
run;