February 2005 Technical Tip SAS: Multiple Records from a Single Observation

My task of writing these monthly articles was made easier this month as I received a request from a former student. Kevin emailed the following question to me:

Hello Bill,

Hope this message finds you well. I have a question for you.
I received a text file which is at customer level: each
customer with multiple accounts is a line as the following:

   ID  Acct.  Bal.   Acct.  Bal.  Acct.  Bal
   01   11    $20    12    $30    13    $35
   02   21    $15    22    $25   

My question is how I can turn the data into an account level
layout so each account is a line like the following:

   ID  Acct.  Bal.
   01  11    $20
   01  12    $30
   01  13    $35
   02  21    $15
   02  22    $25

My goal is to eventually use SAS code to calculate average
and total balances, etc. on this data set.  Please help.  

Thanks very much,
Kevin

It's not often I receive a request complete with test data and expected output! Gotta love it! I put Kevin's test data into file C:\TEMP\ACCOUNTS.TXT as follows. I added a couple of records because I wanted to make sure the logic worked for id's with only one account:

01   11    $20    12    $30    13    $35
02   21    $15    22    $25 
03   31    $30
04   41    $10    42    $20    43    $30    44    $40    45    $50
05   51    $15

Here is the SAS source code:

data accounts;
   infile "c:\temp\accounts.txt" missover;
   input id $ acct @;

   do while (acct ne .);
      input bal : dollar. @;
      output;
      input acct @;
   end;

   format bal dollar.;

proc print data=accounts;

run;

Here is the output from PROC PRINT just as Kevin asked:

     The SAS System

Obs    id    acct    bal

 1     01     11     $20
 2     01     12     $30
 3     01     13     $35
 4     02     21     $15
 5     02     22     $25

Space is limited but a few quick comments are appropriate:

  • The MISSOVER option on the INFILE statement tells SAS that any attempt to read beyond the end of the record should result in missing values rather than going to the next record.
  • The single trailing at sign (@) on the INPUT statements tells SAS to allow multiple INPUT statements for the same record within a single iteration through the DATA step. I like to think of this as "Wait! I'm not done with this record!" (The single trailing at sign (@) releases a record when control returns to the top of the DATA step.)
  • (acct ne .) is read as "acct is not a missing value"
  • The dollar sign following id on the first INPUT statement tells SAS to treat id as a character field: I am assuming from Kevin's letter that leading zeroes on the id are significant.
  • The colon following bal on the second INPUT tells SAS that this is formatted; the dollar. informat tells SAS to treat it as a number with a leading dollar sign.
  • The OUTPUT statement writes a record. Normally SAS will automatically output a record for each iteration through the DATA step. But if an OUTPUT statement appears anywhere in the DATA step, a record is written only when an OUTPUT statement occurs.

What about Kevin's stated goal "to eventually use SAS code to calculate average and total balances, etc. on this data set"? Well, you'll have to tune in next month for that! Want to know more about SAS? Give us a call! Mainframe or PC, you can count on Caliber Data Training for top quality education.


Go to the articles index. Written by Bill Qualls. Copyright 2005 by Caliber Data Training 800.938.1222