November 2005 Technical Tip - Easytrieve Plus: Identifying duplicate records

Easytrieve Plus provides a simple technique for identifying duplicate records in a file. The file should be sorted by the key field, then the Eastrieve keywords DUPLICATE, FIRST-DUP and LAST-DUP can be used to tell you not only if a record is a duplicate, but also if it is the first or last duplicate in a set of like keyed records.

Our annotated test data is as follows:

KEY FIELDDUPLICATEFIRST-DUPLAST-DUP
AAA   
BBB X X  
BBB X   X
CCC X X  
CCC X   
CCC X   X
DDD X X  
DDD X   X
EEE   

The complete JCL with test data and Easytrieve source code is as follows:

//* JOBCARD HERE
//********************************************                       
//*  DEMO USE OF "DUPLICATE", "FIRST-DUP"    *                       
//*  AND "LAST-DUP" KEYWORDS IN EASYTRIEVE   *                       
//********************************************                       
//STEP010 EXEC PGM=EZTPA00                                                      
//MYDATA   DD *                                                                 
AAA 1 OF 1                                                                      
BBB 1 OF 2                                                                      
BBB 2 OF 2                                                                      
CCC 1 OF 3                                                                      
CCC 2 OF 3                                                                      
CCC 3 OF 3                                                                      
DDD 1 OF 2                                                                      
DDD 2 OF 2                                                                      
EEE 1 OF 1                                                                      
//SYSPRINT DD SYSOUT=*                                                          
//SYSIN    DD *                                                                 
FILE MYDATA                                                                     
  A-KEY       1   3   A                                                         
  A-MSG       5  10   A                                                         
                                                                                
* WORKING-STORAGE                                                               
  RECORDTYPE  W  15   A                                                         
                                                                                
JOB INPUT (MYDATA KEY(A-KEY))                                                   
                                                                                
  IF DUPLICATE MYDATA                                                           
     RECORDTYPE = 'DUPLICATE'                                                   
     PRINT MYREPORT                                                             
  END-IF                                                                        
                                                                                
* MUST USE "NOT DUPLICATE" INSTEAD OF "UNIQUE"                                  
  IF NOT DUPLICATE MYDATA                                                       
     RECORDTYPE = 'NOT DUPLICATE'                                               
     PRINT MYREPORT                                                             
  END-IF                                                                        
                                                                                
  IF FIRST-DUP MYDATA                                                           
     RECORDTYPE = 'FIRST-DUP'                                                   
     PRINT MYREPORT                                                             
  END-IF                                                                        
                                                                                
  IF LAST-DUP MYDATA                                                            
     RECORDTYPE = 'LAST-DUP'                                                    
     PRINT MYREPORT                                                             
  END-IF                                                                        
                                                                                
REPORT MYREPORT LINESIZE 80                                                     
  LINE A-KEY A-MSG RECORDTYPE                                                   
Download file here.

Comments:

  • The KEY clause is required.
  • The file should be sorted by the key field. If the file is not sorted, the program will still run to completion, but subsequent occurances of any key will be treated as the first occurance of that key.
  • There is no UNIQUE keyword: use NOT DUPLICATE instead.

The results of our Easytrieve program are as follows:

  A-KEY     A-MSG        RECORDTYPE
   AAA    1 OF 1       NOT DUPLICATE
   BBB    1 OF 2       DUPLICATE
   BBB    1 OF 2       FIRST-DUP
   BBB    2 OF 2       DUPLICATE
   BBB    2 OF 2       LAST-DUP
   CCC    1 OF 3       DUPLICATE
   CCC    1 OF 3       FIRST-DUP
   CCC    2 OF 3       DUPLICATE
   CCC    3 OF 3       DUPLICATE
   CCC    3 OF 3       LAST-DUP
   DDD    1 OF 2       DUPLICATE
   DDD    1 OF 2       FIRST-DUP
   DDD    2 OF 2       DUPLICATE
   DDD    2 OF 2       LAST-DUP
   EEE    1 OF 1       NOT DUPLICATE

Want to know more about Easytrieve Plus? Give us a call! You can always count on Caliber Data Training for top quality IT training.


Go to the articles index. Written by Bill Qualls. Copyright © 2005 by Caliber Data Training 800.938.1222