Split dataset using Mainframe SORT utility

OUTFIL can create multiple output data sets from a single input data set. Here are five ways you can use OUTFIL to split datasets into multiple datasets. All five can be used with SORT, MERGE, or COPY. For illustration, the examples shown here assume you want to split the data set into THREE output data sets, but you can actually split it into any number of output data sets.

Split dataset using SPLIT1R or SPLIT or SPLITBY or STARTREC or ENDREC or INCLUDE, OMIT and SAVE

Split Dataset using SPLIT1R

SPLIT1R=n can be used to split the dataset into multiple output data sets each of which will have contiguous records. SPLIT1R=n writes n records to each output data set and writes any extra records to the last output data set.

Here’s an example of SPLIT1R=4 for an input data set with 14 records record 1-14:

//SPLIT1R EXEC PGM=ICEMAN
//SYSOUT DD SYSOUT=*
//SORTIN DD DSN=INPUT1,DISP=SHR
//OUT1 DD DSN=SPLITR1,DISP=(NEW,CATLG),
//        SPACE=(CYL,(5,5)),UNIT=SYSDA
//OUT2 DD DSN=SPLITR2,DISP=(NEW,CATLG),
//        SPACE=(CYL,(5,5)),UNIT=SYSDA
//OUT3 DD DSN=SPLITR3,DISP=(NEW,CATLG),
//        SPACE=(CYL,(5,5)),UNIT=SYSDA
//SYSIN DD *
  SORT FIELDS=(21,5,FS,A)
  OUTFIL FNAMES=(OUT1,OUT2,OUT3),SPLIT1R=4
/*

The first four sorted records are written to the OUT1 data set, the second four sorted records are written to the OUT2 data set, the third four sorted records are written to the OUT3 data set, and the remaining two records are also written to the OUT3 data set.

The resulting output data sets would contain the following records:

SPLITR1 (OUT1 DD)
record 1
record 2
record 3
record 4
SPLITR2 (OUT2 DD)
record 5
record 6
record 7
record 8
SPLITR3 (OUT3 DD)
record 9
record 10
record 11
record 12
record 13
record 14

Please note that the records in each output file are contiguous.

Split Dataset using SPLIT

SPLIT is the easiest way to split datasets into multiple output data sets if you don’t need the records in each output data set to be contiguous. SPLIT can be used to split the records as evenly as possible among the output data sets. SPLIT writes one record to each output data set in rotation.

Here’s an example of SPLIT for an input data set with 14 records:

//SPLIT EXEC PGM=ICEMAN
//SYSOUT DD SYSOUT=*
//SORTIN DD DSN=INPUT1,DISP=OLD
//OUT1 DD DSN=SPLIT1,DISP=(NEW,CATLG),
//        SPACE=(CYL,(5,5)),UNIT=SYSDA
//OUT2 DD DSN=SPLIT2,DISP=(NEW,CATLG),
//        SPACE=(CYL,(5,5)),UNIT=SYSDA
//OUT3 DD DSN=SPLIT3,DISP=(NEW,CATLG),
//        SPACE=(CYL,(5,5)),UNIT=SYSDA
//SYSIN DD *
  SORT FIELDS=(21,5,FS,A)
  OUTFIL FNAMES=(OUT1,OUT2,OUT3),SPLIT
/*

The first sorted record is written to the OUT1 data set, the second sorted record is written to the OUT2 data set, the third sorted record is written to the OUT3 data set, the fourth sorted record is written to the OUT1 data set, and so on in rotation.

The resulting output data sets would contain the following records:

SPLIT1 (OUT1 DD)
record 1
record 4
record 7
record 10
record 13
SPLIT2 (OUT2 DD)
record 2
record 5
record 8
record 11
record 14
SPLIT3 (OUT3 DD)
record 3
record 6
record 9
record 12

Notice that the records in each output file are not contiguous.

Split Dataset using SPLITBY

SPLITBY=n is another way to split the records evenly among the output data sets if you don’t need the records in each output data set to be contiguous. It is similar to SPLIT, except that it writes n records to each output data set in rotation.

Here’s an example of SPLITBY=n for an input data set with 53 records:

//SPLITBY EXEC PGM=ICEMAN
//SYSOUT DD SYSOUT=*
//SORTIN DD DSN=IN1,DISP=OLD
//OUT1 DD DSN=SPLITBY1,DISP=(NEW,CATLG),
//        SPACE=(CYL,(5,5)),UNIT=SYSDA
//OUT2 DD DSN=SPLITBY2,DISP=(NEW,CATLG),
//        SPACE=(CYL,(5,5)),UNIT=SYSDA
//OUT3 DD DSN=SPLITBY3,DISP=(NEW,CATLG),
//        SPACE=(CYL,(5,5)),UNIT=SYSDA
//SYSIN DD * 
  SORT FIELDS=(21,5,FS,A)
  OUTFIL FNAMES=(OUT1,OUT2,OUT3),SPLITBY=10
/*

The first ten sorted records are written to the OUT1 data set, the second ten sorted records are written to the OUT2 data set, the third ten sorted records are written to the OUT3 data set, the fourth ten sorted record are written to the OUT1 data set, and so on in rotation.

The resulting output data sets would contain the following records:

SPLITBY1 (OUT1 DD)
records 1-10
records 31-40
SPLITBY2 (OUT2 DD)
records 11-20
records 41-50
SPLITBY3 (OUT3 DD)
records 21-30
records 51-53

Notice that the records in each output file are not contiguous.

STARTREC and ENDREC

STARTREC=n and ENDREC=m can be used to select a sequential range of records to be included in each output data set. STARTREC=n starts processing at the nth record while ENDREC=m ends processing at the mth record.

Here’s an example of STARTREC=n and ENDREC=m:

//RANGE EXEC PGM=ICEMAN
//SYSOUT DD SYSOUT=*
//SORTIN DD DSN=INPUT2,DISP=OLD
//FRONT DD DSN=RANGE1,DISP=(NEW,CATLG)
//         SPACE=(CYL,(5,5)),UNIT=SYSDA
//MIDDLE DD DSN=RANGE2,DISP=(NEW,CATLG),
//         SPACE=(CYL,(5,5)),UNIT=SYSDA
//BACK DD  DSN=RANGE3,DISP=(NEW,CATLG),
//         SPACE=(CYL,(5,5)),UNIT=SYSDA
//SYSIN DD *
  OPTION COPY
  OUTFIL FNAMES=FRONT,ENDREC=500
  OUTFIL FNAMES=MIDDLE,STARTREC=501,ENDREC=2205
  OUTFIL FNAMES=BACK,STARTREC=2206
/*

Input record 1 through input record 500 are written to the FRONT data set. Input record 501 through input record 2205 are written to the MIDDLE data set. Input record 2206 through the last input record are written to the BACK data set.

The resulting output data sets would contain the following records:

RANGE1 (FRONT DD)
record 1 
record 2 
...
record 500 
RANGE2 (MIDDLE DD)
record 501 
record 502
..
record 2205
RANGE3 (BACK DD)
record 2206
record 2207
...
last record

INCLUDE, OMIT and SAVE

INCLUDE/OMIT and SAVE can be used to select specific records to be included in each output data set. INCLUDE and OMIT operands provide all of the capabilities of the INCLUDE and OMIT statements including sub-string search and bit logic. SAVE can be used to select the records that are not selected for any other subset, eliminating the need to specify complex conditions.

Here’s an example of INCLUDE and SAVE:

//SUBSET EXEC PGM=ICEMAN
//SYSOUT DD SYSOUT=*
//SORTIN DD DSN=INPUT3,DISP=OLD
//OUT1 DD DSN=SUBSET1,DISP=(NEW,CATLG),
//        SPACE=(CYL,(5,5)),UNIT=SYSDA
//OUT2 DD DSN=SUBSET2,DISP=(NEW,CATLG),
//        SPACE=(CYL,(5,5)),UNIT=SYSDA
//OUT3 DD DSN=SUBSET3,DISP=(NEW,CATLG),
//        SPACE=(CYL,(5,5)),UNIT=SYSDA 
//SYSIN DD *
  OPTION COPY
  OUTFIL INCLUDE=(8,6,CH,EQ,C'ACCTNG'),FNAMES=OUT1
  OUTFIL INCLUDE=(8,6,CH,EQ,C'DVPMNT'),FNAMES=OUT2 
  OUTFIL SAVE,FNAMES=OUT3 
/*

Records with ACCTNG in positions 8-13 are included in the OUT1 data set. Records with DVPMNT in positions 8-13 are included in the OUT2 data set. Records without ACCTNG or DVPMNT in positions 8-13 are written to the OUT3 data set.

So the resulting output data sets might contain the following records:

SUBSET1 (OUT1 DD)
J20 ACCTNG 
X52 ACCTNG
...
SUBSET2 (OUT2 DD)
P16 DVPMNT
A51 DVPMNT
...
SUBSET3 (OUT3 DD)
R27 RESRCH
Q51 ADMIN
...

SYNCSORT Manual:Click Here

Admin

Next XSUM - Remove Duplicate Records using SORT »

Previous « ASCII to EBCDIC or EBCDIC to ASCII conversion

Published by

Admin

Tags: INCLUDEOMIT and SAVESPLITSplit datasetSPLIT1RSPLITBYSTARTREC and ENDREC

6 years ago

OUTER JOIN Queries: Common Errors and Resolutions
OUTER JOIN Queries are a valuable tool in SQL, allowing you to retrieve data from…
OUTER JOIN – step by step walkthrough with examples
The JOIN operation allows you to combine rows from two or more tables based on…
OUTFIL control statement in SORT JCL
The OUTFIL control statement describes the output file(s) and the processing to be done on…