COBOL SORT files are common functionalities of COBOL applications. Sorting is required for sequential processing as the files need to be sorted in ascending/descending order. The COBOL SORT statement creates a sort file by executing input procedures or by transferring records from another file, sorts the records in the sort file on a set of specified keys, and in the final phase of the sort operation, makes available each record from the sort file, in sorted order to some output procedures or to an output file.
Internal sort is used to sort files within a COBOL program. SORT verb is used to sort a file. External sort is used to sort files by using the SORT utility in JCL.
Three files are used in the sorting process in COBOL −
- Input file is the file which we have to sort either in ascending or descending order.
- Work file is used to hold records while the sort process is in progress. Input file records are transferred to the work file for the sorting process. This file should be defined in the File-Section under SD entry.
- Output file is the file which we get after the sorting process. It is the final output of the Sort verb.
SORT file-name-1 ON ASCENDING KEY rec-key1 [ON DESCENDING KEY rec-key2] USING input-file GIVING output-file.
Note: For file-name-1, SD entry should be defined in FILE SECTION as given below.
SD file-name-1 [RECORD CONTIANS integer CHARACTERS] [DATA RECORD IS file-rec-1]
Please click here to refer to the Full COBOL SORT syntax.
file-name-1/input-file/output-file – Specifies the file name in the SD entry. All three files are opened by the SORT statement itself and close on sort completion. The file should not be kept open before SORT begins. Input and output files must be PS (physical sequential) files.
- If file-name-1 has more than one record description, the KEY data items need be specifies in only one of the record descriptions.
- If file-name-1 contains variable-length records, all of the KEY data-items must be contained within the minimum records size specified for file-name-1.
ASCENDING KEY and DESCENDING KEY – Specifies the records are to be processed in ascending or descending sequence depending on the phrase specified. When ASCENDING is specified, the sequence is from the lowest key value to the highest key value. When DESCENDING is specified, the sequence is from the highest key value to the lowest.
rec-key1/rec-key2 – Specifies a KEY data item used for sorting. The data-names following the word KEY are listed from left to right in order of decreasing significance. The leftmost data-name is the major key, the next data-name is the next most significant key, and so forth.
- KEY data items must not contain an OCCURS clause or be subordinate to an item that contains an OCCURS clause.
- If the KEY data item is alphabetic, alphanumeric, alphanumeric-edited or numeric-edited, the sequence of key values arranged depends on the collating sequence.
- If the KEY is a display floating-point item, the compiler treats the data item as character data of the same size as the key. The sequence in which the records are sorted depends on the collating sequence used.
- If the KEY data item is internal floating point, the sequence of key values will be in numeric order.
- When the COLLATING SEQUENCE phrase is not specified, the key comparisons are performed according to the rules for comparison of operands in a relation condition.
- When the COLLATING SEQUENCE phrase is specified, the indicated collating sequence is used for key data items of alphabetic, alphanumeric, alphanumeric-edited, external floating-point and numeric-edited categories.
USING –
Specifies the input files. During the SORT operation, all the records from input-file, output-file are transferred to file-name-1. When the SORT statement is executed, these files should not be opened. The input files are automatically opened, read, and closed.
All input files must specify sequential or dynamic access mode and be defined in FD entries in the data division. If file-name-1 contains variable-length records, the size of the records contained in the input files (input-file, output-file) must be neither less than the smallest record nor greater than the largest record of the file-name-1. If file-name-1 contains fixed-length records, the size of the records contained in the input files must not be greater than the largest record of file-name-1.
Optional Parameters
DUPLICATES phrase – If all the keys associated with one record are equal to the corresponding keys in one or more other records, then considered that file has duplicate records. DUPLICATES phrase used to specify the order when the duplicate records existed in the file.
If the DUPLICATES phrase is specified and duplicates existed, the order of associated input files as specified in the SORT statement. If the DUPLICATES phrase is not specified, the order of these records is undefined.
COLLATING SEQUENCE phrase – Specifies the collating sequence to be used in alphanumeric comparisons for the KEY data items in this sort operation. When the COLLATING SEQUENCE phrase is specified, the indicated collating sequence is used for key data items comparisons of alphabetic, alphanumeric, alphanumeric-edited, external floating-point, and numeric-edited categories.
When the COLLATING SEQUENCE phrase is not specified, the key comparisons are performed according to the rules for the comparison of operands. The COLLATING SEQUENCE phrase has no effect for keys that are not alphabetic or alphanumeric.
INPUT PROCEDURE phrase – Specifies the name of a procedure used to select or modify input records before the sorting operation begins.
OUTPUT PROCEDURE phrase – Specifies the name of a procedure used to select or modify output records from the sort operation. The OUTPUT PROCEDURE can consist of any statements needed to select, modify or copy the records that are made available by the RETURN statement in sorted order from the file referenced by file-name-1.
COBOL SORT performs the following operations
- Opens work-file in I-O mode, input-file in the INPUT mode and output-file in the OUTPUT mode.
- Transfers the records present in the input-file to the work-file.
- Sorts the SORT-FILE in ascending/descending sequence by rec-key.
- Transfers the sorted records from the work-file to the output-file.
- Closes the input-file and the output-file and deletes the work-file.
COBOL SORT RULES
- Never open or close the sort work-file.
- Define the sort work-file with an SD entry (not an FD).
- Do not open the source file or destination file prior to the SORT (when using SORT with USING/GIVING option). They must be closed when the SORT statement executes.
- The sort work-file must be assigned to disk, a direct access storage device.
- Use RETURN and RELEASE when referencing the sort work-file (when using the INPUT/OUTPUT procedures method).
- Neither an INPUT or OUTPUT procedure may contain a SORT statement.
- Neither an INPUT procedure nor an OUTPUT procedure may reference a paragraph or section outside the procedure.
- To exist a section, branch to its last paragraph using a GO TO statement. That last paragraph may only contain an EXIT statement.
- The use of GO TO is restricted to use in INPUT/OUTPUT procedures for COBOL!!! Don’t try to use it elsewhere.
Example of simple SORT
SORT EMPLOYEE-WORK ON ASCENDING KEY EMPLOYEE-ID-WK USING EMPLOYEE-INP GIVINS EMPLOYEE-OUT
The file control section should have a definition of all three files
FILE-CONTROL. SELECT EMPLOYEE-INP ASSIGN TO EMPINP. SELECT EMPLOYEE-OUT ASSIGN TO EMPOUT. SELECT EMPLOYEE-WORK ASSIGN TO WORKFL.
The file section should have details of all three files
FILE-SECTION. FD EMPLOYEE-INP. 01 EMPLOYEE-INPUT. 05 EMPLOYEE-ID PIC 9(05). 05 EMPLOYEE-NAME PIC X(10). 05 EMPLOYEE-ADDRESS PIC X(100). 05 EMPLOYEE-SALARY PIC 9(9)V99. FD EMP-OUT. 01 EMP-OUTPUT. 05 EMPLOYEE-ID-OUT PIC 9(05). 05 EMPLOYEE-NAME-OUT PIC X(10). 05 EMPLOYEE-ADDRESS-OUT PIC X(100). 05 EMPLOYEE-SALARY-OUT PIC 9(9)V99. SD EMPLOYEE-WORK. 01 EMPLOYEE-WORK. 05 EMPLOYEE-ID-WK PIC 9(05). 05 EMPLOYEE-NAME-WK PIC X(10). 05 EMPLOYEE-ADDRESS-WK PIC X(100). 05 EMPLOYEE-SALARY-WK PIC 9(9)V99.
Example with Duplicates and Procedures
SORT SORT-FILE ON ASCENDING KEY SORT-KEY-1 ON DESCENDING KEY SORT-KEY-2 WITH DUPLICATES IN ORDER INPUT PROCEDURE IS GET-RECORDS OUTPUT PROCEDURE IS PUT-RECORDS.
GET-RECORDS & PUT-RECORDS are 2 different procedures, one can be for reading records and the other can be for writing the record.