FILE ALLOCATION is a key concept in COBOL programming. A file is a collection of data related to a set of entities and typically exists on a magnetic tape or a disk. We refer file as PS in the Mainframe environment. In File, data is organized as records. Each record is divided into fields that contain information about data.
File Allocation Syntax
ENVIRONMENT DIVISION. INPUT-OUTPUT SECTION. FILE-CONTROL. SELECT [OPIONAL] file-name ASSIGN TO DDNAME [ORGANIZATION IS {SEQUENTIAL/INDEXED/RELATIVE}] [ACCESS IS {SEQUENTIAL/RANDOM/DYNAMIC}] [RECORD KEY IS key-name] [RELATIVE KEY is rel-key] [ALTERNATE RECORD KEY IS alt-key {WITH/WITHOUT} DUPLICATES] [FILE STATUS IS file-status] [RESERVE number AREA] DATA-DIVISION. FILE-SECTION. FD File Name.
OPTIONAL clause
- When coded, it is used to indicate that the file need not mandatorily present when the program runs.
- This clause can be coded only for files opened in INPUT, I-O or EXTENDED mode.
- If OPTIONAL not coded, then input file is expected to be present in JCL, otherwise there will be an execution error.
- If OPTIONAL is coded and file is not mapped in JCL the file is considered empty and the first read results in end-of-file.
ASSIGN TO clause
- The ASSIGN TO clause associates the name of the file in a program with actual external name of the data file.
- file-name in above mentioned syntax indicates logical name used in the COBOL program and DDNAME is logical name in JCL mapped with mainframe dataset.
- file-name can be formed by prefixing DDNAME with ‘UT-S-’ to indicate PS(QSAM) file, ‘UT-AS-‘ to indicate ESDS file and no prefix to indicate KSDS/RRDS file.
JCL: //INPUTFL DD DSN=DEPT1.EMP1.DATA1,DISP=SHR COBOL: SELECT INFILE ASSIGN TO UT-S-INPUTFL.
ORGANIZATION clause
- The ORGANIZATION clause is used to specify the logical structure of the file. This is established when a file is first created and may not be changed
- It can be coded SEQUENTIAL (for PS or ESDS file), INDEXED( for KSDS file), RELATIVE (for RRDS file). You can understand difference between KSDS, ESDS and RRDS here
- If ORGANIZATION is not coded, then SEQUENTIAL organization is implied
ACCESS clause
- The ACCESS clause is used to specify the order in which records are read or written.
- It can be coded SEQUENTIAL, RANDOM or DYNAMIC. If ACCESS is not coded, then SEQUENTIAL is implied
ACCESS IS SEQUENTIAL
- The records are accessed sequentially according to the organization of the file
- When used with ORGANIZATION IS SEQUENTIAL, records in the file are accessed in sequence established when file is created or extended.
- When used with ORGANIZATION IS INDEXED, records in file are accessed in the sequence of ascending records key values according to the collating sequence of the file
- When used with ORGANIZATION IS RELATIVE, records in files are accessed in ascending sequence of relative record number of existing records in the file
- In sequential access, to read 50th record, first 49 records need to be read and skipped
ACCESS IS RANDOM
- The records can be accessed randomly using the primary/alternate key of INDEXED file organization or using the relative record number of RELATIVE file organization
- It can be used ONLY with INDEXED, RELATIVE file organization
- Example: – In INDEXED file 50th record can be read after getting address of the record from INDEX. In RELATIVE files,, 50th record can be read directly using relative record number
ACCESS IS DYNAMIC
- This is mixed mode of access. The records can be accesses in random as well as sequential order.
- It can be used ONLY with INDEXED and RELATIVE file organization
- Example:- If you want to read records between 100-150. First randomly read 100th record and then sequentially read till 150th record. START and READ NEXT commands can used to achieve this
RECORD KEY clause
- The RECORD KEY clause is used to specify the primary key of INDEXED organization file (i.e. KSDS file)
- The value contained in Primary key data item must be unique among the records and part of index record structure
- It can be used with INDEXED file organization
RELATIVE KEY clause
- The RELATIVE KEY is used to specify the data name that specifies the relative record number for a specific logical record within a RELATIVE file organization (i.e. RRDS file)
- It can be used ONLY with RELATIVE file organization
ALTERNATE RECORD KEY clause
- The ALTERNATE RECORD KEY clause is used to specify a data item within the record that provides alternative path to the data in an INDEXED file organization (i.e. KSDS file)
- It can be used only with INDEXED files (KSDS) defined with Alternate INDEX (AIX). We can specify WITH DUPLICATES clause if Alternate INDEX is defined with duplicates.
FILE STATUS clause
- The FILE STATUS clause is used to monitor the execution of each I/0 operation for the file
- When FILE STATUS clause specified, the system moves a value into ‘file-status’ after each I/O operation. The value in ‘file-status’ can be used to determine next action in program
- ‘file-status’ variable can be defined as PIC X(02) in working storage
RESERVE clause
- The RESERVE clause is used to specify the number of input/output buffers to be allocated at run time for the files
- If RESERVE clause is not coded, the number of buffers taken from DD statement. If not coded in DD statement then the system default(usually 2) is taken
Types of Cobol File Allocation
COBOL supports 3 types of files
- Sequential File
- Indexed File
- Relative organization File
SEQUENTIAL FILE
Let’s see some important characteristics of Sequential files-
- Sequential Files are also called as Flat file.
- The records are stored in the file in the same order in which they are entered.
- To access the Nth record, we have to read first (N-1) records first.
- Records cannot be inserted or deleted. File opened with Extend Mode appends the writing records at the end of the file.
- It can have a fix or variable length.
- It is recommended to use the sequential file if the simple file read and write is required and there are the less frequent search of a random and dynamic accessing of record is required.
- In mainframe environment we have 2 types of sequential files.
- Flat file (NON-VSAM Sequential file)
- Entry sequential data set (VSAM ESDS)
Syntax:
SELECT LOGICAL-FL ASSIGNED TO PHYSICAL FL
ORGANIZATION IS SEQUENTIAL
Define this in FILE-CONTROL under INPUT-OUTPUT SECTION which is defined under ENVIRONMENT DIVISION. The Layout and details of the file must be declared in FILE-SECTION under FD clause.
FILE SECTION. FD FILE-NAME. [RECORD CONTAINS N characters] [BLOCK CONTAINS I RECORDS] [DATA RECORD IS RECORD-DET] [RECORDING MODE IS {F/V/U}]. 01 LOGICAL-INPUT-REC. 05 INPT-VAR-1 PIC 9(9). 05 INPT-VAR-2 PIC X(9). ………… …………
RECORDS CONTAINS N CHARACTERS
It describes the size of the data Record.
Format-1:
RECORD CONTAINS 80 CHARACTERS
We can define like Format 1 for fixed-length files
Format-2:
RECORD CONTAINS SIZE N-1 TO N-2 CHARACTERS
We can define Format 2 for variable files.
Format-3:
RECORD IS VARYING IN SIZE N-1 TO N-2 [DEPENDING ON DATA-ITEM-NAME]
We can define like Format 3 Dynamic files.
DATA RECORD IS RECORD-DET
It provides the record details of the data name. Here RECORD-DET contains the layout of the file.
BLOCK CONTAINS I RECORDS
It is used to define the block size of the file. It decides the size of a physical record.
Format-1:
BLOCK CONTAINS 80 CHARACTERS
We can define like Format 1 for fixed-length files
Format-2:
BLOCK CONTAINS N-1 TO N-2 CHARACTERS
We can define Format 2 for variable files.
Format-3:
BLOCK CONTAINS SIZE IN N-1 TO N-2 CHARACTERS DEPENDING ON DATA-ITEM-NAME
We can define like Format 3 Dynamic files.
There is a special ‘BLOCK CONTAINS’ and the syntax for this is- BLOCK CONTAINS 0 CHARACTERS
In some organizations, programmers code like this for sequential(QSAM) files. If we code this, the block size is determined at runtime.
RECORDING MODE IS {F/V/U}
It is used to describe the format of the logical records of the file.
Format-1:
RECORDING MODE IS F
This means that the logical record of the file is of a fixed length, hence all the records in the file will occupy a fixed length which is provided in a COBOL program or in JCL.
Format-2:
RECORDING MODE IS V
This means that the logical record of the file is of a variable length, hence all the records in the file will occupy a variable length as per the definition.
Format-3:
RECORDING MODE IS U
We use this when the file does not have fixed as well as variable record length and the records have unidentified record length.
There is a special sequential file ‘Line sequential file. Let’s understand about Line sequential file
LINE SEQUENTIAL FILE
- Line sequential is a special type of Sequential file where each record is separated by a carriage return(X“0D”) or Line Feed (X“0A”) at the end of last non-space character.
- This sequential files always contain variable-length records.
- We also call as text files. For the Report file, we should define the file here.
Syntax of Line Sequential file
SELECT LOGICAL-FL ASSIGNED TO PHYSICAL FL
ORGANIZATION IS LINE SEQUENTIAL.
This is compulsory to code in the FILE-CONTROL under INPUT-OUTPUT SECTION which is defined under ENVIRONMENT DIVISION.
INDEXED FILE
Let’s see some important characteristics of INDEXED files-
- Indexed files are files which we can access faster as compared to sequential files.
- To access the INDEXED file, we use key-values.
- The Indexed file uses the alphanumeric key as KEY.
- We can access any record in any order using KEY in an INDEXED file.
- The ACCESS MODE for the INDEXED file can be sequential as well as Random.
- The Primary key in an INDEXED file must be unique.
- ORGANIZATION IS INDEXED in case of Indexed files.
- We cannot update the primary key in the INDEXED file.
Note:If we need to write in an INDEXED file, we should write the records in increasing order of keys.
Example: To write the record with key A123, we must write all the records with keyless than A123. Here the keys are also case sensitive so, A123 is not same as a123.
We cannot update the key field in the INDEXED file but we can delete a particular record. The record cannot be physically deleted only that particular memory location for the record will be made as inaccessible.
Format-1:
SELECT logical-fl ASSIGNED TO physical-fl
ORGANIZATION IS INDEXED
ACCESS MODE IS SEQUENTIAL
RECORD KEY IS I-KEY
ALTERNATE KEY IS IA-KEY.
Here, It is compulsory to define I-KEY and AI-KEY in the FD clause in the FILE SECTION.
Format-2:
SELECT logical-fl ASSIGNED TO physical-fl
ORGANIZATION IS INDEXED
ACCESS MODE IS DYNAMIC
RECORD KEY IS I-KEY
ALTERNATE KEY IS IA-KEY.
Here, It is compulsory to define I-KEY and AI-KEY in the FD clause in the FILE SECTION.
Format-3:
SELECT logical-fl ASSIGNED TO physical-fl
ORGANIZATION IS INDEXED
ACCESS MODE IS RANDOM
RECORD KEY IS R-KEY.
Here, It is compulsory to define R-KEY FD clause in the FILE SECTION.
RELATIVE FILE
Let’s see some important characteristics of RELATIVE files-
- We also call this Relative Record Data Set(RRDS) file.
- The records in Relative File we can access by using RRN (Relative Record Number).
- It can have Random or Sequential access.
- We can access the record in any order by declaring a ‘RECORD KEY’.
- Here the key should be a numeric key only.
- Suppose, we have a requirement to read a record with a particular key value which we already know, it is better to define the file as RRDS.
Format-1:
SELECT logical-fl ASSIGNED TO physical-fl
ORGANIZATION IS RELATIVE
ACCESS MODE IS SEQUENTIAL
RECORD KEY IS R-KEY.
Here, It is compulsory to define R-KEY in the FD clause in the FILE SECTION.
Format-2:
SELECT logical-fl ASSIGNED TO physical-fl
ORGANIZATION IS RELATIVE
ACCESS MODE IS RANDOM
RECORD KEY IS R-KEY.
Here, it is compulsory to define R-KEY in the FD clause in the FILE SECTION.
Suppose, we have 9000 records in a file
SELECT logical-fl ASSIGNED TO physical-fl
ORGANIZATION IS RELATIVE
ACCESS MODE IS RANDOM
RECORD KEY IS R-KEY.
FD LOGICAL-FL. 01 LOGICAL-FL-REC. 05 R-KEY PIC 9(4). 05 R-NAME PIC X(20).
To hold 9000 records, we must define the key with PIC 9(4) which can hold a value from 0001 to 9999.
Sequential vs Indexed vs Relative files
Sequential files | Indexed files | Relative files |
These files can be accessed only sequentially. | These files can be accessed sequentially as well as randomly with the help of the record key. | These files can be accessed sequentially as well as randomly with the help of their relative record number. |
The records are stored sequentially. | The records are stored based on the value of the RECORD-KEY which is the part of the data. | The records are stored by their relative address. |
Records cannot be deleted and can only be stored at the end of the file. | It is possible to store the records in the middle of the file. | The records can be inserted at any given position. |
It occupies less space as the records are stored in continuous order. | It occupies more space. | It occupies more space. |
It provides slow access, as in order to access any record all the previous records are to be accessed first. | It also provides slow access(but is fast as compared to sequential access) as it takes time to search for the index. | It provides fast access as provides the record key compared to the other two. |
In Sequential file organization, the records are read and written in sequential order. | In Indexed file organization, the records are written in sequential order but can be read in sequential as well as random order. | In Relative file organization, the records can be written and read in sequential as well as random order. |
There is no need to declare any KEY for storing and accessing the records. | One or more KEYS can be created for storing and accessing the records. | Only one unique KEY is declared for storing and accessing the records. |