VSAM Basics | Programming for VSAM |
Allocating VSAM Files | Opening and Closing VSAM Files |
Reading, Writing, and Updating VSAM Files | Sharing VSAM Data Sets |
VSAM Return Codes | Service Descriptions |
Additional background information on VSAM can be found in the
Using Data Sets
publication for your system's level of DFP or DFSMS.
Using Data Sets
contains descriptions of VSAM data sets and VSAM
utilities, and VSAM programming guidance.
VSAM Basics
VSAM, which stands for Virtual Storage Access Method, provides
facilities for defining, managing, and accessing 5 different
types of data sets. These are:
KSDSs are made up of two components. The first is the data component; it contains the actual data records of the data set. The second component is the index. The index contains compressed key values and pointers into the data component, and permits high-speed random access of records. There may be only one index entry per data record (i.e., duplicate keys are not allowed). In addition to the primary index, alternate indexes may be defined for and associated with the data component. These alternate indexes permit random access to the data component using fields other than the primary key field. Alternate indexes, by default, permit duplicate keys.
KSDS records may be accessed both sequentially and directly (randomly). Records may be inserted, both in between existing records and onto the end. Records may also be deleted (erased) from the data set.
An ESDS may be accessed both sequentially and randomly. For random access, a number called the Relative Byte Address (RBA) is used to specify the record to retrieve. The RBA of a record can be obtained at the time it is inserted, or can be determined by sequentially reading the data set.
The RRDS permits both sequential and random access. Random access is by Relative Record Number (RRN), which is the sequence number of a record.
The VRDS permits both sequential and random access. Random access is by Relative Record Number (RRN).
VSAM data sets are organized first by records, then by control
intervals (CIs) which are collections of records; and then by
control areas, which are collections of control intervals. Using
the processing options described in the sections that follow, you
can directly read from and write to VSAM data sets at the record
and control interval levels. You cannot read or write control
areas.
Programming For VSAM
The basic flow of a program that processes VSAM files is
essentially the same as the flow of a program that processes
sequential files. In general, such a program will contain the
following steps:
If you are writing records, the access method takes the record from an argument to PUT.
Shadow REXXTOOLS maintains information about open VSAM files in a data structure associated with the task under which the OPEN function was executed. The information is maintained by ddname. Files remain registered with REXXTOOLS and open until they are explicitly closed, or until the task that opened them terminates.
All REXX programs under a MVS task share the same REXXTOOLS VSAM data structures. Thus if program A calls program B (REXX CALL or function call), any ddname that is opened by either program A or program B is known to the other. File sharing also extends to directly subtasked REXX programs. As a consequence, if Program A attaches program B, the files opened by program A are known by program B. However, files opened by program B will not be known to A when control is returned. This is because task termination will close all files opened by program B.
Notes:
The OPEN options are organized into groups. Some of the groups are additive. That is, you may select one, some, or all of the options in that group. Other groups are alternative. From these groups you may select just one option. All of the groups contain one option that is the default value. If you do not select an option from a group, VSAM will use the default value (the default values are underscored).
Option | Description |
---|---|
ADR | Specifies that you want to use Relative Byte Addresses (RBAs) to access a data set. Note that addressed access is not permitted for RRDSs. |
CNV | Specifies that you want to access the data set by control interval rather than by records. Note that CNV access also implies addressed (ADR) access (i.e., RBAs are used). |
KEY | Specifies that you want to use keys (for KSDSs) or Relative Record Numbers (RRNs - for RRDSs) to access a data set. Keyed access is not permitted for ESDSs. |
Option | Description |
---|---|
DIR | Specifies that direct access will be used in processing this data set. That is, individual records will be randomly requested. |
SEQ | Specifies that sequential access will be used in processing this data set. That is, contiguous records will be requested in either ascending or descending sequence. |
SKP | Specifies that skip sequential access will be used in processing this data set. Records will be processed in sequence but some records may be skipped. |
Option | Description |
---|---|
IN | Specifies that the data set is being opened for reading (GETting) records. |
OUT | Specifies that the data set is being opened for writing (PUTting) records. |
Option | Description |
---|---|
DFR | Specifies that buffers are not to be immediately written following direct PUT requests. |
NDF | Specifies that buffers are immediately to be written following direct PUT requests. |
Option | Description |
---|---|
NIS | Specifies that the normal insertion strategy is to be used. Control intervals are split at the midpoint. |
SIS | Specifies that the sequential insertion strategy is to be used. Control intervals are split at the insertion point. This method is faster if several contiguously located records are being inserted. |
Option | Description |
---|---|
NRM | Specifies that the ddname gives the object to be processed. |
AIX | Specifies that the alternate index of the object specified by the ddname is to be processed. You do not use this option to process the base cluster with an alternate index. This option lets you read the records of the alternate index itself (something you will not usually want to do). |
Option | Description |
---|---|
NRS | Specifies that the data set is not reusable. That is, you cannot open it for reuse and overwrite the existing data with load mode processing. |
RST | Specifies that the data set is reusable. Upon opening a data set with this option, it is as if all the existing data in the file is erased. |
Option | Description |
---|---|
DDN | Specifies that VSAM is to share control blocks and buffers by ddname. This option only applies to sibling tasks in the same address space, and is only available when the subtasks share subpool zero. |
DSN | Specifies that VSAM is to share control block and buffers by data set name. This option only applies to sibling tasks in the same address space, and is only available when the subtasks share subpool zero. |
call open 'vsam', 'outdd', '(key,seq,out)'
call open 'vsam', 'iodd', '(key,dir,in,out)'
call close 'vsam', 'mydd'
The groups of RPL options are given in the tables below. All of the RPL options are alternative options. From alternative option tables you may select only one option. If you do not select an option from a table, VSAM will use a default value (the default values are underscored).
Notes:
Option | Description |
---|---|
ADR | Specifies that you want to use Relative Byte Addresses (RBAs) to access a record. Note that addressed access is not permitted for RRDSs. |
CNV | Specifies that you want to process a control interval rather than a record. Note that CNV access also implies addressed (ADR) access (i.e., RBAs are used). |
KEY | Specifies that you want to use keys (for KSDSs) or Relative Record Numbers (RRNs - for RRDSs) to process a record. Keyed access is not permitted for ESDSs. |
Option | Description |
---|---|
DIR | Specifies that you want to process a record directly. (i.e., you want to specify a key, RBA, or RRN to process a record). |
SEQ | Specifies that you want to process the next record in the data set (either forward or backward). |
SKP | Specifies that you want to process a record by key but that the key of the record will be higher than that of the previous record processed. |
Option | Description |
---|---|
ARD | Specifies that the keyin argument is to be used (if a form of direct access processing has been requested) to find the record to be located, retrieved, or stored. |
LRD | Specifies that the last record in the data set is to be processed. The BWD (backward processing option) must also be specified. |
Option | Description |
---|---|
FWD | Specifies that if the file is being processed sequentially, processing will proceed from the first record to the last (i.e., forward). |
BWD | Specifies that if the file is being processed sequentially, processing will proceed from the last record to the first (i.e., backward). |
Option | Description |
---|---|
NSP | Specifies that for direct processing requests VSAM is to remember the position of the record being processed. The position will not be "forgotten" by VSAM until a sequential request or an ENDREQ function call is performed. |
NUP | Specifies that the record being processed will not be updated, or deleted and that positioning is not to be remembered. |
UPD | Specifies that the record being processed is to be updated or deleted (ERASEd). For a GET, VSAM will remember its position and (for certain share options) obtain exclusive control of the control interval containing the record. When a subsequent ERASE, PUT, or ENDREQ function is executed, positioning and control are relinquished. |
Option | Description |
---|---|
KEQ | Specifies for keyed, direct searches that the key you provide in the keyin argument must match, exactly, the key of a record in the data set or else the search fails. |
KGE | Specifies for keyed, direct searches that the key you provide in the keyin argument must be greater than or equal to the key of a record in the data set. |
Note: The Keyin Test Options apply to both full and generic keys.
Option | Description |
---|---|
FKS | Specifies that the key supplied in the keyin argument is to be treated as a full length key. |
GEN | Specifies that the key supplied in the keyin argument is to be treated as a generic key (i.e., only the leading characters of a key are specified). Generic keys potentially will match more than one record. |
Option | Description |
---|---|
VAR | Specifies that $RXT variables are to be created. |
NOV | Specifies that $RXT variables are not to be created. This option can improve performance. |
/* Allocate and open the outdd file */ do i = 1 to 20 call put 'outdd', rec.i,,'(key,seq)' end /* Close and free the outdd file */
/* Allocate and open the indd file */ recin = get( 'indd',,'(key,seq)') do while rc = 0 parse var recin 'Address:' aline 'Date:' date, 145 custinfo recin = get( 'indd',,'(key,seq)') end /* Close and free the indd file */
/* Allocate and open the indd file */ GetCustRec = "parse value get( 'indd',,'(adr,seq)')", "with fname +10 lname +15 ssn +9 ." interpret GetCustRec do while rc = 0 say 'Name:' lname','fname 'SSN:' ssn interpret GetCustRec end /* Close and free the indd file */
KSDSs offer the most flexibility with respect to direct access. A KSDS may be accessed using either keyed (KEY) or addressed (ADR) access. If keyed access is used, the keyin argument for the function must contain a partial or complete key.8 If addressed access is used, the argument must be a relative byte address (RBA).
ESDSs may be accessed directly using the ADR (addressed) option. For a direct request against an ESDS you must supply an RBA for the keyin argument.
A direct access request against an RRDS must use a relative record number (RRN) for the keyin argument. You may not use addressed access for a relative record data set even though it is possible to obtain the RBAs for an RRDS's records.
Note: The Batch Local Shared Resources (BLSR) subsystem can be used to improve the performance of long running, direct processing jobs. The Shadow REXXTOOLS VSAM interface supports the use of BLSR. For more information on this facility, refer to Application Development Guide: Batch Local Shared Resource Subsystem, GC28-1672 . Do not use BLSR with programs that process data sequentially since this will lead to increased run times.
/* Allocate and open the outdd file */ call put 'outdd', 'ABC001 new data',,'(key,dir)' /* Close and free the outdd file */
/* Allocate and open the indd file */ call get 'indd', 0, '(adr,dir)' /* Close and free the indd file */
/* Allocate and open the iodd file */ call get 'iodd', 'ABC001', '(key,dir)' if rc = 0 then call erase 'iodd' /* Close and free the iodd file */
Depending on the file sharing method (which is discussed below), other users in other address spaces may also read and write the same records. The other users will have their own private buffers. Thus, if there are two users reading the first control interval of a KSDS, there are really 3 copies of the data in the system: the DASD copy, the first user's copy, and the second user's copy.
Because of VSAM buffering, two exposures to data corruption must be considered:
At first glance, this might not appear to be a problem. Even the most scrupulous of protocols must permit serial access to a record. However, when you consider the case where one user updates the first record of a control interval while another user updates the fourth, you begin see where the difficulty lies. The last user to write to the file will completely nullify the first user's changes.
The scope of a lock is the amount of data that is reserved. The status portion of the disposition parameter in JCL has a data set-wide scope, for example. Using DISP=OLD, an accessor has exclusive control of the whole data set. The duration of a lock is the amount of time a lock is held. Using the disposition parameter example again, DISP=OLD locks a data set for the amount of time it takes the job step to run. Because of these factors, the perceived concurrency for DISP=OLD serialization is very low. Other serialization methods permit greater perceived concurrency.
Note: Often the duration of lock is the most important factor in determining perceived concurrency. For example, if a data set-wide lock is held only briefly, perceived concurrency does not suffer.
In general, the number of users that can concurrently access a VSAM data set is a function of one or more of the following serialization mechanisms.
There are 4 levels of cross-region file sharing:
When buffers are shared in this manner, VSAM maintains complete read and write integrity. However, only subtasks that share subpool zero storage can use intra-region file sharing. This is because VSAM allocates data buffers in subpool zero and needs these buffers to remain allocated even though the first task to open the data set may have terminated (when subpool zero is not shared, MVS automatically frees subpool zero storage when a task terminates).
Sophisticated applications requiring a high degree of concurrent use, almost always require the use of SHAREOPTIONS 3 or 4 and ENQ/DEQ serialization. Even so, you can always use disposition serialization whenever you want to ensure that just one user has the file (third shift batch update and reporting applications often fit this profile).
Another complication that must be considered arises from insertions into KSDSes. Whenever a control interval is full but a new record needs to be inserted into its middle, VSAM "splits" the control interval into two parts, each containing some free space. The new record can then be inserted. A negative consequence of this splitting action is that it makes impossible the use of any protocol based on control interval locks -- you can't lock on a moving target.
Finally, the problem of splitting can cascade to higher levels:
if a control area is full, it too must be split before a new
control interval can be inserted.
The "Dirty Read" Technique
The "dirty read" technique can be used to provide a high degree
of concurrent access to VSAM files while avoiding the
complications associated with CI and CA splits. The dirty read
protocol can be summarized as follows:
In addition, except for the GET function which returns a record, all VSAM functions return the return code as their value. For example, the following code will display the return code two times, once as OPEN's returned value, and once as the value of the RC variable
say "RC="open('vsam','indd') "RC="rc "REASON="reasonUnless otherwise noted, the values for RC and REASON are taken directly from the underlying VSAM macro return and reason codes. In all cases, a return code of zero indicates success.
Notes: