Working with Record Fields

The access method interfaces supplied by REXX and REXXTOOLS do not support direct access to specific fields within a record. Yet most applications require field-level access. In many high-level languages, such as PL/I and COBOL, declared record definitions perform this function. While REXX does not support the notion of a record structure, the language does contain several constructs that work just as well. Moreover, REXX gives you the ability to dynamically change your program's view of the data, a capability that is poorly implemented in many other high-level languages, when it is supported at all.

Parsing Records With REXX

Building Records With REXX

A Simple Methodology for Sharing Record Definitions

Parsing Records With REXX

Most programmers who are new to REXX parse records using the SUBSTR (substring) function. For example, to parse a record with 3 fields you could code:

name    = substr(record,1,10)
ssn     = substr(record,11,9)
balance = substr(record,20,5)

While this method translates easily from other languages (most of which support some form of substring), it suffers from several defects:

It tends to perform poorly as the number of fields increases.
It is difficult to program, since the absolute offsets of fields must be calculated by hand.

Experienced REXX programmers use other techniques to parse records. The most flexible and by far the best performing method for breaking apart records is the REXX PARSE instruction.

The REXX PARSE instruction has several distinct advantages over other methods:

PARSE is efficient. PARSE permits you to separate all of the fields in a record with a single REXX instruction. And because PARSE is a REXX instruction -- as opposed to a function or host command -- it receives preferential treatment by the interpreter (this statement is doubly true for programs compiled with the REXX/370 compiler). PARSE also allows your programs to avoid unnecessary work. If you want to access only one or two fields in a record, PARSE lets you extract just the relevant data.
PARSE is flexible. PARSE permits record splitting that is sensitive to the data within the record. Unlike traditional record structures, PARSE allows (indeed, encourages) the use of pattern matching to break apart highly variable data.
PARSE is portable. The PARSE instruction is a part of the REXX language, proper. Because of this, PARSE can be used with any access method and on any platform. In contrast, proprietary parsing facilities generally are not available in all environments.

The next two sections provide a brief review of PARSE VAR and PARSE VALUE, the two forms of PARSE that are particularly useful for handling record fields.

Note: The TSO/E Version 2 MVS/REXX Reference, SC28-1883 contains an excellent chapter on REXX parsing (in the TSO/E 2.4 level of this manual, see "Chapter 5. Parsing"). This chapter describes all of the uses of PARSE, and contains a flow chart which documents, precisely, the behavior of the instruction. This chapter should be required reading for anyone interested in using the PARSE instruction.

PARSE VAR

PARSE VAR is used to split a record contained within a REXX variable. The general format for the PARSE VAR instruction is:

PARSE VAR varname template

where varname is the name of the REXX variable that contains the data to parse, and template is the set of instructions for executing the parse (templates are discussed below). Varname can specify either a simple symbol or a REXX stem variable.

PARSE VALUE

PARSE VALUE is used to parse a record that is the result of a REXX expression. The general format for the PARSE VALUE instruction is only slightly different from that of PARSE VAR. It is:

PARSE VALUE expression WITH template

where expression is any arbitrarily complex REXX expression, and template is, again, the instructions for conducting the parse.

In the context of file access, PARSE VALUE is of special importance because it can be combined with the REXXTOOLS GET function, which returns a retrieved record. The following REXX instruction demonstrates how this works:

parse value get('indd',keyvar,'(key,dir)') with,
      name +10 ssn +9 address +30 salary +5

As you can see, with one REXX instruction you can both read a record and separate its fields.

Basic Parsing

All forms of PARSE use templates to specify the parsing action of the instruction. There are two formats for templates:

String patterns: which specify patterns for extracting data, and
Positional patterns: which use absolute or relative offsets to extract data.

A string pattern specifies data which the PARSE instruction is to search for in order to split out fields. For example, in the following PARSE VAR instruction, the patterns "SSN=", "NAME=" and "ZIP=" are used to find where the record is to be divided:

record = get('indd')
/* Assume record contains:
  'SSN=123456789 NAME=FRED ZIP=12345' */
 
parse var record 'SSN=' ssn 'NAME=' name 'ZIP=' zip
 
/* This yields:
   ssn  = '123456789'
   name = 'FRED'
   zip  = '12345'
*/

Positional patterns are used to indicate the absolute or relative positions at which the record is to be split. Parsing by absolute position is equivalent to SUBSTR parsing (except that you can substring many fields in one instruction). In the following example, absolute positions are used to extract three fields:

/* Positions:----+----1----+----2----+ */
parse value 'Richard   Travis    01' with,
      firstname 11 lastname 21 empno
 
/* This yields:
   firstname = 'Richard   '
   lastname  = 'Travis    '
   empno     = '01'
*/

Relative positions are indicated by coding a plus sign (+) or a minus sign (-) in front of the number. Parsing by relative position is frequently much easier than parsing by absolute position, because you don't need to manually calculate field offsets. You only need to know how long a field is. For example, the PARSE in the previous example re-coded for relative positions would look like this:

parse value 'Richard   Travis    01' with,
      firstname +10 lastname +10 empno

Advanced Parsing Problems

This section describes solutions to commonly encountered parsing problems:

Handling Conversions

If you are designing a new application, you may want to consider storing all fields of a record -- including numeric fields -- in the REXX printable format. This strategy will allow you to avoid all conversions. If you are processing "legacy" files containing binary and packed numeric data, you probably will need to convert some field values to the REXX decimal format in order to use them in computations.

The following table describes the conversion functions you will need:

Data Type Conversion After GET and PARSE Conversion Before Build and PUT
2 byte binary integer (halfword) C2D(fieldvalue) D2C(fieldvalue,2)
4 byte binary integer (fullword) C2D(fieldvalue) D2C(fieldvalue,4)
packed decimal P2D(fieldvalue,s) where s is the number of fractional digits. D2P(fieldvalue,p,s) where p is the total number of digits and s is the number of fractional digits.
short floating point F2D(fieldvalue) D2F(fieldvalue,'SHORT')
long floating point F2D(fieldvalue) D2F(fieldvalue,'LONG')
printable numeric edited PIC2D(fieldvalue) D2PIC(fieldvalue,picstr) where picstr is the picture specification.

Data Type	Conversion After GET and PARSE	Conversion Before Build and PUT
2 byte binary integer (halfword)	C2D(fieldvalue)	D2C(fieldvalue,2)
4 byte binary integer (fullword)	C2D(fieldvalue)	D2C(fieldvalue,4)
packed decimal	P2D(fieldvalue,s) where s is the number of fractional digits.	D2P(fieldvalue,p,s) where p is the total number of digits and s is the number of fractional digits.
short floating point	F2D(fieldvalue)	D2F(fieldvalue,'SHORT')
long floating point	F2D(fieldvalue)	D2F(fieldvalue,'LONG')
printable numeric edited	PIC2D(fieldvalue)	D2PIC(fieldvalue,picstr) where picstr is the picture specification.

Parsing Records with Multiple Formats

PARSE permits a record to be broken-up in several ways -- all in one PARSE instruction. To parse a record more than once, you separate the various formats with an absolute position. In the following example, we parse the entire contents of a record into a variable called 'record', then we back up and break the record into 3 fields:

parse value 'Abraham   Lincoln   President ' with,
    record 1 firstname +10 lastname +10 job
 
/* Results in the following:
 record    = 'Abraham   Lincoln   President'
 firstname = 'Abraham   '
 lastname  = 'Lincoln   '
 job       = 'President '
*/

You can also use relative positioning to "back up" to an earlier position, and parse forward from there. This technique is demonstrated in the following example:

parse value get('indd') with rectype +2,
      fname +10 lname +10 -20,  /* rectype=1 */
      Address +30 -30,          /* rectype=2 */
      salary +5 bonus +5        /* rectype=3 */

Using In-line Parse Routines

If your templates are long, or if you simply want to keep all of the parsing information for a record in one place, you may want to use the in-line routine technique demonstrated by the following code segment:

GetEmpRec = "parse value get('indd') with",
            "firstname +10 lastname +10 empno +4",
            "salary +5; empno=c2d(empno);",
            "salary=p2d(salary,2)"
 
interpret GetEmpRec
do while rc = 0
  salary = salary * 1.10
  say firstname"'s new salary:"d2pic(salary,'$$$,$$9.99')
  interpret GetEmpRec
end

Note that GetEmpRec contains all of the instructions for reading, parsing and converting fields in a record. Each REXX instruction is separated by a semicolon (;). The REXX INTERPRET instruction is used to run this in-line parse routine (which can be as complex as you need) whenever another record is required.

Another way to handle this problem is to place all of the statements for reading (or writing) a record into a conventional internal subroutine. Be aware, though, that internal subroutines cannot be shared among separate REXX source files, whereas strings, such as GetEmpRec can be shared (see "A Simple Methodology for Sharing Record Definitions" ). An external routine could also be used, but then you would need to use global variables to pass field values between the called and calling routines (not an elegant solution).

Building Records With REXX

New records or records whose fields have been changed must be reassembled before they are written to a file. Using the REXX concatenation operator (||), an entire record can be built with a single REXX instruction. For example:

EmpRec = left(firstname,10)||,
         left(lastname,10)||,
         d2c(empno,4)||,
         d2p(salary,9,2)

In this example the REXX LEFT function is used to left justify and truncate or pad the character fields (FIRSTNAME and LASTNAME) to their desired lengths (in this case, 10 bytes). The D2C function is used to convert the EMPNO field from a REXX decimal number to a full-word binary integer. And, the REXXTOOLS D2P function is used to convert SALARY from a REXX decimal number to a packed decimal number.

Advanced Record Building

As we saw with record parsing, it is often desirable to keep all of the information for manipulating a record in one place. The in-line routine technique (see "Advanced Parsing Problems" above) can also be applied to record building, as the following code segment demonstrates:

PutEmpRec = "EmpRec=left(firstname,10)||",
            "left(lastname,10)||",
            "d2c(empno,4)||",
            "d2p(salary,9,2);",
            "call put('outdd',EmpRec)"
 
parse pull firstname lastname empno salary
do while firstname <> ''
  interpret PutEmpRec
  parse pull firstname lastname empno salary
end

In-line routines like PutEmpRec can be as simple or as complex as you need them to be. They can contain entire programs, including IF, SELECT, and iterative DO instructions.

A Simple Methodology for Sharing Record Definitions

In the previous examples we assumed that just one REXX source program would be accessing a file. However, in many applications, several components (separate source programs) may need to access the same file. To handle this problem, you could replicate record-handling code in all of the programs that need to process the file. Unfortunately, the replicated code would present a severe maintenance problem: any time you needed to change a record definition, you would have to modify all of the copies of the code.

REXX provides an elegant solution to this problem that is based on the "in-line" routine method, presented earlier. This solution allows all of the logic for handling a record to be placed in one external source program, called a record definition function. The record definition function does not perform any operations on a record directly. Instead, it returns the logic for performing an operation, encapsulated in an in-line procedure. The following example shows how this works.

First, we construct a record definition function for parsing and building the records in the employee file:

/* REXX  EmpRDef */
function = translate(arg(1))
recvname = arg(2)
select
  when function = 'PARSE' then
    return "parse var" recvname,
              "firstname +10 ",
              "lastname  +10 ",
              "empno     +04 ",
              "salary    +05;",
              "empno  =  c2d(empno);",
              "salary =  p2d(salary,2)"
  when function = 'BUILD' then
    return recvname "=",
              "left(firstname,10)||",
              "left(lastname,10)||",
              "d2c(empno,4)||",
              "d2p(salary,9,2)"
  otherwise
    return "*ERROR*"
end

The EmpRDef function takes two arguments. The first argument, function, indicates what type of in-line routine EmpRDef is to generate. The second argument gives the name of the variable that contains or will contain the employee record. When a value of "PARSE" is specified for the function argument, EmpRDef returns an in-line routine to parse and convert the fields of the record. If "BUILD" is specified, the code for converting the fields and constructing a record is returned.

Now let's look at how a program uses EmpRDef. The following program sequentially reads the employee file:

ParseEmpRec = EmpRDef('PARSE','EmpRec')
 
EmpRec = get('indd')
do while rc = 0
  interpret ParseEmpRec
  say firstname lastname empno d2pic(salary,'$$$,$$9.99')
  EmpRec = get('indd')
end

Notice that all of the information related to field positions and data types is hidden from EmpRDef's caller. The calling program needs to know only the names of the fields.

This method is very flexible, and can be expanded to encompass much more of the process. A logical extension of this technique is the file definition function, a function that encapsulates I/O-related tasks as well as record definitions. A file definition function might, for example, accept a "GETALL" function code that would return all of the logic needed to:

Open the file
Read and parse all of the records into a family of stem arrays, and
Close the file.

Depending on how much information you want to hide in the file definition function, the calling program's file logic could be as simple as:

interpret EmpFDef('getall', 'indd')

Other processes that you may want to include in this type of function are:

Automatic record locking using the REXXTOOLS ENQ and DEQ functions.
A function code that returns the names and data types of the fields in the record. In other words, a record descriptor.

Self-defining VSAM Files

If you are defining new VSAM files, you may want to consider making them self-defining files. A self-defining file is one that contains both data and the instructions for accessing the data. One way to make a KSDS self-defining is to store in-line subroutines for parsing and building records in the "high-values" record (many KSDSes are initialized with a record containing X'FFFFF...' in the key field). Any program that wants to process the self-defining file, opens the data set and reads the high-values record. The program then has all of the logic needed to parse and build the other records in the file.