A Simple Methodology for Sharing Record Definitions
Parsing Records With REXX
Most programmers who are new to REXX parse records using the
SUBSTR (substring) function. For example, to parse a record with
3 fields you could code:
name = substr(record,1,10) ssn = substr(record,11,9) balance = substr(record,20,5)
While this method translates easily from other languages (most of which support some form of substring), it suffers from several defects:
Experienced REXX programmers use other techniques to parse records. The most flexible and by far the best performing method for breaking apart records is the REXX PARSE instruction.
The REXX PARSE instruction has several distinct advantages over other methods:
The next two sections provide a brief review of PARSE VAR and PARSE VALUE, the two forms of PARSE that are particularly useful for handling record fields.
Note: The TSO/E Version 2 MVS/REXX Reference, SC28-1883 contains an excellent chapter on REXX parsing (in the TSO/E 2.4 level of this manual, see "Chapter 5. Parsing"). This chapter describes all of the uses of PARSE, and contains a flow chart which documents, precisely, the behavior of the instruction. This chapter should be required reading for anyone interested in using the PARSE instruction.
PARSE VAR varname template
where varname is the name of the REXX variable that contains the data to parse, and template is the set of instructions for executing the parse (templates are discussed below). Varname can specify either a simple symbol or a REXX stem variable.
PARSE VALUE expression WITH template
where expression is any arbitrarily complex REXX expression, and template is, again, the instructions for conducting the parse.
In the context of file access, PARSE VALUE is of special importance because it can be combined with the REXXTOOLS GET function, which returns a retrieved record. The following REXX instruction demonstrates how this works:
parse value get('indd',keyvar,'(key,dir)') with, name +10 ssn +9 address +30 salary +5
As you can see, with one REXX instruction you can both read a record and separate its fields.
A string pattern specifies data which the PARSE instruction is to search for in order to split out fields. For example, in the following PARSE VAR instruction, the patterns "SSN=", "NAME=" and "ZIP=" are used to find where the record is to be divided:
record = get('indd') /* Assume record contains: 'SSN=123456789 NAME=FRED ZIP=12345' */ parse var record 'SSN=' ssn 'NAME=' name 'ZIP=' zip /* This yields: ssn = '123456789' name = 'FRED' zip = '12345' */
Positional patterns are used to indicate the absolute or relative positions at which the record is to be split. Parsing by absolute position is equivalent to SUBSTR parsing (except that you can substring many fields in one instruction). In the following example, absolute positions are used to extract three fields:
/* Positions:----+----1----+----2----+ */ parse value 'Richard Travis 01' with, firstname 11 lastname 21 empno /* This yields: firstname = 'Richard ' lastname = 'Travis ' empno = '01' */
Relative positions are indicated by coding a plus sign (+) or a minus sign (-) in front of the number. Parsing by relative position is frequently much easier than parsing by absolute position, because you don't need to manually calculate field offsets. You only need to know how long a field is. For example, the PARSE in the previous example re-coded for relative positions would look like this:
parse value 'Richard Travis 01' with, firstname +10 lastname +10 empno
The following table describes the conversion functions you will need:
Data Type | Conversion After GET and PARSE | Conversion Before Build and PUT |
---|---|---|
2 byte binary integer (halfword) | C2D(fieldvalue) | D2C(fieldvalue,2) |
4 byte binary integer (fullword) | C2D(fieldvalue) | D2C(fieldvalue,4) |
packed decimal | P2D(fieldvalue,s) where s is the number of fractional digits. | D2P(fieldvalue,p,s) where p is the total number of digits and s is the number of fractional digits. |
short floating point | F2D(fieldvalue) | D2F(fieldvalue,'SHORT') |
long floating point | F2D(fieldvalue) | D2F(fieldvalue,'LONG') |
printable numeric edited | PIC2D(fieldvalue) | D2PIC(fieldvalue,picstr) where picstr is the picture specification. |
parse value 'Abraham Lincoln President ' with, record 1 firstname +10 lastname +10 job /* Results in the following: record = 'Abraham Lincoln President' firstname = 'Abraham ' lastname = 'Lincoln ' job = 'President ' */
You can also use relative positioning to "back up" to an earlier position, and parse forward from there. This technique is demonstrated in the following example:
parse value get('indd') with rectype +2, fname +10 lname +10 -20, /* rectype=1 */ Address +30 -30, /* rectype=2 */ salary +5 bonus +5 /* rectype=3 */
GetEmpRec = "parse value get('indd') with", "firstname +10 lastname +10 empno +4", "salary +5; empno=c2d(empno);", "salary=p2d(salary,2)" interpret GetEmpRec do while rc = 0 salary = salary * 1.10 say firstname"'s new salary:"d2pic(salary,'$$$,$$9.99') interpret GetEmpRec end
Note that GetEmpRec contains all of the instructions for reading, parsing and converting fields in a record. Each REXX instruction is separated by a semicolon (;). The REXX INTERPRET instruction is used to run this in-line parse routine (which can be as complex as you need) whenever another record is required.
Another way to handle this problem is to place all of the
statements for reading (or writing) a record into a conventional
internal subroutine. Be aware, though, that internal subroutines
cannot be shared among separate REXX source files, whereas
strings, such as GetEmpRec can be shared
(see
"A Simple Methodology for Sharing Record Definitions"
).
An external routine could also be used, but then you would need
to use global variables to pass field values between the called
and calling routines (not an elegant solution).
Building Records With REXX
New records or records whose fields have been changed must be
reassembled before they are written to a file. Using the REXX
concatenation operator (||), an entire record can be built with a
single REXX instruction. For example:
EmpRec = left(firstname,10)||, left(lastname,10)||, d2c(empno,4)||, d2p(salary,9,2)
In this example the REXX LEFT function is used to left justify and truncate or pad the character fields (FIRSTNAME and LASTNAME) to their desired lengths (in this case, 10 bytes). The D2C function is used to convert the EMPNO field from a REXX decimal number to a full-word binary integer. And, the REXXTOOLS D2P function is used to convert SALARY from a REXX decimal number to a packed decimal number.
PutEmpRec = "EmpRec=left(firstname,10)||", "left(lastname,10)||", "d2c(empno,4)||", "d2p(salary,9,2);", "call put('outdd',EmpRec)" parse pull firstname lastname empno salary do while firstname <> '' interpret PutEmpRec parse pull firstname lastname empno salary end
In-line routines like PutEmpRec can be as simple or as complex as
you need them to be. They can contain entire programs, including
IF, SELECT, and iterative DO instructions.
A Simple Methodology for Sharing Record Definitions
In the previous examples we assumed that just one REXX source
program would be accessing a file. However, in many
applications, several components (separate source programs) may
need to access the same file. To handle this problem, you could
replicate record-handling code in all of the programs that need
to process the file. Unfortunately, the replicated code would
present a severe maintenance problem: any time you needed to
change a record definition, you would have to modify all of the
copies of the code.
REXX provides an elegant solution to this problem that is based on the "in-line" routine method, presented earlier. This solution allows all of the logic for handling a record to be placed in one external source program, called a record definition function. The record definition function does not perform any operations on a record directly. Instead, it returns the logic for performing an operation, encapsulated in an in-line procedure. The following example shows how this works.
First, we construct a record definition function for parsing and building the records in the employee file:
/* REXX EmpRDef */ function = translate(arg(1)) recvname = arg(2) select when function = 'PARSE' then return "parse var" recvname, "firstname +10 ", "lastname +10 ", "empno +04 ", "salary +05;", "empno = c2d(empno);", "salary = p2d(salary,2)" when function = 'BUILD' then return recvname "=", "left(firstname,10)||", "left(lastname,10)||", "d2c(empno,4)||", "d2p(salary,9,2)" otherwise return "*ERROR*" end
The EmpRDef function takes two arguments. The first argument, function, indicates what type of in-line routine EmpRDef is to generate. The second argument gives the name of the variable that contains or will contain the employee record. When a value of "PARSE" is specified for the function argument, EmpRDef returns an in-line routine to parse and convert the fields of the record. If "BUILD" is specified, the code for converting the fields and constructing a record is returned.
Now let's look at how a program uses EmpRDef. The following program sequentially reads the employee file:
ParseEmpRec = EmpRDef('PARSE','EmpRec') EmpRec = get('indd') do while rc = 0 interpret ParseEmpRec say firstname lastname empno d2pic(salary,'$$$,$$9.99') EmpRec = get('indd') end
Notice that all of the information related to field positions and data types is hidden from EmpRDef's caller. The calling program needs to know only the names of the fields.
This method is very flexible, and can be expanded to encompass much more of the process. A logical extension of this technique is the file definition function, a function that encapsulates I/O-related tasks as well as record definitions. A file definition function might, for example, accept a "GETALL" function code that would return all of the logic needed to:
Depending on how much information you want to hide in the file definition function, the calling program's file logic could be as simple as:
interpret EmpFDef('getall', 'indd')
Other processes that you may want to include in this type of function are: