Join

PURPOSE:  The equated command JOIN extracts information from up to ten Data Sets (DS) with Data Indexes (DI) and produces a new DS object with the combined information.  A report of the combined information may be generated.

There are four types of JOIN operations:

  1. Key Field JOIN joins records in different DS’s by matching key values for one or more key fields.  Only records of matched key values are combined.  When one or more key fields are specified, the records are combined based on the matched key values.  If matching key values are specified for a JOIN operation of three DS’s, matching values must be found in all three or no record will be created.

  2. Non-Key Field JOIN combines all records of each DS with all records of all other DS’s.

  3. Conditional JOIN finds the first record that meets the first condition in the first DS.  Next, it finds the first record that meets the first condition in the second DS.  This continues for each additional DS.  If fields are not specified in the JOINed DS, ACCENT R combines all fields from all extracted records together into a new DS.

  4. JOIN within the Same Data Set combines records of the same DS.  In this type of JOIN operation, it is possible to combine fields and create a new DS.  Records can also be combined on different conditional operations.  Each conditional operation is specified as another DS and is given a different designator.  The records collected under each designator are then combined into a new DS.

This utility is started by a command that generates a Process Module (PM).  This PM may be SAVEd by adding the DEFINE PM option as part of the JOIN command.  It can be executed later as part of an application.  This saves the time it takes to generate the PM each time the JOIN command is executed.

Syntax

JOIN ds_name [USE DI di_name [DOMAIN domain_name]] [AS designator]

[IF; UNLESS clause] [\\WITH ds_name [USE DI di_name [DOMAIN domain_name]

[AS designator] [SINGLE] [{IF; UNLESS} clause]\\] [KEYED] ON field_list]

[TO;CREATE ds_name APPEND] FIELD/S/ ALL; UNIQUE; field_name_list

[SET \\exp TO output_field_name\\] [SHOW clause] [DEFINE PM pm_name]

USE DI di_name

specifies a Data Index (DI) to be used in JOINing.  If a DI is specified with the first DS, the records are retrieved sequentially through the index.  ACCENT R does a keyed search in other DSs which have a DI specified for it (using the key fields of the main DS for the matching values).

DOMAIN domain_name

specifies a domain for the specified index to be used in JOINing.

AS designator

specifies an identifier for a DS.  It can be used to identify fields in both the SHOWing clause and the SET clause.  The default designator names are DS1 through DS10.  DS1 identifies the main DS and DS2 through DS10 identify those DSs specified in the WITH clause.

IF; UNLESS clause

specifies a condition used to qualify records of the specified DS.

WITH ds_name

can be used up to nine times to specify a total of ten DSs in the JOIN operation.

SINGLE

specifies that only the first record of that key or group of qualified records for the JOIN is selected.  When two equal keys are in a DS, ACCENT R only creates one record for that combination.  This option is used when only a single match is needed to obtain specific information.

[KEYED] ON

specifies the key fields to be used in the matching process.  It should be defined in all DS's specified in the command.  When KEYED is used, a DI must be specified.  If no DI is specified, the DS is assumed to be in the sorted order of the key fields.  The ON clause does not check the sequence, therefore the results are unpredictable if the records are not in the correct sorted order.  If the ON clause is not specified, a cartesian JOIN is performed where all records of the DS are selected.

TO ds_name APPEND

specifies the name of the output DS.  Data is moved from each DS’s fields in the order that the DS's were specified in the command.  If a field of the output DS exists in more than one of the input DS's, the data will be taken from the field of the first specified DS.  ACCENT R does not check that any fields of the input DS's exist in the output DS therefore, the field exist as an empty record and will be created..

CREATE ds_name APPEND

checks if the specified output DS exists.  If it does not, this option will define a Schema Definition (SD) which contains all fields of all input DSs (unless a FIELDS clause is used to specify the desired fields).  If a field exists in more than one input DS, ACCENT R provides only the first occurrence (unless the ALL option of the FIELD clause is specified).  Both the SD and the DS are created in the current Data Base Library (DBL).

 

If the DS already exists in the current DBL, ACCENT R uses the DS.  ACCENT R does not check if all fields of all the JOINed DS's are defined in the output DS.  ACCENT R does check the first line of the SD to be sure it was created by the JOIN command before using it.  If it was not, the command is aborted.

FIELD/S/ ALL

specifies which fields will appear in the output DS.  ALL generates a schema with all fields and for those which have the same name in several DS's, it appends the designator preceded by a period to the name of the field’s second and subsequent occurrences.

UNIQUE

is the default option.  When generating the SD, only the first occurrence of the field is generated in the SD when more than one of the JOINed DS's have the same field name.

field_name_list

specifies which fields will be contained in the SD.  ACCENT R then checks each DS for the specified fields.  The input field names are checked both with and without their appended designator.  If a field list is specified and the CREATE option is not, ACCENT R checks the output DS’s SD for the fields in the list and only fills in values for those fields.

SET\\ exp TO output_field_name\\

is executed after all the data is moved from the input records to the output records.  Values may then be changed by specifying which input fields should be this clause.  SET also allows values to be moved from the input to output fields of a different name.  The input field names should be qualified with the input fields designator if not unique.

SHOW clause

displays information from the input DS's for each matching situation where a JOINed record would normally be created.  Fields should be qualified if they are not unique.

DEFINE PM  pm_name

generates a PM that may be SAVEd and run as part of an application.  This saves the time it takes to generate it each time the JOIN command is executed.  The PM is created in the current DBL.

EXAMPLE

In this example, the DS BOOKS will be read until a match is found for TITLE_CODE.  Without the SINGLE option, the DS BOOKS is processed in its entirety for each record of DS AUTHOR.

JOIN AUTHOR WITH BOOKS IF TITLE_CODE:DS1 = TITLE_CODE:DS2 &
SINGLE SHOW AUTHOR_NAME, 1S, TITLE_CODE, 1S, ROYALTY

These examples show four different types of JOIN operations.  The first three examples combine records from two or more DS’s.  The last example shows a conditional JOIN of a single DS.  Before executing JOIN, the DS’s must have a DI or be sorted on the key field that will be JOINed on.

For the first three examples, three DS’s are used.  DS WRITER contains the name of a writer and location information (city-state).  DS MAG contains the name of the writer and of the magazine for which they work.  DS SALARY contains the name of the writer and the salary amount they make.  Each DS can be joined by using the writer name as the common link.

*USE DS WRITER
*STRUCTURE
NAME    CHAR   10
CITY    CHAR   12
STATE   CHAR   2
*EXTRACT
CHICK    ATLANTA    GA
SAM      SAN JOSE   CA
SCOTT    BOISE      ID
SHARON   CHICAGO    IL
*USE DS MAG
*STRUCTURE
NAME     CHAR   10
MAG      CHAR   15
*EXTRACT
CHICK    HOCKEY REVIEW
SAM      HITECH
SCOTT    HIKING DIGEST
SHARON   FASHION WORLD
*USE DS SALARY
*STRUCTURE
NAME     CHAR    10
SALARY   FLOAT   6
*EXTRACT
CHICK    32000
SAM      24000
SCOTT    30000
SHARON   45000
*JOIN WRITER WITH SALARY ON NAME SHOW NAME,1S,SALARY,1S,CITY,&
1S ,STATE
CHICK    32000   ATLANTA    GA
SAM      24000   SAN JOSE   CA
SCOTT    30000   BOISE      ID
SHARON   45000   CHICAGO    IL
*JOIN WRITER IF NAME HAS 'CHIC' WITH SALARY IF &
NAME HAS 'CHIC' WITH MAG IF NAME HAS 'CHIC' &
ON NAME CREATE CHIC
*USE DS CHIC
*EXTRACT
CHICK   ATLANTA   GA   32000   HOCKEY REVIEW
*JOIN WRITER AS X WITH SALARY AS Y WITH MAG ON NAME DEFINE &
PM STAFF CREATE STAFF SHOW NAME, 1S, SALARY, 1S, MAG, &
1S, CITY, 1S, STATE
CHICK    32000   HOCKEY DIGEST   ATLANTA    GA
SAM      24000   HITECH          SAN JOSE   CA
SCOTT    30000   HIKING DIGEST   BOISE      ID
SHARON   45000   FASHION WORLD   CHICAGO    IL

JOIN creates a record in STAFF with the combined fields from the sets WRITER, SALARY and MAG, SAVEs the generated PM in the current DBL as STAFF, and also creates SD STAFF and the DS STAFF.  In the last example, a different DS is used.

*USE DS EMPLOYEES<CR>
*STRUCTURE<CR>
EMP_ID   INT     5
F_NAME   CHAR    8
L_NAME   CHAR   10
SALARY   FLOAT   7   2
STATUS   INT     2
MGR_ID   INT     5
JOB      CHAR   20
*EXTRACT<CR>
50001   DALE    BROOKS     3700.00   15   99999   PUB REL
39921   TOM     BRECORDN   2978.12   12   43231   MAINTENANCE
53999   CHICK   DODSON     5000.00   15   50001   MKT MGR
25431   MARY    GORGEN     2700.00   12   43231   FOREMAN
12345   BOB     HAMILTON   2645.67   12   43231   PRINTER
43231   BILL    WOODSON    3333.33   14   50001   SUPERVISOR

This example shows a conditional JOIN of a single DS.  The JOIN finds all employees who make more than Bob Hamilton and the name of their supervisors, displays Bob’s salary, the name of the persons who make more than Bob in his status code, their salary, and the name of their supervisors.

*JOIN EMPLOYEES AS X IF F_NAME='BOB' AND L_NAME='HAMILTON' &
WITH EMPLOYEES AS Y IF STATUS:X=STATUS AND SALARY:X < SALARY &
WITH EMPLOYEES AS Z IF MGR_ID:Y=EMP_ID SHOW F_NAME:X,1S,&
L_NAME:X,1S,SALARY:X, 3S, F_NAME:Y,1S,L_NAME:Y,1S,SALARY:Y,3S, &
F_NAME:Z,1S,L_NAME:Z
BOB HAMILTON 2645.67 TOM BRECORDN 2978.12 BILL WOODSON
BOB HAMILTON 2645.67 MARY GORGEN  2700.00 BILL WOODSON

NOTES:  The generated PM may also be modified to execute operations which cannot be done with the SET clause such as generating a report from the PM.  The generated PM is commented to make it easy to understand.

The optional clauses may be specified in any order except that the DOMAIN clause must follow the USE DI clause.