IBM 1130 User Manual page 587

Computing system
Hide thumbs Also See for 1130:
Table of Contents

Advertisement

not be the case, however. A sort can be in a des-
cending sequence, with the control key of each suc-
cessive record collating equal to or lower than that
of the preceding record.
Frequently, two or more sorted files have to be
merged into a single file of sequenced records.
In
general, "merging" is a technique that collates
several sequences of data records to form a single
sequence. The number of files to be combined
during a merging operation is known as the order
of merge, or "merge order". Thus, a merge of
order m is called an "m-way merge". The proc-
essing of all the records once through the merge
is termed a "merge pass", or simply, a "pass".
The object of a pass is to reduce the number of
sequences (strings) by increasing the number of
records contained in each sequence. During a
single pass, the number of sequences is usually
reduced by a factor equal to the order of merge
(m). Several intermediate passes may be required
to reduce the file to a single sequence. A multi-
pass sort is a sort program designed to sort more
data than can be contained within the internal stor-
age of the central processing unit.
In
this case,
intermediate storage (disk) is required.
It is customary to segment a sort program into
a number of phases, each of which is executed as
one core storage lo.ad.
For example, a typical
sort may be divided,into four phases: an initiali-
zation phase, an internal (presort or premerge)
phase within core storage, a merge phase (for
combining the sequences), and a final output phase.
The sequencing of a group of data records con-
tained at one time in core storage is known as an
"internal sort". The size of the internal sort
is the nUli ber of data records (abbreviated G) that
can be sequenced at one time in core storage.
Note, however, that since the num ber of data
records to be sorted usually exceeds G (the num-
ber contained at one time in core storage), the
internal sort process must generally be repeated
until all the records in the file have been sequenced
into strings that may later be combined, or
merged.
It has been implied that sorting consists of mov-
ing data records around until their respective con-
trol keys are in the proper collating sequence. This
is not always the case. In some sorting methods,
the control keys upon which sequencing is based
are read from the record and combined with the
record number (called tag) to form a key-tag pair.
Then the keys are sorted, rather than the original
records. A fter sorting, the tags serve as an index
for later retrieval of the data records in the desired
sequence (see Figure 75.3).
I n core storage
before sorting:
Key
NREC
085
1
603
2
143
3
801
4
013
5
035
6
109
7
706
8
431
9
307
10
010
11
444
12
Section
Subsections
Page
75
10
I
00
Key
085
2
603
3
143
4
801
5
013
Record Number
6
035
NREC
7
109
8
706
9
431
10
307
11
010
12
444
I n core storage
after sorting:
Key
NREC
Keys are p.hysically
sorted (moved around)
- - - . with each corresponding ........
record number (NREC)
moved with it
010
013
035
085
109
143
307
431
444
603
706
801
Now, either physically move the
disk data records
or
process (e.g., print report)
by obtaining disk records
in the order found in the
NREC table.
11
5
6
1
7
3
10
9
12
2
8
4
02
Figure 75. 3. Tag sort
The effectiveness of a sort program is measured
by the time it takes to sort a file of data records.
If
the sorting method alone determined the overall
performance and speed, the choice of the best
method would be relatively simple. In actuality,
though, sort performance is the result of a complex
interaction between the characteristics of the data
file, the data processing system, the sorting
method used, the objectives desired, and a number

Advertisement

Table of Contents
loading

Table of Contents