IBM N series Hardware Manual page 193

System storage
Hide thumbs Also See for N series:
Table of Contents

Advertisement

Answers to these questions can be determined in large part based on the following types of
data and how it got into memory in the first place:
Write data
Write workloads tend not to be read back after writing. They are often already cached
locally on the system that ran the write. Therefore, they are not good candidates for
caching. In addition, recently written data is normally not a high priority for retention in the
system buffer cache. The overall write workload can be high enough that writes overflow
the cache and cause other, more valuable data to be ejected. However, some
read-modify-write type workloads benefit from caching recent writes. Examples include
stock market simulations and some engineering applications.
Sequential reads
Sequential reads can often be satisfied by reading a large amount of contiguous data from
disk at one time. In addition, as with writes, caching large sequential reads can cause
more valuable data to be ejected from system cache. Therefore, it is preferable to read
such data from disk and preserve available read cache for data that is more likely to be
read again. The N series provides algorithms to recognize sequential read activity and
read data ahead, which makes it unnecessary to retain this type of data in cache with a
high priority.
Metadata
Metadata describes where and how data is stored on disk (name, size, block locations,
and so on). Because metadata is needed to access user data, it is normally cached with
high priority to avoid the need to read metadata from disk before every read and write.
Small, random reads
Small, random reads are the most expensive disk operation because they require a higher
number of head seeks per kilobyte than sequential reads. Head seeks are a major source
of the read latency that is associated with reading from disk. Therefore, data that is
randomly read is a high priority for caching in system memory.
The default caching behavior for the Data ONTAP buffer cache is to prioritize small, random
reads and metadata over writes and sequential reads.
Deciding which data to prefetch into system memory
The N series read ahead algorithms are designed to anticipate what data will be requested
and read it into memory before the read request arrives. Because of the importance of
effective read ahead algorithms, IBM performed a significant amount of research in this area.
Data ONTAP uses an adaptive read history logging system that is based on read sets, which
provide much better performance than traditional and fixed read-ahead schemes.
In fact, multiple read sets can support caching for individual files or LUNs, which means that
multiple read streams can be prefetched simultaneously. The number of read sets per file or
LUN object is related to the frequency of access and the size of the object.
The system adaptively selects an optimized read-ahead size for each read stream that is
based on the following historical factors:
The number of read requests that are processed in the read stream
The amount of host-requested data in the read stream
A read access style that is associated with the read stream
Forward and backward reading
Identifying coalesced and fuzzy sequences of arbitrary read access patterns
Cache management is improved by these algorithms, which determine when to run
read-ahead operations and how long each read stream's data is retained in cache.
Chapter 11. Core technologies
173

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents