IBM N Series Hardware Manual page 175

System storage
Hide thumbs Also See for N Series:
Table of Contents

Advertisement

writes overflow the cache and cause other, more valuable data to be ejected. However,
some read-modify-write type workloads benefit from caching recent writes. Examples
include stock market simulations and some engineering applications.
Sequential reads: Sequential reads can often be satisfied by reading a large amount of
contiguous data from disk at one time. In addition, as with writes, caching large sequential
reads can cause more valuable data to be ejected from system cache. Therefore, it is
preferable to read such data from disk and preserve available read cache for data that is
more likely to be read again. The N series provides algorithms to recognize sequential
read activity and read data ahead, making it unnecessary to retain this type of data in
cache with a high priority.
Metadata: Metadata describes where and how data is stored on disk (name, size, block
locations, and so on). Because metadata is needed to access user data, it is normally
cached with high priority to avoid the need to read metadata from disk before every read
and write.
Small, random reads: Small, random reads are the most expensive disk operation
because they require a higher number of head seeks per kilobyte than sequential reads.
Head seeks are a major source of the read latency associated with reading from disk.
Therefore, data that is randomly read is a high priority for caching in system memory.
The default caching behavior for the Data ONTAP buffer cache is to prioritize small, random
reads and metadata over writes and sequential reads.
Deciding which data to prefetch into system memory
The N series read ahead algorithms are designed to anticipate what data will be requested
and read it into memory before the read request arrives. Because of the importance of
effective read ahead algorithms, IBM has done a significant amount of research in this area.
Data ONTAP uses an adaptive read history logging system based on "read sets", which
provides much better performance than traditional and fixed read-ahead schemes.
In fact, multiple read sets can support caching for individual files or LUNs, which means that
multiple read streams can be prefetched simultaneously. The number of read sets per file or
LUN object is related to the frequency of access and the size of the object.
The system adaptively selects an optimized read-ahead size for each read stream based on
these historical factors:
The number of read requests processed in the read stream
The amount of host-requested data in the read stream
A read access style associated with the read stream
Forward and backward reading
Identifying coalesced and fuzzy sequences of arbitrary read access patterns
Cache management is significantly improved by these algorithms, which determine when to
run read-ahead operations and how long each read stream's data is retained in cache.
Chapter 11. Core technologies
155

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents