Dell DR4000 Administrator's Manual page 13

Administrator guide
Hide thumbs Also See for DR4000:
Table of Contents

Advertisement

FILE LOCATION: C:\Users\bruce_wylie\Desktop\Dell
Docs\~Sidewinder_Docs_DR4000\FRAME_Conversion_DR4000_AdminGuide\~DR4000_AG_F
Block-level deduplication works efficiently where there are multiple duplicate
versions of the same file. This is because it looks at the actual sequence of the
data–the 0s and 1s–that comprise the data.
Whenever a document is repeatedly backed up, the 0s and 1s stay the same
because the file is simply being duplicated. The similarities between two files
can be easily identified using block deduplication because the sequence of
their 0s and 1s remain exactly the same. In contrast to this, there are
differences in online data.
Online data has few exact duplicates. Instead, online data files include files
that may contain a lot of similarities between each file. For example, a
majority of files that contribute to increased data storage requirements come
pre-compressed by their native applications, such as:
Images and video (such as the JPEG, MPEG, TIFF, GIF, PNG formats)
Compound documents (such as .zip files, E-mail, HTML, web pages, and
PDFs)
Microsoft Office application documents (including Powerpoint, MS-Word,
Excel, and Sharepoint)
NOTE:
The DR4000 system experiences a reduced savings rate when the data it
ingests is already compression-enabled by the native data source. It is highly
recommended that you disable data compression used by the data source, and
especially for first-time backups. For optimal savings, the native data sources need
to send data to the DR4000 system in a raw state for ingestion.
Block deduplication is not as effective on existing compressed files due to the
nature of file compression because its 0s and 1s change from the original
format.
Data deduplication is a specialized form of data compression that eliminates
a lot of redundant data. The compression technique improves storage
utilization, and it can be used in network data transfers to reduce the number
of bytes that must be sent across a link. Using deduplication, unique chunks
of data, or byte patterns, can be identified and stored during analysis.
As the analysis continues, other chunks are compared to the stored copy and
when a match occurs, the redundant chunk is replaced with a small reference
that points to its stored chunk. This reduces the amount of data that must be
stored or transferred.
D E L L C O N F I D E N T I A L – P R E L I M I N A R Y 1 / 1 0 / 1 2 - F O R P R O O F O N LY
Understanding the DR4000 System
5

Advertisement

Table of Contents
loading

Table of Contents