Thursday, May 17, 2018

DataDomain Deduplication


Data deduplication looks for redundancy of sequences of bytes across very large
comparison windows. Sequences of data (over 8 KB long) are compared to the
history of other such sequences. The first uniquely stored version of a sequence
is referenced rather than stored again. This process is completely hidden from
users and applications so the whole file is readable after it's written.

/data -->   this is were your files (files from CIFS shares)
            and backups (backups from backup softwares like Veritas, SQL, etc)
            reside; storage file system
/ddvar -->  resembles unix filesystems (log directories, etc); administrative
            filesystem
/backup --> common directory for CIFS shares (used in older DD OS versions)

Data Domain is a deduplicated device meaning all data you send to it will be
deduplicated and compressed. For example, if you sent a 100 TB data, expect that
the size of that data is much lower inside the DD. It may be 10 TB, 15TB, or
5 TB depending on the compression ratio.

As an example:
sysadmin@dd01# filesys show space

Active Tier:
Resource           Size GiB   Used GiB   Avail GiB   Use%   Cleanable GiB*
----------------   --------   --------   ---------   ----   --------------
/data: pre-comp           -   418498.9           -      -                - # this line tells you the actual size of data sent to the DD (uncompressed data)
/data: post-comp    64766.3    59730.8      5035.5    92%            244.0 # this line tells you the size of virtual data inside the DD specifically on the "Used GiB" column (compressed data)
/ddvar                 29.5       13.3        14.7    48%                -
----------------   --------   --------   ---------   ----   --------------
 * Estimated based on last cleaning of 2015/02/17 11:55:17.
sysadmin@dd01#
The output above will also tell you how large is the capacity of your DD
(approx 65 TB).

No comments:

Post a Comment