My Lazy Admin: NBU Duplication and SLP

Things to know about Netbackup duplication

------------------------------------------

- you can duplicate a backup image from cmd or GUI

- by default, restore is being done from the primary copy

- duplication job doesn't show "KB per second" in JAVA console

- from experience, a 35 GB backup took 2 hours and a 32 KB backup took 16

minutes to duplicate to a DR facility (destination system is a DatDomain w/

un-aggregated links)

- To duplicate data generally takes longer than to back up data

- Duplication also consumes twice the bandwidth from storage devices than

backups consume because a duplication job must read from one storage device

and write to another storage device

- Duplication taxes the NetBackup resource broker (nbrb) twice as much as

backups

- If nbrb is overtaxed, it can slow the rate at which all types of new jobs are

able to acquire resources and begin to move data

How duplication jobs are triggered?

-----------------------------------

NetBackup starts a duplication session every five minutes to copy data from a

backup destination to a duplication destination. If a duplication job fails, the

next three duplication sessions retry the job if necessary. If the job fails all

three times, the job is retried every 24 hours until it succeeds.

Duplication occurs as soon as possible after the backup completes.

Concepts about backup service levels

------------------------------------

- service level is based on recovery capability

- Recovery point objective (RPO) is The most recent backup

- Recovery time objective (RTO) is the time required to recover the backup

- RTO of a given backup becomes less critical as the backup ages

- Backup data is at its most valuable immediately after the backup has been made

- Platinum service level = RPO and RTO of 1 or 2 hours --> mission critical

applications such as order processing systems and transaction processing

systems

- Gold service level = RPO and RTO of 12 hours or less --> non-critical

applications such as e-mail, CRM, and HR systems

- Silver service level = RPO and RTO of 1 or 2 days --> non-critical

applications such as user file and print data, relatively static data

- high cost storage devices are disk, ssds, etc

- low cost storage devices are tapes, virtual tape libraries, etc

Things to know about Netbackup Storage Lifecycle Policy (SLP)

-------------------------------------------------------------

- It is introduced in NBU 6.5

- a Storage Lifecycle Policy is a plan or map of where backup data will be

stored and for how long

- it automates duplication process and determines how long the backup data will

reside in each location that it is duplicated to

- when a storage plan changes (e.g., if a new regulation is imposed on your

business requiring changes to retention periods or the number of copies

created), you simply need to change a small number of Storage Lifecycle

Policies, and all associated backups will take the changes into account

automatically

- after the original backup completes, the Storage Lifecycle Policy process

creates copies of the image, retrying as necessary until all required copies

are successfully created

- in practice it is likely that a Backup Policy may have two or three Storage

Lifecycle Policies covering different types of backup (e.g., daily

incremental, weekly full, and monthly full)

- a backup policy may have one or more SLPs (e.g one for Daily Incr schedule and

another one for Weekly Full)

- SLP scheduling is builtin on NBU 7.6

SLP Operations

--------------

- duplication jobs will start as soon as the backup completes (backup then

duplication)

- by default, SLP checks every 5 minutes for backup images that have recently

completed and require duplication jobs

- SLP groups batches of similar images together for each duplication job, to

optimize the performance of duplication (when there is enough data, 8 GB by

default, to warrant a duplication job, duplication is started)

-> as an example, see "first_duplication_batch_job.jpg"

- default settings of 5 minutes and 8 GB can be varied by setting values in the

/usr/openv/netbackup/db/config/LIFECYCLE_PARAMETERS

- if a duplication job fails to make a copy of an image, that image will be

added to a subsequent batch of images to be duplicated with the next

five-minute sweep of images that need to be copied (this is done 3 times for

a single image)

- after three failures, the SLP will wait two hours (by default) before trying

to create that copy of that image again (this retry will continue once every

two hours (by default) until either the user intervenes or the time of the

longest retention specified for the image comes to pass)

- duplicate copies will not be deleted until if atleast one copy failed to

duplicate

- In practice, I notice that SLP starts 30 minutes after a daily incremental

finishes (both triggered and scheduled backup)

-> reason of this is because we don't have a

/usr/openv/netbackup/db/config/LIFECYCLE_PARAMETERS file in our master

server

-> so SLP is using the default values for

MAX_MINUTES_TIL_FORCE_SMALL_DUPLICATION_JOB which is 30 minutes

Considerations in setting up Storage Lifecycle Policy (SLP)

-----------------------------------------------------------

1.) It is important to remember that this is not a hierarchical model; it is

duplicated at the first possible opportunity and occupies all the storage

locations simultaneously.

2.) In most cases the primary (first) Backup Storage Destination will be a

high-speed storage device that allows fast restores.

3.) It is not possible to specify the use of the Media Server Encryption Option

on specific Storage Destinations within a Storage Lifecycle Policy.

4.) A storage destination within a Storage Lifecycle Policy may use either a

specific Storage Unit or a Storage Unit Group.

5.) It is important to remember this when defining Duplication Storage

Destinations, as poor design may lead to excessive network traffic and other

resource contention.

6.) The “Alternate Read Server” setting for a storage destination applies on the

source destination, not the target destination. This means that the only

Storage Destination on which the “Alternate Read Server” setting has any

effect is the first Backup Destination (as this is the source used for all

duplication).

Setup/Configuration

-------------------

The LIFECYCLE_PARAMETERS file:

/usr/openv/netbackup/db/config/LIFECYCLE_PARAMETERS

MIN_KB_SIZE_PER_DUPLICATION

This is the size of the minimum duplication batch (default 8 GB).

MAX_KB_SIZE_PER_DUPLICATION_JOB

This is the size of the maximum duplication batch (default 25 GB).

MAX_MINUTES_TIL_FORCE_SMALL_DUPLICATION_JOB

This represents the time interval between forcing duplication sessions for

small batches (default 30 minutes).

IMAGE_EXTENDED_RETRY_PERIOD_IN_HOURS

After duplication of an image fails three times, this is the time interval

between subsequent retries (default 2 hours).

DUPLICATION_SESSION_INTERVAL_MINUTES

This is how often the Storage Lifecycle Policy service (nbstserv) looks to see

if it is time to start a new duplication job(s) (default 5 minutes).

- if this file does not exist, the default values will be used

- not all parameters are required in the file, and there is no order dependency

in the file

- any parameters omitted from the file will use default values

The syntax of the LIFECYCLE_PARAMETERS file, using default values, is as

follows:

MIN_KB_SIZE_PER_DUPLICATION_JOB 8192

MAX_KB_SIZE_PER_DUPLICATION_JOB 25600

MAX_MINUTES_TIL_FORCE_SMALL_DUPLICATION_JOB 30

IMAGE_EXTENDED_RETRY_PERIOD_IN_HOURS 2

DUPLICATION_SESSION_INTERVAL_MINUTES 5

My Lazy Admin

Wednesday, May 23, 2018

NBU Duplication and SLP

No comments:

Post a Comment