Things to know about Netbackup duplication
------------------------------------------
- you can duplicate a backup image from cmd or GUI
- by default, restore is being done from the primary copy
- duplication job doesn't show "KB per second" in JAVA console
- from experience, a 35 GB backup took 2 hours and a 32 KB backup took 16
minutes to duplicate to a DR facility (destination system is a DatDomain w/
un-aggregated links)
- To duplicate data generally takes longer than to back up data
- Duplication also consumes twice the bandwidth from storage devices than
backups consume because a duplication job must read from one storage device
and write to another storage device
- Duplication taxes the NetBackup resource broker (nbrb) twice as much as
backups
- If nbrb is overtaxed, it can slow the rate at which all types of new jobs are
able to acquire resources and begin to move data
How duplication jobs are triggered?
-----------------------------------
NetBackup starts a duplication session every five minutes to copy data from a
backup destination to a duplication destination. If a duplication job fails, the
next three duplication sessions retry the job if necessary. If the job fails all
three times, the job is retried every 24 hours until it succeeds.
Duplication occurs as soon as possible after the backup completes.
Concepts about backup service levels
------------------------------------
- service level is based on recovery capability
- Recovery point objective (RPO) is The most recent backup
- Recovery time objective (RTO) is the time required to recover the backup
- RTO of a given backup becomes less critical as the backup ages
- Backup data is at its most valuable immediately after the backup has been made
- Platinum service level = RPO and RTO of 1 or 2 hours --> mission critical
applications such as order processing systems and transaction processing
systems
- Gold service level = RPO and RTO of 12 hours or less --> non-critical
applications such as e-mail, CRM, and HR systems
- Silver service level = RPO and RTO of 1 or 2 days --> non-critical
applications such as user file and print data, relatively static data
- high cost storage devices are disk, ssds, etc
- low cost storage devices are tapes, virtual tape libraries, etc
Things to know about Netbackup Storage Lifecycle Policy (SLP)
-------------------------------------------------------------
- It is introduced in NBU 6.5
- a Storage Lifecycle Policy is a plan or map of where backup data will be
stored and for how long
- it automates duplication process and determines how long the backup data will
reside in each location that it is duplicated to
- when a storage plan changes (e.g., if a new regulation is imposed on your
business requiring changes to retention periods or the number of copies
created), you simply need to change a small number of Storage Lifecycle
Policies, and all associated backups will take the changes into account
automatically
- after the original backup completes, the Storage Lifecycle Policy process
creates copies of the image, retrying as necessary until all required copies
are successfully created
- in practice it is likely that a Backup Policy may have two or three Storage
Lifecycle Policies covering different types of backup (e.g., daily
incremental, weekly full, and monthly full)
- a backup policy may have one or more SLPs (e.g one for Daily Incr schedule and
another one for Weekly Full)
- SLP scheduling is builtin on NBU 7.6
SLP Operations
--------------
- duplication jobs will start as soon as the backup completes (backup then
duplication)
- by default, SLP checks every 5 minutes for backup images that have recently
completed and require duplication jobs
- SLP groups batches of similar images together for each duplication job, to
optimize the performance of duplication (when there is enough data, 8 GB by
default, to warrant a duplication job, duplication is started)
-> as an example, see "first_duplication_batch_job.jpg"
- default settings of 5 minutes and 8 GB can be varied by setting values in the
/usr/openv/netbackup/db/config/LIFECYCLE_PARAMETERS
- if a duplication job fails to make a copy of an image, that image will be
added to a subsequent batch of images to be duplicated with the next
five-minute sweep of images that need to be copied (this is done 3 times for
a single image)
- after three failures, the SLP will wait two hours (by default) before trying
to create that copy of that image again (this retry will continue once every
two hours (by default) until either the user intervenes or the time of the
longest retention specified for the image comes to pass)
- duplicate copies will not be deleted until if atleast one copy failed to
duplicate
- In practice, I notice that SLP starts 30 minutes after a daily incremental
finishes (both triggered and scheduled backup)
-> reason of this is because we don't have a
/usr/openv/netbackup/db/config/LIFECYCLE_PARAMETERS file in our master
server
-> so SLP is using the default values for
MAX_MINUTES_TIL_FORCE_SMALL_DUPLICATION_JOB which is 30 minutes
Considerations in setting up Storage Lifecycle Policy (SLP)
-----------------------------------------------------------
1.) It is important to remember that this is not a hierarchical model; it is
duplicated at the first possible opportunity and occupies all the storage
locations simultaneously.
2.) In most cases the primary (first) Backup Storage Destination will be a
high-speed storage device that allows fast restores.
3.) It is not possible to specify the use of the Media Server Encryption Option
on specific Storage Destinations within a Storage Lifecycle Policy.
4.) A storage destination within a Storage Lifecycle Policy may use either a
specific Storage Unit or a Storage Unit Group.
5.) It is important to remember this when defining Duplication Storage
Destinations, as poor design may lead to excessive network traffic and other
resource contention.
6.) The “Alternate Read Server” setting for a storage destination applies on the
source destination, not the target destination. This means that the only
Storage Destination on which the “Alternate Read Server” setting has any
effect is the first Backup Destination (as this is the source used for all
duplication).
Setup/Configuration
-------------------
The LIFECYCLE_PARAMETERS file:
/usr/openv/netbackup/db/config/LIFECYCLE_PARAMETERS
MIN_KB_SIZE_PER_DUPLICATION
This is the size of the minimum duplication batch (default 8 GB).
MAX_KB_SIZE_PER_DUPLICATION_JOB
This is the size of the maximum duplication batch (default 25 GB).
MAX_MINUTES_TIL_FORCE_SMALL_DUPLICATION_JOB
This represents the time interval between forcing duplication sessions for
small batches (default 30 minutes).
IMAGE_EXTENDED_RETRY_PERIOD_IN_HOURS
After duplication of an image fails three times, this is the time interval
between subsequent retries (default 2 hours).
DUPLICATION_SESSION_INTERVAL_MINUTES
This is how often the Storage Lifecycle Policy service (nbstserv) looks to see
if it is time to start a new duplication job(s) (default 5 minutes).
- if this file does not exist, the default values will be used
- not all parameters are required in the file, and there is no order dependency
in the file
- any parameters omitted from the file will use default values
The syntax of the LIFECYCLE_PARAMETERS file, using default values, is as
follows:
MIN_KB_SIZE_PER_DUPLICATION_JOB 8192
MAX_KB_SIZE_PER_DUPLICATION_JOB 25600
MAX_MINUTES_TIL_FORCE_SMALL_DUPLICATION_JOB 30
IMAGE_EXTENDED_RETRY_PERIOD_IN_HOURS 2
DUPLICATION_SESSION_INTERVAL_MINUTES 5
No comments:
Post a Comment