Download Backup is not Archive-Handout .pages

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Big data wikipedia , lookup

Clusterpoint wikipedia , lookup

Data Protection Act, 2012 wikipedia , lookup

Data model wikipedia , lookup

Data center wikipedia , lookup

Data analysis wikipedia , lookup

SAP IQ wikipedia , lookup

Information privacy law wikipedia , lookup

Data vault modeling wikipedia , lookup

Business intelligence wikipedia , lookup

Open data in the United Kingdom wikipedia , lookup

Object storage wikipedia , lookup

Transcript
!
!
!
Backup is not Archive !
!
by Joseph Ortiz, Senior Analyst !!
In order to protect their data while dealing with explosive data growth, many organizations
have started backing up their data to the cloud in an effort to reduce their storage and data
center costs as well as obtaining data redundancy without the need to maintain a separate
physical DR site. Many also mistakenly believe that these additional backup copies qualify
as archive copies. Unfortunately, they do not.
!Backup vs. Archive
!
As we discussed in our article, What Is Archive Anyway?, there is a big difference between a
backup copy and an archive copy.
!A backup is the recurring, systematic copying of active data, which is being frequently
accessed and modified, in order to preserve its active content. These backup copies are
made at regularly scheduled intervals so that this active data can be restored in the event of
a system failure, file overwrite or file deletion, whether deliberate or accidental. This active
data is usually stored on tier one storage, which consists of flash and/or high-speed disk.
The backups are also usually kept on tier 2, cost effective but reasonably performing
storage storage to insure rapid response to restore requests or to provide acceptable
performance if the backup file or snapshot is mounted in a VM (virtual machine).
!An archive, however, is a static backup copy of groups of older inactive data that is not
needed for daily operations. This inactive or cold data is not modified and is only accessed
occasionally for historical reference or not at all. Typically, data is considered as cold data if
it has not been accessed or modified in over 90 days. The challenge with the archive data
set is that no one knows which component of it will be accessed and when that access will
occur. While response time to a request from an archive set does not need to be
instantaneous, it does need to be responsive.
!The archive copy is normally created only when the data has not been modified or
accessed for a specific period of time that is defined by the administrator or business unit
manager. These archive copies are stored indefinitely on less expensive media that is not in
the backup path. There are two components to the archive process. Typically, there is an
application or applications that identify and move data from the active tier to the archive tier.
There are also multiple storage hardware targets, typically some combination of disk, tape
and cloud.
!The combination of multiple archive sources and multiple archive targets has historically
made the process of tiering data to less expensive storage both expensive and extremely
!complex. In an effort to avoid this complexity, data centers have instead continued to
expand production storage and to count on backup to facilitate an archive like function. The
result has been an even more expensive alternative.
!Appliances are now appearing on the market to consolidate the archive mess by integrating
the archive software and abstracting the management of multiple archive components. The
result is less expensive primary storage and a more simplified data protection process.
!How Archive Complements Backup
!If you closely examine the differences between backup and archive functions, it becomes
readily apparent that archive is actually an integral part of the backup process with each
function protecting different types of data.
!Backup protects active data while archive protects cold data. However, archive goes
beyond just protecting cold data. It also enhances and simplifies the backup process as well
as freeing up expensive primary storage.
!Consider some basic demands of the backup process itself:
! Time to backup target data, both active and inactive
•
• Amount of primary storage required to store the backup copies and snapshots
• Amount of primary storage required for the backup application database that tracks all
files it backs up and media it manages
!A good archive solution will automatically identify and migrate cold data from expensive
primary storage to less expensive secondary or tertiary storage tiers, like tape. The archive
process yields the following benefits:
!
•
•
•
•
•
Reduces amount of data that needs to be backed up
Reduces the time needed to perform backup operations
Reduces the size of the backup application database since it has to track fewer files
Reduces the size of the backup storage repository
Frees up expensive primary storage, which avoids the necessity for purchasing
additional storage or the need for IT personnel to manage additional disk or appliances
!As storage capacity demands continue to grow uncontrollably, the current strategy of
scaling primary storage and leveraging backup as an archive becomes untenable. The cost
benefits of a good archive solution should not be underestimated. All these facts make it
clear that archive should be an integral part of a comprehensive storage strategy that
includes a small but high performing active storage tier, a slightly larger and modest
performing data protection tier and a high capacity, cost effective archive tier.
!
Sponsored by Fujifilm Dternity, Powered by StrongBox