HP VLS Solutions Guide Design Guidelines for Virtual Library Systems with Deduplication and Replication Abstract This document describes the HP Virtual Library System and its concepts including automigration, deduplication, and replication, to help you define and implement your virtual tape library system. It includes best practices for working with specific backup applications.
© Copyright 2005, 201 1 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.
Contents 1 Introduction....................8 2 Concepts....................9 Disk-based Backup and Virtual Tape Libraries................9 Problems Addressed by Virtual Tape Libraries................9 Integration of Disk in Data Protection Processes................9 Where Virtual Tape Fits in the Big Picture................9 HP VLS and D2D Portfolio....................10 Typical VLS Environments....................11 Typical D2D Environments....................11 What are the Alternatives?....................12 Physical Tape........................12...
Copy to Tape using VLS Automigration.................33 Benefits of Echo Copy....................34 Considerations of Echo Copy..................34 Copy to Remote Disk Backup Device using Replication............35 Benefits of Replication....................35 Considerations of Replication..................35 Creating Archive Tapes from the Replication Target..............35 Considerations for Restores......................36 Restoring from Disk Backup Device..................36 Restoring from Backup Application-created Tape Copy............36 Restoring from the Replication Target..................36 Performance Bottleneck Identification..................37...
Combining D2D and VLS in an End-to-end Solution for ROBO/Regional DCs and Main Data Center using HP Data Protector....................77 5 Automigration..................79 Echo Copy Concepts.......................79 Implementation........................79 Echo Copy Pools........................79 Automigration Policy......................80 Automigration Setup......................81 Design Considerations......................83 Automigration Use Models....................83 Backup to Non-shared Virtual Libraries................83 Backup to a Shared Virtual Library..................84 Sizing the Tape Library.......................85 Restoring from Automigration Media..................85...
Device Sizing........................120 VLS Replication Data Recovery Options...................120 Restore Directly from the VLS Target Device................120 Restore the VLS over the LAN/WAN..................122 Rebuilding Source Device and Re-establishing Replication.............123 Rebuilding Target Device and Re-establishing Replication............123 Creating Archive Tapes from the Target................124 ISV Import Email Format....................125 Email Processing Example Script...................125 8 Non-deduplicated Replication..............127 Replicating a Subset of Virtual Cartridges - a Use Case.............128...
Document Conventions and Symbols..................155 Glossary....................157 Index.......................158 Contents...
1 Introduction Welcome to virtual tape libraries. This guide describes the HP Virtual Library System and its concepts including automigration, deduplication, and replication, to help you define and implement your virtual tape library system. It includes best practices for working with specific backup applications. Although every user environment and every user’s goals are different, there are basic considerations that can help you use the VLS effectively in your environment.
2 Concepts Disk-based Backup and Virtual Tape Libraries Problems Addressed by Virtual Tape Libraries You can optimize your backup environment with VLS and D2D if you are: Not meeting backup windows due to slow servers. Not consistently streaming your tape drives. Dealing with restore problems caused by interleaving.
Figure 1 Common Backup Technologies What are the Alternatives? for more discussion of the other potential players in your backup environment. HP VLS and D2D Portfolio HP offers a wide range of disk-based backup products to help organizations meet their data protection challenges.
Figure 2 HP Virtual Tape Library Product Range Typical VLS Environments In a typical enterprise backup environment, there are multiple application servers backing up data to a shared tape library on the SAN. Each application server contains a remote backup agent that sends the data from the application server over the SAN fabric to a tape drive in the tape library.
for a longer retention time on disk without needing significantly higher disk capacities, and the deduplication-enabled replication allows cost-effective off-site copying of the backups for disaster protection. What are the Alternatives? Alternatives to virtual tape solutions include: Physical Tape (network attached storage) Application-based Disk Backup (disk to disk, backup to disk, disk to disk to tape) Business Copy...
Figure 3 Basic Write-to-disk Setup Table 1 VLS Compared to Application-based Write-to-disk Virtual tape devices Write-to-disk Setup and Sets up just like a physical tape library. Requires configuration of RAID groups, LUNs, management volumes, and file systems. complexity Data compression Software or hardware enabled (software No device-side data compression available.
Deduplication Introduction In recent years, the amount of data that companies produce has been steadily increasing. To comply with government regulations, or simply for disaster recovery and archival purposes, companies must retain more and more data. Consequently, the costs associated with data storage –...
HP Accelerated deduplication and HP Dynamic deduplication are designed to meet different needs, as shown in Table 2 (page 15). Table 2 HP Deduplication Solutions HP Accelerated deduplication HP Dynamic deduplication Intended for enterprise users. Intended for mid-sized enterprise and remote office users.
Table 3 1 TB File Server Backup Data stored normally Data stored with deduplication 1st daily full backup 500 GB 500 GB 1st daily incremental backup 50 GB 5 GB 2nd daily incremental backup 50 GB 5 GB 3rd daily incremental backup 50 GB 5 GB 4th daily incremental backup...
compression, you must create 100 TB of virtual tape capacity to hold the four weeks of backup data. Given deduplication across the four weeks of backup versions, the amount of physical disk required for this 100 TB of virtual tape would be significantly less. NOTE: Do not create too much virtual tape capacity or your backup application may be set to prefer to use blank tapes instead of recycling older tapes.
physical tapes has its down sides—a high level of manual intervention, tape tracking requirements, etc. The physical transfer of tapes off-site is not very automated. In addition, one of the pain points for many companies large and small is protecting data in remote offices.
Figure 7 Remote Site Data Protection Using Replication Deduplication is the key technology enabler for replication on HP VLS and D2D systems. (VLS systems use HP Accelerated deduplication, and D2D systems use Dynamic deduplication.) The same technology that allows duplicate data to be detected and stored only once on the HP VLS or D2D system also allows only the unique data to replicate between sites.
NOTE: T1/T3 and OC12 are old terms with respect to WAN link terminology. Many link providers use their own names (for example, IP Clear and Etherflow). This document distinguishes them by their speed using 2 Mbits/sec, 50 Mbits/sec, etc. One consideration with replication is that you must “initialize” the Virtual Tape Libraries with data prior to starting the replication.
Figure 8 Replication Configuration Options Active-Passive: The best deployment for a single site disaster recovery protection. The active source device receives the local backup data and then replicates it to a passive target device at the disaster recovery site dedicated to receiving replication data. Many-to-one: The best deployment for several smaller sites consolidating their replication to a central disaster recovery site.
In this example, it costs less to use active-active because it adds two replication LTUs but saves the hardware/power/footprint cost of a second rack and the cost of a second VLS connectivity kit. However, if your backups required more than half of the maximum device performance (for example, more than two nodes out of a maximum configuration of four nodes), you may have to deploy two devices per site.
NOTE: With HP Data Protector, if you have a cell server in each site that can share library devices across sites through a MoM/CMMDB, you still need to ensure that each cell server only sees its local virtual library (the source cell server must not be configured to see the target virtual library and vice-versa).
3 Backup Solution Design Considerations This section uses use models to explores many of the concepts you must consider when designing your system. Use models are organizational schemes that provide a basic organizational framework that you can use to delineate your environment and visualize how to implement the VLS for your best results.
Figure 10 Backup to the VLS in a LAN/SAN Hybrid Environment This includes servers that contain lots of small files that are likely to slow down backup performance on the host (such as those found on Windows file servers and web servers), blade servers, etc.
List your backup applications. In a heterogeneous environment, you can have multiple applications writing to different virtual libraries on the same VLS (which is highly flexible with regard to the libraries that can be configured). This is similar to library partitioning but more flexible and easier to setup and maintain.
Figure 1 1 Backup to VLS in a Simple Deployment (VLS9000–series with One Shared Library Shown) Benefits of Single Library Systems This use model is easy to manage. It is easy to copy through the backup application because you already have shared devices and they all see and are seen by a copy engine or server.
Multiple Library In this model, multiple hosts map to discrete libraries in a one-to-one or some-to-one configuration. Figure 12 Backup to VLS in a Simple Deployment (VLS9000–series with Four Dedicated Libraries Shown) Benefits of Multiple Library Systems The assignment of drives and media in this configuration is more individualized. This model is helpful when you need to control the organization and grouping of media.
Multiplexing, Multistreaming, and Multipathing This section explains the concepts of Multiplexing, Multistreaming and Multipathing, and notes which of these technologies are useful for disk based backup devices and which are not recommended. Multiplexing This technology (sometimes called interleaving) is where multiple backup streams (from multiple servers) are concurrently mixed together into a single tape drive (and thus a single tape cartridge).
but the remaining half of the drives in the library can still be used and can still access every virtual cartridge in the library. This also implies that when a SAN fabric fails this halves the available Fibre Channel performance so if full backup performance is required even after a SAN fabric failure then the VLS needs to be configured with double the number of Fibre Channel ports.
Future Data Growth How fast is your data growing? This is critical in deciding which disk backup technology best fits; choose a technology that will continue to scale in performance and capacity so that it still can provide a single backup target even after several years of data growth. If you choose a disk backup technology that starts close to it maximum performance/capacity, then as data grows you must add another device (and thus another backup target) and so on, which means significant increases in backup administration over time due to having to manually...
After installing the VLS and configuring/redirecting the backups to the device, you will create tape copy/clone jobs in the backup applications to perform the migration to physical tape. Most backup applications include several copy/clone options, ranging from creating the tape copy at the same time as the backup (mirroring), scheduling a copy of all backup media to run when all the backups are expected to finish (scheduled copy), or creating the tape copy immediately after each backup completes (triggered copy).
after the backup window is complete, all of the media servers that were performing LAN backups are now idle and can be used for copies. Do not use media servers that are performing LAN-free backup because these will be active application servers and could affect application performance if used for copy operations.
Figure 14 Echo Copy is Managed through Automigration Benefits of Echo Copy The destination library is not visible to the backup application, so it does not need licensing. There is no need to license/configure copy jobs in the backup application. Copy traffic is no longer routed through the SAN.
NOTE: Automigration echo copy is not suitable for use with deduplication because: You cannot use echo copy to create archive tapes from the replication target device because these must have a different barcode, retention time, cartridge size, and contents. They must be created by another instance of the backup application.
have a different barcode, retention time, contents, and potentially a different size from the replication target media. To create archive tapes from the replication target media, you must use a second instance of the backup application to perform this. (See Creating Archive Tapes from the Target.) The backup application must have the ability to “import”...
Performance Bottleneck Identification In many cases, backup and restore performance using the VLS is limited by external factors. For example, performance is affected by the speed at which data can be transferred to and from the source disk system (the system being backed up), or by the performance of the Ethernet or Fibre Channel SAN link from the source to the VLS.
Reducing the time it takes to debug and resolve anomalies in the backup/restore environment. Reducing the potential for conflict with untested third-party products. Zoning may not always be required for configurations that are already small or simple. Typically the bigger the SAN, the more zoning is needed. HP recommends the following for determining how and when to use zoning.
system). Some operating systems limit the number of tape LUNs that can be discovered to eight LUNs on each device port (remember there are multiple Fibre Channel ports on the backup devices), but there are workarounds in some cases: ◦ Windows Large LUN support.
Windows: Disable MS SCCM (System Center Config Mgr). ◦ HPUX: Ensure the EMS is not polling tape drives by either disabling it or by copying the ◦ archived dm_stape.cfg file to the /var/stm/config/tools/monitor folder, and the polling interval can be set to 0 to disable polling. HPUX: prevent utilities like mt from rewinding a tape that is in use by the backup ◦...
Figure 15 Virtual Tape Environment In the VLS firmware version 1.x/2.x, the LUN mapping mode is specific to each host. It lets you manually assign new LUN numbers to the virtual devices visible to the host, and hosts that are not LUN masked continue to use the default (shared device) LUN numbering.
4 VLS Devices VLS Defined The HP VLS12000 Gateway, the VLS6000–series, the VLS9000–series, and the new VLS12200 and VLS9200 are RAID disk-based SAN backup devices. All three platforms emulate physical tape libraries, allowing you to perform disk-to-virtual tape (disk-to-disk) backups using your existing backup applications.
The virtual tape solution uses disk storage to simulate a tape library. That is, a virtual library simulates libraries with tape drives and slots as if they were physical devices. Backup servers see the virtual libraries as physical libraries on the SAN. Because you can create many more virtual libraries and drives than you have physical tape drives, many more SAN-based backups can run concurrently from the application servers, reducing the aggregate backup window.
Less expensive than buying more fiber channel arrays. Can require fewer tape drives than other solutions and get better utilization of the tape drives/libraries you have. The benefits of specific VLS models: Because the VLS12X00 is a Gateway to an EVA, it can take advantage of Instant Support Enterprise Edition –...
Figure 17 Full VLS9000 System VLS Technical Specifications Table 6 VLS9200 Multi-node Maximum Capacity by Configuration with 40–port Connectivity Kit and 1 TB Drives (2:1 Compression, Non-deduplication) 1 node 2 nodes 3 nodes 4 nodes 1 array 80 TB 2 arrays 160 TB 160 TB 3 arrays...
Table 7 VLS9200 Multi-node Maximum Capacity by Configuration with 40–port Connectivity Kit and 2 TB Drives (2:1 Compression, Non-deduplication) (continued) 1 node 2 nodes 3 nodes 4 nodes 4 arrays 640 TB 640 TB 640 TB 5 arrays 800 TB8 400 TB 6 arrays 960TB...
Table 10 VLS9000–series Multi-node Maximum Capacity by Configuration with Two 32–port Connectivity Kits and 40 TB Arrays (2:1 Compression, Non-deduplication) 1 node 2 nodes 3 nodes 4 nodes 5 nodes 6 nodes 7 nodes 8 nodes 1 array 80 TB 2 arrays 160 TB 160 TB...
Table 12 VLS6000–series Technical Specifications (continued) VLS6218 VLS6227 VLS6636 VLS6653 Maximum usable capacity (standard 28.4 TB 30.6 TB 56.8 TB 61.2 TB configuration, non-compressed) Maximum usable capacity (2:1 56.8 TB 61.2 TB 1 13.6 TB 122.4 TB compression) Table 13 VLS6000–series Performance by Configuration, 4 GB Models VLS6200 VLS6600 with commpression disabled...
Table 14 VLS12200 EVA Gateway Backup Throughput using Deduplication (continued) Number of VLS12200 EVA Gateway Nodes with uncompressible data or deduplication enabled 3.9 TB/hr 5.8 TB/hr 7.2 TB/hr 3.9 TB/hr 5.8 TB/hr 7.8 TB/hr Table 15 VLS12200 EVA Gateway Backup Throughput without Deduplication Number of VLS12200 EVA Gateway Nodes with 2:1compressible data or deduplication disabled Number of EVA4000 or EVA4100...
Table 16 VLS12000 EVA Gateway Technical Specifications VLS12000 EVA Gateway Usable capacity for base (2–node) configuration 50 TB Maximum usable capacity 512 TB and 1 PB (with hardware compression) Number of virtual library nodes 2 to 8 Number of virtual tape libraries per node 1 to 16 Number of virtual tape drives per node 1 to 128...
Table 17 VLS12000 EVA Gateway Backup Throughput using Deduplication (continued) Number of VLS12000 EVA Gateway Nodes with uncompressible data or deduplication enabled 1080 1600 1600 1600 1600 1600 1600 MB/s MB/s MB/s MB/s MB/s MB/s MB/s 1080 1620 2000 2000 2000 2000 2000...
Table 18 VLS12000 EVA Gateway Backup Throughput without Deduplication (continued) Number of VLS12000 EVA Gateway Nodes with 2:1 compressible data or deduplication disabled 1200 1800 2400 MB/ 3000 3600 4200 4800 MB/s MB/s MB/s MB/s MB/s MB/s Number of 800 MB/s 800 MB/s 800 MB/s 800 MB/s...
Figure 18 Internal Architecture of the VLS (VLS9X00 Shown) This allows any cartridge to be loaded into any virtual tape drive on any node, so the virtual drives in a library can be on different nodes from the virtual library robot. Therefore, the virtual drives for a library can be configured across multiple nodes as shown in Figure 19 (page 53), and as...
VLS Automatic Performance Load Balancing The VLS always “thin provisions” its virtual cartridges and dynamically load balances all the incoming new backup data across the available back-end arrays to ensure that there are never any performance hotspots. When any virtual cartridge is written (in any virtual tape drive), it dynamically assigns which array LUNs will store the cartridge data based on which array LUNs are least active.
VLS Warm Failover A VLS feature (in firmware version 3.2 or higher) provides an automatic mechanism for faster recovery from node0 failures, which is the ability to automatically restore the previous virtual library configuration and licenses after a complete node0 replacement. This warm failover feature works as follows: The VLS will automatically save, within 1 hour, the current configuration and licenses to a hidden virtual cartridge stored on the back-end disk arrays.
Table 19 Base Capacity License by VLS Model (continued) VLS model Base capacity license VLS6653, 750 GB Covers the two MSA20 disk array enclosures that come bundled. drives VLS9000–series Each node adds a base license for one full VLS9000 array (48 drives, 30 TB or 40 TB). The entry-level VLS9000 7.5 and 10 TB systems do not need additional licenses to add capacity kits up to the full array configuration.
Load balance your virtual devices across the ports. For example, if you have a four-port configuration and you configure a 24-drive library, you should spread the drives evenly across the ports. Ensure that your backup application also balances the drive distribution. With VLS9X00 and VLS12X00, you can upgrade capacity as needed after the initial setup.
Figure 21 VLS12000 EVA Gateway Connected to an EVA Ensure that the zoning configuration is complete and that storage ports 2 and 3 on each VLS node connect to different switches/fabrics or zones. The EVA controllers must also be connected to both switches/fabrics or zones.
Ensure that all VRaid LUNs required for the VLS have been created on the EVA such that the LUNs are RAID 6 (preferred) or RAID 5, are not configured as read-only, and are roughly the same size. The maximum supported EVA LUN size is 2047 GB. Do not create LUNs for the VLS larger than this even if the EVA model supports it.
Figure 22 Single vs. Multiple Storage Pool Considerations VLS9000 Storage Pooling With VLS9X00, the user establishes a storage pool policy which defines how the arrays are pooled. VLS9X00 then automatically creates storage pools based on the policy. Storage pools are defined in terms of whole arrays, which consist of one base disk array and three expansion disk arrays.
Figure 23 VLS9X00 Partially Populated Storage Pools The flexibility of the VLS9X00 storage pooling effectively allows the user to partition the disk arrays to fit the user's environment. For example, if a company has four SANs, each with its own backup application, the user can configure the storage pools so that there are two arrays per SAN.
Compare the capacity required to hold the data that is most often restored to the VLS capacity. If you use ‘x’% of the VLS to speed up your restores, is that the best use of the VLS? How much time in restores will this save? What does the ROI look like for this? Compare the capacity required to hold data that is backed up but has the shortest retention periods.
Sizing/implementation Examples There are no standard, generic rules for sizing your environment but there are general parameters that you can use to help you understand your capacity needs. Example 1 VLS Sizing for Performance and Capacity Assume that you have 12 servers being backed up across a SAN. Each is pushing data at 15 MB/sec.
Figure 24 Testing Backups to Tape vs. Backups to a VLS If you run two tests, in both cases backing up approximately 13 GB of data from the five servers, the following results: Backing up to physical tape takes 6 minutes 41 seconds Backing up the same data to the VLS takes 3 minutes 34 seconds Example 3: VLS12000 EVA Sizing Without Deduplication When sizing the EVA array requirements for a VLS12000 without deduplication enabled, the key...
Reduce Backup Window (No Deduplication) Company requirements: Company has Fibre Channel SAN, needs to reduce backup window, already using tape-based backup application, wants low risk easy integration of disk-based backup. Solution: Entry Level VLS9000 with ISV-based copy to tape. No deduplication. Caveats: None.
Large Scale Backup Consolidation (No Deduplication) Company requirements: Up to 500 TB backup storage required in a single unit, high throughput required to meet backup windows. Solution: Multi-node VLS9000 using Object copy to physical tape. Caveats: Very large systems (greater than 300 TB) might be better split into two smaller systems for increased availability.
Other information: HP strongly recommends using the backup application to perform the copy from VLS to physical tape. Single management console for all activities and better utilization of physical media through object copy technology. Managing Data Retention and Reducing Backup Storage Costs Company requirements: To improve RPO objectives, reduce expenditure on ever increasing backup storage, better manage data growth for backup, provide more data online, reduced dependency on tape.
Other information: HP strongly recommends using the backup application to perform the copy from VLS to physical tape. Single management console for all activities and better utilization of physical media through object copy technology. Full Enterprise Backup and Disaster Recovery Capability Company requirements: Full disaster recovery capability for an Enterprise Data center to replicate data automatically to another site in a cost effective manner with reduced fixed costs (intersite link costs).
recommended for high volumes of data (TBs) because of the time to replicate data wholesale over a relatively small link speed. Data can be copied to physical tape at the disaster recovery site then transported to the primary site and imported to new servers and arrays using physical tape at the primary site. Tape could also be copied to the VLS and then recovered from there.
Figure 29 Using VLS with Multi-data Center Replication to a Common Disaster Recovery Site Recovery options: Primary site data can be recovered directly from VLS at the primary site. If there is a total disaster at one of the primary sites and all servers and VLS are lost, data can be recovered from the disaster recovery site in several ways: At the disaster recovery site using a separate backup application master server, data can be recovered to new servers.
Other information: HP strongly recommends using the backup application to perform the copy from VLS to physical tape. Single management console for all activities and better utilization of physical media through object copy technology. For many-to-one implementations especially, sizing the links, capacity required, and replication windows required is quite complex.
Online backup of File System and Application Data inside the VMs. Enable single object and application recovery (including point in time) Single appliance solution for VMWare and non-VMWare hosts Solution trade-offs: Network Backups can only reach around 80 MB/sec best case. For high volumes of data this may not be fast enough to meet a fixed backup window, but VMWare backups tend to be a lot of small machines with not high volumes of application data.
Figure 31 Using VLS with VMWare using VCB Recovery options: Recovery is a two stage recovery: The image is restored to the VCB proxy or the ESX server. The image is restored as a VM using VMware Converter if restored to the VCB proxy, or vcbRestore if restored to the ESX server.
Solution trade-offs: Full snapshot-based recovery – single file recovery only on Windows file system backups. Distinctly separate proxy server required at additional cost – requires additional storage for copy of snapshots. Backup application integrations such as Oracle, SQL, Exchange, etc., cannot be used with VCB.
Figure 32 Using VLS in Larger VMWare Environments with HP Data Protector ZDB IR Backup Recovery options: Two step restore process that provides full recovery of VM and data Recover the VM image from VCB backup ◦ ◦ Required only in case the whole VM is lost or restore to a new ESX server Recover Application Data For replication-based disc backups (IR), perform Instant Recovery ◦...
VM backup Data Protector Integration with VMware VCB provides the first layer of protection Provides snapshots that can be used for restoring VM Needed only when VM configuration changes Application backup Provides the second layer that enables protection of application data Can be performed using ZDB/IR feature of Data Protector by using HP Business Copy for EVA, XP (LHN to follow) Further enhancements possible by providing multi-site disaster recovery by using low bandwidth...
Solution advantages: Data off-sited as part of the initial backup, good resilience solution, no delays Part of an Active/Active Data Center disaster recovery solution Backup load can be split across sites 50:50 – best use of all available resources No data offsiting costs for physical tape – it is already offsite with this method Solution trade-offs: Company must be willing to invest in multiprotocol routers and DWDN inter site link technology at some cost.
at that site. In the event of a major disaster at a site then the replicated copy of data can be accessed in several ways: At disaster recovery site using a separate backup application master server data can be recovered to new servers. A replacement VLS and servers can be installed at the primary site and data can be reverse replicated from the disaster recovery site over the low bandwidth link, although this is not recommended for high volumes of data (TBs) because of the time to replicate data wholesale...
5 Automigration In addition to copying virtual media to physical media via the backup application, another option is to use automigration to perform transparent tape migration. The automigration feature allows the VLS to act as a tape copy engine that transfers data from virtual cartridges on disk to a physical tape library connected to the VLS device.
creates matching virtual cartridges in the virtual library specified for the echo copy pool the physical tape is in. See Figure 35 (page 80). Figure 35 Linked Media When virtual tapes are ejected by a backup application, or when tapes are ejected in the destination library, the matching virtual tapes are moved to the device's firesafe —...
firesafe, the user must move the virtual cartridges out of the firesafe first. See “Restoring from Automigration Media” (page 85). Start time – the time at which the echo copy job begins. HP recommends that copies are scheduled within a different time window from other backup activities. Window size –...
Figure 37 Automigration Media Life Cycle This section presents an example, in which a user who works for Company X did everything described in the previous section to set up automigration. See “Automigration Setup” (page 81). The numbered list that follows coincides with Figure 37 (page 82), which illustrates the stages in the life cycle of automigration virtual and physical media.
retention period passes, the virtual cartridges are deleted, at which point a physical tape is needed for restore operations. See “Restoring from Automigration Media” (page 85). Thirty days have passed. The truck returns with the physical tapes previously transported off-site that are now to be recycled.
to their own dedicated virtual libraries. During the echo copy window, all the virtual cartridges from the virtual libraries are automatically and transparently copied to the single physical tape library. This scheme provides daily off-site protection and archiving of the physical tapes. The user can configure additional virtual slots in the virtual libraries.
The ability to map multiple physical tape libraries to a single virtual library means that customers can start with smaller, single physical libraries. As they upgrade the capacity of their virtual libraries, customers can add additional physical libraries without needing to modify their backup jobs. The user can configure additional virtual slots in the virtual libraries.
NOTE: Customers may choose to configure their devices such that there are additional, unmapped slots in the virtual library. These additional virtual slots can be used for backups so that even if the physical slots are all full, there are still spare slots available to hold the cartridges needed for restore.
6 Accelerated Deduplication HP Accelerated deduplication technology is designed for optimal performance and scalability to meet the needs of enterprise data centers. It offers the following features and benefits: Leverages object-level differencing code, which targets matching backup “objects.” Rather than comparing the current backup with every byte of data stored on the device, Accelerated deduplication can compare backup objects from the current backup to matching objects from the previous backup, where there is likely to be duplicate data.
Figure 40 Steps of Accelerated deduplication On–the–fly backup analysis Figure 41 (page 89) illustrates the steps in the first phase of Accelerated deduplication. Although Accelerated deduplication uses post-processing technology, backups are analyzed initially while the backup is running. This process has minimal performance impact. The deduplication software uses the metadata attached by the backup application to decode the format of the tape.
Figure 41 Backup Analysis Item Description The VLS analyzes the backup as it goes through memory. The backup application metadata is stripped and used to understand the format of the tape. The metadata database creates a map of the locations of the logical backup data. Figure 42 (page 89), two instances of File A are shown, one from Session 1, and one from Session 2.
the type of backup – full or incremental the type of data in the backup – files, database, etc. The deduplication software then queries the metadata database to find an equivalent older version of the same backup job to compare it against the new backup. If the current backup is full, it will be compared against the last equivalent full backup version.
Figure 43 Backup Sessions with Duplicate and Unique Data Item Description Backup session 1 consists of files A and B. Duplicate data from A is replaced with a pointer to A'. The unique data from file A, as well as file B in session 1 one is retained.
Supported Backup Applications and Data Types Accelerated deduplication can only be used with supported backup applications and operating systems. (In firmware version 3.3.0 and higher, all data types within the supported backup application and operating system matrix will deduplicate.) See http://www.hp.com/go/ebs more information on backup applications and operating system types supported with HP Accelerated deduplication.
Configuration and Reporting Accelerated deduplication is aware of the backup contents of tapes. Therefore, you have reporting and policy control options available over the deduplication process. You can: Enable and disable deduplication by backup job type or individual backup job. Switch from backup to File-level differencing for file server backup jobs.
Use the VLS GUI to generate a Cartridge Utilization report (via Notifications Report Setup). This report lists the barcode of each cartridge, disk space usage in bytes, deduplication status, and dependent cartridges (if any) which can be used to identify the cartridges that you can reformat/erase to create free storage.
NOTE: If the number of nodes required for backing up data differs from the number of nodes required to perform deduplication, select the larger number of nodes. NOTE: On the VLS6200, do not enable software compression and deduplication together. Enabling software compression will drastically reduce the performance if deduplication is enabled. Capacity Sizing Several factors affect how much disk space is required for the desired retention scheme: Data compression rate.
Use the following guidelines: Appending media pool HP recommends using standard, appending shared media pools across multiple tapes. By doing so, tapes fill with backup jobs from different hosts, which allows deduplication to occur and storage to be reclaimed. You should not use non-appending media pools. Virtual cartridge sizing Generally, it is always better to create smaller cartridges than larger ones because it frees up the cartridge quicker for use by other processes such as restore, tape copy or deduplication...
Disabled in the VLS GUI. Otherwise, the last backup of that client will remain in the "Waiting for next backup" state. If you have a clustered system that is being backed up, in some cases this may mean the client name changes when a node fails because the cluster fails over to another node and the backup application uses that node name (rather than the virtual cluster name).
7 Replication The VLS deduplication-enabled replication technology (which leverages Accelerated deduplication) is designed to maintain performance and scalability to meet the needs of enterprise data centers: On a multi-node VLS device the replication is also multi-node capable and thus can scale as the device scales.
backup tape on the source device (not the tape currently replicating) runs in parallel with the replication. Once the backup has successfully replicated, the backup is reassembled using the transferred deltas and the older existing duplicate components on the target device. Space reclamation of the previous backups (not the tape just replicated) on the target device begins.
itself. (In firmware version 3.3.0 or higher, differential backups will deduplicate against each other so the amount of data replicated is reduced to the size of incremental backups.) Database block-level incrementals will also always be replicated as the entire size of the incremental backup because the data is always unique.
you would install one license per node in both the source and target devices (for all nodes in those devices). Replication Setup After preparing your source and destination VLS devices (see “Replication Preparation” (page 133)), use the following steps to configure replication. See your VLS user guide for details. Figure 47 (page 101) shows an overview of the steps outlined below.
Create a replication target on the target device. On the target device, set up “Global LAN/WAN Target Settings” and then “Create a “LAN/WAN Target” with: Name Start and end slots Maximum simultaneous transfers Password Target replication start window Target replication duration window (typically 24 hours) Set up the global LAN/WAN settings by going to the Automigration/Replication tab and seting global settings for compression over the replication link and the TCP/IP port number used for replication data transfers.
Configure the replication target’s window limits — start time and window duration. Add “LAN/WAN Destination” to the source device (to link the source device to target device). On the source device, use “Manage LAN/WAN Library” and the relevant password to link the source and target devices together in a replication pair.
Create an echo copy pool (including selecting whether to use tape initialization). On the source device, select the destination library and “Create echo copy pools.” In this context, the echo copy pool acts as an asynchronous mirror between the selected source virtual library and the target library.
Run the first full backup (to replicating media in the source echo copy pool). Only backups performed after the echo copy pool is created will be replicated. Any previous existing backups will not be replicated until their cartridges expire in the backup application and are overwritten by new backups.
You must decide whether to use the tape initialization process when initially configuring the VLS replication (when creating the echo copy pool). If tape initialization is not selected, WAN or co-location initialization will be performed over LAN/WAN. Physical Tape Initialization Because in enterprise backup environments the initial backups can be quite large (many TB), in most cases it will be impractical to perform tape transfers using the low bandwidth link (because it would take weeks to replicate the first backup).
below. This option prompts you to select from the available “SAN Destination Library” devices to use for the tape export: The tape export will stack the virtual cartridges to replicate (the cartridges that contain the full backup) onto the available physical tapes in the selected SAN destination library. The tape export GUI will show the export status to the tape handler to determine which physical tapes are ready for removal, whether new tapes need to be loaded, etc.
Figure 49 VLS Initialization using the WAN Co-location Initialization In circumstances where data centers are within a small distance of each other it may be practical to co-locate the two VLS units at the same location, directly connect the GbE links from the VLS nodes together by plugging both source and target device external LAN ports into the same external LAN switch.
Figure 50 VLS Initialization by Co-location Co-location is practical if replication is being installed along with VLS technology from day 1. If replication is being added to VLS already installed at different sites, WAN transfer is best for small quantities of data and tape transfer for larger quantities of data. Again, for many-to-one or active-active implementations, co-location is impractical and WAN transfer or tape transfer is best.
Status of LAN/WAN Replication Connection You can check that the replication LAN/WAN connection between the source and target devices is online by viewing “LAN/WAN Replication Libraries” in the source device GUI. If the replication connection is offline this can be because: The target device is “Unreachable”...
take 40 minutes create the matching 1000 cartridges on the source device and complete the “Adding Cartridge” operations for all 1000 cartridges. Up to Date. The source virtual cartridge is fully synchronized with the matching target cartridge (the mirror is in sync). Waiting for Backup Data.
NOTE: If an active replication job was interrupted (for example, the cartridge was loaded into a virtual tape drive or the link failed), this will cancel/reschedule the replication job and return it to “Mirror Scheduled” status (if within a replication window) or “Out of Sync” (if outside of a replication window) so that the replication job is run again once the interruption is resolved.
The following screen shows an example of the job history on a target device but you can also view the same information on the source device, and in the case of an active-active deployment you can view both the outgoing and incoming jobs on a device that is both a source and a target: On both the source and target devices there is also an option to view the “Job Summary”...
Figure 51 Dividing the Backup Jobs by Priority Level On the target device you also have the ability to limit replication traffic beyond limiting the replication time windows. This can be done by controlling how many concurrent replication jobs are allowed per replication target, and optionally controlling the maximum Mbytes/sec throughput of a single replication job.
Table 22 Network Latency Job Concurrency Required to Network Latency Aggregate Mbytes/sec Saturate Link 80 MB/sec 50ms 80 MB/sec 100ms 80 MB/sec 200ms 80 MB/sec 500ms 52 MB/sec A new option in firmware versions 3.4.1 and higher is the ability to throttle the number of replication job that can run on the master node0 of a multi-node configuration.
The most likely scenario for enterprise data centers using HP VLS replication is that it will be enabled to run 24 hours a day on dedicated links used only for replication. HP VLS has the ability to enable windows for replication to further customize the implementation to best suit your needs.
Figure 52 VLS Sizing Example 1 Database Full Backup size 15TB – 1% change rate File System Full backup Size 5TB – Typical incremental size of 10% Replication Window 24 Hours Required Link speed 40% of existing 200Mbit/sec link This example shows a 15 TB full database backup and 10% incremental backup of file system data to replicate daily between sites.
Required Link - 400Mbit/sec link made available Typical replication time 12.8 hours Now consider a larger backup requirement with a full database backup of 50 TB and an incremental of 10% of 16.6 TB of file system data. Notice how as the amount of data backed up scales the number of nodes scales, and as the number of nodes scales more processing power is made available to both backup, deduplication, and replication.
significantly higher cost. While some may say 1 GB/sec is not low bandwidth, everything is relative to the task at hand. Table 23 Summary of the Calculations Involved in the Sizing Examples Data types (3 Full size in GB Replicated data Time available Link speed Combined link...
Here are four remote sites replicating 2 TB and 10 TB backups daily into a central VLS9000. The best approach in a many-to-one scenario is: Size each of the individual links separately. Ensure the links at the target at least match the total remote size replication capacity (in this example 300 Mbits/sec).
the front-end Fibre Channel ports and thus can be presented as a library to a backup application in the disaster recovery site (note the backup application can only be presented to the target device, it cannot also be presented to the source device as well). Figure 56 VLS Recovery Options for a Data Center Rebuild This target virtual library can be used to restore directly onto new servers that can then be shipped to the replacement site, or to copy the data to physical tape which can be shipped to the...
most recent backups in the media database may not have been fully replicated and thus may not be usable for restore.) Figure 57 Data Recovery from the Target VLS The source VLS becomes inaccessible to the remote VLS and to the backup host. The source tape library and drive are deleted from the backup host (cartridge media pools remain).
Rebuilding Source Device and Re-establishing Replication If the entire source VLS device was destroyed (for example, you had a site disaster) and was replaced with a brand new blank device, then you must recreate the original virtual library configuration and replication configuration. If you have a previously saved configuration file (created using Save Configuration in the VLS GUI) you can perform a “Restore Configuration”...
On the target VLS: Create the cartridges to be used by the replication target. Using the same barcodes as exist on the source will cause the existing source cartridges to be reused. On the source VLS: Manage the replication library. On the source VLS: Create the new echo copy pool.
VLS that sends an hourly email report listing which virtual cartridges have been successfully replicated in the last hour, and this cartridge list can then be fed into a script that automatically triggers tape import jobs in the backup application (which read the new cartridge data and import this content into the media database).
“Data Protector Import Example Script” (page 138) “NetBackup Import Example Script” (page 141) for sample import scripts. On the target VLS device GUI, setup the SMTP Gateway address and then configure the “ISV Import” email report using the selected email account. 126 Replication...
8 Non-deduplicated Replication If you have not enabled Accelerated deduplication on your VLS device , non-deduplicated replication can still provide a more reliable method of sending mission-critical data off-site compared to off-site using physical tape (because there is no need to load/unload physical media and ship it to the off-site location).
Replicating a Subset of Virtual Cartridges - a Use Case The following is just one possible use case for non-deduplicated replication. In a typical backup environment, companies rarely have network bandwidth surpassing what meets their needs. For this reason, the ability to replicate just a subset of business critical data, thus conserving bandwidth, is advantageous.
9 VLS Configuration and Backup Application Guidelines To achieve full performance benefits from the VLS, you may need to modify your enterprise backup application configuration (for example, to take advantage of the fact that you can create many more virtual tape drives than a physical tape library holds). If you have enabled deduplication, you may need additional changes in your enterprise backup application to take full advantage of this capacity optimization technology.
If deduplication and replication will be configured on the VLS, review “Deduplication Preparation” (page 132) “Replication Preparation” (page 133) for more information. Basic VLS Device Configuration The following steps highlight the considerations for designing the required virtual libraries/drives/cartridges and standard device configuration. Identify the number of required virtual libraries.
as the current total amount of physical tape (already compressed) divided by the virtual cartridge sizes. ◦ If this virtual library will be a replication target, it must contain at least as many slots as the total number of source slots that are replicating to this target. See “Replication Preparation”...
Determine if there are performance bottlenecks in the current backup (for example, LAN or SAN ISL bottlenecks) that you must resolve to allow full performance to the VLS. Based on your offsite copy requirements, assess the configuration changes needed to implement offsite copy.
Install deduplication licenses and wait for the “Completed deduplication installation” and “Initialization for node0 complete” notifications. If deduplication failed to initialize due to lack of free space for the deduplication metadata database (1 TB for VLS6000–series, 2 TB for VLS9000–series and VLS12000), you must free up that disk space by deleting cartridges.
source backup application (for example, assign it to dedicated front-end Fibre Channel ports that you can zone away from the source backup application agents). Best practices include creating more library slots than initially needed in the source and target virtual libraries to provide room for future growth.
Figure 60 Preparing the Network Connection for Replication in a VLS You must maintain time synchronization between the source and target VLS devices; if they become more than one minute out of sync, the replication will go offline with a warning message. The recommended method of achieving this is configure both source and target VLS devices to use a single NTP server.
necessary to administer the backup and restore activities directly from the Cell Manager itself, because any client within the cell (as supported) can connect to the Cell Manager over the network and be used to administrate the activities of the cell. Disk Agent: install the Disk Agent on client systems you want to back up.
Data Protector Deduplication Guidelines In addition to the Data Protector general guidelines, the following guidelines apply to a VLS with deduplication enabled: Patch all Windows clients: HP Data Protector on Windows clients requires a software fix to eliminate 56 KB file fragments that diminish deduplication performance on VLS and D2D products.
called BACKUP2 (MONTHLY), you can enter (WEEKLY) and (MONTHLY) in the configuration. The suffix removal applies to all backup job names. DB2 Databases: for deduplication to work you must set parallelism to 1 for DB2 backups. Data Protector Import Example Script The following section details an example script to perform the tape import commands for HP Data Protector.
wait DRVBSY=( 0 0 0 0 ) done wait The above basic example script template can be extended to add error handling or support for multiple libraries, etc. Symantec NetBackup This section includes both general and deduplication guidelines as well as other useful information for Symantec NetBackup.
NetBackup General Guidelines The following NetBackup guidelines apply to a VLS regardless of whether deduplication is enabled or disabled: HP VLS Library Emulation: When configuring the VLS for use with NetBackup, select “HP VLS” as the library emulation. NetBackup requires the use of this emulation in all HP VLS products for device support purposes.
Multistreaming SQL Server: For large SQL Server databases with more than one file group and more than one data file per file group, configure the backup to create one backup set per database data file (for example, single stripe per data file) for optimal deduplication. Different backup job names for the same data: You should not have different group names for backing up the same data or they will not deduplicate against each other.
Figure 61 TSM LAN-based backups Data on application servers (A-G) is backed up by TSM server via the SAN to the TSM disk pool. A typical TSM primary disk pool is sized for a minimum of one nightly backup. As the primary disk pool fills, TSM moves data from it via the SAN to the primary tape pool. Unscheduled Migrations Undersized disk storage pools;...
Figure 62 TSM LAN-free backups Data on large application servers (D-G) is backed up by TSM server via the SAN directly to the tape storage pool. Data is copied from the primary tape storage pool to a copy storage pool, making a set of tapes send off-site for disaster protection.
Table 24 Recommended Settings for each TSM Server's dsmserv.opt File or Storage Agent’s dsmsta.opt Command Value Notes TCPNODELAY Default TCPWINDOWSIZE Default BUFPOOLSIZE 32768 Default TXNGROUPMAX Default EXPINTERVAL Default = 24. When setting this to 0, use an administrative schedule to execute expiration at an appropriate time each day.
Keep mount point: The Keep Mount Point setting on each node must be set to Yes using the command keepmp=yes. This command only works if there are no client sessions running when enabled. Update node * keepmp=yes Replace the asterisk (*) with a node name to update an individual node. Resource utilization: Use the “resourceutilization”...
The more data you collocate, the fewer tape mounts are required during restores. The downside is that you will significantly increase the amount of tapes you use. Deduplication does not support co-location because this feature is unnecessary when using a disk-based backup solution.
TSM image files: TSM image file backups will only deduplicate if the differencing algorithm for the corresponding policy is set to Backup-level. If the policy for these files is set to File-level, these files will not be deduplicated. To simplify configuration, define a specific management class for TSM image backups and configure that class in the VLS GUI for a backup-level deduplication policy.
Display backup totals in TB’s for a 7 day period: select cast(float(sum(bytes))/1024/1024/1024/1024 as dec(8,4)) as ""Backed up data in TB"" from summary where activity='BACKUP' and start_time>timestamp(current_date)-(7)days Display backup volume in Bytes by node for the past 24 hours: select start_time, end_time, entity, bytes from summary where activity='BACKUP' and start_time>timestamp(current_date)-(1)days order by end_time Display value of mount retention for the device class:...
Networker General Guidelines The following NetWorker guidelines apply to a VLS regardless of whether deduplication is enabled or disabled: Tape block size: Set the tape block size to 256 KB (maximum supported size, 512KB is not supported). Otherwise, NetWorker uses a block size based on the device type used. For example, LTO-2 is set by EMC to 64 KB, assuming that the operating system is configured for variable block size mode.
configuration details and examples. HP recommend setting stinit to load via an init file (for example, /etc/rc.local on Redhat) to ensure it is loaded properly. ◦ NetWorker supports Linux udef device naming. See the EMC whitepaper Persistent Binding and udef Changes for EMC NetWorker for details (document h5795, available at https:// powerlink.emc.com).
backup, edit the NetWorker client object for the SQL backup and remove the -Sn option from the backup command (n is the number of parallel streams per data file; removing this argument from the backup commands sets it to the default of 1 stream per data file). The default setting for this is disabled, and most sites do not use Striped Backups with SQL Server.
MAXOPENFILES default Number of channels <= 30 CONFIGURE CONTROLFILE AUTOBACKUP ON; CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE ‘SBT_TAPE’ TO ‘%F’; Compression off Encryption off NOTE: Consistent Backup Streams: When backing up multiple data files into each backup set, deduplication depends on consistent data file-to-backup set assignments. If the database is being reconfigured (for example, data files are added or removed), the backup set matching will return to normal when the data file grouping becomes consistent between backups again.
The deduplication-optimized RMAN settings: FORMAT 'df.%d.%f.<other_identifiers>' for each data file FILESPERSET 1 MAXOPENFILES default Number of channels does not impact deduplication performance, but does impact backup performance so consider increasing the number of channels while maintaining concurrency at 1) CONFIGURE CONTROLFILE AUTOBACKUP ON; CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE ‘SBT_TAPE’...
Documents HP 9000–series Virtual Library System User Guide HP 9200 Virtual Library System User Guide HP 12000 Gateway Virtual Library System User Guide HP 12200 Gateway Virtual Library System User Guide HP StorageWorks 6000–series Virtual Library System User Guide HP Enterprise Backup Solution Design Guide...
HP contact information For the name of the nearest HP authorized reseller: See the Contact HP worldwide (in English) webpage (http://welcome.hp.com/country/us/ en/wwcontact_us.html). In the United States, call 1-800-345- 1 518. In Canada, call 1-800-263-5868. For HP technical support: In the United States, for contact options see the Contact HP United States webpage (http:// welcome.hp.com/country/us/en/contact_us.html).
WARNING! Indicates that failure to follow directions could result in bodily harm or death. CAUTION: Indicates that failure to follow directions could result in damage to equipment or data. NOTE: Provides additional information. The following equipment symbols may be found on hardware to which this guide pertains. They have the following meanings: WARNING! These symbols, which mark an enclosed surface or area of the equipment,...
Glossary automigration The feature in which the virtual tape library acts as a tape copy engine that transfers data from virtual cartridges on disk to a physical tape library connected to the virtual tape device. The HP D2D Backup Systems product line. deduplication The feature in which only a single copy of a data block is stored on a device.
Index restore considerations, retention planning, Accelerated deduplication single library see single library backup backup job naming, single vs. multiple library, backup-level differencing, speed considerations, capacity sizing, transfer size, client naming, backup technologies, defined, backup-level differencing, design considerations, business copy disabling on specific backup policies, alternative to virtual tape, file-level differencing, how it works,...
NetWorker deduplication guidelines, echo copy general guidelines, benefits, with HP Accelerated deduplication, considerations, with HP virtual libraries, file-level differencing, object-level differencing, operating system tape configuration, help, obtaining, OS see operating system oversubscription, tape, deduplication, eplication, storage web site, performance bottlenecks, Subscriber's choice web site, physical tape HP Data Protector...
VLS12000 technical specifications, 50, 51, VLS12200 backup guidelines, technical specifications, design considerations, VLS6000–series zoning, technical specifications, script examples VLS9000–series HP Data Protector, technical specifications, 46, NetBackup, VLS9200 single library backup technical specifications, 45, benefits, considerations, use model, warm failover, snap and clone see business copy web sites space reclamation, HP documentation,...