Download doc - Common Solutions Group

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Scalable Enterprise Storage – NAS Gateway Appliances
Karl Lewis, Storage Administrator, College of Engineering, University of Michigan
([email protected])
What is enterprise storage?
Enterprise storage is a crucial component of modern businesses, relied upon to fulfill the
needs and goals of the business. Enterprise storage provides the critical infrastructure used
by the various desktop and server platforms in the environment.
Wikipedia defines enterprise storage as:
Enterprise storage is the field of information technology focused on the
storage, protection, and retrieval of data in large-scale environments. It
is differentiated from consumer storage in many practical ways, ranging
from the size of the environment to the technologies used.
It goes on to state:
Enterprise Storage has four focus areas:
Storage: (http://en.wikipedia.org/wiki/Data_storage) Online Random access
(http://en.wikipedia.org/wiki/Random_access) storage and protection of data
Backup: (http://en.wikipedia.org/wiki/Backup) Offline Sequential access
(http://en.wikipedia.org/wiki/Sequential_access) storage for data
protection
Archiving: (http://en.wikipedia.org/wiki/Archive) Offline storage of
content, as opposed to data
Disaster Recovery: (http://en.wikipedia.org/wiki/Disaster_recovery)
Protection of data from localized from localized disasters focused on
business continuity planning
In most enterprises, these four focus areas can be handled within a single hardware/
software solution or within a combination of products from a variety of vendors. The size,
shape and scale of what constitutes “enterprise” changes almost daily, yet three elements
are given the highest precedence in any enterprise storage solution:
Reliability – the capacity to continue to function through a large variety of hardware and
software failures (disk, processor and network failure)
Availability – the capacity to continue to function through hardware/software maintenance
and upgrades (capacity expansion and performance/feature enhancement)
Scalability – the capacity to expand usable resources (number of host connections, CPU,
disk, memory and network) beyond the initial acquisition
The various solutions on the market offer differing levels of these elements, yet every
enterprise solution offers all three of them. Most “enterprise” solutions differentiate
themselves from “SOHO” or “consumer” solutions by offering the highest levels of these
elements.
The Cost of Enterprise Storage
The Computer-Aided Engineering Network (CAEN) of the College of Engineering (CoE)
provides compute laboratory and network access to approximately 8,000 students, faculty
and staff within the many engineering disciplines at the University of Michigan. Before the
year 2000, the Andrew File System (AFS) provided CAEN with the most cost-effective
solution to deliver storage resources to its UNIX-based users, in the form of user home
directories, class directories, collaborative group directories and research space. With the
Michigan – Bobcat Case
page 1
more recent growth of the Windows platform in the laboratory and server space, many
problems with the interaction between the AFS filesystem and the Windows operating
system surfaced. Lacking the filesystem semantics and ACLs of traditional Windows
servers, it was clear to the CAEN Windows administrators that AFS was not a viable storage
solution to grow on or provide customer-facing services from. In 2002, CAEN staff embarked
on a project to augment its AFS environment with a Windows-native solution that could
provide the robust features needed to provision user home directories and other Windows
services.
As the design of our Windows-based solution was taking shape, the question was posed – “if
we were to provide 1GB home directories to the students, faculty and staff of the CoE, how
would that solution appear?” A FibreChannel-based (FC) SAN with traditional servers was
first proposed. Early estimates showed a traditional SAN to be difficult to scale and costly to
manage. A dedicated NAS filer was considered. That solution appeared to be much easier
to manage, but similarly costly and difficult to scale.
A hybrid design, a Dell|EMC NAS/SAN gateway solution, was investigated and chosen for
the final design. This approach uses a SAN to provide the back-end storage, with a NAS
gateway placed in front of it. The back-end could grow to accommodate more storage nondisruptively. Similarly, additional NAS gateways could be added in front of the storage, to
provide additional performance or availability. The NAS gateway could be used to provide
both a home directory service for users and could also be used to consolidate other Windows
fileservers – departmental Windows servers, application servers, OS load image servers and
so on. Each consolidated server reduced the number of Windows servers that must be
patched, secured, managed, maintained and licensed in the environment. This frees up staff
resources for other tasks.
As an additional benefit, other storage services could leverage the SAN back-end created for
the NAS. I/O intensive storage resources such as Exchange and SQL could be moved from
local server-attached disks to a faster Dell storage array that comprised the SAN back-end,
increasing availability and leveraging storage features such as snapshots and replication.
In addition, many integrated features greatly reduced the labor necessary to manage the
solution. Automated, user accessible point-in-time snapshots allow users to perform selfservice restores from their home directories or project space, with no intervention from
administrative staff. Weekly tape backups of the storage solution to an adjacent building
offer an additional level of protection for user data. Thanks to the snapshots, only two or
three restores per semester have required going to tape.
CAEN looked to host NAS space from the Dell|EMC solution for the various departments with
the additional capacity in the solution. Several departments and groups purchased space
from CAEN for their internal use, receiving the same benefits in performance, availability and
management. Other groups, however, expressed concern over the cost/GB resale rate of
the NAS. Other groups expressed a need for several TB of storage for various projects,
considerably more space than available in the solution. These challenges proved difficult to
meet with the existing solution.
After analyzing the recharge rate of the NAS, almost one-third of that cost is the cost of the
hardware, software and labor related to providing tape backup. Even after unbundling the
cost of tape backup, the remaining rate was still considered a high price to pay for data sets
that were too large to keep on local, unprotected workstations.
To better serve the College community, was it possible to provide a highly-available NAS
solution at a much lower cost point?
Michigan – Bobcat Case
page 2
Low-cost NAS solutions
With that thought in mind, CAEN staff began to investigate many solutions from industryleading vendors. Several software-based products were examined, such as Polyserve’s
NAS cluster software and the OpenSource OpenFiler project. At the time, Polyserve had
high licensing cost per server (as much as 80% the cost of a traditional NAS filer) and
OpenFiler had many software limitations in provisioning and disk management. Both
products are installed on traditional Windows or Linux servers, which require regular patching
and maintenance. This added to the total labor cost of the solution – a dedicated staff
member would need to constantly monitor, patch and maintain the servers in the system, a
task that grows with the number of servers. In contrast, the existing NAS gateway needs
relatively little attention and doesn’t require an FTE to constantly monitor it, because of its
self-monitoring and alert generation capabilities.
Using the experiences with the Dell|EMC NAS solution, CAEN also began to investigate
other NAS gateway products, looking for a lower-cost gateway that utilizes a back-end
storage network that is independent of the user-facing, file sharing front-end. This NAS
gateway must provide similar high-availability in the front-end (the ability to failover the
network gateway from one physical server to another), provide NFS and CIFS access for
users, and have good network performance. To control costs, the solution must also
leverage a low-cost disk back-end.
Although no tape backup option was to be offered for this solution, disk-based snapshots
were desired to provide data recovery for recently-deleted files. Though it is not a complete
replacement for traditional tape backups, it affords users some protection against
accidentally deleted files. As only a few restores from the Dell|EMC NAS solution have
required recovery from tape, this seems appropriate. For users that need high-availability,
high-performance and regular tape backups, those features are available on the Dell|EMC
NAS. It is possible, however, to offer one-shot, on-demand backups on this solution, for
users who wish to purchase them.
CAEN chose to deploy a pair of ONStor Bobcat 2240 NAS gateway appliances. Each
ONStor 2240 NAS gateway has four GigE network interfaces and two 2Gbps FC interfaces.
The NAS gateway is attached to a pair of low-cost FC switches. The gateways are
configured in an active-active fashion; one gateway can take over for the other in the event
the other gateway suffers a hardware failure. To maximize storage capacity, CAEN chose a
pair of 12TB SATA-to-FC disk arrays to connect to the FC switches for the back-end storage.
These high-density, low-cost disk arrays provide 10TB usable space at RAID6, providing
protection from two simultaneous disk failures in each RAID group. Logical volumes can be
provisioned from the arrays, which the NAS gateway can use to create NFS or CIFS shares.
The hardware for the complete solution, costs less than $100,000 and offers 20TB of
useable space – a hardware cost less than $5/GB. The labor costs of this solution are very
similar to the cost of managing the Dell|EMC NAS. As more disk storage is added to the
back-end, the resale cost per GB decreases. As more departments or research groups buy
in, or as the amount of storage increases, the resale cost/GB continues to decrease. It
becomes possible to support research projects with massive, short-term data needs –
unused space can be quickly and easily reclaimed for reuse.
The approach of deploying an inexpensive, scalable back-end was also chosen to meet the
needs of faculty and research groups that wanted to “purchase disk” and not “lease disk
space”. In one scenario, a faculty member wanted to purchase disk that could be attached to
the Dell|EMC NAS on his grant; the space was needed to facilitate the research. It appeared
to be very difficult to write a grant to “rent disk space” for the research period, instead of
“buying disk”, which could be resold or reused at the end of the period. This solution
facilitates that scenario as well – something that was not possible with other NAS gateways.
Michigan – Bobcat Case
page 3
Focus Areas of Enterprise Storage
Returning to the four focus areas of Enterprise Storage, we can illustrate how the Bobcat
responds in each of these focus areas.
Storage
The Bobcat offers several compelling advantages over similar products. First, the Bobcat
supports a wide array of vendor storage arrays. High-end storage arrays from HP, IBM and
EMC are supported simultaneously with low-end commodity disk arrays. Customers can mix
and match any combination of vendor products to easily create volumes and migrate storage
from legacy solutions. The NAS gateway is designed to only use storage volumes that are
explicitly placed in its pool, allowing it to coexist with other servers on a SAN, without
disrupting those servers’ access to disk. This allows a unit to extend the usefulness of a
legacy storage product and migrate onto faster, more cost-effective storage hardware.
Moreover, shops with significant storage expertise in a particular product can retain that
experience. The Bobcat provisions storage from a pre-defined pool. A storage
administrator continues to use familiar tools to create volumes on storage arrays, then
presents the volumes to the Bobcat to create CIFS and NFS shares. If the enterprise needs
more storage, administrators can continue to buy the same disk arrays, eliminating the need
for additional training or can choose to buy newer, easier-to-use arrays.
As mentioned before, the ability to move data from array to array simplifies the task of retiring
or repurposing disk arrays attached to the SAN: the administrator simply moves a volume
from one array to another and removes the array when it has been completely emptied and
is no longer needed.
Backup
While CAEN staff opted not to provide tape backups of the NAS gateway, users are still able
to use the automated snapshot features to provide their own self-service restores. At any
level in the directory tree, a user can issue a command to view the available snapshots of the
filesystem and retrieve deleted files instantly. Since these snapshots consume little space,
multiple snapshots can be taken of a filesystem, allowing users to see their files from many
weeks back. Making these snapshots available to users greatly enhances their user
experience and greatly reduces the need for a dedicated administrator to manage and
maintain such a service.
Archiving
Since the Bobcat supports the NDMP data backup protocol, however, users could request an
archival tape backup on a time-and-materials basis. Such a backup leverages the tape
hardware and software used to backup the Dell|EMC solution, writing the backup to the tape
drives normally used to back up the Dell|EMC storage. CAEN staff felt this solution would be
satisfactory for faculty with project or grant requirements to write archival backups to tape,
without subjecting all users to the large capital cost required to build a tape solution.
Disaster Recovery
While not an initial design constraint, the Bobcat offers a few solutions for disaster recovery.
First, the Bobcat’s support for the NDMP protocol allows CAEN staff to backup the solution to
another NDMP-enabled NAS filer, over an IP network. This NDMP NAS filer could be
located on campus, within the Midwest region or anywhere on the globe. NDMP, however,
only provides a static backup of existing filesystems. To provide real-time data availability,
synchronous data replication is required.
For an additional license fee, the Bobcat can mirror its data volumes from one NAS gateway
to another Bobcat in another location. The speed of the replication process is limited only by
Michigan – Bobcat Case
page 4
the speed of the network between the two locations. For institutions on high-speed networks
or internet2, it is possible to keep large Bobcat volumes in close synchronization over long
physical distance.
Tradeoffs
Labor
The management and labor savings of IT appliances are well documented in the CAEN
environment. When the adoption of the Dell|EMC NAS solution in 2004 demonstrated that a
small number of individuals could easily manage many TB of storage, it seemed only logical
to repeat this approach with the Bobcat solution. Thanks to the assistance of automated
provisioning and monitoring tools, no additional staff is required; more storage could be
purchased and deployed with no increase in labor. Since the Bobcat supported a type of
disk array already in use at CAEN, no new array administration skills were needed to quickly
bring the solution online.
However, this meant purchasing more of a particular vendor’s disk arrays. While they could
be easily replaced over time, it reduces the total efficiency and potential reliability of the
solution. Each array must be managed and maintained separately. Even though each new
array adds to the total capacity of the solution, CAEN staff must go through a lengthy
process to install and configure each array before it can be put into service. Similarly, each
array must be monitored and supported individually – as the number of arrays grows, the
added complexity makes troubleshooting more difficult. When the number of arrays grows
beyond a few, the number of FC connections makes diagnosing SAN connectivity issues
extremely tedious and time-consuming; the odds of a lengthy service outage grow quickly.
A monolithic RAID array, a solution with one RAID controller and many shelves of disks, is a
good alternative. However, the high capital costs for these types of arrays will almost always
be greater than the cost to purchase many smaller arrays.
Platform Support
The decision was made very early to support Windows via the CIFS protocol and UNIX via
NFS version 3. Pundits argued that NFS version 4 was an essential security requirement,
since its use of Kerberos authentication prevented unauthorized users from accessing files.
While NFSv4 is clearly more secure than NFSv3, fewer NFS clients support NFSv4 than
NFSv3. The Bobcat does, however, limit access to NFS shares to a specified list of IP
addresses, offering a small improvement in security. ONStor has not committed to support
NFSv4 in the product, waiting to see if customers demand it.
Similarly, with the growth of the Macintosh platform in the CoE, some Macintosh users
requested support for the Apple-native filesharing protocol, AFP. As with NFSv4, AFP is not
on ONStor’s product support roadmap. CAEN staff recommended Mac OS X users connect
to the Bobcat using the CIFS protocol, which is supported natively on Mac OS X. While
CIFS won’t provide the same user experience as AFP, Mac OS X users can still browse files
and recover files from snapshots.
Michigan – Bobcat Case
page 5