Download Advances in Data Storage Technologies Overview Opening thoughts

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Parallel ATA wikipedia , lookup

SCSI wikipedia , lookup

Storage virtualization wikipedia , lookup

Transcript
Overview
Advances in
Data Storage Technologies
?
?
Opening thoughts
Main-stream technologies
?
?
Thomas M. Ruwart
University of Minnesota
Digital Technology Center
Intelligent Storage Consortium
April 23, 2004
[email protected]
Opening Thoughts
?
?
?
Technologies – The underlying components
used to build products using a given
architecture
Architectures – The way technologies are put
together to solve a problem
Applications – define the scope of
requirements and associated problems to be
addressed
Storage
?
Disk Drives
?
?
?
?
?
?
ATA Disk Technology
SCSI Disk Technology
Tape Technology
DVD (optical) Technology
MEMS
Solid State
?
?
?
Storage – where data resides when it is not being manipulated or
moved around
Transports – how data is moved between other components and
how components are physically connected together
Protocols – the “language” that components use to talk to each
other
Software – the control of what happens throughout the system
Closing thoughts
An orthogonal thought
?
?
?
There are many interesting “technologies”
that can be incorporated into “products”
Products are what sells
This presentation describes
?
?
Past and current “products” and associated
technologies
The evolution of various technologies and
architectures that may or may not become
products
Disk Drives in General
?
?
Definition
?
“Winchester” disks have the rotating rigid
media and read/write heads plus
actuator enclosed in an air-tight case
Current Status
?
Disk drives have been around since
1957 (IBM RAMAC 305)
?
Areal Density is about 60 Gbit/in2
?
Current capacities are in the 300GB
range for 3.5-inch ATA-class drives
?
Rotation speeds are at 7200 RPM for
ATA, 15000 RPM for SCSI
?
Form factors are 3.5-inch for
desktop/enterprise, 2.5 -inch for mobile,
some 1-inch
?
Interfaces include ATA, SATA, Parallel
SCSI, and FC
1
Disk Drives continued…
?
HDD Technologies for the future
Technology Evolution
?
Perpendicular Recording
Perpendicular
recording
Will operate in the 100-200 Gigabit/in2 areal density range
2 to 4 years out
?
?
Patterned
media,
33 nm bits.
Bruce
Terris,
HGST
New media types
?
?
Patterned Media
Tilted Perpendicular media
Self-organized media
?
?
Smaller Form Factors – move from 3.5 -inch to 2.5-inch form factors
?
?
Heat assisted
magnetic
recording
uses both
laser and field
to record
T. McDaniel,
Seagate
Lower power requirements
Higher manufacturing yields
Higher areal densities
Higher RPM drives == lower access latency
?
?
?
Serial interfaces
?
Serial ATA (SATA)
Serial Attached SCSI (SAS)
Fibre Channel (FC) – 4 and 10Gbit/sec
?
?
?
Self-organized magnetic arrays
Protocols
?
Object-based Storage Device
?
D. Weller, Seagate
Courtesy of INSIC and Tarnotek, Inc.
Capacity Growth: Sustainable?
Precipitous decline in $/GB
HDD Capacity vs. Time, 95 mm Desktop ? 7,200 rpm
Cost per Gigabyte, 95 mm Desktop ? 7,200 rpm
100.00
y = 6E-28e
?
0.0018x
R2 = 0.9626
100
INSIC 1 Tb/inch2
AD demo goal
Seagate’s
plans
HGST
Maxtor
Seagate
WDC
10.00
$/GB
All HDD
Early HDD
10
1.00
- 44%/ year
0.10
(C) 2003 TarnoTek
Courtesy of INSIC and Tarnotek, Inc.
Jan-09
Jan-08
Jan-07
Jan-06
Jan-05
Jan-04
Jan-03
Jan-02
Jan-01
Jan-98
Jan-09
Jan-08
Jan-07
Jan-06
Jan-05
Jan-04
Jan-03
Jan-02
Jan-01
Jan-00
Jan-99
Jan-98
Time
(C) 2003 TarnoTek
Jan-00
0.01
1
Jan-99
HDD Capacity (GB)
1000
Time
Courtesy of INSIC and Tarnotek, Inc.
The Future of Hard Disk Drive Technology
Disk Arrays in General
Lab Demos: Possible HDD Areal Density Progression
Laboratory Demonstrations
100000.0
Areal Density (Gb/ in 2 )
10000.0
%
50
1 Terabit per inch 2 g o a l
1000.0
100.0
e
lin
ate
r
G
CA
30%
?
line
CAG
?
highest in products
70
perpendicular recording
?
now
heat-assisted mag recording
10.0
patterned media ?
self -organized arrays ?
1.0
Jan-90
Jan-93
Jan-96
Jan-99
Jan-02
Jan-05
Jan-08
Jan-11
Jan-14
Definition
?
RAID – Redundant Array of Independent (Inexpensive)
Disks
?
Aggregation of disk drives to operate as a large single
storage device
?
Used to improve reliability, availability, serviceability,
and performance through
Current Status
?
Disk arrays have been around since 1988
?
Interface is primarily 2Gbit Fibre Channel
Technology Evolution
?
Recent developments in MAID – Massive Arrays of
Independent (Inexpensive) Disks
?
MAID takes advantage of smaller form factor disk
drives – lots of them
?
Intended to address problems associated with multiple
drive failures
Jan-17
Date
Courtesy of INSIC and Tarnotek, Inc.
2
ATA/IDE Disk Technology
?
?
?
ATA/IDE Disk Technology (cont.)
Definition
?
Inexpensive interface used for consumer-grade disk drives
?
ATA stands for AT Attachment, IDE stands for Integrated Drive
Electronics. The two terms are used interchangeably
?
Primarily 3.5-inch and 2.5-inch form factors (5.25 -inch for optical devices)
Current Status
?
ATA disk drives are one half to one third the cost of equivalent capacity
SCSI disks and can result in lower overall equipment costs for large
deployments
?
ATA cost levels make it an attractive alternative or augmentatio n to tape
Technology Evolution
?
Parallel ATA interface technology has been around for many years and
has matured to the point where it is very commonplace.
?
ATA disk technology is equally mature to SCSI drive technology but is a
very different drive technology.
?
Serial ATA is the next evolutionary step in ATA technology and is now
available in production quantities.
?
Serial ATA improves upon Parallel ATA performance, addressing, and
other limitations while still maintaining the cost effectiveness of Parallel
ATA.
SCSI Disk Technology
?
?
?
?
?
?
?
?
Comments
?
ATA disk drives are NOT simply a SCSI disk drive with a different
interface. ATA (consumer or personal storage) and SCSI (enterprise-class
storage) are designed with very different goals in mind.
?
?
?
?
SCSI interface technology has been around for almost 20 years and has matured to
the point where it is the disk drive interface of choice for ent erprise class storage.
Serial Attached SCSI (SAS) is the next evolutionary step in SCSI technology and
will be available in production quantities in late 2004 or early 2005.
SAS provides some electrical compatibility with SATA.
?
Technology Availability
SAS has been announced by Seagate on the new 2.5-inch enterprise-class disk
drive.
Reference Web Sites
?
?
?
?
?
SCSI – www.t10.org
FIbre Channel – www.t11.org
iSCSI – www.ieft.org
?
Tape Technology (cont.)
?
Native GB
Native MB/S
Media
Available
?
?
2
3
4
Generation
Generation
Generation
Generation
100
200
400
800
80-160
10-2 0
20-4 0
40-8 0
PRML
PRML
PRML
MP
MP
MP
Thin Film
2000
2002
2004
2006
Recording Method RLL 1,7
The Winchester disk industry is moving toward a 2.5-inch form factor very
similar to the disk drives in laptop computers. One significant implication is
that there will be a much larger number of disk units (roughly four times as
many) to manage in an overall installation. This move toward smaller form
factors is motivated by higher aerial densities and media yield.
Reference Web Sites
?
Serial ATA – www.serialata.org
Definition
?
Magnetic recording
?
Linear and helical scan
Current Status
?
Used for backup and long-term data archival
?
High density, low cost, very durable
?
Potential high latency perhaps an issue
Technology Evolution
?
Magnetic tape well understood – it has been around since 1953
?
Magnetic tape follows disk density and performance curves
?
LTO (Linear Tape Open) getting a lot of market share
? Drive consortium of IBM, Seagate and HP
? Tape manufacturers include Maxell, Fujifilm, TDK, Imation, Sony,
Emtec (BASF), Verbatim, …etc.
? www.lto.org
?
IBM 3490 tape technology is the gold standard in enterprise-class tape
Comments
?
Technology Refresh is a significant issue with large data archives
DVD
Technology availability
? LTO Roadmap
1
ATA disks are designed to minimize cost, maximize volume, and are intended to
operate as single units.
SCSI disks are designed to maximize performance and reliability as well as
being able to operate in arrays.
?
SCSI disk drives are principally used in disk arrays and in applications that require
consistent high bandwidth, lower latency, and higher reliability than can be provided
by ATA/IDE disk drive.
Technology Evolution
?
?
?
High-performance interface used for enterprise-class disk drives
SCSI stands for Small Computer Systems Interface
Primarily 3.5-inch and 2.5-inch form factors (5.25-inch for optical devices)
Current Status
?
Technology Availability
?
ATA disk interfaces are available on both standard Winchester Ha rd Disks
and on CD and DVD devices.
?
ATA disk drives of various capacities and form factors are avail able from
multiple manufacturers: Seagate, Maxtor, Western Digital, Hitach i Global
Storage (formerly IBM Storage), Fujitsu, NEC, and Samsung
Tape Technology
Definition
?
?
?
?
18 - 24 Months between generations
?
Definition
?
Digital Versatile Disc
Current Status
?
Long term data archival
?
High latency data
Technology Evolution
?
Capacity
? Now - 4.7GB per side, 9.4GB per disk double sided
? Soon - 27GB per side (blue-laser)
? Ultimately - 50GB per side
?
Transfer rates
? Current write = 3.3MB/sec, read = 10.8MB/sec
? Assume technology will advance (i.e. like CDRW)
Technology Availability
?
Drives and robotics are common
Comments
?
Long media life, reasonably durable
?
Exploits the consumer market for media
3
MEMS-based Storage Technology
?
Definition
?
?
?
?
?
?
?
?
?
?
Not available for another few years
?
Reference website
?
?
Fibre Channel
Gigabit Ethernet
TCP Offload Engines (TOEs)
System Area Networks
?
?
?
Gigabit Ethernet
?
?
?
?
Consumer products
Fibre Channel
?
?
Technology Availability
?
?
?
Read/write/re-write
Slow write speeds, moderate read speeds
Driven by the consumer electronics market
Continue to follow the density curves of integrated circuits
Factor of 10-100 times the price/MB of magnetic storage
www.pdl.cmu.edu
?
?
CompactFlash (CF)
Smart Media (SM)
Secure Digital Memory Card (SD)
Memory Stick (Sony)
USB Memory devices
Technology Evolution
?
Research at CMU
Transports
?
Most popular types
?
?
?
Non-volatile for permanent data storage
Current Status
?
Technology Availability
?
?
Definition
?
?
Still in the technology demonstration mode
Technology Evolution
?
?
?
Current Status
?
?
MEMS - Micro Electro Mechanical Systems
Solid State Storage Technology
Definition
?
Ethernet at speeds of 1,000 (1GigE) to 10,000 (10GigE) bits per
second
Current Status
?
Ethernet is the network transport of choice
?
1GigE is the overwhelming favorite for high -speed LANs
?
10GigE close behind
Technology Evolution
?
IEEE 802.3 with 802.3ae defining the 10,000 bit modifications (6 /17/02)
Technology Availability
?
Cisco Catalyst 6500 Serial 1550nm 10 Gigabit Ethernet Module
?
Intel® PRO/10GbE LR Server Adapter
?
Intel® TXN17401 Optical Transceiver
Reference Web Sites
?
www.10gea.org/index.htm
Comments
?
10GigE Products are currently shipping
Definition
?
High speed data transport (physical, encoding, framing protocol)
?
Extensible to miles (100km) with special gbics and special 9-micron singlemode fiber
Current Status
?
Currently the Storage Area Network Interconnect of choice
Technology Evolution
?
1 and 2Gbit shipping on disk drives and arrays
?
4Gbit may be shipping on disk drives in near future
?
10Gbit/sec standard mature and is intended for disk arrays
Vendors/Products
?
Multiple
General Reference Web Sites
?
Fibre Channel Industry Association (www.fibrechannel.org , www.t11.org)
Comments
?
Synchronization of HBAs, switches and storage
?
Technical Committee T11 (www.t11.org)
TOE
?
?
?
Definition
?
A TOE (TCP Offload Engine) -chip or a board that handles the TCP protocol
stack without utilizing host CPU resources
Current Status
?
Reduces processing load on nodes that are connected via GigE and handle
heavy traffic through this interface
?
Some TOEs have iSCSI protocol engines that accelerate the SCSI command
protocol processing for iSCSI storage devices
Technology Availability
?
Adaptec AIC-7211 1Gb ASIC with full TCP/IP offload
?
Lucent TA1000 manufactured by Intel
4
System Area Networks
?
?
Definition
?
A network used to connect nodes together
within a single system (computer room
environment) that has the following
operational characteristics
? Very Low Latency (~1 µsec )
? High Bandwidth (>1Gigabit/sec)
? Support for Atomic Operations
? Remote DMA capability (RDMA)
? Low Overhead
? Allow the construction of NUMA -like
systems with *standard* hardware
Current Status
?
System Area Networks are being employed
as the interconnect for compute and
storage clusters
System Area Networks (cont.)
?
SAN Switch
System Area Networks (cont.)
?
Technology Availability
? InfiniBand is available from a variety of companies. It is best to
go to the InfiniBand Trade Association website to see who
these companies are and what categories their product fall
under (i.e. switches, NICs, software, …etc.)
? HyperTransport is slightly ahead of InfiniBand on the maturity
curve but also has slightly different applicability.
? Rapid I/O is on the same track as HyperTransport and is
available from several vendors.
? VIA hardware adapters are available in different forms from
Qlogic, Emulex, Intel, and Troika Networks
? Quadrics is a proprietary system area network and all the
adapters and relevant software is available from Quadrics.
? Myrinet hardware (NICs and Switches) is available from
Myricom however *all* the associated software, interface
protocols, and APIs are publicly available.
Protocols
System Area Networks (cont.)
?
Comments
?
Evolving storage system architectures will incorporate System Area
Networks
?
Storage “devices” will become peers on the System Area Network
?
Reference Web Sites
?
?
?
?
?
?
?
iSCSI
OSD
NFS/CIFS
InfiniBand – www.infinibandta.org
Hypertransport – www.hypertransport.org
Virtual Interface Architecture (VIA) –
www.intel.com/design/servers/vi
Myrinet – www.myri.com
Quadrics – www.quadrics.com
iSCSI
?
?
Technology Evolution
? Two companies, Myricom and Quadrix, have been producing
system area networks for a number of years with some success.
? Several other companies have more recently come up with
products in the VIA (Virtual Interface Architecture) but these
devices only support RDMA and no atomic operations.
? The most widely accepted system area network technology is
InfiniBand with more than 200 companies building various pieces
of that technology.
? Other companies such as AMD and Motorola have developed
competing technologies, HyperTransport and Rapid I/O
respectively, originally intended to be restricted to within the
confines of a single “box” but have since defined connectors and
cables to allow the to interconnect boxes as well.
?
?
Definition
? Internet Small Computer Systems Interface (iSCSI) protocol
? Encapsulated SCSI over IP
Current Status
? Important Protocol for block storage applications
? Can be thought of an an inexpensive alternative to Fibre Channel
Technology Evolution
? Number of early release products
? Some driver based
? Some with specialized hardware, (TCP offload)
? Limited commercial success
? Security and discovery remain a problem
? Draft IETF iSCSI
? www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-14.pdf
5
iSCSI (cont.)
?
?
?
OSD – Object -based Storage Devices
Technology Availability
? Adaptec ASA-7211 iSCSI Adapter (Mid 2002)
? Intel Pro 1000 T IP
? www.intel.com/network/connectivity/products/iscsi/index.ht
m?iid=ipp_home+netcon_iscsi&
Reference Web Sites
? iSCSI_network_storage.pdf
? www.ietf .org/html.charters/ips-charter.html
Comments
? Replaces need for Fibre Channel, if accepted by customers
and industry
? Uphill battle - many camps for and against.
What is OSD?
?
?
?
?
?
?
?
?
What OSD is NOT
Object-based Storage Devices – An Enabling Technology
Grew out of the Network Attached Secure Disks (NASD) project at CMU
?
A flexible and powerful protocol used to communicate with storage
devices
Proposed as a protocol extension to the SCSI command set
?
?
Actively being pursued by the OSD Technical Working Group in the
Storage Networking Industry Association (SNIA)
?
It is a natural step in the evolution of storage interface protocols
?
For some however, it is very new and very
1902
ST506
1985
SMD
Definition
?
Object-based Storage Devices – A protocol for accessing data on storage
devices
Current Status
?
OSD can have a significant impact on helping to solve many of th e issues
that arise in building scalable, high performance storage systems
Technology Evolution
?
First release of the T10 (SCSI) OSD standard specification has been
submitted to the T10 committee
Technology Availability
?
Nothing commercially available yet – some product from some companies
is available
?
Many large storage companies looking into it
?
An OSD reference code is available at
? http://www.sourceforge.net/projects/intel -iscsi
1998
SCSI
FC SCSI
2002?
?
?
?
different
1990
It is not intended or expected that the object
abstraction be a complete file system
There is NO notion of
?
?
Naming
Hierarchical relationships
Streams
file system style ownership access control
The omitted features are assumed still to be the
responsibility of the OS file system
200X
SCSI OSD
OSD
OSD System Architecture
The General Application:
Storage Architectures Today
DAS Architecture
OSD Architecture
I/O Application
File System
I/O Application
Storage Device
Interconnect
Direct Attached Storage
(blocks)
I/O Application
I/O Application
File System
User Component
File System
User Component
I/O Application
File System
File System
Storage Device
Network Attached Storage
(files)
Interconnect
Storage Device
Storage Area Network
(blocks)
Architecture defined by location of
storage system & devices
File System
Storage Component
SCSI
Block Storage
Device
SCSI
File System
Storage Component
Block Storage
Device
6
File System Components
?
User File System Component
?
?
?
?
I/O Application
I/O Application
File System
User Component
File System
User Component
Da
ta T
ran
sfe
r
SCSI
Se
cu
rity
File System Storage Component
?
?
?
Free space management
Storage allocation for data entities
Attribute Interpretation
File System
Storage Component
I/O Application
Block Storage
Device
What problems are being solved?
File System
User Component
?
?
?
?
?
?
?
?
?
?
?
?
Depends on the APPLICATION
Different people are trying to solve different problems for different reasons
Storage Device Utilization
Data Management
Cost
Reliability
Device Management
Performance
Security
Availability
Maintainability
Extensibility
Restate the question: What problems CAN be solved with OSD?
Intelligent Storage
?
?
Definition
? Assume an Object-based Storage Device
? Storage Device is “aware” of the data objects it stores
? An Intelligent storage device can manipulate its data objects an d
potentially the “contents” of the data object
Current Status
? Pre-Competitive research
? Several organizations involved
?
?
?
?
?
University of Minnesota DTC Intelligent Storage Consortium
CMU Parallel Data Lab
UC Santa Cruz Storage Research Center
UCSD Center for Magnetic Recording Research
Technology Evolution
? Intelligent Storage is a natural evolution of OSD.
File System
Storage Component
Block Storage
Device
What CAN be addressed by OSD
?
?
OSD Manager
t
jec n
Obcatio
Lo
Security
?
Hierarchy Management
Naming
User Access Control
Data Properties (Attributes)
How OSD works
?
?
?
?
?
Improved storage management
? Self-managed, policy-driven storage (e.g., backup, recovery)
Improved device and data sharing
? Shared devices and data across OS platforms
Improved storage performance
? Hints, QoS, Differentiated Services
Improved scalability (and not just capacity)
? Of performance and metadata (e.g, free block allocation)
Current block- based access protocols and associated file systems
are 30 years old (that’s 210 in dog -years).
OSD has the potential to make a significant impact on the
Extensibility of a Storage System Architecture
Why Intelligent Storage?
?
From the storage device manufacturer and
storage vendor’s perspective
?
?
?
More room to innovate and differentiate storage
devices like disk drives
Increase margins – price storage devices based
on capability not simply capacity
From the User’s perspective
?
?
Increase in capability more specific to the User’s
application space
Easier to manage
7
Things that need to happen
NFS/CIFS – Network File System / Common
Internet File System
?
?
?
?
Standards – OSD Interface to storage
devices
Standards – Runtime / execution
environment
Standards – API from Application to the
Intelligent Storage System
Software
?
?
?
?
?
?
?
DAS/SAN/NAS
File systems
Heterogeneous Shared File Systems
Hierarchical Storage Management (HSM)
Storage Resource Management
Storage Management
Virtualization
File Systems
?
?
?
DAS/NAS/SAN
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
Local file systems that use Direct Attached Storage or dedicated storage on a SAN
Network File Systems that subscribe to the NFS or CIFS protocol
Shared File Systems that operate on a SAN and allow for concurrent read/write access
to files with all other host computer systems on the SAN
All these file systems unless otherwise noted are “block-based” file systems – a
30-year-old technology
Block-based file systems require the host computer to manage the free space
on the storage as well as the allocation of storage blocks to files
Block-based file systems have difficulty scaling particularly in a truly shared
environment
Object-based File Systems assume that free -space management and space
allocation functions are delegated to the Object-based Storage Devices
Object-based File Systems scale far better than block-based file systems
Object-based File Systems enable the exploitation of Intelligent Storag e Devices
These are more architectural terms than technologies
DAS – Direct Attached Storage
NAS – Network Attached Storage
SAN – Storage Area Networks
Current Status
?
DAS is the oldest and most common storage interconnect
NAS is a generic term for NFS or CIFS
SAN is the architectural interconnect used to physically share storage devices among
several host computer systems
DAS and SAN imply block-based storage access protocols such as SCSI or ATA
NAS implies a file-based access protocol
Technology Evolution
?
?
?
DAS has been around since the beginning of time
NAS became widely used with NFS in the mid 1980’s
SAN has only been around since about 1997 and is still somewhat qwerky
Heterogeneous Shared File Systems
There are several types of file systems
?
Definition
?
?
?
Definition
?
NFS and CIFS are file sharing protocols
Current Status
?
NFS and CIFS are the current standard protocols used for file sh aring
Technology Evolution
?
NFS version 3 is the current generally available release.
?
NFS version 4 is under development at the University of Michigan and
contains many enhancements that will allow it to exploit OSDs .
?
CIFS is the Microsoft answer to NFS – it essentially does the same thing
as NFS
?
CIFS evolution is proprietary to Microsoft
Technology Availability
?
NFS v3 is available for all OS and platforms
?
CIFS comes with all MS operating systems
?
?
Definition
?
Permits simultaneous file sharing among different (1) Operating Systems
and (2) multiple computers at mount point level
Current Status
?
Key foundation technologies that will ultimately support seamless,
geographical data sharing
?
Allows a site to phase in (and out) different client and/or proc essing
systems without affecting the data storage
?
Allows for the growth of data storage subsystems without affecting the
client and/or processing systems
Technology Evolution
?
Products on the market for several years
?
Still acclimating to 24x7 operational, ‘user’ heavy environments
?
None of the file systems fully support the full range of Operati ng Systems
?
Debate over centralized versus distributed metadata
?
Scalability remains a question
?
Failover sometimes difficult
8
Heterogeneous Shared File Systems (cont.)
?
Technology Availability
? Several proprietary, somewhat heterogeneous file systems
?
?
?
?
?
?
Funded by the DoE (ASCI Path Forward)
GPL’d solution in development and available from Cluster File
Systems, Inc (Lustre)
?
?
?
Definition
? Storage administration, capacity planning, monitoring, etc.
Current Status
? Movement towards centralized control of storage
Technology Evolution
? Evolving but disjointed
Technology Availability
? Multiple products by multiple vendors
?
?
?
So, What is SMIS?!
?
?
?
?
?
?
?
Based off of CIM/WBEM and SNIA Shared
Storage Model
Provides common interface to SAN resources
Services include discovery, monitoring,
configuration, security, capacity planning, …
Bluefin says how we manage CIM via WBEM
Solves the problem of multi -vendor SAN
interoperability
Traditional HSMs mature
SAN HSMs emerging
Technology Availability
?
?
?
Multiple storage tiers likely given mix of short latency/long archive requirements
Technology Evolution
?
?
Policy-based data migration between storage elements
Current Status
?
Numerous HSM vendors/products
SAN HSMs
?
ADIC’s StoreNext SAN HSM now shipping
?
Tivoli’s SANergy™/Sun QFs/SAM-FS
Comments
?
Most interesting and most relevant are emerging Disk-to-Disk systems and storage
over IP oriented companies
?
Nexsan Technologies – www.nexsan.com
?
FalconStor Software – www.falconstor.com
?
HSM has been around for many years but has always had trouble getting
traction
What is SMIS?
?
The Problem
? Too many management infrastructures!
?
?
?
?
?
?
?
?
?
Simple Network Management Protocol for networks
Desktop Management Interface for desktops
Common Management Information Protocol for telco
System Management BIOS for motherboard/BIOS vendors
Alert Standard Format for system alarms, …
Non-interoperable models, frameworks and policies
?
McData’s SANavigator
AppIQ
Comments
? Getting the products to gel to provide a unified, cohesive view
of storage
? SMIS (a.k.a. Bluefin) is a new standard that is getting
significant attention
Definition
?
?
Reference Web Sites
? Lustre – www.lustre.org
? SNFS – www.adic.com
? CxFS – www.sgi.com
Storage Resource Management (SRM)
?
?
GFS by Redhat (formerly Sistina), orginally GPL’dbut has
since taken on a proprietary course
Lustre, currently under development as GPL
?
?
Tivoli’s SANergy ™
ADIC’s CVFS now called SNFS
SGI’s CxFS
Hierarchical Storage Management (HSM)
The model describes what you are managing.
The framework allows you to manage the model.
And policies say how you can manage the model.
We need a unified management infrastructure for the enterprise
SMIS – Storage Management
?
?
?
Definition
? SMIS is an object-oriented messaging interface that links
distributed management applications (clients) with device
management support (agents) to discover, manage, and
control devices of any kind
? A CIM/WBEM-based SAN management framework
? A SNIA-based standard
Current Status
? Will significantly enhance the ability to manage the entire
heterogeneous storage environment independent of hardware
or software vendor or manufacturer
Technology Evolution
? Started 5 years ago in SNIA
? Taken out of SNIA by the Partnership Development Process –
a consortium of 17 companies
? Rev 1 of the SMIS spec was brought back into SNIA June 2002
for review and approval by all the Technical Working Groups
9
SMIS – Storage Management (cont.)
Storage Virtualization
?
?
?
Technology Availability
? Spec released to the public September 2002
? Products that are SMIS-compliant are available from a limited
number of companies
Reference Web Sites
? SNIA – www.snia.org
Definition
?
?
?
?
?
?
?
?
?
?
?
Future of Data Storage Systems Workshop –
April 27-29, UCSD, San Diego CA
Intelligent Storage Workshop, May 19-20,
UMN/DTC, Minneapolis, MN
SNIA OSD TWG, meets monthly
StoreCloud, Supercomputing 2004,
November 6 -12, 2004, Pittsburgh, PA
Other References
?
www.dtc.umn.edu
www.insic.org
?
www.datarecoverygroup.com/articles/article3.htm
?
www.actionfront.com/ts_articles.asp
History of disk drives
?
www.research.ibm.com/about/past_history.shtml
?
www.research.ibm.com/journal/rd/443/thompson.html
?
www.i-t-s.com/corporate/disk_drive_history.html
?
www.startribune.com/stories/484/4734780.html
Future of Data Storage Technologies (NSF/NIST/DARPA project)
?
www.wtec.org/loyola /hdmem/toc.htm
?
www.eetimes.com/sys/news/OEG20030718S0038
?
?
?
Range from complete software to a mix of hardware and software products
Products from several vendors are currently available
?
StoreAge Networking Technologies
?
DataCore™ Software
Comments
?
Cool Stuff Happening in the Storage Industry
Some of the virtualization products are still only a few years old and have not had
time to prove themselves as a success or failure
Virtualization hyped during 2001 as a way to decrease the TCO of a storage system
but it is becoming commonly believed that this is not the case
Technology Availability
?
?
Sites are made up of many different vendors’ storage devices and some of the
virtualization products allow the pooling of storage devices int o a single, larger
space for more efficient use of that space
Technology Evolution
?
?
Unfortunately, there is no single definition – depends on the vendor
Generically, virtualization is an abstraction of physical data storage space
Current Status
Debate over ‘inband ’ versus ‘out -of-band’ virtualization
Closing Thoughts
?
?
Hardware technologies are evolving
? Areal Density increasing
? Form factors shrinking
? Serial interfaces/transports are replacing parallel
interfaces/transports
? Newer, cooler storage technologies like MEMS are in process
Protocol and software technologies
? Lagging the hardware evolution
? Block-based access moving toward Object-based protocol in
devices and protocols
? Object-based File Systems are being developed
? Traditional POSIX file system API is being challenged, reformed
Thankyou!
University of Minnesota
Digital Technology Center
Intelligent Storage Consortium
www.dtc.umn.edu
10
Software Issues
?
Operating Systems – Homogeneity and Heterogeneity
? Between OS Types
?
?
?
?
Software Issues (cont.)
?
?
Windows
Unix in all flavors
Mac
?
?
Within OS types
?
?
?
?
Linux Releases from a single vendor (i.e. RedHat )
Linux releases from different vendors (i.e. RedHat vs Suse)
Patches from many different Linux Value-Add providers
Windows 95/98/NT/2000/XP …etc.
?
?
?
?
Concurrent support for multiple OS types (Heterogeneous)
Striping efficiency of the Virtualization Engine(s)
Striping efficiency of the file system/Volume Manager
?
Software Issues (cont.)
?
?
Hard product functionality/operational limits
? 1 TB LUN/file system limit for Solaris
? Note this has secondary impacts on products such as CVFS
(shared volume labeling)
? Veritas file system
? 1 TB file systems on Solaris
? 2 TB file systems on HP-UX
? QFS supports up to 252 LUNs therefore 252 TB file systems
? CVFS supports up to 1.84E19 files
? Number of files in an HSM, etc.
Driver ( NICs, HBAs , etc.) availability
? No iSCSI driver for SGI IRIX™
? SNIA approved drivers that support LUN discovery
?
?
?
?
?
?
?
Mixing protocols with interfaces – not all “endpoints” support all the
possible combinations
? FC over everything
? TCP/IP over everything
? SCSI over everything
Go with what works
? Ethernet for networking
? Fibre Channel for storage
Experiment with what will be the most likely winner
? SCSI over Ethernet one way or another
Plan for “phasing out” old technologies
Plan on “phasing in” new technologies
Linux
? Open source is powerful but not without its problems
? Most everything is kernel and/or distribution (i.e. RedHat , etc.)
specific
?
?
?
?
SANergy ™
CVFS
GFS
Product incompatibility – Lots of examples
?
?
?
?
From one OS to another OS (i.e. Unix to Windows)
Software rot
Losing source code
Losing algorithms
Losing compilers
Software Issues (cont.)
?
Protocol Issues
Name spaces
Security mechanisms
Meta data: Proprietary versus standard
Disk storage layout: Proprietary versus standard
Application porting issues
?
?
?
File Systems Incompatibilities
CVFS won’t run on a GFS patched kernel
Security/firewall
? CVFS uses dynamically assigned ports for communication with
the FSS (metadata)
? SANergyuses NFS
Firmware and software upgrades
? Impact on operations – not just computers
Tuning
? Variables/parameters at all levels
Management Issues
?
?
?
?
?
There are many pieces in a system to manage
There is no single unified management tool be
weary of anyone who tries to sell you one
Even in the SAN space, no single management
tool can manage all the SAN devices
Bluefin will help with this but it is still a ways out
Real-time monitoring and management is still a
problem
11
Management Issues (cont.)
?
?
Failure management
? Run under the assumption that there is ALWAYS something
broken somewhere in the system
? Complete architectural redundancy or allowance for degraded
operation
? Host, HBA, Switch(s), RAID controller, LUN
? Interaction of all the components to effect a proper
switchover
? Disconnecting and shutting off failed components
? SANergy
? CVFS
? GFS
? Failback (restoration)
Performance management
? Treat bandwidth, latency, and transaction rates as a resource
that needs to be monitored and managed
12