Download iscsi_and_dedupe.pps

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Information privacy law wikipedia , lookup

Data vault modeling wikipedia , lookup

Business intelligence wikipedia , lookup

Open data in the United Kingdom wikipedia , lookup

3D optical data storage wikipedia , lookup

Disk formatting wikipedia , lookup

Transcript
iSCSI guides and suggestions.
For most implementations
iSCSI –What it is.
Internet Small Computer System Interface, an (IP)-based storage networking standard for linking
data storage facilities by carrying SCSI commands over IP networks typically using TCP ports 860
and 3260
•
•
•
•
•
•
Requires an initiator and a LUN
An initiator can be hardware to implement iSCSI (not used or covered in this document)
An initiator can be software which is needed to implement iSCSI with our devices
This software utilizes code to implement iSCSI
This code contains a kernel resident device driver that will use the existing NIC and
network stack to emulate SCSI
It is required for the remote end to also “speak” the iSCSI protocol.
You can think of this software as a method to bus data from a remote target (LUN)
The ReadyDATA is the remote end…
SAN
The concept is that a computer
running an initiator will “initiate” a
connection to a host which has
been setup with a LUN as target.
These connections can be
insecure, or secure using CHAP.
iSCSI
Huge amounts of RAM are used when Dedupe is used.
iSCSI deduplication is strongly discouraged as it
will adversely affect your system’s performance.
•
Dedup
•
•
Deduplication requires creating a deduplication
store which uses a lot of RAM
The reason for this degradation is because
iSCSI uses strictly 8k blocks instead of the
variable block size ZFS uses (128k average).
Keep in mind that your memory overhead goes
up in proportion to the decrease in block size.
iSCSI
Some of the attributes that can be assigned to LUNS like, compression and deduplication can
work for or against you.
Particularly on a SAN implementation.
Dedup
The numbers are not exact and will vary based on
the very DATA that is being handled in the LUN.
A good rule of thumb is to double the normal
dedup-required-ram of 5GB to 1 TB of DATA to
“more” than 10GB RAM for every 1TB of DATA.
iSCSI uses a small block size (8k) to be able to be
managed by many different file systems.
Because it is more granular, other overlaying file
systems will align well. This comes at an increased
load on the CPU, but because of the deduplication
store tables which consume a great amount of
memory it should be avoided with iSCSI.
If the LUN is created with the purpose of using ESX or other VMs
DO NOT USE DEDUPLICATION ON THE READYDATA instead use the VM’s software dedupe!
Visualizing
DEUPLICATION
Deduplication
Deduplication (dedup for short) is a method to remove redundant data blocks from a
data set.
Only unique blocks will be written to disk and any blocks that are identical to an existing
block will only be referenced as a component of multiple data sets.
What this means is that the capacity of the disk will be dedicated to unique and distinct
blocks of data while the inodes maintain files structured correctly by using components
shared across many inodes. Resulting in a dramatic increase of efficiency.
In the next slides we will look at how the data is managed by the inodes
and how it is written to the disks, both, with Dedupe ON and OFF.
Inodes will be represented by
Blocks will be represented by
Deduplication
Lets look at two formatted drives...
These are our inodes
These are our sectors
These are our inodes
These are our sectors
Deduplication
MEMORY
Dedupe OFF...
As the data flows in, it is simply written as it comes in
Deduplication
MEMORY
Dedupe OFF...
All data is written to the array in its original form, and in this example our drive
has filled up.
We need to expand our volume to store more data.
Note the low use of memory.
Deduplication
MEMORY
Dedupe ON...
Only unique blocks are written to the array leaving much, much more
capacity available for more unique blocks.
Because a reference to all unique blocks is kept in memory in order to
identify repeating blocks, it uses a lot of it.
A single inode will be referenced for all files requiring that particular block of
data.
Deduplication
MEMORY
Dedupe
OFF
MEMORY
Dedupe
ON
Deduplication
Deduplication does use more compute cycles and a lot more RAM…
A LOT of RAM!!!
Particularly when there is a lots and lots of unique data.
• The amount of memory needed is around 5 GB for every 1TB of DATA
• The ReadyDATA 5200 has 16GB of ECC (error correcting) memory.
• ReadyDATA uses ZFS which has a max file size of 16 Exabytes
• The maximum number of files is 2.8147498e+14 or 281,474,976,710,656
files…
..So you will not be running out of inodes any time soon, but memory is limited.
Deduplication
USE DEDUPLICATION WISELY
It is UNWISE to dedup an iSCSI target
Thick or Thin LUN
The question of creating thin or a thick LUN will have a definite impact on how memory is used
if one decides to use dedupe on a LUN.
Thick provisioning will reserve the entered “Size”
from the volume immediately.
Thick?...
Thin?...
Thin provisioning will not reserve the entered “Size”
from the volume at all, but once the target is
mounted it will report to the host FS as the entered
size.
Thick or Thin LUN
Thick provisioning, in terms of iSCSI would create
an enormous dedupe table in memory.
If dedupe is absolutely desired on the LUN create it
as thin provisioned.
Thick?...
Thin?...
Also, if you decide to use dedupe after all, we
strongly advise you to use a pair of read cache
SSD drives.
ReadyDATA Disk Packs
For meeting the differing needs of specific applications, ReadyDATA 5200 users can mix-and
match SATA, near-line SAS, SAS, and SSD drives within volumes to provide a stunning boost
to performance and flexible capacity.
NOTE:
• ReadyDATA supports SATA, NL-SAS, SAS and SSD drives
• Optimize the ratio of performance to capacity by mixing drive types within a volume
• Only ReadyDATA disk packs from NETGEAR are recognized by ReadyDATA 5200
ReadyDATA storage is available as a diskless chassis (RD5200) or in a pre-populated 12TB
SATA configuration (RD521210).
• Only pre-certified disk packs from NETGEAR are compatible with ReadyDATA storage
devices.
• For your convenience, NETGEAR offers a wide variety of disk types, capacity and speeds.
SATA
Nearline SAS
SAS (LFF)
SSD (SFF)
End