Download flat file systems

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
File Systems in Real-Time
Embedded Applications
Introduction to File Systems
March 4th
Eric Julien
1
Week Agenda
• Day 1: Introduction to File Systems
• Day 2: Understanding how the File Allocation
Table (FAT) Operates
• Day 3: Balancing performance, safety and
resource usage in an embedded file
system
• Day 4: Choosing the right storage media
• Day 5: The challenges of using NAND flash
memory in embedded systems.
2
Definition of a file system
From the user’s perspective, the file
system provides a means of organizing,
storing and retrieving data to a
permanent storage device.
3
Definition of a file system
From the designer’s perspective, the file
system refers to all the internal data
structures and algorithms that support
these services.
4
Historical overview
• 1973: CP/M operating system was first
introduced. Its FS was very simple and
had no directory hierarchy.
• 1980: CP/M was modified and renamed
QDOS. QDOS FS was based on a data
structure called File Allocation Table.
• 1981: Microsoft bought QDOS and its FS
and marketed them as MS-DOS and FAT.
5
FS in embedded systems
Embedded systems, as opposed to fullfledged computers, have strict limitations
both in terms of processor speed and
memory.
File systems designed for huge data centers
(e.g. ZFS) are therefore not well-suited for
small, less capable embedded systems.
6
Files
The file abstraction provides the user with a
convenient way to retrieve previously stored
pieces of data using their name.
A file can be seen as a labeled data
container.
7
File metadata
The file metadata refer to pieces of
information stored on disk that describe a
file. The metadata is not part of the file
content. Examples of file metadata are:
•
•
•
•
File name
File creation
File size
Security attributes
8
Directories
The directory abstraction provides the user
with a convenient way to group related files.
Internally, the directory stores information
that allows file names to be associated with
corresponding data block locations.
Some old FS (such as early versions of DOS)
had a single directory containing all files.
Such FS are called flat file systems.
9
Device, partition and volume
The device refers to the physical storage
media (e.g. hard disk, SD card, flash
memory).
The partition is a logical unit obtained by
the division of the underlying device
physical space (not FS specific).
The volume is a formatted partition or
device where the FS resides (FS specific).
10
Common internal structures
Although internal architectures vary widely
from one FS to another, the base ingredients
remain the same:
•
•
•
•
•
Arrays
Bitmaps
Linked lists
Unbalanced trees
Balanced trees
11
Bitmaps
Often used to keep track of resource
allocation.
0000000000000110
Resources 1, 2 and
19 are allocated
0000000000001000
Used by ext2/3/4, NTFS, HFS, ReiserFS
among others.
12
Linked lists
Used to store and manage directory content
(a) and file content (b).
Dir X
File A
File X
Block A
File B
Block B
Dir Y
Block C
File C
Block D
(b)
(a)
Used by ext2/3 (a) and FAT12/16/32 (b).
13
Unbalanced trees
Heavily used by ext2/3 to organize data
blocks. More levels of indirection are added
as file grows.
File X
Metadata
A
C
D
B
E
F
14
G
H
I
J
K
L
M
N
Balanced trees (B-trees)
Figure (a) shows what a
balanced tree looks like, as
opposed to an unbalanced
tree (b).
The B-tree is a self-balancing
tree that provides logarithmictime search at the expense of
a more complex node
insertion/deletion.
15
(a)
(b)
+
B -tree
vs. linked list
B+-tree (a variant of B-tree) provides fast
random access.
File X
B C
File X
I>=H so branch
right
E H
I>=I so branch
right
H I
E F
VS.
Data found in
3 hops !
C
B
D
F
A
H
G
I
In a B+-tree, the search time is
logarithmic and deterministic.
16
E
A
B
C
D
E
F
G
H
I
Data found
in 8 hops !
In linked list the search time is
linear and non deterministic.
File systems
•
•
•
•
•
•
•
•
FAT
exFAT
Ext2/3/4
NTFS
HFS/HFS Plus
Btrfs
ZFS
Log-structured file systems (YaFFS, JFFS)
17
FAT
-
3 flavors: FAT12, FAT16 and FAT32.
DOS and Windows 9x file system.
Simple architecture based on linked lists.
Well-suited for embedded because of its
low footprint (both on-disk and RAM).
- Poor performances on big volumes
(remember linked-list vs. B-trees ?).
- More on FAT later…
18
exFAT
- Smaller footprint than NTFS (more on
NTFS later) but better performances than
FAT32.
- Bitmaps used to track unallocated clusters
(much faster than browsing the FAT).
- Huge file size limit (16 exabytes).
19
Ext2/3/4
• Default file system for many Linux
distributions.
• Internal structure based on unbalanced
trees with up to 3 levels of indirection.
• Journaling (in ext3) as a means of
providing metadata reliability.
• Extents (variable-sized blocks) in ext4
allows better large file performances.
20
NTFS
• Default Windows file system since XP.
• Based on extents.
• Directory entries stored in a B-tree,
providing much better performances than
FAT for huge directories.
• Clever handling of small files: data is
stored with the metadata for fast access
and low internal fragmentation.
21
HFS/HFS plus
• Default file system for Mac OS.
• All files and directories metadata is stored
in a single giant B-tree.
• HFS plus basically provides additional
support for bigger files and longer file
names.
• Journaling possible with HFS plus.
22
Btrfs (B-tree file system)
• Almost everything (file, directory, resource
allocation management) is B-tree.
• Copy-on-write is used as means of better
reliability. Data or metadata is never
overwritten. Instead, a modified block is
written out-of-place and pointers to it are
then adjusted to reflect new block
location.
23
ZFS
• More than a regular file system: also a
logical volume manager.
• Transactional model based on copy-onwrite.
• Provides metadata AND data integrity by
checksumming almost everything.
• Many advanced features such as data
deduplication, snapshots and clones.
24
Log-structured file systems
• Storage media treated as log.
• Good reliability: logging implies copy-onwrite.
• High write throughput: logging allows long
sequential write operations.
• Well-suited for flash media as it inherently
provides wear leveling.
• Used by YaFFS and JFFS (both flash FS).
25