Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The HDF Group Introduction to HDF5 Session Two Data Model Comparison HDF5 File Format Copyright © 2010 The HDF Group. All Rights Reserved 1 www.hdfgroup.org Our Purpose Today 1) Familiarize you with HDF5 and its capabilities. 2) Help you understand how HDF5 might be applied to your data management challenges. Copyright © 2010 The HDF Group. All Rights Reserved 2 www.hdfgroup.org HDF5 Data Model File Link Dataset Group Datatype Dataspace Attribute HDF5 Objects Copyright © 2010 The HDF Group. All Rights Reserved 3 www.hdfgroup.org Developing a Project Data Model Project Domain Concepts Relational Logical Data Model HDF5 Data Model A Relational Database Physical Instantiation HDF5 File Copyright © 2010 The HDF Group. All Rights Reserved 4 www.hdfgroup.org Logical Data Models Copyright © 2010 The HDF Group. All Rights Reserved 5 www.hdfgroup.org HDF5 / Directories and Files HDF5 Directories (Folders) and Files file filesystem dataset file datatype ~ file type or extension dataspace ~ file size attribute ~ properties (Windows) group directory (Unix) or folder (Windows) link hard links & symbolic links (Unix); ~shortcuts (Windows) • Both support hierarchies for organizing information (and to some degree, directed graphs) Copyright © 2010 The HDF Group. All Rights Reserved 6 www.hdfgroup.org HDF5 / XML HDF5 XML file document dataset element datatype simple or complex type definitions in XML Schema dataspace ~ minOccurs, maxOccurs in XML Schema attribute attribute group ~ element with sub-elements link ~ IDREF • Both support rich metadata and allow new types to be defined • HDF5 objects designed for numeric data; XML objects designed for text Copyright © 2010 The HDF Group. All Rights Reserved 7 www.hdfgroup.org HDF5 / Relational Databases HDF5 Relational Database file database dataset data table datatype char, varchar, number, blob, raw, date, … dataspace ~ records attribute ? group ? link ? • HDF5 supports multi-dimensional arrays with common datatypes in the cells; locate by offset • RDB support rows with different data types in fields; locate by primary key Copyright © 2010 The HDF Group. All Rights Reserved 8 www.hdfgroup.org HDF5 Technology Platform • HDF5 data model • The “building blocks” for data organization and specification • HDF5 software • Library, language interfaces, tools • HDF5 file format • Bit-level organization of HDF5 file Copyright © 2010 The HDF Group. All Rights Reserved 9 www.hdfgroup.org HDF5 File Format • Defined by the HDF5 File Format Specification • Specifies the bit-level organization of an HDF5 file on storage media • Maps the data model objects to a linear address space • Other representations of the data model objects are also possible, but those are not the HDF5 format • Self-describing • All the information necessary to read and reconstruct the data model objects is specified by the format • Designed to work well with other technologies • Designed for speed and storage efficiency • Binary format Copyright © 2010 The HDF Group. All Rights Reserved 10 www.hdfgroup.org HDF5 File Format Specification Introduction You can have the power of the format without worrying about the details of the specification. Copyright © 2010 The HDF Group. All Rights Reserved 11 www.hdfgroup.org Developing a Project Data Model Project Domain Concepts Relational Logical Data Model HDF5 Data Model A Relational Database Physical Instantiation HDF5 File Copyright © 2010 The HDF Group. All Rights Reserved 12 www.hdfgroup.org Physical Instantiations Copyright © 2010 The HDF Group. All Rights Reserved 13 www.hdfgroup.org HDF5 / Filesystem • Both allow traversal of objects in the hierarchy • Both include internal metadata for fast access to subsets of the data • Both can handle variety of data • HDF5 file can be easily migrated or shared Copyright © 2010 The HDF Group. All Rights Reserved 14 www.hdfgroup.org HDF5 / “Binary Flat File” • “Binary Flat File” = A sequence of bytes representing (primarily) numeric data. Often written by scientific and engineering applications to save results from simulations or experiments. • A binary flat files usually represents the fastest way to write numeric data. Read performance varies depending on access patterns. • Unlike HDF5, binary flat files are not self-describing or portable across architectures. Copyright © 2010 The HDF Group. All Rights Reserved 15 www.hdfgroup.org HDF5/XML • Both HDF5 and XML are self-describing and portable • XML is text-based and requires contents to be accessed sequentially • HDF5 is binary and supports random access and subsetting Copyright © 2010 The HDF Group. All Rights Reserved 16 www.hdfgroup.org HDF5/PDF • Both HDF5 and PDF formats are published and open • Both can include heterogeneous types of information • PDF focused on documents • HDF5 focused on collections of different types, with strong support for multi-dimensional arrays of numeric data • Both are portable across architectures Copyright © 2010 The HDF Group. All Rights Reserved 17 www.hdfgroup.org HDF5 / Relational Databases • RDB provides access control features; HDF5 does not • RDB transaction based; HDF5 is not • Transactions / Logging introduce overhead that may not be needed • HDF5 not designed for many writers to ‘random’ locations • RDB provides built-in indices to values • HDF5 provides navigation to datasets / subsets within datasets • HDF5 files portable across platforms Copyright © 2010 The HDF Group. All Rights Reserved 18 www.hdfgroup.org Discussion • How could daily temperature measurements made at various locations throughout a building be modeled in different formats? Filesytem, Binary Flat File, XML, PDF, Relational Database • What are some pros/cons of each? Copyright © 2010 The HDF Group. All Rights Reserved 19 www.hdfgroup.org Review • HDF5 consists of • file format • self-describing • many internal structures to support high-performance • software • data model • file, dataset, datatype, dataspace, attribute, group, link • HDF5 designed to support • management of high-volume, complex data • data sharing and preservation Copyright © 2010 The HDF Group. All Rights Reserved 20 www.hdfgroup.org The HDF Group HDF5 Data Model Example ENSIGHT Automotive Crash Simulation Copyright © 2010 The HDF Group. All Rights Reserved 21 www.hdfgroup.org Automotive Crash Simulation 22 www.hdfgroup.org Automotive Crash Simulation 23 www.hdfgroup.org Automotive Crash Simulation 24 www.hdfgroup.org Solid Modeling 25 www.hdfgroup.org Solid Modeling 26 www.hdfgroup.org Modeled in HDF5 Copyright © 2010 The HDF Group. All Rights Reserved 27 www.hdfgroup.org Mesh Example in HDFView Copyright © 2010 The HDF Group. All Rights Reserved 28 www.hdfgroup.org Stretch Break Copyright © 2010 The HDF Group. All Rights Reserved 29 www.hdfgroup.org