HDF5 (.h5, .hdf5)

Background & Context

    • MIME type: application/x-hdf5
    • HDF data format Version 5.
    • General-purpose format for representing multidimensional datasets and images.
    • Datasets with compound data structures are supported.
    • Used for storage, management, and exchange of scientific data.
    • HDF is an acronym for Hierarchical Data Format.
    • Originally developed by the US National Center for Supercomputing Applications (NCSA).
    • Currently maintained by The HDF Group.
    • Binary file format.
    • Incompatible with HDF Version 4 and earlier.

Import & Export

  • Import["file.h5"] imports an HDF5 file, returning the names of the datasets stored in the file.
  • Import["file.h5",elem] imports the specified element from an HDF5 file.
  • The import format can be specified with Import["file","HDF5"] or Import["file",{"HDF5",elem,}].
  • Export["file.h5",expr] exports a numeric array to HDF5.
  • Export["file.h5",{expr1,},{"Datasets", {"dataset1",}}] creates an HDF5 file, storing the data arrays {expr1,} as separate datasets.
  • Export["file.h5",expr,elem] exports the specified element to an HDF5 file.
  • Export["file.h5",elem1->expr1,,"Rules"] uses rules to specify the elements to be exported.
  • See the reference pages for full general information on Import and Export.
  • ImportString and ExportString support the HDF5 format.

Import Elements

  • General Import elements:
  • "Elements"list of elements and options available in this file
    "Rules"full list of rules for each element and option
    "Options"list of rules for options, properties, and settings
  • Structure elements:
  • "Datasets"names of all datasets
    "Groups"names of all groups
    "StructureGraph"a directed graph showing the structure of the datasets
    {"StructureGraph, groupname}a graph showing the structure under groupname
    "StructureGraphLegend"legend for the structure graph
    "Summary"summary of properties
  • Names of groups and datasets are given as the absolute paths starting with the root group name "/".
  • Import by default uses the "Datasets" element for the HDF5 format.
  • Data representation elements:
  • "Data"all datasets imported as an association
    {"Data",n} or nn^(th) dataset
    {"Data",dataset} or datasetnamed dataset
    {"Data",groupname}an association of all datasets under groupname
    {"Data",groupname,lev}an association of datasets under groupname up to level lev
  • The following basic data types are supported:
  • "Integer8"8-bit integers
    "Integer16"16-bit integers
    "Integer32"32-bit integers
    "Integer64"64-bit integers
    "UnsignedInteger8"8-bit unsigned integers
    "UnsignedInteger16"16-bit unsigned integers
    "UnsignedInteger32"32-bit unsigned integers
    "UnsignedInteger64"64-bit unsigned integers
    "Real32"IEEE singleprecision numbers
    "Real64"IEEE doubleprecision numbers
    "String"string of ASCII characters
  • The following structured data types are supported:
  • "ByteArray"a ByteArray of an arbitrary length
    "Array"an array of any supported data format
    "Enum"an enumeration
    "Compound"a compound dataset consisting of any other data format and other compound datasets
  • Complex numbers are typically stored and imported as compound types.
  • Metadata elements:
  • "Attributes"attributes of all groups and datasets
    "DataEncoding"specifies how each dataset is compressed
    "DataFormat"type used to represent each dataset
    "Dimensions"data dimensions of each dataset
    {"metadata",n}metadata of the n^(th) dataset
    {"metadata",dataset}metadata of the named dataset
  • The following data encodings are supported:
  • Noneno data compression is used
    "Fletcher32"adds the Fletcher checksum
    "GZIP"GZIP compression
    "ScaleOffset"performs a scale and/or offset operation
    "Shuffle"reorders so that consistent byte positions are placed together
    "SZIP"SZIP compression (Import only)
  • A single dataset can have multiple encodings, which will be specified as a list {enc1,enc2,}.

Export Elements

  • The following export elements can be given:
  • "Attributes"attributes associated to any object
    "Datasets"datasets and their associated elements
    "Groups"group names and their associated elements
    "NamedDataTypes"named data types and their associated elements
  • With the "Attributes" element, the following expressions can be given:
  • attrattributes associated to the root group "/"
    {path1attr1,}attributes attri associated to the specified pathi
  • Attributes attri should be given in the form "attname"->attval.
  • With the "Groups" element, the following expressions can be given:
  • {"gr1","gr2",}a list of group paths
    {"gr1"grdata1,}list of groups and their associated data
  • Group data grdatai can take the following keys:
  • "Attributes"group attributes
    "HardLinks"hard links to other objects
    "SoftLinks"soft links to other objects
  • Soft and hard links can be specified using "linkname"path.
  • With the "Datasets" element, the following expressions can be given:
  • datastores data under "Dataset1"
    {"name1"data1,}a list of dataset names and their associated data
    {"name1"ds1,}specifies each dataset dsi using a list of rules
  • Datasets dsi can take the following keys:
  • "Attributes"dataset attributes
    "Data"array of data
    "DataFormat"data type
  • With the "NamedDataTypes" element, the following expressions can be given:
  • {"name1"type1,}a data type
    {"name1"<|"Type"type1,"Attributes"att1|>,}an association specifying a type and its attributes
  • The type specification typei can take the following forms:
  • "simpletype"a simple data type such as "Integer64"
    <|"Class""ByteArray",|>takes "Length" and "Tag" keys
    <|"Class""Array",|>takes "DataFormat" and "Dimensions" keys
    <|"Class""Compound",|>takes a "Structure" key

Options

  • Import and Export options:
  • "ComplexKeys"Automatickeys for complex interpretation and export
  • By default, {"Re","Im"} are used as complex keys. Other settings include:
  • Noneno complex interpretation (import only)
    Automaticuse automatic keys
    {key1,key2}use explicit keys
  • Import option:
  • "TakeElements"Allsubset of elements to import
  • "TakeElements" can take the following values:
  • {elem1,elem2,}list of elements elemi
    {m;;n;;s,...}elements m through n in steps of s
    {opt1val1,}list of suboptions
  • The following suboptions opti are available for taking elements:
  • "Offset"{0,0,}the offset along the dimensions of the dataset
    "Count"Allthe number of block to be imported along each dimension
    "Stride"{1,1,}the step between beginnings of the blocks
    "Block"{1,1,}number of elements in each block
  • Export option:
  • ByteOrdering$ByteOrderingwhat byte ordering to use
    OverwriteTargetTruewhether to overwrite an existing file
    "AppendMode""Extend"how to append to existing objects
  • Using OverwriteTarget->"Append", new objects may be added to an existing file.
  • Possible settings for "AppendMode" include:
  • "Extend"extends existing objects, if possible (default)
    "Overwrite"overwrites exisiting objects
    "Preserve"preserves existing objects

Examples

open allclose all

Basic Examples  (1)

Show the datasets stored in a sample file:

Get the file summary:

Show the structure of the file:

Scope  (12)

Import  (7)

Show all elements available in the file:

Import, by default, returns the list of datasets in the file:

Import the file structure, specifying the format explicitly:

Import contents of a dataset by specifying its name:

Import dimensions and data format for all datasets in the file:

Import dimensions and data format of a single dataset:

Import 8-bit RGB raster data and render it as an Image object:

Export  (5)

Export a matrix to HDF5:

Show the datasets contained in this file:

Export a named dataset:

Export two matrices to HDF5:

Export a named dataset with given data type:

Import the data format:

Export a named data type and a group that links to it:

Import the file structure:

Import Elements  (22)

Attributes  (4)

Import attributes of all objects in the file:

Import attributes of a specific dataset:

Import attributes of the second dataset in the file:

The order of datasets in the file can be checked by calling:

Import attributes of multiple objects:

Data  (4)

Get data from all datasets in the file:

Import data from a dataset "Complex64" inside group "Complex":

Import data from the third dataset in the file:

Get data from all datasets in a given group:

Import data from every dataset or group in a list:

DataEncoding  (1)

Check what filters were applied to each dataset in the file:

DataFormat  (2)

Get data type description for every dataset in the file:

Simple numeric and string types have one-word descriptions:

"DataFormat" for a compound type shows class and structure:

"DataFormat" for enumerated types includes class, base data format and a list of values and names:

"DataFormat" for array types includes class, base data format and dimensions:

"DataFormat" for byte arrays includes class and length:

Datasets  (2)

Import names of all datasets in a file:

"Datasets" is the default HDF5 element:

Dimensions  (4)

Get dimensions of all datasets in a file:

Import dimensions of all datasets under a given group:

Dimensions of data with only a single element are indicated by an empty list:

Get dimensions of a specific dataset:

Specify the dataset using its index:

Groups  (1)

Import names of all groups in the file, with each group listed only once:

StructureGraph  (3)

The structure of a structured HDF5 file:

Structure of a flat HDF5 file:

Get a legend:

Summary  (1)

Get the file summary:

Export Elements  (21)

Datasets  (11)

"Datasets" is the default export element. Data format and dimensions are automatically inferred from the expression:

Export expressions into different datasets, each of which can have a full path:

Inspect the structure of the HDF5 file:

Export a dataset with a custom data format:

Export a dataset with an attribute:

Create a scalar dataset with a single integer:

Create an array of integer numbers:

Export an array of real numbers:

Export a numeric array with complex numbers:

Note that complex numbers are exported to a compound dataset:

Use the "ComplexKeys" option to get complex numbers back:

Create a dataset with strings:

A ByteArray is stored as a "ByteArray" type (also called the opaque type in HDF5):

Associations with string keys are exported as elements of a compound type:

Groups  (7)

HDF5 files always have a root group (with path "/"):

Export multiple groups by giving a list of paths:

Groups can have links to other groups or datasets:

Soft link names cannot contain the special character "/":

Soft link target must be a valid path, without "." or "..":

If a soft link target is missing, the link becomes a dangling link:

Create a file with two groups, a dataset and a hard link from one group to the dataset:

NamedDatatypes  (2)

Export a simple type to the file:

Append a compound named data type to the same file:

Use the previously exported type when exporting a dataset:

Import data format and dimensions of the dataset:

Inspect exported data:

Named datatypes can carry attributes:

Attributes  (1)

Attach attributes to the root group:

Import Options  (3)

"TakeElements"  (3)

Import three points from a dataset:

This would be equivalent to but more efficient than extracting parts after importing the whole dataset:

Import a range of data from a dataset using spans:

Use suboptions for specifying offset, stride, and block:

Applications  (1)

Easily recreate a chess game from a dataset of 8x8 grids of string characters:

Properties & Relations  (1)

An association is, by default, exported as a compound dataset:

A list of rules is exported as multiple datasets:

Possible Issues  (2)

"Groups" lists each group in the file only once:

Notice that "/group2" is an unlisted valid path in the file:

Use an unlisted group path:

HDF5 has no built-in type for complex numbers. Usually they are stored in a compound type:

Introduced in 2004
 (5.1)
 |
Updated in 2019
 (12.0)