ArrowIPC (.arrow, .arrows, .feather, .ftr)

Import supports ArrowIPC file and stream formats as well as Feather Version 1 and Version 2.
Export supports ArrowIPC file and stream formats.

Background & Context

- Registered MIME types: application/vnd.apache.arrow.file, application/vnd.apache.arrow.stream
- Arrow IPC columnar data format.
- Used for efficient serialization of large columnar datasets.
- The primitive unit of serialized data in the columnar format is called record batch.
- Arrow IPC file format is used for serializing a fixed number of record batches and supports random access.
- Arrow IPC streaming format is used for sending an arbitrary-length sequence of record batches.
- Feather version 2 is a file format represented as the Arrow IPC file on disk.
- Feather version 1 is a legacy file format distinct from Arrow IPC files.
- Developed by the Apache Software Foundation.
- Binary file format.
- Supports multiple compression methods.

Import & Export

Import["file.arrow"] imports an ArrowIPC file as a Tabular object.
Import["file.arrow",elem] imports the specified elements.
Import["file.arrow",{elem,subelem₁,…}] imports subelements subelem_i, useful for partial data import.
The import format can be specified with Import["file","ArrowIPC"] or Import["file",{"ArrowIPC",elem,…}].
Export["file.arrow",expr] exports a Tabular object to ArrowIPC file format.
Supported expressions expr include:

	{v₁,v₂,…}	a single column of data
	{{v₁₁,v₁₂,…},{v₂₁,v₂₂,…},…}	lists of rows of data
	array	an array such as SparseArray, QuantityArray, etc.
	dataset	a Dataset or a Tabular object

See the following reference pages for full general information:

	Import, Export	import from or export to a file
	CloudImport, CloudExport	import from or export to a cloud object
	ImportString, ExportString	import from or export to a string
	ImportByteArray, ExportByteArray	import from or export to a byte array

Import Elements

General Import elements:
"Elements" list of elements and options available in this file

"Summary" summary of the file

"Rules" list of rules for all available elements
Data representation elements:
"Data" two-dimensional array

"Dataset" table data as a Dataset

"Tabular" a Tabular object
Import by default uses the "Tabular" element.
Subelements for partial data import for the "Tabular" element can take row and column specifications in the form {"Tabular",rows,cols}, where rows and cols can be any of the following:

	n	n^th row or column
	-n	counts from the end
	n;;m	from n through m
	n;;m;;s	from n through m with steps of s
	{n₁,n₂,…}	specific rows or columns n_i

Data descriptor elements:
"ColumnLabels" names of columns

"ColumnTypes" association with data type for each column

"Schema" TabularSchema object
Metadata elements:
"ColumnCount" number of columns stored in file

"Dimensions" data dimensions

"RowCount" number of rows stored in file

"MetaInformation" metadata

Options

General Import options:
IncludeMetaInformation All metadata types to import

"UseMemoryMappedFile" True whether to use memory-mapped reader
General Export options:

"Compression"	None	compression method
CompressionLevel	Automatic	compression level
"Schema"	Automatic	schema used to construct Tabular object
"Streamable"	False	if true, then Arrow IPC streaming format is used

The following settings for "Compression" are supported:
None no compression

"LZ4Frame" LZ4 Frame compression

"ZSTD" ZSTD compression

Examples

open allclose all

Basic Examples (3)

Import Tabular object from Arrow IPC file:

Import the file summary:

Export Tabular object to Arrow IPC file:

Scope (3)

Import (3)

Show all elements available in the file:

By default, a Tabular object is returned:

Import column types:

Import Elements (14)

"ColumnCount" (1)

Get the number of columns:

"ColumnLabels" (1)

Read column names:

"ColumnTypes" (1)

Import column types:

"Data" (2)

Get the data from a file:

Import only selected rows:

Import only selected columns:

"Dataset" (2)

Get the data as a Dataset:

Import only selected rows:

Import only selected columns:

"Dimensions" (1)

Import data dimensions:

"MetaInformation" (1)

Import metadata:

"RowCount" (1)

Get the number of rows:

"Schema" (1)

Get the TabularSchema object:

"Summary" (1)

Get the file summary:

"Tabular" (2)

Get the data from a file as a Tabular object:

Import only selected rows:

Import only selected columns:

Import Options (3)

IncludeMetaInformation (1)

By default, all metadata stored in a file is imported and embedded in the Tabular object:

Do not import metadata:

"Schema" (1)

By default, column labels and their types stored in a file are used when Tabular or Dataset objects are imported:

Use "Schema" option to specify column labels and types:

"UseMemoryMappedFile" (1)

By default, memory mapping is disabled. "UseMemoryMappedFile"->True to enable memory mapping:

Export Options (6)

"Compression" (2)

Compression is disabled by default:

Compare supported compression methods:

CompressionLevel (2)

By default, Automatic value of CompressionLevel is used. It corresponds to a different default value for each compression method.

Use maximal compression for each method:

"Streamable" (2)

By default, Export uses Arrow IPC file format:

Use "Streamable" option to generate Arrow IPC streaming format:

Top

	"Elements"	list of elements and options available in this file
	"Summary"	summary of the file
	"Rules"	list of rules for all available elements

	"Data"	two-dimensional array
	"Dataset"	table data as a Dataset
	"Tabular"	a Tabular object

	"ColumnLabels"	names of columns
	"ColumnTypes"	association with data type for each column
	"Schema"	TabularSchema object

	"ColumnCount"	number of columns stored in file
	"Dimensions"	data dimensions
	"RowCount"	number of rows stored in file
	"MetaInformation"	metadata

	IncludeMetaInformation	All	metadata types to import
	"UseMemoryMappedFile"	True	whether to use memory-mapped reader

	None	no compression
	"LZ4Frame"	LZ4 Frame compression
	"ZSTD"	ZSTD compression

ArrowIPC (.arrow, .arrows, .feather, .ftr)

Background & Context

Import & Export

Import Elements

Options

Examples

Basic Examples (3)

Scope (3)

Import (3)

Import Elements (14)

"ColumnCount" (1)

"ColumnLabels" (1)

"ColumnTypes" (1)

"Data" (2)

"Dataset" (2)

"Dimensions" (1)

"MetaInformation" (1)

"RowCount" (1)

"Schema" (1)

"Summary" (1)

"Tabular" (2)

Import Options (3)

IncludeMetaInformation (1)

"Schema" (1)

"UseMemoryMappedFile" (1)

Export Options (6)

"Compression" (2)

CompressionLevel (2)

"Streamable" (2)

See Also

Related Guides

History