Computation with Structured Datasets
The symbolic character of the Wolfram Language allows it to support an unprecedentedly flexible and general approach to structured datasets. Unifying both relational (SQL-like) and hierarchical (no-SQL) approaches, the Wolfram Language incorporates a new kind of uniquely powerful data query language—with seamless scaling from direct in-memory computation to computations backed by external files or databases.
Dataset — a general hierarchical dataset containing nested lists and associations
Query — represent a hierarchical query on a dataset or other expression
{e1,e2,…} — a list of values (List)
<key1->e1,key2->e2,… > — an association of keys and values (Association)
dataset[…] — apply transformations to a dataset
dataset[[…]] — extract parts from a dataset
Explicit Parts of Datasets
dataset[[…,part,…]] — a numbered or named part at any level
All — all parts at a given level
Span (;;) — a span of parts at a given level
Keys, Values — keys, values in associations
Selections and Transformations
Select — parts selected to satisfy a criterion
SelectFirst ▪ Count ▪ Counts ▪ CountsBy ▪ GroupBy
Sort ▪ SortBy ▪ Union ▪ DeleteDuplicates
TakeLargest ▪ TakeSmallest ▪ TakeLargestBy ▪ TakeSmallestBy
Lookup — look up sets of values from associations
Missing Values
Missing — represent a missing value
DeleteMissing — delete missing values
MissingBehavior — control computations with missing values
Relational Data
JoinAcross — combine tables by column
Merge — combine associations by key
Custom Query Construction
Data Computations
Total ▪ Mean ▪ Median ▪ Min ▪ Max ▪ ...
Basic Structural Operations
Insert ▪ Delete ▪ Append ▪ Take ▪ Drop ▪ ...
Dataset Presentation
Grid ▪ Column ▪ Multicolumn ▪ TabView ▪ MenuView
Creating Associations
Association — turn a list of rules into an association
AssociationMap — create an association by applying a function to a list of keys
AssociationThread — create an association from a list of keys and a list of values
Counts, CountsBy — associate values with the number of times they occur
GroupBy — group values by collecting those sharing a criterion
PositionIndex — build an index of positions at which values occur
Creating Datasets
SemanticImport — import a file as a Dataset
EntityValue — obtain data from the Wolfram Knowledgebase in the form of a Dataset
ResourceData — get data from the Wolfram Data Repository, often as a Dataset