Query

Query[operator1,operator2,]

represents a query that can be applied to a Dataset object, in which the successive operatori are applied at successively deeper levels.

Details and Options

  • Query can perform complex filtering, reorganization, and aggregation of arbitrary rectangular and hierarchical data, typically returning further rectangular or hierarchical data.
  • Query can operate on a Dataset object or an arbitrary nested expression consisting of lists and associations.
  • Query can be seen as a generalization of Part that allows computations to be performed in the process of traversing an expression or dataset.
  • The operatori can be any of the following forms:
  • All,i,i;;j,"key",Key[]part operators
    Select[],MaximalBy[],filtering operators
    Counts,Total,Mean,Max,aggregation operators
    Query[],{op1,op2,},subquery operators
    Function[],farbitrary functions
  • In Query[operator1,][expr], the operatori are applied at successively deeper levels in expr, but any given one may be applied either while "descending" into expr or while "ascending" out of it. In general, part specifications and filtering operators are "descending" operators. Aggregation operators, subquery operators, and arbitrary functions are "ascending" operators. Query[][expr] returns expr.
  • A "descending" operator is applied to corresponding parts of the original dataset, before subsequent operators are applied at deeper levels. Descending operators have the feature that they do not change the structure of deeper levels of the data when applied at a certain level. This ensures that subsequent operators will encounter subexpressions whose structure is identical to the corresponding levels of the original dataset. The simplest descending operator is All, which selects all parts at a given level and therefore leaves the structure of the data at that level unchanged.
  • An "ascending" operator is applied after all subsequent operators have been applied to deeper levels. Whereas descending operators correspond to the levels of the original data, ascending operators correspond to the levels of the result. Unlike descending operators, ascending operators do not necessarily preserve the structure of the data on which they operate. Unless an operator is specifically recognized to be descending, it is assumed to be ascending.
  • The "descending" part operators specify which elements to take at a level before applying any subsequent operators to deeper levels:
  • Allapply subsequent operators to each part of a list or association
    i;;jtake parts i through j and apply subsequent operators to each part
    itake only part i and apply subsequent operators to it
    "key",Key[key]take value of key in an association and apply subsequent operators to it
    Keystake keys of an association and apply subsequent operators to each key
    Valuestake values of an association and apply subsequent operators to each value
    {part1,part2,}take given parts and apply subsequent operators to each part
  • The "descending" filtering operators specify how to rearrange or filter elements at a level before applying subsequent operators to deeper levels:
  • Select[test]take only those parts of a list or association that satisfy test
    SelectFirst[test]take the first part that satisfies test
    KeySelect[test]take those parts of an association whose keys satisfy test
    MaximalBy[crit],MinimalBy[crit]take the parts for which criteria crit is minimal or maximal
    SortBy[crit]sort parts in order of crit
    KeySortBy[crit]sort parts of an association based on their keys, in order of crit
    DeleteDuplicatesBy[crit]take parts that are unique according to crit
    DeleteMissingdrop elements with head Missing
  • The "ascending" aggregation operators combine or summarize the results of applying subsequent operators to deeper levels:
  • Totaltotal all quantities in the result
    Min,Maxgive minimum, maximum quantity in the result
    Mean,Median,Quantile,give statistical summary of the result
    Histogram,ListPlot,calculate a visualization on the result
    Merge[f]merge common keys of associations in the result using function f
    Catenatecatenate the elements of lists or associations together
    Countsgive association that counts occurrences of values in the result
    CountsBy[crit]give association that counts occurrences of values according to crit
    CountDistinctgive number of distinct values in the result
    CountDistinctBy[crit]give number of distinct values in the result according to crit
  • The "ascending" subquery operators perform a subquery after applying subsequent operators to deeper levels:
  • Query[]perform a subquery on the result
    {op1,op2,}apply multiple operators at once to the result, yielding a list
    op1/* op2/* apply op1, then apply op2 at the same level, etc.
    <|key1op1,key2op2,|>apply multiple operators at once to the result, yielding an association with the given keys
    {key1op1,key2op2,}apply different operators to specific parts in the result
  • When one or more descending operators are composed with one or more ascending operators (e.g. desc/*asc), the descending part will be applied, then subsequent operators will be applied to deeper levels, and lastly the ascending part will be applied to the result.
  • The special descending operator GroupBy[spec] will introduce a new association at the level at which it appears and can be inserted or removed from an existing query without affecting the behavior of other operators.
  • The syntax GroupBy["string"] can be used as a synonym for GroupBy[Key["string"]]. The same syntax is also available for SortBy, CountsBy, MaximalBy, MinimalBy, and DeleteDuplicatesBy.
  • The following options can be given:
  • FailureAction "Abort"how to handle operators that fail
    MissingBehavior Automatichow to treat operations involving Missing
    PartBehavior Automatichow to resolve missing parts
  • Possible values for FailureAction include:
  • Noneignore all messages and failures
    "Abort"abort the entire query when a message is encountered (default)
    "Drop"drop the results of operations that issue messages
    "Encapsulate"wrap operations that issue messages in a Failure object
    "Replace"replace the results of operations that issue messages with Missing["Failed"]
  • The option MissingBehavior describes how numeric and other functions should treat expressions with head Missing. Possible values include:
  • Noneuse ordinary behavior of Missing
    Automaticinvoke special rules for Mean, Total, etc.
  • The option PartBehavior describes how operators that refer to nonexistent parts are evaluated. Possible values include:
  • Noneuse ordinary behavior of Part
    Automaticinvoke special rules for invalid i;;j, missing i, etc.

Examples

open allclose all

Basic Examples  (1)

Construct tabular data on which to perform queries:

Take a set of rows:

Take a specific row:

Take a specific element from a specific row:

Take the contents of a specific column:

Take a specific part within a column:

Take a subset of the rows and columns:

Apply a function to a specific column:

Partition the data based on a column, applying the rest of the query to each group:

Apply a function to each row:

Apply a function both to each row and to the entire result:

Apply a function f to every element in every row:

Apply specific operators to each column independently:

Construct a new table by specifying operators that will compute each column:

Use the same technique to rename columns:

Select specific rows based on a criterion:

Take the contents of a column after selecting the rows:

Take a subset of the available columns after selecting the rows:

Take a value from the first row satisfying a criterion:

Sort the rows by a criterion:

Take the rows that give the maximal value of a scoring function:

Delete rows that duplicate a criterion:

Compose an ascending and a descending operator to aggregate values of a column after filtering the rows:

Do the same thing by composing Total with the Query:

Options  (4)

FailureAction  (2)

Create sample data:

The setting FailureActionNone takes no special action on the result:

The setting FailureAction->"Encapsulate" encapsulates failed results in a Failure object:

The setting FailureAction->"Replace" replaces failed results with a placeholder. For the function Query, this placeholder is a Missing object:

The setting FailureAction->"Drop" causes failed results to be dropped from the final expression:

The setting FailureAction->"Abort" aborts the computation, returning a Failure object instead:

The setting FailureAction->{"Drop",f} can be used to perform an action before dropping a failed result:

The setting FailureAction->{"Replace",f} can be used to specify a replacement that is a function of the failure:

MissingBehavior  (1)

The default option value MissingBehavior->Automatic applies special rules to operators that encounter Missing:

Use MissingBehavior->None to specify the ordinary behavior of Missing for all operators:

PartBehavior  (1)

The default option value PartBehavior->Automatic invokes special behavior for operators that would otherwise fail:

Using PartBehavior->None specifies that the ordinary behavior of Part should be used:

Properties & Relations  (1)

Query is the operator form of the query language supported by Dataset:

Before being applied, Query expressions are "compiled" into ordinary compositions of ordinary Wolfram Language functions and their operator forms. To see the compiled form of a Query, use Normal:

Wolfram Research (2014), Query, Wolfram Language function, https://reference.wolfram.com/language/ref/Query.html.

Text

Wolfram Research (2014), Query, Wolfram Language function, https://reference.wolfram.com/language/ref/Query.html.

CMS

Wolfram Language. 2014. "Query." Wolfram Language & System Documentation Center. Wolfram Research. https://reference.wolfram.com/language/ref/Query.html.

APA

Wolfram Language. (2014). Query. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/Query.html

BibTeX

@misc{reference.wolfram_2023_query, author="Wolfram Research", title="{Query}", year="2014", howpublished="\url{https://reference.wolfram.com/language/ref/Query.html}", note=[Accessed: 16-April-2024 ]}

BibLaTeX

@online{reference.wolfram_2023_query, organization={Wolfram Research}, title={Query}, year={2014}, url={https://reference.wolfram.com/language/ref/Query.html}, note=[Accessed: 16-April-2024 ]}