DatasetCopy to clipboard.
✖
Dataset
represents a structured dataset based on a hierarchy of lists and associations.
Details and Options
- Dataset can represent not only full rectangular multidimensional arrays of data, but also arbitrary tree structures, corresponding to data with arbitrary hierarchical structure.
- Depending on the data it contains, a Dataset object typically displays as a table or grid of elements.
- Functions like Map, Select, etc. can be applied directly to a Dataset by writing Map[f,dataset], Select[dataset,crit], etc.
- Subsets of the data in a Dataset object can be obtained by writing dataset[[parts]].
- Dataset objects can also be queried using a specialized query syntax by writing dataset[query].
- While arbitrary nesting of lists and associations is possible, two-dimensional (tabular) forms are most commonly used.
- The following table shows the correspondence between the common display forms of a Dataset, the form of Wolfram Language expression it contains, and logical interpretation of its structure as a table:
-
{{◻,◻,◻},
{◻,◻,◻},
{◻,◻,◻},
{◻,◻,◻}}
list of listsa table without named rows and columns
{<"x"◻,"y"◻,… >,
<"x"◻,"y"◻,… >,
<"x"◻,"y"◻,… >}
list of associationsa table with named columns <"a"{◻,◻},
"b"{◻,◻},
"c"{◻,◻},
"d"{◻,◻} >
association of listsa table with named rows <"a"<"x"◻,"y"◻ >,
"b"<"x"◻,"y"◻ >,
"c"<"x"◻,"y"◻ > >
association of associationsa table with named columns and named rows - Dataset interprets nested lists and associations in a row-wise fashion, so that level 1 (the outermost level) of the data is interpreted as the rows of a table, and level 2 is interpreted as the columns.
- Named rows and columns correspond to associations at level 1 and 2, respectively, whose keys are strings that contain the names. Unnamed rows and columns correspond to lists at those levels.
- Rows and columns of a dataset can be exchanged by writing Transpose[dataset].
- The following options can be given:
-
Alignment {Left,Baseline} horizontal and vertical alignments of items Background None background colors to use for items DatasetTheme Automatic overall theme for the dataset HeaderAlignment {Left,Baseline} horizontal and vertical alignments of headers HeaderBackground Automatic background colors to use for headers HeaderDisplayFunction Automatic function to use to format headers HeaderSize Automatic widths and heights of headers HeaderStyle None styles to use for headers HiddenItems None items to hide ItemDisplayFunction Automatic function to use to format items ItemSize Automatic widths and heights of items ItemStyle None styles for columns and rows MaxItems Automatic maximum number of items to display - Settings for options except HiddenItems, MaxItems and DatasetTheme can be given as follows to apply separately to different items:
-
spec apply spec to all items {speck} apply speck at successive levels {spec1,spec2,…, specn,rules} also allow explicit rules for specific parts - The speck can have the following forms:
-
{s1,s2,…,sn} use s1 through sn, then use defaults {{c}} use c in all cases {{c1,c2}} alternate between c1 and c2 {{c1,c2,…}} cycle through all ci {s,{c}} use s, then repeatedly use c {s1,{c},sn} use s1, then repeatedly use c, but use sn at the end {s1,s2,…,{c1,c2,…},sm,…,sn} use the first sequence of si at the beginning, then cyclically use the ci, then use the last sequence of si at the end {s1,s2,…,{},sm,…,sn} use the first sequence of si at the beginning and the last sequence at the end - With settings of the form {s1,s2,…,{…},sm,…,sn}, if there are more si specified than items in the dataset, si from the beginning are used for the first items, and ones from the end are used for the last items.
- Rules have the form ispec, where i specifies a position in the Dataset. Positions can be patterns.
- The position of an element can be read at the bottom of a dataset when you hover over the element.
- Settings for MaxItems can be given as follows:
-
m display m rows {m1, m2,…,mn} display mi items at dataset level i - In MaxItems, Automatic indicates that the default number of items should be displayed.
- Settings for HiddenItems can be given as follows:
-
i hide the item at position i {i1,i2,…,in} hide the items at positions ik {…, iFalse,…} show the item at position i {…, iTrue,…} hide the item at position i - Within a HiddenItems list, later settings override earlier ones.
- Individual settings for ItemDisplayFunction and HeaderDisplayFunction are pure functions that return the item to be displayed. The functions take three arguments: the item value, the item position and the dataset that contains the item.
- In some positions in ItemStyle and HeaderStyle settings, an explicit rule may be interpreted as ispec. If a style option is intended instead, wrap the rule with Directive.
- In options such as Alignment, HeaderAlignment, ItemSize and HeaderSize, which may be list-valued, a top-level list is interpreted as a single option value if possible, otherwise as a list of values for successive Dataset levels.
- If the left-hand side of a rule is not a list, the setting is applied to any position that contains the left-hand side as a key or index.
- A pure function f that returns a setting can be used in place of any setting. The setting is given by f[item,position,dataset].
- Normal can be used to convert any Dataset object to its underlying data, which is typically a combination of lists and associations.
- The syntax dataset[[parts]] or Part[dataset,parts] can be used to extract parts of a Dataset.
- The parts that can be extracted from a Dataset include all ordinary specifications for Part.
- Unlike the ordinary behavior of Part, if a specified subpart of a Dataset is not present, Missing["PartAbsent",…] will be produced in that place in the result.
- The following part operations are commonly used to extract rows from tabular datasets:
-
dataset[["name"]] extract a named row (if applicable) dataset[[{"name1",…}]] extract a set of named rows dataset[[1]] extract the first row dataset[[n]] extract the n row dataset[[-1]] extract the last row dataset[[m;;n]] extract rows m through n dataset[[{n1,n2,…}]] extract a set of numbered rows - The following part operations are commonly used to extract columns from tabular datasets:
-
dataset[[All,"name"]] extract a named column (if applicable) dataset[[All,{"name1",…}]] extract a set of named columns dataset[[All,1]] extract the first column dataset[[All,n]] extract the n column dataset[[All,-1]] extract the last column dataset[[All,m;;n]] extract columns m through n dataset[[All,{n1,n2,…}]] extract a subset of the columns - Like Part, row and column operations can be combined. Some examples include:
-
dataset[[n,m]] take the cell at the n row and mcolumn dataset[[n,"colname"]] extract the value of the named column in the n row dataset[["rowname","colname"]] take the cell at the named row and column - The following operations can be used to remove the labels from rows and columns, effectively turning associations into lists:
-
dataset[[Values]] remove labels from rows dataset[[All,Values]] remove labels from columns dataset[[Values,Values]] remove labels from rows and columns - The query syntax dataset[op1,op2,…] can be thought of as an extension of Part syntax to allow aggregations and transformations to be applied, as well as taking subsets of data.
- Some common forms of query include:
-
dataset[f] apply f to the entire table dataset[All,f] apply f to every row in the table dataset[All,All,f] apply f to every cell in the table dataset[f,n] extract the n column, then apply f to it dataset[f,"name"] extract the named column, then apply f to it dataset[n,f] extract the n row, then apply f to it dataset["name",f] extract the row, then apply f to it dataset[{nf}] selectively map f onto the n row dataset[All,{nf}] selectively map f onto the ncolumn - Some more specialized forms of query include:
-
dataset[Counts,"name"] give counts of different values in the named column dataset[Count[value],"name"] give number of occurences of value in the named column dataset[CountDistinct,"name"] count the number of distinct values in the named column dataset[MinMax,"name"] give minimum and maximum values in the named column dataset[Mean,"name"] give the mean value of the named column dataset[Total,"name"] give the total value of the named column dataset[Select[h]] extract those rows that satisfy condition h dataset[Select[h]/*Length] count the number of rows that satisfy condition h dataset[Select[h],"name"] select rows, then extract the named column from the result dataset[Select[h]/*f,"name"] select rows, extract the named column, then apply f to it dataset[TakeLargestBy["name",n]] give the n rows for which the named column is largest dataset[TakeLargest[n],"name"] give the n largest values in the named column - In dataset[op1,op2,…], the query operators opi are effectively applied at successively deeper levels of the data, but any given one may be applied either while "descending" into the data or while "ascending" out of it.
- The operators that make up a Dataset query fall into one of the following broad categories with distinct ascending and descending behavior:
-
All,i,i;;j,"key",… descending part operators Select[f],SortBy[f],… descending filtering operators Counts,Total,Mean,… ascending aggregation operators Query[…],… ascending subquery operators Function[…],f ascending arbitrary functions - A descending operator is applied to corresponding parts of the original dataset, before subsequent operators are applied at deeper levels.
- Descending operators have the feature that they do not change the structure of deeper levels of the data when applied at a certain level. This ensures that subsequent operators will encounter subexpressions whose structure is identical to the corresponding levels of the original dataset.
- The simplest descending operator is All, which selects all parts at a given level and therefore leaves the structure of the data at that level unchanged. All can safely be replaced with any other descending operator to yield another valid query.
- An ascending operator is applied after all subsequent ascending and descending operators have been applied to deeper levels. Whereas descending operators correspond to the levels of the original data, ascending operators correspond to the levels of the result.
- Unlike descending operators, ascending operators do not necessarily preserve the structure of the data they operate on. Unless an operator is specifically recognized to be descending, it is assumed to be ascending.
- The descending part operators specify which elements to take at a level before applying any subsequent operators to deeper levels:
-
All apply subsequent operators to each part of a list or association i;;j take parts i through j and apply subsequent operators to each part i take only part i and apply subsequent operators to it "key",Key[key] take value of key in an association and apply subsequent operators to it Values take values of an association and apply subsequent operators to each value {part1,part2,…} take given parts and apply subsequent operators to each part - The descending filtering operators specify how to rearrange or filter elements at a level before applying subsequent operators to deeper levels:
-
Select[test] take only those parts of a list or association that satisfy test SelectFirst[test] take the first part that satisfies test KeySelect[test] take those parts of an association whose keys satisfy test TakeLargestBy[f,n],TakeSmallestBy[f,n] take the n elements for which f[elem] is largest or smallest, in sorted order MaximalBy[crit],MinimalBy[crit] take the parts for which criteria crit is greater or less than all other elements SortBy[crit] sort parts in order of crit KeySortBy[crit] sort parts of an association based on their keys, in order of crit DeleteDuplicatesBy[crit] take parts that are unique according to crit DeleteMissing drop elements with head Missing - The syntax op1/*op2 can be used to combine two or more filtering operators into one operator that still operates at a single level.
- The ascending aggregation operators combine or summarize the results of applying subsequent operators to deeper levels:
-
Total total all quantities in the result Min,Max give minimum, maximum quantity in the result Mean,Median,Quantile,… give statistical summary of the result Histogram,ListPlot,… calculate a visualization on the result Merge[f] merge common keys of associations in the result using function f Catenate catenate the elements of lists or associations together Counts give association that counts occurrences of values in the result CountsBy[crit] give association that counts occurrences of values according to crit CountDistinct give number of distinct values in the result CountDistinctBy[crit] give number of distinct values in the result according to crit TakeLargest[n],TakeSmallest[n] take the largest or smallest n elements - The syntax op1/*op2 can be used to combine two or more aggregation operators into one operator that still operates at a single level.
- The ascending subquery operators perform a subquery after applying subsequent operators to deeper levels:
-
Query[…] perform a subquery on the result {op1,op2,…} apply multiple operators at once to the result, yielding a list <key1op1,key2op2,… > apply multiple operators at once to the result, yielding an association with the given keys {key1op1,key2op2,…} apply different operators to specific parts in the result - When one or more descending operators are composed with one or more ascending operators (e.g. desc/*asc), the descending part will be applied, then subsequent operators will be applied to deeper levels, and lastly, the ascending part will be applied to the result at that level.
- The special descending operator GroupBy[spec] will introduce a new association at the level at which it appears and can be inserted or removed from an existing query without affecting subsequent operators.
- Functions such as CountsBy, GroupBy, and TakeLargestBy normally take another function as one of their arguments. When working with associations in a Dataset, it is common to use this "by" function to look up the value of a column in a table.
- To facilitate this, Dataset queries allow the syntax "string" to mean Key["string"] in such contexts. For example, the query operator GroupBy["string"] is automatically rewritten to GroupBy[Key["string"]] before being executed.
- Similarly, the expression GroupBy[dataset,"string"] is rewritten as GroupBy[dataset,Key["string"]].
- Where possible, type inference is used to determine whether a query will succeed. Operations that are inferred to fail will result in a Failure object being returned without the query being performed.
- By default, if any messages are generated during a query, the query will be aborted and a Failure object containing the message will be returned.
- When a query returns structured data (e.g. a list or association, or nested combinations of these), the result will be given in the form of another Dataset object. Otherwise, the result will be given as an ordinary Wolfram Language expression.
- For more information about special behavior of Dataset queries, see the function page for Query.
- Import and SemanticImport can be used to import files as Dataset objects from formats such as "CSV" and "XLSX".
- Dataset objects can be exported with Export to formats such as "CSV", "XLSX" and "JSON".
Dataset Structure
Dataset Options
Part Operations
Dataset Queries
Descending and Ascending Query Operators
Part Operators
Filtering Operators
Aggregation Operators
Subquery Operators
Special Operators
Syntactic Shortcuts
Query Behavior
Import & Export
Examples
open allclose allBasic Examples (1)Summary of the most common use cases
Create a simple Dataset with a list of associations:
https://wolfram.com/xid/0rs3seh0m-navlb2
https://wolfram.com/xid/0rs3seh0m-rsym41
https://wolfram.com/xid/0rs3seh0m-2ohky3
Compute the Total of each column:
https://wolfram.com/xid/0rs3seh0m-rz266t
Scope (1)Survey of the scope of standard use cases
Create a Dataset object from tabular data:
https://wolfram.com/xid/0rs3seh0m-5dy4of
https://wolfram.com/xid/0rs3seh0m-kak8wh
https://wolfram.com/xid/0rs3seh0m-hc6zps
A row is merely an association:
https://wolfram.com/xid/0rs3seh0m-h2wor0
Take a specific element from a specific row:
https://wolfram.com/xid/0rs3seh0m-poe6an
https://wolfram.com/xid/0rs3seh0m-0gwxk9
https://wolfram.com/xid/0rs3seh0m-dcqmtc
Take the contents of a specific column:
https://wolfram.com/xid/0rs3seh0m-q0ufxb
https://wolfram.com/xid/0rs3seh0m-wxxunr
https://wolfram.com/xid/0rs3seh0m-l8tzhp
Take a specific part within a column:
https://wolfram.com/xid/0rs3seh0m-teycvn
Take a subset of the rows and columns:
https://wolfram.com/xid/0rs3seh0m-omnmwp
Apply a function to the contents of a specific column:
https://wolfram.com/xid/0rs3seh0m-9b1sxu
https://wolfram.com/xid/0rs3seh0m-oyvft2
https://wolfram.com/xid/0rs3seh0m-4mbqv5
Partition the dataset based on a column, applying further operators to each group:
https://wolfram.com/xid/0rs3seh0m-i9ynzo
https://wolfram.com/xid/0rs3seh0m-zstjms
Apply a function both to each row and to the entire result:
https://wolfram.com/xid/0rs3seh0m-5nu0nz
Apply a function f to every element in every row:
https://wolfram.com/xid/0rs3seh0m-zr19lg
Apply functions to each column independently:
https://wolfram.com/xid/0rs3seh0m-xs1z34
Construct a new table by specifying operators that will compute each column:
https://wolfram.com/xid/0rs3seh0m-qpoklh
Use the same technique to rename columns:
https://wolfram.com/xid/0rs3seh0m-06zz5g
Select specific rows based on a criterion:
https://wolfram.com/xid/0rs3seh0m-yyy007
Take the contents of a column after selecting the rows:
https://wolfram.com/xid/0rs3seh0m-c1rddh
Take a subset of the available columns after selecting the rows:
https://wolfram.com/xid/0rs3seh0m-h1k0r3
Take the first row satisfying a criterion:
https://wolfram.com/xid/0rs3seh0m-cyzpyo
https://wolfram.com/xid/0rs3seh0m-rb9ahp
https://wolfram.com/xid/0rs3seh0m-jlpchr
Take the rows that give the maximal value of a scoring function:
https://wolfram.com/xid/0rs3seh0m-5g5f4t
Give the top 3 rows according to a scoring function:
https://wolfram.com/xid/0rs3seh0m-dr8pxa
Delete rows that duplicate a criterion:
https://wolfram.com/xid/0rs3seh0m-bsnnjd
https://wolfram.com/xid/0rs3seh0m-4wm766
Compose an ascending and a descending operator to aggregate values of a column after filtering the rows:
https://wolfram.com/xid/0rs3seh0m-e6f5ih
Do the same thing by applying Total after the query:
https://wolfram.com/xid/0rs3seh0m-myis0p
Options (42)Common values & functionality for each option
Alignment (2)
Background (10)
Give all dataset items a pink background:
https://wolfram.com/xid/0rs3seh0m-l5rbwg
https://wolfram.com/xid/0rs3seh0m-65def1
https://wolfram.com/xid/0rs3seh0m-yuuoro
https://wolfram.com/xid/0rs3seh0m-9locll
Pink and gray backgrounds for the first two rows:
https://wolfram.com/xid/0rs3seh0m-znmxi5
https://wolfram.com/xid/0rs3seh0m-xd5b3r
Alternating pink and gray rows:
https://wolfram.com/xid/0rs3seh0m-6vbua3
Alternating pink and gray columns with the first and last columns yellow:
https://wolfram.com/xid/0rs3seh0m-kurstx
Alternating pink and gray columns with the third column yellow:
https://wolfram.com/xid/0rs3seh0m-oilbca
https://wolfram.com/xid/0rs3seh0m-j1j5bu
Set background color by value:
https://wolfram.com/xid/0rs3seh0m-rl1fx7
Set background color by position:
https://wolfram.com/xid/0rs3seh0m-wjk7ld
https://wolfram.com/xid/0rs3seh0m-1novbe
DatasetTheme (4)
Use a theme with alternating row backgrounds:
https://wolfram.com/xid/0rs3seh0m-3j8653
Combine themes for a customized presentation:
https://wolfram.com/xid/0rs3seh0m-jbjcf1
Use a theme to emphasize low-level groupings:
https://wolfram.com/xid/0rs3seh0m-n5ibix
Use a theme to stripe long rows and columns to make them easier to follow:
https://wolfram.com/xid/0rs3seh0m-kzjbax
HeaderAlignment (2)
HeaderBackground (2)
HeaderDisplayFunction (1)
HeaderSize (1)
Make headers a fixed number of characters tall:
https://wolfram.com/xid/0rs3seh0m-i5i3sl
Make row headers a fixed number of characters wide:
https://wolfram.com/xid/0rs3seh0m-fqx54z
HeaderStyle (4)
Set one overall style for headers:
https://wolfram.com/xid/0rs3seh0m-ky6nr6
Use Directive to wrap multiple style directives:
https://wolfram.com/xid/0rs3seh0m-mim00t
Use a style from the current stylesheet:
https://wolfram.com/xid/0rs3seh0m-4h5gm1
https://wolfram.com/xid/0rs3seh0m-6j524z
HiddenItems (4)
https://wolfram.com/xid/0rs3seh0m-219yjm
https://wolfram.com/xid/0rs3seh0m-bnjbfn
Hide a row except for one item:
https://wolfram.com/xid/0rs3seh0m-cydrs6
Hide items with a given value:
https://wolfram.com/xid/0rs3seh0m-i4nnxa
ItemDisplayFunction (1)
ItemSize (3)
Make each item a fixed number of character widths wide:
https://wolfram.com/xid/0rs3seh0m-0pnf1v
https://wolfram.com/xid/0rs3seh0m-eo4h5s
https://wolfram.com/xid/0rs3seh0m-1e8txt
ItemStyle (6)
Set one overall style for dataset items:
https://wolfram.com/xid/0rs3seh0m-mua7m7
Use Directive to wrap multiple style directives:
https://wolfram.com/xid/0rs3seh0m-0k830b
Use a style from the current stylesheet:
https://wolfram.com/xid/0rs3seh0m-ii1tld
https://wolfram.com/xid/0rs3seh0m-la6zbv
https://wolfram.com/xid/0rs3seh0m-5hajre
https://wolfram.com/xid/0rs3seh0m-gzdymn
MaxItems (2)
Applications (3)Sample problems that can be solved with this function
Tables (Lists of Associations) (1)
Load a dataset of passengers of the Titanic:
https://wolfram.com/xid/0rs3seh0m-cvay6w
https://wolfram.com/xid/0rs3seh0m-62oa7o
Get a random sample of passengers:
https://wolfram.com/xid/0rs3seh0m-de4dn
The underlying data is a list of associations:
https://wolfram.com/xid/0rs3seh0m-xwxunx
Count the number of passengers with a missing age:
https://wolfram.com/xid/0rs3seh0m-bp7e1l
Count the number of passengers in 1st, 2nd and 3rd class:
https://wolfram.com/xid/0rs3seh0m-2dpxmc
Get a histogram of passenger ages:
https://wolfram.com/xid/0rs3seh0m-w1osrd
Get a histogram of passenger ages, grouped by passenger class:
https://wolfram.com/xid/0rs3seh0m-naa2op
Find the age of the oldest passenger:
https://wolfram.com/xid/0rs3seh0m-mje4k6
Calculate the overall survival ratio:
https://wolfram.com/xid/0rs3seh0m-favsta
https://wolfram.com/xid/0rs3seh0m-ksj7p9
Show the survival ratio against sex and passenger class:
https://wolfram.com/xid/0rs3seh0m-inyxvn
Show the survival ratio as a function of age:
https://wolfram.com/xid/0rs3seh0m-x1iqwh
Indexed Tables (Associations of Associations) (1)
Load a dataset of planets and their properties:
https://wolfram.com/xid/0rs3seh0m-ctz8rn
Look up the mass of the Earth:
https://wolfram.com/xid/0rs3seh0m-hlzgzj
Get the subtable corresponding to moons of a specific planet:
https://wolfram.com/xid/0rs3seh0m-s0h4uf
Produce a dataset of the radii of the planets:
https://wolfram.com/xid/0rs3seh0m-dxfh3l
Visualize the radii of the planets:
https://wolfram.com/xid/0rs3seh0m-yipa3q
Produce a dataset of the number of moons of each planet:
https://wolfram.com/xid/0rs3seh0m-7i5ivm
https://wolfram.com/xid/0rs3seh0m-nfndg1
Obtain a list of the planets and their masses, sorted by increasing mass:
https://wolfram.com/xid/0rs3seh0m-0y2klt
Find the total mass of each planet's moons:
https://wolfram.com/xid/0rs3seh0m-4ueigh
Obtain a list of only those moons that have a mass larger than half that of Earth's moon:
https://wolfram.com/xid/0rs3seh0m-hrugpa
Find the heaviest moon of each planet:
https://wolfram.com/xid/0rs3seh0m-qr5p2x
Obtain a list of all planetary moons:
https://wolfram.com/xid/0rs3seh0m-0er8pl
Make a scatter plot of the mass against radius:
https://wolfram.com/xid/0rs3seh0m-gnfmxj
Calculate and make a histogram of the densities:
https://wolfram.com/xid/0rs3seh0m-7v1ezr
Compute the mean density for the moons of each planet:
https://wolfram.com/xid/0rs3seh0m-kp9mbm
Create a table comparing the density of each planet with the mean density of its moons:
https://wolfram.com/xid/0rs3seh0m-u0caes
https://wolfram.com/xid/0rs3seh0m-vftvc7
Hierarchical Data (Associations of Associations of Other Data) (1)
Load a dataset associating countries to their administrative divisions to their populations:
https://wolfram.com/xid/0rs3seh0m-gcdl5a
The underlying data is an association whose keys are countries and whose values are further associations between administrative divisions and their populations:
https://wolfram.com/xid/0rs3seh0m-y3nbgl
Look up the populations for a specific country:
https://wolfram.com/xid/0rs3seh0m-yrtyot
Give the total population (not all countries in the world are included in this dataset):
https://wolfram.com/xid/0rs3seh0m-h3eqie
Count the number of divisions within each country:
https://wolfram.com/xid/0rs3seh0m-4whhny
Total the number of divisions:
https://wolfram.com/xid/0rs3seh0m-dh9h4w
Build a histogram of the number of divisions:
https://wolfram.com/xid/0rs3seh0m-shk381
Calculate the total population of each country by adding the populations of each division:
https://wolfram.com/xid/0rs3seh0m-xjsaqz
Obtain the five most populous countries:
https://wolfram.com/xid/0rs3seh0m-7hop0s
Obtain the most populous divisions for each country:
https://wolfram.com/xid/0rs3seh0m-1tw9y2
Correlate the number of divisions a country has with its population:
https://wolfram.com/xid/0rs3seh0m-7rnalw
The underlying data being passed to ListLogPlot is an association of lists, each of length 2:
https://wolfram.com/xid/0rs3seh0m-k6b8p4
https://wolfram.com/xid/0rs3seh0m-gm8b10
Properties & Relations (4)Properties of the function, and connections to other functions
Query is the operator form of the query language supported by Dataset:
https://wolfram.com/xid/0rs3seh0m-z6kolj
https://wolfram.com/xid/0rs3seh0m-ffeox5
Use EntityValue to obtain a Dataset of the properties of Entity objects from the Wolfram Knowledgebase:
https://wolfram.com/xid/0rs3seh0m-46b22g
Plot the boiling point versus density on a log plot:
https://wolfram.com/xid/0rs3seh0m-rhdjih
Use SemanticImport to import a file as a Dataset:
https://wolfram.com/xid/0rs3seh0m-1f6bkb
Calculate the total quantity of sales:
https://wolfram.com/xid/0rs3seh0m-xlgoxc
Obtain a small sample of the Titanic passenger dataset:
https://wolfram.com/xid/0rs3seh0m-5kkgjn
Export the sample as "JSON" format:
https://wolfram.com/xid/0rs3seh0m-sqfkp5
Data with named columns can be more compactly represented if it is first transposed:
https://wolfram.com/xid/0rs3seh0m-763959
Export the sample as "CSV":
https://wolfram.com/xid/0rs3seh0m-uzjv80
Possible Issues (3)Common pitfalls and unexpected behavior
Data without a consistent structure will not usually format in the same way as structured data:
https://wolfram.com/xid/0rs3seh0m-mp1u21
https://wolfram.com/xid/0rs3seh0m-ygkn7e
If a sub-operation of a query is inferred to fail, the entire query will not be performed, and a Failure object will be returned:
https://wolfram.com/xid/0rs3seh0m-hwtoeu
By default, if any messages are generated during an operation on a Dataset, a Failure object will be returned:
https://wolfram.com/xid/0rs3seh0m-thz3qy
To specify different behavior, use an explicit Query expression in conjunction with the option FailureAction:
https://wolfram.com/xid/0rs3seh0m-gex67f
Neat Examples (2)Surprising or curious use cases
Calculate the survival likelihood of the characters Jack Dawson and Rose DeWitt Bukater from the movie Titanic by matching them with real passengers:
https://wolfram.com/xid/0rs3seh0m-6hlnep
https://wolfram.com/xid/0rs3seh0m-to7z5e
https://wolfram.com/xid/0rs3seh0m-cisbx7
https://wolfram.com/xid/0rs3seh0m-z48fc1
https://wolfram.com/xid/0rs3seh0m-r6rsyf
Make an interactive control that plots a heat map of a country's divisions when a country is selected:
https://wolfram.com/xid/0rs3seh0m-v7wm7f
https://wolfram.com/xid/0rs3seh0m-8cf3t0
https://wolfram.com/xid/0rs3seh0m-43i58u
Wolfram Research (2014), Dataset, Wolfram Language function, https://reference.wolfram.com/language/ref/Dataset.html (updated 2021).
Text
Wolfram Research (2014), Dataset, Wolfram Language function, https://reference.wolfram.com/language/ref/Dataset.html (updated 2021).
Wolfram Research (2014), Dataset, Wolfram Language function, https://reference.wolfram.com/language/ref/Dataset.html (updated 2021).
CMS
Wolfram Language. 2014. "Dataset." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2021. https://reference.wolfram.com/language/ref/Dataset.html.
Wolfram Language. 2014. "Dataset." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2021. https://reference.wolfram.com/language/ref/Dataset.html.
APA
Wolfram Language. (2014). Dataset. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/Dataset.html
Wolfram Language. (2014). Dataset. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/Dataset.html
BibTeX
@misc{reference.wolfram_2024_dataset, author="Wolfram Research", title="{Dataset}", year="2021", howpublished="\url{https://reference.wolfram.com/language/ref/Dataset.html}", note=[Accessed: 10-January-2025
]}
BibLaTeX
@online{reference.wolfram_2024_dataset, organization={Wolfram Research}, title={Dataset}, year={2021}, url={https://reference.wolfram.com/language/ref/Dataset.html}, note=[Accessed: 10-January-2025
]}