SemanticImport

SemanticImport[file]
attempts to import a file semantically to give a Dataset object.

SemanticImport[file,type]
attempts to interpret all elements in the file as being of the specified type.

SemanticImport[file,{type1,type2,}]
attempts to interpret elements in successive columns as being of the specified types.


keeps only the columns specified by their positions or names.

SemanticImport[file,typespec,form]
puts the result in the specified form.

Details and OptionsDetails and Options

  • SemanticImport is primarily intended for one- and two-dimensional arrays of elements.
  • SemanticImport can use free-form linguistics to interpret elements in the structure it is given.
  • Types of objects returned include numbers, Quantity objects, Entity objects, DateObject, GeoPosition, etc.
  • SemanticImport makes detailed assumptions, for example about date formats, by looking at all elements in particular rows or columns of the input.
  • Possible values for type include:
  • Automaticchoose type automatically
    "String"Unicode string
    "Number"number in any standard format
    "Integer"integer in decimal notation
    "Real"real in decimal notation
    "Quantity"quantity with units
    "Currency"currency amount
    "Date"date in any standard format
    "DateTime"date and time
    "Time"time of day
    "GeoCoordinates"geo position specifed as latitude, longitude
    "URL"correctly formatted URL
    "EmailAddress"correctly formatted email address
    "Country"country given in natural language
    "AdministrativeDivision"administrative division (US states only in this version) given in natural language
    "City"city given in natural language
    "Person"person given in natural language
    Noneskip a column
    ispecany basic form used by Interpreter
  • The following options can be given to indicate features of the input:
  • CharacterEncodingAutomaticassumed encoding of input file
    DelimitersAutomaticdelimiters between elements
    HeaderLinesAutomaticline numbers to treat as headers
    ExcludedLines{}lines to exclude from result
    MissingDataRules{}rules for replacing data to be considered "missing"
  • Possible values for form include:
  • "Dataset"a row-oriented dataset
    "List"a single column as a list
    "Columns"a list of columns, each given as a list
    "NamedColumns"an association associating column name with list of contents
    "Rows"a list of rows, each given as a list
    "NamedRows"a list of rows, each given as an association from column name to content
  • When elements cannot be interpreted, forms returned in their place include:
  • Missing["Empty"]an empty or whitespace element
    Missing["Invalid","string"]data with invalid or meaningless fields
    Missing["Unrecognized","string"]element that could not be parsed
    Missing["ByDesignation",value]an element matching MissingDataRules
    Missing[custom]a Missing[] provided through MissingDataRules

ExamplesExamplesopen allclose all

Basic Examples  (8)Basic Examples  (8)

Import a file, automatically detecting and interpreting dates and cities:

In[1]:=
Click for copyable input
Out[1]=

Columns shown in bold correspond to semantic objects in the Wolfram Language:

In[2]:=
Click for copyable input
Out[2]=
In[3]:=
Click for copyable input
Out[3]=
In[4]:=
Click for copyable input
Out[4]=

Import a file as a single column of administrative divisions:

In[1]:=
Click for copyable input
Out[1]=

Import a file with the specified column types:

In[1]:=
Click for copyable input
Out[1]=

Import only some columns of a file, in the specified format, using column numbers:

In[1]:=
Click for copyable input
Out[1]=

Import only some columns of a file, in the specified format, using column names:

In[1]:=
Click for copyable input
Out[1]=

Import only some columns, specifying None for columns that should be dropped:

In[1]:=
Click for copyable input
Out[1]=

Import a file as a list of rows:

In[1]:=
Click for copyable input
Out[1]=

Import a file as a list of columns:

In[1]:=
Click for copyable input
In[2]:=
Click for copyable input
Out[2]=
In[3]:=
Click for copyable input
Out[3]=
In[4]:=
Click for copyable input
Out[4]=
Introduced in 2014
(10.0)