MongoLink Introduction

MongoLink is a set of tools for working with MongoDB. This tutorial shows how to perform the most common MongoDB operations using MongoLink.

This tutorial assumes that a MongoDB server is running on your local machine at the default host and port. For platform-dependent instructions for running a MongoDB server locally, see this.

Making a Local Connection

Load MongoLink:

In[33]:=
Click for copyable input

Create a client connection using the default host "localhost" and port 27017 (this is the default hostname and port when running the MongoDB server on your local machine):

In[5]:=
Click for copyable input
Out[5]=

The port and host can also be explicitly specified:

In[7]:=
Click for copyable input
Out[7]=

Or use the MongoDB Connection String URI format:

In[8]:=
Click for copyable input
Out[8]=

Connecting to a Database

A MongoDB server can host multiple independent databases. List the available databases on the server:

In[9]:=
Click for copyable input
Out[9]=

Connect to a database:

In[10]:=
Click for copyable input
Out[10]=

This is equivalent to the function:

In[11]:=
Click for copyable input
Out[11]=

Connecting to a Collection

A collection is a collection of documents. Getting a collection is the same getting a database:

In[12]:=
Click for copyable input
Out[12]=

The above syntax is equivalent to:

In[13]:=
Click for copyable input
Out[13]=

Note: databases and collections are created lazily, so getting a collection or database does not perform any operations on the MongoDB server. They are only created once the first document is inserted into them.

Documents

A Document in MongoDB can be viewed as a (possibly) nested Association whose keys must be strings and whose values are limited to a small set of types (for example, strings, lists, integers, dates, etc). One simple example of a document:

In[14]:=
Click for copyable input
Out[14]=

For a list available types that can be used as values, see the documentation for the "BSON" format.

Inserting a Single Document into a Collection

Insert the previous document into the "WolframTest" collection:

In[15]:=
Click for copyable input
Out[15]=

Get a list of the inserted document IDs:

In[16]:=
Click for copyable input
Out[16]=

Note: Every MongoDB document must have an "_id" key. If this key is missing from the document being inserted, it is automatically added to the document with value of type BSONObjectID.

A BSONObjectID object contains various metadata related to its creation:

In[18]:=
Click for copyable input
Out[18]=

Inserting Multiple Documents into a Collection

It is more efficient to insert many documents at once into a collection. Create a list of documents:

In[20]:=
Click for copyable input
Out[20]=

Insert these two documents into the collection:

In[21]:=
Click for copyable input
Out[21]=

Getting a Single Document

The simplest query is performed with MongoCollectionFindOne, which returns a single document from the collection:

In[22]:=
Click for copyable input
Out[22]=

We can specify that we want to find a document of a three year old cat:

In[23]:=
Click for copyable input
Out[23]=

Various keys that are not wanted can be eliminated using projection. Eliminating the "date" and "sex" fields:

In[24]:=
Click for copyable input
Out[24]=

This can be used to speed up the transfer of documents from the server, as unwanted parts of a document need not be transferred.

Note: for more information on building queries, the tutorial "Query Documents" from the MongoDB documentation might be useful.

Getting Multiple Documents

MongoCollectionFind has the same syntax as MongoCollectionFindOne, but instead of returning a document, it returns a MongoCursor, appropriate for handling large datasets:

In[25]:=
Click for copyable input
Out[25]=

Calling Read (or equivalently MongoCursorNext) on a MongoCursor gets the next document:

In[26]:=
Click for copyable input
Out[26]=

Calling ReadList (or equivalently MongoCursorToArray) gets a list of all remaining documents:

In[27]:=
Click for copyable input
Out[27]=

Once all documents have been read from the cursor, Null is returned if Read is called again:

In[28]:=
Click for copyable input

To read the documents again, a new cursor needs to be created:

In[29]:=
Click for copyable input
Out[29]=

All documents in a cursor can also be read into a Dataset:

In[30]:=
Click for copyable input
Out[30]=

The real power of cursors is that every document in a collection can be processed without having to load all documents into memory:

In[31]:=
Click for copyable input

This allows for the handling of massive datasets.

Sampling From a Collection

Collections can be sampled from:

In[33]:=
Click for copyable input
Out[33]=

Read the cursor into a Dataset:

In[34]:=
Click for copyable input
Out[34]=

Modifying Documents

Modify all documents with "age" greater than 4 so that "age" becomes 10:

In[35]:=
Click for copyable input
Out[35]=

Deleting Documents

Warning: deleting documents is dangerous, can cannot easily be undone. Delete all documents in a collection with "age" of 3:

In[36]:=
Click for copyable input
Out[36]=

Delete all documents in a collection:

In[37]:=
Click for copyable input
Out[37]=