"ByteTrie" (Data Structure)

"ByteTrie"

represents a trie where the members are sequences of bytes.

Details

  • A byte trie is useful for efficient insertion as well as membership testing:
  • CreateDataStructure["ByteTrie", start, end]create a new empty "ByteTrie" for sequences of bytes that range from start to end
    Typed[x,"ByteTrie"]give x the type "ByteTrie"
  • For a data structure of type "ByteTrie", the following operations can be used:
  • ds["ByteLists"]return the byte sequences stored in ds represented as a list of lists of bytestime: O(n)
    ds["ByteLists",s]return the byte sequences stored in ds that start with the sequence stime: O(n)
    ds["ByteRange"]the range of bytes that ds can holdtime: O(1)
    ds["Copy"]return a copy of dstime: O(n)
    ds["EmptyQ"]True, if ds has no elementstime: O(1)
    ds["FreeQ",s]True , if the byte sequence s is not stored in dstime: O(log n)
    ds["Insert",s]insert the byte sequence s in dstime: O(log n)
    ds["Length"]the number of byte sequences stored in dstime: O(1)
    ds["MemberQ",s]True, if the byte sequence s is stored in dstime: O(log n)
    ds["NumericArrays"]return the byte sequences stored in ds represented as a list of numeric arraystime: O(n)
    ds["NumericArrays",s]return the byte sequences stored in ds that start with the sequence stime: O(n)
    ds["Strings"]return the byte sequences stored in ds represented as a list of stringstime: O(n)
    ds["Strings",s]return the byte sequences stored in ds that start with the sequence stime: O(n)
  • The following functions are also supported:
  • dsi===dsjTrue, if dsi equals dsj
    FullForm[ds]full form of ds
    Information[ds]information about ds
    InputForm[ds]input form of ds
    Normal[ds]convert ds to a normal expression
  • The byte sequences stored in a "ByteTrie" data structure can be represented as lists of bytes, strings or numeric arrays.
  • A trie uses a prefix to store its elements so it never suffers from the collisions that are a problem for other data structures, such as those that use hash computations.

Examples

open allclose all

Basic Examples  (6)

A new "ByteTrie" can be created with CreateDataStructure:

Insert into the trie:

Test if a byte sequence is stored:

If a sequence is not stored, False is returned:

A list of the bytes represented as strings can be returned:

Numerical values for the range of bytes can be given:

A list of bytes can be inserted:

Test if a byte sequence is stored:

If a sequence is not stored, False is returned:

A list of the bytes represented as strings can be returned:

A byte trie with several entries:

Return all the entries:

Return all the entries with a specific prefix:

The prefix to select entries can contain more than one byte:

A byte trie with several entries:

Return all the entries:

Return all the entries with a specific prefix:

The prefix to select entries can contain more than one byte:

A byte trie with some insertions:

A visualization of the data structure can be generated:

The elements colored in red show the end of actual entries.

It is an error to insert bytes that are outside of the range:

Scope  (1)

Information  (1)

A new "ByteTrie" can be created with CreateDataStructure:

Information about the data structure ds:

Applications  (2)

Command Completion  (1)

One use of a byte trie is for command completion. This works well with the RandomWord function.

Create a list of 1000 words, removing hyphens and making everything lowercase:

Create a byte trie and insert the words:

Now efficiently find all the words that start with a specific prefix:

These could all be used for a command completion system.

Storing Integers  (1)

Integers can be stored in a byte trie by breaking them down into sequences of bytes.

One way to do this is with bit operations, but IntegerDigits is also effective:

A utility function that inserts a number into a byte trie:

A function that tests for membership:

Create a byte trie and make an insertion:

Test for membership:

These could all be used for a command completion system.

Neat Examples  (1)

Visualization  (1)

A collection of random byte sequences of random lengths:

Create a byte trie and insert the byte sequences:

The visualization shows how the data is laid out: