GGUF (.gguf)

Background & Context

    • Open format designed for the fast loading and saving of large language models.
    • Stores models and various metadata.
    • GGUF is an acronym for GPT-Generated Unified Format.
    • Binary file format successor of GGML.
    • Released in 2023 by Georgi Gerganov.

Import

  • Import["file.gguf"] imports a GGUF file, returning a NetExternalObject.
  • Import["file.gguf",elem] imports the specified elements.
  • The import format can be specified with Import["file","GGUF"] or Import["file",{"GGUF",elem,}].
  • See the following reference pages for full general information:
  • Importimport from a file
    CloudImportimport from a cloud object
    ImportStringimport from a string
    ImportByteArrayimport from a byte array

Import Elements

  • General Import elements:
  • "Elements" list of elements and options available in this file
    "Summary"summary of the file
    "Rules"list of rules for all available elements
  • Import elements include:
  • "NetExternalObject"NetExternalObject representation of the net

Options

  • Import options:
  • "ContextWindowSize"Automaticsize of the model's context window
    "Output""Text"output type of the model
  • Options "Output" can be set to the following:
  • "Embeddings"import the file as a text-embedding model
    "Text"import the file as a text-generation model
  • Not all models support both output types.
  • Option "ContextWindowSize" can be set to a positive integer or to Automatic, which selects the model's default.

Examples

open allclose all

Basic Examples  (2)

Import a net in GGUF format (model credit: Olusegun Odewole, https://huggingface.co/segestic/Tinystories-gpt-0.1-3m ):

Show the Import elements available in this file:

Run the model until a termination token is reached:

Import a "GGUF" file as an embedding model:

Generate an embedding vector for each token in a string:

Import Options  (2)

"ContextWindowSize"  (1)

Import a "GGUF" file:

Check the default context window size:

Specify a different maximum value:

Inspect the new value in the model information:

Retrieve the original training-time context window size:

"Output"  (1)

By default, models are imported as text generators:

Import the model as a text embedder instead: