NetUnfold
NetUnfold[fnet]
produces the elementary net of the folded net fnet, exposing the recurrent states.
Details and Options
data:image/s3,"s3://crabby-images/547e2/547e235ce3e74085921d2d1e8cf88ebea86868cc" alt=""
data:image/s3,"s3://crabby-images/11130/111309f6970d7d19e5511e2e75b2a80e59d483bc" alt=""
- A folded net is a net iterating over a sequence unidirectionally by repeating the same operation, such as recurrent nets and unidirectional transformers.
- NetUnfold is typically used to extract the repeating operation, in order to efficiently generate sequences from a trained decoder that can be used in applications such as text and audio generation, text translation and more.
- With a recurrent network with state equations
and output equation
for
and training parameters
, the unfolded net corresponds to just a single step of this recurrence
and
.
- In particular, NetUnfold exposes the recurrent states of the following folded layers:
-
BasicRecurrentLayer[…] one-state vector GatedRecurrentLayer[…] one-state vector LongShortTermMemoryLayer[…] two-state vectors, among which one internal cell state NetFoldOperator[net,{"out1""in1",…,"outn""inn"},…] n-state vectors AttentionLayer[…,"Mask""Causal"] two-state sequences, which are the previous keys and values - Exposed states of recurrent layers are vectors that are typically initialized with zeros. Exposed states of transformers are sequences of vectors with a variable length, which are typically initialized with empty sequences.
- NetUnfold can also be applied to a folded net that is followed by an operation on the last element of its output sequence. In such cases, the corresponding SequenceLastLayer is dropped.
- NetUnfold can be seen as the inverse operation of NetFoldOperator.
data:image/s3,"s3://crabby-images/10681/106814b7d7f3d49911cc1ed6f4f57d962f6f238b" alt=""
Examples
open allclose allBasic Examples (1)
Get the core operation folded in a GatedRecurrentLayer:
Scope (5)
Applications (1)
Implementing efficient text generation. First, get a trained language model:
The most straightforward function to stochastically generate text is the following:
The problem of this function is that it has quadratic time complexity, because the model is fed several times with the same input:
NetUnfold permits you to avoid recomputing the same activations twice, by exposing the states:
Write an efficient stochastic text generation based on this unfolded net:
Properties & Relations (2)
NetUnfold is the inverse operation of NetFoldOperator:
Any SequenceLastLayer after a recursion is automatically removed:
Text
Wolfram Research (2021), NetUnfold, Wolfram Language function, https://reference.wolfram.com/language/ref/NetUnfold.html.
CMS
Wolfram Language. 2021. "NetUnfold." Wolfram Language & System Documentation Center. Wolfram Research. https://reference.wolfram.com/language/ref/NetUnfold.html.
APA
Wolfram Language. (2021). NetUnfold. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/NetUnfold.html