Data Structure Registers (DSRs) are physical registers that are used to store DSD values. Each DSR belongs to one of three DSR files, namely theDocumentation Index
Fetch the complete documentation index at: https://sdk.cerebras.ai/llms.txt
Use this file to discover all available pages before exploring further.
dest, src0 and src1 DSR files. All DSD operations will
actually operate on DSRs behind the scenes and therefore, all DSD operands
to DSD operations must be loaded to DSRs before executing the respective
DSD operation.
Extended DSRs and Stride Registers
Certain kinds of DSD values require additional registers called Extended DSRs (XDSRs) to be loaded as well. Specifically, FIFOs, circular buffers, and multi-dimensional vectors all require an XDSR. In addition, multi-dimensional vectors may also require an additional set of registers called Stride Registers (SRs) that are used to store strides of the underlying multi-dimensional access.DSR, XDSR and SR Allocation
The allocation of DSRs, XDSRs and SRs, and the loading of DSDs to them, is typically done automatically by the compiler. However, it is possible to create and use DSRs, XDSRs and SRs directly. This chapter describes how users can allocate DSRs, XDSRs and SRs and then load DSDs to them without the compiler’s assistance.DSR Types
There are 5 types of DSRs supported in CSL, each corresponding to one of the three DSR files. These are the following:dsr_destrepresents a DSR value that can only be used to store a destination operand to a DSD operation.dsr_src0represents a DSR value that can be used to store a source as well as a destination operand to a DSD operation.dsr_src1represents a DSR value that can be only be used to store a source operand to a DSD operation.dsr_fifo_destrepresents adsr_destDSR that is expected to store a FIFO (See FIFO DSR types).dsr_fifo_src1represents adsr_src1DSR that is expected to store a FIFO.
xdsr type while SR values are
represented by the sr type.
FIFO DSR types
Thedsr_fifo_dest and dsr_fifo_src1 types can be used instead of
dsr_dest and dsr_src1, respectively, to represent DSRs that are
known to store a FIFO if one does not have access to a FIFO object. Like
FIFO objects, non-asynchronous DSD operations on FIFO DSRs will terminate
and return false when reading from an empty FIFO or writing to a full
FIFO. Otherwise, FIFO DSR-typed values have the same semantics as the
corresponding non-FIFO DSR types.
Behavior is undefined if a FIFO DSR-typed value is not initialized as part
of a FIFO when it is used in a DSD operation.
If a non-asynchronous DSD operation has a DSR operand that does not
have FIFO DSR type, but that DSR holds a FIFO, behavior is undefined if
that FIFO experiences a FIFO full or FIFO empty event. It is the
programmer’s responsibility to avoid such FIFO full or FIFO empty events.
DSR Builtins
@get_dsr
Create a DSR identifier value. This value will identify a physical DSR along with the corresponding DSR file.Syntax
dsr_typeis an expression of typetypeand whose value must be one of the DSR types.dsr_idis a comptime-known expression of integer type.fifo_dsr_typeis an expression of typetypewhose value is one of the FIFO DSR types (dsr_fifo_destordsr_fifo_src1).non_fifo_dsris a comptime-known expression of a non-FIFO DSR type.- Returns a value of
dsr_type.
Example
Semantics
Creates a DSR identifier value ofdsr_type type using the specified
integer identifier. This builtin must be evaluated at comptime.
The provided integer identifier must be non-negative and smaller than the
number of available DSRs for the given DSR file. Otherwise, an error will
be emitted.
The type of non_fifo_dsr must correspond to fifo_dsr_type. If
fifo_dsr_type is dsr_fifo_dest, then non_fifo_dsr must have
type dsr_dest, and if fifo_dsr_type is dsr_fifo_src1, then
non_fifo_dsr must have type dsr_src1.
@get_xdsr
Create an XDSR identifier value. This value will identify a physical XDSR.Syntax
xdsr_idis a comptime-known expression of integer type.- Returns a value of type
xdsr.
Example
Semantics
Creates an XDSR identifier value using the specified integer identifier. This builtin must be evaluated at comptime. The provided integer identifier must be non-negative and smaller than the number of available XDSRs. Otherwise, an error will be emitted.@get_sr
Create a Stride Register (SR) identifier value. This value will identify a physical SR.Syntax
sr_idis a comptime-known expression of integer type.- Returns a value of type
sr.
Example
Semantics
Creates an SR identifier value using the specified integer identifier. This builtin must be evaluated at comptime. The provided integer identifier must be non-negative and smaller than the number of available SRs. Otherwise, an error will be emitted.@load_to_dsr
Load a DSD value into a DSR.Syntax
-
dsr_valuea comptime-known expression of a DSR type. -
dsd_valuean expression of DSD type. -
config_structoptional anonymous struct consisting of either of the following:-
Asynchronous configuration setting fields as explained in
Asynchronous DSD Operations. These are allowed only for fabric
DSDs. The supported settings are:
asyncactivateunblockon_control
-
The
save_addresssetting field. This is allowed only formem1dandmem4dDSDs. See save_address for more details. -
The
single_stepsetting field.
-
Asynchronous configuration setting fields as explained in
Asynchronous DSD Operations. These are allowed only for fabric
DSDs. The supported settings are:
Example
Semantics
The@load_to_dsr builtin can be called at comptime or runtime.
If it is called at runtime it will load the input DSD to the specified
DSR at runtime.
If it is called at comptime, the specified DSD will be loaded to the DSR
before the program begins executing.
A DSD of type fabin_dsd cannot be loaded to a dsr_dest DSR.
A DSD of type fabout_dsd cannot be loaded to a dsr_src0 or
dsr_src1 DSRs.
A DSD of type mem4d_dsd cannot be loaded using load_to_dsr. It can only
be loaded using load_to_dsr_xdsr_sr.
FIFO DSRs are not permitted in @load_to_dsr.
When using a fabin_dsd loaded to a DSR, the input queue used by the
fabin_dsd must be explicitly initialized with the associated color via
@initialize_queue.
On WSE-3, when using a fabout_dsd loaded to a DSR, the output queue
used by the fabout_dsd must be explicitly initialized with the associated
color via @initialize_queue.
@load_to_dsr_xdsr
Load a circular buffer DSD value into a DSR and XDSR.Syntax
dsr_valueis a comptime-known expression of a DSR type.xdsr_valueis a comptime-known expression of XDSR type.circbuf_dsdis an expression ofcircbuf_dsdtype.config_structis an optional anonymous struct consisting of either of the following:- The
save_addresssetting field. See save_address for more details. - The
single_stepsetting field.
- The
Example
Semantics
The@load_to_dsr_xdsr builtin can be called at runtime or during the
evaluation of a top-level comptime block.
The input DSD must be of type circbuf_dsd and it will be loaded to a pair of
DSR and XDSR values.
@load_to_dsr_xdsr_sr
Load a 4D memory DSD value into a DSR, an XDSR and zero or more stride registers (SRs).Syntax
dsr_valueis a comptime-known expression of a DSR type.xdsr_valueis a comptime-known expression of XDSR type.sr_tupleis a comptime-known tuple expression with elements of SR type.config_structis an optional anonymous struct consisting of either of the following:- The
save_addresssetting field. See save_address for more details. - The
single_stepsetting field. See single_step for more details.
- The
Example
Semantics
The@load_to_dsr_xdsr_sr builtin can be called at runtime or during the
evaluation of a top-level comptime block.
The input DSD value must be of type mem4d_dsd and it will be loaded to a pair
of physical DSR and XDSR registers. In addition, some of the DSD’s strides, if
any, will also be loaded to the provided SRs.
When the input DSD value is comptime-known then the number of SRs needed is
determined by the access pattern. Specifically, a multi-dimensional vector can
have up to four dimensions and therefore four strides, i.e., one for each
dimension. However, the maximum number of SRs per multi-dimensional vector on
all target architectures is currently three. This means that the first
dimension (the fastest moving dimension) will never need an SR, only the other
three, if they exist. In addition, if the stride of the first dimension
(fastest moving dimension) is one, then the second dimension, if present, will
also not need an SR.
As a result, when the input DSD value is comptime-known, the user must provide
the exact number of SRs needed or otherwise an error will be emitted. The error
message will indicate the number of SRs that are needed.
When the input DSD value is not comptime-known then @load_to_dsr_xdsr_sr will
always need three SRs, which is the maximum amount.
save_address
Thesave_address option may be supplied to @load_to_dsr if the DSD is
of the type mem1d_dsd or mem4d_dsd. This causes subsequent DSD
operations on the DSR to update the DSR’s base address for the outermost
(slowest-varying) dimension after termination to point one position past the
end of the range covered by the DSD operation. The next operation on the DSR
will effectively pick up where the previous one ended.
Example
single_step
Thesingle_step option may be supplied to @load_to_dsr to support use
with the @map builtin. When a DSR is used as an argument to @map, it
should be loaded with a DSD value where .single_step = true, otherwise the
behavior is unspecified. If a DSR loaded with a DSD value where
.single_step = true is used as an argument to DSD builtins other than
@map, the behavior is undefined.
Example
@allocate_fifo with DSRs
By default, the DSRs and XDSR used by@allocate_fifo (see
FIFOs) are allocated by the compiler. However, it
supports the use of user-specified DSRs and XDSR as well, using the
following syntax:
config_struct must contain the fields:
dest: a comptime-known expression ofdsr_desttype.src: a comptime-known expression ofdsr_src1type.xdsr: a comptime-known expression ofxdsrtype.
dest, src, and xdsr must all be specified together,
or all absent, otherwise an error will be emitted. The integer identifiers
of dest and src must match. If the provided DSR and XDSR
identifiers have already been used for their respective types or exceed
the valid range of values for the given target architecture, then an error
will be emitted.
Other fields of config_struct described in
Task Activation on Pop and Push retain their same semantics
when the DSRs and XDSR are specified.

