Version: 1.0.0b2

For the complete Mojo documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /docs/manual/basics.md).

SIMD

struct SIMD[dtype: DType, size: Int]

Represents a vector type that leverages hardware acceleration to process multiple data elements with a single operation.

SIMD (Single Instruction, Multiple Data) is a fundamental parallel computing paradigm where a single CPU instruction operates on multiple data elements at once. Modern CPUs can perform 4, 8, 16, or even 32 operations in parallel using SIMD, delivering substantial performance improvements over scalar operations. Instead of processing one value at a time, SIMD processes entire vectors of values with each instruction.

For example, when adding two vectors of four values, a scalar operation adds each value in the vector one by one, while a SIMD operation adds all four values at once using vector registers:

Scalar operation:                SIMD operation:
┌─────────────────────────┐      ┌───────────────────────────┐
│ 4 instructions          │      │ 1 instruction             │
│ 4 clock cycles          │      │ 1 clock cycle             │
│                         │      │                           │
│ ADD  a[0], b[0] → c[0]  │      │ Vector register A         │
│ ADD  a[1], b[1] → c[1]  │      │ ┌─────┬─────┬─────┬─────┐ │
│ ADD  a[2], b[2] → c[2]  │      │ │a[0] │a[1] │a[2] │a[3] │ │
│ ADD  a[3], b[3] → c[3]  │      │ └─────┴─────┴─────┴─────┘ │
└─────────────────────────┘      │           +               │
                                 │ Vector register B         │
                                 │ ┌─────┬─────┬─────┬─────┐ │
                                 │ │b[0] │b[1] │b[2] │b[3] │ │
                                 │ └─────┴─────┴─────┴─────┘ │
                                 │           ↓               │
                                 │        SIMD_ADD           │
                                 │           ↓               │
                                 │ Vector register C         │
                                 │ ┌─────┬─────┬─────┬─────┐ │
                                 │ │c[0] │c[1] │c[2] │c[3] │ │
                                 │ └─────┴─────┴─────┴─────┘ │
                                 └───────────────────────────┘

The SIMD type maps directly to hardware vector registers and instructions. Mojo automatically generates optimal SIMD code that leverages CPU-specific instruction sets (such as AVX and NEON) without requiring manual intrinsics or assembly programming.

This type is the foundation of high-performance CPU computing in Mojo, enabling you to write code that automatically leverages modern CPU vector capabilities while maintaining code clarity and portability.

Caution: If you declare a SIMD vector size larger than the vector registers of the target hardware, the compiler will break up the SIMD into multiple vector registers for compatibility. However, you should avoid using a vector that's more than 2x the hardware's vector register size because the resulting code will perform poorly.

Key properties:

Hardware-mapped: Directly maps to CPU vector registers
Type-safe: Data types and vector sizes are checked at compile time
Zero-cost: No runtime overhead compared to hand-optimized intrinsics
Portable: Same code works across different CPU architectures (x86, ARM, etc.)
Composable: Seamlessly integrates with Mojo's parallelization features

Key APIs:

Construction:
- Broadcast single value to all elements: SIMD[dtype, size](value)
- Initialize with specific values: SIMD[dtype, size](v1, v2, ...)
- Zero-initialized vector: SIMD[dtype, size]()
Element operations:
- Arithmetic: +, -, *, /, %, //
- Comparison: ==, !=, <, <=, >, >=
- Math functions: sqrt(), sin(), cos(), fma(), etc.
- Bit operations: &, |, ^, ~, <<, >>
Vector operations:
- Horizontal reductions: reduce_add(), reduce_mul(), reduce_min(), reduce_max()
- Element-wise conditional selection: select(condition, true_case, false_case)
- Vector manipulation: shuffle(), slice(), join(), split()
- Type conversion: cast[target_dtype]()

Examples:

Vectorized math operations:

# Process 8 floating-point numbers simultaneously
var a = SIMD[DType.float32, 8](1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0)
var b = SIMD[DType.float32, 8](2.0)  # Broadcast 2.0 to all elements
var result = a * b + 1.0
print(result)  # => [3.0, 5.0, 7.0, 9.0, 11.0, 13.0, 15.0, 17.0]

Conditional operations with masking:

# Double the positive values and negate the negative values
var values = SIMD[DType.int32, 4](1, -2, 3, -4)
var is_positive = values.gt(0)  # greater-than: gets SIMD of booleans
var result = is_positive.select(values * 2, values * -1)
print(result)  # => [2, 2, 6, 4]

Horizontal reductions:

# Sum all elements in a vector
var data = SIMD[DType.float64, 4](10.5, 20.3, 30.1, 40.7)
var total = data.reduce_add()
var maximum = data.reduce_max()
print(total, maximum)  # => 101.6 40.7

Constraints:

The size of the SIMD vector must be positive and a power of 2.

Parameters

dtype (DType): The data type of SIMD vector elements.
size (Int): The size of the SIMD vector (number of elements).

Implemented traits

Absable, AnyType, Boolable, CeilDivable, Ceilable, Comparable, CoordLike, Copyable, Defaultable, DevicePassable, DivModable, Equatable, Floorable, Hashable, ImplicitlyCopyable, ImplicitlyDeletable, Indexer, Intable, Movable, Powable, RegisterPassable, Roundable, Sized, TrivialRegisterPassable, Truncable, Writable, _FromInt

`comptime` members

`device_type`

comptime device_type = SIMD[dtype, size]

SIMD types are remapped to the same type when passed to accelerator devices.

`DTYPE`

comptime DTYPE = dtype

The data type for the runtime integer value.

`is_static_value`

comptime is_static_value = False

True if the value is known at compile time.

`is_tuple`

comptime is_tuple = False

True if this is a tuple type (Coord), False for scalar values.

`is_value`

comptime is_value = True

True if this is a scalar value, False for tuple types.

`MAX`

comptime MAX = SIMD(max_or_inf[dtype]())

Gets the maximum value for the SIMD value, potentially +inf.

`MAX_FINITE`

comptime MAX_FINITE = SIMD(max_finite[dtype]())

Returns the maximum finite value of SIMD value.

`MIN`

comptime MIN = SIMD(min_or_neg_inf[dtype]())

Gets the minimum value for the SIMD value, potentially -inf.

`MIN_FINITE`

comptime MIN_FINITE = SIMD(min_finite[dtype]())

Returns the minimum (lowest) finite value of SIMD value.

`ParamListType`

comptime ParamListType = *?

The element types (Self for scalar types).

`static_value`

comptime static_value = -1

Always -1 for runtime values (not statically known).

Methods

`init`

def __init__() -> Self

Default initializer of the SIMD vector.

By default the SIMD vectors are initialized to all zeros.

def __init__[other_dtype: DType, //](value: SIMD[other_dtype, size], /) -> Self

Initialize from another SIMD of the same size. If the value passed is a scalar, you can initialize a SIMD vector with more elements.

Example:

print(UInt64(UInt8(42))) # 42
print(SIMD[DType.uint64, 4](UInt8(42))) # [42, 42, 42, 42]

Casting behavior:

# Basic casting preserves value within range
Int8(UInt8(127)) == Int8(127)

# Numbers above signed max wrap to negative using two's complement
Int8(UInt8(128)) == Int8(-128)
Int8(UInt8(129)) == Int8(-127)
Int8(UInt8(256)) == Int8(0)

# Negative signed cast to unsigned using two's complement
UInt8(Int8(-128)) == UInt8(128)
UInt8(Int8(-127)) == UInt8(129)
UInt8(Int8(-1)) == UInt8(255)

# Truncate precision after downcast and upcast
Float64(Float32(Float64(123456789.123456789))) == Float64(123456792.0)

# Rightmost bits of significand become 0's on upcast
Float64(Float32(0.3)) == Float64(0.30000001192092896)

# Numbers equal after truncation of float literal and cast truncation
Float32(Float64(123456789.123456789)) == Float32(123456789.123456789)

# Float to int/uint floors
Int64(Float64(42.2)) == Int64(42)

Parameters:

other_dtype (DType): The type of the value that is being cast from.

Args:

value (SIMD[other_dtype, size]): The value to cast from.

@implicit def __init__(value: Int, /) -> Self

Initializes the SIMD vector with a signed integer.

The signed integer value is splatted across all the elements of the SIMD vector.

Args:

value (Int): The input value.

def __init__[T: Floatable, //](value: T, /) -> Float64

Initialize a Float64 from a type conforming to Floatable.

Parameters:

T (Floatable): The Floatable type.

Args:

value (T): The object to get the float point representation of.

Returns:

Float64

def __init__[T: FloatableRaising, //](out self: Float64, value: T, /)

Initialize a Float64 from a type conforming to FloatableRaising.

Parameters:

T (FloatableRaising): The FloatableRaising type.

Args:

value (T): The object to get the float point representation of.

Returns:

Float64 Raises:

If the type does not have a float point representation.

@implicit def __init__(value: IntLiteral, /) -> Self

Initializes the SIMD vector with an integer.

The integer value is splatted across all the elements of the SIMD vector.

Args:

value (IntLiteral): The input value.

@implicit def __init__(value: Bool, /) -> SIMD[DType.bool, size]

Initializes a Scalar with a bool value.

Since this constructor does not splat, it can be implicit.

Args:

value (Bool): The bool value to initialize the Scalar with.

Returns:

SIMD[DType.bool, size]

def __init__(*, fill: Bool) -> SIMD[DType.bool, size]

Initializes the SIMD vector with a bool value.

The bool value is splatted across all elements of the SIMD vector.

Args:

fill (Bool): The bool value to fill each element of the SIMD vector with.

Returns:

SIMD[DType.bool, size]

@implicit def __init__(value: Scalar[dtype], /) -> Self

Constructs a SIMD vector by splatting a scalar value.

The input value is splatted across all elements of the SIMD vector.

Args:

value (Scalar[dtype]): The value to splat to the elements of the vector.

def __init__(*elems: Scalar[dtype], *, __list_literal__: NoneType = None) -> Self

Constructs a SIMD vector via a variadic list of elements.

The input values are assigned to the corresponding elements of the SIMD vector.

Constraints:

The number of input values is equal to size of the SIMD vector.

Args:

*elems (Scalar[dtype]): The variadic list of elements from which the SIMD vector is constructed.
list_literal (NoneType): Tell Mojo to use this method for list literals.

@implicit def __init__(value: FloatLiteral, /) -> Self

Initializes the SIMD vector with a float.

The value is splatted across all the elements of the SIMD vector.

Args:

value (FloatLiteral): The input value.

def __init__[int_dtype: DType, //](*, from_bits: SIMD[int_dtype, size]) -> Self

Initializes the SIMD vector from the bits of an integral SIMD vector.

Parameters:

int_dtype (DType): The integral type of the input SIMD vector.

Args:

from_bits (SIMD[int_dtype, size]): The SIMD vector to copy the bits from.

def __init__(out self: Scalar[dtype], *, py: PythonObject)

Initialize a SIMD value from a PythonObject.

Args:

py (PythonObject): The PythonObject to convert.

Returns:

Scalar[dtype] Raises:

If the conversion to double fails.

`bool`

def __bool__(self) -> Bool

Converts the SIMD scalar into a boolean value.

Returns:

Bool: True if the SIMD scalar is non-zero and False otherwise.

`getitem`

def __getitem__(self, idx: Int) -> Scalar[dtype]

Gets an element from the vector.

Args:

idx (Int): The element index.

Returns:

Scalar[dtype]: The value at position idx.

`setitem`

def __setitem__(mut self, idx: Int, val: Scalar[dtype])

Sets an element in the vector.

Args:

idx (Int): The index to set.
val (Scalar[dtype]): The value to set.

`neg`

def __neg__(self) -> Self

Defines the unary - operation.

Returns:

Self: The negation of this SIMD vector.

`pos`

def __pos__(self) -> Self

Defines the unary + operation.

Returns:

Self: This SIMD vector.

`invert`

def __invert__(self) -> Self

Returns ~self.

Constraints:

The element type of the SIMD vector must be boolean or integral.

Returns:

Self: The ~self value.

`lt`

def __lt__(self, rhs: Self) -> Bool

Compares two Scalars using less-than comparison.

Args:

rhs (Self): The Scalar to compare with.

Returns:

Bool: True if self is less than rhs, False otherwise.

`le`

def __le__(self, rhs: Self) -> Bool

Compares two Scalars using less-than-or-equal comparison.

Args:

rhs (Self): The Scalar to compare with.

Returns:

Bool: True if self is less than or equal to rhs, False otherwise.

`eq`

def __eq__(self, rhs: Self) -> Bool

Compares two SIMD vectors for equality.

Args:

rhs (Self): The SIMD vector to compare with.

Returns:

Bool: True if all elements of the SIMD vectors are equal, False otherwise.

`ne`

def __ne__(self, rhs: Self) -> Bool

Compares two SIMD vectors for inequality.

Args:

rhs (Self): The SIMD vector to compare with.

Returns:

Bool: True if any elements of the SIMD vectors are not equal, False otherwise.

`gt`

def __gt__(self, rhs: Self) -> Bool

Compares two Scalars using greater-than comparison.

Args:

rhs (Self): The Scalar to compare with.

Returns:

Bool: True if self is greater than rhs, False otherwise.

`ge`

def __ge__(self, rhs: Self) -> Bool

Compares two Scalars using greater-than-or-equal comparison.

Args:

rhs (Self): The Scalar to compare with.

Returns:

Bool: True if self is greater than or equal to rhs, False otherwise.

`contains`

def __contains__(self, value: Scalar[dtype]) -> Bool

Whether the vector contains the value.

Args:

value (Scalar[dtype]): The value.

Returns:

Bool: Whether the vector contains the value.

`add`

def __add__(self, rhs: Self) -> Self

Computes self + rhs.

Args:

rhs (Self): The rhs value.

Returns:

Self: A new vector whose element at position i is computed as self[i] + rhs[i].

`sub`

def __sub__(self, rhs: Self) -> Self

Computes self - rhs.

Args:

rhs (Self): The rhs value.

Returns:

Self: A new vector whose element at position i is computed as self[i] - rhs[i].

`mul`

def __mul__(self, rhs: Self) -> Self

Computes self * rhs.

Args:

rhs (Self): The rhs value.

Returns:

Self: A new vector whose element at position i is computed as self[i] * rhs[i].

`truediv`

def __truediv__(self, rhs: Self) -> Self

Computes self / rhs.

Args:

rhs (Self): The rhs value.

Returns:

Self: A new vector whose element at position i is computed as self[i] / rhs[i].

`floordiv`

def __floordiv__(self, rhs: Self) -> Self

Returns the division of self and rhs rounded down to the nearest integer.

Constraints:

The element type of the SIMD vector must be numeric.

Args:

rhs (Self): The value to divide with.

Returns:

Self: floor(self / rhs) value.

`mod`

def __mod__(self, rhs: Self) -> Self

Returns the remainder of self divided by rhs.

Args:

rhs (Self): The value to divide with.

Returns:

Self: The remainder of dividing self by rhs.

`pow`

def __pow__(self, exp: Int) -> Self

Computes the vector raised to the power of the input integer value.

Args:

exp (Int): The exponent value.

Returns:

Self: A SIMD vector where each element is raised to the power of the specified exponent value.

def __pow__(self, exp: Self) -> Self

Computes the vector raised elementwise to the right hand side power.

Args:

exp (Self): The exponent value.

Returns:

Self: A SIMD vector where each element is raised to the power of the specified exponent value.

`lshift`

def __lshift__(self, rhs: Self) -> Self

Returns self << rhs.

Constraints:

The element type of the SIMD vector must be integral.

Args:

rhs (Self): The RHS value.

Returns:

Self: self << rhs.

`rshift`

def __rshift__(self, rhs: Self) -> Self

Returns self >> rhs.

Constraints:

The element type of the SIMD vector must be integral.

Args:

rhs (Self): The RHS value.

Returns:

Self: self >> rhs.

`and`

def __and__(self, rhs: Self) -> Self

Returns self & rhs.

Constraints:

The element type of the SIMD vector must be bool or integral.

Args:

rhs (Self): The RHS value.

Returns:

Self: self & rhs.

`or`

def __or__(self, rhs: Self) -> Self

Returns self | rhs.

Constraints:

The element type of the SIMD vector must be bool or integral.

Args:

rhs (Self): The RHS value.

Returns:

Self: self | rhs.

`xor`

def __xor__(self, rhs: Self) -> Self

Returns self ^ rhs.

Constraints:

The element type of the SIMD vector must be bool or integral.

Args:

rhs (Self): The RHS value.

Returns:

Self: self ^ rhs.

`radd`

def __radd__(self, value: Self) -> Self

Returns value + self.

Args:

value (Self): The other value.

Returns:

Self: value + self.

`rsub`

def __rsub__(self, value: Self) -> Self

Returns value - self.

Args:

value (Self): The other value.

Returns:

Self: value - self.

`rmul`

def __rmul__(self, value: Self) -> Self

Returns value * self.

Args:

value (Self): The other value.

Returns:

Self: value * self.

`rtruediv`

def __rtruediv__(self, value: Self) -> Self

Returns value / self.

Args:

value (Self): The other value.

Returns:

Self: value / self.

`rfloordiv`

def __rfloordiv__(self, rhs: Self) -> Self

Returns the division of rhs and self rounded down to the nearest integer.

Constraints:

The element type of the SIMD vector must be numeric.

Args:

rhs (Self): The value to divide by self.

Returns:

Self: floor(rhs / self) value.

`rmod`

def __rmod__(self, value: Self) -> Self

Returns value mod self.

Args:

value (Self): The other value.

Returns:

Self: value mod self.

`rpow`

def __rpow__(self, base: Self) -> Self

Returns base ** self.

Args:

base (Self): The base value.

Returns:

Self: base ** self.

`rlshift`

def __rlshift__(self, value: Self) -> Self

Returns value << self.

Constraints:

The element type of the SIMD vector must be integral.

Args:

value (Self): The other value.

Returns:

Self: value << self.

`rrshift`

def __rrshift__(self, value: Self) -> Self

Returns value >> self.

Constraints:

The element type of the SIMD vector must be integral.

Args:

value (Self): The other value.

Returns:

Self: value >> self.

`rand`

def __rand__(self, value: Self) -> Self

Returns value & self.

Constraints:

The element type of the SIMD vector must be bool or integral.

Args:

value (Self): The other value.

Returns:

Self: value & self.

`ror`

def __ror__(self, value: Self) -> Self

Returns value | self.

Constraints:

The element type of the SIMD vector must be bool or integral.

Args:

value (Self): The other value.

Returns:

Self: value | self.

`rxor`

def __rxor__(self, value: Self) -> Self

Returns value ^ self.

Constraints:

The element type of the SIMD vector must be bool or integral.

Args:

value (Self): The other value.

Returns:

Self: value ^ self.

`iadd`

def __iadd__(mut self, rhs: Self)

Performs in-place addition.

The vector is mutated where each element at position i is computed as self[i] + rhs[i].

Args:

rhs (Self): The rhs of the addition operation.

`isub`

def __isub__(mut self, rhs: Self)

Performs in-place subtraction.

The vector is mutated where each element at position i is computed as self[i] - rhs[i].

Args:

rhs (Self): The rhs of the operation.

`imul`

def __imul__(mut self, rhs: Self)

Performs in-place multiplication.

The vector is mutated where each element at position i is computed as self[i] * rhs[i].

Args:

rhs (Self): The rhs of the operation.

`itruediv`

def __itruediv__(mut self, rhs: Self)

In-place true divide operator.

The vector is mutated where each element at position i is computed as self[i] / rhs[i].

Args:

rhs (Self): The rhs of the operation.

`ifloordiv`

def __ifloordiv__(mut self, rhs: Self)

In-place flood div operator.

The vector is mutated where each element at position i is computed as self[i] // rhs[i].

Args:

rhs (Self): The rhs of the operation.

`imod`

def __imod__(mut self, rhs: Self)

In-place mod operator.

The vector is mutated where each element at position i is computed as self[i] % rhs[i].

Args:

rhs (Self): The rhs of the operation.

`ipow`

def __ipow__(mut self, rhs: Int)

In-place pow operator.

The vector is mutated where each element at position i is computed as pow(self[i], rhs).

Args:

rhs (Int): The rhs of the operation.

`ilshift`

def __ilshift__(mut self, rhs: Self)

Computes self << rhs and save the result in self.

Constraints:

The element type of the SIMD vector must be integral.

Args:

rhs (Self): The RHS value.

`irshift`

def __irshift__(mut self, rhs: Self)

Computes self >> rhs and save the result in self.

Constraints:

The element type of the SIMD vector must be integral.

Args:

rhs (Self): The RHS value.

`iand`

def __iand__(mut self, rhs: Self)

Computes self & rhs and save the result in self.

Constraints:

The element type of the SIMD vector must be bool or integral.

Args:

rhs (Self): The RHS value.

`ixor`

def __ixor__(mut self, rhs: Self)

Computes self ^ rhs and save the result in self.

Constraints:

The element type of the SIMD vector must be bool or integral.

Args:

rhs (Self): The RHS value.

`ior`

def __ior__(mut self, rhs: Self)

Computes self | rhs and save the result in self.

Constraints:

The element type of the SIMD vector must be bool or integral.

Args:

rhs (Self): The RHS value.

`get_type_name`

static def get_type_name() -> String

Gets this type's name, for use in error messages when handing arguments to kernels. TODO: This will go away soon, when we get better error messages for kernel calls.

Returns:

String: This type's name.

`divmod`

def __divmod__(self, denominator: Self) -> Tuple[Self, Self]

Computes both the quotient and remainder using floor division.

Args:

denominator (Self): The value to divide on.

Returns:

Tuple[Self, Self]: The quotient and remainder as a Tuple(self // denominator, self % denominator).

`eq`

def eq(self, rhs: Self) -> SIMD[DType.bool, size]

Compares two SIMD vectors using elementwise equality.

Args:

rhs (Self): The SIMD vector to compare with.

Returns:

SIMD[DType.bool, size]: A new bool SIMD vector of the same size whose element at position i is the value of self[i] == rhs[i].

`ne`

def ne(self, rhs: Self) -> SIMD[DType.bool, size]

Compares two SIMD vectors using elementwise inequality.

Args:

rhs (Self): The SIMD vector to compare with.

Returns:

SIMD[DType.bool, size]: A new bool SIMD vector of the same size whose element at position i is the value of self[i] != rhs[i].

`gt`

def gt(self, rhs: Self) -> SIMD[DType.bool, size]

Compares two SIMD vectors using elementwise greater-than comparison.

Args:

rhs (Self): The SIMD vector to compare with.

Returns:

SIMD[DType.bool, size]: A new bool SIMD vector of the same size whose element at position i is the value of self[i] > rhs[i].

`ge`

def ge(self, rhs: Self) -> SIMD[DType.bool, size]

Compares two SIMD vectors using elementwise greater-than-or-equal comparison.

Args:

rhs (Self): The SIMD vector to compare with.

Returns:

SIMD[DType.bool, size]: A new bool SIMD vector of the same size whose element at position i is the value of self[i] >= rhs[i].

`lt`

def lt(self, rhs: Self) -> SIMD[DType.bool, size]

Compares two SIMD vectors using elementwise less-than comparison.

Args:

rhs (Self): The SIMD vector to compare with.

Returns:

SIMD[DType.bool, size]: A new bool SIMD vector of the same size whose element at position i is the value of self[i] < rhs[i].

`le`

def le(self, rhs: Self) -> SIMD[DType.bool, size]

Compares two SIMD vectors using elementwise less-than-or-equal comparison.

Args:

rhs (Self): The SIMD vector to compare with.

Returns:

SIMD[DType.bool, size]: A new bool SIMD vector of the same size whose element at position i is the value of self[i] <= rhs[i].

`len`

def __len__(self) -> Int

Gets the length of the SIMD vector.

Returns:

Int: The length of the SIMD vector.

static def __len__() -> Int

Get the length (always 1 for scalar types).

Returns:

Int: Always returns 1.

`int`

def __int__(self) -> Int

Casts to the value to an Int. If there is a fractional component, then the fractional part is truncated.

Constraints:

The size of the SIMD vector must be 1.

Returns:

Int: The value as an integer.

`float`

def __float__(self) -> Float64

Casts the value to a float.

Constraints:

The size of the SIMD vector must be 1.

Returns:

Float64: The value as a float.

`floor`

def __floor__(self) -> Self

Performs elementwise floor on the elements of a SIMD vector.

Returns:

Self: The elementwise floor of this SIMD vector.

`ceil`

def __ceil__(self) -> Self

Performs elementwise ceiling on the elements of a SIMD vector.

Returns:

Self: The elementwise ceiling of this SIMD vector.

`trunc`

def __trunc__(self) -> Self

Performs elementwise truncation on the elements of a SIMD vector.

Returns:

Self: The elementwise truncated values of this SIMD vector.

`abs`

def __abs__(self) -> Self

Defines the absolute value operation.

For signed integral element types, the absolute value of the minimum representable value is the minimum value itself.

Returns:

Self: The absolute value of this SIMD vector.

`round`

def __round__(self) -> Self

Performs elementwise rounding on the elements of a SIMD vector.

This rounding goes to the nearest integer with ties towards the nearest even value ("banker's rounding"). This is the default rounding mode for binary floating point in the IEEE 754 Standard for Floating Point Arithmetic.

Returns:

Self: The elementwise rounded value of this SIMD vector.

def __round__(self, ndigits: Int) -> Self

Performs elementwise rounding on the elements of a SIMD vector.

Args:

ndigits (Int): The number of digits to round to.

Returns:

Self: The elementwise rounded value of this SIMD vector.

`hash`

def __hash__[H: Hasher](self, mut hasher: H)

Updates hasher with this SIMD value.

Parameters:

H (Hasher): The hasher type.

Args:

hasher (H): The hasher instance.

`ceildiv`

def __ceildiv__(self, denominator: Self) -> Self

Return the rounded-up result of dividing self by denominator.

Args:

denominator (Self): The denominator.

Returns:

Self: The ceiling of dividing numerator by denominator.

`cast`

def cast[target: DType](self) -> SIMD[target, size]

Casts the elements of the SIMD vector to the target element type.

Casting behavior:

# Basic casting preserves value within range
Int8(UInt8(127)) == Int8(127)

# Numbers above signed max wrap to negative using two's complement
Int8(UInt8(128)) == Int8(-128)
Int8(UInt8(129)) == Int8(-127)
Int8(UInt8(256)) == Int8(0)

# Negative signed cast to unsigned using two's complement
UInt8(Int8(-128)) == UInt8(128)
UInt8(Int8(-127)) == UInt8(129)
UInt8(Int8(-1)) == UInt8(255)

# Truncate precision after downcast and upcast
Float64(Float32(Float64(123456789.123456789))) == Float64(123456792.0)

# Rightmost bits of significand become 0's on upcast
Float64(Float32(0.3)) == Float64(0.30000001192092896)

# Numbers equal after truncation of float literal and cast truncation
Float32(Float64(123456789.123456789)) == Float32(123456789.123456789)

# Float to int/uint floors
Int64(Float64(42.2)) == Int64(42)

Parameters:

target (DType): The target DType.

Returns:

SIMD[target, size]: A new SIMD vector whose elements have been cast to the target element type.

`is_power_of_two`

def is_power_of_two(self) -> SIMD[DType.bool, size]

Checks if the input value is a power of 2 for each element of a SIMD vector.

Constraints:

The element type of the input vector must be integral.

Returns:

SIMD[DType.bool, size]: A SIMD value where the element at position i is True if the integer at position i of the input value is a power of 2, False otherwise.

`write_to`

def write_to(self, mut writer: T)

Formats this SIMD value to the provided Writer.

Args:

writer (T): The object to write to.

`write_repr_to`

def write_repr_to(self, mut writer: T)

Write the string representation of the SIMD value.

Args:

writer (T): The value to write to.

`write_padded`

def write_padded[W: Writer](self, mut writer: W, width: Int) where dtype.is_integral()

Write the integral SIMD with each element right-aligned to a set padding. No additional space between elements is inserted.

Parameters:

W (Writer): A type conforming to the Writable trait.

Args:

writer (W): The object to write to.
width (Int): The amount to pad to the left.

`to_bits`

def to_bits[_dtype: DType = _uint_type_of_width[bit_width_of[dtype]()]()](self) -> SIMD[_dtype, size]

Bitcasts the SIMD vector to an integer SIMD vector.

Parameters:

_dtype (DType): The integer type to cast to.

Returns:

SIMD[_dtype, size]: An integer representation of the floating-point value.

`from_bytes`

static def from_bytes[*, big_endian: Bool = is_big_endian()](bytes: InlineArray[UInt8, size_of[Self]()]) -> Self

Converts a byte array to a vector.

Parameters:

big_endian (Bool): Whether the byte array is big-endian.

Args:

bytes (InlineArray[UInt8, size_of[Self]()]): The byte array to convert.

Returns:

Self: The integer value.

`as_bytes`

def as_bytes[*, big_endian: Bool = is_big_endian()](self) -> InlineArray[UInt8, size_of[Self]()]

Convert the vector to a byte array.

Parameters:

big_endian (Bool): Whether the byte array should be big-endian.

Returns:

InlineArray[UInt8, size_of[Self]()]: The byte array.

`clamp`

def clamp(self, lower_bound: Self, upper_bound: Self) -> Self

Clamps the values in a SIMD vector to be in a certain range.

Clamp cuts values in the input SIMD vector off at the upper bound and lower bound values. For example, SIMD vector [0, 1, 2, 3] clamped to a lower bound of 1 and an upper bound of 2 would return [1, 1, 2, 2].

Args:

lower_bound (Self): Minimum of the range to clamp to.
upper_bound (Self): Maximum of the range to clamp to.

Returns:

Self: A new SIMD vector containing x clamped to be within lower_bound and upper_bound.

`fma`

def fma[flag: FastMathFlag = FastMathFlag.CONTRACT](self, multiplier: Self, accumulator: Self) -> Self

Performs a fused multiply-add operation, i.e. self*multiplier + accumulator.

Parameters:

flag (FastMathFlag): Fast-math optimization flags to apply (default: CONTRACT).

Args:

multiplier (Self): The value to multiply.
accumulator (Self): The value to accumulate.

Returns:

Self: A new vector whose element at position i is computed as self[i]*multiplier[i] + accumulator[i].

`shuffle`

def shuffle[*mask: Int](self) -> Self

Shuffles (also called blend) the values of the current vector with the other value using the specified mask (permutation). The mask values must be within 2 * len(self).

Parameters:

*mask (Int): The permutation to use in the shuffle.

Returns:

Self: A new vector with the same length as the mask where the value at position i is (self)[permutation[i]].

def shuffle[*mask: Int](self, other: Self) -> Self

Shuffles (also called blend) the values of the current vector with the other value using the specified mask (permutation). The mask values must be within 2 * len(self).

Parameters:

*mask (Int): The permutation to use in the shuffle.

Args:

other (Self): The other vector to shuffle with.

Returns:

Self: A new vector with the same length as the mask where the value at position i is (self + other)[permutation[i]].

def shuffle[mask: IndexList[size, element_type=mask.element_type]](self) -> Self

Shuffles (also called blend) the values of the current vector with the other value using the specified mask (permutation). The mask values must be within 2 * len(self).

Parameters:

mask (IndexList[size, element_type=mask.element_type]): The permutation to use in the shuffle.

Returns:

Self: A new vector with the same length as the mask where the value at position i is (self)[permutation[i]].

def shuffle[mask: IndexList[size, element_type=mask.element_type]](self, other: Self) -> Self

Shuffles (also called blend) the values of the current vector with the other value using the specified mask (permutation). The mask values must be within 2 * len(self).

Parameters:

mask (IndexList[size, element_type=mask.element_type]): The permutation to use in the shuffle.

Args:

other (Self): The other vector to shuffle with.

Returns:

Self: A new vector with the same length as the mask where the value at position i is (self + other)[permutation[i]].

`slice`

def slice[output_width: Int, /, *, offset: Int = 0](self) -> SIMD[dtype, output_width]

Returns a slice of the vector of the specified width with the given offset.

Constraints:

output_width + offset must not exceed the size of this SIMD vector.

Parameters:

output_width (Int): The output SIMD vector size.
offset (Int): The given offset for the slice.

Returns:

SIMD[dtype, output_width]: A new vector whose elements map to self[offset:offset+output_width].

`insert`

def insert[*, offset: Int = 0](self, value: SIMD[dtype]) -> Self

Returns a new vector where the elements between offset and offset + input_width have been replaced with the elements in value.

Parameters:

offset (Int): The offset to insert at. This must be a multiple of value's size.

Args:

value (SIMD[dtype]): The value to be inserted.

Returns:

Self: A new vector whose elements at self[offset:offset+input_width] contain the values of value.

`join`

def join(self, other: Self) -> SIMD[dtype, (2 * size)]

Concatenates the two vectors together.

Args:

other (Self): The other SIMD vector.

Returns:

SIMD[dtype, (2 * size)]: A new vector self_0, self_1, ..., self_n, other_0, ..., other_n.

`interleave`

def interleave(self, other: Self) -> SIMD[dtype, (2 * size)]

Constructs a vector by interleaving two input vectors.

Args:

other (Self): The other SIMD vector.

Returns:

SIMD[dtype, (2 * size)]: A new vector self_0, other_0, ..., self_n, other_n.

`split`

def split(self) -> Tuple[SIMD[dtype, (size // 2)], SIMD[dtype, (size // 2)]]

Splits the SIMD vector into 2 subvectors.

Returns:

Tuple[SIMD[dtype, (size // 2)], SIMD[dtype, (size // 2)]]: A new vector self_0:N/2, self_N/2:N.

`deinterleave`

def deinterleave(self) -> Tuple[SIMD[dtype, (size / 2)], SIMD[dtype, (size / 2)]]

Constructs two vectors by deinterleaving the even and odd lanes of the vector.

Constraints:

The vector size must be greater than 1.

Returns:

Tuple[SIMD[dtype, (size / 2)], SIMD[dtype, (size / 2)]]: Two vectors the first of the form self_0, self_2, ..., self_{n-2} and the other being self_1, self_3, ..., self_{n-1}.

`reduce`

def reduce[func: def[width: Int](SIMD[dtype, width], SIMD[dtype, width]) -> SIMD[dtype, width], size_out: Int = 1](self) -> SIMD[dtype, size_out]

Reduces the vector using a provided reduce operator.

Constraints:

size_out must not exceed width of the vector.

Parameters:

func (def[width: Int](SIMD[dtype, width], SIMD[dtype, width]) -> SIMD[dtype, width]): The reduce function to apply to elements in this SIMD.
size_out (Int): The width of the reduction.

Returns:

SIMD[dtype, size_out]: A new scalar which is the reduction of all vector elements.

def reduce[func: def[width: Int](SIMD[dtype, width], SIMD[dtype, width]) capturing -> SIMD[dtype, width], size_out: Int = 1](self) -> SIMD[dtype, size_out]

Reduces the vector using a provided reduce operator.

Constraints:

size_out must not exceed width of the vector.

Parameters:

func (def[width: Int](SIMD[dtype, width], SIMD[dtype, width]) capturing -> SIMD[dtype, width]): The reduce function to apply to elements in this SIMD.
size_out (Int): The width of the reduction.

Returns:

SIMD[dtype, size_out]: A new scalar which is the reduction of all vector elements.

`reduce_max`

def reduce_max[size_out: Int = 1](self) -> SIMD[dtype, size_out]

Reduces the vector using the max operator.

Constraints:

size_out must not exceed width of the vector. The element type of the vector must be integer or FP.

Parameters:

size_out (Int): The width of the reduction.

Returns:

SIMD[dtype, size_out]: The maximum element of the vector.

`reduce_min`

def reduce_min[size_out: Int = 1](self) -> SIMD[dtype, size_out]

Reduces the vector using the min operator.

Constraints:

size_out must not exceed width of the vector. The element type of the vector must be integer or FP.

Parameters:

size_out (Int): The width of the reduction.

Returns:

SIMD[dtype, size_out]: The minimum element of the vector.

`reduce_add`

def reduce_add[size_out: Int = 1](self) -> SIMD[dtype, size_out]

Reduces the vector using the add operator.

Constraints:

size_out must not exceed width of the vector.

Parameters:

size_out (Int): The width of the reduction.

Returns:

SIMD[dtype, size_out]: The sum of all vector elements.

`reduce_mul`

def reduce_mul[size_out: Int = 1](self) -> SIMD[dtype, size_out]

Reduces the vector using the mul operator.

Constraints:

size_out must not exceed width of the vector. The element type of the vector must be integer or FP.

Parameters:

size_out (Int): The width of the reduction.

Returns:

SIMD[dtype, size_out]: The product of all vector elements.

`reduce_and`

def reduce_and[size_out: Int = 1](self) -> SIMD[dtype, size_out]

Reduces the vector using the bitwise & operator.

Constraints:

size_out must not exceed width of the vector. The element type of the vector must be integer or boolean.

Parameters:

size_out (Int): The width of the reduction.

Returns:

SIMD[dtype, size_out]: The reduced vector.

`reduce_or`

def reduce_or[size_out: Int = 1](self) -> SIMD[dtype, size_out]

Reduces the vector using the bitwise | operator.

Constraints:

size_out must not exceed width of the vector. The element type of the vector must be integer or boolean.

Parameters:

size_out (Int): The width of the reduction.

Returns:

SIMD[dtype, size_out]: The reduced vector.

`reduce_bit_count`

def reduce_bit_count(self) -> Int

Returns the total number of bits set in the SIMD vector.

Constraints:

Must be either an integral or a boolean type.

Returns:

Int: Count of set bits across all elements of the vector.

`select`

def select[_dtype: DType](self, true_case: SIMD[_dtype, size], false_case: SIMD[_dtype, size]) -> SIMD[_dtype, size]

Selects the values of the true_case or the false_case based on the current boolean values of the SIMD vector.

Constraints:

The element type of the vector must be boolean.

Parameters:

_dtype (DType): The element type of the input and output SIMD vectors.

Args:

true_case (SIMD[_dtype, size]): The values selected if the positional value is True.
false_case (SIMD[_dtype, size]): The values selected if the positional value is False.

Returns:

SIMD[_dtype, size]: A new vector of the form [true_case[i] if elem else false_case[i] for i, elem in enumerate(self)].

`rotate_left`

def rotate_left[shift: Int](self) -> Self

Shifts the elements of a SIMD vector to the left by shift elements (with wrap-around).

Constraints:

-size <= shift < size

Parameters:

shift (Int): The number of positions by which to rotate the elements of SIMD vector to the left (with wrap-around).

Returns:

Self: The SIMD vector rotated to the left by shift elements (with wrap-around).

`rotate_right`

def rotate_right[shift: Int](self) -> Self

Shifts the elements of a SIMD vector to the right by shift elements (with wrap-around).

Constraints:

-size < shift <= size

Parameters:

shift (Int): The number of positions by which to rotate the elements of SIMD vector to the right (with wrap-around).

Returns:

Self: The SIMD vector rotated to the right by shift elements (with wrap-around).

`shift_left`

def shift_left[shift: Int](self) -> Self

Shifts the elements of a SIMD vector to the left by shift elements (no wrap-around, fill with zero).

Constraints:

0 <= shift <= size

Parameters:

shift (Int): The number of positions by which to rotate the elements of SIMD vector to the left (no wrap-around, fill with zero).

Returns:

Self: The SIMD vector rotated to the left by shift elements (no wrap-around, fill with zero).

`shift_right`

def shift_right[shift: Int](self) -> Self

Shifts the elements of a SIMD vector to the right by shift elements (no wrap-around, fill with zero).

Constraints:

0 <= shift <= size

Parameters:

shift (Int): The number of positions by which to rotate the elements of SIMD vector to the right (no wrap-around, fill with zero).

Returns:

Self: The SIMD vector rotated to the right by shift elements (no wrap-around, fill with zero).

`reversed`

def reversed(self) -> Self

Reverses the SIMD vector by indexes.

Examples:

print(SIMD[DType.uint8, 4](1, 2, 3, 4).reversed()) # [4, 3, 2, 1]

Returns:

Self: The by index reversed vector.

`product`

def product(self) -> Scalar[dtype]

Calculate the product (returns the value for scalar types).

Returns:

Scalar[dtype]: The integer value.

`sum`

def sum(self) -> Scalar[dtype]

Calculate the sum (returns the value for scalar types).

Returns:

Scalar[dtype]: The integer value.

`value`

def value(self) -> Scalar[dtype]

Get the scalar value.

Returns:

Scalar[dtype]: The runtime integer value.

`tuple`

def tuple(var self) -> Coord[Self]

Get as a tuple (not valid for Scalar CoordLike).

Returns:

Coord[Self]: Never returns; aborts at compile time.

View source

Parameters
Implemented traits
comptime members
Methods

Parameters​

Implemented traits​

comptime members​

device_type​

DTYPE​

is_static_value​

is_tuple​

is_value​

MAX​

MAX_FINITE​

MIN​

MIN_FINITE​

ParamListType​

static_value​

Methods​

__init__​

__bool__​

__getitem__​

__setitem__​

__neg__​

__pos__​

__invert__​

__lt__​

__le__​

__eq__​

__ne__​

__gt__​

__ge__​

__contains__​

__add__​

__sub__​

__mul__​

__truediv__​

__floordiv__​

__mod__​

__pow__​

__lshift__​

__rshift__​

__and__​

__or__​

__xor__​

__radd__​

__rsub__​

__rmul__​

__rtruediv__​

__rfloordiv__​

__rmod__​

__rpow__​

__rlshift__​

__rrshift__​

__rand__​

__ror__​

__rxor__​

__iadd__​

__isub__​

__imul__​

__itruediv__​

__ifloordiv__​

__imod__​

__ipow__​

__ilshift__​

__irshift__​

__iand__​

__ixor__​

__ior__​

get_type_name​

__divmod__​

eq​

ne​

gt​

ge​

lt​

le​

__len__​

__int__​

__float__​

__floor__​

__ceil__​

__trunc__​

__abs__​

Parameters

Implemented traits

`comptime` members

`device_type`

`DTYPE`

`is_static_value`

`is_tuple`

`is_value`

`MAX`

`MAX_FINITE`

`MIN`

`MIN_FINITE`

`ParamListType`

`static_value`

Methods

`init`

`bool`

`getitem`

`setitem`

`neg`

`pos`

`invert`

`lt`

`le`

`eq`

`ne`

`gt`

`ge`

`contains`

`add`

`sub`

`mul`

`truediv`

`floordiv`

`mod`

`pow`

`lshift`

`rshift`

`and`

`or`

`xor`

`radd`

`rsub`

`rmul`

`rtruediv`

`rfloordiv`

`rmod`

`rpow`

`rlshift`

`rrshift`

`rand`

`ror`

`rxor`

`iadd`

`isub`

`imul`

`itruediv`

`ifloordiv`

`imod`

`ipow`

`ilshift`

`irshift`

`iand`

`ixor`

`ior`

`get_type_name`

`divmod`

`eq`

`ne`

`gt`

`ge`

`lt`

`le`

`len`

`int`

`float`

`floor`

`ceil`

`trunc`

`abs`