Version: Nightly

For the complete Mojo documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /docs/manual/basics.md).

reduce

def reduce[val_type: DType, simd_width: SIMDSize, //, shuffle: def[dtype: DType, simd_width: SIMDSize](val: SIMD[dtype, simd_width], offset: UInt32) thin -> SIMD[dtype, simd_width], func: def[dtype: DType, width: SIMDSize](SIMD[dtype, width], SIMD[dtype, width]) capturing thin -> SIMD[dtype, width]](val: SIMD[val_type, simd_width]) -> SIMD[val_type, simd_width]

Performs a generic warp-wide reduction operation using shuffle operations.

This is a convenience wrapper around lane_group_reduce that operates on the entire warp. It allows customizing both the shuffle operation and reduction function.

Example:

    from std.gpu.primitives.warp import reduce, shuffle_down

    # Compute warp-wide sum using shuffle down
    @parameter
    def add[dtype: DType, width: SIMDSize](x: SIMD[dtype, width], y: SIMD[dtype, width]) capturing -> SIMD[dtype, width]:
        return x + y

    val = SIMD[DType.float32, 4](2.0, 4.0, 6.0, 8.0)
    result = reduce[shuffle_down, add](val)

Parameters:

val_type (DType): The data type of the SIMD elements (e.g. float32, int32).
simd_width (SIMDSize): The number of elements in the SIMD vector.
shuffle (def[dtype: DType, simd_width: SIMDSize](val: SIMD[dtype, simd_width], offset: UInt32) thin -> SIMD[dtype, simd_width]): A function that performs the warp shuffle operation. Takes a SIMD value and offset and returns the shuffled result.
func (def[dtype: DType, width: SIMDSize](SIMD[dtype, width], SIMD[dtype, width]) capturing thin -> SIMD[dtype, width]): A binary function that combines two SIMD values during reduction. This defines the reduction operation (e.g. add, max, min).

Args:

val (SIMD[val_type, simd_width]): The SIMD value to reduce. Each lane contributes its value.

Returns:

SIMD[val_type, simd_width]: A SIMD value containing the reduction result broadcast to all lanes in the warp.