IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /docs/manual/basics.md). For the complete Mojo documentation index, see llms.txt.
Skip to main content
Version: Nightly
For the complete Mojo documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /docs/manual/basics.md).

dual_elementwise

def dual_elementwise[rank: Int, //, func_0: def[width: Int, rank: Int, alignment: Int = 1](IndexList[rank]) capturing -> None, func_1: def[width: Int, rank: Int, alignment: Int = 1](IndexList[rank]) capturing -> None, simd_width: Int, *, target: StringSlice[StaticConstantOrigin] = StringSlice("gpu"), _trace_description: StringSlice[StaticConstantOrigin] = StringSlice("dual_elementwise")](shape_0: IndexList[rank], shape_1: IndexList[rank], context: DeviceContext)

Executes two elementwise functions over their respective shapes in a single GPU kernel launch. Each thread processes elements from both shapes, fusing two independent elementwise passes into one.

Parameters:

  • rank (Int): The rank of the buffers.
  • func_0 (def[width: Int, rank: Int, alignment: Int = 1](IndexList[rank]) capturing -> None): The first body function.
  • func_1 (def[width: Int, rank: Int, alignment: Int = 1](IndexList[rank]) capturing -> None): The second body function.
  • simd_width (Int): The SIMD vector width to use.
  • target (StringSlice[StaticConstantOrigin]): The target to run on (must be GPU).
  • _trace_description (StringSlice[StaticConstantOrigin]): Description of the trace.

Args:

Raises:

If the operation fails.