IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /docs/manual/basics.md). For the complete Mojo documentation index, see llms.txt.
Skip to main content
Version: 1.0.0b2
For the complete Mojo documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /docs/manual/basics.md).

dual_elementwise

def dual_elementwise[func_0: def[width: Int, alignment: Int = 1](Coord[_]) capturing -> None, func_1: def[width: Int, alignment: Int = 1](Coord[_]) capturing -> None, simd_width: Int, *, target: StringSlice[StaticConstantOrigin] = StringSlice("gpu"), _trace_description: StringSlice[StaticConstantOrigin] = StringSlice("dual_elementwise")](shape_0: Coord, shape_1: Coord, context: DeviceContext)

Executes two elementwise functions over their respective shapes in a single GPU kernel launch. Each thread processes elements from both shapes, fusing two independent elementwise passes into one.

Parameters:

  • func_0 (def[width: Int, alignment: Int = 1](Coord[_]) capturing -> None): The first body function.
  • func_1 (def[width: Int, alignment: Int = 1](Coord[_]) capturing -> None): The second body function.
  • simd_width (Int): The SIMD vector width to use.
  • target (StringSlice[StaticConstantOrigin]): The target to run on (must be GPU).
  • _trace_description (StringSlice[StaticConstantOrigin]): Description of the trace.

Args:

  • shape_0 (Coord): The shape for the first function.
  • shape_1 (Coord): The shape for the second function.
  • context (DeviceContext): The device context to use.

Raises:

If the operation fails.