elementwise
elementwise[func: def[width: Int, rank: Int, alignment: Int = 1](IndexList[rank]) capturing -> None, simd_width: Int, *, use_blocking_impl: Bool = False, target: StringSlice[StaticConstantOrigin] = StringSlice("cpu"), _trace_description: StringSlice[StaticConstantOrigin] = StringSlice("elementwise")](shape: Int, ctx: Optional[DeviceContext] = None)
Executes func[width, rank](indices), possibly as sub-tasks, for a suitable combination of width and indices so as to cover shape. Returns when all sub-tasks have completed.
Parameters:
- func (
def[width: Int, rank: Int, alignment: Int = 1](IndexList[rank]) capturing -> None): The body function. - simd_width (
Int): The SIMD vector width to use. - use_blocking_impl (
Bool): Do not invoke the function using asynchronous calls. - target (
StringSlice): The target to run on. - _trace_description (
StringSlice): Description of the trace.
Args:
Raises:
If the operation fails.
elementwise[rank: Int, //, func: def[width: Int, rank: Int, alignment: Int = 1](IndexList[rank]) capturing -> None, simd_width: Int, *, use_blocking_impl: Bool = False, target: StringSlice[StaticConstantOrigin] = StringSlice("cpu"), _trace_description: StringSlice[StaticConstantOrigin] = StringSlice("elementwise")](shape: IndexList[rank, element_type=shape.element_type], ctx: Optional[DeviceContext] = None)
Executes func[width, rank](indices), possibly as sub-tasks, for a suitable combination of width and indices so as to cover shape. Returns when all sub-tasks have completed.
Parameters:
- rank (
Int): The rank of the buffer. - func (
def[width: Int, rank: Int, alignment: Int = 1](IndexList[rank]) capturing -> None): The body function. - simd_width (
Int): The SIMD vector width to use. - use_blocking_impl (
Bool): Do not invoke the function using asynchronous calls. - target (
StringSlice): The target to run on. - _trace_description (
StringSlice): Description of the trace.
Args:
Raises:
If the operation fails.
elementwise[func: def[width: Int, rank: Int, alignment: Int = 1](IndexList[rank]) capturing -> None, simd_width: Int, *, use_blocking_impl: Bool = False, target: StringSlice[StaticConstantOrigin] = StringSlice("cpu"), pdl_level: PDLLevel = PDLLevel(1), _trace_description: StringSlice[StaticConstantOrigin] = StringSlice("elementwise")](shape: Int, context: DeviceContext)
Executes func[width, rank](indices), possibly as sub-tasks, for a suitable combination of width and indices so as to cover shape. Returns when all sub-tasks have completed.
Parameters:
- func (
def[width: Int, rank: Int, alignment: Int = 1](IndexList[rank]) capturing -> None): The body function. - simd_width (
Int): The SIMD vector width to use. - use_blocking_impl (
Bool): Do not invoke the function using asynchronous calls. - target (
StringSlice): The target to run on. - pdl_level (
PDLLevel): The PDL level controlling GPU kernel overlap behavior. - _trace_description (
StringSlice): Description of the trace.
Args:
- shape (
Int): The shape of the buffer. - context (
DeviceContext): The device context to use.
Raises:
If the operation fails.
elementwise[rank: Int, //, func: def[width: Int, rank: Int, alignment: Int = 1](IndexList[rank]) capturing -> None, simd_width: Int, *, use_blocking_impl: Bool = False, target: StringSlice[StaticConstantOrigin] = StringSlice("cpu"), pdl_level: PDLLevel = PDLLevel(1), _trace_description: StringSlice[StaticConstantOrigin] = StringSlice("elementwise")](shape: IndexList[rank, element_type=shape.element_type], context: DeviceContext)
Executes func[width, rank](indices), possibly as sub-tasks, for a suitable combination of width and indices so as to cover shape. Returns when all sub-tasks have completed.
Parameters:
- rank (
Int): The rank of the buffer. - func (
def[width: Int, rank: Int, alignment: Int = 1](IndexList[rank]) capturing -> None): The body function. - simd_width (
Int): The SIMD vector width to use. - use_blocking_impl (
Bool): Do not invoke the function using asynchronous calls. - target (
StringSlice): The target to run on. - pdl_level (
PDLLevel): The PDL level controlling GPU kernel overlap behavior. - _trace_description (
StringSlice): Description of the trace.
Args:
- shape (
IndexList): The shape of the buffer. - context (
DeviceContext): The device context to use.
Raises:
If the operation fails.
elementwise[rank: Int, //, func: def[width: Int, rank: Int, alignment: Int = 1](IndexList[rank]) capturing -> None, simd_width: Int, *, use_blocking_impl: Bool = False, target: StringSlice[StaticConstantOrigin] = StringSlice("cpu"), pdl_level: PDLLevel = PDLLevel(1), _trace_description: StringSlice[StaticConstantOrigin] = StringSlice("elementwise")](shape: IndexList[rank, element_type=shape.element_type], context: DeviceContextPtr)
Executes func[width, rank](indices), possibly as sub-tasks, for a suitable combination of width and indices so as to cover shape. Returns when all sub-tasks have completed.
Parameters:
- rank (
Int): The rank of the buffer. - func (
def[width: Int, rank: Int, alignment: Int = 1](IndexList[rank]) capturing -> None): The body function. - simd_width (
Int): The SIMD vector width to use. - use_blocking_impl (
Bool): Do not invoke the function using asynchronous calls. - target (
StringSlice): The target to run on. - pdl_level (
PDLLevel): The PDL level controlling GPU kernel overlap behavior. - _trace_description (
StringSlice): Description of the trace.
Args:
- shape (
IndexList): The shape of the buffer. - context (
DeviceContextPtr): The device context to use.
Raises:
If the operation fails.