For the complete Mojo documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /docs/manual/basics.md).
async_copy_wait_group
async_copy_wait_group(n: Int32)
Waits for the completion of n most recently committed cp.async-groups.
This function blocks execution until the specified number of previously committed cp.async-groups have completed their memory transfers.
Notes:
- Only supported on NVIDIA GPUs.
- Maps to the cp.async.wait.group PTX instruction.
- Provides fine-grained control over asynchronous transfer synchronization.
- Can be used to implement a pipeline of asynchronous transfers.
Args:
- n (
Int32): The number of pending cp.async-groups to wait for. Must be > 0.