IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /docs/manual/basics.md). For the complete Mojo documentation index, see llms.txt.
Skip to main content
Version: Nightly
For the complete Mojo documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /docs/manual/basics.md).

DeviceGraph

struct DeviceGraph

Represents an instantiated device graph that can be replayed.

A DeviceGraph captures a sequence of GPU operations (such as kernel launches) as a reusable graph. Once instantiated from a DeviceGraphBuilder, the graph can be replayed multiple times at a lower overhead than re-enqueueing each operation individually.

To obtain a DeviceGraph, use DeviceGraphBuilder.instantiate().

Implemented traits

AnyType, Copyable, ImplicitlyCopyable, ImplicitlyDestructible, Movable

Methods

__init__

__init__(out self, *, copy: Self)

Creates a copy of an existing device graph by incrementing its reference count.

Args:

  • copy (Self): The device graph to copy.

__del__

__del__(deinit self)

Releases resources associated with this device graph.

replay

replay(self)

Replays the captured sequence of GPU operations.

Submits the pre-captured sequence of operations for execution on the device. This is more efficient than re-enqueueing each operation individually because the graph has already been compiled and instantiated by the driver.

Example:

from std.gpu.host import DeviceContext

def kernel():
print("replaying")

with DeviceContext() as ctx:
var compiled_fn = ctx.compile_function[kernel, kernel]()
var builder = ctx.create_graph_builder()
builder.add_function(compiled_fn, grid_dim=1, block_dim=1)
var graph = builder^.instantiate()
graph.replay()
graph.replay() # replay as many times as needed
ctx.synchronize()

Raises:

If replay fails.