Structural Equality and Hashing#
TVM FFI provides structural_equal and structural_hash for the
object graph. These compare objects by content — recursively walking
fields — rather than by pointer identity.
The behavior is controlled by two layers of annotation on
py_class():
Type-level
structural_eq=— what role does this type play in the IR graph?Field-level
structural_eq=onfield()— should this field be skipped, or does it introduce new variable bindings?
This document explains what each annotation means, when to use it, and how they compose.
Type-Level Annotation#
The structural_eq parameter on @py_class declares how instances of the
type participate in structural equality and hashing:
@py_class(structural_eq="tree")
class Expr(Object):
...
Quick reference#
|
Meaning |
Use when… |
|---|---|---|
|
A regular IR node |
Default for most IR nodes |
|
An immutable value node (with pointer shortcut) |
The type has no transitive |
|
A node in a dataflow graph |
Pointer sharing is semantically meaningful |
|
A bound variable |
The type represents a variable binding |
|
A singleton |
Exactly one instance per logical identity (e.g. registry entries) |
|
Not comparable |
The type should never be compared structurally |
"tree" — The Default#
@py_class(structural_eq="tree")
class Add(Object):
lhs: Expr
rhs: Expr
Meaning: “This node is defined by its fields. Two nodes are equal if and only if all their fields are recursively equal.”
This is the right choice for the vast majority of IR nodes: expressions, statements, types, attributes, buffers, etc.
Example.
1 + 2 vs 1 + 2 → Equal
1 + 2 vs 1 + 3 → Not equal (rhs differs)
"const-tree" — Tree with a Fast Path#
@py_class(structural_eq="const-tree")
class DeviceMesh(Object):
shape: list[int]
device_ids: list[int]
Meaning: “Same as "tree", but if two references point to the same
object, they are guaranteed equal — skip the field comparison.”
This is purely a performance optimization. The only behavioral difference
from "tree" is that pointer identity short-circuits to True.
When is this safe?#
When the type satisfies two conditions:
Immutable — content doesn’t change after construction, so same-pointer always implies same-content.
No transitive
"var"children — skipping field traversal won’t cause variable mappings to be missed (see "var" — Bound Variables for why this matters).
Why not use it everywhere?#
Most IR nodes are immutable, but many transitively contain variables
(e.g., x + 1 contains the "var" node x). If the pointer
shortcut fires, the traversal skips x, and a variable mapping that should
have been established is silently missed.
The following diagram shows the danger. Suppose the + node were
incorrectly annotated as "const-tree". When comparing two trees that
share a sub-expression, the pointer shortcut fires on the shared node, and
the "var" x inside it is never visited — so no x ↔ y mapping
is recorded:
graph TD
subgraph "lhs"
LT["(_, _)"]
LE["x + 1"]
LX["x : var"]
LT -->|".0"| LE
LT -->|".1"| LX
LE -->|".lhs"| LX2["x"]
end
subgraph "rhs"
RT["(_, _)"]
RE["y + 1"]
RY["y : var"]
RT -->|".0"| RE
RT -->|".1"| RY
RE -->|".lhs"| RY2["y"]
end
LE -. "const-tree would skip here<br/>(misses x ↔ y mapping)" .-> RE
LX -. "Later comparison fails:<br/>x has no recorded mapping" .-> RY
style LE fill:#fff3cd
style RE fill:#fff3cd
style LX fill:#f8d7da
style RY fill:#f8d7da
style LX2 fill:#f8d7da
style RY2 fill:#f8d7da
"dag" — Sharing-Aware Comparison#
@py_class(structural_eq="dag")
class Binding(Object):
var: Var
value: Expr
Meaning: “This node lives in a graph where pointer sharing is semantically meaningful. Two graphs are equal only if they have the same content and the same sharing structure.”
Why it exists#
In dataflow IR, sharing matters. Consider:
# Program A: shared — compute once, use twice
let s = x + 1 in (s, s)
# Program B: independent — compute twice
(x + 1, x + 1)
Program A computes x + 1 once and references it twice; Program B
computes it independently twice. Under "tree" these are equal;
under "dag" they are not:
graph TD
subgraph "Program A — DAG"
TA["(_, _)"]
SA["s = x + 1"]
TA -->|".0"| SA
TA -->|".1"| SA
end
subgraph "Program B — Tree"
TB["(_, _)"]
A1["x + 1"]
A2["x + 1"]
TB -->|".0"| A1
TB -->|".1"| A2
end
SA -. "NOT EQUAL under dag<br/>(sharing structure differs)" .-> A1
style SA fill:#d4edda
style A1 fill:#d4edda
style A2 fill:#f8d7da
How "dag" detects sharing#
"dag" maintains a bijective (one-to-one) mapping between objects that
have been successfully compared. When the same object appears again, it
checks whether the pairing is consistent:
Comparing Program A vs Program B:
.0: s ↔ (x+1)₁ → content equal, record pairing: s ↔ (x+1)₁
.1: s ↔ (x+1)₂ → s already paired with (x+1)₁, not (x+1)₂
→ NOT EQUAL
The mapping is bijective: if a is paired with b, no other object
can pair with either a or b. This prevents false positives in both
directions.
Example of the reverse direction.
lhs: (a, b) rhs: (a, a) where a ≅ b (same content)
.0: a₁ ↔ a₂ → equal, record a₁ ↔ a₂
.1: b₁ ↔ a₂ → b₁ is new, but a₂ already paired with a₁
→ NOT EQUAL
Without the reverse check, the second comparison would proceed to content
comparison, find b₁ ≅ a₂, and incorrectly succeed.
Full comparison: "tree" vs "dag"#
Scenario |
|
|
|---|---|---|
both trees with same content |
Equal |
Equal |
both DAGs, same sharing shape |
Equal |
Equal |
|
Equal |
Not equal |
|
Equal |
Not equal |
"var" — Bound Variables#
@py_class(structural_eq="var")
class Var(Object):
name: str
Meaning: “This is a variable. Two variables are equal if they are bound in corresponding positions, not if they have the same name or content.”
The problem#
fun x → x + 1 should equal fun y → y + 1
Variables are not defined by their content (name, type annotation). They
are defined by where they are introduced and how they are used.
x and y above are interchangeable because they occupy the same
binding position and are used in the same way.
How it works: definition regions#
"var" works together with field(structural_eq="def") (see
Field-Level Annotations). A field marked structural_eq="def" is a
definition region — it’s where new variable bindings are introduced.
Inside a definition region: encountering two different variables establishes a correspondence (“treat
xas equivalent toy”).Outside a definition region: variables are only equal if a prior correspondence already exists, or they are the same pointer.
The following diagram traces the comparison of two alpha-equivalent functions:
sequenceDiagram
participant C as Comparator
participant L as lhs: fun x → x + 1
participant R as rhs: fun y → y + 1
Note over C: Field "params" has structural_eq="def"
C->>L: get params → [x]
C->>R: get params → [y]
Note over C: Enter definition region
C->>C: Compare x ↔ y: both are Vars
Note over C: Record mapping: x ↔ y
Note over C: Exit definition region
Note over C: Field "body" — normal region
C->>L: get body → x + 1
C->>R: get body → y + 1
C->>C: Compare + fields...
C->>C: x ↔ y: lookup finds x→y ✓
C->>C: 1 ↔ 1: equal ✓
Note over C: Result: EQUAL ✓
Without a definition region, the same variables would not be equal:
# Bare expressions, no enclosing function:
x + 1 vs y + 1 → NOT EQUAL (no definition region, different pointers)
Full comparison: with and without definition regions#
Scenario |
With |
Without |
|---|---|---|
|
Equal |
n/a |
|
Not equal (body uses |
n/a |
|
Equal (x↔a, y↔b) |
n/a |
|
Not equal (x↔a but body uses |
n/a |
|
n/a |
Not equal |
|
n/a |
Equal |
Inconsistent variable usage#
The bijective mapping catches inconsistencies. Consider:
fun (x, y) → x + x vs fun (a, b) → a + b
sequenceDiagram
participant C as Comparator
participant L as lhs: fun (x, y) → x + x
participant R as rhs: fun (a, b) → a + b
Note over C: Definition region (params)
C->>C: x ↔ a → record x↔a ✓
C->>C: y ↔ b → record y↔b ✓
Note over C: Body: x + x vs a + b
C->>C: x ↔ a → lookup x→a, matches ✓
C->>C: x ↔ b → lookup x→a, but rhs is b ≠ a → FAIL ✗
Note over C: Result: NOT EQUAL ✓
The map_free_vars flag#
structural_equal(lhs, rhs, map_free_vars=True) starts the comparison
in “definition region” mode. This is useful for comparing standalone
expressions where you want alpha-equivalence at the top level without an
enclosing function:
# With map_free_vars=True:
structural_equal(x + 1, y + 1, map_free_vars=True) # → True
# With map_free_vars=False (default):
structural_equal(x + 1, y + 1) # → False
"singleton" — Singletons#
@py_class(structural_eq="singleton")
class Op(Object):
name: str
Meaning: “There is exactly one instance of this object per logical identity. Pointer equality is the only valid comparison.”
No content comparison is ever performed. Different pointers are always unequal; same pointer is always equal.
op_conv = Op.get("nn.conv2d")
op_relu = Op.get("nn.relu")
structural_equal(op_conv, op_conv) # → True (same pointer)
structural_equal(op_conv, op_relu) # → False (different pointers)
Field-Level Annotations#
The structural_eq parameter on field() controls
how structural equality/hashing treats that specific field.
structural_eq="ignore" — Exclude a field#
@py_class(structural_eq="tree")
class MyNode(Object):
value: int
span: str = field(structural_eq="ignore")
Meaning: “This field is not part of the node’s structural identity. Skip it during comparison and hashing.”
Use for:
Source locations (
span) — where the node came from in source code doesn’t affect what it means.Cached/derived values — computed from other fields, would be redundant to compare.
Debug annotations — names, comments, metadata for human consumption.
structural_eq="def" — Definition region#
@py_class(structural_eq="tree")
class Lambda(Object):
params: list[Var] = field(structural_eq="def")
body: Expr
Meaning: “This field introduces new variable bindings. When comparing or hashing this field, allow new variable correspondences to be established.”
This is the counterpart to "var". A "var" type says “I am a
variable”; structural_eq="def" says “this field is where variables are
defined.” Together they enable alpha-equivalence: comparing functions up
to consistent variable renaming.
Use for:
Function parameter lists
Let-binding left-hand sides
Any field that introduces names into scope
Custom Equality and Hashing#
For types where the default field-by-field traversal is insufficient, you can register custom callbacks as type attributes:
``__s_equal__`` — custom equality logic
``__s_hash__`` — custom hashing logic
These are registered per type via EnsureTypeAttrColumn. When present,
they replace the default field iteration. The system still manages all
kind-specific logic ("dag" memoization, "var" mapping, etc.) — the
custom callback only controls which sub-values to compare/hash and in what
order.
All Kinds at a Glance#
The following diagram visualizes the five comparable kinds, arranged by how much structural information they track:
graph LR
UI["singleton<br/><i>pointer only</i>"]
TN["tree<br/><i>content only</i>"]
CTN["const-tree<br/><i>content + pointer shortcut</i>"]
DN["dag<br/><i>content + sharing</i>"]
FV["var<br/><i>content + binding position</i>"]
UI --- TN
TN --- CTN
TN --- DN
TN --- FV
style UI fill:#e2e3e5
style TN fill:#d4edda
style CTN fill:#d4edda
style DN fill:#cce5ff
style FV fill:#fff3cd
Content comparison |
Pointer shortcut |
Tracks sharing |
Tracks binding position |
|
|---|---|---|---|---|
|
No |
Yes (only) |
No |
No |
|
Yes |
No |
No |
No |
|
Yes |
Yes (fast path) |
No |
No |
|
Yes |
No |
Yes |
No |
|
Yes |
No |
No |
Yes |
Decision Guide#
When defining a new type:
graph TD
Start["New @py_class type"] --> Q1{"Singleton?<br/>(one instance per<br/>logical identity)"}
Q1 -->|Yes| UI["structural_eq="singleton""]
Q1 -->|No| Q2{"Represents a<br/>variable binding?"}
Q2 -->|Yes| FV["structural_eq="var""]
Q2 -->|No| Q3{"Pointer sharing<br/>semantically<br/>meaningful?"}
Q3 -->|Yes| DN["structural_eq="dag""]
Q3 -->|No| Q4{"Immutable AND<br/>no transitive<br/>var children?"}
Q4 -->|Yes| CTN["structural_eq="const-tree""]
Q4 -->|No| TN["structural_eq="tree""]
style UI fill:#e2e3e5
style FV fill:#fff3cd
style DN fill:#cce5ff
style CTN fill:#d4edda
style TN fill:#d4edda
For fields:
graph TD
Start["field() parameter"] --> Q1{"Irrelevant to<br/>structural identity?<br/>(span, cache, debug)"}
Q1 -->|Yes| IGN["structural_eq="ignore""]
Q1 -->|No| Q2{"Introduces new<br/>variable bindings?"}
Q2 -->|Yes| DEF["structural_eq="def""]
Q2 -->|No| NONE["No flag needed"]
style IGN fill:#f8d7da
style DEF fill:#fff3cd
style NONE fill:#d4edda
Worked Example#
Putting it all together for a function node with parameters, body, and source location:
@py_class(structural_eq="tree")
class Lambda(Object):
params: list[Var] = field(structural_eq="def")
body: Expr
span: str = field(structural_eq="ignore", default="")
@py_class(structural_eq="var")
class Var(Object):
name: str
@py_class(structural_eq="singleton")
class Op(Object):
name: str
With these annotations, alpha-equivalent functions are structurally equal:
# These two are structurally equal:
fun [x] → x + 1 (span="a.py:1")
fun [y] → y + 1 (span="b.py:5")
# - params has structural_eq="def" → x maps to y
# - body uses that mapping → (x + 1) ≅ (y + 1)
# - span has structural_eq="ignore" → locations don't matter
And in Python:
from tvm_ffi import structural_equal, structural_hash
x, y = Var("x"), Var("y")
f1 = Lambda([x], x + 1, span="a.py:1")
f2 = Lambda([y], y + 1, span="b.py:5")
assert structural_equal(f1, f2) # alpha-equivalent
assert structural_hash(f1) == structural_hash(f2) # same hash