April 10, 2026· Marvin Amador· 8 min read

Inside graph collusion

What a 3-hop neighborhood actually tells you about a fraud ring — and how we turn it into a single number an adjuster can trust.

fraudgraphengineering

ES·Leer en español

Dana runs SIU at a mid-market auto carrier in the Midwest. On a Tuesday in March she opens CLM-8817, a rear-end collision filed overnight. It looks clean: two priors in 24 months, no SIU flags, body shop on the approved list. She is about to route it as low-risk and move on. (Names changed. The call happened.)

Then she clicks one tab further and sees the attorney who filed the claim also represented someone in a ring we tagged last quarter. Two hops through the graph. Nothing in the claim itself said so.

Most fraud models score a claim as if it lives alone — a row of features, one function, one number out. That works until you hit organized fraud, where the signal isn't in any single claim but in the pattern between them.

We treat a claim as a point in a graph.

The entity graph

Every claim pulls a constellation of entities with it: the claimant, the policy, the documents filed, the evidence atoms extracted from those documents. Shared entities draw edges. Shared entities across multiple claims form rings.

ClaimPolicyClaimantDocumentEvidence

A single claim (CLM-8817) with its immediate 1-hop neighborhood: policy, claimant, two documents, two evidence atoms.

This is the shape every claim exits intake with. Nothing fancy yet — a claim, its identity, and the artifacts it produced.

Edge weights

Two claims share context. How much does that matter? We score shared entities additively, capped at 1.0:

Shared between two claims	Edge weight
Policy	0.6
Claimant	0.8
Policy and claimant	1.0 (cap)

Same state adds 0.15, loss dates within 7 days adds 0.20. Those are overlays — they nudge an existing edge, they don't create one on their own. A shared state with nothing else shared is just geography.

Traversal: the 3-hop neighborhood

With edges in hand, we ask a more interesting question: who is near this claim?

The default radius is three hops. Beyond that, the graph is the whole world — every claim is a few bridges from every other. Three hops is enough to catch a ring and short enough to keep the signal tight.

Influence decays exponentially by hop:

Hop decay

w(hop) = 0.3^hop

A neighbor's contribution to the focal score shrinks by 70% per hop.

Why 0.3 and not 0.25 or 0.4? Honestly, it's a calibration artifact from an early backtest — someone picked a number, it scored well on a held-out set of confirmed rings, and nobody has had a good reason to move it since. We'll probably revisit it the next time precision drops.

ClaimClaimantPolicyEntity

Focal claim CLM-8817 with its 3-hop neighborhood. The red dashed ring marks a claim the system already flagged as fraudulent — reachable in 2 hops through a shared claimant.

Two things jump out visually once the neighborhood is drawn:

Density. CLM-8817 is adjacent to a body shop that ties to three other claims. Regional concentration? Maybe. Or a coordinating hub.
A flagged neighbor two hops away, reached via shared claimant and shared attorney. That's the signal that would never surface in a row-wise model.

Signal fusion

Graph density is one signal. It never stands alone. The fraud score you see on an adjuster's screen is a weighted fuse of four components:

Signal components

Graph collusion0.78×0.35
Tabular risk0.45×0.25
Multimodal evidence0.62×0.20
Adversarial stress0.55×0.20

Fused fraud score

0.62

weighted fuse

Each component has its own story. The graph score is high here: the neighborhood density plus the flagged 2-hop neighbor push it to 0.78. Tabular risk is modest — the claimant has two priors in 24 months, no SIU referrals. Multimodal evidence picked up a small anomaly in the estimate PDF. Adversarial stress flagged the loss narrative as coherent but not airtight.

Fused, the final lands at 0.62.

Band and governance

We normalize the fused score to 0–100 and bucket into three bands. Nothing auto-denies. Everything above medium routes to a human with the full trace attached.

CLM-8817 · fraud score

72HIGH

0 · LOW ≤ 40MEDIUM · 40–70HIGH ≥ 70 · 100

Watching it happen live

The whole traversal streams over SSE. Every node the agent visits, every edge it traverses, every tool call it makes — there's an event for it.

Live trace · CLM-8817

GET /api/graph/claims

GraphNodeVisited0ms

{ "node_id": "CLM-8817", "type": "claim", "reason": "focal" }

ToolCallStarted12ms

{ "tool": "get_claim_risk_snapshot", "args": { "claim_id": "CLM-8817" } }

GraphEdgeTraversed38ms

{ "from": "CLM-8817", "to": "J. Rivera", "type": "CLAIMANT" }

GraphNodeVisited41ms

{ "node_id": "CLM-8201", "type": "claim", "hop": 2, "flagged": true }

ToolCallStarted58ms

{ "tool": "get_claim_graph_neighborhood", "args": { "max_hops": 3 } }

GraphNodeVisited140ms

{ "node_id": "fraud_signal", "score": 0.78, "edges": 10 }

For Dana's UI, we replay this stream into a live subgraph that expands as the agent thinks. For auditors, it's the record that the decision was reasoned, not improvised.

Feedback propagation

When an investigator marks a claim as confirmed fraud — or confirmed clean — that decision diffuses through the graph as a seed:

Seed propagation

s'(n) = s(n) + Σ sign(k) · sim(n, k) · 0.3^{hop(n, k)}

Positive seeds raise scores on neighbors, negative seeds lower them; the effect decays with hop distance and similarity weight.

Two weeks after Dana confirmed CLM-8201 as fraud, a new claim came in from the same body shop. The graph already knew. Base graph score: 0.41 on the features alone. After seed propagation from its 2-hop neighbor: 0.68. Medium band, straight to SIU, no retraining in between.

Confirmed fraud pulls its neighborhood up. Confirmed-clean pulls it down. Next week's scoring inherits today's decisions without retraining a model.

If you run fraud at a carrier or TPA and want to see the graph view in action, we'll walk you through a live claim.

All posts