Inside graph collusion
What a 3-hop neighborhood actually tells you about a fraud ring — and how we turn it into a single number an adjuster can trust.
Dana runs SIU at a mid-market auto carrier in the Midwest. On a Tuesday in March she opens CLM-8817, a rear-end collision filed overnight. It looks clean: two priors in 24 months, no SIU flags, body shop on the approved list. She is about to route it as low-risk and move on. (Names changed. The call happened.)
Then she clicks one tab further and sees the attorney who filed the claim also represented someone in a ring we tagged last quarter. Two hops through the graph. Nothing in the claim itself said so.
Most fraud models score a claim as if it lives alone — a row of features, one function, one number out. That works until you hit organized fraud, where the signal isn't in any single claim but in the pattern between them.
We treat a claim as a point in a graph.
The entity graph
Every claim pulls a constellation of entities with it: the claimant, the policy, the documents filed, the evidence atoms extracted from those documents. Shared entities draw edges. Shared entities across multiple claims form rings.
This is the shape every claim exits intake with. Nothing fancy yet — a claim, its identity, and the artifacts it produced.
Edge weights
Two claims share context. How much does that matter? We score shared entities additively, capped at 1.0:
| Shared between two claims | Edge weight |
|---|---|
| Policy | 0.6 |
| Claimant | 0.8 |
| Policy and claimant | 1.0 (cap) |
Same state adds 0.15, loss dates within 7 days adds 0.20. Those are overlays — they nudge an existing edge, they don't create one on their own. A shared state with nothing else shared is just geography.
Traversal: the 3-hop neighborhood
With edges in hand, we ask a more interesting question: who is near this claim?
The default radius is three hops. Beyond that, the graph is the whole world — every claim is a few bridges from every other. Three hops is enough to catch a ring and short enough to keep the signal tight.
Influence decays exponentially by hop:
Hop decay
w(hop) = 0.3hop
Why 0.3 and not 0.25 or 0.4? Honestly, it's a calibration artifact from an early backtest — someone picked a number, it scored well on a held-out set of confirmed rings, and nobody has had a good reason to move it since. We'll probably revisit it the next time precision drops.
Two things jump out visually once the neighborhood is drawn:
- Density. CLM-8817 is adjacent to a body shop that ties to three other claims. Regional concentration? Maybe. Or a coordinating hub.
- A flagged neighbor two hops away, reached via shared claimant and shared attorney. That's the signal that would never surface in a row-wise model.
Signal fusion
Graph density is one signal. It never stands alone. The fraud score you see on an adjuster's screen is a weighted fuse of four components:
Signal components
- Graph collusion0.78×0.35
- Tabular risk0.45×0.25
- Multimodal evidence0.62×0.20
- Adversarial stress0.55×0.20
Fused fraud score
0.62
weighted fuse
Each component has its own story. The graph score is high here: the neighborhood density plus the flagged 2-hop neighbor push it to 0.78. Tabular risk is modest — the claimant has two priors in 24 months, no SIU referrals. Multimodal evidence picked up a small anomaly in the estimate PDF. Adversarial stress flagged the loss narrative as coherent but not airtight.
Fused, the final lands at 0.62.
Band and governance
We normalize the fused score to 0–100 and bucket into three bands. Nothing auto-denies. Everything above medium routes to a human with the full trace attached.
CLM-8817 · fraud score
Watching it happen live
The whole traversal streams over SSE. Every node the agent visits, every edge it traverses, every tool call it makes — there's an event for it.
Live trace · CLM-8817
GET /api/graph/claims- GraphNodeVisited0ms
{ "node_id": "CLM-8817", "type": "claim", "reason": "focal" } - ToolCallStarted12ms
{ "tool": "get_claim_risk_snapshot", "args": { "claim_id": "CLM-8817" } } - GraphEdgeTraversed38ms
{ "from": "CLM-8817", "to": "J. Rivera", "type": "CLAIMANT" } - GraphNodeVisited41ms
{ "node_id": "CLM-8201", "type": "claim", "hop": 2, "flagged": true } - ToolCallStarted58ms
{ "tool": "get_claim_graph_neighborhood", "args": { "max_hops": 3 } } - GraphNodeVisited140ms
{ "node_id": "fraud_signal", "score": 0.78, "edges": 10 }
For Dana's UI, we replay this stream into a live subgraph that expands as the agent thinks. For auditors, it's the record that the decision was reasoned, not improvised.
Feedback propagation
When an investigator marks a claim as confirmed fraud — or confirmed clean — that decision diffuses through the graph as a seed:
Seed propagation
s'(n) = s(n) + Σ sign(k) · sim(n, k) · 0.3hop(n, k)
Two weeks after Dana confirmed CLM-8201 as fraud, a new claim came in from the same body shop. The graph already knew. Base graph score: 0.41 on the features alone. After seed propagation from its 2-hop neighbor: 0.68. Medium band, straight to SIU, no retraining in between.
Confirmed fraud pulls its neighborhood up. Confirmed-clean pulls it down. Next week's scoring inherits today's decisions without retraining a model.
If you run fraud at a carrier or TPA and want to see the graph view in action, we'll walk you through a live claim.