Cracking the Darkweb: How Clustering Exposed a Global Child Abuse Ring

In 2017, global investigators quietly launched a case that would unravel one of the darkest corners of the internet. Buried behind layers of Tor routing and pseudonymous Bitcoin payments, Welcome to Video was a child sexual abuse material (CSAM) marketplace operating out of South Korea. It didn't just trade illegal content; it industrialized it. Access came at the price of a few thousand satoshis, and buyers believed their transactions would disappear into the blockchain's fog.

They were wrong.

Using only open-source blockchain data and forensic techniques like clustering, investigators traced payments, linked wallets to names, and kicked down digital doors in 38 countries. In the end, 337 users were arrested (U.S. Department of Justice, 2019). Not with spyware. Not with surveillance. With math.

From Crime Scene to Pattern Recognition

The Welcome to Video case revealed that blockchain transactions, when analyzed correctly, can expose hidden criminal networks. But the question remained: how did investigators know where to look? This brings us to one of the most pivotal discoveries in blockchain analysis—a method born not in a government agency, but in academia.

The Myth of Anonymity Meets a Spreadsheet

Back in 2013, long before most of us had heard of Bitcoin, a doctoral student at UC San Diego asked a deceptively simple question: how many people are actually using this thing?

What Sarah Meiklejohn found was chaos: 12 million addresses, nearly 16 million transactions, and no obvious way to tell which wallets belonged to whom. But she had a hunch. If Bitcoin was pseudonymous, not anonymous, then patterns had to exist. So she started digging, transforming the entire blockchain into a massive searchable spreadsheet and running queries like a digital archaeologist brushing sand off buried bones (Greenberg, 2022).

That hunch became a breakthrough. Through behavioral patterns and transaction structures, she realized there was a way to collapse the illusion of anonymity. It started with a rule of thumb.

When Piggy Banks Talk

Investigators call it a heuristic, a pattern you can follow even when the data won't speak directly. One of the most effective is called the common input ownership heuristic. Here's how it works: if multiple Bitcoin addresses are used together to fund a transaction, odds are they're controlled by the same person. Think of it like someone paying for lunch using two different credit cards. You wouldn't assume they borrowed both from friends. You'd assume they're both his.

But Bitcoin has another quirk. Andy Greenberg calls it the "piggy bank problem" in Tracers in the Dark. Every Bitcoin address is like a sealed piggy bank. If you want to pay someone 6 BTC from an address holding 10, you can't just scoop out what you need. You smash the whole thing. The 6 goes to your recipient. The leftover 4? That goes to a change address, a brand-new piggy bank that your wallet creates just for you (Greenberg, 2022).

To anyone watching the blockchain, both outputs look the same. But with just a bit of logic, you can often tell which one is the payment and which one is the change. If one of the outputs is a brand-new address never seen before, there's a good chance it's the change address. And if it's the change address, it probably belongs to the sender too.

Link the inputs. Spot the change. Group the addresses. That's clustering.

Peeling the Onion: Heuristics That Push Attribution Further

Here's why this matters: being able to tell payment and change outputs apart is fundamental to a successful blockchain investigation. In any UTXO-based transaction like Bitcoin, identifying which output is the actual payment and which one is the change is not just a technical detail; it can determine whether an investigation moves forward or ends up chasing its own tail. If an investigator misclassifies the change address as the recipient, the trail can go cold fast, wasting hours or even days tracing the wrong cluster. On the other hand, correctly identifying the real payment address puts the investigation on solid ground, revealing downstream flows and potential real-world identities. This distinction becomes a critical skill that separates seasoned investigators from those chasing dead ends.

The common ownership heuristic gets you in the door. But once you're inside, things get murkier. Criminals aren't always sloppy, and Bitcoin doesn't hand over answers. It hides them in plain sight. That's why blockchain investigators don't just rely on one trick. They stack heuristics like tools in a forensics kit, each one sharpening the picture of who's actually behind a wallet.

Nominal Spend Heuristic

A lot of people think the smallest output in a Bitcoin transaction is the payment. But that's not always the case. Wallets are designed to find the most efficient way to spend the exact amount needed while keeping fees low. So instead of looking for the smallest number, investigators ask a better question: are all the inputs required to cover one specific output? If yes, then that output is likely the payment.

For example, if the inputs add up to 3 BTC and the outputs are 0.5 BTC and 2.49 BTC, the 0.5 is likely the payment and 2.49 the change. But if the outputs are close in value—like 1.45 BTC and 1.55 BTC—it gets tricky. Switching to USD values sometimes helps clear things up. The goal isn't just to match numbers. It's to spot the intent behind the spend.

This method isn't perfect. Wallet behavior can vary, and some transactions are crafted to mislead. But combined with other heuristics, it's a reliable starting point for figuring out who paid what, and to whom.

Change Address Type Heuristic

Not all Bitcoin addresses look the same. Legacy addresses start with "1," nested SegWit with "3," and native SegWit with "bc1." Most wallet software prefers consistency and sends change back to the same address type. If a transaction's outputs include one address that matches the input type and another that doesn't, there's a good chance the matching one is the change. This heuristic is documented in Meiklejohn et al.'s foundational work (Meiklejohn et al., 2013).

Multisig Heuristic

Multisig wallets require multiple keys to authorize a transaction, which adds an extra layer of security. These addresses have a distinct script type that makes them easily recognizable on-chain. When a transaction involves a multisig input or output alongside standard addresses, analysts can use that contrast to infer wallet structure or organizational control. If multiple transactions show consistent movement between a particular multisig and a standard address, it can help cluster them as being under the same entity's control.

Round Payment Heuristic

Humans are messy, but when we pay others, we like clean numbers like 0.1 BTC, 1.0 BTC, or 5.0 BTC. These round figures stick out on the blockchain. The less-precise outputs are often change. Round numbers suggest external payments, while uneven leftovers point to self-change.

Self-Change Heuristic

Despite warnings about address reuse, it still happens frequently, especially in older transactions or lazy wallets. When change is sent back to an address that's already appeared in the user's history, that's a huge clue. Meiklejohn's study identified this behavior and used it to tag massive address clusters across Silk Road, Mt. Gox, and other darknet markets (Meiklejohn et al., 2013). In forensic cases like Welcome to Video, this heuristic helped connect clusters when change addresses overlapped with previously used ones (Greenberg, 2022).

Don't Get Too Comfortable

These heuristics are sharp, but they're not gospel. They're patterns. Educated guesses. Good ones, but guesses nonetheless. All of them rely on predictable behavior, and criminals, when they're smart, stop being predictable.

That's where CoinJoin comes in. CoinJoin breaks the rules. It takes multiple users, mashes their payments together in one transaction, and jumbles the inputs and outputs so the usual tricks don't work. It's like walking into a crime scene and finding everyone's fingerprints on every surface. Wallets like Wasabi and Samurai have made CoinJoin more accessible, and forensic tools struggle to untangle the aftermath without additional context or external identifiers.

In the next post, we'll dig into CoinJoin, what it is, how it works, and why it remains the closest thing Bitcoin has to a true disappearing act—for now.

References

U.S. Department of Justice. (2019). South Korean National and Hundreds of Others Charged Worldwide in Takedown of Largest Darknet Child Exploitation Market. Retrieved from https://www.justice.gov/opa/pr/south-korean-national-and-hundreds-others-charged-worldwide-takedown-largest-darknet-child

Greenberg, A. (2022). Tracers in the Dark: The Global Hunt for the Crime Lords of Cryptocurrency. Doubleday.

Meiklejohn, S., Pomarole, M., Jordan, G., Levchenko, K., McCoy, D., Voelker, G. M., & Savage, S. (2013). A Fistful of Bitcoins: Characterizing Payments Among Men with No Names. https://cseweb.ucsd.edu/~smeiklejohn/files/imc13.pdf

Next
Next

Bitcoin Runs on Code — But Markets Run on Human Nature