14 min read

Toxic Recall Attack - Unwinding JoinMarket Case Study

Introduction

This report aims to help solve a cold case that was never solved. The year is 2015, Bitcoin is still young but gaining in popularity. Reddit is the venue, the /r/bitcoin subreddit is where the alleged victim makes a plea for help

"On the 9th of February 2015, somebody gained access to an online computer of ours with a BTC wallet holding 445 BTC... transferred those BTC to 5 different wallets... Those BTC were left sitting for roughly 1.5 years, until a few weeks ago when the thieves found out how nice bitcoin mixers are, and started to mix them, leaving nearly no traces..."

The reddit post by u/gridchain concludes with a plea for aid in tracking the allegedly stolen funds. As far as we can tell, no help was provided.

Fig.1 — Screenshot of the Reddit post by /u/gridchain

Scope

In this report we will cover the following:

  • The timeline of events that occured during the incident
  • An introduction to bitcoin privacy and CoinJoin theory
  • An introduction to the Toxic Recall Attack on flawed CoinJoin protocols
  • Assessment of likely destinations of the alleged stolen coins

We cannot attest to the validity of the claims of theft in the original post by u/gridchain. However, we have attempted to contact the user with the conclusions of this report to aid in the recovery of the allegedly stolen funds. At the time of publication, we have not received any return communication from the user.

If anyone can aid us in contacting the user, please reach out to us at investigations@oxtresearch.com.

A Timeline of Events

Based on the information provided in the reddit post and some additional background acquired from the blockchain, a rough timeline of the events is provided below.

15 January 2015 to 8 February 2015

  • u/gridchain's web wallet receives approximately 445 BTC in payouts from mining pools and coinbase (miner block reward) transactions to the following cluster: ANON-494272502.

9 February 2015

  • The user's web wallet is allegedly compromised. Funds are removed from the web wallet in a series of 5 transactions to the noted address provided in the reddit post.
Table 1 — Alleged Withdrawal Destination of Stolen Funds
Date Received Destination Address Blockheight Received BTC
09 Feb 2015 16a2pR6UDyeqv1ArQ8hGXJgqVCWfoqbdUr 342641 45.868
09 Feb 2015 17MtkE39Ms9gcZBdAWS6QQCyd7qrKdVdzo 342641 100
09 Feb 2015 1KNgyBny6S5sA9fxU8QJC3bLFHdDAKAabU 342641 100
09 Feb 2015 12RrvE59LUgcRdgE5W4iPpjcr66GtW6YgV 342640 100
09 Feb 2015 1EMChJbxPW7vTLyaTh3TBVMm9i8BUPFA1i 342640 100

7 April 2017

  • After nearly two years of inactivity, the alleged stolen funds are moved for the first time.
  • 45.87 BTC from address [16a2pR...] are transferred via the following TxID (f1609...) and enter their first JoinMarket CoinJoin via TxID (49b5f...).

10 April 2017

  • 400 BTC from the remaining addresses in Table 1 above are transferred via the following TxID (136d7...) and enter their first JoinMarket CoinJoin via TxID (926aa...).

2 May 2017

  • The coins continue mixing and the last of the "unmixed change" associated with both initial mixer deposits is merged into the same CoinJoin transaction via TxID (a85b3...).

5 May 2017

  • u/gridchain posts about the events above on Reddit.

Bitcoin Privacy & Mixers

We have spent a considerable amount of time tracking the movement of coins across both custodial tumblers and non-custodial CoinJoin protocols alike. However, we have yet to cover the basics of bitcoin privacy and why Coinjoins are vital to providing users with a basic level of privacy.

Pseudonymous Bitcoin

Bitcoin transactions spend bitcoins to and from pseudonymous addresses. Rather than a bank account or name tied directly to an identity. Bitcoin addresses offer a powerful, but fragile level of pseudonymous privacy.

Fig.2 — A Typical Bitcoin Transaction

The Transparent Ledger

In contrast to the traditional finance system, bitcoin's public ledger can be observed by any third party running a full node or with access to a web-based block explorer.

This allows any observer to construct a transaction graph showing the relationship between transaction inputs and outputs over a series of transactions.

Fig.3 — A Typical Bitcoin Transaction Graph

Heuristics That Damage Bitcoin Transaction Privacy

Additional metadata and heuristics, such as output amounts and wallet fingerprinting, can be leveraged to infer additional information about a bitcoin transaction. Common heuristics include wallet fingerprinting and the round payment heuristic.

Fig.4 — Wallet Fingerprinting and Round Payment Amount Heuristics

Regardless of the heuristics applied to evaluate bitcoin transactions, a mathematical relationship exists between the inputs and outputs of most bitcoin transactions. These relationships are called deterministic links which indicate a mathematical certainty that a transaction input was used to pay an output.

Fig.5 — Deterministic Links Between Inputs and Outputs

In a properly constructed CoinJoin transaction, deterministic links between inputs and outputs are broken, instead creating probabilistic links between inputs and identical outputs.

The presence of deterministic links between inputs and outputs are evaluated based on the CoinJoin sudoku algorithm which has been integrated into the transaction privacy algorithm called Boltzmann.

Boltzmann evaluates CoinJoin transactions for deterministic and probabilistic links between inputs and outputs.

Fig.6 — A CoinJoin Transaction with Deterministic Links for "Unmixed Change" TxID (37c64...)

The JoinMarket CoinJoin shown in Fig. 6 above breaks the deterministic links between transaction inputs and outputs for the identical outputs.

Identical outputs are often described with a simple privacy metric called anonymity set. In this case there are four 4.1761 BTC outputs, so these outputs are assigned an initial anonymity set of four.

The non-identical outputs shown in the example transaction above are deterministically linked (100%) to the transaction inputs. These outputs are sometimes referred to as "unmixed change" and create a "peeling" style transaction graph.

These outputs are directly attributable to an initial mixer deposit and its associated toxic pre-mix history. The most obvious deterministic link in the transaction above is between input 3 (~394 BTC) and output 2 (389.82 BTC). In most cases, deterministic links become obvious when subtracting the mix output denomination from the targeted input.

As described in this report, a transaction graph based attack combines the presence of "unmixed change" and previously seen mix outputs to retroactively attack mixer deposits.

The CoinJoin Coordinator

The privacy afforded by a CoinJoin protocol is dictated by the coordinator algorithm.

If a coordinator does not enforce each item in the list below, it creates the possibility of attacking user privacy through no fault of the user.

No Deterministic Links
Address reuse, a common privacy worst practice should be rejected by the coordinator to preserve all user's privacy.

Sybil resistance
Fee should be taken prior to mixing. Fees taken directly in a mix transaction result in deterministic links ("unmixed change"). Taking fees outside of the mix allows for an "ideal" CoinJoin with no reliable discernible history and provides the opportunity for free remixing.

Identical Output Denominations and Pool Style Mixing
Non-identical outputs create unique fingerprints for each mix, which can be used as additional leverage in an attack. Users are often required to combine smaller outputs to meet a larger mix output denomination. Identical output denominations and a pool style mixer allow for higher anonymity sets than that obtained with a single mix.

Structural Liquidity Enforcement
New mixer liquidity should be required by the coordinator before triggering a mix. If new liquidity is not enforced, users will continuously remix with the same mix participants.

Limiting "Previously Seen" Mix Outputs
A mix transaction should limit the number of outputs from a previous mix to no more than one. This minimizes the risks of users remixing with the same entities and creates a dispersed transaction graph.

Coins On The Move

Now that we have presented the basics of bitcoin transaction and CoinJoin privacy, we can circle back to the mixing of u/gridchain's allegedly stolen coins.

After nearly two years of inactivity, the coins were mixed through JoinMarket. JoinMarket was the first CoinJoin protocol offering a trust-less, non-custodial mixing service.

As detailed above, the "unmixed change" outputs within a CoinJoin transaction are deterministically linked to the corresponding inputs. The presence of "unmixed change" creates a "peeling" pattern as mixed output amounts are subtracted from the deterministic links with each subsequent mix.

We have highlighted the unmixed change as it works its way through subsequent CoinJoins. Fully expanding the transaction graph of each CoinJoin reveals a very noisy transaction graph.

Fig.7 — Complete Transaction Graph Tracing the Path of "Unmixed Change" Through JoinMarket

The peeling pattern presents a unique metadata trail that indicates each initial mix that a user participates in. It's worth noting metadata indicating the first mixing round a user participates in is present in all CoinJoin implementations. However, the presence of "unmixed change" within a CoinJoin transaction creates a target for the attack described in this report.

This metadata becomes more apparent when we filter out the irrelevant transaction graph data. We do this by only expanding the "unmixed change" outputs (orange highlighted lines) from each mix that are attributable to the original mixer deposits (400 BTC and 45.87 BTC).

Fig.8 — Isolated Tx Graph Tracing the Path of "Unmixed Change" of Each Mixer Deposit (Tx Graph Bookmark)

📖
KEY TAKEAWAYS Figure 8

Hiding the unnecessary transaction graph data highlights the separate peeling chains.

The transaction graph feature on oxt.me automatically reveals outputs (blue lines) that are used as inputs in following transactions.

Three distinct mixing regimes become apparent in the 400 BTC peeling chain.

Three Distinct Regimes

JoinMarket incentivizes providing mixing liquidity by rewarding users with the Sybil resistance fee. This process is called market "making". On the other hand, liquidity consumers pay the Sybil resistance fee and act as market "takers".

Figure 8 above reveals three distinct mixing regimes for the path of the 400 BTC mixing deposit. The regimes are as follows:

First Regime — Maker

  • The "unmixed change" outputs receive a net mixing fee, indicating the controller of the original 400 BTC mixer deposit was acting as a market "maker".
  • The "unmixed change" of the alleged stolen coins are typically the only outputs sent to follow-up mixes indicating a relatively high liquidity environment.
  • Mixed (anonymity set) outputs range from approximately 0.8 to 15.5 BTC in this regime.

Second Regime — Taker

  • The "unmixed change" outputs pay a net mixing fee, indicating the controller of the original 400 BTC mixer deposit was engaging in a "taker" role.
  • "Previously seen" mixed outputs from previous mixes (as shown by the blue outputs) are much more common in this regime indicating remixing with the same users in a low liquidity environment.
  • These outputs are frequently sent to the same mixes as the "unmixed change" allowing the attack described below.
  • Mixed (anonymity set) outputs range from approximately 5.8 to 57.9 BTC.

Third Regime — Alternating Regime

  • This regime includes both market making and taking before being merged with the 45.8 BTC peeling chain.

The different mixing regimes are highlighted by the increase in mix output denominations as shown in Fig. 9 below. The average mix output denomination was significantly higher during the taker regime.

Fig.9 — Mix Output Denominations During Different Mixing Regimes

📖
KEY TAKEAWAYS Figure 9

During the maker and alternating regimes the average mix denominations are 5.9 and 9.1 BTC, respectively.

During the taker regime, the mix output denominations average 27.1 BTC. The liquidity provided in this higher denomination regime is likely lower than the other regimes as evidenced by the common "previously seen" mix outputs.

The Toxic Recall Attack

We have named this attack the "Toxic Recall" Attack in homage to the film "Total Recall" based on the short story "We Can Remember It for You Wholesale" by Phillip K. Dick.

Attack Assumptions:

  1. A user can only run one mixing client at a time.
  2. The mixing client prevents the combination of mixed outputs and "unmixed change". This is accomplished by JoinMarket's mix depth segregation of mixed and unmixed outputs.

Necessary attributes of the Toxic Recall Attack (TRA):

  1. The presence of "unmixed change" also known as toxic change within a mix transaction. The unmixed change is "toxic" because it is traceable (deterministically linked) to the original mixer deposit and associated premix history.
  2. The recall of "previously seen" mix outputs that are sent to future mixes indicating the same entities are remixing with each other.
  3. The continued toxic change and "previously seen" mix outputs are sent to the same future mix.

Based on Assumption #2 above about the functioning of the JoinMarket protocol, "previously seen" mixed outputs that are sent in a follow-up transaction that combines the toxic unmixed change CANNOT be controlled by the same user who controls the toxic unmixed change.

This results in a lower anonymity set for the previous mixes for the entity controlling the toxic unmixed change.

Deploying The Attack

This attack becomes more apparent when the transaction graph is focused on the taker regime. Below we highlight the "previously seen" anonymity set outputs that are remixed along with the toxic change.

Fig. 10 — Taker Regime and Weakest Mix Transactions in Peeling Chain (Tx Graph Bookmark)

📖
KEY TAKEAWAYS Figure 10

The orange highlighted mix outputs are "previously seen" mix outputs that have already been mixed with the 400 BTC deposit.

The previously seen mix outputs that are sent to future mixes with the toxic change of the 400 BTC deposit cannot be controlled by this entity.

As a result, we can eliminate these mixed outputs from their original mix to isolate the outputs as controlled by the 400 BTC mixer deposit. This is the Toxic Recall Attack.

We have highlighted the "weakest links" in this mix peeling chain for additional analysis.

Readers will have an easier time understanding this analysis by interacting with the included Transaction Graph Bookmarks. Peer review of this analysis is welcome.

Toxic Recall Attack Impact

The effects of the TRA on the weakest link mixes are highlighted in Table 2, below.

Table 2 — Toxic Recall Attack on 400 BTC Mixer Deposit
TXID Mix Regime Original Mix Anonymity Set "Previously Seen" Mixed Outputs Toxic Recall Attack Anonymity Set
TxID (12dea...) Maker 8 0 8
TxID (fbe8e...) Maker 5 2 3
TxID (d090e...) Maker 5 0 5
TxID (2dc4e...)* Taker 4 2 2
TxID (6d25a...)* Taker 3 2 1
TxID (fc92b...)* Taker 7 5 2
TxID (11dfc...)* Taker 7 5 2
TxID (70812...) Taker 6 1 5
TxID (7d5d4...) Taker 7 3 4
TxID (206c5...) Taker 5 1 4
TxID (32f4d...) Taker 6 0 6
TxID (1ca73...) Taker 6 0 6
TxID (52d23...) Alt 3 0 3

📖
KEY TAKEAWAYS Table 2

The TRA is applicable to 8 mixes in the 400 BTC "unmixed change" peeling chain.

4 of these mixes are "weakest links" occurring during the low liquidity taker regime that result in high probability attacks on the 400 BTC entity's mixed outputs.

The 4 weakest link mixes have an original total identical output count of 21.

We have highlighted the "weakest links" in this mix peeling chain for additional analysis.

The TRA on the 400 BTC entity reduces the possible number of mix outputs (anonymity set) controlled by the entity to 7.

The 400 BTC entity controls 4 of the 7 remaining mixed outputs

Postmix Spending

So far we have used the Toxic Recall Attack (TRA) to isolate 7 remaining mix outputs as possibly belonging to the 400 BTC entity. Table 3 below shows the details for these outputs.

Table 3 — Toxic Recall Attack Isolated Mix Outputs
Fig 10 Item Weakest Link TXID Isolated Outputs from Mix Isolated Output Index Destination
A TxID (2dc4e...) 2 1 Remixed via TxID (c4c82...)
B TxID (2dc4e...) 2 2 Exits Mixer
C TxID (6d25a...)*** 1 5 Remixed via (ba29e...)
D TxID (fc92b...) 2 10 Remixed via (ba29e...)
E TxID (fc92b...) 2 8 Remixed via (fb0a1...)
F TxID (11dfc...) 2 11 Remixed via (c4c82...)
G TxID (11dfc...) 2 3 Remixed via (af5a6...)

Each isolated output is highlighted and annotated in Fig. 11 below.

Fig.11 — TRA Isolated Mix Outputs (Tx Graph Bookmark)

📖
KEY TAKEAWAYS Table 3 and Figure 11

The Toxic Recall Attack was able to deanonymize the 400 BTC mix output from TxID (6d25a...). This output was remixed.

Of the remaining 6 isolated outputs, 5 are remixed.

Output 2 from TxID (2dc4e...) exited the mixer. This output has a 50% probability of belonging to the 400 BTC entity.

To simplify this analysis, we will follow Output 2 from TxID (2dc4e...) as it exits the mixer. With the context of this output in mind, we will circle back to the remixed coins isolated from the TRA attack afterwards.

Fig.12 — Attacked Output from Mix TxID (2dc4e...) Spent to Poloniex (Tx Graph Bookmark)

📖
KEY TAKEAWAYS Figure 12

The targeted output is spent via TxID (756c6...) split and peeled in various directions.

Funds from several postmix spends are consolidated via TxID (ab1e6...).

TxID (ab1e6...) pays 76.69 BTC to address [16vBEu...] which is included in our Poloniex cluster.

Inputs to TxID (ab1e6...) are also sourced from the following JoinMarket CoinJoins via TxIDs (f79ca..., cbdf8...).

The TRA has given us a relatively high probability postmix spending destination attributable to the 400 BTC mixer deposit.

In a continuation of the attack we will trace back from the history of address [16vBEu...] to see if any additional mixer outputs were sent to the address. The results of a Part 1 of this attack are shown in Fig. 13 below.

Fig.13 — Part 1 - Backtracking of Address [16vBEu...] History to JoinMarket Mixes (Tx Graph Bookmark)

📖
KEY TAKEAWAYS Figure 13

Four smaller mixed outputs from 400 BTC peeling chain are merged and remixed via TxID (ba29e...), including de-anonymized and weakest link mix outputs. 1 of 3 103.46 BTC mix outputs from this mix is sent to [16vBEu...].

1 of 2 isolated weakest link mix outputs from TxID (11dfc...) is remixed via mix TxID (af5a6...) one of these mix outputs is merged with the 103.46 BTC output from A.

Postmix spend via TxID (16f43...) consolidates these outputs in a postmix spend to 136.93 BTC to Poloniex Address [16vBEu...].

4 of the 7 original weakest link isolated outputs are linked to payments to Poloniex Address [16vBEu...].

We repeated the backtracking from Poloniex Address [16vBEu...] multiple times. The results of one of these iterations is shown in Figure 14 below.

Fig.14 — Part 2 - Backtracking of Address [16vBEu...] History to JoinMarket Mixes (Tx Graph Bookmark)

📖
KEY TAKEAWAYS Figure 14

Tracing back from Address [16vBEu...] via TxID (d1a91...) again leads to several post mix transactions.

Postmix spend of 2 BTC to Coinbase clustered address [13M2aB...] via TxID (9cd82...).

Additional postmix spends to Poloniex clustered addresses [13bVw1..., 17uer1..., 1CGaFy...] via TxIDs (a910e..., 4a2053..., 79987...).

We traced back several additional spends to [16vBEu...]. In total, the Toxic Recall attack opened the door for the tracking of 380 BTC to a final destination (out of 445 BTC mixed). A complete list of postmix spends, destination spends, and destination addresses is provided in the attached spreadsheet. Users are encouraged to evaluate the validity of this attack by using the included transaction graph bookmarks.

Thawing Out A Cold Case

While this data alone is not enough to confirm alleged theft and destination of the stolen coins, it provides a thread to pull on for follow up investigations.

In this report, we have covered the following:

  • The transaction graph of a CoinJoin protocol can be used to attack mixed outputs.
  • The Toxic Recall Attack uses, the presence of toxic "unmixed change" attributable to a user's original mixer deposit and "previously seen" mix outputs to weaken mixes associated with the original mixer deposits.
  • While this case study focused specifically on JoinMarket CoinJoins, the Toxic Recall Attack is also applicable to Wasabi Wallet CoinJoins.
  • The presence of "unmixed change" and previously seen mix outputs permitted by old style CoinJoin coordinator technology results in severe structural flaws.
  • In a best-case scenario, the Toxic Recall Attack can be used to reduce the anonymity set of specific users.
  • In a worst-case scenario, this attack leads to de-anonymization of mix outputs.
  • Contrary to the claims of the developers of these coinjoin protocols, "unmixed change" can be used to attack user privacy through no fault of the end user.
  • Findings from this OXT Research study are being used to reinforce the Whirlpool protocol when faced with similar attacks.
  • The coins associated with this alleged theft are likely irrecoverable. However, we have attempted to contact u/gridchain to provide the results of this analysis. If any readers can help us get inn touch with u/gridchain, please contact us at investigations@oxtresearch.com

Our analysis leveraged the power of the OXT platform in combination with unique knowledge of bitcoin mixers and a flexible targeted evaluation to track the movement of these funds.

The OXT Research Team is intimately familiar with the analysis of special situations affecting bitcoin markets. With the help of the OXT platform, we remain uniquely situated to provide targeted transparent analysis of this and other special situations affecting bitcoin.