How concentrated is block formation in Bitcoin? Which mining pools are dominant, and how long have they held that position? Are major pools in a position to collude and censor transactions? Which miners should developers and key stakeholders engage with? How exposed is Bitcoin to potentially malicious miners? Are major pools reinforcing their dominant positions or is the industry becoming more competitive? (This was an issue covered by Ark Invest in mid 2018.)

These are supremely relevant questions for Bitcoin, and they cannot be answered without accurate data about block production. However, popular sources leave something to be desired in this respect. Bitcoinity and BTC.com both publish time-series data on block production, but leave important things out, and do not offer anything better than monthly resolution. Determining which pools are gaining popularity and which are waning is challenging. That was the inspiration for this study.
By combining the Bitcoinity and BTC.com datasets, we were able to create this initial 3-year visualization of the development of mining pools.

Keep in mind that the mining pools listed do not constitute individual entities, but are in most cases aggregations of hashpower. Individual miners, whether hobbyist operations or industrial farms, allocate their hashes to pools to reduce variance in payouts. However, since mining pools have effective control over third-party hashes in the short term, their ability to exercise discretion over the aggregated hash of contributing miners is worth contemplating. So we wanted to find a more detailed and comprehensive view of the evolution of pools over time, and shrink the “other” field in the chart above.

Thus we parsed the coinbase outputs from Bitcoin’s last 450,000 blocks (excluding the first 100,000, because early miners did not put identifying information in the coinbase data) to reconstitute the pool data directly.

By coinbase data, we do not mean the popular exchange, but rather the field in the block generation transaction where miners can insert arbitrary data. It has become a convention for mining pools to insert identifying data in newly-mined blocks, although this is purely voluntary (and miners are under no obligation to tell the truth!)

After cleaning the data, we were able to identify 37 individual mining pools / large solo miners which mined at least 0.1% of the blocks in the period, and 11 additional identifiable entities which we excluded due to not meeting that threshold (they were aggregated into “other known” in the chart below).

Large & HQ versions: PDFPNG

The key can be hard to read, so here is the same chart with some major mining pools labeled on the chart:

Large & HQ versions: PDFPNG

The striking conclusion from the all-time chart is just how fallible major mining pools are: several influential pools which once controlled significant fractions of Bitcoin’s hashrate – BTC Guild, GHash, BTCC – are now totally defunct. Indeed, few pools seem to be truly persistent, F2Pool and Slushpool being notable exceptions. And while measuring real-world concentration is hard (for instance, Bitmain owns both Antpool and BTC.com, and has a stake in ViaBTC), the high die-off rate suggests that mining pool administration is a competitive industry. The miner concentration analysis is nevertheless confounded by the complex pool ownership dynamics at play.

One mystery arising from the chart is the resurgence of unknown miners. Between the latter half of 2015 and mid-2017, miners disclosing their identities in coinbase outputs accounted for the vast majority of all blocks. However, through 2018, unknown miners picked up. This may be due to the waning importance of miner signaling due to the resolution of the Segwit saga, a newly-found appreciation for privacy, or the emergence of miners who have something to hide.
Let’s take a closer look at the last four years.

Large & HQ versions: PDFPNG

Slushpool remains relevant, Antpool’s influence has moderately declined, F2Pool has declined considerably, and BTC.com has tapered off despite a position of strength. Notably, BTCC shut down altogether. BW.com also mined the last of their 12,414 blocks on July 30, 2018. Since mining pool operation is a generally undifferentiated business, such failures are frequent. Much-touted hardware manufacturer BitFury has tailed off their mining pool operation as they have diversified their business operations.
Interestingly, while F2pool (43,657 blocks) and Antpool (40,304 blocks) lead the standings in terms of blocks mined, neither sits atop the leaderboard when it comes to raw BTC mined. This is due to the fact that issuance per block has been declining, and early mining pools had access to a more plentiful stock of BTC. Thus BTCGuild eclipses its peers when it comes to total BTC mined. Note that this analysis omits blocks mined by pools in the pre-disclosure era. Prior to 2011/12 when it became commonplace to self-identify, some mining pools were operating, and they are excluded by this methodology.

Lastly, it’s worth calling attention to Slushpool’s remarkable persistence. Founded in 2010, Slushpool was dominant in the early era of pooled mining, accounting for 10-15% of Bitcoin’s daily hashrate for an extended period. It entered a trough in 2014 but surprisingly revived itself in the last twelve months, being one of the few mining pools to do so – normally they just trail off and die.

There are far more stories in this data. We have only begun to scratch the surface. In particular, an updated industry concentration measure (the Herfindahl-Hircshman index seems suitable) which tackles both the unknown miners and the joint control of multiple pools by single entities would be well warranted.