Being certain is a lovely thing. Despite what many would allege about the poor finality of proof of work, the relative certainty it provides is part of the appeal. Once that inbound transaction is buried six confirmations deep, it’s almost certainly yours. Of course, even more certainty is achievable with an in-person cash transaction. But you can’t send those over the internet.

It’s no surprise that users of Bitcoin and other cryptocurrencies have a near-insatiable hunger for data. The tools that extract meaningful data from block headers and node information and convert them into a useful format are vital. Whether it’s coin.dance, fork.lol, flippening.watch, or segwit.party, this industry has always benefited from useful interpretations of non-human-readable data broadcast on various blockchains. Other services we use virtually every day are p2sh.info, blockchain.info, etherscan.io, ethgasstation.info, and jochen’s mempool stats.

These aren’t just for analysts – they are for normal businesses using cryptocurrency or typical users looking to set the right fee. Some of these interpretations have constraints, of course, and occasionally the data offered by free services is poor or misinterpreted. Since we run a website in that same category – we want to empower users to make good decisions – we feel that we have a duty to make the shortcomings of our data starkly clear.

The previous installment in the series of user education can be found here. We describe the difficulties inherent to estimating on-chain transaction volume, and the constraints of our own estimate. For UTXO chains, they are suboptimal. For account-based systems (Ethereum and its derivatives), we have a better idea, albeit not a perfect one.

This post is a brief attempt to qualify our data and to give you an idea of which figures we have a high degree of confidence in, and which ones we don’t.

Let’s jump straight to the CSV files available at Data Downloads. Virtually everything found on our charts page is built out of these pieces of data. It’s just a matter of knowing how to put them together. This is what the last two weeks of data look like for Bitcoin:

Let’s go through these columns carefully. First, you have the date. Not much to talk about there. We use the gregorian calendar. Daily closes for our price quotes occur at 00:00 UTC (that’s 7 pm EST).

Next you have txVolume(USD). That’s what we’re talking about when we say “on-chain transaction volume.” Simply put, it’s a broad and largely unadjusted measure of the total value of outputs on the blockchain, on a given day. This is an answer to the question “approximately how much value, denominated in USD, circulates on the Bitcoin blockchain a day?”

That said, on-chain transaction volume in practice is a very hard thing to estimate properly. We discuss that at length in this post. Please read it! We don’t want to give anyone a false degree of certainty in the number. We are currently working through various methods to improve the estimate, and maybe get closer to blockchain.info’s adjusted figure – but that will take time. So bear with us.

The third column is txCount. That refers to the number of transactions happening on the public blockchain a day. Be aware that for low-fee blockchains, it’s really easy to fabricate a whole bunch of transactions. So this isn’t necessarily that reliable! Additionally, UTXO networks like Bitcoin can batch a whole bunch of transactions into one, so txCount underestimates those ones. You have to therefore be careful comparing the number of transactions on Ethereum with Bitcoin; by its very nature, Bitcoin typically has more transactions than that datapoint suggests. Naively comparing the UTXO to account based systems by transaction count is like watching motorway traffic and comparing the number of buses versus motorcycles to guess at how many people are making trips. Maybe there are the same number of buses and motorcycles – but each bus might have 50 people inside of it.

Here’s a very typical example of a batched Bitcoin transaction. One sender, lots of receivers.

This is fairly typical for Bitcoin. Maybe it’s a mining pool paying out, or an exchange paying multiple users at once. Batching transactions saves space, and is more cost-efficient, so it’s encouraged. It also means that just counting transactions for Bitcoin (or other UTXO chains) isn’t likely to yield a reliable estimate of how many actual transactions are occurring. The useful site outputs.today popped up recently to make this point. Of course, outputs.today might be including change outputs (we’re not sure if they are or not), which aren’t meaningful transfers. But even if you conservatively assumed that half of all bitcoin outputs tallied on outputs.today are change outputs, you would still have a higher number of outputs than just the raw number of transactions. So please be aware of the fact that Bitcoin transactions have the flexibility of email (one can send to many), constrained only by the blocksize and the willingness of miners to include large transactions.

Next, marketcap(USD). This is of course the unit price multiplied by the number of units in circulation. There has been quite a bit of controversy over this indicator. We still like this post from the Sia guys on the topic. Marketcap or network value is definitely flawed. It becomes less tethered to reality the smaller the float is. Float means the ratio of actual circulating units to the total number of units. Ripple, for instance, has a fairly small float, so one should probably be skeptical of its “market cap.” OnChainFx is doing a lot of good work on the issue.

Price is price. Not much to say about that one. We get it from CoinMarketCap, with all the caveats that entails. Be advised – it’s the opening price.

exchangevolume(USD) is, as you might expect, the dollar value of the volume at exchanges like GDAX and Bitfinex. We get this data to Coinmarketcap, who have a bit of a conflicted history with the figure, having deleted and re-added Korean exchange figures. It doesn’t include data on OTC exchanges, which is a meaningful portion of all global exchange. Remember that 0-fee exchange volume should be taken with a grain of salt.

Next, generatedCoinsThis refers to the number of new coins that have been brought into existence on that day. We count up the actual number of newly-minted coins, rather than using the stated inflation figures (i.e. for bitcoin you should expect 12.5 per block, every ten minutes, giving you 12.5*6*24 = 1800 coins per day). You can see that we’ve been exceeding 1800 coins per day recently – this is due to lots of new hashpower which is coming online and making blocks come every 8 or 9 minutes than 10. In practice, since hashpower is continually added to the system, Bitcoin inflation progresses slightly faster than its theoretical rate. This is also why our figures differ from those of other websites – we count up the actual number of new coins rather than just assuming the official inflation rates are correct.

Lastly, Fees. Fees in our data are based on the native currency, not USD. So on January 28th, fees totaled 168.25 BTC. That’s about $1.88m. This has been a source of confusion for many, so again – fees are counted in the native currency. You have to multiply by unit price to obtain the USD value of fees. Fees are interesting because they can’t really be faked – either you want to use the blockchain, and you’ll pay to do it, or you choose not to.

So to conclude, on-chain volume and transaction count can both be faked and can be tricky to estimate. Exchange volume must be viewed fairly skeptically. Market cap has a whole host of methodological issues. Generated coins and fees, however, are much more concrete.

We hope this helps you assess our data. Do not take it as gospel – use it only with the appropriate dose of skepticism.

2 comments

  1. Hi there,

    thank you for providing the dataset. It is very useful for my research.

    Here I have a question:

    In the file lsk.csv, in the beginning period, the marketcap is zero but the exchange volume is not. How can this happen? Is it an error in the construction?

    All the best, Cedric

    1. I might understand it after I read your post on the marketcap. Maybe there is no circulation in that moment.

      Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.