4th update of 2021 on our Hive software work
# Hived work (blockchain node software)
Lately our blockchain work has focused on the hived plugin that injects operations, transactions, and block data into a PostgresSQL database (for use by modular hivemind apps such as wallets, games, etc). We’ve made solid progress since my last report on the this plugin.
We’ve never had a really great name for this plugin, but we may end up calling it the “SQL account history plugin” since it can, as a side benefit, effectively replace the rocksdb-based account history plugin. This is because this week we added API calls to the hived plugin so it that can serve up this postgres-stored data the same way that the rocksdb account history plugin does. In other words, a hived node operator can make use of this plugin even if they don’t operate a hivemind node, they just need to spin up a postgres database (although I still expect performance advantages for adding hivemind to the mix, as I discuss further below).
Performance was one of the primary drivers for this work, and performance results so far have been as good as I anticipated (but surprising to some of our more pessimistic devs). All API performance tests done so far show it to be equal or better in performance to the existing rocksdb plugin.
The new plugin should really shine when it comes to operation history requests where filtering based on operation type is desirable (previously performance issues forced us to place limits on what a single API call of this type could request, and I hope we’ll be able to eliminate this limitation).
I believe we will be able to achieve even better results when we implement the account history API on hivemind itself rather than via hived, where I think serialization issues still slow down the results quite a lot on large data requests. For larger result queries, I think we could easily reduce response latency by 4x or more and these new calls should also be less costly in terms of CPU loading.
Both the hived-based API and the hivemind-based API will employ the same SQL queries, so we’ll be able to re-use much of this week’s work on the hived plugin when implementing account history API support inside hivemind.
## Some benchmark results from the SQL account history plugin
A full reindex of hived for 49 million blocks while running the SQL account history plugin took around 11 hours (as opposed to 15 hours for the rocksdb-based account history plugin). While this performance is already fairly impressive, there’s a reasonable chance we can still improve on this, with the absolute lower bound being the time required by a consensus-only replay (6 hours).
It’s also important to note that while we’re comparing the re-index time to that of the old account history plugin as an example benchmark, one of primary goal for this work is to speedup the reindex time of hivemind (and other modular hivemind variants), which currently takes too long (~4 days).
So an even more interesting benchmark will be to compare how much faster hivemind reindexing will be using the SQL account history plugin. In the past week we completed the work in hivemind to swap away from pulling the hived data directly from hived to using the data pushed by the hived plugin, so we’ll be performing a full hivemind re-index on a Postgres database that has been pre-populated with 49M blocks worth of operations. It’s hard to predict exactly how long this new re-index process will take yet, but I’m very optimistic about the results.
The SQL account history plugin currently generates 967GB of storage when we run the 49M block reindex. This is more than 2x the size of rocksdb-based storage, but the data is being stored in more desirable forms that can be served up faster during API calls, and with much less CPU loading. This is also a bit of an unfair comparison, as we’re including data here on the SQL account history plugin side that previously had to also be stored in the hivemind database.
We still don’t know what the final size of hivemind will be with this new data introduced, other than the upper cap at the current block height should be around 1.5TB. I expect to have an initial answer on this in the coming week after we complete the full hivemind sync test.
## Other work on hived
* We have one dev assigned to working on incorporating the accounting virtual ops code from the BlockTrades version of the hived code. That work will likely be completed in the coming week.
* We performed a preliminary review of the current state of the SMT code. Unfortunately, we found that the code was far from in a complete state (not only is much of it untested, there’s a fair amount that hasn’t yet been implemented). We have a partially complete report on the state of the C++ code for SMTs, if there’s any developers that would like to review it.
* We’re creating a new wallet_bridge_api_plugin to reduce future headaches associated with upgrading the command-line interface wallet when we make updates to hived’s API.
# Hivemind (2nd layer microservice for social media apps)
As mentioned above, we completed the work for enabling a hivemind sync operation to be performed from injected operations data instead of pulling the data from hived. We’re currently running functional and performance tests on this code.
We made another improvement to the speed of the hivemind initial sync process (we added threading support to the post initial sync process, just before “live sync” begins), but we had to fight our way through a few errors that this created along the way.
We also fixed a constraint problem that could occur when upgrading an existing hivemind database to use the latest code.
We’re currently fixing bugs related to the follow code and decentralized lists. A big part of this work is creating a comprehensive set of tests for both the old and new features. I expect we will complete that work this week. Next we will merge the current development branch of hivemind into the master branch and make an official Hivemind release for deployment by Hive API node operators (these are the node operators that serve up Hive data to the frontend applications).
There’s a known issue with testing system when it comes to measuring the performance of individual API calls, but we expect to fix it in the upcoming week. We’ve temporarily disabled the performance measurement code in the meantime to avoid false positive test fails.
# Condenser and wallet (https://hive.blog)
We completed testing and deployed new versions of hive.blog and the web wallet with the latest changes from our team and @quochuy.
I’ve also assigned a couple of UI devs to compare/contrast the current state and functionality of the code bases for ecency and hive.blog.
# Near-term work plans and work in progress
On the hived side, we will continue to work on the governance changes discussed in our six-month roadmap post. With regards to the modular hivemind plugin, we have a few ideas we plan to try to further speedup the re-indexing process.
On hivemind side, we’ll continue work on tests and documentation and preparing the next release of hivemind after the follows code passes all the new tests.
We’ll also be continuing our work on modular hivemind. So far, I’m very pleased with our progress on this project and I’m optimistic about its use as a foundation for a smart-contract platform and other 2nd layer apps.
See: 4th update of 2021 on our Hive software work by @blocktrades