7th update of 2022 on BlockTrades work on Hive software

![blocktrades update.png](https://images.hive.blog/DQmSihw8Kz4U7TuCQa98DDdCzqbqPFRumuVWAbareiYZW1Z/blocktrades%20update.png) Below are highlights of some of the Hive-related programming issues worked on by the BlockTrades team since my last post. # Hived (blockchain node software) work ### Mirrornet (testnet that mirrors traffic from mainnet) to test p2p code We started testing the develop branch of hived with the mirrornet code this week. In this process, we found and fixed a couple of problems with the mirrornet itself (it wasn’t compatible with default BOOST library on Ubuntu20 and transactions ported from blocks in the mainnet were using the block time as their expiration time instead of their original expiration time). More importantly, we found a longstanding performance issue with the peer-to-peer code that limits the rate at which transactions can be shared over the network: each node requests at most one transaction at a time from each of its peers. So, for example, with an average latency of 100ms to all its peers, a node could get at most 30 transactions per peer during a block interval (3s block interval / 0.1s = 30). This problem showed up very clearly when we were testing a mirror net configured with only 3 nodes, with one node receiving the transactions from the mirrornet and sharing the transactions to a close-by node (1ms latency) and a far-away node (200ms latency). The close-by node was able to receive the transactions in a timely manner and fill it’s blocks with transactions, but the far-away node was receiving transactions at such a low transaction rate that it got the transactions after they were either expired or put into the other producer’s block, so it was only able to create 0-transaction blocks. Empirically, we’ve observed this problem occasionally on the mainnet as well, for nodes that are far away from the other nodes during high-traffic times, but the problems have been transient and therefore difficult to analyze. In the more constrained environment of the mirrornet the problem was much easier to reproduce and analyze because we had full control of the network topology and access to all the nodes in the network. It was also helpful to be able to run all the nodes with the new p2p code, since that allowed us to discount potential performance problems that were already fixed related to the locking protocol between the p2p code and the blockchain processing code during our search for the performance bottleneck. ### Planned p2p performance enhancement To address the issue above, we’re updating the p2p code to allow a node to request more than one transaction at a time from a peer. This brings up issues of how a node can best load balance its transaction requests to its peers, since multiple peers can have the same set of transactions the node wants to fetch. Previously, load balancing was achieved automatically: since we only allowed one active request to any peer, the node would just round-robin its requests to peers, for the most part. Since we are now allowing multiple active requests to a peer, we need a new load-balancing algorithm. We’re using two simple heuristics for this: 1) the code won’t request more than 100 transactions from any given peer at a time and 2) we will favor peers that have low latency over high latency. This should result in transactions getting shared to peers as quickly as possible. As a side note, I suspect that the above performance problem has probably been limiting average block size as well and preventing many “full” blocks, so once we’ve implemented the above enhancement we will also take a look at how the updated code functions in scenarios where we get a series of “full blocks” since this has never been tested much (if at all) in the past. ### Completed implementation of OBI (one-block irreversibility) protocol As mentioned last week, the new block finality protocol has been coded and is waiting for testing resources to free up. I’ll publish a separate post about the implementation and effects of the new protocol on Monday. # Hive Application Framework (HAF) ### Reduced size of hashes stored in HAF databases As reported last week, we’ve changed the storage mechanism now to store them as a pure binary hash, which reduces the size of the transactions table (and some indexes) and also should improve performance somewhat. Benchmarking showed that this reduced overall HAF database size by about 10%. ### Benchmarking of various HAF server configurations (using ZFS in particular) We’ve continued to benchmark HAF running in various hardware and software configurations to figure out optimal configurations for HAF servers in terms of performance and cost effectiveness. The majority of the experiments we’ve done so far are recorded here (it doesn’t include some of the more exotic tests we’ve done such as enabling huge_pages and use of postgres CLUSTER command, nor does it include a set of various pg_bench tests we’ve performed yet): https://gitlab.syncad.com/hive/haf/-/wikis/benchmarks ### Dockerized version of HAF server We’ve begun experimenting with various dockerized variations of a HAF server, to determine which configuration will offer the most performance while offering the easiest options for server setup and administration. # HAF account history app (aka hafah) We’ve created a dockerized version of hafah (with postgrest server) that will be used by CI tests, and we’ll be adding performance results for it to the benchmark table above soon (hopefully tonight). Assuming this version performs as well as expected based on previous standalone benchmarks, we’ll replace the current python-based hafah with the dockerized postgrest-based one on our production API node (api.hive.blog) in the next few days. # Hivemind (social media middleware server used by web sites) This week we identified and fixed the remaining test fails for the new HAF-based hivemind that replaces the old hivemind, so the next steps will be to do some cleanup work and then begin performance testing. # Some upcoming tasks * Implement p2p protocol changes to increase transaction-sharing bandwidth and test changes on mirrornet. * Modify hived to process transactions containing either NAI based assets or legacy assets. * Complete work on resource credit rationalization. * More dockerization and CI improvements for HAF and HAF apps. * Collect benchmarks for a hafah app operating in “irreversible block mode” and compare to a hafah app operating in “normal” mode. * Test postgrest-based hafah on production server (api.hive.blog). * Clean up and benchmarking of HAF-based hivemind. If this goes well, we will also deploy and test on our API node. * Finished testing of new one-block irreversibility (OBI) code. # When hardfork 26? Two weeks ago we were mostly in a “testing-only” mode, so I was hopeful we could still complete the hardfork by the end of this month, since we only had one outstanding big change to hived (RC credit rationalization). But yesterday we discovered that we needed to code up two new features: a fix for the p2p bottleneck and support for transactions with NAI and legacy assets in hived (originally it appeared that this work was complete, but further investigation revealed that there were cases that weren’t covered). With three significant code changes still to be made to hived, I think it is best to push back the hardfork date till the end of next month, in order to allow proper testing time for all these new changes once they are implemented (especially as the last two tasks are being done by the same developer).

See: 7th update of 2022 on BlockTrades work on Hive software by @blocktrades