0

What I know so far is that when the node starts for the first time or after some time, the Initial Block Download (IBD) begins. The node chooses one peer (the header sync peer) from 8 outgoing connections. Initially, it sends a getheader message only to this peer and receives headers from it. As the timestamp of receiving headers approach current timestamp, the node starts sending getheader requests to the remaining peers. Headers received from other peers can differ and even create two chains starting from the genesis block.

What I'm interested in is at what point does node start requesting for blocks? Is it during IBD or does it first download headers from all peers, create a complete header tree and only then start downloading blocks for the chain with the most cumulative power and move the active chain tip?

By the way, if checkpoints are applied to header sync, if I get from these other peers in IBD some chain (headers) that starts before checkpoint, how will my node accept it? In case my sync peer gave me wrong chain, I will stay forever blocked from getting the best chain.

Cosmos
  • 199
  • 7
  • 1
    Whenever there are headers in our header tree, and peers which we know to have blocks in that header tree that we don't have, we start asking them for those blocks. – Pieter Wuille Jan 08 '24 at 03:49
  • @PieterWuille Ok, thanks. What do you think about this what I wrote for checkpoints? If my node choose bad peer, we would remain permanently blocked in that kind of attack. – Cosmos Jan 08 '24 at 03:55
  • There are various timeouts, at some point other peers get asked too. If a node gives us invalid blocks/headers, the network code may disconnect the peer too, etc. – Pieter Wuille Jan 08 '24 at 04:05

1 Answers1

0

What I'm interested in is at what point does node start requesting for blocks? Is it during IBD or does it first download headers from all peers, create a complete header tree and only then start downloading blocks for the chain with the most cumulative power and move the active chain tip?

There are two different processes.

The first is IBD. It is divided in two phases (caused by header pre-sync). In first we select one peer as the sync peer and send him one by one getheaders requests. As the timestamp of received headers begins to approach the current timestamp, we start sending getheaders messages to other peers as well. Peer responses with corresponding headers, we validate them and store their tiny identifiers (nothing is stored in block header tree at this moment).

Only after the peers send us all the headers they have (all header messages with less than 2000 objects) we move to the second phase.

In the second phase, we send getheaders messages only for headers that are part of the chain that has enough cumulative work. There are two conditions: that cumulative work must be within one day of the current active chain tip and greater than a certain predefined value. Since the first condition at the very first moment is a genesis block (and some low difficulty blocks), it will be easily bypassed by malicious peer, however, the second condition will not. At this phase, it does not matter which peer you are requesting (whether it is sync peer or some other), it is only important that we ask some peer for which we know it has given header.

More to read about header pre-synchronization you can find in this post.

The second process works, something like, on events. Whenever a new header is added to the tree (event) this process is triggered. If various necessary conditions are met, a request for the block will be sent to one of the peers that is considered to have it.

When a node is just started, since in initial synchronization we only "talk" to one peer, then as sync peer sends us headers, we validate them and put in the tree, getdata requests for these blocks will be sent to that sync peer immediately.

Therefore, no! We do not store all headers first, create complete tree and then download blocks. Everything is done simultaneously. However, we do download all headers in the tree for pre-sync first, before doing anything else.

For this and much more other information check this post.

Cosmos
  • 199
  • 7
  • The pre-syncing process will prevent low-difficulty headers from being accepted, and without headers, no blocks will be downloaded/accepted either. – Pieter Wuille Jan 08 '24 at 05:07
  • @PieterWuille Fine, but pre-sync at the begin is on genesis block. Low difficulty chain that we get from malicious peer will have cumulative work that is bigger than genesis, so it will be accepted? – Cosmos Jan 08 '24 at 05:12
  • No, there is a minimum amount of accumulated chain work hardcoded in the client, updated at every release. No headers chain below that value is accepted. See https://bitcoin.stackexchange.com/questions/76018/how-does-headers-first-prevent-fill-disk-attack/121235#121235: "(which means: enough to within one day of the active chain tip, and more than the preconfigured minimum chain work)". – Pieter Wuille Jan 08 '24 at 05:17
  • @PieterWuille Thanks, I've edited the answer. Is it okay now? – Cosmos Jan 08 '24 at 05:52
  • We do not wait for the headers sync to complete before requesting blocks (but pre-sync does need to complete, because no headers are added to the block header tree during presync). – Pieter Wuille Jan 08 '24 at 13:02
  • @PieterWuille Here you wrote that header presync only denotes first download of headers from some peer and verifying that the chain has enough work, so redownload and checking whether they match is not part of presync? However, as I know we need first to redownload all of these headers again and if they ALL match, then they are one by one stored (not need to again request headers, we got them in redownload) in header tree. As they are added to the header tree, a getheader message for the blocks can simultaneously be sent for them. – Cosmos Jan 08 '24 at 16:30
  • @PieterWuille What do you mean by presync here but pre-sync does need to complete, because no headers...? – Cosmos Jan 08 '24 at 16:32
  • Sorry, confusing naming. The concept of "headers presynchronization" consists of two (sequential) phases, the presync phase and the redownload phase. Nothing is added to the block header tree before the presync phase completes, but once that happens, during the redownload phase, blocks can start be requested immediately (before header synchronization completes). The if the ALL match refers to the all the headers in the redownload buffer, not all headers in the entire chain. – Pieter Wuille Jan 08 '24 at 17:12
  • @PieterWuille Ok, so we take entire header chain from some sync peer (not store anything, just small IDs). Calculate work and if its not enough, we reject. Otherwise, we start second phase. This is first phase (pre-sync). In second phase (redownload phase), we redownload headers in number of redownload buffer (for example 100 headers). If they match, we start header sync for these 100 headers (validation and storing in tree). It also trigger block sync for them. Also, since first 100 headers in redownload buffer match, we start with redownloading another 100 headers and checking their match. – Cosmos Jan 08 '24 at 17:46
  • @PieterWuille If some matching in current redownload buffer fail, or header/block validation of them fails, further redownload is suspended. Is this correct? Also at what moment we start to ask another peers for headers (not just sync peer). I know its when headers timestamp start to approach present timestamp, but is that during pre-sync phase or during redownload? – Cosmos Jan 08 '24 at 17:49
  • The redownload buffer as of Bitcoin Core is 14441 headers FWIW (significantly larger than what fits in a single headers message). I wouldn't say "we start header sync" only when the redownload buffer matches - the header sync process is the entire thing (from presync phase to redownload to eventual adding to block header tree). Header synchronization (the entire thing) is started by sending getheaders to a peer. We initially send it to one peer, but once the best header that actually made it into our header tree (regardless of how it got there) is less than 24h old, we ask everyone. – Pieter Wuille Jan 08 '24 at 17:58
  • @PieterWuille Are you with "best header" referencing to active chain tip or to header that is last (by its number) in some of the sequence (does not have children but does not need to be active chain tip; maybe it does not have valid corresponding block or some of its ancestor is invalid)? – Cosmos Jan 08 '24 at 18:14
  • The best header is just the best entry in the entire block header tree; it does not need to be the active chain tip). – Pieter Wuille Jan 08 '24 at 18:17
  • @PieterWuille I'm really sorry to spam and bother you. But the terminology is not the clearest for me. What does the best header mean? The best in the sense that it is just the last one by sequence/serial number in the header tree that is still considered valid (it can become invalid if during block synchronization its block or a block of some ancestor is permanently invalid; then some earlier header by serial number that has not yet been declared invalid becomes the best). Therefore, it does not have to be part of the chain with the most work, nor to be an active chain tip, just to be last? – Cosmos Jan 08 '24 at 18:35
  • The best header is the one with the most accumulated work, and not known to be invalid or descendant of a known-to-be invalid block. It doesn't need to be the active chain tip, but it is the one that would become the active chain tip if we had all block data, and validation succeeds. – Pieter Wuille Jan 08 '24 at 18:43
  • @PieterWuille Thanks for all this help. You have done a lot for me, I appreciate it. – Cosmos Jan 08 '24 at 19:02
  • I realize I made a mistake in the comments above. We only start requesting blocks from a peer when they have announced/sent to us a header whose work exceeds the minimum chain work (the same hardcoded constant needed to get out of header presync). I've updated my other answer to reflect this. – Pieter Wuille Jan 08 '24 at 20:57
  • @PieterWuille Oh, okay, no problem. So everything stay the same (entire story) related to header sync, pre-sync, redownload, the moment when we ask another peers for their chain (not just sync peer) etc., just the difference is that we will not request for block immediately after the header is added to the tree, but when we download enough headers so they became part of the chain in chain tree that has enough chain work? Did I understand correctly? – Cosmos Jan 08 '24 at 21:28