14

If I have transaction A and wish to verify that it is in the block, and I also have the hash of transaction B and the hash of the hashes of A and B, then can't I just hash A and then see if what I get by hashing A's hash and B's hash is equal to what is in the Merkle tree? Why do I have go all the way to the root?

3mrsh
  • 65
  • 5
Jeff
  • 273
  • 1
  • 2
  • 6

3 Answers3

19

If you have A, HB and HAB you can obviously check whether A fits. As Jestin noted this is essentially a full Merkle tree with two leaves.

However, as a thin client, you only have readily available the Merkle root (which is in the block header) and get told about A. The intermediate levels of the Merkle tree are not provided, therefore, to calculate them, you'd need the block's complete set of transactions.

https://i.stack.imgur.com/2Ep7y.png Image via Mastering Bitcoin

So, for a thin client, we calculate the Merkle branch instead. For the Merkle branch, we just need the transaction's position in the block's transaction list and the hashing partners at each level, instead of the complete set of transactions. A Merkle branch is impractical to fake because it would require finding of a hash collision (which is not doable, or mining wouldn't work). So by going up the tree and combining our result with the respective hashing partner at each level, we finally get the Merkle root. Thus we can prove membership of the transaction in the block.

In the example image, you only need to provide the blue information to link HK to the Merkle root, whereas checking the whole Merkle tree would require all transactions from A to P and the Merkle root.

Murch
  • 75,206
  • 34
  • 186
  • 622
  • But don't you need the path which in fact consists of intermediate levels? – Jeff Jan 09 '17 at 20:38
  • @Jeff: I've edited to clarify. :) – Murch Jan 09 '17 at 20:40
  • I assume the client who wants to verify the existence of the transaction tells the full node (?) the transaction id (?) and the full node then provides the path? What makes the client believe a given transaction is in a particular block? And when would we first detect that transaction K is not in the block? When you say "blue" I think you mean solid blue? – Jeff Jan 09 '17 at 20:48
  • Thin clients only get the block headers usually. They give a bloom filter to the full nodes which resolves to transactions they are interested in. In response full nodes tell them about transactions that are found with the bloom filter. The thin client then requests the Merkle branch for any transaction they are interested in. It is hard to prove that a transaction is not in a block, but it is easy to prove that it is. If nobody can provide proof that K is in the block, it probably isn't. –– Yes, I meant solid blue. – Murch Jan 09 '17 at 23:51
  • I am still not understanding why we need the full Merkle branch: why would not a single hashing partner be sufficient? – Jeff Jan 10 '17 at 17:09
  • 2
    @Jeff: If you were trying to verify that K is in the block, and you only get HL, you can merely calculate HKL, but you don't know whether HKL is actually in the tree. To check that, you have to hash it with HIJ, but again, you don't know if the resulting HIJKL is in the tree. So, you go up until you can calculate the Root, because if your calculation matches the information that you have from the block header, you know that the transaction fits the block header. – Murch Jan 10 '17 at 22:32
  • @Jeff: Meanwhile, you're fairly sure that the block header is correct, because it adheres to the difficulty requirement and it would be quite the investment to fake a block header that adheres the difficulty requirement. Even better when you learn about new block headers that reference the block header of your transaction's block and also adhere the difficulty requirement. – Murch Jan 10 '17 at 22:34
  • Thanks for your explanation. Is there an example with a small tree that demonstrates this? – Jeff Jan 13 '17 at 05:25
  • @Jeff: The example I gave corresponds to the image in the answer. Or what do you mean? – Murch Jan 13 '17 at 17:18
  • Well, one question is, one you say: You don't know whether HKL is actually in the tree, does this mean that a "malicious" full node might deliberately give you a false hashing partner? – Jeff Jan 13 '17 at 23:35
  • 1
    @Jeff: If they give us a wrong hashing partner, a few levels further up the result will not match the Merkle root, unless they managed a successful preimage attack i.e. found a second input that creates the same given output. The latter is thought to be infeasible for SHA-256d. – Murch Jan 14 '17 at 00:42
  • When starting with Hk and given Hl, how does one know which of (Hk + Hl) or (Hl + Hk) to compose when checking back up to the root? Does the Merkle Path include a branch-left/right indicator? – GoZoner May 15 '18 at 17:01
  • @GoZoner: Good question! The full information set includes the transaction you're getting the branch for, the transaction's position in the block's transaction list and the list of its hashing partners up to the Merkle root. You can calculate from the position whether it's the left or right hashing partner for each hashing partner. — I've updated my answer. – Murch May 15 '18 at 17:25
3

Merkle trees are used extensively by SPV nodes. SPV nodes don’t have all transactions and do not download full blocks, just block headers. In order to verify that a transaction is included in a block, without having to download all the transactions in the block, they use an authentication path, or a merkle path. To understand why we need authentication path, you need to understand how Merkle trees work.

A merkle path is used to prove inclusion of a data element. A node can prove that a transaction K is included in the block by producing a merkle path that is only four 32-byte hashes long (128 bytes total). The path consists of the four hashes (shown with a blue background in A merkle path used to prove inclusion of a data element) HL, HIJ, HMNOP, and HABCDEFGH. With those four hashes provided as an authentication path, any node can prove that HK (with a green background at the bottom of the diagram) is included in the merkle root by computing four additional pair-wise hashes HKL, HIJKL, HIJKLMNOP, and the merkle tree root (outlined in a dashed line in the pic below)merkle path

user2203937
  • 319
  • 2
  • 9
  • 1
    Are those intermediate hashes (e.g. HMNOP) cached somewhere? Or do you also need to compute them from the other transactions in the block? – Andrew Jan 27 '20 at 03:07
1

What you are proposing is essentially a Merkle tree with 2 transactions. The hash of the hashes is the Merkle root, and the hash of B is the rest of the Merkle path. You are going all the way to the root.

When you scale this up, you can provide the next highest node in the tree...but what verifies that it belongs in the tree? You'll need to provide the hashes all the way up in order to verify, because it's the Merkle root that is hashed into the proof of work algorithm.

Jestin
  • 8,812
  • 1
  • 22
  • 32
  • But why would this fail if transaction A was one of 100s of transactions? If A were not in the block, then A's hash would not be in the hash of the hashes of A and B, right? – Jeff Jan 09 '17 at 18:07
  • @Jeff, are you saying that 100 hashes would be concatenated and then hashed? If so, that's a o(n) solution, compared to the o(log n) solution that a Merkle tree provides. – Jestin Jan 09 '17 at 18:10
  • I may not be /probably am not understanding the procedure which involves a "Merkle path" but it seems to me that the absence of A in the block would become apparent without having to use all of the nodes in the path. – Jeff Jan 09 '17 at 18:14
  • 2
    I think the part that you are missing is that intermediate nodes in the tree are not always available, only the root. Therefore, the path back to the root must also be provided. Check the "transaction-data" section of the developer guide (https://bitcoin.org/en/developer-guide#transaction-data), specifically the part about Simplified Payment Verification. – Jestin Jan 09 '17 at 19:19