How To Calculate Merkle Root?

I found this example:


Current merkle hash list:
32650049a0418e4380db0af81788635d8b65424d397170b8499cdc28c4d27006
30861db96905c8dc8b99398ca1cd5bd5b84ac3264a4e1b3e65afa1bcee7540c4

Current merkle hash list:
d47780c084bad3830bcdaf6eace035e4c6cbf646d103795d22104fb105014ba3


I have tried to calculate the hash using PHP:

$tx1="32650049a0418e4380db0af81788635d8b65424d397170b8499cdc28c4d27006";
$tx2="30861db96905c8dc8b99398ca1cd5bd5b84ac3264a4e1b3e65afa1bcee7540c4";
$v=hash('sha256', hash('sha256',$tx1.$tx2) );

But I got:
2b5f377b4adab64f489a2f73605ffb448b8add5b708d218729f9dfc58f1f5fe2

What is wrong?

hash – Adding instead of concatenating hashes in Merkle trees

There are a number of issues here, with different answers.

Can Merkle trees use a commutative operation in general to combine hashes?

Yes, but only if they aren’t intended to commit to the order of the leaves.

Clearly when a commutative operation is used, (A,B,C,D) and (D,C,A,B) will hash to the same thing. This is not a problem if the Merkle root is intended to be a commitment to the (multi)set of leaves, but it is if it is intended to be a commitment to the list.

Could the Bitcoin transaction Merkle tree have used a commutative operation?

Maybe, it’s hard to talk about hypotheticals.

The order of transactions in a block is relevant (transactions can spend outputs created by previous transactions in the same block), so you want to prevent a peer from reordering the transactions to invalidate it without breaking the commitment. Obviously alternative solutions could have been used here, either by encoding the order explicitly, or by performing a topological sort on the set of transactions before verification.

Obviously this cannot be changed anymore in the actual Bitcoin protocol without a very invasive hardfork.

Can you use addition or xor as commutative hash combination function?

Not without security reduction.

Imagine a 2-element Merkle tree with leaf elements A and B. Their hash is H = hash(leafhash(A) + leafhash(B)). An attacker who knows A and B can use a generalized birthday attack to find other leaves C and D such that leafhash(C) + leafhash(D) = leafhash(A) + leafhash(B). Perhaps surprisingly, this only needs ~2128 work if leafhash is a generic (and secure) 256-bit hash function. By doing so, the attacker has managed to perform a second preimage attack on the Merkle tree construction, in only the square root of the time that would normally be expected for second preimage security.

There may be reductions in other security properties too.

Are there other feasible commutative hash combination functions then?

Yes, but they don’t shrink the data.

For example, if the hashes are treated as integers modulo a large prime, then combining child hashes x and y as (x+y,xy) works (because the sum and product uniquely identify the inputs, but not their order). When working in a large characteristic-2 finite field (e.g. GF(2256)), (x+y,x3+y3) also works.

Another much simpler possibility is simply sorting the elements: combining x and y as (min(x,y),max(x,y)) works fine.

If using secure commutative combination functions doesn’t shrink the data, then is there any point?

It means that you can prove an element is in the Merkle tree without revealing its position.

This is a minor bandwidth gain (log2(n) bits for a tree with n elements), reduces some implementation complexity, and may be a slight privacy improvement if the positions are sensitive. In fact, this approach is used in the proposed Taproot script trees (concatenating the hashes after sorting), precisely for the reason above.

Disclaimer: I’m a co-author of that proposal.

Are hashes of Merkle tree roots unique throughout whole blockchain?

The "Mastering Bitcoin" book says that each block header contains a 32-bit long hash of the Merkle root for all the transactions it contains.

If I extract all these Merkle roots from all the blocks in the blockchain (longest branch only) and put them in a list, will that list have any duplicates?

bitcoin core – how does merkle root building work for empty array?

It’s known that even if no transactions get created, bitcoin still creates a block with empty array of transactions.

In this way, I am curious what the merkle root would be. How does bitcoin figure out merkle root for empty array ? if there’re even 1 transaction in it, I understand completely what it does, but what about empty array ?

Question about Merkle path verification

Merkle trees in general are useful in the context of a prover-verifier model.

A prover Peggy wants to prove to verifier Victor that a merkle root R, which Victor knows ahead of time, commits to a tree which includes a specific leaf L.

To do so, Peggy would send the element L well as the Merkle proof (or branch) containing all hashing partners L is combined with to produce R. In your example diagram, that proof consists of Hk, Hij, Hmnop, and Habcdefgh. Victor uses these elements to recompute R from it, and compare it with his pre-existing knowledge of R.

So to answer your question: certainly something has to keep Hk (or the elements that has to it, so it can be recomputed), but it doesn’t need to the same party as the one that does the verification. If there is only one party involved, there is nothing to prove.

bitcoin core – Do full nodes store the complete merkle tree or do they regenerate it when creating a merkle proof?

I understand what the merkle root is for. And I understand that blocks don’t store the merkle tree.

Question 1) Is there any place that the complete merkle trees get stored? I don’t mean the merkle root hashes since I know they are in the block headers.

Question 2) Let’s say a full node starts proving to a light node that a specific transaction is in Block J. How does the full node send the merkle branch to the light node? Does it loop through the transactions again to get the hashes and then sends the interior node hashes of transactions, or do full nodes already have the complete merkle tree (whole tree and each internal hash) stored somewhere?

bitcoin core – how does merkle root actualy send proof to light node?

I understand what merkle root is for .

I understand that blocks don’t store the merkle tree.

Question1) is there any place that merkle trees get stored ? I don’t mean merkle root hash since I know they are in block headers.

Question 2) Let’s say full node starts proving to light node that specific transaction is in Block J. how does full node send merkle tree path to light node ? Does it loop through the transactions again to get the hashes and then sends the interior node hashes of transactions ? or does full nodes already have merkle tree(whole tree and each hashes) stored somewhere ?

transaction verification – Is this Merkle hash root problem existent in Bitcoin?

In the Wikipedia article about Merkle trees, I was just reading this, unable to understand where the problem lies:

Second preimage attack

The Merkle hash root does not indicate the tree depth, enabling a second-preimage attack in which an attacker creates a document other than the original that has the same Merkle hash root. For the example above, an attacker can create a new document containing two data blocks, where the first is hash 0-0 + hash 0-1, and the second is hash 1-0 + hash 1-1.

One simple fix is defined in Certificate Transparency: when computing leaf node hashes, a 0x00 byte is prepended to the hash data, while 0x01 is prepended when computing internal node hashes. Limiting the hash tree size is a prerequisite of some formal security proofs, and helps in making some proofs tighter. Some implementations limit the tree depth using hash tree depth prefixes before hashes, so any extracted hash chain is defined to be valid only if the prefix decreases at each step and is still positive when the leaf is reached.

My first questions are: Is this really a problem in Bitcoin? If yes, how is it being solved in Bitcoin core?

My second questions are: Could this problem be solved by storing the tree depth of each block directly in the block chain? Or speaking of Bitcoin, would that negatively affect the block validation procedure itself somehow?

merkle tree – How/where is the informatioin regarding the mapping of a transaction to a particular block stored?

According to the bitcoin white paper a block contains Previous Hash, Nonce and Root Hash. The root hash is a merkle tree root node of all the transactions that have been confirmed to that particular block.

I read here in the fifth paragraph of the top answer that “a transaction that claims to have been from block #234133 we can get the transactions for that block, verify the Merkle tree, and know that the transaction is valid.”

So say a block m somewhere in the blockchain holds n transactions. Where and how is the mapping of a particular set of transaction to a particular block stored? Because the blockchain itself only contains the root hash of all transactions to save space. So are there other hidden components that haven’t been published in the white paper and are there resources to get a comprehensive idea about them?

Thanks.

merkleblock – How does a Merkle proof differ from the Merkle tree?

Merkle proofs are not for blocks nor is there a singular “the merkle proof.” Merkle proofs are for transactions. They prove that a particular transaction is contained within a particular block.

A merkle proof begins with the transaction that is being proved. Then each branch in the merkle tree that cannot be derived from the transaction is provided, all the way up to the root. The result is a path from the merkle root to the transaction. A verifier can then use this path to compute the merkle root and check that it matches the one in the block header.