2

I want to extract output addresses from tx messages in any bitcoin pcap.

currently, i extract tx messages which it's output script starts with "0x76" or "0xA9", but i don't know how can i extract the other types of op_codes which is introduced here:

https://en.bitcoin.it/wiki/Script

Saeed
  • 125
  • 8

1 Answers1

3

Most of the output addresses follow one of the following 'standard outputs':

Legacy Outputs

  1. P2PK: scriptPubKey: <public_key> OP_CHECKSIG. Output pays the public key directly and hence does not have a direct address.

  2. P2PKH: scriptPubKey: OP_DUP OP_HASH160 <hash160 of pubkey> OP_EQUALVERIFY OP_CHECKSIG. For this kind of output you just need to base58check the hash160 of pubkey with version 0x00 and you will get the addresses that starts with 1.

  3. P2SH: scriptPubKey: OP_HASH160 <redeem_script> OP_EQUAL. You again need to just base58check the redeem_script with version 0x05 and you will get an address starting with 3.

  4. Multisig: scriptPubKey: M <public_key1>...<public_keyN> N OP_CHECKMULTISIG. In this case, you cannot get the addresses as the output pays the public keys directly.

Segwit Outputs

  1. P2WPKH: scriptPubKey: <0x00 (version)> <hash160 of pubkey>. You will need to create a bech32 address using the version and 20-byte redeem script and you will get an address starting with bc1. You can find the python script for bech32 encoding here.
  2. P2SH: scriptPubKey <version: 0x00> <sha256 of redeem_script>. Again a bech32 address starting bc1 but the redeem_script is 32 bytes rather than 20 bytes for P2WPKH.
  3. P2SH(P2WPKH): scriptPubKey: OP_HASH160 <redeem_script > OP_EQUAL. The encoding is same as legacy P2SH. Base58check with 0x05 version. For a more detailed overview on how the 'redeem_script` and addresses are generated for P2SH(P2WPKH) see my other answer here.

OP_RETURN

Many times people like to encode some data in the Bitcoin blockchain. For that, you use a OP_RETURN opcode. The scriptPubKey:<OP_RETURN><OP_PUSHDATA1><bytes to push><script>.

Ugam Kamat
  • 7,398
  • 2
  • 15
  • 40
  • This is technically not true - you have outputs on the chain that don't match any of the above, the most obvious being P2PK outputs from the early blocks before hash160 addresses were commonly used – Raghav Sood May 19 '19 at 11:02
  • So, Tx messages which contains output scripts start with 0x00, is belongs to which category? it seems that it is a P2SH (like output scripts which start with 0xA9), but generated output address from that way is not valid. – Saeed May 20 '19 at 09:57
  • @Saeed It's segwit outputs. Legacy P2SH needs to have OP_HASH160 OP_Equal. However, Segwit P2SH is 0x00 <redeem_script>. For a non-segwit aware client, this looks as if this is anyone can spend output. But Segwit aware clients will look for the scriptSig in the witness section of the transaction. – Ugam Kamat May 20 '19 at 10:13
  • @UgamKamat How can i detect between legacy outputs and segwit outputs? as you said before, there are 6 different way to extract output address based on legacy or segwit outputs. is all these ways? – Saeed May 25 '19 at 08:08
  • @saeed as I said earlier, those are the current STANDARD outputs. Bitcoin is a programmable money, and as a result accept a number of ways in which you can lock the output, but that will not be standard – Ugam Kamat May 25 '19 at 08:18