1

As the title says I'm trying to extract bc1 addresses by using opcodes I've found from reading about bitcoin online. However, I cannot actually find the opcodes associated with bc1. The segwit outputs here are described as, from what I gather, not having OP codes. How can this be? Surely I'm misunderstanding something. Can anyone elaborate?

Vojtěch Strnad
  • 8,292
  • 2
  • 12
  • 40

2 Answers2

5

Native segwit scriptPubKeys are of the form OP_n + <data>, so first a single number opcode followed by a push of some data (called the witness program). Specifically:

  • For P2WPKH (pay to witness pubkey hash, BIP141), OP_0 followed by a push of 20 bytes. Those 20 bytes are the Hash160 of a public key. Their corresponding address format is defined by BIP173.
  • For P2WSH (pay to witness script hash, BIP141), OP_0 followed by a push of 32 bytes. Those 32 bytes are the SHA256 of a script. Their corresponding address format is defined by BIP173 as well.
  • For P2TR (pay to taproot, BIP341), OP_1 followed by a push of 32 bytes. Those 32 bytes are a tweaked x-only public key. Their corresponding address format is defined by BIP350.
  • For future witness versions, any OP_1 through OP_16 followed by a push of something between 2 and 40 bytes (except OP_1 followed by 32 bytes, which is P2TR). These too have addresses associated with them in BIP350, but no semantics (yet).

In general, native segwit outputs do not contain any "active" opcodes like OP_CHECKSIG or OP_HASH160 or anything like that, they're just stubs that push some data. The segwit (and taproot) consensus rules know how to interpret them.

Pieter Wuille
  • 105,497
  • 9
  • 194
  • 308
  • 1
    It's listed under "N/A" there, because direct pushes aren't considered opcodes. Pushing 32 bytes is just 0x20 (hex value of 32) followed by the 32 bytes being pushed in the script encoding. – Pieter Wuille Feb 27 '23 at 15:34
  • One last question. I realize that you have shown me the op codes that are in front of the public key. What about the op codes following the public keys? I deleted my reference to the op codes form that site since it's not actually for bitcoin. – bitcoinluvr6969 Feb 27 '23 at 15:55
  • Are they followed by op_return? as per https://bitcoin.stackexchange.com/questions/87832/how-i-can-extract-all-bitcoin-output-addresses-from-tx-message?noredirect=1&lq=1 – bitcoinluvr6969 Feb 27 '23 at 16:02
  • No, OP_RETURN makes any output unspendable. It's only included in so called "data carrier" outputs which are deliberately not intended to be spensable. As I stated in my answer, native segwit outputs are of the form OP_n + some data push. There are never any other opcodes involved in the scriptPubKey. – Pieter Wuille Feb 27 '23 at 18:30
  • I'm sorry to keep bringing up more questions but this is fascinating. So, how exactly do you signify the end of what needs to be decoded as the public key then? If it's OP_n, then it signifies exactly how many bytes to pull out? – bitcoinluvr6969 Feb 27 '23 at 19:51
  • 1
    @bitcoinluvr6969 Look, I get that you're trying to reverse engineer the entire transaction format based on just a hex dump. This is foolish, it'll make you miss all sorts of edge cases even if it works. Implement a proper transaction decoder, and if need be, a Script parser. You don't need to guess these things; the script field has a well-defined length in a transaction, so with a proper decoder you'll never need to guess. Just trying to pattern match for known byte sequences will only work some of the time. You'll find the basics on https://en.bitcoin.it/wiki/Protocol_documentation e.g. – Pieter Wuille Feb 27 '23 at 19:52
  • 1
    I understand that you're somehow constrained to not use existing libraries to do this parsing for you. But that means there is only one option: learn how the actual protocol works. And that means starting with the beginning: learn how messages are encoded on the wire. Then learn how transactions (and maybe block) messages are encoded. And when you have all that, you'll see where the scripts appear and what to send a script decoder. – Pieter Wuille Feb 27 '23 at 19:55
1

I think you may be misunderstanding how an address is constructed and where scripting opcodes fit into that. The bc1 indicates the use of the bech32 address format for SegWit versions as defined in BIP 173 and also broken down here. The hash of the witness script (SegWit version 0) or the tweaked internal key (SegWit version 1) is included within that address. The opcodes used in the witness script or the tweak of the internal key are only revealed on spending from the address. So to view and analyze the opcodes you need to look at the transactions spending from a particular address rather than the transactions sending to a particular address.

Michael Folkson
  • 15,313
  • 3
  • 17
  • 53
  • 1
    I think you’re mixing up “witness script” and “witness program”. Native segwit output scripts are defined as consisting of <version> <witness_program>. In P2WSH the witness program is the hash of the witness script, so your third sentence should therefore read “The hash of the witness script…”. – Murch Feb 27 '23 at 15:24
  • Thanks @Murch. Updated. – Michael Folkson Feb 27 '23 at 17:36