Title: [Code] Generating addresses from scratch (Python 3.6+) Post by: PowerGlove on December 28, 2022, 07:22:40 AM I thought it might be useful to have a completely self-contained Python script that generates Bitcoin addresses (both legacy P2PKH addresses, as well as bech32 P2WPKH addresses).
Most examples I've seen resort to using third-party packages, which makes it difficult for someone reading the code to follow (in detail) each of the steps involved. Even using Python's standard library has pitfalls, because the cryptographic hash functions included in Python are based on OpenSSL, which means that decisions coming from that project sometimes affect Python (e.g. some installations require additional configuration to make RIPEMD-160 available). The following script uses no external libraries and makes no use of the cryptographic routines in the standard library, either. That way, this script should still run many years from now, and won't be affected by the activity of package maintainers, or even OpenSSL algorithm deprecations. It's original code, written by me, specifically for this post. Whenever I consulted reference material, I attached the relevant link(s). Code: #!/usr/bin/env python3 What does it do? It generates Bitcoin addresses and WIFs for a given scalar. It's capable of producing 6 different address types (P2WPKH and compressed/uncompressed P2PKH, for both mainnet and testnet); there are boolean (True/False) variables at the top of the script that you can modify to your liking (show_testnet, show_p2pkh_uncompressed, show_p2pkh_compressed, and show_p2wpkh). By default, only show_p2pkh_compressed and show_p2wpkh are set to True, so when you run it you'll see output like the following: +------+--------------------+ | Type | Legacy, Compressed | +--+------+--------------------+---------------+ | Address | 1BgGZ9tcN4rm9KBzDn7KprQz87SZ26SAMH | +---+---------+------------------------------------+-----------------------+ | Private Key | p2pkh:KwDiBf89QgGbjEhKnhXJuH7LrciVrZi3qYjgd9M7rFU73sVHnoWn | +-------------+------------------------------------------------------------+ +------+---------------+ | Type | Native SegWit | +--+------+---------------+----------------------------+ | Address | bc1qw508d6qejxtdg4y5r3zarvary0c5xw7kv8f3t4 | +---+---------+--------------------------------------------+----------------+ | Private Key | p2wpkh:KwDiBf89QgGbjEhKnhXJuH7LrciVrZi3qYjgd9M7rFU73sVHnoWn | +-------------+-------------------------------------------------------------+ The above two addresses were generated from the scalar 0x1, like this: Code: $ python3 make_address.py 0x1 If you run it without arguments (that is, without supplying a scalar) it will use a randomly-generated 256-bit scalar, and could be used (for example) to make a cold wallet on an air-gapped computer. How are you generating random scalars? With the secrets module. Specifically, the secrets.randbits function. According to the Python documentation, that module will use "the most secure source of randomness that your operating system provides." If you don't trust that module to do a good job, then you can supply your own scalar (in hexadecimal) as a command-line argument. One way to randomly generate a 256-bit value would be to execute a command like: Code: $ dd if=/dev/urandom bs=1024 count=1 status=none | sha256sum -b --tag Which will produce output that looks similar to this: SHA256 (-) = 80aa0c48c65e9f2962e462c76c3e4f6d0c48a1e2a9359ecc47c8a12679f3ab98 Then you can take that value, prefix it with "0x" and supply it to the script, as follows: Code: $ python3 make_address.py 0x80aa0c48c65e9f2962e462c76c3e4f6d0c48a1e2a9359ecc47c8a12679f3ab98 Is this safe for me to use? If your threat model doesn't include side-channel attacks, then I think it's pretty safe to use. I've done my best to ensure that this code won't produce a "bad" address (i.e. one that you won't be able to spend from later, because of a bug in my code). To make sure that possible behavior changes in future Python implementations won't silently lead to broken address generation, I've included a self-test that runs before the script does any further work. In the event that something about your version of Python or your system leads to incorrect addresses being generated, the script will fail with "fatal error: self-test failed." I'm a very careful programmer, but I wouldn't consider myself an expert at finite fields, elliptic curves or hash functions, so although I've tested this code extensively, it's quite possible that something may have eluded me. Please keep that in mind before risking significant funds on an address generated by this script. I'm probably being overcautious, but I would hate to be responsible for anyone losing any of their precious sats. Do you want feedback on your code? Legitimate bug reports are greatly appreciated! Style tips, micro-optimizations (replacing % 64 with & 63, etc.) and the like, not so much. This code is not meant to be fast, pretty or idiomatic; it's only meant to be correct, and to map reasonably well to the included reference links. If you spot a real bug (like a case I've mishandled in the EC point addition code, for example) I'll send you some merit. Please include a small test case, demonstrating how the bug leads to incorrect address generation. What's with all the type annotations? I used mypy (http://mypy-lang.org) as a static type checker to minimize bugs and to help me reason about the correctness of the code. The annotations themselves are inert (they don't affect execution) and don't require anything beyond vanilla Python being installed. You can remove them if you like, although I wouldn't recommend it. If you have mypy installed, then you can type check the code yourself, with: Code: $ mypy --strict make_address.py Title: Re: [Code] Generating addresses from scratch (Python 3.6+) Post by: witcher_sense on December 28, 2022, 12:55:30 PM EDIT:
Legitimate bug reports are greatly appreciated! Style tips, micro-optimizations (replacing % 64 with & 63, etc.) and the like, not so much. Sorry, didn't notice this remark, but still... 'code is read more often than it's written'.Great work, buddy, congratz! I am not a professional reviewer nor have I skills in Python like you do, but... If I were you I would rewrite these variables in uppercase to emphasize these are constants and shouldn't be played with to not mess things up. From: Code: secp256k1_field_order: int = 2**256 - 0x1000003d1 To: Code: SECP256K1_FIELD_ORDER: int = 2**256 - 0x1000003d1 Also, as a user I would like to have an option to control below variables without modifying the source code (maybe via command line as arguments): Code: show_testnet: bool = False And let me clarify: you use typing.List and typing.Tuple instead of in-built objects just to make your code backward-compatible? Title: Re: [Code] Generating addresses from scratch (Python 3.6+) Post by: NotATether on December 28, 2022, 09:25:19 PM Also, as a user I would like to have an option to control below variables without modifying the source code (maybe via command line as arguments): Code: show_testnet: bool = False And let me clarify: you use typing.List and typing.Tuple instead of in-built objects just to make your code backward-compatible? You should not hardcode parameters inside a script, you need to get them from the command line using an arguments library such as argparse which is bundled with the Python library by default. Title: Re: [Code] Generating addresses from scratch (Python 3.6+) Post by: PowerGlove on December 31, 2022, 08:07:15 AM Great work, buddy, congratz! Thanks, man! ;)If I were you I would rewrite these variables in uppercase to emphasize these are constants and shouldn't be played with to not mess things up. Yeah, PEP8-style "constants" are probably a good idea (especially in ripemd160() and sha256(), that u32 looks a lot like a variable). I guess I'm a little biased away from all-caps constants after years of working on hellish C/C++ codebases that used the preprocessor far too much. In my defense, the constants in this script are surprisingly tamper-resistant; try changing any of them and you'll see what I mean. Either the self-test itself will fail, or something being driven by it will.Also, as a user I would like to have an option to control below variables without modifying the source code (maybe via command line as arguments): Yep. Normally, I'd agree with you, but in this case it's a tradeoff I stand behind. Very few people need testnet or uncompressed legacy addresses, so the defaults already serve >99% of people. Keeping the interface super simple (no arguments or one argument) also plays into my future plans, because I'd like to write this program two more times, in two different languages (not anytime soon, I'm burned-out on address generation stuff, for now) and I don't want to have to replicate argparse in those future programs.And let me clarify: you use typing.List and typing.Tuple instead of in-built objects just to make your code backward-compatible? Yes. The secrets module was introduced in 3.6, so that's what I targeted. Subscripting the built-in types (without a 3.7+ __future__ import) was introduced in 3.9, and it (that is, nicer-looking type annotations) seemed like a silly thing to raise the minimum version for. However, it looks like I didn't think deeply enough about the right way to approach this, because the backward-compatible compromise I settled on has been deprecated as of 3.9 and will be removed in the future. I'll think about what to do (likely raise the minimum version to 3.9) and update the script at some point (it's not a pressing issue).Title: Re: [Code] Generating addresses from scratch (Python 3.6+) Post by: witcher_sense on December 31, 2022, 09:05:25 AM Yes. The secrets module was introduced in 3.6, so that's what I targeted. Honestly, I wasn't aware of the existence of secrets module until I read your code. Previously, when I needed cryptographically-secure random numbers, I was using the following methods (which I think are compatible with Python 3.6 and below):Code: import ssl Title: Re: [Code] Generating addresses from scratch (Python 3.6+) Post by: PowerGlove on January 01, 2023, 03:53:47 AM [...] Yup, there are a bunch of ways to get at cryptographically-secure random numbers in Python. They all have subtle tradeoffs (some practical, some hypothetical):The ssl module is a big/ugly import for this script, especially considering one of my goals was to decouple address generation from OpenSSL (by not using hashlib, and writing the hash functions myself). random.SystemRandom has a note that it's "Not available on all systems." os.getrandom is nice, but it relies on a Linux-specific system call (this one (https://man7.org/linux/man-pages/man2/getrandom.2.html)). os.urandom is a reasonable, cross-platform choice (despite the name, it doesn't always rely on /dev/urandom), and (in practice) is sufficient, but its documentation is worded a bit worryingly: "The returned data should be unpredictable enough for cryptographic applications, though its exact quality depends on the OS implementation." The secrets module has nice, confidently-worded documentation that says it "provides access to the most secure source of randomness that your operating system provides." Behind the scenes, there's very little difference between some of these approaches (for example, if you peek at the source code for secrets.randbits, here (https://github.com/python/cpython/blob/3.11/Lib/secrets.py#L23), you'll see that — for the moment — it's just a wrapper around random.SystemRandom.getrandbits). In my experience, future development is often informed by current documentation, so picking the right function (i.e. one that does what you need now and will likely still do what you need years from now) is half about understanding the current implementation and half about guessing how that implementation might evolve over time (based on the documentation). Title: Re: [Code] Generating addresses from scratch (Python 3.6+) Post by: PowerGlove on January 02, 2023, 04:01:56 PM os.urandom is a reasonable, cross-platform choice (despite the name, it doesn't always rely on /dev/urandom), and (in practice) is sufficient, but its documentation is worded a bit worryingly: "The returned data should be unpredictable enough for cryptographic applications, though its exact quality depends on the OS implementation." The secrets module has nice, confidently-worded documentation that says it "provides access to the most secure source of randomness that your operating system provides." I find it's really weird both of them use OS randomness source, but have different confidence about it's security. I like to think in terms of "promises" when reading documentation. So, by my reading, the documentation for os.urandom leaves the Python developers with the freedom to use a lower quality source of randomness in the future (though I can't imagine why they would). The secrets documentation confines them to using the best available (API accessible) entropy source. In practice, I'm guessing those two interfaces will be mostly equivalent (security-wise) over their lifetimes, but I still think it's prudent to select the interface with the more restrictive "contract". Another piece of documentation weirdness is that random.SystemRandom is documented as "Not available on all systems", but (behind the scenes) it's being used to implement the secrets module (which is available on all systems). Again, that just means that (in practice) random.SystemRandom is actually available on all systems, but because of the documentation, you (as a user) can't rely on that fact (i.e. it might not be true in the next Python release). Title: Re: [Code] Generating addresses from scratch (Python 3.6+) Post by: btctaipei on January 04, 2023, 08:20:47 AM I thought it might be useful to have a completely self-contained Python script that generates Bitcoin addresses (both legacy P2PKH addresses, as well as bech32 P2WPKH addresses). Most examples I've seen resort to using third-party packages, which makes it difficult for someone reading the code to follow (in detail) each of the steps involved. Even using Python's standard library has pitfalls, because the cryptographic hash functions included in Python are based on OpenSSL, which means that decisions coming from that project sometimes affect Python (e.g. some installations require additional configuration to make RIPEMD-160 available). <SNIP> food court vendors taking bitcoins consumes several dozens blank address daily and now I can wrap this around with script to pump out pre-made QR code wallets that get crontab'ed daily for ssh ftp download to receive payment. Your script makes unique payment for each microtransaction feasible and so stupidly easy! What you might not be aware is that it saves me hours of Photoshop cut and paste weekly + Really appreciate this! Title: Re: [Code] Generating addresses from scratch (Python 3.6+) Post by: PowerGlove on January 05, 2023, 08:06:28 AM [...] I'm happy that this code is being used (especially for something cool like what you described) and that it's saving you some time every week, thanks for letting me know! ;)Title: Re: [Code] Generating addresses from scratch (Python 3.6+) Post by: digaran on February 19, 2023, 11:40:42 AM Hello dev, what happens if we remove all uppercase letters from the address generator? We should be able to get either lower-upper case addresses, right?
Title: Re: [Code] Generating addresses from scratch (Python 3.6+) Post by: PowerGlove on February 23, 2023, 03:56:33 PM Hello dev, what happens if we remove all uppercase letters from the address generator? We should be able to get either lower-upper case addresses, right? That depends.Bech32 addresses can be either all lowercase, or all uppercase (according to BIP173, mixed case addresses are invalid). So, if you have an address like this: bc1qw508d6qejxtdg4y5r3zarvary0c5xw7kv8f3t4 Then you can safely convert it to uppercase: BC1QW508D6QEJXTDG4Y5R3ZARVARY0C5XW7KV8F3T4 This would (for example) let you produce more space-efficient QR codes (by taking advantage of the "alphanumeric" mode, which doesn't support lowercase letters). If you want the script in the OP to generate uppercase Bech32 addresses for you, then change this line (in show_info): Code: address: str = p2pkh_from_point(point, uncompressed=uncompressed, testnet=testnet) if p2pkh else p2wpkh_from_point(point, testnet=testnet) To this: Code: address: str = p2pkh_from_point(point, uncompressed=uncompressed, testnet=testnet) if p2pkh else p2wpkh_from_point(point, testnet=testnet).upper() Legacy (Base58) addresses are case-sensitive and forcing them to come out a particular way is a bit more involved, and not what this script was designed for. Using a vanity address generator that accepts regular expressions is likely your best bet. |