[Code] Generating addresses from scratch (Python 3.6+)

PowerGlove (OP)

Hero Member

Offline

Activity: 508
Merit: 3978

[Code] Generating addresses from scratch (Python 3.6+)

December 28, 2022, 07:22:40 AM

Merited by ABCbits (22), hugeblack (15), Welsh (10), witcher_sense (10), dkbit98 (7), vapourminer (6), arulbero (5), NotATether (5), nc50lc (2), davis196 (1), DdmrDdmr (1), vv181 (1), n0nce (1), citb0in (1), pbies (1), btctaipei (1)

#1

I thought it might be useful to have a completely self-contained Python script that generates Bitcoin addresses (both legacy P2PKH addresses, as well as bech32 P2WPKH addresses).

Most examples I've seen resort to using third-party packages, which makes it difficult for someone reading the code to follow (in detail) each of the steps involved. Even using Python's standard library has pitfalls, because the cryptographic hash functions included in Python are based on OpenSSL, which means that decisions coming from that project sometimes affect Python (e.g. some installations require additional configuration to make RIPEMD-160 available).

The following script uses no external libraries and makes no use of the cryptographic routines in the standard library, either. That way, this script should still run many years from now, and won't be affected by the activity of package maintainers, or even OpenSSL algorithm deprecations.

It's original code, written by me, specifically for this post. Whenever I consulted reference material, I attached the relevant link(s).

Code:

#!/usr/bin/env python3

# make_address.py v2022.12.27 (https://bitcointalk.org/index.php?topic=5432111)

from typing import List, Tuple, Callable

from functools import reduce

import secrets

import sys

show_testnet: bool = False

show_p2pkh_uncompressed: bool = False

show_p2pkh_compressed: bool = True

show_p2wpkh: bool = True

secp256k1_field_order: int = 2**256 - 0x1000003d1

secp256k1_group_order: int = 2**256 - 0x14551231950b75fc4402da1732fc9bebf

secp256k1_generator: Tuple[int, int] = (0x79be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798, 0x483ada7726a3c4655da4fbfc0e1108a8fd17b448a68554199c47d08ffb10d4b8)

u32: int = 0xffffffff

def fail_if(result: bool, message: str) -> None:

    if result:

        sys.exit('fatal error: ' + message)

def rotl32(x: int, s: int) -> int:

    # https://en.wikipedia.org/wiki/Circular_shift

    return (x << s | x >> (32 - s)) & u32

def ripemd160(x: bytes) -> bytes:

    # https://cacr.uwaterloo.ca/hac/about/chap9.pdf (algorithm 9.55, page 350) ['H1' -> 'h0', 'X' -> 'w']

    # this took some effort to derive from the above reference, I first implemented MD4 (from the same reference; algorithm 9.49, page 346) and then manipulated that into RIPEMD-160.

    yl: List[int] = [0] * 16 + [0x5a827999] * 16 + [0x6ed9eba1] * 16 + [0x8f1bbcdc] * 16 + [0xa953fd4e] * 16

    yr: List[int] = [0x50a28be6] * 16 + [0x5c4dd124] * 16 + [0x6d703ef3] * 16 + [0x7a6d76e9] * 16 + [0] * 16

    zl: List[int] = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] + [7, 4, 13, 1, 10, 6, 15, 3, 12, 0, 9, 5, 2, 14, 11, 8] + [3, 10, 14, 4, 9, 15, 8, 1, 2, 7, 0, 6, 13, 11, 5, 12] + [1, 9, 11, 10, 0, 8, 12, 4, 13, 3, 7, 15, 14, 5, 6, 2] + [4, 0, 5, 9, 7, 12, 2, 10, 14, 1, 3, 8, 11, 6, 15, 13]

    zr: List[int] = [5, 14, 7, 0, 9, 2, 11, 4, 13, 6, 15, 8, 1, 10, 3, 12] + [6, 11, 3, 7, 0, 13, 5, 10, 14, 15, 8, 12, 4, 9, 1, 2] + [15, 5, 1, 3, 7, 14, 6, 9, 11, 8, 12, 2, 10, 0, 4, 13] + [8, 6, 4, 1, 3, 11, 15, 0, 5, 12, 2, 13, 9, 7, 10, 14] + [12, 15, 10, 4, 1, 5, 8, 7, 6, 2, 13, 14, 0, 3, 9, 11]

    sl: List[int] = [11, 14, 15, 12, 5, 8, 7, 9, 11, 13, 14, 15, 6, 7, 9, 8] + [7, 6, 8, 13, 11, 9, 7, 15, 7, 12, 15, 9, 11, 7, 13, 12] + [11, 13, 6, 7, 14, 9, 13, 15, 14, 8, 13, 6, 5, 12, 7, 5] + [11, 12, 14, 15, 14, 15, 9, 8, 9, 14, 5, 6, 8, 6, 5, 12] + [9, 15, 5, 11, 6, 8, 13, 12, 5, 12, 13, 14, 11, 8, 5, 6]

    sr: List[int] = [8, 9, 9, 11, 13, 15, 15, 5, 7, 7, 8, 11, 14, 14, 12, 6] + [9, 13, 15, 7, 12, 8, 9, 11, 7, 7, 12, 7, 6, 15, 13, 11] + [9, 7, 15, 11, 8, 6, 6, 14, 12, 13, 5, 14, 13, 13, 7, 5] + [15, 5, 8, 11, 14, 14, 6, 14, 6, 9, 12, 9, 12, 5, 15, 8] + [8, 5, 12, 9, 12, 5, 14, 6, 8, 13, 6, 5, 15, 13, 11, 11]

    fx: List[Callable[[int, int, int], int]] = [lambda u, v, w: u ^ v ^ w] * 16 + [lambda u, v, w: (u & v) | (~u & w)] * 16 + [lambda u, v, w: (u | ~v) ^ w] * 16 + [lambda u, v, w: (u & w) | (v & ~w)] * 16 + [lambda u, v, w: u ^ (v | ~w)] * 16

    padded: bytes = x + b'\x80' + b'\x00' * (64 - (len(x)+1+8) % 64) + (len(x) * 8).to_bytes(8, 'little')

    h0, h1, h2, h3, h4 = 0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476, 0xc3d2e1f0

    for block_index in range(len(padded) // 64):

        block: bytes = padded[block_index*64:block_index*64+64]

        w: List[int] = [int.from_bytes(block[i*4:i*4+4], 'little') for i in range(16)]

        al, bl, cl, dl, el = h0, h1, h2, h3, h4

        ar, br, cr, dr, er = h0, h1, h2, h3, h4

        for j in range(80):

            tl = al + fx[j](bl, cl, dl) + w[zl[j]] + yl[j]

            tr = ar + fx[79-j](br, cr, dr) + w[zr[j]] + yr[j]

            al, bl, cl, dl, el = el, (el + rotl32(tl & u32, sl[j])) & u32, bl, rotl32(cl, 10), dl

            ar, br, cr, dr, er = er, (er + rotl32(tr & u32, sr[j])) & u32, br, rotl32(cr, 10), dr

        h0, h1, h2, h3, h4 = (h1 + cl + dr) & u32, (h2 + dl + er) & u32, (h3 + el + ar) & u32, (h4 + al + br) & u32, (h0 + bl + cr) & u32

    return h0.to_bytes(4, 'little') + h1.to_bytes(4, 'little') + h2.to_bytes(4, 'little') + h3.to_bytes(4, 'little') + h4.to_bytes(4, 'little')

def sha256(x: bytes) -> bytes:

    # https://en.wikipedia.org/wiki/SHA-2#Pseudocode

    # one thing that might catch (compared to the reference) is that I've replaced right rotations with left ones.

    k: List[int] = [0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5, 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174, 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da, 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967, 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85, 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070, 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3, 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2]

    padded: bytes = x + b'\x80' + b'\x00' * (64 - (len(x)+1+8) % 64) + (len(x) * 8).to_bytes(8, 'big')

    h0, h1, h2, h3, h4, h5, h6, h7 = 0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a, 0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19

    for block_index in range(len(padded) // 64):

        block: bytes = padded[block_index*64:block_index*64+64]

        w: List[int] = [int.from_bytes(block[i*4:i*4+4], 'big') for i in range(16)] + [0] * 48

        for i in range(16, 64):

            s0: int = rotl32(w[i-15], 25) ^ rotl32(w[i-15], 14) ^ (w[i-15] >> 3)

            s1: int = rotl32(w[i-2], 15) ^ rotl32(w[i-2], 13) ^ (w[i-2] >> 10)

            w[i] = (w[i-16] + s0 + w[i-7] + s1) & u32

        a, b, c, d, e, f, g, h = h0, h1, h2, h3, h4, h5, h6, h7

        for i in range(64):

            t1: int = (rotl32(e, 26) ^ rotl32(e, 21) ^ rotl32(e, 7)) + ((e & f) ^ (~e & g)) + h + w[i] + k[i]

            t2: int = (rotl32(a, 30) ^ rotl32(a, 19) ^ rotl32(a, 10)) + ((a & b) ^ (a & c) ^ (b & c))

            a, b, c, d, e, f, g, h = (t1 + t2) & u32, a, b, c, (d + t1) & u32, e, f, g

        h0, h1, h2, h3, h4, h5, h6, h7 = (h0 + a) & u32, (h1 + b) & u32, (h2 + c) & u32, (h3 + d) & u32, (h4 + e) & u32, (h5 + f) & u32, (h6 + g) & u32, (h7 + h) & u32

    return h0.to_bytes(4, 'big') + h1.to_bytes(4, 'big') + h2.to_bytes(4, 'big') + h3.to_bytes(4, 'big') + h4.to_bytes(4, 'big') + h5.to_bytes(4, 'big') + h6.to_bytes(4, 'big') + h7.to_bytes(4, 'big')

def rcp(x: int) -> int:

    # https://en.wikipedia.org/wiki/Modular_multiplicative_inverse#Using_Euler's_theorem

    # on 3.8+, use the much faster built-in modular inverse (exponent -1), otherwise use Euler's theorem.

    return pow(x, -1 if sys.hexversion >= 0x030800f0 else secp256k1_field_order-2, secp256k1_field_order)

def is_on_curve(point: Tuple[int, int]) -> bool:

    fail_if(point[0] < 0 or point[0] >= secp256k1_field_order, 'x component is out of range.')

    fail_if(point[1] < 0 or point[1] >= secp256k1_field_order, 'y component is out of range.')

    return (point[1]**2 - point[0]**3) % secp256k1_field_order == 7

def add(a: Tuple[int, int], b: Tuple[int, int]) -> Tuple[int, int]:

    # https://en.wikipedia.org/wiki/Elliptic_curve_point_multiplication#Point_addition

    # https://en.wikipedia.org/wiki/Elliptic_curve_point_multiplication#Point_doubling

    fail_if(not is_on_curve(a) and a != (0, 0), 'point is neither on the curve nor the identity point.')

    fail_if(not is_on_curve(b) and b != (0, 0), 'point is neither on the curve nor the identity point.')

    if b == (0, 0):

        return a

    if a == (0, 0):

        return b

    if b == (a[0], secp256k1_field_order - a[1]):

        return (0, 0)

    slope: int = (a[1] - b[1]) * rcp(a[0] - b[0]) if a != b else 3 * a[0]**2 * rcp(2 * a[1])

    x: int = slope**2 - a[0] - b[0]

    y: int = slope * (a[0] - x) - a[1]

    return (x % secp256k1_field_order, y % secp256k1_field_order)

def scale(point: Tuple[int, int], scalar: int) -> Tuple[int, int]:

    # https://en.wikipedia.org/wiki/Elliptic_curve_point_multiplication#Double-and-add

    fail_if(not is_on_curve(point), 'point is not on the curve.')

    fail_if(scalar < 1 or scalar >= secp256k1_group_order, 'scalar is out of range.')

    return point if scalar == 1 else add(scale(add(point, point), scalar // 2), (0, 0) if scalar % 2 == 0 else point)

def rebase_bytes(x: bytes, *, base: int, min_length: int = 0) -> List[int]:

    # this construction shows up in both base58 and bech32.

    value: int = int.from_bytes(x, 'big')

    result: List[int] = []

    while value != 0:

        value, remainder = divmod(value, base)

        result.append(remainder)

    return [0] * (min_length - len(result)) + result[::-1]

def base58_encode(x: bytes) -> str:

    # https://en.bitcoin.it/wiki/Base58Check_encoding

    return '1' * (len(x) - len(x.lstrip(b'\x00'))) + ''.join('123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'[v] for v in rebase_bytes(x, base=58))

def wif_from_scalar(scalar: int, *, uncompressed: bool = False, testnet: bool = False) -> str:

    # https://en.bitcoin.it/wiki/Wallet_import_format

    fail_if(scalar < 1 or scalar >= secp256k1_group_order, 'scalar is out of range.')

    version: bytes = b'\xef' if testnet else b'\x80'

    payload: bytes = scalar.to_bytes(32, 'big') + (b'' if uncompressed else b'\x01')

    return base58_encode(version + payload + sha256(sha256(version + payload))[:4])

def p2pkh_from_point(point: Tuple[int, int], *, uncompressed: bool = False, testnet: bool = False) -> str:

    # https://en.bitcoin.it/wiki/Technical_background_of_version_1_Bitcoin_addresses

    # https://en.bitcoin.it/wiki/List_of_address_prefixes

    fail_if(not is_on_curve(point), 'point is not on the curve.')

    version: bytes = b'\x6f' if testnet else b'\x00'

    bytes_x: bytes = point[0].to_bytes(32, 'big')

    maybe_bytes_y: bytes = point[1].to_bytes(32, 'big') if uncompressed else b''

    prefix: bytes = b'\x04' if uncompressed else (b'\x02' if point[1] % 2 == 0 else b'\x03')

    payload: bytes = ripemd160(sha256(prefix + bytes_x + maybe_bytes_y))

    return base58_encode(version + payload + sha256(sha256(version + payload))[:4])

def bech32_checksum(x: List[int], *, testnet: bool = False) -> List[int]:

    # https://en.bitcoin.it/wiki/BIP_0173

    # this formulation bears little resemblance to the reference, but it fits my way of thinking better.

    expanded: List[int] = ([3, 3, 0, 20, 2] if testnet else [3, 3, 0, 2, 3]) + x + [0, 0, 0, 0, 0, 0]

    code: int = 1 ^ reduce(lambda a, b: (0x3b6a57b2 if a & (1 << 25) else 0) ^ (0x26508e6d if a & (1 << 26) else 0) ^ (0x1ea119fa if a & (1 << 27) else 0) ^ (0x3d4233dd if a & (1 << 28) else 0) ^ (0x2a1462b3 if a & (1 << 29) else 0) ^ ((a << 5 | b) & 0x3fffffff), expanded, 1)

    return [code >> 25, code >> 20 & 31, code >> 15 & 31, code >> 10 & 31, code >> 5 & 31, code & 31]

def bech32_encode(x: List[int]) -> str:

    # https://en.bitcoin.it/wiki/BIP_0173

    return ''.join('qpzry9x8gf2tvdw0s3jn54khce6mua7l'[v] for v in x)

def p2wpkh_from_point(point: Tuple[int, int], *, testnet: bool = False) -> str:

    # https://en.bitcoin.it/wiki/BIP_0173

    fail_if(not is_on_curve(point), 'point is not on the curve.')

    bytes_x: bytes = point[0].to_bytes(32, 'big')

    version: List[int] = [0]

    payload: List[int] = rebase_bytes(ripemd160(sha256((b'\x02' if point[1] % 2 == 0 else b'\x03') + bytes_x)), base=32, min_length=32)

    return ('tb' if testnet else 'bc') + '1' + bech32_encode(version + payload + bech32_checksum(version + payload, testnet=testnet))

def show_info(point: Tuple[int, int], scalar: int, *, p2pkh: bool = False, uncompressed: bool = False, testnet: bool = False) -> None:

    fail_if(not p2pkh and uncompressed, 'p2wpkh addresses are always compressed.')

    kind: str = ('Legacy' if p2pkh else 'Native SegWit') + ((', Uncompressed' if uncompressed else ', Compressed') if p2pkh else '') + (', Testnet' if testnet else '')

    address: str = p2pkh_from_point(point, uncompressed=uncompressed, testnet=testnet) if p2pkh else p2wpkh_from_point(point, testnet=testnet)

    private_key: str = (('' if uncompressed else 'p2pkh:') if p2pkh else 'p2wpkh:') + wif_from_scalar(scalar, uncompressed=uncompressed, testnet=testnet)

    print('       +------+' + '-' * len(kind) + '--+\n' + '       | Type | ' + kind + ' |\n    +--+------+' + '-' * len(kind) + '--+' + '-' * (len(address) - len(kind) - 1) + '+\n    | Address | ' + address + ' |\n+---+---------+' + '-' * len(address) + '--+' + '-' * (len(private_key) - len(address) - 1) + '+\n| Private Key | ' + private_key + ' |\n+-------------+' + '-' * len(private_key) + '--+')

def self_test() -> bool:

    # this function generates a test pattern consisting of 96 different addresses and 64 different WIFs, and then hashes the result to confirm its correctness.

    # the test pattern is slid 9170 places to slightly increase defect coverage (that offset includes some edge cases; ask me about it on Bitcointalk, if you're interested).

    pattern: str = ''

    for i in range(9171, 9171+16):

        scalar: int = (i * 0x9e3779b97f4a7c15f39cc0605cedc833477394a4b665b1d25e46d78ce158d565) % secp256k1_group_order

        point: Tuple[int, int] = scale(secp256k1_generator, scalar)

        pattern += wif_from_scalar(scalar, uncompressed=True, testnet=True) + wif_from_scalar(scalar, testnet=True) + wif_from_scalar(scalar, uncompressed=True) + wif_from_scalar(scalar)

        pattern += p2pkh_from_point(point, uncompressed=True, testnet=True) + p2pkh_from_point(point, testnet=True) + p2pkh_from_point(point, uncompressed=True) + p2pkh_from_point(point)

        pattern += p2wpkh_from_point(point, testnet=True) + p2wpkh_from_point(point)

    return base58_encode(ripemd160(pattern.encode('ascii'))) == '4RH6F51YXGjTEn5jnvxvFjmsN86j'

def main(args: List[str]) -> None:

    fail_if(len(args) not in { 0, 1 }, 'expected zero or one argument(s).')

    fail_if(len(args) == 1 and (len(args[0]) < 3 or args[0][:2].lower() != '0x' or any(c not in '0123456789abcdef' for c in args[0][2:].lower())), 'expected a private key in hexadecimal form.')

    fail_if(not self_test(), 'self-test failed.')

    scalar: int = secrets.randbits(256) if len(args) == 0 else int(args[0], 16)

    point: Tuple[int, int] = scale(secp256k1_generator, scalar)

    if show_p2pkh_uncompressed:

        show_info(point, scalar, p2pkh=True, uncompressed=True, testnet=show_testnet)

    if show_p2pkh_compressed:

        show_info(point, scalar, p2pkh=True, testnet=show_testnet)

    if show_p2wpkh:

        show_info(point, scalar, testnet=show_testnet)

if __name__ == '__main__':

    main(sys.argv[1:])

What does it do?

It generates Bitcoin addresses and WIFs for a given scalar. It's capable of producing 6 different address types (P2WPKH and compressed/uncompressed P2PKH, for both mainnet and testnet); there are boolean (True/False) variables at the top of the script that you can modify to your liking (show_testnet, show_p2pkh_uncompressed, show_p2pkh_compressed, and show_p2wpkh). By default, only show_p2pkh_compressed and show_p2wpkh are set to True, so when you run it you'll see output like the following:

+------+--------------------+ | Type | Legacy, Compressed | +--+------+--------------------+---------------+ | Address | 1BgGZ9tcN4rm9KBzDn7KprQz87SZ26SAMH | +---+---------+------------------------------------+-----------------------+ | Private Key | p2pkh:KwDiBf89QgGbjEhKnhXJuH7LrciVrZi3qYjgd9M7rFU73sVHnoWn | +-------------+------------------------------------------------------------+ +------+---------------+ | Type | Native SegWit | +--+------+---------------+----------------------------+ | Address | bc1qw508d6qejxtdg4y5r3zarvary0c5xw7kv8f3t4 | +---+---------+--------------------------------------------+----------------+ | Private Key | p2wpkh:KwDiBf89QgGbjEhKnhXJuH7LrciVrZi3qYjgd9M7rFU73sVHnoWn | +-------------+-------------------------------------------------------------+

The above two addresses were generated from the scalar 0x1, like this:

Code:

$ python3 make_address.py 0x1

If you run it without arguments (that is, without supplying a scalar) it will use a randomly-generated 256-bit scalar, and could be used (for example) to make a cold wallet on an air-gapped computer.

How are you generating random scalars?

With the secrets module. Specifically, the secrets.randbits function. According to the Python documentation, that module will use "the most secure source of randomness that your operating system provides."

If you don't trust that module to do a good job, then you can supply your own scalar (in hexadecimal) as a command-line argument.

One way to randomly generate a 256-bit value would be to execute a command like:

Code:

$ dd if=/dev/urandom bs=1024 count=1 status=none | sha256sum -b --tag

Which will produce output that looks similar to this:

SHA256 (-) = 80aa0c48c65e9f2962e462c76c3e4f6d0c48a1e2a9359ecc47c8a12679f3ab98

Then you can take that value, prefix it with "0x" and supply it to the script, as follows:

Code:

$ python3 make_address.py 0x80aa0c48c65e9f2962e462c76c3e4f6d0c48a1e2a9359ecc47c8a12679f3ab98

Is this safe for me to use?

If your threat model doesn't include side-channel attacks, then I think it's pretty safe to use. I've done my best to ensure that this code won't produce a "bad" address (i.e. one that you won't be able to spend from later, because of a bug in my code). To make sure that possible behavior changes in future Python implementations won't silently lead to broken address generation, I've included a self-test that runs before the script does any further work. In the event that something about your version of Python or your system leads to incorrect addresses being generated, the script will fail with "fatal error: self-test failed."

I'm a very careful programmer, but I wouldn't consider myself an expert at finite fields, elliptic curves or hash functions, so although I've tested this code extensively, it's quite possible that something may have eluded me. Please keep that in mind before risking significant funds on an address generated by this script. I'm probably being overcautious, but I would hate to be responsible for anyone losing any of their precious sats.

Do you want feedback on your code?

Legitimate bug reports are greatly appreciated! Style tips, micro-optimizations (replacing % 64 with & 63, etc.) and the like, not so much.

This code is not meant to be fast, pretty or idiomatic; it's only meant to be correct, and to map reasonably well to the included reference links.

If you spot a real bug (like a case I've mishandled in the EC point addition code, for example) I'll send you some merit. Please include a small test case, demonstrating how the bug leads to incorrect address generation.

What's with all the type annotations?

I used mypy as a static type checker to minimize bugs and to help me reason about the correctness of the code. The annotations themselves are inert (they don't affect execution) and don't require anything beyond vanilla Python being installed. You can remove them if you like, although I wouldn't recommend it.

If you have mypy installed, then you can type check the code yourself, with:

Code:

$ mypy --strict make_address.py

witcher_sense

Legendary

Offline

Activity: 2310
Merit: 4313

🔐BitcoinMessage.Tools🔑

Re: [Code] Generating addresses from scratch (Python 3.6+)

December 28, 2022, 12:55:30 PM
Last edit: December 28, 2022, 01:17:53 PM by witcher_sense

#2

EDIT:

Quote from: PowerGlove on December 28, 2022, 07:22:40 AM

Legitimate bug reports are greatly appreciated! Style tips, micro-optimizations (replacing % 64 with & 63, etc.) and the like, not so much.

Sorry, didn't notice this remark, but still... 'code is read more often than it's written'.

Great work, buddy, congratz!

I am not a professional reviewer nor have I skills in Python like you do, but...

If I were you I would rewrite these variables in uppercase to emphasize these are constants and shouldn't be played with to not mess things up.

From:

Code:

secp256k1_field_order: int = 2**256 - 0x1000003d1

secp256k1_group_order: int = 2**256 - 0x14551231950b75fc4402da1732fc9bebf

secp256k1_generator: Tuple[int, int] = (0x79be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798,
                                        0x483ada7726a3c4655da4fbfc0e1108a8fd17b448a68554199c47d08ffb10d4b8)

u32: int = 0xffffffff

To:

Code:

SECP256K1_FIELD_ORDER: int = 2**256 - 0x1000003d1

SECP256K1_GROUP_ORDER: int = 2**256 - 0x14551231950b75fc4402da1732fc9bebf

SECP256K1_GENERATOR: Tuple[int, int] = (0x79be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798,
                                        0x483ada7726a3c4655da4fbfc0e1108a8fd17b448a68554199c47d08ffb10d4b8)

U32: int = 0xffffffff

Also, as a user I would like to have an option to control below variables without modifying the source code (maybe via command line as arguments):

Code:

show_testnet: bool = False

show_p2pkh_uncompressed: bool = False

show_p2pkh_compressed: bool = True

show_p2wpkh: bool = True

And let me clarify: you use typing.List and typing.Tuple instead of in-built objects just to make your code backward-compatible?

█▀▀▀
█
█
█
█
█
█
█
█
█
█
█
█▄▄▄

▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄

▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄

▀▀▀█
█
█
█
█
█
█
█
█
█
█
█
▄▄▄█

▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
░▀▀██▀▀

NotATether

Legendary

Offline

Activity: 1582
Merit: 6697

bitcoincleanup.com / bitmixlist.org

Re: [Code] Generating addresses from scratch (Python 3.6+)

December 28, 2022, 09:25:19 PM

#3

Quote from: witcher_sense on December 28, 2022, 12:55:30 PM

Also, as a user I would like to have an option to control below variables without modifying the source code (maybe via command line as arguments):

Code:

show_testnet: bool = False

show_p2pkh_uncompressed: bool = False

show_p2pkh_compressed: bool = True

show_p2wpkh: bool = True

And let me clarify: you use typing.List and typing.Tuple instead of in-built objects just to make your code backward-compatible?

You should not hardcode parameters inside a script, you need to get them from the command line using an arguments library such as argparse which is bundled with the Python library by default.

.
.^{BLACKJACK^♠FUN.}

███▄██████
██████████████▀
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
█████████████████▄
░██████████████▀
████████████
███████████████░██
██████████

CRYPTO CASINO &
SPORTS BETTING

│

.
REWARD SYSTEM
.ELITE SEAT..

│

▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
███████████████████

.

Whitepaper
JACK

PowerGlove (OP)

Hero Member

Offline

Activity: 508
Merit: 3978

Re: [Code] Generating addresses from scratch (Python 3.6+)

December 31, 2022, 08:07:15 AM

#4

Quote from: witcher_sense on December 28, 2022, 12:55:30 PM

Great work, buddy, congratz!

Thanks, man! Wink

Quote from: witcher_sense on December 28, 2022, 12:55:30 PM

If I were you I would rewrite these variables in uppercase to emphasize these are constants and shouldn't be played with to not mess things up.

Yeah, PEP8-style "constants" are probably a good idea (especially in ripemd160() and sha256(), that u32 looks a lot like a variable). I guess I'm a little biased away from all-caps constants after years of working on hellish C/C++ codebases that used the preprocessor far too much. In my defense, the constants in this script are surprisingly tamper-resistant; try changing any of them and you'll see what I mean. Either the self-test itself will fail, or something being driven by it will.

Quote from: witcher_sense on December 28, 2022, 12:55:30 PM

Also, as a user I would like to have an option to control below variables without modifying the source code (maybe via command line as arguments):

Yep. Normally, I'd agree with you, but in this case it's a tradeoff I stand behind. Very few people need testnet or uncompressed legacy addresses, so the defaults already serve >99% of people. Keeping the interface super simple (no arguments or one argument) also plays into my future plans, because I'd like to write this program two more times, in two different languages (not anytime soon, I'm burned-out on address generation stuff, for now) and I don't want to have to replicate argparse in those future programs.

Quote from: witcher_sense on December 28, 2022, 12:55:30 PM

And let me clarify: you use typing.List and typing.Tuple instead of in-built objects just to make your code backward-compatible?

Yes. The secrets module was introduced in 3.6, so that's what I targeted. Subscripting the built-in types (without a 3.7+ __future__ import) was introduced in 3.9, and it (that is, nicer-looking type annotations) seemed like a silly thing to raise the minimum version for. However, it looks like I didn't think deeply enough about the right way to approach this, because the backward-compatible compromise I settled on has been deprecated as of 3.9 and will be removed in the future. I'll think about what to do (likely raise the minimum version to 3.9) and update the script at some point (it's not a pressing issue).

witcher_sense

Legendary

Offline

Activity: 2310
Merit: 4313

🔐BitcoinMessage.Tools🔑

Re: [Code] Generating addresses from scratch (Python 3.6+)

December 31, 2022, 09:05:25 AM

#5

Quote from: PowerGlove on December 31, 2022, 08:07:15 AM

Yes. The secrets module was introduced in 3.6, so that's what I targeted.

Honestly, I wasn't aware of the existence of secrets module until I read your code. Previously, when I needed cryptographically-secure random numbers, I was using the following methods (which I think are compatible with Python 3.6 and below):

Code:

import ssl
import random
import os


print(hex(random.SystemRandom().getrandbits(256)))
print(f'0x{os.urandom(32).hex()}')
print(f'0x{ssl.RAND_bytes(32).hex()}')

What are the advantages of secrets module, are the random numbers it generates even more random, or is the way it generates numbers more secure when compared to alternatives?

█▀▀▀
█
█
█
█
█
█
█
█
█
█
█
█▄▄▄

▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄

▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄

▀▀▀█
█
█
█
█
█
█
█
█
█
█
█
▄▄▄█

▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
░▀▀██▀▀

PowerGlove (OP)

Hero Member

Offline

Activity: 508
Merit: 3978

Re: [Code] Generating addresses from scratch (Python 3.6+)

January 01, 2023, 03:53:47 AM

Merited by hugeblack (4), ABCbits (2), witcher_sense (2)

#6

Quote from: witcher_sense on December 31, 2022, 09:05:25 AM

[...]

Yup, there are a bunch of ways to get at cryptographically-secure random numbers in Python. They all have subtle tradeoffs (some practical, some hypothetical):

The ssl module is a big/ugly import for this script, especially considering one of my goals was to decouple address generation from OpenSSL (by not using hashlib, and writing the hash functions myself).

random.SystemRandom has a note that it's "Not available on all systems."

os.getrandom is nice, but it relies on a Linux-specific system call (this one).

os.urandom is a reasonable, cross-platform choice (despite the name, it doesn't always rely on /dev/urandom), and (in practice) is sufficient, but its documentation is worded a bit worryingly: "The returned data should be unpredictable enough for cryptographic applications, though its exact quality depends on the OS implementation."

The secrets module has nice, confidently-worded documentation that says it "provides access to the most secure source of randomness that your operating system provides."

Behind the scenes, there's very little difference between some of these approaches (for example, if you peek at the source code for secrets.randbits, here, you'll see that — for the moment — it's just a wrapper around random.SystemRandom.getrandbits).

In my experience, future development is often informed by current documentation, so picking the right function (i.e. one that does what you need now and will likely still do what you need years from now) is half about understanding the current implementation and half about guessing how that implementation might evolve over time (based on the documentation).

PowerGlove (OP)

Hero Member

Offline

Activity: 508
Merit: 3978

Re: [Code] Generating addresses from scratch (Python 3.6+)

January 02, 2023, 04:01:56 PM

Merited by ABCbits (1)

#7

Quote from: ETFbitcoin on January 02, 2023, 12:58:07 PM

Quote from: PowerGlove on January 01, 2023, 03:53:47 AM

os.urandom is a reasonable, cross-platform choice (despite the name, it doesn't always rely on /dev/urandom), and (in practice) is sufficient, but its documentation is worded a bit worryingly: "The returned data should be unpredictable enough for cryptographic applications, though its exact quality depends on the OS implementation."

The secrets module has nice, confidently-worded documentation that says it "provides access to the most secure source of randomness that your operating system provides."

I find it's really weird both of them use OS randomness source, but have different confidence about it's security.

Yep, I agree. Although, technically, those two sentences don't actually contradict each other. But, I still prefer the way the secrets documentation reads.

I like to think in terms of "promises" when reading documentation. So, by my reading, the documentation for os.urandom leaves the Python developers with the freedom to use a lower quality source of randomness in the future (though I can't imagine why they would). The secrets documentation confines them to using the best available (API accessible) entropy source. In practice, I'm guessing those two interfaces will be mostly equivalent (security-wise) over their lifetimes, but I still think it's prudent to select the interface with the more restrictive "contract".

Another piece of documentation weirdness is that random.SystemRandom is documented as "Not available on all systems", but (behind the scenes) it's being used to implement the secrets module (which is available on all systems). Again, that just means that (in practice) random.SystemRandom is actually available on all systems, but because of the documentation, you (as a user) can't rely on that fact (i.e. it might not be true in the next Python release).

btctaipei

Member

Offline

Activity: 141
Merit: 62

Re: [Code] Generating addresses from scratch (Python 3.6+)

January 04, 2023, 08:20:47 AM

Merited by PowerGlove (1)

#8

Quote from: PowerGlove on December 28, 2022, 07:22:40 AM

I thought it might be useful to have a completely self-contained Python script that generates Bitcoin addresses (both legacy P2PKH addresses, as well as bech32 P2WPKH addresses).

Most examples I've seen resort to using third-party packages, which makes it difficult for someone reading the code to follow (in detail) each of the steps involved. Even using Python's standard library has pitfalls, because the cryptographic hash functions included in Python are based on OpenSSL, which means that decisions coming from that project sometimes affect Python (e.g. some installations require additional configuration to make RIPEMD-160 available).
<SNIP>

food court vendors taking bitcoins consumes several dozens blank address daily and now I can wrap this around with script to pump out pre-made QR code wallets that get crontab'ed daily for ssh ftp download to receive payment. Your script makes unique payment for each microtransaction feasible and so stupidly easy!

What you might not be aware is that it saves me hours of Photoshop cut and paste weekly + Really appreciate this!

my PGP Key https://pastebin.com/b7nYutWC
pgp key server search 0x4BCC117F9EFB1A97

PowerGlove (OP)

Hero Member

Offline

Activity: 508
Merit: 3978

Re: [Code] Generating addresses from scratch (Python 3.6+)

January 05, 2023, 08:06:28 AM

#9

Quote from: btctaipei on January 04, 2023, 08:20:47 AM

[...]

I'm happy that this code is being used (especially for something cool like what you described) and that it's saving you some time every week, thanks for letting me know! Wink

digaran

Copper Member
Hero Member

Offline

Activity: 1330
Merit: 899

🖤😏

Re: [Code] Generating addresses from scratch (Python 3.6+)

February 19, 2023, 11:40:42 AM

#10

Hello dev, what happens if we remove all uppercase letters from the address generator? We should be able to get either lower-upper case addresses, right?

🖤😏

PowerGlove (OP)

Hero Member

Offline

Activity: 508
Merit: 3978

Re: [Code] Generating addresses from scratch (Python 3.6+)

February 23, 2023, 03:56:33 PM

Merited by ABCbits (2)

#11

Quote from: digaran on February 19, 2023, 11:40:42 AM

Hello dev, what happens if we remove all uppercase letters from the address generator? We should be able to get either lower-upper case addresses, right?

That depends.

Bech32 addresses can be either all lowercase, or all uppercase (according to BIP173, mixed case addresses are invalid). So, if you have an address like this:

bc1qw508d6qejxtdg4y5r3zarvary0c5xw7kv8f3t4

Then you can safely convert it to uppercase:

BC1QW508D6QEJXTDG4Y5R3ZARVARY0C5XW7KV8F3T4

This would (for example) let you produce more space-efficient QR codes (by taking advantage of the "alphanumeric" mode, which doesn't support lowercase letters).

If you want the script in the OP to generate uppercase Bech32 addresses for you, then change this line (in show_info):

Code:

address: str = p2pkh_from_point(point, uncompressed=uncompressed, testnet=testnet) if p2pkh else p2wpkh_from_point(point, testnet=testnet)

To this:

Code:

address: str = p2pkh_from_point(point, uncompressed=uncompressed, testnet=testnet) if p2pkh else p2wpkh_from_point(point, testnet=testnet).upper()

Legacy (Base58) addresses are case-sensitive and forcing them to come out a particular way is a bit more involved, and not what this script was designed for. Using a vanity address generator that accepts regular expressions is likely your best bet.