Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: Kontakt on April 21, 2013, 09:33:36 PM



Title: Address generation
Post by: Kontakt on April 21, 2013, 09:33:36 PM
I've been trying to understand the generation of bitcoin addresses, and to further that end, I've been trying to write my own code to generate an address.

I understand the process, I just can't get it to work properly.

The example generation on the wiki has a public key of "0450863AD64A87AE8A2FE83C1AF1A8403CB53F53E486D8511DAD8A04887E5B23522CD470243453A 299FA9E77237716103ABC11A1DF38855ED6F2EE187E9C582BA6", which it then says is hashed via SHA256 to return "600FFE422B4E00731A59557A5CCA46CC183944191006324A447BDB2D98D4B408"
however, I'm consistently getting "32511e82d56dcea68eb774094e25bab0f8bdd9bc1eca1ceeda38c7a43aceddce".

What am I doing wrong?


Title: Re: Address generation
Post by: jackjack on April 21, 2013, 09:39:26 PM
The public key, and further more all the hexadecimal data must not be used as strings, they are binary data
For example the string "0123afz" is "3031323361667a" in hexadecimal (see ascii), aka "\x30\x31\x32\x33\x61\x66\x7a"

Doing it wrong (http://www.fileformat.info/tool/hash.htm?text=0450863AD64A87AE8A2FE83C1AF1A8403CB53F53E486D8511DAD8A04887E5B23522CD470243453A299FA9E77237716103ABC11A1DF38855ED6F2EE187E9C582BA6)
Doing it right (http://www.fileformat.info/tool/hash.htm?hex=0450863AD64A87AE8A2FE83C1AF1A8403CB53F53E486D8511DAD8A04887E5B23522CD470243453A299FA9E77237716103ABC11A1DF38855ED6F2EE187E9C582BA6)

ps: look at "Original bytes" too and notice how "045086..." as a string corresponds to "303435303836..." in binary


Title: Re: Address generation
Post by: Kontakt on April 21, 2013, 10:17:56 PM
Alright, that helps a lot, and I understand what I was doing wrong. Now I just need to figure out how to actually store the hex in c++.


Title: Re: Address generation
Post by: jackjack on April 21, 2013, 10:24:54 PM
Now I just need to figure out how to actually store the hex in c++.
I'd say vector<char>
You can also use strings I believe... But pay attention to '\x00', it may end strings unexpectedly. I am no specialist with string + binary + c++ though, so wait until someone confirms and always try and learn before trusting your code


Title: Re: Address generation
Post by: Kontakt on April 22, 2013, 04:26:11 PM
I got this code to work properly.

Code:
uint8_t* sha2(uint8_t *in, uint8_t *out)
{
    SHA256(in, 65, out);
    return out;
}
uint8_t* hex_decode(const char *in, size_t len,uint8_t *out)
{
    unsigned int i, t, hn, ln;

    for (t = 0,i = 0; i < len; i+=2,++t)
    {

        hn = in[i] > '9' ? in[i] - 'A' + 10 : in[i] - '0';
        ln = in[i+1] > '9' ? in[i+1] - 'A' + 10 : in[i+1] - '0';

        out[t] = (hn << 4 ) | ln;
    }

    return out;
}
int main()
{
    char pub_key[] = "0450863AD64A87AE8A2FE83C1AF1A8403CB53F53E486D8511DAD8A04887E5B23522CD470243453A299FA9E77237716103ABC11A1DF38855ED6F2EE187E9C582BA6";
    uint8_t res_sha[32];
    uint8_t res_tmp[65];
    hex_decode(pub_key,131,res_tmp);
    for(int j =0; j < 65; j++)
        cout << setw(2) << setfill('0') << hex << (int)res_tmp[j];
    cout << endl << endl;
    sha2(res_tmp,res_sha);
    for(int i =0; i < 32; i++)
        cout << setw(2) << setfill('0') << hex << (int)res_sha[i];

    return 0;
}


Title: Re: Address generation
Post by: Remember remember the 5th of November on April 22, 2013, 04:31:09 PM
I also made this mistake once. Hashing the hexadecimal representation rather than the binary, very very wrong.


Title: Re: Address generation
Post by: Kontakt on April 22, 2013, 08:19:49 PM
Now I'm just completely stuck at converting the final hash into the base58 address.


Title: Re: Address generation
Post by: Shevek on April 22, 2013, 08:59:24 PM
Now I'm just completely stuck at converting the final hash into the base58 address.

Once you have the RIPEMD160 hash and the 4-bytes checksum:

Code:
010966776006953D5567439E5E39F86A0D273BEED61967F6

then you must consider it a number in little-endian format, like an usual base-10 number you see everyday, but in base-16. I show the number in decimal format:

Code:
25420294593250030202636073700053352635053786165627414518

Then you obtain the last base58 symbol doing:

Code:
25420294593250030202636073700053352635053786165627414518 mod 58 = 20

And 20 correspond to symbol "M" in the base58 alfabet used in bitcoin. Then

Code:
(25420294593250030202636073700053352635053786165627414518 - 20)/58 =
438280941262931555217863339656092286811272175269438181

And the process starts again for the next symbol. At the end you have:

Code:
6UwLL9Risc3QfPqBUvKofHmBQ7wMtjvM


You must add "1" at the beginning:

Code:
16UwLL9Risc3QfPqBUvKofHmBQ7wMtjvM

And you are done!


Title: Re: Address generation
Post by: grue on April 22, 2013, 09:02:48 PM
Now I just need to figure out how to actually store the hex in c++.
I'd say vector<char>
You can also use strings I believe... But pay attention to '\x00', it may end strings unexpectedly. I am no specialist with string + binary + c++ though, so wait until someone confirms and always try and learn before trusting your code

or just allocate 64 bytes of unsigned char. (or whatever size it needs to be)


Title: Re: Address generation
Post by: scintill on April 22, 2013, 09:11:35 PM
Shevek is right, but it is easier said than done. ;)  One minor clarification, each leading zero byte should be replaced with "1".  So if the hash happened to start with \x00 the address would start with 11.

You will probably want a pre-existing biginteger/bignumber library to do that math for you.  It might even include a routine that already does base conversions and just needs the base and the alphabet.  The Bitcoin base58 alphabet is "123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz" (https://github.com/bitcoin/bitcoin/blob/master/src/base58.h#L26).


Title: Re: Address generation
Post by: Shevek on April 22, 2013, 09:55:06 PM
One minor clarification, each leading zero byte should be replaced with "1".  So if the hash happened to start with \x00 the address would start with 11.

Uhmmm... are you sure!?

I think leading '\x00's reflect in shorter addresses; otherwise, you can't see addresses beginning with "1R...", "1S...", "1T..." and so on; second symbol further than "Q" is only possible in hashes with leading 'x00's


Title: Re: Address generation
Post by: scintill on April 22, 2013, 10:12:19 PM
Uhmmm... are you sure!?

I think leading '\x00's reflect in shorter addresses; otherwise, you can't see addresses beginning with "1R...", "1S...", "1T..." and so on; second symbol further than "Q" is only possible in hashes with leading 'x00's

It is leading zero bytes, keep that in mind.  They couldn't affect the leading result character, if it were a straight conversion, right? Somewhat like writing 007345349 in decimal, it doesn't matter how many zeroes there are in front.  But yes, I am pretty sure:

The leading character '1', which has a value of zero in base58, is reserved for representing an entire leading zero byte, as when it is in a leading position, has no value as a base-58 symbol. There can be one or more leading '1's when necessary to represent one or more leading zero bytes.

Also in the source (https://github.com/bitcoin/bitcoin/blob/master/src/base58.h#L60) (the string is built in reverse), and see 111kzsNZ1w27kSGXwyov1ZvUGVLJMvLmJ (http://blockchain.info/address/111kzsNZ1w27kSGXwyov1ZvUGVLJMvLmJ) with three leading 1's and see how its hash160 starts with two zero bytes.


Title: Re: Address generation
Post by: kjj on April 23, 2013, 04:00:20 AM
1 in base58 means zero.  It is a special case at the beginning of a string to mean 8 bits of zero.  This allows the base58 encoding to preserve arbitrary bit strings, rather than just integers.


Title: Re: Address generation
Post by: Kontakt on April 23, 2013, 05:55:18 AM
Now I'm just completely stuck at converting the final hash into the base58 address.

Once you have the RIPEMD160 hash and the 4-bytes checksum:

Code:
010966776006953D5567439E5E39F86A0D273BEED61967F6

then you must consider it a number in little-endian format, like an usual base-10 number you see everyday, but in base-16. I show the number in decimal format:

Code:
25420294593250030202636073700053352635053786165627414518

Then you obtain the last base58 symbol doing:

Code:
25420294593250030202636073700053352635053786165627414518 mod 58 = 20

And 20 correspond to symbol "M" in the base58 alfabet used in bitcoin. Then

Code:
(25420294593250030202636073700053352635053786165627414518 - 20)/58 =
438280941262931555217863339656092286811272175269438181

And the process starts again for the next symbol. At the end you have:

Code:
6UwLL9Risc3QfPqBUvKofHmBQ7wMtjvM


You must add "1" at the beginning:

Code:
16UwLL9Risc3QfPqBUvKofHmBQ7wMtjvM

And you are done!

How do I convert from hex to decimal like you have it?


Title: Re: Address generation
Post by: Remember remember the 5th of November on April 23, 2013, 06:05:48 AM
For Address Generation, I recommend reading this https://en.bitcoin.it/wiki/Technical_background_of_Bitcoin_addresses.

Here is a function in C for hex to binary taken from cgminer.

Code:
bool hex2bin(unsigned char *p, const char *hexstr, size_t len)
{
bool ret = false;

while (*hexstr && len) {
char hex_byte[4];
unsigned int v;

if (!hexstr[1]) {
return ret;
}

memset(hex_byte, 0, 4);
hex_byte[0] = hexstr[0];
hex_byte[1] = hexstr[1];

if (sscanf(hex_byte, "%x", &v) != 1) {
return ret;
}

*p = (unsigned char) v;

p++;
hexstr += 2;
len--;
}

if (likely(len == 0 && *hexstr == 0))
ret = true;
return ret;
}


Title: Re: Address generation
Post by: Kontakt on April 23, 2013, 08:21:20 AM
For Address Generation, I recommend reading this https://en.bitcoin.it/wiki/Technical_background_of_Bitcoin_addresses.

Here is a function in C for hex to binary taken from cgminer.

Code:
bool hex2bin(unsigned char *p, const char *hexstr, size_t len)
{
bool ret = false;

while (*hexstr && len) {
char hex_byte[4];
unsigned int v;

if (!hexstr[1]) {
return ret;
}

memset(hex_byte, 0, 4);
hex_byte[0] = hexstr[0];
hex_byte[1] = hexstr[1];

if (sscanf(hex_byte, "%x", &v) != 1) {
return ret;
}

*p = (unsigned char) v;

p++;
hexstr += 2;
len--;
}

if (likely(len == 0 && *hexstr == 0))
ret = true;
return ret;
}

I have that page pretty much memorized at this point. XD

Thanks for the code, I'll digest it once I've had some rest.


Title: Re: Address generation
Post by: Shevek on April 23, 2013, 09:47:32 AM
Uhmmm... are you sure!?

I think leading '\x00's reflect in shorter addresses; otherwise, you can't see addresses beginning with "1R...", "1S...", "1T..." and so on; second symbol further than "Q" is only possible in hashes with leading 'x00's

It is leading zero bytes, keep that in mind.  They couldn't affect the leading result character, if it were a straight conversion, right? Somewhat like writing 007345349 in decimal, it doesn't matter how many zeroes there are in front.  But yes, I am pretty sure:


Yes, you are right. I confused leading hex 'x0' with leading byte 'x00'.