Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: tcatm on January 11, 2011, 04:05:05 AM



Title: Account names encoding
Post by: tcatm on January 11, 2011, 04:05:05 AM
It seems like the RPC interface can't handle account names with non-ASCII characters properly. Copy&paste form listaccount to getaccountaddress will create a new account with yet another name.

Can we restrict accountnames to a limited charset (like [a-zA-Z0-9\-_]) or add UTF-8 support?


Title: Re: Account names encoding
Post by: grondilu on January 11, 2011, 04:22:06 AM
I don't use accounts much, but UTF-8 would be cool indeed.   (utf-8 is actually always cool)


Title: Re: Account names encoding
Post by: davout on January 11, 2011, 08:17:33 AM
I don't use accounts much, but UTF-8 would be cool indeed.   (utf-8 is actually always cool)
It also needs more cowbell


Title: Re: Account names encoding
Post by: Gavin Andresen on January 11, 2011, 03:22:43 PM
Character set issues give me headaches.

So I just ran a test at the command-line, moving 500 testnet bitcoins to an account named "฿"

The account created is named "\u00E0\u00B8\u00BF", which is not what I intended.  E0 B8 BF is the utf-8 representation of the unicode Thai Baht character.

Thinking this through, trying hard not to get a headache...

My terminal window has:  LC_CTYPE=en_US.UTF-8
So when I copy&paste the Thai baht symbol, it is being encoded as UTF-8.

I pass a UTF-8 string to bitcoind, and it uses the JSON-Spirit library to convert it into a JSON string (which is defined to be Unicode... encoded using backslashes, I think: see http://www.ietf.org/rfc/rfc4627.txt ).  And there's the bug.  Maybe.  I think?

Command-line bitcoind should be looking at the locale and converting JSON strings to/from that locale.  Anybody motivated enough about internationalized account names (and send comments) to teach it to do that?


Title: Re: Account names encoding
Post by: marcusaurelius on January 11, 2011, 08:42:01 PM
that is something opaque to john user with the mainline client, right? if not, please point me in the right direction.


Title: Re: Account names encoding
Post by: Gavin Andresen on January 11, 2011, 09:25:56 PM
RE: point you in the right direction:

File rpc.cpp, the CommandLineRPC method:

I suspect what needs to be done is to properly JSON encode any strings passed via the command line.

And then properly decode/recode any strings returned from the JSON RCP call before printing out the result.


Title: Re: Account names encoding
Post by: tcatm on January 11, 2011, 09:35:13 PM
It's probably not only the CommandLineRPC method that causes problems. UTF-8 account names created from js-remote are wrong, too.