Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: The Ferox on July 31, 2014, 07:31:19 PM



Title: How do they manage so many addresses?
Post by: The Ferox on July 31, 2014, 07:31:19 PM
I have a quick question that i am sure someone has an answer to that will make it all make sense for me.

I recently was able to get re-imported a bunch of old addresses that are still receiving payments from faucets and old affiliate programs, but since the import my client is running at the speed of molasses in January. I know this is not a computer hardware problem because my server is only 2 months old and has 6 top of the line intel CPU's with haswell cores and 128 gig of ECC ram. Way more raw computing power than a person should ever need for pretty much anything.

I have tried, erasing the client and re-installing it, i have the bitcoin client set to the highest priority on 12 cores, i have had a data center engineer friend of my check everything on the hardware config. i have ran through everything on the software side of things, and the speed issue does not make sense. Almost an hour to load the client and half a day to scan the blockchain when i import a new address.

What i would like to know is how businesses like localbitcoins.com blockchain.info, satoshidice and others can manage tens of millions of addresses before they start having "bloated" wallet issues?

Is there a better client for this than using bitcoin QT and bitcoind directly?
Is there some sort of setting i am overlooking?
What am i missing?


<?php
if (BTC questions > dumb stuff a noob should know)
  echo "Yeah i know, i'm new to this stuff";
?>


Title: Re: How do they manage so many addresses?
Post by: espringe on July 31, 2014, 07:41:23 PM
I can't speak for other sites, but for example https://www.moneypot.com/ is currently monitoring a few million addresses. The approach is rather simple. In my case, I pre-generated all the addresses (using bip32, but it's irrelevant). Next, I store all those addresses in a hashmap,  which effectively allow constant time lookup by address.

Next, I listen on the bitcoin network and the blockchain -- when ever I see a new transaction -- I look at all the outputs and see if it corresponds to one of the them in the hashmap to see if it's in there. If it is, it records the details in a database that has an index on the address column. Actually, using an approach like this would easily allow me to monitor *every* address in the blockchain. Really, the only slow part of my scheme was just generating the initial address set


Title: Re: How do they manage so many addresses?
Post by: The Ferox on July 31, 2014, 07:56:04 PM
I can't speak for other sites, but for example https://www.moneypot.com/ is currently monitoring a few million addresses. The approach is rather simple. In my case, I pre-generated all the addresses (using bip32, but it's irrelevant). Next, I store all those addresses in a hashmap,  which effectively allow constant time lookup by address.

Next, I listen on the bitcoin network and the blockchain -- when ever I see a new transaction -- I look at all the outputs and see if it corresponds to one of the them in the hashmap to see if it's in there. If it is, it records the details in a database that has an index on the address column. Actually, using an approach like this would easily allow me to monitor *every* address in the blockchain. Really, the only slow part of my scheme was just generating the initial address set

Hey, thanks for that info, how would someone come across something similar without having to spend an crap ton of time coding something like that. You solution seems logical and practical. But time for software development is not something i really have, nor funds to do something of that nature as i am sure to have someone code something like that would be a few hundred dollars a minimum.

Really, the way you do it would be perfect in my case as i do not need to sign or send transactions or anything really other than be able to pull the balance on a bunch of addresses. I have a master list in .txt of the addresses by themselves that i need to monitor. Any ideas on quick solutions or short cuts?


Title: Re: How do they manage so many addresses?
Post by: monsterer on August 03, 2014, 11:21:29 AM
I can't speak for other sites, but for example https://www.moneypot.com/ is currently monitoring a few million addresses. The approach is rather simple. In my case, I pre-generated all the addresses (using bip32, but it's irrelevant). Next, I store all those addresses in a hashmap,  which effectively allow constant time lookup by address.

Obviously this approach only works if you know all your addresses in advance. If you had to import a new, unfamiliar address you still have the same problem :)

Cheers, Paul.


Title: Re: How do they manage so many addresses?
Post by: The Ferox on August 05, 2014, 11:21:15 PM
Never worked with hashmaps before, does anyone have any good resources. i  cannot seem to find a platform to manipulate to make something similar so it appears i am going to have to go at it alone and from the ground up.


Title: Re: How do they manage so many addresses?
Post by: BurtW on August 06, 2014, 03:29:34 AM
Never worked with hashmaps before, does anyone have any good resources. i  cannot seem to find a platform to manipulate to make something similar so it appears i am going to have to go at it alone and from the ground up.
In Java:

1) Call the constructor for a hash map object
2) Use it


Title: Re: How do they manage so many addresses?
Post by: rarkenin on August 08, 2014, 12:06:45 AM
Like BurtW said, write a Java program (using bitcoinj to listen for txns), and look into javadocs such as java.util.HashMap (http://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html) for more info.


Title: Re: How do they manage so many addresses?
Post by: espringe on August 08, 2014, 01:20:34 PM
Never worked with hashmaps before, does anyone have any good resources. i  cannot seem to find a platform to manipulate to make something similar so it appears i am going to have to go at it alone and from the ground up.

The "hash" part is an implementation detail. It might be a "map" (e.g. C++), a "hashmap" (e.g. Java), a "dict" (e.g. Python) or an "object" (e.g. Javascript).  Basically anything that gives O(log n) lookup or better will be fine


Title: Re: How do they manage so many addresses?
Post by: DarkHyudrA on August 08, 2014, 01:25:13 PM
The JavaDoc on Oracle should give enough examples on how to use a HashMap.
And trust me, it's ridiculously simple that's why no one is answering you.


Title: Re: How do they manage so many addresses?
Post by: rarkenin on August 09, 2014, 12:17:43 PM
I could write up a Java-based program that does what you need. PM me if interested, with more specific details.


Title: Re: How do they manage so many addresses?
Post by: The Ferox on August 11, 2014, 12:32:23 AM
I could write up a Java-based program that does what you need. PM me if interested, with more specific details.

What kind of price are you talking?


Title: Re: How do they manage so many addresses?
Post by: rarkenin on August 11, 2014, 12:42:19 AM
If I can be flexible with a timeframe (a month or two, as I'm a student with a rough schedule) then as little as 0.1 to 0.2 BTC. Please PM me with specific details.


Title: Re: How do they manage so many addresses?
Post by: kerimk2 on August 11, 2014, 12:43:54 AM
I don't think they store the addresses in a wallet, they just bulk generate maybe using a php script or something, and insert into a mysql database, so that the owner can search by address name. Also you could make a script to check all of the addresses in the mysql database, to check for any new transactions.