JoelKatz
Legendary
Offline
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
|
|
July 17, 2011, 02:26:11 PM |
|
Can you explain the "hub" mode a little better?
The bitcoin code makes few outbound connections and makes them very slowly. It tried to make the connections to numerically diverse IP addresses. And it has a few design decisions that makes it waste CPU under certain circumstances involving having larger numbers of connections. The hub modes are aimed at two things. First, for mining controllers, they allow the client to get a larger mix of connections and to reconnect itself to more of the network more rapidly on a restart. Second, for people who want to run bitcoind nodes to provide the network with a reduced diameter, they make it possible to more easily run nodes with many hundreds of connections. It is still a work in progress. The ultimate goal is to improve network reliability and reduce network diameter and propagation delay by greater interconnectedness between well-connected nodes.
|
I am an employee of Ripple. Follow me on Twitter @JoelKatz 1Joe1Katzci1rFcsr9HH7SLuHVnDy2aihZ BM-NBM3FRExVJSJJamV9ccgyWvQfratUHgN
|
|
|
Matt Corallo
|
|
July 17, 2011, 02:30:35 PM |
|
Can you explain the "hub" mode a little better?
Don't use it...period. Hub mode creates a ton more outgoing connections in a much more aggressive manner than stock bitcoin. This is very bad for the network as the number of nodes accepting incoming connections is relatively low and filling all of their slots that you can is really quite terrible. That said, hub mode is an attempt at solving actual problems, namely the "islanding" of nodes. This is largely caused by a large number of people on the network running old nodes, specifically those older than version 0.3.24 (including 0.3.23). This causes nodes to not correctly forward recent blocks. If you are running 0.3.23 or older (without having applied https://github.com/bitcoin/bitcoin/commit/497317453422611a077f7f195eb193d3bb597a9c) you are doing the network a great disservice and causing problems for others. If you are running 0.3.24 and are, like some, seeing islanding and not being able to download the latest blocks, hub mode theoretically solves the issue in a very brute-force way, to the detriment of others. The proper solution here is to -addnode the nodes of other large pools/miners/fallback nodes who are virtually guaranteed to have the latest blocks.
|
|
|
|
JoelKatz
Legendary
Offline
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
|
|
July 17, 2011, 02:40:44 PM Last edit: July 17, 2011, 03:11:28 PM by JoelKatz |
|
Can you explain the "hub" mode a little better?
Don't use it...period. Hub mode creates a ton more outgoing connections in a much more aggressive manner than stock bitcoin. This is very bad for the network as the number of nodes accepting incoming connections is relatively low and filling all of their slots that you can is really quite terrible. This was a valid criticism against the first hub mode patches. I was not aware of this issue and have changed the patches in such a way that I believe they help solve this problem rather than making it worse. The newer versions decrease the number of outgoing connections made over the initial versions. In the largest hub mode now, the outgoing connections is capped at 96. In exchange, a much larger number of inbound connections (over 500 instead of 125) is accepted, so on balance the network gains in the number of connections it can accept. I 100% agree that nodes that are behind NAT or cannot accept inbound connections should never ever run in hub mode. Otherwise, I don't think this particular criticism is valid anymore. In my opinion, people who run well-maintained, stable clients on the latest build in the largest hub mode with good network connectivity are doing the bitcoin network a favor.
|
I am an employee of Ripple. Follow me on Twitter @JoelKatz 1Joe1Katzci1rFcsr9HH7SLuHVnDy2aihZ BM-NBM3FRExVJSJJamV9ccgyWvQfratUHgN
|
|
|
Matt Corallo
|
|
July 17, 2011, 04:13:55 PM |
|
The newer versions decrease the number of outgoing connections made over the initial versions. In the largest hub mode now, the outgoing connections is capped at 96. In exchange, a much larger number of inbound connections (over 500 instead of 125) is accepted, so on balance the network gains in the number of connections it can accept.
I wasnt aware you had dropped the outgoing cap, but 96 is still way, way, way too much. If you want more outgoing connections, fine, but maybe make 10 or 20, not 96. Accepting more incoming is great, and that should be encouraged. However, accepting more incoming than you make outgoing is not a solution, nor does it add more to the network than it takes away. Keep in mind that nodes attempt to keep a diverse connection pool (something that could be implemented better, though) so accepting 1000 connections does not make up for making 100 connections because you are never going to get that many connections and because you are taking 100 slots on nodes in different /24s. Again, let me just point out that though there is a valid argument for having something to help you get a better connection to more up-to-date nodes, hub mode is probably the worst way to fix the problem possible. As a side note, if you are going to encourage people to patch 0.3.23, please, please include https://github.com/bitcoin/bitcoin/commit/497317453422611a077f7f195eb193d3bb597a9c as without it, you really aren't helping the network at all. Sorry, for some reason I still thought this was based on .23, not sure why I thought that.
|
|
|
|
JoelKatz
Legendary
Offline
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
|
|
July 17, 2011, 04:31:41 PM |
|
I wasnt aware you had dropped the outgoing cap, but 96 is still way, way, way too much. If you want more outgoing connections, fine, but maybe make 10 or 20, not 96. Accepting more incoming is great, and that should be encouraged. However, accepting more incoming than you make outgoing is not a solution, nor does it add more to the network than it takes away. Keep in mind that nodes attempt to keep a diverse connection pool (something that could be implemented better, though) so accepting 1000 connections does not make up for making 100 connections because you are never going to get that many connections and because you are taking 100 slots on nodes in different /24s. I don't think this argument is valid. It seems valid if you think about one node, but think about 100 nodes on different /24s. Those 100 nodes running in hub mode are adding 500 plus connection slots each and in total they are doing so on 100 different /24s. Of course any one node can't add IP diversity. But so long as each node adds more capacity than it takes, there should be a net gain in IP diversity. It was based on 0.3.23 until just recently. I absolutely agree that running in hub mode without that patch is a problem. On the next version of the hub mode patches, I will reduce the number of outgoing connections further. In hub mode 2, recommended for mining controllers, I will reduce the outbound connections from 48 to 32. I feel this is needed because otherwise there is too much of a delay on a restart before the node has an adequate view of the network. I will reduce hub mode 3 from 96 to 32 as well. Hub mode 3 is intended for stable network nodes, and there is no particular reason to need a quick network view. const unsigned HubModes[4][4]= { // outbound connections, total connections, IP mask, multithreaded connect { 8, 125, 0x0000ffff, 0 }, // Normal mode - { 32, 200, 0x0000ffff, 0 }, // Small hub mode - { 48, 384, 0x00ffffff, 1 }, // Medium hub mode - { 96, 640, 0xffffffff, 1 } // Large hub mode + { 16, 200, 0x0000ffff, 0 }, // Small hub mode + { 32, 384, 0x00ffffff, 1 }, // Medium hub mode + { 32, 640, 0x00ffffff, 1 } // Large hub mode }; Update: This change is in the current bitcoin-4diff that is up on my web server.
|
I am an employee of Ripple. Follow me on Twitter @JoelKatz 1Joe1Katzci1rFcsr9HH7SLuHVnDy2aihZ BM-NBM3FRExVJSJJamV9ccgyWvQfratUHgN
|
|
|
Matt Corallo
|
|
July 17, 2011, 04:38:17 PM |
|
On the next version of the hub mode patches, I will reduce the number of outgoing connections further. In hub mode 2, recommended for mining controllers, I will reduce the output connections from 48 to 32. I feel this is needed because otherwise there is too much of a delay on a restart before the node has an adequate view of the network. I will reduce hub mode 3 from 96 to 32 as well. Hub mode 3 is intended for stable network nodes, and there is no particular reason to need a quick network view.
32 seems like a reasonable number of outgoing connections. Just keep in mind (or I guess others who haven't upgraded should keep in mind) that the reason it takes so long for nodes to get a good connection is not because of a poor network, it is because nodes have not upgraded to 0.3.24. If you connect to a pre-.24 node and are more than a day or two out of date with the blockchain, you will be disconnected before you are able to get updated, which causes so many of the recent problems. Seriously people, when it is said that you NEED TO UPGRADE TO 0.3.24, there is a reason behind it.
|
|
|
|
JoelKatz
Legendary
Offline
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
|
|
July 17, 2011, 04:47:56 PM |
|
Thanks for helping me make these patches better. If hope I didn't come across as hostile. I know we both want the same thing -- a network than can remain stable and reliable even with exponential growth.
|
I am an employee of Ripple. Follow me on Twitter @JoelKatz 1Joe1Katzci1rFcsr9HH7SLuHVnDy2aihZ BM-NBM3FRExVJSJJamV9ccgyWvQfratUHgN
|
|
|
Matt Corallo
|
|
July 17, 2011, 04:55:08 PM Last edit: July 17, 2011, 05:59:14 PM by Matt Corallo |
|
If hope I didn't come across as hostile.
Oh absolutely not, it is going to solving a legitimate problem. Hopefully I didn't come across as hostile... All the work you are doing here is great, really hope we can get the asio/threaded RPC into mainline sometime soon (hopefully for a 4.1)
|
|
|
|
JoelKatz
Legendary
Offline
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
|
|
July 17, 2011, 10:46:02 PM |
|
Actually, now that I think about it, reducing the outbound connections hub make will weaken the network. My goal was to get a richly interconnected mesh of high-capacity hubs. And I guess I was thinking "we don't need to connect, they'll connect to us eventually". But this is not true -- if neither side is willing to make many outbound connections, then two hubs won't be likely to connect to each other. But if hubs make lots of outbound connections, then hubs will consume the inbound capacity of non-hubs, reducing the diversity of the network.
I've been thinking about a workaround. What I'm thinking is to have a NODE_HUB bit to indicate a large hub (with the same propagation/storage semantics as NODE_NETWORK). That way, hubs will know that nodes that set that bit are also capable of handling large numbers of incoming connections.
Hubs will first follow the normal mechanism for making outbound connections. But then they will preferentially make extra outbound connections strictly to other hubs to which they are not connected to improve network interconnection density and speed network propagation.
Once we attain a decent number hubs, the net effect will be that each hub will have some 32 outbound connections mostly to other hubs, some 32 inbound connections from other hubs, and be able to support about 512 incoming connections from non-hubs.
|
I am an employee of Ripple. Follow me on Twitter @JoelKatz 1Joe1Katzci1rFcsr9HH7SLuHVnDy2aihZ BM-NBM3FRExVJSJJamV9ccgyWvQfratUHgN
|
|
|
Matt Corallo
|
|
July 17, 2011, 11:14:56 PM |
|
Yep, that is absolutely something that should be done, and its been tossed around in one form or another quite a few times (typically under the supernode/client or sub-node name). You get a ton of supernodes which act like nodes do today, except that the client doesn't upgrade itself into a supernode unless it sees that it has good uptime, accept incoming connections, etc similar to the way gtk-gnutella does it. At that point whether or not hubs need more outgoing connections can be decided. I have a feeling there will be a small enough number of hubs/supernodes that any more than 8 would be needless, but that depends on any number of factors that can't really be decided now. See https://forum.bitcoin.org/index.php?topic=12286.0 and http://forum.bitcoin.org/index.php?topic=7972.msg116265#msg116265
|
|
|
|
JoelKatz
Legendary
Offline
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
|
|
July 17, 2011, 11:51:52 PM |
|
I see a lot of good ideas in there. But I think people are letting the perfect be the enemy of the good. Nobody will ever agree on the best possible solution and complex solutions have possible complex failure modes. I think a baby step in the right direction might be a good idea.
You might only need 8 hubs/supernodes to handle the network from a technical standpoint, but I think you want vastly more because someone who controls a significant fraction of the network could potentially grab all of at least one client's connections and keep that client in the dark. If we move to concentrate power in a small number of supernodes, we may create new attack possibilities.
There are many other steps we could take specifically to deal with the 'mushroom' attack. For example, well-known DNS names under separate administration could each publish a recent block hash as a TXT record. If you aren't seeing the blocks published by most of those well-known names, you're not caught up to the real network. But ideally, the network would just be dense and diverse enough that this would be impractical.
|
I am an employee of Ripple. Follow me on Twitter @JoelKatz 1Joe1Katzci1rFcsr9HH7SLuHVnDy2aihZ BM-NBM3FRExVJSJJamV9ccgyWvQfratUHgN
|
|
|
Matt Corallo
|
|
July 18, 2011, 02:20:54 AM |
|
ideally, the network would just be dense and diverse enough that this would be impractical.
Yep, very much would be. In fact it used to absolutely be, its only with the recent block download stuff, plus possibly the IRC split and other factors have made the network very unstable and poorly connected. My hope is that if people would get off their ass and upgrade to 0.3.24, it would get much better, but as it stands we dont even have a mac build and people are being very, very slow to upgrade.
|
|
|
|
Inaba
Legendary
Offline
Activity: 1260
Merit: 1000
|
|
July 18, 2011, 03:16:22 AM |
|
previewdiff seems to be working fine for me now... just wanted to report back.
I'm a little confused, is the new 4diff update incorporating previewdiff now? Should I be recompiling at this point?
|
If you're searching these lines for a point, you've probably missed it. There was never anything there in the first place.
|
|
|
backburn
Member
Offline
Activity: 111
Merit: 10
★Trash&Burn [TBC/TXB]★
|
|
July 18, 2011, 07:27:35 AM |
|
I had to revert back to the old 3.23 code today. The preview patch + 3.24 locked up the system, used up all available sockets after about 8-10hrs, twice. I had to kill -9 to get the process to end. Possibly erroring when 2nd block is found.
Any changes v.97 3.24 that might address this?
|
|
|
|
Caesium
|
|
July 18, 2011, 09:46:29 AM |
|
JoelKatz, I appear to have some sort of memory/map leak with 0.3.24 + 0.96 bitcoind has grown to the following: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24646 btc 20 0 9032m 246m 9136 S 2 6.2 13:02.59 bitcoind_0.3.24
9032M of mapped memory. If I check /proc/24646/smaps, I can see just over a thousand maps of 8192k but sadly they have no real information, here's a sample of one: 7f70c4ffb000-7f70c57fb000 rw-p 00000000 00:00 0 Size: 8192 kB Rss: 16 kB Pss: 16 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 16 kB Referenced: 16 kB Anonymous: 16 kB Swap: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB
They appear to be increasing at approx one every few minutes. Any idea where they might be coming from? (edit: it's not leaking fds by the way, only 123 open descriptors and 108 of those are connections as reported by getinfo)
|
|
|
|
Inaba
Legendary
Offline
Activity: 1260
Merit: 1000
|
|
July 18, 2011, 04:32:34 PM |
|
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 26217 XXXX 20 0 9.8g 243m 7664 S 2 2.0 55:09.60 bitcoind
I am seeing similar to Caesium
|
If you're searching these lines for a point, you've probably missed it. There was never anything there in the first place.
|
|
|
JoelKatz
Legendary
Offline
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
|
|
July 18, 2011, 05:22:17 PM Last edit: July 18, 2011, 07:54:35 PM by JoelKatz |
|
Argh, that's not good. On the bright side, it shouldn't be too terribly hard to find and fix. I'll try to have a fix in the next few hours.
The new 4-diff is the same as the preview diff with a few small fixes and rebased against the latest release version. It's the same as the old 4-diff plus the update patches minus the 'getblock' "precompute skeleton block" patch which was causing problems.
Update: There is no change so simple I can't screw it up somehow! See if this change fixes it: --- new/util.h 2011-07-17 09:33:40.826055435 -0700 +++ util.h 2011-07-18 12:49:02.095159533 -0700 @@ -625,9 +625,10 @@ inline pthread_t CreateThread(void(*pfn) return (pthread_t)0; } if (!fWantHandle) - return (pthread_t)-1; - else + { pthread_detach(hthread); + return (pthread_t)-1; + } return hthread; }
The final CreateThread should look like this:
inline pthread_t CreateThread(void(*pfn)(void*), void* parg, bool fWantHandle=f { pthread_t hthread = 0; int ret = pthread_create(&hthread, NULL, (void*(*)(void*))pfn, parg); if (ret != 0) { printf("Error: pthread_create() returned %d\n", ret); return (pthread_t)0; } if (!fWantHandle) { pthread_detach(hthread); return (pthread_t)-1; } return hthread; }
|
I am an employee of Ripple. Follow me on Twitter @JoelKatz 1Joe1Katzci1rFcsr9HH7SLuHVnDy2aihZ BM-NBM3FRExVJSJJamV9ccgyWvQfratUHgN
|
|
|
backburn
Member
Offline
Activity: 111
Merit: 10
★Trash&Burn [TBC/TXB]★
|
|
July 18, 2011, 10:39:55 PM Last edit: July 19, 2011, 02:15:43 AM by backburn |
|
if (!fWantHandle) - return (pthread_t)-1; - else + { pthread_detach(hthread); + return (pthread_t)-1; + } return hthread; }
UPDATE: Looks stable so far, running for 6 hrs. Compiling and will test today, thank you bunches.
|
|
|
|
Inaba
Legendary
Offline
Activity: 1260
Merit: 1000
|
|
July 18, 2011, 10:41:15 PM |
|
Looks good so far, after running a couple hours.
|
If you're searching these lines for a point, you've probably missed it. There was never anything there in the first place.
|
|
|
Caesium
|
|
July 19, 2011, 09:25:10 AM |
|
Yep can confirm fixed here too
|
|
|
|
|