Lauda
Legendary
Offline
Activity: 2674
Merit: 2970
Terminated.
|
|
August 05, 2013, 05:43:59 PM |
|
I am also concluding that HP9 client is a lot faster, still getting a block or even two with haswell beast! Indeed it is.
|
"The Times 03/Jan/2009 Chancellor on brink of second bailout for banks" 😼 Bitcoin Core ( onion)
|
|
|
1l1l11ll1l
Legendary
Offline
Activity: 1274
Merit: 1000
|
|
August 05, 2013, 07:49:50 PM |
|
So with HP9 on a 24 core 1.7GHz AMD system I was getting 8300PPS, I just upgraded to a 32 Core 2.2GHz set-up and I'm getting 7100PPS, Anyone have experience with 32 core systems? What setting might I need to adjust?
|
|
|
|
paulthetafy
|
|
August 05, 2013, 08:15:11 PM |
|
So with HP9 on a 24 core 1.7GHz AMD system I was getting 8300PPS, I just upgraded to a 32 Core 2.2GHz set-up and I'm getting 7100PPS, Anyone have experience with 32 core systems? What setting might I need to adjust?
You appear to have uncovered an issue with the miner with high thread counts - I get almost no performance gain when using more than 16 threads on a 32 core system. mikael / sunny any thoughts where this bottleneck might be?
|
|
|
|
Magic8Ball
Legendary
Offline
Activity: 1050
Merit: 1000
|
|
August 05, 2013, 08:27:17 PM |
|
Finally got a block after ages. Was using the hp9 64 version.
|
|
|
|
1l1l11ll1l
Legendary
Offline
Activity: 1274
Merit: 1000
|
|
August 05, 2013, 08:28:36 PM |
|
So with HP9 on a 24 core 1.7GHz AMD system I was getting 8300PPS, I just upgraded to a 32 Core 2.2GHz set-up and I'm getting 7100PPS, Anyone have experience with 32 core systems? What setting might I need to adjust?
You appear to have uncovered an issue with the miner with high thread counts - I get almost no performance gain when using more than 16 threads on a 32 core system. mikael / sunny any thoughts where this bottleneck might be? I'll try playing iwth the genproclimit=X setting
|
|
|
|
dudeguy
Member
Offline
Activity: 182
Merit: 10
|
|
August 05, 2013, 08:51:20 PM |
|
So with HP9 on a 24 core 1.7GHz AMD system I was getting 8300PPS, I just upgraded to a 32 Core 2.2GHz set-up and I'm getting 7100PPS, Anyone have experience with 32 core systems? What setting might I need to adjust?
You appear to have uncovered an issue with the miner with high thread counts - I get almost no performance gain when using more than 16 threads on a 32 core system. mikael / sunny any thoughts where this bottleneck might be? I think the consensus is a lack of/lack of speedy L1 cache. L2 and L3 also play a role but as far as I understand they are very much less important than L1. Someone please back me up. This was weeks ago I heard this.
|
|
|
|
ivanlabrie
|
|
August 05, 2013, 09:20:19 PM |
|
I found hyper threading adds no perf increase on my end...so I run 4 threads on a sandy bridge i7 and it's faster.
|
|
|
|
B.T.Coin
|
|
August 05, 2013, 09:52:47 PM |
|
The easiest way to do this is to copy the wallet.dat from one machine to all of the others. When one mines a block it will appear in the wallet on all machines. I was initially using the dump/importprivkey until I realised I could clone the wallet.dat. Just make sure the ones you are overwriting are empty before you copy the new wallet.dat over the top!
But this will only work for 100 blocks, then the wallets will "split". Unless you change the keypoolsize (search this thread ). I did know that it was a bad idea to run the same wallet on multiple computers, but now it's getting a bit clearer why, although I still don't fully understand. I think I may have a solution but would like to run it by you guys since most people on this forum probably have a better understanding of how the wallet works. What if I run the same wallet on all computers, but run a script once a day that will copy the wallet.dat from my central PC to all others, and overwrite the wallet.dat that is on that machine, which was a clone of the original anyway. This way the wallets can't drift apart after 100 blocks since they get updated/replaced every day. Would this work or could I lose coins this way? Obviously the mining program will be closed and restarted when the wallet gets replaced/renewed.
|
A fine is a tax you pay for something you did wrong. A tax is a fine you pay for something you did right.
|
|
|
matt4054
Legendary
Offline
Activity: 1946
Merit: 1035
|
|
August 05, 2013, 10:07:53 PM |
|
As far as I understand the problem is that the wallet will diverge when the pool of (default 100) addresses is exhausted, and then transactions to newly generated addresses from that point will no longer be accessible/visible from other instances of the wallet, i.e. they fork. If you overwrite some "master" instance to slaves, it means that for any slave that forks before master, newly generated coins will be lost when the forked slave wallet gets overwritten. I'm still waiting to reach that 100 block limit to match theory with reality
|
|
|
|
Stinky_Pete
|
|
August 05, 2013, 10:22:39 PM |
|
I think you may lose unconfirmed, mined blocks if you overwrite the wallet.
|
|
|
|
B.T.Coin
|
|
August 05, 2013, 10:23:33 PM |
|
it means that for any slave that forks before master, newly generated coins will be lost when the forked slave wallet gets overwritten.
That's why my idea was to clone the master to the slaves every day, so they can never reach a 100 block difference.
|
A fine is a tax you pay for something you did wrong. A tax is a fine you pay for something you did right.
|
|
|
mikaelh (OP)
|
|
August 05, 2013, 10:33:54 PM |
|
What if I run the same wallet on all computers, but run a script once a day that will copy the wallet.dat from my central PC to all others, and overwrite the wallet.dat that is on that machine, which was a clone of the original anyway. This way the wallets can't drift apart after 100 blocks since they get updated/replaced every day. Would this work or could I lose coins this way? Obviously the mining program will be closed and restarted when the wallet gets replaced/renewed.
I have to say this sounds potentially dangerous. If you are overwriting wallet files, then you risk losing the private keys of addresses that may be holding coins. I think the best solution is to make one big wallet with thousands of keys. First you need to backup any old wallet files from all your nodes. Then you run the client once with the parameter -keypool=10000 which will generate a big wallet file. Then you can distribute that new file to your mining nodes. Eventually you may need to make a new wallet file if the keys get exhausted. But that probably won't happen any time soon. Many people are using this solution and it's known to work.
|
|
|
|
matt4054
Legendary
Offline
Activity: 1946
Merit: 1035
|
|
August 05, 2013, 10:34:33 PM |
|
it means that for any slave that forks before master, newly generated coins will be lost when the forked slave wallet gets overwritten.
That's why my idea was to clone the master to the slaves every day, so they can never reach a 100 block difference. My understanding of the process is that primecoind will not update wallet.dat for every new generated address, but only once the pool is exhausted. So replicating master to slave on a regular basis would not help IMO, and will not address the issue of a slave that exhausted its pool before master and updated wallet.dat before master did. Generation after the fork on the slave (overwritten wallet.dat) would be lost. But again, I may be wrong on this, anyone who dev'd on the bitcoind wallet could give a more authoritative answer.
|
|
|
|
elebit
|
|
August 05, 2013, 10:44:49 PM |
|
But again, I may be wrong on this, anyone who dev'd on the bitcoind wallet could give a more authoritative answer.
Or you could just check for yourself: getinfo returns the size of the current pool of unused keys.
|
|
|
|
matt4054
Legendary
Offline
Activity: 1946
Merit: 1035
|
|
August 05, 2013, 10:53:15 PM |
|
But again, I may be wrong on this, anyone who dev'd on the bitcoind wallet could give a more authoritative answer.
Or you could just check for yourself: getinfo returns the size of the current pool of unused keys. Thanks for pointing this out. The keypoolsize seems to remain constantly at 101 with default settings on all of my instances using the same wallet.dat
|
|
|
|
mikaelh (OP)
|
|
August 05, 2013, 10:56:34 PM |
|
I found hyper threading adds no perf increase on my end...so I run 4 threads on a sandy bridge i7 and it's faster.
Are you running Windows? If so, which version? Hyper threading performance depends on the CPU scheduler and lots of other things. The CPU scheduler in Windows isn't that great in my experience but I haven't witnessed it actually being detrimental.
|
|
|
|
mikaelh (OP)
|
|
August 05, 2013, 11:06:42 PM |
|
So with HP9 on a 24 core 1.7GHz AMD system I was getting 8300PPS, I just upgraded to a 32 Core 2.2GHz set-up and I'm getting 7100PPS, Anyone have experience with 32 core systems? What setting might I need to adjust?
You appear to have uncovered an issue with the miner with high thread counts - I get almost no performance gain when using more than 16 threads on a 32 core system. mikael / sunny any thoughts where this bottleneck might be? Well, as far as I know there are 2 bottlenecks when it comes to scaling out: 1) Block generation. Only 1 thread at a time can be generating new blocks. This was already mostly fixed by Sunny. 2) Memory allocation. The default malloc implementation uses mutexes internally which reduces performance with multiple thread trying to allocate memory. This shouldn't be an issue with my client because I have reduced the amount of memory allocations needed. So as far as the code is concerned there shouldn't really be any bottlenecks. If the caches on the CPU are completely inadequate, some performance issues would start appearing. But as far as I know, most server CPUs have pretty big caches. And of course if you have a VPS, remember that you may be sharing the CPU time with other people's instances.
|
|
|
|
1l1l11ll1l
Legendary
Offline
Activity: 1274
Merit: 1000
|
|
August 05, 2013, 11:40:24 PM Last edit: August 05, 2013, 11:59:56 PM by 1l1l11ll1l |
|
So with HP9 on a 24 core 1.7GHz AMD system I was getting 8300PPS, I just upgraded to a 32 Core 2.2GHz set-up and I'm getting 7100PPS, Anyone have experience with 32 core systems? What setting might I need to adjust?
You appear to have uncovered an issue with the miner with high thread counts - I get almost no performance gain when using more than 16 threads on a 32 core system. mikael / sunny any thoughts where this bottleneck might be? Well, as far as I know there are 2 bottlenecks when it comes to scaling out: 1) Block generation. Only 1 thread at a time can be generating new blocks. This was already mostly fixed by Sunny. 2) Memory allocation. The default malloc implementation uses mutexes internally which reduces performance with multiple thread trying to allocate memory. This shouldn't be an issue with my client because I have reduced the amount of memory allocations needed. So as far as the code is concerned there shouldn't really be any bottlenecks. If the caches on the CPU are completely inadequate, some performance issues would start appearing. But as far as I know, most server CPUs have pretty big caches. And of course if you have a VPS, remember that you may be sharing the CPU time with other people's instances. So looking at the specs, the 6100 series opteron has 128kb per core of L1 cache, the 6200 and 6300 have 48kb per core of L1. That's the only difference I can see. I tried running on fewer cores on the 16 core 2.2GHz 6276, but the performance was probably half that of the 12 core 1.7GHz 6164 opteron (Chains per min and chain per day) The PPS is 8300 with the 12-core and 7100 with the faster 16-core
|
|
|
|
1l1l11ll1l
Legendary
Offline
Activity: 1274
Merit: 1000
|
|
August 06, 2013, 12:03:42 AM Last edit: August 06, 2013, 12:34:23 AM by 1l1l11ll1l |
|
Also, in case anyone is curious
24-core Opteron 6164HE 1.7GHz: "chainspermin" : 29, "chainsperday" : 1.67533939, "primespersec" : 8389,
32-core Opteron 6274 2.2GHz: "chainspermin" : 12, "chainsperday" : .71721642, "primespersec" : 7039,
4-core i7-2600k 3.4GHz: "chainspermin" : 8, "chainsperday" : 0.57826364, "primespersec" : 3170,
8-core L5520 2.26GHz: "chainspermin" : 8, "chainsperday" : 0.73978522, "primespersec" : 3628,
8-core L5420 2.5GHz: "chainspermin" : 14, "chainsperday" : 0.96020906, "primespersec" : 3490,
8-core X5355 2.66GHz: "chainspermin" : 15, "chainsperday" : 1.00721642, "primespersec" : 3670,
4-core Xeon 5160 3.0GHz: "chainspermin" : 7, "chainsperday" : 0.50713449, "primespersec" : 1859,
4-core Xeon 5130 2.0GHz "chainspermin" : 6, "chainsperday" : 0.34404084, "primespersec" : 1267,
Core 2 Duo 6300 1.86GHz: "chainspermin" : 3, "chainsperday" : 0.15991434, "primespersec" : 587,
All systems running 64-bit HP9
|
|
|
|
Dsfyu
Member
Offline
Activity: 75
Merit: 10
|
|
August 06, 2013, 01:41:14 AM |
|
I found hyper threading adds no perf increase on my end...so I run 4 threads on a sandy bridge i7 and it's faster.
Are you running Windows? If so, which version? Hyper threading performance depends on the CPU scheduler and lots of other things. The CPU scheduler in Windows isn't that great in my experience but I haven't witnessed it actually being detrimental. I see something similar to this on my end right now but not that bad - On my 3930k I can set genproclimit to 6 and I get 2517 pps and 1.2 cpd and with genproclimit set to 12 I get 2900 pps and 1.4 cpd. Something seems wrong with this right now. I also set genproclimit to 1 and I'm getting about 450 pps/ 0.23 cpd. If the performance scaled linearly I would be getting ~5kpps/2.7 or 2.8 cpd. Yes, I know that I should never expect anything like this but it seems like the performance scales linearly up until hyperthreading is involved and then it steeply drops off. Edit: I just tried a few values between 6 and 12 and I'm getting at most a 100 pps increase in performance from one to another, and in some cases no significant increase whatsoever (going from 9 to 10 increased from 2784 to 2832).
|
Don't just trade, get paid to Atomic⚛Trade !!!Disclaimer: I am a noob. Assume I know nothing until proven otherwise.
|
|
|
|