Bitcoin Forum
December 12, 2024, 11:18:34 PM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 ... 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 [80] 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 »
  Print  
Author Topic: [XPM] [ANN] Primecoin High Performance | HP14 released!  (Read 397655 times)
Lauda
Legendary
*
Offline Offline

Activity: 2674
Merit: 2970


Terminated.


View Profile WWW
August 05, 2013, 05:43:59 PM
 #1581

I am also concluding that HP9 client is a lot faster, still getting a block or even two with haswell beast! Smiley
Indeed it is.  Cool

"The Times 03/Jan/2009 Chancellor on brink of second bailout for banks"
😼 Bitcoin Core (onion)
1l1l11ll1l
Legendary
*
Offline Offline

Activity: 1274
Merit: 1000


View Profile WWW
August 05, 2013, 07:49:50 PM
 #1582

So with HP9 on a 24 core 1.7GHz AMD system I was getting 8300PPS, I just upgraded to a 32 Core 2.2GHz set-up and I'm getting 7100PPS, Anyone have experience with 32 core systems? What setting might I need to adjust?

paulthetafy
Hero Member
*****
Offline Offline

Activity: 820
Merit: 1000


View Profile
August 05, 2013, 08:15:11 PM
 #1583

So with HP9 on a 24 core 1.7GHz AMD system I was getting 8300PPS, I just upgraded to a 32 Core 2.2GHz set-up and I'm getting 7100PPS, Anyone have experience with 32 core systems? What setting might I need to adjust?
You appear to have uncovered an issue with the miner with high thread counts - I get almost no performance gain when using more than 16 threads on a 32 core system. 

mikael / sunny any thoughts where this bottleneck might be?
Magic8Ball
Legendary
*
Offline Offline

Activity: 1050
Merit: 1000


View Profile
August 05, 2013, 08:27:17 PM
 #1584

Finally got a block after ages. Was using the hp9 64 version.
1l1l11ll1l
Legendary
*
Offline Offline

Activity: 1274
Merit: 1000


View Profile WWW
August 05, 2013, 08:28:36 PM
 #1585

So with HP9 on a 24 core 1.7GHz AMD system I was getting 8300PPS, I just upgraded to a 32 Core 2.2GHz set-up and I'm getting 7100PPS, Anyone have experience with 32 core systems? What setting might I need to adjust?
You appear to have uncovered an issue with the miner with high thread counts - I get almost no performance gain when using more than 16 threads on a 32 core system. 

mikael / sunny any thoughts where this bottleneck might be?

I'll try playing iwth the genproclimit=X setting

dudeguy
Member
**
Offline Offline

Activity: 182
Merit: 10



View Profile
August 05, 2013, 08:51:20 PM
 #1586

So with HP9 on a 24 core 1.7GHz AMD system I was getting 8300PPS, I just upgraded to a 32 Core 2.2GHz set-up and I'm getting 7100PPS, Anyone have experience with 32 core systems? What setting might I need to adjust?
You appear to have uncovered an issue with the miner with high thread counts - I get almost no performance gain when using more than 16 threads on a 32 core system. 

mikael / sunny any thoughts where this bottleneck might be?

I think the consensus is a lack of/lack of speedy L1 cache. L2 and L3 also play a role but as far as I understand they are very much less important than L1.

Someone please back me up. This was weeks ago I heard this.
ivanlabrie
Hero Member
*****
Offline Offline

Activity: 812
Merit: 1000



View Profile
August 05, 2013, 09:20:19 PM
 #1587

I found hyper threading adds no perf increase on my end...so I run 4 threads on a sandy bridge i7 and it's faster.
B.T.Coin
Sr. Member
****
Offline Offline

Activity: 332
Merit: 250



View Profile
August 05, 2013, 09:52:47 PM
 #1588

The easiest way to do this is to copy the wallet.dat from one machine to all of the others.  When one mines a block it will appear in the wallet on all machines.  I was initially using the dump/importprivkey until I realised I could clone the wallet.dat.  Just make sure the ones you are overwriting are empty before you copy the new wallet.dat over the top!
But this will only work for 100 blocks, then the wallets will "split". Unless you change the keypoolsize (search this thread Wink).

I did know that it was a bad idea to run the same wallet on multiple computers, but now it's getting a bit clearer why, although I still don't fully understand.
I think I may have a solution but would like to run it by you guys since most people on this forum probably have a better understanding of how the wallet works.

What if I run the same wallet on all computers, but run a script once a day that will copy the wallet.dat from my central PC to all others, and overwrite the wallet.dat that is on that machine, which was a clone of the original anyway. This way the wallets can't drift apart after 100 blocks since they get updated/replaced every day.
Would this work or could I lose coins this way? Obviously the mining program will be closed and restarted when the wallet gets replaced/renewed.

A fine is a tax you pay for something you did wrong.
A tax is a fine you pay for something you did right.
matt4054
Legendary
*
Offline Offline

Activity: 1946
Merit: 1035



View Profile
August 05, 2013, 10:07:53 PM
 #1589

As far as I understand the problem is that the wallet will diverge when the pool of (default 100) addresses is exhausted, and then transactions to newly generated addresses from that point will no longer be accessible/visible from other instances of the wallet, i.e. they fork. If you overwrite some "master" instance to slaves, it means that for any slave that forks before master, newly generated coins will be lost when the forked slave wallet gets overwritten.

I'm still waiting to reach that 100 block limit to match theory with reality Cheesy
Stinky_Pete
Hero Member
*****
Offline Offline

Activity: 560
Merit: 500


View Profile
August 05, 2013, 10:22:39 PM
 #1590

I think you may lose unconfirmed, mined blocks if you overwrite the wallet.

B.T.Coin
Sr. Member
****
Offline Offline

Activity: 332
Merit: 250



View Profile
August 05, 2013, 10:23:33 PM
 #1591

it means that for any slave that forks before master, newly generated coins will be lost when the forked slave wallet gets overwritten.

That's why my idea was to clone the master to the slaves every day, so they can never reach a 100 block difference.

A fine is a tax you pay for something you did wrong.
A tax is a fine you pay for something you did right.
mikaelh (OP)
Sr. Member
****
Offline Offline

Activity: 301
Merit: 250


View Profile
August 05, 2013, 10:33:54 PM
 #1592

What if I run the same wallet on all computers, but run a script once a day that will copy the wallet.dat from my central PC to all others, and overwrite the wallet.dat that is on that machine, which was a clone of the original anyway. This way the wallets can't drift apart after 100 blocks since they get updated/replaced every day.
Would this work or could I lose coins this way? Obviously the mining program will be closed and restarted when the wallet gets replaced/renewed.

I have to say this sounds potentially dangerous. If you are overwriting wallet files, then you risk losing the private keys of addresses that may be holding coins.

I think the best solution is to make one big wallet with thousands of keys. First you need to backup any old wallet files from all your nodes. Then you run the client once with the parameter -keypool=10000 which will generate a big wallet file. Then you can distribute that new file to your mining nodes. Eventually you may need to make a new wallet file if the keys get exhausted. But that probably won't happen any time soon. Many people are using this solution and it's known to work.
matt4054
Legendary
*
Offline Offline

Activity: 1946
Merit: 1035



View Profile
August 05, 2013, 10:34:33 PM
 #1593

it means that for any slave that forks before master, newly generated coins will be lost when the forked slave wallet gets overwritten.

That's why my idea was to clone the master to the slaves every day, so they can never reach a 100 block difference.

My understanding of the process is that primecoind will not update wallet.dat for every new generated address, but only once the pool is exhausted. So replicating master to slave on a regular basis would not help IMO, and will not address the issue of a slave that exhausted its pool before master and updated wallet.dat before master did. Generation after the fork on the slave (overwritten wallet.dat) would be lost.

But again, I may be wrong on this, anyone who dev'd on the bitcoind wallet could give a more authoritative answer.
elebit
Sr. Member
****
Offline Offline

Activity: 441
Merit: 250


View Profile
August 05, 2013, 10:44:49 PM
 #1594

But again, I may be wrong on this, anyone who dev'd on the bitcoind wallet could give a more authoritative answer.

Or you could just check for yourself: getinfo returns the size of the current pool of unused keys.
matt4054
Legendary
*
Offline Offline

Activity: 1946
Merit: 1035



View Profile
August 05, 2013, 10:53:15 PM
 #1595

But again, I may be wrong on this, anyone who dev'd on the bitcoind wallet could give a more authoritative answer.

Or you could just check for yourself: getinfo returns the size of the current pool of unused keys.

Thanks for pointing this out.

The keypoolsize seems to remain constantly at 101 with default settings on all of my instances using the same wallet.dat
mikaelh (OP)
Sr. Member
****
Offline Offline

Activity: 301
Merit: 250


View Profile
August 05, 2013, 10:56:34 PM
 #1596

I found hyper threading adds no perf increase on my end...so I run 4 threads on a sandy bridge i7 and it's faster.

Are you running Windows? If so, which version? Hyper threading performance depends on the CPU scheduler and lots of other things. The CPU scheduler in Windows isn't that great in my experience but I haven't witnessed it actually being detrimental.
mikaelh (OP)
Sr. Member
****
Offline Offline

Activity: 301
Merit: 250


View Profile
August 05, 2013, 11:06:42 PM
 #1597

So with HP9 on a 24 core 1.7GHz AMD system I was getting 8300PPS, I just upgraded to a 32 Core 2.2GHz set-up and I'm getting 7100PPS, Anyone have experience with 32 core systems? What setting might I need to adjust?
You appear to have uncovered an issue with the miner with high thread counts - I get almost no performance gain when using more than 16 threads on a 32 core system. 

mikael / sunny any thoughts where this bottleneck might be?

Well, as far as I know there are 2 bottlenecks when it comes to scaling out:

1) Block generation. Only 1 thread at a time can be generating new blocks. This was already mostly fixed by Sunny.

2) Memory allocation. The default malloc implementation uses mutexes internally which reduces performance with multiple thread trying to allocate memory. This shouldn't be an issue with my client because I have reduced the amount of memory allocations needed.

So as far as the code is concerned there shouldn't really be any bottlenecks. If the caches on the CPU are completely inadequate, some performance issues would start appearing. But as far as I know, most server CPUs have pretty big caches.

And of course if you have a VPS, remember that you may be sharing the CPU time with other people's instances.
1l1l11ll1l
Legendary
*
Offline Offline

Activity: 1274
Merit: 1000


View Profile WWW
August 05, 2013, 11:40:24 PM
Last edit: August 05, 2013, 11:59:56 PM by 1l1l11ll1l
 #1598

So with HP9 on a 24 core 1.7GHz AMD system I was getting 8300PPS, I just upgraded to a 32 Core 2.2GHz set-up and I'm getting 7100PPS, Anyone have experience with 32 core systems? What setting might I need to adjust?
You appear to have uncovered an issue with the miner with high thread counts - I get almost no performance gain when using more than 16 threads on a 32 core system.  

mikael / sunny any thoughts where this bottleneck might be?

Well, as far as I know there are 2 bottlenecks when it comes to scaling out:

1) Block generation. Only 1 thread at a time can be generating new blocks. This was already mostly fixed by Sunny.

2) Memory allocation. The default malloc implementation uses mutexes internally which reduces performance with multiple thread trying to allocate memory. This shouldn't be an issue with my client because I have reduced the amount of memory allocations needed.

So as far as the code is concerned there shouldn't really be any bottlenecks. If the caches on the CPU are completely inadequate, some performance issues would start appearing. But as far as I know, most server CPUs have pretty big caches.

And of course if you have a VPS, remember that you may be sharing the CPU time with other people's instances.

So looking at the specs, the 6100 series opteron has 128kb per core of L1 cache, the 6200 and 6300 have 48kb per core of L1. That's the only difference I can see. I tried running on fewer cores on the 16 core 2.2GHz 6276, but the performance was probably half that of the 12 core 1.7GHz 6164 opteron (Chains per min and chain per day) The PPS is 8300 with the 12-core and 7100 with the faster 16-core



1l1l11ll1l
Legendary
*
Offline Offline

Activity: 1274
Merit: 1000


View Profile WWW
August 06, 2013, 12:03:42 AM
Last edit: August 06, 2013, 12:34:23 AM by 1l1l11ll1l
 #1599

Also, in case anyone is curious

24-core Opteron 6164HE 1.7GHz:
"chainspermin" : 29,
    "chainsperday" : 1.67533939,
    "primespersec" : 8389,

32-core Opteron 6274 2.2GHz:
"chainspermin" : 12,
    "chainsperday" : .71721642,
    "primespersec" : 7039,

4-core i7-2600k 3.4GHz:
"chainspermin" : 8,
    "chainsperday" : 0.57826364,
    "primespersec" : 3170,

8-core L5520 2.26GHz:
"chainspermin" : 8,
    "chainsperday" : 0.73978522,
    "primespersec" : 3628,

8-core L5420 2.5GHz:
"chainspermin" : 14,
    "chainsperday" : 0.96020906,
    "primespersec" : 3490,

8-core X5355 2.66GHz:
"chainspermin" : 15,
    "chainsperday" : 1.00721642,
    "primespersec" : 3670,

4-core Xeon 5160 3.0GHz:
"chainspermin" : 7,
    "chainsperday" : 0.50713449,
    "primespersec" : 1859,

4-core Xeon 5130 2.0GHz
"chainspermin" : 6,
    "chainsperday" : 0.34404084,
    "primespersec" : 1267,

Core 2 Duo 6300 1.86GHz:
"chainspermin" : 3,
    "chainsperday" : 0.15991434,
    "primespersec" : 587,



All systems running 64-bit HP9

Dsfyu
Member
**
Offline Offline

Activity: 75
Merit: 10



View Profile
August 06, 2013, 01:41:14 AM
 #1600

I found hyper threading adds no perf increase on my end...so I run 4 threads on a sandy bridge i7 and it's faster.

Are you running Windows? If so, which version? Hyper threading performance depends on the CPU scheduler and lots of other things. The CPU scheduler in Windows isn't that great in my experience but I haven't witnessed it actually being detrimental.

I see something similar to this on my end right now but not that bad - On my 3930k I can set genproclimit to 6 and I get 2517 pps and 1.2 cpd and with genproclimit set to 12 I get 2900 pps and 1.4 cpd. Something seems wrong with this right now. I also set genproclimit to 1 and I'm getting about 450 pps/ 0.23 cpd. If the performance scaled linearly I would be getting ~5kpps/2.7 or 2.8 cpd. Yes, I know that I should never expect anything like this but it seems like the performance scales linearly up until hyperthreading is involved and then it steeply drops off.

Edit: I just tried a few values between 6 and 12 and I'm getting at most a 100 pps increase in performance from one to another, and in some cases no significant increase whatsoever (going from 9 to 10 increased from 2784 to 2832).

Don't just trade, get paid to Atomic⚛Trade !!!
Disclaimer: I am a noob. Assume I know nothing until proven otherwise.
Pages: « 1 ... 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 [80] 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!