Bitcoin Forum
April 24, 2024, 07:09:32 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 [30] 31 32 33 34 35 36 37 »
  Print  
Author Topic: [XPM] Primecoin Built-in Miner Sieve Performance Issue  (Read 69102 times)
Chemisist
Member
**
Offline Offline

Activity: 99
Merit: 10



View Profile
July 12, 2013, 11:16:41 PM
 #581

Ohh... that's clever.  The setgenerate true -1 command just sets it to the number of processors.  Is this the same effect that you get from setting thread-concurrency to something like 3-4 times the number of stream processors on AMD cards?  I'm thinking yes...

(edit: on second thought, I shouldn't have clicked "quote" on that long post... )

btc 1ChemaH12nRmd75M8BmPSiqd8x7B2wxFNF     ltc LaWX7jgJDyQ2oFaQYJvo5kqC1e1KYPoCfd     xpm Ab8NSgxHgGUJvHgSHYqMYBMWai6ZdsA91s
1713942572
Hero Member
*
Offline Offline

Posts: 1713942572

View Profile Personal Message (Offline)

Ignore
1713942572
Reply with quote  #2

1713942572
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1713942572
Hero Member
*
Offline Offline

Posts: 1713942572

View Profile Personal Message (Offline)

Ignore
1713942572
Reply with quote  #2

1713942572
Report to moderator
UNOE
Sr. Member
****
Offline Offline

Activity: 791
Merit: 271


This is personal


View Profile
July 12, 2013, 11:18:09 PM
 #582


16:00:58

getmininginfo


16:00:58

{
"blocks" : 24436,
"currentblocksize" : 1000,
"currentblocktx" : 0,
"errors" : "",
"generate" : true,
"genproclimit" : 80,
"primespersec" : 2359,
"pooledtx" : 0,
"testnet" : false
}


AMD FX 8120

Perhaps PPS isn't the actual goal after all, since I think this measurement is largely misunderstood.

I'd love to know how are you getting a PPS number so high on a FX 8120?.

I think he using Chemisist 2nd release that he posted about last page

PoolMinor
Legendary
*
Offline Offline

Activity: 1843
Merit: 1338


XXXVII Fnord is toast without bread


View Profile
July 12, 2013, 11:18:29 PM
 #583



Ohh... that's clever.  The setgenerate true -1 command just sets it to the number of processors.  Is this the same effect that you get from setting thread-concurrency to something like 3-4 times the number of stream processors on AMD cards?  I'm thinking yes...


I am not finding any more blocks, I have not found any yet at these "higher" settings.  Cry

Btc=C2MF       Free BTC Poker
Being defeated is often a temporary condition. Giving up is what makes it permanent. -Marilyn vos Savant
altsay
Sr. Member
****
Offline Offline

Activity: 359
Merit: 250


View Profile
July 12, 2013, 11:19:41 PM
 #584


16:00:58

getmininginfo


16:00:58

{
"blocks" : 24436,
"currentblocksize" : 1000,
"currentblocktx" : 0,
"errors" : "",
"generate" : true,
"genproclimit" : 80,
"primespersec" : 2359,
"pooledtx" : 0,
"testnet" : false
}


AMD FX 8120

Perhaps PPS isn't the actual goal after all, since I think this measurement is largely misunderstood.

That's the conclusion that I've reached also.  I'm comparing the actual number of blocks generated over 10 minutes (though I should probably do it for longer) on the testnet between production Primecoin code and what I'm working on.

Though I should point out that with each change in genproclimit I saw better results up to 10x actual cores.

15:43:56
?
{
"blocks" : 24336,
"currentblocksize" : 18956,
"currentblocktx" : 1,
"errors" : "",
"generate" : true,
"genproclimit" : 320,
"primespersec" : 2196,
"pooledtx" : 1,
"testnet" : false
}


Mine is always -1 so far. How did you managed to increase that number?
Chemisist
Member
**
Offline Offline

Activity: 99
Merit: 10



View Profile
July 12, 2013, 11:20:26 PM
 #585

Alright, testing with the core i7-950, 81/86 blocks found on testnet in 10 minutes with current Sunny King code and 97/97 blocks found in 10 minutes with my code with 8 threads running.  Testing this concept of overthreading now

btc 1ChemaH12nRmd75M8BmPSiqd8x7B2wxFNF     ltc LaWX7jgJDyQ2oFaQYJvo5kqC1e1KYPoCfd     xpm Ab8NSgxHgGUJvHgSHYqMYBMWai6ZdsA91s
LazyOtto
Sr. Member
****
Offline Offline

Activity: 476
Merit: 250


View Profile
July 12, 2013, 11:21:22 PM
 #586

I think he using Chemisist 2nd release that he posted about last page
But he is actually pointing out something more interesting.

With proclimit == number-of-cores the cpu utilization will be 100%.

With proclimit > number-of-cores the same amount of cpu is being used, but the reported pps is higher.
PoolMinor
Legendary
*
Offline Offline

Activity: 1843
Merit: 1338


XXXVII Fnord is toast without bread


View Profile
July 12, 2013, 11:23:45 PM
 #587


16:00:58

getmininginfo


16:00:58

{
"blocks" : 24436,
"currentblocksize" : 1000,
"currentblocktx" : 0,
"errors" : "",
"generate" : true,
"genproclimit" : 80,
"primespersec" : 2359,
"pooledtx" : 0,
"testnet" : false
}


AMD FX 8120

Perhaps PPS isn't the actual goal after all, since I think this measurement is largely misunderstood.

That's the conclusion that I've reached also.  I'm comparing the actual number of blocks generated over 10 minutes (though I should probably do it for longer) on the testnet between production Primecoin code and what I'm working on.

Though I should point out that with each change in genproclimit I saw better results up to 10x actual cores.

15:43:56
?
{
"blocks" : 24336,
"currentblocksize" : 18956,
"currentblocktx" : 1,
"errors" : "",
"generate" : true,
"genproclimit" : 320,
"primespersec" : 2196,
"pooledtx" : 1,
"testnet" : false
}


Mine is always -1 so far. How did you managed to increase that number?

setgenerate true  80     or......1048576 <or any other number you wish>  Shocked

Btc=C2MF       Free BTC Poker
Being defeated is often a temporary condition. Giving up is what makes it permanent. -Marilyn vos Savant
PoolMinor
Legendary
*
Offline Offline

Activity: 1843
Merit: 1338


XXXVII Fnord is toast without bread


View Profile
July 12, 2013, 11:26:36 PM
 #588

I think he using Chemisist 2nd release that he posted about last page
But he is actually pointing out something more interesting.

With proclimit == number-of-cores the cpu utilization will be 100%.

With proclimit > number-of-cores the same amount of cpu is being used, but the reported pps is higher.

I used the same process with YAC and was told I was wasting the hash by splitting into "micro-threads" making it more difficult to solve. Or the person that said this didn't want me to use the tactic and they used it for themselves.

see post here.

https://bitcointalk.org/index.php?topic=196196.msg2090126#msg2090126

Btc=C2MF       Free BTC Poker
Being defeated is often a temporary condition. Giving up is what makes it permanent. -Marilyn vos Savant
Chemisist
Member
**
Offline Offline

Activity: 99
Merit: 10



View Profile
July 12, 2013, 11:27:17 PM
 #589

I think he using Chemisist 2nd release that he posted about last page
But he is actually pointing out something more interesting.

With proclimit == number-of-cores the cpu utilization will be 100%.

With proclimit > number-of-cores the same amount of cpu is being used, but the reported pps is higher.

That's not possible since the pps is based on the actual number of prime candidates processed.  This is correlated with block finding but not equivalent apparently

btc 1ChemaH12nRmd75M8BmPSiqd8x7B2wxFNF     ltc LaWX7jgJDyQ2oFaQYJvo5kqC1e1KYPoCfd     xpm Ab8NSgxHgGUJvHgSHYqMYBMWai6ZdsA91s
itod
Legendary
*
Offline Offline

Activity: 1974
Merit: 1075


^ Will code for Bitcoins


View Profile
July 12, 2013, 11:27:37 PM
 #590

setgenerate true  80     or......1048576 <or any other number you wish>  Shocked

Thinking out of the box, congrats.

Now waiting for Win binaries of Chemisist newest version with improved thread handling.
anonppcoin
Newbie
*
Offline Offline

Activity: 48
Merit: 0


View Profile
July 12, 2013, 11:29:35 PM
 #591

Updated windows build using the new Chemisist source. Tuned for Intel Sandy and Ivy Bridge but compatible with other architecture.

https://www.dropbox.com/s/4k0xmuajxf5i4ly/primecoin0712v3-avx.zip

I'm seeing lower PPS than my v2 builds but I think that weaving will be better overall.
redphlegm
Sr. Member
****
Offline Offline

Activity: 246
Merit: 250


My spoon is too big!


View Profile
July 12, 2013, 11:30:06 PM
 #592

Alright, so just updated my version (currently on github) such that each thread an independent evolving weave timing parameter.  To compare to mine with Sunny's most recent update, I used the testnet where my version found 30 confirmed blocks in 10 minutes while the original code found 16 confirmed blocks.  I feel that this is a legitimate comparison because there were no other nodes on the test net currently mining (I know this because my client found every continuous block in both cases).  This comparison was performed with a t61p IBM laptop with a T9300 Core 2 Duo processor.  The current difficulty on the testnet is 5.4426.  

Going to test this with the 8 threads on my Core i7 next.

Mind linking to your github? The speed this thread is updating is a bit overwhelming. Thanks in advance.

Whiskey Fund: (BTC) 1whiSKeYMRevsJMAQwU8NY1YhvPPMjTbM | (Ψ) ALcoHoLsKUfdmGfHVXEShtqrEkasihVyqW
AgentME
Member
**
Offline Offline

Activity: 84
Merit: 10


View Profile
July 12, 2013, 11:30:23 PM
 #593

Agreed. I noticed earlier if you cap off the sieve weaving time to almost nothing, you can easily get absurdly high PPS values but you won't actually earn blocks faster. There's a trade-off that needs to be analyzed closer.

The high pps number is due to the very low hard cap on the time set to check the actual sieve that has been produced (it's set to 10 ms in the current master branch on github, line 372 in prime.cpp).  So with the very short weaving time of whatever you decide to set, the sieve has a very large number of prime candidates, most of which satisfy the following check:
Code:
if(TargetGetLength(nProbablePrimeChainLength) >= 1)
     nPrimesHit++;

but many of which are not actually primes.  Anyway, I'm currently testing my code against Sunny's on the testnet (with the large thread count issue potentially fixed, fingers crossed) to see which can find more blocks in 10 minutes on my T9300 laptop.  Results to come shortly
Isn't nProbablePrimeChainLength always zero if N+1 and N-1 both fail the FermatProbablePrimalityTest in ProbableCunninghamChainTest? Or does that get a ton of false positives when the sieve isn't weaved much? (I imagine I probably just answered my own question.)

Anyway, good luck with finding the sweet spot in the trade-off!
altsay
Sr. Member
****
Offline Offline

Activity: 359
Merit: 250


View Profile
July 12, 2013, 11:33:32 PM
 #594

setgenerate true  80     or......1048576 <or any other number you wish>  Shocked

Thinking out of the box, congrats.

Now waiting for Win binaries of Chemisist newest version with improved thread handling.

But it slows down the computer much more than it was on -1. I hope that doesn't increase the possibility of orphans.
PoolMinor
Legendary
*
Offline Offline

Activity: 1843
Merit: 1338


XXXVII Fnord is toast without bread


View Profile
July 12, 2013, 11:35:52 PM
 #595

I could use some fine tuning for the AMD FX series, I have seen others asking for a specific release for this chip but have not seen any that are available.

Otherwise for now I will keep testing the overthreading, I think for PPS rate the 10 threads per core (for me genproclimit=80) has been best so far.


16:30:45

{
"blocks" : 24598,
"currentblocksize" : 1000,
"currentblocktx" : 0,
"errors" : "",
"generate" : true,
"genproclimit" : 80,
"primespersec" : 2492,
"pooledtx" : 0,
"testnet" : false
}

Btc=C2MF       Free BTC Poker
Being defeated is often a temporary condition. Giving up is what makes it permanent. -Marilyn vos Savant
AgentME
Member
**
Offline Offline

Activity: 84
Merit: 10


View Profile
July 12, 2013, 11:36:11 PM
 #596

Alright, so just updated my version (currently on github) such that each thread an independent evolving weave timing parameter.  To compare to mine with Sunny's most recent update, I used the testnet where my version found 30 confirmed blocks in 10 minutes while the original code found 16 confirmed blocks.  I feel that this is a legitimate comparison because there were no other nodes on the test net currently mining (I know this because my client found every continuous block in both cases).  This comparison was performed with a t61p IBM laptop with a T9300 Core 2 Duo processor.  The current difficulty on the testnet is 5.4426.  
Why make it a weave timing parameter and not just a weave count parameter? I think that would be a better metric, as a change in CPU load means the timing parameter's results will change a lot.

some of us on #eligius-prime were able with lukes help and others to get it running.. now im just waiting to see if i can actually get a block..

[image]

Can you share your source code?  Did you modify Sunny's algorithm at all?
I think the biggest change in Luke's miner is that it moves the bnTwoInverse calculation out of Weave() and just pre-calculates it for all of the primes in GeneratePrimeTable(). I didn't get much more performance out of porting that change to primecoin but I didn't check too hard.
Chemisist
Member
**
Offline Offline

Activity: 99
Merit: 10



View Profile
July 12, 2013, 11:37:29 PM
 #597

Alright, so just updated my version (currently on github) such that each thread an independent evolving weave timing parameter.  To compare to mine with Sunny's most recent update, I used the testnet where my version found 30 confirmed blocks in 10 minutes while the original code found 16 confirmed blocks.  I feel that this is a legitimate comparison because there were no other nodes on the test net currently mining (I know this because my client found every continuous block in both cases).  This comparison was performed with a t61p IBM laptop with a T9300 Core 2 Duo processor.  The current difficulty on the testnet is 5.4426.  

Going to test this with the 8 threads on my Core i7 next.

Mind linking to your github? The speed this thread is updating is a bit overwhelming. Thanks in advance.

Updated my profile website to link directly to it, just fyi so you dont have to keep coming back here...

https://github.com/Chemisist/primecoin

btc 1ChemaH12nRmd75M8BmPSiqd8x7B2wxFNF     ltc LaWX7jgJDyQ2oFaQYJvo5kqC1e1KYPoCfd     xpm Ab8NSgxHgGUJvHgSHYqMYBMWai6ZdsA91s
AgentME
Member
**
Offline Offline

Activity: 84
Merit: 10


View Profile
July 12, 2013, 11:38:23 PM
 #598

I think he using Chemisist 2nd release that he posted about last page
But he is actually pointing out something more interesting.

With proclimit == number-of-cores the cpu utilization will be 100%.

With proclimit > number-of-cores the same amount of cpu is being used, but the reported pps is higher.
That's because less time is spent weaving with the threads fighting each other, and more false-positives are counted by the primespersec value.
anonppcoin
Newbie
*
Offline Offline

Activity: 48
Merit: 0


View Profile
July 12, 2013, 11:39:20 PM
 #599

Why make it a weave timing parameter and not just a weave count parameter? I think that would be a better metric, as a change in CPU load means the timing parameter's results will change a lot.


Agreed.
altsay
Sr. Member
****
Offline Offline

Activity: 359
Merit: 250


View Profile
July 12, 2013, 11:41:52 PM
 #600

I think he using Chemisist 2nd release that he posted about last page
But he is actually pointing out something more interesting.

With proclimit == number-of-cores the cpu utilization will be 100%.

With proclimit > number-of-cores the same amount of cpu is being used, but the reported pps is higher.
That's because less time is spent weaving with the threads fighting each other, and more false-positives are counted by the primespersec value.

So you say leaving setgenerate value on its default which is -1 is the best way to observe the real pps?
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 [30] 31 32 33 34 35 36 37 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!