Bitcoin Forum
April 16, 2021, 10:49:07 PM *
News: Latest Bitcoin Core release: 0.21.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 [28] 29 30 31 32 33 34 35 36 37 »
  Print  
Author Topic: [XPM] Primecoin Built-in Miner Sieve Performance Issue  (Read 69059 times)
Chemisist
Member
**
Offline Offline

Activity: 99
Merit: 10



View Profile
July 12, 2013, 08:42:40 PM
 #541

I also notice a decline in PPS using Chemisist's code compare to the official (0.11). Getting an average of about 1800 PPS vs 2200 with official. I'm only running 12 threads as well, so it's not just affecting users with high thread counts.

Well, you're running with more threads than I was able to test (highest I got to was 8 on my core i7-950), so I'm not entirely sure why this is, though it could be that all threads are trying to access a single variable which is determining how long to let the sieve be woven for.  I suppose that all these threads could end up blocking each other and cause a significant portion of idle time.  It would be far more effective to have a sieve weaving time variable for each individual boost thread, but I'm not sure how to do this as I am entirely unfamiliar with the Boost library (I'm not a c++ programmer  Sad )

Actually with my FX 8350 I get just a bit less speed with your code than the official 0.11, and with my sempron 145 I get around 20% less with your code. Maybe it doesn't work well for AMD architectures?.

I wish I had a better answer than "I don't know."  Unfortunately, I've done zero testing on the amd platform, sorry!

btc 1ChemaH12nRmd75M8BmPSiqd8x7B2wxFNF     ltc LaWX7jgJDyQ2oFaQYJvo5kqC1e1KYPoCfd     xpm Ab8NSgxHgGUJvHgSHYqMYBMWai6ZdsA91s
1618613347
Hero Member
*
Offline Offline

Posts: 1618613347

View Profile Personal Message (Offline)

Ignore
1618613347
Reply with quote  #2

1618613347
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
TheSwede75
Full Member
***
Offline Offline

Activity: 224
Merit: 100



View Profile
July 12, 2013, 08:46:15 PM
 #542

I also notice a decline in PPS using Chemisist's code compare to the official (0.11). Getting an average of about 1800 PPS vs 2200 with official. I'm only running 12 threads as well, so it's not just affecting users with high thread counts.

Well, you're running with more threads than I was able to test (highest I got to was 8 on my core i7-950), so I'm not entirely sure why this is, though it could be that all threads are trying to access a single variable which is determining how long to let the sieve be woven for.  I suppose that all these threads could end up blocking each other and cause a significant portion of idle time.  It would be far more effective to have a sieve weaving time variable for each individual boost thread, but I'm not sure how to do this as I am entirely unfamiliar with the Boost library (I'm not a c++ programmer  Sad )

Actually with my FX 8350 I get just a bit less speed with your code than the official 0.11, and with my sempron 145 I get around 20% less with your code. Maybe it doesn't work well for AMD architectures?.

I wish I had a better answer than "I don't know."  Unfortunately, I've done zero testing on the amd platform, sorry!

Ironically my AMD K10 get better speed with O2 then official while my 1100T get much better speed with the O2 build. I also suspect memory speed plays a fairly large factor.
Vilepickle
Member
**
Offline Offline

Activity: 76
Merit: 10


View Profile
July 12, 2013, 08:49:10 PM
 #543

I also notice a decline in PPS using Chemisist's code compare to the official (0.11). Getting an average of about 1800 PPS vs 2200 with official. I'm only running 12 threads as well, so it's not just affecting users with high thread counts.

Well, you're running with more threads than I was able to test (highest I got to was 8 on my core i7-950), so I'm not entirely sure why this is, though it could be that all threads are trying to access a single variable which is determining how long to let the sieve be woven for.  I suppose that all these threads could end up blocking each other and cause a significant portion of idle time.  It would be far more effective to have a sieve weaving time variable (see line 11 in my version of the prime.cpp: static volatile int sieveBuildTime = 0;) for each individual boost thread, but I'm not sure how to do this as I am entirely unfamiliar with the Boost library (I'm not a c++ programmer  Sad )

Tried yours with 30 threads on one box, I'm only pulling about 1800pps.  Latest official repo nets around 4000.

BTC: 14PzAZCW1k8aA4FFFZ55LxizQqwBR969ee
mumus
Sr. Member
****
Offline Offline

Activity: 291
Merit: 250



View Profile
July 12, 2013, 08:51:22 PM
 #544

Please note primespersecond is not an accurate measure of actual performance. It has some correlations but if sieve round is reduced too short you could see inflated pps but not really faster performance. Only block rate is an accurate measure of true performance.

I'm also testing my Windows build from Chemisist's version and I can report that indeed there is an improvement in the PPS rate but I can achieve the same or even better result if I'm adding  gensieveroundlimitms=500 to the primecoin.conf file and using the official 0.1.1 release. In Chemisist code the default is set to 400 instead of 1000 compared to Sunny's release.
So I'm not 100% sure that this new optimizations really makes any change. I would suggest for everybody to check these new versions using gensieveroundlimitms=1000 in their config file.
Based on Sunny's post lowering gensieveroundlimitms value may improve PPS but ....
TheSwede75
Full Member
***
Offline Offline

Activity: 224
Merit: 100



View Profile
July 12, 2013, 08:52:01 PM
 #545

I also notice a decline in PPS using Chemisist's code compare to the official (0.11). Getting an average of about 1800 PPS vs 2200 with official. I'm only running 12 threads as well, so it's not just affecting users with high thread counts.

Well, you're running with more threads than I was able to test (highest I got to was 8 on my core i7-950), so I'm not entirely sure why this is, though it could be that all threads are trying to access a single variable which is determining how long to let the sieve be woven for.  I suppose that all these threads could end up blocking each other and cause a significant portion of idle time.  It would be far more effective to have a sieve weaving time variable (see line 11 in my version of the prime.cpp: static volatile int sieveBuildTime = 0;) for each individual boost thread, but I'm not sure how to do this as I am entirely unfamiliar with the Boost library (I'm not a c++ programmer  Sad )

Tried yours with 30 threads on one box, I'm only pulling about 1800pps.  Latest official repo nets around 4000.

I suspect that optimizing the CPU miner code much further is of little consequence to most people. Right now Cluster instances and a few large miners are totally owning the coin with 100.000s+ PPS.

Next step in 'mainstreaming' the coin is a CUDA/GPU implementation.
kimosan
Hero Member
*****
Offline Offline

Activity: 644
Merit: 501


View Profile
July 12, 2013, 09:05:37 PM
Last edit: July 12, 2013, 09:28:34 PM by kimosan
 #546

-O2 generic build: (generally -O3 gives some errors at end but seems to run fine, I'll go the safe route first).

https://www.dropbox.com/s/stfb9t66tnp6yld/primecoin-chemisist-mod.rar

Been running your build for about 2 hrs. Found a block 10 minutes ago. i5 2500k 3-cores | 650-900 pps

(-03 was slower)

Thank you for posting it.
anonppcoin
Newbie
*
Offline Offline

Activity: 48
Merit: 0


View Profile
July 12, 2013, 09:16:37 PM
 #547

k well either way.. for the noobs... if you install sunny's new client..

don't use this https://www.dropbox.com/s/f7fu0u0yk4i09il/primecoin0712v2-ivyonly.zip

it is making the wallet hang on splash screen..

I just copied the files into the wallet folder.. no worky

Are you sure you're on Ivy? Because 0710-avx is an mtune build while 0712v2 will only work on Ivy Bridge cpus.
dudeguy
Member
**
Offline Offline

Activity: 182
Merit: 10



View Profile
July 12, 2013, 09:16:57 PM
 #548

I also notice a decline in PPS using Chemisist's code compare to the official (0.11). Getting an average of about 1800 PPS vs 2200 with official. I'm only running 12 threads as well, so it's not just affecting users with high thread counts.

Well, you're running with more threads than I was able to test (highest I got to was 8 on my core i7-950), so I'm not entirely sure why this is, though it could be that all threads are trying to access a single variable which is determining how long to let the sieve be woven for.  I suppose that all these threads could end up blocking each other and cause a significant portion of idle time.  It would be far more effective to have a sieve weaving time variable (see line 11 in my version of the prime.cpp: static volatile int sieveBuildTime = 0;) for each individual boost thread, but I'm not sure how to do this as I am entirely unfamiliar with the Boost library (I'm not a c++ programmer  Sad )

Tried yours with 30 threads on one box, I'm only pulling about 1800pps.  Latest official repo nets around 4000.

I suspect that optimizing the CPU miner code much further is of little consequence to most people. Right now Cluster instances and a few large miners are totally owning the coin with 100.000s+ PPS.

Next step in 'mainstreaming' the coin is a CUDA/GPU implementation.

So basically we're fubard until GPU mining/mining pools come into play?
dudeguy
Member
**
Offline Offline

Activity: 182
Merit: 10



View Profile
July 12, 2013, 09:19:33 PM
 #549

k well either way.. for the noobs... if you install sunny's new client..

don't use this https://www.dropbox.com/s/f7fu0u0yk4i09il/primecoin0712v2-ivyonly.zip

it is making the wallet hang on splash screen..

I just copied the files into the wallet folder.. no worky

Are you sure you're on Ivy? Because 0710-avx is an mtune build while 0712v2 will only work on Ivy Bridge cpus.

Weird. I'm getting 50-100 higher PPS on your Sandy+Ivy V2 build than your Ivy only V2 build? Either way they are like the V2 Rockets of QT clients right now!


People need to donate to anonppcoin. He (assuming) put a lot of hard work into building.
drummerjdb666
Full Member
***
Offline Offline

Activity: 244
Merit: 101



View Profile
July 12, 2013, 09:22:42 PM
 #550

My latest Windows builds. From Chemisist source:

Tuned for Sandy and Ivy Intel Core processors (AVX), O3:

https://www.dropbox.com/s/18bgecwqzsmwsh2/primecoin0712v2-avx.zip


Ivy Bridge ONLY build:

https://www.dropbox.com/s/f7fu0u0yk4i09il/primecoin0712v2-ivyonly.zip

XPM: AR2BpBnitqXudN67Ncuc9FfYVT8u9jNe7a

If i use your build alone.. i get the cannot find port error.. use listen=0....  but if I overwrite ur file on top of zalfrin's  it works and produces almost 2000pps  on "IB -3770k"  Smiley   hopefully no more orphans   

oroqen
Sr. Member
****
Offline Offline

Activity: 280
Merit: 250



View Profile
July 12, 2013, 09:26:13 PM
 #551

k well either way.. for the noobs... if you install sunny's new client..

don't use this https://www.dropbox.com/s/f7fu0u0yk4i09il/primecoin0712v2-ivyonly.zip

it is making the wallet hang on splash screen..

I just copied the files into the wallet folder.. no worky

Are you sure you're on Ivy? Because 0710-avx is an mtune build while 0712v2 will only work on Ivy Bridge cpus.

Weird. I'm getting 50-100 higher PPS on your Sandy+Ivy V2 build than your Ivy only V2 build? Either way they are like the V2 Rockets of QT clients right now!


People need to donate to anonppcoin. He (assuming) put a lot of hard work into building.
There's several benchmarks on www.phoronix.com on newer versions of gcc with ivy bridge support bench-marked against older versions. From my own experience Gcc4.8 with core-avx-i was a performace loss regardless of the software it compiled compained too 4.6 with corei7-avx, I can only speak for Linux on that account, as always your mileage may vary.
Chemisist
Member
**
Offline Offline

Activity: 99
Merit: 10



View Profile
July 12, 2013, 09:27:29 PM
 #552

After playing with the code a couple of hours, I think the PPS figure is misleading.

As far as I understood it, the algorithm is split in two parts : building a sieve, and trying each factor stored in the sieve with three different tests.
Each part has a variable execution time (determined by the number of prime factors you can find).
Recent modification of the code (and Chemisist's proposal) put a timeout in these parts so that the execution time invested in a possible solution is capped.
By capping these parts, there is a chance to abort the testing of a valid solution.

Thus, a complex trade-off arises : invest more time in the current candidate, or jump to the next candidate after a given period of time.
Chemisist suggests an adaptative approach (I just skimmed through your code, I may have misunderstood), that's very interesting Smiley

However, IMHO, that does only marginally improve the chances of finding a solution (read block), especially as difficulty rises.
Why ? Because the distribution of prime factors is very very difficult to predict.
Play along with the timeout values, this can lead to multiply or divide your PPS by 10.
But that does not improve your chances of finding a solution.

Anyway, here is my advice : do not focus on the PPS, it is not reliable measure of performance.

Thumbs up to Sunny King (and his team) for designing this coin. The proof-of-work proposed is brilliant and very interesting to play with.

I have a hard timeout (sieveBuildTime) on the weaving of the sieve (generally results in less than 1000-2000 prime candidates to check) and a soft timeout on the analysis of the resultant sieve (3* sieveBuildTime) which allows for all the prime candidates to be checked.  By completing the sieve (takes ~2-5 seconds depending on the machine) you reduce the 1000-2000 prime candidates down to 5-50.  However, the additional time taken to weave the sieve to reduce the numbers from (best case) 2000 -> 5 is far greater than the time it would take to check these additional 1995 prime candidates.  Thus, with my approach I try to equate the time taken to build the sieve with the time taken to analyze it, providing the hard cap for the weave and the soft cap for the checking.

Sunny, can you weigh in on this?

btc 1ChemaH12nRmd75M8BmPSiqd8x7B2wxFNF     ltc LaWX7jgJDyQ2oFaQYJvo5kqC1e1KYPoCfd     xpm Ab8NSgxHgGUJvHgSHYqMYBMWai6ZdsA91s
maco
Sr. Member
****
Offline Offline

Activity: 294
Merit: 250



View Profile
July 12, 2013, 09:45:22 PM
 #553

30 hours + mining ( not a single block mined ) I just started trying out Sunny's update: v0.1.1

On July 10, 2013 - I mined 5 blocks with 400 - 600 PPS .. note: I lost 4 orphans in that action, so it was technically 9 blocks mined, but 4 were lost, so 5 have been check marked. Anyways, the interesting thing is, I haven't touched a block. Anyone else getting the same results?
AgentME
Member
**
Offline Offline

Activity: 84
Merit: 10


View Profile
July 12, 2013, 09:50:03 PM
 #554

After playing with the code a couple of hours, I think the PPS figure is misleading.

As far as I understood it, the algorithm is split in two parts : building a sieve, and trying each factor stored in the sieve with three different tests.
Each part has a variable execution time (determined by the number of prime factors you can find).
Recent modification of the code (and Chemisist's proposal) put a timeout in these parts so that the execution time invested in a possible solution is capped.
By capping these parts, there is a chance to abort the testing of a valid solution.

Thus, a complex trade-off arises : invest more time in the current candidate, or jump to the next candidate after a given period of time.
Chemisist suggests an adaptative approach (I just skimmed through your code, I may have misunderstood), that's very interesting Smiley

However, IMHO, that does only marginally improve the chances of finding a solution (read block), especially as difficulty rises.
Why ? Because the distribution of prime factors is very very difficult to predict.
Play along with the timeout values, this can lead to multiply or divide your PPS by 10.
But that does not improve your chances of finding a solution.

Anyway, here is my advice : do not focus on the PPS, it is not reliable measure of performance.

Thumbs up to Sunny King (and his team) for designing this coin. The proof-of-work proposed is brilliant and very interesting to play with.
Agreed. I noticed earlier if you cap off the sieve weaving time to almost nothing, you can easily get absurdly high PPS values but you won't actually earn blocks faster. There's a trade-off that needs to be analyzed closer.
jlspartz
Full Member
***
Offline Offline

Activity: 205
Merit: 100


View Profile
July 12, 2013, 09:59:33 PM
 #555

Got a quadro 5000 I'm waiting to use. Someone already working on cuda or opencl?
dudeguy
Member
**
Offline Offline

Activity: 182
Merit: 10



View Profile
July 12, 2013, 10:01:04 PM
 #556

30 hours + mining ( not a single block mined ) I just started trying out Sunny's update: v0.1.1

On July 10, 2013 - I mined 5 blocks with 400 - 600 PPS .. note: I lost 4 orphans in that action, so it was technically 9 blocks mined, but 4 were lost, so 5 have been check marked. Anyways, the interesting thing is, I haven't touched a block. Anyone else getting the same results?

Find an optimized QT client here for your processor. I'm on an i3 3225 and I'm getting more than you since the first optimized QT I downloaded. Prior to that though (July 10th) I was getting 0 blocks mined and was probably the most unlucky person mining primecoin.
altsay
Sr. Member
****
Offline Offline

Activity: 359
Merit: 250


View Profile
July 12, 2013, 10:10:05 PM
 #557

Play along with the timeout values, this can lead to multiply or divide your PPS by 10.
But that does not improve your chances of finding a solution.

What do you mean by timeout values?
drummerjdb666
Full Member
***
Offline Offline

Activity: 244
Merit: 101



View Profile
July 12, 2013, 10:12:42 PM
 #558

30 hours + mining ( not a single block mined ) I just started trying out Sunny's update: v0.1.1

On July 10, 2013 - I mined 5 blocks with 400 - 600 PPS .. note: I lost 4 orphans in that action, so it was technically 9 blocks mined, but 4 were lost, so 5 have been check marked. Anyways, the interesting thing is, I haven't touched a block. Anyone else getting the same results?

Find an optimized QT client here for your processor. I'm on an i3 3225 and I'm getting more than you since the first optimized QT I downloaded. Prior to that though (July 10th) I was getting 0 blocks mined and was probably the most unlucky person mining primecoin.

I have an I3 that has produced 6 blocks since start... the second pair of 3 were on the first mod client though and orphaned Sad         That was actually my gf's laptop.. she was pissed lol!  only 63 primes instead of 100
Chemisist
Member
**
Offline Offline

Activity: 99
Merit: 10



View Profile
July 12, 2013, 10:13:36 PM
 #559

After playing with the code a couple of hours, I think the PPS figure is misleading.

As far as I understood it, the algorithm is split in two parts : building a sieve, and trying each factor stored in the sieve with three different tests.
Each part has a variable execution time (determined by the number of prime factors you can find).
Recent modification of the code (and Chemisist's proposal) put a timeout in these parts so that the execution time invested in a possible solution is capped.
By capping these parts, there is a chance to abort the testing of a valid solution.

Thus, a complex trade-off arises : invest more time in the current candidate, or jump to the next candidate after a given period of time.
Chemisist suggests an adaptative approach (I just skimmed through your code, I may have misunderstood), that's very interesting Smiley

However, IMHO, that does only marginally improve the chances of finding a solution (read block), especially as difficulty rises.
Why ? Because the distribution of prime factors is very very difficult to predict.
Play along with the timeout values, this can lead to multiply or divide your PPS by 10.
But that does not improve your chances of finding a solution.

Anyway, here is my advice : do not focus on the PPS, it is not reliable measure of performance.

Thumbs up to Sunny King (and his team) for designing this coin. The proof-of-work proposed is brilliant and very interesting to play with.
Agreed. I noticed earlier if you cap off the sieve weaving time to almost nothing, you can easily get absurdly high PPS values but you won't actually earn blocks faster. There's a trade-off that needs to be analyzed closer.

The high pps number is due to the very low hard cap on the time set to check the actual sieve that has been produced (it's set to 10 ms in the current master branch on github, line 372 in prime.cpp).  So with the very short weaving time of whatever you decide to set, the sieve has a very large number of prime candidates, most of which satisfy the following check:
Code:
if(TargetGetLength(nProbablePrimeChainLength) >= 1)
     nPrimesHit++;

but many of which are not actually primes.  Anyway, I'm currently testing my code against Sunny's on the testnet (with the large thread count issue potentially fixed, fingers crossed) to see which can find more blocks in 10 minutes on my T9300 laptop.  Results to come shortly

btc 1ChemaH12nRmd75M8BmPSiqd8x7B2wxFNF     ltc LaWX7jgJDyQ2oFaQYJvo5kqC1e1KYPoCfd     xpm Ab8NSgxHgGUJvHgSHYqMYBMWai6ZdsA91s
reb0rn21
Legendary
*
Offline Offline

Activity: 1859
Merit: 1024


View Profile
July 12, 2013, 10:15:03 PM
 #560

Anyone with Haswell?
I am using primecoin0712v2-ivyonly.zip so far

              ▄▄▄ ▀▀▀▀▀▀▀▀▀ ▄▄▄
           ▄▀▀    ▄▄▄▄▄▄▄▄▄    ▀▀▄
        ▄▀▀  ▄▄▀█          ▀█▀▄▄  ▀▀▄
      ▄▀▀ ▄▄▀    ▀▀▄▄▄▄▄▄▄▀▀    ▀▄▄ ▀▀▄
     █   █            ▀            █   █
   ▄▀ █  ▀▄▄                     ▄█▀  █ ▀▄
  ▄▀ ▄▀ █▄ ▀▀▀██▄▄▄       ▄▄▄██▀▀  ██ ▀▄ ▀▄
  ▀▄▀▀▄ ██ ▄▄▄▄▄▄  ▀▄   ▄▀  ▄▄▄▄▄▄ ██ ▄▀▀▄▀
 ██   █ ██ ▀▄    ▀▄ █   █ ▄▀    ▄▀ ██ █  ▀██
 █  ▄█  ▀█  ▀▀▀▀▀▀▀ █   █ ▀▀▀▀▀▀▀  █   █▄  █
█▀ █  █  █          █   █          █  █  █ ▀▀
 █▀  ▄▀  █▀▄        █   █        ▄▀█  ▀▄  ▀█
 ▄  █▀   █ ▀█▄      ▀   ▀      ▄█▀ █  ▄▀█  ▄
 █▄▀  █  █                         █  █  ▀▄█
 ▀▄  █   ▀█        ▄▄▀▄▀▄▄        █▀   █  ▄
  ▀▄▀▀  █▄ █     ▀█  ▀▀▀  █▀     █ ▄█ ▄▀▀▄▀
   ▀ ▄  ██ █▀▄     ▀▀▄▄▄▀▀     ▄▀█ ██ ▀▄ ▀
    ▀█  ██ █ █▀▄    ▄▄▄▄▄    ▄▀█ █ ██  █▀
      ▀▄ ▀ █ █ ██▄         ▄██ █ █ ▀ ▄▀
        ▀▄ █ █ █ ▀█▄     ▄█▀ █ █ █ ▄▀
          ▀▀▄█ █    ▀▀▀▀▀    █ █▄▀▀
              ▀▀ ▄▄▄▄▄▄▄▄▄▄▄ ▀▀
   
..I  D  E  N  A..
   
Proof-of-Person Blockchain

Join the mining of the first human-centric
cryptocurrency
 



 
▲    2 3 2 2

..N  O  D  E  S..
   
                ██
                ██
                ██
                ██
                ██
         ▄      ██      ▄
         ███▄   ██   ▄███
          ▀███▄ ██ ▄███▀
            ▀████████▀
              ▀████▀
                ▀▀
██▄                            ▄██
███                            ███
███                            ███
███                            ███
 ███▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄███
  ▀▀██████████████████████████▀▀
   
D O W N L O A D

Idena node

   
   
▄▄▄██████▄▄▄
▄▄████████████████▄▄
▄█████▀▀        ▀▀█████▄
████▀                ▀████
███▀    ▄▄▄▄▄▄▄▄▄       ▀███
███      █   ▄▄ █▀▄        ███
██▀      █  ███ █  ▀▄      ▀██
███       █   ▀▀ ▀▀▀▀█       ███
███       █  ▄▄▄▄▄▄  █       ███
███       █  ▄▄▄▄▄▄  █       ███
██▄      █  ▄▄▄▄▄▄  █      ▄██
███      █          █      ███
███▄    ▀▀▀▀▀▀▀▀▀▀▀▀    ▄███
████▄                ▄████
▀█████▄▄        ▄▄█████▀
▀▀████████████████▀▀
▀▀▀██████▀▀▀
   
    .REQUEST INVITATION.
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 [28] 29 30 31 32 33 34 35 36 37 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!