[XPM] Primecoin Built-in Miner Sieve Performance Issue

redphlegm

Sr. Member

Offline

Activity: 246
Merit: 250

My spoon is too big!

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 12, 2013, 11:43:17 PM

#601

Quote from: altsay on July 12, 2013, 11:41:52 PM

Quote from: AgentME on July 12, 2013, 11:38:23 PM

Quote from: LazyOtto on July 12, 2013, 11:21:22 PM

Quote from: UNOE on July 12, 2013, 11:18:09 PM

I think he using Chemisist 2nd release that he posted about last page

But he is actually pointing out something more interesting.

With proclimit == number-of-cores the cpu utilization will be 100%.

With proclimit > number-of-cores the same amount of cpu is being used, but the reported pps is higher.

That's because less time is spent weaving with the threads fighting each other, and more false-positives are counted by the primespersec value.

So you say leaving setgenerate value on its default which is -1 is the best way to observe the real pps?

I think Chemisist was checking the solved block rate on testnet over a 10-minute period. Have those results from the overthreading been tallied?

Whiskey Fund: (BTC) 1whiSKeYMRevsJMAQwU8NY1YhvPPMjTbM | (Ψ) ALcoHoLsKUfdmGfHVXEShtqrEkasihVyqW

LazyOtto

Sr. Member

Offline

Activity: 476
Merit: 250

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 12, 2013, 11:44:03 PM

#602

Quote from: altsay on July 12, 2013, 11:41:52 PM

So you say leaving setgenerate value on its default which is -1 is the best way to observe the real pps?

yes

Chemisist

Member

Offline

Activity: 99
Merit: 10

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 12, 2013, 11:44:20 PM

#603

Quote from: AgentME on July 12, 2013, 11:36:11 PM

Quote from: Chemisist on July 12, 2013, 10:35:25 PM

Alright, so just updated my version (currently on github) such that each thread an independent evolving weave timing parameter. To compare to mine with Sunny's most recent update, I used the testnet where my version found 30 confirmed blocks in 10 minutes while the original code found 16 confirmed blocks. I feel that this is a legitimate comparison because there were no other nodes on the test net currently mining (I know this because my client found every continuous block in both cases). This comparison was performed with a t61p IBM laptop with a T9300 Core 2 Duo processor. The current difficulty on the testnet is 5.4426.

Why make it a weave timing parameter and not just a weave count parameter? I think that would be a better metric, as a change in CPU load means the timing parameter's results will change a lot.

Quote from: Chemisist on July 12, 2013, 10:43:08 PM

Quote from: gateway on July 12, 2013, 10:38:54 PM

some of us on #eligius-prime were able with lukes help and others to get it running.. now im just waiting to see if i can actually get a block..

[image]

Can you share your source code? Did you modify Sunny's algorithm at all?

I think the biggest change in Luke's miner is that it moves the bnTwoInverse calculation out of Weave() and just pre-calculates it for all of the primes in GeneratePrimeTable(). I didn't get much more performance out of porting that change though to primecoin but I didn't check too hard.

Thanks for the update on Luke's code.

The rationale for the weave timing parameter instead of the weave count parameter is because the weave timing parameter requires only a call to "GetTimeMicros()" whereas determining the weave count parameter is far more intensive to calculate. To calculate it requires looping through all three arrays to find the values that are still false:

from prime.h:

Code:

    unsigned int GetCandidateCount()
    {
        unsigned int nCandidates = 0;
        for (unsigned int nMultiplier = 0; nMultiplier < nSieveSize; nMultiplier++)
        {
            if (!vfCompositeCunningham1[nMultiplier] ||
                !vfCompositeCunningham2[nMultiplier] ||
                !vfCompositeBiTwin[nMultiplier])
                nCandidates++;
        }
        return nCandidates;
    }

the vfComposite arrays are all start out as "0" and when a value is found that is not a prime compatible with Sunny's algorithm, that value gets set to 1. All vfComposite arrays are nMaxSieveSize in length:

Code:

static const unsigned int nMaxSieveSize = 1000000u;

So to calculate a weave count parameter requires 3 million boolean tests, plus the costs of a loop plus the increment operator for each time the if statement returns true.

btc 1ChemaH12nRmd75M8BmPSiqd8x7B2wxFNF ltc LaWX7jgJDyQ2oFaQYJvo5kqC1e1KYPoCfd xpm Ab8NSgxHgGUJvHgSHYqMYBMWai6ZdsA91s

K1773R

Legendary

Offline

Activity: 1792
Merit: 1008

/dev/null

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 12, 2013, 11:45:20 PM

#604

Quote from: eule on July 12, 2013, 10:41:18 PM

Quote from: gateway on July 12, 2013, 10:38:54 PM

some of us on #eligius-prime were able with lukes help and others to get it running.. now im just waiting to see if i can actually get a block..

try testnet for tests! Cheesy

./primecoind stop
./primecoind -testnet

i mined some blocks in -testnet in some minutes:
http://pastebin.com/GN1fafrm

[GPG Public Key]
BTC/DVC/TRC/FRC: 1K1773RbXRZVRQSSXe9N6N2MUFERvrdu6y ANC/XPM AK1773RTmRKtvbKBCrUu95UQg5iegrqyeA NMC: NK1773Rzv8b4ugmCgX789PbjewA9fL9Dy1 LTC: LKi773RBuPepQH8E6Zb1ponoCvgbU7hHmd EMC: EK1773RxUes1HX1YAGMZ1xVYBBRUCqfDoF BQC: bK1773R1APJz4yTgRkmdKQhjhiMyQpJgfN

Chemisist

Member

Offline

Activity: 99
Merit: 10

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 12, 2013, 11:46:29 PM

#605

Quote from: redphlegm on July 12, 2013, 11:43:17 PM

Quote from: altsay on July 12, 2013, 11:41:52 PM

Quote from: AgentME on July 12, 2013, 11:38:23 PM

Quote from: LazyOtto on July 12, 2013, 11:21:22 PM

Quote from: UNOE on July 12, 2013, 11:18:09 PM

I think he using Chemisist 2nd release that he posted about last page

But he is actually pointing out something more interesting.

With proclimit == number-of-cores the cpu utilization will be 100%.

With proclimit > number-of-cores the same amount of cpu is being used, but the reported pps is higher.

That's because less time is spent weaving with the threads fighting each other, and more false-positives are counted by the primespersec value.

So you say leaving setgenerate value on its default which is -1 is the best way to observe the real pps?

I think Chemisist was checking the solved block rate on testnet over a 10-minute period. Have those results from the overthreading been tallied?

I just ran one and with 40 threads on 8 cores giving me 61/62 confirmations over 10 minutes. There might be a maximum between 8 and 40, but I don't have the time right now to figure it out. Some friends just arrived so I am going to have to make an exit for the evening, unfortunately. I'll check in later tonight (maybe) or definitely tomorrow.

btc 1ChemaH12nRmd75M8BmPSiqd8x7B2wxFNF ltc LaWX7jgJDyQ2oFaQYJvo5kqC1e1KYPoCfd xpm Ab8NSgxHgGUJvHgSHYqMYBMWai6ZdsA91s

AgentME

Member

Offline

Activity: 84
Merit: 10

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 12, 2013, 11:49:32 PM

#606

Quote from: Chemisist on July 12, 2013, 11:44:20 PM

The rationale for the weave timing parameter instead of the weave count parameter is because the weave timing parameter requires only a call to "GetTimeMicros()" whereas determining the weave count parameter is far more intensive to calculate. To calculate it requires looping through all three arrays to find the values that are still false:

from prime.h:

Code:

    unsigned int GetCandidateCount()
    {
        unsigned int nCandidates = 0;
        for (unsigned int nMultiplier = 0; nMultiplier < nSieveSize; nMultiplier++)
        {
            if (!vfCompositeCunningham1[nMultiplier] ||
                !vfCompositeCunningham2[nMultiplier] ||
                !vfCompositeBiTwin[nMultiplier])
                nCandidates++;
        }
        return nCandidates;
    }

the vfComposite arrays are all start out as "0" and when a value is found that is not a prime compatible with Sunny's algorithm, that value gets set to 1. All vfComposite arrays are nMaxSieveSize in length:

Code:

static const unsigned int nMaxSieveSize = 1000000u;

So to calculate a weave count parameter requires 3 million boolean tests, plus the costs of a loop plus the increment operator for each time the if statement returns true.

No, I meant only a counter of how many times the Weave() function is called, not related to GetCandidateCount().

urubu

Member

Offline

Activity: 87
Merit: 10

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 12, 2013, 11:50:04 PM

#607

Quote from: anonppcoin on July 12, 2013, 06:52:41 PM

My latest Windows builds. From Chemisist source:

Tuned for Sandy and Ivy Intel Core processors (AVX), O3:

https://www.dropbox.com/s/18bgecwqzsmwsh2/primecoin0712v2-avx.zip

Ivy Bridge ONLY build:

https://www.dropbox.com/s/f7fu0u0yk4i09il/primecoin0712v2-ivyonly.zip

XPM: AR2BpBnitqXudN67Ncuc9FfYVT8u9jNe7a

Would your ivy bridge build be best for haswell?

Chemisist

Member

Offline

Activity: 99
Merit: 10

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 12, 2013, 11:53:22 PM

#608

Quote from: AgentME on July 12, 2013, 11:49:32 PM

Quote from: Chemisist on July 12, 2013, 11:44:20 PM

The rationale for the weave timing parameter instead of the weave count parameter is because the weave timing parameter requires only a call to "GetTimeMicros()" whereas determining the weave count parameter is far more intensive to calculate. To calculate it requires looping through all three arrays to find the values that are still false:

from prime.h:

Code:

    unsigned int GetCandidateCount()
    {
        unsigned int nCandidates = 0;
        for (unsigned int nMultiplier = 0; nMultiplier < nSieveSize; nMultiplier++)
        {
            if (!vfCompositeCunningham1[nMultiplier] ||
                !vfCompositeCunningham2[nMultiplier] ||
                !vfCompositeBiTwin[nMultiplier])
                nCandidates++;
        }
        return nCandidates;
    }

the vfComposite arrays are all start out as "0" and when a value is found that is not a prime compatible with Sunny's algorithm, that value gets set to 1. All vfComposite arrays are nMaxSieveSize in length:

Code:

static const unsigned int nMaxSieveSize = 1000000u;

So to calculate a weave count parameter requires 3 million boolean tests, plus the costs of a loop plus the increment operator for each time the if statement returns true.

No, I meant only a counter of how many times the Weave() function is called, not related to GetCandidateCount().

I don't call the Weave() function over and over and over like Sunny King's. I call it once and then have a for loop inside the function, to eliminate the overhead of continuous function calls

btc 1ChemaH12nRmd75M8BmPSiqd8x7B2wxFNF ltc LaWX7jgJDyQ2oFaQYJvo5kqC1e1KYPoCfd xpm Ab8NSgxHgGUJvHgSHYqMYBMWai6ZdsA91s

anonppcoin

Newbie

Offline

Activity: 48
Merit: 0

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 12, 2013, 11:55:42 PM

#609

Quote from: urubu on July 12, 2013, 11:50:04 PM

Quote from: anonppcoin on July 12, 2013, 06:52:41 PM

My latest Windows builds. From Chemisist source:

Tuned for Sandy and Ivy Intel Core processors (AVX), O3:

https://www.dropbox.com/s/18bgecwqzsmwsh2/primecoin0712v2-avx.zip

Ivy Bridge ONLY build:

https://www.dropbox.com/s/f7fu0u0yk4i09il/primecoin0712v2-ivyonly.zip

XPM: AR2BpBnitqXudN67Ncuc9FfYVT8u9jNe7a

Would your ivy bridge build be best for haswell?

The Ivy Bridge build will work well on Haswell. It doesn't have every instruction set available on Haswell, but most. I am probably done compiling for the night (yay, Friday!) but maybe another kind soul will build you a core-avx2 optimized daemon.

redphlegm

Sr. Member

Offline

Activity: 246
Merit: 250

My spoon is too big!

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 12, 2013, 11:56:37 PM

#610

Quote from: anonppcoin on July 12, 2013, 11:55:42 PM

Quote from: urubu on July 12, 2013, 11:50:04 PM

Quote from: anonppcoin on July 12, 2013, 06:52:41 PM

My latest Windows builds. From Chemisist source:

Tuned for Sandy and Ivy Intel Core processors (AVX), O3:

https://www.dropbox.com/s/18bgecwqzsmwsh2/primecoin0712v2-avx.zip

Ivy Bridge ONLY build:

https://www.dropbox.com/s/f7fu0u0yk4i09il/primecoin0712v2-ivyonly.zip

XPM: AR2BpBnitqXudN67Ncuc9FfYVT8u9jNe7a

Would your ivy bridge build be best for haswell?

The Ivy Bridge build will work well on Haswell. It doesn't have every instruction set available on Haswell, but most. I am probably done compiling for the night (yay, Friday!) but maybe another kind soul will build you a core-avx2 optimized daemon.

How about my outdated Nehalem?

Whiskey Fund: (BTC) 1whiSKeYMRevsJMAQwU8NY1YhvPPMjTbM | (Ψ) ALcoHoLsKUfdmGfHVXEShtqrEkasihVyqW

AgentME

Member

Offline

Activity: 84
Merit: 10

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 13, 2013, 12:00:09 AM

#611

Quote from: Chemisist on July 12, 2013, 11:53:22 PM

I don't call the Weave() function over and over and over like Sunny King's. I call it once and then have a for loop inside the function, to eliminate the overhead of continuous function calls

A little refactoring shouldn't stop a counter from being used instead of a timer.

Chemisist

Member

Offline

Activity: 99
Merit: 10

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 13, 2013, 12:04:59 AM

#612

Quote from: AgentME on July 13, 2013, 12:00:09 AM

Quote from: Chemisist on July 12, 2013, 11:53:22 PM

I don't call the Weave() function over and over and over like Sunny King's. I call it once and then have a for loop inside the function, to eliminate the overhead of continuous function calls

A little refactoring shouldn't stop a counter from being used instead of a timer.

I haven't thought about doing it this way tbh.

btc 1ChemaH12nRmd75M8BmPSiqd8x7B2wxFNF ltc LaWX7jgJDyQ2oFaQYJvo5kqC1e1KYPoCfd xpm Ab8NSgxHgGUJvHgSHYqMYBMWai6ZdsA91s

tadakaluri

Hero Member

Offline

Activity: 616
Merit: 500

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 13, 2013, 12:07:32 AM

#613

Quote from: anonppcoin on July 12, 2013, 11:29:35 PM

Updated windows build using the new Chemisist source. Tuned for Intel Sandy and Ivy Bridge but compatible with other architecture.

https://www.dropbox.com/s/4k0xmuajxf5i4ly/primecoin0712v3-avx.zip

I'm seeing lower PPS than my v2 builds but I think that weaving will be better overall.

How to use it? Over write Installed files? or use from the downloaded folder itself?

fabrizziop

Hero Member

Offline

Activity: 506
Merit: 500

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 13, 2013, 12:17:37 AM

#614

Quote from: Chemisist on July 13, 2013, 12:04:59 AM

Quote from: AgentME on July 13, 2013, 12:00:09 AM

Quote from: Chemisist on July 12, 2013, 11:53:22 PM

I don't call the Weave() function over and over and over like Sunny King's. I call it once and then have a for loop inside the function, to eliminate the overhead of continuous function calls

A little refactoring shouldn't stop a counter from being used instead of a timer.

I haven't thought about doing it this way tbh.

I'm getting over 1600 PPS with the new version! Are they for real or what?. I just compiled with -O2 -march=native.

https://www.dropbox.com/s/vx9wnzfws4zttg8/primecoin-chemisist-mod-v2-o2-amd.rar

it should run on most recent cpus.

Chemisist

Member

Offline

Activity: 99
Merit: 10

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 13, 2013, 12:25:14 AM

#615

Quote from: fabrizziop on July 13, 2013, 12:17:37 AM

Quote from: Chemisist on July 13, 2013, 12:04:59 AM

Quote from: AgentME on July 13, 2013, 12:00:09 AM

Quote from: Chemisist on July 12, 2013, 11:53:22 PM

I don't call the Weave() function over and over and over like Sunny King's. I call it once and then have a for loop inside the function, to eliminate the overhead of continuous function calls

A little refactoring shouldn't stop a counter from being used instead of a timer.

I haven't thought about doing it this way tbh.

I'm getting over 1600 PPS with the new version! Are they for real or what?. I just compiled with -O2 -march=native.

Running mine versus the original on testnet shows that I mine 30 versus 16 with the original client in 10 minutes on a Core 2 Duo t9300. Running on an i7-950 on testnet generates 97 with mine and 81 with the original.

btc 1ChemaH12nRmd75M8BmPSiqd8x7B2wxFNF ltc LaWX7jgJDyQ2oFaQYJvo5kqC1e1KYPoCfd xpm Ab8NSgxHgGUJvHgSHYqMYBMWai6ZdsA91s

romerun

Legendary

Offline

Activity: 1078
Merit: 1001

Bitcoin is new, makes sense to hodl.

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 13, 2013, 12:26:33 AM

#616

although I get more pps from chemisis, but I have yet found a block since switching from the 1.1, it's been like 8 hours from 18 cores...

PoolMinor

Legendary

Offline

Activity: 1843
Merit: 1338

XXXVII Fnord is toast without bread

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 13, 2013, 12:50:49 AM

#617

This is my AMD Phenom II 710 X3 Unleashed to 4 cores.

Code:


13:48:22

"blocks" : 23683,
"generate" : true,
"genproclimit" : 3,
"primespersec" : 439,
16:39:46
"blocks" : 24634,

"generate" : true,
"genproclimit" : 3,
"primespersec" : 409,


16:39:58
?
setgenerate true 30

16:40:40
?
getmininginfo


16:40:40
?
{
"blocks" : 24639,
"currentblocksize" : 1000,
"currentblocktx" : 0,
"errors" : "",
"generate" : true,
"genproclimit" : 30,
"primespersec" : 624,
"pooledtx" : 0,
"testnet" : false
}


16:40:55
?
getmininginfo


16:40:55
?
{
"blocks" : 24641,
"currentblocksize" : 1000,
"currentblocktx" : 0,
"errors" : "",
"generate" : true,
"genproclimit" : 30,
"primespersec" : 624,
"pooledtx" : 0,
"testnet" : false
}

16:41:34
?
getprimespersec

16:41:34
?
903

17:20:25
?
getprimespersec


17:20:25
?
1043

17:21:35
?
getprimespersec


17:21:35
?
1073

B_tc=C²MF Free BTC Poker
Being defeated is often a temporary condition. Giving up is what makes it permanent. -Marilyn vos Savant

tinnvec

Newbie

Offline

Activity: 54
Merit: 0

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 13, 2013, 12:58:38 AM

#618

Quote from: PoolMinor on July 13, 2013, 12:50:49 AM

This is my AMD Phenom II 710 X3 Unleashed to 4 cores.

Looks right in line, my AMD Phenom II X4 920 is sitting around 1250 primes/sec

I also run on linux, so I thought I'd share my little bash startup script in case others can use it:

Code:

#!/bin/sh
cd [INSERT PATH TO PRIMECOIND HERE]
./primecoind --daemon
watch './primecoind getbalance ; ./primecoind getmininginfo'
kill -9 $(pidof primecoind)

This'll give you a little readout to watch your balance and miner info, when you quit (ctrl+c), it will then kill the primecoind process for you

drummerjdb666

Full Member

Offline

Activity: 244
Merit: 101

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 13, 2013, 01:10:09 AM

#619

should I try using more threads than 8? seems my 3770k won't go higher than 1700pps.. which is nice considering when we started i was originally getting 400.. I can't seem to get the 2 or 3k other people are showing from their 3770k's using the ivy only build.. on win7... tried to compile on ubunut though my vm but it seems I fail or are using the wrong distro

PoolMinor

Legendary

Offline

Activity: 1843
Merit: 1338

XXXVII Fnord is toast without bread

Re: [XPM] Primecoin Built-in Miner Sieve Performance Issue

July 13, 2013, 01:13:48 AM

#620

My point in showing those high PPS was to confirm or deny whether they had any bearing on finding more blocks or not. I have not found any blocks in the last 18 hours, only 5 total since start anyway. I made the higher thread count change within the last 3 hours and have not seen any difference, other than a higher number to look at.

B_tc=C²MF Free BTC Poker
Being defeated is often a temporary condition. Giving up is what makes it permanent. -Marilyn vos Savant