Bitcoin Forum
December 12, 2024, 11:37:11 PM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 ... 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 [63] 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 ... 199 »
  Print  
Author Topic: [XPM] [ANN] Primecoin Release - First Scientific Computing Cryptocurrency  (Read 688843 times)
Mike270
Full Member
***
Offline Offline

Activity: 158
Merit: 100


View Profile
July 10, 2013, 08:22:45 PM
 #1241


did you measure the speedup of every change? the first rule of optimization is: measure! otherwise you may be making it worse without knowing...
Hi gatra,

with those few blocks/day and primes/s being quite imprecise...
I left optimizations in when they didn't appear to cause an immediately visible (~10min) deterioration in primes/s.

just a few comments:
those bool arrays are probably no much better than the vectors. Each bool takes 1 byte of RAM. If you want to make that faster you have use an array of ints that is 1/32 of the size, and encode each bool in a single bit of the 32 bits. That will make arrays shorter, speeding up mallocs and making more bools fit in the cache.

Also, probably changing the call to size() by a variable, and splitting the loop look like not worth it. However the upper and lower_bound are definitly better
Yeah, I'm aware of the bool issue, and I thought let's give it a try in the malloc vs. memory tradeoff. It actually seems to have been the single optimization that brought a rather direct improvement to primes/sec (apart from Sunny Kings change).
Still, take a look at a profiler, those mallocs are by far the most dominant in the whole program.
I'm thinking of merging the three arrays into one and saying e.g. 1 bit means vfCompositeCunningham1, another one means vfCompositeCunningham2 etc.
since they are usually addressed one after another at the same index, this optimizes cache usage, and something like
            if (!vfCompositeCunningham1[nMultiplier] ||
                !vfCompositeCunningham2[nMultiplier] ||
                !vfCompositeBiTwin[nMultiplier])
can be reduced to
if (newarray[nMultiplier]!=0)

@tacotime: I did a

    CSieveOfEratosthenes(unsigned int nBits, uint256 hashBlockHeader, CBigNum& bnFixedMultiplier):
        vfCompositeCunningham1{0},
        vfCompositeCunningham2{0},
        vfCompositeBiTwin{0}
    {

instead - the compiler gives a strange warning about me having to supply a switch but this behaviour being the default anyway (Huh) but all bools appear to be false, so it appears to be working (I did a loop in the initializer for a while that was checking whether alls are false and if not would have output a warning, and as I saw no warning after ~30mins I then removed the check and am assuming that all is fine...
prime.h:96:2: Warnung: erweiterte Initialisierungsliste nur mit -std=c++11 oder -std=gnu++11 verfügbar [standardmäÃig aktiviert]
(Roughly translates to : Enhanced initialization list only when using -std=c++11 or -std=gnu++11 available (activated by default)
But when I supply either, I get errors in other modules...
tacotime
Legendary
*
Offline Offline

Activity: 1484
Merit: 1005



View Profile
July 10, 2013, 08:29:25 PM
 #1242

I think it'd be ideal to use a 2D array with the three composite values stored sequentially in memory as single bits each

Probably should be a std::bitset datatype too

Code:
XMR: 44GBHzv6ZyQdJkjqZje6KLZ3xSyN1hBSFAnLP6EAqJtCRVzMzZmeXTC2AHKDS9aEDTRKmo6a6o9r9j86pYfhCWDkKjbtcns
Mike270
Full Member
***
Offline Offline

Activity: 158
Merit: 100


View Profile
July 10, 2013, 08:31:54 PM
 #1243

Thanks Mike!
I did some change but i don't understand some of them, like those three :

-did the headerhash mod outlined earlier

https://bitcointalk.org/index.php?topic=251850.msg2689981#msg2689981
(But I didn't change the MaxSieveSize)

-after creating vPrimes array in prime.cpp, store vPrimes.size() in a static variably and replace all calls to it with the variable
in prime.cpp:
// Prime Table
std::vector<unsigned int> vPrimes;
unsigned int vPrimessz;    (<--- add this)

in GeneratePrimeTable(), add a last line:
vPrimessz=vPrimes.size();

Replace all other occurrences of vPrimes.size() with vPrimessz.

- in prime.cpp, CSieveOfEratosthenes::Weave() :
Split the for loop into two for loops, one going up to nChainLength, the other one going up to 2*nChainLength
Thus, you can safe the first first if+for clause in the first loop and leave away the if, and leave the if+for completely away in the second loop.

    for (; nBiTwinSeq < nChainLength; nBiTwinSeq++)
    {
[...]
        //if (nBiTwinSeq < nChainLength)
        for (unsigned int nVariableMultiplier = nSolvedMultiplier; nVariableMultiplier < nMaxSieveSize; nVariableMultiplier += nPrime)
            vfCompositeBiTwin[nVariableMultiplier] = true;
        if (((nBiTwinSeq & 1u) == 0))
            for (unsigned int nVariableMultiplier = nSolvedMultiplier; nVariableMultiplier < nMaxSieveSize; nVariableMultiplier += nPrime)
                vfCompositeCunningham1[nVariableMultiplier] = true;
        else
            for (unsigned int nVariableMultiplier = nSolvedMultiplier; nVariableMultiplier < nMaxSieveSize; nVariableMultiplier += nPrime)
                vfCompositeCunningham2[nVariableMultiplier] = true;

[...]
    for (; nBiTwinSeq < 2 * nChainLength; nBiTwinSeq++)
    {
[...]
        //if (nBiTwinSeq < nChainLength)
        //    for (unsigned int nVariableMultiplier = nSolvedMultiplier; nVariableMultiplier < nMaxSieveSize; nVariableMultiplier += nPrime)
        //        vfCompositeBiTwin[nVariableMultiplier] = true;
        if (((nBiTwinSeq & 1u) == 0))
            for (unsigned int nVariableMultiplier = nSolvedMultiplier; nVariableMultiplier < nMaxSieveSize; nVariableMultiplier += nPrime)
                vfCompositeCunningham1[nVariableMultiplier] = true;
        else
            for (unsigned int nVariableMultiplier = nSolvedMultiplier; nVariableMultiplier < nMaxSieveSize; nVariableMultiplier += nPrime)
                vfCompositeCunningham2[nVariableMultiplier] = true;


gudmunsn
Sr. Member
****
Offline Offline

Activity: 336
Merit: 250



View Profile
July 10, 2013, 08:34:32 PM
 #1244

I think it'd be ideal to use a 2D array with the three composite values stored sequentially in memory as single bits each

Probably should be a std::bitset datatype too

+1

refer_2_me
Full Member
***
Offline Offline

Activity: 213
Merit: 100



View Profile
July 10, 2013, 08:35:50 PM
 #1245

I think it'd be ideal to use a 2D array with the three composite values stored sequentially in memory as single bits each

Probably should be a std::bitset datatype too

+1

When you guys get that figured out, pull requests would be much appreciated.

BTC: 1reFerkRnftob5YvbB112bbuwepC9XYLj
XPM: APQpPZCfEz3kejrYTfyACY1J9HrjnRf34Y
Sunny King (OP)
Legendary
*
Offline Offline

Activity: 1205
Merit: 1010



View Profile WWW
July 10, 2013, 08:38:02 PM
 #1246

Sorry for my earlier error on Market cap.  The newest exchange rates and coin total (168,186.31) yields a cap of around 68k which would be 15th just behind CNC.  The hourly mining rate is equivalent to 6 BTC (assuming 1 minute blocks and 20.41 coins per block as the last few have been) which wold be $500 at current exchange rates, still 3rd over.

Going according to plan isn't it  Wink
Looking forward to XPM's debut on the network mining income pie chart  Cool
Mike270
Full Member
***
Offline Offline

Activity: 158
Merit: 100


View Profile
July 10, 2013, 08:39:08 PM
 #1247

I think it'd be ideal to use a 2D array with the three composite values stored sequentially in memory as single bits each

Probably should be a std::bitset datatype too
Hm, that would save memory but cost more in terms of computing (fiddling with all the bit shift operations).
I think I'll first go for a plain int array and when I want to set a bit for the former arrays I can just do a |=,
and it will much simplify the ORing of all three arrays, so I'd guess it's only one malloc for the whole array and saving in computational efforts.
(Although I'm aware that those CBigNum probably much outweigh all these optimizations if one could save there)
96redformula
Full Member
***
Offline Offline

Activity: 224
Merit: 100


View Profile
July 10, 2013, 08:39:23 PM
 #1248

Sorry for my earlier error on Market cap.  The newest exchange rates and coin total (168,186.31) yields a cap of around 68k which would be 15th just behind CNC.  The hourly mining rate is equivalent to 6 BTC (assuming 1 minute blocks and 20.41 coins per block as the last few have been) which wold be $500 at current exchange rates, still 3rd over.

Going according to plan isn't it  Wink
Looking forward to XPM's debut on the network mining income pie chart  Cool

Are you doing the windows binary build?
RandyFolds
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250



View Profile
July 10, 2013, 08:39:33 PM
 #1249

I think a lot of people are frustrated because they don't understand the difference between pooled and solo mining.

In pooled mining, (ie, what we all do with Bitcoin), the work is broken down into a ton of discrete chunks - each miner is compensated when a block is solved based on their proportion of work contributed - ie everyone gets a slow, steady and somewhat predictable payout.

Solo minings different - blocks don't get shared, so each individual gets a much greater payout, but whether they get it or not is a LOT more dependent on whether they're lucky enough to solve a block before someone else does. There have only been 8000 or so blocks solved so far. I've got the miner running on various computers at varying pps rates - during the first 17 hours or so, I solved no blocks at all. Then they started getting solved. And the thing was, the difference between the number of blocks solved from a computer running at 10 pps and another running at 150pps is much smaller than the difference in prime generation rate.

So, there are 8000 blocks awarded so far.

Lets say there are 500 people attempting to mine them and each person is using 3 computers to do so - on average, if they've all been mining since the beginning, each computer would have solved 5.333 blocks. But the distribution is a lot more random; some people will have gotten 5 blocks, others have gotten zero, and others have gotten 20+. But that's the thing with solo mining. And if, in the beginning, there were only 50 people mining during the first hour, 200 mining 6 hours later, and 500 24-hours later, of course there will be drop-offs in the rate that people see blocks get solved...

Given the newness of the coin, the lack of markets for it, etc, I'd be shocked if someone had already figured out how to have a GPU solve the problems - it's not just rewriting the existing SHA algorithm that bitcoin uses, changing a couple lines to make it primecoin ready, it's revising everything about the miner to handle a completely different type of work. Optmizing the CPU miner? Sure. But even so, I don't think they'ed be churning out coins in this environment, as, again, I think the payout has a lot more to do with luck at this point than anything else including raw horse power.

Putting together some charts to demonstrate the distribution of the rewards I've seen so far.... maybe someone will find it to be of interest, who knows?

But in the meantime, if people want to see more predictable coin generation, then maybe they should start a fund to pay a bounty to the first person to develop a mining pool? i assume it'll be a lot of work, because, again, the dynamics of this coin are much different than any of the others, including this time to "mature". Or, you could make an informal "pool" with someone you know and really trust - just have both of you mine to the same wallet.dat file, and, as you add machines, the randomness that a single miner sees will be evened out more and more...

That my two cents. Utterly useless, because I'm not a programmer. But so far, I'm pretty excited about this coin and I can't quite fathom why people are getting upset with it already?

People are getting upset because some people not sharing their compiled code are damaging their own reputation and the reputation of the coin. Also, it's just flat out unethical to steal blocks. 'Of course they are! This is the internetz where anything goes!' is the typical argument, however I'd like to think the crypto-community and particularly XPM miners are for the most part much better than the worst posters on the /b forums. Right now there are a lot of people proving me wrong. All you have to do is look at the statistics.
Behind ahead of the curve is now unethical? Ever hear of this thing called life?

So you pretty much outed yourself and someone not sharing your optimizations with the community. Can people please realize this and use the reputation button the way it was intended?

There's a rep system here? I see a WoT for TRADE, not bitching because you're butthurt.
fluffypony
Donator
Legendary
*
Offline Offline

Activity: 1274
Merit: 1060


GetMonero.org / MyMonero.com


View Profile WWW
July 10, 2013, 08:39:38 PM
 #1250

I think it'd be ideal to use a 2D array with the three composite values stored sequentially in memory as single bits each

Probably should be a std::bitset datatype too

+1

When you guys get that figured out, pull requests would be much appreciated.

I was literally about to ask why there haven't been pull requests for this:) Much easier working off a common codebase than trying to implement scattered patches.

gateway
Hero Member
*****
Offline Offline

Activity: 552
Merit: 500


View Profile
July 10, 2013, 08:41:03 PM
 #1251

Difficult to say, as Sunny King's change brings a huge boost to primes/sec measurement, and since we already know that this is not a real measurement....
I found three blocks in 1,5h whereas I only had 2 blocks in the 24h before that, but since we're talking luck this may well just be a statistical anomaly.

With all mentioned optimizations in place I now see
2013-07-10 19:54:30 primemeter   1229628 prime/h   9519403 test/h
2013-07-10 19:56:30 primemeter   1200380 prime/h   9385149 test/h
2013-07-10 19:58:31 primemeter   1059123 prime/h   8281702 test/h
2013-07-10 20:00:32 primemeter   1132707 prime/h   8806710 test/h

Whereas before I had
2013-07-09 16:01:06 primemeter    191133 prime/h   1230638 test/h
2013-07-09 16:03:07 primemeter    207187 prime/h   1333814 test/h
2013-07-09 16:05:07 primemeter    197670 prime/h   1302659 test/h
2013-07-09 16:07:08 primemeter    226844 prime/h   1538940 test/h
2013-07-09 16:08:09 primemeter    241089 prime/h   1653641 test/h
2013-07-09 16:10:09 primemeter    246644 prime/h   1574452 test/h

So if this helps, feel free to donate ;-)
XPM   ALLbe86QswwTRvx1LTVnKaVXCz6mcA6YUR
BTC   135oF6hF9uUbzq9x1vTuaoqKPab2j513f2

I'm thinking the biggest improvements can be done by
a) optimizing the algorithm itself, possible according to TacoTime's post
b) minimizing malloc()s
and I guess there's still lots of potential.
If I had more time I'd try looking into using smaller sieves that fit into CPU cache but doing more sieves (so that the range sieved remains the same)


nice, you building on windows or *nix?
Mike270
Full Member
***
Offline Offline

Activity: 158
Merit: 100


View Profile
July 10, 2013, 08:48:30 PM
 #1252

Difficult to say, as Sunny King's change brings a huge boost to primes/sec measurement, and since we already know that this is not a real measurement....
I found three blocks in 1,5h whereas I only had 2 blocks in the 24h before that, but since we're talking luck this may well just be a statistical anomaly.

With all mentioned optimizations in place I now see
2013-07-10 19:54:30 primemeter   1229628 prime/h   9519403 test/h
2013-07-10 19:56:30 primemeter   1200380 prime/h   9385149 test/h
2013-07-10 19:58:31 primemeter   1059123 prime/h   8281702 test/h
2013-07-10 20:00:32 primemeter   1132707 prime/h   8806710 test/h

Whereas before I had
2013-07-09 16:01:06 primemeter    191133 prime/h   1230638 test/h
2013-07-09 16:03:07 primemeter    207187 prime/h   1333814 test/h
2013-07-09 16:05:07 primemeter    197670 prime/h   1302659 test/h
2013-07-09 16:07:08 primemeter    226844 prime/h   1538940 test/h
2013-07-09 16:08:09 primemeter    241089 prime/h   1653641 test/h
2013-07-09 16:10:09 primemeter    246644 prime/h   1574452 test/h

So if this helps, feel free to donate ;-)
XPM   ALLbe86QswwTRvx1LTVnKaVXCz6mcA6YUR
BTC   135oF6hF9uUbzq9x1vTuaoqKPab2j513f2

I'm thinking the biggest improvements can be done by
a) optimizing the algorithm itself, possible according to TacoTime's post
b) minimizing malloc()s
and I guess there's still lots of potential.
If I had more time I'd try looking into using smaller sieves that fit into CPU cache but doing more sieves (so that the range sieved remains the same)


nice, you building on windows or *nix?

building on Ubuntu. Should I get phenomenal results (ahem) I'll try to follow the instructions on how to set up a windows build environment for my windows machines.
Processor on that machine is an i7-920 at stock speeds, primecoind mining with 6 threads at nice 20 (the machine also has other stuff to do, like running bfgminer on my fpgas :-) )
mustyoshi
Sr. Member
****
Offline Offline

Activity: 287
Merit: 250



View Profile
July 10, 2013, 08:52:25 PM
 #1253

I just had a transaction confirmed in under a minute.

The effective mining rate of the network is waaaaay faster than it should be.
craslovell
Legendary
*
Offline Offline

Activity: 1470
Merit: 1021



View Profile WWW
July 10, 2013, 08:52:51 PM
 #1254

Thanks for sharing Mike270 Smiley

We are now onto the 4th day of mining (74 hours since launch)...

Some rough observations:

  • Day 1 block spacing was in the range 40 - 110 seconds | difficulty 7 - 7.2
  • Day 2 block spacing was in the range 10 - 40 seconds | difficulty 7.2 - 7.33
  • Day 3 block spacing was in the range 15 - 25 seconds | difficulty 7.4 - 7.6

If you are running the original binary released by Sunny, then you have very little chance of finding a block in less than 15 seconds IMO

Yeah I'm watching all the blocks I'm not mining right now
Buffer Overflow
Legendary
*
Offline Offline

Activity: 1652
Merit: 1016



View Profile
July 10, 2013, 08:58:04 PM
 #1255

I just had a transaction confirmed in under a minute.

The effective mining rate of the network is waaaaay faster than it should be.

The per-block difficulty adjustment seems to lag behind.

96redformula
Full Member
***
Offline Offline

Activity: 224
Merit: 100


View Profile
July 10, 2013, 08:58:16 PM
 #1256

Well it seems the little guys are out now.  The big machines are up and running with the fixed code Sad.  

BTW seems like we have a 10 second gap between blocks now  Cry.
mustyoshi
Sr. Member
****
Offline Offline

Activity: 287
Merit: 250



View Profile
July 10, 2013, 09:00:44 PM
 #1257

Well it seems the little guys are out now.  The big machines are up and running with the fixed code Sad.  

BTW seems like we have a 10 second gap between blocks now  Cry.
My proposed p2pool client can't operate when the network speed is the same as the p2p chain speed is supposed to be.
craslovell
Legendary
*
Offline Offline

Activity: 1470
Merit: 1021



View Profile WWW
July 10, 2013, 09:03:54 PM
 #1258

Well it seems the little guys are out now.  The big machines are up and running with the fixed code Sad.  

BTW seems like we have a 10 second gap between blocks now  Cry.
My proposed p2pool client can't operate when the network speed is the same as the p2p chain speed is supposed to be.

Ahh that's disappointing. Too bad blocks are being found an entire 6 times faster than they're supoosed to be...
romerun
Legendary
*
Offline Offline

Activity: 1078
Merit: 1002


Bitcoin is new, makes sense to hodl.


View Profile
July 10, 2013, 09:08:27 PM
 #1259

where's the most efficient code ?

8 cores in total 24 hours got nothing.  Huh
bidji29
Sr. Member
****
Offline Offline

Activity: 392
Merit: 250


View Profile
July 10, 2013, 09:11:13 PM
 #1260

For the two for loop, i get that

Code:
// Weave the sieve for the prime
    unsigned int nChainLength = TargetGetLength(nBits);
    for (; nBiTwinSeq < nChainLength; nBiTwinSeq++)
    {
        // Find the first number that's divisible by this prime
        int nDelta = ((nBiTwinSeq % 2 == 0)? (-1) : 1);
        unsigned int nSolvedMultiplier = ((bnFixedInverse * (p - nDelta)) % p).getuint();
        if (nBiTwinSeq % 2 == 1)
            bnFixedInverse *= bnTwoInverse; // for next number in chain

        unsigned int nPrime = vPrimes[nPrimeSeq];
        if (nBiTwinSeq < nChainLength)
        for (unsigned int nVariableMultiplier = nSolvedMultiplier; nVariableMultiplier < nMaxSieveSize; nVariableMultiplier += nPrime)
            vfCompositeBiTwin[nVariableMultiplier] = true;
        if (((nBiTwinSeq & 1u) == 0))
            for (unsigned int nVariableMultiplier = nSolvedMultiplier; nVariableMultiplier < nMaxSieveSize; nVariableMultiplier += nPrime)
                vfCompositeCunningham1[nVariableMultiplier] = true;
        else
            for (unsigned int nVariableMultiplier = nSolvedMultiplier; nVariableMultiplier < nMaxSieveSize; nVariableMultiplier += nPrime)
                vfCompositeCunningham2[nVariableMultiplier] = true;
    }

for (; nBiTwinSeq < 2 * nChainLength; nBiTwinSeq++)
    {
        // Find the first number that's divisible by this prime
        int nDelta = ((nBiTwinSeq % 2 == 0)? (-1) : 1);
        unsigned int nSolvedMultiplier = ((bnFixedInverse * (p - nDelta)) % p).getuint();
        if (nBiTwinSeq % 2 == 1)
            bnFixedInverse *= bnTwoInverse; // for next number in chain

        unsigned int nPrime = vPrimes[nPrimeSeq];
        if (nBiTwinSeq < nChainLength)
            for (unsigned int nVariableMultiplier = nSolvedMultiplier; nVariableMultiplier < nMaxSieveSize; nVariableMultiplier += nPrime)
                vfCompositeBiTwin[nVariableMultiplier] = true;
        if (((nBiTwinSeq & 1u) == 0))
            for (unsigned int nVariableMultiplier = nSolvedMultiplier; nVariableMultiplier < nMaxSieveSize; nVariableMultiplier += nPrime)
                vfCompositeCunningham1[nVariableMultiplier] = true;
        else
            for (unsigned int nVariableMultiplier = nSolvedMultiplier; nVariableMultiplier < nMaxSieveSize; nVariableMultiplier += nPrime)
                vfCompositeCunningham2[nVariableMultiplier] = true;
    }

i'm not sure of myself on this one

http://www.freebieservers.com/  100% FREE GAME SERVERS
Pages: « 1 ... 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 [63] 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 ... 199 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!