Bitcoin Forum
June 16, 2024, 09:03:51 AM *
News: Voting for pizza day contest
 
   Home   Help Search Login Register More  
Warning: One or more bitcointalk.org users have reported that they strongly believe that the creator of this topic is a scammer. (Login to see the detailed trust ratings.) While the bitcointalk.org administration does not verify such claims, you should proceed with extreme caution.
Pages: « 1 ... 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 [107] 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 ... 328 »
  Print  
Author Topic: [ANN] SpreadCoin | Decentralize Everything (decentralized blockexplorer coming)  (Read 790359 times)
e1ghtSpace
Legendary
*
Offline Offline

Activity: 1526
Merit: 1001


Crypto since 2014


View Profile WWW
November 20, 2015, 11:18:38 PM
 #2121

Mr. Spread first created an AMD gpu miner, around 20th Nov 2014:

https://bitcointalk.org/index.php?topic=715435.msg9606917#msg9606917

And then during the next month it was decided that girino's GPU miner was "the better one" ?

https://github.com/girino/spreadcoinx11-sgminer

Anyway...
Ah so it might be easier to modify Mr. Spread's miner.
Anyway Girino has some explaining to do.
chrysophylax
Legendary
*
Offline Offline

Activity: 2828
Merit: 1091


--- ChainWorks Industries ---


View Profile WWW
November 21, 2015, 03:27:53 AM
 #2122

Looks like I oopsed, they're not hashing the same thing over 3k times, they're hashing an INSANE amount of data. What the fuck is all of this...?

i never used the amd miner wolf - ever ...

it never compiled properly for me - and never worked properly for me ...

thefarm ( nvidia based ) was the only thing that was working well based on sp's spreadminer ( which was obviously based on ccminer-tsiv ) and nonce-pool spreadcoin private pool ...

so if you are building an amd miner - could you also see ( if its part of your scope with the build ) if it can be integrated with the same sgminer as x11 / quark / lyra2rev2 / neoscrypt? ...

i am in the office again - so im back on irc ... i have some updates about farmamd ...

i have a softspot for spreadcoin ( and ftc for that matter ) but have never been happy about the algo itself ...

it has always been a difficult algo to implement on theafrm due to changing miners constantly whenever i wanted to accumulate ( spreadminer as opposed to ccminer-spmod ) - and im not the only one that has had this particular issue ... so an integration of the algo in the same sgminer would be a HUGE advantage ...

#crysx

e1ghtSpace
Legendary
*
Offline Offline

Activity: 1526
Merit: 1001


Crypto since 2014


View Profile WWW
November 21, 2015, 05:18:23 AM
 #2123

Mr. Spread first created an AMD gpu miner, around 20th Nov 2014:

https://bitcointalk.org/index.php?topic=715435.msg9606917#msg9606917

And then during the next month it was decided that girino's GPU miner was "the better one" ?

https://github.com/girino/spreadcoinx11-sgminer

Anyway...
Ah so it might be easier to modify Mr. Spread's miner.
Anyway Girino has some explaining to do.

Girino's IS better, and as I said, this is not a crippling of the miner, it's just WTF.
It's not crippling? Ok that's good, I thought it was.
Does the hashing of all that data slow it down very much?
georgem (OP)
Legendary
*
Offline Offline

Activity: 1484
Merit: 1007


spreadcoin.info


View Profile WWW
November 21, 2015, 10:05:02 AM
Last edit: November 21, 2015, 10:27:50 AM by georgem
 #2124

Girino's IS better, and as I said, this is not a crippling of the miner, it's just WTF.

Can you be a little bit more specific please?

You said words like: WTF, Insane, Barbaric and Ooops, so now I am not sure anymore I understand the point you are trying to make.

Nobody doubts that the miner's efficiency can be improved, and everybody knows that SpreadX11 is pretty exotic *

So... what gives?

Are you just baffled by it? In a good/bad way?  Wink

*(adds MinerSignature to the blockheader, and constructs MAX SIZE block by padding the previousBlockHash (this means many thousand iterations, depending on how "empty" the block is, but those are just simple multiplications and bit operations, no hashes) so that a hashWholeBlock value can be added to the header, so as to prove that whoever found a block solution was also the same person to sign the coinbase and know about the whole blocks content, before any pool did.).

georgem (OP)
Legendary
*
Offline Offline

Activity: 1484
Merit: 1007


spreadcoin.info


View Profile WWW
November 21, 2015, 06:15:31 PM
Last edit: November 21, 2015, 06:33:53 PM by georgem
 #2125

I'm baffled as to why it's done the way it is. There's just a TON of data in there being hashed by SHA256, far larger than any block SPR has...

Yes, the idea is that to calculate hashWholeBlock we need to fill a complete MAX_BLOCK_SIZE block with data (that's why we have the "padding") and then Double-SHA256 it.

MAX_BLOCK_SIZE for spreadcoin is 200 Kbyte. https://github.com/spreadcoin/spreadcoin/blob/master/src/main.h#L35

So you have the header (less than 100 bytes) , followed by a few tx (<1 KByte or more), and then a giant padding (199 KBytes) to fill up the rest.

(This large pseudoblock is only existing temporarily in memory BTW, just to create the hashWholeBlock hash.)

From the whitepaper:

Quote
Padding ensures that there is no incentive to mine empty blocks without transactions

So the way I understand this, it means that padding calculations are more numerous the fewer transactions you get into your block.
Vice versa: The bigger the block already is (more txs), the less padding has to be added/calculated.

So in a way, padding is horrible on efficiency since we are mostly having 1 transaction blocks these days.
When this changes, then padding will not carry as much weight as it does now.

But I don't think that the padding calculations are that heavy anyway, but I need to do some benchmarks and look into OpenCL / Cuda.

And yes, we are ALWAYS doing a double SHA256 calculation on a MAX_BLOCK_SIZE block (200 KByte).
This size doesn't change from block to block, it's always the maximum.

Why Mr. Spread did it this way?
I'd have to guess, but probably to deliberately give GPUs a hard time?

Wasn't he playing the "CPU-Only-coin" - card for some time?


coins101
Legendary
*
Offline Offline

Activity: 1456
Merit: 1000



View Profile
November 21, 2015, 06:33:06 PM
 #2126

I'm baffled as to why it's done the way it is. There's just a TON of data in there being hashed by SHA256, far larger than any block SPR has...

Yes, the idea is that for hashWholeBlock we need to fill a complete MAX_BLOCK_SIZE block with data (that's why we have the "padding").

MAX_BLOCK_SIZE for spreadcoin is 200 Kbyte. https://github.com/spreadcoin/spreadcoin/blob/master/src/main.h#L35

So you have the header (less than 100 bytes) , followed by a few tx (<1 KByte or more), and then a giant padding (199 KBytes) to fill up the rest.

(This large pseudoblock is only existing temporarily in memory BTW, just to create the hashWholeBlock hash.)

From the whitepaper:

Quote
Padding ensures that there is no incentive to mine empty blocks without transactions

So the way I understand this, it means that padding calculations are more numerous the fewer transactions you get into your block.
Vice versa: The bigger the block already is (more txs), the less padding has to be added/calculated.

So in a way, padding is horrible on efficiency since we are mostly having 1 transaction blocks these days.
When this changes, then padding will not carry as much weight as it does now.

And yes, we are ALWAYS doing a double SHA256 calculation on a MAX_BLOCK_SIZE block (200 KByte).
This size doesn't change from block to block, it's always the maximum.

Why Mr. Spread did it this way?
I'd have to guess, but probably to deliberately give GPUs a hard time?

Wasn't he playing the "CPU-Only-coin" - card for some time?



Possibly.

But maybe its to give everyone equal weight. If you think about it, everyone effectively processes a full block. There is no incentive to choose more efficient blocks to mine to move on to the next one quickly.

I suppose if there is an incentive to fill blocks quickly, it sounds like it makes it more profitable to process blocks with transactions vs. padding.

What I don't get is why the AMD miner is so much worse than nvidia; and it has problems too.
georgem (OP)
Legendary
*
Offline Offline

Activity: 1484
Merit: 1007


spreadcoin.info


View Profile WWW
November 21, 2015, 06:36:28 PM
 #2127

What I don't get is why the AMD miner is so much worse than nvidia; and it has problems too.

I can't really judge it (yet), but it's probably badly implemented.

georgem (OP)
Legendary
*
Offline Offline

Activity: 1484
Merit: 1007


spreadcoin.info


View Profile WWW
November 21, 2015, 09:02:55 PM
Last edit: November 21, 2015, 09:17:29 PM by georgem
 #2128

What I don't get is why the AMD miner is so much worse than nvidia; and it has problems too.

I can't really judge it (yet), but it's probably badly implemented.

It's largely SPH code... really, really bad. About the same as the original darkcoin-mod.

EDIT: Idea! What's the padding made out of? Maybe I can shortcut the memory usage!

padding is constructed starting with a seed/copy of the 32 bytes of previousHashBlock (or hashPrevBlock as it is called in the code), and then there are a few bitshifts (if necessary) and then tens of thousands of multiplications while we are filling the space (moving backwards thru the block).
But we basically start with just those 32 bytes, and everything is derived from them.

Padding starts here:

https://github.com/spreadcoin/spreadcoin/blob/master/src/main.cpp#L1511

Code:
    while (BlockData.size() % 4 != 0)
    BlockData << uint8_t(7);

    // Fill rest of the buffer to ensure that there is no incentive to mine small blocks without transactions.
    uint32_t *pFillBegin = (uint32_t*)&BlockData[BlockData.size()];
    uint32_t *pFillEnd = (uint32_t*)&BlockData[MAX_BLOCK_SIZE];
    uint32_t *pFillFooter = std::max(pFillBegin, pFillEnd - 8);

    memcpy(pFillFooter, &hashPrevBlock, (pFillEnd - pFillFooter)*4);
    for (uint32_t *pI = pFillFooter; pI < pFillEnd; pI++)
        *pI |= 1;

    for (uint32_t *pI = pFillFooter - 1; pI >= pFillBegin; pI--)
        pI[0] = pI[3]*pI[7];

    BlockData.forsed_resize(MAX_BLOCK_SIZE);

First thing we do is fill up the Block from the left (right after the tx-section) with a few 0x07 bytes (only if necessary), just so that the current size of the blockdata size (header + txs) is exactly divisible by 4.
Maybe size is already divisible modulo 4, so we don't need to add any such bytes.

Then we define pointers pFillBegin, pFillEnd and pFillFooter, and then copy hashPrevBlock to the end (last 32 bytes) of this MAX_BLOCK_SIZE block, and then fill all the empty bytes in between moving backwards, 4 bytes per iteration.

Oh, and before we start we also turning these 32 bytes (or the 8 x 4 byte integers it consists of)  "ODD", by doing this *pI |= 1 operation on them, so that they are not divisible by 2 anymore).
Then we just iterate backwards, in 4 byte steps, always taking pI[3]*pI[7] and writing the multiplication result into pI[0], we do that, until we reach the transaction section (or those 0x07 bytes we created earlier (if they were necessary))...

That's about it.

So this large padding section doesn't have any regularity or repetition if you were expecting that.

It's pretty messy & chaotic data.  Cheesy

Mr. Spread really wants your GPU to double-SHA256 a 200Kbyte datastructure. All the time.

coins101
Legendary
*
Offline Offline

Activity: 1456
Merit: 1000



View Profile
November 21, 2015, 09:26:02 PM
 #2129

I think its a good idea, on paper. Tx's are given priority, the rest is there to make mining equal. With near full blocks, mining should be easy.

Perhaps we should cut the block sizes down to 1 tx  Grin

1tx blocks, moving up to 1mb blocks by next year. Maybe we should just organise hard forks every 6 months.
georgem (OP)
Legendary
*
Offline Offline

Activity: 1484
Merit: 1007


spreadcoin.info


View Profile WWW
November 21, 2015, 09:36:12 PM
Last edit: November 21, 2015, 10:49:09 PM by georgem
 #2130

Maybe I can shortcut the memory usage!

I told Mr. Spread about your idea.

He LOL'ed and...



Who is Mr. Spread?

coins101
Legendary
*
Offline Offline

Activity: 1456
Merit: 1000



View Profile
November 21, 2015, 09:39:11 PM
 #2131

That looks like he's standing in front of an ASICs with vents. Has Mr.Spread already created an ASICs for SpreadX11?
georgem (OP)
Legendary
*
Offline Offline

Activity: 1484
Merit: 1007


spreadcoin.info


View Profile WWW
November 21, 2015, 09:41:45 PM
 #2132

That looks like he's standing in front of an ASICs with vents. Has Mr.Spread already created an ASICs for SpreadX11?

Yep!
It's a hamster dung powered wooden ASIC alright.
He created it himself, just with the stuff that was lying around...  Huh

coins101
Legendary
*
Offline Offline

Activity: 1456
Merit: 1000



View Profile
November 21, 2015, 09:51:13 PM
 #2133

That looks like he's standing in front of an ASICs with vents. Has Mr.Spread already created an ASICs for SpreadX11?

Yep!
It's a hamster dung powered wooden ASIC alright.
He created it himself, just with the stuff that was lying around...  Huh

Dude...He's ripping you off, stealing your electricity.

Quote
// Fill rest of the buffer to ensure that there is no incentive to mine small blocks without transactions.

So, that's clear now....the padding is to make mining equal. Just needs a better implementation, but still with the padding, I guess.
georgem (OP)
Legendary
*
Offline Offline

Activity: 1484
Merit: 1007


spreadcoin.info


View Profile WWW
November 21, 2015, 10:30:50 PM
 #2134

What I don't get is why the AMD miner is so much worse than nvidia; and it has problems too.

I can't really judge it (yet), but it's probably badly implemented.

It's largely SPH code... really, really bad. About the same as the original darkcoin-mod.

EDIT: Idea! What's the padding made out of? Maybe I can shortcut the memory usage!

What I want to find out how many times we actually need to calculate those double-SHA256 for the whole 200KBytes.
Maybe we can skip / hold a few iterations.

georgem (OP)
Legendary
*
Offline Offline

Activity: 1484
Merit: 1007


spreadcoin.info


View Profile WWW
November 21, 2015, 10:37:08 PM
 #2135

I think its a good idea, on paper. Tx's are given priority, the rest is there to make mining equal. With near full blocks, mining should be easy.

Perhaps we should cut the block sizes down to 1 tx  Grin

1tx blocks, moving up to 1mb blocks by next year. Maybe we should just organise hard forks every 6 months.

 Cheesy

One thing is for sure.
If we keep the algo as it is, and increase the block size to say 2 Megabytes (10x ), this will also make the padding / hashWholeBlock calculation 10x more heavy on your GPU.
 Shocked

I wonder if this algo can be reduced in complexity while still maintaining the same results.
(But we can also introduce new complexity if it helps make everything MORE anti pool and pro solo-mining)

After all, a solo-miner is always also a full node. Bingo!

georgem (OP)
Legendary
*
Offline Offline

Activity: 1484
Merit: 1007


spreadcoin.info


View Profile WWW
November 23, 2015, 07:18:53 PM
 #2136

It's not really the SHA256d that's bad, it's the REST of X11.

You mean the way the whole SPH library has been ported to OpenCL, right?

georgem (OP)
Legendary
*
Offline Offline

Activity: 1484
Merit: 1007


spreadcoin.info


View Profile WWW
November 23, 2015, 07:44:34 PM
 #2137

It's not really the SHA256d that's bad, it's the REST of X11.

You mean the way the whole SPH library has been ported to OpenCL, right?

If you wanna call it that. It hasn't been ported so much as copypasted, and on top of this, SPH is a BAD library to use for any kind of speed-critical application. Its main purpose is to be portable across a wide range of CPUs, not perform well.

I understand now, thanks.
This means that the efficiency can probably be increased by a tenfold, I would guess... wow!

georgem (OP)
Legendary
*
Offline Offline

Activity: 1484
Merit: 1007


spreadcoin.info


View Profile WWW
November 23, 2015, 08:02:32 PM
 #2138

Not quite - you need to take into consideration that the massive number of SHA256 hashes take MOST of the time. So the REMAINING code can probably be doubled in speed, and I'm not sure what I can do with the signature yet.

Right now, signature2 is not looking good - https://ottrbutt.com/tmp/spreadx11-sig2-analysis.png -- it's bigger than the code cache by a lot (code cache on GCN is 32KiB), and uses enough registers to limit it to one wave in flight. Since the kernel also uses some memory, it probably would benefit from more waves in flight.

Interesting,
so SHA256d is THE problem, although SPH is the most obvious thing that can be improved.

I need to look into this CodeXL thing to analyze kernels.

PS: mr. spread asks if you have any NSFW pics of naked hamster girls.  Cheesy

georgem (OP)
Legendary
*
Offline Offline

Activity: 1484
Merit: 1007


spreadcoin.info


View Profile WWW
November 23, 2015, 08:32:25 PM
 #2139

SHA256d, at least the code itself, probably isn't going to get too much faster. HOWEVER, it can be improved, I think, by improving the structure of the kernel. It's partially unrolled, possibly wasting space. There's a tradeoff in rolling it up - I'll have to branch, or use conditional moves - but I'm pretty sure it'll be WELL worth it to decrease register usage and shrink code size.

Thanks again for providing this insight.

PS: No hamsters in my collection that I recall, but I *do* have a cute mouse: https://ottrbutt.com/tmp/3121bcd0f67852c01ae4a582bd4ab24e.jpg
It doesn't work for mr. spread if she has no "cheek pouches"... thanks but no thanks.  Tongue

chrysophylax
Legendary
*
Offline Offline

Activity: 2828
Merit: 1091


--- ChainWorks Industries ---


View Profile WWW
November 25, 2015, 12:51:54 AM
 #2140

Can someone shed some light on this:

Code:
uint64_t signature8[5];
    signature8[0] = psign[0];
    signature8[1] = psign[8];
    signature8[2] = psign[16];
    signature8[3] = psign[24];
    signature8[4] = psign[32];

    uint64_t signature[4];
    signature[0] = (DEC64LEng(psign +  0) >> 8) | (signature8[1] << 56);
    signature[1] = (DEC64LEng(psign +  8) >> 8) | (signature8[2] << 56);
    signature[2] = (DEC64LEng(psign + 16) >> 8) | (signature8[3] << 56);
    signature[3] = (DEC64LEng(psign + 24) >> 8) | (signature8[4] << 56);

    signature8[1] = signature[0] >> 56;
    signature8[2] = signature[1] >> 56;
    signature8[3] = signature[2] >> 56;
    signature8[4] = signature[3] >> 56;

    signbe[0] = SWAP8((signature[0] << 8) | signature8[0]);
    signbe[1] = SWAP8((signature[1] << 8) | signature8[1]);
    signbe[2] = SWAP8((signature[2] << 8) | signature8[2]);
    signbe[3] = SWAP8((signature[3] << 8) | signature8[3]);
    signbe[4] = (signature8[4] << 56) | 0x80000000000000;

Just... what is it even doing?

Got it. Fun fact, that can be replaced by this:

Code:
for(int i = 0; i < 4; ++i) signbe[i] = SWAP8(((ulong *)psign)[i]);

signbe[4] = (((ulong)psign[32]) << 56) | 0x80000000000000;

Whoever wrote that needs to stay away from the alcohol...

hahaha ...

you will find the main drive for the creation of most crypto IS alcohol ... not innovation Wink ...

just kidding ...

thats awesome stuff wolf ...

#crysx

Pages: « 1 ... 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 [107] 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 ... 328 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!