djm34
Legendary
Offline
Activity: 1400
Merit: 1050
|
|
March 30, 2015, 09:50:33 PM |
|
One of the ways to do momentum hashing is to search for a triplet birthday; that is, find three nonces such that when the block is hashed with each of them, the resulting hashes match (to a certain width).
There are variations on the theme, some of which work by making it deliberately harder to construct an efficiently searchable compact index (forcing more table searching for each sub-nonce considered). But the idea is that in order to form any kind of valid *whole* momentum nonce you have find to some number of sub-nonces that have some particular mathematical relationship which you can only find by comparing their results, meaning you have to keep an indexed table of the intermediate results. The twin paradox says you can find pairs with some related property fairly quickly, but efficiently finding triplets require either lotsa luck or a pretty full table.
could be interesting... (haven't had time yet to read through the whole discussion)
|
djm34 facebook pageBTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
|
|
|
smolen
|
|
March 31, 2015, 02:28:29 AM |
|
There are variations on the theme, some of which work by making it deliberately harder to construct an efficiently searchable compact index (forcing more table searching for each sub-nonce considered). It provokes to create efficiently searchable and update-able index, that's good for a database, but, I think, wrong for mining. The scarcest resource going to be memory bandwidth, we can sacrify memory and then CPU for it. The miner has not to be deterministic, some nonces could be thrown away. I'd start with set of buckets with ring buffer inside of each, advance through nonces in big chunks, newly calculated hashes are first uploaded (from cache to main memory) into buckets and then, when entire new chunk is distributed, download each bucket back to fast cache and search for triplets with hash table. Memory access going to be smooth, may be even some HDD bandwidth could be used
|
Of course I gave you bad advice. Good one is way out of your price range.
|
|
|
smolen
|
|
March 31, 2015, 02:45:03 AM |
|
More fun around Bitcoin block structure - the order of transactions in block is almost not restricted, so couple of KB of hidden data could be embedded into blockchain per 1MB block. If updating nonce2 becomes the bottleneck for mining and transaction fees will be significant, miner could iterate through transaction order by swapping children in nodes, optimal block will contain power of two number of transactions - 64, 128, 256 etc
|
Of course I gave you bad advice. Good one is way out of your price range.
|
|
|
Cryddit (OP)
Legendary
Offline
Activity: 924
Merit: 1132
|
|
March 31, 2015, 03:13:06 AM |
|
I figured out the solution to another problem; unfortunately this solution will completely screw with fine distinctions among hashing algorithms. Pretty much the only meaningful parameter remaining to tweak is the amount of memory it'll take, because no matter what happens, mining will be CPU-bound by the rate at which miners can calculate signatures.
And yes, that can be parallelized, and GPU'd, and ASIC'd. I can screw with momentum hash badly enough to make it remain memory size bound, and maybe even somewhat memory bandwidth bound. But it's gonna be ugly.
You guys want a laugh? Here's the current block header structure. See if you can spot the ringer. Hint: I regard mining pools as an existential threat to the stability of any cryptocurrency.
offset bytes block 0 4 0 block version 4 4 0 block height 12 8 0 UNIX timestamp 20 4 0 current target (of this algm, compressed form) 24 4 0 current hash algm 28 4 0 zero, do not use (future update space) 32 32 0 merkle root of current tx tree
0 32 1 hash of prev block 32 32 1 merkle root of current unspent txout set
0 20 2 current nonce (yes damnit, that's huge) 20 32 2 sig of all the above (including nonce) with privkey of coinbase txOut 52 12 2 reserved for SHA2 endbit & length field (it actually uses 8 bytes)
|
|
|
|
smolen
|
|
March 31, 2015, 03:24:09 AM |
|
And yes, that can be parallelized, and GPU'd, and ASIC'd.
Momentum as in Protoshares? If yes, I have a nice puzzle to solve! 20 32 2 sig of all the above (including nonce) with privkey of coinbase txOut
Useless. EDIT: even if coinbase is restricted to the only txout
|
Of course I gave you bad advice. Good one is way out of your price range.
|
|
|
Cryddit (OP)
Legendary
Offline
Activity: 924
Merit: 1132
|
|
March 31, 2015, 03:32:14 AM |
|
I'd start with set of buckets with ring buffer inside of each, advance through nonces in big chunks, newly calculated hashes are first uploaded (from cache to main memory) into buckets and then, when entire new chunk is distributed, download each bucket back to fast cache and search for triplets with hash table. Memory access going to be smooth, may be even some HDD bandwidth could be used Consider how you'd index a search for "triplets" with a hash collision in 35 bits -- except that any 4 of those 35 bits MUST be different between hash A and hash B, and a different 4 bits MUST be different between hash B and hash C. Suddenly, you are either trying to write nonces into many buckets of your table (use too much memory) or trying to read from about ten-thousand different unpredicted places in a simple hash-indexed table (cache abuse). Of course, you may be able to do that by the time you calculate another signature over the next subnonce....
|
|
|
|
Cryddit (OP)
Legendary
Offline
Activity: 924
Merit: 1132
|
|
March 31, 2015, 03:34:00 AM |
|
And yes, that can be parallelized, and GPU'd, and ASIC'd.
Momentum as in Protoshares? If yes, I have a nice puzzle to solve! 20 32 2 sig of all the above (including nonce) with privkey of coinbase txOut
Useless. EDIT: even if coinbase is restricted to the only txout How so? It is my objective to ensure that the person who finds the winning hash is able to spend the txOut. What better way than ensuring that he needs the Privkey in the first place to form the block that must be hashed?
|
|
|
|
smolen
|
|
March 31, 2015, 03:39:49 AM Last edit: March 31, 2015, 03:53:57 AM by smolen |
|
And yes, that can be parallelized, and GPU'd, and ASIC'd.
Momentum as in Protoshares? If yes, I have a nice puzzle to solve! 20 32 2 sig of all the above (including nonce) with privkey of coinbase txOut
Useless. EDIT: even if coinbase is restricted to the only txout How so? It is my objective to ensure that the person who finds the winning hash is able to spend the txOut. What better way than ensuring that he needs the Privkey in the first place to form the block that must be hashed? I've seen code that checks only the first txOut of coinbase, it could be 1 satoshi dust, the rest is transferred to unsiclosed pubkeyhash. If block can contain transaction that spends coinbase txout (almost sure it can), privkey of coinbase txout could be fully disclosed to everyone, the second transaction will transfer to undisclosed pubkey. EDIT: Wait, I was too fast. It's useless for protected closed source miner, such as my Smelter.
|
Of course I gave you bad advice. Good one is way out of your price range.
|
|
|
Cryddit (OP)
Legendary
Offline
Activity: 924
Merit: 1132
|
|
March 31, 2015, 03:50:17 AM |
|
If block can contain transaction that spends coinbase txout (almost sure it can), privkey of coinbase txout could be fully disclosed to everyone, the second transaction will transfer to undisclosed pubkey.
Coinbase txOut is not spendable until more than 100 blocks have passed since it was mined. Ah, I think I get the attack. You're saying that *NEXT* time the miner gets a block, he can spend the 1-satoshi txOut from the current coinbase and claim the rest as 'mining fee?' Good point. I'll go make sure it can't happen. coinbase can only have one txOut and it must be the full amount. EDIT: I'm editing this thing a lot.... Actually, nope. It suffices to ensure that the coinbase has only one txOut. If the miner doesn't make it for the full amount he's due, that's tough toenails; the rest isn't going to appear out of nowhere when he spends his undersized txOut.
|
|
|
|
smolen
|
|
March 31, 2015, 04:17:35 AM |
|
Coinbase txOut is not spendable until more than 100 blocks have passed since it was mined.
Yes, I was wrong about it. Thought it is wallet restriction It suffices to ensure that the coinbase has only one txOut.
Seems so. Now I have to invent new trick for my miner And what kind of 'ringer' did you mean initially? Something with hash? EDIT: May be the purpose of this header structure is to ensure that miners will do hash+signature per nonce, and encourage developing open-source ECC math for GPU?
|
Of course I gave you bad advice. Good one is way out of your price range.
|
|
|
Cryddit (OP)
Legendary
Offline
Activity: 924
Merit: 1132
|
|
March 31, 2015, 10:25:56 PM |
|
Actually opensource ecc code for the GPU would be pretty nice, and possibly worthwhile.
But mostly, what was going on with that was that I was trying to insure that nobody who didn't have the ability to spend the txOut would be finding the hash.
Pool mining is an existential threat to the future of cryptocurrency. I understand why people do it, but putting that much control (more specifically putting everyone that much closer to a 51% attack) into the hands of a pool admin is dangerous.
The other 'ringer' is the root of the merkle tree of unspent txOuts. That allows "rolling root" block chain compression where the oldest blocks can be allowed to fall off the end of the block chain after long enough.
|
|
|
|
smolen
|
|
April 01, 2015, 12:22:04 AM |
|
But mostly, what was going on with that was that I was trying to insure that nobody who didn't have the ability to spend the txOut would be finding the hash.
May be there is a way to provide miner with only partial knowledge of the key, say, miner will be able to sign only message with fixed prefix. Unlikely, but careful with hash used in signing algorithm. With ECDSA ASICs, fast cheap internet and slow PoW pools will continue business as usual. Homomorphic encryption. Seems not ready yet. Pool could prevent miner from spending block reward by ceasing his pending payments and broadcasting the key. So pool gets nothing (like in block withholding attack), but uncooperative miner with high probability takes a loss. Variant: association of miners cooperate, put some funds under escrow, establish shared secret (private key), reward is splitted in p2pool fashion. If uncooperative spend is detected, association waives reward by broadcasting the key and confiscates defector's funds (or burns all funds, if there is no way to determine rogue member). I guess we are on territory of reverse game theory here.
|
Of course I gave you bad advice. Good one is way out of your price range.
|
|
|
tromp
Legendary
Offline
Activity: 990
Merit: 1110
|
|
April 01, 2015, 02:57:55 AM |
|
But mostly, what was going on with that was that I was trying to insure that nobody who didn't have the ability to spend the txOut would be finding the hash.
The pool operator can give each participating miner pre-signed headers to work on. You can only prevent this by having PoW attempts be so cheap that the pool operator (who can likely do hardware accelerated signing) could not hand out signed headers fast enough to keep all miners busy. But your desire for a memory hard PoW suggests rather slow PoW attempts, in which case the pool operator can easily keep up...
|
|
|
|
Cryddit (OP)
Legendary
Offline
Activity: 924
Merit: 1132
|
|
April 01, 2015, 03:45:37 AM |
|
The pool operator can give each participating miner pre-signed headers to work on.
What good is a pre-signed header when each signature is only good for one nonce?
|
|
|
|
smolen
|
|
April 01, 2015, 11:59:40 AM |
|
Actually opensource ecc code for the GPU would be pretty nice, and possibly worthwhile.
Long residue math would be fun to do in OpenCL, but is any coin ready for bulk signature verification?
|
Of course I gave you bad advice. Good one is way out of your price range.
|
|
|
tromp
Legendary
Offline
Activity: 990
Merit: 1110
|
|
April 01, 2015, 02:39:17 PM |
|
The pool operator can give each participating miner pre-signed headers to work on.
What good is a pre-signed header when each signature is only good for one nonce? I thought your nonce field defined the 2^64 search space. Re-reading your earlier post, I now see that it defines the triple-nonce solution within the search space as well. But if every element in your search space requires a new signature then your PoW will be very compute intensive, your search space rather limited in size, and thus not particularly memory hard.
|
|
|
|
djm34
Legendary
Offline
Activity: 1400
Merit: 1050
|
|
April 01, 2015, 03:09:06 PM |
|
The pool operator can give each participating miner pre-signed headers to work on.
What good is a pre-signed header when each signature is only good for one nonce? I thought your nonce field defined the 2^64 search space. Re-reading your earlier post, I now see that it defines the triple-nonce solution within the search space as well. But if every element in your search space requires a new signature then your PoW will be very compute intensive, your search space rather limited in size, and thus not particularly memory hard. actually it depends a bit how the 3 nonces are searched (it would become mem hard, if this implies to store the hashes generated by previous nonce). but yes, it would probably be possible to search for 3 nonces in one pass... could take quite some time... but I am wondering if it could be possible to search the 3 nonces in parallel or is it F1(nonce1) some_operation F2(nonce2) some_operation F3(nonce3)
|
djm34 facebook pageBTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
|
|
|
Cryddit (OP)
Legendary
Offline
Activity: 924
Merit: 1132
|
|
April 01, 2015, 05:57:04 PM |
|
As I said, I haven't done much programming on GPUs so I don't know in great detail yet what their capabilities are. Right now I'm just working from the conventional knowledge that they're highly parallel and have 2Gbytes or so of memory - and memory bandwidth that's about 10x as fast as CPU memory bandwidth.
Right now the "Big Honkin' Server" task looks like searching for triplet subnonces whose hashes match - EXCEPT for one bit - for the first fifty bits. Because of the way the triple is represented, the nonces have to be within a 45-bit range of each other, but that's not really an issue because you can spend all day searching inside a 45-bit range.
You can compute them in parallel, but you're going to need a lot of data structure to store them in. For every one you compute you have fifty different places to look for a matching nonce. The more previous results you can store, the better the odds that you'll have a couple of matching nonces in that set of places and be able to make a triplet. In fact if you find more than two you can make more than one triplet.
That's a pretty brutal task. I have been trying to think of every possible way for a GPU program to "cheat" at it using its 10x memory bandwidth advantage in a 2Gbyte data structure - and then applying the same compression techniques to a 64G data structure to make the task even more brutal. This one is supposed to be for big-memory servers only, and GPUs will find a far more profitable use of their time chewing on the SHA256D stuff that the big-memory servers (which don't usually have any use for high-end graphics cards and aren't built with them) can't do nearly as fast as they can.
It's actually kind of fun - I haven't done this kind of "use every trick in the book to absolutely maximize performance given a particular set of resources" programming in years.
|
|
|
|
smolen
|
|
April 01, 2015, 07:47:28 PM |
|
Right now the "Big Honkin' Server" task looks like searching for triplet subnonces whose hashes match - EXCEPT for one bit - for the first fifty bits. Because of the way the triple is represented, the nonces have to be within a 45-bit range of each other, but that's not really an issue because you can spend all day searching inside a 45-bit range.
Draft idea: two values that differ only in one bit will be separated by Hamming distance of 1. Miner could backet hashes by Hamming weight, possibly discarding hashes with low and high weight, then generate Bloom filter for even backets and perform search for odd ones in two neighbor (even, +-1) backets.
|
Of course I gave you bad advice. Good one is way out of your price range.
|
|
|
AliassailA
Newbie
Offline
Activity: 50
Merit: 0
|
|
April 02, 2015, 12:50:56 AM |
|
It might work on a quad channel lga 2011... I think you would exceed bandwidth here under normal circumstances. If I am wrong it is because I'm stupid.
|
|
|
|
|