Bitcoin Forum
December 16, 2017, 07:25:36 AM *
News: Latest stable version of Bitcoin Core: 0.15.1  [Torrent].
 
   Home   Help Search Donate Login Register  
Pages: « 1 ... 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 [55] 56 57 58 »
  Print  
Author Topic: The Ethereum Paradox  (Read 84474 times)
iamnotback
Sr. Member
****
Offline Offline

Activity: 336



View Profile
June 10, 2016, 10:20:05 AM
 #1081

I am just arguing why not be safer than sorry when so much is at stake? Why would you recommend adding unnecessary risk even where you think you are omniscient and there is no risk?

I only need good statistical properties of the hash function, not cryptographic security.

Pseudorandomness ("good statistical properties") is tied into the cryptographic security:

Quote from: Blake2 paper
BLAKE2 aims to provide the highest security level, be it in terms of classical notions as (second) preimage or collision resistance, or of theoretical notions as pseudorandomness (a.k.a. indistinguishability) or indifferentiability.

Relevant terms are 'indistinguishability' and 'indifferentiability'.

Siphash is sufficient for the use case it was designed for because the attacker doesn't know the input to the hash function. It is not the case that Siphash has been cryptanalysed for the case of your application of Siphash in Cuckoo Cycle wherein the attacker knows both the input and output of the hash function.

Note that every submission that Dan Berstein made for cryptographic hash, has always been rejected having breakage. He appears to not excel in that one area. His ARX rounds (from ChaCha symmetric encryption) have been redeployed in Blake2 by those who know how to make it cryptographically secure for the different application of a hash function. I admire Dan a lot. He knows a hell of a lot more than me about everything involving higher maths and cryptography.

In fact siphash-2-4 is already overkill for my purposes, with siphash-4-8 being close to cryptographically secure. Your attack is about as conceivable to me as P being equal to NP, in which case there is no cryptographically secure hash function anyway.

Fact? Based on what published cryptanalysis.

What I am saying is that the edge trimming eliminates most of the nodes from consideration and then you have a hash table representing a sparse array for the remainder of the nodes which aren't the pruned/trimmed leaf edges.

Yes, totally correct. But in my view the term "bucket" denotes a fixed size unordered container, and my current algorithms use no such thing.

Agreed 'bucket' would not be an ordered array element. That is a more precise use of the term 'bucket'. We programmers who started hacking since the 1970s and 1980s may be a little bit more loose (employing the general English definition) with our terminology (using 'bucket' to refer to an array element as an array bucket) as compared to someone who came up predominantly in an academic setting and especially someone employing technical English not as their native language wherein they would likely be more strict about definitions. I stand corrected on the more precise terminology.

Well my wink is about using maximum nonce counts, i.e. edge fraction M/N > 1/2.

My algorithms break down in that case, since the expected number of cycles becomes non-constant, causing the basic algorithm to overlook an increasing fraction of them, and edge-trimming fails altogether to remove a large fraction of edges.

Exactly. Wink In a few months, you'll know what I was referring to when it is published.

My calculation was 2^15, that isn't a million.

You need millions to find 2^12 of them wanting to access a single row.

With 2^15 threads you expect 1 access per row.

Correct on 2^15 but not on the conclusion. We both agree the number of memory banks is irrelevant for this concern. Memory banks only determine how many row buffers are active simultaneously, which impacts maximum speed and whether we are memory bandwidth or memory latency bound on speed of execution.

So for the power consumption issue, if we have 2^29 counters (2-bits each) and 2^14 counters per memory page/row, then 2^16 (65536) h/w threads means we can stall threads (perhaps not permanently so as to account for variance outliers) and be roughly assured statistically of roughly 2 accesses per row. That already doubles the hash computation per memory page row.

Thus we don't need millions of h/w threads to convert the algorithm to computation bound on power consumption, and thus making it not ASIC resistant.

At 2^18 (262144) h/w threads we have 8 times more computation per memory page row than the CPU (and the computation will be orders-of-magnitude more power efficient on the ASIC).

A CUDA gpu can apparently do 671 million threads, although this are probably not synced on memory row buffers as we would need here, although the statistical spread might be sufficient without syncing. I think if you had the Kill-A-Watt meter what you would have observed is that by increasing threads, the speed remained topped out at memory bandwidth, but the power consumption of the GPU decreases as threads are increased beyond a certain threshold (assuming GPU threads are very efficient, but that might not be the case).

I don't think it is safe to assume that ASICs can't be designed to support millions of very efficient threads for this very customized computation and again sharing the computation transistors amongst only the threads that aren't stalled, thus not needing millions of instances of the compute units. The GPU may already be doing this, although maybe not since it is probably optimized for performance and not power consumption where there is a choice, although once it has hit memory bandwidth bound one might hope the GPU would be optimizing for minimizing power consumption by maximizing row buffer coalescing, but this is perhaps too costly to implement in the general case of GPU (but I am arguing not in the case of an ASIC with a specific custom computation circuit). Apparently GPU memory coalescing is not powerful enough to do what we require.

Also remember I need for instant transactions to have a very fast proving time, which thus means at 2^20 counters it could be trivially parallelized losing ASIC resistance with only thousands of threads.

Actually how ironic that the 2-bit counters instead of the original basic algorithm, makes the ASIC resistance worse, because more counters fit in a row. Remember I wrote this about "bit array" in my 2013 rough draft paper on memory hard proof-of-work.
1513409136
Hero Member
*
Offline Offline

Posts: 1513409136

View Profile Personal Message (Offline)

Ignore
1513409136
Reply with quote  #2

1513409136
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1513409136
Hero Member
*
Offline Offline

Posts: 1513409136

View Profile Personal Message (Offline)

Ignore
1513409136
Reply with quote  #2

1513409136
Report to moderator
tromp
Hero Member
*****
Offline Offline

Activity: 505


View Profile
June 10, 2016, 11:37:41 AM
 #1082

In fact siphash-2-4 is already overkill for my purposes, with siphash-4-8 being close to cryptographically secure. Your attack is about as conceivable to me as P being equal to NP, in which case there is no cryptographically secure hash function anyway.

Fact? Based on what published cryptanalysis.

I meant to say: In fact I believe that

Quote
You need millions to find 2^12 of them wanting to access a single row.
With 2^15 threads you expect 1 access per row.
Correct on 2^15 but not on the conclusion.

With 2^27 you expect 2^12 accesses per row. That's hundreds of millions.

Quote
A CUDA gpu can apparently do 671 million threads

Not in hardware. It can only run a few thousand in hardware (simultaneously).
So the 671 have to run in thousands of batches, one after the other.

Quote
I don't think it is safe to assume that ASICs can't be designed to support millions of very efficient threads for this very customized computation

Your ASIC would require an insanely huge die and be orders of magnitude more expensive than the DRAM it tries to optimize access to.

Quote
and again sharing the computation transistors amongst only the threads that aren't stalled, thus not needing millions of instances of the compute units.

How would you know what to stall without doing its computation first??

Quote
Also remember I need for instant transactions to have a very fast proving time, which thus means at 2^20 counters it could be trivially parallelized losing ASIC resistance with only thousands of threads.

For such small instances, latency is not that relevant as it all fits in SRAM cache.
iamnotback
Sr. Member
****
Offline Offline

Activity: 336



View Profile
June 10, 2016, 12:30:00 PM
 #1083

You need millions to find 2^12 of them wanting to access a single row.
With 2^15 threads you expect 1 access per row.
Correct on 2^15 but not on the conclusion.

With 2^27 you expect 2^12 accesses per row. That's hundreds of millions.

That would give you 2^12 hashes per row buffer load. That isn't necessary to break ASIC resistance. 2^18 (256k threads) gives 8 hashes per row buffer load and might be enough to make it computation bound to a range of perhaps 100-to-1 improvement of power consumption on the ASIC.

Or in any case, 2^20 (million threads) gives 32 hashes per row buffer.

And remember my use case is 2^20 counters not 2^29, so in my use case the ASIC only needs 1000s of threads to make it computation bound up to the limits of the maximum efficiency advantage of the ASIC.

A CUDA gpu can apparently do 671 million threads

Not in hardware. It can only run a few thousand in hardware (simultaneously).
So the 671 have to run in thousands of batches, one after the other.

Yeah I also stated in the prior post that the GPU isn't likely optimized enough to cause major advantage. The ASIC is the big concern.

I don't think it is safe to assume that ASICs can't be designed to support millions of very efficient threads for this very customized computation

Your ASIC would require an insanely huge die and be orders of magnitude more expensive than the DRAM it tries to optimize access to.

Certainly not in my use case of 2^20 counters.

And in the 2^29 counters use case, please remember I said that you don't need millions of compute units. The compute units can be shared by the non-stalled threads and even these can be pipelined and queued. So you don't need millions of compute units circuits. The millions is only the logical threads.

and again sharing the computation transistors amongst only the threads that aren't stalled, thus not needing millions of instances of the compute units.

How would you know what to stall without doing its computation first??

That is irrelevant. Think it out. It is a statistical phenomenon. Not all the computations are computed monolithically followed all the syncs. These stages are happening in parallel for 2^8 memory banks (row buffers), so we can have many threads stalled yet some not. And even the computations can be pipelined and queued. Let's not conflate power consumption goal with speed goal.

It is very very difficult to defeat the ASIC. I think I know how to do it though.

Also remember I need for instant transactions to have a very fast proving time, which thus means at 2^20 counters it could be trivially parallelized losing ASIC resistance with only thousands of threads.

For such small instances, latency is not that relevant as it all fits in SRAM cache.

Latency is a speed bound concern. Our concern herein is power consumption bound. That is what I am getting you to focus on these being orthogonal issues.

The ASIC can choose to use DRAM instead and amortize the power consumption over the row buffer. So I am not yet sure if that would give the ASIC an advantage. Although I've read that SRAM is 10X faster and 10X more power consumption (thus being neutral relative to the DRAM), I am wondering if the ASIC can coalesce (sync) all the reads in the row buffer for DRAM, if the DRAM might not have a significant power advantage thus rendering it computation bound for power consumption.

Do you have any idea? I was planning to study the equations for memory power consumption in the reference I cited upthread.

Note I think on Android it may be possible to turn off the cache.
tromp
Hero Member
*****
Offline Offline

Activity: 505


View Profile
June 10, 2016, 01:28:47 PM
 #1084

The ASIC can choose to use DRAM instead and amortize the power consumption over the row buffer.

Here's a possible concrete design of what you are hinting at:

The ASIC will have as many (pipelined) siphash circuits as are needed to max out the memory
bandwidth of all the DRAM you're gonna hook up to it.
For each row on each memory bank, the ASIC has a queue of in-row counter-indices.
Each entry in the queue thus represents a "stalled thread".
The siphash circuits fill up these queues at a mostly uniform rate.
Once a queue fills up it is flushed to one of the memory controllers on the ASIC that
will perform the corresponding atomic updates on the memory row.

This looks pretty efficient, but all those queues together actually take up a lot of memory,
which the ASIC needs to access randomly.

So it seems we've just recreated the problem we were trying to solve.
But we can aim for the total queue memory to be maybe 1% of the total DRAM.

There's a lot of parameters here and many different ways to optimize with
combinations of speed, energy cost, and hardware cost...

iamnotback
Sr. Member
****
Offline Offline

Activity: 336



View Profile
June 10, 2016, 02:15:52 PM
 #1085

The ASIC can choose to use DRAM instead and amortize the power consumption over the row buffer.

Here's a possible concrete design of what you are hinting at:

The ASIC will have as many (pipelined) siphash circuits as are needed to max out the memory
bandwidth of all the DRAM you're gonna hook up to it.
For each row on each memory bank, the ASIC has a queue of in-row counter-indices.
Each entry in the queue thus represents a "stalled thread".
The siphash circuits fill up these queues at a mostly uniform rate.
Once a queue fills up it is flushed to one of the memory controllers on the ASIC that
will perform the corresponding atomic updates on the memory row.

This looks pretty efficient, but all those queues together actually take up a lot of memory,
which the ASIC needs to access randomly.

So it seems we've just recreated the problem we were trying to solve.
But we can aim for the total queue memory to be maybe 1% of the total DRAM.

There's a lot of parameters here and many different ways to optimize with
combinations of speed, energy cost, and hardware cost...

Right we can't queue any where near 2^14 counter indices per row, else nothing gained. Good point.

If the compute units are much more efficient and faster than we can fully leverage within the memory bandwidth and optimal counter indices queue limits, we could also consider enumerate multiple nonces and discarding those that don't match a subset of page rows, i.e. process the cuckoo table in chunks with multiple passes. So that decreases the effective memory size for each pass. The edge pruning would need to be done after all passes. Can't the pruning be done without random access?

Basically what I am driving at is we can't rely on the row buffer load power cost being greater than only 1 hash. We'd need wider margins to be very sure of ASIC resistance, such as 1024 row buffer loads for 1 hash. Wink

tromp you are a smart guy, so please don't go publishing my idea before I get to market. You might be able to deduce what I am up to based on the prior paragraph.

You'll receive proper credit as you deserve it. This is your invention.

Don't ruin your future fame by deducing and publishing what I am up to prematurely, given the liberal hints I have given you and then ruin the ability to leverage in the market with first mover advantage. Some shitcoin take the idea and run with it, then the impact will be diluted. Please for the love of God, don't have start having delusions that Blockstream is your friend. Do you enjoy being used.
tromp
Hero Member
*****
Offline Offline

Activity: 505


View Profile
June 10, 2016, 03:25:38 PM
 #1086

we could also consider enumerate multiple nonces and discarding those that don't match a subset of page rows, i.e. process the cuckoo table in chunks with multiple passes.

As mentioned before, this is exactly what higher values of PART_BITS achieve,
as you can see in cuckoo_miner.h and cuda_miner.cu

Each trimming round uses 2^PART_BITS passes with the number of counters reduced by the same factor.
With sufficient reduction, you end up using only one row in each memory bank.
iamnotback
Sr. Member
****
Offline Offline

Activity: 336



View Profile
June 10, 2016, 06:01:23 PM
 #1087

we could also consider enumerate multiple nonces and discarding those that don't match a subset of page rows, i.e. process the cuckoo table in chunks with multiple passes.

As mentioned before, this is exactly what higher values of PART_BITS achieve, as you can see in cuckoo_miner.h and cuda_miner.cu

Each trimming round uses 2^PART_BITS passes with the number of counters reduced by the same factor. With sufficient reduction, you end up using only one row in each memory bank.

What I meant before is that it isn't the same tradeoff as the h/w threads because as you said it increases the number hashes computed:

Alternatively, the algorithm parameter PART_BITS allows for a reduction in the number of counters in use at the same time, which is what your proposal essentially amounts to. Setting this to 21 will require only 2^8 counters, one per memory bank. But now your hash computations have increased by a factor 2^21, over 2 million.

No that is not equivalent to increasing the number of h/w threads and syncing them to pause until 2^13 of them are queued up to read a shared memory page/row.

The point is the ASIC can balance some of the two strategies to find the optimum. The CPU can't effectively leverage either strategy.
iamnotback
Sr. Member
****
Offline Offline

Activity: 336



View Profile
June 11, 2016, 06:11:04 AM
 #1088

I forgot to thank tromp for his very valuable interaction. Thanks John.
hv_
Hero Member
*****
Offline Offline

Activity: 672


View Profile
June 11, 2016, 06:13:31 AM
 #1089

I forgot to thank tromp for his very valuable interaction. Thanks John.

Thx to you both. Always great readings!

Carpe diem  -  cut the down side  -  be anti-fragile
A feature that needs more than one convincing argument is no and Satoshi owes me no proof.
My coding style is legendary but limited to 1MB, sorry but cannot come much over my C64, Bill Gates and Tom Bombadil
tromp
Hero Member
*****
Offline Offline

Activity: 505


View Profile
June 11, 2016, 10:15:03 AM
 #1090

I forgot to thank tromp for his very valuable interaction. Thanks John.

You're welcome, Shelby.
r0ach
Legendary
*
Offline Offline

Activity: 1260


View Profile
June 17, 2016, 06:59:59 PM
 #1091


......ATLANT......
..Real Estate Blockchain Platform..
                    ▄▄▄▄▄▄▄▄▄
                    ████████████░
                  ▄██████████████░
                 ▒███████▄████████░
                ▒█████████░████████░
                ▀███████▀█████████
                  ██████████████
           ███████▐██▀████▐██▄████████░
          ▄████▄█████████▒████▌█████████░
         ███████▄█████████▀██████████████░
        █████████▌█████████▐█████▄████████░
        ▀█████████████████▐███████████████
          █████▀████████ ░███████████████
    ██████▐██████████▄████████████████████████░
  ▄████▄████████▐███████████████░▄▄▄▄░████████░
 ▄██████▄█████████▐█████▄█████████▀████▄█████████░
███████████████████▐█████▄█████████▐██████████████░
▀████████▀█████████▒██████████████▐█████▀█████████
  ████████████████ █████▀█████████████████████████
   ▀██▀██████████ ▐█████████████  ▀██▀██████████
    ▀▀█████████    ▀▀█████████    ▀▀██████████

..INVEST  ●  RENT  ●  TRADE..
 ✓Assurance     ✓Price Discovery     ✓Liquidity     ✓Low Fees





███
███
███
███
███
███





███
███
███
███
███
███
███
███
███
███
███
███

◣Whitepaper ◣ANN ThreadTelegram
◣ Facebook     ◣ Reddit          ◣ Slack


███
███
███
███
███
███
███
███
███
███
███
███





███
███
███
███
███
███








Hero/Legendary members
BitUsher
Hero Member
*****
Offline Offline

Activity: 840


View Profile
June 17, 2016, 07:09:51 PM
 #1092

 Shocked

https://twitter.com/el33th4xor/status/743881413355773952

Ouch. The "Stalker Attack" is real. I'm told that the DAO hacker voted YES on every split proposal, enabling him to raid children DAOs.
AlexGR
Legendary
*
Offline Offline

Activity: 1442



View Profile
June 17, 2016, 08:17:28 PM
 #1093

Shocked

https://twitter.com/el33th4xor/status/743881413355773952

Ouch. The "Stalker Attack" is real. I'm told that the DAO hacker voted YES on every split proposal, enabling him to raid children DAOs.

BitUsher
Hero Member
*****
Offline Offline

Activity: 840


View Profile
June 17, 2016, 09:13:17 PM
 #1094

Look at how they pumped this undertested and buggy shit!!!

https://twitter.com/stephantual/status/743093798302015488

"Holding so much energy, the Colossus of DAO is able to withstand all threats" -



Incredibly unethical!
illodin
Hero Member
*****
Offline Offline

Activity: 966


View Profile
June 17, 2016, 11:38:41 PM
 #1095

Shocked

https://twitter.com/el33th4xor/status/743881413355773952

Ouch. The "Stalker Attack" is real. I'm told that the DAO hacker voted YES on every split proposal, enabling him to raid children DAOs.



iamnotback
Sr. Member
****
Offline Offline

Activity: 336



View Profile
June 17, 2016, 11:40:27 PM
 #1096

Look at how they pumped this undertested and buggy shit!!!

https://twitter.com/stephantual/status/743093798302015488

"Holding so much energy, the Colossus of DAO is able to withstand all threats" -



Incredibly unethical!

The theme of overpowering and disrupting the corrupt order of the world is a very idealistic and powerful meme.

Anyone marketing a DAO should be incorporating that meme.

Why unethical? Just because Tual can't code worth a shit doesn't mean the DAO as a concept is dead.
BitUsher
Hero Member
*****
Offline Offline

Activity: 840


View Profile
June 17, 2016, 11:46:03 PM
 #1097

Just because Tual can't code worth a shit doesn't mean the DAO as a concept is dead.

Sure , DAO's and smart contracts in general are fine to research and test  (not "Th DAO"), but they are also incredibly difficult to execute and find use cases for so I am a bit skeptical when people create ICO scams and crowdfund for these toys. Like bitcoin, people should create working concepts first and than allow the market to determine the value afterwords.
iamnotback
Sr. Member
****
Offline Offline

Activity: 336



View Profile
June 17, 2016, 11:47:24 PM
 #1098

Just because Tual can't code worth a shit doesn't mean the DAO as a concept is dead.

Sure , DAO's and smart contracts in general are fine to research and test  (not "Th DAO"), but they are also incredibly difficult to execute and find use cases for so I am a bit skeptical when people create ICO scams and crowdfund for these toys. Like bitcoin, people should create working concepts first and than allow the market to determine the value afterwords.

Okay I agree that pooling $168 million into one DAO (on 2000 lines of unproven Sol code) was ridiculously risky.
BitUsher
Hero Member
*****
Offline Offline

Activity: 840


View Profile
June 17, 2016, 11:51:07 PM
 #1099

Just because Tual can't code worth a shit doesn't mean the DAO as a concept is dead.

Sure , DAO's and smart contracts in general are fine to research and test  (not "Th DAO"), but they are also incredibly difficult to execute and find use cases for so I am a bit skeptical when people create ICO scams and crowdfund for these toys. Like bitcoin, people should create working concepts first and than allow the market to determine the value afterwords.

Okay I agree that pooling $168 million into one DAO (on 2000 lines of unproven Sol code) was ridiculously risky.

Not just risky but it likely involves some unethical sociopaths dishonestly pumping the hell out of this without any consideration of the consequences. We should know by now that ICO's are unacceptable in our space with the amount of failures. Even if Ethereum initially had noble intentions the vary nature of the ICO makes the probabilities of a failure more likely due to maligning the incentives.

Even now Vitalik has to be reminded of his conflict of interest in the HF he's advocating. It is really sad as many of these "projects" may be 419s or indistinguishable from AFF.
iamnotback
Sr. Member
****
Offline Offline

Activity: 336



View Profile
June 18, 2016, 12:03:21 AM
 #1100

Just because Tual can't code worth a shit doesn't mean the DAO as a concept is dead.

Sure , DAO's and smart contracts in general are fine to research and test  (not "Th DAO"), but they are also incredibly difficult to execute and find use cases for so I am a bit skeptical when people create ICO scams and crowdfund for these toys. Like bitcoin, people should create working concepts first and than allow the market to determine the value afterwords.

Okay I agree that pooling $168 million into one DAO (on 2000 lines of unproven Sol code) was ridiculously risky.

Not just risky but it likely involves some unethical sociopaths dishonestly pumping the hell out of this without any consideration of the consequences.

My theory is it's the Millennials, i.e. generational cultural differences.

I am thinking they just get really excited and rush in.

Sociopath seems too mean-spirited. I understand you want to enforce some self-regulation and discipline in altcoins, but please note I think that regulation is self-defeating. We can agree to disagree. I am an anarchist.

What you seeing now is the real me. I was trying to be someone I wasn't, I guess to appease others.

Sorry I am really a minanarchist at heart, bordering on full anarchist.

...my internal compass throughout my life, has always been non-interventionalist...



We should know by now that ICO's are unacceptable in our space with the amount of failures. Even if Ethereum initially had noble intentions the vary nature of the ICO makes the probabilities of a failure more likely due to maligning the incentives.

Even now Vitalik has to be reminded of his conflict of interest in the HF he's advocating. It is really sad as many of these "projects" may be 419s or indistinguishable from AFF.

Yeah the insiders shouldn't control everything. But that was what a DAO was supposed to fix. Lol.
Pages: « 1 ... 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 [55] 56 57 58 »
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!