Bitcoin Forum
May 06, 2024, 02:27:07 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 [2] 3 »  All
  Print  
Author Topic: [BBR] Boolberry Hash-on-blockchain discussion  (Read 6826 times)
smooth
Legendary
*
Offline Offline

Activity: 2968
Merit: 1198



View Profile
April 29, 2014, 11:52:10 PM
Last edit: April 30, 2014, 12:44:20 AM by smooth
 #21

I may not understand what you are trying to accomplish.

If H1 is slow but does not require a lot of random access to memory, then you can run H1 on a GPU or ASIC, then deliver a set of indexes into the blockchain to the node. If the blockchain fits in memory then you are doing a handful of memory accesses and the other work may dominate. If the blockchain does not fit in memory, then you are giving a huge advantage to people with large solid state drives (flash or battery DRAM) or probably better the ability to store the block chain in a memory kvs across multiple servers.

This may frustrate decentralization because you are better off just maintaining a connection to a node/pool with such a device than running node yourself.

If you want to use the block chain for PoW like Ethereum to require miners to run nodes (but see above), then you can probably do something simple like:

B = block
E = hash function, such as Keccak
B(i) = blockchain data at index i (mod len(blockchain) or some such)

H1=E(B)
H2=E(B+1)
PoW= E(B(H1))+H2)

Could be repeated, but not sure that adds much.

Maybe that is close to what you propose, but again I don't see the point to using a scratchpad at all. The blockchain is essentially your scratchpad.

1714962427
Hero Member
*
Offline Offline

Posts: 1714962427

View Profile Personal Message (Offline)

Ignore
1714962427
Reply with quote  #2

1714962427
Report to moderator
1714962427
Hero Member
*
Offline Offline

Posts: 1714962427

View Profile Personal Message (Offline)

Ignore
1714962427
Reply with quote  #2

1714962427
Report to moderator
1714962427
Hero Member
*
Offline Offline

Posts: 1714962427

View Profile Personal Message (Offline)

Ignore
1714962427
Reply with quote  #2

1714962427
Report to moderator
Unlike traditional banking where clients have only a few account numbers, with Bitcoin people can create an unlimited number of accounts (addresses). This can be used to easily track payments, and it improves anonymity.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714962427
Hero Member
*
Offline Offline

Posts: 1714962427

View Profile Personal Message (Offline)

Ignore
1714962427
Reply with quote  #2

1714962427
Report to moderator
1714962427
Hero Member
*
Offline Offline

Posts: 1714962427

View Profile Personal Message (Offline)

Ignore
1714962427
Reply with quote  #2

1714962427
Report to moderator
1714962427
Hero Member
*
Offline Offline

Posts: 1714962427

View Profile Personal Message (Offline)

Ignore
1714962427
Reply with quote  #2

1714962427
Report to moderator
crypto_zoidberg (OP)
Hero Member
*****
Offline Offline

Activity: 976
Merit: 646



View Profile WWW
April 30, 2014, 10:19:39 AM
Last edit: April 30, 2014, 11:07:12 AM by crypto_zoidberg
 #22

I may not understand what you are trying to accomplish.

If H1 is slow but does not require a lot of random access to memory, then you can run H1 on a GPU or ASIC, then deliver a set of indexes into the blockchain to the node.
That's why H1 have to be fast, as you wrote in shortcomings.

If the blockchain fits in memory then you are doing a handful of memory accesses and the other work may dominate. If the blockchain does not fit in memory, then you are giving a huge advantage to people with large solid state drives (flash or battery DRAM) or probably better the ability to store the block chain in a memory kvs across multiple servers.

This may frustrate decentralization because you are better off just maintaining a connection to a node/pool with such a device than running node yourself.
Let's do some calculations to see what we have:
To get hashing data from block we use:
1. Coinbase outs: usualy 10*32 = 320 bytes.
2. Tx hashes: 32 * (from 1 to 80) (80 is current bitcoin transaction flow) = from 32 to 2560 bytes

With 720 block per day we will increase scratchpad from 92 MB to 758MB per year. Enough to make ASIC's stay away but ok for normal miners. Even if we will be a very success and will get tx flow like a bitcoin in next ten years, scratchpad will be about 10GB, not a problem even now.

The real problem i think is to have SPV client with this approach.
What do you think ?

If you want to use the block chain for PoW like Ethereum to require miners to run nodes (but see above), then you can probably do something simple like:
B = block
E = hash function, such as Keccak
B(i) = blockchain data at index i (mod len(blockchain) or some such)

H1=E(B)
H2=E(B+1)
PoW= E(B(H1))+H2)
Could be repeated, but not sure that adds much.
Maybe that is close to what you propose, but again I don't see the point to using a scratchpad at all. The blockchain is essentially your scratchpad.
We do very similar:
E' - first phase hash.
E  - final phase hash.
H1' and H1' - is different parts of same hash (low and high) used to address random block
 
H1=E'(B)
PoW= E(H1 + E(B(H1')) + E(B(H1'')) )

E' can be a keccak(at least it should be as fast as keccak), but better to use some hash with more complicated instruction set as i said (64-bits numbers multiplication, AES/SSE)

smooth
Legendary
*
Offline Offline

Activity: 2968
Merit: 1198



View Profile
May 01, 2014, 02:03:20 AM
 #23

The real problem i think is to have SPV client with this approach.
What do you think ?

You most certainly can't have SPV clients if verifying the block header pow requires the entire block chain.

As far as the pow itself I'm still not quite sure what you are trying to accomplish. As far as the bitcoin transaction volume you mentioned, I consider that very low if you are designing for the long term.

But it seems you have through this through a fair amount and if you are launching in two days I doubt it makes sense to change the pow right before launch, so I guess we'll just see how it goes. You can always hard fork, but that gets much more difficult to accomplish over time.
digicoin
Legendary
*
Offline Offline

Activity: 1106
Merit: 1000



View Profile
May 01, 2014, 09:03:37 AM
 #24

Is it possible to extend daemond to return full list of block header hashes instead of the full blockchain? What is the security implications of this approach? E.x: a malicious/compromised node can response with purposefully modified hash list?

I believe that this coin can take CryptoNote as its core technology but it must separate itself from Bytecoin to make room for improvement. At least for the middle term.
crypto_zoidberg (OP)
Hero Member
*****
Offline Offline

Activity: 976
Merit: 646



View Profile WWW
May 02, 2014, 09:56:24 PM
 #25

You most certainly can't have SPV clients if verifying the block header pow requires the entire block chain.

As far as the pow itself I'm still not quite sure what you are trying to accomplish.
Sorry if i'm not clear.
We are looking for a way to be memory hard (at mining) on the one hand, on the other to use wider cpu instruction set (if possible). In our opinion it make sense in ASIC protection. Please correct me if i wrong.
At the same time we want to have cheap PoW check operation(the reason why we changing CryptoNote PoW).

As far as the bitcoin transaction volume you mentioned, I consider that very low if you are designing for the long term.

But it seems you have through this through a fair amount and if you are launching in two days I doubt it makes sense to change the pow right before launch, so I guess we'll just see how it goes. You can always hard fork, but that gets much more difficult to accomplish over time.
Probably it's better to spend one week for discussion before launch than stress network with hard fork.

crypto_zoidberg (OP)
Hero Member
*****
Offline Offline

Activity: 976
Merit: 646



View Profile WWW
May 02, 2014, 11:10:08 PM
 #26

Is it possible to extend daemond to return full list of block header hashes instead of the full blockchain? What is the security implications of this approach? E.x: a malicious/compromised node can response with purposefully modified hash list?
Yes, for example we can make daemon able to return randomly requested block headers. But don't think it's necessary.

SPV client have to keep block id vector, like our wallet do, it is 8Mb per year.
For SPV client each new block should be received with the headers required to get this PoW.
To check if this supplied headers valid you just needed to get id-hash(cn_fast_hash which is actually keccak) of this header and validate if this id equal with  id in SPV's vector on correspond height.

Even if compromised node will make PoW with fake headers, SPV client is able to validate it.
So, probably we need to extend daemon to be able work with SPV clients by making possible to send blocks coupled with related PoW headers.

Think that SPV client could be done based on Wallet + p2p layer + PoW.

I believe that this coin can take CryptoNote as its core technology but it must separate itself from Bytecoin to make room for improvement. At least for the middle term.
Not sure that i gues what you mean about separating from Bytecoin.




otila
Sr. Member
****
Offline Offline

Activity: 336
Merit: 250


View Profile
May 08, 2014, 06:16:36 PM
 #27

We are looking for a way to be memory hard (at mining) on the one hand, on the other to use wider cpu instruction set (if possible). In our opinion it make sense in ASIC protection. Please correct me if i wrong.

You can't make it memory-hard with 24-round non-optimized keccak, so why insist on using it?
superresistant
Legendary
*
Offline Offline

Activity: 2128
Merit: 1120



View Profile
May 10, 2014, 06:48:20 AM
 #28

1. Wide CPU instruction set
2. Memory-oriented algo
3. Small work time.

Memorycoin failed on the 2 first point (I don't know about the last). It was AES-NI instruction only, if you didn't have a recent CPU, you were very slow or it didn't work.
It was supposed to rely on RAM amount to counter bot farming but it didn't.
I think it is a great to be memory dependent.

What do you think about a minimum memory amount to mine ?

otila
Sr. Member
****
Offline Offline

Activity: 336
Merit: 250


View Profile
May 10, 2014, 08:21:33 AM
 #29

As you can see, working on small amount of memory 100000 hash operations takes 3020 ms, meanwhile work on 100Mb scratchpad with the same operations count takes 8574 ms.
Such difference(caused by the cache memory overflow) points to real memory hardness we guess.

Compare memory read/written per second by the hash to memmove() speed. What do you get?

Does each round of keccak read from different areas of scratchpad?
crypto_zoidberg (OP)
Hero Member
*****
Offline Offline

Activity: 976
Merit: 646



View Profile WWW
May 10, 2014, 10:37:33 AM
 #30

1. Wide CPU instruction set
2. Memory-oriented algo
3. Small work time.

Memorycoin failed on the 2 first point (I don't know about the last). It was AES-NI instruction only, if you didn't have a recent CPU, you were very slow or it didn't work.
It was supposed to rely on RAM amount to counter bot farming but it didn't.
I think it is a great to be memory dependent.

What do you think about a minimum memory amount to mine ?
Not very big.
It's not about huge memory amount, scratchpad is building on blocks pseudorandom data, such as hashes and tx keys, and will grow about 90MB/year. Huge scratchpad gonna make almost impossible to have SPV-client.

crypto_zoidberg (OP)
Hero Member
*****
Offline Offline

Activity: 976
Merit: 646



View Profile WWW
May 10, 2014, 10:46:37 AM
 #31

As you can see, working on small amount of memory 100000 hash operations takes 3020 ms, meanwhile work on 100Mb scratchpad with the same operations count takes 8574 ms.
Such difference(caused by the cache memory overflow) points to real memory hardness we guess.

Compare memory read/written per second by the hash to memmove() speed. What do you get?

Does each round of keccak read from different areas of scratchpad?

1. it gives correlation between calculations time and memory wait time.
2. yes. addressing based on state buffer.

hirschhornsalz
Newbie
*
Offline Offline

Activity: 16
Merit: 0


View Profile
May 10, 2014, 11:52:43 AM
 #32

Now lets imagine - just for a short time - this currency will be really successful. The blockchain of bitcoin grows faster thean linear in time, it does make sense to assume a slow exponential growth for a successful altcoin too.

Code:
~/.bitcoin $ du -sh blocks
20G     blocks/

Now lets just assume you hit a 40 G blockchain in 3 years. Are you sure there are enough nodes left with 64 GB Ram? What about the distribution of this kind of workstations?
crypto_zoidberg (OP)
Hero Member
*****
Offline Offline

Activity: 976
Merit: 646



View Profile WWW
May 10, 2014, 02:46:26 PM
 #33

Now lets imagine - just for a short time - this currency will be really successful. The blockchain of bitcoin grows faster thean linear in time, it does make sense to assume a slow exponential growth for a successful altcoin too.

Code:
~/.bitcoin $ du -sh blocks
20G     blocks/

Now lets just assume you hit a 40 G blockchain in 3 years. Are you sure there are enough nodes left with 64 GB Ram? What about the distribution of this kind of workstations?
I feel that i need to have more clear description.
Each block's entry in scratchpad is not depends of number of transactions included in it.
It is fixed to about 320 bytes and use prev_id, merkle root, onetime coinbase  key, and hashed coinbase outs (usually 8 ).
I'll put more detailed description.

otila
Sr. Member
****
Offline Offline

Activity: 336
Merit: 250


View Profile
May 11, 2014, 06:24:50 AM
 #34

I feel that i need to have more clear description.
Each block's entry in scratchpad is not depends of number of transactions included in it.
It is fixed to about 320 bytes and use prev_id, merkle root, onetime coinbase  key, and hashed coinbase outs (usually 8 ).

(with mixin_t) with width=1600, rate=1536, and capacity=(1600-1536)=64, you get collision resistance=2^32,  (second) preimage resistance=2^32.
However, data is mixed into the state each round and you use multiply instead of xor, so security level of the construction is unknown.
otila
Sr. Member
****
Offline Offline

Activity: 336
Merit: 250


View Profile
May 13, 2014, 08:50:46 AM
 #35

now when mining in testnet, currency::get_blob_longhash takes 75% of CPU time, and 75% of CPU time in that function is spent in doing div instructions due to size() and operator[], not quite memory-hard Cry
smooth
Legendary
*
Offline Offline

Activity: 2968
Merit: 1198



View Profile
May 13, 2014, 09:00:45 AM
 #36

now when mining in testnet, currency::get_blob_longhash takes 75% of CPU time, and 75% of CPU time in that function is spent in doing div instructions due to size() and operator[], not quite memory-hard Cry

The block chain is tiny right? Probably all in near cache

otila
Sr. Member
****
Offline Offline

Activity: 336
Merit: 250


View Profile
May 13, 2014, 10:22:37 AM
Last edit: May 13, 2014, 10:41:03 AM by otila
 #37

now when mining in testnet, currency::get_blob_longhash takes 75% of CPU time, and 75% of CPU time in that function is spent in doing div instructions due to size() and operator[], not quite memory-hard Cry

The block chain is tiny right? Probably all in near cache

EDIT: div cycle count on Sandy Bridge depends on divisor, bigger divisors take more time (latency: 30-94 cycles). As a comparison, L2 cache minimum latency is 11 cycles.

The vector size could as well be rounded up to next power of two and doing some padding, avoiding modulus by doing bitwise and operation, so instead of size() you use shift count..
I wouldn't like to uglify the code by implementing reciprocal multiplication hacks.

But who knows, maybe GPUs and ASICs have slow 64 bit divide Roll Eyes
smooth
Legendary
*
Offline Offline

Activity: 2968
Merit: 1198



View Profile
May 13, 2014, 11:18:44 AM
 #38

now when mining in testnet, currency::get_blob_longhash takes 75% of CPU time, and 75% of CPU time in that function is spent in doing div instructions due to size() and operator[], not quite memory-hard Cry

The block chain is tiny right? Probably all in near cache

EDIT: div cycle count on Sandy Bridge depends on divisor, bigger divisors take more time (latency: 30-94 cycles). As a comparison, L2 cache minimum latency is 11 cycles.

Right but memory is much higher latency. The idea is for the block chain data to (eventually) be in memory, not L2.

I agree replacing div by something faster is probably a good idea, but I haven't looked at this code at all.

otila
Sr. Member
****
Offline Offline

Activity: 336
Merit: 250


View Profile
May 14, 2014, 01:52:13 PM
Last edit: May 15, 2014, 01:25:37 PM by otila
 #39

seems this miner stuff is obfuscated on purpose, dev is probably running 10x faster miner himself..
but I have two days to make a faster version  Sad

Code:
#0  std::vector<crypto::hash, std::allocator<crypto::hash> >::operator[] (this=0x7f837d3fb950, __n=12302) at /usr/include/c++/4.9.0/bits/stl_vector.h:780
#1  0x0000000000c4e75c in currency::miner::<lambda(uint64_t)>::operator()(uint64_t) const (__closure=0x7f837d3fb4e0, index=9325784170990468272)
    at /c/boolberry/src/currency_core/miner.cpp:365
#2  0x0000000000c4f912 in currency::<lambda(uint64_t (&)[25], uint64_t (&)[24])>::operator()(crypto::state_t_m &, crypto::mixin_t &) const (__closure=0x7f837d3fb200, st=...,
    mix=...) at /c/boolberry/src/currency_core/currency_format_utils.h:189
#3  0x0000000000c4fdab in crypto::wild_keccak<crypto::mul_f, currency::get_blob_longhash(const blobdata&, crypto::hash&, uint64_t, callback_t) [with callback_t = currency::miner::worker_thread()::<lambda(uint64_t)>; currency::blobdata = std::basic_string<char>; uint64_t = long unsigned int]::<lambda(uint64_t (&)[25], uint64_t (&)[24])> >(const uint8_t *, size_t, uint8_t *, size_t, currency::<lambda(uint64_t (&)[25], uint64_t (&)[24])>) (
    in=0x7f837d3fb540 "\366+\307\351\330\b3pQ\264\067\061ǭVjf1\034s \237\224\233\327\016\226-\332xko", inlen=33,
    md=0x7f837d3fb540 "\366+\307\351\330\b3pQ\264\067\061ǭVjf1\034s \237\224\233\327\016\226-\332xko", mdlen=32, cb=...) at /c/boolberry/src/crypto/wild_keccak.h:134
#4  0x0000000000c4fb09 in crypto::wild_keccak_dbl<crypto::mul_f, currency::get_blob_longhash(const blobdata&, crypto::hash&, uint64_t, callback_t) [with callback_t = currency::miner::worker_thread()::<lambda(uint64_t)>; currency::blobdata = std::basic_string<char>; uint64_t = long unsigned int]::<lambda(uint64_t (&)[25], uint64_t (&)[24])> >(const uint8_t *, size_t, uint8_t *, size_t, currency::<lambda(uint64_t (&)[25], uint64_t (&)[24])>) (
    in=0x7f8381dba098 "\001-\261FVm\345O\330\061\021\237\257\210\200\204\364=\374\243\031.\023\254\350\233O\372\373\262\032~łz$i", inlen=76,
    md=0x7f837d3fb540 "\366+\307\351\330\b3pQ\264\067\061ǭVjf1\034s \237\224\233\327\016\226-\332xko", mdlen=32, cb=...) at /c/boolberry/src/crypto/wild_keccak.h:151
#5  0x0000000000c4fa8d in currency::get_blob_longhash<currency::miner::worker_thread()::<lambda(uint64_t)> >(const currency::blobdata &, crypto::hash &, uint64_t, currency::miner::<lambda(uint64_t)>) (
    bd="\001-\261FVm\345O\330\061\021\237\257\210\200\204\364=\374\243\031.\023\254\350\233O\372\373\262\032~łz$i\000\216\353қ\005QJ\034k0\023pq\024y\\\356\031\343\376\376\342\366\250{\340\327\363\344RSn\002c\f\277\236\001", res=..., height=2056, accessor=...) at /c/boolberry/src/currency_core/currency_format_utils.h:179
#6  0x0000000000c4ec65 in currency::miner::worker_thread (this=0x7fff82e7fdd8) at /c/boolberry/src/currency_core/miner.cpp:366
perl
Legendary
*
Offline Offline

Activity: 1918
Merit: 1190


View Profile
May 18, 2014, 10:48:44 PM
 #40

You have smoke for write algo hash or I did not understand ?


What interest is one pool not receveid more 20 personne ?
Pool as need validate best resultat of miner for validate submit .
Make validation 20 personne get more resources of mining directly .
Pages: « 1 [2] 3 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!