Is gossip protocol in Bitcoin perfect?

Bitcoin Forum

June 22, 2024, 04:33:46 PM

Welcome, Guest. Please login or register.

News: Voting for pizza day contest

Home

Help

Search

Login

Register

More

Bitcoin Forum > Bitcoin > Development & Technical Discussion > Is gossip protocol in Bitcoin perfect?

Pages: « 1 [2] All

« previous topic next topic »

Author

Topic: Is gossip protocol in Bitcoin perfect? (Read 803 times)

gmaxwell

Moderator
Legendary

expert

Offline

Offline

Activity: 4200
Merit: 8441

WWW

⇾ Re: Is gossip protocol in Bitcoin perfect?

December 28, 2018, 03:16:49 PM

Merited by ABCbits (1)

#21

Quote from: aliashraf on December 28, 2018, 02:20:34 PM

A block coding algorithm lets sequential pre-computation of code words using pre-knowledge of txns, which is essential for our purpose but the blocks are too local and don't carry message wide information, you decompose the message to adjacent segments and encode each segment independently which is not helpful in recovering long distant errors.

The block for error correction is the entire block, so it is entirely non-local. You normally hear about error coding using small blocks because when you are correcting arbitrary errors and not just erasures the computation for decoding becomes prohibitive for coding large amounts at once. But we only need to handle erasures, so that makes it tractable to code everything at once.

Quote

Upon reception of the compact block (typically consisted of the header, coinbase txn and ids of block txns), the receiver uses its pre-knowledge of txns to build the message (block) in its entirety.

In Fibre the short-ids are somewhat different from compact blocks because they also include transaction lengths. These are needed so it knows how big the holes for the missing transactions are.

Quote

In the very common case of missing txns:
A- The receiver constructs/guesses the sequence of encoded message fragments (code words) using its partial pre-knowledge of embedded txns which ends to realising that there are m missing fragments where it suffices to get m' distinct ones to recover the block and m' < m for being able to act as a transmitter for other peers, besides relaying the receiving stream meanwhile.

Yep. well m' = m: it needs to get as many fragments as are missing. The block, broken into fragments is n fragments long. The rx is missing m fragments, m <= n (obviously), and can decode once it has received any m correction fragments sent by the sender.

Quote

B- The receiver initiates the protocol by requesting encoded data (the sequence of code words/packets) from peer(s).

Not quite-- that would require a round trip, which is fatal in terms of delay. Bam several hundred milliseconds out the door for the request to reach the sender and for the sender's reply to get back.

So instead the sender immediately sends encoded data when it gets a block, and the receiver tells it to stop when it has had enough. If the receiver was missing nothing, it'll just get some fragments it doesn't need which were sent during the time it took the sender to hear its stop request. It just throws them out (or relays the on to other peers that haven't said stop, then throws them out).

Otherwise your description seems okay.

Quote

original compact block with. For raw multi kB compact blocks, TCP is hell of a reliable transport protocol.
As for reordering and retransmission in TCP, I think they are neglectable compare to the overhead of missing txns.

TCP is a hell of a slow transport over long distance links. As I pointed out, on international internet links there is typically 1%+ packet loss (akamai says more, but lets use 1% for discussion). If TCP were used and even one packet were dropped then WHAM the connection is stalled for hundreds of milliseconds waiting for a retransmission, and further the transmission speed it cut down until the window regrows. Prior measurements before fibre and such showed that the effective block transmission speed between mining pools was about 750kbit/sec.

TCP's reliability only hurts and doesn't help-- the fibre recovery doesn't care which of the sent fragments it gets, it just needs enough of them... so pausing the whole stream just to retransmit a lost one doesn't make much sense.

Missed transactions and lost packets on long links happen at about similar rates-- 1%-10%. But regardless, if TCP were used every time there was a dropped packet the transmission would stall, so they couldn't be disregarded unless they were very rare.

Quote

2- I think non-systemic nature of the coding, implies frequently used encoding/decoding that are good candidates for hardware acceleration (like what happens for DVB),

The code we are using was designed specially to be very very fast on general purpose hardware. Almost all its decoding inner work is accomplished via SIMD xors. This was a major barrier in making fibre realistic after I originally proposed the general idea back in 2013-- a naive implementation took too long to decode. Above I say that the decode happens in in the speed of light delays plus 10ms or so, much of that 10ms is the FEC decode (obviously if nothing was missing, no FEC decode was needed). The FEC decode is also faster in the common case that relatively few fragments are missing.

aliashraf

Legendary

Offline

Offline

Activity: 1456
Merit: 1174

Always remember the cause!

WWW

Re: Is gossip protocol in Bitcoin perfect?

December 28, 2018, 05:18:40 PM
Last edit: December 28, 2018, 06:23:34 PM by aliashraf

#22

Quote from: gmaxwell on December 28, 2018, 03:16:49 PM

Quote from: aliashraf on December 28, 2018, 02:20:34 PM

A block coding algorithm lets sequential pre-computation of code words using pre-knowledge of txns, which is essential for our purpose but the blocks are too local and don't carry message wide information, you decompose the message to adjacent segments and encode each segment independently which is not helpful in recovering long distant errors.

The block for error correction is the entire block, so it is entirely non-local. You normally hear about error coding using small blocks because when you are correcting arbitrary errors and not just erasures the computation for decoding becomes prohibitive for coding large amounts at once. But we only need to handle erasures, so that makes it tractable to code everything at once.

But coding everything at once requires total knowledge of the message, which is not guaranteed for the receiver. Hence his partial knowledge becomes useless.

Actually, Iranian scientist, Amin Shokrollahi has invented Raptor codes with excellent property of being linear time in encoding/decoding. Raptor codes are the only known linear time codings in Fountain class coding schemas, they are widely used in the industry and I think one flavor of this codes, online codes, seems to be pretty much adequate for what we are discussing.

Raptor Codes in general and Online Codes specially, are important because of their linear complexity, Online Codes is described by its inventor, Petar Maymounkov as

Quote

from: https://cs.nyu.edu/media/publications/TR2002-833.pdf (Extended abstract of original paper)
a class of near-optimal codes for a very general loss channel
which we call the free channel. Online codes are linear encoding/decoding time codes, based
on sparse bipartite graphs, similar to Tornado codes, with a couple of novel properties: local
encode-ability and rateless-ness. Local encode-ability is the property that each block of the encoding
of a message can be computed independently from the others in constant time. This also implies
that each encoding block is only dependent on a constant-sized part of the message and a few
preprocessed bits. Rateless-ness is the property that each message has an encoding of practically
infinite size.

I strongly recommend, taking a closer look on this class of FEC erasure codings.

The thing is, I don't know which FEC coding algorithm is used or intended to be used by you guys. Would you please enlighten me about it?

Quote

Quote

Upon reception of the compact block (typically consisted of the header, coinbase txn and ids of block txns), the receiver uses its pre-knowledge of txns to build the message (block) in its entirety.

In Fibre the short-ids are somewhat different from compact blocks because they also include transaction lengths. These are needed so it knows how big the holes for the missing transactions are.

I knew it! The first thing came to my mind was the absolute necessity of txn length information, still Matt says BIP 152 and compact blocks was intentionally designed to be compatible with FIBRE Roll Eyes

Roll Eyes

gmaxwell

Moderator
Legendary

expert

Offline

Offline

Activity: 4200
Merit: 8441

WWW

Re: Is gossip protocol in Bitcoin perfect?

December 28, 2018, 06:08:08 PM
Last edit: December 28, 2018, 06:42:29 PM by gmaxwell

#23

Quote from: aliashraf on December 28, 2018, 05:18:40 PM

But coding everything at once requires total knowledge of the message, which is not guaranteed for the receiver. Hence his partial knowledge becomes useless.

It is guarenteed for the sender, which is the party doing the coding. The reciever will know the total message when he is done decoding.

Quote

Raptor codes with excellent property of being linear time in encoding/decoding. Raptor codes are the only known linear time codings in Fountain class coding schemas,

The code we use (Feel free to checkout the code base) is functionally similar (it is a linear time to decode quasi-fountain code) but without the obvious patent problems.

Careful though: "linear in time" does not necessarily mean fast.

There is now also a plain 16-bit reed solomon implementation which is _almost_ as fast through use of FFT based methods. I expect that when we make fibre-like forwarding a part of bitcoin core we'll use an RS code since it's just so much easier to specify, also more likely to get even faster still on future hardware... also easy to hardware accelerate. It's also much easier to make a functional but slow implementation (part of why it would be easier to specify).

For the purpose of understanding fibre, it's sufficient to imagine that we already made that change and it's coding-- say, a 2-million byte block in to 3906 fragments each of which is 512-bytes long contains 256 16-bit RS code symbols for 256 parallel RS codes that are each 3906 symbols long, spanning the whole block.

The primary functional difference between that and what's actually in use is that would be limited to sending ~4x the size of the block in data without repetition... and whats actually in use has a decoder that is maybe 10% faster than the state of the art decoder for a 16-bit RS code of that size.

[I originally proposed the idea using RS codes back in 2013, but back then the fastest suitable implementation we had was >10x slower... which made it uninteresting.]

That understanding lets you see that e.g. a missing transaction cause a need for reconstruction data equal to its own size plus zero to two more fragments, depending on how the txn aligned to the fragments. There is a trade-off between fragment size and decode time... smaller fragments would cause less loss overhead but take longer to decode. The current parameters in fibre were just tuned to give the minimum high-Nth percentile total decode delay.

An earlier version of the fibre protocol used the lengths to reorder transactions for coding purposes, so that they aligned much more often with fragment boundaries. But it turned out the extra copies required to reorder the block added more delay than the alignment saved, so it was dropped.

Quote

they are widely used in the industry

Actually, I was unable to find pretty much any production systems using fountaining codes. That's probably part of the reason for the relative lack of existence of actually fast code implementations vs theoretically fast ones that aren't. [Not that it really matters, but this delayed the deployment of this class of technique by a couple years.]

aliashraf

Legendary

Offline

Offline

Activity: 1456
Merit: 1174

Always remember the cause!

WWW

Re: Is gossip protocol in Bitcoin perfect?

December 28, 2018, 07:54:27 PM

#24

Quote from: gmaxwell on December 28, 2018, 06:08:08 PM

Careful though: "linear in time" does not necessarily mean fast.

For Tornado, Raptor and Online codes, time linearity comes with no hidden big constant, hence they are theoretically most optimized algorithms available nowadays.

Quote

There is now also a plain 16-bit reed solomon implementation which is _almost_ as fast through use of FFT based methods. I expect that when we make fibre-like forwarding a part of bitcoin core we'll use an RS code since it's just so much easier to specify, also more likely to get even faster still on future hardware... also easy to hardware accelerate. It's also much easier to make a functional but slow implementation (part of why it would be easier to specify).

Reed Solomon coding suffers from two weaknesses:
1) It is O(n log_n) in encoding/decoding.

2) Along with many other algorithms, it is based on assumptions about the probability of loss, δ, to perform optimal. Such assumptions for a Coupon Collector's problem like missing txns in bitcoin, seems to be problematic.

For instance, suppose you want to tune RS parameters to support a 10% loss channel, i.e. a channel with 90% capacity, you need to choose n,k such that R=k/n <= 0.9 now even if you choose them to be like (8,10) you have definitely high risks of not being able to recover original message from 8 out of 10 encoded blocks in a 30% noisy day.

In bitcoin block relay problem having no assumption about δ looks more decent which is achievable by rateless-ness feature of Online Codes and its support of general loss channels also called free channels.

Quote

For the purpose of understanding fibre, it's sufficient to imagine that we already made that change and it's coding-- say, a 2-million byte block in to 3906 fragments each of which is 512-bytes long contains 256 16-bit RS code symbols for 256 parallel RS codes that are each 3906 symbols long, spanning the whole block.

I'll do my best to decode your numbers and the source code for further contributions.

As of Raptor, not being used in the industry, you are correct, I mixed it up with something else, but a strength for this class of coding is the extensive efforts Shokrollahi has dedicated to standardization and this has been widely recognized, such that he has received

Quote

the IEEE Eric E. Sumner Award in 2007 together with Michael Luby "for bridging mathematics, Internet design and mobile broadcasting as well as successful standardization"[4] and the IEEE Richard W. Hamming Medal in 2012 together with Michael Luby "for the conception, development, and analysis of practical rateless codes"

gmaxwell

Moderator
Legendary

expert

Offline

Offline

Activity: 4200
Merit: 8441

WWW

Re: Is gossip protocol in Bitcoin perfect?

December 28, 2018, 09:22:25 PM

Merited by Foxpup (8), danda (2), ABCbits (1), aliashraf (1), DaCryptoRaccoon (1)

#25

Quote from: aliashraf on December 28, 2018, 07:54:27 PM

For Tornado, Raptor and Online codes, time linearity comes with no hidden big constant, hence they are theoretically most optimized algorithms available nowadays.

And existing implementations that I've seen are not just several times slower than the code we use, but also several times slower than a 16-bit RS code that is implemented very well at least for message sizes of interest to us.

It sounds are parroting stuff on the internet which is written by solution partisans and by people without a lot of practical experience-- without experience with any of it yourself. We tried these things. They were slow. Feel free to knock yourself out looking into them further, I agree they are very interesting.

On a modern computer a "fast algorithm" that cannot make effective use of SIMD will lose access to (say) 93% percent of the cpu's peak throughput. Turns out that can hide a pretty big log n factor!

One thing to keep in mind when looking at algorithm asymptotics is that it is often reasonable to read "log n" as "a small constant" because no problem that fits in memory on your target computer will make log n larger than (say) 30... and often (but not always), any problem that doesn't fit in memory will have its time utterly dominated by IO costs and so then cache is critical which asympotics say nothing about. (Pealing decoders, in particular, have poor cache locality and poor branch prediction friendliness).

It's very easy to have a factor of thirty difference between implementations. So it's not especially surprising to see an O(N log N) algorithm matching an O(N) one... even on large inputs. I've also found that it's not that uncommon for N^2 algorithms on realistic problems to beat N log N ones in practice (often due to cache localities), and found it to be fairly common for O(N) algorithms to beat log(N) ones, for problems small enough to fit in L3 cache (usually due to branch prediction and parallizability).

In any case, for the workload presented by fibre the fastest code implemented for it that we've been able to find is a mixed density quasi random graph code... and fairly close nearest runner up is an FFT using 16-bit RS code (not coincidentally written by the same author).

Quote

For instance, suppose you want to tune RS parameters to support a 10% loss channel, i.e. a channel with 90% capacity, you need to choose n,k such that R=k/n <= 0.9 now even if you choose them to be like (8,10) you have definitely high risks of not being able to recover original message from 8 out of 10 encoded blocks in a 30% noisy day.

In bitcoin block relay problem having no assumption about δ looks more decent which is achievable by rateless-ness feature of Online Codes

You can just as well use one "ratelessly" by sending as much as is needed: you just end up with a minimum bound on the rate that arises out of the size of the RS symbol size, you can only generate so many non-duplicate symbols. There is no overhead created from doing this (and in fact all other options are not MDS codes, so they have overhead, though e.g. for ours the probability of a failed decode when you have enough data is 0.03^(extrapackets+1), so negligible but technically it's overhead).

Say you have 512 bits you want to send. You can code it with 32 symbols in a 16-bit RS code then send up to 65535 symbols (128kb) (or 65503 non-systematic symbols) and the receiver will recover once they've received any 32 of them. So this code works for a rate of 1.0 to 0.00048. A true rateless code could drive the rate arbitrarily close to zero: But for our application mempool losses alone mean we never need a code with a rate of less than 0.5 for mempool losses.

A code with a minimum rate of 0.00048 is sufficient with 100% of the mempool missed for up to _99.95%_ packet loss.

While "we will not simultaneously lose more than 80% of all packets and have no mempool hits" is not technically "no assumption" it's close enough for engineering purposes: at 80% loss the FEC is not really the system performance limiting consideration in any case (the limiting one is that any link that is currently losing 80% packets is very likely about to start losing 100%

). If you have more than 80% packet loss, you have worse things to worry about, also if you have 80% packet loss the efficiency loss from repeating packets is not particularly substantial (since you're losing most packets, its unlikely that you received any particular duplicate), so if you really did want a lower rate code you could just start duplicating packets. Aside, a couple messages up you were also trying to argue that packet loss wasn't relevant.

In practice, because we don't want to waste unbounded amounts of bandwidth on a down peer fibre currently only sends up to 8x the size of the block so we never use a rate less than 0.125. In practice, even under extremely high packet loss a rate of 0.333 is more than low enough to accommodate the loss and no mempool hits at all. The mean rate we need for decode is like 0.94 since missing tx data and packet loss are about 6% or so.

Quote

As of Raptor, not being used in the industry, you are correct, I mixed it up with something else,

I expect that we won't see those codes widely deployed until after the patents expire. There are few enough applications where actual ratelessness is all that useful compared to just wide rate flexiblity that you can achieve with many fast well understood codes-- especially by using a product of two codes (like two layers of RS codes, or RS+duplication code like I mentioned above).

aliashraf

Legendary

Offline

Offline

Activity: 1456
Merit: 1174

Always remember the cause!

WWW

Re: Is gossip protocol in Bitcoin perfect?

December 28, 2018, 11:08:54 PM

#26

Greg,
Your latest replies totally convinced me that your approach to block relay is professional and thoughtful. I needed to remind it verbally, because unfortunately I'm short of sendable merits right now

I need more time to do my own research both on different options available and FIBRE code, in case of any interesting finding, I won't hesitate to share.

Pages: « 1 [2] All

Bitcoin Forum > Bitcoin > Development & Technical Discussion > Is gossip protocol in Bitcoin perfect?

« previous topic next topic »

Jump to:

Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines