creativex
|
|
December 24, 2012, 12:46:21 PM |
|
@ Team Avalon
What happened to the last Thursday update? (from TSMC?)
everything is going well, everything is going on time. Great! Any chance of pictures of chips or PCBs any time soon? They've already said they wouldn't publish this information, but that they wouldn't attempt to prevent customers from doing so. Seems we'll have to wait till they're in the wild to obtain this information.
|
|
|
|
nathanrees19
|
|
December 24, 2012, 01:02:27 PM |
|
@ Team Avalon
What happened to the last Thursday update? (from TSMC?)
everything is going well, everything is going on time. Great! Any chance of pictures of chips or PCBs any time soon? They've already said they wouldn't publish this information, but that they wouldn't attempt to prevent customers from doing so. Seems we'll have to wait till they're in the wild to obtain this information.
|
|
|
|
CoinHoarder
Legendary
Offline
Activity: 1484
Merit: 1026
In Cryptocoins I Trust
|
|
December 24, 2012, 05:10:38 PM |
|
Cool stuff, 27 days. Good luck team Avalon on reaching your release date! I hope you guys make it! *** doesn't have anything on team Avalon... maybe you guys should send a few of your employees over to help them out!
|
|
|
|
creativex
|
|
December 24, 2012, 05:25:56 PM |
|
Cool stuff, 27 days. Good luck team Avalon on reaching your release date! I hope you guys make it! I don't have any orders in with Avalon, but I hope they make it too. If only to light a fire under the companies that can't seem to get out of their own way. *** doesn't have anything on team Avalon... maybe you guys should send a few of your employees over to help them out! Please don't. Let the weak companies fend for themselves or die off. BTC will be better off without them if they can't swim.
|
|
|
|
420
|
|
December 24, 2012, 08:15:22 PM |
|
27 days...but can they get one to a customer before February 1? http://betsofbitco.in/item?id=1003
|
Donations: 1JVhKjUKSjBd7fPXQJsBs5P3Yphk38AqPr - TIPS the hacks, the hacks, secure your bits!
|
|
|
PuertoLibre
Legendary
Offline
Activity: 1890
Merit: 1003
|
|
December 24, 2012, 09:55:42 PM |
|
*** doesn't have anything on team Avalon... maybe you guys should send a few of your employees over to help them out! Please don't. Let the weak companies fend for themselves or die off. BTC will be better off without them if they can't swim. Survival of the most competent!? Wait, where does that leave Inaba? Seems like your stacking a [un]fair deck. The other companies will need a handicrap handicap at least.
|
|
|
|
libertybuck
|
|
December 24, 2012, 11:48:45 PM |
|
They've already said they wouldn't publish this information, but that they wouldn't attempt to prevent customers from doing so. Seems we'll have to wait till they're in the wild to obtain this information.
I think you are correct. They should not publish this information. This might lead themselves into mud.
|
|
|
|
Inaba
Legendary
Offline
Activity: 1260
Merit: 1000
|
|
December 25, 2012, 03:27:03 AM |
|
Survival of the most competent!? Wait, where does that leave Inaba?
It obviously leaves me so far ahead of you that you can't even see me flash my ass at you.
|
If you're searching these lines for a point, you've probably missed it. There was never anything there in the first place.
|
|
|
PuertoLibre
Legendary
Offline
Activity: 1890
Merit: 1003
|
|
December 25, 2012, 06:42:24 AM |
|
Survival of the most competent!? Wait, where does that leave Inaba?
It obviously leaves me so far ahead of you that you can't even see me flash my ass at you. Of course, that is what the slingshot was for. Your only hope of crossing the finish line!
|
|
|
|
libertybuck
|
|
December 25, 2012, 06:50:46 AM |
|
Even so creative picture in the world !
|
|
|
|
creativex
|
|
December 25, 2012, 04:02:35 PM |
|
Survival of the most competent!? Wait, where does that leave Inaba? It obviously leaves me so far ahead of you that you can't even see me flash my ass at you. Of course, that is what the slingshot was for. Your only hope of crossing the finish line! OWNED! lol
|
|
|
|
cosurgi
|
|
December 25, 2012, 06:42:56 PM |
|
maybe we could make bets whether they ship or not?
|
|
|
|
mobodick
|
|
December 25, 2012, 07:31:22 PM |
|
These chips crunch near a billion hashes per second. Losing a small handful of those each second is miniscule.
Mine along on your CPU if you wanna make up the difference and then some.
I get a feeling that a longer explanation is required for those unfamiliar with digital logic design. The issue isn't really about losing one in billions of hashes. It is about gaining the timing margin (a.k.a. overclocking headroom) in the design. Of course Avalon's logic is secret, but I'm going to discuss the problem based on one of the open-source FPGA hashers. It had a critical timing path in the logic that latched the "golden nonce". Since the design was 125-deep pipelined it had a hardware that subtracted constant 125 from the nonce counter before sending it out of the chip. Now we have two ways to speed up the above design: 1) remove the 32-bit wide constant subtractor. This will gain a fraction of a nanosecond on every hash tried. It is very easy to subtract 125 in software from the nonce downloaded from the chip.2) acknowledge that the timing violation may occur and the nonce latched may not be the exact one that solved the block, but a next one or previous one, depending on the details of the latching logic. It is somewhat more involved, but still easily doable in software: recompute the hashes for nonce values n-126,n-125,n-124 and use the one that solved the block. Again this will make the design more tolerant to overclocking for every hash tried inside the chip. Obviously 1) cannot be applied to the ASIC chip or closed-source FPGA bitstream. But the method 2) remains applicable, just use a different set of test values. Since it's a pipelined design, wouldn't removing the subtractor just reduce the latency of the pipeline instead of increasing the throughput? Even if this subtractor would prevent the re-loading of the pipeline than you could pipeline the pipeline and the subtractor. Since the pipeline will not (i presume) produce a nounce to be latched on every clock you have more than enough time to store the previous nounce on chip and subtract the number before sending it out to the controller. At least i would make my 'store' circuit parallel to the actual pipeline so it can operate asynchonously.
|
|
|
|
2112
Legendary
Offline
Activity: 2128
Merit: 1073
|
|
December 25, 2012, 08:27:24 PM |
|
Since it's a pipelined design, wouldn't removing the subtractor just reduce the latency of the pipeline instead of increasing the throughput?
The overall speed of a pipelined logic design is limited by the speed of the slowest stage. In the design I mentioned the last pipeline stage did what every other stage did plus it did the zero comparator, subtractor and a latch. Even if this subtractor would prevent the re-loading of the pipeline than you could pipeline the pipeline and the subtractor. Since the pipeline will not (i presume) produce a nounce to be latched on every clock you have more than enough time to store the previous nounce on chip and subtract the number before sending it out to the controller. At least i would make my 'store' circuit parallel to the actual pipeline so it can operate asynchonously.
I don't think you've ever tried to use Xilinx ISE or something similar. The problem isn't: come up with a different, potentially faster design. The problem is: come up with a working design, the one that the available tools will be capable of synthesizing, and placing/routing sensibly. The overall structure of SHA-2 (which makes every output bit depend on every input bit in each round) is apparently hitting some worst case behavior in the Xilinx toolchain. It takes close to a full day to run a single full implementation. And in many cases the the toolchain either fails to converge to a working implementation or converges to something shamefully inefficient. On this board 2^256 is frequently thrown around as a number so high that nobody will be able check all of them. Compare this with the work demanded from the Xilinx placing tool: 23038 SLICEs in XC6SLX150 can be permuted in 23028! ways (I'm making a gross simplification of the "place" step) which is about 10^90499. Obviously all digital synthesis tools have to take some heuristic shortcuts through that vast space of available solutions. So the human art required from the designer is to figuratively take the poor toolchain by the hand an lead it/them to some safe place.
|
|
|
|
mobodick
|
|
December 25, 2012, 09:16:37 PM |
|
Since it's a pipelined design, wouldn't removing the subtractor just reduce the latency of the pipeline instead of increasing the throughput?
The overall speed of a pipelined logic design is limited by the speed of the slowest stage. In the design I mentioned the last pipeline stage did what every other stage did plus it did the zero comparator, subtractor and a latch. Even if this subtractor would prevent the re-loading of the pipeline than you could pipeline the pipeline and the subtractor. Since the pipeline will not (i presume) produce a nounce to be latched on every clock you have more than enough time to store the previous nounce on chip and subtract the number before sending it out to the controller. At least i would make my 'store' circuit parallel to the actual pipeline so it can operate asynchonously.
I don't think you've ever tried to use Xilinx ISE or something similar. true enough.. The problem isn't: come up with a different, potentially faster design. The problem is: come up with a working design, the one that the available tools will be capable of synthesizing, and placing/routing sensibly. The overall structure of SHA-2 (which makes every output bit depend on every input bit in each round) is apparently hitting some worst case behavior in the Xilinx toolchain. It takes close to a full day to run a single full implementation. And in many cases the the toolchain either fails to converge to a working implementation or converges to something shamefully inefficient.
lol., sorry i even mentioned it.. And no way to work around this? On this board 2^256 is frequently thrown around as a number so high that nobody will be able check all of them. Compare this with the work demanded from the Xilinx placing tool: 23038 SLICEs in XC6SLX150 can be permuted in 23028! ways (I'm making a gross simplification of the "place" step) which is about 10^90499. Obviously all digital synthesis tools have to take some heuristic shortcuts through that vast space of available solutions.
Well, thats why you program that thing, right?. The information you give the synthesis tools reduces this space by very much. Anyway, it would be too off-topic to discuss it here. I take it it is not an easy task. So the human art required from the designer is to figuratively take the poor toolchain by the hand an lead it/them to some safe place.
The blackbox art of FPGA programming...
|
|
|
|
420
|
|
December 25, 2012, 11:56:35 PM |
|
|
Donations: 1JVhKjUKSjBd7fPXQJsBs5P3Yphk38AqPr - TIPS the hacks, the hacks, secure your bits!
|
|
|
hardcore-fs
|
|
December 26, 2012, 04:21:42 AM |
|
These chips crunch near a billion hashes per second. Losing a small handful of those each second is miniscule.
Mine along on your CPU if you wanna make up the difference and then some.
I get a feeling that a longer explanation is required for those unfamiliar with digital logic design. The issue isn't really about losing one in billions of hashes. It is about gaining the timing margin (a.k.a. overclocking headroom) in the design. Of course Avalon's logic is secret, but I'm going to discuss the problem based on one of the open-source FPGA hashers. It had a critical timing path in the logic that latched the "golden nonce". Since the design was 125-deep pipelined it had a hardware that subtracted constant 125 from the nonce counter before sending it out of the chip. Now we have two ways to speed up the above design: 1) remove the 32-bit wide constant subtractor. This will gain a fraction of a nanosecond on every hash tried. It is very easy to subtract 125 in software from the nonce downloaded from the chip.2) acknowledge that the timing violation may occur and the nonce latched may not be the exact one that solved the block, but a next one or previous one, depending on the details of the latching logic. It is somewhat more involved, but still easily doable in software: recompute the hashes for nonce values n-126,n-125,n-124 and use the one that solved the block. Again this will make the design more tolerant to overclocking for every hash tried inside the chip. Obviously 1) cannot be applied to the ASIC chip or closed-source FPGA bitstream. But the method 2) remains applicable, just use a different set of test values. Since it's a pipelined design, wouldn't removing the subtractor just reduce the latency of the pipeline instead of increasing the throughput? Even if this subtractor would prevent the re-loading of the pipeline than you could pipeline the pipeline and the subtractor. Since the pipeline will not (i presume) produce a nounce to be latched on every clock you have more than enough time to store the previous nounce on chip and subtract the number before sending it out to the controller. At least i would make my 'store' circuit parallel to the actual pipeline so it can operate asynchonously. for Christ sake. Why the hell do people assume you need to do a subtraction when a 'nonce' is found, this is C programming at its worse, by people incapable of thinking in parallel. Once again for the noobs: The nonce is calculated BEFORE the SHA256(SHA256(x)), the product of this function is what is evaluated and dictates IF the nonce is a golden value. Therefore you have ATLEAST 120 clk cycles to calculate the nonce correction (subtraction), before it is needed (if at all) The subtraction is only an issue for people that should not be programming logic chips in the first place. If you have unused gates just "sitting around", please use them.
|
BTC:1PCTzvkZUFuUF7DA6aMEVjBUUp35wN5JtF
|
|
|
nathanrees19
|
|
December 26, 2012, 05:15:37 AM |
|
Therefore you have ATLEAST 120 clk cycles to calculate the nonce correction (subtraction), before it is needed (if at all) Put it through a pipeline the same length as the main calculation - no subtraction at all!
|
|
|
|
2112
Legendary
Offline
Activity: 2128
Merit: 1073
|
|
December 26, 2012, 06:39:56 PM |
|
Put it through a pipeline the same length as the main calculation - no subtraction at all!
I know that the above was meant to be a joke, but it helps to explain some salient choices that the designer has to make. Most of the FPGA designers for Bitcoin hashing used the XC6SLX150 chip that has about 150k "gates" and costs about $200. hardcore-fs is working on XC5VLX110T chip that has about 110k "gates" and costs about $2000. So where's the catch? Spartan-6 has much less "wires" than Virtex-5, the designs on Spartan-6 are quite oftern routing-constrained: there is enough "gates", but not enough "wires" to connect them. And even if there is enough "wires" then the gate interconnections may be longer and slower than in a design that uses less "gates". Check out the extreme example of the routing-resource limitation: eldentyrell started working on his "hand-placed, auto-routed" design in October'11. He complained about auto-routing failing and being forced to hand-route until about March'12 when he disclosed that he started using DSP slices for some adders to relieve the congestion of the routing for the general-purpose SLICEs. https://bitcointalk.org/index.php?topic=49971.msg793740#msg793740The very same conceptual limitations will apply to the ASIC synthesis. One can spend a lot of time optimizing performance for the particular design flow. Or one can accept most of the default choices to optimize the time it takes to start the manufacturing.
|
|
|
|
mem
|
|
December 27, 2012, 02:17:39 AM |
|
Survival of the most competent!? Wait, where does that leave Inaba?
It obviously leaves me so far ahead of you that you can't even see me flash my ass at you. It leaves him gnashing his teeth, poor guy So much ego and so little to justify it.
|
|
|
|
|