eldentyrell (OP)
Donator
Legendary
Offline
Activity: 980
Merit: 1004
felonious vagrancy, personified
|
|
June 16, 2012, 07:24:55 AM |
|
Xilinx guarantees their chips can tolerate operating junction temperatures up to 125°C (see page 2 of DS162) without damage, for all temperature grades (they're manufactured identically, then sorted by testing). That's the Absolute Maximum Rating. If you look at the footnote it warns: "Exposure to Absolute Maximum Ratings conditions for extended periods of time might affect device reliability." So they guarantee it won't die immediately, but not that it won't eventually fail in a few weeks or months as a result of running it out of spec which is what ztex is worried about. I think you missed the part of my post where I showed that it takes 26 Amps to get the junction up to 125C with a heatsink+fan.
|
The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.
|
|
|
eldentyrell (OP)
Donator
Legendary
Offline
Activity: 980
Merit: 1004
felonious vagrancy, personified
|
|
June 16, 2012, 07:28:55 AM Last edit: June 16, 2012, 07:56:34 AM by eldentyrell |
|
you can offer to replace/refund FPGA's that fail during usage of your software.
Intriguing. But how am I supposed to tell this apart from a chip damaged by failure to bother installing a heatsink and fan? Unless I can do that, I'll simply end up taking over your warranty liabilities...
|
The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.
|
|
|
eldentyrell (OP)
Donator
Legendary
Offline
Activity: 980
Merit: 1004
felonious vagrancy, personified
|
|
June 16, 2012, 07:31:37 AM |
|
Well, I am sitting here staring at PAR grinding slowly along. I don't know if I'll be able to stay awake until it finishes.
Assuming nothing goes wrong (big if), preview bitstreams in the morning.
|
The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.
|
|
|
eldentyrell (OP)
Donator
Legendary
Offline
Activity: 980
Merit: 1004
felonious vagrancy, personified
|
|
June 16, 2012, 07:37:16 AM Last edit: June 16, 2012, 09:02:32 AM by eldentyrell |
|
That quote's from a thread about a Virtex-5 device. As I recall those have metal heatspreaders. So a different chip, different packaging
No. The whole reason for using junction temperature is that it's package-independent. The package determines the relationship between the air/board/case temperature and the junction temperature. Junction temperature is all that matters, but you can't measure it directly, so you compute it using thermal constants from the package. , and built on a different process (65nm rather than 45nm).
Good point, but if the 45nm process really was more easily damaged by temperature you'd see that reflected in lower maximum junction temperatures in the datasheet. The fact that Xilinx didn't change them means it's unlikely there has been a major change in temperature tolerance. And we're only talking about one generation difference in process here -- it's not like 180nm vs 22nm or anything like that.
|
The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.
|
|
|
eldentyrell (OP)
Donator
Legendary
Offline
Activity: 980
Merit: 1004
felonious vagrancy, personified
|
|
June 17, 2012, 12:22:16 AM |
|
Hrm.
So, I have a bitstream that will run error-free on the ztex board at 170mhz as long as I only use one of the three rings. I can also run any one ring at 170mhz and the other two really slow (like 50mhz slow). But if I use all three rings at full speed, I get errors all the way down to some pretty embarrassingly-poor hash rates. I experienced a similar phenomenon on my own boards, but it wasn't nearly this severe and the optimal clock frequencies were still giving me 245+MH/s on my SG-2 boards (ztex uses faster SG-3 chips).
I'll be doing some more experiments on the clock-rate/error relationship this evening, but the important questions require a new build in order to answer, and that's going to take 24-48 hours (sorry, folks). Still lots of tricks up my sleeve, but they take (build) time.
|
The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.
|
|
|
rjk
Sr. Member
Offline
Activity: 448
Merit: 250
1ngldh
|
|
June 17, 2012, 12:49:08 AM |
|
What's the specs on the box you are building on? Always good to know for comparison.
|
|
|
|
kano
Legendary
Offline
Activity: 4620
Merit: 1851
Linux since 1997 RedHat 4
|
|
June 17, 2012, 01:07:04 AM |
|
As of today, such a bitstream change would have to be manually handled.
No. There is an option to quit the miner if it is unable to contact the signcryption server. So you launch it from a three-line shell script: #!/bin/bash run-tml-miner run-old-miner
problem solved. When you submit free patches to all the major mining software packages to support automatic failover to backup bitstreams I will agree with you.
I hereby open-source the above three-line shell script. quit and never go back - simplicity at it's best I like it.
|
|
|
|
Inspector 2211
|
|
June 17, 2012, 02:13:52 AM |
|
Hrm.
So, I have a bitstream that will run error-free on the ztex board at 170mhz as long as I only use one of the three rings. I can also run any one ring at 170mhz and the other two really slow (like 50mhz slow). But if I use all three rings at full speed, I get errors all the way down to some pretty embarrassingly-poor hash rates. I experienced a similar phenomenon on my own boards, but it wasn't nearly this severe and the optimal clock frequencies were still giving me 245+MH/s on my SG-2 boards (ztex uses faster SG-3 chips).
I'll be doing some more experiments on the clock-rate/error relationship this evening, but the important questions require a new build in order to answer, and that's going to take 24-48 hours (sorry, folks). Still lots of tricks up my sleeve, but they take (build) time.
Bitfury experienced a similar thing. It's probably ground bounce INTERNALLY to the FPGA. Or something like that. Xilinx never designed their FPGAs in such a way that 95% of all flip-flops could switch at the same time. They just didn't. But that's what a miner does.
|
|
|
|
pusle
Member
Offline
Activity: 89
Merit: 10
|
|
June 17, 2012, 09:59:34 AM |
|
You have probably thought of this stuff already but here goes:
Hookup an oscilloscope to the vccint, close as possible to the fpga. Make it look smooth on the scope at all times by:
-Make sure new midstate load etc doesn't results in spikes.
-Stagger the rings start time/midstate load/nonce wrap
-Use phase offset to interleave clock transitions for the different rings
-Ramp the clocks up gradually from idle
It could also be the PLL suffering from too much noise. Try changing the loopfilter/bandwidth of the PLL. Might be hard but try an external high-speed clock source (connection/termination to the board is critical)
|
|
|
|
ShadesOfMarble
Donator
Hero Member
Offline
Activity: 543
Merit: 500
|
|
June 17, 2012, 07:43:33 PM |
|
I don't know if I missed a statement for this question completely or it has not yet been answered...
What about eldentyrell's ("tricone mining") bitstream? Are you going to support it? Will it be the first bitstream working* with CM? Or are you focusing on your own bitstream?
(* working means more than 700 MH/s)
You really should be asking eldentyrell this question. Given his plans for a commission structure, it makes no sense for anyone other than himself to work on implementations. So, eldentyrell, what do you say? (CM = Cairnsmore board by Enterpoint)
|
|
|
|
makomk
|
|
June 17, 2012, 08:34:42 PM |
|
On ZTEX boards, the FPGA's JTAG signals are not even connected to the Cypress FX2 microcontroller.
That's unfortunate. I'm guessing he's not broken out the appropriate pins to allow the two to be connected either.
|
Quad XC6SLX150 Board: 860 MHash/s or so. SIGS ABOUT BUTTERFLY LABS ARE PAID ADS
|
|
|
eldentyrell (OP)
Donator
Legendary
Offline
Activity: 980
Merit: 1004
felonious vagrancy, personified
|
|
June 18, 2012, 05:40:55 AM |
|
Hookup an oscilloscope to the vccint, close as possible to the fpga.
Y'know, I was never any good with an oscilliscope. One of these days…. -Make sure new midstate load etc doesn't results in spikes.
Check. I deliberately don't stop the rings when loading nonces for this very reason; I just let garbage fly out the back end due to half-loaded work. The noise caused by that huge change in power consumption is not worth it. -Stagger the rings start time/midstate load/nonce wrap
-Use phase offset to interleave clock transitions for the different rings
Well, they're on different clocks. However, I will build one where they all use the same clock so I can try this -- good ideas. -Ramp the clocks up gradually from idle
Yes, already doing this. It could also be the PLL suffering from too much noise. Try changing the loopfilter/bandwidth of the PLL.
Since it's only a jitter filter I have it on the lowest bandwidth setting. I'm also going to try dropping it altogether after finding a comment by Austin saying that Xilinx's PLLs are very sensitive to activity in nearby logic Might be hard but try an external high-speed clock source (connection/termination to the board is critical)
Unfortunately I don't have boards that can do that (SMA connectors, right?)
|
The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.
|
|
|
eldentyrell (OP)
Donator
Legendary
Offline
Activity: 980
Merit: 1004
felonious vagrancy, personified
|
|
June 18, 2012, 05:42:32 AM |
|
Ok, so, it appears that I can get the top and bottom rings running at the rated speed (I'm still using 150mhz builds because they finish fast). But the middle ring only runs at 60% of expected speed unless the top+bottom rings are switched off (or running super slow).
If it runs stable overnight I will launch a high-frequency build and post those bitstreams when they finish. It won't be the predicted hashrate, but it should still be an improvement over what people have right now. And no commissions until I figure out wtf is really going on here.
|
The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.
|
|
|
eldentyrell (OP)
Donator
Legendary
Offline
Activity: 980
Merit: 1004
felonious vagrancy, personified
|
|
June 18, 2012, 05:45:21 AM |
|
Bitfury experienced a similar thing.
Yeah, I know… once I have fewer things on my to-do list I think me and him and anybody else interested ought to heckle forums.xilinx.com until they own up to this issue. I have been seeing the very same "center of the fabric drops out first" phenomenon, but until I read about his experiences I had it chalked up to my crappy homemade boards. Now that I'm seeing it on ztex's boards too I am kinda disappointed with X. Xilinx never designed their FPGAs in such a way that 95% of all flip-flops could switch at the same time. They just didn't.
Maybe, but they steadfastly refuse to post maximum current ratings for their devices, and say over and over "run our power analysis tools, and if the tool says it's ok, it's ok". Well, all my designs pass their power analyses. Yet the voltage near the center of the chip is clearly sagging. Basically, the power analysis tools are effectively "part of the datasheet" and Xilinx has a serious datasheet error here.
|
The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.
|
|
|
eldentyrell (OP)
Donator
Legendary
Offline
Activity: 980
Merit: 1004
felonious vagrancy, personified
|
|
June 18, 2012, 05:49:26 AM |
|
quit and never go back - simplicity at it's best I like it. Or, at least, manual intervention required to go back. I suppose a better idea would be a 3-line script that emails the operator to let him/her know that it has "downshifted".
|
The printing press heralded the end of the Dark Ages and made the Enlightenment possible, but it took another three centuries before any country managed to put freedom of the press beyond the reach of legislators. So it may take a while before cryptocurrencies are free of the AML-NSA-KYC surveillance plague.
|
|
|
DiabloD3
Legendary
Offline
Activity: 1162
Merit: 1000
DiabloMiner author
|
|
June 18, 2012, 06:06:23 AM |
|
Ok, so, it appears that I can get the top and bottom rings running at the rated speed (I'm still using 150mhz builds because they finish fast). But the middle ring only runs at 60% of expected speed unless the top+bottom rings are switched off (or running super slow).
If it runs stable overnight I will launch a high-frequency build and post those bitstreams when they finish. It won't be the predicted hashrate, but it should still be an improvement over what people have right now. And no commissions until I figure out wtf is really going on here.
Sounds like you need prime numbers.
|
|
|
|
pieppiep
|
|
June 18, 2012, 06:47:35 AM |
|
What if you shift the clock of the middle ring? Maybe the voltage internally in the chip in the middle drops to much each clock edge.
|
|
|
|
ztex
Donator
Sr. Member
Offline
Activity: 367
Merit: 250
ZTEX FPGA Boards
|
|
June 18, 2012, 01:54:14 PM |
|
Bitfury experienced a similar thing.
Yeah, I know… once I have fewer things on my to-do list I think me and him and anybody else interested ought to heckle forums.xilinx.com until they own up to this issue. I have been seeing the very same "center of the fabric drops out first" phenomenon, but until I read about his experiences I had it chalked up to my crappy homemade boards. Now that I'm seeing it on ztex's boards too I am kinda disappointed with X. Inspector 2211 already mentioned it and it is also hidden in the datasheet ("Simultaneous switching" issue): The internal GND traces of the S6 seem to be a little bit weak.
|
|
|
|
pusle
Member
Offline
Activity: 89
Merit: 10
|
|
June 18, 2012, 02:57:03 PM |
|
Err, isn't "Simultaneous switching" issue about I/O pins? not internal core logic.
|
|
|
|
ngzhang
|
|
June 18, 2012, 03:48:54 PM |
|
What if you shift the clock of the middle ring? Maybe the voltage internally in the chip in the middle drops to much each clock edge.
a trade-off is: the added GCLKs will consume more power. but i think that's worth trying.
|
|
|
|
|