Bitcoin Forum
May 08, 2024, 02:01:12 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
  Home Help Search Login Register More  
  Show Posts
Pages: « 1 2 3 4 5 [6] 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 ... 72 »
101  Bitcoin / Hardware / Re: New Official AMT Thread on: May 23, 2014, 11:15:35 PM
Well... in theory at least the tweaked cgminer for the A1 is on Github so the code for that can be looked at (not by me - otta my scope of things). Are the calcs from it right and being reinterpreted by the web UI?

Oh, standard rule of thumb Bitmain uses for the Ants error % is
HW error % = HW/(diffA + diffR + HW) * 100

Following that my S1's range from truly infinitesimal to 0.000012% error. Have never ever had my pool report rejects/errors/stales. Always 0/0/0

Yes, a formula like that would be accurate to the "output", not the "workload". (Now, if both those were low, that would be another issue, other than one with a high workload and high errors. Which is what we should be seeing.)

Eg...
On a system that normally runs about 390-375 Ghs...
1: Doing 380 GHs of work and 0% errors, thus Validity is 380 GHs "accepted" = Normal
2: Doing 380 GHs of work and 50% errors, thus Validity is 145 GHs "accepted" = Reboot needed (? Software errors, Initialization errors, possibly ?)
3: Doing 145 GHs of work and 0% errors, thus Validity is 145 Ghs "accepted" = Stop miner (? Hardware critical failure, card dead or unit half dead ?)

Right now, for #2, both the screen and GUI indicate 360 GHs, as if nothing is wrong at all. #2 will eventually lead to #3 in time, or worse... If not rebooted or attended to.

P.S. Seems the hardware errors are not persistent... they are accumulative. Might want to formulate a time-scale for that number, as the total number is irrelevant. It is the hardware-errors per second/avg which is needed for that formula. I am up to over 185 hardware errors, and so I am sure that would translate into 100% death, if that was how many errors there are at the moment, not just over time.

Eg, It has been running for 60 min, or 3600 seconds, so 185 errors in 3600 seconds = HW-Error rate of 0.0514/second Which is not bad. (Started with a lot, right out of the gate, now it is only one every few seconds.)
102  Bitcoin / Hardware / Re: New Official AMT Thread on: May 23, 2014, 10:48:43 PM
Ok, one more frustrating issue...

Though this is not an issue with AMT, but with the software being used for mining-control.

The issue:

The number on the screen "Estimated hash-rate", is completely inaccurate, on many levels.
1: Several times the machine is running, it has run with over 50% "device rejections". Screen still indicates hashing-rate as "380 GHs", though the output is obviously only "190 GHs", as seen in the pools. (This is "rejected shares due to invalid, rejected at the RasPi".)
2: Every time, I have "hardware errors". They grow in time... 2 - 7 - 14 - 24 - 36 - 48... Still, the screen shows hashing-rate as "380 GHs", though the output is obviously only about "290 GHs", as seen in the pools. (This is "failed hardware, not responding, not hashing.)

Worst is when I have 50% hardware rejected, and 48 hardware errors, which brings the output to "145 GHs", (290/2=145). Still the screen indicates a healthy "380 GHs". (Not sure if hardware errors are accumulative, or "total", but it seems like a "total", in the final output.)

I am not sure what it is using to get the GHs calculations, but it is not actually calculating them from the output of the machine. Seems it may be estimating based on "single best share", for whatever share is still being hashed. Thus, as long as one fraction of one chip was running, it would indicate "380 GHs", though the output may actually be only "2.3 GHs".

Seems each hardware error is roughly 2.3 GHs, in the state of the cards, which I have. While the device-rejected percent is just a percent of what is being delivered down the chain, from non-failing portions. So, I have to roughly subtract 2.3 GHs for each error, and then subtract the percentage of rejections from the Pi, and then subtract the small percentage of rejections from the pool-server. To determine the "actual estimated hashing rate".

Might be one reason why you guys are having a difficult time "detecting" the failing or failed boards... Prior to shipping. Because the software is essentially fudging the "output" or the "production". (Not sure if this was done on purpose, to thwart the introduction of THs machines into the market, but I suspect it was not done on accident. No-one can be that stupid to have overlooked that, when programming. It had to be purposely reprogrammed, because the program which they made this from, did formulate for all of that. Not AMT, the people who made this web-UI and the RasPi distro. The web-UI has access to that information, obviously, in the logs and output.)

If I had not checked it, I would have run for days, at less than 50%, thinking it was running 100%, as the screen and web-UI indicated.

Sample output from the "LOGS -> CgMiner" page... (Screens indicate "380 GHs". Pool indicates "145 GHs".)
Code:
Difficulty Accepted=>169984.0
Pool Rejected%=>1.3
Found Blocks=>0
Difficulty Rejected=>0.0
Device Rejected%=>54.8
Pool Stale%=>0.0
Work Utility=>16.48
Rejected=>0
Elapsed=>2418
Hardware Errors=>85
Accepted=>664
Network Blocks=>3
Local Work=>219307
Get Failures=>0
Difficulty Stale=>0.0
Total MH=>923765860.991
Device Hardware%=>11.3485
Discarded=>3708
Stale=>0
MHS av=>382077.53
Getworks=>64
MHS 5s=>388568.23
Best Share=>131249
Last getwork=>1393148966
Remote Failures=>0
Utility=>16.48

Pool Rejected%=>1.3
Device Rejected%=>54.8
Hardware Errors=>85
Device Hardware%=>11.3485
MHS av=>382077.53 Huh (Not even possible with 50% rejects, which should not be counted.)
MHS 5s=>388568.23
Remote Failures=>0

Suggestion...

In the Web-UI, Indicate the "Workload Output: 380 GHs", then the "Valid Output: 145 GHs". Also taking note of "Device rejected: 53%", and "Hardware Errors: 85"...

The screen on the machine should just read "Speed: 145 GHs", not "Speed: 380 Ghs", since that should indicate our "Valid production". Which, if low, we KNOW there is an issue, and can investigate further with the GUI, or reboot, or alert you of an issue before the issue becomes a major device failure. Since, seeming as if nothing was wrong, we would let it keep keep running, possibly burning up the device and causing more "expensive damage", which the MFG would be responsible for repairing. (Just trying to reduce your future potential losses here.)

At the moment, a simple reboot fixes this issue... for the "device rejected" errors. (That too, you could indicate on the miner screen and the web-UI, since you can detect that "error rate", easily.)

Works best if you shut it down and wait 30 seconds, to allow the unit to completely drain itself of power. Seems the caps hold a lot of power in the Pi, and some on the miner, causing it to linger in a low-power, half-alive, dysfunctional state. One which causes it to not boot correctly, if instantly turned off and back on, in a time-period below 20-seconds.
103  Bitcoin / Hardware / Re: New Official AMT Thread on: May 22, 2014, 01:58:37 PM

YES... that is the correct mounting and heat-sink setup I was looking for... xD

Didn't know about that version!

Those should have at-least one mounting to a screw though, with a spacer-obviously. That small thermal-tape will not hold that weight, once heated. I can peel the heat-sinks off by hand. (Have done that to four of them, where they were not seated correctly on the chip. Just got to be real slow about it. xD I am a master of tape removal! That is my only skill.)

That heat-sink covers the "span of the air flow" more, forcing the air through it. (Unlike the copper ones, where 90% of the air flows around it, not through it.)

That is hard to apply, when they hang over the chip like that, and are in two separate pieces. I would have made a spacer with regular double-sided tape, at the least, to support the excess overhang, on both sides, secured to the PCB.

One long one, across both those chips would have been better. But for the other four, an offset mounting would have been preferred, with the under-side support.

I am sure that was just another manufactures "substitutions".
104  Bitcoin / Hardware / Re: New Official AMT Thread on: May 22, 2014, 01:53:15 PM
Ok, latest update... xD

The boards with only one of the four caps/resistors around the A1, with the alternate other caps...
ISAWHIM,

We appreciate your diagnostic efforts thus far, they have been well documented and thorough. Send back to the two non working boards and we'll you send you another 2 working boards plus an additional 2 non working with issues...

So far, there are three that need to be send back, the one which needs the A1 reworked or replaced. (That was one of the "good ones", the later designs. You would not have seen that chips was out of place, unless you got all personal with the chip like I had. It started running fine, until the heat got to it, I assume. So I am not sure if the inductors or other power components were also hurt. They were all super hot.)

The other two, were the boards which have just stopped responding to detection. (The ones with only the one cap at the A1.) Both have obvious screw-marks across that "COUT46" set of terminals, thus, that was "shorted" at one point in the components operation. Though they were not shorted at the moment I was using them, or running them... I was not sure if that had something to do with the issue those boards were having.

If you could tell me what to look for on those... as they were both once operating perfect. One at 208GHs, and one at 198GHs, prior to just stopping. Those, I believe, had the issue where the boards seemed to start mining faster and faster, just before failure. They eventually started hashing around 50-20GHs, which caused me to re-boot, which would normally seem to "reset" the operational state. But at the time of those cards failures, the "last time they did this", re-booting resulted in the cards just failing to reply for operation.

Are those going to need a component swap? (Thus turning them into the other boards?)

As for the other two boards with the blue heat-sinks... I don't recall any "markings" on those. So, those might work, but may eventually have an issue, as I do recall someone posting "Does anyone have any boards with the blue heat-sinks that is still working". So, that leads me to believe, as well as the fact that they were devoid of heat-sinks, that they will have something to challenge me. xD

Though, I will give those a once-over, to see if I can see anything wrong, before mounting to the heat-sinks and plugging them in.

That shorted board, I will pull and inspect tonight.

While the unit is apart again, I will play with the individual card-shrouds. Then too, I guess I will order some thermal-pads. To facilitate some desired cooling, where I feel it could be improved on those operating units. The pads I was looking at, are expensive, at a consumer level. (They are all overpriced. xD) However, these had the best thermal attributes I could find, without containing metal-foils/shims...

They accept BTC, however, at this low BTC value, it is better to hold the BTC and pay with USD. (BTC has started that rise I have been talking about for the last 2 months. The last thing you want to do now, is cash-out BTC. xD)
http://www.frozencpu.com/cat/l2/g8/c487/list/p1/Thermal_Interface-Thermal_Pads_Tape.html

I would love to use the "Ultra extreme" stuff, which is 17.0 W/mK, or the "Extreme" which is 11.0 W/mK, but not at those prices. xD
I will be getting the "Premium" stuff, with is more than acceptable at 6.0 W/mK. That is 6x to 4x better than the crap others sell as "normal". (Normal thermal pads, the kind used and sold without numbers, are usually 0.5 W/mK to 1.5 W/mK. Eg, they are just airless-foam/rubber/silicone that is heat resistant and impregnated with aluminum-oxide. xD.)

The pads are only 0.5mm thick, which is enough to raise the heat-sink off the surface of the PCB, however, I may look at getting the 1.0mm thick stuff, which I believe is similar to the stuff used on the small heat-sinks. The extra lift will break any capacitance happening between the heat-sink and the PCB itself, and even allow some air-flow under the heat-sink, adding another 30% surface-area for direct cooling. (The under side of the heat-sink, and the top exposed side of the PCB itself.)

I will be adding spacers under each screw, to "shim" the board to a matching height, enough to allow slight compression on the pads, without stressing the PCB. (This is not a solution that is as simple as it sounds. xD)

Some will be added to the tops of the inductors, as soon as I get, or make, a heat-sink for them.

AMT, a tip for applying thermal-pads, is to use a "flat roller" or just a bowed-surface...
1: Peel the backing off one side
2: Rest the pad on the curved surface
3: Using the edge of your nail, hold it in place, as you gently press the other edge to the surface to bond to... (Just the edge)
4: Roll the heat-pad down onto the surface while applying firm pressure. (That stops any large air-bubbles from forming.)

Here is the big trick to the next part...
1: Wait one day before applying this in a similar method, to the chip. (That allows the pad to "expand" back to normal.)
2: Rest one edge on the edge of a chip, and angle it downward, applying pressure towards the edge that is already in contact.
3: Use a heat-gun to heat the pad real hot... Below the SMT solder temps, obviously. (This blows-out any bubbles, by heat-evacuation/expansion.)
4: Apply slight pressure to the two taped/bonded items, and allow to cool. (That seals any "gaps", and when cooling, creates a vacuum that will pull the thermal-pad onto the surface for 100% microscopic contact.)

The alternative to that last step, is to have a vacuum chamber with a device that can apply pressure after the air has been evacuated and before the pressure has been returned to normal. That is an expensive unit. The heat trick works better anyways. Makes the bond stronger and pre-treats the unit for "operating conditions".

The issue with doing this by hand... You use your FINGER to press the pad onto the chip. The pad compresses in the center, creating a "dome". Once applied to the chip, that "dome" is now a sealed "bubble" of air and humidity. That decays the bond, stops the heat-pad from making contact where it is needed most, and ends-up in failure similar to the photo a few pages back.

I lied, there is another application method that works better, if the pads can be compressed. It involves pre-compressing them into a reverse-dome, or a bowl-like curve. However, that is best used for machines/robots to do the application, as they can hold the two surfaces flush, and lower them so the "now raised center", touches the chip first. That also stops them from having to wait 24-hours for the pad to "swell back to normal", between both applications. That swelling time is a big issue, after having been compressed.
105  Bitcoin / Hardware / Re: New Official AMT Thread on: May 22, 2014, 04:58:56 AM
Ok, latest update... xD

The boards with only one of the four caps/resistors around the A1, with the alternate other caps... are both non-responsive. (Both these originally came with heat-sinks.)

The four boards with copper heat-sinks, and all four caps/resistors, two are still running. One runs at 204GHs, one at 178GHs. Two do not function. Of the two which are not functioning, one is due to an A1 chip seemingly over-heating, due to improper mounting on the PCB. The other had no heat-sink, and had an "S" or a "5" written near the back-plane mount, in permanent marker. This card had no heat-sink originally, which was added from one of the failed cards. It seems to have a short, as it knocks the whole miner offline, stopping all cards from booting, as it seems to ground-out my single 12v rail.

There are two more cards I have yet to test, which have the blue-anodized aluminum heat-sinks on the A1's and no heat-sink on the back-side. I will attempt to place heat-sinks on those, from the other non-responding or malfunctioning cards.

The few fractions of a coin that I have earned, have just about equaled the additional money I had to spend, to attempt to get these working, and running.

I too, am down to only 2 working cards, of eight. However, those cards are working fine, with the exception of the one with obvious dead cores. There is nothing left I can do to resolve any issues myself. The only thing I could do, if the thermal-compound was the issue, would be to buy thermal-pads for the large heat-sink, and to slice the heat-sink in half, so it didn't expand as it heated. That, and add a pressure-clip to hold the front heat-sink to the back heat-sink, to sandwich the chips with some tension, for better thermal contact. However, the issue is not thermal, at the moment.

After testing these last two boards, I will contact AMT to have the non-functioning cards returned. Each will be clearly noted, as to the issues found. However, it seems that the issues are already known for the older boards.

I just want to remind anyone who has just walked-in to this post... This is NOT a reflection of a "finished AMT miner". This was one of the alternative options. I am simply being thorough in my posting, as to the specific issues, so that the information can be used to help with any possible resolution to the specific units I was sent. I was aware that the boards may have issues, as they were of various production versions.

Just had to add those last few lines. I didn't want people thinking that this was what THEY were going to be delivered.
106  Bitcoin / Hardware / Re: New Official AMT Thread on: May 20, 2014, 02:01:41 AM
Mmm. Guess we gave them somethings to chew on for a while Cheesy

I requested some driver info so I can get these things working. That is my hold up at the moment. I might be able to get it working without the driver info, I am still working at it.

Why do you need a new driver when a couple of folks supposedly don't have a problem running the system?
Because some of us do...

That is like windows saying, why patch windows, it works for a few... xD

That assumes that the hardware is different,  don't we have the same hardware?

Apparently, not... I have two separate mountings of the same boards. But even with the same hardware, every individual component in every system is not the same, just "similar". You happened to get one with components that all work well in harmony. I, in this case, have components that obviously need "tuning" or "replacement". (Tuning to run them so they don't over-shoot "operational function". As they start working fine, but as they run, they change operation, running into a state that can't be managed with the "stock settings", which can only be changed, at the moment, by puttying into the RasPi.)

It is like your RAM, that you purchase, runs great at stock speeds and voltages... mine, purchased from the same MFG, starts producing errors, demanding less voltage and slower timing... (Or the same timing, but my setting of 20Khz, is actually running at 22Khz, due to MFG of other components. Thus, I have to lie and set mine to 18Khz to make it actually produce 20Khz, to compensate for the drift of the other components, or for the variations in my component over yours.)

Perfect example is your CPU... It is sold as 2.5Ghz, but your may be running at 2.4572332GHz, and mine at 2.637434234GHz, though we both have the exact same stock settings. (Which, if you notice, changes at ever reboot. Your CPU never runs at the same exact frequency, ever, and drifts while running and heating up.)

In this case, 2.63+ causes my CPU to crash when I open notepad. xD That 0.007434234 extra GHz, is enough to push it over the edge.
107  Bitcoin / Hardware / Re: New Official AMT Thread on: May 20, 2014, 01:45:46 AM
Mmm. Guess we gave them somethings to chew on for a while Cheesy

I requested some driver info so I can get these things working. That is my hold up at the moment. I might be able to get it working without the driver info, I am still working at it.

Why do you need a new driver when a couple of folks supposedly don't have a problem running the system?
Because some of us do...

That is like windows saying, why patch windows, it works for a few... xD

Though, for my situation, where I am seeing the miner increase in hashing over time, on the faulting cards... It will not make much difference. It was the faulting cards stopping the rest of the cards from operating normally. However, something is still causing the boot-up failure to stop at "MONITOR".

In all honesty, I want the new software, just so I can tune the hashing-power, where I can not now. Especially when the errors get so high, that the hashing-power indicated on the screen, is not reflected in the hashing-results in the pool, due to the errors being obviously not sent, or rejected by the server after being sent. However, that is not my primary concern at the moment. My concern is getting more than 2 cards running.

Also, getting hold of at-least one more heat-sink, so I can run my system as 3 and 3 cards. At the moment, I only have five heat-sinks. Three are on cards that no longer function. (Well, they no longer show in the system, when hooked-up.) The power-supply I have coming, has no use until I can get at-least three more cards running. So I at-least have five hashing.
108  Bitcoin / Hardware / Re: New Official AMT Thread on: May 20, 2014, 01:45:04 AM
Mmm. Guess we gave them somethings to chew on for a while Cheesy

Yes I noticed the large negative ground-shielding, surrounded by large positive plates, which is the essence of an electrical capacitor. (Two plates of opposite polarity, separated by a layer of non-conductive medium.) Might explain the electrical run-off that is causing the VR to bump-up in time, which leads to ever-increasing voltage, leading to more capacitance, leading to more increasing voltage.

I am just not sure why one board does this, over another board, being all of the same design. With exception to the issue of substituted parts that may be leading to the failure. So hard to see some of the numbers of the smaller components.

I will be mapping-out the contacts tonight, and grinding-down one heat-sink, a fraction of a MM, which should be enough to avoid any thermal-grease issues, avoid all potential through-hole contacts, and give a more localized point of pressure contact. With less metal resting against the PCB itself.

I will be grinding the contact points about 1 MM smaller than the exposed metal, since the excess compound will squish-out to that small gap.

Going to look at the caps, and check the values.

P.S. I got my invoice today... Though it indicates the incorrect price. I paid $6089.00 not $5599.00 as the invoice indicates. (That was the earlier price, which I had just missed. I had to pay the $6089.00 price. Which is indicated on the AMT website invoice.)
109  Bitcoin / Hardware / Re: New Official AMT Thread on: May 19, 2014, 09:25:13 PM
Got my order on the way for my plexi-bending stuff... should be here by friday. I will get it setup, and make two cases. One for the full-size 5-card standard design, and one with the 3-card mini design.

Then I will throw everything back into the tower, and show the heat-shroud thing. (Just can't show it running, since I only have a PSU large enough to run 3 cards, and I am down to 2 running cards at the moment.)

Now I have to go to work... I will look for more sources for low-profile, multi-fin heat-sinks when I get back. (I am trying to stay away from the thick heat-sinks, as those just retain heat. You need lots of thin fins, for the greater surface area, to reflect the heat off the aluminum, and into the humidity in the air.)

Also looking for some sheets of thermal-transfer pads, that I can cut-up to match the plates on the PCB's. As opposed to solid shims.
110  Bitcoin / Hardware / Re: New Official AMT Thread on: May 19, 2014, 09:14:39 PM
Can you point to some correct sized heat sinks for the other side?

I am still looking for one with the correct fin alignments...

That, or one larger one, like the one on the back... (well, not that large) Which can be cut-up to order, so the fins are oriented correctly.

However, they would still need to be mounted, have pressure applied, and be "ground/machined" so that they do not short any of the many components that are along the mounting-surface, around the small chip.

EG, the fins have to run the same way as the air-flow, but the heat-sinks have to be wide, not long. Wide, so they fill more of the air-flow for the width of the card. Because filling the length, just causes air-resistance that makes the air flow around the heat-sink, not through it. Sort of what is happening to the large heat-sink, without a guide to keep the air moving through the heat-sinks. It enters the fins, hits resistance, exits the top of the heat sink to the unrestricted gap between the cards, and flows right around the copper heat-sinks. There is very little air-flow passing through the fins, near the middle of the heat-sinks, due to this. Only on each of the ends.

The individual chips and inductors need one more like this... (but not this tall, just this proportion and alignment)



The one for the inductors has to have holes drilled to make room for the caps, and be shorter. One running the length of the inductors, would run into the same issue as the large heat-sink... the length would cause too much resistance to make it functional, without a guide to keep the air inside of the fins. The air would just flow around it, cooing down nothing, just making more air-noise and drawing more unused power blowing air around all the hot components.
111  Bitcoin / Hardware / Re: New Official AMT Thread on: May 19, 2014, 08:56:12 PM
Anyone else get the "invoice" email with the revised dates? Mine was stated at 4/29/2013 when in fact I was promised miners within 3-4 weeks from the original date, in the end I got my miners in early April BUT they don't work and have no confirmation or update on whether I am getting replacements or not. I was offered 2 replacement miners on the 10th, I accepted but have gotten no follow up.


On the hardware side....damn I had not even considered the heatsink as a source of the shorts.....Just an oversight since its such an obvious and needed piece of hardware....alternately, couldn't a small copper plate be inserted in between each chip? Sort of how CPU heatsinks function. Put a copper plate/shim in between the chips and the heatsink it would actually eliminate that particular problem. Both sides greased up to keep it held in somehow. maybe as a getto solution crazy glue the corners to keep it fixed and grease up both sides of the shim to insure contact with both. The copper thermal transfer would work in a pinch and would be a nice workaround to the shorting issues that MIGHT be caused by the heatsink...

Also sorry to hear your card died. This was the same issue I observed in almost every instance (one card was DOA and never worked). Most of the others did the same thing within minutes. The ones I have running now I figure are on borrowed time. The only working miner I have is an Antminer s2. Despite the shipping issues It works well. All parts are solid. AND well built. For ALOT less than the AMT miners were. The irony is the Antminer runs at 1.2 some of the time (usually 1056).

No invoice sent to me.

I would not advise using a copper-shim... or any shim, for the moment. (The screws would surely warp the boards, unless you had a similar shim washer on every screw-hole also. Though, that would also run the risk of poor contact, since the screws are so far away from the chips, and there is nothing besides the pressure of the PCB holding the thing against the heat-sink.)

Thermal transfer paste is not the same as thermal-transfer pads. The pads are heavily impregnated with a high thermal-transfer medium, and solid. The paste forms air-bubbles if it is thick, and thus, creates pockets of insulation, like Styrofoam. Shims, tend to react, if they are not the same metals. Thin shims, will react, and dissolve almost instantly. (Except gold-leaf-foil, which is too expensive to use as a shim, as you would need many grams.)

The best thing to use, would be the super-thin thermal-transfer pads, which are a specialty item to order. (Not the standard crap they give you for a CPU or RAM, which is too thick and has fiberglass mesh inside, normally. It will not "compress" without direct "above pressure", which these designs do not have at all. Thus, causing the PCB to warp, which may lift the SMT chips off the PCB, along with the traces themselves.)

Besides, mixing metals is the source of electrolytic corrosion. It creates a "battery", in essence, using the thermal paste and heat as the electrolytic medium. That is why gold-plate touching non-gold-plate is worse then using tinned-metal touching tinned-metal contacts. It is never real wise to use bi-metal components in direct contact with one another. (Especially where heat is applied, which speeds-up the chemical reaction.)

If you ever looked at a copper/aluminum heat-sink, where the copper touches the steel-case of the CPU-case, the copper turns black/green and is all corroded, looking like the moons cratered surface. Unlike Aluminum, which corrodes with a thin and hard patina that protects it. Yet, it is aluminum, so it reflects heat... It is sort of a catch-22. Copper absorbs reflected heat, transferring directly to the bonded aluminum, which then reflects the heat into the moisture in the air, causing the evaporation/humidity cooling that we call "air cooling", which has nothing to do with air at all. (Air is thermally inert, as it is essentially invisible to IR-radiation, which is why you feel IR-radiation from a candle nearly ten feet away, in a room devoid of moisture, but only 1 foot away in a room with normal humidity.)

112  Bitcoin / Hardware / Re: New Official AMT Thread on: May 19, 2014, 08:03:38 PM
Can always just solder a length of twisted-pair out to somewhere convenient outside the shroud from the chip cap.
A shot in the dark - is the flakely board one of the ones with the 10k resistors in place of caps? If so could be that the remaining caps are starting to fail from excessive ripple currents.

Buck inductor temps are something I've never been comfortable with for various reasons. They are actually rated to run pretty bloody hot with max ratings often 140C or more. At least they are one of the few things that heat does not really affect until insulation failure. If you can keep a finger on it without branding yourself then it's good. I'd worry more about the solder connections to them failing from temp cycling and the heat getting to nearby components.

Yea, those seem to be the largest source of heat, once the cards stop functioning. May be the coils insulation failing. Those too, it seems, are not real flush-mounted to the board. There are exposed heat-pads under them, on the heat-sink side. However, the varnish actually raises the heat-sink from direct contact to the metal at all. (Looks like the heat-sinks should be machined to match the pads, but they are not. They are essentially using the heat-sink compound to "fill the gap" between the exposed heat-pad and the heat-sink, which is not the function of thermal grease. Thermal grease is designed to fill the microscopic gaps, not gaps of zero-contact. Which is where heat-pads are designed to function.)

Might also be why the heat-sinks are not showing as being needed, and the smaller ones seem to be over-heated. There is just not enough actual thermal-transfer to the larger heat-sink directly. Due to that large gap. (0.01mm is a large gap to thermal transfer, when there is 0 direct metal-metal contact. Not sure how thick the varnish layer on the PCB is. I would use aluminum foil shims, but then I need two applications of thermal grease. xD Plus, the grease tends to corrode foil. The silver/tin on the metal plates is already corroded. That should have been gold.)

I may just rip off the heat-sink on this one, make a pencil rubbing of the back of the board, and transfer that pattern to the heat-sink. Then grind it down by hand, to match the pads, for direct contact. Though, thermal-pads would be much better than all this extended effort.

One unit has a full sheet of thermal-pad across the whole backing. Not sure how wise that is... the cold spots will act as risers, not allowing the thermal pad to compress. While the hot parts will have an air-gap as the heated material thins-out. Though it will stop any possible shorting of the through-holes rubbing against the expanding and contracting heat-sink. (Which is something I also fear may be happening to the mystery working/failing card.)

Card completely failed today at 1:30 PM... Knocked the whole machine down to 0 GHs... Same as the one that had previously shorted. So now I am back to 385GHs. 1/3 of 1.2THs. Sad (Still not sure why they sent me this. This is not what I asked for. They didn't ever reply back, about testing them, or send any testing information. So I assume they sent this as an attempt to fill my order. I got sort-of a mix of what I asked for, if a complete unit could not be sent, and what they were offering, but neither of any, but more and less. xD. I am so confused.)

This one was not missing any parts on it. All four of the CAPs and Resistors were where they should be, on each chip. There was also no screws-shorts on the cards.

Did notice another thing that bugs me a little, about the cards... The PCB sticks-out beyond the heat-sinks, by a fraction of a MM... This bugs me because the frame mounts firmly to the cards by the heat-sink, which has a metal-edge pressing hard against the PCB itself. This too stresses the PCB, and I am not 100% sure that the inner layers may or may not have any stray "copper traces" near the edges, which are in direct contact with the metal frame. Mostly, my concern is the compression of the PCB, as the aluminum expands and contracts from the heat. (This also being a concern due to the solid screw mounting which is firmly holding the non-expanding fiberglass PCB to the expanding aluminum heat-sink. Which is also being crushed between the mounting frame.)

Having the heat-sink in two separate halves would alleviate some of the expansion, if there was an adequate gap between them, and if the cards were not suspended by the heat-sink, through a firm mounting. However, that would require a complete redesign. The PCB would not sustain the stresses of the heavy heat-sink, in a setup like that.
113  Bitcoin / Hardware / Re: New Official AMT Thread on: May 19, 2014, 03:46:59 AM
Don't suppose your meter has hi-lo hold function does it? If does you can wire leads across one of the local chip caps and record the core voltage and see if it changes...

No, but I can get one... hard part is cooling while trying to probe, on a running card... Need a real big fan to run it without the shroud.

EDIT: (3:15 AM, oddball card is still running normal, so far.)

Going to make a per-card shroud and air-guide, to force air to the desired components. Seems to exit the fins and travel freely in the giant air-gap, which has less resistance, and not over the inductance coils enough. Again, this is where a tighter design and larger "wide" heat-sinks on the backs of the chips would really pay-off.

Got hold of some cheap Lexan. Came from picture frames someone was throwing-out. Free is always good. They are like Walmart frames, with cheap Lexan covers, instead of glass. Also got some insulation-foam-board, from construction they are doing at the place I work. That will make great space-filling material, to guide the air where it needs to flow.

Getting some click-pens, so I can pull the springs from them, to apply pressure to the copper heat-sinks. Just to test the idea. Just what I want... loose metal springs around all these electrical connections. xD (It was either that, or steel barrets, which use spring-steel, and are cheap at the dollar-store. Tongue)
114  Bitcoin / Hardware / Re: New Official AMT Thread on: May 19, 2014, 02:21:29 AM
Just for kicks and giggles.. I powered-up the faulty card... Seems to be running back at full speed again. (10:21 PM GMT -5)

Getting about 590GHs now... It got faster! xD...

Seeing how long before it faults again, dropping back down to 20Ghs... It had about 4 hours of cooling time. Might be that the coils are not mounted well enough, and could gain from having a hat-sink on top of them too.

I have a few CPU heat-sinks I can chop-up to fit those. Just need some thermal-transfer pads. Don't want to smear heat-sink compound across those suckers.

I did notice that power consumption is up to around 800w (796w) at the wall, while the machine runs about 590GHs. Seems to indicate the crystal failing, or the driver circuit relaying the clock-freq, failing or falling out of spec. (Higher freq translates to higher workloads, thus, it seems to be slipping towards turbo-mode, though it is still set on nominal settings. Not pure turbo, but faster than nominal. Or, if it is the voltage regulator failing, it is delivering more power, and thus, also slipping/drifting towards turbo-levels. That, or the higher voltage is the result of the clock-crystal slipping.)
115  Bitcoin / Hardware / Re: New Official AMT Thread on: May 19, 2014, 02:06:24 AM
Also, along earlier lines  re: caps... Seems the earlier Ant s2 have an issue with that very thing. Bitmine has the info on where to add more across the local chip caps... https://bitmaintech.com/files/download/Guideline%20of%20Soldering%20Capacitor%20to%20S2.pdf

Wish I knew where to do something like that on the A1 boards...

Today I had another card just bottom-out. When it was dropping to 480GHs, it was one of the cards dropping down to 50-20GHs. Sadly, that was a card that ran at 208GHs previously. Not even sure where to troubleshoot that card. Like the one that was shorted, ironically in the same location on the backplane, it has done exactly what the shorted one had done.

I got about 3 days out of it... running at 208GHs. Ran fine the first day. Then dropped-out to 20Ghs... Reset the miner, it ran about 18 hours, then dropped back down again... Then ran about 12 hours, and dropped back down again... Today, ran about 3 hours, dropped back down... Now it only runs about 5 mintues, before it drops back down.

Obviously, something is decaying.

Showed no hardware failures an no errors... it just didn't run any faster than 20GHs. The coils were hot, but not the chips, this time. (Voltage regulator issue, or a cap issue?)

Too tired to rip the whole unit apart again, to pull the card. Just unplugged the card power. Hope the other cards work, but that still leaves me with a total of only 1THs of cards. Running only 380GHs now, off two cards. Glad I waited to RMA.



Red is the unusual spike... It was hashing waaay higher than normal. (I assume the initial drift-up.)

Green is the resulting failure... I caught it here, and rebooted...

Blue is the end-result, failing after a few hours before I caught it... Resulting reboot did not "fix the issue" this time. I just left it unplugged. The final result is only 2 cards running for the remainder of the day.
116  Bitcoin / Hardware / Re: New Official AMT Thread on: May 18, 2014, 10:31:14 AM
Also I am going to setup PHPminer as the web interface for this firmware. Seems to work quite well for what we need.
https://phpminer.com/?page=screenshots

Once I get the driver issue out of the way and working I will update with PHPminer and put out an alpha 1 release. This will work for all coincraft A1 devices. Not just AMT's

Nice... Control... Just what we need to tune things.

Trying to figure out why my cards do not like to run for more than 18 hours... I keep catching them slipping down from 590GHs (3 cards), down to 480-380GHs. (Heat is not the issue, it was cold as hell the last few nights.) Seems software related. Since rebooting fixes it instantly. (Which takes a few boots for the cards to stop from getting stuck on the "MONITOR" stage still.)

It is like it may have a memory-leak, or if it is hardware, the frequency-controller is drifting. (Thus, causing it to reach the point of generating more errors, operating at a higher frequency. I'll have to see if it is errors causing the issue next time. If it is drifting down, then it may be operating too slow to sustain the cores communications. Or, with two separate clocks, it may just be falling out of sync. But that would be a hardware error. Guess an oscilloscope would come in handy to check that. I don't have a digital counter that can count that fast, accurately.)

But it does take about 18 hours, which I can normally catch. That, or just reboot every 12 hours. Not that big of a deal with only one machine. (This is where the 2 days, running for 24 hours would detect.)

Selling my thermal imager... Before the BTC hike starts. (Looks like less than a week, and I don't think there will be a final dip before the hike.)

I'll use some money to buy the things I need to finish this lexan case, and try to get the other cards running. May end-up just selling this miner too. If I put enough glitter and glow on it, I should be able to get close to what I paid for it. (Minus the $60 in BTC I earned over the last few days. xD) In the hike, I should be earning closer to $300-$600 a day. For at-least 30-60 days, when it is all said and done. That will be nice, but I would have made more just buying BTC directly, again. Would be more if I could solo-mine alt-coins... But pools will have to do for now.

We'll see how things go, once I get some of those supplies in, after my thermal imager sells.
117  Bitcoin / Hardware / Re: New Official AMT Thread on: May 17, 2014, 03:59:29 AM
Polycarbonate (Lexan is a registered TM) great choice. Not only good temp before warping but also unlike acrylic it is hard to shatter.

I would be using "Lexan", the brand-name... Unless I can find another source.

Managed to work-out a shroud that fits perfectly into a 12" x 24" pre-cut piece, perfect shape for the 5-card setup. Seems to fit nice and snug. Just have to remove the back-plane, slide it in, and insert the back-plane back into place.

Ordered more Nichrome wire and another LED-dimmer for a controller, to bend it better. Blew my last strand of wire a few weeks ago. Left it on a few hours too long.

Ordering some nice thick Lexan to build two separate custom cases, and some thinner sheets for shrouds. Just need to find some 20-pin ribbon-cables and get some "U" channel to support the cards. Getting rid of that whole metal frame. Was going to chop it into two halves, but it is setup so awkward. Now all I need is more individual cards!

Reduced the whole 5-card setup into a 16" * 7.5" * 14.5" (deep) setup... smaller than my home PC. xD

Going to make it smaller when I split it into 2 units of 3 and 3, and chop the large heat-sinks down by half, snugly packing them together. (The 24-pin ribbon-cables will give me the ability to move them closer together.) The excess length of the backplane just extends over the PSU, since 2 connections are not being used.) I got me a little GH-Cube. Though, adding the ribbon-cables will make it about 9.5" * 9.5" * 14.5" (deep). Ras-Pi also relocated to being above the boards, with the back-plane and ribbon-connectors.
118  Bitcoin / Hardware / Re: New Official AMT Thread on: May 17, 2014, 12:00:15 AM
AMT, I suggest that you run these for a full 24-hours, after assembly... Not just each card on a bench-test... actual assembled units...

After you have given them a once-over, for quality control. Looking-out for those long screws, which seem to be drilling into one of the mounted boards. Also looking for backwards mounted fans. (All fans should be mounted so you DO NOT see the sticker on the fan.)

(Drill holes in a metal jig, and cut all the screws to length with a die-grinder or cut-off wheel... Even tin-snips! They are all waaay too long. Better yet, get nylon screws, or push-in clips... or use cheap "L" trim to hold all fans in place, mounted with velcro.)

Then, wait 2 hours, with the unit off... and power it back up, running it for another 24 hours. That allows the components to burn-in, settle-down, and any post-issues to crop-up. Issues that are seeming to crop-up after bench-testing, then assembling, then shipping.

Also, get a thermal imager. You can easily see the problem components, before they get out of control, on the bench. Just print a photo of a "normal view", and use that to compare the view in your imager. You can set that up with a simple assembly-line rig, and a cheap 300w PSU. You don't need a HD thermal imager like I have. Just a cheap $300 imager with a 20x40 screen would suffice, once mounted to a testing-slide. You can get decent used HD thermal imagers on ebay, for about $600. Mine cost over $3000. It was overkill for my purposes, but worth it.

The heat-sinks are adequate, with shrouding. Might want to invest in some cheap high-temp plastic (Thin lexan would work, and is cheap 155c melting temp.). Cover the holes in the back of the tower, where the cards are. That, or pull the shroud up to the heat-sinks on that side, so air comes IN through the holes, and is guided out to the big fan. Same for the front-side of the tower, and the side with the backplane. As it is now, hot air is just blowing all over inside the case, and some air is being sucked-in, from the back, and instantly pulled right out, by the larger fan on top.

A simple "U" shaped shroud. (I actually use cardboard) Wrapped around the three exposed sides, to create that air-guide/channel, helps dramatically, to ensure cool operation. Running at 780w, with only 3 cards, the exhaust on my unit is barely even warm. Though, the small copper heat-sinks are the hottest component, they are adequate, until they are not. Working on a spring-tension to place between the copper sinks, and the larger heat-sinks on the opposite side. (On the last card, where the copper sinks are exposed without a matching large heat-sink, to get pressure from... I am adding a metal plate as a shroud. That will keep better pressure against the copper, and allow some heat to dissipate through the spring-tension, which also increases the surface area.)

Other than that... These things get an A- grade in my book. Would have been an A if there was a power-switch, jumpered to the PSU connection, or a splice-kit switch, which splices to the power-ready line, so we can safely turn-off the power without power-cycling the PSU by removal of the plug. Since, not all PSU's have a separate power switch, and not all power switches operate at "Safe" levels, as they were intended for emergency power killing when the computers power button fails.

P.S. The large heat-sink is twice as large as it needs to be. The fins could be half the height, and the backplane junctions could be spaced a lot closer together, if that was done. Though, ribbon-connectors would have been better, to the cards. You could also reduce the airflow to six fans, for six cards. Not to mention, setting it up for two PSU's, you would save about 50% in PSU costs. Six cards can easily run off two 800W PSU's, instead of one 1500W PSU. There would be ample room for two PSU's, and that large fan would not be required at all. The PSU's could safely exhaust or be intakes, for 100% of the required air-flow. (Being intakes, would be preferred.) Once I get another heat-sink, I will build a unit as described, to show you. xD. Got a spare heat-sink?

LEXAN 1/32" * 24" * 48" = $10.00 in bulk (Might get down to $7.00) This is enough to do 4x units. That is roughly $2.50 a piece.

Anyone want me to make a shroud, let me know. I'll make them for $10.00 shipped. I won't be getting bulk prices, obviously. I'll be getting home-depot prices. lol.

EDIT: Screw that, Home-Depot charges $67.00 for a $10.00 sheet only 24" * 36"... What a rip-off. lol. I'll get them online. Found a nice seto of assorted springs for $5.00 that may work for adding tension to the copper heat-sinks. Looking for a better alternative, spring-steel "U" mounts.
119  Bitcoin / Hardware / Re: New Official AMT Thread on: May 16, 2014, 11:14:40 PM
The COUT's connections are a mix of CAPs and Resistors, it seems. However, though the R+ labeled SMD seems to be a resistor, it has a polarity? Might be a cap/resistor combo. That, or the polarity indicator is just redundant. (This is on all my boards that function without issue. Except the one where the A1 chip is not mounted flush, and has thus, overheated.)

I suspect that the short across COUT46, by a screw from the fans, has just shorted the card, enough to keep it from initializing.

The board with the A1 that was not mounted flush, I assume had "shut down" due to the internal over-heating, once it had gotten to that point. In any event, the failure of that single chip, has led me to believe that the through-connections, the serial in/out on each chip, demands that the chips are chained. Failure of one, would, in some cases, stop the whole board from responding.

Removal of the chip, would seem to imply that a jumper be soldered, to bridge the through-connections, where they would normally be communicating "through the chip", to get to the other chips down the line. However, without the jumper, it seems like it should be possible to communicate with the chips up to that removed chip.

The work, I assume is coming in through the U8 chip, and returning back to the Ras-Pi, through the U5 serial linked chip. U8 is top-right, by the four other chips, I assume setting-up the distributed workload through U1, U2, U3, U4. The U5 chip is at the end of the serial connections from the chips, going directly back to the Ras-Pi, which is why I assume it is returning work/solutions/shares directly back, to be processed for diff-targets, to determine if they are solutions or just shares.

However, I could have that backwards too... U8 might be sending the work to the chips, and U5 may be the return of processed data. However, I am about 90% sure I have the previous paragraph correct this time. xD

The chips are sort-of isolated in groups of four, by two separate clock-timers or frequency controllers. Chips U60 and U61 are tied to four independent 1-wire connections, which are tied to a chip that has the clock-crystal.

All the circuits in the center, seem to be just the voltage regulation. Odd that they all seem to be built as individual regulation units. Might be why the new versions just made one whole power-regulator that was simply more robust, as opposed to 8 individual mini-regulators. There is power-isolation, as the chip on mine, which overheated, had a matching voltage-regulator which was also overheating.

I wanted to remove the caps and resistors from that board, which had the non-flush overheating A1 chip, and place them on the other board with the previously shorted connection... However, I had to craft a special soldering head to desolder the SMD components. That is taking longer than expected. lol. It was easier to do that, then to attempt to rework the non-flush A1, which I assume is actually burned-out and shorted inside anyways.

I am not sure how easy it would be to jumper/bridge the small connections for the data-lines... However, there are tiny inline (resistors or caps, can't read the numbers on them)... in the two center datalines, of the four. Not sure if that would impact the integrity of the data transfer. This is a question only AMT or BITMINE or TECHNOBIT can answer. Again, I don't have the schematics, and have not taken a strong look at the pinouts on the A1's in a while. I had no intentions to EVER get this deep into the design process.

The A2 board does have a few sets of caps and resistors, of a similar form-factor, just outside the heat-mounting area. There also seems to be a lot less micro-caps for filtering, directly around the chips. Roughly about half. However I do not see any crystal clock components... They may have settled for on-chip clock controls with self-oscillators. The controller chip also looks more like an ATMEGA-32 or ATMEGA-64, or a RISC processor. I would assume RISC, due to it being cheaper, and having more sister-components.

Ordered a new PSU for my computer, so I can rape the PSU from that, since it is way overpowered for my PC, but ideal to run the last two card with heat-sinks, that I have. Once I transfer the heat-sinks to the other cards that came without heatsinks. Should be here by tomorrow. Then, hopefully I will have almost 1GHs total. 580GHs + ~400GHs = ~980GHs (Not including errors, which, oddly are counted towards the GHs output, even though they were not valid, thus, not a hash. Makes determining the actual output a pain in the ass. Always says 580GHs, but when errors start, and get worse, the output is actually only about 480GHs in the pool. Just requires a reboot, and the errors go away for another day. xD The program needs a frequency tuning area. Though you SET a specific frequency, the output is not "calibrated", thus, it is ballpark. Sometimes you have to tell it a higher or lower frequency to operate, to be that actual frequency. That is how you tune cards. That is something that 7970's demand, in order to operate at ideal speeds. Not sure why that was just reduced to "Power-save", "Normal" and "Turbo"... Those should be user-settable. However, limited to the MFG MIN and MAX, for obvious reasons. xD, Guess I just gotta PuTTy to get it done. Tongue)
120  Bitcoin / Hardware / Re: New Official AMT Thread on: May 16, 2014, 12:27:14 PM
I wonder how the power internal to the chips is bussed. One can assume that all Vdd pads are tied together at various points, question is, how much do they rely on the main feeds from the pads to be in balance to keep from burning out spots in the internal traces? I would think they can tolerate a local 2x overload at best.

eidt: looking at that that kemet info is great - very good coverage on exactly we are talking about on bypassing Cheesy

Gave me an idea... lol... throwing the non-responding card back in...
Pages: « 1 2 3 4 5 [6] 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 ... 72 »
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!