Bitcoin Forum
May 05, 2024, 01:16:15 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 [25] 26 27 28 29 »
  Print  
Author Topic: [CLOSED] Bitmine CoinCraft A1 28nm chip distribution / DIY support  (Read 81190 times)
Gator-hex
Hero Member
*****
Offline Offline

Activity: 490
Merit: 500


View Profile
March 19, 2014, 01:22:41 AM
Last edit: March 19, 2014, 04:24:33 PM by Gator-hex
 #481

Did Bitmine change chips specs?

Wasn't turbo 40GH? Now it is 33GH... But still same price... But I do agree I don't see way to get them to 40GH with normal cooling...

Well that's sad to see… I've still been hopeful that I could get 40 to work someday… It actually does kinda work, I just notice that when I over clock the chip I start dropping nonces (when running a known nonce test case)…

EDIT2: And from what I see you can't get 33GH out at 0.85V more like 0.985 to 1 V. Also 1W in Turbo mode is fantasy... More like 1,3 to 1,5W... Hard to say how much is the loss on chip power supply...

I haven't played with increasing the core voltage beyond 0.85V - still on the list. 1V really is necessary that's going to be a decent chunk of power indeed!

Has anyone been able to get 33GH/s or higher to run?

The current pricing really does need to be adjusted. Several competitors are all coming online very soon at well under $2/GH. I'd really prefer to stick with coin craft as I have a working design and am very happy with the support I've seen (zefir plus free samples). But the current pricing level is going to make it very hard to be profitable for long...

My experience with the Technobit Pre-production A1 chips is that they don't fire up until 850mV (Turbo) and then you'll be lucky to see 25GH (Normal).
They're in a top-30%/70%-bottom package so you'll need a heatsink on both sides and the power/temps rise quickly so you'll need a lot of cooling to get anywhere near 33GH. You need to raise the voltage as you raise the clock or you'll get hardware errors but 33GH is doable at 1040MHz(4x260)*/1000mV. Marto even managed to push it to 35GH at 1100MHz(4x275)*/1050mV
* The Bitmine clock is 4x the Technobit clock.

"With e-currency based on cryptographic proof, without the need to trust a third party middleman, money can be secure and transactions effortless." -- Satoshi
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714914975
Hero Member
*
Offline Offline

Posts: 1714914975

View Profile Personal Message (Offline)

Ignore
1714914975
Reply with quote  #2

1714914975
Report to moderator
1714914975
Hero Member
*
Offline Offline

Posts: 1714914975

View Profile Personal Message (Offline)

Ignore
1714914975
Reply with quote  #2

1714914975
Report to moderator
Lucko
Hero Member
*****
Offline Offline

Activity: 826
Merit: 1000



View Profile
March 19, 2014, 07:24:46 AM
Last edit: March 19, 2014, 04:00:30 PM by Lucko
 #482

This is also what I see. You get to crazy amounts of power usage and heat over 33... I don't think it is safe to run it over that with normal cooling. Even if you use coper coolers I don't think it is enough gain to be worth extra cost of power and price... When you get to about 950 MHz range necessary voltage to run the chips start going up like crazy... From 0,9 at 950 to over 1 at 1050...

EDIT: Looking at your signature. I think thus Hex16B are about as good as 8 chips A1 board... If we look at just a performance... Maybe even batter

Quote
6x Hex16B (Bitfury) 45GH boards 270GH/414W total (1.53W/GH)

Did anyone till now get close to promised numbers? So I know is it my board (and HEX8 too) or just the chip...
Gator-hex
Hero Member
*****
Offline Offline

Activity: 490
Merit: 500


View Profile
March 19, 2014, 04:14:02 PM
Last edit: March 26, 2014, 04:49:59 PM by Gator-hex
 #483

This is also what I see. You get to crazy amounts of power usage and heat over 33... I don't think it is safe to run it over that with normal cooling. Even if you use coper coolers I don't think it is enough gain to be worth extra cost of power and price... When you get to about 950 MHz range necessary voltage to run the chips start going up like crazy... From 0,9 at 950 to over 1 at 1050...

EDIT: Looking at your signature. I think thus Hex16B are about as good as 8 chips A1 board... If we look at just a performance... Maybe even batter

Quote
6x Hex16B (Bitfury) 45GH boards 270GH/414W total (1.53W/GH)

Did anyone till now get close to promised numbers?

Yeah the Hex8A1 is in the 1.5-1.6W/GH range when running at 8x33GH=264GH and runs a hell of a lot noisier fans than 6x Hex16B do (1800 v 4900rpm). Roll Eyes
I opt to run them at 880MHz(4x220)*/910mV = 27GH chip = 1.25W/GH which is just about coolable with 3x Quiet F9 92mm 1800rpm fans.
I'd rather distribute the heat over more units and not have a headache! Bitfury2 now shows another 25% GH improvement too, but it looks like it's going to be short lived as king, by the new 40nm Avalon 3 which promises 0.775W/GH, but still waiting to see real world results on that one.
* The Bitmine clock is 4x the Technobit clock.

totalslacker
Newbie
*
Offline Offline

Activity: 26
Merit: 0


View Profile
March 19, 2014, 09:18:45 PM
 #484

I added code to trim my supply and the good news is that it looks like the board is stable at 35GH/s at 0.975V. To get any faster than that I need to get over the max 1.050V my supply is able to put out (to do that I need to disassemble the cooling and change a sense resistor - not terrible but it is a hassle).

A single chip (the board has four) almost works at 1.050V/40GH/s. If I run all four then the supply under load drops to about 1.030V which doesn't work too well.

Need to validate this on more than one board of course Smiley
Lucko
Hero Member
*****
Offline Offline

Activity: 826
Merit: 1000



View Profile
March 19, 2014, 10:45:17 PM
 #485

How much power do you need for that(40GH)?
totalslacker
Newbie
*
Offline Offline

Activity: 26
Merit: 0


View Profile
March 19, 2014, 11:05:37 PM
 #486

I failed to measure current at 40GH - I can check that next time I try it.

35GH/s was 39A@0.975V = 38W.

I did try a longer run under bfgminer and was seeing some hardware errors (about 0.5%). Not sure why my test wasn't catching them - I was just running zefir's test vector over and over (and validating that the correct nonces were returned). Guess I need more test vectors Smiley

The board did start getting pretty hot. I have a water block attached to the bottom of the board but just a heatsink on top of the chips. The heatsink was sitting at around 35C but the board (the top) itself was getting to over 60C. I suspect I need more via's to better transfer overall heat to the bottom of the board. Or figure out a heatsink that can cover the board top itself.

Clearly immersion is the way to go Smiley
Bicknellski
Hero Member
*****
Offline Offline

Activity: 924
Merit: 1000



View Profile
March 20, 2014, 04:57:57 AM
 #487

Interesting.

Yes the 2 phase cooling like they have in the Asicminer mining center might be a great option for these as you could dump the heat sinks and fans potentially.

https://bitcointalk.org/index.php?topic=346134.0

http://www.enterprisetech.com/2013/11/24/3m-allied-control-cool-clusters-novec-bubble-bath/


http://www.allied-control.com/
http://www.clusteredsystems.com/

Dogie trust abuse, spam, bullying, conspiracy posts & insults to forum members. Ask the mods or admins to move Dogie's spam or off topic stalking posts to the link above.
silver71
Member
**
Offline Offline

Activity: 101
Merit: 10

no avatar for now


View Profile WWW
March 20, 2014, 09:51:18 PM
 #488

I added code to trim my supply and the good news is that it looks like the board is stable at 35GH/s at 0.975V. To get any faster than that I need to get over the max 1.050V my supply is able to put out (to do that I need to disassemble the cooling and change a sense resistor - not terrible but it is a hassle).

A single chip (the board has four) almost works at 1.050V/40GH/s. If I run all four then the supply under load drops to about 1.030V which doesn't work too well.

Need to validate this on more than one board of course Smiley

It's useless to operate any PS on border limit power, since power flicker will reset your boards...

Use 2 or more...we have such solution if you have no clue...let me know.

smart solutions from Tesla's home country...
Bicknellski
Hero Member
*****
Offline Offline

Activity: 924
Merit: 1000



View Profile
March 21, 2014, 04:06:56 AM
 #489

The WPC EE says:

Quote
I am blinking the lights!

So:

1. 3.3V regulator working
2. JTAG interface to UC3C working
3. Processor is initializing
4. ASF now working (took moving to Studio 6.2, since all UC3C was broken in 6.1)
5. 12MHz oscillator is working
6. Can do one LED - all 3 colors are working

Yay! (Finally!)


Dogie trust abuse, spam, bullying, conspiracy posts & insults to forum members. Ask the mods or admins to move Dogie's spam or off topic stalking posts to the link above.
tindela1
Newbie
*
Offline Offline

Activity: 5
Merit: 0


View Profile
March 23, 2014, 11:06:13 PM
 #490

Here is our initial testing results (2x4 IC configuration @ 0.85V):

https://i.imgur.com/HGszbm0.jpg

Just ran for short period of time. And here is snapshot from our build:

https://i.imgur.com/uDNUiKz.jpg


- Noncetech
mhmmd
Newbie
*
Offline Offline

Activity: 5
Merit: 0


View Profile
March 25, 2014, 11:12:54 PM
 #491

Hello everybody

I've found a mess in the BOM of the two chips reference board!
So far the major issue is that there is a mixup of two versions: Ver. 1.0.a and Ver. 1.0.b.
The first evident difference is that the last release is missing C503 and C504, having therefore a different design.
In the repository there is a mix of files of the two versions and is impossible, at least for myself, to cross check diagram and BOM; I was doing such a check, finding some incongruences, when I discovered the problem.
If someone could help me to setup a 100% error free BOM, or at least provide me a diagram of the ver 1.0.b, I will appreciate very much and will be happy to send him/her a free PCB (I have 100 of them waiting to be populated).

Thank you
goodney
Member
**
Offline Offline

Activity: 102
Merit: 10


View Profile
March 26, 2014, 12:07:07 AM
 #492


Just ran for short period of time. And here is snapshot from our build:




- Noncetech

Noncetech: how many A1's are under that heatsink/fan combo? Is the heatsink we see cooling the chips or the board? Do you have cooling on both sides? And finally, do you have current draw numbers?

Looks great though!

-a[g
tindela1
Newbie
*
Offline Offline

Activity: 5
Merit: 0


View Profile
March 26, 2014, 01:13:45 AM
Last edit: March 26, 2014, 01:36:36 AM by tindela1
 #493


Just ran for short period of time. And here is snapshot from our build:

https://i.imgur.com/uDNUiKz.jpg


- Noncetech

Noncetech: how many A1's are under that heatsink/fan combo? Is the heatsink we see cooling the chips or the board? Do you have cooling on both sides? And finally, do you have current draw numbers?

Looks great though!

-a[g


The heatsink+fan combo seen on the picture is about 0.22W/Cdeg heatsink fan combo, which is cooling the topside of A1 chips. We have another bigger 0.16W/Cdeg heatsink+fan cooling the board. We measured some temperatures today at 800MHz and it was quite stable at 65-67Cdeg and for another board 55-58Cdeg. Our FETs and inductors stablized around 60Cdeg.

We had two board stack-up configuration. I guess the another board is not getting fresh air enough. We may need to adjust the upper heatsink alignment, so that its pushing the air out properly.

We have 8 chips under those heatsinks.

EDIT: We don't have proper equipment for accurate on-board current measurements at this moment, thus unable to measure our buck's efficiency at different loads. But we had a wattage meter. I don't have numbers now for 25GH/s setting, but we were getting around 490-510W in total at ~490GH/s (16 chips configuration). Raspberry and ATX PSU were drawing 25W, so it needs to be reduced from total amount in order to get board specific. All the chips did overclock quite nicely. Still need to determine the optimum spot.

https://i.imgur.com/pUAjFZ8.jpg

- Noncetech
zefir (OP)
Donator
Hero Member
*
Offline Offline

Activity: 919
Merit: 1000



View Profile
March 26, 2014, 08:25:55 AM
 #494

Info: Clarification on HW errors / hashrate

I got the below SW support request via PM which I think is relevant for other DIY projects and therefore want to respond here publicly.

Quote
Hello Zefir,

The chips are working nicely!

But when we tried to push it past 1050MHz clock (to all the way to 1200MHz) it seems that cgminer is showing us wrong results. Cgminer showed a bit smaller hashing speed than expected (Sys_clk * 32), but it kept on going all the way to 38GH/s per chip. HW errors were very small, smaller than 32GH/s settings. Did not have any rejections or stales.

I checked also the PLL setting trace and it corresponded datasheet (fbdiv = 71-78, pre and postdivs were at 1, our ref clk is 16MHz).

Then we examined pool's results, it was showing rather 250GHs - 330GH/s. Then we switched back to slower setting, pools were showing immediately higher hashing speeds.

Could give us some advice on this, or point out where could be the possible reason. (We are using the latest cgminer bitmine-A1-driver fork).


Thank you in advance!

Hello,

a diverging hashrate at pool and cgminer simply means you are losing shares through HW errors.

What you need to consider is:

a) a detected HW error also implies that there were errors on true results; the related probability needs to be derived correctly, but I would assume that when you have a HW error rate of 5% it also means you are missing 5% of real results

b) the A1 uses real target, that is, if your pool sends you diff256 work, A1 filters any result witch lower difficulty. In that case, generating a HW error is (at least, needs correct mathematical analysis) 256 times less probable (since you need to generate a wrong diff256 share) therefore you won't see many HW errors with increasing difficulty. Equivalently, because of a) HW errors will cause loss of wrongly calculated real shares.


The current cgminer driver for the A1 is meant for a field deployment where optimal hashrate was measured before and PLL is not tuned by users. If you need to have some meaningful feedback on HW errors to tune your system, you can achieve this easily by letting the A1 report Diff1 shares. For that, you basically need to prevent setting the real target for the jobs with this patch:
Code:
diff --git a/driver-SPI-bitmine-A1.c b/driver-SPI-bitmine-A1.c
index 81df48d..0104c34 100644
--- a/driver-SPI-bitmine-A1.c
+++ b/driver-SPI-bitmine-A1.c
@@ -652,7 +652,6 @@ static uint8_t *create_job(uint8_t chip_id, uint8_t job_id, struct work *work)
        p1[0] = bswap_32(p2[0]);
        p1[1] = bswap_32(p2[1]);
        p1[2] = bswap_32(p2[2]);
-       p1[4] = get_diff(work->sdiff);
        return job;
 }



Good Luck

totalslacker
Newbie
*
Offline Offline

Activity: 26
Merit: 0


View Profile
March 26, 2014, 03:18:10 PM
 #495


Quote
But when we tried to push it past 1050MHz clock (to all the way to 1200MHz) it seems that cgminer is showing us wrong results. Cgminer showed a bit smaller hashing speed than expected (Sys_clk * 32), but it kept on going all the way to 38GH/s per chip. HW errors were very small, smaller than 32GH/s settings. Did not have any rejections or stales.


Hello,

a diverging hashrate at pool and cgminer simply means you are losing shares through HW errors.

What you need to consider is:

a) a detected HW error also implies that there were errors on true results; the related probability needs to be derived correctly, but I would assume that when you have a HW error rate of 5% it also means you are missing 5% of real results


I have noticed the same thing when pushing the hardware beyond 25GH/s. In my case I'm looping the test vector zefir had posted a while back. Since this has known nonces I can verify that the hardware is returning the correct nonce sequence. Irrespective of errors the time taken for each chip to finish a job always seems to correlate very closely to the configured hash rate.

I notice that the hardware tends to drop nonces before it starts to produce bad ones. As I push the chip harder and harder the "good" nonce rate drops to zero and bad nonces become frequent.

However, this ultimately is all a symptom of too low core voltage. 35GH/s gets pretty stable at around 1.050V. I had previously thought it stable at 0.975V but longer tests started producing more errors…

I've modified my supply to get higher output voltages but haven't gotten back to testing it yet.

You do need aggressive cooling at these voltages so be careful! You can get away with short runs with minimal cooling but be careful. Even sitting idle at these voltages it's easy to generate enough heat to pop a chip (as I learned the other day when my code crashed int he debugger and I got distracted trying to figure out a bug I had been seeing from time to time).
silver71
Member
**
Offline Offline

Activity: 101
Merit: 10

no avatar for now


View Profile WWW
March 27, 2014, 09:16:07 AM
 #496


Quote
But when we tried to push it past 1050MHz clock (to all the way to 1200MHz) it seems that cgminer is showing us wrong results. Cgminer showed a bit smaller hashing speed than expected (Sys_clk * 32), but it kept on going all the way to 38GH/s per chip. HW errors were very small, smaller than 32GH/s settings. Did not have any rejections or stales.


Hello,

a diverging hashrate at pool and cgminer simply means you are losing shares through HW errors.

What you need to consider is:

a) a detected HW error also implies that there were errors on true results; the related probability needs to be derived correctly, but I would assume that when you have a HW error rate of 5% it also means you are missing 5% of real results


I have noticed the same thing when pushing the hardware beyond 25GH/s. In my case I'm looping the test vector zefir had posted a while back. Since this has known nonces I can verify that the hardware is returning the correct nonce sequence. Irrespective of errors the time taken for each chip to finish a job always seems to correlate very closely to the configured hash rate.

I notice that the hardware tends to drop nonces before it starts to produce bad ones. As I push the chip harder and harder the "good" nonce rate drops to zero and bad nonces become frequent.

However, this ultimately is all a symptom of too low core voltage. 35GH/s gets pretty stable at around 1.050V. I had previously thought it stable at 0.975V but longer tests started producing more errors…

I've modified my supply to get higher output voltages but haven't gotten back to testing it yet.

You do need aggressive cooling at these voltages so be careful! You can get away with short runs with minimal cooling but be careful. Even sitting idle at these voltages it's easy to generate enough heat to pop a chip (as I learned the other day when my code crashed int he debugger and I got distracted trying to figure out a bug I had been seeing from time to time).

Would isolation of SPI cables uC<>blades help ? Shielding them like S/FTP LAN ?

smart solutions from Tesla's home country...
tindela1
Newbie
*
Offline Offline

Activity: 5
Merit: 0


View Profile
March 27, 2014, 05:07:54 PM
 #497

I don't know if someone has already mentioned this, but we found out that engineering chips can be undervolted by first starting up at 0.85V, then when chips are hashing adjust output voltage. This would require that your DC/DC feedback is designed to allow dynamic voltage adjustment without making too much under-/overshoots.


- Noncetech
mazurov
Newbie
*
Offline Offline

Activity: 2
Merit: 0


View Profile WWW
April 09, 2014, 12:55:43 AM
 #498

Hello everybody

I've found a mess in the BOM of the two chips reference board!
So far the major issue is that there is a mixup of two versions: Ver. 1.0.a and Ver. 1.0.b.
The first evident difference is that the last release is missing C503 and C504, having therefore a different design.
In the repository there is a mix of files of the two versions and is impossible, at least for myself, to cross check diagram and BOM; I was doing such a check, finding some incongruences, when I discovered the problem.
If someone could help me to setup a 100% error free BOM, or at least provide me a diagram of the ver 1.0.b, I will appreciate very much and will be happy to send him/her a free PCB (I have 100 of them waiting to be populated).

Thank you

The only difference is R12. Take a look at my site, the notes should still be on the front page.
mazurov
Newbie
*
Offline Offline

Activity: 2
Merit: 0


View Profile WWW
April 11, 2014, 07:33:54 PM
 #499

There is a TI TXB0106 -based level translator I made to talk to A1s. Requires VCC from the MCU board for high voltage side to function and translate correctly. Also provides 1.8V for A1s data interface. The LDO on the left is TI LP3871-1.8V. The circuit is trivial.

I need a second one and it takes too much time to build on a protoboard. I'm routing a PCB, will post when ready.

https://www.circuitsathome.com/wp/wp-content/uploads/2014/04/bc_levelshifter.jpg
[gadget]
Newbie
*
Offline Offline

Activity: 30
Merit: 0


View Profile
April 13, 2014, 06:06:10 AM
 #500

Here are some more pics from mazurov and [gadget]'s build.

A few boards we've put together:

https://i.imgur.com/r1pUq5G.jpg

View of the DCDC area:

https://i.imgur.com/23Aiuzp.jpg

View of the A1 area:

https://i.imgur.com/GkYiUPf.jpg

Heatsink (let's see how far it takes us):

https://i.imgur.com/x7427xS.jpg

The one tool we couldn't have done without was the microscope:

https://i.imgur.com/AtWUq26.jpg

And for those who read this far, here is a small treat - a corrected BOM. I can verify that these parts will get you to a working board (at least during bring-up Smiley

https://docs.google.com/spreadsheet/ccc?key=0AkO84VcUgOWgdFA5S0tGQTcxVVViX0I1VUlPaHhISEE&usp=sharing
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 [25] 26 27 28 29 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!