Bitcoin Forum
August 04, 2024, 06:18:36 AM *
News: Latest Bitcoin Core release: 27.1 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 ... 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 [53] 54 55 56 57 58 59 60 61 62 63 64 »
  Print  
Author Topic: The Habanero - 650GH/s - OOS  (Read 96000 times)
This is a self-moderated topic. If you do not want to be moderated by the person who started this topic, create a new topic.
Dryleaf
Member
**
Offline Offline

Activity: 85
Merit: 10


View Profile
July 18, 2014, 02:19:28 AM
 #1041

May want to raise the hashclock up to about 850 as it isnt supposed to go below 550.
MedinaMiner
Newbie
*
Offline Offline

Activity: 2
Merit: 0


View Profile
July 18, 2014, 02:48:16 AM
 #1042

Nevermind. I figured it out.

It was pilot error.

It's working now.

Next challenge is to get the PepperMining program working. For some reason it does not seem to be working. I probably have something incorrect on it also.

A document on these products would be welcome. Wink

Regards,

James


I've just purchased a couple of Habanero's, and in the process of bringing them up.

Neither seems to work like I would expect them to.

I have read through all of this thread, researched the Habanero website, pulled hair... lost sleep... to no avail.

I think that the problem has to do with the Habanero making 'contact' with the miner. (Not sure of the technical terms)

It appears that the actual Habanero is working. I can issue a cgminer --benchmark and it starts up and merrily runs hashing away happily... too bad I can't grab some of those and toss them into my wallet.

So - here is my cg.bat - which has been edited many times to try different things, but nothing seems to be the correct combination...

cgminer -o http://stratum.btcguild.com:3333 -u {user} -p {xxxx} --hfa-name OZ --usb 2:3 --hfa-fan 100 --hfa-temp-target 0 --hfa-temp-overheat 103 --hfa-fail-drop 0 --hfa-hash-clock 300 --api-port 4028 --verbose 2>>logfile_hab.txt

I have found that the --verbose is really handy in helping to debug... but the output is cryptic to me. Perhpas someone more knowledgeable will see what I am doing wrong...

 [2014-07-17 20:10:27] Started cgminer 4.4.1
 [2014-07-17 20:10:29] HFA: Found device with name OZ
 [2014-07-17 20:10:30] Probing for an alive pool
 [2014-07-17 20:10:30] HFA : Sending OP_USB_INIT with GWQ protocol specified
 [2014-07-17 20:10:30] Testing pool http://stratum.btcguild.com:3333
 [2014-07-17 20:10:30] HTTP request failed: Empty reply from server
 [2014-07-17 20:10:30] HTTP request failed: Empty reply from server
 [2014-07-17 20:10:31] JSON-RPC decode failed: (unknown reason)
 [2014-07-17 20:10:31] Pool 0 difficulty changed to 2
 [2014-07-17 20:10:31] Closing socket for stratum pool 0
 [2014-07-17 20:10:33] HFB :      firmware_rev:    0.5
 [2014-07-17 20:10:33] HFB :      hardware_rev:    1.1
 [2014-07-17 20:10:33] HFB :      serial number:   9e4c7500
 [2014-07-17 20:10:33] HFB :      hash clockrate:  300 Mhz
 [2014-07-17 20:10:33] HFB :      inflight_target: 768
 [2014-07-17 20:10:34] HFB :      sequence_modulus: 2048
 [2014-07-17 20:10:36] Testing pool http://stratum.btcguild.com:3333
 [2014-07-17 20:10:37] Pool 0 difficulty changed to 2
 [2014-07-17 20:10:37] Closing socket for stratum pool 0
 [2014-07-17 20:10:42] Testing pool http://stratum.btcguild.com:3333
 [2014-07-17 20:10:43] Pool 0 difficulty changed to 2
 [2014-07-17 20:10:43] Closing socket for stratum pool 0
 [2014-07-17 20:10:44] Waiting for work to be available from pools.
 [2014-07-17 20:10:48] Testing pool http://stratum.btcguild.com:3333
 [2014-07-17 20:10:48] Pool 0 difficulty changed to 2
 [2014-07-17 20:10:49] Closing socket for stratum pool 0

I have a few more details, but am not sure which ones may be relevant. Feel free to beat me over the head... I'm a newbie.

Please help... please???

Thanks in advance...

James

Taugeran
Hero Member
*****
Offline Offline

Activity: 658
Merit: 500


CCNA: There i fixed the internet.


View Profile
July 18, 2014, 08:35:03 PM
 #1043

In addition to -api-port 4028, you need

-api-listen
-api-network

This enables listening and network traffic(so you can monitor from other computers on same subnet)

Bitfury HW & Habañero : 1.625Th/s
tips/Donations: 1NoS89H3Mr6U5CmP4VwWzU2318JEMxHL1
Come join Coinbase
xjack
Hero Member
*****
Offline Offline

Activity: 539
Merit: 500



View Profile
July 19, 2014, 12:52:36 AM
 #1044

Not-so-funny Habanero story...

Chip and Dabs have been running very well the past several weeks.  As long as I can keep the ambient down, they purr along at ~1.4+ GH/s.

Since they're in my garage, I foolishly decided to leave them running while cleaning the crud and bugs out of the rads.  In the process, I accidentally knocked Dabs off the shelf.   Shocked

She was saved by the power cables, but not before ripping the USB connector off clean, pads and all.



https://dl.dropboxusercontent.com/u/30721962/dabs_fell_down.jpg

Also, when she went down, the cooling head bounced off the shelf itself, popping one of the hold down screws off.   Embarrassed


I immediately cut the power and grabbed it up, carefully moving it to a bench to assess.  No obvious board damage, and I was plugged to the end of the USB chain, so there was hope.  Once I got Dabs apart and cleaned up, I noticed a solid mark on the cooling head mounting surface from the die contact, and a very small part of the corner chipped from die 3.   Undecided

Long story short, I lapped the cooling head, cleaned her up and reassembled.  24 hours later she's running a little hotter than she used to, but still hashes normally.

Code:
 cgminer version 4.4.0 - Started: [2014-07-17 18:33:37]
----------------------------------------------------------------------------------------------------
 (5s):1.447T (1m):1.408T (5m):1.429T (15m):1.432T (avg):1.425Th/s
 A:30007296  R:82944  HW:360811  WU:19909.5/m | ST: 125  SS: 0  NB: 174  LW: 31329244  GF: 0  RF: 0
 Connected to stratum-lb48.btcguild.com diff 1.02K with stratum as user xxxxx
 Block: 40834981...  Diff:17.3G  Started: [19:40:05]  Best share: 217M
----------------------------------------------------------------------------------------------------
 [U]SB management [P]ool management [S]ettings [D]isplay options [Q]uit
 0: HFB Chip    : 950MHz  96C 100% 0.90V  | 717.7G / 719.9Gh/s WU:10057.5/m A:15199232 R:39936 HW:177519
 1: HFB Dabs    : 950MHz  98C 100% 0.90V  | 721.9G / 705.2Gh/s WU: 9852.0/m A:14808064 R:43008 HW:183292
----------------------------------------------------------------------------------------------------
 [2014-07-18 19:45:50] Accepted 189620a3 Diff 2.67K/1024 HFB 0 pool 0
 [2014-07-18 19:46:00] Accepted 3950c902 Diff 1.14K/1024 HFB 1 pool 0

Lesson learned - disconnect and carefully clean your miners on a bench, not while they're in use.

We're very lucky to have two USB ports on these!

xjack - 1xjackDMgJCLn1LDtbgh51DYw6uRgeHVb
Reputation thread - https://bitcointalk.org/index.php?topic=482124.0
daddyfatsax
Hero Member
*****
Offline Offline

Activity: 857
Merit: 1000


Anger is a gift.


View Profile
July 19, 2014, 06:50:09 PM
 #1045

What is your voltage set at to run 950 on the clock? I think my voltage is at .960 or .970 and I can only get to 900MHz.
xjack
Hero Member
*****
Offline Offline

Activity: 539
Merit: 500



View Profile
July 19, 2014, 09:55:35 PM
 #1046

What is your voltage set at to run 950 on the clock? I think my voltage is at .960 or .970 and I can only get to 900MHz.

.980 on both with 1.1-1.2% HW.

Chip will run at .975, but the errors drop a hair at .980.  He's maxed at 950mhz.  Higher speeds/voltages yield less hashrate.

Dabs is a bit faster at .990/962Mhz, but she's ambient temp sensitive at maxxed voltage.  She needs .995+ to run at 975 which gets 725GH/s but closer to 2% HW - Die 3 is a bit weak on that chip.


xjack - 1xjackDMgJCLn1LDtbgh51DYw6uRgeHVb
Reputation thread - https://bitcointalk.org/index.php?topic=482124.0
PCComf
Member
**
Offline Offline

Activity: 90
Merit: 10


View Profile
July 20, 2014, 02:18:51 AM
 #1047

I have one Hab that has been running nicely since the day I got it with no issues after the normal tweaking for clock and my target temperature. This evening it went offline suddenly and when I checked it the power supply appeared to be tripped. The fans had stopped and everything was dead. There was no hint of magic smoke anywhere. After pulling the plug and letting it sit a bit the PS appeared to reset as one would expect from a trip. I hooked everything back up but as soon as I applied power it appeared to trip again. I pulled a server PS out of the garage that has been running a pair of S1's and hooked it up, same deal except the server supply shuts down and then and attempts to restart after a few seconds without having to pull the plug. The Hab's power supply runs the pair of S1's no problem.

Not sure what might be the problem, but I'm expecting something wrong on the Hab. Is there anything I can do to troubleshoot further, or can I send this back for some diagnosis?

Thanks.
Taugeran
Hero Member
*****
Offline Offline

Activity: 658
Merit: 500


CCNA: There i fixed the internet.


View Profile
July 20, 2014, 02:55:27 AM
 #1048

I have one Hab that has been running nicely since the day I got it with no issues after the normal tweaking for clock and my target temperature. This evening it went offline suddenly and when I checked it the power supply appeared to be tripped. The fans had stopped and everything was dead. There was no hint of magic smoke anywhere. After pulling the plug and letting it sit a bit the PS appeared to reset as one would expect from a trip. I hooked everything back up but as soon as I applied power it appeared to trip again. I pulled a server PS out of the garage that has been running a pair of S1's and hooked it up, same deal except the server supply shuts down and then and attempts to restart after a few seconds without having to pull the plug. The Hab's power supply runs the pair of S1's no problem.

Not sure what might be the problem, but I'm expecting something wrong on the Hab. Is there anything I can do to troubleshoot further, or can I send this back for some diagnosis?

Thanks.

that sounds like a direct short to ground. get a magnifying glass and good lighting or a bright flashlight.

look for: dust, solder blobs, misplaced components, fried traces near power circuitry, discoloration, charring, etc.


e.g. on a bitfury h-card i had running for a while. all of a sudden it stopped working. couldnt find bridges, etc. broke out DMM and power plane was 0V. so went searching and the power converter Vout traces were blackened, charred and delaminated.


*hopes nothing that serious has happened*

-Taug

Bitfury HW & Habañero : 1.625Th/s
tips/Donations: 1NoS89H3Mr6U5CmP4VwWzU2318JEMxHL1
Come join Coinbase
btchedge
Member
**
Offline Offline

Activity: 92
Merit: 10



View Profile
July 20, 2014, 04:33:40 AM
 #1049

Help. Trying to bring up a brand new board and can't figure out what's going on...

So here is the error I am getting when trying to hash...

Code:
 
[2014-07-19 23:14:07] HFA: Found device with name HB2                   
[2014-07-19 23:14:07] Hotplug: Hashfast added HFA 37                   
[2014-07-19 23:14:10] [color=red][font=Verdana]ERR: Asked to memcpy 0 bytes from usbutils.c _usb_read():3075   [/font][/color]                 
[2014-07-19 23:14:10] [color=red][font=Verdana]HFA : OP_USB_INIT failed! Operation status 1 (Reset timeout)[/font][/color] 

I tried flashing it with the latest firmware and this certainly doesn't look right...

Code:
.../firmware/firmware-20140713-0# ./field_firmware_update.py
('confirm is ', False)
('FIRMWARE_DIR is ', '.')
UC_HFU_FILE at './uc3.cropped.hfu'.
READSERIAL found at 'x86_64/readserial'.
HFUPDATE found at 'x86_64/hfupdate'.
ENTERLOADER found at 'x86_64/enterloader'.

HashFast Firmware Updater

Reading serial number from module.
Did not get valid magic characters: aa 32 00 00 aa 42 07 f3 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Entering Boot Loader...
Enumerating modules...
Found 1 modules.
Loading Firmware...
Updating module 0...
hfupdate v0.1
module chain config 1 master 1 slaves 0
module 0 version 0x80000003 crc 0x23ed99de
module 0 invalid serial number:
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff
done
hfupdate v0.1
module chain config 1 master 1 slaves 0
module 0 version 0x80000003 crc 0x23ed99de
module 0 invalid serial number:
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff
done

***FIRMWARE UPDATE COMPLETE


HashFast Firmware Updater Completed

Please enlighten me if you have any ideas. Thanks!

Who is John Galt?
Batshark
Newbie
*
Offline Offline

Activity: 33
Merit: 0


View Profile
July 22, 2014, 05:53:19 PM
 #1050

Help. Trying to bring up a brand new board and can't figure out what's going on...

So here is the error I am getting when trying to hash...

Code:
 
[2014-07-19 23:14:07] HFA: Found device with name HB2                   
[2014-07-19 23:14:07] Hotplug: Hashfast added HFA 37                   
[2014-07-19 23:14:10] [color=red][font=Verdana]ERR: Asked to memcpy 0 bytes from usbutils.c _usb_read():3075   [/font][/color]                 
[2014-07-19 23:14:10] [color=red][font=Verdana]HFA : OP_USB_INIT failed! Operation status 1 (Reset timeout)[/font][/color] 

I tried flashing it with the latest firmware and this certainly doesn't look right...

Code:
.../firmware/firmware-20140713-0# ./field_firmware_update.py
('confirm is ', False)
('FIRMWARE_DIR is ', '.')
UC_HFU_FILE at './uc3.cropped.hfu'.
READSERIAL found at 'x86_64/readserial'.
HFUPDATE found at 'x86_64/hfupdate'.
ENTERLOADER found at 'x86_64/enterloader'.

HashFast Firmware Updater

Reading serial number from module.
Did not get valid magic characters: aa 32 00 00 aa 42 07 f3 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Entering Boot Loader...
Enumerating modules...
Found 1 modules.
Loading Firmware...
Updating module 0...
hfupdate v0.1
module chain config 1 master 1 slaves 0
module 0 version 0x80000003 crc 0x23ed99de
module 0 invalid serial number:
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff
done
hfupdate v0.1
module chain config 1 master 1 slaves 0
module 0 version 0x80000003 crc 0x23ed99de
module 0 invalid serial number:
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff
done

***FIRMWARE UPDATE COMPLETE


HashFast Firmware Updater Completed

Please enlighten me if you have any ideas. Thanks!

You have the same problem as me  just stopped hashing 5 days ago and gives this error. Teal said I need to get a multimeter to track it down. 
PCComf
Member
**
Offline Offline

Activity: 90
Merit: 10


View Profile
July 22, 2014, 06:42:10 PM
 #1051

I have one Hab that has been running nicely since the day I got it with no issues after the normal tweaking for clock and my target temperature. This evening it went offline suddenly and when I checked it the power supply appeared to be tripped. The fans had stopped and everything was dead. There was no hint of magic smoke anywhere. After pulling the plug and letting it sit a bit the PS appeared to reset as one would expect from a trip. I hooked everything back up but as soon as I applied power it appeared to trip again. I pulled a server PS out of the garage that has been running a pair of S1's and hooked it up, same deal except the server supply shuts down and then and attempts to restart after a few seconds without having to pull the plug. The Hab's power supply runs the pair of S1's no problem.

Not sure what might be the problem, but I'm expecting something wrong on the Hab. Is there anything I can do to troubleshoot further, or can I send this back for some diagnosis?

Thanks.

that sounds like a direct short to ground. get a magnifying glass and good lighting or a bright flashlight.

look for: dust, solder blobs, misplaced components, fried traces near power circuitry, discoloration, charring, etc.


e.g. on a bitfury h-card i had running for a while. all of a sudden it stopped working. couldnt find bridges, etc. broke out DMM and power plane was 0V. so went searching and the power converter Vout traces were blackened, charred and delaminated.


*hopes nothing that serious has happened*

-Taug
With a magnifying glass I see no dust, defects or burned traces. It appears as new.
btchedge
Member
**
Offline Offline

Activity: 92
Merit: 10



View Profile
July 22, 2014, 09:05:27 PM
 #1052

Help. Trying to bring up a brand new board and can't figure out what's going on...

So here is the error I am getting when trying to hash...

Code:
 
[2014-07-19 23:14:07] HFA: Found device with name HB2                   
[2014-07-19 23:14:07] Hotplug: Hashfast added HFA 37                   
[2014-07-19 23:14:10] [color=red][font=Verdana]ERR: Asked to memcpy 0 bytes from usbutils.c _usb_read():3075   [/font][/color]                 
[2014-07-19 23:14:10] [color=red][font=Verdana]HFA : OP_USB_INIT failed! Operation status 1 (Reset timeout)[/font][/color] 

I tried flashing it with the latest firmware and this certainly doesn't look right...

Code:
.../firmware/firmware-20140713-0# ./field_firmware_update.py
('confirm is ', False)
('FIRMWARE_DIR is ', '.')
UC_HFU_FILE at './uc3.cropped.hfu'.
READSERIAL found at 'x86_64/readserial'.
HFUPDATE found at 'x86_64/hfupdate'.
ENTERLOADER found at 'x86_64/enterloader'.

HashFast Firmware Updater

Reading serial number from module.
Did not get valid magic characters: aa 32 00 00 aa 42 07 f3 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Entering Boot Loader...
Enumerating modules...
Found 1 modules.
Loading Firmware...
Updating module 0...
hfupdate v0.1
module chain config 1 master 1 slaves 0
module 0 version 0x80000003 crc 0x23ed99de
module 0 invalid serial number:
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff
done
hfupdate v0.1
module chain config 1 master 1 slaves 0
module 0 version 0x80000003 crc 0x23ed99de
module 0 invalid serial number:
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff
done

***FIRMWARE UPDATE COMPLETE


HashFast Firmware Updater Completed

Please enlighten me if you have any ideas. Thanks!

You have the same problem as me  just stopped hashing 5 days ago and gives this error. Teal said I need to get a multimeter to track it down. 

The only difference is the "Operation status 1" versus "Operation status 20". I am working on trying to figure this out today. Will let you know what I find.

Who is John Galt?
btchedge
Member
**
Offline Offline

Activity: 92
Merit: 10



View Profile
July 23, 2014, 01:47:33 AM
 #1053

Help. Trying to bring up a brand new board and can't figure out what's going on...

So here is the error I am getting when trying to hash...

Code:
 
[2014-07-19 23:14:07] HFA: Found device with name HB2                   
[2014-07-19 23:14:07] Hotplug: Hashfast added HFA 37                   
[2014-07-19 23:14:10] [color=red][font=Verdana]ERR: Asked to memcpy 0 bytes from usbutils.c _usb_read():3075   [/font][/color]                 
[2014-07-19 23:14:10] [color=red][font=Verdana]HFA : OP_USB_INIT failed! Operation status 1 (Reset timeout)[/font][/color] 

I tried flashing it with the latest firmware and this certainly doesn't look right...

Code:
.../firmware/firmware-20140713-0# ./field_firmware_update.py
('confirm is ', False)
('FIRMWARE_DIR is ', '.')
UC_HFU_FILE at './uc3.cropped.hfu'.
READSERIAL found at 'x86_64/readserial'.
HFUPDATE found at 'x86_64/hfupdate'.
ENTERLOADER found at 'x86_64/enterloader'.

HashFast Firmware Updater

Reading serial number from module.
Did not get valid magic characters: aa 32 00 00 aa 42 07 f3 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Entering Boot Loader...
Enumerating modules...
Found 1 modules.
Loading Firmware...
Updating module 0...
hfupdate v0.1
module chain config 1 master 1 slaves 0
module 0 version 0x80000003 crc 0x23ed99de
module 0 invalid serial number:
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff
done
hfupdate v0.1
module chain config 1 master 1 slaves 0
module 0 version 0x80000003 crc 0x23ed99de
module 0 invalid serial number:
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff
done

***FIRMWARE UPDATE COMPLETE


HashFast Firmware Updater Completed

Please enlighten me if you have any ideas. Thanks!

You have the same problem as me  just stopped hashing 5 days ago and gives this error. Teal said I need to get a multimeter to track it down. 

Well, I tried flashing the previous firmware and now the error says "Operation status 20.." This is really strange.

Who is John Galt?
PCComf
Member
**
Offline Offline

Activity: 90
Merit: 10


View Profile
July 23, 2014, 02:54:54 AM
 #1054

This gets stranger and stranger. I had some time this evening and took a chance with additional troubleshooting since I'm pretty dead in the water otherwise. The board powers up fine when A,C,D PCI-E are plugged in, but B is what causes the issue. I see nothing wrong with that connection.

If I try to get the board to hash in this state - I am not sure this is possible - I get the Operation status 20 error and it continuously re-detects the board. I have not touched anything to do with firmware and since it doesn't seem to be helping the others with that problem and I'm not going to try yet. I am wondering if it is complaining because that one isn't powered up and if it is possible to disable a chip. I'm not even sure if things are set up that way - just hoping for some kind of way to get it partially alive again.

Thanks for any help.
btchedge
Member
**
Offline Offline

Activity: 92
Merit: 10



View Profile
July 23, 2014, 07:31:18 AM
 #1055

This gets stranger and stranger. I had some time this evening and took a chance with additional troubleshooting since I'm pretty dead in the water otherwise. The board powers up fine when A,C,D PCI-E are plugged in, but B is what causes the issue. I see nothing wrong with that connection.

If I try to get the board to hash in this state - I am not sure this is possible - I get the Operation status 20 error and it continuously re-detects the board. I have not touched anything to do with firmware and since it doesn't seem to be helping the others with that problem and I'm not going to try yet. I am wondering if it is complaining because that one isn't powered up and if it is possible to disable a chip. I'm not even sure if things are set up that way - just hoping for some kind of way to get it partially alive again.

Thanks for any help.
In a conversation at some point I was advised by Dave that it was worth trying to disable certain dies to resolve this issue. Essentially the idea is to identify the bad die then set its voltage to ZERO so it will then be disregarded and allow the healthy dies to work. So using the version of the hcm tool I have the command is something like # ./hcm --write-die-settings 1:0@0 if you wanted to disable die 1. I tried this tonight but it did not solve my problem.

Hopefully Dave can chime in with further suggestions. I will report back if I have any luck tomorrow.

Who is John Galt?
MrTeal (OP)
Legendary
*
Offline Offline

Activity: 1274
Merit: 1004


View Profile
July 23, 2014, 03:32:09 PM
 #1056

This gets stranger and stranger. I had some time this evening and took a chance with additional troubleshooting since I'm pretty dead in the water otherwise. The board powers up fine when A,C,D PCI-E are plugged in, but B is what causes the issue. I see nothing wrong with that connection.

If I try to get the board to hash in this state - I am not sure this is possible - I get the Operation status 20 error and it continuously re-detects the board. I have not touched anything to do with firmware and since it doesn't seem to be helping the others with that problem and I'm not going to try yet. I am wondering if it is complaining because that one isn't powered up and if it is possible to disable a chip. I'm not even sure if things are set up that way - just hoping for some kind of way to get it partially alive again.

Thanks for any help.
It's hard to tell without testing using a multimeter, but it sounds to me like the one input (B) is shorted. That would be why the PSU is shutting off if it's disconnected. You'll get the regulator programming error if you try to start running with the one cable disconnected, but you should be able to leave B off if you set the voltage to 0 for that die using hftool or JakeTri's modified cgminer. The firmware will then route around that die and not turn on that power supply.
MrTeal (OP)
Legendary
*
Offline Offline

Activity: 1274
Merit: 1004


View Profile
July 23, 2014, 03:39:49 PM
 #1057

You have the same problem as me  just stopped hashing 5 days ago and gives this error. Teal said I need to get a multimeter to track it down. 
Having a multimeter would make it quicker, but it's not necessary. It is going to be difficult to do much without having access to a linux machine to adjust voltages or update firmware though. You might need to send it back in in this case. Send me a PM and I can arrange an RMA.
MrTeal (OP)
Legendary
*
Offline Offline

Activity: 1274
Merit: 1004


View Profile
July 23, 2014, 03:43:22 PM
 #1058

This gets stranger and stranger. I had some time this evening and took a chance with additional troubleshooting since I'm pretty dead in the water otherwise. The board powers up fine when A,C,D PCI-E are plugged in, but B is what causes the issue. I see nothing wrong with that connection.

If I try to get the board to hash in this state - I am not sure this is possible - I get the Operation status 20 error and it continuously re-detects the board. I have not touched anything to do with firmware and since it doesn't seem to be helping the others with that problem and I'm not going to try yet. I am wondering if it is complaining because that one isn't powered up and if it is possible to disable a chip. I'm not even sure if things are set up that way - just hoping for some kind of way to get it partially alive again.

Thanks for any help.
In a conversation at some point I was advised by Dave that it was worth trying to disable certain dies to resolve this issue. Essentially the idea is to identify the bad die then set its voltage to ZERO so it will then be disregarded and allow the healthy dies to work. So using the version of the hcm tool I have the command is something like # ./hcm --write-die-settings 1:0@0 if you wanted to disable die 1. I tried this tonight but it did not solve my problem.

Hopefully Dave can chime in with further suggestions. I will report back if I have any luck tomorrow.
It's weird that it continues to give you the error 20 even with a die disabled. Did you try running them each individually to see if they'll run one at a time? IE,
./hftool.py -w 0:950@900/1:0@900/2:0@900/3:0@900
then ./hftool.py -w 0:0@900/1:950@900/2:0@900/3:0@900
etc.
QuiveringGibbage
Hero Member
*****
Offline Offline

Activity: 617
Merit: 543


http://idontALT.com


View Profile WWW
July 26, 2014, 10:57:55 AM
 #1059

Hi all,

Got two questions I was wondering if anyone can answer:

1. What are the consequences of setting a high Voltage setting while only using a low Clock speed? Say 350MHz with Voltage 960mv. Would hashing at such low rates have overheating issues, just because the Voltages have been set high? Just setting the Voltages high doesn't mean you consume all the power?

2. The HF-Tool has absolutely nothing to do with naming Habanero boards? It does not permanently record Names to boards and you do not need the HF-Tool to rename a Habanero board? Am I correct that the only way to change the name of the Habanero board is to use cgminer command --hfa-name <arg> ?

I ask this because I have a Habanero board set to 937MHz and Voltage 960mv with the HF-Tool. The board hashes fine at these settings at the rate of 665GHs and at 100c temps.

But the Corsair H100i radiator was recently been punctured and leaked fluid. It has been repaired with twin epoxy and fluids refilled to 99.99%. cgminer is set at 200Mhz and still says ZOMBIE. Message says it's overheating. Have noticed voltage starts at 1.04V but other Habaneros (that have stock settings) run at 0.86V. There is currently no access to a linux box and HF-Tool to down clock the setting of the board. cgminer command is:
cgminer.exe --usb HFA:1 -o pool:3333 -u user -p pass --hfa-fan 100 --hfa-temp-target 0 --hfa-temp-overheat 104 --hfa-fail-drop 0 --hfa-hash-clock 200 --api-port 4029

Appreciate your time if you have any hints on how to get the board mining again,
QG

Bitcoin is at the tippity top of the mountain...but it's really only half way up.. Wink
MrTeal (OP)
Legendary
*
Offline Offline

Activity: 1274
Merit: 1004


View Profile
July 29, 2014, 02:36:19 AM
 #1060

Hi all,

Got two questions I was wondering if anyone can answer:

1. What are the consequences of setting a high Voltage setting while only using a low Clock speed? Say 350MHz with Voltage 960mv. Would hashing at such low rates have overheating issues, just because the Voltages have been set high? Just setting the Voltages high doesn't mean you consume all the power?

2. The HF-Tool has absolutely nothing to do with naming Habanero boards? It does not permanently record Names to boards and you do not need the HF-Tool to rename a Habanero board? Am I correct that the only way to change the name of the Habanero board is to use cgminer command --hfa-name <arg> ?

I ask this because I have a Habanero board set to 937MHz and Voltage 960mv with the HF-Tool. The board hashes fine at these settings at the rate of 665GHs and at 100c temps.

But the Corsair H100i radiator was recently been punctured and leaked fluid. It has been repaired with twin epoxy and fluids refilled to 99.99%. cgminer is set at 200Mhz and still says ZOMBIE. Message says it's overheating. Have noticed voltage starts at 1.04V but other Habaneros (that have stock settings) run at 0.86V. There is currently no access to a linux box and HF-Tool to down clock the setting of the board. cgminer command is:
cgminer.exe --usb HFA:1 -o pool:3333 -u user -p pass --hfa-fan 100 --hfa-temp-target 0 --hfa-temp-overheat 104 --hfa-fail-drop 0 --hfa-hash-clock 200 --api-port 4029

Appreciate your time if you have any hints on how to get the board mining again,
QG
1. At least for the board (outside of the cooling system) running at 350MHz@960mV will consume about half the power of running at 700MHz at 960MHz, and efficiency will be about the same. There shouldn't be any overheating issues at that speed if your cooling is working.
2. No, the hftool isn't used for naming the boards.

I would double-check the radiator that the pump didn't die when the fluid all leaked, and then try to carefully reseat the head. The voltage reading is likely erroneous, it will sometimes report a weird value if it goes into shutdown before it actually samples a value.
Pages: « 1 ... 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 [53] 54 55 56 57 58 59 60 61 62 63 64 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!