Bitcoin Forum
December 07, 2016, 08:27:04 PM *
News: To be able to use the next phase of the beta forum software, please ensure that your email address is correct/functional.
 
   Home   Help Search Donate Login Register  
Pages: [1] 2 »  All
  Print  
Author Topic: Looking for FPGA cgminer testers.  (Read 3291 times)
P_Shep
Legendary
*
Offline Offline

Activity: 924


View Profile WWW
June 18, 2012, 07:16:07 PM
 #1

I've been working on many MANY improvements for BFL support in cgminer.
I think I'm about ready to submit a pull request into cgminer, but I still need to test how it operates with other FPGAs.
Essentially it should have no effect on Icarus, ztex and ModMiner, but since don't have any of those, I can't test them so I'm hoping some kind volunteers could compile my fork here, and run it for a while:
https://github.com/pshep/cgminer

The most significant change for FPGAs is the inclusion of 'SICK' and 'DEAD' processing, which was previously reserved only for OpenCL devices. For Icarus, Ztex and ModMiner, this should tell you if they are sick or dead (for BFL it'll attempt to re-init the device).

For BFL devices, my changes do the following:
- Timeout to restart work if it's taking too long
   A nonce range should take just over 5s. Any longer and device is throttling.
   Fixes issue where BFL appears to stall in cgminer
- Count throttling as a zero hash error
- Temp taken in watchdog thread
  Now a disabled device will still return a current temperature, rather than the last value before disabling.
- Work restart on new work
  This was cause very high stale rate for me...
  Previously on a work restart, cgminer would allow the BFL to continue with the stale block. Now this is checked, and while and nonces found in that time will be wasted, the work will be discarded and new work will be started immediately and not after the 5s the BFL takes to return results.
- Timing adjustments
   The 'wait for results' was hard-set to 4500ms before polling at 10ms intervals. With variation of systems and new firmwares of differing hash rates, a hard set timer could be either inefficient (starts polling way before necessary) or wasteful (starts polling way after result is ready). The auto-adjustment will find the correct wait time to minimize polling and and delay retrieving results.
- Device re-initialization
   When a device is disabled (for whatever reason - user or by cgminer) then re-enabled, the device is re-initialized, rather then assuming communications are still working.
- Sick / Dead monitoring
  As with OpenCL devices, BFL devices will be checked for sickness (60s no response) or dead (10 mins no response) and try to re-initialize them.
- Improved logging
   Most logs now include the device in question, i.e.: "BFL0: took longer than 15s"
- Device start offset
  Delays the start of each device by a random time between 0 and 100ms so that they don't all make calls at exactly the same time.

If you can help, I'll be much appreciative.

Thanks Smiley


1481142424
Hero Member
*
Offline Offline

Posts: 1481142424

View Profile Personal Message (Offline)

Ignore
1481142424
Reply with quote  #2

1481142424
Report to moderator
1481142424
Hero Member
*
Offline Offline

Posts: 1481142424

View Profile Personal Message (Offline)

Ignore
1481142424
Reply with quote  #2

1481142424
Report to moderator
1481142424
Hero Member
*
Offline Offline

Posts: 1481142424

View Profile Personal Message (Offline)

Ignore
1481142424
Reply with quote  #2

1481142424
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1481142424
Hero Member
*
Offline Offline

Posts: 1481142424

View Profile Personal Message (Offline)

Ignore
1481142424
Reply with quote  #2

1481142424
Report to moderator
rudrigorc2
Legendary
*
Offline Offline

Activity: 1064



View Profile
June 18, 2012, 11:55:00 PM
 #2

if you compile for the tplink mips toy I can test =)
P_Shep
Legendary
*
Offline Offline

Activity: 924


View Profile WWW
June 19, 2012, 12:02:50 AM
 #3

I tried actually, just to see if my compiler would!

It wouldn't.

Actually I think the compiler would, but the libraries are all wrong for your kind of processor, so it got nowhere.
nedbert9
Sr. Member
****
Offline Offline

Activity: 252

Inactive


View Profile
June 19, 2012, 12:07:06 AM
 #4




Really appreciated, P_Shep
nedbert9
Sr. Member
****
Offline Offline

Activity: 252

Inactive


View Profile
June 19, 2012, 12:08:55 AM
 #5




I guess there's no hope for scan-serial to work for BFL's in Windows, eh?
P_Shep
Legendary
*
Offline Offline

Activity: 924


View Profile WWW
June 19, 2012, 12:32:35 AM
 #6

I guess there's no hope for scan-serial to work for BFL's in Windows, eh?

Well I don't know... an ugly way maybe just to try an open each port in turn. But then where do you stop... 8? 16? 100? That might take a while...
rjk
Sr. Member
****
Offline Offline

Activity: 420


1ngldh


View Profile
June 19, 2012, 12:35:45 AM
 #7

I guess there's no hope for scan-serial to work for BFL's in Windows, eh?

Well I don't know... an ugly way maybe just to try an open each port in turn. But then where do you stop... 8? 16? 100? That might take a while...
Can they be done in parallel? Ufasoft is able to do it, somehow.

Mining Rig Extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS] Dead project is dead, all hail the coming of the mighty ASIC!
Phraust
Full Member
***
Offline Offline

Activity: 206


Mostly Harmless...


View Profile WWW
June 19, 2012, 06:01:38 AM
 #8

I tried it out on OSX, seems to work pretty well.

With the stock cgminer, I've been noticing a decline in hashrate over the course of about 4 hours (from 8.5gh down to 7.Cool so I've been restarting it when I notice it dropping.  I've been running your version for the last five hours, and it's still up at 8.5gh.  Thanks a ton, I'll keep you updated.

I should add that after about 4 hours, one or more of the BFLs will drop below a U of 10 (normally around 8, sometimes down to 6.  this behavior started after I moved my rig back, so it might be how I laid everything out, I was thinking it was probably noise across all the USB cables).  Right now, they are all at or above 11.8, with one at 11.3, much much better.
af_newbie
Legendary
*
Offline Offline

Activity: 896



View Profile
June 19, 2012, 01:45:29 PM
 #9

I guess there's no hope for scan-serial to work for BFL's in Windows, eh?

Well I don't know... an ugly way maybe just to try an open each port in turn. But then where do you stop... 8? 16? 100? That might take a while...
Can they be done in parallel? Ufasoft is able to do it, somehow.

Of course.  Separate pool of "test" threads (say 10-20) could signal the main "scan" thread when they are done so that new port can be assigned to be scanned.  The upper limit can be read from the OS.  The "scan" thread would assign "untested" ports to worker "test" threads.

The BFL I/O code should be non-blocking, overlapped IO so that scanning can be stopped if needed.
P_Shep
Legendary
*
Offline Offline

Activity: 924


View Profile WWW
June 19, 2012, 03:46:48 PM
 #10

I guess there's no hope for scan-serial to work for BFL's in Windows, eh?

Well I don't know... an ugly way maybe just to try an open each port in turn. But then where do you stop... 8? 16? 100? That might take a while...
Can they be done in parallel? Ufasoft is able to do it, somehow.

Of course.  Separate pool of "test" threads (say 10-20) could signal the main "scan" thread when they are done so that new port can be assigned to be scanned.  The upper limit can be read from the OS.  The "scan" thread would assign "untested" ports to worker "test" threads.

The BFL I/O code should be non-blocking, overlapped IO so that scanning can be stopped if needed.

When/if this is accepted, I'll look into it. I also want to have it scan port during operation so you can yank out and replace/add devices while it's running.
nedbert9
Sr. Member
****
Offline Offline

Activity: 252

Inactive


View Profile
June 19, 2012, 04:39:42 PM
 #11

I guess there's no hope for scan-serial to work for BFL's in Windows, eh?

Well I don't know... an ugly way maybe just to try an open each port in turn. But then where do you stop... 8? 16? 100? That might take a while...
Can they be done in parallel? Ufasoft is able to do it, somehow.

Of course.  Separate pool of "test" threads (say 10-20) could signal the main "scan" thread when they are done so that new port can be assigned to be scanned.  The upper limit can be read from the OS.  The "scan" thread would assign "untested" ports to worker "test" threads.

The BFL I/O code should be non-blocking, overlapped IO so that scanning can be stopped if needed.

When/if this is accepted, I'll look into it. I also want to have it scan port during operation so you can yank out and replace/add devices while it's running.

Not sure if this helps, but reserved for use COM identifiers are located here.  Records in a key of sorts what has ever been assigned - offline devices, too.
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\COM Name Arbiter

Problem with this method is unplug/replug of devices without reboot results in incrementing COM id's.

Phraust
Full Member
***
Offline Offline

Activity: 206


Mostly Harmless...


View Profile WWW
June 20, 2012, 01:44:14 AM
 #12

Just wanted to update, it's been 24 hours and it's working like a champ, a solid 8.5gh with no issues.
P_Shep
Legendary
*
Offline Offline

Activity: 924


View Profile WWW
June 20, 2012, 03:44:35 AM
 #13

Great Smiley

Just need someone who has an FPGA other then BFL to test it...
kano
Legendary
*
Offline Offline

Activity: 1932


Linux since 1997 RedHat 4


View Profile
June 23, 2012, 03:47:21 PM
 #14

Oh - you have a thread.
You didn't mention that Smiley

Yeah been running for a while - but it's 1:45am
I'll leave it running overnight anyway.

If you show up shortly in #cgminer I'll give you the link to see my rig (or in the morning when I wake up)
But yeah it's mining and showing the same av MH/s mine does.

Pool: https://kano.is BTC: 1KanoiBupPiZfkwqB7rfLXAzPnoTshAVmb
CKPool and CGMiner developer, IRC FreeNode #ckpool and #cgminer kanoi
Help keep Bitcoin secure by mining on pools with Stratum, the best protocol to mine Bitcoins with ASIC hardware
fred0
Sr. Member
****
Offline Offline

Activity: 349


View Profile
June 25, 2012, 03:26:16 PM
 #15

Some numbers from testing BFL rev2 x16 running 800Mh/s Firmware

I disrupted the results by disconnecting the power on one unit mistakenly and did not notice, so results might be a teeny bit better for the pshep changes to cgminer.

Running under ubuntu 12.04 64-bit

CGMinerRejectAcceptedUtilMH/sGetworkRemoteLocalDiscardFoundHW ErrNetworkUptime hh:mm:ssRej %Eff
std8249700825174.831251492038870085391020044866:48:431.16%761%
pshep588367158175.68125753988810235339036850021334:49:580.16%920%



P_Shep
Legendary
*
Offline Offline

Activity: 924


View Profile WWW
June 25, 2012, 04:20:27 PM
 #16

Thanks for that Kano, I'm just about ready to submit a pull request now, Sorting out one more thing...

Looks good there fred Smiley
kano
Legendary
*
Offline Offline

Activity: 1932


Linux since 1997 RedHat 4


View Profile
June 25, 2012, 05:02:34 PM
 #17

Not sure if you are doing this so:
As I  have mentioned to luke-jr so I'll mention to you also Smiley
The BFL abort should only be done if --no-submit-stale is enabled and the getwork said to not submit stale.
(i.e. you need to somehow check those two before aborting the work)

Reasons:
1) If you abort work on a pool that allows stale shares, then when you abort you may be throwing away shares (since BFL doesn't tell you what shares you have worked out already when you abort the work) - so on such a pool (or a getwork that says to submit stale) you'd never abort the work
2) On p2pool only ~1 in every ~60 LPs represent a real BTC LP - for all other LP's, if the stale work is a full difficulty block, it is a valid payable BTC block - and p2pool will send it to the bitcoind ... and throwing away blocks is bad Smiley

Of course no one in their right mind would mine on p2pool with a BFL since either you throw away blocks or you throw away shares - you must do one or the other with a BFL on p2pool.

Pool: https://kano.is BTC: 1KanoiBupPiZfkwqB7rfLXAzPnoTshAVmb
CKPool and CGMiner developer, IRC FreeNode #ckpool and #cgminer kanoi
Help keep Bitcoin secure by mining on pools with Stratum, the best protocol to mine Bitcoins with ASIC hardware
P_Shep
Legendary
*
Offline Offline

Activity: 924


View Profile WWW
June 25, 2012, 05:48:19 PM
 #18

Not sure if you are doing this so:
As I  have mentioned to luke-jr so I'll mention to you also Smiley
The BFL abort should only be done if --no-submit-stale is enabled and the getwork said to not submit stale.
(i.e. you need to somehow check those two before aborting the work)

Reasons:
1) If you abort work on a pool that allows stale shares, then when you abort you may be throwing away shares (since BFL doesn't tell you what shares you have worked out already when you abort the work) - so on such a pool (or a getwork that says to submit stale) you'd never abort the work
2) On p2pool only ~1 in every ~60 LPs represent a real BTC LP - for all other LP's, if the stale work is a full difficulty block, it is a valid payable BTC block - and p2pool will send it to the bitcoind ... and throwing away blocks is bad Smiley

Of course no one in their right mind would mine on p2pool with a BFL since either you throw away blocks or you throw away shares - you must do one or the other with a BFL on p2pool.

Well, that's the thing, either way the work is lost, no? It's a matter of minimizing work lost. We can submit shares which may/may not be accepted, or we can restart work which we know will be accepted. As you say, this is only really a problem on P2Pool, which is problem anyway, so what's lost?
P_Shep
Legendary
*
Offline Offline

Activity: 924


View Profile WWW
June 25, 2012, 07:12:20 PM
 #19

Actually it'd be interest to get real performance data from someone with BFLs to mine on P2Pool with the existing 2.4.3, and my version. I'm quite curious Smiley
kano
Legendary
*
Offline Offline

Activity: 1932


Linux since 1997 RedHat 4


View Profile
June 25, 2012, 11:58:04 PM
 #20

...
Well, that's the thing, either way the work is lost, no? It's a matter of minimizing work lost. We can submit shares which may/may not be accepted, or we can restart work which we know will be accepted. As you say, this is only really a problem on P2Pool, which is problem anyway, so what's lost?
Well, no it's not an actual problem as such.
How the code must work is quite straight forward - as I said:
If --no-submit-stale is set and the getwork didn't say to submit stale, then yes abort.
i.e. the choice is the user's with using "--no-submit-stale" or the pool's by saying to submit stale in the getwork

Pool: https://kano.is BTC: 1KanoiBupPiZfkwqB7rfLXAzPnoTshAVmb
CKPool and CGMiner developer, IRC FreeNode #ckpool and #cgminer kanoi
Help keep Bitcoin secure by mining on pools with Stratum, the best protocol to mine Bitcoins with ASIC hardware
Pages: [1] 2 »  All
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!