Bitcoin Forum
November 14, 2024, 02:07:41 AM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 5 6 7 8 [9] 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 ... 89 »
  Print  
Author Topic: Butterfly Labs - Bitforce Single and Mini Rig Box  (Read 186939 times)
RoloTonyBrownTown
Sr. Member
****
Offline Offline

Activity: 350
Merit: 250



View Profile
February 26, 2012, 10:16:22 PM
 #161

i placed an order to pay with dwolla a few days ago, sent couple emails, PMs even called

trying to be patient... waiting for them to contact me

They're busy.  They obviously want your order, but I imagine it's full steam ahead trying to put together and post out all the pre-orders first.

BlackPrapor
Hero Member
*****
Offline Offline

Activity: 628
Merit: 504



View Profile WWW
February 26, 2012, 10:38:36 PM
 #162

i placed an order to pay with dwolla a few days ago, sent couple emails, PMs even called

trying to be patient... waiting for them to contact me

They're busy.  They obviously want your order, but I imagine it's full steam ahead trying to put together and post out all the pre-orders first.

I guess they haven't anticipated such huge demand, and the factory they've contracted with just overloaded, plus Chinese NY holidays, plus last year you can get 50BTC per block =))). I hope they'll start production ASAP.

There is no place like 127.0.0.1
In blockchain we trust
makomk
Hero Member
*****
Offline Offline

Activity: 686
Merit: 564


View Profile
February 26, 2012, 10:45:44 PM
 #163

Does anyone know how difficult/easy these are to set up to do merged mining on P2Pool?

It shouldn't be any different.

The miner hardware (actual CPU/GPU/FPGA) simply gets a binary blob of data combines it with a nonce that it increments, hashes it and looks for nonces "small" enough". It has really no concept of what it is doing.

It is the mining software which sets up those "work units" and returns proofs of work. 
It is the pool (p2pool daemon for p2pool) which ensures the blockheaders work for both chains.
It may not be that simple. p2pool has its own, much faster parallel block chain which means that unless you want a whole bunch of stales you need to have a reasonably low latency miner. Last time I looked the BitForce singles had a latency of several seconds between finding a share and reporting it because they didn't return any shares until they'd completely processed every nonce in the work unit. That would absolutely wreck your p2pool profits. Because they also threw away any shares when they got a new work unit, this wouldn't necessarily show up as stales either - you just wouldn't find as many shares as you should.


Quad XC6SLX150 Board: 860 MHash/s or so.
SIGS ABOUT BUTTERFLY LABS ARE PAID ADS
grue
Legendary
*
Offline Offline

Activity: 2058
Merit: 1452



View Profile
February 26, 2012, 10:48:14 PM
 #164

I guess they haven't anticipated such huge demand, and the factory they've contracted with just overloaded, plus Chinese NY holidays, plus last year you can get 50BTC per block =))). I hope they'll start production ASAP.
that was like, a month ago Roll Eyes

It is pitch black. You are likely to be eaten by a grue.

Adblock for annoying signature ads | Enhanced Merit UI
cablepair
Hero Member
*****
Offline Offline

Activity: 896
Merit: 1000


Buy this account on March-2019. New Owner here!!


View Profile WWW
February 26, 2012, 11:11:48 PM
 #165

i placed an order to pay with dwolla a few days ago, sent couple emails, PMs even called

trying to be patient... waiting for them to contact me

They're busy.  They obviously want your order, but I imagine it's full steam ahead trying to put together and post out all the pre-orders first.

I guess they haven't anticipated such huge demand, and the factory they've contracted with just overloaded, plus Chinese NY holidays, plus last year you can get 50BTC per block =))). I hope they'll start production ASAP.

well can you call sunny back and tell him to call me! I got $1300 burning a whole in my dwolla and I want to get my order in queue!

(i ordered almost 3 days ago now with no response...)
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1079


Gerald Davis


View Profile
February 26, 2012, 11:50:43 PM
 #166

Does anyone know how difficult/easy these are to set up to do merged mining on P2Pool?

It shouldn't be any different.

The miner hardware (actual CPU/GPU/FPGA) simply gets a binary blob of data combines it with a nonce that it increments, hashes it and looks for nonces "small" enough". It has really no concept of what it is doing.

It is the mining software which sets up those "work units" and returns proofs of work. 
It is the pool (p2pool daemon for p2pool) which ensures the blockheaders work for both chains.
It may not be that simple. p2pool has its own, much faster parallel block chain which means that unless you want a whole bunch of stales you need to have a reasonably low latency miner. Last time I looked the BitForce singles had a latency of several seconds between finding a share and reporting it because they didn't return any shares until they'd completely processed every nonce in the work unit. That would absolutely wreck your p2pool profits. Because they also threw away any shares when they got a new work unit, this wouldn't necessarily show up as stales either - you just wouldn't find as many shares as you should.

There is no reason FPGA can't have an intensity value just like GPU do.  That is all intensity is doing.  Rather than work on 2^32 hashes at once it works on a smaller subset.

Of course theses stales would show up.   They are still valid hashes just stale and a good miner (like latest version of cgminer) will submit stales if the pool advised the miner to do (which p2pool) does.  So if the FPGA has too high of a stale rate the bitstream could be improved to work on a smaller subset of the nonce range. 
jddebug
Sr. Member
****
Offline Offline

Activity: 446
Merit: 250



View Profile
February 27, 2012, 02:32:58 AM
 #167

I got an email back from Sonny the other day. He's saying late this week for my singles.

I run Mac. Wondering if cgminer will compile for me and support the singles?
makomk
Hero Member
*****
Offline Offline

Activity: 686
Merit: 564


View Profile
February 27, 2012, 10:23:39 AM
 #168

There is no reason FPGA can't have an intensity value just like GPU do.  That is all intensity is doing.  Rather than work on 2^32 hashes at once it works on a smaller subset.
There's no reason why it couldn't in theory, but as far as I can tell it doesn't support one right now. Could presumably be added by a firmware upgrade in the future though.

Of course theses stales would show up.   They are still valid hashes just stale and a good miner (like latest version of cgminer) will submit stales if the pool advised the miner to do (which p2pool) does.  So if the FPGA has too high of a stale rate the bitstream could be improved to work on a smaller subset of the nonce range. 
Looks like cgminer will do actually. There are basically two choices. You can carry on working on every work unit to completion even though you get a longpoll, which is what cgminer does, or you can submit a new work unit to the single and it'll discard any results it found so far for the old work unit and start working on the new one. Either way you lose out, it's just a question of whether you lose by throwing away shares or lose by working on work units that are stale.

Quad XC6SLX150 Board: 860 MHash/s or so.
SIGS ABOUT BUTTERFLY LABS ARE PAID ADS
kano
Legendary
*
Offline Offline

Activity: 4620
Merit: 1851


Linux since 1997 RedHat 4


View Profile
February 27, 2012, 01:54:24 PM
 #169

...
Looks like cgminer will do actually. There are basically two choices. You can carry on working on every work unit to completion even though you get a longpoll, which is what cgminer does, or you can submit a new work unit to the single and it'll discard any results it found so far for the old work unit and start working on the new one. Either way you lose out, it's just a question of whether you lose by throwing away shares or lose by working on work units that are stale.
Hmmm, that suggests something rather unexpected.
I guess I'll have to verify you're correct next time I see Luke-jr in IRC ... coz that is a waste as you suggest
I was pretty sure that the BFL can handle nonce ranges but I'm not sure now Tongue

If the rest below seems off-topic - it actually isn't coz the code for both BFL and Icarus is very similar.
The Icarus code is a butchered copy of the BFL code but redesigned to handle the way Icarus works
(and that will be enhanced more once I get my 2 Icarus in the next 4 or 5 days)

With the current firmware in the Icarus, it has the problem of only ever returning 1 nonce.
Now that would seem bad ... but looking at what you said above, it actually isn't all that bad after all ...

For an LP, when you overwrite the current work, you know you only have a very small chance of throwing anything away
(no share nonces exist before the current point in the full nonce range otherwise it would have returned one already)
It could be working on a share nonce at the time you overwrote it - but that's quite unlikely - and the chance of that would be 1 in (2^32 divided by however many nonces could be checked in the amount of time it would take to overwrite the current work)

The other case (that would seem bad but isn't really either) is when you get a nonce reply from Icarus, it has stopped work and you start another work and thus have only a very small (but different) chance of throwing anything away if both FPGA's happen the find an answer at the same time
The result of this is of course you on average halve your efficiency - but that doesn't really mean anything worthy of concern

Nonce ranges would reduce the impact on BFL (if they exist)
That just adds the extra overhead of starting each nonce range and a smaller nonce range is the maximum wasted processing time (if the pool doesn't accept stale shares)

... oh and lastly, in cgminer you can also manually enable stale share submission with the --submit-stale option if the pool doesn't pass that to info to cgminer

Pool: https://kano.is - low 0.5% fee PPLNS 3 Days - Most reliable Solo with ONLY 0.5% fee   Bitcointalk thread: Forum
Discord support invite at https://kano.is/ Majority developer of the ckpool code - k for kano
The ONLY active original developer of cgminer. Original master git: https://github.com/kanoi/cgminer
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1079


Gerald Davis


View Profile
February 27, 2012, 02:10:40 PM
 #170

Looks like cgminer will do actually. There are basically two choices. You can carry on working on every work unit to completion even though you get a longpoll, which is what cgminer does, or you can submit a new work unit to the single and it'll discard any results it found so far for the old work unit and start working on the new one.

I don't think you can do that.  You certainly can't do that with a GPU.  Once it starts working it is asynchronous.  It simply works until completed.  This is why setting cgminer to an insane intensity like 17 is foolish as you have no loaded the GPU down with 4 billion hashes.  It will takes 10 to 20 or more seconds to complete and if a longpoll occurs cgminer can only wait as the GPU hashes away on worthless data.

Looking at the (very limited) data BFL provided in cgminer thread it looks like the BFL  singles can't be interrupted once they begin a cycle.

Quote
Either way you lose out, it's just a question of whether you lose by throwing away shares or lose by working on work units that are stale.
Well the former is much better than the later.  Once LP occurs any work completed but not submitted is worthless.  That doesn't change but what matter is how much MORE worthless work will be completed.  If you can interrupt the single you can prevent 0 hashes more worthless work if you can't then on average you will lose an entire "batch workload" on each LP.  I can't tell from data BFL provides if BFL singles work on a full 2^32 in one batch or use a smaller fixed batch size.  The optimal solution would be a firmware which allows an "intensity like value" so the miner can give the hardware a starting nonce and # of nonces to perform. 
rjk
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250


1ngldh


View Profile
February 27, 2012, 02:13:20 PM
 #171

Looking at the (very limited) data BFL provided in cgminer thread it looks like the BFL  singles can't be interrupted once they begin a cycle.
They posted in the cgminer thread to give specifics of the interface spec. One of the commands does allow the work in progress to be discarded immediately and begin again from new work. Sorry I don't have a link just at the moment.

EDIT: I found it:
    * BUSY
    (Device is busy processing a block. You can still issue a job, but the previous process will be
     discarded and new process will start)

Mining Rig Extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS] Dead project is dead, all hail the coming of the mighty ASIC!
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1079


Gerald Davis


View Profile
February 27, 2012, 02:17:16 PM
 #172

Looking at the (very limited) data BFL provided in cgminer thread it looks like the BFL  singles can't be interrupted once they begin a cycle.
They posted in the cgminer thread to give specifics of the interface spec. One of the commands does allow the work in progress to be discarded immediately and begin again from new work. Sorry I don't have a link just at the moment.

So the only other missing piece of the puzzle is does BFL return shares as found or does it wait until end of nonce range.

If shares are returned as found then there is no "loss" you simply "reset" the BFL at each LP and will have perfect efficiency.  If shares are only returned at the end of the nonce range (i.e. it performs 2^32 nonces and returns all shares found) then there will be lower efficiency the shorter the LP interval is.

I got to find that post.
rjk
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250


1ngldh


View Profile
February 27, 2012, 02:19:07 PM
 #173

Looking at the (very limited) data BFL provided in cgminer thread it looks like the BFL  singles can't be interrupted once they begin a cycle.
They posted in the cgminer thread to give specifics of the interface spec. One of the commands does allow the work in progress to be discarded immediately and begin again from new work. Sorry I don't have a link just at the moment.

So the only other missing piece of the puzzle is does BFL return shares as found or does it wait until end of nonce range.

If shares are returned as found then there is no "loss" you simply "reset" the BFL at each LP and will have perfect efficiency.  If shares are only returned at the end of the nonce range (i.e. it performs 2^32 nonces and returns all shares found) then there will be lower efficiency the shorter the LP interval is.

I got to find that post.
https://bitcointalk.org/index.php?topic=28402.msg692304#msg692304

Mining Rig Extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS] Dead project is dead, all hail the coming of the mighty ASIC!
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1079


Gerald Davis


View Profile
February 27, 2012, 02:22:01 PM
Last edit: February 27, 2012, 02:33:59 PM by DeathAndTaxes
 #174

Looking at the (very limited) data BFL provided in cgminer thread it looks like the BFL  singles can't be interrupted once they begin a cycle.
They posted in the cgminer thread to give specifics of the interface spec. One of the commands does allow the work in progress to be discarded immediately and begin again from new work. Sorry I don't have a link just at the moment.

So the only other missing piece of the puzzle is does BFL return shares as found or does it wait until end of nonce range.

If shares are returned as found then there is no "loss" you simply "reset" the BFL at each LP and will have perfect efficiency.  If shares are only returned at the end of the nonce range (i.e. it performs 2^32 nonces and returns all shares found) then there will be lower efficiency the shorter the LP interval is.

I got to find that post.
https://bitcointalk.org/index.php?topic=28402.msg692304#msg692304

So the wording is vague but it looks like it only returns nonce at the end of the block interval
Code:
3) Checking for results
-------------------------------
After the sent job, driver must keep asking the device for status (10ms is preferred polling interval).
Status command is 'ZFX'. Once sent, the unit may respond with one of the predefined responses:
  
    * BUSY
    (Device is busy processing a block. You can still issue a job, but the previous process will be
     discarded and new process will start)

    * IDLE
    (Device is not processing anything)

    * NONCE-FOUND:<8 Hexadecimal characters defining the found Nonce>, <8 Hexa decimal of the second found nonce>...
    Note: The last nonce will not be terminated by a comma. The byte-ordering is Little-Endian
    Example: NONCE-FOUND:1234ABCD,2468EFAB,1111BBBB   (3 valid nonces are discovered in this process)

    * NO-NONCE
    Processing has been finished, no valid nonce was detected.

It doesn't look like you can provide a nonce start value or nonce range so I am once again assuming (spec is very vaguely worded) is works only on full 2^32 nonce range.

This would indicate some valid work is being "left behind".
rjk
Sr. Member
****
Offline Offline

Activity: 448
Merit: 250


1ngldh


View Profile
February 27, 2012, 02:32:12 PM
 #175

Since each chip is ~400 MH/s it is ~10 seconds to complete a "batch" and thus you would lose a lot of potential work with p2pool.  I wonder if BFL can field update the firmware to provide either fixed smaller nonce range or user variable nonce range (2 values added to start job command (starting-nonce = uint32 value, nonce-range where # of hashes = 2^(nonce-range).
Remember according to Inaba's tests, it seems to be single threaded - it got 500% efficiency and took about 50 seconds to complete a block header from what I recall. So apparently the nonce is already being split between the 2 FPGAs.

Also, it appears according to the spec (at least how I read it) that nonces are returned as found, not at the end of a cycle. However this would need testing. The polling interval is 10ms, so if it continues working after finding nonces, there would be no issue with p2pool, as long as those nonces were gathered DURING the cycle, and not at the end. Again, needs testing.

Mining Rig Extraordinaire - the Trenton BPX6806 18-slot PCIe backplane [PICS] Dead project is dead, all hail the coming of the mighty ASIC!
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1079


Gerald Davis


View Profile
February 27, 2012, 02:37:44 PM
Last edit: February 27, 2012, 03:32:00 PM by DeathAndTaxes
 #176

Also, it appears according to the spec (at least how I read it) that nonces are returned as found, not at the end of a cycle. However this would need testing. The polling interval is 10ms, so if it continues working after finding nonces, there would be no issue with p2pool, as long as those nonces were gathered DURING the cycle, and not at the end. Again, needs testing.

Given only four statuses:
BUSY
IDLE
NONCE-FOUND
NO-NONCE


If status changes to "NONCE-FOUND" when it finds a nonce how do you know when it is finished w/ the entire nonce range?

I would have imagined if the device returned nonces are found it would have status like:
IDLE
BUSY-NO-NONCE
BUSY-NONCE-FOUND
FINISHED-NO-NONCE
FINISHED-NONCE-FOUND

Looking at the code it would appear it waits for NONCE-FOUND or NO-NONCE and then loads new data which would indicate that the status "BUSY" is used until batch completes.
https://github.com/ckolivas/cgminer/blob/master/bitforce.c

Code:
	while (1) {
BFwrite(fdDev, "ZFX", 3);
BFgets(pdevbuf, sizeof(pdevbuf), fdDev);
if (unlikely(!pdevbuf[0])) {
applog(LOG_ERR, "Error reading from BitForce (ZFX)");
return 0;
}
if (pdevbuf[0] != 'B')
    break;
usleep(10000);
i += 10;
}
applog(LOG_DEBUG, "BitForce waited %dms until %s\n", i, pdevbuf);
work->blk.nonce = 0xffffffff;
if (pdevbuf[2] == '-')
return 0xffffffff;
else if (strncasecmp(pdevbuf, "NONCE-FOUND", 11)) {
applog(LOG_ERR, "BitForce result reports: %s", pdevbuf);
return 0;
}

pnoncebuf = &pdevbuf[12];

while (1) {
hex2bin((void*)&nonce, pnoncebuf, 4);
#ifndef __BIG_ENDIAN__
nonce = swab32(nonce);
#endif

submit_nonce(thr, work, nonce);
if (pnoncebuf[8] != ',')
break;
pnoncebuf += 9;
}

return 0xffffffff;

I agree though if the status changes to "NONCE-FOUND" as soon as a nonce is found then there is no issues w/ p2pool (or other shorter LP intervals).
zefir
Donator
Hero Member
*
Offline Offline

Activity: 919
Merit: 1000



View Profile
February 27, 2012, 05:12:29 PM
 #177

One thing to consider is the overhead for queuing work and checking results: assume the serial communication goes over 115.2kbps 8N1, sending a job request takes around 4ms, if you added starting nonce and length about 5ms. Checking for results adds another ms.

At 800 MH/s the BitForce needs 5secs for the whole nonce range. Splitting up the work into e.g. 64 chunks to get latency down to 80ms will cost you about 350ms for communication. Thats 7% of total idle time for your BFL... Might still be worth considering as a mean to prevent chips from running hot Wink

jamesg
VIP
Legendary
*
Offline Offline

Activity: 1358
Merit: 1000


AKA: gigavps


View Profile
February 27, 2012, 05:23:11 PM
 #178

I have been watching this thread and want to thank everyone who has participated lately. It's great to see discussion about if/how to improve the BFL code and firmware without all of the BFL bashing going on.
pieppiep
Hero Member
*****
Offline Offline

Activity: 1596
Merit: 502


View Profile
February 27, 2012, 05:39:55 PM
 #179

One thing to consider is the overhead for queuing work and checking results: assume the serial communication goes over 115.2kbps 8N1, sending a job request takes around 4ms, if you added starting nonce and length about 5ms. Checking for results adds another ms.

At 800 MH/s the BitForce needs 5secs for the whole nonce range. Splitting up the work into e.g. 64 chunks to get latency down to 80ms will cost you about 350ms for communication. Thats 7% of total idle time for your BFL... Might still be worth considering as a mean to prevent chips from running hot Wink
Unless the input/output is buffered before sent to the sha256 engine.
In that case it doesn't matter how often you send new work.
cablepair
Hero Member
*****
Offline Offline

Activity: 896
Merit: 1000


Buy this account on March-2019. New Owner here!!


View Profile WWW
February 27, 2012, 06:01:54 PM
 #180

I have been watching this thread and want to thank everyone who has participated lately. It's great to see discussion about if/how to improve the BFL code and firmware without all of the BFL bashing going on.

Gigavps: any tips on how I can actually order their products bro? I placed an order four days ago , sent multiple emails and calls , I can't get anyone from bfl to even talk to me, I know they must be busy but damn ... Is it that hard for them to setup an order and take my money?
Pages: « 1 2 3 4 5 6 7 8 [9] 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 ... 89 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!