Bitcoin Forum
May 29, 2024, 02:48:11 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
  Home Help Search Login Register More  
  Show Posts
Pages: [1] 2 3 »
1  Bitcoin / Pools / Re: BTCGuild and it's relation to DDoS attackers on: October 20, 2011, 06:36:01 PM
Man this is a real shitty situation.  Personally I think what he did ( switching the DNS to btcG ) was a bit grey-area and possibly immoral, but without it he/we wouldn't have gotten the piece of information that the attackers weren't hitting btcG, at least not yet.  I think he explained fairly well his reasoning for this, and I don't fault him for his decisions.  It may have been a better idea to try to talk it out between the operators before doing something like that, but I think that's another discussion.

My opinion on the whole btcG deal with the devil to avoid getting his servers DDOSd - that to me is the despicable and worser evil.  The second any of you pool operators agree to a deal like this ENSURES that this will continue for a lot longer.  You can't collude with the enemy on a situation like this, it's the same reason you don't give in to terrorist demands.  You have essentially enabled these criminals to continue what they are doing because they know it will work now.  The only way to deal with these criminals is the hard way - you ban them every chance you get, and you get your pools DDOS'd and taken down. 

I think if any of this were ever to be taken to court in a criminal case, the pool operators who have colluded with these criminals could potentially be seen as accomplices to these crimes.  Without the pools, these botnets would have no where to connect and pool mine.  These pools are a great way for these botnets to essentially "wash" their mining.  Push their mining to a large pooled mining operation ( a "safe" ip ) and it's much harder to track who and where these people are coming from.  It's one thing to blindly not know about their operations, but it's completely another to hash out a deal with these people and cave in to this extortion.

The second any one of you pool operators caves to their pressure, you have enabled these criminals to continue what they are doing for the detriment of the entire BTC community.

The only problem I see is if the big 3 pools come together and draw a line in the sand, it may just push these botnets to smaller pools, potentially clearing all the smaller pools out and emphasizing the problem with too large pools ( > 51% ).

But what you guys rather do?  Cave in to these criminals "paying" them protection in the form of pooling services?  Or make the BTC community a better place showing these bastards that this is no place for botnet mining?

I guess the other "problem" here is that it's a lot of hash rate, and with pool fees and whatnot, it may be too sweet of a deal to the pool operators to collude with these criminals.
2  Other / Archival / Re: delete on: October 11, 2011, 10:57:44 PM
If you can prove, like you said, Then STFU and quit bullying people with your MOD status.

Yes it was out right bullying and RandyFolds is spot on.

Personally I believe that BCX has the ability to do this.  Has he actually done anything?  That I'm unsure about.

If BCX can successfully do a 51% double spend - then I'd love to see the transaction history of that occurring.

Personally I don't think the onus is on Maged to proove that he didn't attack... I think the onus is on BCX to proove that he did do an attack.  Or that whoever made SC needs to proove that no attack has taken place...

Unless I'm wrong in assuming that Maged has nothing to do officially with SC?

And all of this attack vector stuff - if SC was as big as BTC, then would a 51% attack really be feasible?  I think it's a bit of apples to oranges to compare BTC to SC at the usage/acceptance rates they are experiencing is vastly different.

I really don't think what I've heard of SC describes something that I would "invest" in or follow, the solutions to the problems that SC has offered doesn't really solve the problems in an acceptable fashion.  One of the key points of BTC is it's decentralized nature - and this is one of the things SC has had to compromise to protect against a 51% attack.  That compromise alone in my mind defeats the very nature of BTC.  As a cryptocurrency SC may be very valid and may have a use, but as a decentralized cryptocurrency it does not.  Who controls these "trusted nodes"?  Why do they have trusts?  Why should I trust this unknown anonymous central authority?

I still havn't seen a single alternative BTC chain that has something useful, actually useful to offer.  BTC is hardly useful in itself right now, and yet it seems everyone is spending their energy trying to beat BTC.... I think what people need to do first is get BTC to a more generally accepted place.  If design considerations are a problem holding BTC from gaining that widespread usage, then I may try to hop on some alternative chain.  But the only reason to do that would be that chain has some sort of feature that would allow more widespread usage.  Maybe there are uses for side-by-side alt chains, but in my mind the usage cases for these things is much smaller and much more focused and niche-like.  Things like faster block generation in my mind are good for things like confirmations - near-instant confirmations.

What is it that SC has to offer feature wise that makes it so much better than BTC?  And at what cost?
3  Other / Archival / Re: delete on: October 11, 2011, 07:15:50 PM
Diff is at 2791 now... indicating somewhere between your 10% and 15%.... try 13%, I think I read that somewhere as the "official" number.
Yup, the only thing broken in diff adjustment is its design. But otherwise it "works" as intended.
so it sounds like difficulty is increasing as planned.  The problem was initial design having initial difficulty of 1.  And that doesn't seem like that much of a problem.  It does heavily favor whoever was mining for these initial low difficulty rates.  As throwing huge amounts of processing power at the chain would not increase difficulty heavily.

If the initial difficulty was 10, the difficulty right now would be 10x what it is now.  If it started at 100 it would be 100x what is now.  This was probably an unplanned problem due to the variables stated above.

The only problem I see with having the hard 10-15% difficulty increase cap is the chain does not adjust well to large amounts of processing power being added.  That coupled with the uneven % increases in the negative direction, and the "attack" mentioned a few posts back of strobing/throttling a large botnet onto the chain seem like it could become a problem.  Not so much a huge problem, just that a large amount of processing power could be used on the chain for a longer amount of time due to the retargetting scheme - which would mean potential attackers could "harvest" more out of the chain without affecting the difficulty as much.

I also don't think the huge # of stales at the start was an attack.  It more sounds like the extremely low difficulty of 1 coupled with a huge influx of unexpected miners caused a lot more problems than expected.  If you are generating a block every 1s - it's going to be real hard to keep the whole network on the same chain.  Propogation time alone of the blocks would probably take longer than generating a new block - and that was more the main issue it seems at the start of this chain.

As for central authority and % skimming of generated blocks that's a whole other issue not related to security, and more related to trust and design.  As people have been posting, that really is an issue with trust and with who controls that wallet.

Claims of 51% attack preventions - I don't believe there is enough evidence or proof here to refute or back up those claims.  But I'd be wary of assuming this is true without proof.  But I also wouldn't go out of my way to say it's not there.... There is no proof for either side to be true, just claims that their side is true....  Show me a 51% network control double spend and I'll believe it doesn't have protection, or show us/explain to us the code of how a 51% attack couldn't succeed.  But without proof on either side my bias would be to assume it isn't true because there's no gain, only potential loss by assuming it is true without proof.

edit: difficulty started at 8? or you could think of it as difficulty starting at 6.03 after first retarget?
6 * (1.13^69) = 27578.0688
4  Other / Archival / Re: delete on: October 11, 2011, 06:57:29 PM
The sources will be published, just some more patience...

Do you mind if I ask what reason you have to place so much trust in CoinHunter/RealSolid?

Satoshi did not ask nor want that any trust be placed in him, the code was open from day one.
It's a hard sell for me reading this thread.  I'm not saying CH is out to scam anyone, but he is relying on blind trust of an anonymous internet user to back his claims... and to me that's just not enough....  Back up your claims of features, and they can be taken as features - otherwise it's just all talk.  Proof = backing, not talk
5  Other / Archival / Re: delete on: October 11, 2011, 06:54:25 PM
Now let's finish the math... at a 10 to 15% capped increase.... the first many adjustments were up by single digit values.... now we are in the double digit increases.  Go back home and pull out your math book and turn to the section on percentages again johny

For reference, Bitcoin difficulty increased by 1800% (~100k ->1.8m) between April-August, which took ~15 difficulty adjustments, while still maintaining rough target block generation.  SC 2.0 has yet to even reach target block generation after 70 adjustments - it's still only 1/17th of where it should be.

Why do you insist on providing proof that you don't understand basic simple % calculations?
My next question is this:

If SC 2.0 handles rising difficulty so poorly, how is it going to handle falling difficulty?

Didn't SC 1.0 suffer 2week blocks after the mandatory voluntary shutdown?
The MAIN FEATURE of SC1 was the fast responding difficulty. That continues in SC2.
Obviously the slow rising difficulty continues too.

So in answer to your question......Just Fine

12hrs in, 17k blocks found, and it's still at 6s/blk?  If it's retargeting every 240 blocks, that's 70 'adjustments' it's had, and still can't break 10s/blk.

My question remains.

So hold on, let me get this straight, I'd like you to show me how % increase is working right now?

What's the difficulty right now?

Here's my calculation of 70 max adjustments starting from difficulty 1:
http://www.google.com/search?q=1*1.15^70

1 * 1.15^70 = 17735.72

so using max % adjustments for difficulty, difficulty should currently be at 17k difficulty.... what is it at right now?

So let's use the "max" as 10% instead:

1 * (1.10^70) = 789.746957

hrm... still 789 there... what's the difficulty right now?

unless someone is doing a massive strobing of hash power at each retarget, then I just don't see how the difficulty is correctly increasing...

someone correct my calculations, I don't know the exact state of things, just read through this thread
6  Bitcoin / Hardware / Re: Official Open Source FPGA Bitcoin Miner (Smaller Devices Now Supported!) on: July 28, 2011, 04:22:08 AM
just got off a long day of work, brain hurts, so I havn't really gone over what you've written here so far, but skimming it definitely seems like it will help me understanding everything.  When I get some free time I'll try to analyze it and work it all out, maybe even draw a single cell diagram for anyone else trying to wrap their heads around it.

On another note - the 2engine design I let running finally finished, and failed.  It ran out of placement sites - said it was short like 1k FFs and 3k LUTs to implement the design - and this was before routing, so it may not even have been able to attain 100 MHz there.  I am running it again, but it may take 2-3 days again - but this time I've enabled some more aggressive optimizations in the Xilinx ISE.

We might have to settle for having 1 fully unrolled engine, and then 1 LOOP_LOG2=1 engine running and that should hopefully get to ~150MHz pretty easily.

Has anyone else gotten a 100MHz version working?  Should I compile a bit file for someone to test it?  I hate flying blind with a target device to test on.
7  Bitcoin / Development & Technical Discussion / Re: Modular FPGA Miner Hardware Design Development on: July 28, 2011, 04:00:08 AM
li_gangyi: do you think there will be an inrush current issue with the voltage regulators you suggested?

The board I work on at work used to have issues with the regulators locking up if they could not grab a lot of amperage on power up.  At running mode, the system only sucks out like under 1 A @ 24 V, but when you power it on, if you limit the current at the power supply to somewhere around 2-3 A, it will lock up the regulators, and they never get to their full voltage.  Our system has a bit more stuff on it that's sucking out power on boot, like a DSP, FPGA, a bunch of 24-bit A2Ds, and probably a handful of other stuff I'm forgetting at the moment - and mebe that's part of the problem, all the chips turning on at the same time.  But I'm unsure if this is typical for voltage regulators, and I want to be sure that once this board gets fabbed/assembled, that it will actually turn on without some deadbugging.

for the clocks - I don't think phase synchronization between the two FPGAs in an issue - they shouldn't be communicating with each other, and thus it doesn't matter if the phase relationship between the two FPGAs is synchronous - everything will be talking to the MCU.  Personally I would say go with 2 clock crystals as you'll get a better clock signal this way - and we don't gain much from having the 2 clocks in phase - other than PCB area, routing, and crystal cost.  Having a pure clock signal with low jitter and an even duty cycle will prevent the FPGA from fucking up because to get a fast hash engine out of the FPGA we are going to have to push it to it's limit.  Just for this reason alone I would say go for 2 clock crystals.  But then again, if you guys are sure you won't get a polluted clock signal this way, then it is viable.

Also, if you guys add any more IO connections to the FPGA - keep them all in the same bank.  I believe the Spartan6 has restrictions on which signals can be used in which clock domains based on quadrant.  You can get by the ISE by forcing it, but it's typically not good practice because the tools will have to route that one signal a long distance, which will impact max core speed.  But the FPGA should have more than enough IO on that one Bank 2 you guys are already using.
8  Bitcoin / Hardware / Re: Official Open Source FPGA Bitcoin Miner (Smaller Devices Now Supported!) on: July 27, 2011, 02:07:43 AM
ok, so I'm not crazy yeah, pre-routing got me to 140% LUTs pre-routed, and the ISE has been at the map stage for something like 18 hours so far.... gonna let it run, I got a quad core, so doing parallel compiles is feasible.  The first global placement run took 4 hours for me.... and it's been stuck on the 2nd global placement run now - probably around 19 hours total running time so far... what's sad is this is the Map phase... Place and Route has yet to run =(

The other thing I've been told is your FPGA should never really exceed 60-70% usage pre-routing.... because a lot of resources are needed to get high-speed routing done...  And trying to pack 2 engines in there is likely nearing more like 90% usage...

in terms of pipelining, it's not so much the Hasher blocks I dont understand, it's the signals feeding into them.... For example, what are the different length shift registers for?  What is the definition of cur_w# and why do they need different lengths, or more specifically what do the specific lengths correlate to?  The previous hasher's output?  And on a fully unrolled loop - why are shift registers even necessary?  Shouldn't each hasher's digester essentially have the "register" of the state in there?
edit: ohh wait nm, that's the message scheduler!

Maybe not a full block diagram outling all the pipelined stages, but more of a "cell" diagram of a Hasher in terms of i.  E.g. Hasher i has connected to it's input Hasher i-1's cur_w0.  Something like that might help me figure out exactly what's going on.

And I guess that loops into your question on specifying signal names.  Personally I would have some sort of prepend to every wire/reg.  In VHDL there is no distinction and the behavior ( wire or reg ) is inferred through the design - e.g. signal is assigned a value in a clocked process ==> register.  And in VHDL I usually prepend all my signal names with sig_XXXXX.  One of the problems I have with single letter variable names is that they are impossible to search the document for references.  So you have a variable K - want to see how hard it is to search a document for references to the letter K?  If every K was instead sig_K, it would be much easier to search the document to find references.  Basically any single letter variable name IMO is bad.  

Some of the other signal names might be a mix of non-detailed name + my inexperience with the SHA algorithm.  For example, wtf does cur_w1 mean?  I understand a _fb = feedback.  But I don't know what w1 or w0 or w14 or w9 do.  Also, I'm unsure what a _w means, or a _w1, or a _t1. Or a prepend of cur_ - not exactly sure what that means either.

And although it may be easy and quick to type, the shift register definition also has 2 single letter registers, r and m - and this one isn't as bad because that stuff is internal, but imagine what a pain in the ass it is when you get a synthesis info/warning about some variable m - now I gotta search through all the source files by hand to look for a register m - because I can't just search for "m" in all the documents and get anything useful...

It might also help to organize the wire/reg definitions a little bit better.  The way it is now, definitions are strewn throughout the code.  I always prefer having my wire/reg/input/output declarations at the top of the module, like software coding.  It may also help to separate out the modules a little bit more.  The sha256_transform is so complicated already - maybe move all things like the digester or shift registers out of the same source file, that way the root sha256_transform module is more of a connectivity/hierarchy module defining the structure, not the fuction of the sha256 transform.  

But truthfully my understanding of the sha256 algorithm and it's pipelined version are probably a little bit lacking, and that is not helping to understand the code/flow.
9  Bitcoin / Hardware / Re: Official Open Source FPGA Bitcoin Miner (Smaller Devices Now Supported!) on: July 26, 2011, 11:12:56 PM
good point, I should just be mapping the K's directly into the hashers as constants

I was able to achieve 100MHz and route it with the current design, slightly modified.  Changed the PLL to output clk0 so no clock division, and I changed the K/K_next assignments as to what I described in my previous post.  This routes it to a min clock period of 9.742ns for me.

It also seems like the worst case critical path related to the 100MHz clock is between Hasher[8] and Hasher[13], looks like an output of an adder in Hasher[8], adding rx_state and k_next gets registered into Hasher[13]'s shift_w1 register.

I also don't like this chipscope in here heh, it's treating the signals you are looking at as clocks and thus routes them on BUFG's through BUFGMUXs.  It's low fanout so it doesn't matter that much, but from what I'm used to with FPGA design, you don't really want to be using non-clock signals as clocks, i.e. using these signals in edge logic, a la always @ posedge(some_sig).  I should probably see how much resources these chipscope modules are taking up as well....

it'd be nice if there was a higher level timing diagram/pipeline diagram for this process.  I'd love to know what exactly each unit should be doing at any one time, e.g. which "nonce"/hash is block X currently working on.

it's taken me a bit of time to figure out what's going where, and it really really really makes it hard to read and figure out what signals are what with the plethora of 1-letter signal/wire names....


and hrm... 2 engine design still tells me it's 140% LUTs lol.... and this is without getting the RAM => LUT message.  BRAM/FIFO usage is 202/268.... hrm.... i wonder if this compile will also last 12+ hours
10  Bitcoin / Hardware / Re: Official Open Source FPGA Bitcoin Miner (Smaller Devices Now Supported!) on: July 26, 2011, 07:38:56 PM
hrm.... yeah been doing more testing... and it seems liek I have high LUT usage because some of the "RAM" is being inferred as LUTs?

do you get any of these messages when you compile?
Quote
INFO:Xst:3218 - HDL ADVISOR - The RAM <Mram_HASHERS[0].K> will be implemented on LUTs either because you have described an asynchronous read or because of currently unsupported block RAM features. If you have described an asynchronous read, making it synchronous would allow you to take advantage of available block RAM resources, for optimized device usage and improved timings. Please refer to your documentation for coding guidelines.
    -----------------------------------------------------------------------
    | ram_type           | Distributed                         |          |
    -----------------------------------------------------------------------
    | Port A                                                              |
    |     aspect ratio   | 64-word x 32-bit                    |          |
    |     weA            | connected to signal <GND>           | high     |
    |     addrA          | connected to signal <n1055>         |          |
    |     diA            | connected to signal <GND>           |          |
    |     doA            | connected to signal <HASHERS[0].K>  |          |
    -----------------------------------------------------------------------

really odd....  it's not happening to all of the sha_transform modules though... it only seems to be one.... the 2nd one with the NUM_ROUNDS set to 61 it appears


also, I see things like this when it's synthesizing:
Quote
   Found 6x6-bit multiplier for signal <n1055> created at line 120.
    Found 6x32-bit multiplier for signal <n1057> created at line 127.
line 120 is:
Quote
assign K = Ks_mem[(NUM_ROUNDS/LOOP)*cnt+i];
line 127 is:
Quote
assign K_next = Ks_mem[(NUM_ROUNDS/LOOP)*((cnt+i) & (LOOP-1)) +i+1];
hrm... there has to be a better way to use those generate blocks to parse these values not as signals/wires to be used at runtime, but rather generate constant integers or look up tables/mux's to generate these...

edit: update
if I use this for the K and K_next assignment when LOOP == 1, I don't get the LUT messages anymore:
Quote
`ifdef USE_RAM_FOR_KS
         if ( LOOP == 1) begin
            assign K = Ks_mem[ i ];
            assign K_next = Ks_mem[ i + 1 ];
         end else begin
...
I think the problem is that K and K_next are not assigned in a clock state, thus they become asynchronous combinatorial logic - and XST can't map that to a ROM?  Or maybe it's the addition of using a multiplier output as an address selector?  Something in there XST wasn't liking for me.

also, it seems the 1st round synthesizes much differently?
for the first sha block I get this:
Quote
   Summary:
   inferred  10 Adder/Subtractor(s).
   inferred 551 D-type flip-flop(s).
   inferred  17 Multiplexer(s).
Unit <sha256_transform_1> synthesized.

for the 2nd block I get this:
Quote
   Summary:
   inferred  62 RAM(s).
   inferred   2 Multiplier(s).
   inferred  63 Adder/Subtractor(s).
   inferred 295 D-type flip-flop(s).
   inferred  17 Multiplexer(s).
Unit <sha256_transform_2> synthesized.

why are these so different!?

first off are they sharing the RAM for the K's ?  It seems only the K's for the 2nd block are generated, but Xilinx might be optimizing across the hierarchy here.  But what about the # of adders/subtractors!? only 10 in the first block? how can that be?  or is it that it's shifting the position of the adders from the digester to the higher module?


I also see this:
Quote
Synthesizing Unit <shifter_32b_9>.
    Related source file is "e:/bitcoin/lx150_makomk_test/hdl/sha256_transform.v".
        LENGTH = 8
WARNING:Xst:3035 - Index value(s) does not match array range for signal <m>, simulation mismatch.
which relates to the shift register code wi:
Quote
      reg [31:0] m[0:(LENGTH-2)];
      always @ (posedge clk)
      begin
         addr <= (addr + 1) % (LENGTH - 1);

now when I look at that, I'm not sure if that's correct, so lets say LENGTH = 8.  The first line says create a 32-bit register array, with (8-2+1) elements, so 7 elements, but the addr modulous wraps around at 7 - e.g. once ( addr + 1 ) == 7, then addr becomes 0, not 7.  So we are missing the last element of the shift register.

I think this is just an indexing problem - LENGTH = 8 means 8 elements in the shift register.  so you want reg[32:0] m[0:7] or reg[32:0] m[0:(LENGTH-1)].  Then below on the addr assignment, you would want addr <= ( addr + 1 ) % ( LENGTH ).  Because using a LENGTH of 8,  xxx % 8 will always return a value inclusively between 0 and 7.

Not sure how this is even working with one of the shift registers effectively 1 element short....
edit: seems if I "fix" this, it breaks it heh..... I need to look into this
ok another edit update, it seems this code is correct because you also have a 32-bit register r in there that's separate from the m storage register.  And that also explains the different synthesis for this module.  It's using a RAM, a 32-bit register r, 3-bit register addr, 9-bit adder for next address range, as opposed to just LENGTH*32 register/FF for the other types of shift registers... not sure which one is better here


on another note, I placed 2 cores ( 4 sha256 transforms ) into the design, it said I was using 140% LUTs, but it's still trying to route it right now?  It's been running for over 12 hours though....
11  Bitcoin / Hardware / Re: Official Open Source FPGA Bitcoin Miner (Smaller Devices Now Supported!) on: July 25, 2011, 02:31:42 AM
tried popping in two more sha cores to get 2 engines running ( fully unrolled ), ISE spit out this:
Quote
Slice Logic Utilization:
 Number of Slice Registers:           92543  out of  184304    50%  
 Number of Slice LUTs:                121337  out of  92152   131% (*)
    Number used as Logic:             113389  out of  92152   123% (*)
    Number used as Memory:             7948  out of  21680    36%  
       Number used as SRL:             7948

so looks like without a lil bit of massaging the current design uses up a bit more resources....

I'm gonna try 2 hashing engines ( 4 cores ) running at log_level2 - that should be able to fit, and then I'll see how fast I can scale up the clocking to get it routable
12  Bitcoin / Hardware / Re: Official Open Source FPGA Bitcoin Miner (Smaller Devices Now Supported!) on: July 24, 2011, 05:39:24 PM
great reply, thanks, I have it successfully generating a golden nonce now in simulation, awesome.

I'm going to toy around and see if I can get this running faster than 100 MHz, or rather, routing @ faster than 100 MHz, I'm liking that ISE 13 has multi-core support for the stuff like routing and simulation now!
13  Other / Beginners & Help / Re: First commercial ASIC miner specifications and pre-launch on: July 24, 2011, 03:11:14 PM
I think this smells like scam, I can't find the google page right now, but some of that last press releases was pretty much lifted from a textbook talking about a SHA core implementation based on a Pilchard design - or mebe that was the Pilchard design.  Even the things like polling communication.  If it helps anyone, I was googling for Pilchard, and I ended up on a google books page, but after reading the section that came up, it literally was almost word for word the same as their latest news description.

On the other hand.  There are "ASIC"s out there that are literally just stripped down FPGAs.  For example, you can design something on a Xilinx FPGA, and then you once your design is set in stone, you can actually send it off to Xilinx to make "ASIC"s from that FPGA design.  It's basically just a FPGA without the FP ( Field Programmable Gate Array => Gate Array ).  And it's not as cheap as a real ASIC, but it's definitely cheaper than an FPGA - per unit price.  Xilinx also expects volumes in the thousands if not millions to even provide the service.  But I could see cheaper Chinese companies doing this type of thing.  So I don't think it's out of this world to think that it's not an "ASIC" in this term of the definition.

Now back again to the scam side.  480-500 MHash/s just sounds unfathomable for what I just described above.  To obtain these types of hash rates you would need either multiple SHA cores running in parallel ( AND pipelined, i just don't see enough cores getting packed in ), or one core running at extrememly fast clock rates.  I just don't see either of these happening in an FPGA type ASIC.  I would expect maybe something on the order of 100-300 MHash.  As said earlier, you can't really do much to optimize the math/logic that goes on in the alogirhtm.

I also find it funny that they think using some sort of external memory could possibly be faster than using the on-chip BRAMs.  Or if they are using the internal BRAMs, then how they could possibly think that doing something like this could get them anywhere near such an efficiency boost in terms of clocking.  The critical path in the SHA logic is not constant inputs - it's the adders - 32-bit adders that need to execute in one clock means you have a 32-bit carry chain to deal with, and I would likely assume that is going to be the critical path here - 32-bit adder carry chains with inputs that need to be xor'd.

From little information he has provided, IMO just seems fake and I hate to say it, but very chinese.  I love his reactions in that reddit thread - people criticize the legitimacy and retorts with "FINE MAYBE WE SHOULDN'T GIVE YOU ANY INFORMATION" - LO-fucking-L.  I don't know who is still interested in this vaporware, but it's definitely got me not interested - who wouldn't want to deal with someone like this?

I'm not even sure if photos/videos of this thing "working" will cut it at this point.  Maybe if it was a spur of the moment - someone asks for a pic/video of this thing running, and 5 minutes later these guys post a response because they actually have it running live.  But the way it is now, I'd expect any further "proof" to be highly constructed/photoshopped/faked.  How is he even supposed to "proove" he has one of these things up and running?  It's like an iphone jailbreak video - it's a freaking video, it's so easy to fake that shit - it's so easy to shoot it 100 times till you stop making a mistake.  It's like one of those japanese video magicians....

I don't know, but at least in my mind or someone prooves me wrong with some real solid proof - this just screams scammer to me.

Here are some questions I would ask and expect to be answered with out giving away the technical details of the implementation of the design.

What is the core clock rate of the chip?  How many SHA engines are running?  How many clocks for each core to produce a hash?  Are they fully pipelined?
14  Bitcoin / Hardware / Re: Official Open Source FPGA Bitcoin Miner (Smaller Devices Now Supported!) on: July 23, 2011, 04:46:58 PM
ooh interesting stuff going on here for Spartan devices eh?  I need to check some of this out in my compiler as well.

The latest confirmed 50MHash/s on the lx150 - which codeset is that? the LX150_makomk directory?

and thats a lx150 or lx150t?

also, I see a testbench in there - do you have maybe a timing diagram of expected/correct outcome for those inputs?

not too familiar with ISE 13 myself or verilog for that matter - i use mostly vhdl, but it also looks like you left a chipscope core in the project file in the github

also looks like the ucf is set up to receive a 100MHz clock and I don't see any clock dividers in the code?

edit:
hrm... so it seems you are using chipscope to communicate with the chip? interesting, I havn't seen that before - you guys discuss that somewhere in this thread? what software are you using to talk through the chipscope objects?
15  Bitcoin / Development & Technical Discussion / Re: Modular FPGA Miner Hardware Design Development on: July 23, 2011, 04:39:39 PM
With what I'm used to that sounds perfectly fine.  We use a 100 MHz crystal - and the PLL/DCM can easily turn that into 100 or 50 using the least noisy clk0 output, and we can get all kinds of different ranges with the clkfx output - multiply the clock or reduce it in IIRC any integer between 1-256 over 1-256

I'm just unsure of what clock levels we can obtain we the hashing code.  You know, I'll try to run some compiles right now and see what we can get, but now that I think about it yeah, if someone else said they had it running around 100Mhz then that sounds about right, we can tune from there.

There's a lot more projects in that open source fpga miner github now heh.... hrm.... i should probably go read that thread again

btw if anyone wants to read up on it:
Spartan 6 FPGA Clocking Resources
http://www.xilinx.com/support/documentation/user_guides/ug382.pdf
16  Bitcoin / Development & Technical Discussion / Re: Modular FPGA Miner Hardware Design Development on: July 23, 2011, 03:52:14 PM
FPGA Power considerations

Spartan 6 FPGA Power Management
http://www.xilinx.com/support/documentation/user_guides/ug394.pdf
Quote
The FPGA can only enter suspend mode if enabled in the configuration bitstream (see
Enable the Suspend Feature and Glitch Filtering, page 14). The SUSPEND pin must be Low
during power up and configuration. Once enabled through the bitstream, and the
SUSPEND_SYNC primitive is not present in the design, when the SUSPEND pin is
asserted, the FPGA unconditionally and quickly enters suspend mode.
...
There are four possible ways to exit suspend mode in a powered system:
• Drive the SUSPEND input Low, exiting suspend mode.
• If multi-pin wake-up mode is enabled, drive the SUSPEND input Low and then assert
any one of the user enabled SCP pins.
• Pulse the PROGRAM_B input Low to reset the FPGA and cause the FPGA to
reprogram.
• Power cycle the FPGA, causing the FPGA to reprogram.

sounds pretty simple to put this guy into suspend, and it will retain it's programming in that state too, and all that's needed is to enable it in the bitstream and to assert the SUSPEND pin when you want it to go to sleep.  Sounds like you guys want to tie that to the MCP so then it can control if the FPGA is on or not. 

there is also a hibernate mode, but it basically just sounds like a way to safer way to power up/down in hot-swapping situations

What is the configuration consensus again?  Bit banging the MCP's I/O and a JTAG chain of the 2 FPGAs?

it also says:
Quote
Saving Power
...
The lowest power state is the quiescent state with no inputs toggling, all outputs disabled,
and no pull-up or pull-down resistors in use.

17  Bitcoin / Development & Technical Discussion / Re: Modular FPGA Miner Hardware Design Development on: July 23, 2011, 03:21:45 PM
I think another thing to note is locations of everyone in this project.  If you plan on shipping these things around to different people for different things the shipping costs alone may become very large.  I'm in the USA, California.  But it sounds like a lot of you guys are in Europe or Asia... I hate to say it because it may mean it will be more difficult for me to get a board here in the US, but it almost sounds like you guys in Euro/Asia are closer together - and it might make more sense to get everything done semi-close to you guys, e.g. board fab/assembly/debugging.  I'd love to get my hands on one of these myself, so it kind of sucks if us US guys are the minority, but it goes both ways too....  But basically to reduce costs we are forced to centralize things - we can't not buy all the boards at once without it costing much more $$$, same goes for the part orders.  Assembly may not be as bad as li_gangyi seems to have that covered for now.  But again, if he's doing the assembly, and correct me if I'm wrong, he's in Singapore, that may be part of the decision right there.  I could do any soldering/assembly myself, except for the reflow/bga stuff.

I don't mind putting up some money for prototyping, but without getting any hardware myself I'd be much less likely to put money up ( I imagine I'd only want to part with USD$100-$200 if it meant I wouldn't see anything for a while - plus I'd love it if that meant I got some sort of discount later or guaranteed first production run etc... ).  If I knew I was going to get hardware for this, I may be more willing to spend around USD$300-$500 for prototyping.

Here's a thought - how hard are LX150's to re-sell.  Maybe we should try to acquire enough funds to get the first price break on a batch of 10 of those?  It sounds like we're going to need at least 1-2 prototypes, but it's sounding almost more like we might make 5-6 and ship them around to that various devs.  Dunno how much of a price savings it is, but if everyone is very convinced that's the FPGA we're going to use, then maybe this might be a good cost savings investment now?

edit: seems like @ digikey there is no bulk discounting at all for these chips

On another note to the FPGA devs out here - what FPGA code are we currently going with?  I havn't kept up with the Open Source FPGA Miner project in a bit, have any major improvements be made?  I was thinking about setting up some simulation test benches in Xilinx and trying to further optimize what they've done.  IIRC the design they have made IMO doesn't perfectly match up with the Xilinx architecture - and that's likely why people have been unable to fully maximize/optimize a hashing core for this chip.  I think Xilinx does much better with faster clocks and smaller logic - but I'm by far no guru or expert, this is just my gut feeling - we'd really have to tear deep down into timing analysis and RTL diagrams to really optimize well.
18  Bitcoin / Development & Technical Discussion / Re: Modular FPGA Miner Hardware Design Development on: July 23, 2011, 03:07:02 PM
OK, so took me a day and a half, but I finally went through and read all 23 pages of this thread.  Kind of glossed over it, so I'm not 100% on what all the decisions are made, but I have a much better feel of where you guys are at in terms of the design and progress.

Some things i'd like to add to the discussion for now:

In terms of Xilinx licenses - as someone said above - Xilinx doesn't care too much where you get the software or license - as in the end you will require a Xilinx FPGA to program and that's really where they make their money.  If anyone cares I can show you where to obtain a less than reputable ( read: pirate ) ISE license for ISE 13 with everything unlocked ( including the LX150/T design targets )

The unconnected I/O are unimportant for this design.  You can tie them all to ground, you can tie them all to Vcc, you can leave them floating.  It doesn't matter - personally I'd say just leave them floating.  This design is not going to be requiring non-noisy I/O lines as we aren't going to be doing anything like a high speed bus on the I/O.  At my workplace we build industrial monitoring equipment and on a lot of our boards we just leave unconnected I/O hanging and set the ISE to either use a weak pull down or leave the I/O floating.  It's really not much of an issue unless you are requiring very high speed accurate I/O ( think 100-200MHz clock ranges ).  These settings can be found under right clicking on Generate Programming File -> Process Properties -> Configuration Options -> Unused IOB pins ( Pull Down, Pull Up, Floating )

Also, again I'm a bit unclear on how the JTAG is going to be set up.  But I would expect a JTAG header on every DIMM, or at least one JTAG somewhere with all FPGAs chained.  I would also probably expect a couple of LEDs.  You probably are going to want to put one between VCC and GND to show you that the baord has power.  You will probably also want an LED or two as debug outputs.  In my mind, I would also love to have a couple other test points as just unused I/O pins routed out to a TP - these are also very useful for debugging.  On the other hand, this design shouldn't require too much debugging as it's pretty dead simple - but in terms of future application/expansion, it may be helpful for debugging new features.  Also, typically there will be an LED or two somehow tied to the tx/rx lines of the USB bus so you can tell when communications is occurring.  I would love to be able to route all the unused I/O to pin headers or test points, but I agree with what has been said above, and the cost/complexity to do it is just not worth it.  Sure you'd be able to use the board as a spartan6 dev board, but I don't think that's the goal of this project.  So to keep PCB complexity down I agree with you guys - just route exactly what's needed, and then possibly add a handful more I/Os for debugging/indicators and then a few more that are brought out for future expansion  - either to DIMM pins or test points ( think extra comm protocols etc... ).  If not all DIMM pins are going to be routed ( not taken for power or I/O ) - I would also prefer to have a TP for each of these unused DIMM pins - that way we could deadbug new features or bug fixes this way.  If we leave just enough room for error that we can hack something on for this prototype it will save us a lot of time/effort later because it'll likely help us skip re-spinning.  I agree with what was said above that this is a prototype - it should have a slight excess of what's necessary to help us debug/fix any potential problems/errors we may make before the first spin.


I'm unsure about what internal clock rates any said miner design will be able to obtain inside the FPGA.  But I would guess it's going to be around 25-50 MHz.  I would probably advise against using the same 25 MHz crystal for the MCP and the FPGAs just because to get 25MHz to 100 MHz using a PLL in the FPGA will require using the CLK_FX output to multiply the input clock and this is generally a noisier clock solution.  It's definitely doable, but maybe not the best implementation.  I don't have any experience with Spartan6 devices either ( we use a lot of Spartan3s ), so I'm unsure if it's possible to get useful computation done in this chip at higher clock rates.  One thing I definitely do know is you will want to route your clock input into one of the global clock pins - basically there are certain pins in each bank that route closer/more directly to the BUFGMUXs that control the quadrant clock lines.  These clock lines are the best lines to use for distribution clocks throughout the FPGA as they are built for this function and provide the least amount of clock skew along these lines.  You can definitely still take a clock in on any I/O pad and route it to one of these BUFGMUXs - it's just sometimes that trace path ( between non-clock I/O and BUFGMUX ) is not the most optimal path.  There are a bunch of different pins you can use, but I would stray away from the quadrant/side locked clock inputs and just use one of the global clock inputs ( there should be at least 4 ).  The quadrant/side clocks are useful if you partition your FPGA device into regions based on clock domains - then you can free up global clock resources and only use clocks in one quadrant or one side if needed - but this isn't necessary for our design - I envision one main clock for the hashing engine, and potentially one other clock for communications.  The hashing engine clock is the only one I'm worried about - the comm clock could be derived internally off a counter as it's not fast speed or touches a lot of resources.  If you'd like to read more up on it -ug382.pdf describes clocking mechanisms for the Spartan6 family.  

TL;DR So when you are looking to route the clock to the FPGA - make sure you connect to a GCLK I/O pin.

19  Bitcoin / Development & Technical Discussion / Re: Modular FPGA Miner Hardware Design Development on: July 22, 2011, 12:21:14 AM
It's worse than that: we want to be able to bring down the FPGAs deliberately if they are drawing too much power. How much data would need to be moved across the bus every time we do this? So far the assumption is that bandwidth is negligable. Edit: 1.6MB can take minutes at standard serial speeds.
And that's a spartan 3E i'm talking about - I imagine the 6 series has a lot more cells that need to be programmed.

Quote
Another reason for using the MCU is that it saves space on the FPGA for hashing. The MCU can translate between USB and SPI for example.

Edit2: MCUs can also be reprogrammed without software costing $3000 to produce the bitstream. This makes things like minor protocol changes or calibrating the built-in temperature sensor easy.
It's fairly easy to get an evaluation license for Xilinx - and I think it's even pretty easy to just re-generate more if you have more e-mail addresses....  Nobody in the industry really pays for Xilinx licenses - they just give them away when you buy chips typically - but the bad news is that it's unlikely they will even give you a license if you aren't purchasing semi-bulk.  I would guess it'd be kind of hard for hobbyists to nab a non-eval license.

So what exactly is the purpose of powering down FPGAs?  If this was my mining rig, I'd be running it at 100% at all times.  I see you mentioned temperature sensor - so maybe in places without adequate cooling or if a cooling system breaks you might want to turn things off... again I think I need to see a bit more of what you guys have designed so far.

The amount of FPGA space you will use for simple things like comm protocols is nearly negligible - especially when these modules are running at much reduced clock rates compared to the important stuff ( the hashing engine ).  Another thing to think about - see what other FPGAs in the family have the same foot-print.  A prototype board doesn't need to have the most expensive FPGA if a cheaper smaller FPGA can handle 1 hash engine, and this may help a little in the first round of prototyping.

I guess if the MCU is very cheap it's not that much of a hit to have one... but you are adding one more layer of complexity to the design. PC <==> USB chip <==> MCU <==> FPGA.  And the MCU itself will need to have code written for it...  I guess it matters what you really want the MCU to be able to accomplish.  Or what you think the MCU may be able to accomplish in the long run.

I think I need a bit more info on what the high level design architecture is going to be.  How does the backplane/motherboard fit into this if each DIMM module has it's own MCU and USB controller?  I would imagine ideally you would put these type of comm things onto the backplane/mobo itself - but I understand in terms of prototyping and development it'd be easier to have it on the module itself so you wouldn't need a backplane to debug/develop with it.

Also, I'd like to mention that I'd be willing to throw up a bit of cash ( couple hundred dollars ) when prototyping time comes.  And I am interested in helping with the project.  I have access to Xilinx ISE tools, and all kinds of hardware equipment ( logic analyzers, power supplies, o-scopes, sig gens, etc... ).  I am also a hobby programmer with experience in C/C++, Java, python, perl, PHP, etc... Trained as an EE in school, but focused more on computer engineering, hence why I'm now a firmware engineer.  But yeah I'd love to help develop this with you guys and I have some money I could pool together with you guys to get the first round running.  I don't do much of the hardware layout - but I can always ask my coworkers to take a look at things.
20  Bitcoin / Development & Technical Discussion / Re: Modular FPGA Miner Hardware Design Development on: July 21, 2011, 11:40:16 PM
if we wanna say load the bitstream and everything from the uC, I don't think it'll be a good idea
At my work we use a lot of Spartan 3E 1600s and the bit files are somewhere around 1.6 megabytes....  No idea how large the bit files are for the 6 series....  USB could definitely handle it - but then you have to also think about the uC being able to handle it in a reasonable amount of time too.

From what I've seen typically for every FPGA you would have a SPI flash memory for holding the programming - and on boot, the FPGA's bootloader can be told to talk to the SPI flash chip and unload it's programming from there.

I was dabbling around with some code to field-upgrade the firmware in the flash and the Spartan 3E itself doesn't even have enough BRAM to hold the entire flash image on the FPGA itself - I'm going to have to go down the road of possibly having two flash chips, one with some "bootloader" code and the other with the actual running program - and to field upgrade the firmware, I would have code that would be able to access the 2nd flash to write directly to it.

But I guess if you program the chips through the USB comm interface that's 1 less part you'll need ( flash memory ), but it also means much more complicated uC code to be able deal with programming the FPGAs.  And it also means that if the device loses power, that it would need to be reprogrammed again - and the software communicating through USB would need to be able to catch this, or be restarted.  I guess it could work, but in my mind it just sounds kind of hacky.
Pages: [1] 2 3 »
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!