YipYip (OP)
|
|
December 23, 2012, 11:40:09 AM |
|
i understand that Litecoin mining was designed to be GPU un-friendly etc
Will we ever see FGPA firmware for LiteCoin Mining and even ASIC chips ??
Is it just a case of economics or is this technically not a reality ?
|
OBJECT NOT FOUND
|
|
|
BR0KK
|
|
December 23, 2012, 11:45:29 AM |
|
As you can read here in the forum, most fpga boards do not come with much RAM and nobody wrote a working bitstream for LTC for them (LTC need massive amounts of RAM because of the scrypt algo).
Asics have to be fitted for LTC. The Chip must be manufactured for that purpose only! Current (nonexistent unicorn blood powered) ASICS are fitted for BTC not LTC. An ASIC can only do the one thing it was build for, and that in our case hashing BTC.
|
|
|
|
Blazr
|
|
December 23, 2012, 11:55:58 AM |
|
i understand that Litecoin mining was designed to be GPU un-friendly etc
Will we ever see FGPA firmware for LiteCoin Mining and even ASIC chips ??
Is it just a case of economics or is this technically not a reality ?
Economics. There is nothing stopping anyone from developing a litecoin ASIC. I'm not entirely sure if an FPGA would be more efficient than a GPU though due to RAM restrictions.
|
|
|
|
Endgame
|
|
December 23, 2012, 12:32:41 PM |
|
Not sure about FPGA's, but in regards to ASIC's, there are 2 possibilities: 1. Yes, if litecoin becomes popular enough to justify investing a large amount of money developing litecoin ASIC's. 2. No, in which case litecoin will remain a GPU mined coin.
Either way, the future looks pretty bright for litecoin.
|
|
|
|
BR0KK
|
|
December 23, 2012, 12:43:47 PM |
|
Wasn't the RAM part the expensive part of a litecoin asic?
|
|
|
|
beekeeper
|
|
December 23, 2012, 12:50:31 PM |
|
ASIC for Bitcoin would be a joke compared with ASIC for litecoin.. GL trying that..
|
|
|
|
tacotime
Legendary
Offline
Activity: 1484
Merit: 1005
|
|
December 23, 2012, 06:35:12 PM |
|
Wasn't the RAM part the expensive part of a litecoin asic?
Yes, you need really fast RAM hence video cards work so well. They're actually quite a lot faster sequentially than on-die L2 cache on most modern CPUs. You can get really high power efficiency with small ARM chips with NEON and L2 caches like Cortex A9s (~2-3x that of video cards) but the chips are expensive. You could either make an ASIC with the cache on-die (lots of space used...) or an ASIC that accesses GDDR5 through a high performance bus, but you will probably never see the 20-100x increase in efficiency you would with SHA256 only mining. You need about 45k circuits for a scrypt core (Salsa20/SHA256) versus 25k for a SHA256 core as well, so you use up more die space there too. Most people who keep calling LTC a "scam" haven't looked at the encryption algorithm thoroughly.
|
XMR: 44GBHzv6ZyQdJkjqZje6KLZ3xSyN1hBSFAnLP6EAqJtCRVzMzZmeXTC2AHKDS9aEDTRKmo6a6o9r9j86pYfhCWDkKjbtcns
|
|
|
crazyates
Legendary
Offline
Activity: 952
Merit: 1000
|
|
December 23, 2012, 07:54:13 PM |
|
Wasn't the RAM part the expensive part of a litecoin asic?
Yes, you need really fast RAM hence video cards work so well. They're actually quite a lot faster sequentially than on-die L2 cache on most modern CPUs. You can get really high power efficiency with small ARM chips with NEON and L2 caches like Cortex A9s (~2-3x that of video cards) but the chips are expensive. I'm a little confused by what you mean by this. Could you clarify. The Litecoin hardware wiki only shows these Cortex A9s can't even get 5kH/s, and thats on a Gentoo armhf build with -mfpu=neon @ 1400Mhz, but with DDR2 800 RAM. And even undervolted, you can't even get to 1kHs/Watt, so I don't consider that all that efficient. My MK802 II is only a Cortex A8, but should have NEON. The MK802 II also has DDR3 memory, so it might be a little faster than DDR2?
|
|
|
|
YipYip (OP)
|
|
December 23, 2012, 09:40:29 PM |
|
As you can read here in the forum, most fpga boards do not come with much RAM and nobody wrote a working bitstream for LTC for them (LTC need massive amounts of RAM because of the scrypt algo).
Asics have to be fitted for LTC. The Chip must be manufactured for that purpose only! Current (nonexistent unicorn blood powered) ASICS are fitted for BTC not LTC. An ASIC can only do the one thing it was build for, and that in our case hashing BTC.
I heard that a fresh batch of Unicorns where being delivered soon and they had Silver horns instead of gold so are prebuilt to do LTC
|
OBJECT NOT FOUND
|
|
|
YipYip (OP)
|
|
December 23, 2012, 09:46:48 PM |
|
Wasn't the RAM part the expensive part of a litecoin asic?
Yes, you need really fast RAM hence video cards work so well. They're actually quite a lot faster sequentially than on-die L2 cache on most modern CPUs. You can get really high power efficiency with small ARM chips with NEON and L2 caches like Cortex A9s (~2-3x that of video cards) but the chips are expensive. You could either make an ASIC with the cache on-die (lots of space used...) or an ASIC that accesses GDDR5 through a high performance bus, but you will probably never see the 20-100x increase in efficiency you would with SHA256 only mining. You need about 45k circuits for a scrypt core (Salsa20/SHA256) versus 25k for a SHA256 core as well, so you use up more die space there too. Most people who keep calling LTC a "scam" haven't looked at the encryption algorithm thoroughly. I have a situation where I (("thank you jesus") + ("bow down to satan our lord") + "whatever gets the job done") = $0 power costs What would be the best card for a litecoin rig (of h/w that i can buy today ) ?
|
OBJECT NOT FOUND
|
|
|
tacotime
Legendary
Offline
Activity: 1484
Merit: 1005
|
|
December 23, 2012, 10:03:25 PM |
|
If you don't pay for power, 6950s that are unlocked are the best. If you pay for power, 7950s.
|
XMR: 44GBHzv6ZyQdJkjqZje6KLZ3xSyN1hBSFAnLP6EAqJtCRVzMzZmeXTC2AHKDS9aEDTRKmo6a6o9r9j86pYfhCWDkKjbtcns
|
|
|
tacotime
Legendary
Offline
Activity: 1484
Merit: 1005
|
|
December 23, 2012, 10:06:49 PM |
|
Wasn't the RAM part the expensive part of a litecoin asic?
Yes, you need really fast RAM hence video cards work so well. They're actually quite a lot faster sequentially than on-die L2 cache on most modern CPUs. You can get really high power efficiency with small ARM chips with NEON and L2 caches like Cortex A9s (~2-3x that of video cards) but the chips are expensive. I'm a little confused by what you mean by this. Could you clarify. The Litecoin hardware wiki only shows these Cortex A9s can't even get 5kH/s, and thats on a Gentoo armhf build with -mfpu=neon @ 1400Mhz, but with DDR2 800 RAM. And even undervolted, you can't even get to 1kHs/Watt, so I don't consider that all that efficient. My MK802 II is only a Cortex A8, but should have NEON. The MK802 II also has DDR3 memory, so it might be a little faster than DDR2? It was on one of those quad a9s yeah.. but I heard they were pulling 2w and not 6w like that person indicates and pulling ~8KH/s. But maybe they didn't check power consumption accurately.
|
XMR: 44GBHzv6ZyQdJkjqZje6KLZ3xSyN1hBSFAnLP6EAqJtCRVzMzZmeXTC2AHKDS9aEDTRKmo6a6o9r9j86pYfhCWDkKjbtcns
|
|
|
YipYip (OP)
|
|
December 23, 2012, 11:04:13 PM |
|
If you don't pay for power, 6950s that are unlocked are the best. If you pay for power, 7950s.
And finally what M/B & how many cards can i/should I load into that rig ?? Have ordered a batch of Unicorn blood running ASIC chips but starting to get over the BFL/bASICBTC story..... ASIC may be LTC Turbo boost
|
OBJECT NOT FOUND
|
|
|
YipYip (OP)
|
|
December 23, 2012, 11:06:24 PM |
|
If you don't pay for power, 6950s that are unlocked are the best. If you pay for power, 7950s.
When u say unlocked r u meaning overclocked or is this some type of unlocked firmware that allows a performance overcloking profile not available on standard card ??
|
OBJECT NOT FOUND
|
|
|
BR0KK
|
|
December 24, 2012, 12:56:55 AM |
|
NO he meant unlocking shader cores. The very first series of these cards were software disabled. So you could flash them with a corresponding BIOS to enable Shaders. Not sure if that works with all generations.
|
|
|
|
jasinlee
|
|
December 27, 2012, 04:01:48 PM |
|
According to greedi, the fpga/asics are already fabricated, but not in use. I hope they stay that way if this is true.
|
|
|
|
Desolator
|
|
December 28, 2012, 11:50:14 PM |
|
There are very few devices in general that can beat a PC or Laptop at RAM access times, especially on a graphics card. In fact, I think GDDR5 is the fastest RAM that exists in a retail product at all. So if someone made an Application Specific Integrator Circuit where the application it's specific to is litecoin mining, what they'd end up with is something so similar to a graphics card, it wouldn't be much faster, if at all. In fact, it'd probably resemble a Firepro or other Autocad/3D-intensive card where they have a ton of memory at high access speeds.
|
|
|
|
DeathAndTaxes
Donator
Legendary
Offline
Activity: 1218
Merit: 1079
Gerald Davis
|
|
December 28, 2012, 11:54:41 PM |
|
There are very few devices in general that can beat a PC or Laptop at RAM access times, especially on a graphics card. In fact, I think GDDR5 is the fastest RAM that exists in a retail product at all. So if someone made an Application Specific Integrator Circuit where the application it's specific to is litecoin mining, what they'd end up with is something so similar to a graphics card, it wouldn't be much faster, if at all. In fact, it'd probably resemble a Firepro or other Autocad/3D-intensive card where they have a ton of memory at high access speeds.
The major difference is that: 1) scrypt doesn't require that much memory. Your 2GB graphics card has about 1.9GB sitting idle 2) scrypt doesnt't require that much memory bandwidth. Sure it needs "fast" lookups but fast is all relative. The latency to GPU main memory is actually pretty high. 3) scrypt can actually be made pretty hard but the creator either through malice or ignorance altered the default parameters making it barely "memory hard" (which is why GPU work as well as they do). A FPGA with a 64 bit memory controller on a board with a DDR DIMM slot would work just fine. Now LTC needs to grow by at least a factor of 100x+ or more before anyone is going to devote that kind of time or resources.
|
|
|
|
tacotime
Legendary
Offline
Activity: 1484
Merit: 1005
|
|
December 29, 2012, 05:55:33 AM |
|
It requires roughly 128kb per process; for 1600 processes (eg 5870) you then need about 200mb of memory, although the algoritm as implemented uses a bit more typically. If you want to figure out what effect that the gddr5 memory is having on mining scrypt, turn your memory down to about 25mhz; at that speed you'll be pulling about the same quantity of bandwidth as the ddr memory you're refering to.
Memory bandwidth in the internal bus is critical, hence why GPUs mine so much faster than CPUs (GPU memory bandwidth is 5-10x that of the L2 cache).
Now if you can put 8000 scrypt cores on an ASIC clocked at ~150 hz with 8gb ddr3 memory at 25gb/s, congrats, you've made an ASIC exactly as fast as a GPU that maybe uses a little less power.
I welcome an FPGA miner to the scene, but don't be surprised if you build one and the results are underwhelming.
|
XMR: 44GBHzv6ZyQdJkjqZje6KLZ3xSyN1hBSFAnLP6EAqJtCRVzMzZmeXTC2AHKDS9aEDTRKmo6a6o9r9j86pYfhCWDkKjbtcns
|
|
|
DeathAndTaxes
Donator
Legendary
Offline
Activity: 1218
Merit: 1079
Gerald Davis
|
|
December 29, 2012, 06:25:17 AM |
|
Memory bandwidth doesn't tell the whole story.
gddr5 has a huge latency hundreds of clock cycles are waited when the GPU needs to pull from global memory (modern GPU do have an L1 cache to avoid needing to go to global memory but they are relatively small (8KB to 16KB).
Thus "handicap" is obvious in the scaling from CPU to GPU.
For example if we look at a high end CPU and GPU say i5 2600 vs 5970. Typical performance on Bitcoin is something on the order of 20 MH/s vs 400 MH/s. The GPU has roughly 20x the throughput. On litecoin typical numbers might be more like 30kh/s vs 300 kh/s only a 10x increase in throughput.
Despite the higher bandwidth and nearly 20x the intops per second it "only" achieves 10x the troughput. The latency on GDDR3 (which can easily run 200 to 500 clock cycles) is the bottleneck. Note: the latency in in clock cycles so slowing the memclock makes the problem worse. In seconds the latency increases and the clock gets turned down. The bursty nature of the scrypt algorithm makes it difficult for the GPU to pipeline the reads and writes. The bandwidth of a GPU might be like a dumptruck but when it takes 10 minutes to warm the dump truck up and you are only picking up a shovel full on each load you really aren't getting the max performance out of the chip.
Still lets say you are right and an ASIC core with 8GB of DDR3 is only comparable to a high end GPU at say 1/2 the electrical cost. It would kill of GPU given enough resources devoted to development. GDD3 is incredibly expensive, that combined with the high latency, and high energy consumption means ASICs would eventually overwhelm GPU with improved efficiency. Now I would point out it wouldn't be this 100x performance breakthrough like on Bitcoin but given enough time, enough market value anything would be replaced with dedicated hardware.
|
|
|
|
|