Bitcoin Forum
March 28, 2024, 11:05:53 AM *
News: Latest Bitcoin Core release: 26.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 [3] 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 »  All
  Print  
Author Topic: artforz and coblee gpu mining litecoin since the start?  (Read 32573 times)
wknight
Legendary
*
Offline Offline

Activity: 889
Merit: 1000


Bitcoin calls me an Orphan


View Profile WWW
February 10, 2012, 06:27:49 PM
 #41

Well.. since everyone knows now that cobblee and artforz have a top secret gpu miner.. and i am part of the dozen people who are allowed to use it.. let me share it with all of you:




Code:
The miner is written in plain C# and doesnt need any special libraries besides what you have anyway. I havent tested it, but since it is written in MONO on windows it might be cross platform compatible... For more hash you have to tweak lines 31 and 32.. but dont expect too much, its just a preview of what cobblefart is.. By the way.. fun is: You can set it to any pool you like.. its speed sucks on bitcoin but is a bomb on Solidcoin and Scrypt - and its wattage is awesome. Not much more than simple idle power consumption at 200kh

Code: http://pastebin.com/aCR53QcY



We have found a bug in the code to make it look like we are mining at 1 MH/S on a simple dual core.. however only 1 share was submitted.. again.. no real proof

Mining Both Bitcoin and Litecoin.
1711623953
Hero Member
*
Offline Offline

Posts: 1711623953

View Profile Personal Message (Offline)

Ignore
1711623953
Reply with quote  #2

1711623953
Report to moderator
The Bitcoin software, network, and concept is called "Bitcoin" with a capitalized "B". Bitcoin currency units are called "bitcoins" with a lowercase "b" -- this is often abbreviated BTC.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
coblee
Donator
Legendary
*
Offline Offline

Activity: 1651
Merit: 1284


Creator of Litecoin. Cryptocurrency enthusiast.


View Profile
February 10, 2012, 06:43:46 PM
 #42

I also don't really care about the price because I'm not in this for the money, but on last check the price of each SC is nearly 4x that of LiteCoin.

Seriously CoinHunter. If you didn't care about price, then why do you keep manipulating block rewards to try to reach dollar parity? SC block rewards is now only ~0.05 (1/1000 of Litecoin). And SC price is only about 3x that of LTC. That's rather weak in my opinion. Maybe that's why you come here to spread your FUD?

Roadhog2k5
Full Member
***
Offline Offline

Activity: 131
Merit: 100



View Profile
February 10, 2012, 07:21:20 PM
 #43

I also don't really care about the price because I'm not in this for the money, but on last check the price of each SC is nearly 4x that of LiteCoin.

Seriously CoinHunter. If you didn't care about price, then why do you keep manipulating block rewards to try to reach dollar parity? SC block rewards is now only ~0.05 (1/1000 of Litecoin). And SC price is only about 3x that of LTC. That's rather weak in my opinion. Maybe that's why you come here to spread your FUD?
He needs someone to pick on since he is butthurt that he killed solidcoin again.
Snapman
Sr. Member
****
Offline Offline

Activity: 291
Merit: 250


BTCRadio Owner


View Profile WWW
February 10, 2012, 07:33:35 PM
 #44

Typical coinhunter, continuing to be worthless to the crypto currency community.

Are you really that sad, your coin comes to a crashing halt; and your response instead of improving your system, is bashing another.
You would do quite well in Washington, DC (slumming it with the underlords of douche baggery)

BTCRadio: 17cafKShokyQCbaNuzaDo5HLoSnffMNPAs
Gabi
Legendary
*
Offline Offline

Activity: 1148
Merit: 1008


If you want to walk on water, get out of the boat


View Profile
February 10, 2012, 07:36:53 PM
 #45

Another pathetic tentative to attack litecoin?  Roll Eyes Ridicolous.

Oh and i have made some dozens of full custom chips at 28nm specifically tailored to mine bitcoin and litecoins. 1 terahash per chip. True story.

StewartJ
Sr. Member
****
Offline Offline

Activity: 392
Merit: 250



View Profile
February 10, 2012, 08:08:34 PM
 #46

I've learned to ignore the solidcoin shills.

They are actually bots programmed to rile you.



Mousepotato
Hero Member
*****
Offline Offline

Activity: 896
Merit: 1000


Seal Cub Clubbing Club


View Profile
February 10, 2012, 08:14:25 PM
 #47

Oh and i have made some dozens of full custom chips at 28nm specifically tailored to mine bitcoin and litecoins. 1 terahash per chip. True story.

Ships in 4-6 weeks?

Mousepotato
wknight
Legendary
*
Offline Offline

Activity: 889
Merit: 1000


Bitcoin calls me an Orphan


View Profile WWW
February 10, 2012, 08:17:21 PM
Last edit: February 10, 2012, 08:49:01 PM by wknight
 #48

Ohh my goodness.. look out.. We have found the ultimate GPU miner for litecoin!!!! We did this with a GeForce mx2 Smiley


Mining Both Bitcoin and Litecoin.
Schwede65
Sr. Member
****
Offline Offline

Activity: 309
Merit: 250


View Profile
February 10, 2012, 09:28:31 PM
 #49


It currently gets ~250KH/s on a 6990 .


the ltc-mining cpu- and gpu-effectivity w/kh seem to be simultan:
core i3 / 3.1 ghz / 3 threads / ~ 15 w only for this / 12.5 kh/s =>1.2 w/kh
core i7 / 3.6 ghz / 7 threads / ~ 46 w only for this / 32 kh/s =>1.437 w/kh
6990 / ~350 w (don't know the correct watt) for 250 kh/s => 1.4 w/kh

but with more improvement there will be an even better w/kh-rate for the 6990 and the other mining-cards

so the end of ltc-cpu-mining is not too far away because
one 6990 does the job of ~8 core-i7 or ~20 core i3
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1063


Gerald Davis


View Profile
February 10, 2012, 09:39:56 PM
 #50


It currently gets ~250KH/s on a 6990 .


the ltc-mining cpu- and gpu-effectivity w/kh seem to be simultan:
core i3 / 3.1 ghz / 3 threads / ~ 15 w only for this / 12.5 kh/s =>1.2 w/kh
core i7 / 3.6 ghz / 7 threads / ~ 46 w only for this / 32 kh/s =>1.437 w/kh
6990 / ~350 w (don't know the correct watt) for 250 kh/s => 1.4 w/kh

but with more improvement there will be an even better w/kh-rate for the 6990 and the other mining-cards

so the end of ltc-cpu-mining is not too far away because
one 6990 does the job of ~8 core-i7 or ~20 core i3

A unicorn can run 500 miles on one gallon (of beer).  A gallon of beer is more expensive than a gallon of gas but likely we are still seeing the end of hybrid cars because one unicorn is equal to almost 10x Toyota Prius.  Plus with improvement in unicorn-beer technology you are getting even better m/goa rate (thats miles to gallons of alcohol).
Mousepotato
Hero Member
*****
Offline Offline

Activity: 896
Merit: 1000


Seal Cub Clubbing Club


View Profile
February 10, 2012, 09:48:06 PM
 #51

At 250 kH/s with your 6990 your daily yield is right around 160 LTC at current difficulty (estimating with http://www.litecoinpool.org/stats, and ignoring pool fees of course).  At an exchange rate of .002802 BTC per LTC that means you can exchange your daily take of LTC for roughly .44 BTC. 

Now if you were mining straight BTC with that 6990, you'd be doing around 820 MH/s which yields around .60 BTC per day, maybe a little more.  I'm guessing that you'd need that GPU Litecoin miner to hit around 350 kH/s or so before you start breaking even versus mining straight BTC.

Mousepotato
ssvb
Newbie
*
Offline Offline

Activity: 39
Merit: 0


View Profile
February 10, 2012, 09:48:41 PM
 #52

I am no GPGPU expert, but I think ArtForz made some very good points in the following thread:
https://bitcointalk.org/index.php?topic=45849.0
CoinHunter could make his claims convincing by simply explaining how to address the GPU limitations outlined by ArtForz.
At least ArtForz was mistaken about Cell earlier Smiley

Just let's do some simple math. Playstation3 has 6 SPE cores, each clocked at 3.2GHz and 25GB/s of total memory bandwidth. Calculating one hash needs approximately 434176 ADD/ROL/XOR operations on 128-bit vectors in the performance critical part of salsa20/8 which are executed in the even pipe (shuffles and the other instructions are executed in the odd pipe). Also calculating one hash needs 256KB of memory bandwidth (128KB is written sequentially, 128KB is read in scattered 128-byte chunks). So taking into account that SPE core can execute one instruction from the even pipe each cycle, the theoretical performance limit based on computational power is (6 * 3200000000) / 434176 ~= 44.2 khash/s. The theoretical performance limit based on memory bandwidth is 25GB / 256KB ~= 95.4 khash/s. There is a lot of headroom for the memory bandwidth and arithmetic calculations are the bottleneck. Though Cell has precise control over memory operations by scheduling DMA transfers and can overlap DMA transfers with calculations. This allows to utilize memory bandwidth very efficiently for scrypt algorithm.

This page seems to say that HD 6990 has 320GB/s of memory bandwidth. And here ArtForz tells us that it is possible to achieve < 20% peak BW with GPU. Doing some math again, we get 320GB * 0.2 / 256KB ~= 244 khash/s. Looks rather believable to me.

edit: corrected HD 6990 memory bandwidth (it is 320GB/s and not 350GB/s)
Mousepotato
Hero Member
*****
Offline Offline

Activity: 896
Merit: 1000


Seal Cub Clubbing Club


View Profile
February 10, 2012, 09:53:12 PM
 #53

At least ArtForz was mistaken about Cell earlier Smiley

Just let's do some simple math. Playstation3 has 6 SPE cores, each clocked at 3.2GHz and 25GB/s of total memory bandwidth. Calculating one hash needs approximately 434176 ADD/ROL/XOR operations on 128-bit vectors in the performance critical part of salsa20/8 which are executed in the even pipe (shuffles and the other instructions are executed in the odd pipe). Also calculating one hash needs 256KB of memory bandwidth (128KB is written sequentially, 128KB is read in scattered 128-byte chunks). So taking into account that SPE core can execute one instruction from the even pipe each cycle, the theoretical performance limit based on computational power is (6 * 3200000000) / 434176 ~= 44.2 khash/s. The theoretical performance limit based on memory bandwidth is 25GB / 256KB ~= 95.4 khash/s. There is a lot of headroom for the memory bandwidth and arithmetic calculations are the bottleneck. Though Cell has precise control over memory operations by scheduling DMA transfers and can overlap DMA transfers with calculations. This allows to utilize memory bandwidth very efficiently for scrypt algorithm.

This page seems to say that HD 6990 has 350GB/s of memory bandwidth. And here ArtForz tells us that it is possible to achieve < 20% peak BW with GPU. Doing some math again, we get 350GB * 0.2 / 256KB ~= 267 khash/s. Looks rather believable to me.

Check out the big brain on Brett! Shocked

Mousepotato
DeathAndTaxes
Donator
Legendary
*
Offline Offline

Activity: 1218
Merit: 1063


Gerald Davis


View Profile
February 10, 2012, 10:06:08 PM
 #54

I am no GPGPU expert, but I think ArtForz made some very good points in the following thread:
https://bitcointalk.org/index.php?topic=45849.0
CoinHunter could make his claims convincing by simply explaining how to address the GPU limitations outlined by ArtForz.
At least ArtForz was mistaken about Cell earlier Smiley

Just let's do some simple math. Playstation3 has 6 SPE cores, each clocked at 3.2GHz and 25GB/s of total memory bandwidth. Calculating one hash needs approximately 434176 ADD/ROL/XOR operations on 128-bit vectors in the performance critical part of salsa20/8 which are executed in the even pipe (shuffles and the other instructions are executed in the odd pipe). Also calculating one hash needs 256KB of memory bandwidth (128KB is written sequentially, 128KB is read in scattered 128-byte chunks). So taking into account that SPE core can execute one instruction from the even pipe each cycle, the theoretical performance limit based on computational power is (6 * 3200000000) / 434176 ~= 44.2 khash/s. The theoretical performance limit based on memory bandwidth is 25GB / 256KB ~= 95.4 khash/s. There is a lot of headroom for the memory bandwidth and arithmetic calculations are the bottleneck. Though Cell has precise control over memory operations by scheduling DMA transfers and can overlap DMA transfers with calculations. This allows to utilize memory bandwidth very efficiently for scrypt algorithm.

This page seems to say that HD 6990 has 320GB/s of memory bandwidth. And here ArtForz tells us that it is possible to achieve < 20% peak BW with GPU. Doing some math again, we get 320GB * 0.2 / 256KB ~= 244 khash/s. Looks rather believable to me.

edit: corrected HD 6990 memory bandwidth (it is 320GB/s and not 350GB/s)

Bandwidth has nothing to do w/ scrypt.  LATENCY does.  Which is why the amount of L1 cache is so important.
tacotime
Legendary
*
Offline Offline

Activity: 1484
Merit: 1005



View Profile
February 10, 2012, 10:20:35 PM
 #55

I am no GPGPU expert, but I think ArtForz made some very good points in the following thread:
https://bitcointalk.org/index.php?topic=45849.0
CoinHunter could make his claims convincing by simply explaining how to address the GPU limitations outlined by ArtForz.
At least ArtForz was mistaken about Cell earlier Smiley

Just let's do some simple math. Playstation3 has 6 SPE cores, each clocked at 3.2GHz and 25GB/s of total memory bandwidth. Calculating one hash needs approximately 434176 ADD/ROL/XOR operations on 128-bit vectors in the performance critical part of salsa20/8 which are executed in the even pipe (shuffles and the other instructions are executed in the odd pipe). Also calculating one hash needs 256KB of memory bandwidth (128KB is written sequentially, 128KB is read in scattered 128-byte chunks). So taking into account that SPE core can execute one instruction from the even pipe each cycle, the theoretical performance limit based on computational power is (6 * 3200000000) / 434176 ~= 44.2 khash/s. The theoretical performance limit based on memory bandwidth is 25GB / 256KB ~= 95.4 khash/s. There is a lot of headroom for the memory bandwidth and arithmetic calculations are the bottleneck. Though Cell has precise control over memory operations by scheduling DMA transfers and can overlap DMA transfers with calculations. This allows to utilize memory bandwidth very efficiently for scrypt algorithm.

This page seems to say that HD 6990 has 320GB/s of memory bandwidth. And here ArtForz tells us that it is possible to achieve < 20% peak BW with GPU. Doing some math again, we get 320GB * 0.2 / 256KB ~= 244 khash/s. Looks rather believable to me.

edit: corrected HD 6990 memory bandwidth (it is 320GB/s and not 350GB/s)

What about smix and the mul operations in scrypt?  I thought the reason for the speed of Cell as implemented in PS3 (~35 kh/s) was do to the 256kb onboard local registers...  The slowdown in scrypt(1024,1,1) has little to do with the speed of the memory and everything to do with the speed of random accesses to that memory.  Cache (or onboard memory in the case of Cell) is way, way faster in terms of random access to data (L1 and L2 are 4 and 10 clock cycles respectively for an I7).

Quote
With DRAM memory, random access is never efficient. In fact, the GPU hardware looks at all memory addresses that the running threads want to access at a given cycle, and attempts to coalesce them into a single DRAM access - in case they are not random. Effectively the contiguous range from i to i+#threads is reverse-engineered from the explicitly computed i,i+1,i+2… - another cost of replicating the index in the first place. If the indexes are in fact random and can not be coalesced, the performance loss depends on “the degree of randomness”. This loss results from the DRAM architecture quite directly, the GPU being unable to do much about it - similarly to any other processor.

http://www.yosefk.com/blog/simd-simt-smt-parallelism-in-nvidia-gpus.html

GPUs generally have little onboard cache (16-32kb) because the data they process is intended to be sequential (and it usually is for 3D applications).

Code:
XMR: 44GBHzv6ZyQdJkjqZje6KLZ3xSyN1hBSFAnLP6EAqJtCRVzMzZmeXTC2AHKDS9aEDTRKmo6a6o9r9j86pYfhCWDkKjbtcns
ssvb
Newbie
*
Offline Offline

Activity: 39
Merit: 0


View Profile
February 10, 2012, 10:31:58 PM
 #56

Bandwidth has nothing to do w/ scrypt.  LATENCY does.  Which is why the amount of L1 cache is so important.
L1 cache is just less important than you think Smiley For example, my scrypt miner optimizations for Cell do not use 256KB of fast local memory at all. It is insufficient for 4x unrolling which is needed in order to eliminate pipeline stalls and at least half of the performance would be lost. But scrypt is not memory heavy enough, so I can easily get away working with the main memory and still have a lot of memory bandwidth headroom. LATENCY is not important in my case, because memory accesses are pipelined, get executed asynchronously and do not block execution. But you can check scrypt_spu_core8 function in the code yourself.

If GPUs have excessive computational resources, then even waiting for memory a lot of time (80% or so per each execution core) is likely not a problem as long as all of them are competing for the precious memory bandwidth and fully saturating it. I did not think about GPU mining earlier just because I did not have any experience with GPU programming and honestly did not expect them to have that much memory bandwidth (more than 10x advantage over Cell).
RoloTonyBrownTown
Sr. Member
****
Offline Offline

Activity: 350
Merit: 250



View Profile
February 10, 2012, 10:37:27 PM
 #57

What's funny to me about all this is that gpu mining the coin early before anyone else could is exactly what coinhunter did with SC Cheesy.   Such a douche.  You should be banned from this forum.

Mousepotato
Hero Member
*****
Offline Offline

Activity: 896
Merit: 1000


Seal Cub Clubbing Club


View Profile
February 10, 2012, 10:47:05 PM
 #58

What's funny to me about all this is that gpu mining the coin early before anyone else could is exactly what coinhunter did with SC Cheesy.   Such a douche.  You should be banned from this forum.

 Huh
Well.. yeah, SC started out as a GPU chain.

Mousepotato
P4man
Hero Member
*****
Offline Offline

Activity: 518
Merit: 500



View Profile
February 10, 2012, 10:53:14 PM
 #59

So I guess CH is really telling us its time to quickly buy some litecoins, because difficulty is about to explode Smiley.

Joshwaa
Hero Member
*****
Offline Offline

Activity: 497
Merit: 500



View Profile
February 10, 2012, 10:57:27 PM
 #60


It currently gets ~250KH/s on a 6990 .


the ltc-mining cpu- and gpu-effectivity w/kh seem to be simultan:
core i3 / 3.1 ghz / 3 threads / ~ 15 w only for this / 12.5 kh/s =>1.2 w/kh
core i7 / 3.6 ghz / 7 threads / ~ 46 w only for this / 32 kh/s =>1.437 w/kh
6990 / ~350 w (don't know the correct watt) for 250 kh/s => 1.4 w/kh

but with more improvement there will be an even better w/kh-rate for the 6990 and the other mining-cards

so the end of ltc-cpu-mining is not too far away because
one 6990 does the job of ~8 core-i7 or ~20 core i3

A unicorn can run 500 miles on one gallon (of beer).  A gallon of beer is more expensive than a gallon of gas but likely we are still seeing the end of hybrid cars because one unicorn is equal to almost 10x Toyota Prius.  Plus with improvement in unicorn-beer technology you are getting even better m/goa rate (thats miles to gallons of alcohol).


OMG I laughed for about 4 mins from that one. Priceless!

Like what I said : 1JosHWaA2GywdZo9pmGLNJ5XSt8j7nzNiF
Don't like what I said : 1FuckU1u89U9nBKQu4rCHz16uF4RhpSTV
Pages: « 1 2 [3] 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!