Bitcoin Forum
March 19, 2024, 07:43:51 AM *
News: Latest Bitcoin Core release: 26.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 [4] 5 6 7 8 9 10 11 »  All
  Print  
Author Topic: NVIDIA Kepler (K20) from 134MHash/s to 330MHash/s with CUDA  (Read 73284 times)
charliemaggot
Member
**
Offline Offline

Activity: 79
Merit: 10



View Profile
April 06, 2013, 04:50:59 AM
 #61

I am getting a 200 error. I didn't see it covered above. I downloaded all the files in the bin folder into the same folder, is that correct?

@InqBit What cards do you have?

I have a GTX480 and GTS450

edit: Fermi cards, but figured it would run. Maybe not?

I edited my last post, try the 20 ptx file. I think psychocoder is working on more optimisations.
1710834231
Hero Member
*
Offline Offline

Posts: 1710834231

View Profile Personal Message (Offline)

Ignore
1710834231
Reply with quote  #2

1710834231
Report to moderator
1710834231
Hero Member
*
Offline Offline

Posts: 1710834231

View Profile Personal Message (Offline)

Ignore
1710834231
Reply with quote  #2

1710834231
Report to moderator
1710834231
Hero Member
*
Offline Offline

Posts: 1710834231

View Profile Personal Message (Offline)

Ignore
1710834231
Reply with quote  #2

1710834231
Report to moderator
"In a nutshell, the network works like a distributed timestamp server, stamping the first transaction to spend a coin. It takes advantage of the nature of information being easy to spread but hard to stifle." -- Satoshi
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1710834231
Hero Member
*
Offline Offline

Posts: 1710834231

View Profile Personal Message (Offline)

Ignore
1710834231
Reply with quote  #2

1710834231
Report to moderator
1710834231
Hero Member
*
Offline Offline

Posts: 1710834231

View Profile Personal Message (Offline)

Ignore
1710834231
Reply with quote  #2

1710834231
Report to moderator
1710834231
Hero Member
*
Offline Offline

Posts: 1710834231

View Profile Personal Message (Offline)

Ignore
1710834231
Reply with quote  #2

1710834231
Report to moderator
InqBit
Newbie
*
Offline Offline

Activity: 27
Merit: 0



View Profile
April 06, 2013, 04:54:23 AM
 #62

Heading out the door now, but will make the changes when I get back. Thanks!
InqBit
Newbie
*
Offline Offline

Activity: 27
Merit: 0



View Profile
April 06, 2013, 06:11:55 AM
 #63

Nice work. the 20 file got it going and look at the results, 19Ghash from Nvidia cards! Shocked (30 file returned 209 error)

https://i.imgur.com/ag4W2YR.png

It's working but BTC is showing work as dupes. Not surprisingly...

GUIMiner ends up reporting as having connection issues fwiw.
GimpyPrime
Member
**
Offline Offline

Activity: 68
Merit: 10


View Profile
April 06, 2013, 07:04:04 AM
 #64

I've got it working now, with CUDA I get 100-110mhash/s on a 670GTX FTW(Overclocked), with an i7 CPU

Using OpenCL I get approximately the same range, but even as high as 120-130mhash/s sometimes.

Not expecting to become a bitcoin millionaire with a 670GTX lol, just doing this for fun. However whatever improvements got added do not appear to be assisting with my machine.

Curious though what is the difference between the bitcoinminercuda.20/30 files? I am wondering if there is something I need to compile myself here.
dentldir
Sr. Member
****
Offline Offline

Activity: 333
Merit: 250



View Profile
April 06, 2013, 08:32:51 AM
 #65

No luck with a 660ti on Windows with any of the .ptx files.  20 and 30 load but the binary crashes after the Target and Done allocating CUDA resource messages.  (rpcminer-mod-cuda.exe has stopped working).

Tried with 314.22 and the CUDA 5.0 dev driver default (306.xx I think?).

Can provide more info if needed.

Thanks.




1DentLdiRMv3dpmpmqWsQev8BUaty9vN3v
DastanX
Newbie
*
Offline Offline

Activity: 42
Merit: 0



View Profile
April 06, 2013, 02:25:08 PM
 #66

this is still very exprimental. at linux os i get 350-370 MH/s while in windows only max 120MH/s.

i will try to make an workaround to it
camaro69327
Newbie
*
Offline Offline

Activity: 59
Merit: 0



View Profile
April 06, 2013, 03:10:04 PM
 #67

Might as well jump in ....First..no linuix ...just a point click, old guy in Win 7.

I have 2 - 580 GTX. Using CGminer I get 160-200 Mhash PER card . (depends on overclock and using Comp or not)

Trying this..@ first i had the "Unable to load CUDA module: 209" error. I grabbed the other .plx file bitcoinminercuda.20.ptx. Renamed it and...

Now i am getting "curl return Value = 7"

Kinda getting lost here...lol These are some of the Command lines tried....

rpcminer-mod-cuda.exe -aggression=8 -gpugrid=64 -gputhreads=384 -o - url=http://stratum.bitcoin.cz:3333 -user=####### -password=#####
rpcminer-mod-cuda.exe -aggression=8 -gpugrid=256 -gputhreads=512 - url=http://stratum.bitcoin.cz:3333 -user=###### -password=#####
rpcminer-mod-cuda.exe -url=http://stratum.bitcoin.cz:3333 -user=##### -password=####
rpcminer-mod-cuda.exe -url=http://localhost:8332 -user=##### -password=#### <<(set according to Bitcoin.conf)

"curl return Value = 7"

Thanks for all the hard work you guys do, would really like to get these cards working better (they are embarrassed to announce their terrible Hash rates to all the other cards on my network. Especially the 7970 getting 720 Mhash...lol).
gateway
Hero Member
*****
Offline Offline

Activity: 552
Merit: 500


View Profile
April 06, 2013, 05:04:10 PM
 #68

this is still very exprimental. at linux os i get 350-370 MH/s while in windows only max 120MH/s.

i will try to make an workaround to it

wow thats a huge jump on linux.. be great to see that for us windows users!

What card are you using?
gateway
Hero Member
*****
Offline Offline

Activity: 552
Merit: 500


View Profile
April 06, 2013, 05:05:26 PM
 #69

Might as well jump in ....First..no linuix ...just a point click, old guy in Win 7.

I have 2 - 580 GTX. Using CGminer I get 160-200 Mhash PER card . (depends on overclock and using Comp or not)

Trying this..@ first i had the "Unable to load CUDA module: 209" error. I grabbed the other .plx file bitcoinminercuda.20.ptx. Renamed it and...

Now i am getting "curl return Value = 7"

Kinda getting lost here...lol These are some of the Command lines tried....

rpcminer-mod-cuda.exe -aggression=8 -gpugrid=64 -gputhreads=384 -o - url=http://stratum.bitcoin.cz:3333 -user=####### -password=#####
rpcminer-mod-cuda.exe -aggression=8 -gpugrid=256 -gputhreads=512 - url=http://stratum.bitcoin.cz:3333 -user=###### -password=#####
rpcminer-mod-cuda.exe -url=http://stratum.bitcoin.cz:3333 -user=##### -password=####
rpcminer-mod-cuda.exe -url=http://localhost:8332 -user=##### -password=#### <<(set according to Bitcoin.conf)

"curl return Value = 7"

Thanks for all the hard work you guys do, would really like to get these cards working better (they are embarrassed to announce their terrible Hash rates to all the other cards on my network. Especially the 7970 getting 720 Mhash...lol).


dont use the stratum url us this instead.. btcguild.com:8332
peacefulmind
Full Member
***
Offline Offline

Activity: 196
Merit: 100


View Profile
April 06, 2013, 05:50:19 PM
 #70

I have 2x TITAN but they are on win7 for DirectX.

Quote from: FrictionlessCoin
"I think you are to hung up on this notion about 'pre-mining' being a No-No."
- from journeys into the dark depths of the alt coin forum....
psychocoder (OP)
Newbie
*
Offline Offline

Activity: 49
Merit: 0


View Profile
April 06, 2013, 06:16:11 PM
 #71

Please use a pool with getwork support, there is no real strtum support inside the miner.

Today night (german time) I post a new reposetory with my new code. I hope charliemaggot create a windows version.
The new code supports all old GPU (I think till GTX9800). The is no ptx needed.

I think on old GPU we can't get a good speedup because the GPUs has very slow bit operations.

GPU Overview:

C1070 - old GPU - 30 Streaming Multiprocessors (SM) - ~53MHash/s
C2050 - old Fermi - 14 SM - ~ 90MHash/s
K20c - new Kepler - 13 SM - ~ 325MHash/s
charliemaggot
Member
**
Offline Offline

Activity: 79
Merit: 10



View Profile
April 06, 2013, 08:27:56 PM
Last edit: April 06, 2013, 08:39:06 PM by charliemaggot
 #72

@GimpyPrime The default bitcoinminercuda.ptx file was built to "compute 3.5", which was just from the patch in psychocoder's first post in this thread. It is the latest optimised level for NVIDIA Telsa K20 cards. You need to know the compute level for your card (https://developer.nvidia.com/cuda-gpus) and use the appropriate file. Your GTX 670 should be compute level 3.0, so you would need the .30 file.

You can obviously build them yourself, I was just trying to include as much as possible so it could just be downloaded and run, however I didn't include a note about which file was needed. If you have VS 2010 you can edit the buildcuda.bat file and change the compute number to be appropriate for your card - and change the compiler value if you are using VS2012.

@dentldir Should work with the 30 file on your 660. Latest stable driver? Maybe try again after I rebuild it from psychocoder's latest changes.

@camaro69327 Your card is a 2.0 Fermi device, so not sure you would see much benefit from the Kepler (3.0) optimisation in this thread. Wait and see if psychocoder can create some better optimisations. The app is still using the getwork api, so you need to use http://api.bitcoin.cz:8332 or download their stratum proxy.

@psychocoder I'll start building it for Windows once you get changes done, if you could let me know where they are. Thanks.
camaro69327
Newbie
*
Offline Offline

Activity: 59
Merit: 0



View Profile
April 06, 2013, 09:30:09 PM
 #73

Thanks for the responses guys. I did get this working as i had the wrong port for localhost. 9332 Worked. Only could get one card hashing. Could get 160 easy. 214 was another common Mhash.

300 + Mh but only for 3 or 4 shares then 0.

Back to Cgminer and a steady 170 - 200 for now. I thought this was the Gen i had. Fermi not Kelper Duh.




psychocoder (OP)
Newbie
*
Offline Offline

Activity: 49
Merit: 0


View Profile
April 06, 2013, 10:38:35 PM
 #74

I have checked in my code to https://github.com/psychocoder-germany/rpcminer-mod. The code is a little bit slower than my first patch. I have put same calculations to cpu to save registers.
Now I only need 32 Register for Fermi and have 100% occupancy. There is no need to create ptx because all versions are inside the binary after compiling.

Sry that I create a new repo but it was my first git commit. I am oldschool and normaly use svn^^

@charliemaggot: Please add windows compile support.
psychocoder (OP)
Newbie
*
Offline Offline

Activity: 49
Merit: 0


View Profile
April 07, 2013, 12:11:15 AM
Last edit: April 07, 2013, 12:23:30 AM by psychocoder
 #75

OK, switched back to faster kernel.

GPU Overview update:

C1070 - old GPU - 30 Streaming Multiprocessors (SM) - ~46MHash/s (slows down with the last patched kernel)
C2050 - old Fermi - 14 SM - ~ 107MHash/s
K20c - new Kepler - 13 SM - ~ 325M - 350 Hash/s   (options: -aggression=11)

Note: the option -gputhreads change nothing, all kernel are build to run 256 threads.
Limie
Member
**
Offline Offline

Activity: 70
Merit: 10



View Profile WWW
April 07, 2013, 12:15:27 AM
 #76

watching with anticpation both threads

XRP- rJZrZTkMYrqe94c1V6KS1gbYpcaJRQqcd8
gateway
Hero Member
*****
Offline Offline

Activity: 552
Merit: 500


View Profile
April 07, 2013, 12:23:02 AM
 #77

OK, switched back to faster kernel.

GPU Overview update:

C1070 - old GPU - 30 Streaming Multiprocessors (SM) - ~46MHash/s (slows down with the last patched kernel)
C2050 - old Fermi - 14 SM - ~ 107MHash/s
K20c - new Kepler - 13 SM - ~ 325M - 350 Hash/s

what exactly is K20c ? just so we know here.. and what gpu are you using right now, I cant wait to see 300+ on my 680 ...
psychocoder (OP)
Newbie
*
Offline Offline

Activity: 49
Merit: 0


View Profile
April 07, 2013, 12:42:33 AM
 #78

K20c is http://www.techpowerup.com/gpudb/564/NVIDIA_Tesla_K20c.html a high performance GPU card. This cards are created to run math calculations with floating point operations. It is nothing for a home pc. The version for the home pc with the same architecture is GTX Titan http://www.techpowerup.com/gpudb/1996/.html

GTX680 has not the new bit rotate (funnel) operator I think 300+ is not possible.
Theorethic calculation: 1006*8/(3733/160+1194/32)=132 MHash/s   (magic numbers are the count of operations from the binary for this implementation)

IMO the GTX680 GPU is limited to max 132 MHash/s
wzl
Newbie
*
Offline Offline

Activity: 25
Merit: 0


View Profile
April 07, 2013, 01:02:08 AM
 #79

just compiled it and i'm running it on gtx680, CUDA still new to me, playing with parameters.
Unfortunately I don't have several thousand $ for a K20  Grin
charliemaggot
Member
**
Offline Offline

Activity: 79
Merit: 10



View Profile
April 07, 2013, 05:58:23 AM
Last edit: April 07, 2013, 06:15:35 AM by charliemaggot
 #80

I have updated the Windows build with psychocoder's changes:

https://github.com/cdmackie/rpcminer-mod (use the master branch)

We'll merge them together shortly.

To just run, you only need the bin folder, and run the rpcminer-mod-cuda.exe. There is no need for the ptx files anymore.

To build yourself, you need MSVC 2010 and the CUDA SDK 5.x.

Please post any errors or successes.
Pages: « 1 2 3 [4] 5 6 7 8 9 10 11 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!