Bitcoin Forum
January 19, 2017, 11:16:30 AM *
News: Latest stable version of Bitcoin Core: 0.13.2  [Torrent]. (New!)
 
   Home   Help Search Donate Login Register  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 [16] 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 ... 1142 »
  Print  
Author Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX]  (Read 2966602 times)
cbuchner1
Hero Member
*****
Offline Offline

Activity: 742


View Profile
April 16, 2013, 08:54:26 AM
 #301

Thank you for your work!  Smiley
I think you did a great job!

What I miss is a variable to control the system/GPU load.
The --interactive flag does not really work for me, I even experienced greater desktop lags with "interactive 1"...

For interactive you need to let autotune choose a smaller workload. Manually specifying the same -l parameter as for non-interactive mode won't be a good idea.

Interactive mode will be trying that you have around 60 individual CUDA kernel launches per second, and a millisecond of CPU+GPU sleep time inbetween. -> 60 frame updates on the display should be possible, so you can watch movies or porn or whatever while mining Wink

Christian
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1484824590
Hero Member
*
Offline Offline

Posts: 1484824590

View Profile Personal Message (Offline)

Ignore
1484824590
Reply with quote  #2

1484824590
Report to moderator
cbuchner1
Hero Member
*****
Offline Offline

Activity: 742


View Profile
April 16, 2013, 09:13:28 AM
 #302

Christian, could you just post the source to git and host the binaries there?

My only prior experience is with sourceforge, but I will see how I can get started on github.

UPDATE: I think they've removed the feature to serve binary distributions as separate downloads.
cbuchner1
Hero Member
*****
Offline Offline

Activity: 742


View Profile
April 16, 2013, 09:15:21 AM
 #303

When compiling 04-14 in Linux (Ubuntu 12.04), I'm getting the following message not seen in 04-09:

it's a known problem - try targeting a 32 bit executable, as shown in configure.sh

g++-multilib, ia32-libs and libcurl4-dev:i386 should be installed prior to that.
SubNoize
Jr. Member
*
Offline Offline

Activity: 47


View Profile
April 16, 2013, 12:50:18 PM
 #304

Out of curiosity how much further do you think you can push nvidia cards? Do you see any improvements coming any time soon or if we see another large improvement it will be due to an unusual find?
portosTCM
Newbie
*
Offline Offline

Activity: 19


View Profile
April 16, 2013, 01:34:53 PM
 #305

Will you improve sha256 version? I see that your miner can achieve good khashes ratio so i can't wait for fully working gpu version Smiley
cbuchner1
Hero Member
*****
Offline Offline

Activity: 742


View Profile
April 16, 2013, 01:54:07 PM
 #306

Out of curiosity how much further do you think you can push nvidia cards? Do you see any improvements coming any time soon or if we see another large improvement it will be due to an unusual find?

My crystal ball is currently malfunctioning.  I advise that you consult a fortune teller of your choosing Wink
cbuchner1
Hero Member
*****
Offline Offline

Activity: 742


View Profile
April 16, 2013, 01:54:56 PM
 #307

Will you improve sha256 version? I see that your miner can achieve good khashes ratio so i can't wait for fully working gpu version Smiley

No motivation to do so, as Bitcoin mining is so unprofitable.
Schleicher
Hero Member
*****
Offline Offline

Activity: 631



View Profile
April 16, 2013, 02:08:31 PM
 #308

My only prior experience is with sourceforge, but I will see how I can get started on github.
UPDATE: I think they've removed the feature to serve binary distributions as separate downloads.
Sourceforge is ok I think. You could use that for the binaries.

Bitcoin donations: 1H2BHSyuwLP9vqt2p3bK9G3mDJsAi7qChw
datguyian
Sr. Member
****
Offline Offline

Activity: 336



View Profile WWW
April 16, 2013, 03:02:18 PM
 #309


Thanks for this! I've updated the sheet with the Quadro cards I've been messing with (600, 4000 and 4600). I haven't really had enough time to mess with the settings much, so I pretty much let auto tune do its thing then used whatever kernel it decided on after that (and notated in the spreadsheet).

TiltABitPoker.com ♣♠♥♦ Fantastic poker community with offerwalls, freerolls, worldwide poker news, and more! ♣♠♥♦ TiltABitPoker.com
cbuchner1
Hero Member
*****
Offline Offline

Activity: 742


View Profile
April 16, 2013, 03:39:14 PM
 #310

A few days ago I ordered a used 560Ti 448 core edition (~130 Euros) because of the stellar performance figures.

I believe the high memory bandwidth of 500 series cards is mainly responsible for their performance. And it seems the core count vs. memory throughput is rather balanced for this type of application.

The Kepler series (6xx) seems to have too many CUDA cores and a memory interface that isn't any better than the 500 series. In other words: too much compute power in relation to bandwidth.

About future optimization possibilities:

I do believe that adding a LOOKUP_GAP implementation for factor 2 and 3 may boost the performance slightly - and more significantly for Kepler cards and the GTX Titan (250 kHash for a non-overclocked Titan seems really low).

I think that using some inline PTX assembly for the xor_salsa implementation we can get another slight boost, and maybe also a reduction in kernel register count.

I have doubts about the potential and/or feasibility of the texture cache. The texture cache would work better for a very small scratchpad for sure - a small lookup table size increases the cache hit-to-miss ratio, but maybe it requires such a high LOOKUP_GAP value that any memory performance benefit is offset by the required extra computation.

Christian
Nomusss
Newbie
*
Offline Offline

Activity: 20


View Profile
April 16, 2013, 06:50:04 PM
 #311

Got 185 kh\s on GTX680 1214\6038

Thanks for the software!
portosTCM
Newbie
*
Offline Offline

Activity: 19


View Profile
April 16, 2013, 06:52:50 PM
 #312

cudaMiner shows cpu usage near 100%, how can i fix it?
cbuchner1
Hero Member
*****
Offline Offline

Activity: 742


View Profile
April 16, 2013, 07:18:57 PM
 #313

cudaMiner's inconsistent CPU usage is a topic that I will be working on. You can currently only play with the -i flag to see if it makes a difference.

I think I found out what is wrong with the texture cache. I was not computing the texel coordinates correctly - in particular I failed to add a block+warp specific texel offset. Results now do validate, but I see a performance degradation instead of a gain.

I will have to determine whether it is better to use a 2-dimensional texturing or a single 1 dimensional linear texture. I may even allow to pass in the dimensionality via the -C flag directly Wink

Christian
FalconFour
Full Member
***
Offline Offline

Activity: 176



View Profile WWW
April 16, 2013, 08:31:00 PM
 #314

Well, I definitely appreciate that someone's put some work into an nVidia miner!  Grin

Maybe I'm alone here, but I kinda think most of us *aren't* going to go out and buy all-new cards just to mine Litecoin. Maybe. Maybe not. I dunno. But the most valuable use I have for it now is going through a junk-pile at the shop and pulling out all the 8000-series and higher cards and building mining systems for them (while the shop owner and I work together mining Bitcoin, of course... hehe).

That said, the best card that's been in the pile so far is a 9800GT (which was kinda impressive - thought it was an 8800). So I've got a 9800GT and a 8800GTX working right now with this cudaMiner.

Here's the problem I ran into. Both are experiencing all-over-the-map performance variations. The 8800GTX was previously cranking out 34-36khps (with accepted results), then when I moved to a 64-bit Windows 7 SP1 install (previously 32-bit Vista SP0 from the initial OEM install), it shot up to ~44khps. However, after updating drivers and allowing me to crank the fan speed higher, it fell through the floor and lingers around 16khps.

And that 9800GT? It was cranking out 16khps, pretty pathetic, under a 32-bit Win7 SP0 install. When I moved that up to Win7 SP1 x64, it again shot up to ~24khps, but that also wasn't stable - next time I restarted the miner, it's only doing... EIGHT... YES... EIGHT! 8khps.


I've been playing with the different driver versions, and it seems that cudaMiner won't run (just silently crashes/exits without any output other than the initial banner - not even an error log entry) with any drivers below version 300. Can't make any sense of it... :/

feed the bird: 187CXEVzakbzcANsyhpAAoF2k6KJsc55P1 (BTC) / LiRzzXnwamFCHoNnWqEkZk9HknRmjNT7nU (LTC)
cbuchner1
Hero Member
*****
Offline Offline

Activity: 742


View Profile
April 16, 2013, 09:08:45 PM
 #315

CUDA drivers rel 304.54 (Linux) and 306.94 (Windows) or later are required for CUDA 5.0 apps like cudaMiner. Because I am not doing error checks yet, you will see the program crashing if these requirements are not met.

About 8800/9800 cards performance: I am seeing the same performance varitions on an nVidia 9600M GT on Windows. sometimes 4 kHash, sometimes 6 kHash.

Linux: I get solid 9.6 kHash.

I am starting to believe there's something wrong with windows drivers for very old card models.

Could be that the device is not clocking up for CUDA workloads? Have you tried running any kind of DirectX or OpenGL app simultaneously, to see if that makes it get up to speed?

UPDATE: the texture cache feature seems to work in 1D and 2D modes now, but does not really make things faster yet. I do get accepted and verified shares though (happy!)

UPDATE2: I may have solved the excessive CPU utilization problem on Windows, too.

Christian
FalconFour
Full Member
***
Offline Offline

Activity: 176



View Profile WWW
April 16, 2013, 11:30:12 PM
 #316

Well, these old cards don't have dynamic clocks for 2D/3D modes - which is why they get so damn hot while just sitting idle. I do however think that if Linux is giving so much of a performance boost, it'd be worth just dumping Ubuntu on these things to mine with them. They're "shell" computers anyway - optimally just going to sit up on a shelf connected to power and network, just being remote-controlled for mining. TONS of motherboards, hard drives, CPUs, memory sticks, and GPUs laying around that I'd love to put to work while the shop doesn't have to pay for power Wink You got a recommendation for a Linux distro that'll do the job best? Cheesy

feed the bird: 187CXEVzakbzcANsyhpAAoF2k6KJsc55P1 (BTC) / LiRzzXnwamFCHoNnWqEkZk9HknRmjNT7nU (LTC)
FalconFour
Full Member
***
Offline Offline

Activity: 176



View Profile WWW
April 16, 2013, 11:34:27 PM
 #317

Whoa, never mind Linux. This just happened when I disabled Aero/desktop composition:


From 8khash to 28khash. Interesting. Now, wonder what the 8800GTX will do...

I do think the resolution of the auto-tune is a bit sketchy though. It seems to fly through the khash/sec timings far faster than it can get an accurate reading, which results in many test results just being all over the place (20... 18... 20... 22... 18...). There's stuff going on in the background (like drawing on the screen) that I'm sure causes some bumps in the readings that it doesn't test twice. Maybe increase the test duration for each step, and lock into multiples of two? I couldn't imagine "13x3" would serve any better purpose than a rounded number like 14x2 or such... (or am I wrong there?)

feed the bird: 187CXEVzakbzcANsyhpAAoF2k6KJsc55P1 (BTC) / LiRzzXnwamFCHoNnWqEkZk9HknRmjNT7nU (LTC)
cbuchner1
Hero Member
*****
Offline Offline

Activity: 742


View Profile
April 16, 2013, 11:42:01 PM
 #318

From 8khash to 28khash. Interesting. Now, wonder what the 8800GTX will do...

That's more like what I would have expected from these cards.

Strangely, when I enable texture caching the determined performance during autotune is about 10-25% higher than without cache. But the achieved performance during the mining is way about 30% less than without cache. So why does the performance advantage turn into a disadvantage? This discrepancy needs to be understood before I can put out another version. I've even tried to completely randomize the input data during autotune - but no change. I really want to get that measured gain into the actual mining. I hope it's not just an illusion.
cbuchner1
Hero Member
*****
Offline Offline

Activity: 742


View Profile
April 16, 2013, 11:43:58 PM
 #319

Any news on a secondary download source? Dropbox, github,sourceforge?

You can get the source code from github now, but not the binaries. For Linux compilation this should suffice.

FalconFour
Full Member
***
Offline Offline

Activity: 176



View Profile WWW
April 17, 2013, 12:15:01 AM
 #320

From 8khash to 28khash. Interesting. Now, wonder what the 8800GTX will do...

That's more like what I would have expected from these cards.

Strangely, when I enable texture caching the determined performance during autotune is about 10-25% higher than without cache. But the achieved performance during the mining is way about 30% less than without cache. So why does the performance advantage turn into a disadvantage? This discrepancy needs to be understood before I can put out another version. I've even tried to completely randomize the input data during autotune - but no change. I really want to get that measured gain into the actual mining. I hope it's not just an illusion.


This could be along the same issue with the short auto-tune duration problem. Maybe the texture cache benefits for a very short time but starts deteriorating slowly (on the order of whole seconds, not milliseconds). Maybe try a narrow set of autotune parameters (it's unlikely that a card would ever see any autotune benefit in the sequential range from 20...100 iterations) and run longer tuning per each combination? Basically, not every cell in the autotune matrix needs to be checked, I think. Smiley

Also getting super-erratic behavior right now after updating drivers on the 8800GTX. It launches and identifies "compute capability 1.0", but it cranks out all zeroes on the autotune then crashes (hard). That's with the latest driver I just installed, 314.22. :/

feed the bird: 187CXEVzakbzcANsyhpAAoF2k6KJsc55P1 (BTC) / LiRzzXnwamFCHoNnWqEkZk9HknRmjNT7nU (LTC)
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 [16] 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 ... 1142 »
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!