DiabloMiner GPU Miner

MacCompiler

Newbie

Offline

Activity: 53
Merit: 0

Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 19, 2011, 09:51:51 PM

#481

I’ve packaged DiabloMiner with some helper scripts that will make it easier for new users to start mining on Mac OS. DiabloMiner will work like a normal application that you double-click on to open. Have a look at this thread for details and downloads.

MysteryMiner

Legendary

Offline

Activity: 1596
Merit: 1067

Death to enemies!

Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 19, 2011, 10:29:03 PM

#482

I really like Diablo Miner but there is few problems with it:

1. When I run the new Diablo miner versions with .exe instead of .bat in it, it does not work. I get black console screen for maybe 25ms and it exits. Not enough time to even hit Pause button to see what's wrong. I need tu use .bat file instead.

2. No speed improvements on new version with BFI_INT. I even get speed decrease. I get 260 Mh/s with 2011-04-23 version and with 2011-05-19 I get 250 Mh/s

I use ATI HD5850 with Catalyst 10.11 and ATI SDK 2.1

bc1q59y5jp2rrwgxuekc8kjk6s8k2es73uawprre4j

DiabloD3 (OP)

Legendary

Offline

Activity: 1162
Merit: 1000

DiabloMiner author

Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 20, 2011, 02:26:05 AM

#483

Quote from: MysteryMiner on May 19, 2011, 10:29:03 PM

You run the exe the same way as the bat. The exe does not magically read your mind on what arguments you want to use.

The bat is probably running the old jar, which means, no, you're not running a new version of DiabloMiner.

DiabloMiner

DiabloD3 (OP)

Legendary

Offline

Activity: 1162
Merit: 1000

DiabloMiner author

Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 20, 2011, 05:38:11 AM

#484

Quote from: ryepdx on May 19, 2011, 09:43:31 PM

Quote from: DiabloD3 on May 19, 2011, 09:22:18 PM

Quote from: BOARBEAR on May 19, 2011, 08:12:07 PM

is there any way I can use the miner without installing java?
can you put it in a warper and compile the whole thing including the java?

Java does not work that way.

I call bullshit: http://gcc.gnu.org/java/

I'm aware of gcj, and I do not consider something that cannot run quite a few apps, and be a shitload slower at it an actually valid Java implementation. Oh, and last time I noticed, they didn't do JNI yet, so you can't run my miner with it.

DiabloMiner

ryepdx

Hero Member

Offline

Activity: 714
Merit: 500

⇾ Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 20, 2011, 05:42:05 AM

#485

Quote from: DiabloD3 on May 20, 2011, 05:38:11 AM

Oh, and last time I noticed, they didn't do JNI yet, so you can't run my miner with it.

Ah, okay. Got it.

Jaime Frontero

Full Member

Offline

Activity: 126
Merit: 100

Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 20, 2011, 06:07:39 AM

#486

two 5870s, CC 11.5, SDK 2.1, on Debian testing.

i don't know yet how much faster it is than your pre-BFI_INT release.

but a lot.

i'm putting in some extra fans and a rheostatic fan speed controller - it's so damn fast that i have to clock it down right now to keep temps under 85.

so going from the old version, max volted at 300 MemClock and 900 GPUClock, to the new version down-volted by almost 0.2, MemClock at 315 and GPUClock at 850; i picked up a bit over 100 Mh/s.

i'll have the new fans and controller in tomorrow. i have another box that i've experimented with fans on - just a single 5870, but i've learned a bit. i'm hoping for a maxed-out setup on the dual box, running at well under 75 degrees. we'll see.

DiabloD3 (OP)

Legendary

Offline

Activity: 1162
Merit: 1000

DiabloMiner author

Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 20, 2011, 07:02:19 AM

#487

Quote from: Jaime Frontero on May 20, 2011, 06:07:39 AM

At stock 850, 2 5870 should be in the neighborhood of 740 using -v 2 -w 128 on SDK 2.1.

BFI_INT adds around 10%.

DiabloMiner

DustinEwan

Newbie

Offline

Activity: 14
Merit: 0

Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 20, 2011, 07:05:41 AM

#488

I got the profiler working... that was a lot easier than I thought it would be. I haven't done too much Java outside of Google's DalvikVM, but it's not a true Java implementation so some things are done a little bit differently.

Anyway, I'm running the first batch of samples now

Are you going to be modifying the kernel much? I'm curious as to how phatk reduced the operation count by that amount...

DiabloD3 (OP)

Legendary

Offline

Activity: 1162
Merit: 1000

DiabloMiner author

Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 20, 2011, 07:10:24 AM

#489

Quote from: DustinEwan on May 20, 2011, 07:05:41 AM

Are you going to be modifying the kernel much? I'm curious as to how phatk reduced the operation count by that amount...

I did a lot of examining of phatk. I can't tell where he thinks hes saving cycles. Not only that, it runs exactly the same on SDK 2.1 and SDK 2.4 on my 5850 vs phoenix's standard kernel. Plus, if he is in fact exploiting anything, it probably isn't exploiting it as much as -v 3 -w 128 on mine on 69xx.

DiabloMiner

DustinEwan

Newbie

Offline

Activity: 14
Merit: 0

Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 20, 2011, 07:12:59 AM

#490

I completely agree with you... I've looked at both code and it's almost line for line exactly the same...

I tried looking for other SHA256 algorithms, just in case anybody had come up with something clever besides the norm, but there's nothing out there really... in the cpu world Crypto++ is king and that's pretty much it..

DiabloD3 (OP)

Legendary

Offline

Activity: 1162
Merit: 1000

DiabloMiner author

Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 20, 2011, 07:40:28 AM

#491

Update: Added all of Dustin's suggestions, and also added a timeout for non-LP connections.

DiabloMiner

Jaime Frontero

Full Member

Offline

Activity: 126
Merit: 100

Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 20, 2011, 07:54:18 AM

#492

Quote from: DiabloD3 on May 20, 2011, 07:02:19 AM

Quote from: Jaime Frontero on May 20, 2011, 06:07:39 AM

At stock 850, 2 5870 should be in the neighborhood of 740 using -v 2 -w 128 on SDK 2.1.

BFI_INT adds around 10%.

pretty much.

i'm getting 746-748.

i'm hoping that once i get the voltage back up, and the GPUClock at 900 again, i'll be somewhere considerably closer to 800Mh/s.

by the way, Diablo - do you agree with the formula (picked up somewhere on this forum...) that the sweet spot for MemClocks is very close to:

GPUClock/3 + 14

?

jedi95

Full Member

Offline

Activity: 219
Merit: 120

Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 20, 2011, 08:07:09 AM

#493

Quote from: DiabloD3 on May 20, 2011, 07:10:24 AM

Quote from: DustinEwan on May 20, 2011, 07:05:41 AM

Are you going to be modifying the kernel much? I'm curious as to how phatk reduced the operation count by that amount...

The key difference is not in the total number of instructions executed, but that they make better use of the 5-wide ALU design. Have a look at the ASM generated with AMD's KernelAnalyzer. Particularly the number of ALU ops. It's no faster than the poclbm kernel on 2.1, but for most people it eliminates the speed disadvantage of SDK 2.4.

It's also designed with VLIW5 in mind, so it's obviously not going to be optimal on VLIW4 hardware.

Phoenix Miner developer

Donations appreciated at:
1PHoenix9j9J3M6v3VQYWeXrHPPjf7y3rU

DiabloD3 (OP)

Legendary

Offline

Activity: 1162
Merit: 1000

DiabloMiner author

Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 20, 2011, 11:28:11 AM

#494

Quote from: jedi95 on May 20, 2011, 08:07:09 AM

Quote from: DiabloD3 on May 20, 2011, 07:10:24 AM

Quote from: DustinEwan on May 20, 2011, 07:05:41 AM

Are you going to be modifying the kernel much? I'm curious as to how phatk reduced the operation count by that amount...

Well the big problem is on 2.4 phoenix-poclbm and phatk give near identical results... and both are still slower than real poclbm on both 2.1 and 2.4. And -v 18 and 19 give interesting results on 58xx on 2.4 which beats phatk's lackluster speed.

So... ymm so fucking v.

DiabloMiner

DustinEwan

Newbie

Offline

Activity: 14
Merit: 0

Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 20, 2011, 11:34:19 AM

#495

From my first run of profiling the miner, I saw that you were spending about 2% cpu time in just building strings (mainly StringBuilder copying char arrays internally). Using the + operator is inlined to StringBuilder, which can be pretty slow. I ran into this in my game engine here at work and had come across this post at StackOverflow from a guy that implements his own (albeit primitive) class for string concatenation.

I forgot to save the profile for that one (the profiler automatically overwrites the output file every time and I'm lazy Tongue

), but I reduced the CPU time spent on String building from 2% to <= .01%

It's not much, but hey, it was easy and I knew how to do it Cheesy

Anyway, here is the latest trace I ran. (a lot is left out, just is just the top 90% of cpu time)

Code:

CPU TIME (ms) BEGIN (total = 2239712) Fri May 20 19:40:42 2011
rank   self  accum   count trace method
   1 17.47% 17.47%      21 306858 java.lang.Object.wait
   2 17.46% 34.94%     828 306869 java.lang.ref.ReferenceQueue.remove
   3 16.52% 51.46%      16 319564 sun.net.www.http.KeepAliveCache.run
   4 15.74% 67.20% 7513448 319281 java.nio.DirectByteBuffer.getInt
   5  4.05% 71.25%     210 318093 java.net.SocketInputStream.read
   6  2.81% 74.05%   29347 319369 org.lwjgl.opencl.CL10.clEnqueueReadBuffer
   7  2.70% 76.75% 7513448 319278 java.nio.Buffer.checkIndex
   8  2.69% 79.44% 7513448 319279 java.nio.DirectByteBuffer.ix
   9  2.64% 82.08% 7513448 319280 java.nio.DirectByteBuffer.getInt[quote author=DiabloD3 link=topic=1721.msg131499#msg131499 date=1305890891]
[quote author=jedi95 link=topic=1721.msg131287#msg131287 date=1305878829]
[quote author=DiabloD3 link=topic=1721.msg131220#msg131220 date=1305875424]
[quote author=DustinEwan link=topic=1721.msg131215#msg131215 date=1305875141]
I got the profiler working... that was a lot easier than I thought it would be.  I haven't done too much Java outside of Google's DalvikVM, but it's not a true Java implementation so some things are done a little bit differently.

Anyway, I'm running the first batch of samples now :)

Are you going to be modifying the kernel much?  I'm curious as to how phatk reduced the operation count by that amount...
[/quote]

I did a lot of examining of phatk. I can't tell where he thinks hes saving cycles. Not only that, it runs exactly the same on SDK 2.1 and SDK 2.4 on my 5850 vs phoenix's standard kernel. Plus, if he is in fact exploiting anything, it probably isn't exploiting it as much as -v 3 -w 128 on mine on 69xx.
[/quote]

The key difference is not in the total number of instructions executed, but that they make better use of the 5-wide ALU design. Have a look at the ASM generated with AMD's KernelAnalyzer. Particularly the number of ALU ops. It's no faster than the poclbm kernel on 2.1, but for most people it eliminates the speed disadvantage of SDK 2.4.

It's also designed with VLIW5 in mind, so it's obviously not going to be optimal on VLIW4 hardware.
[/quote]

Well the big problem is on 2.4 phoenix-poclbm and phatk give near identical results... and both are still slower than real poclbm on both 2.1 and 2.4. And -v 18 and 19 give interesting results on 58xx on 2.4 which beats phatk's lackluster speed.

So... ymm so fucking v.
[/quote]
  10  1.80% 83.88%  675014 319312 org.lwjgl.opencl.CL10.clSetKernelArg
  11  1.36% 85.24%  675014 319313 org.lwjgl.opencl.InfoUtilFactory$CLKernelUtil.setArg
  12  1.01% 86.25%  675015 319298 java.lang.ThreadLocal.get
  13  1.00% 87.26%  675016 311203 java.lang.ThreadLocal$ThreadLocalMap.getEntry
  14  0.98% 88.24%  675015 319302 java.nio.DirectIntBufferU.put
  15  0.68% 88.92%   29348 319351 org.lwjgl.opencl.CL10.clEnqueueNDRangeKernel
  16  0.63% 89.55%  675015 319307 org.lwjgl.PointerWrapperAbstract.getPointer
  17  0.63% 90.18%  675012 319315 java.lang.ThreadLocal$ThreadLocalMap.access$000
  18  0.62% 90.80%  675015 319305 org.lwjgl.BufferChecks.checkBufferSize

Now I've started looking at some of the bigger stuff. The first 2 lines are from the garbage collector, so you can see that ~35% of the CPU time was spent on just garbage collecting, 17% of which was spent just blocking all the execution threads in order to do so. So I'm trying to figure out ways to improve that.

I don't really think that the netcode can be much faster, but another ~20% of cpu time is spent on that. So if the netcode can be improved, that will get us back into the kernel faster. The third line there is the thread that is used for keeping the HTTP 1.1 session alive. I don't know much about that, but maybe it's a lead.

Anyway, I'm done for now. Here is the new DiabloMiner.java with the new string builder.

Also:

Quote from: DiabloD3 on May 20, 2011, 11:28:11 AM

So... ymm so fucking v.

I totally agree with that, but I love your code and bitcoin is fascinating. So digging through this code is a great joy for me! Great work so far man, and in Java too! Grin

DiabloD3 (OP)

Legendary

Offline

Activity: 1162
Merit: 1000

DiabloMiner author

Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 20, 2011, 11:41:33 AM

#496

Quote from: DustinEwan on May 20, 2011, 11:34:19 AM

), but I reduced the CPU time spent on String building from 2% to <= .01%

It's not much, but hey, it was easy and I knew how to do it Cheesy

Anyway, here is the latest trace I ran. (a lot is left out, just is just the top 90% of cpu time)

Code:

CPU TIME (ms) BEGIN (total = 2239712) Fri May 20 19:40:42 2011
rank   self  accum   count trace method
   1 17.47% 17.47%      21 306858 java.lang.Object.wait
   2 17.46% 34.94%     828 306869 java.lang.ref.ReferenceQueue.remove
   3 16.52% 51.46%      16 319564 sun.net.www.http.KeepAliveCache.run
   4 15.74% 67.20% 7513448 319281 java.nio.DirectByteBuffer.getInt
   5  4.05% 71.25%     210 318093 java.net.SocketInputStream.read
   6  2.81% 74.05%   29347 319369 org.lwjgl.opencl.CL10.clEnqueueReadBuffer
   7  2.70% 76.75% 7513448 319278 java.nio.Buffer.checkIndex
   8  2.69% 79.44% 7513448 319279 java.nio.DirectByteBuffer.ix
   9  2.64% 82.08% 7513448 319280 java.nio.DirectByteBuffer.getInt
  10  1.80% 83.88%  675014 319312 org.lwjgl.opencl.CL10.clSetKernelArg
  11  1.36% 85.24%  675014 319313 org.lwjgl.opencl.InfoUtilFactory$CLKernelUtil.setArg
  12  1.01% 86.25%  675015 319298 java.lang.ThreadLocal.get
  13  1.00% 87.26%  675016 311203 java.lang.ThreadLocal$ThreadLocalMap.getEntry
  14  0.98% 88.24%  675015 319302 java.nio.DirectIntBufferU.put
  15  0.68% 88.92%   29348 319351 org.lwjgl.opencl.CL10.clEnqueueNDRangeKernel
  16  0.63% 89.55%  675015 319307 org.lwjgl.PointerWrapperAbstract.getPointer
  17  0.63% 90.18%  675012 319315 java.lang.ThreadLocal$ThreadLocalMap.access$000
  18  0.62% 90.80%  675015 319305 org.lwjgl.BufferChecks.checkBufferSize

Quote from: DiabloD3 on May 20, 2011, 11:28:11 AM

So... ymm so fucking v.

I totally agree with that, but I love your code and bitcoin is fascinating. So digging through this code is a great joy for me! Great work so far man, and in Java too! Grin

You need to get in the habit of using Github to push merge requests.

Also, the thread pool is very important because it can keep HTTP connections open between getwork/sendworks (cutting down on network round trip). Further, I spawn 3 threads per GPU to cut down on the actual hit of blocking due to HTTP, and then on top of that, LP cuts down on needing to keep fetching work every 5 seconds (with LP it only fetches as it returns asynchronously , or when nonce saturation occurs).

As for blocking on garbage collection, switching to Java 7 altogether would do a lot to improve that.

DiabloMiner

DiabloD3 (OP)

Legendary

Offline

Activity: 1162
Merit: 1000

DiabloMiner author

Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 20, 2011, 11:57:46 AM

#497

BTW, I am not going to accept a patch containing a custom concat setup. This is not C.

DiabloMiner

MysteryMiner

Legendary

Offline

Activity: 1596
Merit: 1067

Death to enemies!

Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 20, 2011, 12:25:04 PM

#498

I got the new version working! What I did:

1. I run the DiabloMiner-Windows.exe from command prompt with all arguments needed such as -u and -p

2. I need tu manually specify -v 2 argument to use vectors. Without Vectors I have 248Mh/s, with -v 2 I finally got 282Mh/s instead of former 260Mh/s. The BFI_INT is a huge improvement.

3. I created .BAT file myself to run DiabloMiner-Windows.exe with all necessary arguments.

Quote

The bat is probably running the old jar, which means, no, you're not running a new version of DiabloMiner.

No, I'm not so stupid. I know how to use and edit bat files from MS-DOS 5.0 times. I check they contents before I run them.

And Thank You DiabloD3! If I ever find coins with Your miner, I will send you some of them!

bc1q59y5jp2rrwgxuekc8kjk6s8k2es73uawprre4j

OtaconEmmerich

Full Member

Offline

Activity: 235
Merit: 100

Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 20, 2011, 06:22:51 PM

#499

Quote from: DiabloD3 on May 20, 2011, 11:28:11 AM

Quote from: jedi95 on May 20, 2011, 08:07:09 AM

Quote from: DiabloD3 on May 20, 2011, 07:10:24 AM

Quote from: DustinEwan on May 20, 2011, 07:05:41 AM

Are you going to be modifying the kernel much? I'm curious as to how phatk reduced the operation count by that amount...

I've yet to replicate the same results on my system, In fact with 2.4 every time phatk has beaten your miner. Every time I've tried anything other then -v 2 I get slower speeds.
This is on a Sapphire Extreme 5850 on Windows 7 x64.

toasty

Member

Offline

Activity: 90
Merit: 12

Re: Official DiabloMiner GPU Miner Thread (now with Long Poll and BFI_INT support)

May 20, 2011, 06:25:03 PM

#500

If this is just totally unsupported, feel free to smack me. I'm running on a MacPro with both a 5870 and a 5770 in it, which seems perfectly okay doing normal OS things, including games.

If I try running DiabloMiner without any special flags, I get:

[5/20/11 1:17:33 PM] Added ATI Radeon HD 5870 (#1) (10 CU, local work size of 256)
[5/20/11 1:17:34 PM] Added ATI Radeon HD 5870 (#2) (20 CU, local work size of 256)

which doesn't seem right. I'm guessing the 5770 is #1.

With no special flags at all, I'm getting roughly 125M/sec. If I use -D 1 to make it only attach to the first card, it only drops to roughly 100M/sec which leads me to believe something very inefficient is going on.

I've tried various combos of -f, -v and -w and don't seem to be able to do anything but make it worse.

Is this configuration just not going to work at all? Is there any way I can force it to only use the 5870 instead?

Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 [25] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 ... 89 »

Bitcoin Forum > Other > Archival > CPU/GPU Bitcoin mining hardware > DiabloMiner GPU Miner

« previous topic next topic »