Bitcoin Forum
March 03, 2015, 04:32:25 PM *
News: Latest stable version of Bitcoin Core: 0.10.0 [Torrent] (New!)
 
   Home   Help Search Donate Login Register  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 [50] 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 »
  Print  
Author Topic: DiabloMiner GPU Miner  (Read 671640 times)
DiabloD3
Legendary
*
Offline Offline

Activity: 1162


DiabloMiner author


View Profile WWW

Ignore
August 24, 2011, 11:20:53 PM
 #981

As I just said to iopq, some kernels require more than others. Try 1/3rd, it will probably bring back your missing hashes.

Diablo,

with cores 860,287 and 980,327 and -v 18 I now get 795-797 so it is 3-4 MHs faster than the previous version.

Thanks a lot.

spiccioli

btw, what makes a kernel depend upon memory speed?

I thought that there is no (or very very little) video memory use in hashing the bitcoin chain and, up until now, I always did lower memory clock as much I could to lower energy consumption.
Well, it's not such a bad idea to keep the algorithm in memory to unroll back to the GPUs once the unroll is used up.  It should be faster than referring to the system memory.  Of course, if it could be unrolled straight from the GPU back to the GPU once the unrolls nearly reach their end (1 unroll away), it would be a lot better.  But that would involve holding the entire code in a register or so and somehow converting it which doesn't seem all that possible.

The program (the kernel) is kept loaded in graphics memory, but the compute units dump the program when it switches to something else (EVERYTHING is a program, even rendering boring 2D desktop shit).

Radeons have multiple levels of graphics memory, the memory clock just controls the actual GDDR5 RAM chips (ie, the "lowest" level as far as OpenCL is concerned). Kernel arguments and constants are stored in constant RAM (which for all intents and purposes are as fast as registers), and then theres scratch RAM that belongs to the CU which can be used to backfill register overflow (which isn't controlled by the memory clock, but seems to synchronize timings in some way). There are also multiple levels of caches for the CU and the texture processing units.

1425400345
Hero Member
*
Offline Offline

Posts: 1425400345

View Profile Personal Message (Offline)

Ignore
1425400345
Reply with quote  #2

1425400345
Report to moderator
COINROYALE BITCOIN CASINO     Blackjack       Slots        Baccarat       Video Poker     Dice     Roulette   Provably Fair ✓
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1425400345
Hero Member
*
Offline Offline

Posts: 1425400345

View Profile Personal Message (Offline)

Ignore
1425400345
Reply with quote  #2

1425400345
Report to moderator
DiabloD3
Legendary
*
Offline Offline

Activity: 1162


DiabloMiner author


View Profile WWW

Ignore
August 27, 2011, 05:09:59 AM
 #982

Update: Make kernel arrays an option, default to off, use -a to turn on.

This should help users that had a speed decrease after introducing phatk-like arrays, such as OSX and Nvidia and SDK 2.1 users.

iopq
Hero Member
*****
Offline Offline

Activity: 560


View Profile

Ignore
September 02, 2011, 03:06:41 PM
 #983

Phateus posted this graph:

does that look like 316 is the fastest? no, I'm pretty sure 410 is faster (vectors 2, worksize 256) right after the dip in speeds

and it obviously doesn't matter that much whether you're running 300ish or 400ish clocks according to the graph

DiabloD3
Legendary
*
Offline Offline

Activity: 1162


DiabloMiner author


View Profile WWW

Ignore
September 02, 2011, 04:50:19 PM
 #984

Phateus posted this graph:

does that look like 316 is the fastest? no, I'm pretty sure 410 is faster (vectors 2, worksize 256) right after the dip in speeds

and it obviously doesn't matter that much whether you're running 300ish or 400ish clocks according to the graph

Huh, I wonder what hes using for vectors, I assume he means uint4 = V4, etc. That graph is very interesting, it highlights the register spillover problem in the phatk design quite nicely.

I also wonder what card that is.

iopq
Hero Member
*****
Offline Offline

Activity: 560


View Profile

Ignore
September 04, 2011, 12:20:43 PM
 #985

Phateus posted this graph:

does that look like 316 is the fastest? no, I'm pretty sure 410 is faster (vectors 2, worksize 256) right after the dip in speeds

and it obviously doesn't matter that much whether you're running 300ish or 400ish clocks according to the graph

Huh, I wonder what hes using for vectors, I assume he means uint4 = V4, etc. That graph is very interesting, it highlights the register spillover problem in the phatk design quite nicely.

I also wonder what card that is.

5870 overclocked, and v4 is indeed uint4

DiabloD3
Legendary
*
Offline Offline

Activity: 1162


DiabloMiner author


View Profile WWW

Ignore
September 04, 2011, 12:40:53 PM
 #986

Phateus posted this graph:

does that look like 316 is the fastest? no, I'm pretty sure 410 is faster (vectors 2, worksize 256) right after the dip in speeds

and it obviously doesn't matter that much whether you're running 300ish or 400ish clocks according to the graph

Huh, I wonder what hes using for vectors, I assume he means uint4 = V4, etc. That graph is very interesting, it highlights the register spillover problem in the phatk design quite nicely.

I also wonder what card that is.

5870 overclocked, and v4 is indeed uint4

Those numbers might not be entirely valid then. (Some?) 1200mhz cards do not seem to have the same timing as 1000mhz cards, so 1/4th might work better. On my 5850, the peak seems to be around 1/3rd instead, and on some 5870s from what I've heard its still 1/3rd.

TheMalon
Member
**
Offline Offline

Activity: 70


View Profile

Ignore
September 08, 2011, 09:27:17 AM
 #987

Hi Diablo,
I have a Radeon 6670 running on Win7-64bit and after i upgraded the Catalyst to the last version (11.8 from 11.6) my hardware errors reported are between 20% and 25% and the CPU usage is 40% to 50% (before was under 10%).

Any idea what should i do or where i should look for some info?

Thanks.
edit: forgot to mention that i use the default configuration (launched the .exe using only with the -o, -r, -u and -p options).
DiabloD3
Legendary
*
Offline Offline

Activity: 1162


DiabloMiner author


View Profile WWW

Ignore
September 08, 2011, 04:44:05 PM
 #988

Hi Diablo,
I have a Radeon 6670 running on Win7-64bit and after i upgraded the Catalyst to the last version (11.8 from 11.6) my hardware errors reported are between 20% and 25% and the CPU usage is 40% to 50% (before was under 10%).

Any idea what should i do or where i should look for some info?

Thanks.
edit: forgot to mention that i use the default configuration (launched the .exe using only with the -o, -r, -u and -p options).

Try adding -v 2 to see if it decreases HW errors.

Also, newer versions of Catalyst have a CPU use bug that effects all OpenCL apps. It cannot be fixed from within the app.

DiabloD3
Legendary
*
Offline Offline

Activity: 1162


DiabloMiner author


View Profile WWW

Ignore
September 11, 2011, 07:39:10 AM
 #989

Update: Cut network failure sleep in half, move execution threads from 2 to 3 to increase performance until AMD fixes CPU usage bug

TheMalon
Member
**
Offline Offline

Activity: 70


View Profile

Ignore
September 12, 2011, 09:08:21 AM
 #990

Hi Diablo,
I have a Radeon 6670 running on Win7-64bit and after i upgraded the Catalyst to the last version (11.8 from 11.6) my hardware errors reported are between 20% and 25% and the CPU usage is 40% to 50% (before was under 10%).

Any idea what should i do or where i should look for some info?

Thanks.
edit: forgot to mention that i use the default configuration (launched the .exe using only with the -o, -r, -u and -p options).

Try adding -v 2 to see if it decreases HW errors.

Also, newer versions of Catalyst have a CPU use bug that effects all OpenCL apps. It cannot be fixed from within the app.
Thanks, Diablo.
The -v 2 option reduced the HW errors from 20-25% to 1-2% and increased the Mhs by 12%!
iopq
Hero Member
*****
Offline Offline

Activity: 560


View Profile

Ignore
September 12, 2011, 09:20:07 AM
 #991

Phateus posted this graph:

does that look like 316 is the fastest? no, I'm pretty sure 410 is faster (vectors 2, worksize 256) right after the dip in speeds

and it obviously doesn't matter that much whether you're running 300ish or 400ish clocks according to the graph

Huh, I wonder what hes using for vectors, I assume he means uint4 = V4, etc. That graph is very interesting, it highlights the register spillover problem in the phatk design quite nicely.

I also wonder what card that is.

5870 overclocked, and v4 is indeed uint4

Those numbers might not be entirely valid then. (Some?) 1200mhz cards do not seem to have the same timing as 1000mhz cards, so 1/4th might work better. On my 5850, the peak seems to be around 1/3rd instead, and on some 5870s from what I've heard its still 1/3rd.
on MY 5850 275 is faster than 250 at 725 clock so the peak is higher than 1/3
I had to RMA it due to artifacts, so when I get a new one I'll test again and see if my new card differs

DiabloD3
Legendary
*
Offline Offline

Activity: 1162


DiabloMiner author


View Profile WWW

Ignore
September 12, 2011, 12:13:52 PM
 #992

Please note: Eligius is intentionally disabling rollntime for DiabloMiner users and tripling reject rates in the process. Use a different pool, such as btcguild which maintains a reject rate below 0.5%.

Druas
Member
**
Offline Offline

Activity: 78


View Profile

Ignore
September 13, 2011, 07:37:51 AM
 #993

And for future note, I'm going to treat all future bugs like this: If you're not using Eligius, it is not my problem.
Wait so does this still apply?
DiabloD3
Legendary
*
Offline Offline

Activity: 1162


DiabloMiner author


View Profile WWW

Ignore
September 13, 2011, 08:07:11 AM
 #994

And for future note, I'm going to treat all future bugs like this: If you're not using Eligius, it is not my problem.
Wait so does this still apply?

That was never really true. I test on many of the large pools. But if I can't reproduce it, its not a bug.

Druas
Member
**
Offline Offline

Activity: 78


View Profile

Ignore
September 13, 2011, 09:39:00 AM
 #995

That was never really true. I test on many of the large pools. But if I can't reproduce it, its not a bug.
Ah, well I could agree with that.
iopq
Hero Member
*****
Offline Offline

Activity: 560


View Profile

Ignore
September 17, 2011, 05:36:15 AM
 #996

No, it isn't a guideline. 1/3rd core clock for memory clock sits in a zone that on most Radeon 5xxxes it hits the stock memory timings correctly and incurs no speed loss for applications that don't rely on memory bandwidth.

If you're too low or too high, you incur a speed loss or sometimes the card just locks up.

Some kernels require better compliance with this than others.

except it is a guideline, because my 5750 is not stable with memory at 233 mhz
my 5850 card is faster as slightly more than 1/3, its core clock is is 725 and 275 is faster than both 242 and 300

you can blame the kernel, but phatk 2.2 is the fastest kernel on both cards and those timings are the fastest timings in practice
update: got a new card
at 725 core speed, 275 mem clock is faster than 250, and 240 gives artifacts

this is higher than 1/3
between 270 and 280 gives the best results

DiabloD3
Legendary
*
Offline Offline

Activity: 1162


DiabloMiner author


View Profile WWW

Ignore
September 17, 2011, 06:12:04 AM
 #997

No, it isn't a guideline. 1/3rd core clock for memory clock sits in a zone that on most Radeon 5xxxes it hits the stock memory timings correctly and incurs no speed loss for applications that don't rely on memory bandwidth.

If you're too low or too high, you incur a speed loss or sometimes the card just locks up.

Some kernels require better compliance with this than others.

except it is a guideline, because my 5750 is not stable with memory at 233 mhz
my 5850 card is faster as slightly more than 1/3, its core clock is is 725 and 275 is faster than both 242 and 300

you can blame the kernel, but phatk 2.2 is the fastest kernel on both cards and those timings are the fastest timings in practice
update: got a new card
at 725 core speed, 275 mem clock is faster than 250, and 240 gives artifacts

this is higher than 1/3
between 270 and 280 gives the best results

Hrm, if the timing is off (ie, for 1200 instead of 1000), it should be closer to 290 is the best.

DiabloD3
Legendary
*
Offline Offline

Activity: 1162


DiabloMiner author


View Profile WWW

Ignore
October 04, 2011, 03:28:38 AM
 #998

Update: Try to drive the reject average further below 0.25%

TheMalon
Member
**
Offline Offline

Activity: 70


View Profile

Ignore
October 04, 2011, 08:58:54 AM
 #999

Hi Diablo,
I have a Radeon 6670 running on Win7-64bit and after i upgraded the Catalyst to the last version (11.8 from 11.6) my hardware errors reported are between 20% and 25% and the CPU usage is 40% to 50% (before was under 10%).

Any idea what should i do or where i should look for some info?

Thanks.
edit: forgot to mention that i use the default configuration (launched the .exe using only with the -o, -r, -u and -p options).

Try adding -v 2 to see if it decreases HW errors.

Also, newer versions of Catalyst have a CPU use bug that effects all OpenCL apps. It cannot be fixed from within the app.
Thanks, Diablo.
The -v 2 option reduced the HW errors from 20-25% to 1-2% and increased the Mhs by 12%!

With 11.9 driver I started to have 10% HW errors but the CPU usage disappeared.
I added to the previous configuration -w 128 and now it all works perfectly Smiley (HW errors <0.5%, gained another 4% MHs and CPU usage is 1%)
Thanks

DiabloD3
Legendary
*
Offline Offline

Activity: 1162


DiabloMiner author


View Profile WWW

Ignore
October 04, 2011, 04:14:00 PM
 #1000

Hi Diablo,
I have a Radeon 6670 running on Win7-64bit and after i upgraded the Catalyst to the last version (11.8 from 11.6) my hardware errors reported are between 20% and 25% and the CPU usage is 40% to 50% (before was under 10%).

Any idea what should i do or where i should look for some info?

Thanks.
edit: forgot to mention that i use the default configuration (launched the .exe using only with the -o, -r, -u and -p options).

Try adding -v 2 to see if it decreases HW errors.

Also, newer versions of Catalyst have a CPU use bug that effects all OpenCL apps. It cannot be fixed from within the app.
Thanks, Diablo.
The -v 2 option reduced the HW errors from 20-25% to 1-2% and increased the Mhs by 12%!

With 11.9 driver I started to have 10% HW errors but the CPU usage disappeared.
I added to the previous configuration -w 128 and now it all works perfectly Smiley (HW errors <0.5%, gained another 4% MHs and CPU usage is 1%)
Thanks



Huh, 10% you say? I wonder if that could be a driver bug.

Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 [50] 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 »
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!