Modified Kernel for Phoenix 1.5

Bitcoin Forum

August 24, 2024, 04:26:23 PM

Welcome, Guest. Please login or register.

News: All versions of Windows are affected by a critical security bug; make sure you update.

Home

Help

Bitcoin Forum > Bitcoin > Mining > Mining software (miners) > Modified Kernel for Phoenix 1.5

Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 [13] 14 15 16 » All

« previous topic next topic »

Author

Topic: Modified Kernel for Phoenix 1.5 (Read 96568 times)

ssateneth

Legendary

Offline

Activity: 1344
Merit: 1004

Re: Modified Kernel for Phoenix 1.5

August 10, 2011, 07:52:14 PM
Last edit: August 11, 2011, 04:32:23 AM by ssateneth

#241

Quote from: Phateus on August 10, 2011, 04:21:30 PM

Quote from: BOARBEAR on August 10, 2011, 09:40:16 AM

Quote from: Phateus on August 09, 2011, 10:43:16 PM

Quote from: metacontent on August 09, 2011, 09:16:34 PM

Why not make two separate kernels then?

VECTORS4 might one day be the better alternative, instead of doing all that work then why not start now and keep pace?

Because I have literally put in over 100 hours on the main kernel and have gotten almost nothing in donations. I just don't have the time to keep up with two kernels. If anyone feels like making a VECTORS4 branch, go for it... the source code is in the public domain and you can use how you'd like. Wink

Also, from what I've gathered, there may be only 1 or 2 people interested it... If you can lower your memory speed, I think VECTORS will always be faster than VECTORS4.

Now, I do like hearing feedback from everyone. I am just letting you know that it is not feasible to optimize the kernel for every possible configuration (SDK 2.1, 2.4, slow memory). Right now, the kernel is optimized for SDK 2.5 and the 68xx and 5xxx cards and assuming you pick the best memory clock speed for your card (somewhere around 1/3 of your core clock).

-Phateus

the thing is, VECTORS4 worked perfectly for me in version 2.1
in version 2.2 its broken

As in it doesn't work at all, or that it is much slower?... Just use version 2.1 then

The behavior is as if it's not doing 4 nonces, but only doing 1 (i.e. no VECTORS option specified). My compute speed remained the same regardless of memory speed, which is exactly like your V1 result on the graph on page 1.

I am a long time trusted user: Bitcointalk forum trust ratings, Bitcoin-OTC Ratings, eBay Feedback, and Localbitcoins public profile.

critical

Full Member

Offline

Activity: 160
Merit: 100

Re: Modified Kernel for Phoenix 1.5

August 11, 2011, 09:49:54 AM

#242

in guiminer, i keep getting invalid buffer, unable to write to file, wonder why

Diapolo

Hero Member

Offline

Activity: 769
Merit: 500

Re: Modified Kernel for Phoenix 1.5

August 11, 2011, 11:09:00 AM

#243

Quote from: znort987 on August 11, 2011, 10:21:34 AM

Just did a test:

Rig setup:
  Linuxcoin v0.2b (Linux version 2.6.38-2-amd64)
  Dual HD5970 (4 GPU cores in the rig)
  Mem clock @ 300Mhz
  Core clock @ 800Mhz
  VCore @ 1.125v
  AMD SDK 2.5
  Phoenix r100
  Phatk v2.2
  -v -k phatk BFI_INT VECTORS WORKSIZE=256 AGGRESSION=11 FASTLOOP=false

Result:
  Overall Rig rate: 1484 MH/s
  Rate per core: 371 MH/s

This is ~4MH/s faster than Diapolo's latest.

On 5970, phatk 2.2 is current king of the hill.

For the world to be perfect, this kernel needs to be integrated into cgminer

The last kernel releases show, that it is a bit of trial and error to find THE perfect kernel for a specific setup. Phaetus and I try to use the KernelAnalyzer and our Setups as a first measurement, if a new Kernel got "faster". But there are many different factors that come into play like OS, driver, SDK, miner-software and so on.

I would suggest that we should try to create a kernel which is based on the same kernel-parameters for phatk and phatk-Diapolo so that the users are free to chose which kernel is used. One thing is CGMINER kernel uses the switch VECTORS2, where Phoenix used only VECTORS (which I changed to VECTORS2 in my last kernel releases). It doesn't even matter to use the same variable names in the kernel (in fact they are different sometimes) as long as the main miner software passes the awaited values in a defined sequence to the kernel.

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo

MegaBux

Newbie

Offline

Activity: 31
Merit: 0

Re: Modified Kernel for Phoenix 1.5

August 11, 2011, 03:26:33 PM
Last edit: August 11, 2011, 05:19:23 PM by MegaBux

#244

Quote from: Phateus on May 11, 2011, 05:05:55 PM

As of version 2.1, phatk now has command line option "VECTORS4" which can be used instead of "VECTORS".
This option works on 4 nonces per thread instead of 2 and may increase speed mainly if you do not underclock your memory, but feel free to try it out. Note that if you use this, you will more than likely have to decrease your WORKSIZE to 128 or 64.

I'm using a 6770 @ 1.01Ghz with phatk 2.2. When I run the memory clock at 300Mhz with the VECTORS option, I get 234.5Mhps. However, I can't seem to reap the benefits of VECTORS2 or VECTORS4 at a higher memory clock (i.e. 1.2Ghz). I've reduced the WORKSIZE from 256 to 128 and 64 and peak around 213Mhps; with these options, I can only achieve between 204 and 213 Mhps.

Phateus (OP)

Newbie

Offline

Activity: 52
Merit: 0

Re: Modified Kernel for Phoenix 1.5

August 11, 2011, 04:33:14 PM

#245

Quote from: znort987 on August 11, 2011, 12:45:44 PM

Quote from: Diapolo on August 11, 2011, 11:09:00 AM

Quote from: znort987 on August 11, 2011, 10:21:34 AM

A good idea.

A further improvement: I'd like to have an option in my miner that spends ~2mn
benchmarking all the kernels available in the current directory (without talking to
a pool, i.e. doing pure SHA256 on bogus nonces), and picking the fastest for the
current rig.

For people with lots of different rigs/setups, that would save them the headache
of having to hand-tune each instance.

What I am currently working on is a modified version of phoenix which runs multiple kernels with a single instance and a single work queue (to decrease excessive getwork).
I am also working on plugin support for it, so you can use various added features (such as built-in gui, Web interface, logger, autotune, variable aggression for when computer is idle, overclocking support, etc...)
This would make it tremendously easier for anyone to add features and you can still use whichever kernel works best for you.

As for cgminer support, I haven't tried it, are there any benefits over phoenix? I may fork that instead of phoenix and make the plugin support via command-line, lua or javascript, although I find that python is much easier to code than c (especially for cross platform support).

Phateus (OP)

Newbie

Offline

Activity: 52
Merit: 0

Re: Modified Kernel for Phoenix 1.5

August 11, 2011, 04:50:32 PM

#246

Quote from: MegaBux on August 11, 2011, 03:26:33 PM

Quote from: Phateus on May 11, 2011, 05:05:55 PM

I'm using a 6770 @ 1.01Ghz with phatk 2.2. When I run the memory clock at 300Mhz with the VECTORS option, I get 234.5Mhps. However, I can't seem to reap the benefits of VECTORS2 or VECTORS4 at a higher memory clock (i.e. 1.2Ghz). I've reduced the WORKSIZE from 256 to 128 and 64 and can only seem to peek at 213Mhps. With these options, I can only achieve between 204 and 213 Mhps.

I have found that VECTORS4 is extremely unreliable... even tiny changes in the kernel and other factors affect the hashrate tremendously... OpenCL gets really weird when you use a lot of registers. I added it in 2.1 because it was comparable to VECTORS in some situations, but changing the kernel slightly in 2.2 seems to have broken it (even though kernel analyer says it uses less registers and less ALU ops... *sigh*)

Anyone wondering about any new kernel improvements, I seem to be at a standstill... I have tried the following:

Removing all control flow operations (about 1MH/s slower)
Sending all kernel arguments in a buffer (about 1MH/s slower)
Using an atomic counter for the output so that the output buffer is written sequentially (about the same speed and only works on ATI xxx cards and newer)
Using an internal loop in the kernel to process multiple nonces (Either significantly slower or massive desktop lag)
Calling set_arg only once per getwork instead of once per kernel call (only faster when using very low aggression and FASTLOOP, I will add this to my next kernel release)

-Phateus

jedi95

Full Member

Offline

Activity: 219
Merit: 120

Re: Modified Kernel for Phoenix 1.5

August 11, 2011, 08:44:55 PM

#247

Quote from: Phateus on August 11, 2011, 04:33:14 PM

What I am currently working on is a modified version of phoenix which runs multiple kernels with a single instance and a single work queue (to decrease excessive getwork).
I am also working on plugin support for it, so you can use various added features (such as built-in gui, Web interface, logger, autotune, variable aggression for when computer is idle, overclocking support, etc...)
This would make it tremendously easier for anyone to add features and you can still use whichever kernel works best for you.

As for cgminer support, I haven't tried it, are there any benefits over phoenix? I may fork that instead of phoenix and make the plugin support via command-line, lua or javascript, although I find that python is much easier to code than c (especially for cross platform support).

In most cases you won't see much if any decrease in the number of getwork requests by running multiple kernels behind the same work queue. The reason for having a work queue in the first place is so that the miner only needs to ask for more work when the queue falls below a certain size. During normal operation Phoenix won't request more work than absolutely necessary. There might be a small benefit to doing this when the block changes, but aside from that the getwork count for a single instance running 2 kernels compared to 2 instances will be very close.

That said, I am interested to see the results of the other changes you mentioned. Feel free to PM me if you have any questions.

Phoenix Miner developer

Donations appreciated at:
1PHoenix9j9J3M6v3VQYWeXrHPPjf7y3rU

deepceleron

Legendary

Offline

Activity: 1512
Merit: 1036

Re: Modified Kernel for Phoenix 1.5

August 12, 2011, 03:12:32 AM
Last edit: August 12, 2011, 04:30:40 AM by deepceleron

#248

Big Edit:

I looked again at the AMD APP SDK v2.5, trying to get it to not suck. I did one more thing, not only did I install the 2.5 SDK (on Catalyst 11.6), but I also re-compiled pyopencl 0.92 against the newer SDK. On phatk 2.2, changing just from 2.4 SDK to 2.5 SDK with a matching pyOpenCL gets a hair more mhash:
SDK 2.4: 309.97
SDK 2.5: 310.10

Just to let people know, regarding the APP SDK, the version installed as well as the version used to compile pyopencl both seem to matter (not that this helps you if you are using just the prepackaged Windows phoenix.exe.)

Using a pyOpenCL newer than 0.92 gives a deprecation warning:

[0 Khash/sec] [0 Accepted] [0 Rejected] [RPC]kernels\phatk\__init__.py:414: Depr
ecationWarning: 'enqueue_read_buffer' has been deprecated in version 2011.1. Ple
ase use enqueue_copy() instead.
  self.commandQueue, self.output_buf, self.output)
[11/08/2011 21:10:22] Server gave new work; passing to WorkQueue
[291.32 Mhash/sec] [0 Accepted] [0 Rejected] [RPC (+LP)]kernels\phatk\__init__.p
y:427: DeprecationWarning: 'enqueue_write_buffer' has been deprecated in version
 2011.1. Please use enqueue_copy() instead.
  self.commandQueue, self.output_buf, self.output)

Using pyOpenCL 2011.1.2 with the kernel in its current form gets me less mhash though:
SDK 2.4: 307.98
SDK 2.5: 307.84

(5830@955/350; Catalyst 11.6; Win7; py 2.6.6)

CYPER

Hero Member

Offline

Activity: 812
Merit: 502

Re: Modified Kernel for Phoenix 1.5

August 12, 2011, 03:24:26 AM

#249

Using the latest 2.2 version got quite a noticeable increase:

Before:
4x 440Mh/s = 1760Mh/s

After:
4x 446Mh/s = 1784Mh/s

My best settings are:
Worksize = 256
Aggresion = 12
VECTORS

Tx2000

Full Member

Offline

Activity: 182
Merit: 100

Re: Modified Kernel for Phoenix 1.5

August 12, 2011, 03:46:11 AM

#250

Quote from: Phateus on August 11, 2011, 04:33:14 PM

Would definitely be interested in a cgminer fork. Don't get me wrong, phoenix is great and has always given me the best performance overall but it does lack some of the more refined features, which the other poster listed above. Failover and nice static but updated command line "UI". Seems like you and diapolo are hitting the ceiling with phoenix anyway.

hugolp

Legendary

Offline

Activity: 1148
Merit: 1001

Radix-The Decentralized Finance Protocol

Re: Modified Kernel for Phoenix 1.5

August 12, 2011, 06:54:36 AM

#251

There is a thing I dont understand about the results of these modifications. They increase the hash rate but they also increase consumption, and I always though that since they are making the kernel more efficient (same task with less instructions, less work for the gpu per hash) they should increase the hash rate without chaning consumption too much. Does anyone know why the more efficient kernel is not also more energy efficient?

Also, if one of you guys is out of ideas to make the cards runs faster it could be interesting to target energy efficiency instead of speed. A lot of us are not interested in running our cards at the maximum MHash/s rate but are more interested on having a better MHash/J rate.

▄████████▄ ██▀▀▀▀▀▀▀▀ ██▀ ███ ▄▄▄▄▄ ███ ██████ ███ ▀██▄ ▄██ ▀██▄▄██▀ ████▀ ▀█▀

The Radix DeFi Protocol is
R A D I X

███████████████████████████████████

The Decentralized

Finance Protocol

Scalable

▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄ ██▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀██ ██ ██ ██ ██ ████████████████ ██ ██ ██ ██ ██ ██ ██ ██▄▄▄▄▄▄ ██ ██ ██▀▀▀▀██ ██ ██ ██ ██ ██ ██ ██ ██ ███████████████████████

███

Secure

▄▄▄▄▄ █████████ ██▀ ▀██ ███ ███ ▄▄███▄▄▄▄▄▄▄███▄▄ ██▀▀▀▀▀▀▀▀▀▀▀▀▀██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ███████████

███

Community Driven

▄█ ▄▄ ██ ██████▄▄ ▀▀▄█▀ ▀▀██▄ ▄▄ ██ ▀███▄▄██ ██ ██▀ ▀▀██▀ ██ ██▄ ██ ██ ██████▄▄ ██▀ ▄██ ▀██▄ ██ ██▀ ▀███▄▄██▀ ▄██ ▀▀▀▀ ██▀ ▄██

▄▄ ██ ███▄ ▀███▄ ▀███▄ ▀████ ████ ████▄ ▀███▄ ▀███▄ ▀████ ███ ██ ▀▀

███

Radix is using our significant technology
innovations to be the first layer 1 protocol
specifically built to serve the rapidly growing DeFi.

Radix is the future of DeFi

█████████████████████████████████████

▄▄█████ ▄████▀▀▀ █████ █████████▀ ▀▀█████▀▀ ████ ████ ████

Facebook

███

▄▄ ▄▄▄█████ ▄▄▄███▀▀▄███ ▀▀███▀ ▄██████ █ ███████ ██▀▀▀███ ▀▀

Telegram

███

▄ ▄███▄▄ ██▄▄▄ ██████▀ ████████████ ██████████▀ ███████▀ ▄█████▀▀

Twitter

██████

...^{Get Tokens}...

talldude

Member

Offline

Activity: 224
Merit: 10

Re: Modified Kernel for Phoenix 1.5

August 12, 2011, 01:23:02 PM

#252

It is more efficient - the more output per unit time you have, the more efficient it is since the card will be wasting less power sitting idle.

If you want to increase efficiency, that is a hardware thing - namely undervolt your card.

bcforum

Full Member

Offline

Activity: 140
Merit: 100

Re: Modified Kernel for Phoenix 1.5

August 12, 2011, 01:28:56 PM

#253

Quote from: hugolp on August 12, 2011, 06:54:36 AM

In theory, fewer ALU ops translates to less energy consumption. In practice, each ALU op uses a slightly different amount of power and a kernel which 10x instruction A may burn more power than 12x instruction B. Unfortunately, instruction power numbers aren't documented anywhere so it is almost impossible to optimize in a theoretical sense, and could vary from GPU to GPU (due to minor manufacturing defects.)

One of Diapolo's recent kernels lowered operating temperature by ~3C without changing hashrate significantly. Presumably that particular kernel is ~10% more power efficient than others.

If you found this post useful, feel free to share the wealth: 1E35gTBmJzPNJ3v72DX4wu4YtvHTWqNRbM

hugolp

Legendary

Offline

Activity: 1148
Merit: 1001

Radix-The Decentralized Finance Protocol

Re: Modified Kernel for Phoenix 1.5

August 12, 2011, 01:35:19 PM
Last edit: August 12, 2011, 02:29:51 PM by hugolp

#254

Quote from: bcforum on August 12, 2011, 01:28:56 PM

Thanks for the answer. Can you indicate the version of Diapolo's kernel you are refering to?

The Radix DeFi Protocol is
R A D I X

███████████████████████████████████

The Decentralized

Finance Protocol

Scalable

███

Secure

███

Community Driven

▄▄ ██ ███▄ ▀███▄ ▀███▄ ▀████ ████ ████▄ ▀███▄ ▀███▄ ▀████ ███ ██ ▀▀

███

Radix is using our significant technology
innovations to be the first layer 1 protocol
specifically built to serve the rapidly growing DeFi.

Radix is the future of DeFi

█████████████████████████████████████

▄▄█████ ▄████▀▀▀ █████ █████████▀ ▀▀█████▀▀ ████ ████ ████

Facebook

███

▄▄ ▄▄▄█████ ▄▄▄███▀▀▄███ ▀▀███▀ ▄██████ █ ███████ ██▀▀▀███ ▀▀

Telegram

███

▄ ▄███▄▄ ██▄▄▄ ██████▀ ████████████ ██████████▀ ███████▀ ▄█████▀▀

Twitter

██████

...^{Get Tokens}...

Phateus (OP)

Newbie

Offline

Activity: 52
Merit: 0

Re: Modified Kernel for Phoenix 1.5

August 12, 2011, 05:53:22 PM

#255

Quote from: Tx2000 on August 12, 2011, 03:46:11 AM

Quote from: Phateus on August 11, 2011, 04:33:14 PM

I will release a version that will work with cgminer early next week (looks like he has already implemented diapolo's old version).

We are hitting a ceiling with opencl in general (and perhaps with the current hardware). In one of the mining threads, vector76 and I were discussing the theoretical limit on hashing speeds... and unless there is a way to make the Maj() operation take 1 instruction, we are within about a percent of the theoretical limit on minimum number of instructions in the kernel unless we are missing something.

Now that doesn't mean that there is NO room for improvement, just that any other improvement will probably have to be faster hardware, a more efficient implementation of openCL by AMD or figuring out a better way to finagle the current openCL implementation to reduce the implementation overhead. But, unless there is a problem with pyopenCL, c and python should give equivalent speeds as long as they are just calling the openCL interface (the actual miner uses negligible resources). I suppose it could be possible to access the hardware drivers directly and run the kernel that way... but I don't see that as being feasible.

But, with all of that said, I have looked through some of his code, and it some really clean code. Part of the reason I want to add these features is to learn more python (this is the first thing I have programmed in python), but it probably will just be easier modifying the cgminer code. Thanks for pointing out cgminer to me

Tx2000

Full Member

Offline

Activity: 182
Merit: 100

Re: Modified Kernel for Phoenix 1.5

August 12, 2011, 06:04:56 PM

#256

Sent another donation your way. Look forward to your work on cgminer.

Phateus (OP)

Newbie

Offline

Activity: 52
Merit: 0

Re: Modified Kernel for Phoenix 1.5

August 12, 2011, 06:30:17 PM

#257

Quote from: Tx2000 on August 12, 2011, 06:04:56 PM

Sent another donation your way. Look forward to your work on cgminer.

Thanks

bcforum

Full Member

Offline

Activity: 140
Merit: 100

Re: Modified Kernel for Phoenix 1.5

August 12, 2011, 06:50:47 PM

#258

Quote from: hugolp on August 12, 2011, 01:35:19 PM

Thanks for the answer. Can you indicate the version of Diapolo's kernel you are refering to?

https://bitcointalk.org/index.php?topic=25860.msg428882#msg428882

If you found this post useful, feel free to share the wealth: 1E35gTBmJzPNJ3v72DX4wu4YtvHTWqNRbM

BOARBEAR

Member

Offline

Activity: 77
Merit: 10

Re: Modified Kernel for Phoenix 1.5

August 12, 2011, 07:38:31 PM

#259

I took a look at the comparison between version 2.2 and version 2.1
could it because __constant uint ConstW[128] change that broke VECTORS4?

Phateus (OP)

Newbie

Offline

Activity: 52
Merit: 0

Re: Modified Kernel for Phoenix 1.5

August 12, 2011, 08:01:13 PM

#260

Quote from: BOARBEAR on August 12, 2011, 07:38:31 PM

I took a look at the comparison between version 2.2 and version 2.1
could it because __constant uint ConstW[128] change that broke VECTORS4?

That change is inconsequential (I was trying some things that required the change but did not keep them).. the compiler doesn't use those values, so they code should be exactly the same doing it either way (you can try and replace the code with the old code if you want to check).

You keep saying that it is broken.. if it does not run, post the errors.

I have found that on my card, VECTORS4 is much slower in version 2.2 than 2.1, but this is not a bug... it seems to be because openCL does not like allocating that many registers... Version 2.1 uses around 99.7% of instruction slots with VECTORS4 and I have tried many many ways to make it faster and more reliable (in 2.1), but I have given up on it. It is still in the release because I don't see any point in taking it out... but getting 2.2 to run as fast as 2.1 with VECTORS4 is not going to happen. Also, the differences between 2.1 and 2.2 with VECTORS are very tiny anyway (less than .5%)...

Getting into more detail about it: If you look at the graph on the main page of the thread, you can see the graph of VECTORS4 in version 2.1... in version 2.2 for some reason, the spike (and corresponding valley) is located higher (somewhere around 500), this could mean that it would be just as fast if you had 1500 Mhz memory, but I have no idea why openCL reacts this way to changing the memory speed. There are way to many GPU architecture/GPU bios/PCIe bus/CPU-GPU transfer/driver/openCL implementation unknowns to try to predict this behavior.

-Phateus

Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 [13] 14 15 16 » All

Bitcoin Forum > Bitcoin > Mining > Mining software (miners) > Modified Kernel for Phoenix 1.5

« previous topic next topic »

Jump to: