Bitcoin Forum
April 26, 2024, 10:05:42 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 2 3 4 [5] 6 7 8 9 »  All
  Print  
Author Topic: Phoenix 2 beta discussion  (Read 57908 times)
Bananington
Sr. Member
****
Offline Offline

Activity: 1400
Merit: 340



View Profile
February 09, 2012, 02:50:35 AM
 #81

Instead of the program creating a blank config file and aborting, that would be a good opportunity to run a little wizard questionnaire to set up the program:

"[02/08/2012 18:44:39] Welcome to Phoenix v2.0.0-rc1"
"I see this is the first time you've run Phoenix, please provide some information for your initial configuration:"

"Pool URL:"
[if not detected in URL:]
"Pool Port:"
"Pool Worker Username:"
"Pool Worker Password:"

"Available OpenCL processing devices:"
    [[0]] Juniper
    [[1]] Juniper
    [[2]] Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz

"Please input the number(s) of the devices to use (Enter= Use All):"

"Do you wish to autodetect the best settings (Y/N) (Enter=Auto)?:"

"- Manual Configuration -"

"Available OpenCL software kernels:"
    [[0]] opencl (default)
    [[1]] phatk2
    [[2]] diapolo

"Please select the software kernel (Enter=default):"

"Available worksizes: 64, 128, 256"
"Please enter the OpenCL kernel worksize (Enter=autodetect):"

"Available vector sizes: 2, 4"
"Please enter the desired work vector size (Enter=autodetect):"

"Available aggression setting (work unit size): 1-14 (default 6)"
"Please enter the desired aggression (Enter=default):"
 
"Thank you, parameters saved to configuration file phoenix.cfg!"
[18:46:23] Connected to server
[18:46:23] Server gave new work; passing to WorkQueue
[18:46:23] New block (WorkQueue)


I just blew my happiness load. +69

▄▄███████████████████▄▄
▄██████████████████████▄
███████████▀▌▄▀██████████
███████▄▄███████▄▄███████
██████▄███▀▀██▀██████████
█████████▌█████████▌█████
█████████▌█████████▌█████
██████████▄███▄███▀██████
████████████████▀▀███████
███████████▀▀▀███████████
█████████████████████████
▀█████▀▀████████████████▀
▀▀███████████████████▀▀
Peach
BTC bitcoin
Buy and Sell
Bitcoin P2P
.
.
▄▄███████▄▄
▄████████
██████▄
▄██
█████████████████▄
▄███████
██████████████▄
███████████████████████
█████████████████████████
████████████████████████
█████████████████████████
▀███████████████████████▀
▀█████████████████████▀
▀██████████████████▀
▀███████████████▀
▀▀███████▀▀

▀▀▀▀███▀▀▀▀
Available in
EUROPE | AFRICA
LATIN AMERICA
▄▀▀▀











▀▄▄▄


███████▄█
███████▀
██▄▄▄▄▄░▄▄▄▄▄
████████████▀
▐███████████▌
▐███████████▌
████████████▄
██████████████
███▀███▀▀███▀
.
Download on the
App Store
▀▀▀▄











▄▄▄▀
▄▀▀▀











▀▄▄▄


▄██▄
██████▄
█████████▄
████████████▄
███████████████
████████████▀
█████████▀
██████▀
▀██▀
.
GET IT ON
Google Play
▀▀▀▄











▄▄▄▀
1714125942
Hero Member
*
Offline Offline

Posts: 1714125942

View Profile Personal Message (Offline)

Ignore
1714125942
Reply with quote  #2

1714125942
Report to moderator
1714125942
Hero Member
*
Offline Offline

Posts: 1714125942

View Profile Personal Message (Offline)

Ignore
1714125942
Reply with quote  #2

1714125942
Report to moderator
1714125942
Hero Member
*
Offline Offline

Posts: 1714125942

View Profile Personal Message (Offline)

Ignore
1714125942
Reply with quote  #2

1714125942
Report to moderator
The forum was founded in 2009 by Satoshi and Sirius. It replaced a SourceForge forum.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
CFSworks (OP)
Member
**
Offline Offline

Activity: 63
Merit: 10


View Profile
February 09, 2012, 03:27:43 AM
 #82

Instead of the program creating a blank config file and aborting, that would be a good opportunity to run a little wizard questionnaire to set up the program:

The plan is actually to tell the user to fire up a web browser and have PhoenixWeb run the first-time setup wizard. The only trouble is that PhoenixWeb still needs a lot of work before it's ready to be included in the standard Phoenix download. (Right now it only has enough there for a "real" web developer to pick it up and work on it...)

Phoenix Miner developer

PGP/GPG key: FC5461A3
Personal donations: 1Abq88sPz2MjH4Yi8yZVCbfu1ZXRSP7id5
ssateneth
Legendary
*
Offline Offline

Activity: 1344
Merit: 1004



View Profile
February 09, 2012, 03:32:01 AM
 #83

Instead of the program creating a blank config file and aborting, that would be a good opportunity to run a little wizard questionnaire to set up the program:

The plan is actually to tell the user to fire up a web browser and have PhoenixWeb run the first-time setup wizard. The only trouble is that PhoenixWeb still needs a lot of work before it's ready to be included in the standard Phoenix download. (Right now it only has enough there for a "real" web developer to pick it up and work on it...)

CFSWorks, could you please verify the crc of current windows build of phoenix.exe is 89B9736A ? I keep getting told that the windows build was updated but i'm just not seeing it because the "new" and "old" crc32s are identical, so I don't believe anything was updated/changed.

CFSworks (OP)
Member
**
Offline Offline

Activity: 63
Merit: 10


View Profile
February 09, 2012, 03:43:57 AM
 #84

What I doing wrong? Ubuntu:

I'm getting what looks to be the same issue as mich, I'm on Debian 6:

Sorry about that! Apparently I don't know how to use setuptools correctly. Cheesy

I just pushed a fix for that problem into the Github repository. You can try that if you like.

Because o has to come before p, opencl before phatk2. Maybe you should rename it 00-opencl. Wink

A better fix for that problem is to use a function which loads the kernel if it hasn't already been imported. I'm working on that now...

Phoenix Miner developer

PGP/GPG key: FC5461A3
Personal donations: 1Abq88sPz2MjH4Yi8yZVCbfu1ZXRSP7id5
CFSworks (OP)
Member
**
Offline Offline

Activity: 63
Merit: 10


View Profile
February 09, 2012, 03:49:59 AM
 #85


CFSWorks, could you please verify the crc of current windows build of phoenix.exe is 89B9736A ? I keep getting told that the windows build was updated but i'm just not seeing it because the "new" and "old" crc32s are identical, so I don't believe anything was updated/changed.

I just checked - the zipfile itself is identical. I'm going to go badger jedi95 to compile a new one.

Phoenix Miner developer

PGP/GPG key: FC5461A3
Personal donations: 1Abq88sPz2MjH4Yi8yZVCbfu1ZXRSP7id5
CFSworks (OP)
Member
**
Offline Offline

Activity: 63
Merit: 10


View Profile
February 09, 2012, 04:01:49 AM
 #86

-- snip -- (accidentally quoted the main post)

Phoenix Miner developer

PGP/GPG key: FC5461A3
Personal donations: 1Abq88sPz2MjH4Yi8yZVCbfu1ZXRSP7id5
ssateneth
Legendary
*
Offline Offline

Activity: 1344
Merit: 1004



View Profile
February 09, 2012, 04:47:29 AM
 #87


CFSWorks, could you please verify the crc of current windows build of phoenix.exe is 89B9736A ? I keep getting told that the windows build was updated but i'm just not seeing it because the "new" and "old" crc32s are identical, so I don't believe anything was updated/changed.

I just checked - the zipfile itself is identical. I'm going to go badger jedi95 to compile a new one.

Thanks! Smiley

Diapolo
Hero Member
*****
Offline Offline

Activity: 769
Merit: 500



View Profile WWW
February 09, 2012, 07:35:21 AM
 #88

Hashrate display is fixed on latest build, good job!

Another problem I observed with my own kernel (DiaKGCN), if I use a 7970 (Tahiti - GCN) seperate everything is okay, if I use a 6550D (BeaverCreek - VLIW5) everything is okay, but if I try to use both of them together there seems to be a problem.

Code:
[general]
autodetect = +cl -cpu
backend = http://XYZ:XYZ@pool.bitlc.net
ratesamples = 100
verbose = true

[cl:0:0]
disabled = false
kernel = diakgcn
aggression = 12
goffset = true
vectors2 = true
vectors4 = false
vectors8 = false
worksize = 256

[cl:0:1]
disabled = false
kernel = diakgcn
aggression = 12
goffset = true
vectors2 = false
vectors4 = false
vectors8 = true
worksize = 128

Hashrate for each device is ~540 MH/s + 60 MH/s, which should lead to ~600 MH/s for the above config. Real displayed rate is only 114 MH/s, so it seems the kernel does not use the supplied settings for each device but perhaps uses only the last supplied parameters (here for [cl:0:1]. Could well be a problem of my init / kernel, but could also be a general problem. Any ideas?

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
jedi95
Full Member
***
Offline Offline

Activity: 219
Merit: 120


View Profile
February 09, 2012, 07:48:56 AM
 #89

Hashrate display is fixed on latest build, good job!

Another problem I observed with my own kernel (DiaKGCN), if I use a 7970 (Tahiti - GCN) seperate everything is okay, if I use a 6550D (BeaverCreek - VLIW5) everything is okay, but if I try to use both of them together there seems to be a problem.

Code:
[general]
autodetect = +cl -cpu
backend = http://XYZ:XYZ@pool.bitlc.net
ratesamples = 100
verbose = true

[cl:0:0]
disabled = false
kernel = diakgcn
aggression = 12
goffset = true
vectors2 = true
vectors4 = false
vectors8 = false
worksize = 256

[cl:0:1]
disabled = false
kernel = diakgcn
aggression = 12
goffset = true
vectors2 = false
vectors4 = false
vectors8 = true
worksize = 128

Hashrate for each device is ~540 MH/s + 60 MH/s, which should lead to ~600 MH/s for the above config. Real displayed rate is only 114 MH/s, so it seems the kernel does not use the supplied settings for each device but perhaps uses only the last supplied parameters (here for [cl:0:1]. Could well be a problem of my init / kernel, but could also be a general problem. Any ideas?

Dia

I'm going to need more info to figure this one out. Multiple kernels is working fine on one of my rigs with a 5870 + 5830.

Things to check:
Can you try this using opencl and/or phatk2 kernels for both devices?
What load % are you getting on each GPU?
Is the CPU load % being maxed out? (perhaps a bug in the CPU detection code you submitted?) "autodetect = +cl -cpu" would use the CPU if the detection code doesn't work.

Phoenix Miner developer

Donations appreciated at:
1PHoenix9j9J3M6v3VQYWeXrHPPjf7y3rU
Diapolo
Hero Member
*****
Offline Offline

Activity: 769
Merit: 500



View Profile WWW
February 09, 2012, 08:28:41 AM
Last edit: February 10, 2012, 06:55:35 AM by Diapolo
 #90

These pictures are not phatk, but DiaKGCN!

6550D:


7950:


I set an affinity to only one of my 4 CPU cores and utilisation jumps between 100% and 50% for that single core.
The CPU device was disabled via:
Code:
[cl:0:2]
disabled = true

Utilisation for the GPUs looks very weird...

Autoconfiguration for both GPUs lead to a higher Hashrate, but 7970 has a GPU load only at 75% max. I guess there is a problem with setting different worksizes or vector widths for different GPUs here and that autoconfig sets the same values for both devices, which is not optimal, too and only makes the problem less obvious.

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
d3m0n1q_733rz
Sr. Member
****
Offline Offline

Activity: 378
Merit: 250



View Profile WWW
February 10, 2012, 06:06:59 AM
Last edit: February 10, 2012, 06:53:18 AM by d3m0n1q_733rz
 #91

6550D:


7950:


I set an affinity to only one of my 4 CPU cores and utilisation jumps between 100% and 50% for that single core.
The CPU device was disabled via:
Code:
[cl:0:2]
disabled = true

Utilisation for the GPUs looks very weird...

Autoconfiguration for both GPUs lead to a higher Hashrate, but 7970 has a GPU load only at 75% max. I guess there is a problem with setting different worksizes or vector widths for different GPUs here and that autoconfig sets the same values for both devices, which is not optimal, too and only makes the problem less obvious.

Dia
I don't think that autoconfig actually sets Vector and worksize properly anyway.  When I used it all it set was the phatk2 kernel usage and no other parameters.

Speaking of Phatk2, I think something needs to be done with the nonce determination at the end of the program.  You see, it compares the different vectors of v and g for equivalence and then sets nonce to the corresponding value of W[3]'s vector.  Unfortunately, nonce is a uint and can contain only one vector's value.  This makes the assumption that only a single vector will be okay to use.  And, should more than one vector be proper to utilize, nonce is set to the highest vector's value.  So, one of two things needs to be done:
1)  Only check vectors as long as nonce remains 0.  Any remaining checks will be wasted.
2)  Expand nonce to the size of the vectors used and allow the use of multiple nonce, if found, to increase efficiency.

In any case, the nonce determination code needs to be modified.  I suggest doing a 64-bit atom_and of v and g and then pulling the values of W[3] out based upon each vector in v having all bits equal to 1.  This still has the problem of having multiple nonce possible though.  But it will only be comparing bits in v to 1.  This also mean that we can skip the nonce check all together in the case that we have more results without nonce than with by testing if v=0.  If v=0, don't bother pulling out the nonce values, because they're not there.  If every bit in a vector of v is not 1, don't pull out the corresponding value in W[3].  It saves a heck of a lot of time.

I made a miscalculation here.  Even if we AND the vectors, some bits will still probably match which means it still needs to pull the vectors apart to check each one against the constant of all 1 bits.  Too bad we can't AND or XOR an entire vector to equal 1 or 0 that I know of.

Funroll_Loops, the theoretically quicker breakfast cereal!
Check out http://www.facebook.com/JupiterICT for all of your computing needs.  If you need it, we can get it.  We have solutions for your computing conundrums.  BTC accepted!  12HWUSguWXRCQKfkPeJygVR1ex5wbg3hAq
blandead
Newbie
*
Offline Offline

Activity: 46
Merit: 0


View Profile
February 11, 2012, 02:58:15 AM
 #92

I don't think that autoconfig actually sets Vector and worksize properly anyway.  When I used it all it set was the phatk2 kernel usage and no other parameters.

Speaking of Phatk2, I think something needs to be done with the nonce determination at the end of the program.  You see, it compares the different vectors of v and g for equivalence and then sets nonce to the corresponding value of W[3]'s vector.  Unfortunately, nonce is a uint and can contain only one vector's value.  This makes the assumption that only a single vector will be okay to use.  And, should more than one vector be proper to utilize, nonce is set to the highest vector's value.  So, one of two things needs to be done:
1)  Only check vectors as long as nonce remains 0.  Any remaining checks will be wasted.
2)  Expand nonce to the size of the vectors used and allow the use of multiple nonce, if found, to increase efficiency.

In any case, the nonce determination code needs to be modified.  I suggest doing a 64-bit atom_and of v and g and then pulling the values of W[3] out based upon each vector in v having all bits equal to 1.  This still has the problem of having multiple nonce possible though.  But it will only be comparing bits in v to 1.  This also mean that we can skip the nonce check all together in the case that we have more results without nonce than with by testing if v=0.  If v=0, don't bother pulling out the nonce values, because they're not there.  If every bit in a vector of v is not 1, don't pull out the corresponding value in W[3].  It saves a heck of a lot of time.

I made a miscalculation here.  Even if we AND the vectors, some bits will still probably match which means it still needs to pull the vectors apart to check each one against the constant of all 1 bits.  Too bad we can't AND or XOR an entire vector to equal 1 or 0 that I know of.

You can AND or XOR whatever you want if it's set up to do so
T atom_and (Q T*p, T val) | Read, Store (*p & val)
T atom_xor (Q T*p, T val) | Read, Store(*p ^ val)

The reason you can't expand the nonce is because the base is a uint (as you pointed out), only with floats or double can you do that

If you have float4 v; corresponding values are (v.x, v.s0), (v.y, v.s1), (v.z, v.s2), (v.w, v.s3)

Then just next them together into v.xyzw or v.s0123
or v.lo (v.s01, v.xy) or v.hi (v.s23, v.zw) or v.odd(v.s13, v.yw) or v.even(v.s02, v.xz)
or whatever combination you want

There is a way to directly test sign bits btw,
intn signbit(floatn)

There is also a separate function to test if all bits are 1 compared to a constant

You can test for finite values, +infinity or -infinity, NaN, do a bitselect or a select function for vector types

Unfortunately the _init_.py does not have any of it set up properly, the way the buffer is created, stored, and read is a very slow method compared to what is possible now

Also, there is a way to create an offset function natively as well plus many more options
d3m0n1q_733rz
Sr. Member
****
Offline Offline

Activity: 378
Merit: 250



View Profile WWW
February 11, 2012, 09:40:15 AM
 #93

I don't think that autoconfig actually sets Vector and worksize properly anyway.  When I used it all it set was the phatk2 kernel usage and no other parameters.

Speaking of Phatk2, I think something needs to be done with the nonce determination at the end of the program.  You see, it compares the different vectors of v and g for equivalence and then sets nonce to the corresponding value of W[3]'s vector.  Unfortunately, nonce is a uint and can contain only one vector's value.  This makes the assumption that only a single vector will be okay to use.  And, should more than one vector be proper to utilize, nonce is set to the highest vector's value.  So, one of two things needs to be done:
1)  Only check vectors as long as nonce remains 0.  Any remaining checks will be wasted.
2)  Expand nonce to the size of the vectors used and allow the use of multiple nonce, if found, to increase efficiency.

In any case, the nonce determination code needs to be modified.  I suggest doing a 64-bit atom_and of v and g and then pulling the values of W[3] out based upon each vector in v having all bits equal to 1.  This still has the problem of having multiple nonce possible though.  But it will only be comparing bits in v to 1.  This also mean that we can skip the nonce check all together in the case that we have more results without nonce than with by testing if v=0.  If v=0, don't bother pulling out the nonce values, because they're not there.  If every bit in a vector of v is not 1, don't pull out the corresponding value in W[3].  It saves a heck of a lot of time.

I made a miscalculation here.  Even if we AND the vectors, some bits will still probably match which means it still needs to pull the vectors apart to check each one against the constant of all 1 bits.  Too bad we can't AND or XOR an entire vector to equal 1 or 0 that I know of.

You can AND or XOR whatever you want if it's set up to do so
T atom_and (Q T*p, T val) | Read, Store (*p & val)
T atom_xor (Q T*p, T val) | Read, Store(*p ^ val)

The reason you can't expand the nonce is because the base is a uint (as you pointed out), only with floats or double can you do that

If you have float4 v; corresponding values are (v.x, v.s0), (v.y, v.s1), (v.z, v.s2), (v.w, v.s3)

Then just next them together into v.xyzw or v.s0123
or v.lo (v.s01, v.xy) or v.hi (v.s23, v.zw) or v.odd(v.s13, v.yw) or v.even(v.s02, v.xz)
or whatever combination you want

There is a way to directly test sign bits btw,
intn signbit(floatn)

There is also a separate function to test if all bits are 1 compared to a constant

You can test for finite values, +infinity or -infinity, NaN, do a bitselect or a select function for vector types

Unfortunately the _init_.py does not have any of it set up properly, the way the buffer is created, stored, and read is a very slow method compared to what is possible now

Also, there is a way to create an offset function natively as well plus many more options

Supposing we did fit two found nonce values into a single nonce, could the miner output them both from a uint2?  And the v.s0 is the same as v.x for uint2 or uint4 as well.  When you get into uint8 or above, they become explicitly v.s0 etc.  They're really just vector locations.
But would the miner know to cut the nonce apart or would it expect a single uint?

Funroll_Loops, the theoretically quicker breakfast cereal!
Check out http://www.facebook.com/JupiterICT for all of your computing needs.  If you need it, we can get it.  We have solutions for your computing conundrums.  BTC accepted!  12HWUSguWXRCQKfkPeJygVR1ex5wbg3hAq
iNs4nePT
Newbie
*
Offline Offline

Activity: 25
Merit: 0


View Profile
February 11, 2012, 03:28:00 PM
 #94

Edited: my bad, don't forget "agression" != "aggression" Smiley

Hash rate still seems a bit lower than phoenix 1.7.5 on an OC'd 6850


Is there any possibility to select the pool from command line?
Either passing the full url, or a configuration key name would be great.
jedi95
Full Member
***
Offline Offline

Activity: 219
Merit: 120


View Profile
February 11, 2012, 07:42:35 PM
 #95

Edited: my bad, don't forget "agression" != "aggression" Smiley

Hash rate still seems a bit lower than phoenix 1.7.5 on an OC'd 6850


Is there any possibility to select the pool from command line?
Either passing the full url, or a configuration key name would be great.


There are a few ways to do this.

First, the only argument Phoenix 2 accepts on the command line is the path to its config file. If none is supplied, it defaults to phoenix.cfg in the current working directory. Using this method you could have several config files (one per pool) and simply switch between them.

The second method is to use Phoenix 2's RPC interface to switch pools at runtime. The following code is an example of how to do this in Python:
Code:
import jsonrpc
sp = jsonrpc.ServiceProxy('http://x:phoenix@localhost:7780')
sp.setconfig('general', 'backend', 'http://user:password@pool.com:8332')

This will cause Phoenix to switch to the new pool immediately.

Phoenix Miner developer

Donations appreciated at:
1PHoenix9j9J3M6v3VQYWeXrHPPjf7y3rU
blandead
Newbie
*
Offline Offline

Activity: 46
Merit: 0


View Profile
February 12, 2012, 08:47:14 PM
 #96

Supposing we did fit two found nonce values into a single nonce, could the miner output them both from a uint2?  And the v.s0 is the same as v.x for uint2 or uint4 as well.  When you get into uint8 or above, they become explicitly v.s0 etc.  They're really just vector locations.
But would the miner know to cut the nonce apart or would it expect a single uint?

Yes there is a way to output directly 4 values

and again it can't be a uint, it has to be a float

with float4 v.s0 and v.x are not the same, I don't think at least since, it's still a 4 component variable (float4)
Diapolo
Hero Member
*****
Offline Offline

Activity: 769
Merit: 500



View Profile WWW
February 12, 2012, 09:00:41 PM
 #97

Supposing we did fit two found nonce values into a single nonce, could the miner output them both from a uint2?  And the v.s0 is the same as v.x for uint2 or uint4 as well.  When you get into uint8 or above, they become explicitly v.s0 etc.  They're really just vector locations.
But would the miner know to cut the nonce apart or would it expect a single uint?

Yes there is a way to output directly 4 values

and again it can't be a uint, it has to be a float

with float4 v.s0 and v.x are not the same, I don't think at least since, it's still a 4 component variable (float4)

Why should .s0 and .x be not the same? For 4-component vectors it's possible to use .x .y .z .w or s0 s1 s2 s3 as per OpenCL spec.
To oputput 2 nonce values in one variable we can use ulong, that's what's currently used in DiaKGCN or my last phatk_dia (I did change that though in my last internal version). To output 4 values this could be done via vstore() I guess.

Dia

Liked my former work for Bitcoin Core? Drop me a donation via:
1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x
bitcoin:1PwnvixzVAKnAqp8LCV8iuv7ohzX2pbn5x?label=Diapolo
blandead
Newbie
*
Offline Offline

Activity: 46
Merit: 0


View Profile
February 13, 2012, 08:00:47 AM
 #98

Why should .s0 and .x be not the same? For 4-component vectors it's possible to use .x .y .z .w or s0 s1 s2 s3 as per OpenCL spec.
To oputput 2 nonce values in one variable we can use ulong, that's what's currently used in DiaKGCN or my last phatk_dia (I did change that though in my last internal version). To output 4 values this could be done via vstore() I guess.

Dia

My mistake it is the same, just be careful when using .lo or .hi functions as if you have a float2 .lo can mean .x or .s0, of course you can always use .lo.x if you want to be extra specific.

Sure you can use ulong, and it takes an extra 10+ instructions to output the nonces while you upsample it, instead of outputting all nonces at once with just one output function.

A vstore() writes the vector data to memory, and a vload() reads it from memory. Probably with Async copies and Prefetch it can be done
d3m0n1q_733rz
Sr. Member
****
Offline Offline

Activity: 378
Merit: 250



View Profile WWW
February 13, 2012, 10:26:10 AM
 #99

Supposing we did fit two found nonce values into a single nonce, could the miner output them both from a uint2?  And the v.s0 is the same as v.x for uint2 or uint4 as well.  When you get into uint8 or above, they become explicitly v.s0 etc.  They're really just vector locations.
But would the miner know to cut the nonce apart or would it expect a single uint?

Yes there is a way to output directly 4 values

and again it can't be a uint, it has to be a float

with float4 v.s0 and v.x are not the same, I don't think at least since, it's still a 4 component variable (float4)

Why should .s0 and .x be not the same? For 4-component vectors it's possible to use .x .y .z .w or s0 s1 s2 s3 as per OpenCL spec.
To oputput 2 nonce values in one variable we can use ulong, that's what's currently used in DiaKGCN or my last phatk_dia (I did change that though in my last internal version). To output 4 values this could be done via vstore() I guess.

Dia
Dia, didn't you use a conditional statement that directly output the nonce to the miner instead of storing them and then outputting them at the end?
Why should .s0 and .x be not the same? For 4-component vectors it's possible to use .x .y .z .w or s0 s1 s2 s3 as per OpenCL spec.
To oputput 2 nonce values in one variable we can use ulong, that's what's currently used in DiaKGCN or my last phatk_dia (I did change that though in my last internal version). To output 4 values this could be done via vstore() I guess.

Dia

My mistake it is the same, just be careful when using .lo or .hi functions as if you have a float2 .lo can mean .x or .s0, of course you can always use .lo.x if you want to be extra specific.

Sure you can use ulong, and it takes an extra 10+ instructions to output the nonces while you upsample it, instead of outputting all nonces at once with just one output function.

A vstore() writes the vector data to memory, and a vload() reads it from memory. Probably with Async copies and Prefetch it can be done
In 2 vectors, .x is the same as .s0, .even and .lo.  However, the only reason to use the last two is if you have code designed to handle multiple different vector types and you need to output the even or lower half of the results and don't want to rewrite the entire code for each vector type.
I tend to just bother with the .s# version as it's easier and the other appears to be becoming legacy (matter of opinion of course).  Just remember that the vector after .s9 is .sa as it's numbered in hexidecimal.

Funroll_Loops, the theoretically quicker breakfast cereal!
Check out http://www.facebook.com/JupiterICT for all of your computing needs.  If you need it, we can get it.  We have solutions for your computing conundrums.  BTC accepted!  12HWUSguWXRCQKfkPeJygVR1ex5wbg3hAq
blandead
Newbie
*
Offline Offline

Activity: 46
Merit: 0


View Profile
February 13, 2012, 12:58:21 PM
 #100

This is a good point, might as well just use .s0 - .sf
Pages: « 1 2 3 4 [5] 6 7 8 9 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!