peacefulmind
|
|
April 22, 2013, 08:43:39 PM |
|
I am using 4/22 version. Same cmd as before, however total has dropped from 520 to 410kH. Same clocks here is .bat cudaminer.exe --url http://127.0.0.1:8332/ --userpass xxx.x:123 -i 0,0 -d 0,1 -m 1,1 -C 2,2 -l 84x4,84x4 This got 520kH on the 2x titan in 4/17 release.
|
"I think you are to hung up on this notion about 'pre-mining' being a No-No." - from journeys into the dark depths of the alt coin forum....
|
|
|
tacotime
Legendary
Offline
Activity: 1484
Merit: 1005
|
|
April 22, 2013, 08:45:28 PM |
|
Do you feel you are at release candidate level yet? I want to add this to guiminer-scrypt when it hits maturity.
|
XMR: 44GBHzv6ZyQdJkjqZje6KLZ3xSyN1hBSFAnLP6EAqJtCRVzMzZmeXTC2AHKDS9aEDTRKmo6a6o9r9j86pYfhCWDkKjbtcns
|
|
|
Misiolap
Newbie
Offline
Activity: 14
Merit: 0
|
|
April 22, 2013, 08:45:53 PM |
|
Isn't ulong the same as uint on 32 bit builds? On 64bit linux it breaks things, because ulong is 64bit and uint is 32bit.
|
|
|
|
cbuchner1 (OP)
|
|
April 22, 2013, 08:49:05 PM Last edit: April 22, 2013, 10:36:56 PM by cbuchner1 |
|
Do you feel you are at release candidate level yet? I want to add this to guiminer-scrypt when it hits maturity.
hmm I am probably not going to change the console and command line options output now. But stability (error checking) has to be improved before this can even hit beta status.
|
|
|
|
cbuchner1 (OP)
|
|
April 22, 2013, 08:50:11 PM Last edit: April 22, 2013, 09:02:21 PM by cbuchner1 |
|
Isn't ulong the same as uint on 32 bit builds? On 64bit linux it breaks things, because ulong is 64bit and uint is 32bit.
Any suggestion for a portable 32 bit type among 32 bit and 64 bit builds? I thought int changed size depending on architecture, long is always 32 bits, and long long is always 64 bits. EDIT: I've been reading up on the differences between Microsoft's LLP64 model vs. Unix/Linux LP64 model. I will have to change a few things in the code, then. Christian
|
|
|
|
cbuchner1 (OP)
|
|
April 22, 2013, 08:51:29 PM |
|
I am using 4/22 version. Same cmd as before, however total has dropped from 520 to 410kH. This got 520kH on the 2x titan in 4/17 release.
Wah! I need to empty a bottle of wine now.
|
|
|
|
K1773R
Legendary
Offline
Activity: 1792
Merit: 1008
/dev/null
|
|
April 22, 2013, 08:57:42 PM Last edit: April 22, 2013, 09:58:59 PM by K1773R |
|
wait, 1 titan is 7kh/s slower as my 580? that's sad
|
[GPG Public Key]BTC/DVC/TRC/FRC: 1 K1773RbXRZVRQSSXe9N6N2MUFERvrdu6y ANC/XPM A K1773RTmRKtvbKBCrUu95UQg5iegrqyeA NMC: N K1773Rzv8b4ugmCgX789PbjewA9fL9Dy1 LTC: L Ki773RBuPepQH8E6Zb1ponoCvgbU7hHmd EMC: E K1773RxUes1HX1YAGMZ1xVYBBRUCqfDoF BQC: b K1773R1APJz4yTgRkmdKQhjhiMyQpJgfN
|
|
|
Misiolap
Newbie
Offline
Activity: 14
Merit: 0
|
|
April 22, 2013, 09:18:41 PM |
|
Any suggestion for a portable 32 bit type among 32 bit and 64 bit builds? I thought int changed size depending on architecture, long is always 32 bits, and long long is always 64 bits.
EDIT: I've been reading up on the differences between Microsoft's LLP64 model vs. Unix/Linux LP64 model. I will have to change a few things in the code, then.
Christian
For general purpose vars use uint*_t from <stdint.h> I'm not sure what should be used for CUDA vector types for portability. On 64bit linux: sizeof(ulong2): 16, sizeof(uint2): 8 On 32bit linux it's probably 8 for both.
|
|
|
|
peacefulmind
|
|
April 22, 2013, 09:18:48 PM |
|
I am using 4/22 version. Same cmd as before, however total has dropped from 520 to 410kH. This got 520kH on the 2x titan in 4/17 release.
Wah! I need to empty a bottle of wine now. Christian, Perhaps it is something in my setup? This is not made for a mining rig, I use it for day to day and gaming. 990x 12GB RAM 2x Titan 5760x1200 SLI 64bit win7 I have noticed when I set interactive to 1,1 it freezes, also when I try to let it auto-tune it freezes. I have a ton of games and applications on this machine - so it may be my system. Perhaps the new 4/22 build needs different settings than the ones I used on the 4/17 build? I will try some more. My dedicated mining machines are lean and mean and using AMD RADEON on Linux/a few win7 - so it is hard to compare. You are trailblazing new ground!
|
"I think you are to hung up on this notion about 'pre-mining' being a No-No." - from journeys into the dark depths of the alt coin forum....
|
|
|
cbuchner1 (OP)
|
|
April 22, 2013, 09:28:12 PM |
|
Okay, this would be the 2nd attempt for the day.
uint32_t becomes typedef'd as unsigned int ulong2 becomes uint2 ulong4 becomes uint4
and Titan kernel now does uint2 based memory transactions in a shared memory buffer of [16+2] width, which should reduce warp serialization.
now where's my bottle of wine?
|
|
|
|
Misiolap
Newbie
Offline
Activity: 14
Merit: 0
|
|
April 22, 2013, 10:04:43 PM |
|
Additionally shared buffers for 64bit builds must be 64bit aligned. If it's worth to save some memory for 32bit builds something like this can be done: #if __x86_64__ #define _64BIT_ALIGN 1 #else #define _64BIT_ALIGN 0 #endif
And for each buffer: __shared__ uint32_t X[WARPS_PER_BLOCK][WU_PER_WARP][32+1+_64BIT_ALIGN];
|
|
|
|
cbuchner1 (OP)
|
|
April 22, 2013, 10:22:09 PM |
|
Additionally shared buffers for 64bit builds must be 64bit aligned.
Does this also apply when targeting compute_10, sm_10 (which is done in salsa_kernel.cu) ? Christian
|
|
|
|
Misiolap
Newbie
Offline
Activity: 14
Merit: 0
|
|
April 22, 2013, 10:48:47 PM |
|
Yes, without this on 64bit it kernel dies due to Warp Misaligned Address on compute_10.
|
|
|
|
cbuchner1 (OP)
|
|
April 22, 2013, 10:50:58 PM |
|
Yes, without this on 64bit it kernel dies due to Warp Misaligned Address on compute_10.
ok then I'll put that in and make this a final upload for today. Thank you for the explanations and for developing the patch.
|
|
|
|
Misiolap
Newbie
Offline
Activity: 14
Merit: 0
|
|
April 22, 2013, 11:18:01 PM |
|
Great, now it works out-of-box for me (salsa_kernel), thanks.
|
|
|
|
dbabo
Newbie
Offline
Activity: 41
Merit: 0
|
|
April 22, 2013, 11:29:09 PM |
|
Yes, without this on 64bit it kernel dies due to Warp Misaligned Address on compute_10.
ok then I'll put that in and make this a final upload for today. Thank you for the explanations and for developing the patch. well deserved bottle of wine shall be corked: [2013-04-22 19:29:03] 1 miner threads started, using 'scrypt' algorithm. [2013-04-22 19:29:14] GPU #0: GeForce GT 430 with compute capability 2.1 [2013-04-22 19:29:14] GPU #0: interactive: 1, tex-cache: 0 , single-alloc: 0 [2013-04-22 19:29:14] GPU #0: Performing auto-tuning (Patience...) [2013-04-22 19:29:21] GPU #0: 24.34 khash/s with configuration 4x6 [2013-04-22 19:29:21] GPU #0: using launch configuration 4x6 [2013-04-22 19:29:21] GPU #0: GeForce GT 430, 4608 hashes, 0.26 khash/s [2013-04-22 19:29:21] GPU #0: GeForce GT 430, 1536 hashes, 15.62 khash/s [2013-04-22 19:29:25] GPU #0: GeForce GT 430, 78336 hashes, 22.29 khash/s [2013-04-22 19:29:30] GPU #0: GeForce GT 430, 112128 hashes, 22.11 khash/s [2013-04-22 19:29:35] GPU #0: GeForce GT 430, 110592 hashes, 21.79 khash/s [2013-04-22 19:29:40] GPU #0: GeForce GT 430, 109056 hashes, 21.37 khash/s [2013-04-22 19:29:45] GPU #0: GeForce GT 430, 107520 hashes, 22.23 khash/s
and no i686 deps first coin goes to you thank you!
|
|
|
|
nst6563
|
|
April 22, 2013, 11:48:33 PM |
|
Yes, without this on 64bit it kernel dies due to Warp Misaligned Address on compute_10.
ok then I'll put that in and make this a final upload for today. Thank you for the explanations and for developing the patch. well deserved bottle of wine shall be corked: [2013-04-22 19:29:03] 1 miner threads started, using 'scrypt' algorithm. [2013-04-22 19:29:14] GPU #0: GeForce GT 430 with compute capability 2.1 [2013-04-22 19:29:14] GPU #0: interactive: 1, tex-cache: 0 , single-alloc: 0 [2013-04-22 19:29:14] GPU #0: Performing auto-tuning (Patience...) [2013-04-22 19:29:21] GPU #0: 24.34 khash/s with configuration 4x6 [2013-04-22 19:29:21] GPU #0: using launch configuration 4x6 [2013-04-22 19:29:21] GPU #0: GeForce GT 430, 4608 hashes, 0.26 khash/s [2013-04-22 19:29:21] GPU #0: GeForce GT 430, 1536 hashes, 15.62 khash/s [2013-04-22 19:29:25] GPU #0: GeForce GT 430, 78336 hashes, 22.29 khash/s [2013-04-22 19:29:30] GPU #0: GeForce GT 430, 112128 hashes, 22.11 khash/s [2013-04-22 19:29:35] GPU #0: GeForce GT 430, 110592 hashes, 21.79 khash/s [2013-04-22 19:29:40] GPU #0: GeForce GT 430, 109056 hashes, 21.37 khash/s [2013-04-22 19:29:45] GPU #0: GeForce GT 430, 107520 hashes, 22.23 khash/s
and no i686 deps first coin goes to you thank you! If you overclock that gt430 a bit you can get in the 30kh/s range. I currently get 36kh/s on my gt430 with configuration 20x8.
|
|
|
|
dbabo
Newbie
Offline
Activity: 41
Merit: 0
|
|
April 22, 2013, 11:53:23 PM |
|
... If you overclock that gt430 a bit you can get in the 30kh/s range. I currently get 36kh/s on my gt430 with configuration 20x8.
and how do i do that?
|
|
|
|
nst6563
|
|
April 23, 2013, 12:05:07 AM |
|
... If you overclock that gt430 a bit you can get in the 30kh/s range. I currently get 36kh/s on my gt430 with configuration 20x8.
and how do i do that? Google a tool called NvidiaInspector (I think it's from TechPowerup). It will let you adjust the fan speeds, voltage, and clock speeds of the core/mem/shader of most all nvidia cards. I use it to set the clocks on my gt430 card to 882Mhz core, 810Mhz mem, 1760Mhz shader, .990v. That combo yields between 36kh/s-38kh/s. If I go any higher than that I get the driver crash. Your mileage may vary though on the clock speeds you can attain. I have an EVGA GT430 so I'm not sure how it compares to other flavors.
|
|
|
|
dbabo
Newbie
Offline
Activity: 41
Merit: 0
|
|
April 23, 2013, 12:38:30 AM |
|
Google a tool called NvidiaInspector (I think it's from TechPowerup). It will let you adjust the fan speeds, voltage, and clock speeds of the core/mem/shader of most all nvidia cards. ..
that seems to be win only. i don't have win pcs. but there seemed to be a nvclock utility that should have some options. I'll look into it. Thank you for pointing out to the opportunity to OC.
|
|
|
|
|