420
|
|
April 21, 2013, 12:27:34 AM |
|
What speed should I get with a GTX 460M
or a GTX 680
|
Donations: 1JVhKjUKSjBd7fPXQJsBs5P3Yphk38AqPr - TIPS the hacks, the hacks, secure your bits!
|
|
|
nst6563
|
|
April 21, 2013, 12:50:19 AM |
|
|
|
|
|
Misiolap
Newbie
Offline
Activity: 14
Merit: 0
|
|
April 21, 2013, 12:51:51 AM |
|
@K1773R: check out patch for 64bit systems, a few post above yours.
|
|
|
|
K1773R
Legendary
Offline
Activity: 1792
Merit: 1008
/dev/null
|
|
April 21, 2013, 02:28:21 AM Last edit: April 21, 2013, 02:49:33 AM by K1773R |
|
ERR: nvm i got the old 4.0 cuda... works with 5.0
|
[GPG Public Key]BTC/DVC/TRC/FRC: 1 K1773RbXRZVRQSSXe9N6N2MUFERvrdu6y ANC/XPM A K1773RTmRKtvbKBCrUu95UQg5iegrqyeA NMC: N K1773Rzv8b4ugmCgX789PbjewA9fL9Dy1 LTC: L Ki773RBuPepQH8E6Zb1ponoCvgbU7hHmd EMC: E K1773RxUes1HX1YAGMZ1xVYBBRUCqfDoF BQC: b K1773R1APJz4yTgRkmdKQhjhiMyQpJgfN
|
|
|
hammz
Member
Offline
Activity: 143
Merit: 10
|
|
April 21, 2013, 05:35:47 AM |
|
Is there a way I can periodically terminate the cudaminer program without it crashing my video card driver?
I want to run a script to restart cudaminer, and my stratum proxy connection, on a timer but task killing the cudaminer process in windows also crashes the video card, so I lose my overclock settings.
|
|
|
|
Lacan82
|
|
April 21, 2013, 05:56:38 AM |
|
Is there a way I can periodically terminate the cudaminer program without it crashing my video card driver?
I want to run a script to restart cudaminer, and my stratum proxy connection, on a timer but task killing the cudaminer process in windows also crashes the video card, so I lose my overclock settings.
Do you have the latest drivers? and version? this isn't an issue in the newer one
|
|
|
|
borgopio
Newbie
Offline
Activity: 8
Merit: 0
|
|
April 21, 2013, 06:53:30 AM |
|
Cudaminer 2013-04-17 compiling errors
Cudaminer 2013-04-14 compiled fine on my system. I'm running it now and getting about 65 khash/s. My system is: AMD Phenom II Linux 12.04 GeForce GTS 450
When I attempt to compile 2013-04-17, I get several errors like this: "./salsa_kernel.cu(164): Warning: Cannot tell what pointer points to, assuming global memory space" and this one: "titan_kernel.cu(377): error: identifier "usleep" is undefined"
I assume they are coding errors and thought this would be helpful info.
Thanks for your work Christian. As soon as I get some coins I can send you something.
|
|
|
|
hammz
Member
Offline
Activity: 143
Merit: 10
|
|
April 21, 2013, 11:07:44 AM |
|
Is there a way I can periodically terminate the cudaminer program without it crashing my video card driver?
I want to run a script to restart cudaminer, and my stratum proxy connection, on a timer but task killing the cudaminer process in windows also crashes the video card, so I lose my overclock settings.
Do you have the latest drivers? and version? this isn't an issue in the newer one 04-17...version from a few days ago. I'm not talking about using ctrl-c from within the program, I'm task killing it via an automated script.
|
|
|
|
dbabo
Newbie
Offline
Activity: 41
Merit: 0
|
|
April 21, 2013, 02:04:56 PM |
|
Cudaminer 2013-04-17 compiling errors
Cudaminer 2013-04-14 compiled fine on my system. I'm running it now and getting about 65 khash/s. My system is: AMD Phenom II Linux 12.04 GeForce GTS 450
When I attempt to compile 2013-04-17, I get several errors like this: "./salsa_kernel.cu(164): Warning: Cannot tell what pointer points to, assuming global memory space" and this one: "titan_kernel.cu(377): error: identifier "usleep" is undefined"
I assume they are coding errors and thought this would be helpful info.
Thanks for your work Christian. As soon as I get some coins I can send you something.
some deps are missing - check orig post. Looks like you are on ubuntu right?
|
|
|
|
borgopio
Newbie
Offline
Activity: 8
Merit: 0
|
|
April 21, 2013, 03:01:41 PM |
|
No dependencies listed for Linux on original post. I am running Cuda 5.0 and graphics driver meets minimum release requirements. 2013-04-24 compiles fine.
|
|
|
|
dbabo
Newbie
Offline
Activity: 41
Merit: 0
|
|
April 21, 2013, 03:10:46 PM |
|
No dependencies listed for Linux on original post. I am running Cuda 5.0 and graphics driver meets minimum release requirements. 2013-04-24 compiles fine.
g++-multilib and ia32-libs, possibly also libcurl4-dev r u on 32?
|
|
|
|
borgopio
Newbie
Offline
Activity: 8
Merit: 0
|
|
April 21, 2013, 03:33:26 PM |
|
Yes - 32 bit. I installed g++multilib and ia32-libs and recompiled. Same errors. There is more than one libcurl4 to chose from.
|
|
|
|
KnowBuddy
Member
Offline
Activity: 69
Merit: 10
|
|
April 21, 2013, 03:59:34 PM |
|
Is there a way I can periodically terminate the cudaminer program without it crashing my video card driver?
I want to run a script to restart cudaminer, and my stratum proxy connection, on a timer but task killing the cudaminer process in windows also crashes the video card, so I lose my overclock settings.
Do you have the latest drivers? and version? this isn't an issue in the newer one 04-17...version from a few days ago. I'm not talking about using ctrl-c from within the program, I'm task killing it via an automated script. Isn't it possible to make the cudaminer.exe window active and enter ctrl-c using an automated script?
|
|
|
|
bitg
Newbie
Offline
Activity: 19
Merit: 0
|
|
April 21, 2013, 07:01:42 PM |
|
I am seeing an small increment with 17-04 version too, about 3-5%.
I did notice that Ctrl-C is ignored when the connection to pool isn't succesfull, forcing me to close the cmd window directly.
Anyways great job with the CUDA implementation.
|
|
|
|
logdog16
Newbie
Offline
Activity: 19
Merit: 0
|
|
April 21, 2013, 07:51:54 PM |
|
On my 670 with 1D tex cache I am getting about 175 kHash/s.
Great work!
|
|
|
|
cbuchner1 (OP)
|
|
April 21, 2013, 09:53:26 PM Last edit: April 21, 2013, 10:04:09 PM by cbuchner1 |
|
On my 670 with 1D tex cache I am getting about 175 kHash/s.
A 570 though would be significantly faster (but also run significantly hotter). I am still trying to understand why the Kepler architecture has such a performance disadvantage with my current code. I did try some inline PTX assembly (looks horrid, check it out) __device__ void ROTL7(uint32_t &A0, const uint32_t &A1, const uint32_t &A2, uint32_t &B0, const uint32_t &B1, const uint32_t &B2, uint32_t &C0, const uint32_t &C1, const uint32_t &C2, uint32_t &D0, const uint32_t &D1, const uint32_t &D2) { asm("{\n\t" " .reg .u32 tA1, tA2;\n\t" " .reg .u32 tB1, tB2;\n\t" " .reg .u32 tC1, tC2;\n\t" " .reg .u32 tD1, tD2;\n\t" " add.u32 tA1, %4, %5;\n\t" " add.u32 tB1, %6, %7;\n\t" " add.u32 tC1, %8, %9;\n\t" " add.u32 tD1, %10, %11;\n\t" " shl.b32 tA2, tA1, 7;\n\t" " shl.b32 tB2, tB1, 7;\n\t" " shl.b32 tC2, tC1, 7;\n\t" " shl.b32 tD2, tD1, 7;\n\t" " shr.b32 tA1, tA1, 25;\n\t" " shr.b32 tB1, tB1, 25;\n\t" " shr.b32 tC1, tC1, 25;\n\t" " shr.b32 tD1, tD1, 25;\n\t" " or.b32 tA1, tA1, tA2;\n\t" " or.b32 tB1, tB1, tB2;\n\t" " or.b32 tC1, tC1, tC2;\n\t" " or.b32 tD1, tD1, tD2;\n\t" " xor.b32 %0, %0, tA1;\n\t" " xor.b32 %1, %1, tB1;\n\t" " xor.b32 %2, %2, tC1;\n\t" " xor.b32 %3, %3, tD1;\n\t" "}" : "+r"(A0), "+r"(B0), "+r"(C0), "+r"(D0) : "r" (A1), "r" (A2), "r" (B1), "r" (B2), "r" (C1), "r" (C2), "r" (D1), "r" (D2)); }
as well as added instruction level parallelism by formulating the CUDA code like this: #define ROTL7(A0, A1, A2, B0, B1, B2, C0, C1, C2, D0, D1, D2) \ {\ volatile uint32_t tA1 = A1 + A2, tB1 = B1 + B2, tC1 = C1 + C2, tD1 = D1 + D2;\ volatile uint32_t tA2 = tA1<< 7, tB2 = tB1<< 7, tC2 = tC1<< 7, tD2 = tD1<< 7;\ tA1 = tA1>>25; tB1 = tB1>>25; tC1 = tC1>>25; tD1 = tD1>>25;\ tA2|= tA1 ; tB2|= tB1 ; tC2|= tC1 ; tD2|= tD1 ;\ A0 ^= tA2 ; B0 ^= tB2 ; C0 ^= tC2 ; D0 ^= tD2 ;\ }
but actually I couldn't get performance above what is already achieved. So in case you're wondering why there haven't been any updates. That is because my experiments in getting more speed haven't been fruitful yet.
|
|
|
|
Bakemono
Member
Offline
Activity: 85
Merit: 10
|
|
April 21, 2013, 09:58:02 PM |
|
13kh/s with GeForce GT 330M LAPTOP POWAh
|
BTC : 1Ct9opEdmq4ZuZmNQmhGBDcurrePFykTRt LTC : LLqGtKpAdx6Ci8ZaSrvWG6WXfFF3mPK4V9
|
|
|
InqBit
Newbie
Offline
Activity: 27
Merit: 0
|
|
April 21, 2013, 11:05:39 PM |
|
On my 670 with 1D tex cache I am getting about 175 kHash/s.
A 570 though would be significantly faster (but also run significantly hotter). I am still trying to understand why the Kepler architecture has such a performance disadvantage with my current code. I did try some inline PTX assembly (looks horrid, check it out) __device__ void ROTL7(uint32_t &A0, const uint32_t &A1, const uint32_t &A2, uint32_t &B0, const uint32_t &B1, const uint32_t &B2, uint32_t &C0, const uint32_t &C1, const uint32_t &C2, uint32_t &D0, const uint32_t &D1, const uint32_t &D2) { asm("{\n\t" " .reg .u32 tA1, tA2;\n\t" " .reg .u32 tB1, tB2;\n\t" " .reg .u32 tC1, tC2;\n\t" " .reg .u32 tD1, tD2;\n\t" " add.u32 tA1, %4, %5;\n\t" " add.u32 tB1, %6, %7;\n\t" " add.u32 tC1, %8, %9;\n\t" " add.u32 tD1, %10, %11;\n\t" " shl.b32 tA2, tA1, 7;\n\t" " shl.b32 tB2, tB1, 7;\n\t" " shl.b32 tC2, tC1, 7;\n\t" " shl.b32 tD2, tD1, 7;\n\t" " shr.b32 tA1, tA1, 25;\n\t" " shr.b32 tB1, tB1, 25;\n\t" " shr.b32 tC1, tC1, 25;\n\t" " shr.b32 tD1, tD1, 25;\n\t" " or.b32 tA1, tA1, tA2;\n\t" " or.b32 tB1, tB1, tB2;\n\t" " or.b32 tC1, tC1, tC2;\n\t" " or.b32 tD1, tD1, tD2;\n\t" " xor.b32 %0, %0, tA1;\n\t" " xor.b32 %1, %1, tB1;\n\t" " xor.b32 %2, %2, tC1;\n\t" " xor.b32 %3, %3, tD1;\n\t" "}" : "+r"(A0), "+r"(B0), "+r"(C0), "+r"(D0) : "r" (A1), "r" (A2), "r" (B1), "r" (B2), "r" (C1), "r" (C2), "r" (D1), "r" (D2)); }
as well as added instruction level parallelism by formulating the CUDA code like this: #define ROTL7(A0, A1, A2, B0, B1, B2, C0, C1, C2, D0, D1, D2) \ {\ volatile uint32_t tA1 = A1 + A2, tB1 = B1 + B2, tC1 = C1 + C2, tD1 = D1 + D2;\ volatile uint32_t tA2 = tA1<< 7, tB2 = tB1<< 7, tC2 = tC1<< 7, tD2 = tD1<< 7;\ tA1 = tA1>>25; tB1 = tB1>>25; tC1 = tC1>>25; tD1 = tD1>>25;\ tA2|= tA1 ; tB2|= tB1 ; tC2|= tC1 ; tD2|= tD1 ;\ A0 ^= tA2 ; B0 ^= tB2 ; C0 ^= tC2 ; D0 ^= tD2 ;\ }
but actually I couldn't get performance above what is already achieved. So in case you're wondering why there haven't been any updates. That is because my experiments in getting more speed haven't been fruitful yet. I assume you've seen this Kepler thread? https://bitcointalk.org/index.php?topic=163750.0;topicseen
|
|
|
|
|
termhn
|
|
April 22, 2013, 05:31:37 AM |
|
this is my 670gtx (GIGABYTE GV-N670OC-2GD) doing over 200khash/s cudaminer.exe --url http://notroll.in:6332/ --userpass jasonharty24.4:12345 -i 0 -m 1 -C 2 -l 70x4 JESUS CHRIST that is a great OC!
|
|
|
|
|