Bitcoin Forum
February 20, 2017, 06:28:22 AM *
News: Latest stable version of Bitcoin Core: 0.13.2  [Torrent]. (New!)
 
   Home   Help Search Donate Login Register  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 [22] 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 ... 1142 »
  Print  
Author Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX]  (Read 2992434 times)
420
Hero Member
*****
Offline Offline

Activity: 756



View Profile
April 21, 2013, 12:27:34 AM
 #421

What speed should I get with a GTX 460M

or a GTX 680

Donations: 1JVhKjUKSjBd7fPXQJsBs5P3Yphk38AqPr - TIPS
the hacks, the hacks, secure your bits!
1487572102
Hero Member
*
Offline Offline

Posts: 1487572102

View Profile Personal Message (Offline)

Ignore
1487572102
Reply with quote  #2

1487572102
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
nst6563
Sr. Member
****
Offline Offline

Activity: 252


View Profile
April 21, 2013, 12:50:19 AM
 #422

What speed should I get with a GTX 460M

or a GTX 680

check here
https://docs.google.com/spreadsheet/ccc?key=0AjMqJzI7_dCvdG9fZFN1Vjd0WkFOZmtlejltd0JXbmc#gid=1


that should give you an idea

Was I helpful? BTC: 1B5SzPsBDuGLvGYuWw5b4Po8MaFSGck1Dt    LTC: LSQEM9re967ds5ZTHxs6SUvmKVpB5KdMwZ    DOGE:  DS3HN4s4YainUsiwrFcAoXa4e3WiSaM74P
Misiolap
Newbie
*
Offline Offline

Activity: 14


View Profile
April 21, 2013, 12:51:51 AM
 #423

@K1773R: check out patch for 64bit systems, a few post above yours.

My tip jars: 1FQMFpqnCH1ATPGoHLnmSAXB1gBh9vAKXC | LZiXcRvUr5wJrkfvc9vXvqDwe51YfVPRkv
K1773R
Legendary
*
Offline Offline

Activity: 1526


/dev/null


View Profile
April 21, 2013, 02:28:21 AM
 #424

ERR: nvm i got the old 4.0 cuda...
works with 5.0 Smiley

[GPG Public Key]  [Devcoin Builds]  [BBQCoin Builds]  [Multichain Blockexplorer]  [Multichain Blockexplorer - PoS Coins]  [Ufasoft Miner Linux Builds]
BTC/DVC/TRC/FRC: 1K1773RbXRZVRQSSXe9N6N2MUFERvrdu6y ANC/XPM AK1773RTmRKtvbKBCrUu95UQg5iegrqyeA NMC: NK1773Rzv8b4ugmCgX789PbjewA9fL9Dy1 LTC: LKi773RBuPepQH8E6Zb1ponoCvgbU7hHmd EMC: EK1773RxUes1HX1YAGMZ1xVYBBRUCqfDoF BQC: bK1773R1APJz4yTgRkmdKQhjhiMyQpJgfN
hammz
Member
**
Offline Offline

Activity: 80



View Profile
April 21, 2013, 05:35:47 AM
 #425

Is there a way I can periodically terminate the cudaminer program without it crashing my video card driver?

I want to run a script to restart cudaminer, and my stratum proxy connection, on a timer but task killing the cudaminer process in windows also crashes the video card, so I lose my overclock settings.





Lacan82
Sr. Member
****
Offline Offline

Activity: 247


View Profile
April 21, 2013, 05:56:38 AM
 #426

Is there a way I can periodically terminate the cudaminer program without it crashing my video card driver?

I want to run a script to restart cudaminer, and my stratum proxy connection, on a timer but task killing the cudaminer process in windows also crashes the video card, so I lose my overclock settings.








Do you have the latest drivers? and version? this isn't an issue in the newer one

borgopio
Newbie
*
Offline Offline

Activity: 8


View Profile
April 21, 2013, 06:53:30 AM
 #427

Cudaminer 2013-04-17 compiling errors

Cudaminer 2013-04-14 compiled fine on my system. I'm running it now and getting about 65 khash/s.
My system is:
AMD Phenom II
Linux 12.04
GeForce GTS 450

When I attempt to compile 2013-04-17, I get several errors like this:
"./salsa_kernel.cu(164): Warning: Cannot tell what pointer points to, assuming global memory space"
and this one:
"titan_kernel.cu(377): error: identifier "usleep" is undefined"

I assume they are coding errors and thought this would be helpful info.

Thanks for your work Christian. As soon as I get some coins I can send you something.
hammz
Member
**
Offline Offline

Activity: 80



View Profile
April 21, 2013, 11:07:44 AM
 #428

Is there a way I can periodically terminate the cudaminer program without it crashing my video card driver?

I want to run a script to restart cudaminer, and my stratum proxy connection, on a timer but task killing the cudaminer process in windows also crashes the video card, so I lose my overclock settings.








Do you have the latest drivers? and version? this isn't an issue in the newer one

04-17...version from a few days ago.

I'm not talking about using ctrl-c from within the program, I'm task killing it via an automated script.  
dbabo
Jr. Member
*
Offline Offline

Activity: 41


View Profile
April 21, 2013, 02:04:56 PM
 #429

Cudaminer 2013-04-17 compiling errors

Cudaminer 2013-04-14 compiled fine on my system. I'm running it now and getting about 65 khash/s.
My system is:
AMD Phenom II
Linux 12.04
GeForce GTS 450

When I attempt to compile 2013-04-17, I get several errors like this:
"./salsa_kernel.cu(164): Warning: Cannot tell what pointer points to, assuming global memory space"
and this one:
"titan_kernel.cu(377): error: identifier "usleep" is undefined"

I assume they are coding errors and thought this would be helpful info.

Thanks for your work Christian. As soon as I get some coins I can send you something.

some deps are missing - check orig post. Looks like you are on ubuntu right?
borgopio
Newbie
*
Offline Offline

Activity: 8


View Profile
April 21, 2013, 03:01:41 PM
 #430

No dependencies listed for Linux on original post. I am running Cuda 5.0 and graphics driver meets minimum release requirements. 2013-04-24 compiles fine.
dbabo
Jr. Member
*
Offline Offline

Activity: 41


View Profile
April 21, 2013, 03:10:46 PM
 #431

No dependencies listed for Linux on original post. I am running Cuda 5.0 and graphics driver meets minimum release requirements. 2013-04-24 compiles fine.

g++-multilib and ia32-libs, possibly also libcurl4-dev

 r u on 32?
borgopio
Newbie
*
Offline Offline

Activity: 8


View Profile
April 21, 2013, 03:33:26 PM
 #432

Yes - 32 bit.
I installed g++multilib and ia32-libs and recompiled. Same errors. There is more than one libcurl4 to chose from.
KnowBuddy
Member
**
Offline Offline

Activity: 69


View Profile
April 21, 2013, 03:59:34 PM
 #433

Is there a way I can periodically terminate the cudaminer program without it crashing my video card driver?

I want to run a script to restart cudaminer, and my stratum proxy connection, on a timer but task killing the cudaminer process in windows also crashes the video card, so I lose my overclock settings.








Do you have the latest drivers? and version? this isn't an issue in the newer one

04-17...version from a few days ago.

I'm not talking about using ctrl-c from within the program, I'm task killing it via an automated script.  

Isn't it possible to make the cudaminer.exe window active and enter ctrl-c using an automated script?
bitg
Newbie
*
Offline Offline

Activity: 19


View Profile
April 21, 2013, 07:01:42 PM
 #434

I am seeing an small increment with 17-04 version too, about 3-5%.

I did notice that Ctrl-C is ignored when the connection to pool isn't succesfull, forcing me to close the cmd window directly.

Anyways great job with the CUDA implementation.
logdog16
Newbie
*
Offline Offline

Activity: 19



View Profile WWW
April 21, 2013, 07:51:54 PM
 #435

On my 670 with 1D tex cache I am getting about 175 kHash/s.

Great work!
cbuchner1
Hero Member
*****
Offline Offline

Activity: 742


View Profile
April 21, 2013, 09:53:26 PM
 #436

On my 670 with 1D tex cache I am getting about 175 kHash/s.

A 570 though would be significantly faster (but also run significantly hotter). I am still trying to understand why the Kepler architecture has such a performance disadvantage with my current code.

I did try some inline PTX assembly (looks horrid, check it out)

Code:
__device__ void ROTL7(uint32_t &A0, const uint32_t &A1, const uint32_t &A2,
                      uint32_t &B0, const uint32_t &B1, const uint32_t &B2,
                      uint32_t &C0, const uint32_t &C1, const uint32_t &C2,
                      uint32_t &D0, const uint32_t &D1, const uint32_t &D2)
{
    asm("{\n\t"
    "  .reg .u32 tA1, tA2;\n\t"
    "  .reg .u32 tB1, tB2;\n\t"
    "  .reg .u32 tC1, tC2;\n\t"
    "  .reg .u32 tD1, tD2;\n\t"
    "  add.u32 tA1, %4, %5;\n\t"
    "  add.u32 tB1, %6, %7;\n\t"
    "  add.u32 tC1, %8, %9;\n\t"
    "  add.u32 tD1, %10, %11;\n\t"
    "  shl.b32 tA2, tA1, 7;\n\t"
    "  shl.b32 tB2, tB1, 7;\n\t"
    "  shl.b32 tC2, tC1, 7;\n\t"
    "  shl.b32 tD2, tD1, 7;\n\t"
    "  shr.b32 tA1, tA1, 25;\n\t"
    "  shr.b32 tB1, tB1, 25;\n\t"
    "  shr.b32 tC1, tC1, 25;\n\t"
    "  shr.b32 tD1, tD1, 25;\n\t"
    "  or.b32 tA1, tA1, tA2;\n\t"
    "  or.b32 tB1, tB1, tB2;\n\t"
    "  or.b32 tC1, tC1, tC2;\n\t"
    "  or.b32 tD1, tD1, tD2;\n\t"
    "  xor.b32 %0, %0, tA1;\n\t"
    "  xor.b32 %1, %1, tB1;\n\t"
    "  xor.b32 %2, %2, tC1;\n\t"
    "  xor.b32 %3, %3, tD1;\n\t"
    "}"
    : "+r"(A0), "+r"(B0), "+r"(C0), "+r"(D0) : "r" (A1), "r" (A2), "r" (B1), "r" (B2), "r" (C1), "r" (C2), "r" (D1), "r" (D2));
}

as well as added instruction level parallelism by formulating the CUDA code like this:

Code:
#define ROTL7(A0, A1, A2, B0, B1, B2, C0, C1, C2, D0, D1, D2)  \
{\
    volatile uint32_t tA1 = A1 + A2, tB1 = B1 + B2, tC1 = C1 + C2, tD1 = D1 + D2;\
    volatile uint32_t tA2 = tA1<< 7, tB2 = tB1<< 7, tC2 = tC1<< 7, tD2 = tD1<< 7;\
                      tA1 = tA1>>25; tB1 = tB1>>25; tC1 = tC1>>25; tD1 = tD1>>25;\
                      tA2|= tA1    ; tB2|= tB1    ; tC2|= tC1    ; tD2|= tD1    ;\
                      A0 ^= tA2    ; B0 ^= tB2    ; C0 ^= tC2    ; D0 ^= tD2    ;\
}

but actually I couldn't get performance above what is already achieved. So in case you're wondering why there haven't been any updates. That is because my experiments in getting more speed haven't been fruitful yet.

Bakemono
Member
**
Offline Offline

Activity: 85



View Profile
April 21, 2013, 09:58:02 PM
 #437

13kh/s with GeForce GT 330M  Tongue Tongue

LAPTOP POWAh  Cool

BTC : 1Ct9opEdmq4ZuZmNQmhGBDcurrePFykTRt
LTC : LLqGtKpAdx6Ci8ZaSrvWG6WXfFF3mPK4V9
InqBit
Newbie
*
Offline Offline

Activity: 27



View Profile
April 21, 2013, 11:05:39 PM
 #438

On my 670 with 1D tex cache I am getting about 175 kHash/s.

A 570 though would be significantly faster (but also run significantly hotter). I am still trying to understand why the Kepler architecture has such a performance disadvantage with my current code.

I did try some inline PTX assembly (looks horrid, check it out)

Code:
__device__ void ROTL7(uint32_t &A0, const uint32_t &A1, const uint32_t &A2,
                      uint32_t &B0, const uint32_t &B1, const uint32_t &B2,
                      uint32_t &C0, const uint32_t &C1, const uint32_t &C2,
                      uint32_t &D0, const uint32_t &D1, const uint32_t &D2)
{
    asm("{\n\t"
    "  .reg .u32 tA1, tA2;\n\t"
    "  .reg .u32 tB1, tB2;\n\t"
    "  .reg .u32 tC1, tC2;\n\t"
    "  .reg .u32 tD1, tD2;\n\t"
    "  add.u32 tA1, %4, %5;\n\t"
    "  add.u32 tB1, %6, %7;\n\t"
    "  add.u32 tC1, %8, %9;\n\t"
    "  add.u32 tD1, %10, %11;\n\t"
    "  shl.b32 tA2, tA1, 7;\n\t"
    "  shl.b32 tB2, tB1, 7;\n\t"
    "  shl.b32 tC2, tC1, 7;\n\t"
    "  shl.b32 tD2, tD1, 7;\n\t"
    "  shr.b32 tA1, tA1, 25;\n\t"
    "  shr.b32 tB1, tB1, 25;\n\t"
    "  shr.b32 tC1, tC1, 25;\n\t"
    "  shr.b32 tD1, tD1, 25;\n\t"
    "  or.b32 tA1, tA1, tA2;\n\t"
    "  or.b32 tB1, tB1, tB2;\n\t"
    "  or.b32 tC1, tC1, tC2;\n\t"
    "  or.b32 tD1, tD1, tD2;\n\t"
    "  xor.b32 %0, %0, tA1;\n\t"
    "  xor.b32 %1, %1, tB1;\n\t"
    "  xor.b32 %2, %2, tC1;\n\t"
    "  xor.b32 %3, %3, tD1;\n\t"
    "}"
    : "+r"(A0), "+r"(B0), "+r"(C0), "+r"(D0) : "r" (A1), "r" (A2), "r" (B1), "r" (B2), "r" (C1), "r" (C2), "r" (D1), "r" (D2));
}

as well as added instruction level parallelism by formulating the CUDA code like this:

Code:
#define ROTL7(A0, A1, A2, B0, B1, B2, C0, C1, C2, D0, D1, D2)  \
{\
    volatile uint32_t tA1 = A1 + A2, tB1 = B1 + B2, tC1 = C1 + C2, tD1 = D1 + D2;\
    volatile uint32_t tA2 = tA1<< 7, tB2 = tB1<< 7, tC2 = tC1<< 7, tD2 = tD1<< 7;\
                      tA1 = tA1>>25; tB1 = tB1>>25; tC1 = tC1>>25; tD1 = tD1>>25;\
                      tA2|= tA1    ; tB2|= tB1    ; tC2|= tC1    ; tD2|= tD1    ;\
                      A0 ^= tA2    ; B0 ^= tB2    ; C0 ^= tC2    ; D0 ^= tD2    ;\
}

but actually I couldn't get performance above what is already achieved. So in case you're wondering why there haven't been any updates. That is because my experiments in getting more speed haven't been fruitful yet.



I assume you've seen this Kepler thread?

https://bitcointalk.org/index.php?topic=163750.0;topicseen

12nyKWbyCku2N1Vqkwv4w53VRSGP3W4ps2
jasonharty24
Newbie
*
Offline Offline

Activity: 28


View Profile
April 22, 2013, 04:12:35 AM
 #439

this is my 670gtx (GIGABYTE GV-N670OC-2GD) doing over 200khash/s
http://s23.postimg.org/m7lni155j/cuda_miner.jpg



Code:
cudaminer.exe --url http://notroll.in:6332/ --userpass jasonharty24.4:12345 -i 0 -m 1 -C 2 -l 70x4
termhn
Member
**
Offline Offline

Activity: 112


View Profile
April 22, 2013, 05:31:37 AM
 #440

this is my 670gtx (GIGABYTE GV-N670OC-2GD) doing over 200khash/s




Code:
cudaminer.exe --url http://notroll.in:6332/ --userpass jasonharty24.4:12345 -i 0 -m 1 -C 2 -l 70x4
JESUS CHRIST that is a great OC!

Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 [22] 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 ... 1142 »
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!