Bitcoin Forum
November 07, 2024, 10:04:13 PM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 ... 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 [896] 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 ... 1135 »
  Print  
Author Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX]  (Read 3426930 times)
djm34
Legendary
*
Offline Offline

Activity: 1400
Merit: 1050


View Profile WWW
July 17, 2014, 09:00:57 AM
 #17901

Can anyone send me a link to the latest NVminer sourcecode. (merged)

Seems like the funnel shift is missing in some of the 11 algorithms: (With a ROL compute 5.0 + devices will get a boost)


From Github (ccminer 1.2):

cuda_x11_luffa512.cu:

   define TWEAK(a0,a1,a2,a3,j)\
    a0 = (a0<<(j))|(a0>>(32-j));\
    a1 = (a1<<(j))|(a1>>(32-j));\
    a2 = (a2<<(j))|(a2>>(32-j));\
    a3 = (a3<<(j))|(a3>>(32-j));
   
#define MIXWORD(a0,a4)\
    a4 ^= a0;\
    a0  = (a0<<2) | (a0>>(30));\
    a0 ^= a4;\
    a4  = (a4<<14) | (a4>>(18));\
    a4 ^= a0;\
    a0  = (a0<<10) | (a0>>(22));\
    a0 ^= a4;\
    a4  = (a4<<1) | (a4>>(31));   

cuda_x11_cubehash512.cu:
   
#define ROTATEUPWARDS7(a) (((a) << 7) | ((a) >> 25))
#define ROTATEUPWARDS11(a) (((a) << 11) | ((a) >> 21))

etc..

By rewriting these macros shift+shift+or cuda instructions can be replaced with a single rol (3 times faster)  (Compute maxwell / 5.0+)
which cuda version ?

djm34 facebook page
BTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze
Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
sp_
Legendary
*
Offline Offline

Activity: 2954
Merit: 1087

Team Black developer


View Profile
July 17, 2014, 09:21:35 AM
Last edit: July 17, 2014, 09:42:02 AM by sp_
 #17902

Cuda 6.0


Check:

Version features and specifications

The funnel shift is available for compute 3.5 and higher.

http://en.wikipedia.org/wiki/CUDA

How to inline CUDA Assembly:
http://docs.nvidia.com/cuda/inline-ptx-assembly/index.html#axzz37iRLSsMj

Instruction set.


docs.nvidia.com/cuda/parallel-thread-execution/index.html#logic-and-shift-instructions

8.7.5.1. Logic and Shift Instructions:
(8.7.5.6. Logic and Shift Instructions: shf)


So the following macro should be converted to something like:

a0 = (a0<<(j))|(a0>>(32-j)); --> shf.l.wrap.b32 a0,a0,j,c
...

I have some time in the weekend to do the full implementation. Just give me the latest branch to work on...


Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
miner256
Newbie
*
Offline Offline

Activity: 23
Merit: 0


View Profile
July 17, 2014, 09:50:59 AM
 #17903


Not that I can see either.
All closed, and binaries for Windows only :-(

Looking forward to trying something when (if?) the source does get released though.

This weekend I am going to try KopiemTu 1.4 and see what that is like - reading good things about it, and I like the idea of trying some tweaking of the card on linux.
https://litecointalk.org/index.php?topic=16800.0

No sourcecode available in nvminer.zip.
yellowduck2
Hero Member
*****
Offline Offline

Activity: 868
Merit: 1000


View Profile
July 17, 2014, 10:00:44 AM
 #17904

Can anyone send me a link to the latest NVminer sourcecode. (merged)

Seems like the funnel shift is missing in some of the 11 algorithms: (With a ROL compute 5.0 + devices will get a boost)


From Github (ccminer 1.2):

cuda_x11_luffa512.cu:

 &nbsp; define TWEAK(a0,a1,a2,a3,j)\
 &nbsp; &nbsp;a0 = (a0&lt;&lt;(j))|(a0&gt;&gt;(32-j));\
 &nbsp; &nbsp;a1 = (a1&lt;&lt;(j))|(a1&gt;&gt;(32-j));\
 &nbsp; &nbsp;a2 = (a2&lt;&lt;(j))|(a2&gt;&gt;(32-j));\
 &nbsp; &nbsp;a3 = (a3&lt;&lt;(j))|(a3&gt;&gt;(32-j));
   
#define MIXWORD(a0,a4)\
 &nbsp; &nbsp;a4 ^= a0;\
 &nbsp; &nbsp;a0 &nbsp;= (a0&lt;&lt;2) | (a0&gt;&gt;(30));\
 &nbsp; &nbsp;a0 ^= a4;\
 &nbsp; &nbsp;a4 &nbsp;= (a4&lt;&lt;14) | (a4&gt;&gt;(18));\
 &nbsp; &nbsp;a4 ^= a0;\
 &nbsp; &nbsp;a0 &nbsp;= (a0&lt;&lt;10) | (a0&gt;&gt;(22));\
 &nbsp; &nbsp;a0 ^= a4;\
 &nbsp; &nbsp;a4 &nbsp;= (a4&lt;&lt;1) | (a4&gt;&gt;(31));   

cuda_x11_cubehash512.cu:
   
#define ROTATEUPWARDS7(a) (((a) &lt;&lt; 7) | ((a) &gt;&gt; 25))
#define ROTATEUPWARDS11(a) (((a) &lt;&lt; 11) | ((a) &gt;&gt; 21))

etc..

By rewriting these macros shift+shift+or cuda instructions can be replaced with a single rol (3 times faster) &nbsp;(Compute maxwell / 5.0+)

Hope u succeed in optimizing code
djm34
Legendary
*
Offline Offline

Activity: 1400
Merit: 1050


View Profile WWW
July 17, 2014, 10:25:23 AM
 #17905

Can anyone send me a link to the latest NVminer sourcecode. (merged)

Seems like the funnel shift is missing in some of the 11 algorithms: (With a ROL compute 5.0 + devices will get a boost)


From Github (ccminer 1.2):

cuda_x11_luffa512.cu:

 &nbsp; define TWEAK(a0,a1,a2,a3,j)\
 &nbsp; &nbsp;a0 = (a0&lt;&lt;(j))|(a0&gt;&gt;(32-j));\
 &nbsp; &nbsp;a1 = (a1&lt;&lt;(j))|(a1&gt;&gt;(32-j));\
 &nbsp; &nbsp;a2 = (a2&lt;&lt;(j))|(a2&gt;&gt;(32-j));\
 &nbsp; &nbsp;a3 = (a3&lt;&lt;(j))|(a3&gt;&gt;(32-j));
   
#define MIXWORD(a0,a4)\
 &nbsp; &nbsp;a4 ^= a0;\
 &nbsp; &nbsp;a0 &nbsp;= (a0&lt;&lt;2) | (a0&gt;&gt;(30));\
 &nbsp; &nbsp;a0 ^= a4;\
 &nbsp; &nbsp;a4 &nbsp;= (a4&lt;&lt;14) | (a4&gt;&gt;(18));\
 &nbsp; &nbsp;a4 ^= a0;\
 &nbsp; &nbsp;a0 &nbsp;= (a0&lt;&lt;10) | (a0&gt;&gt;(22));\
 &nbsp; &nbsp;a0 ^= a4;\
 &nbsp; &nbsp;a4 &nbsp;= (a4&lt;&lt;1) | (a4&gt;&gt;(31));   

cuda_x11_cubehash512.cu:
   
#define ROTATEUPWARDS7(a) (((a) &lt;&lt; 7) | ((a) &gt;&gt; 25))
#define ROTATEUPWARDS11(a) (((a) &lt;&lt; 11) | ((a) &gt;&gt; 21))

etc..

By rewriting these macros shift+shift+or cuda instructions can be replaced with a single rol (3 times faster) &nbsp;(Compute maxwell / 5.0+)

Hope u succeed in optimizing code
The problem is that it may break definitively the compatibility with other versions

djm34 facebook page
BTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze
Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
cayars
Full Member
***
Offline Offline

Activity: 168
Merit: 100


View Profile
July 17, 2014, 10:40:43 AM
 #17906

I've update to the Beta drivers 340.88 and the hash increased slightly. Now getting 7.1 Mh or so but still not the 7.8/7.9 Mh you are getting. I also tried with minep.it pool but as I was expecting it didn't change anything.

Any more ideas?

I have not followed the whole conversation, but I am wondering if one of you is running on risers and the other not?

That's an interesting theory. I read somewhere that indeed it could affect the hash rate http://cryptomining-blog.com/1276-first-impressions-from-a-6-card-mining-rig-using-geforce-gtx-750-ti-gpus/

I, on my side am using risers. Powered for that matter. The difference it rather significant: ~800/900 Kh

I am also using ASUS, not Gigabyte. Perhaps that could also be the reason.

Good observation.  No risers of any kind for me.
sp_
Legendary
*
Offline Offline

Activity: 2954
Merit: 1087

Team Black developer


View Profile
July 17, 2014, 10:46:49 AM
 #17907

Christian has already broken compabillity for any hardware below compute 3.0. In the Killer Groestl implementation he use Compute 3.0 + instructions. like perm:

static __device__ uint32_t cuda_swab32(uint32_t x)
{
return __byte_perm(x, 0, 0x0123);
}

I will implement the changes by using a compilerflag like this:

#if __CUDA_ARCH__ >= 130
return (uint32_t)__double2hiint(__longlong_as_double(x));
#else
return (uint32_t)(x >> 32);
#endif

So if you compile with compute 5.0 you will get maxwell funnelshift instead.

The current CC miner runs at the same speed for compute 3.0, 3,5 and 5.0. This is about to change.


Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
cayars
Full Member
***
Offline Offline

Activity: 168
Merit: 100


View Profile
July 17, 2014, 10:54:16 AM
 #17908

Cuda 6.0


Check:

Version features and specifications

The funnel shift is available for compute 3.5 and higher.

http://en.wikipedia.org/wiki/CUDA

How to inline CUDA Assembly:
http://docs.nvidia.com/cuda/inline-ptx-assembly/index.html#axzz37iRLSsMj

Instruction set.


docs.nvidia.com/cuda/parallel-thread-execution/index.html#logic-and-shift-instructions

8.7.5.1. Logic and Shift Instructions:
(8.7.5.6. Logic and Shift Instructions: shf)


So the following macro should be converted to something like:

a0 = (a0<<(j))|(a0>>(32-j)); --> shf.l.wrap.b32 a0,a0,j,c
...

I have some time in the weekend to do the full implementation. Just give me the latest branch to work on...



I won't have the source available until after the weekend as I want to compile/test on Linux first and need to get the makefiles setup properly.

However, if you want to start playing now pull down the latest from dsm34 and use this to test with. https://github.com/djm34/ccminer

If I remember correctly I have tried to compile compute5.0 under CUDA 6 and it broke something in the FRESH algo.  Because of this I went back to compute 3/3.5 on Cuda 5.5 for the last release.

So point being after trying your changes, please make sure to test every algo to make sure something else doesn't brake under Cuda 6 and compute 5 (if you try this).

BTW, are you doing your testing on Windows or Linux?
NeuroticFish
Legendary
*
Offline Offline

Activity: 3850
Merit: 6583


Looking for campaign manager? Contact icopress!


View Profile
July 17, 2014, 11:00:22 AM
 #17909

Cry I am already bound to ancient versions of ccminer. Because I still have compute 2.1....


Such areas really can't be fixed / done with #ifdef and also kept a not-so-good version for older GPUs?
But maybe I wish too much. Obviously everybody only improves the versions mostly for own needs....

███████████████████████
████▐██▄█████████████████
████▐██████▄▄▄███████████
████▐████▄█████▄▄████████
████▐█████▀▀▀▀▀███▄██████
████▐███▀████████████████
████▐█████████▄█████▌████
████▐██▌█████▀██████▌████
████▐██████████▀████▌████
█████▀███▄█████▄███▀█████
███████▀█████████▀███████
██████████▀███▀██████████

███████████████████████
.
BC.GAME
▄▄▀▀▀▀▀▀▀▄▄
▄▀▀░▄██▀░▀██▄░▀▀▄
▄▀░▐▀▄░▀░░▀░░▀░▄▀▌░▀▄
▄▀▄█▐░▀▄▀▀▀▀▀▄▀░▌█▄▀▄
▄▀░▀░░█░▄███████▄░█░░▀░▀▄
█░█░▀░█████████████░▀░█░█
█░██░▀█▀▀█▄▄█▀▀█▀░██░█
█░█▀██░█▀▀██▀▀█░██▀█░█
▀▄▀██░░░▀▀▄▌▐▄▀▀░░░██▀▄▀
▀▄▀██░░▄░▀▄█▄▀░▄░░██▀▄▀
▀▄░▀█░▄▄▄░▀░▄▄▄░█▀░▄▀
▀▄▄▀▀███▄███▀▀▄▄▀
██████▄▄▄▄▄▄▄██████
.
..CASINO....SPORTS....RACING..


▄▄████▄▄
▄███▀▀███▄
██████████
▀███▄░▄██▀
▄▄████▄▄░▀█▀▄██▀▄▄████▄▄
▄███▀▀▀████▄▄██▀▄███▀▀███▄
███████▄▄▀▀████▄▄▀▀███████
▀███▄▄███▀░░░▀▀████▄▄▄███▀
▀▀████▀▀████████▀▀████▀▀
sp_
Legendary
*
Offline Offline

Activity: 2954
Merit: 1087

Team Black developer


View Profile
July 17, 2014, 11:08:25 AM
 #17910

I won't have the source available until after the weekend as I want to compile/test on Linux first and need to get the makefiles setup properly.
However, if you want to start playing now pull down the latest from dsm34 and use this to test with. https://github.com/djm34/ccminer
If I remember correctly I have tried to compile compute5.0 under CUDA 6 and it broke something in the FRESH algo.  Because of this I went back to compute 3/3.5 on Cuda 5.5 for the last release.
So point being after trying your changes, please make sure to test every algo to make sure something else doesn't brake under Cuda 6 and compute 5 (if you try this).
BTW, are you doing your testing on Windows or Linux?

Ok. I'll start with the dsm34 branch. But in order to test all the algo's I need the unified sourcecode. I will be developing on windows.

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
yellowduck2
Hero Member
*****
Offline Offline

Activity: 868
Merit: 1000


View Profile
July 17, 2014, 11:37:05 AM
 #17911

Can anyone send me a link to the latest NVminer sourcecode. (merged)

Seems like the funnel shift is missing in some of the 11 algorithms: (With a ROL compute 5.0 + devices will get a boost)


From Github (ccminer 1.2):

cuda_x11_luffa512.cu:

 &nbsp; define TWEAK(a0,a1,a2,a3,j)\
 &nbsp; &nbsp;a0 = (a0&lt;&lt;(j))|(a0&gt;&gt;(32-j));\
 &nbsp; &nbsp;a1 = (a1&lt;&lt;(j))|(a1&gt;&gt;(32-j));\
 &nbsp; &nbsp;a2 = (a2&lt;&lt;(j))|(a2&gt;&gt;(32-j));\
 &nbsp; &nbsp;a3 = (a3&lt;&lt;(j))|(a3&gt;&gt;(32-j));
   
#define MIXWORD(a0,a4)\
 &nbsp; &nbsp;a4 ^= a0;\
 &nbsp; &nbsp;a0 &nbsp;= (a0&lt;&lt;2) | (a0&gt;&gt;(30));\
 &nbsp; &nbsp;a0 ^= a4;\
 &nbsp; &nbsp;a4 &nbsp;= (a4&lt;&lt;14) | (a4&gt;&gt;(18));\
 &nbsp; &nbsp;a4 ^= a0;\
 &nbsp; &nbsp;a0 &nbsp;= (a0&lt;&lt;10) | (a0&gt;&gt;(22));\
 &nbsp; &nbsp;a0 ^= a4;\
 &nbsp; &nbsp;a4 &nbsp;= (a4&lt;&lt;1) | (a4&gt;&gt;(31));   

cuda_x11_cubehash512.cu:
   
#define ROTATEUPWARDS7(a) (((a) &lt;&lt; 7) | ((a) &gt;&gt; 25))
#define ROTATEUPWARDS11(a) (((a) &lt;&lt; 11) | ((a) &gt;&gt; 21))

etc..

By rewriting these macros shift+shift+or cuda instructions can be replaced with a single rol (3 times faster) &nbsp;(Compute maxwell / 5.0+)

Hope u succeed in optimizing code
The problem is that it may break definitively the compatibility with other versions

5.0 is the future. Need to get started and path the way for 800 series. By the time 800 series is out, there will be perfectly optimized 5.0 
djm34
Legendary
*
Offline Offline

Activity: 1400
Merit: 1050


View Profile WWW
July 17, 2014, 11:54:27 AM
 #17912

I won't have the source available until after the weekend as I want to compile/test on Linux first and need to get the makefiles setup properly.
However, if you want to start playing now pull down the latest from dsm34 and use this to test with. https://github.com/djm34/ccminer
If I remember correctly I have tried to compile compute5.0 under CUDA 6 and it broke something in the FRESH algo.  Because of this I went back to compute 3/3.5 on Cuda 5.5 for the last release.
So point being after trying your changes, please make sure to test every algo to make sure something else doesn't brake under Cuda 6 and compute 5 (if you try this).
BTW, are you doing your testing on Windows or Linux?

Ok. I'll start with the dsm34 branch. But in order to test all the algo's I need the unified sourcecode. I will be developing on windows.
If possible, try to put changes in cuda_helper.h rather than breaking everybody else code...

djm34 facebook page
BTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze
Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
polanskiman
Full Member
***
Offline Offline

Activity: 266
Merit: 100


View Profile
July 17, 2014, 03:07:41 PM
 #17913

I've update to the Beta drivers 340.88 and the hash increased slightly. Now getting 7.1 Mh or so but still not the 7.8/7.9 Mh you are getting. I also tried with minep.it pool but as I was expecting it didn't change anything.

Any more ideas?

I have not followed the whole conversation, but I am wondering if one of you is running on risers and the other not?

That's an interesting theory. I read somewhere that indeed it could affect the hash rate http://cryptomining-blog.com/1276-first-impressions-from-a-6-card-mining-rig-using-geforce-gtx-750-ti-gpus/

I, on my side am using risers. Powered for that matter. The difference it rather significant: ~800/900 Kh

I am also using ASUS, not Gigabyte. Perhaps that could also be the reason.

Good observation.  No risers of any kind for me.

Well I don't think it's a riser issue. I've connected the 2 GPUs directly to the slots of the mobo and I get the same hashrate as with risers... back to square one.
Schleicher
Hero Member
*****
Offline Offline

Activity: 675
Merit: 514



View Profile
July 17, 2014, 04:29:17 PM
 #17914

Cry I am already bound to ancient versions of ccminer. Because I still have compute 2.1....

Such areas really can't be fixed / done with #ifdef and also kept a not-so-good version for older GPUs?
But maybe I wish too much. Obviously everybody only improves the versions mostly for own needs....
Sure, that's possible.
Example for rotate function in sha256:
Code:
#if __CUDA_ARCH__<350 
#define rrot(x, bits) ((x >> bits) | (x << (32 - bits)))
#else
#define rrot(x, bits) __funnelshift_r(x, x, bits)
#endif
But usually there are other reasons for not supporting older cards

d33_man
Member
**
Offline Offline

Activity: 65
Merit: 10


View Profile
July 17, 2014, 08:44:11 PM
 #17915

Hey,

I'm using ccminer 2.1 and keep getting the error "abnormal hashes, exiting with code 211!" when mining x11 on 4 750tis. Any advice as to what may be causing this? CCminer just shuts down after the error which occurs after a few hours mining.



Thanks
Newwsr
Sr. Member
****
Offline Offline

Activity: 311
Merit: 250


View Profile
July 17, 2014, 11:38:36 PM
 #17916

Hey,

I'm using ccminer 2.1 and keep getting the error "abnormal hashes, exiting with code 211!" when mining x11 on 4 750tis. Any advice as to what may be causing this? CCminer just shuts down after the error which occurs after a few hours mining.



Thanks


Friend uses ccminer v.1.1 use it here I have not poblema not
Waldozaur12
Legendary
*
Offline Offline

Activity: 1223
Merit: 1000


View Profile
July 18, 2014, 01:01:42 AM
 #17917

any Virus&Trojans inside Cudaminer software ?
sp_
Legendary
*
Offline Offline

Activity: 2954
Merit: 1087

Team Black developer


View Profile
July 18, 2014, 01:05:49 AM
Last edit: July 18, 2014, 01:25:03 AM by sp_
 #17918

any Virus&Trojans inside Cudaminer software ?

Claymore made 100 000$ on the monero miner with 5% tip...

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
Poena
Newbie
*
Offline Offline

Activity: 48
Merit: 0


View Profile
July 18, 2014, 02:26:13 AM
 #17919

any Virus&Trojans inside Cudaminer software ?

Claymore made 100 000$ on the monero miner with 5% tip...

Hidden tip or a legit fee for using his apps?
tarzanbigcity
Sr. Member
****
Offline Offline

Activity: 602
Merit: 250



View Profile
July 18, 2014, 03:47:47 AM
 #17920

Hey guys, perhaps we can rally some people here for support for PIMP support on the nvidia platform. You can donate to the dev here.

http://www.getpimp.org/features.html

3) develop and release pimp nvidia   17FEj7UEwH32PadCtWnyAS5hGYv7f99Lki   Jun 14th   0.23   2.00   1.77   11%

They are currently only 11% funded. But with our support perhaps we can complete that goal. I for one love the pimp platform and I think bringing it to nvidia would be a godsend.
Pages: « 1 ... 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 [896] 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 ... 1135 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!