SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

Genoil

Sr. Member

Offline

Activity: 438
Merit: 250

Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 01:21:52 PM

#981

Quote from: mgmk on November 14, 2016, 11:30:34 AM

Build instructions for Windows:

1. Install AMD APP SDK for Windows from here http://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk/
2. Get cygwin from http://cygwin.com
3. Install cygwin with the packages:
- Devel/git
- Devel/make
- Devel/gcc-core
- Python/python3
4. Start cygwin64 terminal
5. Clone repo: git clone https://github.com/mbevand/silentarmy.git
6. cd silentarmy
7. Build: make OPENCL_HEADERS="/cygdrive/c/Program\ Files\ \(x86\)/AMD\ APP\ SDK/3.0/include/" LIBOPENCL="/cygdrive/c/Program\ Files\ \(x86\)/AMD\ APP\ SDK/3.0/lib/x86_64"
8. Copy cygwin1.dll: cp /cygdrive/c/cygwin64/bin/cygwin1.dll .
9. Get your build path by typing: cygpath . -w -a
10. Create start.bat with:
c:\cygwin64\bin\python3.4m silentarmy --instances=1 -c stratum+tcp://address -u x.x -p x --use 1,2
pause

Does the Cygwin port also have issues wit multiple instances like my MSVC port?

ETH: 0xeb9310b185455f863f526dab3d245809f6854b4d
BTC: 1Nu2fMCEBjmnLzqb8qUJpKgq5RoEWFhNcW

oslak

Sr. Member

Offline

Activity: 273
Merit: 250

Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 01:34:02 PM

#982

Quote from: mgmk on November 14, 2016, 11:30:34 AM

Thank you very much for this. Grin

ioglnx

Sr. Member

Offline

Activity: 574
Merit: 250

Fighting mob law and inquisition in this forum

Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 01:40:44 PM

#983

Quote from: krnlx on November 14, 2016, 11:37:12 AM

Try replacing ht_store function with this
I'm getting little speed increase( 1070)

Code:

uint ht_store(uint round, __global char *ht, uint i,
	ulong xi0, ulong xi1, ulong xi2, ulong xi3, __global uint *rowCounters)
{
    uint    row;
    __global char       *p;
    uint                cnt;
   uint                tid = get_global_id(0);
uint                tlid = get_local_id(0);
#if NR_ROWS_LOG == 16
    if (!(round % 2))
	row = (xi0 & 0xffff);
    else
	// if we have in hex: "ab cd ef..." (little endian xi0) then this
	// formula computes the row as 0xdebc. it skips the 'a' nibble as it
	// is part of the PREFIX. The Xi will be stored starting with "ef...";
	// 'e' will be considered padding and 'f' is part of the current PREFIX
	row = ((xi0 & 0xf00) << 4) | ((xi0 & 0xf00000) >> 12) |
	    ((xi0 & 0xf) << 4) | ((xi0 & 0xf000) >> 12);
#elif NR_ROWS_LOG == 18
    if (!(round % 2))
	row = (xi0 & 0xffff) | ((xi0 & 0xc00000) >> 6);
    else
	row = ((xi0 & 0xc0000) >> 2) |
	    ((xi0 & 0xf00) << 4) | ((xi0 & 0xf00000) >> 12) |
	    ((xi0 & 0xf) << 4) | ((xi0 & 0xf000) >> 12);
#elif NR_ROWS_LOG == 19
    if (!(round % 2))
	row = (xi0 & 0xffff) | ((xi0 & 0xe00000) >> 5);
    else
	row = ((xi0 & 0xe0000) >> 1) |
	    ((xi0 & 0xf00) << 4) | ((xi0 & 0xf00000) >> 12) |
	    ((xi0 & 0xf) << 4) | ((xi0 & 0xf000) >> 12);
#elif NR_ROWS_LOG == 20
    if (!(round % 2))
	row = (xi0 & 0xffff) | ((xi0 & 0xf00000) >> 4);
    else
	row = ((xi0 & 0xf0000) >> 0) |
	    ((xi0 & 0xf00) << 4) | ((xi0 & 0xf00000) >> 12) |
	    ((xi0 & 0xf) << 4) | ((xi0 & 0xf000) >> 12);
#else
#error "unsupported NR_ROWS_LOG"
#endif
    xi0 = (xi0 >> 16) | (xi1 << (64 - 16));
    xi1 = (xi1 >> 16) | (xi2 << (64 - 16));
    xi2 = (xi2 >> 16) | (xi3 << (64 - 16));
    p = ht + row * NR_SLOTS * SLOT_LEN;
    uint rowIdx = row/ROWS_PER_UINT;
    uint rowOffset = BITS_PER_ROW*(row%ROWS_PER_UINT);
    uint xcnt = atomic_add(rowCounters + rowIdx, 1 << rowOffset);
    xcnt = (xcnt >> rowOffset) & ROW_MASK;
    cnt = xcnt;
    if (cnt >= NR_SLOTS)
      {
	// avoid overflows
	atomic_sub(rowCounters + rowIdx, 1 << rowOffset);
	return 1;
      }
    p += cnt * SLOT_LEN + xi_offset_for_round(round);
    // store "i" (always 4 bytes before Xi)
//    *(__global uint *)(p - 4) = i;
    if (round == 0 || round == 1)
      {
	//*(__global uint *)(p - 4) = i;
	// store 24 bytes
	ulong2 store;
	store.x=xi1;
	store.y=xi2;
	//*(__global ulong *)(p + 0) = xi0;
	*(__global uint *)(p - 4) = i;
	*(__global ulong *)(p + 0) = xi0;
	*(__global ulong2 *)(p + 8)=store;	

      }
    else if (round == 2)
      {
//	*(__global uint *)(p - 4) = i;
	// store 20 bytes

	*(__global ulong *)(p - 4) = ((ulong)i) | (xi0 << 32);
	*(__global ulong *)(p + 4) = (xi0 >> 32) | (xi1 << 32);
	*(__global ulong *)(p + 12) = (xi1 >> 32) | (xi2 << 32);

      }
    else if (round == 3)
      {
//	*(__global uint *)(p - 4) = i;
	// store 16 bytes
	//8 byte align	
	*(__global ulong *)(p - 4) = ((ulong)i) | (xi0 << 32);
	*(__global ulong *)(p + 4) = (xi0 >> 32) | (xi1 << 32);
	*(__global uint *)(p + 12) = (xi1 >> 32);
      }
    else if (round == 4)
      {
//	*(__global uint *)(p - 4) = i;
	// store 16 bytes
	*(__global uint *)(p - 4) = i;
	*(__global ulong *)(p + 0) = xi0;
	*(__global ulong *)(p + 8) = xi1;
      }
    else if (round == 5)
      {
	//*(__global uint *)(p - 4) = i;
	// store 12 bytes
//	*(__global uint *)(p - 4) = i;

	*(__global uint *)(p - 4) = i;
	*(__global ulong *)(p + 0) = xi0;	
	*(__global uint *)(p + 8) = xi1;
      }
    else if (round == 6 || round == 7)
      {
//	*(__global uint *)(p - 4) = i;
	// store 8 bytes
	*(__global ulong *)(p - 4) = ((ulong)i) | (xi0 << 32);
	*(__global uint *)(p + 4) = (xi0 >> 32);	
      }
    else if (round == 8)
      {
	//4 byte align
	*(__global uint *)(p - 4) = i;
	// store 4 bytes
	*(__global uint *)(p + 0) = xi0;

      }

//*(__global uint *)(p - 4) = i;
    return 0;
}

And part of xor_and_store

Code:

    else if (round == 3)
      {
	// xor 20 bytes
	uint one = *(__global uint *)a ^ *(__global uint *)b;
	
	

	uint4 loada = *(__global uint4 *)((__global char *)a + 4);
	uint4 loadb = *(__global uint4 *)((__global char *)b + 4);
	uint4 stor = loada ^ loadb;
	
	
	xi0 = ((ulong)one ) | ((ulong) stor.x << 32);
	xi1 = ((ulong)stor.y << 32) | ((ulong)stor.z );
	xi2 = stor.w;	


	//xi0 = half_aligned_long(a, 0) ^ half_aligned_long(b, 0);
	//xi1 = half_aligned_long(a, 8) ^ half_aligned_long(b, 8);
	//xi2 = well_aligned_int(a, 16) ^ well_aligned_int(b, 16);
      }

these changes are made in the kernel.cu? Am I right

GTX 1080Ti rocks da house... seriously... this card is a beast³
Owning by now 18x GTX1080Ti :-D @serious love of efficiency

snk0752

Newbie

Offline

Activity: 39
Merit: 0

Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 01:46:22 PM

#984

Quote from: ioglnx on November 14, 2016, 01:40:44 PM

these changes are made in the kernel.cu? Am I right

input.cl

ioglnx

Sr. Member

Offline

Activity: 574
Merit: 250

Fighting mob law and inquisition in this forum

Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 01:48:36 PM

#985

Quote from: snk0752 on November 14, 2016, 01:46:22 PM

Quote from: ioglnx on November 14, 2016, 01:40:44 PM

these changes are made in the kernel.cu? Am I right

input.cl

Well krnlx has a cuda port of it so I assume its in the kernel.cu since ther eis no input.cl in his cuda implementation. Or I'm blind since i found these functions in the kernel .cu too,

GTX 1080Ti rocks da house... seriously... this card is a beast³
Owning by now 18x GTX1080Ti :-D @serious love of efficiency

snk0752

Newbie

Offline

Activity: 39
Merit: 0

Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 01:54:40 PM

#986

Quote from: ioglnx on November 14, 2016, 01:48:36 PM

Well krnlx has a cuda port of it so I assume its in the kernel.cu since ther eis no input.cl in his cuda implementation. Or I'm blind since i found these functions in the kernel .cu too,

ah.. got it. as I have no cuda (as I'm using amd opencl drivers) I've just adjusted input.cl in my case. Sorry for confusing.

qqqq

Legendary

Offline

Activity: 1596
Merit: 1011

Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 02:01:27 PM

#987

Why the devs not making the windows builds, i don't get it ?

mgmk

Newbie

Offline

Activity: 3
Merit: 0

Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 02:03:42 PM

#988

Quote from: Genoil on November 14, 2016, 01:21:52 PM

Does the Cygwin port also have issues wit multiple instances like my MSVC port?

Yes, using cygwin results the same performance issue when running multiple instances

gross

Full Member

Offline

Activity: 236
Merit: 100

Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 02:08:38 PM

#989

Quote from: mgmk on November 14, 2016, 11:30:34 AM

It doesn't work, 3 dev show:"0 sol/s".
Do I installs AMD APP SDK 64bit and 32 bit? I only install AMD APP SDK 64bit

eXtremal

Sr. Member

Offline

Activity: 2106
Merit: 282

👉bit.ly/3QXp3oh | 🔥 Ultimate Launc

⇾ Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 02:12:57 PM

#990

Quote from: zawawa on November 14, 2016, 03:46:53 AM

Yeah, I think it's totally worth it to put up a serious open source competition against Sir. Claymore Wink

We shall be back.

I'll try get Claymore's v5/optiminer speed in 2-3 days.

p.s. he have a more motivation because his model (miner with fee) works perfectly - opposite to my (opensource miner and own pool), miners don't want support coinsforall.io Sad

TONUP

██
██
██
██
██
██
██
██
██
██
██
██
██

.
THE ULTIMATE LAUNCHPAD
FOR THE MOST PROMISING
ASSETS ON TON CHAIN

██
██
██
██
██
██
██
██
██
██
██
██
██

████████
██
██
██
██
██
██
██
██
██
██
██
████████

████████████████████████████████████████████████████████████████████████████████
.
► JOIN NOW
.
████████████████████████████████████████████████████████████████████████████████

████████
██
██
██
██
██
██
██
██
██
██
██
████████

Kubuxu

Newbie

Offline

Activity: 4
Merit: 0

Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 02:50:13 PM

#991

Quote from: eXtremal on November 14, 2016, 02:12:57 PM

Quote from: zawawa on November 14, 2016, 03:46:53 AM

Yeah, I think it's totally worth it to put up a serious open source competition against Sir. Claymore Wink

We shall be back.

What do you think about FOSS miner with dev fee. Small fee 1-2% split among devs (address list in sources with weights). If someone builds one on its own he has chance to opt out. No enforcing, no saying that people opting out are bad.

There probably won't be much profit in that but some motivation is better than none, as someone that does OSS from begging and currently works for FOSS company I know that well.

Also I am planning to rewrite computation pipeline to always have some job queued up thus reducing kernel downtime. The overhead is small right now but when the limitations of GPUs are truly met it might make a difference.

ioglnx

Sr. Member

Offline

Activity: 574
Merit: 250

Fighting mob law and inquisition in this forum

Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 02:55:12 PM

#992

Hello Krnlx,

can you please post the complete kernel.cu somehow your snippets aren't as complete as I thought on first spot.Getting this error:

Error   3   error : no operator "^" matches these operands   X:\Mining\sources\nheqminer-cuda-silentarmy\cuda_silentarmy\kernel.cu   496   1   cuda_silentarmy

   uint4 loada = *(__global uint4 *)((__global char *)a + 4);
   uint4 loadb = *(__global uint4 *)((__global char *)b + 4);
   uint4 stor = loada ^ loadb;
Or wasn't it supposed for your cuda port?

GTX 1080Ti rocks da house... seriously... this card is a beast³
Owning by now 18x GTX1080Ti :-D @serious love of efficiency

padrino

Legendary

Offline

Activity: 1428
Merit: 1000

https://www.bitworks.io

Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 03:18:24 PM

#993

Quote from: eXtremal on November 14, 2016, 02:12:57 PM

Quote from: zawawa on November 14, 2016, 03:46:53 AM

Yeah, I think it's totally worth it to put up a serious open source competition against Sir. Claymore Wink

We shall be back.

I had a rough time with your pool so I gave it up some time ago, but have no problem letting the miner mine for a dev fee, perhaps you can sync up with mrb and figure something out..

1CPi7VRihoF396gyYYcs2AdTEF8KQG2BCR
https://www.bitworks.io

cryptomined

Full Member

Offline

Activity: 168
Merit: 104

Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 03:22:50 PM
Last edit: November 14, 2016, 03:33:18 PM by cryptomined

#994

Quote from: eXtremal on November 14, 2016, 02:12:57 PM

I would have mined on coinsforall.io from day one but your pool seems blocked from chinese IPs? chinese were complaining that you blocked them from your pool from day 1.. Is it you or your host? everyone in china was complaining that it was not fair for them,... and that the ZCash start was not fair due to your pool not allowing direct chiense connections.

it was a pain in the butt to connect through VPN in linux, my VPn keeps dropping when the screen turns off so i would have to always leave the monitor on.. and since I was just using one monitor for 3 rigs... was impossible

did/do you know about the issue with china IPs not connecting to your pool? is this done on purpose? if not can you ask for a change of IP from your host?

I liked your miner and was also very upset i was not able to efficiently use it on your pool... really upsetting for me

at least I left you a small tip as I was using it for a few hours until i gave up due to the IP address situation..... so you got my remaining balance from day 1 - which today isn't work jack but still,,., i lots hours of mining on your pool on genesis day

you may want to consider changing hosts - but maybe damage is already done and there are too many pools now

and anyways... how about adding a ZClassic Pool?

https://cryptomined.com

Tmdz

Hero Member

Offline

Activity: 1008
Merit: 1000

Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 03:35:15 PM

#995

Quote from: padrino on November 14, 2016, 03:18:24 PM

Quote from: eXtremal on November 14, 2016, 02:12:57 PM

Quote from: zawawa on November 14, 2016, 03:46:53 AM

Yeah, I think it's totally worth it to put up a serious open source competition against Sir. Claymore Wink

We shall be back.

I had a rough time with your pool so I gave it up some time ago, but have no problem letting the miner mine for a dev fee, perhaps you can sync up with mrb and figure something out..

I like the way optiminer mines the dev fee, eg it continuously mines instead of dropping all the hash to mine for dev. That also gives the user a accurate picture of how much hash they are getting.

People don't like being stuck or forced to use a single pool, thats just science.

krnlx

Full Member

Offline

Activity: 243
Merit: 105

Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 04:02:55 PM

#996

Quote from: ioglnx on November 14, 2016, 02:55:12 PM

It's for opencl. Cuda have not native 128bit xor (don't know about amd, and future cards). For cuda you can test
uint4 stor;
stor.x = loada.x ^ loadb.x;
stor.y = loada.y ^ loadb.y;

Genoil

Sr. Member

Offline

Activity: 438
Merit: 250

Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 04:11:12 PM

#997

Quote from: krnlx on November 14, 2016, 04:02:55 PM

Quote from: ioglnx on November 14, 2016, 02:55:12 PM

It's for opencl. Cuda have not native 128bit xor (don't know about amd, and future cards). For cuda you can test
uint4 stor;
stor.x = loada.x ^ loadb.x;
stor.y = loada.y ^ loadb.y;

Code:

static __device__ __forceinline__ uint4 operator^ (uint4 a, uint4 b) { return make_uint4(a.x ^ b.x, a.y ^ b.y, a.z ^ b.z, a.w ^ b.w); }

ETH: 0xeb9310b185455f863f526dab3d245809f6854b4d
BTC: 1Nu2fMCEBjmnLzqb8qUJpKgq5RoEWFhNcW

ioglnx

Sr. Member

Offline

Activity: 574
Merit: 250

Fighting mob law and inquisition in this forum

Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 04:26:08 PM

#998

Thanks genoil :-D

Edit: I could build the CUDA_SA now..let see if it works :-D
But it doesn't work :-D registers get spammed :-D takes 5min to unload the application Cheesy

GTX 1080Ti rocks da house... seriously... this card is a beast³
Owning by now 18x GTX1080Ti :-D @serious love of efficiency

laik2

Sr. Member

Offline

Activity: 652
Merit: 266

Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 05:42:13 PM

#999

Add cpu mining support for DevFee with 1 core mining on your T addresses.
Include option from 0 to 100% DevFee. I am aware that binary distributions are preferred due to speed and developers get more donations due to existance of devfee but mining 24/7 with 1 core to dev address or 1 hour at full speed every 24h will make huge difference in development speed.

P.S. We need solutions fast. Everyone with cpu/gpu power should donate ASAP for faster development. Other closed source projects already support 150+ S/s on highend AMD cards, but instability is the only frontier that stops everyone using silentarmy to switch there...As if nobody cares about Claymore's and others devfee but only speed - please donate by mining on silentarmy devs addresses. I will ask every active developer to put their address on the github page so every and each one of us donates for you!

Miners Mining Platform [ MMP OS ] - https://app.mmpos.eu/

toptek

Legendary

Offline

Activity: 1274
Merit: 1000

Re: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

November 14, 2016, 07:08:30 PM

#1000

add devfee we can turn off and on ,if it's off it doesn't slow us down i'll keep it on a few days at a time. I'll put it use. I hope others would to.

For security, your account has been locked. Email acctcomp15@theymos.e4ward.com

Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 [50] 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 »

Bitcoin Forum > Alternate cryptocurrencies > Mining (Altcoins) > SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070

« previous topic next topic »