Bitcoin Forum
June 29, 2024, 01:26:10 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 ... 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 [65] 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 »
  Print  
Author Topic: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070  (Read 209264 times)
krnlx
Full Member
***
Offline Offline

Activity: 243
Merit: 105


View Profile
November 20, 2016, 09:41:16 PM
 #1281

why not using vector store...damn it, I few more weeks and I'll get in rails with OpenCL.

vector store address must be aligned by 16 bytes, it is not possible in any round because of different offsets
zawawa
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
November 20, 2016, 09:57:52 PM
 #1282

I have been tweaking disassembled GCN codes of SA's kernels, and there seems to be quite a bit of room for performance enhancements, especially by optimizing global memory access by reordering flat_store_dword and s_waitcnt in ht_store(). @eXtremal, how are your next batch of optimizations coming along? If they are almost ready, I will wait for them. Otherwise, I will optimize the OpenCL kernel myself and then tweak the GCN code.

xor_and_store and ht_store must be rewrited, and joined to one function.

unaligned 32 bits reads in xor_and_store -> join in 64bit in half_aligned_long -> 64bit xor in xor_and_store -> on 2,4,6,8 round 256bit shift on xi0xi1xi2xi3 in xor_and_store -> 256bit shift again in ht_store -> split in 32bit, and write in ht_store

must be rewrited to:

unaligned 32 bits reads  - > 32 bit xor -> 256bit shift -> 32 or 64 bit, or vector store
or
64 bits reads -> 64 bit xor -> 64 bit 256bit shift -> 64bit or vector store
or
64 and 32 bit reads -> 64 and 32 bit xor -> mixed 256bit shift -> 64bit or 32bit or vector store

depend on round



Excellent suggestions! Let me get to them ASAP.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
eXtremal
Sr. Member
****
Offline Offline

Activity: 2106
Merit: 282


👉bit.ly/3QXp3oh | 🔥 Ultimate Launc


View Profile WWW
November 20, 2016, 10:04:06 PM
Last edit: November 20, 2016, 10:15:20 PM by eXtremal
 #1283

Speedup +10-15% for NVIdia only:
http://coinsforall.io/distr/nvidia/input.cl
http://coinsforall.io/distr/nvidia/param.h

Sorry, but can't work more than 1 hour a day on miner now.

For other developers, you need:
- Decrease NR_ROWS_LOG to 13 or 12. ht_store function works much faster with low NR_ROWS values and when you decrease NR_ROWS, you also decrease total slots amount, because you can use less values for OVERHEAD constant.
- Optimize equihash round for big NR_SLOTS values. I begin do it in last NVidia release, but need much more work..

TONUP██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
▄▄███████▄▄
▄▄███████████████▄▄
▄███████████████████▄
▄█████▄░▄▄▀█████▀▄████▄
▄███████▄▀█▄▀██▀▄███████▄
█████████▄▀█▄▀▄██████████
██████████▄▀█▄▀██████████
██████████▀▄▀█▄▀█████████
▀███████▀▄██▄▀█▄▀███████▀
▀████▀▄█████▄▀▀░▀█████▀
▀███████████████████▀
▀▀███████████████▀▀
▀▀███████▀▀
▄▄▄███████▄▄▄
▄▄███████████████▄▄
▄███████████████████▄
▄██████████████▀▀█████▄
▄██████████▀▀█████▐████▄
██████▀▀████▄▄▀▀█████████
████▄▄███▄██▀█████▐██████
█████████▀██████████████
▀███████▌▐██████▐██████▀
▀███████▄▄███▄████████▀
▀███████████████████▀
▀▀███████████████▀▀
▀▀▀███████▀▀▀
▄▄▄███████▄▄▄
▄▄███████████████▄▄
▄███████████████████▄
▄█████████████████████▄
▄████▀▀███▀▀███▀▀██▀███▄
████▀███████▀█▀███▀█████
██████████████████████
████▄███████▄█▄███▄█████
▀████▄▄███▄▄███▄▄██▄███▀
▀█████████████████████▀
▀███████████████████▀
▀▀███████████████▀▀
▀▀▀███████▀▀▀
████████
██
██
██
██
██
██
██
██
██
██
██
████████
████████████████████████████████████████████████████████████████████████████████
.
JOIN NOW
.
████████████████████████████████████████████████████████████████████████████████
████████
██
██
██
██
██
██
██
██
██
██
██
████████
zawawa
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
November 20, 2016, 10:09:08 PM
 #1284

Speedup +10-15% for NVIdia only:
http://coinsforall.io/distr/nvidia/input.cl
http://coinsforall.io/distr/nvidia/param.h

Sorry, but can't work more than 1 hour a day on miner now.


No prob, thank you for your contributions!

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
laik2
Sr. Member
****
Offline Offline

Activity: 652
Merit: 266



View Profile WWW
November 20, 2016, 10:13:02 PM
 #1285

Speedup +10-15% for NVIdia only:
http://coinsforall.io/distr/nvidia/input.cl
http://coinsforall.io/distr/nvidia/param.h

Sorry, but can't work more than 1 hour a day on miner now.


No prob, thank you for your contributions!
Did you make any progress with AMD?

Miners Mining Platform [ MMP OS ] - https://app.mmpos.eu/
eXtremal
Sr. Member
****
Offline Offline

Activity: 2106
Merit: 282


👉bit.ly/3QXp3oh | 🔥 Ultimate Launc


View Profile WWW
November 20, 2016, 10:19:33 PM
 #1286

Did you make any progress with AMD?
Last release don't working on AMD, but if I found a reason, it will be same +10-15% on AMD cards. For more speedup, see my previous post.

TONUP██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
██
▄▄███████▄▄
▄▄███████████████▄▄
▄███████████████████▄
▄█████▄░▄▄▀█████▀▄████▄
▄███████▄▀█▄▀██▀▄███████▄
█████████▄▀█▄▀▄██████████
██████████▄▀█▄▀██████████
██████████▀▄▀█▄▀█████████
▀███████▀▄██▄▀█▄▀███████▀
▀████▀▄█████▄▀▀░▀█████▀
▀███████████████████▀
▀▀███████████████▀▀
▀▀███████▀▀
▄▄▄███████▄▄▄
▄▄███████████████▄▄
▄███████████████████▄
▄██████████████▀▀█████▄
▄██████████▀▀█████▐████▄
██████▀▀████▄▄▀▀█████████
████▄▄███▄██▀█████▐██████
█████████▀██████████████
▀███████▌▐██████▐██████▀
▀███████▄▄███▄████████▀
▀███████████████████▀
▀▀███████████████▀▀
▀▀▀███████▀▀▀
▄▄▄███████▄▄▄
▄▄███████████████▄▄
▄███████████████████▄
▄█████████████████████▄
▄████▀▀███▀▀███▀▀██▀███▄
████▀███████▀█▀███▀█████
██████████████████████
████▄███████▄█▄███▄█████
▀████▄▄███▄▄███▄▄██▄███▀
▀█████████████████████▀
▀███████████████████▀
▀▀███████████████▀▀
▀▀▀███████▀▀▀
████████
██
██
██
██
██
██
██
██
██
██
██
████████
████████████████████████████████████████████████████████████████████████████████
.
JOIN NOW
.
████████████████████████████████████████████████████████████████████████████████
████████
██
██
██
██
██
██
██
██
██
██
██
████████
zawawa
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
November 20, 2016, 10:34:04 PM
 #1287

Did you make any progress with AMD?
Last release don't working on AMD, but if I found a reason, it will be same +10-15% on AMD cards. For more speedup, see my previous post.

Your last release is working on RX 480 with these modifications. Thanks a bunch!
Code:
// Number of rows and slots is affected by this. 20 offers the best performance
// but occasionally misses ~1% of solutions.
#ifdef cl_nv_pragma_unroll // NVIDIA
#define NR_ROWS_LOG                     16
#else
#define NR_ROWS_LOG                     18
#endif

// Setting this to 1 might make SILENTARMY faster, see TROUBLESHOOTING.md
#define OPTIM_SIMPLIFY_ROUND 1

// Number of collision items to track, per thread
#ifdef cl_nv_pragma_unroll // NVIDIA
#define THREADS_PER_ROW 32
#define ROWS_PER_WORKGROUP (64/THREADS_PER_ROW)
#define LDS_COLL_SIZE (NR_SLOTS * 24 * (64 / THREADS_PER_ROW))
#else
#define THREADS_PER_ROW 8
#define ROWS_PER_WORKGROUP (64/THREADS_PER_ROW)
#define LDS_COLL_SIZE (NR_SLOTS * 8 * (64 / THREADS_PER_ROW))
#endif

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
November 20, 2016, 10:40:43 PM
 #1288

I am currently getting 100-114 sol/s with RX 480. This is very nice...

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
laik2
Sr. Member
****
Offline Offline

Activity: 652
Merit: 266



View Profile WWW
November 20, 2016, 10:46:59 PM
 #1289

I am currently getting 100-114 sol/s with RX 480. This is very nice...
102/103 here with 2080 OC.
Which card do you have?

Miners Mining Platform [ MMP OS ] - https://app.mmpos.eu/
zawawa
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
November 20, 2016, 10:49:10 PM
 #1290

I pushed recent changes to my repo, including my Win32 multithreading mod:
https://github.com/zawawawa/silentarmy
New Windows binaries will be available shortly.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
November 20, 2016, 10:51:44 PM
 #1291

I am currently getting 100-114 sol/s with RX 480. This is very nice...
102/103 here with 2080 OC.
Which card do you have?

XFX Black Edition with a modded BIOS.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
November 20, 2016, 10:54:20 PM
 #1292

I am currently getting 100-114 sol/s with RX 480. This is very nice...
102/103 here with 2080 OC.
Which card do you have?

Oh, my numbers are with 4 threads per GPU, too. Multithreading seems to be working well so far.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
TIKCrazy
Member
**
Offline Offline

Activity: 73
Merit: 10


View Profile
November 20, 2016, 10:56:31 PM
 #1293

I pushed recent changes to my repo, including my Win32 multithreading mod:
https://github.com/zawawawa/silentarmy
New Windows binaries will be available shortly.
check also this https://github.com/krnlx/silentarmy-nvmod
krlnx working on silentarmy too
ioglnx
Sr. Member
****
Offline Offline

Activity: 574
Merit: 250

Fighting mob law and inquisition in this forum


View Profile
November 20, 2016, 11:34:29 PM
 #1294

@extermal: Nice optimizations I was close to hit the 500sol/s with 2 GTX1080 and 2 GTX1070 :-D
Thanks

GTX 1080Ti rocks da house... seriously... this card is a beast³
Owning by now 18x GTX1080Ti :-D @serious love of efficiency
Amph
Legendary
*
Offline Offline

Activity: 3206
Merit: 1069



View Profile
November 21, 2016, 07:45:39 AM
 #1295

I pushed recent changes to my repo, including my Win32 multithreading mod:
https://github.com/zawawawa/silentarmy
New Windows binaries will be available shortly.

is this version adding the extremal addition codes or any improved hashrate besides fixing the known bugs?
zawawa
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
November 21, 2016, 07:53:03 AM
 #1296

I pushed recent changes to my repo, including my Win32 multithreading mod:
https://github.com/zawawawa/silentarmy
New Windows binaries will be available shortly.

is this version adding the extremal addition codes or any improved hashrate besides fixing the known bugs?

Yes. I haven't uploaded binaries yet, though. I just got new ideas for optimization. Please wait.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
disman
Newbie
*
Offline Offline

Activity: 25
Merit: 0


View Profile
November 21, 2016, 09:33:28 AM
 #1297

138 sol - already not interested ... ((

Profit below the plinth.

200+ sol on 1070... one might think


I can see there remained some sportsmen altruists)))
Venon
Newbie
*
Offline Offline

Activity: 51
Merit: 0


View Profile
November 21, 2016, 02:41:04 PM
 #1298

138 sol - already not interested ... ((

Profit below the plinth.

200+ sol on 1070... one might think


I can see there remained some sportsmen altruists)))

That is the reason why this thread is quite now. If you pay $0.25/kWh, it is not profitable to mine ZEC.
laik2
Sr. Member
****
Offline Offline

Activity: 652
Merit: 266



View Profile WWW
November 21, 2016, 02:44:49 PM
 #1299

138 sol - already not interested ... ((

Profit below the plinth.

200+ sol on 1070... one might think


I can see there remained some sportsmen altruists)))

That is the reason why this thread is quite now. If you pay $0.25/kWh, it is not profitable to mine ZEC.
Yep...switched to ethereum...more profitable right now

Miners Mining Platform [ MMP OS ] - https://app.mmpos.eu/
laik2
Sr. Member
****
Offline Offline

Activity: 652
Merit: 266



View Profile WWW
November 21, 2016, 08:40:19 PM
 #1300

Is someone working on any improvements for AMD, expecially newer RX cards.

Miners Mining Platform [ MMP OS ] - https://app.mmpos.eu/
Pages: « 1 ... 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 [65] 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!