Bitcoin Forum
May 02, 2024, 09:11:01 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 ... 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 [83] 84 85 86 87 88 89 90 91 »
  Print  
Author Topic: SILENTARMY v5: Zcash miner, 115 sol/s on R9 Nano, 70 sol/s on GTX 1070  (Read 209263 times)
nerdralph
Sr. Member
****
Offline Offline

Activity: 588
Merit: 251


View Profile
December 04, 2016, 07:40:06 PM
 #1641

The implementation of the new algorithm turned out to be quite difficult, but it's coming along.
It's not surprising, however, given the fact that I'm up against professional cryptographers.
Just to let you guys know, the new algo involves a modified version of Wagner's algorithm,
which I call "castrated Wagner's," with reduced memory bandwidth without compromising
algorithm binding. My brain is fried thinking about this problem with so many gotchas day and night,
but we are at the end of the tunnel.

You've piqued my curiosity.  I never gave much thought to optimizing Wagner's algorithm, as I thought the algorithm binding limited what you could do.   Maybe I'll take a break from AMD GCN assembler docs and look back at the equihash paper.

p.s. You know your brain is working hard when you get a headache from just thinking!

After a day of racking my brain any faster versions of the algorithm I've come up with result in close to 0 solutions.  I've had ideas that might improve only round 0 or round8, but nothing that would make a material difference for the algorithm as a whole.  I'm not quite confident enough to say it's impossible to significantly optimize the algorithm, but I'm going back to reading AMD GCN architecture docs which I know can help optimize the implementation.
1714641061
Hero Member
*
Offline Offline

Posts: 1714641061

View Profile Personal Message (Offline)

Ignore
1714641061
Reply with quote  #2

1714641061
Report to moderator
1714641061
Hero Member
*
Offline Offline

Posts: 1714641061

View Profile Personal Message (Offline)

Ignore
1714641061
Reply with quote  #2

1714641061
Report to moderator
1714641061
Hero Member
*
Offline Offline

Posts: 1714641061

View Profile Personal Message (Offline)

Ignore
1714641061
Reply with quote  #2

1714641061
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
1714641061
Hero Member
*
Offline Offline

Posts: 1714641061

View Profile Personal Message (Offline)

Ignore
1714641061
Reply with quote  #2

1714641061
Report to moderator
1714641061
Hero Member
*
Offline Offline

Posts: 1714641061

View Profile Personal Message (Offline)

Ignore
1714641061
Reply with quote  #2

1714641061
Report to moderator
1714641061
Hero Member
*
Offline Offline

Posts: 1714641061

View Profile Personal Message (Offline)

Ignore
1714641061
Reply with quote  #2

1714641061
Report to moderator
laik2
Sr. Member
****
Offline Offline

Activity: 652
Merit: 266



View Profile WWW
December 04, 2016, 07:55:13 PM
 #1642

The implementation of the new algorithm turned out to be quite difficult, but it's coming along.
It's not surprising, however, given the fact that I'm up against professional cryptographers.
Just to let you guys know, the new algo involves a modified version of Wagner's algorithm,
which I call "castrated Wagner's," with reduced memory bandwidth without compromising
algorithm binding. My brain is fried thinking about this problem with so many gotchas day and night,
but we are at the end of the tunnel.

You've piqued my curiosity.  I never gave much thought to optimizing Wagner's algorithm, as I thought the algorithm binding limited what you could do.   Maybe I'll take a break from AMD GCN assembler docs and look back at the equihash paper.

p.s. You know your brain is working hard when you get a headache from just thinking!

After a day of racking my brain any faster versions of the algorithm I've come up with result in close to 0 solutions.  I've had ideas that might improve only round 0 or round8, but nothing that would make a material difference for the algorithm as a whole.  I'm not quite confident enough to say it's impossible to significantly optimize the algorithm, but I'm going back to reading AMD GCN architecture docs which I know can help optimize the implementation.

It is possible. Claymore and Optimer have done it. I don't see why OSS community won't. I guess it will take more time. Which is the barrier for all zec miners as of current trade value....
EDIT: The keyword in my post is "community". If you have an idea just share it... I don't see any ideas popping up lately. I've been reading OpenCL and AMD GCN docs too but my knowledge is pretty limited...if I see anything useful I think I can help...one way or another.

Miners Mining Platform [ MMP OS ] - https://app.mmpos.eu/
nerdralph
Sr. Member
****
Offline Offline

Activity: 588
Merit: 251


View Profile
December 04, 2016, 09:23:44 PM
Last edit: December 04, 2016, 09:40:08 PM by nerdralph
 #1643

After a day of racking my brain any faster versions of the algorithm I've come up with result in close to 0 solutions.  I've had ideas that might improve only round 0 or round8, but nothing that would make a material difference for the algorithm as a whole.  I'm not quite confident enough to say it's impossible to significantly optimize the algorithm, but I'm going back to reading AMD GCN architecture docs which I know can help optimize the implementation.

It is possible. Claymore and Optimer have done it. I don't see why OSS community won't. I guess it will take more time. Which is the barrier for all zec miners as of current trade value....
EDIT: The keyword in my post is "community". If you have an idea just share it... I don't see any ideas popping up lately. I've been reading OpenCL and AMD GCN docs too but my knowledge is pretty limited...if I see anything useful I think I can help...one way or another.

No, they only optimized the implementation.  The algorithm, i.e. 8 rounds of bin sorts to find collisions on 20 bits followed by a final 40-bit collision search, has not been optimized.
I've already explained the optimizations that could be done in OpenCL which would take ~5 million core clocks per round.  That would push the performance up to ~200 on a Rx 470 clocked at 1250/1750.  With GCN assembler I believe I can get that down to ~3 million core clocks per round, with performance of over 300 sols/s on a Rx 470.

p.s. When I finish I'll probably go the closed-source route too.  While I don't expect to make thousands per day, I think with a 1-2% fee I could make a few hundred dollars per day as long as the ZEC price holds above $25.  You can talk "community" and sing kumbaya till you are blue in the face, but I expect closed-source miners will have the best performance for the next few months.  Claymore probably spends more than 40 hrs/wk on miner development, but he can be reasonably confident he'll get paid for his efforts.  While the $10K prize money may have been some incentive for Marc, now that the contest is over, the people working on open-source miners are primarily doing it out of fun.
laik2
Sr. Member
****
Offline Offline

Activity: 652
Merit: 266



View Profile WWW
December 04, 2016, 09:40:01 PM
Last edit: December 04, 2016, 10:40:18 PM by laik2
 #1644

Ok...I don't really care close/open source as long as it works and is maintained constantly.
Take optiminer ...truly crashing software with 10% fee...ppl still use it for reason unknown.
If you make any optimizations that you claim you can reach. I deffinately will use your work as long as you guarantee stability.

Miners Mining Platform [ MMP OS ] - https://app.mmpos.eu/
ghostfaceuk
Sr. Member
****
Offline Offline

Activity: 410
Merit: 250


View Profile
December 05, 2016, 07:26:31 AM
 #1645

After a day of racking my brain any faster versions of the algorithm I've come up with result in close to 0 solutions.  I've had ideas that might improve only round 0 or round8, but nothing that would make a material difference for the algorithm as a whole.  I'm not quite confident enough to say it's impossible to significantly optimize the algorithm, but I'm going back to reading AMD GCN architecture docs which I know can help optimize the implementation.

It is possible. Claymore and Optimer have done it. I don't see why OSS community won't. I guess it will take more time. Which is the barrier for all zec miners as of current trade value....
EDIT: The keyword in my post is "community". If you have an idea just share it... I don't see any ideas popping up lately. I've been reading OpenCL and AMD GCN docs too but my knowledge is pretty limited...if I see anything useful I think I can help...one way or another.

No, they only optimized the implementation.  The algorithm, i.e. 8 rounds of bin sorts to find collisions on 20 bits followed by a final 40-bit collision search, has not been optimized.
I've already explained the optimizations that could be done in OpenCL which would take ~5 million core clocks per round.  That would push the performance up to ~200 on a Rx 470 clocked at 1250/1750.  With GCN assembler I believe I can get that down to ~3 million core clocks per round, with performance of over 300 sols/s on a Rx 470.

p.s. When I finish I'll probably go the closed-source route too.  While I don't expect to make thousands per day, I think with a 1-2% fee I could make a few hundred dollars per day as long as the ZEC price holds above $25.  You can talk "community" and sing kumbaya till you are blue in the face, but I expect closed-source miners will have the best performance for the next few months.  Claymore probably spends more than 40 hrs/wk on miner development, but he can be reasonably confident he'll get paid for his efforts.  While the $10K prize money may have been some incentive for Marc, now that the contest is over, the people working on open-source miners are primarily doing it out of fun.


Interesting to hear you think you may be able to squeeze that much from a 470.  I know there is probably a lot of work still required to fully optimise everything but I was wondering if you are going to proceed with the miner (be it open or closed source) will you be making a new thread for it or will we be able to follow its progress here?
Amph
Legendary
*
Offline Offline

Activity: 3206
Merit: 1069



View Profile
December 05, 2016, 07:46:02 AM
 #1646

300 sol is impressive for a 470, especially after clymore said that it's not possible anymore to get 30-50% but only small boost from now

but 300 sol is nearly 100% faster, for a 470...
qwep1
Hero Member
*****
Offline Offline

Activity: 610
Merit: 500


View Profile
December 05, 2016, 08:19:40 AM
 #1647

300 sol is impressive for a 470, especially after clymore said that it's not possible anymore to get 30-50% but only small boost from now

but 300 sol is nearly 100% faster, for a 470...
is that for miner

              ▄▄██▄▄
          ▄▄██████████▄▄
      ▄▄██████████████████▄▄
  ▄▄██████████▀▀ ▀▀██████████▄▄
▄█████████▀▀          ▀▀█████████▄
██████▀▀        ▄▄        ▀▀██████
██████      ▄▄██████▄▄      ██████
██████    ██████████████    ██████
██████    ██████████████    ██████
██████    ██████████████    ██████
██████      ▀▀██████▀▀      ██████
██████          ▀▀        ▄▄██████
▀█████    ▄▄          ▄▄█████████▀
   ▀▀█    ████▄▄ ▄▄██████████▀▀
          ████████████████▀▀
          ▀▀██████████▀▀
              ▀▀██▀▀
P H O R E

     █
    █
   █
  █
   █
    █
   █
  █
 █
    KryptKoin rebranded to Phore   
     █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █
PoS 3.0  -  Masternodes  -  Obfuscation


     █
    █
   █
  █
   █
    █
   █
  █
 █
.


            ▄▄██▄▄
        ▄▄██████████▄▄
    ▄▄████████▀▀████████▄▄
 ▄████████▀▀      ▀▀████████▄
▐█████▀▀              ▀▀█████▌
▐████       ▄▄██▄▄       ████▌
▐████    ▄██████████▄    ████▌
▐████    ████████████    ████▌
▐████    ▀██████████▀    ████▌
▐████       ▀▀██▀▀       ████▌
 ▀███                 ▄▄█████▌
    ▀    █▄▄      ▄▄████████▀
         █████▄▄████████▀▀
         ▀██████████▀▀
            ▀▀██▀▀
ghostfaceuk
Sr. Member
****
Offline Offline

Activity: 410
Merit: 250


View Profile
December 05, 2016, 08:38:57 AM
 #1648

300 sol is impressive for a 470, especially after clymore said that it's not possible anymore to get 30-50% but only small boost from now

but 300 sol is nearly 100% faster, for a 470...
is that for miner

Its what nerdralph thinks he can get from a 470 with GCN optimisation
Genamant
Full Member
***
Offline Offline

Activity: 730
Merit: 102


Trphy.io


View Profile
December 05, 2016, 01:08:09 PM
 #1649

300 sol is impressive for a 470, especially after clymore said that it's not possible anymore to get 30-50% but only small boost from now

but 300 sol is nearly 100% faster, for a 470...
is that for miner

Its what nerdralph thinks he can get from a 470 with GCN optimisation

He predicted the 470 could do only 160 H/s previously. but he might be right this time. But he said he would not share the fast miner.

laik2
Sr. Member
****
Offline Offline

Activity: 652
Merit: 266



View Profile WWW
December 05, 2016, 01:20:25 PM
 #1650

300 sol is impressive for a 470, especially after clymore said that it's not possible anymore to get 30-50% but only small boost from now

but 300 sol is nearly 100% faster, for a 470...
is that for miner

Its what nerdralph thinks he can get from a 470 with GCN optimisation

He predicted the 470 could do only 160 H/s previously. but he might be right this time. But he said he would not share the fast miner.
He will share it in binary form with devfee.

Miners Mining Platform [ MMP OS ] - https://app.mmpos.eu/
ghostfaceuk
Sr. Member
****
Offline Offline

Activity: 410
Merit: 250


View Profile
December 05, 2016, 01:26:47 PM
 #1651

300 sol is impressive for a 470, especially after clymore said that it's not possible anymore to get 30-50% but only small boost from now

but 300 sol is nearly 100% faster, for a 470...
is that for miner

Its what nerdralph thinks he can get from a 470 with GCN optimisation

He predicted the 470 could do only 160 H/s previously. but he might be right this time. But he said he would not share the fast miner.
He will share it in binary form with devfee.

Yeah he said above it will be closed source and will have a dev fee of 1-2% built in.  I hope he does manage to make that kind of speed from the 470 cards, perhaps then he may be able to look at other algo's and use his experience to boost the cards for them as well.  By releasing a miner that can used the memory to its full potential and thus increase speeds above what is available right for the various algos I think more people will use it anid help him make money so he can continue to develop/improve his miners just like other coders
nerdralph
Sr. Member
****
Offline Offline

Activity: 588
Merit: 251


View Profile
December 05, 2016, 02:11:35 PM
 #1652

He predicted the 470 could do only 160 H/s previously. but he might be right this time. But he said he would not share the fast miner.

The 300 target is based on Marc's observation that the performance counters go up by 32 (one DDR5 channel) for a single-byte write, and not 64 (a full cache line) as I initially expected.  This has not yet been measured in code that does sustained writes.  In the short term I should have some test code that will stress the memory controller, and allow me to confirm the practical performance limits instead of just relying on datasheet specs.
nerdralph
Sr. Member
****
Offline Offline

Activity: 588
Merit: 251


View Profile
December 05, 2016, 02:14:16 PM
 #1653

If you make any optimizations that you claim you can reach. I deffinately will use your work as long as you guarantee stability.

After version 0.4, I had no problem getting Optiminer to run stable for days with safe overclock settings and good risers.  If Optiminer v 0.6 crashes on your rig, my miner probably will too.
nerdralph
Sr. Member
****
Offline Offline

Activity: 588
Merit: 251


View Profile
December 05, 2016, 02:20:58 PM
 #1654

300 sol is impressive for a 470, especially after clymore said that it's not possible anymore to get 30-50% but only small boost from now

Claymore may be limited to coding in OpenCL, which doesn't provide a means of using the GDS(global data share).  OpenCL also doesn't expose the SLC or GLC bits to control the caching policy.
https://community.amd.com/thread/208471

It's also possible that he's already tried GCN assembler and found out that the memory controller on GCN chips does not perform the way Marc and I think it should.
laik2
Sr. Member
****
Offline Offline

Activity: 652
Merit: 266



View Profile WWW
December 05, 2016, 02:22:39 PM
 #1655

If you make any optimizations that you claim you can reach. I deffinately will use your work as long as you guarantee stability.

After version 0.4, I had no problem getting Optiminer to run stable for days with safe overclock settings and good risers.  If Optiminer v 0.6 crashes on your rig, my miner probably will too.

Just to clarify...downclocked 4xRX 480 without undervolt meassured from the wall 880/890W vents spin constantly 70+ % to sustain below 60 degree.(tested 0.6.0 last night)
Claymore v8 -i 2 640/650W undervolt -100mV vents 35/40% temp 59 degree.
Silentarmy v5(eXtremal opts + memleak fix) -  ~700W vents 25/30% temps ~60 degree.
I do not consider this normal. Do you?
S/s on optiminer and claymore are almost equal, I added 2xR9 390 and Claymore beats optiminer with ~100S/s so just using win10 for now. Release binary miner with acceptable devfee 1/2% and silentarmy power and resources consumption and I will use it.

Miners Mining Platform [ MMP OS ] - https://app.mmpos.eu/
Linit
Newbie
*
Offline Offline

Activity: 13
Merit: 0


View Profile
December 05, 2016, 05:26:49 PM
 #1656

Who is the developer of the new version of SA, zawana or nerdralph ?.
qwep1
Hero Member
*****
Offline Offline

Activity: 610
Merit: 500


View Profile
December 05, 2016, 05:35:52 PM
 #1657

there is a new version of miners  Huh

              ▄▄██▄▄
          ▄▄██████████▄▄
      ▄▄██████████████████▄▄
  ▄▄██████████▀▀ ▀▀██████████▄▄
▄█████████▀▀          ▀▀█████████▄
██████▀▀        ▄▄        ▀▀██████
██████      ▄▄██████▄▄      ██████
██████    ██████████████    ██████
██████    ██████████████    ██████
██████    ██████████████    ██████
██████      ▀▀██████▀▀      ██████
██████          ▀▀        ▄▄██████
▀█████    ▄▄          ▄▄█████████▀
   ▀▀█    ████▄▄ ▄▄██████████▀▀
          ████████████████▀▀
          ▀▀██████████▀▀
              ▀▀██▀▀
P H O R E

     █
    █
   █
  █
   █
    █
   █
  █
 █
    KryptKoin rebranded to Phore   
     █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █ █
PoS 3.0  -  Masternodes  -  Obfuscation


     █
    █
   █
  █
   █
    █
   █
  █
 █
.


            ▄▄██▄▄
        ▄▄██████████▄▄
    ▄▄████████▀▀████████▄▄
 ▄████████▀▀      ▀▀████████▄
▐█████▀▀              ▀▀█████▌
▐████       ▄▄██▄▄       ████▌
▐████    ▄██████████▄    ████▌
▐████    ████████████    ████▌
▐████    ▀██████████▀    ████▌
▐████       ▀▀██▀▀       ████▌
 ▀███                 ▄▄█████▌
    ▀    █▄▄      ▄▄████████▀
         █████▄▄████████▀▀
         ▀██████████▀▀
            ▀▀██▀▀
zawawa
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
December 05, 2016, 05:56:29 PM
 #1658

I will start my own fork of SA once the current rewrite is done as I am taking a different path.
It will be announced here once it is ready for public consumption. Thank you guys for the great work!

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
ioglnx
Sr. Member
****
Offline Offline

Activity: 574
Merit: 250

Fighting mob law and inquisition in this forum


View Profile
December 05, 2016, 05:58:33 PM
 #1659

I will start my own fork of SA once the current rewrite is done as I am taking a different path.
It will be announced here once it is ready for public consumption. Thank you guys for the great work!

Any idea of when we see the first Alpha beta or omega? It's somehow taking the wrong path of all split up..going closed source..bad feeling.

GTX 1080Ti rocks da house... seriously... this card is a beast³
Owning by now 18x GTX1080Ti :-D @serious love of efficiency
zawawa
Sr. Member
****
Offline Offline

Activity: 728
Merit: 304


Miner Developer


View Profile
December 05, 2016, 06:33:27 PM
 #1660

Oh, my fork will be open-source, so no worries. I will probably GPL it so that all the derivative works will be open-source, too.
I'm a "Free as in Freedom" kind of guy, so I am not particularly interested in closed-source miners.
If other devs would like to monetize their great skills, good luck to them! They have all the rights to do so.
It is just that I don't want to join the band for philosophical reasons.
All I can say about the time frame is, "It's done when it's done."
I am working VERY HARD on the new version, though.
I feel pretty bad about dragging you guys along, but you also need to understand that I have been working on Equihash for only two weeks while other devs have been doing so for months. We will see.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
Pages: « 1 ... 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 [83] 84 85 86 87 88 89 90 91 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!