Slava_K
|
 |
September 03, 2015, 08:47:27 PM |
|
More stable and higher hashrates. Enjoy.
Happy mining!
Lyra2REv2 on SM52 is 5% lower - but pool side is GOOD! On SM50 2-3% higher.
|
|
|
|
t-nelson
Member

Offline
Activity: 70
Merit: 10
|
 |
September 03, 2015, 09:31:21 PM |
|
Pool is looking good here now too. Good job. Now we just have to be careful polishing the porcelain. 
|
BTC: 1K4yxRwZB8DpFfCgeJnFinSqeU23dQFEMu DASH: XcRSCstQpLn8rgEyS6yH4Kcma4PfcGSJxe
|
|
|
hashbrown9000
|
 |
September 04, 2015, 01:01:58 AM |
|
@joblo, glad you got linux OC working. which release are you running?
it takes a little digging with google and man pages, and it's definitely not user friendly (i.e. work locally first to setup up initial OC of multiple cards, then deploy as headless SSH), but does work.
|
Pinkcoin: ETH: VTC: BTC:
|
|
|
joblo
Legendary
Offline
Activity: 1470
Merit: 1114
|
 |
September 04, 2015, 02:05:54 AM |
|
@joblo, glad you got linux OC working. which release are you running?
it takes a little digging with google and man pages, and it's definitely not user friendly (i.e. work locally first to setup up initial OC of multiple cards, then deploy as headless SSH), but does work.
Currently using Fedora 20. Although it's EOL it's the last Fedora release supported by cuda 6.5. I'm crossing my fingers that cuda 7.5 gets optimized soon so I can upgrade to Fedora 22 or Centos 7.
|
|
|
|
pokeytex
Legendary
Offline
Activity: 1504
Merit: 1002
|
 |
September 04, 2015, 02:08:29 AM |
|
@sp - is there any future upgrade to the Spreadcoin miner in the near future? 
|
|
|
|
tsiv
|
 |
September 04, 2015, 06:28:10 AM |
|
Trying to make a Windows build for the modified lyra is leaving me with a seriously  face. Same exact code and nvcc generates something completely different that actually runs slower than the original, be it on a 970 or another 750 Ti. That in addition to VS insisting on rebuilding EVERYTHING after changing something for a single source file in the project file, gotta love it.
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2926
Merit: 1087
Team Black developer
|
 |
September 04, 2015, 06:54:46 AM |
|
If you use cuda 7.5 you should build in 64 bit mode. The x86 compiler is broken.
|
|
|
|
tsiv
|
 |
September 04, 2015, 07:09:45 AM |
|
Nah, 6.5 on both boxes. Slightly older 6.5.12 on Linux and 6.5.19 (the latest 6.5 + compute 5.2 support I think) on Windows. Tried x64 builds too, doesn't seem to make much of a difference either way. Weird shit. I did manage to make the win build a little better by manually unrolling stuff, just looks like the win version of nvcc isn't really trying to figure stuff out itself. Which brings me back to weird shit.
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2926
Merit: 1087
Team Black developer
|
 |
September 04, 2015, 07:13:08 AM |
|
If you share your code I can take a look 
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2926
Merit: 1087
Team Black developer
|
 |
September 04, 2015, 07:16:37 AM |
|
If you look in my bmw256 mod this code would not unroll:
// #pragma unroll // for (i = 0; i<2; i++) // Q[i + 16] = expand32_1(i + 16, M32, H, Q);
So I had to manually unroll it. And with the manual unroll I got less instructions and faster code.
|
|
|
|
chrysophylax
Legendary
Offline
Activity: 3094
Merit: 1093
--- ChainWorks Industries ---
|
 |
September 04, 2015, 07:26:39 AM |
|
@joblo, glad you got linux OC working. which release are you running?
it takes a little digging with google and man pages, and it's definitely not user friendly (i.e. work locally first to setup up initial OC of multiple cards, then deploy as headless SSH), but does work.
Currently using Fedora 20. Although it's EOL it's the last Fedora release supported by cuda 6.5. I'm crossing my fingers that cuda 7.5 gets optimized soon so I can upgrade to Fedora 22 or Centos 7. same here joblo ... all nvidia based miners - running fedora 20 x64 with all the latest dnf updates ... all amd based miners - running fedora 19 x64 with all the latest dnf updates also ... f20x64 - cuda 6.5 ... i have one f22x64 cuda 7.0.28 machine that IS running - ccminer-spmod 1.5.64 using x11 - and its fine ... though hashrate is about 200KH under the 6.5 compiles ... many of the other algos ( including quark and lyra2v2 ) are 'cpu validation error' persistent ... when the donation links are up and running - and all my other jobs are done ( official rename of granitecoin and logo and website ) - then ill work on the recompile and adjustment of oc and OS test also - with centos 7 x64 vps ... i have about 7 of those at the moment - and soon to grow to a LOT more vps in centos 7 x64 for various applications ... i would be VERY interested if there is a dedicated page / link / site specifically for linux / fedora / oc - so that we can reference it all to ... if not - ill make one ... i think we all need it when trying to setup ( and also help ) the systems for the linux savvy ... there is just too much to wade through to get the 'right' info ... im back online for the next few days - so off to compile the 'new' ccminer-spmod 1.5.65 in both cuda 6.5 ( f20x64 ) and 7.0 ( f22x64 ) ... wish me luck  ... btw - tsiv ... if you read this ... i have not heard back from the pm i sent you ... i would really like your details also - as i cant setup a donation server without them ... ill be publishing them in the next day or so when i iron out the little issues i have currently with them ... tanx ... #crysx
|
|
|
|
t-nelson
Member

Offline
Activity: 70
Merit: 10
|
 |
September 04, 2015, 07:45:57 AM |
|
That in addition to VS insisting on rebuilding EVERYTHING after changing something for a single source file in the project file, gotta love it.
Pretty sure one of my PRs from yesterday should've taken care of that. Unless you're touching a header, in which case you're probably up a creek. I think every header includes ever other header. 
|
BTC: 1K4yxRwZB8DpFfCgeJnFinSqeU23dQFEMu DASH: XcRSCstQpLn8rgEyS6yH4Kcma4PfcGSJxe
|
|
|
pallas
Legendary
Offline
Activity: 2716
Merit: 1094
Black Belt Developer
|
 |
September 04, 2015, 07:52:13 AM |
|
If you look in my bmw256 mod this code would not unroll:
// #pragma unroll // for (i = 0; i<2; i++) // Q[i + 16] = expand32_1(i + 16, M32, H, Q);
So I had to manually unroll it. And with the manual unroll I got less instructions and faster code.
That's interesting. Any idea why it's not unrolling it automatically in this case? Did you try this: for (i = 16; i<18; i++) Q = expand32_1(i, M32, H, Q);
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2926
Merit: 1087
Team Black developer
|
 |
September 04, 2015, 08:12:13 AM |
|
Nah, 6.5 on both boxes. Slightly older 6.5.12 on Linux and 6.5.19 (the latest 6.5 + compute 5.2 support I think) on Windows. Tried x64 builds too, doesn't seem to make much of a difference either way. Weird shit. I did manage to make the win build a little better by manually unrolling stuff, just looks like the win version of nvcc isn't really trying to figure stuff out itself. Which brings me back to weird shit.
You should fork my branch and merge the lyra2 changes. My fork is already 500KHASH faster than the DJM34's opensource without modding the lyra2(only the other algos). Big donations are waiting. You have done some improvements in x11(simd) and x13 so your handle is still in the credits. But this is 1 year ago. 
|
|
|
|
scryptr
Legendary
Offline
Activity: 1798
Merit: 1028
|
 |
September 04, 2015, 11:05:03 AM Last edit: September 04, 2015, 05:21:40 PM by scryptr |
|
11 Mh/s QUARK, RELEASE 65--- With a little tuning, and using the "cpu-mining", "-C" switch, I was able to get these results with my Win 7 x64 work computer and an EVGA GTX 960 SSC graphics card:  EVGA GTX 960 SSC mining Quark The card is mining with SP-mod release 65, and a +80 core/+240 mem overclock. There may be room for better and faster tuning, but this appears to be a stable setting for my machine and card. Higher overclocks bring as much as 11.2Mh/s results, but have been less stable. Earlier this week, this card was mining Quark at 10.6Mh/s. With the cpu-mining switch, "-C", performance has improved. My other rigs on Linux have shown similar gains. My 6x EVGA 750ti FTW rig mines at 40Mh/s, up from 38.5Mh/s, and the cards run from 6.6Mh/s to 6.8Mh/s each. My EVGA GTX 970 FTW+ cards now mine Quark at 16.5Mh/s each, up from 14-15Mh/s each. I hope the other bugs are worked out, I'd like to try solo-mining VertCoin. I also noticed the lower poolside VTC hash-rate reports within the last 2 releases, hope it is fixed. --scryptr EDIT: Better results are obtained when using an intensity slightly less than the maximum acceptable/stable. My launch string: ./ccminer -a quark -i 23.9 -C --cpu-priority 5 -o stratum+tcp://quark.pool.com:port -u a -p x  EVGA GTX 960 SSC with OverClock, mining Quark My clocks are currently +90core/+270mem. My results are +160kh/s from the first (top) posted pic. Adjust per your hardware. --scryptr
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2926
Merit: 1087
Team Black developer
|
 |
September 04, 2015, 11:52:26 AM |
|
If you look in my bmw256 mod this code would not unroll:
// #pragma unroll // for (i = 0; i<2; i++) // Q[i + 16] = expand32_1(i + 16, M32, H, Q);
So I had to manually unroll it. And with the manual unroll I got less instructions and faster code.
That's interesting. Any idea why it's not unrolling it automatically in this case? Did you try this: for (i = 16; i<18; i++) Q = expand32_1(i, M32, H, Q);
I don't know. But I know that in my change the loop was working on constant data, and when I unrolled it manually the constant data was not calculated and less instructions was the result.
|
|
|
|
djm34
Legendary
Offline
Activity: 1400
Merit: 1050
|
 |
September 04, 2015, 12:00:35 PM Last edit: September 04, 2015, 12:13:47 PM by djm34 |
|
Nah, 6.5 on both boxes. Slightly older 6.5.12 on Linux and 6.5.19 (the latest 6.5 + compute 5.2 support I think) on Windows. Tried x64 builds too, doesn't seem to make much of a difference either way. Weird shit. I did manage to make the win build a little better by manually unrolling stuff, just looks like the win version of nvcc isn't really trying to figure stuff out itself. Which brings me back to weird shit.
You should fork my branch and merge the lyra2 changes. My fork is already 500KHASH faster than the DJM34's opensource without modding the lyra2(only the other algos). Big donations are waiting. hmmm. I doubt that... I tried to use your modified kernels (cubehash, blakekeccak,bmw) and I mostly see no difference. there are some variability in the result but on a medium/long run it goes down to the same values I get with the standard kernels... edit: actually the main difference I saw from my original setting, was by raising the intensity (which is a parameter adjustable by the user even in my release)
|
djm34 facebook pageBTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2926
Merit: 1087
Team Black developer
|
 |
September 04, 2015, 12:12:01 PM |
|
hmmm. I doubt that... I tried to use your modified kernels (cubehash, blakekeccak,bmw) and I mostly see no difference. there are some variability in the result but on a medium/long run it goes down to the same values I get with the standard kernels...
If the values go down over time it meens that your cards are trottling, because of heat or too low voltage. On my gtx 970 the miner is mining 500KHASH faster than yours. Release 62 standard clocks: (the 980ti is clocked at 1260 on the core) 
|
|
|
|
djm34
Legendary
Offline
Activity: 1400
Merit: 1050
|
 |
September 04, 2015, 12:24:43 PM |
|
hmmm. I doubt that... I tried to use your modified kernels (cubehash, blakekeccak,bmw) and I mostly see no difference. there are some variability in the result but on a medium/long run it goes down to the same values I get with the standard kernels...
If the values go down over time it meens that your cards are trottling, because of heat or too low voltage. On my gtx 970 the miner is mining 500KHASH faster than yours. Release 62 standard clocks: (the 980ti is clocked at 1260 on the core) well, the argument isn't really relevant, if throttling happens it happens in the same way for every kernels (slow or fast), so if a kernel is faster it will remain faster no matter of any throttling and here it isn't the case... (test was done using default clock and tdp target of 100%)
|
djm34 facebook pageBTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
|
|
|
pallas
Legendary
Offline
Activity: 2716
Merit: 1094
Black Belt Developer
|
 |
September 04, 2015, 12:28:26 PM |
|
hmmm. I doubt that... I tried to use your modified kernels (cubehash, blakekeccak,bmw) and I mostly see no difference. there are some variability in the result but on a medium/long run it goes down to the same values I get with the standard kernels...
If the values go down over time it meens that your cards are trottling, because of heat or too low voltage. On my gtx 970 the miner is mining 500KHASH faster than yours. Release 62 standard clocks: (the 980ti is clocked at 1260 on the core) well, the argument isn't really relevant, if throttling happens it happens in the same way for every kernels (slow or fast), so if a kernel is faster it will remain faster no matter of any throttling and here it isn't the case... (test was done using default clock and tdp target of 100%) If a kernel is faster it probably also draw more power, which in turn means more heat so higher chance of throttling. If an enhancement to a kernel has the same performance/watt ratio as the original, the card may throttle and bring the same performance using the same power but a lower clock speed. I'm talking general as I don't know if it's valid for this specific case.
|
|
|
|
|