Bitcoin Forum
November 14, 2024, 08:59:06 AM *
News: Check out the artwork 1Dq created to commemorate this forum's 15th anniversary
 
   Home   Help Search Login Register More  
Pages: « 1 ... 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 [199] 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 ... 1240 »
  Print  
Author Topic: CCminer(SP-MOD) Modded GPU kernels.  (Read 2347579 times)
bensam1231
Legendary
*
Offline Offline

Activity: 1764
Merit: 1024


View Profile
July 09, 2015, 04:08:37 AM
 #3961

7x970 - Lyra2 won't start (out of memory)
4GB system memory
7x 970 ?!  Roll Eyes mobo are so expensive...
you need at least as much ram than vram here something like 28Gb to run that kind of system



Surely there has to be a workaround. I mean the memory/swap doesn't even seem to be allocated let alone used, not even for a second.
Something like initializing the cards one after the other instead of all at the same time or something? Or giving the cards different jobs instead of working together on one big job? I have no idea but I'm sure there's a way.

Also agree, the same thing was happening to me with Neoscrypt and just had to throw more memory at it even though system memory basically isn't used at all.

If it just uses it to 'load' into the vram on the miner, a asynchronous load should help with it (load each card into memory, then into vram one at a time). Right now though I don't really see any indication of memory usage on the system.

I buy private Nvidia miners. Send information and/or inquiries to my PM box.
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2954
Merit: 1087

Team Black developer


View Profile
July 09, 2015, 05:55:11 AM
 #3962

7x970 - Lyra2 won't start (out of memory)
4GB system memory

-Upgrade to the latest NVidia drivers (22-jun-2015)
-Add 16GB virtual ram

NVIDIA fixed a memory allocation bug in their latest driver.

If it still doesn't work. reduce the intensity

f.eks -i 17

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2954
Merit: 1087

Team Black developer


View Profile
July 09, 2015, 06:35:29 AM
 #3963

Its wrote on the main page, Yiimp is not an "autotrade" platform... So like others pools you mine the currency you want with the -right- currency address. I dont want to pay in VTC (or BTC) the whole china which is using SHA farms
The pool is working and pay what is mined... I don't want a second exchange full time job Wink Consider the fees as a donation for the new algos... Some are set very high because we are doing "private" tests... you can still mine on those but its made to reduce "anonymous" users...

http://yiimp.ccminer.org/

This pool looks promising. Can you please add

sharkcoin(quark),
Digibyte(skein)
Myriadcoin(skein)


Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
bensam1231
Legendary
*
Offline Offline

Activity: 1764
Merit: 1024


View Profile
July 09, 2015, 06:38:08 AM
 #3964

7x970 - Lyra2 won't start (out of memory)
4GB system memory

-Upgrade to the latest NVidia drivers (22-jun-2015)
-Add 16GB virtual ram

NVIDIA fixed a memory allocation bug in their latest driver.

If it still doesn't work. reduce the intensity

f.eks -i 17

I understand the whole throw more memory at it thing or reduce intensity, we basically went over the same thing when I was having Neo problems due to out of memory. Is there any reason the system doesn't actually use any memory and it's still getting these messages? The memory usage goes up slightly, but if you look at Resource Monitor, the system is barely using any of the available memory... Pagefile or Hardware.

I buy private Nvidia miners. Send information and/or inquiries to my PM box.
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2954
Merit: 1087

Team Black developer


View Profile
July 09, 2015, 07:11:18 AM
Last edit: July 09, 2015, 07:23:29 AM by sp_
 #3965

SP_ RELEASE dot 54--
SP_'s reduction of register use in Lyra2 appears to have reduced some of the memory requirement.    I have been able to increase my intensity setting from my old standard of "-i 16.5" to higher values and see hash rate improvement, but the setting varies per machine.  My initial results, all for Lyra2:
  GTX 750ti FTW - 1080-1100kh/s per card  (Linux)
  GTX 750ti SC - 1140-1150kh/s per card (Win 8 )
  GTX 960 2GB SSC - 1220-1240kh/s per card (Win 7)
  GTX 960 4GB FTW - 1220-1240kh/s per card (Win 8 )
  GTX 970 4GB FTW+ - 2Mh/s per card (Linux)
The windows machines allow for easy software overclocking.  I still need to learn the command line API flags for Linux, and probably need to re-install Linux with the latest drivers for proper use.
If I move to CUDA Toolkit 7.5, will the SP_ releases still compile on Linux?       --scryptr

Since I use half the threads per block compared to djm34's version, you can add 1 to the maxintensity in the old miner and is still runs on the 750ti.
But running it without the intensity parameter should give a performance increase as well. Crypto mining blog messured it to be +150KHASH on the gtx 980 with the default intensity.

Since I am not a registred cuda developer, I cannot download cuda 7.5 and test.

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
Epsylon3
Legendary
*
Offline Offline

Activity: 1484
Merit: 1082


ccminer/cpuminer developer


View Profile WWW
July 09, 2015, 07:31:35 AM
Last edit: July 09, 2015, 09:27:28 AM by Epsylon3
 #3966


The windows machines allow for easy software overclocking.  I still need to learn the command line API flags for Linux, and probably need to re-install Linux with the latest drivers for proper use.

If I move to CUDA Toolkit 7.5, will the SP_ releases still compile on Linux?       --scryptr

Yes it will compile but most of the sp "fine tuning" is made "at the register" ... a kernel use a given number of registers which can diffear against the platform (os and 32/x64) and also the gpu sm.
If you change the OS or the sdk, some kernels will require to be retuned to fit a certain number of registers (one more reg. can reduce a lot the overall speed). I made this work for linux with nvprof because its faster to do and in general also benefits on windows...

The linux driver in the cuda 7.5 RC is the 352.07 and is older than the one i recommend (352.21) which have the power limit features. Else, you can install it easily on all distributions with the .run (tested ubuntu 14, debian 7, slackware and fedora 22) and both can be installed at once.

Regarding the "overclocking" functions, i think nvidia made a step, but didnt really finish the implementation... Power limit values seems to work, application clocks not sure except it change the pstate to P0...

http://yiimp.ccminer.org/

This pool looks promising. Can you please add

sharkcoin(quark),
Digibyte(skein)
Myriadcoin(skein)


Actually working on a proper way to mine only the coin set by your address, its why there is only one coin per algo for the moment...

BTC: 1FhDPLPpw18X4srecguG3MxJYe4a1JsZnd - My Projects: ccminer - cpuminer-multi - yiimp - Forum threads : ccminer - cpuminer-multi - yiimp
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2954
Merit: 1087

Team Black developer


View Profile
July 09, 2015, 09:05:30 AM
 #3967

The compiler in the cuda 7 produces shitty code. all the AES algos got increased registercount and the program is spilling memory, performance is lost, and the hash is broken.

Same on AMD (Omega drivers) performance is lost compared to the 14.6, 14.7 drivers.


Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
djm34
Legendary
*
Offline Offline

Activity: 1400
Merit: 1050


View Profile WWW
July 09, 2015, 09:08:22 AM
 #3968

7x970 - Lyra2 won't start (out of memory)
4GB system memory
7x 970 ?!  Roll Eyes mobo are so expensive...
you need at least as much ram than vram here something like 28Gb to run that kind of system



Surely there has to be a workaround. I mean the memory/swap doesn't even seem to be allocated let alone used, not even for a second.
Something like initializing the cards one after the other instead of all at the same time or something? Or giving the cards different jobs instead of working together on one big job? I have no idea but I'm sure there's a way.

Also agree, the same thing was happening to me with Neoscrypt and just had to throw more memory at it even though system memory basically isn't used at all.

If it just uses it to 'load' into the vram on the miner, a asynchronous load should help with it (load each card into memory, then into vram one at a time). Right now though I don't really see any indication of memory usage on the system.
if you open msi AB and watch both ram and pagefile graphics, you'll see it gets allocated (more on the pagefile than on the memory) so may-be trying to increase pagefile could work.
There isn't really a work around on the code side, global memory variables have to be allocated from the host and cudamalloc works in mysterious way...)

djm34 facebook page
BTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze
Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
djm34
Legendary
*
Offline Offline

Activity: 1400
Merit: 1050


View Profile WWW
July 09, 2015, 09:22:57 AM
 #3969

The compiler in the cuda 7 produces shitty code. all the AES algos got increased register count and the program is spilling memory, performance is lost, and the hash is broken.

Same on AMD (Omega drivers) performance is lost compared to the 14.6, 14.7 drivers.


did you get an access violation in cuda_hefty ? How did you solved it ?  ;

cuda 7.5 gives however some performance boost in lyra  Grin

The main problem is that it will be difficult to stay on some older version of cuda forever...

djm34 facebook page
BTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze
Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2954
Merit: 1087

Team Black developer


View Profile
July 09, 2015, 09:37:33 AM
 #3970

X11 with cuda7 is 10% slower. How fast is it with cuda 7,5?

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
Epsylon3
Legendary
*
Offline Offline

Activity: 1484
Merit: 1082


ccminer/cpuminer developer


View Profile WWW
July 09, 2015, 09:59:07 AM
 #3971

slower than 6.5 on windows

BTC: 1FhDPLPpw18X4srecguG3MxJYe4a1JsZnd - My Projects: ccminer - cpuminer-multi - yiimp - Forum threads : ccminer - cpuminer-multi - yiimp
pallas
Legendary
*
Offline Offline

Activity: 2716
Merit: 1094


Black Belt Developer


View Profile
July 09, 2015, 11:04:02 AM
 #3972

Same on AMD (Omega drivers) performance is lost compared to the 14.6, 14.7 drivers.

Depends on tha hash and on the card chip. Hawaii groestl got a 25% percent boost on 14.12 and whirlpoolx +10% on 15.3, for example.

sp_ (OP)
Legendary
*
Offline Offline

Activity: 2954
Merit: 1087

Team Black developer


View Profile
July 09, 2015, 12:49:33 PM
 #3973

http://hashpower.co/  (yaamp clone) is currently paying 0.7BTC/GHASH for quark. Have anyone tried this pool?

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
pallas
Legendary
*
Offline Offline

Activity: 2716
Merit: 1094


Black Belt Developer


View Profile
July 09, 2015, 12:51:22 PM
 #3974

http://hashpower.co/  (yaamp clone) is currently paying 0.7BTC/GHASH for quark. Have anyone tried this pool?

I did, and didn't get my payments because they used a too low transaction fee.
Furthermore, payments have been stopped AFAICS.

sp_ (OP)
Legendary
*
Offline Offline

Activity: 2954
Merit: 1087

Team Black developer


View Profile
July 09, 2015, 01:09:14 PM
 #3975

http://hashpower.co/  (yaamp clone) is currently paying 0.7BTC/GHASH for quark. Have anyone tried this pool?
I did, and didn't get my payments because they used a too low transaction fee.
Furthermore, payments have been stopped AFAICS.

I will try it out, with payouts in DASH Smiley

Neoscrypt 20BTC/GHASH

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
bensam1231
Legendary
*
Offline Offline

Activity: 1764
Merit: 1024


View Profile
July 09, 2015, 02:05:12 PM
 #3976

7x970 - Lyra2 won't start (out of memory)
4GB system memory
7x 970 ?!  Roll Eyes mobo are so expensive...
you need at least as much ram than vram here something like 28Gb to run that kind of system



Surely there has to be a workaround. I mean the memory/swap doesn't even seem to be allocated let alone used, not even for a second.
Something like initializing the cards one after the other instead of all at the same time or something? Or giving the cards different jobs instead of working together on one big job? I have no idea but I'm sure there's a way.

Also agree, the same thing was happening to me with Neoscrypt and just had to throw more memory at it even though system memory basically isn't used at all.

If it just uses it to 'load' into the vram on the miner, a asynchronous load should help with it (load each card into memory, then into vram one at a time). Right now though I don't really see any indication of memory usage on the system.
if you open msi AB and watch both ram and pagefile graphics, you'll see it gets allocated (more on the pagefile than on the memory) so may-be trying to increase pagefile could work.
There isn't really a work around on the code side, global memory variables have to be allocated from the host and cudamalloc works in mysterious way...)

So loading one card at a time, waiting for memory allocation, then loading another wouldn't help fix this? Do you guys already do this? The memory usage is increased, but there isn't anything indicating the system is anywhere close to out of memory, so when this happens, I would speculate it being a 'peak' allocation which happens right at the beginning of the mine where a lot of things are loaded into memory and instantly loaded into vram, but that instant is enough to push memory usage over the top.

Maybe I'm mistaken about that. It could just be 'holding' memory too after it moves it from system memory to vram even though it wont really ever use that much memory again.

http://hashpower.co/  (yaamp clone) is currently paying 0.7BTC/GHASH for quark. Have anyone tried this pool?
I did, and didn't get my payments because they used a too low transaction fee.
Furthermore, payments have been stopped AFAICS.

I will try it out, with payouts in DASH Smiley

Neoscrypt 20BTC/GHASH

Looks as though that's about what Nicehash is currently paying.

Also keep in mind because it's such a small pool this could be finders 'luck' and not a expected payout. Basically the pool got lucky finding blocks and it's not big enough to give you a realistic representation of payouts.

I buy private Nvidia miners. Send information and/or inquiries to my PM box.
sp_ (OP)
Legendary
*
Offline Offline

Activity: 2954
Merit: 1087

Team Black developer


View Profile
July 09, 2015, 02:56:12 PM
Last edit: July 10, 2015, 07:51:46 AM by sp_
 #3977

Submittet a bugfix and a speedup in quark.

The gtx windforce 970 is now peaking at 16130 on standard clocks.  (up from 15800)

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
dga
Hero Member
*****
Offline Offline

Activity: 737
Merit: 511


View Profile WWW
July 09, 2015, 03:28:58 PM
 #3978

Submittet a bugfix and a speedup in quark.

The gtx windforce 970 is now peaking at 16130 on standard clocks.  (up from 15800)

Note that release 54 has a small bug in the hash that will report lower rates on the pool.
Just fyi, gtx 980:

Code:
[2015-07-09 15:24:05] GPU #0: result for nonce $0353D188 does not validate on CPU!
[2015-07-09 15:24:09] GPU #1: result for nonce $8BFDF7B0 does not validate on CPU!

compiling with cuda 7.0 on ubuntu for 980.

Does work under 6.5 on a 750ti.  Not sure if it's card or cuda version.

sp_ (OP)
Legendary
*
Offline Offline

Activity: 2954
Merit: 1087

Team Black developer


View Profile
July 09, 2015, 04:55:04 PM
Last edit: July 09, 2015, 05:05:31 PM by sp_
 #3979

If you want a working miner for cuda7.0 this is the correct branch:

https://github.com/tpruvot/ccminer

I tried to compile my fork for the cuda 7.0 and started modding a bit. But the compiler wasn't good enough. Not worth the effort..
Around 10% drop in hashrate in all algos..

If you have time please run tvprovot's cuda 7.0 version of quark and compare the hashrate with my 6.5 release 54-git and post your findings.

Team Black Miner (ETHB3 ETH ETC VTC KAWPOW FIROPOW EVRPROGPOW MEOWPOW + dual mining + tripple mining.. https://github.com/sp-hash/TeamBlackMiner
flipclip
Member
**
Offline Offline

Activity: 111
Merit: 10


View Profile
July 09, 2015, 05:49:47 PM
 #3980


How do you do checkouts then?  I'm used to the command line and using the sha for checkouts, so that is why I am wondering.

COMMAND LINE--

I use the command line, and refer to the commit number when posting about performance.  The sha will verify checksum, and is very precise for that purpose.  Commit numbers are sequential.

The line, "git clone https://github.com/sp-hash/ccminer", should clone the latest commit.  If I am wrong, please tell me!

--scryptr

Yes that is the correct command to get the latest commit.  I was just wondering if you used some type of command to get a previous commit via "COMMIT #".  An example would be someone says "COMMIT #820 is doing faster lyra still" (completely made up example by the way), but you are already on COMMIT #843, so you use a git command to go back to COMMIT #820 to try compiling from that point in the past.  I've always used "git checkout sha" to go "back in time" (hence the reason I always find the sha information more informative then "COMMIT #") but I was thinking maybe I was missing a command.
Pages: « 1 ... 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 [199] 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 ... 1240 »
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!