pallas
Legendary
Offline
Activity: 2716
Merit: 1094
Black Belt Developer
|
|
September 07, 2015, 05:47:48 PM |
|
I have a 200KHASH lyra improvement ready for the 750ti. (+5%)
but it is slower on the 970. Need to tweak it some more
Great! If i can help, just ask.
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2954
Merit: 1087
Team Black developer
|
|
September 07, 2015, 05:49:36 PM |
|
I have a 200KHASH lyra improvement ready for the 750ti. (+5%) but it is slower on the 970. Need to tweak it some more
Great! If i can help, just ask. In the optimalization i use half of the registers and more level1 cache. Will submit soon.
|
|
|
|
Grout
|
|
September 07, 2015, 06:22:47 PM |
|
Hijacking SP_'s thread again (sorry), but you guys started it ;-)
I just finished filling my rig up with 750Tis (Gigabyte, 4 windforce OC, 2 LP) and I'm starting to see interesting heat patterns: the cards on the left suck air from the back of their neighbor and are significantly hotter. I can guess which card is where just by looking at the temperatures and fan speeds... The windforce are handling it like champs (hottest at 57°C), but the LP are struggling (hottest at 68°). I put some case fans I had laying around on top of the cards, which brought them to 50/60, but I'd like a cleaner solution.
I'm contemplating 3 options and I'd like your input on which you think is the best one, or even a different one: - Build a larger rig so I can spread the cards more (3cm between cards on the current one) - not sure it would change much: the cards don't dissipate enough to create a real convection movement between them - Put a big dumb house fan in front of it and blow on the entire rig - simple, efficient, really ugly - Encase the rig and control the air movement by letting air enter at the bottom and extracting it at the top with standard computer fans (I'm thinking 5*140mm, directly above the cards) - interesting from an engineering standpoint, would even allow me to filter the intakes to limit dust buildup. But aerodynamics are hard, especially when there's a big mess of cables in the middle...
What do you think?
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2954
Merit: 1087
Team Black developer
|
|
September 07, 2015, 06:38:29 PM |
|
I have submitted now. I was unable to find the right launch config for the 970, so this optimalization is just for the 750ti.
With tpb52=9 and the 50 kernal the 970 was just abit slower. But I had to case in the code and run seperate code for compute 52 devices.
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2954
Merit: 1087
Team Black developer
|
|
September 07, 2015, 06:55:54 PM |
|
The 980ti is doing 15.7 MHASH with the opensource kernal. core@1270 mhz mem 1600
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2954
Merit: 1087
Team Black developer
|
|
September 07, 2015, 07:09:45 PM Last edit: September 07, 2015, 07:28:07 PM by sp_ |
|
Djm34 posted this pic: His private kernal is doing 15,4MHASH on the 980
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2954
Merit: 1087
Team Black developer
|
|
September 07, 2015, 07:13:58 PM |
|
- Put a big dumb house fan in front of it and blow on the entire rig - simple, efficient, really ugly
This is what the scrypt miners did with their 6x 280x rigs.. X11,quark and lyra2v2 use less memory and power, so not so much heat is generated.
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2954
Merit: 1087
Team Black developer
|
|
September 07, 2015, 07:20:51 PM |
|
The 980ti is doing 15.7 MHASH with the opensource kernal. core@1270 mhz mem 1600
980Ti? Well, I was going to compare 290X to 980, but since I recently flashed my Fury to unlock the extra CUs (nothing wrong with them, lucky me) - I've got an air-cooled Fury X to compare with 980Ti. It does a little over 15.5MH/s peak, but can fluctuate to around 15.2MH/s - I'll leave Freya running to get a nice average. Pretty good when the opensource kernal is doing 4MHASH on the r9 280x
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2954
Merit: 1087
Team Black developer
|
|
September 07, 2015, 07:29:30 PM |
|
I know - but it's not good enough. There's two or three different methods I have thought of to optimize this - looks like this one is just all right. 280X is doing in excess of 7.6MH/s.
My 970 is doing 9.85MHASH with the opensource kernal. And the 980 around 12 I think. Not tested. And I have removed the shfl instruction from the djm34's implementation, so you can try to convert it to opencl. he-he
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2954
Merit: 1087
Team Black developer
|
|
September 07, 2015, 07:42:49 PM Last edit: September 07, 2015, 08:28:52 PM by sp_ |
|
Another binary: - Faster lyra2v2 on gtx 750ti (+5%) - Faster quark. (Pallas) - Fixed crash in the whirlpoolx algo on linux(T Nelson) - Other stability fixes by T Nelson. - Fixed broken hash in cuda 7.0 (x11) (but cuda 7 is slower) 1.5.66(sp-MOD) is available here: (07-09-2015) https://github.com/sp-hash/ccminer/releases/The sourcecode is available here: https://github.com/sp-hash/ccminer
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2954
Merit: 1087
Team Black developer
|
|
September 07, 2015, 07:46:03 PM |
|
I know - but it's not good enough. There's two or three different methods I have thought of to optimize this - looks like this one is just all right. 280X is doing in excess of 7.6MH/s.
My 970 is doing 9.85MHASH with the opensource kernal. And the 980 around 12 I think. Not tested. And I have removed the shfl instruction from the djm34's implementation, so you can try to convert it to opencl. he-he I don't need ideas for improvements - I've got PLENTY to try. I need more time and energy lol yes, I think DJM34 did a full rewrite to reach the number he has now. I am only increasing the hashrate in small steps.
|
|
|
|
bensam1231
Legendary
Offline
Activity: 1764
Merit: 1024
|
|
September 07, 2015, 07:53:02 PM |
|
i think its time to step up to a 980ti and have a look at how this thing works .. is it better? ... and im not asking about the hashrate ... #crysx Looking at the BIOS of that card it has an absolute maximum of 366W power consumption limit if I'm reading it correctly which aligns perfectly with a techpowerup review. Of course that's with some crazy synthetic test like FurMark and the usual peak consumption is about 300W. But even that is a lot. I think these bigger cards are all about scaling; they get somewhat inefficient hash per watt at full speed but get pretty great if you decrease the power target like I found with the 970 a while back ( https://bitcointalk.org/index.php?topic=1091755.msg11636995#msg11636995). With downvolting it could be much more significant but I haven't tried it. So on one hand low profit margins warrants efficiency with lower power target but then the initial card price is too much but from another point of view if the profit margins were to increase in the future pushing the cards to hash as fast as they can would be more profitable. Also, different prices; in my case with the prices I'm presented with it doesn't worth it for me to go for anything above 970s. Yup you can gain efficiency by lowering the TDP slider. However you lose hashrate and clockspeed doing it to the point where it's not even worth buying the bigger cards anymore. Running a 970 at 960 speeds is kinda pointless. I wouldn't use wood near any electronic equipment, unless the insurance agent is a friend of yours ;-) (In event of fire, regardless how much care you took, the insurance will not pay if they know you used wood)
To each his own I guess but I'm absolutely confident the wood wouldn't make any difference that aluminum would in case of major failure. Yeah I use wood too. Wood has a pretty high point of catching fire. It's a great insulator against electricity and heat up till that point.
|
I buy private Nvidia miners. Send information and/or inquiries to my PM box.
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2954
Merit: 1087
Team Black developer
|
|
September 07, 2015, 08:06:07 PM |
|
Keep on the good work! Transaction ID: f1d3e7bd537cb79fc454cca016e9877c5b4c32223d657a8e8cd26a6c0bf04c51-000
Thanks alot. PM me with your email if you want the private spreadcoinminer version 9
|
|
|
|
joblo
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
September 07, 2015, 08:24:46 PM |
|
Hijacking SP_'s thread again (sorry), but you guys started it ;-)
I just finished filling my rig up with 750Tis (Gigabyte, 4 windforce OC, 2 LP) and I'm starting to see interesting heat patterns: the cards on the left suck air from the back of their neighbor and are significantly hotter. I can guess which card is where just by looking at the temperatures and fan speeds... The windforce are handling it like champs (hottest at 57°C), but the LP are struggling (hottest at 68°). I put some case fans I had laying around on top of the cards, which brought them to 50/60, but I'd like a cleaner solution.
I'm contemplating 3 options and I'd like your input on which you think is the best one, or even a different one: - Build a larger rig so I can spread the cards more (3cm between cards on the current one) - not sure it would change much: the cards don't dissipate enough to create a real convection movement between them - Put a big dumb house fan in front of it and blow on the entire rig - simple, efficient, really ugly - Encase the rig and control the air movement by letting air enter at the bottom and extracting it at the top with standard computer fans (I'm thinking 5*140mm, directly above the cards) - interesting from an engineering standpoint, would even allow me to filter the intakes to limit dust buildup. But aerodynamics are hard, especially when there's a big mess of cables in the middle...
What do you think?
Probably not a useful suggestion given you already have the cards but the reference cooler blows the hot air out the back of the case. It might cool better in a multi card rig. I presume you already modified the default fan curve but that probably won't help much if it's sucking hot air.
|
|
|
|
Grim
|
|
September 07, 2015, 08:28:54 PM Last edit: September 07, 2015, 09:09:59 PM by Grim |
|
I have a 200KHASH lyra improvement ready for the 750ti. (+5%) but it is slower on the 970. Need to tweak it some more
Great! If i can help, just ask. In the optimalization i use half of the registers and more level1 cache. Will submit soon. I bet the 970 has also a cut down level 1 cache. (It's level 2 cache is cut down and actually smaller than that of a 750ti) Each maxwell SMM has its own level 1 cache which is actually larger (96 kByte to 64 kByte) compared to the 750ti SMM. So nothing is cut in that regard it seems.
|
|
|
|
djm34
Legendary
Offline
Activity: 1400
Merit: 1050
|
|
September 07, 2015, 08:32:10 PM |
|
I know - but it's not good enough. There's two or three different methods I have thought of to optimize this - looks like this one is just all right. 280X is doing in excess of 7.6MH/s.
My 970 is doing 9.85MHASH with the opensource kernal. And the 980 around 12 I think. Not tested. And I have removed the shfl instruction from the djm34's implementation, so you can try to convert it to opencl. he-he the public sgminer is already the "converted" lyra. Actually that shuffle instruction was there just here to try something, it has no impact (positive or negative) on lyra kernel hashrate
|
djm34 facebook pageBTC: 1NENYmxwZGHsKFmyjTc5WferTn5VTFb7Ze Pledge for neoscrypt ccminer to that address: 16UoC4DmTz2pvhFvcfTQrzkPTrXkWijzXw
|
|
|
t-nelson
Member
Offline
Activity: 70
Merit: 10
|
|
September 07, 2015, 08:36:57 PM |
|
I have a 200KHASH lyra improvement ready for the 750ti. (+5%) but it is slower on the 970. Need to tweak it some more
Great! If i can help, just ask. In the optimalization i use half of the registers and more level1 cache. Will submit soon. I bet the 970 has also a cut down level 1 cache. (It's level 2 cache is cut down and actually smaller than that of a 750ti) Weird performance on the 970? Nah... http://www.pcper.com/reviews/Graphics-Cards/NVIDIA-Discloses-Full-Memory-Structure-and-Limitations-GTX-970I'm staying away from ass-hattery on that scale.
|
BTC: 1K4yxRwZB8DpFfCgeJnFinSqeU23dQFEMu DASH: XcRSCstQpLn8rgEyS6yH4Kcma4PfcGSJxe
|
|
|
Grim
|
|
September 07, 2015, 09:01:43 PM |
|
Actually the 970 is great for mining cuz each SMM costs about the same as in a 750ti. The 980 and 980ti charge 30%+ for the same SMM.
The memory and cache is gimped tho so that sucks for memory intensive algos but for compute intensive algos (eg. x11) the 970 is the ideal buy.
|
|
|
|
Grim
|
|
September 07, 2015, 09:13:27 PM |
|
I know - but it's not good enough. There's two or three different methods I have thought of to optimize this - looks like this one is just all right. 280X is doing in excess of 7.6MH/s.
My 970 is doing 9.85MHASH with the opensource kernal. And the 980 around 12 I think. Not tested. And I have removed the shfl instruction from the djm34's implementation, so you can try to convert it to opencl. he-he the public sgminer is already the "converted" lyra. Actually that shuffle instruction was there just here to try something, it has no impact (positive or negative) on lyra kernel hashrate I wonder about the performance of the fury/fiji gpus in even higher memory intensive algos like cryptonight. With HBM tech the fury must shine there, wouldn't it?
|
|
|
|
|