tacotime
Legendary
Offline
Activity: 1484
Merit: 1005
|
|
October 06, 2011, 04:03:13 PM |
|
Using gcc4.6.1 CFLAGS="-march=amdfam10 -O3 -Wall -msse2" ./configure
Getting 3.8 kh/s per core for K10.5 @ 3.7GHz Guys what about AVX and AES-NI instructions in Sandy 2600K !? How far can I push my 2600K !? AVS won't make the cache size bigger, it's intended to facilitate floating point instructions more than anything... I think scrypt is mostly integer-based.
|
XMR: 44GBHzv6ZyQdJkjqZje6KLZ3xSyN1hBSFAnLP6EAqJtCRVzMzZmeXTC2AHKDS9aEDTRKmo6a6o9r9j86pYfhCWDkKjbtcns
|
|
|
r3v3rs3
Newbie
Offline
Activity: 22
Merit: 0
|
|
October 06, 2011, 04:04:46 PM |
|
Atom 330 @ 2.16 GHz: - 4 threads, 0.66 kH/s each - 2 threads, 0.91 kH/s each
Phenom II @ 3.6 GHz: - 4 threads, 3.62 kH/s each
Core 2 Duo (65 nm) @ 1.5 GHz: - 2 threads, 1.35 kH/s each
Debian/sid x86_64, latest cpuminer, CFLAGS="-O3 -march=whatever_cpu"
|
|
|
|
ArtForz
|
|
October 06, 2011, 08:45:09 PM |
|
Just pushed another small tweak, gets another 3% or so on K10s.
|
bitcoin: 1Fb77Xq5ePFER8GtKRn2KDbDTVpJKfKmpz i0coin: jNdvyvd6v6gV3kVJLD7HsB5ZwHyHwAkfdw
|
|
|
tacotime
Legendary
Offline
Activity: 1484
Merit: 1005
|
|
October 06, 2011, 09:08:03 PM |
|
Just pushed another small tweak, gets another 3% or so on K10s.
Hey, For some reason after compiling the new code, the program no longer takes command line arguments? Not sure what's happening. It just returns the -h line no matter what I input. Also, I've been running the last version of your code at ~38kh/s and haven't gotten any blocks in about 6 hours. But maybe I'm just unlucky.
|
XMR: 44GBHzv6ZyQdJkjqZje6KLZ3xSyN1hBSFAnLP6EAqJtCRVzMzZmeXTC2AHKDS9aEDTRKmo6a6o9r9j86pYfhCWDkKjbtcns
|
|
|
ArtForz
|
|
October 06, 2011, 09:28:33 PM |
|
Just pushed another small tweak, gets another 3% or so on K10s.
Hey, For some reason after compiling the new code, the program no longer takes command line arguments? Not sure what's happening. It just returns the -h line no matter what I input. Also, I've been running the last version of your code at ~38kh/s and haven't gotten any blocks in about 6 hours. But maybe I'm just unlucky. Not sure what's going on there (aka "can't reproduce issue"). My only guess is it's possibly related to removing the sha256 algos, but... that was even before I started doing the compilers job for scrypt. Not sure what to do other than general hints along the lines of "start with a clean tree, CFLAGS="-whatever" ./configure; make" Hrrrm... I guess you could revert the sha256 removal or drop the new scrypt.c into Tenebrix-miner and see if that also causes the same issues.
|
bitcoin: 1Fb77Xq5ePFER8GtKRn2KDbDTVpJKfKmpz i0coin: jNdvyvd6v6gV3kVJLD7HsB5ZwHyHwAkfdw
|
|
|
Beta-coiner1
|
|
October 06, 2011, 10:05:01 PM |
|
I definitely have some questions concerning this considering I have rarely done CPU minin
1.Are you guys using mainly a pool or solo-mining ?
2.Are there any details towards optimizing further for Intel CPUs ?
3.What GUI based programs work with the Tenebrix fork ?
Getting 1.72 Kh/s per thread @ 3.4 Ghz i5
|
|
|
|
TiagoTiago
|
|
October 07, 2011, 01:27:17 AM |
|
1. I'm on a pool.
2. I guess, but i haven't looked into it yet
3.i don't think there is any GUI based miner, but the text based one that comes with it is relativelly friendly, just answer a few questions and it runs.
|
(I dont always get new reply notifications, pls send a pm when you think it has happened) Wanna gimme some BTC/BCH for any or no reason? 1FmvtS66LFh6ycrXDwKRQTexGJw4UWiqDX The more you believe in Bitcoin, and the more you show you do to other people, the faster the real value will soar!
|
|
|
grod
|
|
October 07, 2011, 02:15:36 AM |
|
Up to 2.83 Khash/thread on an i7 920@2.8 ghz. Trying different compilers and optimizations, no code changes. llvm is by far the worst, got a high of 2.3 Khash/thread with that compiler. Yuck.
|
|
|
|
tacotime
Legendary
Offline
Activity: 1484
Merit: 1005
|
|
October 07, 2011, 03:04:48 AM |
|
Up to 2.83 Khash/thread on an i7 920@2.8 ghz. Trying different compilers and optimizations, no code changes. llvm is by far the worst, got a high of 2.3 Khash/thread with that compiler. Yuck.
That's pretty wild, faster than the 1055T if that's with hyperthreading on. The Intel people in this thread are going to want to know what your configure options and compiler was for that.
|
XMR: 44GBHzv6ZyQdJkjqZje6KLZ3xSyN1hBSFAnLP6EAqJtCRVzMzZmeXTC2AHKDS9aEDTRKmo6a6o9r9j86pYfhCWDkKjbtcns
|
|
|
tacotime
Legendary
Offline
Activity: 1484
Merit: 1005
|
|
October 07, 2011, 02:14:05 PM |
|
Not sure what's going on there (aka "can't reproduce issue"). My only guess is it's possibly related to removing the sha256 algos, but... that was even before I started doing the compilers job for scrypt. Not sure what to do other than general hints along the lines of "start with a clean tree, CFLAGS="-whatever" ./configure; make" Hrrrm... I guess you could revert the sha256 removal or drop the new scrypt.c into Tenebrix-miner and see if that also causes the same issues.
There is no scrypt.c in your latest released source of 1.0.2... I'm looking at the tar.gz and it's missing it. edit: It works just popping scrypt.c in, I'm getting 3.87kh/s now per core
|
XMR: 44GBHzv6ZyQdJkjqZje6KLZ3xSyN1hBSFAnLP6EAqJtCRVzMzZmeXTC2AHKDS9aEDTRKmo6a6o9r9j86pYfhCWDkKjbtcns
|
|
|
ArtForz
|
|
October 07, 2011, 02:16:00 PM |
|
Not sure what's going on there (aka "can't reproduce issue"). My only guess is it's possibly related to removing the sha256 algos, but... that was even before I started doing the compilers job for scrypt. Not sure what to do other than general hints along the lines of "start with a clean tree, CFLAGS="-whatever" ./configure; make" Hrrrm... I guess you could revert the sha256 removal or drop the new scrypt.c into Tenebrix-miner and see if that also causes the same issues.
There is no scrypt.c in your latest released source of 1.0.2... I'm looking at the tar.gz and it's missing it. What release? The releases are done by Lolcust, I only keep a git at https://github.com/ArtForz/cpuminer
|
bitcoin: 1Fb77Xq5ePFER8GtKRn2KDbDTVpJKfKmpz i0coin: jNdvyvd6v6gV3kVJLD7HsB5ZwHyHwAkfdw
|
|
|
wknight
Legendary
Offline
Activity: 889
Merit: 1000
Bitcoin calls me an Orphan
|
|
October 07, 2011, 02:21:07 PM |
|
Anyone get a cpu miner to work on Solaris?
Yeah I know its a shot in the dark!
|
Mining Both Bitcoin and Litecoin.
|
|
|
grod
|
|
October 07, 2011, 02:23:21 PM |
|
That's pretty wild, faster than the 1055T if that's with hyperthreading on. The Intel people in this thread are going to want to know what your configure options and compiler was for that.
Nope, that's with HT off. Measurably less heat & power with HT off compared to on, and less than 1% difference in performance. Latest ArtForz scrypt.c loses about 15% compared to the 2.85 high watermark with the same compiler & options on the previous version, FYI. That's some very impressive hand optimization for a particular architecture. Unfortunately even though I'm 2.8% of the tenebrix network I'm only pulling in about 1 BTC/day with the two i7s I have mining this (which still outperforms my 5830s by a factor of 4 in terms of power/$) so it'll be a while till I feel like buying an x6 phenom to try the same compiler/options/minerd version bake-off with.
|
|
|
|
ArtForz
|
|
October 07, 2011, 02:33:22 PM |
|
That's pretty wild, faster than the 1055T if that's with hyperthreading on. The Intel people in this thread are going to want to know what your configure options and compiler was for that.
Nope, that's with HT off. Measurably less heat & power with HT off compared to on, and less than 1% difference in performance. Latest ArtForz scrypt.c loses about 15% compared to the 2.85 high watermark with the same compiler & options on the previous version, FYI. That's some very impressive hand optimization for a particular architecture. Unfortunately even though I'm 2.8% of the tenebrix network I'm only pulling in about 1 BTC/day with the two i7s I have mining this (which still outperforms my 5830s by a factor of 4 in terms of power/$) so it'll be a while till I feel like buying an x6 phenom to try the same compiler/options/minerd version bake-off with. Sad to hear that, as the current HEAD also improves speeds for K10 even when compiled with older gcc versions to near gcc-4.6.1 levels Guess this one will have to stay in non-official state for now (looks like it also produces rather crap asm when compiled for 32 bit targets...).
|
bitcoin: 1Fb77Xq5ePFER8GtKRn2KDbDTVpJKfKmpz i0coin: jNdvyvd6v6gV3kVJLD7HsB5ZwHyHwAkfdw
|
|
|
tacotime
Legendary
Offline
Activity: 1484
Merit: 1005
|
|
October 07, 2011, 03:22:36 PM |
|
Not sure what's going on there (aka "can't reproduce issue"). My only guess is it's possibly related to removing the sha256 algos, but... that was even before I started doing the compilers job for scrypt. Not sure what to do other than general hints along the lines of "start with a clean tree, CFLAGS="-whatever" ./configure; make" Hrrrm... I guess you could revert the sha256 removal or drop the new scrypt.c into Tenebrix-miner and see if that also causes the same issues.
There is no scrypt.c in your latest released source of 1.0.2... I'm looking at the tar.gz and it's missing it. What release? The releases are done by Lolcust, I only keep a git at https://github.com/ArtForz/cpuminerOh, I guess it's not a release then... you can get a compressed file of all the contents by clicking on "Downloads" link on the top right of your page ( https://github.com/ArtForz/cpuminer ), but for some strange reason the latest one is missing files. Might be a github bug.
|
XMR: 44GBHzv6ZyQdJkjqZje6KLZ3xSyN1hBSFAnLP6EAqJtCRVzMzZmeXTC2AHKDS9aEDTRKmo6a6o9r9j86pYfhCWDkKjbtcns
|
|
|
bulanula
|
|
October 07, 2011, 07:43:19 PM |
|
Guys, why no Intel love !?
|
|
|
|
Bobnova
|
|
October 08, 2011, 03:23:35 AM |
|
For SandyBridge CPUs (corei7 with four digit model numbers).
git the package. ./configure (screw the flags) gedit Makefile find "CFLAGS = " Change that line to: CFLAGS = -march=native -O3 -Wall -msse2 -msse3 -msse4.1 -msse4.2 -msse4 -mavx Brought my performance from 2.6kh/s per core (running one thread per core) to 3.66-3.7kh/s per core.
Tested with 6 threads, +2 kh/s in total. HT works now! Just needed some hardcore flags, that's all.
|
BTC: 1AURXf66t7pw65NwRiKukwPq1hLSiYLqbP
|
|
|
Bobnova
|
|
October 08, 2011, 03:57:25 AM Last edit: October 08, 2011, 04:16:31 AM by Bobnova |
|
I've managed to get vaguely close, but not really close enough, their (stupid, IMO) decision to gimp L2 is costing us badly.
It really is rather like nvidia vs ATI. nvidia has focused on floating point performance, in Folding@home and other floating point stuff they stomp ATI. ATI (AMD, whatever) has focused on integer performance. Along comes the integer based Bitcoin, and holy crap does ATI compute like mad.
Except in this case AMD spent 6 years doing minor improvements to a core that came out in 2005 (2004? Long ago) and spent the money earned from stomping the crap out of P4s on CEO paychecks. Intel meanwhile dumped money into R&D like mad and made the old P3 into the Core2, then with a nice performance lead they kept pouring money into R&D. AMD meanwhile got left in the dust, axed the dumbshit CEO and started a crash program to get caught up, part of that program was to sacrifice some money in the form of CPU die size to get a little bit of performance, to get a little bit closer to intel. It hurts their profit margins (which are crap, roughly speaking), but the die size spent on "excessive" L2 cache has paid off big time for Scrypt type CPU mining. Go figure. Bulldozer they're cranking the L2 up somewhat as it's on 32nm instead of 45nm and doesn't cost quite as badly, plus more L2 really does help performance, even with a ton of L3 floating around.
Anyway, if you're an intel user enable as many optimization flags as your CPU supports and that'll help somewhat, as well higher core speed obviously (and not obviously, as L3 speed = core speed on sandybridge chips). If you're on a 1156 or 1366 (three digit core i 7 model numbers), crank up the "uncore" speed, as that's your L3 and L3 speed is the big bottleneck here. It also has the memory controller in it. Ram speed seems to help a bit too, though not a ton.
AMD users: Try turning up the CPU-NB (CPU NorthBridge) speed, that's your L3 cache+memory controller. It may help.
EDIT: In theory, -mavx can replace the entire -msseX mess, but in theory -march=native should do this whole lump automatically, and it doesn't.
|
BTC: 1AURXf66t7pw65NwRiKukwPq1hLSiYLqbP
|
|
|
bulanula
|
|
October 08, 2011, 01:57:42 PM |
|
For SandyBridge CPUs (corei7 with four digit model numbers).
git the package. ./configure (screw the flags) gedit Makefile find "CFLAGS = " Change that line to: CFLAGS = -march=native -O3 -Wall -msse2 -msse3 -msse4.1 -msse4.2 -msse4 -mavx Brought my performance from 2.6kh/s per core (running one thread per core) to 3.66-3.7kh/s per core.
Tested with 6 threads, +2 kh/s in total. HT works now! Just needed some hardcore flags, that's all.
So, on a 2600K with 8 threads I should get what !?
|
|
|
|
Bobnova
|
|
October 08, 2011, 02:57:30 PM |
|
More than what you got without compiling AVX and such in.
|
BTC: 1AURXf66t7pw65NwRiKukwPq1hLSiYLqbP
|
|
|
|