skaffen
|
|
August 26, 2013, 10:32:38 AM |
|
For example, what do you think would be faster:
A server with one dual-core AMD Opteron CPU @ 2.4 GHz
or,
A server with two physical intel Xeon CPUs, each single core, with hyper threading @ 3.6 GHz
As it turns out, the AMD will GREATLY outperform the intel CPUs.
I respectfully disagree, sir or madam. I believe Intel has been ahead in the CPU race since the core 2, or even the core duo, and since the i3/5/7, they have been getting further ahead. Including instructions per clock.
|
BTC: 1PwXSJnmnCMKTEw8JqtjdDVoRUhtMExiNP XPM: ANREvGk6CJYM5YWuW3eNankjGKS1ZYmAzP
|
|
|
Trillium
|
|
August 26, 2013, 10:33:20 AM |
|
Some stats for hp10 :
2 x Opteron 2376 = "chainsperday" : 2.54092020 <- Really old CPU 1 x Intel i7 3770K = "chainsperday" : 2.77305771
Still no blocks found though in almost 24h.
Hope others will share their stats so we can build a nice performance thread.
It would not work well, because it varies with difficulty. --- I'm fairly certain that HP10 does NOT find blocks at 2 x the rate of HP9, despite the 2 x increase in chains per day value. About an hour after HP10 came out I spun up 500 x 8 core VPS instances and ran them for 6 hours. That was ~1000 CPD. I'm aware that the diff is now rising which will skew things, but in that time I got about the same block rate (average 4 an hour) as I did when I last tested a week ago with HP9 when diff was 9.75. Allowing for some luck variation (which should be largely eliminated due to the high number of instances), I can say it is not twice as fast at finding blocks. I can't say how much faster it is without more testing, which I'm not doing as running those 500 instances is NOT profitable by a long way My limited anecdotal experience suggests that same: it's not really ~100% improved mining. It sounds too good to be true anyway, but I'm sure the optimizations help somewhat. Out of curiosity how much did those 6 hours cost? Also we would expect to see changes in network diff and blocks found if there was really such a huge improvement from HP10: http://192.241.170.170/ Instead we see this is not the case.
|
BTC:1AaaAAAAaAAE2L1PXM1x9VDNqvcrfa9He6
|
|
|
Trillium
|
|
August 26, 2013, 10:35:28 AM |
|
For example, what do you think would be faster:
A server with one dual-core AMD Opteron CPU @ 2.4 GHz
or,
A server with two physical intel Xeon CPUs, each single core, with hyper threading @ 3.6 GHz
As it turns out, the AMD will GREATLY outperform the intel CPUs.
I respectfully disagree, sir or madam. I believe Intel has been ahead in the CPU race since the core 2, or even the core duo, and since the i3/5/7, they have been getting further ahead. Including instructions per clock. I am not saying that one company offers superior tech to the other. I am just saying that generally speaking, especially from a historical point of view, AMD tend to do more ops per cycle. Of course there will be exceptions though, and I admit I am not really familiar with the models from either company that are less than a year old.
|
BTC:1AaaAAAAaAAE2L1PXM1x9VDNqvcrfa9He6
|
|
|
paulthetafy
|
|
August 26, 2013, 10:47:50 AM |
|
--- I'm fairly certain that HP10 does NOT find blocks at 2 x the rate of HP9, despite the 2 x increase in chains per day value. About an hour after HP10 came out I spun up 500 x 8 core VPS instances and ran them for 6 hours. That was ~1000 CPD. I'm aware that the diff is now rising which will skew things, but in that time I got about the same block rate (average 4 an hour) as I did when I last tested a week ago with HP9 when diff was 9.75. Allowing for some luck variation (which should be largely eliminated due to the high number of instances), I can say it is not twice as fast at finding blocks. I can't say how much faster it is without more testing, which I'm not doing as running those 500 instances is NOT profitable by a long way My limited anecdotal experience suggests that same: it's not really ~100% improved mining. It sounds too good to be true anyway, but I'm sure the optimizations help somewhat. Out of curiosity how much did those 6 hours cost? Also we would expect to see changes in network diff and blocks found if there was really such a huge improvement from HP10: http://192.241.170.170/ Instead we see this is not the case. yeah the charts do suggest a decrease in block time since HP10, which is why diff is rising. It looks to be around the 25% mark.... https://www.google.com/fusiontables/DataSource?docid=1XuJ-YSvvcWL-po53ra7UDy1u04Rprc-pclD8uwA#chartnew:id=7500 x 8 cores for 6 hours was about $250 and I found 23 blocks, which works out somewhere around $170 worth. It was worth the test though - had it have been 100% increase it would have given a positive ROI
|
|
|
|
arnuschky
|
|
August 26, 2013, 11:07:03 AM |
|
I'm fairly certain that HP10 does NOT find blocks at 2 x the rate of HP9, despite the 2 x increase in chains per day value.
I can confirm this.
|
|
|
|
mikaelh (OP)
|
|
August 26, 2013, 12:55:49 PM |
|
I'm fairly certain that HP10 does NOT find blocks at 2 x the rate of HP9, despite the 2 x increase in chains per day value.
I can confirm this. Yup, it looks like the actual speedup doesn't match the chains/day estimate. I never did a full comparison between hp9 and hp10 myself on mainnet because it's pretty expensive to do that. So big thanks to the guys who did. I did try to adjust the chains/day estimate to account for effects of extending the sieve. It's possible there are some bugs with that. Of course the estimate isn't fully accurate in the first place either.
|
|
|
|
paulthetafy
|
|
August 26, 2013, 01:15:15 PM |
|
I'm fairly certain that HP10 does NOT find blocks at 2 x the rate of HP9, despite the 2 x increase in chains per day value.
I can confirm this. Yup, it looks like the actual speedup doesn't match the chains/day estimate. I never did a full comparison between hp9 and hp10 myself on mainnet because it's pretty expensive to do that. So big thanks to the guys who did. I did try to adjust the chains/day estimate to account for effects of extending the sieve. It's possible there are some bugs with that. Of course the estimate isn't fully accurate in the first place either. The CPD is a good reference point though. I think the key here is that we need a stable metric that we can use to benchmark performance, whether it's accurate in what it is attempting to measure or not. The downside now is that if you change the chains/day estimate to accurately reflect the sieve extension, when you release the next version people are going to complain that's its slower than HP10 With regards to the sieveextensions factor, will changing this value be reflected in the CPD? i.e. if I change it and I see CPD increase, can I take that as a positive thing, regardless if the increase % is inaccurate? Do you have any other ideas for performance improvements in the pipeline?
|
|
|
|
mikaelh (OP)
|
|
August 26, 2013, 01:48:05 PM |
|
I'm fairly certain that HP10 does NOT find blocks at 2 x the rate of HP9, despite the 2 x increase in chains per day value.
I can confirm this. Yup, it looks like the actual speedup doesn't match the chains/day estimate. I never did a full comparison between hp9 and hp10 myself on mainnet because it's pretty expensive to do that. So big thanks to the guys who did. I did try to adjust the chains/day estimate to account for effects of extending the sieve. It's possible there are some bugs with that. Of course the estimate isn't fully accurate in the first place either. The CPD is a good reference point though. I think the key here is that we need a stable metric that we can use to benchmark performance, whether it's accurate in what it is attempting to measure or not. The downside now is that if you change the chains/day estimate to accurately reflect the sieve extension, when you release the next version people are going to complain that's its slower than HP10 With regards to the sieveextensions factor, will changing this value be reflected in the CPD? i.e. if I change it and I see CPD increase, can I take that as a positive thing, regardless if the increase % is inaccurate? Do you have any other ideas for performance improvements in the pipeline? I agree that it would be nice to have a stable metric. The issue is that if the metric is broken then it's simply misleading. The 'sieveextensions' parameter is reflected in chains/day but I have a feeling that it may be broken. And if it's broken, then you can't really trust it. There are still things on my to-do list but I'm not sure if there's going to be anything big anymore.
|
|
|
|
paulthetafy
|
|
August 26, 2013, 02:58:34 PM |
|
I'm fairly certain that HP10 does NOT find blocks at 2 x the rate of HP9, despite the 2 x increase in chains per day value.
I can confirm this. Yup, it looks like the actual speedup doesn't match the chains/day estimate. I never did a full comparison between hp9 and hp10 myself on mainnet because it's pretty expensive to do that. So big thanks to the guys who did. I did try to adjust the chains/day estimate to account for effects of extending the sieve. It's possible there are some bugs with that. Of course the estimate isn't fully accurate in the first place either. The CPD is a good reference point though. I think the key here is that we need a stable metric that we can use to benchmark performance, whether it's accurate in what it is attempting to measure or not. The downside now is that if you change the chains/day estimate to accurately reflect the sieve extension, when you release the next version people are going to complain that's its slower than HP10 With regards to the sieveextensions factor, will changing this value be reflected in the CPD? i.e. if I change it and I see CPD increase, can I take that as a positive thing, regardless if the increase % is inaccurate? Do you have any other ideas for performance improvements in the pipeline? I agree that it would be nice to have a stable metric. The issue is that if the metric is broken then it's simply misleading. The 'sieveextensions' parameter is reflected in chains/day but I have a feeling that it may be broken. And if it's broken, then you can't really trust it. There are still things on my to-do list but I'm not sure if there's going to be anything big anymore. Cool, thanks for the clarification. Just FYI, I did a major test (500-1000 machines from 4 to 16 cores for several hours) of both HP9 and HP10 on mainnet and for the most part found that the default settings you had selected were the optimal. When I did the HP9 test it was just about profitable enough to run for a full day and cover costs (though I had my wallet hacked/stolen along with my entire XPM collection!) HP10 was not even close to profitable, so I won't be running any more tests unless there is a likelihood of 2x performance I think it's time to turn your attention to the GPU now Mikael
|
|
|
|
mikaelh (OP)
|
|
August 26, 2013, 03:25:53 PM Last edit: August 26, 2013, 03:56:46 PM by mikaelh |
|
Well, I found some small issues in the chains/day estimate with regards to the 'sieveextensions' parameter. The fix is now on github. The estimate seems to have gone down by about 5% so it's not a big issue. That also means that the estimate still doesn't match the actual block rates people have been reporting. https://github.com/mikaelh2/primecoin/commit/42496a823b15fadd1a8809298c20310686d12ce9
|
|
|
|
gigawatt
|
|
August 26, 2013, 08:40:09 PM |
|
Cool, thanks for the clarification. Just FYI, I did a major test (500-1000 machines from 4 to 16 cores for several hours) of both HP9 and HP10 on mainnet and for the most part found that the default settings you had selected were the optimal.
Holy balls, and I thought running 20x cc2.8xlarge AWS instances was nuts. Out of curiosity, how does one come across so much computing power? Were you using AWS as well?
|
|
|
|
paulthetafy
|
|
August 26, 2013, 08:46:07 PM |
|
Cool, thanks for the clarification. Just FYI, I did a major test (500-1000 machines from 4 to 16 cores for several hours) of both HP9 and HP10 on mainnet and for the most part found that the default settings you had selected were the optimal.
Holy balls, and I thought running 20x cc2.8xlarge AWS instances was nuts. Out of curiosity, how does one come across so much computing power? Were you using AWS as well? aws with increased limits. Incidentally the cc2.8xlarge is not the most cost effective for xpm as the L1 cache appears to be a bottleneck. To prove it, try running just 16 threads instead of all 32 and you'll see little difference!! Or at least that was true for hp9. You can get more bang for buck with 8 cores.
|
|
|
|
gateway
|
|
August 26, 2013, 10:11:16 PM |
|
hmm hp10 blows up on my core i7 795 box..
|
|
|
|
Lyddite
Member
Offline
Activity: 98
Merit: 10
|
|
August 26, 2013, 10:50:21 PM |
|
Incidentally the cc2.8xlarge is not the most cost effective for xpm as the L1 cache appears to be a bottleneck. To prove it, try running just 16 threads instead of all 32 and you'll see little difference!! Or at least that was true for hp9. You can get more bang for buck with 8 cores.
The slowdown is caused by Hyperthreading. The cc2.8xlarge instances have 32 logical cores, 16 physical cores.
|
- Lyddite -
|
|
|
akspa
Newbie
Offline
Activity: 53
Merit: 0
|
|
August 26, 2013, 11:06:53 PM |
|
You might need to reinstall your Visual C++ runtimes. I'm guessing either the 2010 or 2012 runtimes in this case.
|
|
|
|
atariguy
|
|
August 26, 2013, 11:14:29 PM |
|
Cool, thanks for the clarification. Just FYI, I did a major test (500-1000 machines from 4 to 16 cores for several hours) of both HP9 and HP10 on mainnet and for the most part found that the default settings you had selected were the optimal.
Holy balls, and I thought running 20x cc2.8xlarge AWS instances was nuts. Out of curiosity, how does one come across so much computing power? Were you using AWS as well? aws with increased limits. Incidentally the cc2.8xlarge is not the most cost effective for xpm as the L1 cache appears to be a bottleneck. To prove it, try running just 16 threads instead of all 32 and you'll see little difference!! Or at least that was true for hp9. You can get more bang for buck with 8 cores. Want to know something really strange? I've been running 3 VMs on Azure. I'm doing the free trial, so I'm limited to 20 cores. I had 2 VMs with 8 cores and 1 with 4. Over about 5 days, 1 of the 8 core VMs got 4 blocks, and the other 2 got absolutely nothing. I'm burning through the $200 credit pretty fast, so I turned off the 2 "unlucky" ones, and I'm hoping the lucky streak continues on the one that's left. But once the free trial is over, it's gone as well - it certainly won't be paying for itself at this rate.
|
|
|
|
gigawatt
|
|
August 26, 2013, 11:20:47 PM |
|
Cool, thanks for the clarification. Just FYI, I did a major test (500-1000 machines from 4 to 16 cores for several hours) of both HP9 and HP10 on mainnet and for the most part found that the default settings you had selected were the optimal.
Holy balls, and I thought running 20x cc2.8xlarge AWS instances was nuts. Out of curiosity, how does one come across so much computing power? Were you using AWS as well? aws with increased limits. Incidentally the cc2.8xlarge is not the most cost effective for xpm as the L1 cache appears to be a bottleneck. To prove it, try running just 16 threads instead of all 32 and you'll see little difference!! Or at least that was true for hp9. You can get more bang for buck with 8 cores. Which would you recommend? Multiple c1.xlarge or just using cc2.8xlarge with threads = core/2?
|
|
|
|
Tamis
|
|
August 27, 2013, 12:17:15 AM |
|
Multiple c1.xlarge
|
|
|
|
gateway
|
|
August 27, 2013, 12:56:51 AM |
|
Multiple c1.xlarge
which ami ?
|
|
|
|
atariguy
|
|
August 27, 2013, 01:31:38 AM |
|
Multiple c1.xlarge
How close does that come to paying for itself these days?
|
|
|
|
|