Bitcoin Forum
December 05, 2016, 10:52:59 AM *
News: Latest stable version of Bitcoin Core: 0.13.1  [Torrent].
 
   Home   Help Search Donate Login Register  
Pages: « 1 2 [3] 4 »  All
  Print  
Author Topic: Post Your Hash/Sec and Hardware  (Read 17134 times)
Bitcoiner
Member
**
Offline Offline

Activity: 70


View Profile
July 15, 2010, 02:52:32 AM
 #41

Windows 7 64-bit
AMD Phenom II X4 810 2.61Ghz
2000 khash/sec
CPU temperature stabilises at 61C
MB temperature at 40C

I noticed something funny: After running for a long while, the khash/sec went down to 1750-1800. Is this normal? I see the same thing happening on my laptop where it went down to 275 after staying at 300 - 310 for a while.

Bitcoin 64-bit
Ubuntu 10.04 64-bit
AMD Phenom II X4 810 2.61Ghz
2450 khash/sec

So, somewhat faster than in Windows 7. However, ubuntu lags a lot more than Windows does when Bitcoin is going fullbore.


Bitcoin 32-bit
Ubuntu 10.04 64-bit
AMD Phenom II X4 810 2.61Ghz
2150 khash/sec

Bitcoin 64-bit is faster than Bitcoin 32-bit by 300khash/sec, or one old laptop!

Want to thank me for this post? Donate here! Flip your coins over to: 13Cq8AmdrqewatRxEyU2xNuMvegbaLCvEe  Smiley
1480935179
Hero Member
*
Offline Offline

Posts: 1480935179

View Profile Personal Message (Offline)

Ignore
1480935179
Reply with quote  #2

1480935179
Report to moderator
1480935179
Hero Member
*
Offline Offline

Posts: 1480935179

View Profile Personal Message (Offline)

Ignore
1480935179
Reply with quote  #2

1480935179
Report to moderator
1480935179
Hero Member
*
Offline Offline

Posts: 1480935179

View Profile Personal Message (Offline)

Ignore
1480935179
Reply with quote  #2

1480935179
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
ichi
Member
**
Offline Offline

Activity: 70


View Profile
July 15, 2010, 08:23:07 AM
 #42

I've been comparing various setups on the same server, namely ...

Quad-Core AMD Opteron 2376
8GB memory

In all cases, I'm accessing the net via anonymous VPN.

I get the best results with Ubuntu 10.04 Desktop, installed via wubi on a 1TB WD RE3 7200rpm SATA ...

Ubuntu 10.04 Desktop x86
          Bitcoin x86
          15 connections
          67,556 blocks
          ~1,900 khash/s
          4 cores @ 100%
          system very responsive

Ubuntu 10.04 Desktop x64
          Bitcoin x64
          15 connections
          67,689 blocks
          ~2,150 khash/s
          4 cores @ 100%
          system very responsive

I've also looked at various VMs running on Windows Server 2008 x64 Standard with Hyper-V.  All of the guests live on the 1TB WD RE3 7200rpm SATA, and all have 4 CPUs and 4GB memory.  I get ...

Windows Server 2008 x64 Standard guest
          apparently para-virtualized
          Bitcoin x64
          8 connections
          67,561 blocks
          ~1,700 khash/s = 89% of native
          4 cores @ 100%
          system very responsive

Windows 7 x64 guest
          apparently not para-virtualized
          Bitcoin x64
          8 connections
          appears stuck at 1,581 blocks
          ~1,270 khash/s
          4 cores @ 100%
          system very responsive

Windows XP SP3 x86 guest
          apparently not para-virtualized
          Bitcoin x86
          8 connections
          67,578 blocks
          ~1,700 khash/s
          4 cores @ 100%
          system very responsive

Ubuntu 10.04 x86 guest
          apparently totally not para-virtualized
          Bitcoin x86
          15 connections
          67,609 blocks
          300-400 khash/s
          4 cores @ 100%
          system sluggish (even without Bitcoin)

More results coming soon  Wink
ichi
Member
**
Offline Offline

Activity: 70


View Profile
July 16, 2010, 04:54:10 AM
 #43

dual Intel Xeon 5570
.    8 cores (hyperthreaded to 16) @ 5%-10% used
.    32GB memory @ 20% used
.    Windows Server 2008 x64 with Hyper-V
.         guest Windows Server 2008 x64
.         4 cores @ 100%
.         4GB memory @ 24% used
.         Bitcoin x64
.              8 connections
.              68,226 blocks
.              ~2,250 khash/s
.         system very responsive
nybble41
Full Member
***
Offline Offline

Activity: 152


View Profile
July 20, 2010, 12:56:35 AM
 #44

Core 2 Duo 6600 (2.4 GHz, 4MB cache)
2GB RAM
Debian Linux, kernel 2.6.34.1
BitCoin 0.3.0 Beta (32-bit x86), with custom SSE hash code

Very responsive. No significant change in interactive response.

1 thread: ~4200 khash/s
2 threads: ~7700 khash/s

EDIT: For reference, I was getting about 875 khash/s (two threads) with the stock client and platform-specific compile options (-O3 -march=nocona -msse -msse2 -mfpmath=sse,387). Reorganizing sha.cpp, within the bounds of plain C code, pushed that up to ~1000 khash/s.
Bitcoiner
Member
**
Offline Offline

Activity: 70


View Profile
July 20, 2010, 01:00:36 AM
 #45

Core 2 Duo 6600 (2.4 GHz, 4MB cache)
2GB RAM
Debian Linux, kernel 2.6.34.1
BitCoin 0.3.0 Beta (32-bit x86), with custom SSE hash code

Very responsive. No significant change in interactive response.

1 thread: ~4200 khash/s
2 threads: ~7700 khash/s

What is this custom SSE hash code?

Want to thank me for this post? Donate here! Flip your coins over to: 13Cq8AmdrqewatRxEyU2xNuMvegbaLCvEe  Smiley
nybble41
Full Member
***
Offline Offline

Activity: 152


View Profile
July 20, 2010, 01:09:43 AM
 #46

Core 2 Duo 6600 (2.4 GHz, 4MB cache)
2GB RAM
Debian Linux, kernel 2.6.34.1
BitCoin 0.3.0 Beta (32-bit x86), with custom SSE hash code

Very responsive. No significant change in interactive response.

1 thread: ~4200 khash/s
2 threads: ~7700 khash/s

What is this custom SSE hash code?

Now, now. If I told you that then I would lose my advantage. Smiley

Suffice it to say that I leverage SSE instructions to calculate four hashes at once, per thread.
Bitcoiner
Member
**
Offline Offline

Activity: 70


View Profile
July 20, 2010, 02:11:49 AM
 #47

Core 2 Duo 6600 (2.4 GHz, 4MB cache)
2GB RAM
Debian Linux, kernel 2.6.34.1
BitCoin 0.3.0 Beta (32-bit x86), with custom SSE hash code

Very responsive. No significant change in interactive response.

1 thread: ~4200 khash/s
2 threads: ~7700 khash/s

What is this custom SSE hash code?

Now, now. If I told you that then I would lose my advantage. Smiley

Suffice it to say that I leverage SSE instructions to calculate four hashes at once, per thread.

 Shocked

Want to thank me for this post? Donate here! Flip your coins over to: 13Cq8AmdrqewatRxEyU2xNuMvegbaLCvEe  Smiley
Bitcoiner
Member
**
Offline Offline

Activity: 70


View Profile
July 26, 2010, 07:45:04 PM
 #48

OK, now for some absolutely incredible performance.

Credit to tcatm for the caching part of the SHA context - this offers absolutely brilliant performance. Additionally, the Intel compiler really comes into its own here as its parallelisation abilities give a massive performance boost over Visual Studio.

Performance: 4700khash/s on 4 cores, I think that speaks for itself.

I've included both the VS and Intel build, but there's really no comparison, the Intel build craps all over VS.

Grab SHA state caching Bitcoin here
Wow, this is the biggest jump I've ever seen. Nearly a 250% increase in speed from the stock version, amazing.  Now let's see how stable it is  Smiley

Windows 7 64-bit
AMD Phenom II X4 810 2.61Ghz
2000 khash/sec
CPU temperature stabilises at 61C
MB temperature at 40C

I noticed something funny: After running for a long while, the khash/sec went down to 1750-1800. Is this normal? I see the same thing happening on my laptop where it went down to 275 after staying at 300 - 310 for a while.

Bitcoin 64-bit
Ubuntu 10.04 64-bit
AMD Phenom II X4 810 2.61Ghz
2450 khash/sec

So, somewhat faster than in Windows 7. However, ubuntu lags a lot more than Windows does when Bitcoin is going fullbore.

Windows 7 64-bit
AMD Phenom II X4 810 2.61Ghz

I get 4200 - 4400 khash/sec using the SHA state cashing build above (Intel build).
Using the VS build I still get 3100 khash/sec.

Want to thank me for this post? Donate here! Flip your coins over to: 13Cq8AmdrqewatRxEyU2xNuMvegbaLCvEe  Smiley
RudeDude
Newbie
*
Offline Offline

Activity: 11


View Profile
July 26, 2010, 07:58:47 PM
 #49

I upgraded to the x64 optimized client on a box that I run on rare occasions:

Windows XP Pro x64, Intel Xeon X5550 @ 2.67GHz, 8 cores. - 10,400 khash/sec in the display, 9,000 in the hashmeter log.

The rate with stock client used to be about 4,300 khash/sec.

19sM3BSaoGByh3t6Hasr8YbApibXoPtW5Z
wereHamster
Newbie
*
Offline Offline

Activity: 3


View Profile
July 28, 2010, 03:39:51 PM
 #50

1 thread: ~4200 khash/s
2 threads: ~7700 khash/s

Suffice it to say that I leverage SSE instructions to calculate four hashes at once, per thread.

I tried to implement your idea and my SSE code is almost exactly 4x as fast as the vanilla code (~2000 khash/s with one thread, up from ~500). However, when running two threads I only get ~3000 khash/s. There is room to optimize my code, but still, the improvement is way lower that yours.

I don't think pure SSE can be exactly four times as fast as a well optimized C code, mostly because SSE lacks rotate instructions. Did you implement SHA completely in SSE or are you mixing SSE and C?
nybble41
Full Member
***
Offline Offline

Activity: 152


View Profile
July 28, 2010, 04:58:42 PM
 #51

1 thread: ~4200 khash/s
2 threads: ~7700 khash/s

Suffice it to say that I leverage SSE instructions to calculate four hashes at once, per thread.

I tried to implement your idea and my SSE code is almost exactly 4x as fast as the vanilla code (~2000 khash/s with one thread, up from ~500). However, when running two threads I only get ~3000 khash/s. There is room to optimize my code, but still, the improvement is way lower that yours.

Actually, I re-examined my changes based on your results and I think I may have messed up the rate calculation. Since it calculates four hashes per loop iteration I multiplied the increment to nHashCounter by four, but in retrospect it's already accounting for that by incrementing the nonce four times as quickly. Ergo, the rate was displayed as 4x higher than actual. For reference, I've generated 300 BTC (six blocks) since I started using the program on the 11th of this month, which works out to about one block out of every 400 (0.25%). Is that about what you're getting? It looks like your version should be a bit faster than mine, which is only to be expected--this was my first attempt at using SSE, or for that matter any kind of SIMD optimization.

I should've known it wouldn't be quite so simple. Smiley

I don't think pure SSE can be exactly four times as fast as a well optimized C code, mostly because SSE lacks rotate instructions. Did you implement SHA completely in SSE or are you mixing SSE and C?

I implemented the rotates as ((x >> y) | (x << (32-y))), using SSE opcodes for the shifts. It's implemented completely in C, using GCC's vector extensions--no direct assembly code. I did use intrinsics for the shift operations, since the shift operators aren't implemented for vectors. The rest looks much like the original version.
wereHamster
Newbie
*
Offline Offline

Activity: 3


View Profile
July 28, 2010, 05:11:31 PM
 #52


Actually, I re-examined my changes based on your results and I think I may have messed up the rate calculation. Since it calculates four hashes per loop iteration I multiplied the increment to nHashCounter by four, but in retrospect it's already accounting for that by incrementing the nonce four times as quickly. Ergo, the rate was displayed as 4x higher than actual. For reference, I've generated 300 BTC (six blocks) since I started using the program on the 11th of this month, which works out to about one block out of every 400 (0.25%). Is that about what you're getting? It looks like your version should be a bit faster than mine, which is only to be expected--this was my first attempt at using SSE, or for that matter any kind of SIMD optimization.

I should've known it wouldn't be quite so simple. Smiley

Oh yeah, I had to review the khash/s code a few times to get it right. At one point it displayed 12000 but I somehow didn't believe it Smiley
I settled for this: in each thread, for each hash that I calculate, increase a counter. Every couple iterations save the khash/s *for this thread* in an array. Then every 30 seconds one thread sums up all the khash/s values and prints the total.

I don't think pure SSE can be exactly four times as fast as a well optimized C code, mostly because SSE lacks rotate instructions. Did you implement SHA completely in SSE or are you mixing SSE and C?

I implemented the rotates as ((x >> y) | (x << (32-y))), using SSE opcodes for the shifts. It's implemented completely in C, using GCC's vector extensions--no direct assembly code. I did use intrinsics for the shift operations, since the shift operators aren't implemented for vectors. The rest looks much like the original version.

Yep, that's more or less what I did. Except I used intrinsics instead of the gcc vector extension, I think that should be more portable. It was pretty easy, I took an existing implementation as a base and then only had to change some macros. Comparing my SSE version with the base yields a speedup of (only) 2.5x. It's not 4x, mainly because the lack of rotate operations in SSE. Packing and unpacking also cause a small decrease in speed.

I also heard that some compilers can generate suboptimal, sometimes even outright wrong, code from intrinsics. I was advised to stay away from them and use pure assembler instead.
true
Jr. Member
*
Offline Offline

Activity: 56


View Profile
July 29, 2010, 03:22:49 AM
 #53

Phenom II X6 1090T @ 3.2ghz, WinXP 32bit:
~3600khash/sec
Edit: ~8650khash/sec with 0.3.6

Core 2 Quad Extreme Mobile @ 2.53ghz, Gentoo 32bit:
~2000khash/sec...but if the room is hot, the CPU will get up to 95C, it'll throttle and get ~1600khash/sec. I usually run it on 2 cores @ 1.6ghz, for ~630khash/sec
Edit: 2 cores @ 1.6ghz, getting ~1440khash/sec with 0.3.6


Core 2 Duo @ 2.53ghz, WinXP 32bit:
Was ~1200khash/sec, but fell back to ~800khash/sec for some reason and won't go higher?
Edit: ~2050khash/sec with 0.3.6...still slower by ratio than it was before for some reason

Some Celeron @ ~2ghz, WinXP 32bit:
~220khash/sec
Edit: ~550khash/sec with 0.3.6

Any 32-bit builds of the SSE client? Edit: Official build is quite fast now, but as everyone upgrades it won't matter

did you know that the internet is not a 1ATWN2bMDRRfo7Z2P8Fefvq791X6FT88WQ ?
calling me a troll gives me a massive raging boner
omegadraconis
Jr. Member
*
Offline Offline

Activity: 39


View Profile
July 29, 2010, 06:18:46 AM
 #54

My main desktop:
Windows 7 Pro
Intel Celeron Dual-Core 3200 @3.8ghz
2GB DDR2 Ram
Geforce 8800GTX

With the stock client I am getting 1300Khash/s.
With the x64bit optimized client I am getting 3300Khash/s

With OSX 10.6.3 and the cuda build on the same machine I get a peak of 5400Khash/s and a low of 4200Khash/s.

1HKYXgu9uLp8AQXabYrqbmAGqS73huNM7K
GeorgeH
Member
**
Offline Offline

Activity: 83


View Profile
October 14, 2010, 11:55:33 PM
 #55

Machine 1:
Core2Duo @ 2.4 ghz: 3khash
CUDA client & 8800 GTS 512MB: 30khash

Machine 2:
Atom 330N + ION, CUDA client: 2khash

1DSpPtPTGXTYjkZehPsiAbjkXLkB1jsZ2x
Aqualung
Sr. Member
****
Offline Offline

Activity: 372



View Profile
October 15, 2010, 03:58:50 AM
 #56

Windows 7 32-bit
Intel Core2Duo E7300 @ 3600MHz
2 Gb RAM
GF 9800GT

in CPU miner mode i have 3300 khash/sec
in GPU CUDA client i have 18400 khash/sec

used to be a miner
da2ce7
Legendary
*
Offline Offline

Activity: 1218


Live and Let Live


View Profile
October 15, 2010, 05:28:39 AM
 #57

AMD Phenon II 920 = 4mHash/sec
AMD Phenon II 955 = 6mHash/sec

One off NP-Hard.
sandos
Member
**
Offline Offline

Activity: 106


View Profile
October 15, 2010, 07:34:41 AM
 #58

I was surprised that my Athlon X2 4850e is actually faster than my laptop with a core i55 430m. 2 much older cores faster than 4 newer intel cores? Seems odd, but whatever.

Athlon X2 4850e, 2,2Mhash/s
Intel Core i5 430m 1,8Mhash/s
Aqualung
Sr. Member
****
Offline Offline

Activity: 372



View Profile
October 15, 2010, 08:41:42 AM
 #59

Desktop processors always faster than laptops... as well as GPUs.
For what it's worht i5 430m has 2 cores and 4 threads

used to be a miner
GeorgeH
Member
**
Offline Offline

Activity: 83


View Profile
October 16, 2010, 02:23:40 AM
 #60

Zotac Geforce GTX 460: 53mhash @ default aggression
67khash @ aggression = 14, overclocked to 850Mhz

1DSpPtPTGXTYjkZehPsiAbjkXLkB1jsZ2x
Pages: « 1 2 [3] 4 »  All
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!