Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: ansible adams on July 14, 2010, 08:55:14 PM



Title: anyone tried running with VIA Padlock extensions?
Post by: ansible adams on July 14, 2010, 08:55:14 PM
I understand that the VIA C7 line of x86 processors has "Padlock" cryptographic acceleration CPU instructions built in. OpenSSL can support these extensions too.

I've only been able to find AES benchmarks on Padlock (with some very impressive speedup), but it supports SHA-256 acceleration too. According to VIA's website it can do up to 5 Gbit/sec of SHA-256 hashes, which I believe is about 19500 khash/sec! If you can get even close to that performance it should leave the beefiest multicore 64 bit systems in the dust.

Does anyone here have a VIA processor with these extensions? Does bitcoin automatically pick them up, or do you need to rebuild from source? It seems like this inexpensive processor line should be a real coinspinner!

Sun's Niagara hardware has similar cryptographic acceleration, and I believe there are separate cards you can buy for this purpose too, but I think the VIA line packs a mighty big wallop for a mighty small price. Unfortunately I don't have such a processor myself, but I'd be really interested if someone else manages to get the combination working.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: laszlo on July 14, 2010, 09:46:23 PM
That does sound interesting, I might go pick one of those up to experiment.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: laszlo on July 14, 2010, 09:47:13 PM
We would need to write code to use this hardware, there is no automatic acceleration, btw.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: Ground Loop on July 14, 2010, 10:44:50 PM
There's a "drivers/crypto/padlock-sha.c" driver implementation in the standard kernel.

How does the openssl speed benchmark compare to bitcoin's khash/s?
Code:
openssl speed -evp sha256

On my Core2Duo E8500, it's:
Code:
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
sha256           25568.41k    60726.70k   108968.11k   137848.27k   146604.46k


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: sgtstein on July 14, 2010, 10:46:56 PM
Yep, the openssl acceleration is already built into the kernel.

Check out these for AES benhcmarks on XP. Not sure what for SHA-256 on *nix.
http://www.logix.cz/michal/devel/padlock/bench.xp


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: sgtstein on July 14, 2010, 11:16:29 PM
While this appears to be a good idea for a low power server that can "keep up with the big boys", I think that it isn't the best method for it. It looks like OpenSSL DOES have a CUDA version floating around out there for linux boxes. That would be a HUGE increase over anything we could do in this hardware. I'm still going to be pursuing both paths for now. It would be awesome having both of my "big" boxes crunching for this.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: ansible adams on July 14, 2010, 11:40:01 PM
There's a "drivers/crypto/padlock-sha.c" driver implementation in the standard kernel.

How does the openssl speed benchmark compare to bitcoin's khash/s?
Code:
openssl speed -evp sha256

On my Core2Duo E8500, it's:
Code:
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
sha256           25568.41k    60726.70k   108968.11k   137848.27k   146604.46k

That's a good question. I'm not on my home machine right now so I can't compare khash/s to the OpenSSL benchmark. On the largest block size in that benchmark, it looks like your machine does about 1.2 Gbit/sec (146604.46 * 1000 * 8). So if the VIA chip can reach its full potential it would be about 4 times as fast.

I assumed that since bitcoin appears to be built on OpenSSL you would just need to rebuild it from source on your VIA machine with Padlock-aware OpenSSL, but maybe there is more to it.

I think the most common VIA use right now is in netbooks. I would be pretty amused if a little $350 netbook with this processor could keep up with an i5 or Phenom II. I bet a good CUDA hasher can thrash it, but I wouldn't be surprised if the VIA chip wins on hashes per dollar of hardware and per dollar of electricity.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: jib on July 14, 2010, 11:51:31 PM
I assumed that since bitcoin appears to be built on OpenSSL you would just need to rebuild it from source on your VIA machine with Padlock-aware OpenSSL, but maybe there is more to it.

Bitcoin uses its own SHA256 code, not OpenSSL's. You'd need to modify Bitcoin to be Padlock-aware or to use OpenSSL's code.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: jib on July 15, 2010, 12:15:32 AM
5 Gbit/sec of SHA-256 hashes, which I believe is about 19500 khash/sec

Your calculation seems to be based on 256 bits per hash. I think the 5Gbit/sec refers to the amount of data being hashed, not the size of the hash itself. So instead you should be estimating 80 bytes (640 bits) per hash, which is the size of the block header being hashed. And bitcoin actually uses SHA-256 twice for each hash (using the first result as input to a second SHA-256), so add another 256 bits. This means each hash requires processing 896 bits of input, so at 5Gbit/sec you get 5580 khash/sec.

Although, if we consider how SHA-256 works in more detail, it separates the input data into 512-bit blocks (padded to a whole number of blocks). So for the first hash we process two blocks (1024 bits) and for the second hash we process one block (512 bits), making 1536 bits total. At 5Gbit/sec that gives 3255 khash/sec.

This is ignoring the overhead of initialising the hardware for each SHA-256 we do, which might be significant. The 5Gbit/sec figure is probably for throughput when hashing a long stream of data, not for millions of small hashes. So the actual performance might be much lower.

I guess it's worth experimenting with, but it's almost certainly not going to be as awesome as originally claimed.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: Ground Loop on July 15, 2010, 12:42:34 AM
Excellent insight into how the SHA256 stages are used.
The machine I quoted above seems to run about 1630 khash/sec, so there's one data point.

If openssh is hashing 512-bit chunks at a throughput rate of 60726.70k bytes per second, that's 485 Mbps.
If bitcoin on the same machine is doing 1,630,000 btc-hashes per second, and each btc-hash is effectively 1536 bits (three 512-bit hash inputs) through the same pipeline, that's a whopping 2503 Mbps.

Is bitcoin really running SHA256 at 5x the speed of openssh?

Is each "khash/sec" a whole attempt, or each single cycle through SHA256?


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: ansible adams on July 15, 2010, 01:15:29 AM
Thanks for the hashing analysis from a much more experienced perspective! I am still interested in how this little processor can do... even if I was off by a factor of about 10, it might still be competitive with much more expensive and energy intensive desktop processors. I can get about 2100 khash/sec using all 4 cores of my 64 bit machine when the system is otherwise idle, and that certainly makes the fans blow a lot of hot air. I though it might be possible for VIA to overcome because custom circuits (FPGA or ASIC) for some cryptographic functions have in the past proved orders of magnitude faster than general desktop processors or even GPUs.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: jib on July 15, 2010, 01:18:19 AM
Thanks for the hashing analysis from a much more experienced perspective!

I'm not "experienced", I'm just another random person who reads stuff on the internet. Please don't trust anything I say :p


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: theymos on July 15, 2010, 01:55:53 AM
Is each "khash/sec" a whole attempt, or each single cycle through SHA256?

It's a whole attempt.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: sgtstein on July 15, 2010, 02:30:44 AM
I had been needing a fun system for a home server anyway... So, I just picked up a mobo/cpu combo of Newegg tonight. VIA C7-D@1.6GHz. It will be running with 2GB of RAM on a 16GB compact Flash disk(for BitCoin) and a 120GB 2.5" laptop SATA drive for the OS etc. Hopefully I will have this up by the beginning of next week once it gets here and I figure out how to get Linux onto this one. I'll create a thread or continue this one as I build and test for it.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: laszlo on July 15, 2010, 02:42:10 AM
A 'whole attempt' is actually 3 blocks.  2 blocks of data are hashed to produce a 256 bit result, then that result is padded to make another block and hashed again.  Each block is given the usual 64 pass treatment of SHA-256.  The interesting part that generates the coins uses a simplified implementation in c++ rather than OpenSSL.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: Ground Loop on July 15, 2010, 07:51:08 PM
If I was building dedicated bitminer hardware, I'd start with something like this:
http://www.xilinx.com/products/ipcenter/Fast_SHA-1-SHA-256_MD5_Hashing_cores.htm



Title: Re: anyone tried running with VIA Padlock extensions?
Post by: lfm on August 21, 2010, 05:26:18 AM
Ok, I got a version 0.3.10 bitcoin running on a C7. It does about 1430 khash/s currently on a 1.8 ghz VIA C7.

It's not clear yet if  we can get it optimized to do the 2 block hash (1 block pre-hashed) instead of the 3 block hash per nonce-attempt. We'll be investigating that.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: lfm on August 21, 2010, 08:08:25 AM
There's a "drivers/crypto/padlock-sha.c" driver implementation in the standard kernel.

How does the openssl speed benchmark compare to bitcoin's khash/s?
Code:
openssl speed -evp sha256

On my Core2Duo E8500, it's:
Code:
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
sha256           25568.41k    60726.70k   108968.11k   137848.27k   146604.46k


I was unaware of OpenSSL support. I don't think I have it being used yet. At least I don't see any speed changes no matter what I have tried.

On my VIA C7 1.8 ghz I currently get

Code:
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
sha256            3379.47k     8061.39k    14412.11k    18119.94k    19447.70k

so its still kinda slow. I can't tell if this is with VIA support enabled or not.

Trying to figure out OpenSSL support for VIA padlock functions seems like a quagmire from what I see so far surfing the net.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: sgtstein on August 21, 2010, 03:05:03 PM
Download 0.3.10 HERE! (http://75.149.150.33/bitcoin-c7.tar.gz)!

---

Trying to figure out OpenSSL support for VIA padlock functions seems like a quagmire from what I see so far surfing the net.

Yea, completely agree there. I've looked into it as well, lots might need to be changed.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: jgarzik on November 27, 2010, 06:31:27 AM
Download 0.3.10 HERE! (http://75.149.150.33/bitcoin-c7.tar.gz)!

Thanks for the inspiration.  After reading this, I added VIA padlock support to my CPU miner.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: sgtstein on November 27, 2010, 07:12:21 AM
Thanks for the inspiration.  After reading this, I added VIA padlock support to my CPU miner.

Cool! Did you ever figure out how to get midstate caching working with it? I thought th C7 were capable of it. I know the Nanos are, but I thought VIA had it somewhere on their site. I'll have to try a Windows build of it.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: jgarzik on November 27, 2010, 07:26:46 AM
Thanks for the inspiration.  After reading this, I added VIA padlock support to my CPU miner.

Cool! Did you ever figure out how to get midstate caching working with it? I thought th C7 were capable of it. I know the Nanos are, but I thought VIA had it somewhere on their site. I'll have to try a Windows build of it.

I had not even gotten far enough to determine why your code lacked the midstate caching stuff :)  If you have the hardware (I don't), giving my miner a try would be really helpful.  I don't even have a simple "it works" confirmation on VIA yet.

If you happen to figure out anything interesting, I'll be happy to integrate it and post a new Windows build.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: sgtstein on November 27, 2010, 02:53:11 PM
I had not even gotten far enough to determine why your code lacked the midstate caching stuff :)  If you have the hardware (I don't), giving my miner a try would be really helpful.  I don't even have a simple "it works" confirmation on VIA yet.

If you happen to figure out anything interesting, I'll be happy to integrate it and post a new Windows build.

No problem. I'll see if I can find some time to boot a linux live cd and play with it. Otherwise it's running a Windows build right now.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: jgarzik on November 27, 2010, 06:19:03 PM
I had not even gotten far enough to determine why your code lacked the midstate caching stuff :)  If you have the hardware (I don't), giving my miner a try would be really helpful.  I don't even have a simple "it works" confirmation on VIA yet.

If you happen to figure out anything interesting, I'll be happy to integrate it and post a new Windows build.

No problem. I'll see if I can find some time to boot a linux live cd and play with it. Otherwise it's running a Windows build right now.

You don't need Linux... there's a Windows build:  http://yyz.us/bitcoin/cpuminer-installer-0.2.zip


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: SawEfDir on December 03, 2010, 07:08:28 PM
I've tried the cpuminer version 0.2.1 on a VIA Nano machine. I used the "via" algo with a bitcoin running on the testnet. The miner worked, but the results generated seemed to be wrong, debug output pasted in below.
The system is a 64bit Debian unstable machine with a VIA VB8001 motherboards. It's running a stepping 2 VIA Nano. The kernel seems to do a workaround for Nanos with that stepping, perhaps something needs to be done in the miner code, as well.

Code:
HashMeter(0): 16777216 hashes, 1589.69 khash/sec
DBG: found zeroes in hash:
9ec42e51b34b69fc2f7209f3e334afcfa563d1da21647832cd2b312c00000000
HashMeter(0): 6644792 hashes, 1606.76 khash/sec
PROOF OF WORK FOUND?  submitting...
DBG: sending RPC call:
{"method": "getwork", "params": [ "000000016f643cccfaa9574cd1a3369a23da6452fcf296587e4da572a008520300000001f1071376c66751bede719672dd1e9e3b2a3daeec709fc5bcaede21364748d93a4cf93ea21d05106000000000000000800000000000000000000000000000000000000000000000000000000000000000000000000000000080020000" ], "id":1}
PROOF OF WORK RESULT: false (booooo)


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: jgarzik on December 03, 2010, 07:26:25 PM
I've tried the cpuminer version 0.2.1 on a VIA Nano machine. I used the "via" algo with a bitcoin running on the testnet. The miner worked, but the results generated seemed to be wrong, debug output pasted in below.
The system is a 64bit Debian unstable machine with a VIA VB8001 motherboards. It's running a stepping 2 VIA Nano. The kernel seems to do a workaround for Nanos with that stepping, perhaps something needs to be done in the miner code, as well.

Code:
HashMeter(0): 16777216 hashes, 1589.69 khash/sec
DBG: found zeroes in hash:
9ec42e51b34b69fc2f7209f3e334afcfa563d1da21647832cd2b312c00000000
HashMeter(0): 6644792 hashes, 1606.76 khash/sec
PROOF OF WORK FOUND?  submitting...
DBG: sending RPC call:
{"method": "getwork", "params": [ "000000016f643cccfaa9574cd1a3369a23da6452fcf296587e4da572a008520300000001f1071376c66751bede719672dd1e9e3b2a3daeec709fc5bcaede21364748d93a4cf93ea21d05106000000000000000800000000000000000000000000000000000000000000000000000000000000000000000000000000080020000" ], "id":1}
PROOF OF WORK RESULT: false (booooo)

Actually, it looks like it is working, to me.  As explained in this thread (http://bitcointalk.org/index.php?topic=1925.0;all), cpuminer searches for an approximate number of leading zeroes in the hash.

It then submits that hash to bitcoin, for final verification.  Thus, it is normal for cpuminer to find several almost-solutions, before finding a real solution, depending on current difficulty.

The official bitcoin client works this way too -- it stops hashing when a certain amount of zeroes appear.  However, it does so silently, whereas cpuminer print something.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: jgarzik on December 03, 2010, 07:30:04 PM
You may find this patch to bitcoin helpful:

Code:
diff --git a/main.cpp b/main.cpp
index a1865a4..da85b0d 100644
--- a/main.cpp
+++ b/main.cpp
@@ -3273,8 +3273,11 @@ bool CheckWork(CBlock* pblock, CReserveKey& reservekey)
     uint256 hash = pblock->GetHash();
     uint256 hashTarget = CBigNum().SetCompact(pblock->nBits).getuint256();
 
-    if (hash > hashTarget)
+    if (hash > hashTarget) {
+           printf("proof-of-work check FAILED...\n  hash: %s\ntarget: %s\n",
+                  hash.GetHex().c_str(), hashTarget.GetHex().c_str());
         return false;
+    }
 
     //// debug print
     printf("BitcoinMiner:\n");


This will show the proper, byte-reversed hash, and how close you came to the target.   That is very helpful in verifying whether or not the algorithm is truly working.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: SawEfDir on December 03, 2010, 09:07:38 PM
Thanks for the info. I'll let the testnet client run for the night. What generate setting should bitcoind have? setgenerate set to true with limit to zero processors, or should setgenerate be set to false?


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: jgarzik on December 03, 2010, 10:00:41 PM
Thanks for the info. I'll let the testnet client run for the night. What generate setting should bitcoind have? setgenerate set to true with limit to zero processors, or should setgenerate be set to false?

setgenerate controls the in-client miner.  So, it may be set, or not, as you choose.

These external miners use the 'getwork' JSON-RPC call, which works regardless of the setgenerate setting.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: lfm on December 06, 2010, 03:52:51 AM

jgarzik:

trying your cpu-miner on via:

bug in main pprogram segment violation:  needs extra NULL check for sparse array in parse arg
Code:
                      if (algo_names[i] != NULL &&
                            !strcmp(arg, algo_names[i])) {
 
now it is reporting stack clobbered but I havnt found that yet


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: lfm on December 06, 2010, 04:14:16 AM
ok in sha256_via.c  also align tmp_hash1 to 128 to avoid stack clobber.


btw I am on a via-c7 which is less capable than the via nano (eg no sse2 or 64 bit but also lesser padlock support)

There was another problem in the compiling the sha256_4way.c on my system I had disable some headers that errored when I had no sse support in the compiler thus:

Code:

#include <string.h>
#include <assert.h>

#ifdef WANT_SSE2_4WAY

#include <xmmintrin.h>
#include <stdint.h>
#include <stdio.h>
#include "miner.h"

#define NPAR 32


but I got it working eventually about the same speed as my old version of the main prog and easier to support


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: jgarzik on December 06, 2010, 04:19:25 AM

jgarzik:

trying your cpu-miner on via:

bug in main pprogram segment violation:  needs extra NULL check for sparse array in parse arg
Code:
                      if (algo_names[i] != NULL &&
                            !strcmp(arg, algo_names[i])) {

Good catch.  Applied similar patch.

Thanks for taking a look!


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: jgarzik on December 06, 2010, 06:01:38 AM
git updated with sha256_via, sha256_4way fixes.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: kabo on December 11, 2010, 11:09:02 PM
How many khashes per second are you currently getting on a via-padlock?


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: SawEfDir on December 12, 2010, 05:36:14 PM
I've tried that patch to main.cpp you suggested. Here the output from minerd:

Code:
DBG: found zeroes in hash:
7e242ac3d2f4298e502efd7e4b3677cc287114488e1c77c4a1406cba00000000
HashMeter(0): 7994345 hashes, 1587.98 khash/sec
PROOF OF WORK FOUND?  submitting...
PROOF OF WORK RESULT: false (booooo)

The output from patched bitcoind is this:

Code:
proof-of-work check FAILED...
  hash: 9c581ce97e417b9ea6ffb2502041a46ad740a74567e55bbd636d8944cc552995
target: 0000000045120800000000000000000000000000000000000000000000000000

FYI, I'm using bitcoin on a amd64 Debian unstable machine. Both bitcoind and minerd were compiled natively for amd64.

I'll try other algos now, to see if this behaviour is independent of the selected algorithm or not.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: lfm on December 13, 2010, 03:12:06 PM
How many khashes per second are you currently getting on a via-padlock?

On a via C7 at 1.8 ghz I get 1418 khash/sec on linux

$ cat /proc/cpuinfo
processor       : 0
vendor_id       : CentaurHauls
cpu family      : 6
model           : 13
model name      : VIA C7-D Processor 1800MHz
stepping        : 0
cpu MHz         : 1800.000
cache size      : 128 KB


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: SawEfDir on December 22, 2010, 05:23:29 PM
I'll try other algos now, to see if this behaviour is independent of the selected algorithm or not.

The -4way algo does seem to work alright and successfully generated a few coins within a day or so.

It seems there something wrong with the padlock code for the VIA Nano, at least in 64bit mode.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: jgarzik on December 22, 2010, 06:46:18 PM
It seems there something wrong with the padlock code for the VIA Nano, at least in 64bit mode.

Are you using cpuminer?  You need version 0.3.1 for sha256_via fixes.


Title: Re: anyone tried running with VIA Padlock extensions?
Post by: lfm on December 22, 2010, 09:00:50 PM
I'll try other algos now, to see if this behaviour is independent of the selected algorithm or not.

The -4way algo does seem to work alright and successfully generated a few coins within a day or so.

It seems there something wrong with the padlock code for the VIA Nano, at least in 64bit mode.

I dont think it supports 64 bit mode, it is only coded for the via c7 atm. the C7 doesnt have 64 bit support. It should work compiled for 32 bit mode on the nano even if it is a 64 bit os. If you want to get involved, the nano has some extended hash instructions that would be usefull I think to speed it up on the nano. Make a separate sha256_nano module, keep the sha256_via separate for the c7 I think would be best for now.

There may be still some problem with the sha256_via even on the c7 in 32 bit mode. Not sure yet. I am doing a testnet run but no results for certain yet.