ufasoft (OP)
|
|
February 28, 2011, 05:58:39 AM |
|
I'm trying to use your miner with my pool http://deepbit.net but looks like your implementation of HTTP protocol differs from other miners. Why aren't you sending "Authorization" field in http headers ? This is WinInet HTTP implementation. It sends "Authorization" field after failed "without authorization" request.
|
Bitcoin donations: 18X598V8rVdjy3Yg1cjZmnnv4SpPthuBeT
|
|
|
hendi
Newbie
Offline
Activity: 57
Merit: 0
|
|
February 28, 2011, 12:34:18 PM |
|
I'll give 50 BTC to the person that implements this code into jgarzik's miner. First 25 BTC when working code is released, the other 25 BTC when it's pushed into jgarzik's cpuminer git repository.
|
|
|
|
Fuma
Newbie
Offline
Activity: 9
Merit: 0
|
|
February 28, 2011, 09:56:29 PM Last edit: March 01, 2011, 07:08:55 PM by Fuma |
|
hi i cant get this to work on solo mining...dont know why. putting -o http://127.0.0.1:8332/ results in 0 hash i guess the url is incorrect although i dont get error connecting. can some please help on this? i can mine through the pools . well got from 5mhash to 14Mhash. 4 cores.
|
|
|
|
BOARBEAR
Member
Offline
Activity: 77
Merit: 10
|
|
March 02, 2011, 05:18:31 PM |
|
my 2 cores got from 3.5MH/s to 9.8 MH/s, is there plan to further improve the program?
|
|
|
|
xenon481
|
|
March 03, 2011, 03:55:18 PM Last edit: March 03, 2011, 04:20:02 PM by xenon481 |
|
There appears to be a small memory leak somewhere in this program. I haven't looked at the code yet to try and find it.
It looks like it is leaking on the order of about ~25MB per day.
Specs: - Windows XP laptop - Intel Core2 Duo 2.2Ghz - ~4.7MHash/sec - Slush's pool
Edit: I've got my request time set to the default of every 15 seconds. And ~25MB/day is ~4KB per 15sec. So, it looks like the leak is probably a handle-leak every time a request is made.
|
Tips Appreciated: 171TQ2wJg7bxj2q68VNibU75YZB22b7ZDr
|
|
|
chromicant
Newbie
Offline
Activity: 40
Merit: 0
|
|
March 03, 2011, 08:11:12 PM |
|
I'll give 50 BTC to the person that implements this code into jgarzik's miner. First 25 BTC when working code is released, the other 25 BTC when it's pushed into jgarzik's cpuminer git repository.
Please take a look at https://github.com/chromicant/cpuminer/tree/sse2This is only for x86_64 machines running Linux!I have successfully grafted in most of the SSE2 code from the Windows version into jgarzik's CPU miner. It's still a tad slow (I had to remove some loops to make the porting from the Win64 ABI to SysV's ABI for x86_64), but it's 2x faster than the 4way code in my repository (haven't benchmarked it against jgarzik's latest version). To use this, you need to checkout the sse2 branch of the github project above. You will also need the most recent version of yasm (1.1.0 as of this writing). I haven't figured out how to add .asm/yasm files to auto*...so you'll have to do some manual steps. Step 1: cd x86_64, run the build.sh script. Step 2: autogen.sh / configure / make as you'd normally do for cpuminer Step 3: ./minerd --url http:blah --userpass your:info --algo sse2_64Step 4: Enjoy. Watch this branch, since I have some more speedups that I think will work once I clean up some of the internals. This code was pushing 3200 khash/sec per core on my Core i5 760 @ 2.80GHz. So your mileage may vary. No warranty implied. Etc., etc., etc. Next after cleaning up the code some is to get a proper merge going. Also, this code should be a good basis for my ARM NEON work...so if you want to burn the batteries up in your iPad2... Please, if you like my work, donate at the address in my sig!
|
|
|
|
VastLite
Newbie
Offline
Activity: 32
Merit: 0
|
|
March 03, 2011, 08:25:27 PM Last edit: March 03, 2011, 09:03:23 PM by VastLite |
|
I have this odd problem of bitcoin-miner.exe working for a few seconds and then going idle for a few seconds as it works. I'm running A Phenom II X6 on Windows 7 64-bit. I'm using these switches: "bitcoin-miner.exe -t 5 -a 10 -o http://deepbit.net:8332 -u x@x.x -p x" I'm wondering if I'm doing something wrong maybe? Has anyone else had this problem? Any feedback would be greatly appreciated. -edit: This also happens on my secondary computer running three threads on a quad-core. Is there some specific configuration that is required for 64-bit? That is what the two computers have in common so that's the only thing I can think of, other than some connectivity problem with deepbit. I will try running as just a solo-miner to test. -edit 2: Running with an address of http://127.0.0.1:8332 fixes the problem. I think this is purely an issue of latency, I have satellite internet and the roundtrip time for anything is ridiculous. For whatever reason, poclbm works okay. I guess I'll have to stick with gpu-only mining until I get better internet. Anyway, as far as performance, I get about 15 MHash/s using 5 threads on my AMD processor, so I don't think this is poorly optimized for AMD or anything. Anyone having performance issues using AMD might be some other issue.
|
|
|
|
curator
Newbie
Offline
Activity: 21
Merit: 0
|
|
March 03, 2011, 08:30:51 PM Last edit: March 03, 2011, 08:46:41 PM by curator |
|
So I would like to run the miner on my work computer, but port 8332 is blocked. I'm able to get around this with distributed.net (cracking RC5-72 keys and OGR 27 nodes) by setting it to port 80. Would this work with the ufasoft miner?
Thanks, curator
EDIT: Also if this doesn't work out...could I solo mine with this version? N00b I know.
|
|
|
|
kseistrup
|
|
March 03, 2011, 08:36:29 PM |
|
Step 1: cd x86_64, run the build.sh script.
That one gives me an error in line 106: $ ./build.sh sha256_xmm_amd64.asm:106: error: expected `,'
I'm looking forward to try this minerd as soon as I can build it. Cheers,
|
Klaus Alexander Seistrup
|
|
|
chromicant
Newbie
Offline
Activity: 40
Merit: 0
|
|
March 03, 2011, 08:38:07 PM |
|
Check your yasm version: yasm --version You probably have 0.8.0. You need 1.1.0. I had to compile it myself for some versions of Debian. YASM can be found here: http://www.tortall.net/projects/yasm/
|
|
|
|
kseistrup
|
|
March 03, 2011, 08:45:40 PM |
|
Check your yasm version: yasm --version You probably have 0.8.0. You need 1.1.0. I had to compile it myself for some versions of Debian. YASM can be found here: http://www.tortall.net/projects/yasm/Thanks for the hint. I grabbed 1.1.0 from maverick, it installed on lucid without any problems. That helped. It's running at ~2450 khash/sec on a Core2Duo, as opposed to ~1050 khash/sec with the cryptopp algorithm. Sweet! Cheers,
|
Klaus Alexander Seistrup
|
|
|
chromicant
Newbie
Offline
Activity: 40
Merit: 0
|
|
March 03, 2011, 08:49:05 PM |
|
Ok.
I've gotten some reports that the performance on certain AMD cpus isn't as great as it could be.
If anyone has an AMD which shows this, and knows how to profile, I'd appreciate it. I have this gut feel it's in the core code, so unless I learn what the architecture differences are that cause the problem, I may not have a good fix.
|
|
|
|
kseistrup
|
|
March 03, 2011, 09:11:42 PM |
|
It's running at ~2450 khash/sec on a Core2Duo, as opposed to […]
— and it has successfully submitted 2 POWs, so the calculations are working ok. Nice work! Cheers,
|
Klaus Alexander Seistrup
|
|
|
Raulo
|
|
March 03, 2011, 10:01:33 PM |
|
The chromicant code gives 1.18 Mhash/s per 1 GHz per physical core on Intel Core i5 when utilizing 4 threads (2 physical cores, 2 virtual) as opposed to 0.89 Mhash/s in 4way version. The difference is almost 100% for a single-threaded run but apparently the multithreading can catch up a bit on the 4way ineffciencies.
On K10 AMD, the new code is 0.93 Mhash/s per 1GHz per physical core compared to 1.13 Mhash/s for 4way version of jgarzik's cpuminer compiled with Intel Compiler (icc). It seems that the 4way code compiled with icc is fastest for K10 architecture.
|
1HAoJag4C3XtAmQJAhE9FTAAJWFcrvpdLM
|
|
|
chromicant
Newbie
Offline
Activity: 40
Merit: 0
|
|
March 03, 2011, 10:11:41 PM |
|
Thanks.
Someone sent me the output of Intel's compiler on the 4way code. One thing that is different between the sse2_64 and 4way code is that the SSE2 core loop is unrolled. Also, the sse2_64 code is just SHA-256, which means you have to call it twice to get the hash you want.
This may be leading to some overhead.
|
|
|
|
khal
|
|
March 03, 2011, 11:53:10 PM |
|
A big thank to chromicant ! My hashing speed have done more than x2 ! Here are all the steps i followed one a debian (testing) system to obtain the compiled version : sudo apt-get install git automake1.7 libc6-dev libcurl4-openssl-dev
#YASM : http://pkgs.org/download/debian-sid/multimedia-main-amd64/yasm_1.1.0-0.0_amd64.deb.html wget http://ftp.br.debian.org/debian-multimedia/pool/main/y/yasm/yasm_1.1.0-0.0_amd64.deb sudo dpkg -i yasm_1.1.0-0.0_amd64.deb
git clone https://github.com/chromicant/cpuminer.git cd cpuminer git checkout remotes/origin/sse2 cd x86_64/ ./build.sh cd .. ./autogen.sh ./configure make Here is the compiled binary : http://dl.free.fr/tjbLyHclUHope it will be usefull :p
|
|
|
|
chromicant
Newbie
Offline
Activity: 40
Merit: 0
|
|
March 04, 2011, 12:29:34 AM |
|
Thanks.
OK, so here are two instruction-level benchmarks of the ufasoft code, one on core i5, the other on AMD phenom. A super quick glance at it seems to indicate that bitwise integer SSE ops (psrld and friends) are dirt cheap on Intel chips and rather heavy on AMD Download the profile here: profile.tar.bz2Thanks for the data! The information is enlightening. It looks like you get a (serious) stall if you access memory then try to use the register on AMD, and the rotates seem to be killer...which is quite a shock. The memory loads impact Intel as well, but it seems to be not as much... I can definitely clean up some of the memory loads, but something makes me think that there's not much I can do to help the AMD case. Anyone know why you see impacts on the instruction pipeline due to a ps(r/l)ld? Is there a better instruction for this?
|
|
|
|
Abraham
Newbie
Offline
Activity: 1
Merit: 0
|
|
March 04, 2011, 06:57:35 AM |
|
Is there a way of using this software through a proxy such as Tor? I am behind an ISP NAT router that blocks port 8332.
|
|
|
|
Raulo
|
|
March 04, 2011, 07:33:59 AM |
|
Interesting discussion about AMD and Intel differences. But it does not explain why the compiled 4way code is so much faster on AMD than on Intel. If you compare the best code for AMD and the best code for Intel, they are very close in term of MH/s per GHz.
|
1HAoJag4C3XtAmQJAhE9FTAAJWFcrvpdLM
|
|
|
hendi
Newbie
Offline
Activity: 57
Merit: 0
|
|
March 04, 2011, 11:38:21 AM |
|
I'll give 50 BTC to the person that implements this code into jgarzik's miner. First 25 BTC when working code is released, the other 25 BTC when it's pushed into jgarzik's cpuminer git repository.
Please take a look at https://github.com/chromicant/cpuminer/tree/sse2[...] Please, if you like my work, donate at the address in my sig! Great work, thanks a lot! I've sent you the first 25 BTC
|
|
|
|
|