Bitcoin Forum
May 28, 2018, 04:29:39 AM *
News: Latest stable version of Bitcoin Core: 0.16.0  [Torrent]. (New!)
 
   Home   Help Search Donate Login Register  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 [31] 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 ... 190 »
  Print  
Author Topic: [ANN]: cpuminer-opt v3.8.8.1, open source optimized multi-algo CPU miner  (Read 408694 times)
hmage
Member
**
Offline Offline

Activity: 83
Merit: 10


View Profile
May 12, 2016, 11:57:06 AM
 #601

I have given you the benefit of the doubt and tried to probe you for more info in areas where I didn't have the confidence
to call you out. But so far it's come up empty. When you challenge me on one of my strengths you'd better be well
prepared.

I don't care if I challenge you or not, I'm not here for your entertainment.

10 runs of cpuminer-opt are giving results that are consistently less than 10 runs of cpuminer-multi on the algos listed above. Simple as that.

You're free to ignore this fact, of course. But I thought it'd be nice if you knew it.
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1527481779
Hero Member
*
Offline Offline

Posts: 1527481779

View Profile Personal Message (Offline)

Ignore
1527481779
Reply with quote  #2

1527481779
Report to moderator
1527481779
Hero Member
*
Offline Offline

Posts: 1527481779

View Profile Personal Message (Offline)

Ignore
1527481779
Reply with quote  #2

1527481779
Report to moderator
1527481779
Hero Member
*
Offline Offline

Posts: 1527481779

View Profile Personal Message (Offline)

Ignore
1527481779
Reply with quote  #2

1527481779
Report to moderator
joblo
Legendary
*
Offline Offline

Activity: 1134
Merit: 1016


View Profile
May 12, 2016, 04:04:48 PM
 #602

I have given you the benefit of the doubt and tried to probe you for more info in areas where I didn't have the confidence
to call you out. But so far it's come up empty. When you challenge me on one of my strengths you'd better be well
prepared.

I don't care if I challenge you or not, I'm not here for your entertainment.

10 runs of cpuminer-opt are giving results that are consistently less than 10 runs of cpuminer-multi on the algos listed above. Simple as that.

You're free to ignore this fact, of course. But I thought it'd be nice if you knew it.

When I give you constructive feedback you seem to get angry which is counterproductive. I thank you for your work
but it was not enough to draw any conclusions. A 2% diffreence is statistically insignificant. But let's assume it is.

You suggested it was caused by the use of function pointers by algo-gate. I countered that my measurements when
algo-gate was implemented showed an improvement. That disproves you theory, one that was not supported by any
evidence BTW. So if the difference is real it must be caused by something else. There are a lot of possibilities.
Differences in CPU architecture (I don't mean capabilities) can cause measurable differences between algos. Cache
size and organization, execution environment, memory interface, etc can all cause different algos to perform differently
on different CPUs. If you look at HOdl it performs well on an i7 but poorly on an i5 due to the smaller cache. As it turns
out it was specifically optimized for the size of the i7 cache.

You need to do your research, get your facts straight and present a coherent case it you want to get any attention,
especially when you are criticizing someone's work. I have a thick skin, thicker than yours apparently, so I can take
it and give it back. Put your self in my position, how would you react to someone taking pot shots about what you're
doing wrong and how you should do things. Oh, I already know, you get angry.

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
hmage
Member
**
Offline Offline

Activity: 83
Merit: 10


View Profile
May 12, 2016, 05:15:37 PM
 #603

Okay then, explain this: https://gist.github.com/hmage/2a1fdbd7bdad252cd08c9b4166c5727a

on Core i5-4570S:
Code:
hmage@dhmd:~/test$ cat /proc/cpuinfo |fgrep name|head -1
model name      : Intel(R) Core(TM) i5-4570S CPU @ 2.90GHz
hmage@dhmd:~/test$ gcc dereference_bench.c -O2 -o dereference_bench && ./dereference_bench
      workfunc(): 0.002082 microseconds per call, 480308.777k per second
  workloopfunc(): 0.001774 microseconds per call, 563746.643k per second

on Core i7-4770:
Code:
hmage@vhmd:~$ cat /proc/cpuinfo |fgrep name|head -1
model name      : Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
hmage@vhmd:~$ gcc dereference_bench.c -O2 -o dereference_bench && ./dereference_bench
      workfunc(): 0.001776 microseconds per call, 562932.922k per second
  workloopfunc(): 0.001506 microseconds per call, 664150.879k per second


Dereferencing on every call _is_ a big performance hit, unless you have another explanation.

Latency numbers every programmer should know -- https://gist.github.com/hellerbarde/2843375

Oh, I already know, you get angry.

It looks to me that it was you who got angry. I apologise for my blunt approach.
joblo
Legendary
*
Offline Offline

Activity: 1134
Merit: 1016


View Profile
May 12, 2016, 05:46:02 PM
 #604

Okay then, explain this: https://gist.github.com/hmage/2a1fdbd7bdad252cd08c9b4166c5727a

on Core i5-4570S:
Code:
hmage@dhmd:~/test$ cat /proc/cpuinfo |fgrep name|head -1
model name      : Intel(R) Core(TM) i5-4570S CPU @ 2.90GHz
hmage@dhmd:~/test$ gcc dereference_bench.c -O2 -o dereference_bench && ./dereference_bench
      workfunc(): 0.002082 microseconds per call, 480308.777k per second
  workloopfunc(): 0.001774 microseconds per call, 563746.643k per second

on Core i7-4770:
Code:
hmage@vhmd:~$ cat /proc/cpuinfo |fgrep name|head -1
model name      : Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
hmage@vhmd:~$ gcc dereference_bench.c -O2 -o dereference_bench && ./dereference_bench
      workfunc(): 0.001776 microseconds per call, 562932.922k per second
  workloopfunc(): 0.001506 microseconds per call, 664150.879k per second


Dereferencing on every call _is_ a big performance hit, unless you have another explanation.

Oh, I already know, you get angry.

It looks to me that it was you who got angry. I apologise for my blunt approach.

A little impatient maybe but not really angry. I try to stick to the issues.

Yes, deferencing a pointer to call a function adds overhead but it has to be taken in context.
How often does that occur in the big picture? Take scanhash, for example, the lowest level function
that is gated. Each scan takes seconds to run so the overhead of one extra pointer deref every few
seconds is immeasurable. Even if you go up a level to the miner_thread loop. There are maybe 20
gated fuction calls every loop. 20 extra derefs every few seconds is still immeasurable.

Any change of program flow has overhead, that's why function inlining and loop unrolling exist.
But if the code size of an unrolled loop overflows the cache you may end up losing more performance
from cache misses than you gained from inlining.

This might answer your question:

https://bitcointalk.org/index.php?topic=1326803.msg13770966#msg13770966

I clearly stated I did not predict a performance gain from algo-gate and if you dig deeper you may find
where I did acknowledge the overhead of the deref but was at a loss to explain why I observed a performance
gain. Maybe my observations were just noise, maybe some other change is responsible for the increase in
performance in spite of the gate. I just don't know. There are too many variables that can't be controlled so
I dismiss such observations without a solid case to back it up.

Finally what it comes down to, like any decision, is a balance. Algo-gate was never about performance it was
about a better architecture that made it easier for developpers to add new algos to the miner with minimal
disruption to the existing code. I judged the performnce cost to be negligible.

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
hmage
Member
**
Offline Offline

Activity: 83
Merit: 10


View Profile
May 12, 2016, 08:06:01 PM
 #605

I did acknowledge the overhead of the deref but was at a loss to explain why I observed a performance
gain.

You didn't provide numbers, unfortunately, and you didn't provide a way to recreate the benchmarks to verify your claims either, since there's no archive of older versions of cpuminer-opt to build against. If it were on github, for example, that would have been easier to test.

Each scan takes seconds to run so the overhead of one extra pointer deref every few
seconds is immeasurable. Even if you go up a level to the miner_thread loop. There are maybe 20
gated fuction calls every loop. 20 extra derefs every few seconds is still immeasurable.

That was the info I was looking for, thank you.

This whole debate was too long just because either I didn't communicate clearly enough that I am assuming it is done on every hash call or because you didn't recognize that when reading. Pseudocode should have been a big hint at that.

Either way, this debate is pointless, 20 calls a second isn't something to worry about. The observed slowdown must be caused by other factors.
joblo
Legendary
*
Offline Offline

Activity: 1134
Merit: 1016


View Profile
May 12, 2016, 09:22:55 PM
 #606

I did acknowledge the overhead of the deref but was at a loss to explain why I observed a performance
gain.

You didn't provide numbers, unfortunately, and you didn't provide a way to recreate the benchmarks to verify your claims either, since there's no archive of older versions of cpuminer-opt to build against. If it were on github, for example, that would have been easier to test.

Each scan takes seconds to run so the overhead of one extra pointer deref every few
seconds is immeasurable. Even if you go up a level to the miner_thread loop. There are maybe 20
gated fuction calls every loop. 20 extra derefs every few seconds is still immeasurable.

That was the info I was looking for, thank you.

This whole debate was too long just because either I didn't communicate clearly enough that I am assuming it is done on every hash call or because you didn't recognize that when reading. Pseudocode should have been a big hint at that.

Either way, this debate is pointless, 20 calls a second isn't something to worry about. The observed slowdown must be caused by other factors.


I think you hit the nail on the head when you said you made an assumption. That was, IMO, your biggest mistake and why I
kept repeating that you need to do your homework before bringing it to my attention, Had you done that you would have realized
yourself that the deref overhead was trivial and any observed performance diff was due to something else.

It was my assumption that you would have already done that. We both made assumptions, not a good idea.

I didn't have numbers because there was no way to run a controlled test with the necessary level of precision and accuracy.
And it's also why I suggested it wasn't worth your effort to go back and restest previous releases.

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
hmage
Member
**
Offline Offline

Activity: 83
Merit: 10


View Profile
May 12, 2016, 10:34:38 PM
 #607

It was my assumption that you would have already done that. We both made assumptions, not a good idea.

Yeap. I have only glanced briefly at the source code.

Anyway, I should apologise for my behaviour, it was unprofessional and that lead to less productive results. You weren't perfect either but everyone has faults since everyone is human and every suggestion or problem report felt like court trial just on how much work needed to be done on my end compared to what I saw being done on your end regarding the issue or suggestion (you always asked to do research or just more data without seemingly doing any research on your own before you pass your judgement). I really like your work so far and very appreciate it, though, and don't want to distract you from that more than I already did.
joblo
Legendary
*
Offline Offline

Activity: 1134
Merit: 1016


View Profile
May 12, 2016, 11:31:02 PM
 #608

It was my assumption that you would have already done that. We both made assumptions, not a good idea.

Yeap. I have only glanced briefly at the source code.

Anyway, I should apologise for my behaviour, it was unprofessional and that lead to less productive results. You weren't perfect either but everyone has faults since everyone is human and every suggestion or problem report felt like court trial just on how much work needed to be done on my end compared to what I saw being done on your end regarding the issue or suggestion (you always asked to do research or just more data without seemingly doing any research on your own before you pass your judgement). I really like your work so far and very appreciate it, though, and don't want to distract you from that more than I already did.

Your perception of a court trial is pretty accurate. I was thinking something similar, a lawyer gets one crack at presenting
a case. If the lawyer comes to court unprepared the case gets tossed and he doesn't get another chance.

Although I'm atheist a Bible passage comes to mind. Let he who is without sin throw the first stone. The implication being
that no one is without sin. I simply picked up the stones and threw them back.

An apology is not required, coming to an understanding and learning from it is more important, and applies to both of us.
Nevertheless you offered one and I accept. For my part I'm not one to apologize for my actions, too stubborn, I guess.
But in hindsight I think the timing was bad. I had just released v3.2 and had broken zr5 which was embarassing and was
trying to focus on that issue. In fact I am not pleased with the overall quality of my releases, too many bad ones.
I expect better of myself. Am I losing my edge or is it because I forgot what it was like to be on a steep learning curve
after so long being a subject matter expert? Yeah, I'm arrogant too.

No hard feelings. Cheers.

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
pallas
Legendary
*
Offline Offline

Activity: 1652
Merit: 1081


Black Belt Developer


View Profile
May 13, 2016, 07:38:16 AM
 #609

I agree, what counts is going ahead in the way of knowledge.
Everybody does it his way. Some just stand still but that's not the kind of people usually posting here.

joblo
Legendary
*
Offline Offline

Activity: 1134
Merit: 1016


View Profile
May 15, 2016, 03:18:55 AM
 #610

Download cpuminer-opt v3.2.2:

https://drive.google.com/file/d/0B0lVSGQYLJIZX1F4dHd2NlBHSXc/view?usp=sharing

I finally found the root cause for the zr5 bug, I still don't understand why it seems to
work in v3.2.1 since the original bug from v3.2 was still present. This release is what
v3.2 should have been.

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
joblo
Legendary
*
Offline Offline

Activity: 1134
Merit: 1016


View Profile
May 15, 2016, 02:22:35 PM
 #611

I still have a lot to learn about c/c++. I got burned by pointer arithmetic this week. It seemed only logical
to me that  "p + n" would be a byte offset while "p[ i ]" would be scaled. Surprise, the're both scaled.

My next issue is how to consolidate the definitions of frequently used text strings. In my native language
it's a simple matter of defining the strings in a header file and referencing them in many source files. This approach
causes multi-def warnings in c/c++.

I often see #define macros but they result in the strings being copied by every reference.

Does anyone know of a way in c/c++ to have one definition with multiple references that don't make copies?

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
pallas
Legendary
*
Offline Offline

Activity: 1652
Merit: 1081


Black Belt Developer


View Profile
May 15, 2016, 03:13:51 PM
 #612

I still have a lot to learn about c/c++. I got burned by pointer arithmetic this week. It seemed only logical
to me that  "p + n" would be a byte offset while "p[ i ]" would be scaled. Surprise, the're both scaled.

My next issue is how to consolidate the definitions of frequently used text strings. In my native language
it's a simple matter of defining the strings in a header file and referencing them in many source files. This approach
causes multi-def warnings in c/c++.

I often see #define macros but they result in the strings being copied by every reference.

Does anyone know of a way in c/c++ to have one definition with multiple references that don't make copies?

-fmerge-constants
Attempt to merge identical constants (string constants and floating-point constants) across compilation units.
This option is the default for optimized compilation if the assembler and linker support it. Use -fno-merge-constants to inhibit this behavior.

Enabled at levels -O, -O2, -O3, -Os.

joblo
Legendary
*
Offline Offline

Activity: 1134
Merit: 1016


View Profile
May 15, 2016, 04:13:52 PM
 #613

I still have a lot to learn about c/c++. I got burned by pointer arithmetic this week. It seemed only logical
to me that  "p + n" would be a byte offset while "p[ i ]" would be scaled. Surprise, the're both scaled.

My next issue is how to consolidate the definitions of frequently used text strings. In my native language
it's a simple matter of defining the strings in a header file and referencing them in many source files. This approach
causes multi-def warnings in c/c++.

I often see #define macros but they result in the strings being copied by every reference.

Does anyone know of a way in c/c++ to have one definition with multiple references that don't make copies?

-fmerge-constants
Attempt to merge identical constants (string constants and floating-point constants) across compilation units.
This option is the default for optimized compilation if the assembler and linker support it. Use -fno-merge-constants to inhibit this behavior.

Enabled at levels -O, -O2, -O3, -Os.

Thanks Pallas.

The description of this option indicates it tries to merge multiple explicit definitions of the same constant which means
I'm worrying about nothing. I'm trying to merge explicitly defined identical constants by making a single
definition while it seems the compiler will do it transparently. Outsmarted by the compiler again, I just wish things
wouldn't break when the compiler overrides my code.

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
joblo
Legendary
*
Offline Offline

Activity: 1134
Merit: 1016


View Profile
May 19, 2016, 04:27:14 AM
 #614

cpuminer-opt v3.2.3 is released.

More restructuring, code cleanup and bug fixes. This should be the best release yet.

https://drive.google.com/file/d/0B0lVSGQYLJIZMWdsV21XM0tob0U/view?usp=sharing

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
Ayers
Legendary
*
Offline Offline

Activity: 1358
Merit: 1000


View Profile
May 19, 2016, 05:49:00 AM
 #615

is this better than the wolfo version for hodl and esper?
joblo
Legendary
*
Offline Offline

Activity: 1134
Merit: 1016


View Profile
May 19, 2016, 01:28:52 PM
 #616

is this better than the wolfo version for hodl and esper?

It should be equal to Wolf0 on hodl and esper is not implemented yet.

Edit: I'm taking a look at espers and i think I can improve it.

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
joblo
Legendary
*
Offline Offline

Activity: 1134
Merit: 1016


View Profile
May 19, 2016, 06:18:05 PM
 #617

cpuminer-opt v3.2.4 with support for hmq1725 (espers).

https://drive.google.com/file/d/0B0lVSGQYLJIZY1BVV1RGZFlJclU/view?usp=sharing

it's 56% faster than cpuminer-hmq1725 on CPUs with AES_NI, 17% faster with SSE2.
I recommend using the GPU port at suprnova, the CPU port has a rough start before the diff
settles.

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
monoxide
Hero Member
*****
Offline Offline

Activity: 774
Merit: 500



View Profile
May 19, 2016, 09:25:57 PM
 #618

cpuminer-opt v3.2.4 with support for hmq1725 (espers).

https://drive.google.com/file/d/0B0lVSGQYLJIZY1BVV1RGZFlJclU/view?usp=sharing

it's 56% faster than cpuminer-hmq1725 on CPUs with AES_NI, 17% faster with SSE2.
I recommend using the GPU port at suprnova, the CPU port has a rough start before the diff
settles.


Is there a way to get windows version?
joblo
Legendary
*
Offline Offline

Activity: 1134
Merit: 1016


View Profile
May 19, 2016, 09:33:53 PM
 #619

cpuminer-opt v3.2.4 with support for hmq1725 (espers).

https://drive.google.com/file/d/0B0lVSGQYLJIZY1BVV1RGZFlJclU/view?usp=sharing

it's 56% faster than cpuminer-hmq1725 on CPUs with AES_NI, 17% faster with SSE2.
I recommend using the GPU port at suprnova, the CPU port has a rough start before the diff
settles.


Is there a way to get windows version?


You can use Virtualbox to create a virtual Linux machine, same performance if you allocate
all CPUs to the VM.

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
Dabs
Staff
Legendary
*
Offline Offline

Activity: 2044
Merit: 1096



View Profile
May 19, 2016, 11:26:01 PM
 #620

What do you need to know? Everything should be in the README.md file. Debian should be no problem, nor any other
major distro. I didn't mention it because I assumed most Windows users would find Ubuntu less intimidating.

So, here's the tentative plan:

1. set up a new VM with all cores and 4 or 8 GB of RAM
2. install Debian on it, probably a net install of Debian 8
3. run build.sh
4. install whatever coin wallet software and set it up so I can mine to localhost.

This is almost what I do with my Windows VM; installed Win 10 on it, downloaded the wallet software, and ran three different versions of cpuminer on it. (I come from the ESPERS thread, there have been 3 versions of cpuminers there I think.)

I run Windows Server 2012 R2, so I use Hyper-V on one of my little boxes. I got it cheap, used, something like $600 USD for a dual quad core 48 GB ram rack server.

Escrow Service (Services) - GPG ID: 32AD7565, OTC ID: Dabs
All messages concerning escrow or with bitcoin addresses are GPG signed. Please verify.
CompTIA A+, Microsoft Certified Professional, MCSA: Windows 10; Windows Server 2012, MCSE: Cloud Platform and Infrastructure; Productivity; Messaging
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 [31] 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 ... 190 »
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!