Bitcoin Forum
July 22, 2018, 09:12:37 AM *
News: Latest stable version of Bitcoin Core: 0.16.1  [Torrent]. (New!)
 
   Home   Help Search Donate Login Register  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 [31] 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 ... 191 »
  Print  
Author Topic: [ANN]: cpuminer-opt v3.8.8.1, open source optimized multi-algo CPU miner  (Read 411541 times)
hmage
Member
**
Offline Offline

Activity: 83
Merit: 10


View Profile
May 12, 2016, 05:15:37 PM
 #601

Okay then, explain this: https://gist.github.com/hmage/2a1fdbd7bdad252cd08c9b4166c5727a

on Core i5-4570S:
Code:
hmage@dhmd:~/test$ cat /proc/cpuinfo |fgrep name|head -1
model name      : Intel(R) Core(TM) i5-4570S CPU @ 2.90GHz
hmage@dhmd:~/test$ gcc dereference_bench.c -O2 -o dereference_bench && ./dereference_bench
      workfunc(): 0.002082 microseconds per call, 480308.777k per second
  workloopfunc(): 0.001774 microseconds per call, 563746.643k per second

on Core i7-4770:
Code:
hmage@vhmd:~$ cat /proc/cpuinfo |fgrep name|head -1
model name      : Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
hmage@vhmd:~$ gcc dereference_bench.c -O2 -o dereference_bench && ./dereference_bench
      workfunc(): 0.001776 microseconds per call, 562932.922k per second
  workloopfunc(): 0.001506 microseconds per call, 664150.879k per second


Dereferencing on every call _is_ a big performance hit, unless you have another explanation.

Latency numbers every programmer should know -- https://gist.github.com/hellerbarde/2843375

Oh, I already know, you get angry.

It looks to me that it was you who got angry. I apologise for my blunt approach.
1532250757
Hero Member
*
Offline Offline

Posts: 1532250757

View Profile Personal Message (Offline)

Ignore
1532250757
Reply with quote  #2

1532250757
Report to moderator
1532250757
Hero Member
*
Offline Offline

Posts: 1532250757

View Profile Personal Message (Offline)

Ignore
1532250757
Reply with quote  #2

1532250757
Report to moderator
1532250757
Hero Member
*
Offline Offline

Posts: 1532250757

View Profile Personal Message (Offline)

Ignore
1532250757
Reply with quote  #2

1532250757
Report to moderator
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
1532250757
Hero Member
*
Offline Offline

Posts: 1532250757

View Profile Personal Message (Offline)

Ignore
1532250757
Reply with quote  #2

1532250757
Report to moderator
1532250757
Hero Member
*
Offline Offline

Posts: 1532250757

View Profile Personal Message (Offline)

Ignore
1532250757
Reply with quote  #2

1532250757
Report to moderator
joblo
Legendary
*
Offline Offline

Activity: 1148
Merit: 1016


View Profile
May 12, 2016, 05:46:02 PM
 #602

Okay then, explain this: https://gist.github.com/hmage/2a1fdbd7bdad252cd08c9b4166c5727a

on Core i5-4570S:
Code:
hmage@dhmd:~/test$ cat /proc/cpuinfo |fgrep name|head -1
model name      : Intel(R) Core(TM) i5-4570S CPU @ 2.90GHz
hmage@dhmd:~/test$ gcc dereference_bench.c -O2 -o dereference_bench && ./dereference_bench
      workfunc(): 0.002082 microseconds per call, 480308.777k per second
  workloopfunc(): 0.001774 microseconds per call, 563746.643k per second

on Core i7-4770:
Code:
hmage@vhmd:~$ cat /proc/cpuinfo |fgrep name|head -1
model name      : Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
hmage@vhmd:~$ gcc dereference_bench.c -O2 -o dereference_bench && ./dereference_bench
      workfunc(): 0.001776 microseconds per call, 562932.922k per second
  workloopfunc(): 0.001506 microseconds per call, 664150.879k per second


Dereferencing on every call _is_ a big performance hit, unless you have another explanation.

Oh, I already know, you get angry.

It looks to me that it was you who got angry. I apologise for my blunt approach.

A little impatient maybe but not really angry. I try to stick to the issues.

Yes, deferencing a pointer to call a function adds overhead but it has to be taken in context.
How often does that occur in the big picture? Take scanhash, for example, the lowest level function
that is gated. Each scan takes seconds to run so the overhead of one extra pointer deref every few
seconds is immeasurable. Even if you go up a level to the miner_thread loop. There are maybe 20
gated fuction calls every loop. 20 extra derefs every few seconds is still immeasurable.

Any change of program flow has overhead, that's why function inlining and loop unrolling exist.
But if the code size of an unrolled loop overflows the cache you may end up losing more performance
from cache misses than you gained from inlining.

This might answer your question:

https://bitcointalk.org/index.php?topic=1326803.msg13770966#msg13770966

I clearly stated I did not predict a performance gain from algo-gate and if you dig deeper you may find
where I did acknowledge the overhead of the deref but was at a loss to explain why I observed a performance
gain. Maybe my observations were just noise, maybe some other change is responsible for the increase in
performance in spite of the gate. I just don't know. There are too many variables that can't be controlled so
I dismiss such observations without a solid case to back it up.

Finally what it comes down to, like any decision, is a balance. Algo-gate was never about performance it was
about a better architecture that made it easier for developpers to add new algos to the miner with minimal
disruption to the existing code. I judged the performnce cost to be negligible.

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
hmage
Member
**
Offline Offline

Activity: 83
Merit: 10


View Profile
May 12, 2016, 08:06:01 PM
 #603

I did acknowledge the overhead of the deref but was at a loss to explain why I observed a performance
gain.

You didn't provide numbers, unfortunately, and you didn't provide a way to recreate the benchmarks to verify your claims either, since there's no archive of older versions of cpuminer-opt to build against. If it were on github, for example, that would have been easier to test.

Each scan takes seconds to run so the overhead of one extra pointer deref every few
seconds is immeasurable. Even if you go up a level to the miner_thread loop. There are maybe 20
gated fuction calls every loop. 20 extra derefs every few seconds is still immeasurable.

That was the info I was looking for, thank you.

This whole debate was too long just because either I didn't communicate clearly enough that I am assuming it is done on every hash call or because you didn't recognize that when reading. Pseudocode should have been a big hint at that.

Either way, this debate is pointless, 20 calls a second isn't something to worry about. The observed slowdown must be caused by other factors.
joblo
Legendary
*
Offline Offline

Activity: 1148
Merit: 1016


View Profile
May 12, 2016, 09:22:55 PM
 #604

I did acknowledge the overhead of the deref but was at a loss to explain why I observed a performance
gain.

You didn't provide numbers, unfortunately, and you didn't provide a way to recreate the benchmarks to verify your claims either, since there's no archive of older versions of cpuminer-opt to build against. If it were on github, for example, that would have been easier to test.

Each scan takes seconds to run so the overhead of one extra pointer deref every few
seconds is immeasurable. Even if you go up a level to the miner_thread loop. There are maybe 20
gated fuction calls every loop. 20 extra derefs every few seconds is still immeasurable.

That was the info I was looking for, thank you.

This whole debate was too long just because either I didn't communicate clearly enough that I am assuming it is done on every hash call or because you didn't recognize that when reading. Pseudocode should have been a big hint at that.

Either way, this debate is pointless, 20 calls a second isn't something to worry about. The observed slowdown must be caused by other factors.


I think you hit the nail on the head when you said you made an assumption. That was, IMO, your biggest mistake and why I
kept repeating that you need to do your homework before bringing it to my attention, Had you done that you would have realized
yourself that the deref overhead was trivial and any observed performance diff was due to something else.

It was my assumption that you would have already done that. We both made assumptions, not a good idea.

I didn't have numbers because there was no way to run a controlled test with the necessary level of precision and accuracy.
And it's also why I suggested it wasn't worth your effort to go back and restest previous releases.

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
hmage
Member
**
Offline Offline

Activity: 83
Merit: 10


View Profile
May 12, 2016, 10:34:38 PM
 #605

It was my assumption that you would have already done that. We both made assumptions, not a good idea.

Yeap. I have only glanced briefly at the source code.

Anyway, I should apologise for my behaviour, it was unprofessional and that lead to less productive results. You weren't perfect either but everyone has faults since everyone is human and every suggestion or problem report felt like court trial just on how much work needed to be done on my end compared to what I saw being done on your end regarding the issue or suggestion (you always asked to do research or just more data without seemingly doing any research on your own before you pass your judgement). I really like your work so far and very appreciate it, though, and don't want to distract you from that more than I already did.
joblo
Legendary
*
Offline Offline

Activity: 1148
Merit: 1016


View Profile
May 12, 2016, 11:31:02 PM
 #606

It was my assumption that you would have already done that. We both made assumptions, not a good idea.

Yeap. I have only glanced briefly at the source code.

Anyway, I should apologise for my behaviour, it was unprofessional and that lead to less productive results. You weren't perfect either but everyone has faults since everyone is human and every suggestion or problem report felt like court trial just on how much work needed to be done on my end compared to what I saw being done on your end regarding the issue or suggestion (you always asked to do research or just more data without seemingly doing any research on your own before you pass your judgement). I really like your work so far and very appreciate it, though, and don't want to distract you from that more than I already did.

Your perception of a court trial is pretty accurate. I was thinking something similar, a lawyer gets one crack at presenting
a case. If the lawyer comes to court unprepared the case gets tossed and he doesn't get another chance.

Although I'm atheist a Bible passage comes to mind. Let he who is without sin throw the first stone. The implication being
that no one is without sin. I simply picked up the stones and threw them back.

An apology is not required, coming to an understanding and learning from it is more important, and applies to both of us.
Nevertheless you offered one and I accept. For my part I'm not one to apologize for my actions, too stubborn, I guess.
But in hindsight I think the timing was bad. I had just released v3.2 and had broken zr5 which was embarassing and was
trying to focus on that issue. In fact I am not pleased with the overall quality of my releases, too many bad ones.
I expect better of myself. Am I losing my edge or is it because I forgot what it was like to be on a steep learning curve
after so long being a subject matter expert? Yeah, I'm arrogant too.

No hard feelings. Cheers.

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
pallas
Legendary
*
Offline Offline

Activity: 1722
Merit: 1083


Black Belt Developer


View Profile
May 13, 2016, 07:38:16 AM
 #607

I agree, what counts is going ahead in the way of knowledge.
Everybody does it his way. Some just stand still but that's not the kind of people usually posting here.

joblo
Legendary
*
Offline Offline

Activity: 1148
Merit: 1016


View Profile
May 15, 2016, 03:18:55 AM
 #608

Download cpuminer-opt v3.2.2:

https://drive.google.com/file/d/0B0lVSGQYLJIZX1F4dHd2NlBHSXc/view?usp=sharing

I finally found the root cause for the zr5 bug, I still don't understand why it seems to
work in v3.2.1 since the original bug from v3.2 was still present. This release is what
v3.2 should have been.

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
joblo
Legendary
*
Offline Offline

Activity: 1148
Merit: 1016


View Profile
May 15, 2016, 02:22:35 PM
 #609

I still have a lot to learn about c/c++. I got burned by pointer arithmetic this week. It seemed only logical
to me that  "p + n" would be a byte offset while "p[ i ]" would be scaled. Surprise, the're both scaled.

My next issue is how to consolidate the definitions of frequently used text strings. In my native language
it's a simple matter of defining the strings in a header file and referencing them in many source files. This approach
causes multi-def warnings in c/c++.

I often see #define macros but they result in the strings being copied by every reference.

Does anyone know of a way in c/c++ to have one definition with multiple references that don't make copies?

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
pallas
Legendary
*
Offline Offline

Activity: 1722
Merit: 1083


Black Belt Developer


View Profile
May 15, 2016, 03:13:51 PM
 #610

I still have a lot to learn about c/c++. I got burned by pointer arithmetic this week. It seemed only logical
to me that  "p + n" would be a byte offset while "p[ i ]" would be scaled. Surprise, the're both scaled.

My next issue is how to consolidate the definitions of frequently used text strings. In my native language
it's a simple matter of defining the strings in a header file and referencing them in many source files. This approach
causes multi-def warnings in c/c++.

I often see #define macros but they result in the strings being copied by every reference.

Does anyone know of a way in c/c++ to have one definition with multiple references that don't make copies?

-fmerge-constants
Attempt to merge identical constants (string constants and floating-point constants) across compilation units.
This option is the default for optimized compilation if the assembler and linker support it. Use -fno-merge-constants to inhibit this behavior.

Enabled at levels -O, -O2, -O3, -Os.

joblo
Legendary
*
Offline Offline

Activity: 1148
Merit: 1016


View Profile
May 15, 2016, 04:13:52 PM
 #611

I still have a lot to learn about c/c++. I got burned by pointer arithmetic this week. It seemed only logical
to me that  "p + n" would be a byte offset while "p[ i ]" would be scaled. Surprise, the're both scaled.

My next issue is how to consolidate the definitions of frequently used text strings. In my native language
it's a simple matter of defining the strings in a header file and referencing them in many source files. This approach
causes multi-def warnings in c/c++.

I often see #define macros but they result in the strings being copied by every reference.

Does anyone know of a way in c/c++ to have one definition with multiple references that don't make copies?

-fmerge-constants
Attempt to merge identical constants (string constants and floating-point constants) across compilation units.
This option is the default for optimized compilation if the assembler and linker support it. Use -fno-merge-constants to inhibit this behavior.

Enabled at levels -O, -O2, -O3, -Os.

Thanks Pallas.

The description of this option indicates it tries to merge multiple explicit definitions of the same constant which means
I'm worrying about nothing. I'm trying to merge explicitly defined identical constants by making a single
definition while it seems the compiler will do it transparently. Outsmarted by the compiler again, I just wish things
wouldn't break when the compiler overrides my code.

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
joblo
Legendary
*
Offline Offline

Activity: 1148
Merit: 1016


View Profile
May 19, 2016, 04:27:14 AM
 #612

cpuminer-opt v3.2.3 is released.

More restructuring, code cleanup and bug fixes. This should be the best release yet.

https://drive.google.com/file/d/0B0lVSGQYLJIZMWdsV21XM0tob0U/view?usp=sharing

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
Ayers
Legendary
*
Offline Offline

Activity: 1386
Merit: 1000


View Profile
May 19, 2016, 05:49:00 AM
 #613

is this better than the wolfo version for hodl and esper?
joblo
Legendary
*
Offline Offline

Activity: 1148
Merit: 1016


View Profile
May 19, 2016, 01:28:52 PM
 #614

is this better than the wolfo version for hodl and esper?

It should be equal to Wolf0 on hodl and esper is not implemented yet.

Edit: I'm taking a look at espers and i think I can improve it.

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
joblo
Legendary
*
Offline Offline

Activity: 1148
Merit: 1016


View Profile
May 19, 2016, 06:18:05 PM
 #615

cpuminer-opt v3.2.4 with support for hmq1725 (espers).

https://drive.google.com/file/d/0B0lVSGQYLJIZY1BVV1RGZFlJclU/view?usp=sharing

it's 56% faster than cpuminer-hmq1725 on CPUs with AES_NI, 17% faster with SSE2.
I recommend using the GPU port at suprnova, the CPU port has a rough start before the diff
settles.

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
monoxide
Hero Member
*****
Offline Offline

Activity: 774
Merit: 500



View Profile
May 19, 2016, 09:25:57 PM
 #616

cpuminer-opt v3.2.4 with support for hmq1725 (espers).

https://drive.google.com/file/d/0B0lVSGQYLJIZY1BVV1RGZFlJclU/view?usp=sharing

it's 56% faster than cpuminer-hmq1725 on CPUs with AES_NI, 17% faster with SSE2.
I recommend using the GPU port at suprnova, the CPU port has a rough start before the diff
settles.


Is there a way to get windows version?
joblo
Legendary
*
Offline Offline

Activity: 1148
Merit: 1016


View Profile
May 19, 2016, 09:33:53 PM
 #617

cpuminer-opt v3.2.4 with support for hmq1725 (espers).

https://drive.google.com/file/d/0B0lVSGQYLJIZY1BVV1RGZFlJclU/view?usp=sharing

it's 56% faster than cpuminer-hmq1725 on CPUs with AES_NI, 17% faster with SSE2.
I recommend using the GPU port at suprnova, the CPU port has a rough start before the diff
settles.


Is there a way to get windows version?


You can use Virtualbox to create a virtual Linux machine, same performance if you allocate
all CPUs to the VM.

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
Dabs
Staff
Legendary
*
Offline Offline

Activity: 2100
Merit: 1104



View Profile
May 19, 2016, 11:26:01 PM
 #618

What do you need to know? Everything should be in the README.md file. Debian should be no problem, nor any other
major distro. I didn't mention it because I assumed most Windows users would find Ubuntu less intimidating.

So, here's the tentative plan:

1. set up a new VM with all cores and 4 or 8 GB of RAM
2. install Debian on it, probably a net install of Debian 8
3. run build.sh
4. install whatever coin wallet software and set it up so I can mine to localhost.

This is almost what I do with my Windows VM; installed Win 10 on it, downloaded the wallet software, and ran three different versions of cpuminer on it. (I come from the ESPERS thread, there have been 3 versions of cpuminers there I think.)

I run Windows Server 2012 R2, so I use Hyper-V on one of my little boxes. I got it cheap, used, something like $600 USD for a dual quad core 48 GB ram rack server.

Escrow Service (Services) - GPG ID: 32AD7565, OTC ID: Dabs
All messages concerning escrow or with bitcoin addresses are GPG signed. Please verify.
CompTIA A+, Microsoft Certified Professional, MCSA: Windows 10; Windows Server 2012, MCSE: Cloud Platform and Infrastructure; Productivity; Messaging
joblo
Legendary
*
Offline Offline

Activity: 1148
Merit: 1016


View Profile
May 19, 2016, 11:52:53 PM
 #619

What do you need to know? Everything should be in the README.md file. Debian should be no problem, nor any other
major distro. I didn't mention it because I assumed most Windows users would find Ubuntu less intimidating.

So, here's the tentative plan:

1. set up a new VM with all cores and 4 or 8 GB of RAM
2. install Debian on it, probably a net install of Debian 8
3. run build.sh
4. install whatever coin wallet software and set it up so I can mine to localhost.

This is almost what I do with my Windows VM; installed Win 10 on it, downloaded the wallet software, and ran three different versions of cpuminer on it. (I come from the ESPERS thread, there have been 3 versions of cpuminers there I think.)

I run Windows Server 2012 R2, so I use Hyper-V on one of my little boxes. I got it cheap, used, something like $600 USD for a dual quad core 48 GB ram rack server.

Only stratum has been tested so I have no idea how it will behave if you try anything else.

cpuminer-opt developer, https://bitcointalk.org/index.php?topic=1326803.0
BTC: 12tdvfF7KmAsihBXQXynT6E6th2c2pByTT,
ETH: 0x72122edabcae9d3f57eab0729305a425f6fef6d0
Dabs
Staff
Legendary
*
Offline Offline

Activity: 2100
Merit: 1104



View Profile
May 20, 2016, 01:24:00 AM
 #620

The miners I tried used getwork gbt version 112. This fork should work similarly. I'm willing to do the testing. I've been solo mining ESPERS.

Escrow Service (Services) - GPG ID: 32AD7565, OTC ID: Dabs
All messages concerning escrow or with bitcoin addresses are GPG signed. Please verify.
CompTIA A+, Microsoft Certified Professional, MCSA: Windows 10; Windows Server 2012, MCSE: Cloud Platform and Infrastructure; Productivity; Messaging
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 [31] 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 ... 191 »
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!