Bitcoin Forum
June 21, 2024, 07:32:08 AM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
  Home Help Search Login Register More  
  Show Posts
Pages: [1]
1  Alternate cryptocurrencies / Announcements (Altcoins) / Re: [ANN] Iridium - People are Power - PoW - No Premine - Community Built on: December 11, 2017, 05:53:57 PM
Might want to add a link to the MSVC runtime Redistributable installer.  If you don't have it already it is requried for the Windows installer to complete.
Otherwise, very nice job.  GUI looks slick.
What do you mean ? nobody reports a problem with the win installer yet, you do ?
but ok : release updated to include it.

Most people probably have the Visual C Runtimes nstalled from other things.  I was installing on a VM that I use only for crypto and had recently reimaged so VCRT hadn't been installed yet and the installer errored out with messages about missing DLLs which are part of the runtime.
2  Alternate cryptocurrencies / Announcements (Altcoins) / Re: [ANN] Iridium - People are Power - PoW - No Premine - Community Built on: December 11, 2017, 02:59:24 PM
Might want to add a link to the MSVC runtime Redistributable installer.  If you don't have it already it is requried for the Windows installer to complete.

Otherwise, very nice job.  GUI looks slick.
3  Alternate cryptocurrencies / Mining (Altcoins) / Re: [ANN]: cpuminer-opt v3.4.3, NEW faster m7m, Windows binaries on: August 26, 2016, 03:52:29 PM
Joblo --

Some further testing / updates for you:  Looks like there is an issue with compiling for AMD non-AES_NI capable processors and some older Intel processors --- but it seems to exist in the pristine 3.4.3 chain as well under GCC 6.1.0 so it does not appear to be due to the diffs I've made.

I don't have all the platforms to test binaries, but I have at least been able to successfully compile for all Intel architectures back as far as core2 --- the compile errors pop back up when I try to build with -march=nocona or earlier.  AMD builds work for anything newer than barcelona/amdfam10.


Thanks, that helps. I'm still a little concerned about being unable to both compile and test on the native HW. I'm pretty confident
your changes will not negatively impact other Intel architectures while helping Westmere but I'm not so sure about AMD.

AMD and Intel diverged between SSE4 and AVX. AMD was developping their own SSE5 which was not fully compatible with Intel's AVX.
They eventually converged but there may have been a period where AMD support was not aligned with Intel. This could mean the AVX
check does not work properly on some early AES AMD CPUs. This is somewhat speculative but plausible.

What it comes down to is whether I play it safe at the expense of Westmere performance or improve Westmere for a known and contributing
user at the risk of breaking some unknown AMD users. I'm leaning toward the latter.

I have some systems lying around with AMD CPUs.  I'll see what I've got that is running and run some tests if I can.
4  Alternate cryptocurrencies / Mining (Altcoins) / Re: [ANN]: cpuminer-opt v3.4.3, NEW faster m7m, Windows binaries on: August 26, 2016, 03:02:41 PM
Joblo --

Some further testing / updates for you:  Looks like there is an issue with compiling for AMD non-AES_NI capable processors and some older Intel processors --- but it seems to exist in the pristine 3.4.3 chain as well under GCC 6.1.0 so it does not appear to be due to the diffs I've made.

I don't have all the platforms to test binaries, but I have at least been able to successfully compile for all Intel architectures back as far as core2 --- the compile errors pop back up when I try to build with -march=nocona or earlier.  AMD builds work for anything newer than barcelona/amdfam10.
5  Alternate cryptocurrencies / Mining (Altcoins) / Re: [ANN]: cpuminer-opt v3.4.3, NEW faster m7m, Windows binaries on: August 25, 2016, 09:02:15 PM
joblo ....

Redid my set of changes on a clean copy of your 3.4.3 codebase.  With these changes it compiles on my westmere CPU with -march=westmere  Here are the diffs:

$ diff miner.h miner.h.orig
49a50,56
> #ifndef min
> #define min(a,b) (a>b ? b : a)
> #endif
> #ifndef max
> #define max(a,b) (a<b ? b : a)
> #endif
>

$ diff algo/blake/decred.c algo/blake/decred.c.orig
9,10d8
< #define min(a,b) (a>b ? b : a)
<
$ diff algo/hodl/aes.c algo/hodl/aes.c.orig
85a86,87
> #ifdef __AVX__
>
149a152,178
>
> #else    // NO AVX
>
> static inline __m128i AES256Core(__m128i State, const __m128i *ExpandedKey)
> {
>         State = _mm_xor_si128(State, ExpandedKey[0]);
>
>         for(int i = 1; i < 14; ++i) State = _mm_aesenc_si128(State, ExpandedKey);
>
>         return(_mm_aesenclast_si128(State, ExpandedKey[14]));
> }
>
> void AES256CBC(__m128i *Ciphertext, const __m128i *Plaintext, const __m128i *ExpandedKey, __m128i IV, uint32_t BlockCount)
> {
>         __m128i State = _mm_xor_si128(Plaintext[0], IV);
>         State = AES256Core(State, ExpandedKey);
>         Ciphertext[0] = State;
>
>         for(int i = 1; i < BlockCount; ++i)
>         {
>                 State = _mm_xor_si128(Plaintext, Ciphertext[i - 1]);
>                 State = AES256Core(State, ExpandedKey);
>                 Ciphertext = State;
>         }
> }
>
> #endif
$ diff algo/hodl/hodl-wolf.c algo/hodl/hodl-wolf.c.orig
58a59
> #ifdef __AVX__
129a131,196
>
> #else  // no AVX
>
>     uint32_t *pdata = work->data;
>     uint32_t *ptarget = work->target;
>     uint32_t BlockHdr[22], FinalPoW[8];
>     CacheEntry *Garbage = (CacheEntry*)hodl_scratchbuf;
>     CacheEntry Cache;
>     uint32_t CollisionCount = 0;
>
>     swab32_array( BlockHdr, pdata, 20 );
>         // Search for pattern in psuedorandom data
>         int searchNumber = COMPARE_SIZE / opt_n_threads;
>         int startLoc = threadNumber * searchNumber;
>
>         for(int32_t k = startLoc; k < startLoc + searchNumber && !work_restart[threadNumber].restart; k++)
>         {
>            // copy data to first l2 cache
>            memcpy(Cache.dwords, Garbage + k, GARBAGE_SLICE_SIZE);
> #ifndef NO_AES_NI
>            for(int j = 0; j < AES_ITERATIONS; j++)
>            {
>                 CacheEntry TmpXOR;
>                 __m128i ExpKey[16];
>
>                 // use last 4 bytes of first cache as next location
>                 uint32_t nextLocation = Cache.dwords[(GARBAGE_SLICE_SIZE >> 2)
>                                    - 1] & (COMPARE_SIZE - 1); //% COMPARE_SIZE;
>
>                 // Copy data from indicated location to second l2 cache -
>                 memcpy(&TmpXOR, Garbage + nextLocation, GARBAGE_SLICE_SIZE);
>                 //XOR location data into second cache
>                 for( int i = 0; i < (GARBAGE_SLICE_SIZE >> 4); ++i )
>                    TmpXOR.dqwords = _mm_xor_si128( Cache.dqwords,
>                                                       TmpXOR.dqwords );
>                 // Key is last 32b of TmpXOR
>                 // IV is last 16b of TmpXOR
>
>                 ExpandAESKey256( ExpKey, TmpXOR.dqwords +
>                                  (GARBAGE_SLICE_SIZE / sizeof(__m128i)) - 2 );
>                 AES256CBC( Cache.dqwords, TmpXOR.dqwords, ExpKey,
>                         TmpXOR.dqwords[ (GARBAGE_SLICE_SIZE / sizeof(__m128i))
>                                                              - 1 ], 256 );                 }
> #endif
>            // use last X bits as solution
>            if( ( Cache.dwords[ (GARBAGE_SLICE_SIZE >> 2) - 1 ]
>                                          & (COMPARE_SIZE - 1) ) < 1000 )
>            {
>               BlockHdr[20] = k;
>               BlockHdr[21] = Cache.dwords[ (GARBAGE_SLICE_SIZE >> 2) - 2 ];
>               sha256d( (uint8_t *)FinalPoW, (uint8_t *)BlockHdr, 88 );
>               CollisionCount++;
>               if( FinalPoW[7] <= ptarget[7] )
>               {
>                   pdata[20] = swab32( BlockHdr[20] );
>                   pdata[21] = swab32( BlockHdr[21] );
>                   *hashes_done = CollisionCount;
>                   return(1);
>               }
>            }
>         }
>
>     *hashes_done = CollisionCount;
>     return(0);
>
> #endif
6  Alternate cryptocurrencies / Mining (Altcoins) / Re: [ANN]: cpuminer-opt v3.4.3, NEW faster m7m, Windows binaries on: August 25, 2016, 06:11:59 PM
Yes --- corei7 is definitely running without AES_NI. 

For HODL, you are excluding a whole bunch of AES_NI code that doesn't require AVX to execute.  The only part of Wolf's implementation that requires AVX is the SHA512 function in the initial scratchpad generation routine.  If you take out the AVX checks and "non-AVX" code from the rest of the implementation in algo/hodl/aes.c and algo/hodl/hodl-wolf.c it compiles for westmere and runs just fine with AES-NI enabled.  Running 24 threads with no affinity on my server I'm seeing about 215H/s without AES and close to 375H/s average with the modified version to allow the AES_NI code to run.
7  Alternate cryptocurrencies / Mining (Altcoins) / Re: [ANN]: cpuminer-opt v3.4.3, NEW faster m7m, Windows binaries on: August 25, 2016, 06:02:28 PM
Joblo ---

I flattened the code int algo/hodl/aes.c and algo/hodl/hodl-wolf.c to remove the "non-AVX" code versions for everything but the SHA512 Function at the top of hodl-wolf.c and the code now compiles and runs for -march=westmere.

For cpuminer-corei7.exe from your download mining HODL to nicehash with 12 threads, isolated to the six cores on one CPU I am getting in the 120-130 H/s range performance

For cpuminer-westmere.exe that I compiled using the above modifications using the same configuration on the other CPU in my server I am seeing 240-250 H/s and it indicates AES optimizations ARE enabled.

8  Alternate cryptocurrencies / Mining (Altcoins) / Re: [ANN]: cpuminer-opt v3.4.3, NEW faster m7m, Windows binaries on: August 25, 2016, 05:29:16 PM
Joblo ...

OK, had a chance to play around a bit with the GCC 6.1.0 compiling and I think I found a pretty simple fix to this problem at least.  The min/max macros which are causing collisions in the HODL C++ code are only referenced locally in the decred.c file, but are defined manually in miner.h. So ....

1) Comment out or remove the macro definitions for min and max in miner.h
2) add a local definition of the min macro to decred.c

After that I was able to get it to compile on one of my Haswell systems.  Still having trouble compiling on an older westmere based system due to some AES256CBC complaints.

Bob

Good work. I'll make the change proactively.

The AES256CBC problem may be related to AVX code in hodl-wolf. IIRC either Nehalem or Westmere have AES but not AVX. I may have
to tighen up the checking to force it to use the unoptimized version. Did you do a native compile? Have you tried corei7?
"gcc -Q -march=native --help=target" will tell you which arch is the default for native.

Westmere support AES-NI but not AVX.  Nehalem doesn't support either.

I've successfully compiled for all the AVX platforms on my laptop - haswell corei5 but can't compile with march=westmere or with native on my dev virtual machine which is running on some older servers (Dual Westmere Hex-core) that I wanted to test on.  setting march=haswell on the older VM works fine and compiles haswell optimized code (which can't run locally obviously).

Appears to be some sort of a conflict in the capabilities check on the HODL AES code.

You're right, I only check for AES_NI, not AVX. This may affect some other algos that also have AVX code mixed in with AES.
If I can identify which ones are pure AES I can make a distinction otherwise I'll have to use non-AES code unless the CPU also
supports AVX.

I don't have the necessary HW to test but if you don't mind doing a little more work it would help a lot. There are
three groups of AES code. There is code used only by hodl, code only used by cryptonight and code shared among many algos
including x11. Those three should cover the entire spectrum of AES optimized code. Those that work on your Westmere can have
AES enabled without AVX. Those like Hodl will require a CPU with AVX before AES can be enabled.

It just occurred to me that you probably did a native compile. Do you know what arch the compiler mapped that to? A Windows
user reported success using the corei7 build on a Nehalem CPU. If yours is different you could try -march=corei7.

This will raise my confidence in the fix since I can't test it on the right HW.

Might have gotten lost in the thread above, but I was able to compile with AVX level features on the westemere based system, the obviously just don't get detected or work.  The errors only seem to occur when I set the march to westmere or lower.

I've been looking through the code for hodl to try to figure out what seems to be causing the problem. It primarily seems to be from different versions of the SHA256CBC algorithm that it is attempting to compile in simultaneously. 

In going through though, I've come across a question for you with regards to your capabilities tests --- you seem to be excluding a lot of the AES_NI optimized code by wrapping it in the AVX segment even though there don't seem to be any AVX instructions in those code segments.  I haven't had a chance to look through it thoroughly, but on a quick scan the only part of wolf's code that utilizes AVX instructions is the SHA512 function used to generate the scratchpad.  The rest of the code should be able to be under #ifndef NO_AES_NI.

Bob
9  Alternate cryptocurrencies / Mining (Altcoins) / Re: [ANN]: cpuminer-opt v3.4.3, NEW faster m7m, Windows binaries on: August 25, 2016, 04:08:54 PM
@joblo ---

Mining LYRA2RE to NiceHash:


CPU: Intel(R) Xeon(R) CPU           X5650  @ 2.67GHz
CPU features: SSE2 AES
SW built on Aug 24 2016 with GCC 4.8.3
SW features: SSE2
Algo features: SSE2 AES AVX AVX2
AES not available, starting mining with SSE2 optimizations...

Mining HODL to NiceHash:

CPU: Intel(R) Xeon(R) CPU           X5650  @ 2.67GHz
CPU features: SSE2 AES
SW built on Aug 24 2016 with GCC 4.8.3
SW features: SSE2
Algo features: SSE2 AES AVX AVX2
AES not available, starting mining with SSE2 optimizations...

10  Alternate cryptocurrencies / Mining (Altcoins) / Re: [ANN]: cpuminer-opt v3.4.3, NEW faster m7m, Windows binaries on: August 25, 2016, 02:14:37 PM
@joblo --- I'd be happy to test the AES w/o AVX configurations --- do you have a patch file or a link to the downloaded files with the patches?

11  Alternate cryptocurrencies / Mining (Altcoins) / Re: [ANN]: cpuminer-opt v3.4.3, NEW faster m7m, Windows binaries on: August 24, 2016, 09:03:51 PM
Joblo ...

OK, had a chance to play around a bit with the GCC 6.1.0 compiling and I think I found a pretty simple fix to this problem at least.  The min/max macros which are causing collisions in the HODL C++ code are only referenced locally in the decred.c file, but are defined manually in miner.h. So ....

1) Comment out or remove the macro definitions for min and max in miner.h
2) add a local definition of the min macro to decred.c

After that I was able to get it to compile on one of my Haswell systems.  Still having trouble compiling on an older westmere based system due to some AES256CBC complaints.

Bob

Good work. I'll make the change proactively.

The AES256CBC problem may be related to AVX code in hodl-wolf. IIRC either Nehalem or Westmere have AES but not AVX. I may have
to tighen up the checking to force it to use the unoptimized version. Did you do a native compile? Have you tried corei7?
"gcc -Q -march=native --help=target" will tell you which arch is the default for native.

Westmere support AES-NI but not AVX.  Nehalem doesn't support either.

I've successfully compiled for all the AVX platforms on my laptop - haswell corei5 but can't compile with march=westmere or with native on my dev virtual machine which is running on some older servers (Dual Westmere Hex-core) that I wanted to test on.  setting march=haswell on the older VM works fine and compiles haswell optimized code (which can't run locally obviously).

Appears to be some sort of a conflict in the capabilities check on the HODL AES code.
12  Alternate cryptocurrencies / Mining (Altcoins) / Re: [ANN]: cpuminer-opt v3.4.3, NEW faster m7m, Windows binaries on: August 24, 2016, 08:14:20 PM
Joblo ...

OK, had a chance to play around a bit with the GCC 6.1.0 compiling and I think I found a pretty simple fix to this problem at least.  The min/max macros which are causing collisions in the HODL C++ code are only referenced locally in the decred.c file, but are defined manually in miner.h. So ....

1) Comment out or remove the macro definitions for min and max in miner.h
2) add a local definition of the min macro to decred.c

After that I was able to get it to compile on one of my Haswell systems.  Still having trouble compiling on an older westmere based system due to some AES256CBC complaints.

Bob
13  Alternate cryptocurrencies / Mining (Altcoins) / Re: [ANN]: cpuminer-opt v3.4.2, NEW AVX2 optimizations, veltor algo on: August 24, 2016, 03:11:57 PM
Joblo ----

Playing around with this again after a while away.  Had to redo my windows build environment and having some compiling issues. 

Using MSYS2 with mingw-w64 that currently installs GCC 6.1.0 and associated tools.  It appears that there are some MIN/MAX Macro issues with building under 6.1.0, specifically on the HODL code:

Code:
gcc -DHAVE_CONFIG_H -I.  -Iyes/include -fno-strict-aliasing  -I. -Iyes/include -Wno-pointer-sign -Wno-pointer-to-int-cast  -Wl,--stack,10485760 -Icompat/pthreads -O3 -march=native -Wall  -Iyes/include -MT algo/cpuminer-hmq1725.o -MD -MP -MF algo/.deps/cpuminer-hmq1725.Tpo -c -o algo/cpuminer-hmq1725.o `test -f 'algo/hmq1725.c' || echo './'`algo/hmq1725.c
g++ -DHAVE_CONFIG_H -I.  -Iyes/include -fno-strict-aliasing  -I. -Iyes/include  -O3 -march=native -Wall -std=gnu++11 -fpermissive -MT algo/hodl/cpuminer-hodl.o -MD -MP -MF algo/hodl/.deps/cpuminer-hodl.Tpo -c -o algo/hodl/cpuminer-hodl.o `test -f 'algo/hodl/hodl.cpp' || echo './'`algo/hodl/hodl.cpp
In file included from C:/msys64/mingw64/include/c++/6.1.0/bits/char_traits.h:39:0,
                 from C:/msys64/mingw64/include/c++/6.1.0/string:40,
                 from algo/hodl/hodl_uint256.h:13,
                 from algo/hodl/hodl.cpp:3:
C:/msys64/mingw64/include/c++/6.1.0/bits/stl_algobase.h:243:56: error: macro "min" passed 3 arguments, but takes just 2
     min(const _Tp& __a, const _Tp& __b, _Compare __comp)
                                                        ^
C:/msys64/mingw64/include/c++/6.1.0/bits/stl_algobase.h:265:56: error: macro "max" passed 3 arguments, but takes just 2
     max(const _Tp& __a, const _Tp& __b, _Compare __comp)
                                                        ^
In file included from C:/msys64/mingw64/include/c++/6.1.0/bits/stl_algo.h:60:0,
                 from C:/msys64/mingw64/include/c++/6.1.0/algorithm:62,
                 from algo/hodl/serialize.h:13,
                 from algo/hodl/block.h:9,
                 from algo/hodl/hodl.cpp:5:
C:/msys64/mingw64/include/c++/6.1.0/bits/algorithmfwd.h:362:41: error: macro "max" passed 3 arguments, but takes just 2
     max(const _Tp&, const _Tp&, _Compare);
                                         ^
C:/msys64/mingw64/include/c++/6.1.0/bits/algorithmfwd.h:375:41: error: macro "min" passed 3 arguments, but takes just 2
     min(const _Tp&, const _Tp&, _Compare);
                                         ^
C:/msys64/mingw64/include/c++/6.1.0/bits/algorithmfwd.h:403:30: error: macro "min" requires 2 arguments, but only 1 given
     min(initializer_list<_Tp>);
                              ^
C:/msys64/mingw64/include/c++/6.1.0/bits/algorithmfwd.h:413:30: error: macro "max" requires 2 arguments, but only 1 given
     max(initializer_list<_Tp>);
                              ^
In file included from C:/msys64/mingw64/include/c++/6.1.0/bits/uniform_int_dist.h:35:0,
                 from C:/msys64/mingw64/include/c++/6.1.0/bits/stl_algo.h:66,
                 from C:/msys64/mingw64/include/c++/6.1.0/algorithm:62,
                 from algo/hodl/serialize.h:13,
                 from algo/hodl/block.h:9,
                 from algo/hodl/hodl.cpp:5:

Any thoughts?
14  Alternate cryptocurrencies / Announcements (Altcoins) / Re: [ANN] Ħ [HODL] Interest on all Balances, no Staking, 4000%+ for Early Hodlers on: February 29, 2016, 06:31:31 PM
I played a bit with the wallet miner.
I could make the miner threads interrupt when a new block is received so they don't waste time on the old hash.
I also made the nonce incremental instead of random and other small enhancements.
I'm testing it and will eventually make a pull request in the next days.

Thread interruption would be an improvement - but take note, I'm expecting the wallet miner to be superceeded quite soon by improved cpu miners - so it might not be worth the candle.

I made the nonce random to avoid the possibility of a user calculating the same hashes on different machines mining to the same address.

I've been looking at the source a bit myself as a learning experience ... a thought on the multiple miners ....
Wouldn't it be better to set the ExtraNonce value to a random amount instead of the hash nonce?  I thought that was one of the reasons the ExtraNonce value exists.  That way each miner would have a full nonce field to work with.

Pages: [1]
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!