Bitcoin Forum
July 05, 2024, 08:33:25 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
  Home Help Search Login Register More  
  Show Posts
Pages: « 1 ... 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 [141] 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 »
2801  Alternate cryptocurrencies / Pools (Altcoins) / Re: [POOL] NiceHash.com Scrypt/X11/X13/Neo/Lyra2/etc. profit switch Bitcoin payout on: January 28, 2016, 10:30:57 PM
Does this extend to Intel IGPU mining? I've seen some discussion of the topic and that at one time cgminer
supported it but I haven't found anything concrete to build on. If there is somewhere I can start I would like
to pursue this.

We haven't worked on Intel IGPU mining support yet. However, since Intel IGPU generally supports OpenCL, it should be possible to add support for these GPUs into sgminer. Our latest branch of sgminer is available here: https://github.com/nicehash/sgminer. Currently we don't have any plans on adding Intel IGPU mining support into sgminer, however this does sound like interesting project and if someone is willing to do the development effort for this integration, we would definitely donate to this effort once integration is complete.


Best regards,
NiceHash team.

Thanks for the quick reply.  I tried another version of sgminer and it didn't recognize the IGPU. Since it has
apparently worked in the past on cgminer I was hoping the development had already been done and it would
simply be an integration effort. I'll keep looking.
2802  Alternate cryptocurrencies / Mining (Altcoins) / Re: CCminer(SP-MOD) Modded NVIDIA Maxwell kernels. on: January 28, 2016, 10:03:28 PM
I was hoping to get a better response to my technical trolls but all I got was more bluster.
I was trying to find out if our skills were complementary. I am a complete noob when it comes
to cuda so I was hoping SP could implement some of my ideas with his knowledge of cuda.
When I provided a demonstration of my skills he respnded with sillly you that was cpu verification
code, and why don't you do better, without ever considering the technical merit or other
applications for the changes I made
He's more interested in selling what he has over and over again rather than providing anything new
that sells itself. I'm afraid SP has turned into a telemarketer.
Assembler for NVIDIA Maxwell architecture
https://github.com/NervanaSystems/maxas
Thanks, that will be useful when I learn how to use it. I'm looking for docs that describe the cuda
processir architecture in detail so I can dtetertmine things like how many loads to queue up to
fill the pipe, how many executions units, user cache management, etc. That kind of information
is necessary to maximize instruction throughput at the processor level. Do you know of any avaiable
docs with this kind of info?

There is not much info available, but if you disassemble compiled code you will see that the maxwell is superscalar with 2 pipes. 2 instructions per cycle. It's able to execute instructions while writing to memory if the code is in the instruction cache. And you to avoid ALU stalls you need to reorder your instructions carefully.   There are vector instructions that can write bigger chunks of memory with fewer instructions... etc etc. The compiler is usually doing a good job here. Little to gain.. Ask DJM34 for more info. He is good in the random stuff...

Thanks again.

Have you tried interleaving memory accesses with arith instructions so they can be issued the same clock?
When copying mem do you issue the first load an the first store immediately after it. Thr first load fills the cache
line and the first store waits for the first bytes to become available. Then you can queue up enough loads to fill
the pipe and do other things while waiting for mem. Multi-buffering is a given being careful not to overuse regs.

If your doing a load, process, and store it's even better because you can have one instruction slot focussed on memory
while the other can do the processing.

These are things I'd like to try but haven't got the time. Although I've done similar in the past there was no performance
tests that could quantify the effect, good or bad.

If you think this has merit give it a shot. Like I said if it works just keep it open because I could still implement it myself.
The hotter the code segments you choose the bigger the result should be. Some of the assembly routines would be logical
targets.
2803  Alternate cryptocurrencies / Mining (Altcoins) / Re: CCminer(SP-MOD) Modded NVIDIA Maxwell kernels. on: January 28, 2016, 08:58:26 PM
I was hoping to get a better response to my technical trolls but all I got was more bluster.
I was trying to find out if our skills were complementary. I am a complete noob when it comes
to cuda so I was hoping SP could implement some of my ideas with his knowledge of cuda.
When I provided a demonstration of my skills he respnded with sillly you that was cpu verification
code, and why don't you do better, without ever considering the technical merit or other
applications for the changes I made

He's more interested in selling what he has over and over again rather than providing anything new
that sells itself. I'm afraid SP has turned into a telemarketer.

Assembler for NVIDIA Maxwell architecture

https://github.com/NervanaSystems/maxas

Thanks, that will be useful when I learn how to use it. I'm looking for docs that describe the cuda
processir architecture in detail so I can dtetertmine things like how many loads to queue up to
fill the pipe, how many executions units, user cache management, etc. That kind of information
is necessary to maximize instruction throughput at the processor level. Do you know of any avaiable
docs with this kind of info?
2804  Alternate cryptocurrencies / Pools (Altcoins) / Re: [POOL] NiceHash.com Scrypt/X11/X13/Neo/Lyra2/etc. profit switch Bitcoin payout on: January 28, 2016, 06:37:11 PM
Hello nicehash,

I have noticed you have taken an interest in mining software development including ASIC, AMD, Nvidia, and CPU
mining.

Does this extend to Intel IGPU mining? I've seen some discussion of the topic and that at one time cgminer
supported it but I haven't found anything concrete to build on. If there is somewhere I can start I would like
to pursue this.

I would also like to plug my recent fork of TPruvot's cpuminer-multi. It doesn't compile on windows yet
but I'm working  on it.

https://bitcointalk.org/index.php?topic=1326803.0

2805  Alternate cryptocurrencies / Mining (Altcoins) / Re: CCminer(SP-MOD) Modded NVIDIA Maxwell kernels. on: January 28, 2016, 06:20:16 PM
I was hoping to get a better response to my technical trolls but all I got was more bluster.
I was trying to find out if our skills were complementary. I am a complete noob when it comes
to cuda so I was hoping SP could implement some of my ideas with his knowledge of cuda.
When I provided a demonstration of my skills he respnded with sillly you that was cpu verification
code, and why don't you do better, without ever considering the technical merit or other
applications for the changes I made

He's more interested in selling what he has over and over again rather than providing anything new
that sells itself. I'm afraid SP has turned into a telemarketer.
2806  Alternate cryptocurrencies / Mining (Altcoins) / Re: Only GTX 970 miners. We will help each other on: January 28, 2016, 06:09:10 PM
Currently many of us, are mining with 970, with single or multiple cards. This section will be used as a information index about GTX 970. We will discuss about new coins, new algo, new mining software and profitability.
As a new miner, it is too hard to find correct coin to mine. Hope this thread will be a nice place for new miners. Please mention the following:
1. what coin you are currently mining
2. What is your your miner
3. What is the hash you are getting ?

There are experienced miners and devs here. hope they will come to help us too.


I don't think this needs to be specific to the 970. Why not broaden the scope? First which nvidia cards perform
better on which algos. For example the 750ti is an oustanding performer when mining lyra2v2. How about
AMD cards, CPU mining, various mining software, free and $$$.

To answer your questions I have a pair of EVGA 970s and I mine mostly with ccminer-1.5.74-SP_mod.
But there are other variations with their own benefits.

Also come join us in the miner SW threads. There's lots of discussion about squeezing more hash.

I will shamelessly plug my new fork of cpuminer called cpuminer-opt, meaning optimized.
It's the fastest cpu miner I am aware of and supports the most algos.

https://bitcointalk.org/index.php?topic=1326803.0
2807  Alternate cryptocurrencies / Mining (Altcoins) / Re: CCminer(SP-MOD) Modded NVIDIA Maxwell kernels. on: January 28, 2016, 04:50:42 PM
joblo, does your quark optimisation work at the end? not sure I understand your conversation with sp_ fully: where does the +30% come from?

Joblo's optimization impacts CPU validation of any found shares. This is usually insignificant, but since he's also mining with all CPU cores, it did have an impact for him. It was that his CPU mining was slowing down ccminer.

Joblo: You're invited for a beer over at #ccminer @freenode: there's friendlier dev talk there, some collaboration now and then, and certainly a lot less BS  Wink

Thanks for the invite. I plan to join #ccminer (and github, and...) when things settle down, which they are beginning to do.
I've been so busy trying to get all the algos supported and delivering the quick optimizations that I'm only now starting to think
longer term.

I'm working on a design to modulerize algos that doesn't require any base code changes when adding a new algo.
But that's a big feature that requires a lot of thought. I have high standards and don't want to present a half-baked plan.
2808  Alternate cryptocurrencies / Mining (Altcoins) / Re: CCminer(SP-MOD) Modded NVIDIA Maxwell kernels. on: January 28, 2016, 04:40:30 PM
joblo, does your quark optimisation work at the end? not sure I understand your conversation with sp_ fully: where does the +30% come from?

Mostly from more efficient management of the groestl ctx. Because quark can run groestl twice per round
and was running the init function twice every time the hash function was called and it was called in a scanhash loop.
That's  2* the number of hash calls for something that only needs to be done once. That was a big boost though I
don't recall exactly how much. The reduction in the number of inits also helped other algos like x11.

I also created a fast reinit function that skipped the constants.  So now a full init is done once when scanhash
is called and any subsequent reinits that are necessary are fast. That alone added another 5%.

I have another idea to factor out the full init from scanhash so the ctx will be fully initted only once, ever, before
entering the thread loop.
2809  Alternate cryptocurrencies / Mining (Altcoins) / Re: [ANN] New Improved altcoin CPU miner with support for AES-NI on: January 28, 2016, 04:10:37 PM
Download 3.0.7 now. More stable than 3.0.6.

https://drive.google.com/file/d/0B0lVSGQYLJIZWjJpXzAtemRaUWc/view?usp=sharing
2810  Alternate cryptocurrencies / Mining (Altcoins) / Re: [ANN] New Improved altcoin CPU miner with support for AES-NI on: January 28, 2016, 03:19:10 PM

Got it working some what just hasn't shown any hashrate yet.


Was the CPU working or just idling? Did you try other pools or algos?

I've streamlined the check in v3.0.7. The check for SSE2 wasn't working and
with the plan to drop seperate generic x86_64 target the SSE check isn't
needed anymore.

Startimg in 3.0.7 it will display seperatety whether the CPU and build
support AES_NI and select the appropriate target. The startup display
will also be directly linked to the target selection, previously there were
two seperate checks.
2811  Alternate cryptocurrencies / Mining (Altcoins) / Re: CCminer(SP-MOD) Modded NVIDIA Maxwell kernels. on: January 28, 2016, 07:49:34 AM
While you where trolling my thread I added another 0.4% in the decred algo.
I will try to do 5% and include it in my donation miner.

Since I forked cpuminer I've increased performance up to 92 % (x13), 75% (x15), 36% (qubit)
and 27% (quark). I can't take credit for all of it because it was just plugging in faster
functions that already existed. But all the gains in quark are mine.
2812  Alternate cryptocurrencies / Mining (Altcoins) / Re: CCminer(SP-MOD) Modded NVIDIA Maxwell kernels. on: January 28, 2016, 07:17:41 AM
this is in the cpu verification code. The gpu code is different. There we have precalc tables of the states to avoid conditional branches.
Whatever it is it's faster.
The cpu verification is only done when the gpu find a solution.
I know why changing the verification code made things faster. I wa scpumining 8
threads at the time so it was slowing down the CPU.

But in ccminer you can just remove the verification. It's there so that you can check if you break the hash when you change something.

Tried that in cpuminer, didn't help. I only managed to get another 1% out of c11, not sure why, expected more,
will take another look.

No other algos benefit from the fast ctx reinit but you should try it in ccminer, the GPU kernel,  that is.
2813  Alternate cryptocurrencies / Mining (Altcoins) / Re: CCminer(SP-MOD) Modded NVIDIA Maxwell kernels. on: January 28, 2016, 07:14:28 AM
You are an ssembly language guy, do you reorder instructions to maximize instruction throughput.
It requires detail knowledge of theprocessor such as how many instructions can be fetched per clock,
how many can be executed per clock, how deep is the memory buffer, dies it delay writes to prioritize
reads?, how big is a cache line, etc. I know none of this stuff, maybe you do and could use it to speed
up the hot spots.

This is something the compiler is very good at. The cudacore is a 3 + operation risc processor with up to 256 registers.
It is buildt for the compiler..

Sometimes you need to move code around, manually unroll some loops etc.. Verify the result with disassembling. (this is what DJM34 is calling random stuff)

But don't let the codesize grow to big, the instruction cache is small.
...

While you where trolling my thread I added another 0.4% in the decred algo.
I will try to do 5% and include it in my donation miner.

I wastalking more about performing loads as soon as possible to give time for mem to respond before
you need the data. It also fills the cache line for susequent loads. If cuda supports read priority you
can even issue a store before a load and the load will have priority. You just have to watch for register
conflicts.

There is also issuing different types of instructions on the same clock to improve superscalar
operation.

These kinds of things are hard for a normal compiler to do because it is specific to each processor,
but if anyone can do it it'd cuda because thy have one HW architecture, one run time system and
one compiler.

And another thing, you trolled me first. Smiley
2814  Alternate cryptocurrencies / Mining (Altcoins) / Re: CCminer(SP-MOD) Modded NVIDIA Maxwell kernels. on: January 28, 2016, 03:42:51 AM
You are an ssembly language guy, do you reorder instructions to maximize instruction throughput.
It requires detail knowledge of theprocessor such as how many instructions can be fetched per clock,
how many can be executed per clock, how deep is the memory buffer, dies it delay writes to prioritize
reads?, how big is a cache line, etc. I know none of this stuff, maybe you do and could use it to speed
up the hot spots.
2815  Alternate cryptocurrencies / Mining (Altcoins) / Re: [ANN] New Improved altcoin CPU miner with support for AES-NI on: January 28, 2016, 01:16:50 AM
Progress update.

I found 4% more hash in quark and I've tested some of the more obscure algos so another
3.0 update is coming before 3.1. I'll take anorther day to look for more low hanging fruit
and to a full suite of testing before releasing. I want this to be super stable.

Then I will start on windows, I promise.

V 3.0.7 almost ready.

Edit

I was checking some stats while testing and here is how much has been gained since
the project forked.

quark + 27%
qubit  + 36
x13    + 92
x15    + 76

It's come a long way.
2816  Alternate cryptocurrencies / Mining (Altcoins) / Re: CCminer(SP-MOD) Modded NVIDIA Maxwell kernels. on: January 28, 2016, 01:12:09 AM
this is in the cpu verification code. The gpu code is different. There we have precalc tables of the states to avoid conditional branches.
Whatever it is it's faster.

The cpu verification is only done when the gpu find a solution.

I know why changing the verification code made things faster. I wa scpumining 8
threads at the time so it was slowing down the CPU.
2817  Alternate cryptocurrencies / Mining (Altcoins) / Re: CCminer(SP-MOD) Modded NVIDIA Maxwell kernels. on: January 28, 2016, 01:00:23 AM
this is in the cpu verification code. The gpu code is different. There we have precalc tables of the states to avoid conditional branches.
Whatever it is it's faster.

The cpu verification is only done when the gpu find a solution.

I may not have realized I was looking at verification code at the time but I know what it is.
Maybe my changes can be applied to the GPU code and you'll get your 30%
2818  Alternate cryptocurrencies / Mining (Altcoins) / Re: CCminer(SP-MOD) Modded NVIDIA Maxwell kernels. on: January 28, 2016, 12:58:24 AM
this is in the cpu verification code. The gpu code is different. There we have precalc tables of the states to avoid conditional branches.

My changes have nothing to do with avoiding branches but avoiding work.
2819  Alternate cryptocurrencies / Mining (Altcoins) / Re: CCminer(SP-MOD) Modded NVIDIA Maxwell kernels. on: January 28, 2016, 12:54:04 AM
this is in the cpu verification code. The gpu code is different. There we have precalc tables of the states to avoid conditional branches.

Whatever it is it's faster.
2820  Alternate cryptocurrencies / Mining (Altcoins) / Re: CCminer(SP-MOD) Modded NVIDIA Maxwell kernels. on: January 28, 2016, 12:35:19 AM
when is the last time you delivered 30% in less than an hour?

Today. your quark kernel.

Since skein is much faster than groestl we only do skein and throw away 50% of the hashes.

    if (hash[0] & 0x8)
    {
        sph_groestl512_init(&ctx_groestl);
        sph_groestl512 (&ctx_groestl, (const void*) hash, 64);
        sph_groestl512_close(&ctx_groestl, (void*) hash);
    }
    else
    {
        sph_skein512_init(&ctx_skein);
        sph_skein512 (&ctx_skein, (const void*) hash, 64);
        sph_skein512_close(&ctx_skein, (void*) hash);
    }


There was an optimization made in cpuminer that  if it was determined that a second
round of groestl was necessary the existing hashes would be thrown away on the belief
it would take longer to complete the second groestl than to start over. It didn't work.

However, I might try ccminer's logic. cpuminer uses a state machine as
the engine. ccminer just uses a simple if.

I'm also going to look at other contexts. selctively reinitializing necessary fields may be
quicker thn the current implementation of copying a saved initialiazed context.
Both are quicker than what ccminer does.
Pages: « 1 ... 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 [141] 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 »
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!