Bitcoin Forum
December 13, 2017, 04:27:50 AM *
News: Latest stable version of Bitcoin Core: 0.15.1  [Torrent].
 
   Home   Help Search Donate Login Register  
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [17] 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 ... 86 »
  Print  
Author Topic: Gateless Gate Sharp 1.1.4: zawawa's open-source dual ETH/XMR/PASC/LBC miner  (Read 163549 times)
nerdralph
Sr. Member
****
Offline Offline

Activity: 406


View Profile
January 17, 2017, 09:39:59 PM
 #321

Let me guess, you're looking at GPU-Z and gives you a reliable measure of GDDR memory bandwidth?  If you want to pretend to be a miner developer, you should at least try to use the right tools.
http://gpuopen.com/compute-product/codexl/

I am not a developer and not pretend to be one, but i do not need use advanced tools to see that my gpu is not using all memory bandwith, can compare eth and zec mcu, that because you are developer that can not prefetch to table in cache for later use not mean that others can not do it, but is clear that zcash is not memory bound

So tell me, wise one, how can any developer get >40MB of data to fit into the L2 cache on something like a Rx 480?  Do you even know how big the cache is?

Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction. Advertise here.
ecohash
Jr. Member
*
Offline Offline

Activity: 34


View Profile
January 17, 2017, 09:43:57 PM
 #322

Each RX 480 CU hosts four texture units, 16KB of L1 cache, a 64KB local data share, and register space for the vector and scalar units. AMD says it made a number of tweaks to improve the CU’s efficiency, including the addition of native FP16 (and Int16) support, tuned cache access and better instruction prefetching. Altogether, the changes purportedly yield up to 15% more performance per CU than the Radeon R9 290’s Hawaii GPU, which is based on a second-gen GCN architecture.
reb0rn21
Legendary
*
Offline Offline

Activity: 1246


View Profile
January 17, 2017, 10:23:01 PM
 #323

As far I know both RX 4xx and GTX 1070 have "just" 2MB of L2 cash

... PLAY SHARE EARN...
.LBRY...
                            __¦¦¦__
                        __¦¦¦¦¦¯¦¦¦¦¦__
                    __¦¦¦¦¦¯¯     ¯¯¦¦¦¦¦__
                __¦¦¦¦¦¯¯             ¯¯¦¦¦¦¦__
            __¦¦¦¦¦¯¯                     ¯¯¦¦¦¦¦__
        __¦¦¦¦¦¯¯                             ¯¯¦¦¦¦¦__
    __¦¦¦¦¦¯¯                                     ¯¯¦¦¦
__¦¦¦¦¦¯¯                                         __¦¦¦
¦¦¦¯¯                                         __¦¦¦¦¦¯¯
¦¦¦     ¦__                               __¦¦¦¦¦¯¯
¦¦¦     ¦¦¦¦¦__                       __¦¦¦¦¦¯¯  ________
¦¦¦       ¯¯¦¦¦¦¦__               __¦¦¦¦¦¯¯       ¦¦¦¦¦¦
¦¦¦¦¦__       ¯¯¦¦¦¦¦__       __¦¦¦¦¦¯¯       __¦¦¦¦¦¦¦
  ¯¯¦¦¦¦¦__       ¯¯¦¦¦¦¦___¦¦¦¦¦¯¯       __¦¦¦¦¦¯¯ ¦¦
      ¯¯¦¦¦¦¦__       ¯¯¦¦¦¦¦¯¯       __¦¦¦¦¦¯¯
          ¯¯¦¦¦¦¦__       ¯       __¦¦¦¦¦¯¯
              ¯¯¦¦¦¦¦__       __¦¦¦¦¦¯¯
                  ¯¯¦¦¦¦¦___¦¦¦¦¦¯¯
                      ¯¯¦¦¦¦¦¯¯
                          ¯
th00ber
Hero Member
*****
Offline Offline

Activity: 624


View Profile
January 18, 2017, 12:21:18 AM
 #324

vcruntime140.dll missing both Win7 and Win10
I tried to reinstall VC Redist / DL missing lib

But not working ... any tips on how to run this in windows ?
Jdope
Sr. Member
****
Offline Offline

Activity: 462


View Profile
January 18, 2017, 12:47:43 AM
 #325

It might not be the best place to ask but, what are the skills used in making such mining softwares, what are the core subjects that one needs to have a good grasp on to have that low level (i assume) knowledge?


▄▄███▄▄
▄▄▄███████████▄▄▄
█████████████████████
█████████████████████
██████████ ██████████
▄▄██  █████████   █████████  ██▄▄
███████  █████           █████  ███████
███████  ██████▄       ▄██████  ███████
███████  ███████       ███████  ███████
████████  ██████ ▄███▄ ██████  ████████
█████████  █████████████████  █████████
██████████  ███████████████  ██████████
███████████   ███████████   ███████████
████████████   ███████   ████████████
█████████████   ███   █████████████
███████████           ███████████
███████               ███████
███                   ███
   
   
   
  ➤  Telegram
➤  Facebook
➤  Twitter
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 18, 2017, 01:16:23 AM
 #326

vcruntime140.dll missing both Win7 and Win10
I tried to reinstall VC Redist / DL missing lib

But not working ... any tips on how to run this in windows ?

That's pretty weird... Are you using a 32-bit version of Windows by any chance?

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 18, 2017, 01:28:36 AM
 #327

It might not be the best place to ask but, what are the skills used in making such mining softwares, what are the core subjects that one needs to have a good grasp on to have that low level (i assume) knowledge?

I am pretty much self-taught as far as programming is concerned, so my approach to it is fairly idiosyncratic. I was originally interested in internal workings of operating systems, device drivers, and compilers, and that background definitely helped me so far. Now only if I could get this assembly version right...

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
cryptominer420
Full Member
***
Offline Offline

Activity: 183


View Profile
January 18, 2017, 03:04:17 AM
 #328

Well guys using Version 1.1 on suprnova on my 5 bios modded powercolor RX 470's clocked at 1250Mhz core and 1720 ram if I set a static difficulty of 1500 my reported average hashrate on the pool side is 1152h/s over 1 hr so that would break down to a effective hash of 230.4h/s per card.

BTC: 1Eeb9SoBeY7AQjjFn7YMJZMY7Jtw5gxxHs  ETH: 0x68e4EA3b7e60C8D6fC9BA92775ccE27Ca542D114
zawawa
Sr. Member
****
Offline Offline

Activity: 420


Miner Developer


View Profile
January 18, 2017, 06:14:03 AM
 #329

I was playing with the cryptonight kernel for a change and was able to get it to work on GTX 1060 with "--gpu-threads 1". I also dug out a NeoScrypt kernel I optimized a while back, which runs at 780kh/s on RX 480. I will include them in the next version.

Gateless Gate Sharp, an open-source ETH/XMR miner: http://bit.ly/2rJ2x4V
BTC: 1BHwDWVerUTiKxhHPf2ubqKKiBMiKQGomZ
toptek
Legendary
*
Offline Offline

Activity: 1120


View Profile
January 18, 2017, 06:16:58 AM
 #330

Well guys using Version 1.1 on suprnova on my 5 bios modded powercolor RX 470's clocked at 1250Mhz core and 1720 ram if I set a static difficulty of 1500 my reported average hashrate on the pool side is 1152h/s over 1 hr so that would break down to a effective hash of 230.4h/s per card.


SO i need to remolded my 470 to get 230 like CM with the fee off  at stock setting ? . no complaints if i have to i have to ... ID rather use Gateless Gate.



would you be willing to post your Exact setting ? .

Above all before some one miss judges me , I m not complaining or against paying Fees  that said i know it is a  little bit longer and  zawawa is doing his best to catch up  but if some one is actually getting such high speeds please share with us your exact setting and this is not a demand ,,  if you willing and don't mine other wise cool if not .
chronek
Sr. Member
****
Offline Offline

Activity: 262


View Profile
January 18, 2017, 07:41:19 AM
 #331

So tell me, wise one, how can any developer get >40MB of data to fit into the L2 cache on something like a Rx 480?  Do you even know how big the cache is?

Who say about get all data at once, just do it asynchronus, when table is filling, table is using at the same time, like a buffer. Now memory is used when cores need it, but they not need it all time, so that why it not use all mcu, but when would be buffer table it would be filled all the time in separate process, even in that gaps when cores not need use memory, and more data would be to process leter when cores want to, it can be few kb table only, but benefits would be faster access and less waiting, but it would need redesign all working process, every calculation would be need push data to table, and get result from second, there would be each unit doing only own job and only when data is in table, it would not be simple

sorry my english is not good and i can not express everything what i want
nerdralph
Sr. Member
****
Offline Offline

Activity: 406


View Profile
January 18, 2017, 11:26:34 AM
 #332

So tell me, wise one, how can any developer get >40MB of data to fit into the L2 cache on something like a Rx 480?  Do you even know how big the cache is?

Who say about get all data at once, just do it asynchronus, when table is filling, table is using at the same time, like a buffer. Now memory is used when cores need it, but they not need it all time, so that why it not use all mcu, but when would be buffer table it would be filled all the time in separate process, even in that gaps when cores not need use memory, and more data would be to process leter when cores want to, it can be few kb table only, but benefits would be faster access and less waiting, but it would need redesign all working process, every calculation would be need push data to table, and get result from second, there would be each unit doing only own job and only when data is in table, it would not be simple

sorry my english is not good and i can not express everything what i want

No problem, you've clearly expressed that you are talking out your arse.
chronek
Sr. Member
****
Offline Offline

Activity: 262


View Profile
January 18, 2017, 11:42:12 AM
 #333

No problem, you've clearly expressed that you are talking out your arse.

and you have expressed you could not think creatively, and you prefer to reject any new thoughts
OhGodAGirl
Full Member
***
Offline Offline

Activity: 149

Look, I'm really not that interesting. Promise.


View Profile WWW
January 18, 2017, 11:44:39 AM
 #334

No problem, you've clearly expressed that you are talking out your arse.

and you have expressed you could not think creatively, and you prefer to reject any new thoughts

There's nothing to do with creative thinking here - there's a limit to how much data can fit on the cache. That's it. You can't add more. You're not being creative, you're being illogical.

1P1C58d4CUiEokjoAfWiZVTogZFAeAfawh
chronek
Sr. Member
****
Offline Offline

Activity: 262


View Profile
January 18, 2017, 12:07:48 PM
 #335

There's nothing to do with creative thinking here - there's a limit to how much data can fit on the cache. That's it. You can't add more. You're not being creative, you're being illogical.

You didnt read, he didn't too, 4kb table can fit in cache
laik2
Sr. Member
****
Offline Offline

Activity: 392


View Profile
January 18, 2017, 03:40:11 PM
 #336

There's nothing to do with creative thinking here - there's a limit to how much data can fit on the cache. That's it. You can't add more. You're not being creative, you're being illogical.

You didnt read, he didn't too, 4kb table can fit in cache

I may not understand much of OpenCL or graphics at all  but as network engeneer I still think that hardware limits cannot be exceeded for the purpuse mentioned above.
There is no queue mechanism or alike that can be used to queue solution rate AFAIK. Protocol specifics doesn't allow workarounds.

ZEC: t1KbbHtXqzSS6qHBaPZDKyWnzxhRjr9oCtW
th00ber
Hero Member
*****
Offline Offline

Activity: 624


View Profile
January 18, 2017, 03:43:24 PM
 #337

vcruntime140.dll missing both Win7 and Win10
I tried to reinstall VC Redist / DL missing lib

But not working ... any tips on how to run this in windows ?

That's pretty weird... Are you using a 32-bit version of Windows by any chance?
64 bits both... Have you à release with the full DLL dependencies ?
joaocha
Full Member
***
Offline Offline

Activity: 222


View Profile
January 18, 2017, 03:46:22 PM
 #338

vcruntime140.dll missing both Win7 and Win10
I tried to reinstall VC Redist / DL missing lib

But not working ... any tips on how to run this in windows ?

That's pretty weird... Are you using a 32-bit version of Windows by any chance?
64 bits both... Have you à release with the full DLL dependencies ?

https://www.microsoft.com/en-us/download/confirmation.aspx?id=48145
OhGodAGirl
Full Member
***
Offline Offline

Activity: 149

Look, I'm really not that interesting. Promise.


View Profile WWW
January 18, 2017, 03:53:13 PM
 #339

There's nothing to do with creative thinking here - there's a limit to how much data can fit on the cache. That's it. You can't add more. You're not being creative, you're being illogical.

You didnt read, he didn't too, 4kb table can fit in cache

I may not understand much of OpenCL or graphics at all  but as network engeneer I still think that hardware limits cannot be exceeded for the purpuse mentioned above.
There is no queue mechanism or alike that can be used to queue solution rate AFAIK. Protocol specifics doesn't allow workarounds.

You're correct.

1P1C58d4CUiEokjoAfWiZVTogZFAeAfawh
chronek
Sr. Member
****
Offline Offline

Activity: 262


View Profile
January 18, 2017, 04:18:25 PM
 #340

I may not understand much of OpenCL or graphics at all  but as network engeneer I still think that hardware limits cannot be exceeded for the purpuse mentioned above.
There is no queue mechanism or alike that can be used to queue solution rate AFAIK. Protocol specifics doesn't allow workarounds.

Yes we can not exceed hardware limits, but for now miner use 63% of mcu and 80% of power, so it not utilize full hardware capacity, why? I suspect that miner computation have 2 phases, one when fetching from memory - cores wait (are blocked), and second cores compute (not use memory).

in simple way:
now threads do: [[external memory read to registers][comp][external memory write result]][[external memory read to registers][comp][external memory write result]]
so one part cores waiting, second part memory not used...

i want (all at the same time):
thread1: [external memory read to cache][external memory read to cache][external memory read to cache]
thread2: [[read cache][comp][write cache]][[read cache][comp][write cache]][[read cache][comp][write cache]]
thread3: [external memory write result][external memory write result][external memory write result]

yes it have flaws in logic, but why i can not discuss that?
Pages: « 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [17] 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 ... 86 »
  Print  
 
Jump to:  

Sponsored by , a Bitcoin-accepting VPN.
Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!