sp_ (OP)
Legendary
Offline
Activity: 2926
Merit: 1087
Team Black developer
|
|
September 13, 2019, 07:15:06 AM Last edit: September 13, 2019, 07:45:47 AM by sp_ |
|
Yeah, no PTX, that's what I was saying. ==> RandomX
So to make a fast randomx miner on NVIDIA you can convert the randomx code to ptx before execution. (Create a new ptx kernel for each block) Without optimalizations the NVIDIA cards are loosing to the CPU. randomx benchmarks: https://bitcointalk.org/index.php?topic=5176747.0GPU | Cryptonight-R | RandomX | AMD | Vega 64 | 2200 H/s | 1225 H/s | RX 480/580 | 960-1000 H/s | 400-410 H/s | RX 560 4GB (1400/2200 MHz) | 495 H/s | 260 H/s | NVIDIA/EVGA | RTX 2080 Ti (1915/13600 MHz) | 960-1000 H/s | 400-410 H/s | GTX 1080 Ti (2037/11800 MHz) | 927 H/s | 1122 H/s | GTX 1070 Ti (1900/7600 MHz) | 625 H/s | 769 H/s |
For CPUs: CPU | Cryptonight-R | RandomX | AMD 3900X (4.25GHZ ALL CORE, 3600MHZ RAM) | 1335 H/s | 13330 H/s | RYZEN 3700X | 1018 H/s | 6853 H/s | RYZEN 5 3600 | 803 H/s | 6580 H/s | INTEL I9 9900K | 630 H/s | 2102 H/s | 2X XEON E5 2670 V2 | 930 H/s | 5815 H/s | INTEL I7 7700K | 350 H/s | 2100 H/s |
|
|
|
|
joblo
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
September 13, 2019, 04:33:06 PM |
|
The point with ptx is that it's a unified language for all NVIDIA gpu architechtures.
The point s that it's only Nvidia GPU architectures. No ASIC, no FPGA, no Radeon, no CPU.
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2926
Merit: 1087
Team Black developer
|
|
September 13, 2019, 08:24:57 PM |
|
The point with ptx is that it's a unified language for all NVIDIA gpu architechtures.
The point s that it's only Nvidia GPU architectures. No ASIC, no FPGA, no Radeon, no CPU. Doesn't need to be PTX. If you run on NVIDIA hardware you convert the random stream of instructions to PTX. RandomX could be very profitable on NVIDIA hardware with a proper implementation...
|
|
|
|
joblo
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
September 13, 2019, 10:37:05 PM |
|
Doesn't need to be PTX. If you run on NVIDIA hardware you convert the random stream of instructions to PTX. RandomX could be very profitable on NVIDIA hardware with a proper implementation...
Precisely. You can build a Nvidia-only proof of concept, but a real product will need it's own pseudo language that can be compiled to ptx/cuda, ocl, and x86 native instructions producing identical functionality. The language would have to complex enough (in the CISC sense) that the FPGA can't decode with a simple table lookup. That's a hell of a lot of work.
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2926
Merit: 1087
Team Black developer
|
|
September 14, 2019, 05:05:48 AM |
|
The language would have to complex enough (in the CISC sense) that the FPGA can't decode with a simple table lookup. That's a hell of a lot of work.
The FPGA have limits to memory access and multipliers. Let's say the FPGA can do 32 multiplications and 32 mem access per cycle, then you might be able to run 32 instruction per cycle. @500mhz RandomX on the gpu doesn't need any memory access because the code is compiled, and you can run with 1024 threads at 2000Mz. So the gpu can do 1024 instructions per cycle@2000mz
|
|
|
|
pallas
Legendary
Offline
Activity: 2716
Merit: 1094
Black Belt Developer
|
|
September 14, 2019, 05:47:38 AM |
|
The FPGA doesn't make N multiplications per cycle. It does N hashes per cycle, with N integer > 0 or, in the case of complex algorithms, 1/N.
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2926
Merit: 1087
Team Black developer
|
|
September 14, 2019, 05:53:53 AM Last edit: September 14, 2019, 06:15:07 AM by sp_ |
|
The FPGA doesn't make N multiplications per cycle. It does N hashes per cycle, with N integer > 0 or, in the case of complex algorithms, 1/N.
Yes, but in Randomx the FPGA need to do a memory read per cycle to determine the instruction to be executed so the N hash doesn't apply. Then the new limit is N instructions where N is limited by the number of memory accesses the chip can do per cycle. In older FPGA designs it was normal to have ASIC multipliers you could use to speedup multiplications (f.ex Altera Cyclone IV). The multiplication could also be done in code.
|
|
|
|
pallas
Legendary
Offline
Activity: 2716
Merit: 1094
Black Belt Developer
|
|
September 14, 2019, 06:11:31 AM |
|
True, but it's also true that you can fill the FPGA with custom made cores each executing RandomX instructions. FPGAs are plenty flexible, much more than GPUs, it only takes much more time to optimise.
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2926
Merit: 1087
Team Black developer
|
|
September 14, 2019, 06:30:33 AM |
|
With a compiled kernel, the GPU can execute 15000 Randomx Instructions in 15 cycles per hash@2000mhz.
|
|
|
|
Kodaman
Jr. Member
Offline
Activity: 189
Merit: 2
|
|
September 24, 2019, 07:07:09 PM |
|
maybe this place has the info. There are rumors about x16rv2 has already ASICS. Any feedback?
|
|
|
|
joblo
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
September 24, 2019, 07:52:41 PM |
|
If you don't produce a source it usually means you're starting the rumour.
|
|
|
|
Kodaman
Jr. Member
Offline
Activity: 189
Merit: 2
|
|
September 24, 2019, 07:58:20 PM |
|
If you don't produce a source it usually means you're starting the rumour.
iBeLink in California is the ASIC provider. Is the source good enough
|
|
|
|
Kodaman
Jr. Member
Offline
Activity: 189
Merit: 2
|
|
September 24, 2019, 08:06:44 PM |
|
X25X algo is the one and only GPU only algo not for all Nvidia GPUS but also for cards that have 6gb or less because it doesn't require memory hard operations. For all the Nvidia cards from 1050ti 2 gb up to 1080 ti can all be mined without the fear of ASICS and FPGAS around. T-Rex has the algo optimised maximum that is why no private miners or no new faster miners for X25X. Today is the good information day lol
|
|
|
|
joblo
Legendary
Offline
Activity: 1470
Merit: 1114
|
|
September 24, 2019, 08:43:04 PM |
|
If you don't produce a source it usually means you're starting the rumour.
iBeLink in California is the ASIC provider. Is the source good enough If iBeLink has anounced it it's not a rumour, it's fact. Where did the rumour originate? Is it your own speculation? There's nothing in x16rv2 that makes it more technically difficult than x16r to implement on ASIC or FPGA. It's probably only a matter of time and demand.
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2926
Merit: 1087
Team Black developer
|
|
September 27, 2019, 08:16:25 AM Last edit: September 27, 2019, 08:28:26 AM by sp_ |
|
T-Rex has the algo optimised maximum that is why no private miners or no new faster miners for ....
T-Rex is slow and not profitable. Bether to mine Beamv2 or Grin.
|
|
|
|
Kodaman
Jr. Member
Offline
Activity: 189
Merit: 2
|
|
September 27, 2019, 09:24:39 AM |
|
T-Rex has the algo optimised maximum that is why no private miners or no new faster miners for ....
T-Rex is slow and not profitable. Bether to mine Beamv2 or Grin. What about people with 1050tis, 1060 3gbs actually nvidia cards less than 8gbs. What should we mine, which algo and which miner?
|
|
|
|
sp_ (OP)
Legendary
Offline
Activity: 2926
Merit: 1087
Team Black developer
|
|
September 27, 2019, 09:31:23 AM Last edit: September 27, 2019, 09:49:34 AM by sp_ |
|
Beam can be mined with 3gb cards. gminer. (windows 7,8.1 or linux) Grin29 can be mined with 4/6gb cards. gminer (windows 7,8.1 or linux) for 2gb cards monero / randomx?
|
|
|
|
Kodaman
Jr. Member
Offline
Activity: 189
Merit: 2
|
|
September 27, 2019, 03:32:19 PM |
|
Beam can be mined with 3gb cards. gminer. (windows 7,8.1 or linux) Grin29 can be mined with 4/6gb cards. gminer (windows 7,8.1 or linux) for 2gb cards monero / randomx?
Cool thanks for the tip but what about the windows 10 users?
|
|
|
|
scryptr
Legendary
Offline
Activity: 1797
Merit: 1028
|
|
September 28, 2019, 06:34:52 PM Last edit: September 28, 2019, 09:22:51 PM by scryptr |
|
Beam can be mined with 3gb cards. gminer. (windows 7,8.1 or linux) Grin29 can be mined with 4/6gb cards. gminer (windows 7,8.1 or linux) for 2gb cards monero / randomx?
RANDOMX IS A GOOD QUESTION-- I was just looking at SChernykh's github. He has coded both CUDA and OpenCL versions for benchmarking RandomX on GPUs. I compiled and ran the CUDA version on my 1070ti (8GB) rig, and got about 669H/s RandomX per single 1070ti. I'll try on my 750ti (2GB) rig in a little while. Both are Linux rigs. Maybe you could plug the RandomX algo into SuprMiner. I noticed that none of the commercial, closed source CCminer clones came out with an x16rv2 version until after you modded your SuprMiner source with it. --scryptr
|
|
|
|
pallas
Legendary
Offline
Activity: 2716
Merit: 1094
Black Belt Developer
|
|
September 28, 2019, 07:01:33 PM |
|
The difference between x16rv2 and x16r is just tiger, which is a pretty basic algorithm, already available as opensource cuda code on my m7 miner years ago, and as commercial miner used in software supporting my x22i and x25x.
|
|
|
|
|