Bitcoin Forum
May 17, 2024, 03:37:41 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: Is there a list with devices/cards and related jumps/sec for Pollard's kangaroo?  (Read 250 times)
BitcoinADAB (OP)
Copper Member
Member
**
Offline Offline

Activity: 75
Merit: 11


View Profile
July 07, 2021, 10:39:17 AM
Merited by ABCbits (1)
 #1

Hi, is there a list with devices/cards and related jumps/sec for Pollard's kangaroo (point addition)?

Thanks
BitcoinADAB (OP)
Copper Member
Member
**
Offline Offline

Activity: 75
Merit: 11


View Profile
July 07, 2021, 10:57:41 AM
 #2

I found some:

Ok, the 204.56 MK/s and 165.57 MK/s are those 2 speeds are added together to equal 370.13 MK/s?

No. The first value on the left is the combined speed of both the CPU and GPU.

The second value (165) is measuring the GPU speed only. it means the CPU speed is actually about 39 MK/s.

Oh ok then, so what gpu's can go at least or close to 1 billion keys with kangaroo?

A single RTX3090 can do 3 or 4 gigakeys/s, so it's not unreasonable to assume the other RTX 30s can search in excess of 1GK/s too.

but K/s and not j/s.
WanderingPhilospher
Full Member
***
Offline Offline

Activity: 1064
Merit: 219

Shooters Shoot...


View Profile
July 07, 2021, 04:38:45 PM
 #3

Quote
but K/s and not j/s.

In your mind, what is the difference in Keys per second versus jumps per second?
BitcoinADAB (OP)
Copper Member
Member
**
Offline Offline

Activity: 75
Merit: 11


View Profile
July 07, 2021, 05:57:18 PM
 #4

K/s and j/s is the same, if K is the result of point addition.
But if K is the hashed form, then it is different.
WanderingPhilospher
Full Member
***
Offline Offline

Activity: 1064
Merit: 219

Shooters Shoot...


View Profile
July 07, 2021, 06:03:27 PM
 #5

K/s and j/s is the same, if K is the result of point addition.
But if K is the hashed form, then it is different.
In the Kangaroo program, each kangaroo jumps to new point, converts point to pubkey, if point contains selected DP, it records the point and distance (priv key) and keeps on moving.

So using MK/s or J/s is the same for Kangaroo.

I am not sure if there is a one stop benchmark for the different cards or CPUs; you may have to go through the main Kangaroo post and look at screen shots that users posted and create your own list.
BitcoinADAB (OP)
Copper Member
Member
**
Offline Offline

Activity: 75
Merit: 11


View Profile
July 07, 2021, 06:27:18 PM
 #6

I am not sure if there is a one stop benchmark for the different cards or CPUs; you may have to go through the main Kangaroo post and look at screen shots that users posted and create your own list.

Thanks
NotATether
Legendary
*
Offline Offline

Activity: 1610
Merit: 6746


bitcoincleanup.com / bitmixlist.org


View Profile WWW
July 07, 2021, 11:09:41 PM
Merited by ABCbits (2)
 #7

Kangaroo has a built-in benchmark option in "-check" that will benchmark the number of keys/second for your hardware, but with two things to note. First, you have to pass -gpu to get benchmarks for your GPU, otherwise you'll just get CPU benchmarks. Second, this is the part of the code in Check.cpp that actually does the benchmark:

Code:
  // Check on ComputePublicKeys
  for(int i = 0; i<nbKey; i++) {
    Int rnd;
    rnd.Rand(256);
    priv.push_back(rnd);  <--- [1]
  }

  t0 = Timer::get_tick();
  for(int i = 0; i<nbKey; i++)
    pts1.push_back(secp->ComputePublicKey(&priv[i]));      <--- [2]
  t1 = Timer::get_tick();
  ::printf("ComputePublicKey %d : %.3f KKey/s\n",nbKey,(double)nbKey / ((t1 - t0)*1000.0));

  t0 = Timer::get_tick();
  pts2 = secp->ComputePublicKeys(priv);
  t1 = Timer::get_tick();
  ::printf("ComputePublicKeys %d : %.3f KKey/s\n",nbKey,(double)nbKey / ((t1 - t0)*1000.0));

To benchmark any other int operation such as add, subtract, divide or multiply, just make a duplicate this snippet at the end, but at [1] you might have to make another array of Ints for binary operations, and at [2] you gotta replace ComplutePublicKey which whatever you're trying to benchmark, such as (some int)->add(&priv).

Theoretically, you can rent many boxes on vast.ai, one at a time, just to measure Kangaroo speed on each. But it's probably not a good idea unless they already have CUDA installed, or you'll consume precious run hours installing that instead of just building Kangaroo.

.
.BLACKJACK ♠ FUN.
█████████
██████████████
████████████
█████████████████
████████████████▄▄
░█████████████▀░▀▀
██████████████████
░██████████████
████████████████
░██████████████
████████████
███████████████░██
██████████
CRYPTO CASINO &
SPORTS BETTING
▄▄███████▄▄
▄███████████████▄
███████████████████
█████████████████████
███████████████████████
█████████████████████████
█████████████████████████
█████████████████████████
███████████████████████
█████████████████████
███████████████████
▀███████████████▀
█████████
.
BitcoinADAB (OP)
Copper Member
Member
**
Offline Offline

Activity: 75
Merit: 11


View Profile
July 08, 2021, 06:02:00 PM
 #8

Theoretically, you can rent many boxes on vast.ai, one at a time, just to measure Kangaroo speed on each. But it's probably not a good idea unless they already have CUDA installed, or you'll consume precious run hours installing that instead of just building Kangaroo.

I checked vast.ai rent page (https://vast.ai/console/create/) and found out it's possible to select OS image with CUDA version you want (ranging from 6.5 to 11.4). You also can add on-start script to install additional software immediately. So it should be practical option if OP wish to make benchmark list.

Thanks, that is an informative site. My intention was not to make a benchmark list. I wanted to know what for devices we can use for point addition and the related j/s.

Do you know, if there are already asics like these

not to mine coins but for point addition?
WanderingPhilospher
Full Member
***
Offline Offline

Activity: 1064
Merit: 219

Shooters Shoot...


View Profile
July 08, 2021, 07:35:04 PM
Merited by LoyceV (5), ABCbits (1)
 #9

Theoretically, you can rent many boxes on vast.ai, one at a time, just to measure Kangaroo speed on each. But it's probably not a good idea unless they already have CUDA installed, or you'll consume precious run hours installing that instead of just building Kangaroo.

I checked vast.ai rent page (https://vast.ai/console/create/) and found out it's possible to select OS image with CUDA version you want (ranging from 6.5 to 11.4). You also can add on-start script to install additional software immediately. So it should be practical option if OP wish to make benchmark list.

Thanks, that is an informative site. My intention was not to make a benchmark list. I wanted to know what for devices we can use for point addition and the related j/s.

Do you know, if there are already asics like these

not to mine coins but for point addition?
Not publicly available, or they would be steep in costs. There are some who are trying to modify old asics to to point addition.

Quote
My intention was not to make a benchmark list. I wanted to know what for devices we can use for point addition and the related j/s.
Not sure what you are wanting then. All modern GPUs and CPUs can do point addition.

Quote
Hi, is there a list with devices/cards and related jumps/sec for Pollard's kangaroo (point addition)?
It sounded like you wanted a list of cards with their applicable speed for point addition.
Do some legwork, become informed.
BitcoinADAB (OP)
Copper Member
Member
**
Offline Offline

Activity: 75
Merit: 11


View Profile
July 27, 2021, 05:17:10 PM
 #10

Hi, is there a list with devices/cards and related jumps/sec for Pollard's kangaroo (point addition)?

Thanks

Here is something... Thanks to DaveF.

Has anyone put together (or started to put together) a list of CPUs / Video Cards & the speed you can get out of them.
I know it's a newer project and Jean_Luc is working VERY VERY hard on it so getting accurate numbers is going to be a moving target. But for now all we can do is look through the thread and see who is running what to get a general idea.
So far I have pulled from this thread:

GPU: GPU #0 GeForce GTX 1080 Ti (28x128 cores) Grid(224x128)
914.418 MK/s (GPU 896.216 MK/s)

GPU: GPU #0 GeForce GTX 1050 Ti (6x128 cores) Grid(48x128)
220.180 MK/s (GPU 220.180 MK/s)

GPU: GPU #0 GeForce GT 520M (1x48 cores) Grid(8x128)
10.233 MK/s (GPU 7.026 MK/s)

GPU: GPU #0 GeForce RTX 2070 (36x64 cores) Grid(288x128)
1535.880 MK/s (GPU 1470.257 MK/s)

Added 30-April-2019

GPU: GPU #0 GeForce GTX 1060 3GB (9x128 cores) Grid(72x128)
321.929 MK/s (GPU 321.929 MK/s)

GPU: GPU #0 GeForce GTX 1080 (20x128 cores) Grid(160x128)
672.062 MK/s (GPU 672.062 MK/s)

Added 1-May-2019

GPU: GPU #0 Tesla V100-SXM2-16GB (80x64 cores) Grid(640x128)
GPU: GPU #3 Tesla V100-SXM2-16GB (80x64 cores) Grid(640x128)
GPU: GPU #2 Tesla V100-SXM2-16GB (80x64 cores) Grid(640x128)
GPU: GPU #1 Tesla V100-SXM2-16GB (80x64 cores) Grid(640x128)
7260.449 MK/s (GPU 7212.931 MK/s)
So 7260 / 4 = 1815 MK/s

GPU: GPU #0 GeForce GTX 750 (4x128 cores) Grid(32x128)
104.960 MK/s (GPU 94.405 MK/s) (2^32.12)

Added 3-May-2019
i7-7700K CPU Number of CPU thread: 8
22.092 MK/s (GPU 0.000 MK/s)

With -t 7
Number of CPU thread: 7
21.609 MK/s

Added 8-May-2019

EVGA RTX 2080 XC ULTRA
1427.967 MK/s (GPU 1424.946 MK/s)

Added 23-May-2019

GPU: GPU #0 GeForce GTX 1660 Ti
961.319 MK/s (GPU 961.319 MK/s)

GPU: GPU #0 GeForce RTX 2080 Ti (68x64 cores) Grid(544x128)
GPU: GPU #1 GeForce RTX 2080 Ti (68x64 cores) Grid(544x128)
5128.213 MK/s (GPU 5128.213 MK/s)
So 5128 / 2  = 2564 MK/s


Added 8-June-2019

GPU: GPU #0 GeForce GTX 960M (5x128 cores) Grid(40x128)
117.802 MK/s (GPU 117.802 MK/s)

Added 23-July-2019

GPU: GPU #0 GeForce GTX 1660 (22x64 cores) Grid(176x128)
839.061 MK/s (GPU 839.061 MK/s)

Added 25-July-2019

GPU: GPU #0 GeForce GTX 1650 (14x64 cores) Grid(112x128)
511.906 MK/s (GPU 511.906 MK/s) (2^36.97)


Added 21-Nov-2019

GPU: GPU #0 GeForce GTX 970 (13x128 cores) Grid(104x128)
360.322 MK/s (GPU 331.442 MK/s) (2^32.77)

Added 25-Nov-2019

GPU: GPU #0 GeForce GTX 980 (16x128 cores) Grid(128x128)
375.384 MK/s (GPU 375.384 MK/s)

GPU: GPU #0 GeForce RTX 2060 SUPER (34x64 cores) Grid(272x256)
[1361.71 Mkey/s][GPU 1361.71 Mkey/s]

GPU: GPU #0 GeForce RTX 2080 SUPER (48x64 cores) Grid(384x256)
[2001.52 Mkey/s][GPU 2001.52 Mkey/s]
BitcoinADAB (OP)
Copper Member
Member
**
Offline Offline

Activity: 75
Merit: 11


View Profile
July 28, 2021, 05:59:59 PM
 #11

This is a benchmark post with speeds for Kangaroo GPU Solver. All the tests were made with default DP and default grid size (calculated by a program). I guess that some plays with DP and grid size could change (increase or decrease) the speed. If somebody knows the optimal values, please let us know.

Code:
  Card Model           Grid size      DP        Tested speed
---------------------------------------------------------------
GTX 1050 Ti          Grid(12x256)   DP 16     115 MKey/sec
GTX 1080 Ti          Grid(56x256)   DP 15     500 MKey/sec
Tesla T4 16Gb        Grid(80x128)   DP 14     565 MKey/sec
RTX 2080ti 11Gb      Grid(136x128)  DP 13     1225 MKey/sec
Tesla V100 32Gb      Grid(160x128)  DP 13     1420 MKey/sec
---------------------------------------------------------------
WanderingPhilospher
Full Member
***
Offline Offline

Activity: 1064
Merit: 219

Shooters Shoot...


View Profile
August 01, 2021, 12:40:51 AM
 #12

This is a benchmark post with speeds for Kangaroo GPU Solver. All the tests were made with default DP and default grid size (calculated by a program). I guess that some plays with DP and grid size could change (increase or decrease) the speed. If somebody knows the optimal values, please let us know.

Code:
  Card Model           Grid size      DP        Tested speed
---------------------------------------------------------------
GTX 1050 Ti          Grid(12x256)   DP 16     115 MKey/sec
GTX 1080 Ti          Grid(56x256)   DP 15     500 MKey/sec
Tesla T4 16Gb        Grid(80x128)   DP 14     565 MKey/sec
RTX 2080ti 11Gb      Grid(136x128)  DP 13     1225 MKey/sec
Tesla V100 32Gb      Grid(160x128)  DP 13     1420 MKey/sec
---------------------------------------------------------------

The program uses an algo to calculate the best DP to use to NOT increase the DP overhead.   very low DPs (like 0 or 1 or 2)  may slow the speed somewhat due to writing pretty much every point and distance into RAM, but other than that possibility, the program gets the same speed regardless of DP size.
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!