Bitcoin Forum
October 20, 2024, 10:00:36 AM *
News: Latest Bitcoin Core release: 28.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 [2] 3 »  All
  Print  
Author Topic: Ultra-Lightweight Database with Public Keys (for puzzle btc)  (Read 1293 times)
mcdouglasx (OP)
Member
**
Offline Offline

Activity: 348
Merit: 89

New ideas will be criticized and then admired.


View Profile WWW
September 24, 2024, 05:03:14 PM
 #21

Can this script be used in any puzzler that has the pubkey revealed or does it only work in a specific bit wallet?

This script is only an idea with a basic implementation, although you can do it, it does not guarantee the right speed, wait for Alberto and its own implementation in Keyhunt, or if you have knowledge in programming implement it yourself in C.

BTC bc1qxs47ttydl8tmdv8vtygp7dy76lvayz3r6rdahu
mcdouglasx (OP)
Member
**
Offline Offline

Activity: 348
Merit: 89

New ideas will be criticized and then admired.


View Profile WWW
September 24, 2024, 07:08:44 PM
 #22

I have added 3 new files, aggregate custom xor filter to 8 bytes because this database does not need many elements
and I think that it is not so likely false collisions, in the same way the false positives would be ignored in the script when
comparing with the target, Use Blake2 that is an efficient and fast hash for this task and is slightly lighter than using the
PATT_DB.txt.

Blake3 would make a notable difference?

https://github.com/Mcdouglas-X/Private-Key-Search-and-Ultra-Lightweight-Database-with-Public-Keys

BTC bc1qxs47ttydl8tmdv8vtygp7dy76lvayz3r6rdahu
albert0bsd
Hero Member
*****
Offline Offline

Activity: 977
Merit: 685



View Profile
September 24, 2024, 07:14:14 PM
 #23

Can this script be used in any puzzler that has the pubkey revealed or does it only work in a specific bit wallet?

This is the idea of one database... there is not one complete script yet..

But to be honest I don't know what are your intentions here, are you coming here to make fun of us?

Quete from another topic:
yous look like dogs, you only know how to bark and you don't solve anything! lol


I am available for hiring. Avatar and Signature available for rent.
Donations: bc1qjyhcjacmpc9pg8wes82lktyrjcwk4sdvqtm7ky
AbadomRSZ
Newbie
*
Offline Offline

Activity: 6
Merit: 0


View Profile
September 24, 2024, 08:11:47 PM
 #24

Can this script be used in any puzzler that has the pubkey revealed or does it only work in a specific bit wallet?

This is the idea of one database... there is not one complete script yet..

But to be honest I don't know what are your intentions here, are you coming here to make fun of us?

Quete from another topic:
yous look like dogs, you only know how to bark and you don't solve anything! lol


OK! As for your question, my intention was to throw cold water on all this euphoria, I apologize if anyone was offended.
albert0bsd
Hero Member
*****
Offline Offline

Activity: 977
Merit: 685



View Profile
September 25, 2024, 03:22:57 PM
Last edit: September 25, 2024, 03:44:57 PM by albert0bsd
 #25

OK! As for your question, my intention was to throw cold water on all this euphoria, I apologize if anyone was offended.

I don't get offended by that, i am not a snowflake. I just wonder why if this topic about solving puzzles is annoying for some users why they bother in create new accounts just to make those comments.

Anyway, I am implementing this new approach just to check what is the new speed.
I believe that a laptop with 8 of RAM may be able to solve any key under 75 bits in a few seconds or minutes.




OK guys, I already completed and tested a BSGS version with this database but the results are not good as expected (But this is a first approach), let me summarize the results, I tested this against puzzle 66 public key in my laptop ( 11th Gen Intel(R) Core(TM) i5-1145G7 @ 2.60GHz with 16 RAM Physically )

17 GB of this database can store (17 * 2^33) items, in this case 146,028,888,064 (146 Billion keys) those stored in a bloom filter only need ~8GB of RAM
This test solve the key in ~10 Minutes

But the regular version of keyhunt with only 4GB of RAM used and only 1,073,741,824 keys stored (1 Billion keys) in the bloom filter only need 6 Minutes to solve it in the same hardware.
So that means that keyhunt with 8GB of RAM used will solve that same publickey in 3 minutes.

I have some screenshots that i want to share with mcdouglasx, also i am sharing the first code with him.

Not all is lost I figure it how to change some calculations and maybe win some  x2 or x4 times more speed with this Ultra-Lightweight Database.

I said that not all is lost because regardless than the new approach need some x64 times more processor to prepare some data to be checked, the final speed is not x64 times less, but only x4 times less, so if I implement a new approach at least i think that the speed can be the same o a little faster.

There are other options apart from bloom filters, there are some XOR filters and some fuse filters, those are supposedly using less data than a bloom filter, this may allow to store more items in the same RAM and speed up BSGS in general.

So I still have hope in this database.

Regards!

I am available for hiring. Avatar and Signature available for rent.
Donations: bc1qjyhcjacmpc9pg8wes82lktyrjcwk4sdvqtm7ky
mcdouglasx (OP)
Member
**
Offline Offline

Activity: 348
Merit: 89

New ideas will be criticized and then admired.


View Profile WWW
September 25, 2024, 08:06:59 PM
 #26

OK! As for your question, my intention was to throw cold water on all this euphoria, I apologize if anyone was offended.

I don't get offended by that, i am not a snowflake. I just wonder why if this topic about solving puzzles is annoying for some users why they bother in create new accounts just to make those comments.

Anyway, I am implementing this new approach just to check what is the new speed.
I believe that a laptop with 8 of RAM may be able to solve any key under 75 bits in a few seconds or minutes.




OK guys, I already completed and tested a BSGS version with this database but the results are not good as expected (But this is a first approach), let me summarize the results, I tested this against puzzle 66 public key in my laptop ( 11th Gen Intel(R) Core(TM) i5-1145G7 @ 2.60GHz with 16 RAM Physically )

17 GB of this database can store (17 * 2^33) items, in this case 146,028,888,064 (146 Billion keys) those stored in a bloom filter only need ~8GB of RAM
This test solve the key in ~10 Minutes

But the regular version of keyhunt with only 4GB of RAM used and only 1,073,741,824 keys stored (1 Billion keys) in the bloom filter only need 6 Minutes to solve it in the same hardware.
So that means that keyhunt with 8GB of RAM used will solve that same publickey in 3 minutes.

I have some screenshots that i want to share with mcdouglasx, also i am sharing the first code with him.

Not all is lost I figure it how to change some calculations and maybe win some  x2 or x4 times more speed with this Ultra-Lightweight Database.

I said that not all is lost because regardless than the new approach need some x64 times more processor to prepare some data to be checked, the final speed is not x64 times less, but only x4 times less, so if I implement a new approach at least i think that the speed can be the same o a little faster.

There are other options apart from bloom filters, there are some XOR filters and some fuse filters, those are supposedly using less data than a bloom filter, this may allow to store more items in the same RAM and speed up BSGS in general.

So I still have hope in this database.

Regards!

Be careful with masking the bits; this shouldn’t happen because even if you perform 64 more calculations (I suppose), the final total should at least have double the speed in the worst-case scenario.

BTC bc1qxs47ttydl8tmdv8vtygp7dy76lvayz3r6rdahu
albert0bsd
Hero Member
*****
Offline Offline

Activity: 977
Merit: 685



View Profile
September 25, 2024, 10:41:11 PM
Last edit: September 25, 2024, 11:13:24 PM by albert0bsd
 #27

I am not masking any bits, Lets to put an example of what am I doing.

I have 2 files:
- Your database (1 bit per public key in DISK)
- The bloom filter (for fast query in RAM)


Lets to show some data
Code:
0111000101100000000110100010100111111111111111010011111101100101 (from key 1 Left to key 64 right    ITEM 1)
1011100010010000011110001100000100001010110011100111101011111110 (from key 65 left to key 128 right  ITEM 2)
1001000001111101101101111001000001101111010010000101000001011010 (from key 129 left to key 192 right ITEM 3)
...
etc up to some Gigabytes of data   (ITEM N)

That is just the binary string representation but internally in the program I just use 64 bit integer variables without any output format.

So I group those 64 keys in a single variable and that is the value that I send to the bloom filter to Initialize and construct it, each of those items need near 29 bits inside of the bloom filter.
I don't have any problem building the bloom filter or querying it to check if some value exist or not.

The problem is that we are looking for some unknown public key and we don't know its original value, so we don't know if that keys is aligned properly to our sequence.

for example in the BSGS algorithm you need to do some subtractions to the target key and check each of those results against your database.
So since we cannot do an exhaustive search in our Ultra-Lightweight Database because it need a lot of time in databases of some Hundred of Gigabytes.

I opted to use the bloom filter for fast search.

Each time that I need to query in the bloom filter I need to generate a 64 bit keys based on the publickey for example lets to said that our publickey result is 2 (Privatekey 2)
In this case i need to group keys from 2 to 65, the 64 bit generated is 1110001011000000001101000101001111111111111110100111111011001011 (Less significant on the left) if I query this value against the bloom filter it is going to return false.

So i need to check the next sequences until one is found or until reach 64 keys more

Code:
1110001011000000001101000101001111111111111110100111111011001011 from 2 to 65 (Bloom filter query: false)
1100010110000000011010001010011111111111111101001111110110010110 from 3 to 66 (Bloom filter query: false)
1000101100000000110100010100111111111111111010011111101100101101 from 4 to 67 (Bloom filter query: false)
0001011000000001101000101001111111111111110100111111011001011011 from 5 to 68 (Bloom filter query: false)
...
1011100010010000011110001100000100001010110011100111101011111110 from 65 to 128 (Bloom filter query: True -> HIT!!)

I have a method to handle false positives HITs and a method to determine the value of the target key based on some binary search with multiple queries to the bloom filter.

In fact for each key that i need check in the  bloom filer i need to calculate the next 128 keys just to get the correct sequences and guaranty that at least one of those must be properly aligned to our bloom filter items.

As i mention before in other posts I've thinking and rethinking about different approach to handle this database, fast searches and collisions, this previous example is the best way that I reached.

So if you or anyone have a faster way to do it efficiently please show us

(Again I am using my custom way to handle the database because I don't understand how your way works, all the Pp, Bi, Tb stuff I don't get it)

I am available for hiring. Avatar and Signature available for rent.
Donations: bc1qjyhcjacmpc9pg8wes82lktyrjcwk4sdvqtm7ky
mcdouglasx (OP)
Member
**
Offline Offline

Activity: 348
Merit: 89

New ideas will be criticized and then admired.


View Profile WWW
September 25, 2024, 11:26:50 PM
Last edit: September 26, 2024, 02:28:27 AM by mcdouglasx
 #28

In fact for each key that i need check in the  bloom filer i need to calculate the next 128 keys just to get the correct sequences and guaranty that at least one of those must be properly aligned to our bloom filter items.

As i mention before in other posts I've thinking and rethinking about different approach to handle this database, fast searches and collisions, this previous example is the best way that I reached.

So if you or anyone have a faster way to do it efficiently please show us


Maybe it would be more efficient to create a sliding window type variable where for example you calculate from 1-64 keys, and from there on you only calculate one more key and the bits in the window move to 2-65 and so on consecutively for the duration of your sequence?

(Again I am using my custom way to handle the database because I don't understand how your way works, all the Pp, Bi, Tb stuff I don't get it)


The patterns in this database are sequences of 010101… or 101010… of length 15 or more.

Suppose:


Process:


Initial target: pk = 1,000,000

Consecutive subtractions: 1 is subtracted consecutively 200,000 times.

Concatenation: These values ​​are concatenated using 1 bit per key, creating a string of 200,000 bits.


Example for Db:

First pattern:

Start position of pattern (Bi): 10 (bits to pattern)

Pattern found (Pp): 010101010101010 (15 bits)

Bits to end of pattern (Tb): 25



Bi: 10, Pp: 010101010101010, Tb: 25



Second pattern:


position of pattern (Bi): 150,000 - 25 = 149,975

Pattern found (Pp): 01010101010101010 (17 bits)

Bits to end of pattern (Tb): 150,017



Bi: 149,975, Pp: 01010101010101010, Tb: 150,017



Database storage:

Code:
Bi: 10, Pp: 010101010101010, Tb: 25
Bi: 149,975, Pp: 01010101010101010, Tb: 150,017

if we use xor, we must store this "Bi: 10, Pp: 010101010101010" as an item in the filter, and when found return Tb value "25".


Search script:



Generating a random point: Example pk = 999,997



Creating a bit string: From the random point, the subtractions are generated just like in the database.


Pattern search:


First pattern:

Start position of pattern (Bi): 7

Pattern found (Pp): 010101010101010 (15 bits)


We do not include Tb here because it is not necessary in the search.
Although Tb is not necessary for comparison in db, it is used in the calculation of the private key, keep your calculation in the code.

Bi: 7, Pp: 010101010101010


Second pattern:


Start position of pattern (Bi): 149,975

Pattern found (Pp): 01010101010101010 (17 bits)


Bi: 149,975, Pp: 01010101010101010



Database search: The last pattern found is searched.


Bi: 149,975, Pp: 01010101010101010



Match found:


return Tb: 150,017



Key calculation:

random_point - total_bits + Tb_in_db

(999,997 - 150,014 + 150,017) = 1,000,000



Sometimes there can be false positives but you ignore them by creating the pub for the obtained pk and comparing it with the target.

Considerations to avoid false positives:

Search cycles: You should search for cycles of 250,000 keys to cover the entire database with a very low margin of error, but you can explore with fewer keys: 150 , 100 , 50.

Storage capacity: 12 billion keys can be stored in a little more than 300,000 elements in your database or XOR filter with an incredibly low weight.

Speed: example; If you do 4 cycles per second, you will have 12,000,000,000 * 4 / Ks. By increasing the database, you will have even more speed.



BTC bc1qxs47ttydl8tmdv8vtygp7dy76lvayz3r6rdahu
albert0bsd
Hero Member
*****
Offline Offline

Activity: 977
Merit: 685



View Profile
September 26, 2024, 03:29:25 AM
 #29

Maybe it would be more efficient to create a sliding window type variable where for example you calculate from 1-64 keys, and from there on you only calculate one more key and the bits in the window move to 2-65 and so on consecutively for the duration of your sequence?

Well Yes I did it, in my previous example that was done. Since almost all the times those 128 items need to be calculated, I calculate all those keys and generate all the array of 64 items all of them of 64 bits, then I proceed to check them against the bloom filter.



The patterns in this database are sequences of 010101… or 101010… of length 15 or more.

...

Thank you for your example actually that is what i want to ask next, let me check and analyze it, to see what i can do.

I am available for hiring. Avatar and Signature available for rent.
Donations: bc1qjyhcjacmpc9pg8wes82lktyrjcwk4sdvqtm7ky
albert0bsd
Hero Member
*****
Offline Offline

Activity: 977
Merit: 685



View Profile
September 27, 2024, 02:53:25 AM
 #30

Now I get your idea.

You should search for cycles of 250,000 keys to cover the entire database with a very low margin of error, but you can explore with fewer keys: 150 , 100 , 50.

150 , 100 , 50.  Thousands, right?

Here is what I understand

Your Ultra-Lightweight Database is more like a reference for some Patterns and their offsets related to next pattern right?

So each time that we need to query if some Random Key X is in our database we need to calculate some 50K to 200K Keys next to our target and looking for those Patterns

If some Patterns match with out database (Pattern and offset against other patterns in our database), we can easily calculate the value of that X Key.

As far I see to get this database of distances and patterns small you are paying with more process to find those matches.

If you do 4 cycles per second, you will have 12,000,000,000 * 4 / Ks. By increasing the database, you will have even more speed.

You need a point of comparation, also you need to calculate what is the point where your approach is better than exising methods.

For example, in the test that I send to you using my approach, bloom filter + BSGS the speed was near 16 Petakeys/s ( 15,904,076,907,774,986 keys/s) Using 8GB or RAM

But the current version of keyhunt did 26 Petakeys/s with 4GB of RAM, if I use 8 like the other Test it would do some 50 Petakeys/s

And my program is slow compared it against the kangaroo (kTimesG's version) for CPU than can solve keys under 80 bits in 1 minute.

So I think that we need to calculate how many keys do you  need to have in your database just to become competitive against the existing tools.

First thinks that you need to do is measure how many time a CPU or GPU can calculate the 200K keys sequentially and search for the patterns.
And then calculate how many keys need the database to reach some Exakey/s, Zettakeys/s o Yottakey/s.

One good think Good about your approach is that the database can grows on demand you only need to keep calculating keys and search the patterns once that you found one just append it to the DB.

To be honest I think that It would be difficult beat to kangaroo but with enough DB it may be possible.

Lets do some calculation:

(2**80) / 60 seconds = 20148763660243819578436

Suppose that you take 1 Second in determine if the pattern is in our database or not

That would mean that you need a database with at least 20148763660243819578436 elements

>>> 2**75  <=  20148763660243819578436
False
>>> 2**74  <=  20148763660243819578436
True


So its something between 2**75 and 2**74, well those are just rounded numbers real database maybe near 2^70 or something low
But one thing comes to my mind, we can't solve a 2^65 with  pure brute-force alone.

How you plan to build a database that rounds the 2^60 items to compete against kangaroo.

If we have enough computer power to build a 2^60 or 2^70 database we may have some chances to be able to try something against 135.

Well those are just round numbers that comes to my mind in this moment.

I am available for hiring. Avatar and Signature available for rent.
Donations: bc1qjyhcjacmpc9pg8wes82lktyrjcwk4sdvqtm7ky
mcdouglasx (OP)
Member
**
Offline Offline

Activity: 348
Merit: 89

New ideas will be criticized and then admired.


View Profile WWW
September 27, 2024, 06:32:31 AM
 #31

Now I get your idea.

You should search for cycles of 250,000 keys to cover the entire database with a very low margin of error, but you can explore with fewer keys: 150 , 100 , 50.

150 , 100 , 50.  Thousands, right?

Here is what I understand

Your Ultra-Lightweight Database is more like a reference for some Patterns and their offsets related to next pattern right?

So each time that we need to query if some Random Key X is in our database we need to calculate some 50K to 200K Keys next to our target and looking for those Patterns

If some Patterns match with out database (Pattern and offset against other patterns in our database), we can easily calculate the value of that X Key.

As far I see to get this database of distances and patterns small you are paying with more process to find those matches.

If you do 4 cycles per second, you will have 12,000,000,000 * 4 / Ks. By increasing the database, you will have even more speed.

You need a point of comparation, also you need to calculate what is the point where your approach is better than exising methods.

For example, in the test that I send to you using my approach, bloom filter + BSGS the speed was near 16 Petakeys/s ( 15,904,076,907,774,986 keys/s) Using 8GB or RAM

But the current version of keyhunt did 26 Petakeys/s with 4GB of RAM, if I use 8 like the other Test it would do some 50 Petakeys/s

And my program is slow compared it against the kangaroo (kTimesG's version) for CPU than can solve keys under 80 bits in 1 minute.

So I think that we need to calculate how many keys do you  need to have in your database just to become competitive against the existing tools.

First thinks that you need to do is measure how many time a CPU or GPU can calculate the 200K keys sequentially and search for the patterns.
And then calculate how many keys need the database to reach some Exakey/s, Zettakeys/s o Yottakey/s.

One good think Good about your approach is that the database can grows on demand you only need to keep calculating keys and search the patterns once that you found one just append it to the DB.

To be honest I think that It would be difficult beat to kangaroo but with enough DB it may be possible.

Lets do some calculation:

(2**80) / 60 seconds = 20148763660243819578436

Suppose that you take 1 Second in determine if the pattern is in our database or not

That would mean that you need a database with at least 20148763660243819578436 elements

>>> 2**75  <=  20148763660243819578436
False
>>> 2**74  <=  20148763660243819578436
True


So its something between 2**75 and 2**74, well those are just rounded numbers real database maybe near 2^70 or something low
But one thing comes to my mind, we can't solve a 2^65 with  pure brute-force alone.

How you plan to build a database that rounds the 2^60 items to compete against kangaroo.

If we have enough computer power to build a 2^60 or 2^70 database we may have some chances to be able to try something against 135.

Well those are just round numbers that comes to my mind in this moment.


Yes, in 50,100, 150 I'm talking about thousands.

Focusing on your tests with your approach:

original bsgs 1,073,741,824 keys stored

new bsgs 146,028,888,064 keys stored

dividing 146,028,888,064 by 128 is 1,140,850,688.

I think you would have to be able to process more keys in the same time frame.
and that's why it becomes slower because of all the processes that involve the 128 additional processes.
to equal efficiency.

In this db 146,028,888,064 fit in less than 20mb

In my database with 120 million elements, the sequence rarely exceeds
200,000 keys between patterns. This suggests that 200,000 can be an average maximum value for key cycles and
the way patterns are packed allows saving large amounts of keys or their benchmarks, which is efficient in terms of storage
unlike methods with fixed computing power like Kangaroo, we can keep moving forward by adding more keys to the database
and gain a significant advantage in terms of scalability.

We can also create an additional mini database with subtractions and additions to your target and have each of those pubs calculate certain amounts of keys and patterns join to the main database.

Example:

pub-a -13636775

pub-b +1368754


if the pattern is found and you get pub-a, you add 13636775 and you get the target pk.

if the pattern is found and you get pub-b, you subtract 1368754 and you get the target pk.


then you could have multiple reference points along the range which improves
the search system.


logic tells me that to achieve a favorable average for this database you should
be able to store more than 200k times the number of keys(I'm not entirely sure) that your current
database can handle (which in terms of space with this db is possible because in this
database 200k times more is not a significant increase compared to your current db)

but on the other hand in theory, there would come a point where the cost of 200k keys
would be profitable because you would only have to exceed the limit of equivalence in effort.

Now the complex thing is, what is the limit of patterns to store? I know that with Python it is
approximately 130kb for 120 million keys in a little more than 3 thousand elements.

What algorithm can this be integrated into? In a filter, how many searches can you do
per second on average? Can it be part of bsgs?

I would have to investigate what would be the most efficient way, because it is linked to how big
the database is, so will your speed.

In the end this database is related to scalability, and I think the barrier of added effort can be broken and improvements can be seen.

BTC bc1qxs47ttydl8tmdv8vtygp7dy76lvayz3r6rdahu
albert0bsd
Hero Member
*****
Offline Offline

Activity: 977
Merit: 685



View Profile
September 27, 2024, 01:38:06 PM
 #32

I think you would have to be able to process more keys in the same time frame.
and that's why it becomes slower because of all the processes that involve the 128 additional processes.
to equal efficiency.

Additional to the extra 128 keys per query I also need to check 65 items to the bloom filter

Original it was to be 64 items but since I already calculate 65 different sets of possible items, from the 128 bits window:

1 to 64 - Item 1
2 to 65 - Item 2
3 to 66 - Item 3
...
64 to 127 - Item 64
65 to 128 - Item 65

Also 1/64 = 0.015625 that is 1.5% more process in this part, so One item extra would be not much noticeable, because bloom filter checks are faster than generate the keys



logic tells me that to achieve a favorable average for this database you should
be able to store more than 200k times the number of keys(I'm not entirely sure) that your current
database can handle (which in terms of space with this db is possible because in this
database 200k times more is not a significant increase compared to your current db)

Yes the database should be at least 200K times great than current database.
And yes my current database can't exceeded the RAM amount that is the big limiting
(A lot of people ask me to use some NVme disk directly for this database instead the RAM,
In this case to compensate the different speed between RAM against Disk the database need to be x1000 times great.
so if you are using 128 GB of RAM, the NVme must be some like 128 TB)
But that is one topic apart

What algorithm can this be integrated into? In a filter, how many searches can you do
per second on average? Can it be part of bsgs?


Yes definitely it can be added to BSGS, any method that can query if a public key is between 1 and M keys can be used for BSGS.

Yes also we can add your database to a bloom-filter or any other kind like XOR, Fuse filters.

how many searches can you do per second on average?

Let me test and measure it, usually those filters have some constant time to query if the item is inside or not.

So we can pack your items like - Pattern:distance:something-else  (This is preferably in byte formats than strings)


Now the complex thing is, what is the limit of patterns to store? I know that with Python it is
approximately 130kb for 120 million keys in a little more than 3 thousand elements.

Question here... why we don't limited the search to some easy pattern to recognize like 0000000000000000 or 1111111111111111
To be more efficient those should be stored as bit and some of them may be easily to search from the CPU point of view you know (ANDs ORs and bitshifts)


Now the complex thing is, what is the limit of patterns to store?

I think that once that you chose some specific pattern you should store all the matches that exists in your DB no?
Because if we don't store some matches then we may have some missing gaps and hence missing keys (HITS)

I know that with Python it is
approximately 130kb for 120 million keys in a little more than 3 thousand elements.

That size of 130kb is because the length of the strings that you are storing?

I think that a data structure should use less space but that is something that can be optimized later

I am available for hiring. Avatar and Signature available for rent.
Donations: bc1qjyhcjacmpc9pg8wes82lktyrjcwk4sdvqtm7ky
mcdouglasx (OP)
Member
**
Offline Offline

Activity: 348
Merit: 89

New ideas will be criticized and then admired.


View Profile WWW
September 27, 2024, 03:36:28 PM
 #33

Question here... why we don't limited the search to some easy pattern to recognize like 0000000000000000 or 1111111111111111
To be more efficient those should be stored as bit and some of them may be easily to search from the CPU point of view you know (ANDs ORs and bitshifts)

I verified it as you indicate with repeated patterns 111 and 000 of length 15 or more. and it stays slightly the same in terms of elements and distances so you are right it is a more efficient approach.

I don't choose a fixed pattern "111111111111111" or a fixed length of 15 to minimize the risk of false positives, nor do I choose a shorter length because the DB would grow, and if it is longer than 15 you would have to use more computing power for the cost of a lighter DB.

Apparently the best approach is 14 or more in length with repeated patterns of 111 and 000, as you suggest.
In a quick test of 10 million keys, only 2 patterns exceeded the distance of 130 thousand keys, and only 3 additional patterns were stored with respect to the original.
which could reduce these 200 thousand keys in the search to a smaller cycle.

I think that once that you chose some specific pattern you should store all the matches that exists in your DB no?

Yes, you store all the ones you find throughout the size of the db.


BTC bc1qxs47ttydl8tmdv8vtygp7dy76lvayz3r6rdahu
albert0bsd
Hero Member
*****
Offline Offline

Activity: 977
Merit: 685



View Profile
September 29, 2024, 02:18:01 AM
 #34

What algorithm can this be integrated into? In a filter, how many searches can you do
per second on average?

I did some test with my laptop with an processor (11th Gen Intel(R) Core(TM) i5-1145G7 @ 2.60GHz)

I query 134217728 random values to the bloom filter already created.

Those 134 Million queries take 5.316306 seconds = 3.960956633090973e-8 seconds per item, so it is 39 Nano seconds per item
or a speed of 25 Million queries per second (25,246,426 queries/s)

Also mention that those 134 Million items give me 30 collision on the bloom filter (Possible false positives) those need to be handled:
- Checking next patter value in out sample if this exists
- Or checking the current item againts your Ultra-Lightweight Database

I also did a test creating keys and creating the bit array (not string)

So in my test i create 4* 2^33 keys ( 8589934592 ) and also create the bit array in the same time it takes 7100 seconds, that is an speed of 4839399 keys per second, but this was using 4 threads, so speed per core is 1,209,849 keys/s (Again the time include creating a bit array of the parity of the X value).

To for this CPU it would take near 0.165 seconds create the 200 thousand keys and their respective bit array. (This time don't include search of patterns or queries to bloom or database)



logic tells me that to achieve a favorable average for this database you should
be able to store more than 200k times the number of keys

My previous test database was a database of 17 * 2**33 keys (146028888064)
So 200K times that number is 29205777612800000 keys that is great than 2**54
and that is just to get the same speed, or Am I wrong?


I hope my previous test give and idea of the expected speeds of the code already optimized for your Database:

and only 3 additional patterns were stored with respect to the original.

What do you mean with this? What is the criteria for those  additional patterns ?

Can you elaborate a lite more?

About the size of 15 bits it is 2^15 = 32768 that means that in average you are going to find one match each 32768 keys

for each new item you have a probability of 1/(2**15) that is your pattern

So what is the probability of not finding a match in 200K items

P(NOT finding the item) = 1 - P(Finding the item)

P(NOT finding the item) = (1 - 1/(2**15)) = (2**15 - 1)/2**15

P(NOT finding the item in 200K) = ((2**15 - 1)/2**15 ) **  200000 = 0.002234

That is a probability of 0.2% that means that 2 of each 1000 sets of 200K keys will not find any match

I am available for hiring. Avatar and Signature available for rent.
Donations: bc1qjyhcjacmpc9pg8wes82lktyrjcwk4sdvqtm7ky
mcdouglasx (OP)
Member
**
Offline Offline

Activity: 348
Merit: 89

New ideas will be criticized and then admired.


View Profile WWW
September 29, 2024, 09:56:46 PM
Last edit: September 29, 2024, 10:11:39 PM by mcdouglasx
 #35

My previous test database was a database of 17 * 2**33 keys (146028888064)
So 200K times that number is 29205777612800000 keys that is great than 2**54
and that is just to get the same speed, or Am I wrong?
I hope my previous test give and idea of the expected speeds of the code already optimized for your Database:

It is what my intuition tells me, that we should have a massive DB, to be more efficient than other algorithms, if only brute force is used:

Because Light DB is not a problem in terms of resource consumption once created by its structure.

This script interprets what would happen approximately when your DB is 200,000 times higher;

Assuming the case that your spered is 1mk/s:

Code:
import random

target = "11234567890987654321"

start = 10000000
end =   20000000

start_b =10
end_b =  20
count_original= 0
count_light= 0
for i in range(10):
   
    for i in range (10):

        for i in range(1000000):
            a =(str(random.randint(start, end))+"890987654321")
            if a == target:
               
                count_original += 1
        for i in range(5):
            a =(str(random.randint(start_b, end_b))+"234567890987654321")
            if a == target:
               
                count_light += 1
               
    print("db original:", count_original)
    print("db light:", count_light)


I see it, for this point, in theory, it would be from x2 to x6 more efficient using light dB.

What do you mean with this? What is the criteria for those  additional patterns ?

Can you elaborate a lite more?
I mean the change from 0101 .. to 111 .., with 14 in length or more, this did not make a significant increase in the number
of patterns found, so the change does not affect the final size of the DB.

The dilemma is how long does it require having such a large database?
Apparently you don't need 200,000 times more to match performance.

BTC bc1qxs47ttydl8tmdv8vtygp7dy76lvayz3r6rdahu
kTimesG
Member
**
Offline Offline

Activity: 229
Merit: 29


View Profile
October 01, 2024, 10:14:35 AM
 #36

I guess the lesson here is that <Space x Time> is always a constant!

A B-Tree is already scalable, fast, and doesn't require 128x more work for a single query. THAT is your competition here, not some "fixed amount of processing". Kangaroo et all can run concurrently on an unlimited number of computing devices, and use the one and same central DB. Which may not even need to be large at all, only adding a few DPs every now and then. However, those special values represent trillions of EC operations EACH.

Faster = Larger
Slower = Smaller
mcdouglasx (OP)
Member
**
Offline Offline

Activity: 348
Merit: 89

New ideas will be criticized and then admired.


View Profile WWW
October 01, 2024, 01:46:42 PM
Merited by ABCbits (1)
 #37

I guess the lesson here is that <Space x Time> is always a constant!

A B-Tree is already scalable, fast, and doesn't require 128x more work for a single query. THAT is your competition here, not some "fixed amount of processing". Kangaroo et all can run concurrently on an unlimited number of computing devices, and use the one and same central DB. Which may not even need to be large at all, only adding a few DPs every now and then. However, those special values represent trillions of EC operations EACH.

Faster = Larger
Slower = Smaller

The B-tree is not a competitor here; the additional processing has nothing to do with the search speed in the database, which is fast enough. This is a different issue. As for Kangaroo, its problem is that it is probabilistic, and as the search space expands, its success rate deteriorates. You know very well that adding DPS doesn’t work like magic.

BTC bc1qxs47ttydl8tmdv8vtygp7dy76lvayz3r6rdahu
kTimesG
Member
**
Offline Offline

Activity: 229
Merit: 29


View Profile
October 01, 2024, 03:50:46 PM
Last edit: October 01, 2024, 04:15:54 PM by kTimesG
 #38

The B-tree is not a competitor here; the additional processing has nothing to do with the search speed in the database, which is fast enough. This is a different issue. As for Kangaroo, its problem is that it is probabilistic, and as the search space expands, its success rate deteriorates. You know very well that adding DPS doesn’t work like magic.

Of course it's not magic. But what do you mean by this: "as the search space expands, its success rate deteriorates"? I'd say it's the other way around: having more pseudo-entropy increases the chances of a collision, because you have better uniformity, so the probabilities are allowed to behave closer and closer to the predicted "success rate". So this "success rate" as you call it is actually worse for smaller ranges, due to lower entropy.

If you throw a coin in the air 1.000.000 times there's a 99.99% chance the number of heads is in a very specific tight interval. If you throw it just 100 times, that confidence interval is a lot worse. Kangaroo will work in the same way, larger interval = higher confidence a solution can be found, otherwise the Universe is broken.

Perhaps your DB has only a very specific use-case: brute-force, which, while guaranteed, you might never live enough for it to finish, even if you count billions of trillions of keys per second. In contrast, do you know of an example of a private key that Kangaroo cannot find? Because I never encountered one. It's like trying to pin-point to a single atom in the Universe and stating "hey, look, this one can never be found this way" and discarding the algorithm as problematic because of this singular exception. While all the other atoms are part of a single DP "tree" into which they all merge into, at some point.
mcdouglasx (OP)
Member
**
Offline Offline

Activity: 348
Merit: 89

New ideas will be criticized and then admired.


View Profile WWW
October 01, 2024, 05:25:36 PM
 #39

The B-tree is not a competitor here; the additional processing has nothing to do with the search speed in the database, which is fast enough. This is a different issue. As for Kangaroo, its problem is that it is probabilistic, and as the search space expands, its success rate deteriorates. You know very well that adding DPS doesn’t work like magic.

Of course it's not magic. But what do you mean by this: "as the search space expands, its success rate deteriorates"? I'd say it's the other way around: having more pseudo-entropy increases the chances of a collision, because you have better uniformity, so the probabilities are allowed to behave closer and closer to the predicted "success rate". So this "success rate" as you call it is actually worse for smaller ranges, due to lower entropy.

If you throw a coin in the air 1.000.000 times there's a 99.99% chance the number of heads is in a very specific tight interval. If you throw it just 100 times, that confidence interval is a lot worse. Kangaroo will work in the same way, larger interval = higher confidence a solution can be found, otherwise the Universe is broken.

Perhaps your DB has only a very specific use-case: brute-force, which, while guaranteed, you might never live enough for it to finish, even if you count billions of trillions of keys per second. In contrast, do you know of an example of a private key that Kangaroo cannot find? Because I never encountered one. It's like trying to pin-point to a single atom in the Universe and stating "hey, look, this one can never be found this way" and discarding the algorithm as problematic because of this singular exception. While all the other atoms are part of a single DP "tree" into which they all merge into, at some point.

Is it the same to use Kangaroo with 4 cores as with 12 cores? No, right? Because it is an algorithm that depends on computing power, which is limited (technology does not advance so quickly and the difficulty of the puzzles grows exponentially). That is, your probability decreases exponentially with time and the difficulty of the puzzle. It is not as you think that Kangaroo works better in higher ranges. If that were the case, 256 bits would be in danger. The correct thing to say is: Kangaroo is not useful in small ranges, and similarly, my database is not useful if you store small ranges of keys, because with this database its success rate grows over time without needing additional effort. I think that in the future this, along with BSGS, will be the most used, due to the advantages already described. You can even store random, sequential patterns and merge them with other databases shared by other users. I know it will take time, but my vision is towards the future.

BTC bc1qxs47ttydl8tmdv8vtygp7dy76lvayz3r6rdahu
kTimesG
Member
**
Offline Offline

Activity: 229
Merit: 29


View Profile
October 01, 2024, 06:25:44 PM
 #40

Is it the same to use Kangaroo with 4 cores as with 12 cores? No, right? Because it is an algorithm that depends on computing power, which is limited (technology does not advance so quickly and the difficulty of the puzzles grows exponentially). That is, your probability decreases exponentially with time and the difficulty of the puzzle.


.. rest of BS ...

Let's cut straight to the chase: did you break the square root bound of finding a key in an interval? I am talking about the ECDLP problem in an interval.

If you did, congrats, I wait for the paper.

If you didn't (which is most likely) then we are definitely not on the same page, so engaging further in a discussion is useless, just like with COBRAS, since you didn't understand what the issues are. Shrinking some exakeys in a 1 KB file is not relevant to the subject.
Pages: « 1 [2] 3 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!