Bitcoin Forum
April 27, 2024, 07:07:57 PM *
News: Latest Bitcoin Core release: 27.0 [Torrent]
 
   Home   Help Search Login Register More  
Pages: « 1 [2] 3 »  All
  Print  
Author Topic: Measuring the randomness of a seed phrase  (Read 584 times)
jaydee3839 (OP)
Newbie
*
Offline Offline

Activity: 14
Merit: 34


View Profile
July 20, 2023, 02:43:06 PM
Merited by ranochigo (5), o_e_l_e_o (4), ABCbits (2), JayJuanGee (1)
 #21

Thank you all for the insights.  I hope it's understood, my intention isn't to be contrary, I'm trying to better understand this for myself and that it's purely for sake of the enjoyment of learning something new.  Further, I'm not trying to justify or convince anyone that a person can better produce a seed than a computer.


If I give you the following list of words:  
rookie, brand, fossil, soda, arena, neutral, mango, yellow, ticket, chair, reunion, husband

What I'm hearing is, the only way you can tell me, if this is a "quality" seed or not, is if you are told what generated it.  
  • If it was a CSPNG that created it, then it's sufficiently random/unpatterned.  
  • If I created it, then by way of natural human cognitive bias, there must be/highly likely to be some pattern that is more guessable by some computer program.  

But there's no way a computer program can ever tell if it's indeed quality or not based on the list itself.  We just *know* that if a human created it, it's certainly insufficient and if a CSPNG SW program created it, it is positively sufficient.

If what I've stated above correctly represents the consensus on this topic (albeit in a simplistic way), the logic of this still eludes me.  If there is agreement among the geniuses in the world that study this stuff for a living that this is indeed the case, then I guess I will just add it to one many of life's mysteries to me.  I'm new to this board, I hope I didn't immediately embarrass myself for asking the question!
1714244877
Hero Member
*
Offline Offline

Posts: 1714244877

View Profile Personal Message (Offline)

Ignore
1714244877
Reply with quote  #2

1714244877
Report to moderator
"I'm sure that in 20 years there will either be very large transaction volume or no volume." -- Satoshi
Advertised sites are not endorsed by the Bitcoin Forum. They may be unsafe, untrustworthy, or illegal in your jurisdiction.
garlonicon
Hero Member
*****
Offline Offline

Activity: 800
Merit: 1932


View Profile
July 20, 2023, 02:47:30 PM
Merited by JayJuanGee (1)
 #22

Quote
what sets the barrier between looking random and being random
There is no such barrier. Again, the same answer, and the same link I gave you is still relevant: https://xkcd.com/221/

Quote
So you don't want a completely random process, you want one that generates randomly looking numbers, which raises the question of which numbers are looking random, or more importantly, which ones don't?
Quote
You cannot have objectively trustless randomness. You can only have things that are random enough for your purposes, that is all you can get.

That means, you are good to go, if your key is generated in a way, that is hard to repeat by someone else. By this definition, 888 is not a good choice, because there are bots scanning keys from the base point upwards, and sweeping anything they could find. The same for Pi: it is a bad choice, if you use it directly. However, nullius created a puzzle based on that, and after many years, it is still not taken.

So, the answer to your question is: anything that could not be guessed by others, is random enough. Of course, if you are not an expert in cryptography, then trusting some wallet like Bitcoin Core is better than making your own algorithm from scratch.
BlackHatCoiner
Legendary
*
Offline Offline

Activity: 1498
Merit: 7294


Farewell, Leo


View Profile
July 20, 2023, 02:54:07 PM
Merited by JayJuanGee (1)
 #23

By this definition, 888 is not a good choice, because there are bots scanning keys from the base point upwards, and sweeping anything they could find.
That means that a software should not mark 888 as a good choice, because while random, it is not secure. So there exists a subset of numbers that are not secure but are random. Is it wrong to think that if we excluded that subset, we'd have better security? Seems to me like the problem lies how to do it, and not on if it's wise to do it.

.
.HUGE.
▄██████████▄▄
▄█████████████████▄
▄█████████████████████▄
▄███████████████████████▄
▄█████████████████████████▄
███████▌██▌▐██▐██▐████▄███
████▐██▐████▌██▌██▌██▌██
█████▀███▀███▀▐██▐██▐█████

▀█████████████████████████▀

▀███████████████████████▀

▀█████████████████████▀

▀█████████████████▀

▀██████████▀▀
█▀▀▀▀











█▄▄▄▄
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
.
CASINSPORTSBOOK
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀▀█











▄▄▄▄█
ranochigo
Legendary
*
Offline Offline

Activity: 2954
Merit: 4165


View Profile
July 20, 2023, 02:58:46 PM
Merited by o_e_l_e_o (4), ABCbits (2)
 #24

That means that a software should not mark 888 as a good choice, because while random, it is not secure. So there exists a subset of numbers that are not secure but are random. Is it wrong to think that if we excluded that subset, we'd have better security? Seems to me like the problem lies how to do it, and not on if it's wise to do it.
The chances of your wallets selecting those subset of numbers is astronomically low. You have 2^256 to choose from, I highly doubt you would ever get any address anywhere near those that were already tried. It would be a massive waste of resources to keep those indexes and limiting the pool of numbers wouldn't be that ideal either.

What I'm hearing is, the only way you can tell me, if this is a "quality" seed or not, is if you are told what generated it.  

If it was a CSPNG that created it, then it's sufficiently random/unpatterned.

If I created it, then by way of natural human cognitive bias, there must be/highly likely to be some pattern that is more guessable by some computer program.
Yep, that is correct. Cognitively, the human brain works by associating events together through a part of their memory. If a human were to think of a certain string of phrases, chances are the phrases appeared somewhere before and they chose that specific string based on some form of recollection. You can possibly prove it, if you were to scrape all the data there is on the internet, it might be inconclusive because they aren't exhaustive.
But there's no way a computer program can ever tell if it's indeed quality or not based on the list itself.  We just *know* that if a human created it, it's certainly insufficient and if a CSPNG SW program created it, it is positively sufficient.
The former is generally true but the latter would depend on the quality of the source of entropy (are they deterministic or stochastic processes?), the way the entropy is processed and how it gets used. Desktop wallets generally use entropy given by /dev/random (which by itself uses multiple sources of entropy -> debiasing before initializing the CSPRNG) , sometimes XORed with other random data and this is more than sufficient for our uses. There are flawed implementations out there, but so long as it has been rigorously tested and correctly implemented, it will be secure.
If what I've stated above correctly represents the consensus on this topic (albeit in a simplistic way), the logic of this still eludes me.  If there is agreement among the geniuses in the world that study this stuff for a living that this is indeed the case, then I guess I will just add it to one many of life's mysteries to me.  I'm new to this board, I hope I didn't immediately embarrass myself for asking the question!
It was a pretty fruitful discussion! My initial venture into cryptography was filled with question like these as well, glad that you're bringing these topics up to discuss!

.
.HUGE.
▄██████████▄▄
▄█████████████████▄
▄█████████████████████▄
▄███████████████████████▄
▄█████████████████████████▄
███████▌██▌▐██▐██▐████▄███
████▐██▐████▌██▌██▌██▌██
█████▀███▀███▀▐██▐██▐█████

▀█████████████████████████▀

▀███████████████████████▀

▀█████████████████████▀

▀█████████████████▀

▀██████████▀▀
█▀▀▀▀











█▄▄▄▄
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
.
CASINSPORTSBOOK
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀▀█











▄▄▄▄█
jaydee3839 (OP)
Newbie
*
Offline Offline

Activity: 14
Merit: 34


View Profile
July 20, 2023, 03:04:27 PM
 #25

The whole concept of "Don't trust, verify" comes with the fact that we are able to determine the authenticity of binary files with hash functions, or ability to inspect the code before compiling the code yourself. Entropy is unfortunately, something that you cannot measure and trying to evaluate a random process with certainty would be absurd (because then it won't be considered unpredictable anymore). ** Though note that urandom actually estimates the amount of entropy that is being added to the pool, but that involves a constant stream of data.
Thank you for the entire response.  I want to pick up on this point.  No issues with your statement of not being able to positively prove randomness for the reasons you stated.  I would think though you could test the inverse.  Can you prove that a human-generate seed isn't random (or rather, is patterned/predictable).  I'm not suggesting that all 2048^12 possibilities are scorable for randomness on a scale.  I'm saying that if we know some are not sufficient, we should know why they are not sufficient and to what degree they are patterned/predictable.  At some point we will hit a limit, but what is that point?  
LoyceV
Legendary
*
Offline Offline

Activity: 3290
Merit: 16557


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
July 20, 2023, 03:08:43 PM
 #26

One thing I never understood is what sets the barrier between looking random and being random. For example, number 888 has the same chances theoretically to be picked between 1 and ~2^256, but it shouldn't, even if the process was completely random, because anyone playing with strange numbers can compromise the key. So you don't want a completely random process, you want one that generates randomly looking numbers
If you pick a number between 1 and 2^256, you don't have to worry about "border cases". The search space is large enough to be absolutely sure it won't be 888 (I mean: 000000000000000000000000000000000000000000000000000000000000000000000000000888).
If you're a bank handing out 4 digit PIN codes, you may want to avoid codes like 0000, but that's only because the search space is very limited.

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
vjudeu
Hero Member
*****
Offline Offline

Activity: 663
Merit: 1527



View Profile
July 20, 2023, 03:13:45 PM
Merited by ABCbits (4), o_e_l_e_o (4), JayJuanGee (1)
 #27

Quote
Is it wrong to think that if we excluded that subset, we'd have better security?
We would have worse security. If you exclude 888 explicitly, then you could still reach 1234. If you exclude all numbers below 2^32, then you could still reach 2^33, and someone doubling the base point could sweep that after 33 point doublings. If you introduce a lot of exclusions, based on a lot of patterns you noticed, then you could downgrade your "n" to a smaller number than something around 2^256, and then you will no longer have 128-bit public key security.

By leaving things as they are, and picking a number from the full range, the whole strength is in the algorithm itself, that is battle tested by many users for many years. Since 2009, it is also covered by money they put on their keys, and that cryptography is so strong, that those who lost their keys, still cannot get their old coins. As long as you can see that many coins from the earliest blocks are not moved, you can be quite sure that the algorithm used by Bitcoin Core to generate them, is good enough.

Trying to change that, is similar as if you wanted to design a hash function, that would never produce a lot of leading zeroes, because that result "looks non-randomly". Of course, you can pick any theoretically ideal fully random number generator, and then produce a lot of data. Then, after producing 2^32 hashes, you could find one with 32 leading zero bits, and complain "hey, this random number generator produced a non-random number!". But this is not the case. If you needed 2^32 different results to get that single value, which have 224 bits, instead of 256 bits, then your generator is still good enough.

So, it is better to leave that unlikely opportunity to generate private key 888, than to "fix" it, and making cure worse than the disease.

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
BlackHatCoiner
Legendary
*
Offline Offline

Activity: 1498
Merit: 7294


Farewell, Leo


View Profile
July 20, 2023, 03:18:25 PM
 #28

If you pick a number between 1 and 2^256, you don't have to worry about "border cases".
Sure, but don't you make it, in the very least, more secure if you exclude numbers like 1, 10, 888, 2^256 / 2 etc.? There's astronomically small chance of being selected, but there's an unnecessary chance. Unless there's a reason we shouldn't exclude that subset, which probably lies on the "how to" situation.

So, it is better to leave that unlikely opportunity to generate private key 888, than to "fix" it, and making cure worse than the disease.
So, the answer to why we shouldn't exclude that insecure subset, is that we're likely to make less secure the rest of the numbers of the set. It makes some sense, yes.

.
.HUGE.
▄██████████▄▄
▄█████████████████▄
▄█████████████████████▄
▄███████████████████████▄
▄█████████████████████████▄
███████▌██▌▐██▐██▐████▄███
████▐██▐████▌██▌██▌██▌██
█████▀███▀███▀▐██▐██▐█████

▀█████████████████████████▀

▀███████████████████████▀

▀█████████████████████▀

▀█████████████████▀

▀██████████▀▀
█▀▀▀▀











█▄▄▄▄
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
.
CASINSPORTSBOOK
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀▀█











▄▄▄▄█
LoyceV
Legendary
*
Offline Offline

Activity: 3290
Merit: 16557


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
July 20, 2023, 03:20:32 PM
 #29

Sure, but don't you make it, in the very least, more secure if you exclude numbers like 1, 10, 888, 2^256 / 2 etc.?
No. It's just as secure. Being 0.00000000000000000000000000000000000000000000000000000000000000000000001% more secure doesn't matter. It's just a waste of programming for something that's never going to happen.

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
ranochigo
Legendary
*
Offline Offline

Activity: 2954
Merit: 4165


View Profile
July 20, 2023, 03:32:19 PM
Merited by LoyceV (4), ABCbits (2), JayJuanGee (1)
 #30

Thank you for the entire response.  I want to pick up on this point.  No issues with your statement of not being able to positively prove randomness for the reasons you stated.  I would think though you could test the inverse.  Can you prove that a human-generate seed isn't random (or rather, is patterned/predictable).  I'm not suggesting that all 2048^12 possibilities are scorable for randomness on a scale.  I'm saying that if we know some are not sufficient, we should know why they are not sufficient and to what degree they are patterned/predictable.  At some point we will hit a limit, but what is that point?  
Depends actually, in certain cases, we can and in some I can't do so. For example, if I ask you to think of 12 phrases, do you think you've chosen them because they have some sort of association with each other and something that you've seen before? Some cases that I thought of:

Base case:
User select the best 11 words that they like the best and calculate the 12th, assuming BIP39 checksum. It would be vulnerable for obvious reasons, it shouldn't take too long to build a dictionary of the most commonly associated phrases and bruteforce them. This would be very predictable, build a RNN and scrape all of the known data sources for words association.

Next best case:
User randomly selects 11 words from the wordlist by scrolling down the list on their computer, stops at random timings and records down the first phrase that they see, and calculate the last word for checksum. Possibly random, but probably not, humans are inherently bad at estimation and they cannot possibly be always random at deciding stop points. This is still predictable.

Next case:
User lists out all of the words on a giant piece of paper and uses a dart to throw at it, while being blindfolded and records all of the words that the darts land on. Possibly random, but not exactly random because the way that the user throws the dart can result in it being biased. This can somewhat still be predictable, though arguably less than the previous two.

Of course, non-exhaustive cases but human errors are often present when entropy is involved. There are possibly cases where your selection can be random and unpredictable, or they can be still associated with each other depending on your actions and how much it compromises your ability to be random. You can select the words yourself and it can be random but chances are, human influence would result it being less random and less secure than it can be. Rather, if you were to use known sources of randomness which were put through rigorous testing and debiasing/whitening algorithm, then your entropy would probably be much much better than the former.

Remember, the speed at which bruteforcing is done is pretty quick and having anything that is predictable/less than random would narrow down the search range significantly. Leaving something like this up to chance wouldn't be a ideal.

.
.HUGE.
▄██████████▄▄
▄█████████████████▄
▄█████████████████████▄
▄███████████████████████▄
▄█████████████████████████▄
███████▌██▌▐██▐██▐████▄███
████▐██▐████▌██▌██▌██▌██
█████▀███▀███▀▐██▐██▐█████

▀█████████████████████████▀

▀███████████████████████▀

▀█████████████████████▀

▀█████████████████▀

▀██████████▀▀
█▀▀▀▀











█▄▄▄▄
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
.
CASINSPORTSBOOK
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀▀█











▄▄▄▄█
garlonicon
Hero Member
*****
Offline Offline

Activity: 800
Merit: 1932


View Profile
July 20, 2023, 03:37:41 PM
Merited by LoyceV (8), o_e_l_e_o (4), ABCbits (3), JayJuanGee (1)
 #31

Quote
It's just a waste of programming for something that's never going to happen.
Yes, in the best case, it is like writing code, that is almost equivalent to "if(false)". However, there are two important issues:
1. It makes code more complex, which means that in case of some bug, it is more likely to do something stupid. For example: by negating some condition, you could accidentally get the code that generates only keys with patterns, instead of only keys without it.
2. By picking "the list of numbers that should be excluded", implementers will fall into the trap of randomness. They are trying to exclude interesting numbers, and often end up with excluding too much. It is like trying to eliminate numbers that are divisible by 2, because they are even. Then by 3, because they are also non-random. And ending up with Ulam spiral, where those prime numbers (that were left after exclusion) can still form lines, so an attacker trying to linearly find keys, can still reach them.
jaydee3839 (OP)
Newbie
*
Offline Offline

Activity: 14
Merit: 34


View Profile
July 20, 2023, 04:31:14 PM
Last edit: July 20, 2023, 04:41:43 PM by jaydee3839
Merited by JayJuanGee (1)
 #32

Thank you for the entire response.  I want to pick up on this point.  No issues with your statement of not being able to positively prove randomness for the reasons you stated.  I would think though you could test the inverse.  Can you prove that a human-generate seed isn't random (or rather, is patterned/predictable).  I'm not suggesting that all 2048^12 possibilities are scorable for randomness on a scale.  I'm saying that if we know some are not sufficient, we should know why they are not sufficient and to what degree they are patterned/predictable.  At some point we will hit a limit, but what is that point?  
Depends actually, in certain cases, we can and in some I can't do so. For example, if I ask you to think of 12 phrases, do you think you've chosen them because they have some sort of association with each other and something that you've seen before? Some cases that I thought of:

Base case:
User select the best 11 words that they like the best and calculate the 12th, assuming BIP39 checksum. It would be vulnerable for obvious reasons, it shouldn't take too long to build a dictionary of the most commonly associated phrases and bruteforce them. This would be very predictable, build a RNN and scrape all of the known data sources for words association.

Next best case:
User randomly selects 11 words from the wordlist by scrolling down the list on their computer, stops at random timings and records down the first phrase that they see, and calculate the last word for checksum. Possibly random, but probably not, humans are inherently bad at estimation and they cannot possibly be always random at deciding stop points. This is still predictable.

Next case:
User lists out all of the words on a giant piece of paper and uses a dart to throw at it, while being blindfolded and records all of the words that the darts land on. Possibly random, but not exactly random because the way that the user throws the dart can result in it being biased. This can somewhat still be predictable, though arguably less than the previous two.

Of course, non-exhaustive cases but human errors are often present when entropy is involved. There are possibly cases where your selection can be random and unpredictable, or they can be still associated with each other depending on your actions and how much it compromises your ability to be random. You can select the words yourself and it can be random but chances are, human influence would result it being less random and less secure than it can be. Rather, if you were to use known sources of randomness which were put through rigorous testing and debiasing/whitening algorithm, then your entropy would probably be much much better than the former.

Remember, the speed at which bruteforcing is done is pretty quick and having anything that is predictable/less than random would narrow down the search range significantly. Leaving something like this up to chance wouldn't be a ideal.

What about:

Taking your favorite football running back, taking his career yards gained and converting that number to millimeters.  
x
GDP of Belgium (or whatever country of your choice) in 1981 (or whatever year), converted to Japenese yen (or whatever currency) in trillions (or to whatever point the '0's start due to rounding).
x
pi from 34 to 47 decimals (or choose the range randomly), find the nearest prime number
x
number of minutes between when your maternal grandparents were married to when the second tower fell on 9/11.

(you don't know who I am, my favorite team, when I was born, who my grandparents are).

Then from that string of numbers, systematize taking numbers between 1-2048 from the string (so any number 1-2048 has equal chance).  Mix up the resulting 12 outputs and draw them out of a bowl one at a time.

This would seem "sufficient" to me.  Thoughts?  Of course, it's a lot more effort than just using a generator, and you're liable to leave a trace of all the research being done here (and maybe that's part of the point), but as a thought experiment, I don't see how a system like this or something similar could be vulnerable to bruteforce.  Particularly if you know nothing about me, I don't see vulnerability in factors 1, 2 and 4 (number 3, ok, prime numbers get scarce as you go up...).  The only thing that could be a problem is that those factors may not generate as many digits as I would like, you'd need to come up with more and more such factors.  Also, you'd have to keep these factors off computers, which would require a lot of hand-calculating, then burn the evidence, etc.

Edit:  Also, I don't know if the multiplication of large numbers leaves vulnerabilities.  If so, other mathematical "mixing" functions could be substituted instead.
o_e_l_e_o
In memoriam
Legendary
*
Offline Offline

Activity: 2268
Merit: 18507


View Profile
July 20, 2023, 04:44:04 PM
 #33

What proof do we have that Pi is random, even if not definite? Do you mean it is very questionably random?
Obviously it's not random in the sense it is a constant which can be reliably reproduced over and over. But it is random in the sense that its digits are randomly uniformly distributed (as far as we can tell).

If I give you the following list of words:  
rookie, brand, fossil, soda, arena, neutral, mango, yellow, ticket, chair, reunion, husband
On a tangent here, but I can tell you that's not a "quality" seed phrase because it has an invalid checksum. Tongue

We just *know* that if a human created it, it's certainly insufficient and if a CSPNG SW program created it, it is positively sufficient.
It's more that if a human created it, then we know it will have less than 256 bits of entropy. The matching game ranochigo linked to on the first page shows that if you are manually picking 0s and 1s, you aren't random. If you randomly pick words from the list, there is an inherent bias and you aren't totally random there either. Even if you toss a coin, there is a human instinct that if you tossed TTTTTTTTTTTT to think "that's not random enough" and throw it out and redo those tosses. Will the seed phrase you end up with be completely insufficient and able to be hacked? Maybe, maybe not. But it will almost certainly have less than 256 bits of entropy.

Sure, but don't you make it, in the very least, more secure if you exclude numbers like 1, 10, 888, 2^256 / 2 etc.?
If you want to follow that logic, then we should also be excluding every key which has already been used? In fact, if you want a 256 bit key, then you need to immediately exclude all numbers with leading zeroes, which is half the range from 1 to 2255.

Of course, it's a lot more effort than just using a generator, and you're liable to leave a trace of all the research being done here (and maybe that's part of the point), but as a thought experiment, I don't see how a system like this or something similar could be vulnerable to bruteforce.
To raw bruteforce with no knowledge of what you have done? No, probably not. But given that you've just typed all these things in to Google, there are now dozens of servers around the world that know you had a specific interest in these numbers at the same time for some reason.

If you don't trust your OS's /dev/urandom, then aside from getting a new OS, I would suggest the best way to manually generate a seed phrase is from coin flips, specifically using Von Neumann's algorithm as I have discussed here to remove any potential bias.
jaydee3839 (OP)
Newbie
*
Offline Offline

Activity: 14
Merit: 34


View Profile
July 20, 2023, 06:08:31 PM
 #34


Of course, it's a lot more effort than just using a generator, and you're liable to leave a trace of all the research being done here (and maybe that's part of the point), but as a thought experiment, I don't see how a system like this or something similar could be vulnerable to bruteforce.
To raw bruteforce with no knowledge of what you have done? No, probably not. But given that you've just typed all these things in to Google, there are now dozens of servers around the world that know you had a specific interest in these numbers at the same time for some reason.


Thank you for continuing to engage with the thought experiment.  Well yes, if someone were to use something like this, the factors would have to be kept private and the calculations for each would have to be done offline, via library archive, books and family records, etc.  The interesting question to me, isn't whether the result could be raw bruteforced, but rather is it demonstratively worse/lower quality/less random/less entropy (semantics in this regard aren't my strongpoint, choose the appropriate term) than a CSPRNG SW-generated phrase.

And again, from what I can tell throughout this whole thread, is that it is likelyworse... but ultimately cannot be demonstrated as such.
ranochigo
Legendary
*
Offline Offline

Activity: 2954
Merit: 4165


View Profile
July 21, 2023, 03:01:57 AM
Last edit: July 25, 2023, 08:52:40 AM by ranochigo
Merited by hugeblack (4), o_e_l_e_o (4), ABCbits (3)
 #35

The interesting question to me, isn't whether the result could be raw bruteforced, but rather is it demonstratively worse/lower quality/less random/less entropy (semantics in this regard aren't my strongpoint, choose the appropriate term) than a CSPRNG SW-generated phrase.

And again, from what I can tell throughout this whole thread, is that it is likelyworse... but ultimately cannot be demonstrated as such.
To respond to your proposed method, it might be difficult to bruteforce that, because no one would probably try to string the exact same multiple factors together to form a seed. However, the predictability of that would be far lower than what your CSPRNG gives you and for a very simple reason. Both of them usually works in the same way, but rather, the CSPRNG that is included usually takes in multiple non-deterministic and random variables (hardware interrupts, keyboard timings, timers, etc) and in addition to that, an algorithm to debias and whiten the entropy. When talking about absolutes, our algorithm is definitely more unpredictable because they are non-deterministic processes.

However, by using a bunch of factors that are known to everyone, the probability of your seeds being predicted is far higher than one which uses inputs from random processes which no one can feasibly guess. The advantage of having CSPRNG that is random enough such that any adversary has no idea of how the states where at the point of generation is what makes cryptography secure. Your selection of factors might be arbitrary and random (but arguably not) yet the factors themselves are pre-determined. Besides, it doesn't offer significant advantage, using known variables as a source of entropy rather than using a RNG.

The main question would that is it consistently secure enough over multiple iterations? Probably not. The CSPRNG present in your OS is definitely consistently more un-predictable in comparison.

.
.HUGE.
▄██████████▄▄
▄█████████████████▄
▄█████████████████████▄
▄███████████████████████▄
▄█████████████████████████▄
███████▌██▌▐██▐██▐████▄███
████▐██▐████▌██▌██▌██▌██
█████▀███▀███▀▐██▐██▐█████

▀█████████████████████████▀

▀███████████████████████▀

▀█████████████████████▀

▀█████████████████▀

▀██████████▀▀
█▀▀▀▀











█▄▄▄▄
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
.
CASINSPORTSBOOK
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀▀█











▄▄▄▄█
o_e_l_e_o
In memoriam
Legendary
*
Offline Offline

Activity: 2268
Merit: 18507


View Profile
July 21, 2023, 07:00:51 AM
Last edit: July 21, 2023, 07:44:46 AM by o_e_l_e_o
 #36

The interesting question to me, isn't whether the result could be raw bruteforced, but rather is it demonstratively worse/lower quality/less random/less entropy (semantics in this regard aren't my strongpoint, choose the appropriate term) than a CSPRNG SW-generated phrase.
Yes, it is demonstrably less random. Whether it is less random enough to be bruteforced depends on your starting points and how many factors you use, I would assume.

As ranochigo has explained, the random number generators you would use on your computer or hardware wallet to generate a seed phrase perform a similar process of taking a bunch of different numbers and combining them. However, you are picking constants which can be known to anybody who looks them up. An electronic CSPRNG will draw entropy from things like interrupt timings and thermal noise, which are impossible for an outside observer to know. You are proposing simply multiplying these numbers together, whereas your CSPRNG will use a combination of functions, including things like XOR and one way hash functions to combine these data in more difficult to predict ways.
LoyceV
Legendary
*
Offline Offline

Activity: 3290
Merit: 16557


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
July 21, 2023, 08:32:36 AM
 #37

How about 9428367110839506348425063820855586539232765? Looks random, right? Except that it's part of the first million decimals of pi.
The same with seed phrases: you can create one based on a Shakespear book. The seed will look random, but it's created deterministically. You can only tell it's not random once you find the source.
The issue here is there are potentially trillions of text inputs to analyze, so it becomes largely impractical to test the seed phrase against all of them
Exactly Smiley That's why it's impossible to check if a seemingly random string was created randomly.

This would seem "sufficient" to me.  Thoughts?  Of course, it's a lot more effort than just using a generator, and you're liable to leave a trace of all the research being done here (and maybe that's part of the point), but as a thought experiment, I don't see how a system like this or something similar could be vulnerable to bruteforce.
I'd say it's unlikely to be brute-forced. But I still don't see the point: if you want to create your own random, just flip coins.
If you want it to be something you can remember and reproduce later to restore your seed, then the examples you gave are terrible. Chances are you forget parts and can't find back all details.

The interesting question to me, isn't whether the result could be raw bruteforced, but rather is it demonstratively worse/lower quality/less random/less entropy (semantics in this regard aren't my strongpoint, choose the appropriate term) than a CSPRNG SW-generated phrase.
Why not use both? Create your own string, and simply add it to a random coming from a random number generator. Kinda like the way Split key vanity addresses are created. As long as at least one of the strings is random, the result is random.

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
ABCbits
Legendary
*
Offline Offline

Activity: 2856
Merit: 7407


Crypto Swap Exchange


View Profile
July 21, 2023, 09:36:56 AM
Merited by JayJuanGee (1)
 #38

--snip--

What about:

Taking your favorite football running back, taking his career yards gained and converting that number to millimeters.  
x
GDP of Belgium (or whatever country of your choice) in 1981 (or whatever year), converted to Japenese yen (or whatever currency) in trillions (or to whatever point the '0's start due to rounding).
x
pi from 34 to 47 decimals (or choose the range randomly), find the nearest prime number
x
number of minutes between when your maternal grandparents were married to when the second tower fell on 9/11.

(you don't know who I am, my favorite team, when I was born, who my grandparents are).

Then from that string of numbers, systematize taking numbers between 1-2048 from the string (so any number 1-2048 has equal chance).  Mix up the resulting 12 outputs and draw them out of a bowl one at a time.

This would seem "sufficient" to me.  Thoughts?  Of course, it's a lot more effort than just using a generator, and you're liable to leave a trace of all the research being done here (and maybe that's part of the point), but as a thought experiment, I don't see how a system like this or something similar could be vulnerable to bruteforce.  Particularly if you know nothing about me, I don't see vulnerability in factors 1, 2 and 4 (number 3, ok, prime numbers get scarce as you go up...).  The only thing that could be a problem is that those factors may not generate as many digits as I would like, you'd need to come up with more and more such factors.  Also, you'd have to keep these factors off computers, which would require a lot of hand-calculating, then burn the evidence, etc.

Edit:  Also, I don't know if the multiplication of large numbers leaves vulnerabilities.  If so, other mathematical "mixing" functions could be substituted instead.

For your example, i would worry more about
1. Human error when entering value (e.g. you enter GDP of Belgium on 1982 rather than 2021) or performing calculation.
2. Whether you can reconstruct seed phrase in the future. If you don't have backup of the source data, you'll have to re-find it on google search where the information could be different due to various reason such as number precision or history manipulation.

The interesting question to me, isn't whether the result could be raw bruteforced, but rather is it demonstratively worse/lower quality/less random/less entropy (semantics in this regard aren't my strongpoint, choose the appropriate term) than a CSPRNG SW-generated phrase.
Why not use both? Create your own string, and simply add it to a random coming from a random number generator. Kinda like the way Split key vanity addresses are created. As long as at least one of the strings is random, the result is random.

Or just feed your string to /dev/urandom instead. I believe you can do that with echo "example" >> /dev/urandom, although i don't know whether it's proper way to do it.

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
LoyceV
Legendary
*
Offline Offline

Activity: 3290
Merit: 16557


Thick-Skinned Gang Leader and Golden Feather 2021


View Profile WWW
July 21, 2023, 02:02:51 PM
Merited by JayJuanGee (1)
 #39

Or just feed your string to /dev/urandom instead. I believe you can do that with echo "example" >> /dev/urandom, although i don't know whether it's proper way to do it.
I didn't know you can do this. Stack Exchange) suggests feeding it with basically a microphone or webcam. To quote:
Quote
This is still big overkill.
I've never worried about this.

█▀▀▀











█▄▄▄
▀▀▀▀▀▀▀▀▀▀▀
e
▄▄▄▄▄▄▄▄▄▄▄
█████████████
████████████▄███
██▐███████▄█████▀
█████████▄████▀
███▐████▄███▀
████▐██████▀
█████▀█████
███████████▄
████████████▄
██▄█████▀█████▄
▄█████████▀█████▀
███████████▀██▀
████▀█████████
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
c.h.
▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄
▀▀▀█











▄▄▄█
▄██████▄▄▄
█████████████▄▄
███████████████
███████████████
███████████████
███████████████
███░░█████████
███▌▐█████████
█████████████
███████████▀
██████████▀
████████▀
▀██▀▀
Synchronice
Hero Member
*****
Offline Offline

Activity: 840
Merit: 766


Watch Bitcoin Documentary - https://t.ly/v0Nim


View Profile
July 22, 2023, 02:38:52 PM
 #40

I was wondering if there are any measurement techniques (software tools) that can quantify the randomness of a seed phrase.  I've read numerous times that humans picking their own seed phrase is not advisable, because it would not have the level of randomness a (quality) computer-generated seed phrase would produce.  Therefore, their must be some test or method of measuring this.  I'm picturing something like a 0-100 scale, where the first word repeated 12 consecutive times would be 0 or extraordinarily close to 0, and the best entropy sources designed for seed phrase generation would be something close to 100, but there may be other ways to measure.

Is there anything like this?  I would think there would be, but I haven't come across is, nor have I heard anyone advertise to "test the randomness of your phrase", though I get the skepticism of entering the phrase into such a system introduces a risk (you'd only want to do it on a trusted, air-gapped device).

For nothing else, I'm curious as to "how bad" a human is at generating seed phrases randomly, versus computer. 
If we are able to measure the randomness of a seed phrase, then we will be able to configure bruteforce software in a way that it will be able to try to generate certainly random seed phrases, right? If true, then it remains questionable, whether 99% randomness is better than 90% randomness.

And you, definitely can't be more random than a computer because computer can calculate millions and billions of possibilities in a second and choose the one option out of millions and billions while your brain only generates one process at that moment and at the same time thinks how random it would be. Computer doesn't overthinks about randomness but you do. You follow your logic, a certain way.

.freebitcoin.       ▄▄▄█▀▀██▄▄▄
   ▄▄██████▄▄█  █▀▀█▄▄
  ███  █▀▀███████▄▄██▀
   ▀▀▀██▄▄█  ████▀▀  ▄██
▄███▄▄  ▀▀▀▀▀▀▀  ▄▄██████
██▀▀█████▄     ▄██▀█ ▀▀██
██▄▄███▀▀██   ███▀ ▄▄  ▀█
███████▄▄███ ███▄▄ ▀▀▄  █
██▀▀████████ █████  █▀▄██
 █▄▄████████ █████   ███
  ▀████  ███ ████▄▄███▀
     ▀▀████   ████▀▀
BITCOIN
DICE
EVENT
BETTING
WIN A LAMBO !

.
            ▄▄▄▄▄▄▄▄▄▄███████████▄▄▄▄▄
▄▄▄▄▄██████████████████████████████████▄▄▄▄
▀██████████████████████████████████████████████▄▄▄
▄▄████▄█████▄████████████████████████████▄█████▄████▄▄
▀████████▀▀▀████████████████████████████████▀▀▀██████████▄
  ▀▀▀████▄▄▄███████████████████████████████▄▄▄██████████
       ▀█████▀  ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀  ▀█████▀▀▀▀▀▀▀▀▀▀
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
.PLAY NOW.
Pages: « 1 [2] 3 »  All
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!