This problem has been solved, with a modification of BTCRecover I am running through my seeds with a speed of 24k seeds per second with a derivation depth of 5. If someone makes the same mistake in the future, here are ETAs. Im running on 8 threads i7.15 words all scrambled - Max 105 days
14 words all scrambled - Max 21 days
13 words all scrambled - Max 3 days
12 words all scrambled - Max 6 hours
Hi,
Long story short. Some years ago I wrote down 12 seed words from my Mycelium-wallet. To make it less suspicious if found I added some words to make sentences. However I accidentally added words that were part of the English BIP 39 wordlist. Fast forward till today. Memory is much more fragile than one would think. My easy to memorize order swap of the words were wrong. So I have too many words and the wrong order.
I worked with this problem for a couple of days now, slowly going through larger and larger scopes of possible combinations. Generating 12 word seeds is fast, does some millions per second. Making those seeds into private keys is time expensive, doing around 36 generations per second. Knowing how fast vanitygen works (might be a different method though?) I feel like this is a tad slow. Especially if I have to run through thousands of addresses. But generating addresses is even worse! I can only generate one address every second. Deriving it from the xpriv key.
I have accepted that this might take for ever, but I would love to get some pointers and help from the community, and I'll make sure to reward anyone who contributes to the solving of this problem.
Having 14 possible words of a 12 word seed makes 43 589 145 600 possible arrangements. Let's say 5% of those gives a correct checksum. That would be 2 179 457 280 combinations.
If I somehow managed to check 1000 addresses each second it would take at max a month to find the correct one. I recon it should be able to push that number. I am also fairly sure about some of the words, which should bring down the possible amount of addresses.
I am using btctools for Python at the moment. I have no idea if it should take this long to generate, on the other hand, when using sites as
https://iancoleman.io/bip39/ it generates 20 publickeys in a second. I am sure it must be a faster way then the one I am using.
My method right now:
1. Generage huge lists of possible combinations of seeds, ex. oven rifle phrase planet dirt true cinnamon kick first echo thing excuse
2. Run through the list line by line and generate BIP32 root key ex. xprv9s21ZrQH143K3HKXZ8ZPebpXnQbWRsQeKnoUbu7BzMpgtym7ya8hPaF2dmFS621C2BMnvCb3qYj
4cL7GiVK1VNmnA7wxFtPmBT8U1xUW8D6
3. Derive the BIP44 address from this root key, ex. 1NF7rutG9zTiZ7HbYuqmik2Sbb8HwqJcqG
4. Check the given addresses against blockchain.