This is gonna be approximately equal parts request for help and a document to get my thoughts organized, so strap in and take a ride on the struggle bus with me!
A friend of a friend contacted me a few weeks ago with a BTC recovery job he was hoping I could help with. On the phone, he described some issues accessing a ledger hardware wallet with nearly a million dollars of bitcoin on it, and offered me 10% if I could recover it for him. I asked if he by any chance still had his recovery key, and he said yeah he just found it after it was lost for several years, but that it wasn't working. Thinking I was about to make the quickest 100k of my life I of course agreed and we made plans to meet up.
I'm gonna TLDR this so I can get down to brass tacks more quickly:
The device in question is a Ledger Nano S, MCU 1.0. This is only important academically at this point because it has wiped itself after the wrong PIN was entered too many times.
The ledger was given to him with a note at the top that says 9.1 BTC - $1397 (supposedly) around 2013 or 2014 (this is weird to me for one, because IIRC the ledger nano S wasn't released until 2016, and I don't think there was any point in time post 2015 that 9.1 BTC was less than $2k) in partial payment for some construction work he did by an old german guy who passed away in 2018.
The recovery key is handwritten, in cursive, by someone who learned to write cursive in probably the 1950s. It is EXTREMELY hard to read. There are crossouts, weird cursive flourishes, possible misspellings, etc.
While many of the 24 words seem to have exact matches in the BIP39 dictionary, several of them do not. There are at least 10 that I am iffy on, and 4 that I am EXTREMELY iffy on.
He has no idea what the receiving address the BTC is on might be.
Still, I wouldn't think this would present TOO much of a problem, as long as I can narrow down the search space we should be able to brute force a word or two easily, and MANY of them as long as we can narrow down the possibilities to fewer than the 2048 possibilities.
I've been working away at this for two weeks now, have written a suite of extremely performant wallet recovery software (which I will get to in a bit and yes I do intend to open source once I'm done with this attempt), and have tried several trillion seed combinations, and I am starting to wonder if there's either some quirk of the wallet derivation process circa 2014 that I have missed, or if the coins aren't there.
Initially I tried seed recovery component of btcrecover. While it has some amazing features (such as the addressdb support for when you don't know the address you are hoping to derive) but ended up frustrated with the speed and most of all the seed selection / wordlist expansion portion.
So, I rewrote it in rust as a modular program that can be plugged together with pipes. It takes a tokenfile that is organized with one word of your seed phrase per line. If you want to check multiple words in a particular position you can put both of them separated by a space and it will test every permutation of those words. It also supports simple several rule-based blocks, like:
[all] : All BIP39 dictionary words
[len:4] : All 4-character words
[!len:4] : All words NOT 4 characters
[len:4-6] : All 4-6 character words (shortest to longest)
[len:6-4] : All 4-6 character words (longest to shortest)
[len:4,6] : All 4 and 6 character words
[first:b] : All words starting with 'b'
[!first:b] : All words NOT starting with 'b'
[last:y] : All words ending with 'y'
[!last:y] : All words NOT ending with 'y'
[last:at] : All words ending with 'at'
[!last:at] : All words NOT ending with 'at'
[has:qt] : All words containing 'qt'
[!has:t] : All words not containing 't'
[len:7 first:b !last:y] : Complex combinations
You can pipe this into skipper which skips any sequence of words piped into its input that could have been generated by any of the tokenfiles int the skip folder. Then you pipe that into the recovery program which supports the same addressdb files as btcrecover.
It's actually EXTREMELY fast. I get around 700k phrases per second on my 10 core i9 desktop. It checks the first address only of the three standard BTC derivation paths:
legacy: DerivationPath::from_str("m/44'/0'/0'/0")?,
segwit_compat: DerivationPath::from_str("m/49'/0'/0'/0")?,
native_segwit: DerivationPath::from_str("m/84'/0'/0'/0")?,
After running this for a few days I decided that I would need more CPU power and a way to coordinate the work.
Soooo, I wrote an orchistration server very similar to the old BTC mining pools and a small glue program called worker that you can connect to your server and it will grab a configurable block of work and then report the progress and/or success.
Currently I have around 50 spot instances on AWS doing a combined 25M phrases per second, but I'm starting to lose hope.
Anyway, this got pretty rambly. Here's what I am wondering:
The actual physical wallet has the BTC, LTC, Ethereum and Ripple apps installed on it. Is this the set that comes preinstalled or is it possible that the dude installed some of them (which might indicate that the coins are actually on a different chain).
- Did ledger ever use a different derivation path for wallet derivation? This is from the pre-ledger live era.
- Does anyone remember if the old chrome extension or whatever presented a user with an actual list of derived addresses or just the first one (maybe I should be checking the first 10 or 20 addresses)
- The writing at the top (9.1BTC - $1397) has me wondering if he possibly created the wallet several years before the ledger was created and simply restored the seed onto it. If this is the case, might whatever he used have used a different derivation path? Does anyone have any info on what other wallet programs that did BIP39 phrases were out at that time?
Thanks for coming to my TED talk. I've talked the dude up to coughing up 30% of the wallet at this point, cause it's proving to be a pain in the ass. And yes, I will open source the software as soon as I'm done (or as soon as I am satisfied that I haven't accidentally checked any of my tokenfiles into git at any point in the development process. I would hate to find the wallet only to discover that someone found it before me and swept it clean!)