Seems like your tool can identify the contents in-between (href=) an HTML anchor tag. Can you detect the contents of the topic title too? That subscribe was a topic content.
I just search all files for a string, then copy the 5 lines above it, and take the username from the HTML. It won't include the title this way. I can change that, but don't think it adds much.
The big advantage of doing it this way: it takes 0.5 s to search through 0.5 GB data.
Your question made me realize a limitation: if I search for "Started by", every post returns a positive hit.
And this gives one interesting fact: 17,436 Newbies have posted on the forum in just over 4 days time.