Bitcoin Forum

Bitcoin => Development & Technical Discussion => Topic started by: brokenscript on May 08, 2014, 09:07:00 AM



Title: Searching OP_RETURN data.
Post by: brokenscript on May 08, 2014, 09:07:00 AM
After doing a bunch or reading about OP_RETURN, I was burning to see what sorts of things are being put on the blockchain. Apart from coinsecrets.org, I couldn't find any easy way to find them, though. In particular, I was wondering if there is a webservice that allows one to do regex searches on the data stored by OP_RETURNs.

On that matter, is anyone else curious what's in there? If there's enough interest, and a service to search the data does't yet exist, I suppose I could modify the ABE blockchain explorer to make it easy to search OP_RETURNs. Being lazy, I'm hoping someone else has already done the work...


Title: Re: Searching OP_RETURN data.
Post by: apxu on May 08, 2014, 10:12:27 AM
http://webbtc.com/scripts/op_return


Title: Re: Searching OP_RETURN data.
Post by: brokenscript on May 08, 2014, 10:34:36 AM
Thank you!
... but I think I must be somewhat stupid this morning: how do I actually search the data in the op_returns? The search box does not seem to search the data in the op_returns (putting in the data of the top transaction, for example, simply returns a not found). Reading the docs, it seems like I can't do it from the described API either. What am I missing?


Title: Re: Searching OP_RETURN data.
Post by: apxu on May 08, 2014, 10:52:40 AM
Look at the number of OP_RETURN scripts. I see only 1173.
You can grab all of them manually :) No need regexps and complicated technologies


Title: Re: Searching OP_RETURN data.
Post by: brokenscript on May 08, 2014, 02:56:16 PM
Maybe there are only 1173 now, but that will surely grow. Anyway, grabbing them all from a website and then stripping them out seems somewhat more complicated than simply getting ABE (or similar) to dump the entire blockchain into a database, and then pulling out what one wants. The problem is, doing so means:
1. Having to wait for the dump of the blockchain into a database,
2. having to carry the database around.

The latter is not really an option if I suddenly get curious on my (long) train commute to work, and want to check something on my mobile phone.

So maybe the question somewhat more generally: is there maybe a database of the whole blockchain around somewhere that one can query? Or even just somewhere where one can do general regex searches against outputs and inputs of transactions on the blockchain? If not, would there be interest (apart from my own) in having such a database available? I would have some trouble hosting it, but would be more than happy to set one up if someone else can lend me somewhere to put it...


Title: Re: Searching OP_RETURN data.
Post by: mriou on May 08, 2014, 07:01:22 PM
Wouldn't something like that fit your need?

http://dev.blockcypher.com/

You could setup a webhook that will call you for any new block transaction. From there you just need to check the first byte of the transaction script and if it's an OP_RETURN, do whatever you'd like to do to analyze it.


Title: Re: Searching OP_RETURN data.
Post by: brokenscript on May 09, 2014, 08:24:03 AM
That is certainly a nice API, but doesn't quite do what I'd like. If my goal were to keep an updated list of all OP_RETURNs with a certain property, your suggestion would be perfect, but what I really want is the ability to search OP_RETURNs without having to store an updated list (for example, I might want to ask for the first ten OP_RETURNs matching PATTERN).

In a perfect world, someone will have the blockchain as a searchable database somewhere, so that I could simply send an SQL request like
SELECT TX_ID FROM OUTPUTS WHERE SCRIPT = REGEX
and it would return me the answers (or, given that there are likely to be a lot of matches to many requests, a block of matching records and a cursor to get the next block). In JSON-RPC, one might imagine the api works similar to:

{
"method" : "query",
"params" : ["SELECT TX_ID, SCRIPT FROM OUTPUTS WHERE SCRIPT = REGEX ORDER BY TIMESTAMP ASC"],
"id" : 1
}

and getting a response like

{
"result" : {"cursor-id" : CURSOR, "response" : [{"TXID" : ..., "SCRIPT" : ...}, {"TXID" : ..., "SCRIPT" : ...}, ... k responses ... , {"TXID" : ..., "SCRIPT" : ...}] },
"error" : null,
"id" : 1
}


The next responses can then be got by sending something like

{
"method" : "nextmatches",
"params" : [CURSOR],
"id" : 1
}

which gets a response like that to the query.

A very concrete application would be to find the first appearance or a particular transaction type, or an op_return with data matching whatever. Perhaps this is even more interesting for testnet: after all, people can try out all sorts of funny transactions there. It seems rather inefficient that if one is interested in querying the blockchain generally, one has to build (and maintain) a database locally.