You could retrieve every block and the discard all transactions without OP_RETURN outputs, but as each block is invalid without the transactions hashed into it, and blocks cannot be verified without retaining the previous block, you'd still need a full node to do so properly. However, you could maintain a buffer of only the last 144 blocks (approximately one day) and verify based on those (equivalent to updating a Bitcoin-Qt checkpoint every hour), which should be sufficient unless you're afraid of a longer temporary fork.
This uses quite a lot of bandwidth as you have to download every block. Is there not a better way to do it?
Not really, unless you use a third-party service that sorts out the transactions for you; there's no way to know if a transaction contains an OP_RETURN output without checking it, so you'd need to receive every transaction (and thus, every block) regardless. However, you don't necessarily need to save the transactions without OP_RETURN, which would let you save considerably on disk space at the (potential) cost of verification.