What are spy nodes ? I have never heard of them. Do they posses any risks and what is their purpose of connecting to our node?
Spy nodes are clients that usually aren't actual full nodes (they can be). They usually don't even have a blockchain and only fake it. These clients connect to different full nodes to literary spy on them. Although it is not always malicious and sometimes they are just gathering statistics (like web crawlers).
For example one goal could be to find a link between a transaction and an IP address. The spy node connects to as many nodes as it can at the same time and if it sees tx1 coming from IPx
for the first time then (some time later) sees tx2 spending outputs of tx1 and coming from same IPx and so on it can eventually conclude that addresses a, b, c, d belong to IPx and from there it could be possible to link IPx to the person's identity hence deanonymizing transactions.
The only risk is privacy risk (although I should add that there are a lot of good work done by core team to make such attempts as hard as possible), and of course wasting your resources.
I am able to get the peer info from the command 'getpeerinfo' but how do I identify if a node is spy node or a genuine one.
That's hard to say since I'm not running core but the simplest behavior they have which makes identifying some of them trivial are:
- One of them is literary called "snoopy" (the client name in version message)
- They can't reply to getdata, getblock, getheader, etc. since they don't have any blockchain
- The version message some of them use during handshake is buggy and if you send them a false block height they start advertising that!
- Some of them keep coming and going (they don't remain connected)
- They also don't ask for same things a normal node would such as checking their headers with you first to sync