bitnodes.io can only detect nodes that are accepting incoming connections not all bitcoin nodes. In order to have a better view of the whole network and see all nodes (or at least a lot more than just those nodes accepting incoming connection) you'd have to run strong servers with multiple nodes accepting incoming connections and receive connection from a large part of the network.
And the way to detect the first node types is by using their public announcements (addr message they send out) even though the nodes you connect to and send a getaddr message are not going to send you all the announcements which means your "crawler" has to connect to a lot of nodes and for a long time to be able to construct as large a database as possible.
Their location is a matter of categorizing IP addresses and mapping them to geography.
The work being done on bitnodes.io is great since other companies/services use it as a data source for their website. Take, for instance, the case of bitrawr[1] who uses the data from bitnodes to showcase how Bitcoin is spread around the world. Right at the end of the page sits also an important information that is worth to be taken into account when looking at the ~17,700 active nodes:
Bitnodes uses Bitcoin protocol version 70001 (i.e. >= /Satoshi:0.8.x/), so nodes running an older protocol version will be skipped.
As a closing remark, I once kept these[2][3] links shared by Luke Dashjr[4] that supposedly were scraping/listening to the nodes and recording each entry. I don't know if it is still being updated but considering who did scrapper, for sure it may be seen as reliable. I do like this[3] section of his website where we can see the amount of nodes that is running each Bitcoin protocol. I wonder if there is some way to verify these numbers?
[1]
https://www.bitrawr.com/terminal/bitcoin-node-map[2]
https://luke.dashjr.org/programs/bitcoin/files/charts/software.html[3]
https://luke.dashjr.org/programs/bitcoin/files/charts/services.html[4]
https://nitter.net/LukeDashjr