|
Izlude
Newbie
Offline
Activity: 19
Merit: 0
|
|
May 31, 2013, 02:06:30 PM |
|
That would be nice I know there's a couple of lists of bitcoin sites, however there are probably hundreds or thousands more (such as myself) who have a site that's not on that list. What do you suppose a good tag to put in my header would be to be found on such a search engine? (or even google at the moment)
|
|
|
|
r3wt (OP)
|
|
May 31, 2013, 02:07:30 PM |
|
no offense, but thats just a slapped together hodge podge of code here, i'll upload my searchengine engine to a webhost so you can see what a real search engine looks like... stand by.
|
My negative trust rating is reflective of a personal vendetta by someone on default trust.
|
|
|
r3wt (OP)
|
|
May 31, 2013, 02:55:47 PM |
|
here you go. this is what a search engine looks like http://coinbit.pw/search/search.php
|
My negative trust rating is reflective of a personal vendetta by someone on default trust.
|
|
|
btceic
|
|
May 31, 2013, 03:02:02 PM |
|
We should propose a btc meta tag, the purpose of which is to easily define a website as about btc rathar than having to parse it from the content on the site itself.
|
|
|
|
phoenix_nl
Newbie
Offline
Activity: 22
Merit: 0
|
|
May 31, 2013, 03:28:29 PM |
|
Interesting idea, but I think it would be hard to cover all bitcoin related sites. You'd still need google for complete coverage I'm afraid.
|
|
|
|
r3wt (OP)
|
|
May 31, 2013, 03:35:56 PM |
|
Interesting idea, but I think it would be hard to cover all bitcoin related sites. You'd still need google for complete coverage I'm afraid.
you don't seem to understand how this works, so ill explain it for you. google works as a spider. every searchengine does. it searches domains for links and tags. it indexes these tags to and matches them to a link list. when you search google, you pull up a list of links that closely match your search term. basically if i set my search engine to index to a depth of six, and all bitcoins sites had some random hash tag like "hashBTC1321ljwljaslsdjaflasorewralesfjlsadfjasdl99999999999" i could set the spider(search engine) to leave the domain, and it would find every website on the internet with that tag in its meta tag list. This would take months. the spider would just be cranking band with crawling the internet. the solution? what google does. "web master tools" is a really fancy multi featured way of letting web site owners do the part of the heavy lifting of the search engine process. this is the next step in my development. right now, i've got a form that gets your email address a description of your website, the url of your homepage and makes you fill out a captcha. when you hit submit if it verifies, it adds your site to qeue to be indexed by the spider. for my idea to be possible, on a global scale, i would need a virtual private server with very liberal mysql connection allowances. the trick is to run multiple instances of the spider at once to crawl pages and speed up the time it takes index. then you have temp tables which serve to hold site information until the spider crawls it. you have to clear these manually. So in essence, you can compete with google, but you have to have a damn nice hosting setup with a very flexible memory and mysql database access abilities. my ideal setup would be 100 simultaneous database connections, unlimited bandwidth, 3 databases, 10gb diskspace.
|
My negative trust rating is reflective of a personal vendetta by someone on default trust.
|
|
|
|
r3wt (OP)
|
|
May 31, 2013, 03:44:08 PM |
|
just a little bit outta my price range broski... i'm no where near ready to deploy yet anyway. i'm still making some changes to the code. edit: besides, i already have my my crawler or as i call it, "the spider" setup how i want it. im trying to make a firm plan of action how to deploy it in a large scale. i was thinking of setting up external access to my db, and creating nodes on as many free webhosting accounts with some little dinky front website to run my crawler on. this way i could deploy quite a few spiders at once. the key is finding a webhost who will let me connect to the database externally and from multiple ip addresses simultaneously. then i was thinking about building a shell script that would move seamlessly from one crawl to the next, so that i can manage my nodes efficiently
|
My negative trust rating is reflective of a personal vendetta by someone on default trust.
|
|
|
btceic
|
|
May 31, 2013, 03:45:40 PM |
|
Interesting idea, but I think it would be hard to cover all bitcoin related sites. You'd still need google for complete coverage I'm afraid.
you don't seem to understand how this works, so ill explain it for you. google works as a spider. every searchengine does. it searches domains for links and tags. it indexes these tags to and matches them to a link list. when you search google, you pull up a list of links that closely match your search term. basically if i set my search engine to index to a depth of six, and all bitcoins sites had some random hash tag like "hashBTC1321ljwljaslsdjaflasorewralesfjlsadfjasdl99999999999" i could set the spider(search engine) to leave the domain, and it would find every website on the internet with that tag in its meta tag list. Each site can use there own address, <meta name="btc" content="1NS5Bj6PDKc7P59q9XoJEGiBgeyfXh6q8j"/>
|
|
|
|
btceic
|
|
May 31, 2013, 03:52:13 PM |
|
just a little bit outta my price range broski... i'm no where near ready to deploy yet anyway. i'm still making some changes to the code. edit: besides, i already have my my crawler or as i call it, "the spider" setup how i want it. im trying to make a firm plan of action how to deploy it in a large scale. i was thinking of setting up external access to my db, and creating nodes on as many free webhosting accounts with some little dinky front website to run my crawler on. this way i could deploy quite a few spiders at once. the key is finding a webhost who will let me connect to the database externally and from multiple ip addresses simultaneously. then i was thinking about building a shell script that would move seamlessly from one crawl to the next, so that i can manage my nodes efficiently they have a free plan
|
|
|
|
r3wt (OP)
|
|
May 31, 2013, 03:56:26 PM |
|
just a little bit outta my price range broski... i'm no where near ready to deploy yet anyway. i'm still making some changes to the code. edit: besides, i already have my my crawler or as i call it, "the spider" setup how i want it. im trying to make a firm plan of action how to deploy it in a large scale. i was thinking of setting up external access to my db, and creating nodes on as many free webhosting accounts with some little dinky front website to run my crawler on. this way i could deploy quite a few spiders at once. the key is finding a webhost who will let me connect to the database externally and from multiple ip addresses simultaneously. then i was thinking about building a shell script that would move seamlessly from one crawl to the next, so that i can manage my nodes efficiently they have a free plan doesn't do what i need.
|
My negative trust rating is reflective of a personal vendetta by someone on default trust.
|
|
|
Ethicoin
Newbie
Offline
Activity: 28
Merit: 0
|
|
May 31, 2013, 04:49:57 PM |
|
Great idea but you are not the first: http://www.bitcoinsphere.com/http://bitcoinmagazine.com/the-bitcoin-search-engine-launches/There was another one around a while ago (forgot the name) but it seems to have faded into obscurity. However, I really like some of your ideas such as a crypto "better business bureau" Ethicoin, when it is released, will function as a crypto charity (amongst many other things) In fact - your project is eligible for funding. PM me if you would like to be added to the Ethicoin receiver files upon release (we will have a thread for this soon in the alt coin forum)
|
|
|
|
|