The founder of the newer engine DuckDuckGo has recently discovered that he’s being hit with tons of spam queries for all sorts of seemingly random searches. He’s made note of the fact that while he can block these botnets from spamming his servers for the same query over and over and over again, he’s formed a question about this traffic.
In his own words from his blog:
“if other search engines include this errant traffic in their query counts. We work hard to keep them completely out because they would overwhelm our real direct queries #s and therefore distort our perception of progress. “
And while Gabriel makes a solid point and brings up a great question as to the quality of the searches and query numbers being generated, I think he’s missing the simplest answer. The founder of DuckDuckGo has managed to block the botnets at the firewall level to prevent them from skewing the query numbers and influencing the search numbers. And being that the other search engines, Google, Bing and Yahoo respectively, have been around far longer, it would lead to the assumption that they’ve already dealt with the issue about the false searches. As far as SEO is concerned, this kind of activity can be seen as a quality spam, as it can be seen as bots that the websites in question have received hundreds of thousands of queries and results from these malicious users. A game and method which was dealt with years ago by both Google and Bing, so it’s almost completely a non issue.
I think the more realistic reasoning behind the botnet traffic on the new search engine is a very simple problem that anyone with a website that has an input box and no validation can relate to. It’s just spam, either looking for an exploit or a kink in the code to be able to exploit the website software that’s been picked up. It’s argued that the small search engines like Blekko and DuckDuckGo offer a better quality of search due to the fact that they are smaller and less bloated than their big brothers. In time however, I can see it being realized that the larger and larger these small engines become, the more increasingly difficult it will be to deliver incredibly fast results (less than half a second) while maintaining a complex directory of hundreds of billions of pages. Google just last year reached the 3 trillion pages indexed mark, a number which would cripple most data centers in existence.