The automated Artificial Intelligence (AI)-based systems at Google are detecting around 25 billion spammy pages in Search every day, ensuring they don't rank well in the overall results that people see. Although the company did not share the specific techniques it uses for spam fighting because "that would weaken our protections and ultimately make Search much less useful", but it shared details about spammy behaviour that can be detected systematically.
"A low-quality page might include the right words and phrases that match what you searched for, so our language systems wouldn't be able to detect unhelpful pages from content alone. The telltale signs of spam are in the behavioural tactics used and how they try to manipulate our ranking systems against our Webmaster Guidelines," the company said in a statement on Tuesday.
Last year, the company observed that more than 25 billion of the pages it find each day are spammy.
"If each of those pages were a page in a book, that would be more than 20 million copies of ‘War & Peace' each day," said Google. "If you've ever gone into your spam folder in Gmail, that's akin to what Search results would be like without our spam detection capabilities," it added.
Google said it has designed systems to prioritize the most relevant and reliable web pages at the top. The Webmaster Guidelines detail the types of spammy behaviour that is discouraged and can lead to a lower ranking: everything from scraping pages and keyword stuffing to participating in link schemes and implementing sneaky redirects. Google admitted that fighting spam is a never-ending battle, a constant game of cat-and-mouse against existing and new spammy behaviours.
"As with anything, our automated systems aren't perfect. That's why we also supplement them with human review, a team that does its own spam sleuthing to understand if content or sites are violating our guidelines," said the company.
Latest technology reviews, news and more