To understand SEO, you have to understand the search engine. And indexing is one major function of our good friend Google. What exactly is search engine indexing? How exactly do they do it?
You over here, you over there…
Since this is the fourth lesson, I assume you already know that a search engine spider is a piece of software developed for a search engine. It’s purpose is to go through every page of your site, categorize it and place it into a database.
Google’s search engine spider which is better known as Googlebot gives the Google indexer the full text of the pages that it crawls on – and these pages are then stored in Google’s index database. This is then sorted alphabetically by search term.
Each index entry stores a list of documents in which the term is found. It also pinpoints the location within the text where the term occurs.
In order for Google to make sure that its performance functions at an optimum speed, Google ignores stop words. Stop words are words like is, are, on, how, or, of, why, this, that, etc…
Stop words are so common within web pages that it does virtually nothing to narrow down a search and help specify a query. And as such, they are discarded and ignored by Google. The indexer also ignores multiple spaces and some punctuations. It also converts all letters to lowercase to help improve its performance.
Again, not all indexers are equal but this is how indexers generally functions. Search engine indexing is all about saving a webpage’s data in it’s database for easy retrieval and ranking – which will be our next two lessons.
Until then, for those who are not yet part of our SEO Hacker Facebook group community, please do join us.
Sean Patrick Si
SEO Hacker Founder and SEO Specialist