How search engines work
In order to assess how search engines work from the viewpoint of how they rank sites, we need to examine three areas:
- how search engines find pages;
- what search engines look for on a page;
- and how search engines compare different pages against a user’s search criteria.
How Search Engines Find Pages
Search engines find pages in one of two methods. They generally have an ADDURL link on the Home or Help page. The ADDURL link allows users to submit either a Web page or their site. A list of ADDURL pages for the major search engines is available at: Web Site Search Engine
Some engines ask you to just submit the domain (eg., www.yourdomain.com/), while others allow individual page submissions. You should always read the submission guidelines before submitting a page. Doing it wrong may get your site banned.
What Search Engines Look for on a Page
Once a page has been submitted, the search engine uses a software SPIDER to look at the site. This program extracts different pieces of information from the site, such as MetaTags content, the text on each page, the text contained in comment tags, image alt tags and form tags.
Each search engine looks for the information it requires, and each is different. Search engines also look at links on each page and may add those links to their database for spidering at a later date. Spiders prefer text links (rather than image maps) and redirected links (eg., links such as those used in redirection scripts). Any links with variable identifiers such as ? will not be followed, as these could lead the spider into infinite loops within the site, or to hundreds of different versions of the same page.
The search engine spider examines the code on the page and extracts text from the programming code. The text is then examined to assess the theme of the page. In doing this, the spider looks at the following:
- words which appear regulary throughout the page;
- words appearing in Metatags;
- link anchor text;
- and emphasised text (such a words in bold or italics).
These give the engine an indication of the overall theme of the page, so that a search for ‘cars’ will bring back lots of pages with cars appearing in them.
Finding the Pages a User has Asked for
After matching the user’s search query with the pages found in the search engine database, it has to decide which pages are most likely to be of use to the surfer. Each search engine has its own ALGORITHM or mathematical calculation which gives more importance to words appearing, for instance, in Metatags, than words appearing on the page. Each engine is looking for what it believes is the best match for the user. By grading each page according to their algorithm, the engine is able to decide that page A is a closer match than page B for this user.
Engines also look at off-page criteria, such as the number of links pointing to a site, or whether those linking pages are also relevant to the search. Other factors include the age of the page and whether it is listed in edited directories, such as Yahoo and Looksmart.
How to Achieve Top Ranking in a Search
In order to achieve top ranking pages, it is necessary to reverse-engineer the algorithm used by each search engine. This can be done by examining top ranking pages in popular searches. For instance, by looking at the top 20 sites for a phrase such as ‘LOANS, a pattern will emerge. This pattern may then give you an indication about what different “factors” the particular search engine you were using is normally looking for. Examples of such “factors” are, for instance:
- the number of words on the page (word count);
- how frequently the keyword appears on the page (keyword density);
- or how near the start of the page the keyword appears (keyword prominance).
The more searches and pages you examine, the easier it gets to recognize a pattern behind the results.
Unfortunately, some sites are able to hide the real code used by delivering different pages to search engine spiders than those delivered to a normal user. They achieve this by examining the IP address and User agent of the visitor before serving an appropriate page. A high ranking page may also be swapped for a differently coded page. This happens as soon as the page appears at the top of the search result, and then the page is automatically switched. You should therefore be careful that the page you look at is in fact the same page that actually got to the top position. You will often be able to spot this because the description on the search engine may appear different to that on the page.
Why Text is King for Search Engines
Pages which contain little text, because of the use of images or flash animation, are unlikely to do well in search engines. This is because they give the spider little to read and, therefore, little to assess what the page is actually about. Search engines cannot read text contained within an image or animation. Similarily, they struggle as words become more deeply buried within tables. Your website designers may have created a fabulous looking site, but is it really search engine-friendly?
Text is king for the search engines. Anything which gets in the way of descriptive text will affect the position achievable on the engines. A search engine-friendly site consists of plain text, with targeted phrases repeated throughout the page. However, compromise is always necessary in the design. Even so, it is worth bearing in mind that some site designs and techniques ruin any chance of achieving top ranking in search engines. This, in turn, can have a devastating effect on your sales.
It is therefore worth considering to create a text-only version of your site to run alongside the main site. This will give search engines a greater chance of picking up your site content. Text-only versions should be designed for text-only browsers such as Linx. Try viewing a page from your site at www.delorie.com/web/ses.cgi to see how it looks to a search engine. You may be surprised.