Search Engine Optimisation
How to make sure your site is found on the search engines, simple steps to optimising your site…

It is vital that the pages on your site are optimised for the search engines. The consequences are that your customers will not be able to find you.

Information retrieval is the science of searching for information. All search engine algorithms are based on information retrieval theories and methods. These algorithms are closely guarded secrets, but knowing the fundamentals of information retrieval is the key to unlocking these secrets of search engine optimisation.

Using these fundamentals, I will explain how you can build a site which will be readily understood by the search engines and therefore easily visible to the visitors you want to attract.

How the search engines work

Anyone who uses the web knows the frustration of typing a phrase into a search engine only to find the results return pages about a similar – but unrelated – item. The companies running the search engines, themselves, are also aware of this and in order to make their results more accurate, they are turning to increasingly sophisticated information retrieval techniques.

Gone are the days when improving your ranking on the search engines was a simple case of repeating a number of keywords and phrases on your pages. In the past, to achieve page one ranking, all you needed to do was to have more keywords on your pages than any of your competitors. Now the search engines regard this as spamming.

Google recommends that web pages should be written to be easily read by humans, as opposed to search engines. (Quality guidelines: )

Information Retrieval and the Web

Information retrieval uses a mathematical approach to determine the weighting of specific phrases in a website. Once the weighting is calculated, this can be compared with other websites to determine which is the most relevant. This is a far more accurate method of ranking web pages because the meaning, or semantics, of the page text is captured. This makes for a system which is far less open to abuse than simply weighing up the density of relevant key words in the text.

Calculating the weighting of given keywords in a website varies with each search engine, but the methods are basically the same.

Here is the Five Step Process Behind Search Engine Information Retrieval...

1) Linearization:

The average webpage contains more than the text you see. There is also code embedded in the content which tells the browser your visitors are using how to display the page. The first thing a search engine does, when it reads a page on your site, is remove this code, a process which is called linearization.

How linearization affects you: The more code you have on each web page, the more difficult it is for the search engine to perform linearization with any meaningful result. For example, if your page displays tables which are defined in html, that is, with the definition embedded in the page content, the search engine will remove the code defining the table. It will then read just the text. Some search engines may read the text column by column others may read row by row. In other words, the search engine may not read the text in your table the way you intended. The best way to get around this is to use cascading style sheets. These allow you to put much of the formatting information in a separate document leaving just a handful of codes in your content. This, in turn, makes the meaning of each page clearer to the search engines.

2) Removal of 'stop words':

Once the search engine has performed linearization, the next step it takes is to remove stop words from the text. These are words which appear often, subjunctions, pronouns… words like “if”, “but”, “and” or “to”.

How the removal of stop words affects you: If you are using keyword density techniques to optimise your site, that is, if you have typed in your key phrase again and again, there is a very real danger that the search engine will see it as a stop word, too, and remove it. This would mean that when your target audience searched for those phrases or words on the web, your site would be invisible to them.

3) Local context analysis and the Lexicographical Tree:

Next, the search engine aims to establish the context of the subjects within the page. Every sentence has a subject and a predicate. The subject is usually at the beginning of the sentence and refers to what the sentence is about. The predicate is the rest of the sentence that gives information about the subject. Within the predicate is usually a verb and an object. The object is normally the thing that is affected by the verb.

Local context analysis attempts to determine the subject, verb and object of each sentence in each paragraph or page. Using each subject found, it collects all the associated objects and builds a two tier hierarchy. Then for each object, it collects all the associated verbs, synonyms and other words and builds the third tier of information. This three tier hierarchy of information is known as a 'lexicographical tree'.

For any given subject, the richer your description, the bigger the tree will be. Local context analysis will result in a relevancy score for each subject and object based on the size of the tree. This relevancy score will be used later to determine the overall weighting of your web page.

How building the lexicographical tree affects you: If the sentences in your text are not properly constructed the search engine may not pick up on the interrelationship of the nouns and verbs in your text and this may cause it to catalogue your site incorrectly. This is why Google advises the use of good grammar and encourages the use of readable text, because the search engine picks out subjects, objects and verbs from each sentence.

4) Latent Semantic Indexing:

Having established the relevancy of the subjects and objects in each web page, the next step is to look for other pages on the website that appear similar or have equally high relevancy scores for the same keywords. Part of this process investigates the use of synonyms to test how often the same or semantically similar words are used. This builds an index of semantics as the use of synonyms will help give the subject or object more meaning.

This can result in a higher ranking for a given page that may not even contain an exact match for the search keywords in question. This is because the derived latent semantics discovered on this page may be more relevant.

How Latent Semantic Indexing affects you: The better your description of the subjects and objects within each sentence, the more accurately the search engine can pin down what your site is about and hence determine the most relevant page. This will vastly increase the number of appropriate hits you will receive from the visitors you aim to attract.

However, while varying your vocabulary is good, beware of using words in an unusual context, even if it is grammatically correct to do so, as it may skew your results. It is often useful to relate an object being written about to senses or visual images – especially in direct copy writing – but pick your words carefully.

For example, a page describing a children’s ABC poster recently stated that ordering through the company’s online shop was “a piece of cake”. Shortly afterwards the site statistics started showing visitors who had been searching for information about cake decorating. An alternative way of putting it, without losing the informal tone or diluting the relevancy of the page, might have been “ordering is as easy as ABC”.

5) Term Vector Analysis

Having carried out Local Context Analysis and Latent Semantic Indexing, the search engine uses a mathematical algorithm on the results, called Term Vector Analysis, to give page a score for the total relevance, or weighting, to the search query. Term vector analysis is a mathematical method of determining the relevancy of multiple terms or keywords. This is performed by putting each keyword on an axis on a graph, and marking the relevancy score for each of those keywords for a given page. The result will be a vector having an angle and a magnitude. This then gives a method of comparing different pages for given combinations of keywords. The vector with the highest magnitude and the closest angle to search query will have the largest weighting.

See below for an example of the Term Vector Analysis Graph for the search terms "torches for cars":

Compare the weighting for 'SITE A', with a relevancy score of 0.7 for torches and 0.2 for cars, with 'SITE B', with a relevancy score of 0.6 for torches and 0.5 for cars. The graph clearly shows that that 'SITE B' is the closer match to the ideal weighting and therefore 'SITE B' will be ranked higher by the search engines.

How Term Vector Analysis affects you: You need to build a high relevancy score for the important keywords. The calculated weighting for combinations of your important keywords on your pages will consequently be higher. This will result in a greater chance of your web page being found on the search engines.

What you can do to give your pages a powerful advantage

Now you know how the search engines work. Here are seven steps you can take to make sure your pages are search engine-friendly.

  1. Keep your pages as uncluttered as possible: Use cascading style sheets where you can. These keep much of the formatting for your page separate from the content. The result will be text that is easier for the search engines to read. Where you need to include formatting, for example, on page titles or text headings, decide on three or four different types, sizes or colours which you are going to use and stick to them. This keeps things straightforward for your human readers, as well as limiting the amount of code the search engine will have to wade through.
  2. Use meta tags: These are invisible text headers at the beginning of each page which give the search engines more information about your site. You can use these to explain many things including what the page is about or whereabouts in the world you are based – handy if you wish to address an audience which resides mainly in your own country.
  3. Aim to comply with formatting standards: Make sure any coding you have added to format your pages is correct. As the internet standards improve, the search engines will inevitably be programmed to favour pages which comply to the standards. Try to make sure your site complies with WC3 readability standards – you can check each of your pages for errors using the WC3 mark up validation service at this address: The content management systems we offer will be WC3 compliant, if you use them with the wysiwyg editors switched on.
  4. Research your keywords: Take time to determine which words and phrases the visitors you wish to attract will type into the search engines – the popular keywords may not be the ones which spring to your mind. If possible, ask others what they would use. Consider using keyword research tools, like Wordtracker or - if you specifically need to search for a UK database of keywords - Trellian.
  5. Use correct grammar: Construct your sentences properly. The search engines follow the rules of good grammar to carry out Local Context Analysis. If you want them to understand your pages, you need to use good grammar.
  6. Use synonyms and metaphors: If you use the same key words repeatedly the search engines may treat them as stop words and ignore them. Using alternatives reinforces what your pages are about, meaning they can be catalogued more accurately. That said, be careful using colloquialisms, slang or less well known alternative meanings or contexts for common words.
  7. Keep your text tight: Make sure you have clearly defined subject matter for each page and stick to it. The more closely you stick to your chosen topic the easier it is for the search engines to catalogue.

In summary, for your customers to find you, you need to keep in mind how the search engines work, not only when you conceive the design and page structure of your site but also when you write the content. If you don’t do this all the time and effort you expend on making a website may well be for nothing.


Keyword analysis tools:


Further information about Information Retrieval

Google Guidelines

For your customers to find you, you need to keep in mind how the search engines work, not only when you conceive the design and page structure of your site but also when you write the content. If you don’t do this all the time and effort you expend on making a website may well be for nothing.

However, the good news is that it doesn’t cost anything to optimise your pages yourself and even trying these simple steps, you can achieve a marked increase in your site's visibility on the search engines.

Find out for yourself how you can maximise your business potential by incorporating Netflare Marketing Support. Call 0800 107 4662 and ask to speak to an advisor.