Search Engine Visibility of Website Development and Design

Search engine traffic is vital to a web site; without it, chances are the site will never fulfil its marketing functions. It essential that the search engines can see the entire publicly visible web site, index it fully and consider it relevant for its chosen keywords.

Search engine optimisation has its own chapter in this textbook, but here are the key considerations when it comes to web development and design.

Labelling things correctly: URLs, alt tags, title tags and meta data

URLs, alt tags, title tags and meta data all describe a web site and its pages to both search engine spiders and people. (And don’t worry; these words are all described to you below!) Chances are, clear descriptive use of these elements will appeal to both.

URLs

URLs should be as brief and descriptive as possible. This may mean that URLs require server side rewriting so as to cope with dynamic parameters in URLs. Does that sound a little heavy? The examples below should make this clearer:

Comparison of URLs for Cube World, a toy for sale on both sites:

Firebox.com - www.firebox.com/index.html?dir=firebox&action=product&pid=1201

Gizoo.co.uk - www.gizoo.co.uk/Products/toysgames/Interactive/CubeWorld2.htm

The first example has dynamic parameters – these are shown by the question mark and the ampersand – and use categories that make sense to the database (e.g. pid=1201), but they make little sense to the user.

The second example is far more user friendly, and clearly indicates where in the site the user is. You even start getting a good idea of the architecture of the web site from just one URL!

More than two dynamic parameters in a URL increase the risk that the URL may not be spidered. The search engine would not even index the content on that page.

Lastly, well written URLs can make great anchor text. If another site is linking to yours and they use just the URL, the search engine will do a better job of knowing what the page is about if you have a descriptive URL.

Alt tags

Have you ever waited for a page to load, and seen little boxes of writing where the images should be? Sometimes they say things like “topimg.jpg”, and sometimes they are much clearer and you have “Cocktails at sunset at Camps Bay”. Since search engines read text, not images, descriptive tags are the only way to tell them what the images are, but these are still essentially for users. Text readers for

browsers will also read out these tags to tell the user what is there. Meaningful descriptions certainly sound a lot better than “image1”, “image2”, “image3”.

Title attribute

Just as you can have the alt tag on an image HTML element, you can have a title attribute on almost any HTML element - most commonly on a link. This is the text that is seen when a user hovers over the element with their mouse pointer. It used to describe the element, or what the link is about. As this is text, it will also be read by search engine spiders.

Title tags

Title tags, what appears on the top bar of your browser, are used by search engines to determine the content of that page. They are also often used by search engines as the link text on the search engines results page, so targeted title tags help to drive clickthrough rates. Title tags should be clear and concise (it’s a general rule of thumb that all tags be clear and concise, you’ll find). Title tags are also used when bookmarking a web page.

Meta tags

Meta tags are where the developer can fill in information about a web page. These tags are not normally seen by users. If you right click on a page in a browser and select “view source”, you should see a list of entries for <meta name= These are the meta data. In the past, the meta tags were used extensively by search engine spiders, but since so many people used this to try to manipulate search results, they are now less important. Meta data now act to provide context and relevancy rather than higher rankings. However, the meta tag called “description” often appears on the search engine results page (SERP) as the snippet of text to describe the web page being linked to. This is illustrated in the image above. If the description is accurate, well-written and relevant to the searcher’s query, these descriptions are more likely to be used by the search engine. And if it meets all those criteria, it also means the link is more likely to be clicked on by the searcher.

Search engine optimised copy

The chapters on online copywriting and search engine optimisation provide details on writing copy for online use and for SEO benefit. When it comes to web development,the copy that is shown on the web page needs to be kept separate from the code that tells the browser how to display the web page. This means that the search engine spider can discern easily between what is content to be read (and hence scanned by the spider) and what are instructions to the browser. CSS (cascading style sheets) can take care of that, and is covered further in this chapter. The following text styles cannot be indexed by search engines: 

  •    Text embedded in a Java Application or a Macromedia Flash File
  •    Text in an image file (that’s why you need descriptive alt tags and title attributes)
  •    Text only accessible after submitting a form, logging in, etc. 

If the search engine cannot see the text on the page, it means that they cannot spider and index that page.

Information architecture

Well organised information is as vital for search engines as it is for users. An effective link structure will provide benefits to search rankings, and helps to ensure that a search engine indexes every page of your site.

Make use of a sitemap, linked to and from every other page in the site. The search engine spiders follow the links on a page, and this way they will be able to index the whole site. A well planned sitemap will also ensure that every page on the site is within a few clicks of the home page.

There are two sitemaps that can be used: an HTML sitemap which a visitor to the web site can see, use and make sense of and an XML sitemap which contains additional information for the search engine spiders. An XML sitemap can be submitted to search engines to promote full and regular indexing. Again, a dynamically generated sitemap will update automatically when content is added.

Using a category structure that flows from broad to narrow also indicates to search engines that your site is highly relevant, and covers a topic in-depth.

Canonical issues: there can be only one

Have you noticed that sometimes several URLs can all give you the same web page?

For example:

http://www.websitename.com

http://websitename.com

http://www.websitename.com/index.html

All the above can be used for the same home page of a web site. However, search engines see these as three separate pages with duplicate content. Search engines look for unique documents and content, and when duplicates are encountered, a search engine will select one as canonical, and display that page in the SERPs. However, it will also dish out a lower rank to that page, and all its copies. Any value is diluted by having multiple versions.

Lazy webmasters sometimes forget to put any kind of redirect in place, meaning that http://websitename.com doesn’t exist while http://www.websitename.com does. This is termed “Lame-Ass Syndrome” (LAS) a fitting moniker

Having multiple pages with the same content, however that came about, hurts the web site’s search engine rankings. There is a solution: 301 re-directs can be used to point all versions to a single, canonical version.

robots.txt

A robots.txt file restricts a search engine spider from crawling and indexing certain pages of a web site by giving instructions to the search engine spider, or bot. This is called the Robots Exclusion Protocol. So, if there are pages or directories on a web site that should not appear in the SERPs, the robots.txt file should be used to indicate this to search engines.If a search engine robot wants to crawl a web site URL, e.g.

http://www.web sitename.com/welcome.html it will first check for

http://www.web sitename.com/robots.txt

 Visiting the second URL will show a text file with:

User-agent: *

Disallow: /

Here, User-agent: * means that the instruction is for all bots. If the instruction is to specific bots, it should be identified here. The Disallow: / is an instruction that no pages of the web site should be indexed. If there are only certain pages or directories that should not be indexed, they should be included here.

For example, if there is both an HTML and a PDF version of the same content, the wise web master will instruct search engine bots to index only of the two to avoid being penalised for duplicate content.

The robots.txt file is publicly accessible, so although it does not show restricted content, it can give an idea of the content that a web site owner wants to keep private.

A robots.txt file needs to be created for each subdomain.

Here is a robots.txt file with additional information:

  • User-agent: *
  • Disallow: *.mp3 , *.wmv , *.swf , *.rm ,
  • Request-rate: 1/5
  • Crawl-delay: 5
  • Visit-time: 0001-1300

Instructions to search engine robots can also be given in the meta tags. This means that instructions can still be given if you only have access to the meta tags and not to the robots.txt file.

Make sure it’s not broken

Make sure that both visitors to your web site and search engines can see it all by following these guidelines: 

  • Check for broken links – anything that you click that gives an error should be considered broken and in need of fixing.
  • Validate your HTML and CSS in accordance with W3C guidelines.

Make sure all forms and applications work as they ought to.Keep file size as small as possible and never greater than 150K for a page. It ensures a faster download speed for users, and means that the content can be fully cached by the search engines.

Events

« May 2012 »
SunMonTueWedThuFriSat
12345
6789101112
13141516171819
20212223242526
2728293031

Stay in touch