Luca Vassalli's Website - Home page


Home
Index

1.Introduction

2.Ethical SEO

3.Spider's view

4.SEO spam

5.General topics

 

3. How a spider views your website

3.1 Images, Flash content and JavaScript

When the World Wide Web was born, it was mainly a text-based medium. Sounds, images and complex animations were either very rare or completely unheard of. Not surprisingly, the first major search engines that came around a couple years later were built to classify and rank pages largely based on textual content.
Unfortunately if the things did change for the webmasters, they did not for the search engines.
Nowadays it is almost impossible to find a website which does not use images or Flash content; if the text is still very important, in some cases you can find websites which are wholly developed in Flash or images are used to display a fancy font. This is extremely bad from a SEO point of view, if it is true that the spiders try to get the much information they can from also these graphical elements, a massive use of images or Flash content may really hurt the ranking of your website and since there is no meaning to have a very fine website with no visitors, you should find a good trade-off between search engine and graphical appeal.
First, when you use images they must have a descriptive name and you must use the "alt" tag attribute to give a complete description. This attribute is also important in case of Braille readers based on speech synthesisers, these kinds of browsers use the alt text, thus you might want to make them usable whilst including the search term. For instance the following tags are a bad and a good example of an image tag:
<img src="pic1.jpg" alt="Tom and Sarah">
<img src="wedding_party_Tom_Sarah.jpg" alt="A picture of Tom and Sarah at the Naismiths' wedding party at Pollock Park on 25th June 2006">
Then it is good to use some images for illustration and decoration, but they must be never used for navigation or for displaying text.
A similar approach must be followed for Flash animations. A Flash introduction to a website is a top ranking killer; in fact spiders cannot index a Flash movie directly, as they do with a plain page of text. Spiders index filenames, which have to be chosen carefully like for the images, but not the contents inside. This is true for all the search engines, although there are differences in the way the handle Flash content, all of them index only a minimum part of it.
There are also other reasons to not use Flash animation: it is proved that users tend to skip flash introductions to go ahead to the information they are interested in; they are bandwidth consuming and in case of a user with a slow connection this may lead to lose a potential reader of the website; they are expensive since require expert designers. The right rule should be: Flash is good for enhancing a story, but not for telling it. So if you want to use it, you can do it but with care.
You must also avoid using Flash for navigation: text links are the only SEO approved way to build site navigation.
However every time you decide to use Flash, there are some workarounds to limit the damage.
Using metadata in this case is important, Flash development tools allow easily adding metadata to your movies; it is a chance to describe them to the spider.
It is better providing an alternative page made only of HTML content if the user wants to skip the Flash content. In general, it is a good habit, not only for SEO but also for usability, giving to the users the chance to choose between Flash and normal HTML. It requires double afford but you will be paid back.
Using an ad hoc tool for translating from Flash to HTML can save lots of working hours. This is the job of one of the handiest applications in the Flash Search Engine SDK, called "swf2html". Obviously you still have to check the automatically produced output, verifying for instance possible mistakes in the colour: for instance if the font was of the same colour of the background, you might be banned for hidden text. You need also to remove possible duplicate context, adjust the links to put the keyword-rich content in the title and headings or at the beginning of the page.
You can use also tools, like the one at the address http://www.se-flash.com/, which are useful to see how different search engines actually see your page with Flash content; in particular if you are using SDK, they provide one more check of the accuracy of the extracted text. There are also more general tools, like http://www.webconfs.com/search-engine-spider-simulator.php, which can be used to analyse what a spider see of a page to verify if there are any JavaScript parts which include important keywords or links.
JavaScript is in fact another content which is not analysed by the spiders. It means that if you have, for example, a JavaScript menu it will not be read by the crawler. To solve the problem you should always us the <noscript> tag including all the links of the JavaScript menu. And you should also avoiding put the JavaScript instructions at the top of the HTML file, area were the most important keywords are supposed to be found; it is better put them in a separate file, otherwise you will forces the spider to wade through something that it is not at all interested in, before being able to read the text. While the major search engines can handle quite well such unfriendly pages, you can say that filling your pages with non-HTML code is more likely to hurt than to help you. Furthermore the less the search engine knows what kind of CSS and JavaScript you use, the better. This is true in particular if you use SEO spam techniques, but also in case of mild ethical SEO there is no guarantee that in the future, what today is allowed, will be strictly forbidden. Putting the js file in another directory and protecting it with the Robots.txt, is more likely that the spider will not analyse it.