Blogging Google Help & Advice Internet SEO & SEM Usability Web Design & Development

SEO for URLs and externally linked files on websites & ‘blogs

Google is a reader of websites who’s best kept happy with sensible structure and strong content. Google will read almost anything — or should I say almost any file…

Google is a reader of websites who’s best kept happy with sensible structure and strong content. Google will read almost anything — or should I say almost any file…

Because the world wide web isn’t just about web pages written in HTML (Hyper-Text Mark-up Language), Google will happily read (and, if deemed appropriate) index a number of file types, which consists of:

  • Adobe Portable Document Format (.pdf)
  • Adobe PostScript (.ps)
  • Atom and RSS feeds (.atom, .rss)
  • Autodesk Design Web Format (.dwf)
  • Google Earth (.kml, .kmz)
  • Lotus 1-2-3 (.wk1, .wk2, .wk3, .wk4, .wk5, .wki, .wks, .wku)
  • Lotus WordPro (.lwp)
  • MacWrite (.mw)
  • Microsoft Excel (.xls)
  • Microsoft PowerPoint (.ppt)
  • Microsoft Word (.doc)
  • Microsoft Works (.wks, .wps, .wdb)
  • Microsoft Write (.wri)
  • Open Document Format (.odt)
  • Rich Text Format (.rtf)
  • Shockwave Flash (.swf)
  • Text (.ans, .txt)
  • Wireless Markup Language (.wml, .wap)

Think of all those times you see a PDF document appear in a SERP (Search Engine Results Page). That’s no accident.

PDF search result link on Google

Keep in mind at all times that the search engines are very specific about what they want from your website. If your website makes it difficult for them to do their job, they will go elsewhere, possibly to a competitors website.

What does this mean for SEO?

If you’re a webmaster, or are in the throws of developing a new website, there’s a chance you’re going to include all kinds of different files for download, some of which from the list above.

These reserves of content aren’t really external to the rest of your website simply because you can’t edit them with Adobe Dreamweaver, or your favourite IDE. So in many ways, the rules of SEO (Search Engine Optimization) still apply.

Linking to external files

an example of the text surrounding a URLThese stores of data & information are often linked to from within a website. The anchor text you use will offer a guide to Google as to how best approach this store of knowledge.

For instance, Google examines the surrounding text of a URL as well as it’s immediate anchor text (see illustration above).

So by looking to the anchor text and examining the surrounding text, Google will look to the referenced file and index it accordingly (see illustrations below).

An example of weak anchor text
an example of weak anchor text

An example of strong anchor text
an example of strong anchor text

Optimizing external files

You can think of these external files as being very much an active part of your website. Indeed, any optimization of the text (commonly referred to as copy) within those documents could have a very real impact on the relevance of your website to people performing a search via Google.

Further more, PDF documents that contain URLs are handled in much the same was as they are in regular web pages. This can have a profound affect on your website.

For a start, when people download these documents, those links will refer the reader back to your website. Also, and arguably more importantly, those links will build towards increasing the strength of your internal link structure.

What that means is, instead of having the same list of link buttons in the header of your website or ‘blog with just a list of recent articles down the left- or right-hand of the page, you’re offering links to related content within the body of the web pages themselves.

Essentially, you’re telling the search engines that previous web pages and / or articles are still relevant today. Also, the search engines are able to enter into your website via means in addition to the regular navigational menu structure.

External files and usability

Sometimes, people think external files, like Microsoft Office and Adobe PDFs are a wasted opportunity — why not just add the content of those files into the website, proper? From a usability point of view, that’s a good argument.

The moment someone clicks on a link to a PDF document, something else happens that is beyond the remit of web browser — either a new window or the new window of another application is spawned.

Lots of businesses now use Adobe Reader, but some still do not. Even if people do have the software to view PDFs, it’s the inconvenience of working with another application when all they really wanting to do is browse.

However, if they’ve installed the correct Plugins or Add-Ons for your browser, it’s possible to view those documents within their favourite web browser.


The take away SEO advice here is:

  1. ensure those external documents linked to from your website are optimized for the search engines the same way your regular web pages are;
  2. if you’re using keywords & key phrases, ensure they’re all present in any / all external documents that require them;
  3. make good use of anchor text as well as surrounding text;
  4. use common file formats you’re confident your visitors will be able to view; such as PDF, Microsoft Office et cetera;
  5. if you’re unsure whether someone has the correct software to view your files, then offer an option for the visitor to download the software — subject to licensing.

SEO is about the whole of the website, not just your web pages. Now that you’re able to think more holistically about those files that make up your website, why not see if you can push your website or ‘blog further up those search results pages…

Recommended reading

By Wayne Smallman

Wayne is the man behind the Blah, Blah! Technology website, and the creator of the Under Cloud, a digital research assistant for journalists and academics.

10 replies on “SEO for URLs and externally linked files on websites & ‘blogs”

Great Tips Wayne as always, very useful reference about context in relation to anchor text.

There is much more to that, particularly when using a virtual theming strategy like Wikipedia to connect contextual information (the anchor indicates internal relevance and makes it easier to rank for when combined with external links).

Search engines (in my experimentation), assign a higher relevance score when simple principles such as proximity are optimized for your on page factors that group key phrases and anchors consistently.

Hi Wayne,

Great article. At work, we are currently looking into accessibility for PDFs, a topic which I think we’ll see more of in future.

I guess the old rule applies: accessibility and SEO go hand in hand. An accessible PDF is likely to be well optimised for SEO as well.

Thanks for this interesting piece of information. I’ve just launched a website for the organisation where I work that contains a lot of external pdf’s. I already knew that these would be included in serp’s, but I never realized that I should consider their accessibility and content in the same fashion as I do with normal html pages.

Question for everyone: if you *only* look at seo, would you choose to put information in an html page or a pdf? Assume that, apart from the format, everything is the same: the number and types of links to the document, the content, et cetera.

Comments are closed.