Why do websites break all by themselves?

November 14, 2012

Ross Ross Gerring

Why do websites break all by themselves?

One of the most frustrating aspects of owning a website is that sometimes they “break”. One day something was working just fine, the next day not.

So for example, one day your ‘Contact Us’ form is successfully sending emails to your sales team, and the next day it’s not. Or yesterday customers could successfully complete a purchase on your e-commerce store, but today they can’t. Or today a link on your website to another website works just fine, but tomorrow it shows as a broken link. Annoying!

It can be time consuming and therefore expensive to fix a broken site. Cause and effect is, on occasions, far from obvious. For example, sometimes a coding change in one part of a site can have an undesirable knock-on effect in an apparently unrelated part of the site. Or perhaps a problem is being reported by one or more of your customers, but you’re struggling to reproduce the error yourself.

Particularly aggravating is where, apparently, no-one actually did anything to the site itself. It appeared to break all by itself. How is this possible?

The bottom line is that websites and related public-facing technologies operate in an extremely dynamic (some would say ‘hostile’!) environment. There are lots and lots of things going on around a website that can cause things to break, temporarily (yes, some issues appear to fix themselves!) or permanently, until a web developer or other IT professional is able to fix things up.

Here are a few reasons how and why websites sometimes appear to break all by themselves:

  1. Browser changes/versions, i.e. changes to the software that people use to view and interact with your website. A website can only be future-proofed to a degree. So your website might work just great on Internet Explorer 9, but a subtle difference when you upgrade to Internet Explorer 10 causes your site to misbehave. But don’t forget that there are multiple browsers (e.g. IE, Chrome, Firefox, Safari), multiple versions of those browsers, multiple operating systems and versions upon which these browser run (Windows, Mac, Linux), and multiple hardware platforms that run all this software (PCs, tablets, smartphones, etc.). Changes in *any* of these have a tiny but real potential to impact how a website behaves.
  2. Software updates, e.g. security patches. These can, and typically need to, occur directly to your site, and/or the software environment around them (e.g. to the web server, operating system, database, or programming language used by your site). Apparently innocent and well-intentioned updates to any of these components has the slight but real potential to break your site to some degree or another.
  3. User error. Today’s Content Management Systems (e.g. WordPress, Drupal) are powerful things! They can give potentially inexperienced users the ability to manipulate websites in a multitude of ways, i.e. far more than just managing text content. Despite the best of intentions, a CMS user might make a change to a site, unaware that it has unintended consequences for other parts (e.g. functionality) of a site. And sometimes it can be very challenging for a support person to trace the problem back to the actions of the CMS user.  Too many administrative users opens up the potential for errors on the site.  We suggest limiting the roles and permissions to one or two key people, with more ‘basic’ roles allocated to staff who only need to edit content etc.
  4. Changes/updates to 3rd party systems or software. Perhaps that ‘Contact Us’ form on your site is working just fine, but the mail server that receives and stores your email is misbehaving. Or maybe your site integrates with Twitter or Facebook, but Twitter or Facebook just made a fundamental change to the rules by which sites are allowed to integrate with them. Or maybe the 3rd party service that processes credit card payments on your site is experiencing technical difficulties. Or that page on another website that you were linking from your site is broken because the other site just had a major update with lots of renamed pages.
  5. Hardware failure, e.g. errors on a hard drive causing intermittent glitches.
  6. Computer viruses. A computer virus on a PC can make a web site do very strange things.
  7. Pop-up blockers, or other software specific to the IT environment of the website visitor. So things like anti-virus or parental control software (e.g. Net Nanny) can cause apparent problems on a website – such as making it appear completely unavailable  – but without the computer user understanding why.
  8. Firewalls. Some organisations – typically government – have very “locked down” IT environments. A website operates perfectly when viewed outside the organisation, but misbehaves when viewed inside the organisation.

What’s the solution?

In the first instance it’s about education so that people have a better understanding of the challenges.

Thereafter it’s about the speed and efficiency with which issues are detected, reported and resolved. The detection and reporting of issues can be automated up to a point, but we’ll always want and need humans to assist with this. Therefore it’s important to encourage and welcome issue reporting, and to make the process as painless and useful as possible.

And whatever is learned from each experience, should be used to build more robust, fault-tolerant systems in the future.