How this site broke and got back online

2018-05-31

in Geek stuff, Internet matters

The world is now full of technology that needs regular software updates to fix security vulnerabilities as they are publicly reported. This includes all of your computers (including cell phones, smart devices like TVs and sensors, and network equipment like routers). It definitely applies to website content management systems (CMS) like WordPress.

That’s why when WordPress 4.9.4 was released in February, my hosting provider DreamHost implemented my ongoing instruction to automatically update the software.

How WordPress works

For those who don’t know, WordPress stores posts, comments, and all sorts of other things inside a database based on open-source software called MySQL. The other big piece of WordPress is the programming language PHP.

You can think of the MySQL database as where WordPress stores everything it knows, and PHP as the machinery that lets WordPress operate and serve up what you ask for. You might think of websites as being like newspapers: all set up and formatted before you have anything to do with them. Actually, modern websites are created dynamically as your web browser talks to the server and the software on the server makes decisions about what to send you.

For example, consider the web address:

https://www.sindark.com/page/50/

WordPress is set up to show a certain number of posts per page, and then to allow users to scroll back through older pages if they wish.

When your web browser visits https://www.sindark.com/page/50/ some pretty complicated stuff happens on the server side. It works out how many posts there should be on each page, works out what should be on the 50th page, goes into the MySQL database to find the titles and contents of posts, as well as their authors and the number of comments on them, and then it puts together an HTML page which your browser displays.

Exactly how everything looks visually in WordPress depends a lot on themes. These are collections of files that tell WordPress what typefaces to use, where to locate design elements, what to show on each page, and more.

For years this site used a premium paid theme called Thesis. Specifically, it used the latest version of Thesis 1. Sometime around 2012, Thesis 2 was released but, whereas Thesis 1 allowed non-expert website operators to set up the look they wanted with simple menus, in my opinion Thesis 2 doesn’t help all that much in designing a look from scratch and requires essentially a web designer’s capabilities to use.

So, the site was using what was arguably an antiquated theme before the WordPress 4.9.4 update was installed.

How the site broke

For someone as non-expert as me, a big piece of software like WordPress is like a space station. It’s complicated and I don’t begin to understand how most of it works, but I can see when things have gone badly wrong, like because the station modules are full of smoke or, worse, literally nothing.

WordPress themes store information in the MySQL database, such as the location and content of menus.

It’s not that Thesis 1.8.9 (the latest version of Thesis 1) is incompatible with WordPress 4.9.4. My old climate-focused site BuryCoal still uses the theme and upgraded just fine, as did my professional photography site durablepigments.com.

Computers make a lot fewer mistakes than humans, but they do happen. A file you download can have some of its contents incorrectly transmitted, and a processor can perform an operation incorrectly on data. Of course, bugs in software can produce errors too.

First, some kind of error broke the back-end system that allows a WordPress site operator to create new posts, manage comments, change how the site looks, and so on. At that stage, all the visitor-facing parts of the site still looked normal. I just couldn’t manage the site from my end as usual.

I put in some effort trying to fix the site, eventually leading to it going down completely. This highlighted the importance of not allowing my ignorance and DreamHost’s limits to permanently wreck the old database. It had problems that kept the site from working, but it was still a good copy of all my posts and all your comments.

Fixing the site

Job number one was to avoid destroying all the years worth of content on the site. Tinkering with the MySQL database, undertaken by an absolute non-expert, carried a considerable risk.

This site is hosted using the least expensive plan DreamHost provides, which is called shared hosting. The name is a little misleading, because even sites on more expensive plans “share” the computer server where they operate with other sites. Those higher-end plans, however, promise you a certain amount of resources like RAM. On shared hosting, an unknown number of sites are all sharing those resources which, among other things, makes it possible for a big jump in popularity on some totally unrelated site to slow down yours.

Shared hosting has other limitations. Crucially, in this case, DreamHost limits which tools you can use to work with your MySQL databases. Through their website they provide a tool called phpMyAdmin which theoretically lets you do things like modify the content of databases, export their contents, and import contents into a new blank database.

Unfortunately, phpMyAdmin suffers from one huge limitation that crops up commonly in shared hosting. If you ask the server to handle too much data, it gets overwhelmed and gives up. This happens to me constantly when I try to upload photos to the site (indeed, that frustration is a big reason I have been considering leaving shared hosting and/or DreamHost). For a site with as many posts and comments as this one, a lot of what I read online suggested that this could be a problem. One major alternative — copying the database using Secure Shell (SSH) isn’t allowed for shared hosting users.

At the beginning of March, I was struggling with efforts to make a copy of my MySQL database to tinker with without risk of breaking the original.

There’s actually a bigger problem, though. Think for a moment about a typewriter. It has all the letters of the alphabet, punctuation, and probably some special symbols like & and ^. With computers, there are different character encodings which similarly include letters and symbols. A basic one, ASCII from 1963, doesn’t handle much more than the typewriter. It basically includes Arabic numerals 0–9, upper and lowercase letters from the English alphabet, and standard punctuation.

But people use computers in languages other than English which include diacritical marks and characters not used in the English alphabet. People also use special punctuation marks like endashes and emdashes. Partly for these reasons, Unicode was developed in the 1980s, eventually allowing people to use all sorts of characters. WordPress, like many computer systems, now uses a UTF-8 character encoding.

To summarize: WordPress is software that helps you turn content like the text of blog posts into a website people can access. It stores that content in a MySQL database, and the content of that database is encoded using UTF-8.

This next bit is a little tricky and probably won’t have occurred to most web users. Using a system like UTF-8 can be risky in a variety of ways. For example, it contains characters from foreign alphabets which look indistinguishable from English letters but which are known to be different by computers. This could allow somebody, for instance, to register a website that looks visually like google.com but which is actually run by the person who made the site with the non-English characters.

Even when it comes to importing new content into a MySQL database UTF-8 could cause problems, so phpMyAdmin will take certain non-standard characters and replace them with what looks like gibberish on import. So, the Greek letter delta imported into phpMyAdmin becomes Δ and `smart’ quotes, which I hate because of these kinds of problems, but which the Thesis theme uses, turn into “ and â€.

So, even when I succeeded in importing my old database into a new one (to be able to fix the site without risk of breaking the original), the new version contained many thousands of errors. I didn’t want to keep adding to a site full of errors, since I realized it should eventually be possible to get a properly copied database.

More on encoding and the web:

The fix

Anyway, it turned out that the pretty basic steps I had been asking DreamHost to use all along worked fine as soon as I found a customer service representative willing to read through and implement them.

I’m not the first person who had this problem with character encoding and phpMyAdmin. Early on I found a website called Orthogonal Thought which describes the problem and some ways to fix it.

Unfortunately, the fixes are done via SSH, which DreamHost doesn’t allow with MySQL for those on shared hosting. I had to get someone on DreamHost’s side to run these commands.

And so began an agonizing process of submitting customer service ‘tickets’, as requests for help are often called in the world of information technology. In each I tried to explain what the problem was, and in each I directed the tech support person to the post on Orthogonal Thought along with a request that they make a copy of my database with characters intact.

DreamHost tech support person after tech support person then did one of three things: refused to help because they thought this problem was something I should fix (despite how the necessary tools are denied to those on shared hosting), made a copy of the database where the character encoding was still broken, or made a copy of the database that somehow didn’t work with WordPress at all. In March, “John R” gave me the “not our problem” treatment, while the efforts of other tech support personnel yielded a set of unfixed databases through April and May.

I sought help from other forums and expressed my frustration on Twitter, leading to many messages from other web hosting companies explaining how bad DreamHost shared hosting is. In many cases, the people operating Twitter accounts for other hosting companies provided me with tech support via Twitter, trying to find ways to copy the database properly myself.

After months with the site down, in desperation I started tweeting at all the people who describe themselves as DreamHost employees in their Twitter bios like @DreamHostBrett whose Twitter handle is in their newsletters, “WordPress Core Developer” Mike Schroder, and “Product Marketing Manager” Jennifer Kay. None of them responded to me, but this prompted another round of exchanges with the DreamHost tech support Twitter account @DreamHostCare.

Finally, a day ago, one of their tech support people emailed me to say they had made a good copy of the database. Indeed, they finally had.

Aware that other people have had and will have this problem, I asked for the solution they used and was told by email: “Per our manager “I made sure to include –default-character-set=latin1” and changed it to “changed latin1 to utf8″”. They had used one of the fixes from the blog post which I had been sending them all along.

There doesn’t seem to be much appetite at DreamHost for looking into and fixing problems with their customer service. That plus all the site reliability problems that have cropped up due to shared hosting over the years have me still searching for alternatives. Probably, I will test out another hosting provider with a set dedicated to my PhD research and move everything over there once I am confident it’s better.

I hope some people from DreamHost will read this and reflect on what it says about the effectiveness of their tech support. One huge problem is how every time a new ticket is created it seems to get randomly assigned to someone new who doesn’t understand the background to the problem. I have been told there is also no way to elevate the problem to the attention of a manager when it proves beyond the capabilities of the first-line tech support people. Unwillingness or inability to follow simple instructions has been the problem all along here, and I would like to hear that they have some intention of making things better.

If they want to credit me back for the nearly four months my site was down, I would be open to that too.

{ 1 comment… read it below or add one }

Milan June 9, 2018 at 1:37 pm

Here’s DreamHost complaining again that my site is crashing. Their story is that it is demanding too many resources for a site on shared hosting, but look at my pathetic traffic levels over the last year:

For comparison, here is the last ten years including times in Ottawa when I was a lot more popular:

If DreamHost shared hosting can’t handle 50-100 visitors per day, I think they need to change their advertising.

Leave a Comment

Previous post:

Next post: