Yesterday, one of our Amazon EC2 servers “spontaneously” rebooted, leaving the production environment of our new secret project in shambles. The problem was that we hadn’t yet written monit scripts that watched over if the different database and application servers were restarted as they should have been in such a case. So I spent today writing monit scripts for our unicorn servers and testing them. One thing leads to another, so I also did the same thing for our Postgres server. This in turn led to the realisation that Postgres 9.0.0 (which is current now) doesn’t read database files written by 9.0beta4 (which was installed due to the non-availability at that time of 9.0.0). So back to 9.0beta4, dumping all databases, re-installing 9.0.0, re-creating the databases (as per these instructions). Fun and joy!
...