Daily Log, 2010-11-03
Yesterday, one of our Amazon EC2 servers “spontaneously” rebooted, leaving the production environment of our new secret project in shambles. The problem was that we hadn’t yet written monit scripts that watched over if the different database and application servers were restarted as they should have been in such a case. So I spent today writing monit scripts for our unicorn servers and testing them. One thing leads to another, so I also did the same thing for our Postgres server. This in turn led to the realisation that Postgres 9.0.0 (which is current now) doesn’t read database files written by 9.0beta4 (which was installed due to the non-availability at that time of 9.0.0). So back to 9.0beta4, dumping all databases, re-installing 9.0.0, re-creating the databases (as per these instructions). Fun and joy!
Keith in the meantime continued to work on the Widget and did a lot of compatibility stuff to make it work on pages that already load jQuery etc. He now dynamically loads jQuery if it’s not on the page already. Our widget works better and better.
After lunch with an old colleague (we collaborated with him on Quint-Essenz.ch) we did some impromptu usability testing on our sign-up / registration process. In 5 minutes we had uncovered half a dozen flaws and one serious error. To be repeated (with other people)
And as the day ends, I’m still struggling with monit. Monit seems to behave totally erratic, not being able to start services (like postgresql), silently failing to start our unicorns (with the very little helpful message: “failed to start”). I think I have wasted enough time with this for now - might look into another solution tomorrow…