Why MozillaZine Went Down
Friday January 16th, 2004
Regular visitors will have noticed that MozillaZine vanished from the Internet for almost 24 hours beginning at approximately 1:45am UTC on Friday (late Thursday USA time). The outage started when the server failed to come back up after a planned reboot following a software update. Further investigation by the support personnel of our host EV1Servers.net (formerly Rackshack) revealed that the machine had experienced a kernel panic. The support engineers then installed a new hard drive formatted with EV1Servers.net's standard Linux setup (though ironically this ran into problems due to a failed IDE cable). We could then access the machine and rebuild our pre-update configuration. We successfully transferred all the important data from the old disk — including articles, forum messages, weblog posts and emails — and the site is now running as normal. We've taken the opportunity to switch the server to Pacific Time (GMT -0800), providing better consistency with Mozilla Foundation tools such as Bugzilla and Bonsai (previously the site used Central Time (GMT -0600) and before that Eastern Time (GMT -0500)). However, it took us a couple of attempts to get it right, so there may be some time-related errors present (such as a few inaccurate forum post timestamps).
The disruption affected all services including email. If you sent any email to mozillazine.org addresses during the time the server was down, please resend, as it most likely did not get through. We at MozillaZine would like to apologise to all our visitors for any inconvenience caused by the downtime and wish to extend our gratitude to everyone who helped to get the site up and running again, especially James "Imajes" Cox.
Update: We think everything's working now but please read our latest outage update for more information.
#1 test from buff
Friday January 16th, 2004 4:24 AM
I think it is working. What happened? Server die?
Friday January 16th, 2004 4:32 AM
Nothign to see here. Move along.
#3 Article updates and pending reboot
Friday January 16th, 2004 4:54 PM
The two comments above this one were posted before the main article text went live (there was just a placeholder story before then). That was also before the clock was changed to Pacific Time.
Note that we expect a further brief period of downtime when the server is shut down to have its old disk removed.
#6 Re: Article updates and pending reboot
Friday January 16th, 2004 5:45 PM
"That was also before the clock was changed to Pacific Time."
Though come to think of it, the times on the posts still aren't PST even now. Hmm.
Friday January 16th, 2004 5:03 PM
Friday January 16th, 2004 5:04 PM
Quite a night. Being MozillaZine-less was sure interesting. I had to actually do other stuff... such as homework or cleaning my room. :)
Thanks, all for getting this back up without to much heartache :)
#7 Where we are
Friday January 16th, 2004 6:08 PM
OK, so the server time change hasn't quite gone as expected. Post times still seem to be Central Standard Time (except the first two, which were made before the clock was reset) though the server time is Pacific Standard Time now.
I'm not convinced new mail is being delivered to mozillazine.org addresses. When we fix this, it's possible that all the mail sent while the server was down may start coming through. However, if you think you should get a reply and don't, please resend. Mail was a problem when we first set the server up in the Summer.
I think everything else works though.
#8 Re: Where we are
Friday January 16th, 2004 6:20 PM
"I think everything else works though."
Except weblog comments. And maybe some other things.
Friday January 16th, 2004 9:49 PM
The real cause was the millions of people simultaneously connecting to mz when they found out 1.6 was being released [/sarcasm]
Friday January 16th, 2004 9:56 PM
Like a legitimate "Denial of Service" attack?
#11 Re: Reply
Friday January 16th, 2004 10:20 PM
Nah, I was on IRC when it broke. Feels odd sleeping at night instead of prowling the MozillaZine forums though.
#12 Quick update
Saturday January 17th, 2004 2:23 AM
For those who were awake, we just completed our final reboot, and are back to normal for the most part. Blog comments are fixed. Time is now correct, other than main page articles which are still using CST, we'll work on this tomorrow. We're going to be making everything on the site PST based, rather than a mix of time zones. Mail is still not functional. It doesn't seem to be bouncing yet, so the machine may be getting it and just not dropping them into our inboxes. If you emailed us, please hang in there, hopefully it hasn't been lost. We'll let you know when mail is up and running again, so you can go back to telling us you clicked the under 13 link when signing up for the forums ;P.
Thanks for your patience folks, it's been a long 2 days. After close to 40 hours of working on this, I'm finally going to get some sleep :).
#16 Re: Quick update
Saturday January 17th, 2004 2:47 PM
Mail is working now!
#13 IDE cable!?
Saturday January 17th, 2004 8:44 AM
"though ironically this ran into problems due to a failed IDE cable"
though both SCSI and ATA disks are IDE (integrated drive electronics), ATA disks are what are commonly (wrongly) refered to as IDE disks. so, assuming the term here meant an ATA cable, what-on-earth are these people doing running a server with ATA disks?
Saturday January 17th, 2004 10:44 AM
Not to make any assumptions, but how do you know they aren't running RAID 1 ATA drives? SCSI on small servers is overrated as parallel ATA drives have nearly caught up in features to their SCSI counterparts. Also, parallel ATA drives use IDE connectors, so assuming they have parallel ATA drives, then the statement is true enough.
Saturday January 17th, 2004 1:17 PM
Best luck to you all, I havin't personally found any problems with the site though never-say-never.
#17 yehp ev1servers suck ass
Monday January 19th, 2004 1:35 PM
yehp figure il get my reply in lol, ev1servers just sucks ass period. the company is a joke and they dont know left from right.. i wouldnt trust them with any servers... *goes back to his cave*