Network & Power Outage

Andem
#1
Dear Member,


As you have probably realised, Canadian Content has suffered a serious outage of over 30 hours which has been linked to an explosion in the data centre where our Domain Name Services server is hosted. Along with that, we also have a legacy server in the same building on the same floor which was effected by the outage.


The outage occurred at 7:29pm Eastern time on 31 May 08 caused by a serious fire in the main power supply base. 3 walls were blasted and we're thankful to report that nobody was injured. Fire safety personnel rejected the backup plan of using the backup power supply because the damage to the facility was too intense to give a straight go-ahead due to safety concerns. Our service provider has assured us that no hardware other than the power supply was damaged and our site will be sustained by the backup power supply which has now been approved by the fire chief.


As we've never had an outage of such magnitude before, nor have we ever witnessed such damage, we have learned from this lesson and are subsequently working on having a 3rd and 4th DNS service setup and moreover a skeleton backup server with some features stripped in case an event such as this occurs again.


Unfortunately, the redundancy we have setup in the past did not help us as the “big mamma” server hosting our DNS also suffered the outage.


Rest assured, this has not been brought on by a security breach but a real serious explosion over the weekend.


A few grey hairs later, I'd like to welcome you back online our web site and hopefully a break from the Internet on Saturday night and during Sunday has given you some time to relax.


Christopher Walsh
Administration
(ps. It could take half a day or more for everybody to be able to access the site again)
 
hermanntrude
#2
wowee. three walls? what causes that kind of explosion? holy crap
 
Andem
#3
Quote: Originally Posted by hermanntrude View Post

wowee. three walls? what causes that kind of explosion? holy crap

That's what they've told me. More details have not been released nor do I expect them to as it seems like negligence hence bad PR. I'll keep you all updated, though if I receive more details.
 
Zan
#4
I'm sure the last couple days were pretty rough for you Andem. Glad to see us back up.
 
Andem
#5
To be honest, today I took a walk around the city and read a book in a park near the Reichstag in the shade.. Absolutely nothing that I could do to remedy the situation.
 
Zan
#6
lol good for you. I didn't fare so well - my Sunday morning routine was shattered - with nothing to distract me I was forced to be.... *gasp*... productive!
 
Praxius
#7
I bet you it was TERRORISM!!!!



But glad to hear nobody was hurt... geez, I had to venture into my old forums for a bit there....
 
Kreskin
#8
Chris, I followed that link to the forum you sent about the outage. There were sure some ticked off people, and rightfully so considering some were online stores. When the smoke eventually clears it could be very costly for that company, considering the magnitude of those affected. I found it somewhat educational to watch that mess unfold. If it can happen to them, one of biggest server management companies in the world, anything can happen on the internet. Those of us using smaller firms had better make sure we can react to a catastrophic problem at the average webhost. Webhosts and the people who use them should never take anything for granted.
 
MikeyDB
#9
We all believe that there's an unlimited supply of gasoline.....We all believe that nothing mankind can do to our global environment is of any concern....We believe that a communications system built as defense against interruption as side-effect of war and armed hostilities is capable of mounting a defense against anything that happens.....

We believe that building huge war machines will bring peace....we believe that appearance is more important than fact.....
 
scratch
#10
Quote: Originally Posted by Zan View Post

lol good for you. I didn't fare so well - my Sunday morning routine was shattered - with nothing to distract me I was forced to be.... *gasp*... productive!

I was in withdrawal for a while but was able to clear up some e-mails and read the latest Cornwell book. The fact that no-one was hurt is a blessing.
 
#juan
#11
Glad it's back up and running Chris. What was surprising was the length of time it was down, but hey were back up and running.....All is well.....
 
shadowshiv
#12
I am not sure how I missed this thread(I'll blame Zan as I know she'll forgive me ).

I am glad that no one was hurt. I was beginning to think it was my computer again, but since I wasn't getting any email notification of posts, I was beginning to think that it was something else.

At least you were able to get a break, Chris.
 
Andem
#13
A little update: It seems the data centre staff are still struggling with something? I can't tell if its only me, a server or two of ours or their system which is having issues resolving domain names. From what I've been told, they are still running on generators so it *could* be a little bit of a bumpy road still ahead.

The secondary backups I've ordered are not yet up and running!
 
Andem
#14
Update:

I've just received word that we could experience another 6-8 hours of downtime on either Saturday, June 7th _or_ Saturday, June 14th to get us off of backup power and back onto the main power supply as the power supply and backup systems are restored to their original states.

Hoping our new backup server will be up by then, I will attempt to move over to the backup servers on the Thursday before until the work is done. I'm hoping this will be completely seemless.
 
dancing-loon
#15
Quote: Originally Posted by Andem View Post

Update: Hoping our new backup server will be up by then, I will attempt to move over to the backup servers on the Thursday before until the work is done. I'm hoping this will be completely seemless.

You are a genius, Chris! From the bottom of my heart I thank you for your dedication to our enjoyment.
 
darkbeaver
#16
I hope none of my work was damaged.
 
Dexter Sinister
#17
Ah, I wondered what was going on. I got timed out so often trying to connect to CC recently that I gave up for a few days, and began to wonder if the site had gone south permanently. My ISP seems to have had a lot of trouble at the same time, my email connectivity's been timing out a lot the last few days too, and I've just recently replaced the old wireless B router in my house with a new higher speed one. So many things happening at once... nice to know it's not me misconfiguring things. I hate it when I think I might have done something wrong and can't figure it out or fix it.
 
Andem
#18
Dexter: This site would not ever go down permenantly without giving members the opportunity, well in advance to collect anything they may have posted or to backup their private messages. Expect CC to be online for many years to come!
 
Dexter Sinister
#19
Quote: Originally Posted by Andem View Post

Dexter: This site would not ever go down permenantly without giving members the opportunity, well in advance to collect anything they may have posted or to backup their private messages.

Yeah, intellectually I knew that, but I'm sure you know how anxiety works. CC is important to me, I care, and when things I care about don't seem to be working, I worry about them. I'm not always entirely rational...
 
Andem
#20
Update:

We now have our forums setup on a backup server. We're now waiting on our backup DNS servers to become global and updated by ISPs around the world.

Further downtime should now be completely avoided. According to our data providers, a minimum downtime of 6-8 hours and a maximum of 48 hours will occur in that datacentre. This should now be completely averted if we meet the deadline of Friday afternoon Central time.. which we are 99% complete.
 
hermanntrude
#21
impressive stuff. well done
 
MikeyDB
#22
Don't know if this situation is connected to the power issues but I've been getting a great deal more adware and spurious interruptions since the board was down then put back on line. Windows XP-Pro....(firewall on) doesn't seem up to the task of blocking this junk and my googgle toolbar keeps telling me it's blocking ads....

I've picked up one trojan and deleted several adwar bombardments.

One might reasonably speculate that after ten years and millions of patches and "updates" Microsoft Windows would be able to assure some level of security but hey...its Windows...a lousy product that's been pushed to the top of the heap by companies that care more about hawking products than concerned with being associated with a quality product. Just like everything else to emerge from the technological boom....lots of products mostly junk and little effort at quality of any kind.
 
shadowshiv
#23
Quote: Originally Posted by Andem View Post

Dexter: This site would not ever go down permenantly without giving members the opportunity, well in advance to collect anything they may have posted or to backup their private messages. Expect CC to be online for many years to come!

Statements like this are just one of the many reasons why I will be buying the first round.
 
Tonington
#24
Weird time stamp on some posts. I posted in one thread, and just before this posted in another, and the first post came back as the last one posted chronologically when I went to forum home. Quirky.
 
Andem
#25
Update:

We experienced some more downtime earlier today. They had some issues with routing in the data centre which screwed our DNS service. I was planning on getting us over to the backup server today, the day before the expected replacement of the power supply systems.

I've moved the forums over to another server as a backup so hopefully these issues are a thing of the past.

I'm thoroughly disappointed in the way this has been dealt with by the data centre staff.
 

Similar Threads

2
Bell's new GSM network
by Andem | Nov 6th, 2009
0
128bit WEP wireless network key...help
by East Coaster | Apr 28th, 2005
2