Results 1 to 8 of 8

Thread: 27th of Feb. 2012: 5 hour downtime

  1. #1
    tdev
    tdev is online now
    tdev's Avatar


    Developer
    Join Date
    Apr 2007
    Location
    Germany
    Age
    29
    Posts
    10,444
    Blog Entries
    75
    Country: Germany

    Default 27th of Feb. 2012: 5 hour downtime

    i will try to elaborate a bit on what happened and why it took so long:
    • scheduled for a remote LARA system to look onto the server console.
    • got the LARA, and shut down all virtual servers to look if some BIOS settings were screwed. (~5 minutes)
    • after exiting the Bios, the server refused to boot: it could not read the kernel image anymore (~10 minutes)
    • booted into the rescue system, looked at the health of the raid controller and noticed that one of the three discs that caused the trouble earlier finally gave up (~15 minutes)
    • called the datacenter, talked with a technician on how to proceed (~15 minutes)
    • was instructed to gather logs together and reply to the opened ticket to get the disc changed (~5 minutes)
    • waited for the datacenter guys to start to get active after they got the ticket (~45 minutes)
    • waiting for the server to get up again after being powered down for disc exchange: (~3:00 hours)
    • got email from the support guy, that they exchanged the disc and they powered the server up again, great, but no response, so something must be broken
    • rebooted server into rescue system again (~5 mins)
    • saw root FS is broken and contains lots of errors (~5 minutes)
    • uploading 3GB of backups over my slow home connection back onto the server root disc to replace the faulty files (~2 hours)
    • after completion of restoring the backups, restarting the server, no response after reboot (~5 mins)
    • issuing another hardware reset, system finally comes up: 6 virtual machines start up, every one of them has at least one virtual HDD attached. Some machines require a reboot since their root partition is for whatever reason read only. fixing them one by one (~15 mins)
    • all running now, all timed jobs kick in all at once: server builds the outstanding things, runs cron jobs, writes hundreds of emails -> server acts slow (~20 minutes)
    • all running smooth now, cache getting full, speed increases
    • Database backup kicked in, brings down machine (~15 minutes) due to raid performance being not great yet
    Code:
    State is now:   OK       (01:14h 2012.02.27)
    State before:   ERROR    (was about 5 hours)
    Service uptime: 99.8%
    
    and how a fatal disc problem looks like:
    Click image for larger version. 

Name:	fubar.png 
Views:	102 
Size:	54.0 KB 
ID:	295928


    the raid system is currently rebuilding, 8 % completed only, will take some more days x|
    Code:
    Logical device Task:
       Logical device                 : 0
       Task ID                        : 100
       Current operation              : Rebuild
       Status                         : In Progress
       Priority                       : Low
       Percentage complete            : 8
    
    
    Logical device Task:
       Logical device                 : 1
       Task ID                        : 101
       Current operation              : Rebuild
       Status                         : In Progress
       Priority                       : Low
       Percentage complete            : 22
    Last edited by tdev; 02-27-12 at 12:10 AM.

  2. one user found this post helpful: rahulroy9202
  3. #2
    Nickster 7
    Nickster 7 is offline
    Nickster 7's Avatar
    Join Date
    Nov 2010
    Location
    USA
    Posts
    1,017
    Country: United States

    Default Re: 27th of Feb. 2012: 5 hour downtime

    oh. I wondered why ror wasn't working!
    Truck driving skill: 11/10 Looking for good diesel engine sounds? Check out: http://www.youtube.com/user/frontiertruck

  4. #4
    anonymous1
    anonymous1 is offline
    anonymous1's Avatar

    Join Date
    May 2011
    Posts
    2,883
    Country: United States

    Default Re: 27th of Feb. 2012: 5 hour downtime

    Thanks tdev, you are the best and never let us down.

  5. #5
    aljowen
    aljowen is offline
    aljowen's Avatar
    Join Date
    Jul 2009
    Location
    Inside your computer
    Posts
    893
    Country: UK

    Default Re: 27th of Feb. 2012: 5 hour downtime

    Well done on fixing everything, nice to know what happened as well even if i don't fully understand it

  6. #6
    kevinmce
    kevinmce is offline
    kevinmce's Avatar

    Moderator
    Join Date
    May 2007
    Location
    Eindhoven, Nederland
    Posts
    1,352
    Country: Ireland

    Default Re: 27th of Feb. 2012: 5 hour downtime

    Nice job Tom, Like a Pro!
    Its all mad!

  7. #7
    Sushi
    Sushi is offline
    Sushi's Avatar
    Join Date
    Oct 2008
    Location
    SJ Sharks Nation, CA
    Age
    18
    Posts
    1,876
    Country: China

    Default Re: 27th of Feb. 2012: 5 hour downtime

    Good work, too bad your twitter exploded with people complaining about unexpected errors...

  8. one user found this post helpful: sputnik_1
  9. #8
    sputnik_1
    sputnik_1 is online now
    sputnik_1's Avatar

    Join Date
    Oct 2008
    Posts
    741
    Country: Germany

    Default Re: 27th of Feb. 2012: 5 hour downtime

    Quote Originally Posted by Sushi View Post
    Good work, too bad your twitter exploded with people complaining about unexpected errors...
    Totally agree, in my opinion at least one person should really apologize to tdev for his behavior there

Similar Threads

  1. 7 hour downtime
    By tdev in forum Webservices Support
    Replies: 12
    Last Post: 02-24-12, 03:54 PM
  2. Where to watch 24 hour in canada????
    By Timmyboy2011 in forum General Off-Topic
    Replies: 0
    Last Post: 10-17-11, 05:57 AM
  3. unplanned 5 hour downtime
    By tdev in forum Webservices Support
    Replies: 6
    Last Post: 05-26-09, 01:22 PM
  4. Replies: 3
    Last Post: 09-06-08, 07:08 PM
  5. Our Trucks [Added Video Febuary 27th 2008] page 2
    By Hotrodx199 in forum General Off-Topic
    Replies: 33
    Last Post: 03-10-08, 01:15 AM

Posting Permissions



About Rigs of Rods

    Rigs of Rods is a unique soft body physics simulator.


Some Tools


Partners

SourceForge.net

Follow us

Twitter youtube Facebook RSS Feed


impressum