i will try to elaborate a bit on what happened and why it took so long:
- scheduled for a remote LARA system to look onto the server console.
- got the LARA, and shut down all virtual servers to look if some BIOS settings were screwed. (~5 minutes)
- after exiting the Bios, the server refused to boot: it could not read the kernel image anymore (~10 minutes)
- booted into the rescue system, looked at the health of the raid controller and noticed that one of the three discs that caused the trouble earlier finally gave up (~15 minutes)
- called the datacenter, talked with a technician on how to proceed (~15 minutes)
- was instructed to gather logs together and reply to the opened ticket to get the disc changed (~5 minutes)
- waited for the datacenter guys to start to get active after they got the ticket (~45 minutes)
- waiting for the server to get up again after being powered down for disc exchange: (~3:00 hours)
- got email from the support guy, that they exchanged the disc and they powered the server up again, great, but no response, so something must be broken
- rebooted server into rescue system again (~5 mins)
- saw root FS is broken and contains lots of errors (~5 minutes)
- uploading 3GB of backups over my slow home connection back onto the server root disc to replace the faulty files (~2 hours)
- after completion of restoring the backups, restarting the server, no response after reboot (~5 mins)
- issuing another hardware reset, system finally comes up: 6 virtual machines start up, every one of them has at least one virtual HDD attached. Some machines require a reboot since their root partition is for whatever reason read only. fixing them one by one (~15 mins)
- all running now, all timed jobs kick in all at once: server builds the outstanding things, runs cron jobs, writes hundreds of emails -> server acts slow (~20 minutes)
- all running smooth now, cache getting full, speed increases
- Database backup kicked in, brings down machine (~15 minutes) due to raid performance being not great yet
Code:
State is now: OK (01:14h 2012.02.27)
State before: ERROR (was about 5 hours)
Service uptime: 99.8%
and how a fatal disc problem looks like:

the raid system is currently rebuilding, 8 % completed only, will take some more days x|
Code:
Logical device Task:
Logical device : 0
Task ID : 100
Current operation : Rebuild
Status : In Progress
Priority : Low
Percentage complete : 8
Logical device Task:
Logical device : 1
Task ID : 101
Current operation : Rebuild
Status : In Progress
Priority : Low
Percentage complete : 22