This shows you the differences between two versions of the page.
resource:sc_lab_info:sclab:boot [2016/04/12 22:00] rim created |
resource:sc_lab_info:sclab:boot [2023/02/15 12:46] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Booting SC Lab Machines ====== | ||
- | In case anyone needs to boot the machines, here is what needs to be done: | ||
- | |||
- | ====== SC ====== | ||
- | sc is the top machine in the rack | ||
- | - Check that the external backup disk is turned on (it might not restart after a crash) - sc can boot without the backup disk, but it won't be able to backup properly if the disk is turned on after it boots | ||
- | - Boot the machine (this can take 15 minutes, and for the first 5 minutes there may not be anything showing on the screen while it does its memory check) | ||
- | - Restart scteach (preferably by logging into sc on an nx console and running scteach in that) - normally, Bob will look after this | ||
- | |||
- | ====== SC1 ====== | ||
- | sc1 is the second bottom machine in the rack | ||
- | - Turn on the external RAID array (the bottom machine in the rack) - sc1 _cannot boot without this_. | ||
- | - Check that the external backup disk is turned on (it might not restart after a crash) - sc1 can boot without the backup disk, but it won't be able to backup properly if the disk is turned on after it boots | ||
- | - Turn on at least one blade (it won't boot properly, but if none are booted, sc1's cluster control software won't come up properly) | ||
- | - Boot the machine (this can take a long time, though not as long as sc) | ||
- | - If, after about ten minutes, the system is still synching the RAID disks - all the RAID lights are flashing - then the boot has probably failed, and you should reboot again | ||
- | - This problem might have been due to a flakey disk that eventually failed completely and had to be replaced, but we aren't completely sure | ||
- | - Boot or reboot the blades once sc1 is running fully. You should do this even if pbsmon shows them as running. | ||
- | - Run pbsmon (in the education menu on sc1) to check that all the blades boot properly | ||
- | - Restart sc1a (preferably by logging into sc on an nx console and running scteach in that) - normally, Bob will look after this | ||