August 2018 Power Outage Plan: Difference between revisions

From CSCWiki
Jump to navigation Jump to search
Line 9: Line 9:
* Complete plan for outage
* Complete plan for outage
* Move equipment to DC (if necessary)
* Move equipment to DC (if necessary)
* Take backups of LDAP and Kerberos, and download offsite


=== Monday, August 20 ===
=== Monday, August 20 ===

Revision as of 17:43, 16 July 2018

There is a planned power outage in MC from Tuesday, August 21 to Friday, August 24.

There is also a one-day outage in DC, which will complicate keeping services up during the entire outage.

Timeline

Before Monday, August 20

  • Complete plan for outage
  • Move equipment to DC (if necessary)
  • Take backups of LDAP and Kerberos, and download offsite

Monday, August 20

  • Shutdown general-use computing services
  • Transfer computing services to redundant / temporary systems

After the outage

  • Being restoring normal services

Systems

Mirror

TODO. Syscom is currently working with CSCF to identify a plan for mirror.

Website

The CSC website is a static site, and will be straightforward to maintain during the outage.

All user and club sites are hosted in home directories (which are unavailable), so we will display an outage page (with a 503 status code).

Mail

Since the outage is for a week, we need to maintain email services during the outage. An initial plan by ztseguin and jxpryde:

  • rsync users' .forward, .procmailrc and .maildir to a local directory on mail, allowing mail to continue as expected

However, this requires:

  • Users not reference any scripts, programs, etc. in their procmailrc file that reference things in their home directory

Authentication

Authentication is located in both MC and PHY.

While the MC node is down, the PHY node can continue to answer to authentication requests. However, updating membership and changing passwords will not be available if the MC node is down.

We may consider moving auth1 to DC for the outage.

DNS

CSC's DNS service is located in both MC and PHY.

We may consider moving the MC DNS node to DC, but this is not necessary to maintain services during the outage.

NOTE: The MC node is the master node, and we will need to ensure that the SOA record contains a long enough expiry time so the PHY doesn't stop serving zones.

Additional Resources