New CSC Machine: Difference between revisions

From CSCWiki
Jump to navigation Jump to search
mNo edit summary
Line 326: Line 326:


Finally, start <tt>stunnel4.service</tt>.
Finally, start <tt>stunnel4.service</tt>.

If it's a new machine, you'll also need to add it to the list of monitoring at <tt>prometheus:/opt/prometheus/prometheus.yml</tt>. Add it under a suitable label (or create a new label) in 'node_exporter' job.


= New Distribution =
= New Distribution =

Revision as of 21:23, 29 July 2022

Firmware Updates

Vendors such as Dell provide firmware updates that should be applied before putting new machines into service. Even if the machine's warranty has expired, security updates are still made available.

It is recommended to use the following sequence when updating firmware on the Dell PowerEdge servers ([1]):

  1. iDRAC
  2. Lifecycle Controller
  3. BIOS
  4. Diagnostics
  5. OS Driver Pack
  6. RAID
  7. NIC
  8. PSU
  9. CPLD
  10. Other update

Booting

  • Put the TFTP image in place (if dist-arch pair installed before, you may skip this).

e.g. extract http://mirror.csclub.uwaterloo.ca/ubuntu/dists/oneiric/main/installer-amd64/current/images/netboot/netboot.tar.gz to caffeine:/srv/tftp/oneiric-amd64

  • Force network boot in the BIOS. This may be called "Legacy LAN" or other such cryptic things. If this doesn't work, boot from CD or USB instead.

It is preferred to use the "alternate" Ubuntu installer image, based on debian-installer, instead of the Ubiquity installer. This installer supports software RAID and LVM out of the box, and will generally make your life easier. If installing Debian, this is the usual installer, so don't sweat it.

  • Most of our newer servers (e.g. PowerEdge R815) need non-free firmware in order to boot. This means that if you are using a new netboot image, it is highly recommended to include the entire non-free firmware bundle in the boot image. See [2] for more information.

Installing

debian-installer

At least in expert mode, you can choose a custom mirror (top of the countries list) and give the path for mirror directly. This will make installation super-fast compared to installing from anywhere else.

Please install to LVM volumes, as this is our standard configuration on all machines where possible. It allows more flexible partitioning across available volumes. Since GRUB 2, even /boot may be on LVM; this is the preferred configuration for simplicity, except when legacy partitioning setups make this inconvenient.

You may enable unattended upgrades, but do not enable Canonical's remote management service or any such nonsense. This is mostly a straightforward Debian/Ubuntu install.

Ubiquity

Ubiquity is the Ubuntu GUI installer. For it to have lvm support, run:

apt install lvm2

If you still can't see the partitions (even if lvscan sees them, but no devices exist), run vgscan and vgchange -ay as root. Now the partitioner should be able to see them. We prefer to use LVM for partitions. Since GRUB 2, even /boot may be on LVM; this is the preferred configuration for simplicity, except when legacy partitioning setups make this inconvenient.

After installing with Ubiquity, you must also add LVM support to the newly installed system, and in particular its initramfs.

mount /dev/vg0/root /mnt
mount /dev/sda1 /mnt/boot
chroot /mnt
apt install lvm2

You should see an update-initramfs update. Reboot.

After Installing

Add the machine's name to ~git/public/hosts.git, and run the ansible playbook (https://git.uwaterloo.ca/csc/playbooks/blob/master/update-hosts.yml) to distribute the updated hosts file to all machines.


apt

If you did not during installation, change all references in /etc/apt/sources.list to use mirror instead of the usual mirrors.

Also add support for the CSC packages. Add the following to /etc/apt/sources.list.d/csclub.list (or copy from another host):

deb http://debian.csclub.uwaterloo.ca/ <distribution> main contrib non-free
deb-src http://debian.csclub.uwaterloo.ca/ <distribution> main contrib non-free

You'll also need the CSC archive signing key (if curl is not installed, install it).

curl -s http://debian.csclub.uwaterloo.ca/csclub.asc | apt-key add -

You should now run apt-get update to reflect these changes.

For unattended upgrades in the future, install the unattended-upgrades package and copy /etc/apt/apt.conf from another host.

network

Note that inapt current uninstalls NetworkManager, which is what Ubuntu uses by default to configure the network. Once this completes, open /etc/network/interfaces and set up a static networking configuration (otherwise, networking will not come back up on reboot). It should look something like this (NOTE: csc-storage is only for servers in the machine room):

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet static
        address 129.97.134.xxx
        netmask 255.255.255.0
        gateway 129.97.134.1

iface eth0 inet6 static
        address 2620:101:f000:4901:c5c::XXXX
        netmask 64
        gateway 2620:101:f000:4901::1
 
 # csc-storage
 auto eth0.530
 iface eth0.530 inet static
        address 172.19.168.xxx
        netmask 255.255.255.224
        vlan-raw-device eth0
 
 iface eth0.530 inet6 static
        address fd74:6b6a:8eca:4903:c5c::xx
        netmask 64

Keys

If this is a reinstall of an existing host, copy back the SSH host keys and /etc/krb5.keytab from its former incarnation. Otherwise, create a new Kerberos principal and copy the keytab over, as follows (run from the host in question):

kadmin -p sysadmin/admin   # or any other admin principal; the password for this one is the usual root password
addprinc -randkey host/[hostname].csclub.uwaterloo.ca
ktadd host/[hostname].csclub.uwaterloo.ca

This will generate a new principal (you can skip this step if one already exists) and add it to the local Kerberos keytab.

Also copy /etc/ssl/certs/GlobalSign_Intermediate_Root_SHA256_G2.pem from another host, as many of our services use a certificate issued by this CA.

Configuration

General

The following config files are needed to work in the CSC environment (examples given below for an office terminal; perhaps refer to another host if preferred).

/etc/nsswitch.conf

# /etc/nsswitch.conf
#
# Example configuration of GNU Name Service Switch functionality.
# If you have the `glibc-doc-reference' and `info' packages installed, try:
# `info libc "Name Service Switch"' for information about this file.

passwd:         files ldap
group:          files ldap
shadow:         files ldap
sudoers:        files ldap

hosts:          files dns
networks:       files

protocols:      db files
services:       db files
ethers:         db files
rpc:            db files

/etc/ldap/ldap.conf

# $OpenLDAP: pkg/ldap/libraries/libldap/ldap.conf,v 1.9 2000/09/04 19:57:01 kurt Exp $
#
# LDAP Defaults
#

# See ldap.conf(5) for details
# This file should be world readable but not world writable.

BASE   dc=csclub, dc=uwaterloo, dc=ca
URI     ldap://ldap1.csclub.uwaterloo.ca ldap://ldap2.csclub.uwaterloo.ca

SIZELIMIT      0

TLS_CACERT      /etc/ssl/certs/GlobalSign_Intermediate_Root_SHA256_G2.pem
TLS_CACERTFILE /etc/ssl/certs/GlobalSign_Intermediate_Root_SHA256_G2.pem

SUDOERS_BASE    ou=SUDOers,dc=csclub,dc=uwaterloo,dc=ca

Also make /etc/sudo-ldap.conf a symlink to the above. On debian, install sudo-ldap package too.

/etc/nslcd.conf

# /etc/nslcd.conf
# nslcd configuration file. See nslcd.conf(5)
# for details.

# The user and group nslcd should run as.
uid nslcd
gid nslcd

# The location at which the LDAP server(s) should be reachable.
uri ldap://ldap1.csclub.uwaterloo.ca
uri ldap://ldap2.csclub.uwaterloo.ca

# The search base that will be used for all queries.
base dc=csclub,dc=uwaterloo,dc=ca

# use the uniqueMember attribute for group membership
# (not applicable on Debian squeeze)
map group member uniqueMember

/etc/krb5.conf

[libdefaults]
        default_realm = CSCLUB.UWATERLOO.CA
        forwardable = true
        proxiable = true
        dns_lookup_kdc = false
        dns_lookup_realm = false

[realms]
        CSCLUB.UWATERLOO.CA = {
                kdc = kdc1.csclub.uwaterloo.ca
                kdc = kdc2.csclub.uwaterloo.ca
                admin_server = kadmin.csclub.uwaterloo.ca
        }
(rest omitted for brevity)

Update: allow_weak_crypto is basically a no-op in recent Kerberos versions - but this is not a problem as any linux kernel with version >= 2.6.38.2 can use any cipher available to the kernel to grab tickets from the KDC for the purpose of NFS sec=krb5. Notably, this means you can use ciphersuites less craptastic than des-cbc-crc (the only one that used to work prior to this kernel revision) for NFS sec=krb5 mounts. Therefore, allow_weak_crypto has been removed from /etc/krb5.conf on all our machines.

Furthermore, the lines dns_lookup_kdc and dns_lookup_realm have been added - they are needed to stop the KDC from throwing its arms in the air and giving up if IST's DNS servers ever explode - an event that has happened in the recent past far more often than I'd like it to.

Notably, allow_weak_crypto is currently needed to mount /users (/music and /scratch is sec=sys and thus will always mount, even when krb5 is down and/or broken). Otherwise, you will get a mysterious "permission denied" error (even though the server claims to have authenticated the mount successfully).

/etc/pam.d/common-account

#
# /etc/pam.d/common-account - authorization settings common to all services
#

# here are the per-package modules (the "Primary" block)
account        [success=1 new_authtok_reqd=done default=ignore]        pam_unix.so 
# here's the fallback if no module succeeds
account        requisite                       pam_deny.so
# prime the stack with a positive return value if there isn't one already;
# this avoids us returning an error just because nothing sets a success code
# since the modules above will each just jump around
account        required                        pam_permit.so
# and here are more per-package modules (the "Additional" block)
account        required                        pam_krb5.so minimum_uid=10000
# end of pam-auth-update config

# Make sure the user is up to date. System accounts and syscom are exempt.
account [success=2 default=ignore]     pam_succeed_if.so quiet uid < 10000
account [success=1 default=ignore]     pam_succeed_if.so quiet user ingroup syscom
account required        pam_csc.so

This file is notably different on syscom-only hosts. Look at an existing syscom-only host to see the difference.

Alter /etc/default/nfs-common to enable statd, and more importantly gssd (needed for Kerberos NFS mounts). Start both daemons manually for now.

Add /users, /music and /scratch to /etc/fstab (as appropriate for the machine's role), make their mount points and mount them. Note that /music and /scratch are sec=sys whereas /users is sec=krb5 (with exceptions granted on a case-by-case basis for servers only, office terminals are always sec=krb5 for security reasons).

To allow single sign-on as root (primarily useful for pushing files to all machines simultaneously), put the following in /root/.k5login:

sysadmin/admin@CSCLUB.UWATERLOO.CA

Also copy the following files from another CSC host:

  • /etc/ssh/ssh_config and /etc/ssh/sshd_config (for single sign-on)
  • /etc/ssh/ssh_known_hosts (to remove hostkey warnings within our network)
  • /etc/hosts (for host tab completion and emergency name resolution)
  • /etc/resolv.conf (to use IST's nameservers and search csclub/uwaterloo domains. Only required if you are not using /etc/network/interfaces to configure DNS)

Display Manager

LightDM (with unity-greeter) is the current display manager of choice for CSC office terminals. Copy /etc/lightdm/lightdm.conf from another CSC machine to configure it properly. If kdm or another display manager gets installed, please ensure that you continue to choose LightDM as the default display manager.

Please leave AccountsService enabled, as LightDM and certain parts of the GNOME packages work better when it is available.

The Unity greeter configuration is now in gsettings. We currently have a novelty wallpaper configured. To configure this, copy /usr/local/share/backgrounds/tarkin.png from another machine and run:

sudo -u lightdm dbus-launch gsettings set com.canonical.unity-greeter background /usr/local/share/backgrounds/tarkin.png

User-Defined Session

For some reason, ubuntu does not install a session file for a session that just launches whatever's in the user's ~/.xsession. To fix this, put the following into /usr/share/xsessions/xsession.desktop:

[Desktop Entry]
Name=User-defined session
Exec=/etc/X11/Xsession

Audio

On an office terminal, copy /etc/pulse/default.pa from another office terminal.

If this is to be the machine that actually plays audio (currently nullsleep), the setup is slightly more complicated. You'll need to set up MPD and PulseAudio to receive connections, and store the PulseAudio cookie in ~audio, with appropriate permissions so that only the audio group can access it. If this is a new audio machine, you'll also need to change default.pa on all office terminals to point to it.

Tweaks

On Ubuntu precise, even when gnome-keyring is uninstalled, it leaves a config file behind that causes error messages. Remove /etc/pkcs11/modules/gnome-keyring-module to fix this.

On Ubuntu saucy or newer, edit /etc/sysctl.d/10-magic-sysrq at change the value 244.

Records

You probably already created the host in the University IPAM system beforehand. If not, please do so.

Please also add the host to the Machine List here on the Wiki, and to /users/syscom/csc-machines (and csc-office-machines, if applicable).

Munin (System Monitoring)

If the new machine is not a container, you probably want to have it participate in the Munin cluster. Run apt-get install munin-node to install the monitoring client, then edit the file /etc/munin/munin-node.conf. Look for a line that says allow ^127\.0\.0\.1$ and add the following on a new line immediately below it: allow ^129\.97\.134\.51$ (this is the IP address for munin.csclub). Save the file, then /etc/init.d/munin-node restart and update-rc.d munin-node defaults.

Then, ssh into munin.csclub and edit the file /etc/munin/munin.conf and add the following lines to the end:

[NEW-MACHINE-NAME.csclub]
addr 129.97.134.###
use_node_name yes

Prometheus (System Monitoring)

We are currently using Prometheus to monitor our systems. On the new machine, install prometheus-node-exporter and stunnel.

Change /etc/default/prometheus-node-exporter to this:

ARGS="--web.listen-address=localhost:9101"

and start prometheus-node-exporter.service.

Then set up stunnel. Create /etc/stunnel/prometheus-node-exporter.conf with this content:

setuid = stunnel4
setgid = stunnel4
pid = /var/run/stunnel4/exporter.pid

debug = 7

[prometheus-node-exporter]
accept = 0.0.0.0:9100
connect = 127.0.0.1:9101
CAfile = /etc/stunnel/tls/server.crt
cert = /etc/stunnel/tls/node.crt
key = /etc/stunnel/tls/node.key
verifyPeer = yes

Copy /etc/stunnel/{node.crt, node.key, server.crt} from prometheus:/opt/prometheus/tls or the same location on other machines.

Finally, start stunnel4.service.

If it's a new machine, you'll also need to add it to the list of monitoring at prometheus:/opt/prometheus/prometheus.yml. Add it under a suitable label (or create a new label) in 'node_exporter' job.

New Distribution

If you're adding a new distribution, there a couple of steps you'll need to take in updating the CSClub Debian repository on sodium-benzoate/mirror.

The steps to add a new Debian release (in the examples, jessie) is as follows, modify as necessary:

Step 0: Create a GPG key

Use "gpg --gen-key" or something like that. Skip this if you already have one.

Step 1: Add to Uploaders

The /srv/debian/conf/uploaders file on mirror contains the list of people who can upload. Add your GPG key id to this file. Use "gpg --list-secret-keys" to find out the key ID. You also need to import your key into the mirror's gpg homedir as follows:

gpg --export $KEYID | sudo env GNUPGHOME=/srv/debian/gpg gpg --import

You only need to do this step once.

Step 2: Add Distro

Add a new section to /srv/debian/conf/distributions:

Origin: CSC
Label: Debian
Codename: jessie
Architectures: alpha amd64 i386 mips mipsel sparc powerpc armel source
Components: main contrib non-free
Uploaders: uploaders
Update: dell chrome
SignWith: yes
Log: jessie.log
 --changes notifier

And update the Allow line in /srv/debian/conf/incoming:

Allow: jessie>jessie oldstable>squeeze stable>wheezy lucid>lucid maverick>maverick oneiric>oneiric precise>precise quantal>quantal

Step 3: Update from Sources

Run:

sudo env GNUPGHOME=/srv/debian/gpg /srv/debian/bin/rrr-update

If all went well you should see the new distribution listed at http://debian.csclub.uwaterloo.ca/dists/

Step 4: CSC Packages

Now that we've got our new distribution set up we need to generate our packages and have them uploaded. Namely, ceo and libpam-csc. For libpam-csc:

Get the package:

git clone https://git.csclub.uwaterloo.ca/public/libpam-csc.git
cd libpam-csc

Update change log:

EMAIL=[you]@csclub.uwaterloo.ca NAME="Your Name" dch -i

Update as necessary, i.e:

libpam-csc (1.10jessie0) jessie; urgency=low

  * Packaging for jessie.

 -- Your Name <[you]@csclub.uwaterloo.ca>  Thu, 10 Oct 2013 22:08:48 -0400

Build! (You may need to install various dependencies, which it will yell at you if you don't have.)

debuild -kYOURKEYID

Yay, it built now let's upload it to the repo. The build process which create a PACKAGE.changes file in the parent directory (replace PACKAGE with the actual package name).

Copy the dupload file from corn-syrup and dupload:

mv /etc/dupload /etc/dupload.bak
scp corn-syrup:/etc/dupload /etc/dupload
dupload libpam-csc_1.10jessie0_amd64.changes

Finally, log into mirror and type "sudo /srv/debian/bin/rrr-incoming". This is supposed to happen once every few minutes however it is always faster to run it manually.

And you're done. For CEO, see https://git.csclub.uwaterloo.ca/public/pyceo/src/branch/master/PACKAGING.md