<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.csclub.uwaterloo.ca/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Merenber</id>
	<title>CSCWiki - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.csclub.uwaterloo.ca/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Merenber"/>
	<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/Special:Contributions/Merenber"/>
	<updated>2026-05-14T07:35:19Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.44.5</generator>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=LDAP&amp;diff=5321</id>
		<title>LDAP</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=LDAP&amp;diff=5321"/>
		<updated>2025-01-17T01:36:01Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Changing a user&amp;#039;s username */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;We use [http://www.openldap.org/ OpenLDAP] for directory services. Our primary LDAP server is [[Machine_List#auth1|auth1]] and our secondary LDAP server is [[Machine_List#auth2|auth2]].&lt;br /&gt;
&lt;br /&gt;
=== ehashman&#039;s Guide to Setting up OpenLDAP on Debian ===&lt;br /&gt;
&lt;br /&gt;
Welcome to my nightmare.&lt;br /&gt;
&lt;br /&gt;
==== What is LDAP? ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;&#039;&#039;&#039;LDAP:&#039;&#039;&#039; Lightweight Directory Access Protocol&lt;br /&gt;
&lt;br /&gt;
An open, vendor-neutral, industry standard application protocol for accessing and maintaining distributed directory information services over an Internet Protocol (IP) network. — [https://en.wikipedia.org/wiki/Lightweight_Directory_Access_Protocol Wikipedia: LDAP]&lt;br /&gt;
&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
In this case, &amp;amp;quot;directory&amp;amp;quot; refers to the user directory, like on an old-school Rolodex. Many groups use LDAP to maintain their user directory, including the University (the &amp;amp;quot;WatIAM&amp;amp;quot; identity management system), the Computer Science Club, and even the UW Amateur Radio Club.&lt;br /&gt;
&lt;br /&gt;
This is a guide documenting how to set up LDAP on a Debian Linux system.&lt;br /&gt;
&lt;br /&gt;
==== First steps ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Ensure that openldap is installed on the machine:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# apt-get install slapd ldap-utils&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Debian will do a lot of magic and set up a skeleton LDAP server and get it running. We need to configure that further.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Let&#039;s set up logging before we forget. Create the following files in &amp;lt;code&amp;gt;/var/log&amp;lt;/code&amp;gt;:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# mkdir /var/log/ldap&lt;br /&gt;
# touch /var/log/ldap.log&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Set ownership correctly:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# chown openldap:openldap /var/log/ldap&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Set up rsyslog to dump the LDAP logs into &amp;lt;code&amp;gt;/var/log/ldap.log&amp;lt;/code&amp;gt; by adding the following lines:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# vim /etc/rsyslog.conf&lt;br /&gt;
...&lt;br /&gt;
# Grab ldap logs, don&#039;t duplicate in syslog&lt;br /&gt;
local4.*                        /var/log/ldap.log&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Set up log rotation for these by creating the file [https://git.uwaterloo.ca/wics/documentation/blob/master/ldap/logrotate.d.ldap &amp;lt;code&amp;gt;/etc/logrotate.d/ldap&amp;lt;/code&amp;gt;] with the following contents:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;/var/log/ldap/*log {&lt;br /&gt;
    weekly&lt;br /&gt;
    missingok&lt;br /&gt;
    rotate 1000&lt;br /&gt;
    compress&lt;br /&gt;
    delaycompress&lt;br /&gt;
    notifempty&lt;br /&gt;
    create 0640 openldap adm&lt;br /&gt;
    postrotate&lt;br /&gt;
        if [ -f /var/run/slapd/slapd.pid ]; then&lt;br /&gt;
            /etc/init.d/slapd restart &amp;amp;gt;/dev/null 2&amp;amp;gt;&amp;amp;amp;1&lt;br /&gt;
        fi&lt;br /&gt;
    endscript&lt;br /&gt;
}&lt;br /&gt;
&lt;br /&gt;
/var/log/ldap.log {&lt;br /&gt;
    weekly&lt;br /&gt;
    missingok&lt;br /&gt;
    rotate 24&lt;br /&gt;
    compress&lt;br /&gt;
    delaycompress&lt;br /&gt;
    notifempty&lt;br /&gt;
}&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;As of OpenLDAP 2.4, it doesn&#039;t actually create a config file for us. Apparently, this is a &amp;amp;quot;feature&amp;amp;quot;: LDAP maintainers think we should want to set this up via dynamic queries. We don&#039;t, so the first thing we need is our [https://git.uwaterloo.ca/wics/documentation/blob/master/ldap/slapd.conf &amp;lt;code&amp;gt;slapd.conf&amp;lt;/code&amp;gt;] file.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===== Building &amp;lt;code&amp;gt;slapd.conf&amp;lt;/code&amp;gt; from scratch =====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Get a copy to work with:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# scp uid@auth1.csclub.uwaterloo.ca:/etc/ldap/slapd.conf /etc/ldap/  ## you need CSC root for this&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;You&#039;ll want to comment out the TLS lines, and anything referring to Kerberos and access for now. You&#039;ll also want to comment out lines specifically referring to syscom and office staff.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Make sure you remove the reference to &amp;lt;code&amp;gt;nonMemberTerm&amp;lt;/code&amp;gt; as an index, as we&#039;re going to remove this field.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;You&#039;ll also need to generate a root password for the LDAP to bootstrap auth, like so:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# slappasswd&lt;br /&gt;
New password: &lt;br /&gt;
Re-enter new password:&lt;br /&gt;
{SSHA}longhash&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Add this line below &amp;lt;code&amp;gt;rootdn&amp;lt;/code&amp;gt; in the &amp;lt;code&amp;gt;slapd.conf&amp;lt;/code&amp;gt;:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;rootpw          {SSHA}longhash&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Now we want to edit all instances of &amp;amp;quot;csclub&amp;amp;quot; to be &amp;amp;quot;wics&amp;amp;quot; instead, e.g.:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;suffix     &amp;amp;quot;dc=wics,dc=uwaterloo,dc=ca&amp;amp;quot;&lt;br /&gt;
rootdn     &amp;amp;quot;cn=root,dc=wics,dc=uwaterloo,dc=ca&amp;amp;quot;&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Next, we need to grab all the relevant schemas:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;scp -r uid@auth1.csclub.uwaterloo.ca:/etc/ldap/schema/ /tmp/schemas&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Use the include directives to help you find the ones you need. I noticed we were missing &amp;lt;code&amp;gt;sudo.schema&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;csc.schema&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;rfc2307bis.schema&amp;lt;/code&amp;gt;.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Open up the [https://git.uwaterloo.ca/wics/documentation/blob/master/ldap/csc.schema &amp;lt;code&amp;gt;csc.schema&amp;lt;/code&amp;gt;] for editing; we&#039;re not using it verbatim. Remove the attributes &amp;lt;code&amp;gt;studentid&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;nonMemberTerm&amp;lt;/code&amp;gt; and the objectclass &amp;lt;code&amp;gt;club&amp;lt;/code&amp;gt;. Also make sure you change the OID so we don&#039;t clash with the CSC. Because we didn&#039;t want to go through the process of requesting a [http://pen.iana.org/pen/PenApplication.page PEN number], we chose arbitrarily to use 26338, which belongs to IWICS Inc.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;We also need to can the auto-generated config files, so do that:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# rm -rf /etc/openldap/slapd.d/*&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Also nuke the auto-generated database:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# rm /var/lib/ldap/__db.*&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Configure the database:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# cp /usr/share/slapd/DB_CONFIG /var/lib/ldap/&lt;br /&gt;
# chown openldap:openldap /var/lib/ldap/DB_CONFIG &amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Now we can generate the new configuration files:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# slaptest -f /etc/ldap/slapd.conf -F /etc/ldap/slapd.d/&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;And ensure that the permissions are all set correctly, lest this break something:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# chown -R openldap:openldap /etc/ldap/slapd.d&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;If at this point you get a nasty error, such as&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;5657d4db hdb_db_open: database &amp;amp;quot;dc=wics,dc=uwaterloo,dc=ca&amp;amp;quot;: db_open(/var/lib/ldap/id2entry.bdb) failed: No such file or directory (2).&lt;br /&gt;
5657d4db backend_startup_one (type=hdb, suffix=&amp;amp;quot;dc=wics,dc=uwaterloo,dc=ca&amp;amp;quot;): bi_db_open failed! (2)&lt;br /&gt;
slap_startup failed (test would succeed using the -u switch)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;Just try restarting slapd, and see if that fixes the problem:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# service slapd stop&lt;br /&gt;
# service slapd start&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Congratulations! Your LDAP service is now configured and running.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Getting TLS Up and Running ====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Now that we have our LDAP service, we&#039;ll want to be able to serve encrypted traffic. This is especially important for any remote access, since binding to LDAP (i.e. sending it a password for auth) occurs over plaintext, and we don&#039;t want to leak our admin password.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Our first step is to copy our SSL certificates into the correct places. Public ones go into &amp;lt;code&amp;gt;/etc/ssl/certs/&amp;lt;/code&amp;gt; and private ones go into &amp;lt;code&amp;gt;/etc/ssl/private/&amp;lt;/code&amp;gt;.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Since the LDAP daemon needs to be able to read our private cert, we need to grant LDAP access to the private folder:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# chgrp openldap /etc/ssl/private &lt;br /&gt;
# chmod g+x /etc/ssl/private&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Next, uncomment the TLS-related settings in &amp;lt;code&amp;gt;slapd.conf&amp;lt;/code&amp;gt;. These are &amp;lt;code&amp;gt;TLSCertificateFile&amp;lt;/code&amp;gt; (the public cert), &amp;lt;code&amp;gt;TLSCertificateKeyFile&amp;lt;/code&amp;gt; (the private key), &amp;lt;code&amp;gt;TLSCACertificateFile&amp;lt;/code&amp;gt; (the intermediate CA cert), and &amp;lt;code&amp;gt;TLSVerifyClient&amp;lt;/code&amp;gt; (set to &amp;amp;quot;allow&amp;amp;quot;).&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# enable TLS connections&lt;br /&gt;
TLSCertificateFile      /etc/ssl/certs/wics-wildcard.crt&lt;br /&gt;
TLSCertificateKeyFile   /etc/ssl/private/wics-wildcard.key&lt;br /&gt;
&lt;br /&gt;
# enable TLS client authentication&lt;br /&gt;
TLSCACertificateFile    /etc/ssl/certs/GlobalSign_Intermediate_Root_SHA256_G2.pem&lt;br /&gt;
TLSVerifyClient         allow&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Update all your LDAP settings:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# rm -rf /etc/openldap/slapd.d/*&lt;br /&gt;
# slaptest -f /etc/ldap/slapd.conf -F /etc/ldap/slapd.d/&lt;br /&gt;
# chown -R openldap:openldap /etc/ldap/slapd.d&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;And last, ensure that LDAP will actually serve &amp;lt;code&amp;gt;ldaps://&amp;lt;/code&amp;gt; by modifying the init script variables in &amp;lt;code&amp;gt;/etc/default/&amp;lt;/code&amp;gt;:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# vim /etc/default/slapd&lt;br /&gt;
...&lt;br /&gt;
SLAPD_SERVICES=&amp;amp;quot;ldap:/// ldapi:/// ldaps:///&amp;amp;quot;&lt;br /&gt;
...&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Now you can restart the LDAP server:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# service slapd restart&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;And assuming this is successful, test to ensure LDAP is serving on port 636 for &amp;lt;code&amp;gt;ldaps://&amp;lt;/code&amp;gt;:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# netstat -ntaup&lt;br /&gt;
Active Internet connections (servers and established)&lt;br /&gt;
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name&lt;br /&gt;
tcp        0      0 0.0.0.0:389             0.0.0.0:*               LISTEN      22847/slapd     &lt;br /&gt;
tcp        0      0 0.0.0.0:636             0.0.0.0:*               LISTEN      22847/slapd &amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Populating the Database ====&lt;br /&gt;
&lt;br /&gt;
Now you&#039;ll need to start adding objects to the database. While we&#039;ll want to mostly do this programmatically, there are a few entries we&#039;ll need to bootstrap.&lt;br /&gt;
&lt;br /&gt;
===== Root Entries =====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Start by creating a file [https://git.uwaterloo.ca/wics/documentation/blob/master/ldap/tree.ldif &amp;lt;code&amp;gt;tree.ldif&amp;lt;/code&amp;gt;] to create a few necessary &amp;amp;quot;roots&amp;amp;quot; in our LDAP tree, with the contents:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;dn: dc=wics,dc=uwaterloo,dc=ca&lt;br /&gt;
objectClass: dcObject&lt;br /&gt;
objectClass: organization&lt;br /&gt;
o: Women in Computer Science&lt;br /&gt;
dc: wics&lt;br /&gt;
&lt;br /&gt;
dn: ou=People,dc=wics,dc=uwaterloo,dc=ca&lt;br /&gt;
objectClass: organizationalUnit&lt;br /&gt;
ou: People&lt;br /&gt;
&lt;br /&gt;
dn: ou=Group,dc=wics,dc=uwaterloo,dc=ca&lt;br /&gt;
objectClass: organizationalUnit&lt;br /&gt;
ou: Group&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Now attempt an LDAP add, using the password you set earlier:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# ldapadd -cxWD cn=root,dc=wics,dc=uwaterloo,dc=ca -f tree.ldif&lt;br /&gt;
Enter LDAP Password:&lt;br /&gt;
adding new entry &amp;amp;quot;dc=wics,dc=uwaterloo,dc=ca&amp;amp;quot;&lt;br /&gt;
&lt;br /&gt;
adding new entry &amp;amp;quot;ou=People,dc=wics,dc=uwaterloo,dc=ca&amp;amp;quot;&lt;br /&gt;
&lt;br /&gt;
adding new entry &amp;amp;quot;ou=Group,dc=wics,dc=uwaterloo,dc=ca&amp;amp;quot;&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Test that everything turned out okay, by performing a query of the entire database:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# ldapsearch -x -h localhost&lt;br /&gt;
# extended LDIF&lt;br /&gt;
#&lt;br /&gt;
# LDAPv3&lt;br /&gt;
# base &amp;amp;lt;dc=wics,dc=uwaterloo,dc=ca&amp;amp;gt; (default) with scope subtree&lt;br /&gt;
# filter: (objectclass=*)&lt;br /&gt;
# requesting: ALL&lt;br /&gt;
#&lt;br /&gt;
&lt;br /&gt;
# wics.uwaterloo.ca&lt;br /&gt;
dn: dc=wics,dc=uwaterloo,dc=ca&lt;br /&gt;
objectClass: dcObject&lt;br /&gt;
objectClass: organization&lt;br /&gt;
o: Women in Computer Science&lt;br /&gt;
dc: wics&lt;br /&gt;
&lt;br /&gt;
# People, wics.uwaterloo.ca&lt;br /&gt;
dn: ou=People,dc=wics,dc=uwaterloo,dc=ca&lt;br /&gt;
objectClass: organizationalUnit&lt;br /&gt;
ou: People&lt;br /&gt;
&lt;br /&gt;
# Group, wics.uwaterloo.ca&lt;br /&gt;
dn: ou=Group,dc=wics,dc=uwaterloo,dc=ca&lt;br /&gt;
objectClass: organizationalUnit&lt;br /&gt;
ou: Group&lt;br /&gt;
&lt;br /&gt;
# search result&lt;br /&gt;
search: 2&lt;br /&gt;
result: 0 Success&lt;br /&gt;
&lt;br /&gt;
# numResponses: 4&lt;br /&gt;
# numEntries: 3&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===== Users and Groups =====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Next, add users to track the current GID and UID. This will save us from querying the entire database every time we make a new user or group. Create this file, [https://git.uwaterloo.ca/wics/documentation/blob/master/ldap/nextxid.ldif &amp;lt;code&amp;gt;nextxid.ldif&amp;lt;/code&amp;gt;]:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;dn: uid=nextuid,ou=People,dc=wics,dc=uwaterloo,dc=ca&lt;br /&gt;
cn: nextuid&lt;br /&gt;
objectClass: account&lt;br /&gt;
objectClass: posixAccount&lt;br /&gt;
objectClass: top&lt;br /&gt;
uidNumber: 20000&lt;br /&gt;
gidNumber: 20000&lt;br /&gt;
homeDirectory: /dev/null&lt;br /&gt;
&lt;br /&gt;
dn: cn=nextgid,ou=Group,dc=wics,dc=uwaterloo,dc=ca&lt;br /&gt;
objectClass: group&lt;br /&gt;
objectClass: posixGroup&lt;br /&gt;
objectClass: top&lt;br /&gt;
gidNumber: 10000&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;p&amp;gt;You&#039;ll see here that our first GID is 10000 and our first UID is 20000.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Now add them, like you did with the roots of the tree:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# ldapadd -cxWD cn=root,dc=wics,dc=uwaterloo,dc=ca -f nextxid.ldif&lt;br /&gt;
Enter LDAP Password:&lt;br /&gt;
adding new entry &amp;amp;quot;uid=nextuid,ou=People,dc=wics,dc=uwaterloo,dc=ca&amp;amp;quot;&lt;br /&gt;
&lt;br /&gt;
adding new entry &amp;amp;quot;cn=nextgid,ou=Group,dc=wics,dc=uwaterloo,dc=ca&amp;amp;quot;&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===== Special &amp;lt;code&amp;gt;sudo&amp;lt;/code&amp;gt; Entries =====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;We also need to add a sudoers OU with a defaults object for default sudo settings. We also need entries for syscom, such that members of the syscom group can use sudo on all hosts, and for termcom, whose members can use sudo on only the office terminals. Call this one [https://git.uwaterloo.ca/wics/documentation/blob/master/ldap/sudoers.ldif &amp;lt;code&amp;gt;sudoers.ldif&amp;lt;/code&amp;gt;]:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;dn: ou=SUDOers,dc=wics,dc=uwaterloo,dc=ca&lt;br /&gt;
objectClass: organizationalUnit&lt;br /&gt;
ou: SUDOers&lt;br /&gt;
&lt;br /&gt;
dn: cn=defaults,ou=SUDOers,dc=wics,dc=uwaterloo,dc=ca&lt;br /&gt;
objectClass: top&lt;br /&gt;
objectClass: sudoRole&lt;br /&gt;
cn: defaults&lt;br /&gt;
sudoOption: !lecture&lt;br /&gt;
sudoOption: env_reset&lt;br /&gt;
sudoOption: listpw=never&lt;br /&gt;
sudoOption: mailto=&amp;amp;quot;wics-sys@lists.uwaterloo.ca&amp;amp;quot;&lt;br /&gt;
sudoOption: shell_noargs&lt;br /&gt;
&lt;br /&gt;
dn: cn=%syscom,ou=SUDOers,dc=wics,dc=uwaterloo,dc=ca&lt;br /&gt;
objectClass: top&lt;br /&gt;
objectClass: sudoRole&lt;br /&gt;
cn: %syscom&lt;br /&gt;
sudoUser: %syscom&lt;br /&gt;
sudoHost: ALL&lt;br /&gt;
sudoCommand: ALL&lt;br /&gt;
sudoRunAsUser: ALL&lt;br /&gt;
&lt;br /&gt;
dn: cn=%termcom,ou=SUDOers,dc=wics,dc=uwaterloo,dc=ca&lt;br /&gt;
objectClass: top&lt;br /&gt;
objectClass: sudoRole&lt;br /&gt;
cn: %termcom&lt;br /&gt;
sudoUser: %termcom&lt;br /&gt;
sudoHost: honk&lt;br /&gt;
sudoHost: hiss&lt;br /&gt;
sudoHost: gosling&lt;br /&gt;
sudoCommand: ALL&lt;br /&gt;
sudoRunAsUser: ALL&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Now add them:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# ldapadd -cxWD cn=root,dc=wics,dc=uwaterloo,dc=ca -f sudoers.ldif&lt;br /&gt;
Enter LDAP Password:&lt;br /&gt;
adding new entry &amp;amp;quot;ou=SUDOers,dc=wics,dc=uwaterloo,dc=ca&amp;amp;quot;&lt;br /&gt;
&lt;br /&gt;
adding new entry &amp;amp;quot;cn=defaults,ou=SUDOers,dc=wics,dc=uwaterloo,dc=ca&amp;amp;quot;&lt;br /&gt;
&lt;br /&gt;
adding new entry &amp;amp;quot;cn=%syscom,ou=SUDOers,dc=wics,dc=uwaterloo,dc=ca&amp;amp;quot;&lt;br /&gt;
&lt;br /&gt;
adding new entry &amp;amp;quot;cn=%termcom,ou=SUDOers,dc=wics,dc=uwaterloo,dc=ca&amp;amp;quot;&amp;lt;/pre&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;Last, add some special local groups via [https://git.uwaterloo.ca/wics/documentation/blob/master/ldap/local-groups.ldif &amp;lt;code&amp;gt;local-groups.ldif&amp;lt;/code&amp;gt;]:&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;# ldapadd -cxWD cn=root,dc=wics,dc=uwaterloo,dc=ca -f local-groups.ldif&amp;lt;/pre&amp;gt;&lt;br /&gt;
The local groups are special because they usually are present on all systems, but we want to be able to add users to them at the LDAP level. For instance, the audio group controls access to sound equipment, and the adm group controls log read access.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;p&amp;gt;That&#039;s all the entries we have to add manually! Now we can use software for the rest. See [[weo|&amp;lt;code&amp;gt;ceo&amp;lt;/code&amp;gt;]] for more details.&amp;lt;/p&amp;gt;&amp;lt;/li&amp;gt;&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Querying LDAP ===&lt;br /&gt;
&lt;br /&gt;
There are many tools available for issuing LDAP queries. Queries should be issued to &amp;lt;tt&amp;gt;ldap1.csclub.uwaterloo.ca&amp;lt;/tt&amp;gt;. The search base you almost certainly want is &amp;lt;tt&amp;gt;dc=csclub,dc=uwaterloo,dc=ca&amp;lt;/tt&amp;gt;. Read access is available without authentication; [[Kerberos]] is used to authenticate commands which require it.&lt;br /&gt;
&lt;br /&gt;
Example:&lt;br /&gt;
&lt;br /&gt;
 ldapsearch -x -h ldap1.csclub.uwaterloo.ca -b dc=csclub,dc=uwaterloo,dc=ca uid=ctdalek&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;tt&amp;gt;-x&amp;lt;/tt&amp;gt; option causes &amp;lt;tt&amp;gt;ldapsearch&amp;lt;/tt&amp;gt; to switch to simple authentication rather than trying to authenticate via SASL (which will fail if you do not have a Kerberos ticket).&lt;br /&gt;
&lt;br /&gt;
The University LDAP server (uwldap.uwaterloo.ca) can also be queried like this. Again, use &amp;quot;simple authentication&amp;quot; as read access is available (from on campus) without authentication. SASL authentication will fail without additional parameters.&lt;br /&gt;
&lt;br /&gt;
Example:&lt;br /&gt;
&lt;br /&gt;
 ldapsearch -x -h uwldap.uwaterloo.ca -b dc=uwaterloo,dc=ca &amp;quot;cn=Prabhakar Ragde&amp;quot;&lt;br /&gt;
&lt;br /&gt;
=== Replication ===&lt;br /&gt;
&lt;br /&gt;
While &amp;lt;tt&amp;gt;ldap1.csclub.uwaterloo.ca&amp;lt;/tt&amp;gt; ([[Machine_List#auth1|auth1]]) is the LDAP master, an up-to-date replica is available on &amp;lt;tt&amp;gt;ldap2.csclub.uwaterloo.ca&amp;lt;/tt&amp;gt; ([[Machine_List#auth2|auth2]]).&lt;br /&gt;
&lt;br /&gt;
In order to replicate changes from the master, the slave maintains an authenticated connection to the master which provides it with full read access to all changes.&lt;br /&gt;
&lt;br /&gt;
Specifically, &amp;lt;tt&amp;gt;/etc/systemd/system/k5start-slapd.service&amp;lt;/tt&amp;gt; maintains an active Kerberos ticket for &amp;lt;tt&amp;gt;ldap/auth2.csclub.uwaterloo.ca@CSCLUB.UWATERLOO.CA&amp;lt;/tt&amp;gt; in &amp;lt;tt&amp;gt;/var/run/slapd/krb5cc&amp;lt;/tt&amp;gt;. This is then used to authenticate the slave to the server, who maps this principal to &amp;lt;tt&amp;gt;cn=ldap-slave,dc=csclub,dc=uwaterloo,dc=ca&amp;lt;/tt&amp;gt;, which in turn has full read privileges.&lt;br /&gt;
&lt;br /&gt;
In the event of master failure, all hosts should fail LDAP reads seamlessly over to the slave.&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Modifying LDAP entry ===&lt;br /&gt;
&lt;br /&gt;
Editing entries can be easily done with &amp;lt;code&amp;gt;ldapvi&amp;lt;/code&amp;gt;. First search for the entry using &amp;lt;code&amp;gt;ldapsearch&amp;lt;/code&amp;gt; like above, and change &amp;lt;code&amp;gt;ldapsearch -x&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;ldapvi -Y GSSAPI&amp;lt;/code&amp;gt; to make your edits.&lt;br /&gt;
&lt;br /&gt;
Note that if your &amp;lt;tt&amp;gt;EDITOR&amp;lt;/tt&amp;gt; enviroment is set to something not avaliable it will give out errors like&lt;br /&gt;
&lt;br /&gt;
 error (misc.c line 180): No such file or directory&lt;br /&gt;
 editor died&lt;br /&gt;
 error (ldapvi.c line 83): No such file or directory&lt;br /&gt;
&lt;br /&gt;
This can be fixed by something like&lt;br /&gt;
&lt;br /&gt;
 EDITOR=vi ldapvi ******&lt;br /&gt;
&lt;br /&gt;
==== Changing a user&#039;s username ====&lt;br /&gt;
&lt;br /&gt;
Only a member of the Systems Committee can change a user&#039;s username. &#039;&#039;&#039;At all times, a user&#039;s username must match the user&#039;s username in WatIAM.&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
All changes to an account MUST be done in person so that identity can be confirmed. If a member cannot attend in person, then an alternate method of identity verification may be chosen by the Systems Administrator.&lt;br /&gt;
&lt;br /&gt;
# Edit entries in LDAP (&amp;lt;code&amp;gt;ldapvi -Y GSSAPI&amp;lt;/code&amp;gt;)&lt;br /&gt;
#* Find and replace the user&#039;s old username with the new one (&amp;lt;code&amp;gt;%s/$OLD/$NEW/g&amp;lt;/code&amp;gt;)&lt;br /&gt;
# Change the user&#039;s Kerberos principal (on auth1, &amp;lt;code&amp;gt;renprinc $OLD $NEW&amp;lt;/code&amp;gt;)&lt;br /&gt;
# Move the user&#039;s home directory (on phosphoric-acid, &amp;lt;code&amp;gt;mv /users/$OLD /users/$NEW&amp;lt;/code&amp;gt;)&lt;br /&gt;
# Modify the user&#039;s ~/.forward file if their old username is in it.&lt;br /&gt;
# Change the user&#039;s csc-general (and csc-industry, if subscribed) email address for &amp;lt;code&amp;gt;$OLD@csclub.uwaterloo.ca&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;$NEW@csclub.uwaterloo.ca&amp;lt;/code&amp;gt;&lt;br /&gt;
#* https://mailman.csclub.uwaterloo.ca/admin/csc-general&lt;br /&gt;
# If the user has vhosts on caffeine, update them to point to their new username&lt;br /&gt;
&lt;br /&gt;
If the user&#039;s account has been around for a while, and they request it, forward email from their old username to their new one.&lt;br /&gt;
&lt;br /&gt;
# Edit &amp;lt;code&amp;gt;/etc/aliases&amp;lt;/code&amp;gt; on mail. &amp;lt;code&amp;gt;$OLD: $NEW&amp;lt;/code&amp;gt;&lt;br /&gt;
# Run &amp;lt;code&amp;gt;newaliases&amp;lt;/code&amp;gt;&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=SSL&amp;diff=5291</id>
		<title>SSL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=SSL&amp;diff=5291"/>
		<updated>2024-11-07T13:59:36Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* GlobalSign */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== GlobalSign ==&lt;br /&gt;
&lt;br /&gt;
The CSC currently has an SSL Certificate from GlobalSign for *.csclub.uwaterloo.ca provided at no cost to us through IST.  GlobalSign likes to take a long time to respond to certificate signing requests (CSR) for wildcard certs, so our CSR really needs to be handed off to IST at least 2 weeks in advance. You can do it sooner – the certificate expiry date will be the old expiry date + 1 year (+ a bonus )  Having an invalid cert for any length of time leads to terrible breakage, followed by terrible workarounds and prolonged problems.&lt;br /&gt;
&lt;br /&gt;
When the certificate is due to expire in a month or two, syscom should (but apparently doesn&#039;t always) get an email notification. This will include a renewal link. Otherwise, use the [https://uwaterloo.ca/information-systems-technology/about/organizational-structure/information-security-services/certificate-authority/globalsign-signed-x5093-certificates/self-service-globalsign-ssl-certificates IST-CA self service system]. Please keep a copy of the key, CSR and (once issued) certificate in &amp;lt;tt&amp;gt;/users/sysadmin/certs&amp;lt;/tt&amp;gt;. The OpenSSL examples linked there are good to generate a 2048-bit RSA key and a corresponding CSR. It&#039;s probably a good idea to change the private key (as it&#039;s not that much effort anyways). Just sure your CSR is for &amp;lt;tt&amp;gt;*.csclub.uwaterloo.ca&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
At the self-service portal, these options worked in 2013. If you need IST assistance, [mailto:ist-ca@uwaterloo.ca ist-ca@uwaterloo.ca] is the email address you should contact.&lt;br /&gt;
  Products: OrganizationSSL&lt;br /&gt;
  SSL Certificate Type: Wildcard SSL Certificate&lt;br /&gt;
  Validity Period: 1 year&lt;br /&gt;
  Are you switching from a Competitor? No, I am not switching&lt;br /&gt;
  Are you renewing this Certificate? Yes (paste current certificate)&lt;br /&gt;
  30-day bonus: Yes (why not?)&lt;br /&gt;
  Add specific Subject Alternative Names (SANs): No (*.csclub.uwaterloo.ca automatically adds csclub.uwaterloo.ca as a SAN)&lt;br /&gt;
  Enter Certificate Signing Request (CSR): Yes (paste CSR)&lt;br /&gt;
  Contact Information:&lt;br /&gt;
    First Name: Computer Science Club&lt;br /&gt;
    Last Name: Systems Committee&lt;br /&gt;
    Telephone: +1 519 888 4567 x33870&lt;br /&gt;
    Email Address: syscom@csclub.uwaterloo.ca&lt;br /&gt;
&lt;br /&gt;
=== Helpful links ===&lt;br /&gt;
* [https://support.globalsign.com/ssl/ssl-certificates-installation/generate-csr-openssl How to generate a new CSR and private key]&lt;br /&gt;
* [https://uwaterloo.atlassian.net/wiki/spaces/ISTKB/pages/262013183/How+to+obtain+a+new+GlobalSign+certificate+or+renew+an+existing+one How to obtain a new GlobalSign certificate or renew an existing one]&lt;br /&gt;
* [https://system.globalsign.com/bm/public/certificate/poporder.do?domain=PAR12271n5w6s27pvg8d92v4150t GlobalSign UWaterloo self-service page]&lt;br /&gt;
* [https://support.globalsign.com/ca-certificates/intermediate-certificates/organizationssl-intermediate-certificates GlobalSign intermediate certificate] (needed to create a certificate chain; see below)&lt;br /&gt;
&lt;br /&gt;
=== OpenSSL cheat sheet ===&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Generate a new CSR and private key (do this in a new directory):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
openssl req -out csclub.uwaterloo.ca.csr -new -newkey rsa:2048 -keyout csclub.uwaterloo.ca.key -nodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Enter the following information at the prompts:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Country Name (2 letter code) [AU]:CA&lt;br /&gt;
State or Province Name (full name) [Some-State]:Ontario&lt;br /&gt;
Locality Name (eg, city) []:Waterloo&lt;br /&gt;
Organization Name (eg, company) [Internet Widgits Pty Ltd]:University of Waterloo&lt;br /&gt;
Organizational Unit Name (eg, section) []:Computer Science Club&lt;br /&gt;
Common Name (e.g. server FQDN or YOUR name) []:*.csclub.uwaterloo.ca&lt;br /&gt;
Email Address []:systems-committee@csclub.uwaterloo.ca&lt;br /&gt;
&lt;br /&gt;
Please enter the following &#039;extra&#039; attributes&lt;br /&gt;
to be sent with your certificate request&lt;br /&gt;
A challenge password []:&lt;br /&gt;
An optional company name []:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
View the information inside a CSR:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
openssl req -noout -text -in csclub.uwaterloo.ca.csr&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
View the information inside a private key:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
openssl pkey -noout -text -in csclub.uwaterloo.ca.key&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
View information inside a certificate:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
openssl x509 -noout -text -in csclub.uwaterloo.ca.crt&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== csclub.cloud ===&lt;br /&gt;
Once a year, someone from IST will ask us to create a temporary TXT record for csclub.cloud to prove to GlobalSign that we own it. This must be created at the &amp;lt;b&amp;gt;root&amp;lt;/b&amp;gt; of the domain. Since this zone is managed dynamically (via the acme.sh script on biloba, see below), we need to freeze the domain and update /var/lib/bind/db.csclub.cloud directly.&lt;br /&gt;
&lt;br /&gt;
Once you&#039;re in the correct server (not Biloba). Here are the steps:&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;rndc freeze csclub.cloud&amp;lt;/code&amp;gt;.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Open /var/lib/bind/db.csclub.cloud and add a new TXT record. It&#039;ll look something like&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
TXT &amp;quot;_globalsign-domain-verification=blablabla&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
In the same file, make sure to also update the SOA serial number. It should generally be YYYYMMDDNN where NN is a monotonically increasing counter (YYYYMMDD is the current date).&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;rndc reload&amp;lt;/code&amp;gt;.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Run a DNS query to make sure you can see the TXT record:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dig -t txt @dns1 csclub.cloud&lt;br /&gt;
dig -t txt @dns2 csclub.cloud&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Email back the person from IST and let them know that we created the TXT record.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Once the certificate has been renewed, delete the TXT record, update the SOA serial number, and run &amp;lt;code&amp;gt;rndc reload&amp;lt;/code&amp;gt;.&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;rndc thaw csclub.cloud&amp;lt;/code&amp;gt;.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Certificate Files ==&lt;br /&gt;
Let&#039;s say you obtain a new certificate for *.csclub.uwaterloo.ca. Here are the files which should be stored in the certs folder:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;csclub.uwaterloo.ca.key: private key created by openssl&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;csclub.uwaterloo.ca.csr: certificate signing request created by openssl&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;order: order number from GlobalSign&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;csclub.uwaterloo.ca.crt: certificate created by GlobalSign&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;globalsign-intermediate.crt: intermediate certificate from GlobalSign, obtainable from [https://support.globalsign.com/ca-certificates/intermediate-certificates/organizationssl-intermediate-certificates here]. As of this writing, we use the &amp;quot;OrganizationSSL SHA-256 R3 Intermediate Certificate&amp;quot;. Just click the &amp;quot;View in Base64&amp;quot; button and copy the contents.&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;There is an alternative way to get the intermediate certificate: if you run &amp;lt;code&amp;gt;openssl x509 -noout -text -in csclub.uwaterloo.ca.crt&amp;lt;/code&amp;gt;, under X509v3 extensions &amp;gt; Authority Information Access, there should be a field called &amp;quot;CA Issuers&amp;quot; which has a URL which looks like http://secure.globalsign.com/cacert/gsrsaovsslca2018.crt. You can download that file and convert it to PEM:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
wget https://secure.globalsign.com/cacert/gsrsaovsslca2018.crt&lt;br /&gt;
openssl x509 -inform der -in gsrsaovsslca2018.crt -out globalsign-intermediate.crt&lt;br /&gt;
rm gsrsaovsslca2018.crt&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;csclub.uwaterloo.ca.chain: create this with the following command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cat csclub.uwaterloo.ca.crt globalsign-intermediate.crt &amp;gt; csclub.uwaterloo.ca.chain&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;csclub.uwaterloo.ca.pem: create this with the following command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cat csclub.uwaterloo.ca.key csclub.uwaterloo.ca.chain &amp;gt; csclub.uwaterloo.ca.pem&lt;br /&gt;
chmod 600 csclub.uwaterloo.ca.pem&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Certificate Locations ==&lt;br /&gt;
&lt;br /&gt;
Keep a copy of newly generated certificates in /users/sysadmin/certs.&lt;br /&gt;
&lt;br /&gt;
A list of places you&#039;ll need to put the new certificate to keep our services running. Private key (if applicable) should be kept next to the certificate with the extension .key.&lt;br /&gt;
&lt;br /&gt;
* caffeine:/etc/ssl/private/csclub-wildcard.crt (for Apache)&lt;br /&gt;
* coffee:/etc/ssl/private/csclub.uwaterloo.ca (for PostgreSQL and MariaDB)&lt;br /&gt;
* &amp;lt;s&amp;gt;mail:/etc/ssl/private/csclub-wildcard.crt (for Apache, Postfix and Dovecot)&amp;lt;/s&amp;gt; (UPDATE: we use certbot now for these)&lt;br /&gt;
* mailman:/etc/ssl/private/csclub-wildcard-chain.crt (for Apache)&lt;br /&gt;
* rt:/etc/ssl/private/csclub-wildcard.crt (for Apache)&lt;br /&gt;
* potassium-benzoate:/etc/ssl/private/csclub-wildcard.crt (for nginx)&lt;br /&gt;
* phosphoric-acid:/etc/ssl/private/csclub-wildcard-chain.crt (for ceod)&lt;br /&gt;
* auth1:/etc/ssl/private/csclub-wildcard.crt (for slapd, make sure to &amp;lt;code&amp;gt;sudo service slapd restart&amp;lt;/code&amp;gt;)&lt;br /&gt;
* auth2:/etc/ssl/private/csclub-wildcard.crt (for slapd, make sure to &amp;lt;code&amp;gt;sudo service slapd restart&amp;lt;/code&amp;gt;)&lt;br /&gt;
* mattermost:/etc/ssl/private/csclub-wildcard.crt (for nginx)&lt;br /&gt;
* load-balancer-0(1|2):/etc/ssl/private/csclub.uwaterloo.ca (for haproxy) [temporarily down 2020]&lt;br /&gt;
* chat:/etc/ssl/private/csclub-wildcard-chain.crt (for nginx)&lt;br /&gt;
* prometheus:/etc/ssl/private/csclub-wildcard-chain.crt (for Apache)&lt;br /&gt;
* bigbluebutton:/etc/nginx/ssl/csclub-wildcard-chain.crt (podman container on xylitol)&lt;br /&gt;
* icy:/etc/ssl/private/csclub-wildcard.pem (for Icecast)&lt;br /&gt;
* chamomile:/etc/ssl/private/cloud.csclub.uwaterloo.ca.chain.crt, /etc/ssl/private/csclub.cloud.chain, /etc/ssl/private/csclub.uwaterloo.ca.chain (for nginx)&lt;br /&gt;
* biloba:/etc/ssl/private/cloud.csclub.uwaterloo.ca.chain.crt, /etc/ssl/private/csclub.cloud.chain, /etc/ssl/private/csclub.uwaterloo.ca.chain (for nginx)&lt;br /&gt;
* nextcloud (nspawn container inside guayusa): /etc/ssl/private/csclub.uwaterloo.ca.chain (for nginx)&lt;br /&gt;
* citric-acid (runs vaultwarden): /etc/ssl/private/csclub.uwaterloo.ca.{chain,key} (for nginx)&lt;br /&gt;
&lt;br /&gt;
Some services (e.g. Dovecot, Postfix) prefer to have the certificate chain in one file. Concatenate the appropriate intermediate root to the end of the certificate and store this as csclub-wildcard-chain.crt.&lt;br /&gt;
&lt;br /&gt;
=== More certificate locations ===&lt;br /&gt;
We have some SSL certificates which are not used by web servers, but still need to be renewed eventually.&lt;br /&gt;
&lt;br /&gt;
==== Prometheus node exporter ====&lt;br /&gt;
All of our Prometheus node exporters are using mTLS via stunnel (every bare-metal host, as well as caffeine, coffee and mail, is running this exporter). The certificates (both client and server) are set to expire in &amp;lt;b&amp;gt;September 2031&amp;lt;/b&amp;gt;; before then, create new keypairs in /opt/prometheus/tls, and deploy the new server.crt, node.crt and node.key to /etc/stunnel/tls on all machines. Restart prometheus and all of the node exporters.&lt;br /&gt;
&lt;br /&gt;
==== ADFS ====&lt;br /&gt;
See [[ADFS]]. When the university&#039;s IdP certificate expires (&amp;lt;b&amp;gt;October 2025&amp;lt;/b&amp;gt;), we can just download a new one and restart Apache; when our own certificate expires (&amp;lt;b&amp;gt;July 2031&amp;lt;/b&amp;gt;), we need to submit a new form to IST (please do this &amp;lt;i&amp;gt;before&amp;lt;/i&amp;gt; the cert expires).&lt;br /&gt;
&lt;br /&gt;
==== Keycloak ====&lt;br /&gt;
See [[Keycloak]]. When the saml-passthrough certificate expires (&amp;lt;b&amp;gt;January 2032&amp;lt;/b&amp;gt;), you need to create a new keypair in /srv/saml-passthrough on caffeine, and upload the new certificate into the Keycloak UI (IdP settings). When the Keycloak SP certificate expires (&amp;lt;b&amp;gt;December 2031&amp;lt;/b&amp;gt;), make sure to create a new keypair and upload it to the Keycloak UI (Realm Settings).&lt;br /&gt;
&lt;br /&gt;
== letsencrypt ==&lt;br /&gt;
&lt;br /&gt;
We support letsencrypt for our virtual hosts with custom domains. We use the &amp;lt;tt&amp;gt;cerbot&amp;lt;/tt&amp;gt; from debian repositories with a configuration file at &amp;lt;tt&amp;gt;/etc/letsencrypt/cli.ini&amp;lt;/tt&amp;gt;, and a systemd timer to handle renewals.&lt;br /&gt;
&lt;br /&gt;
The setup for a new domain is:&lt;br /&gt;
&lt;br /&gt;
# Become &amp;lt;tt&amp;gt;certbot&amp;lt;/tt&amp;gt; on caffine with &amp;lt;tt&amp;gt;sudo -u certbot bash&amp;lt;/tt&amp;gt; or similar.&lt;br /&gt;
# Run &amp;lt;tt&amp;gt;certbot certonly -c /etc/letsencrypt/cli.ini -d DOMAIN --logs-dir /tmp&amp;lt;/tt&amp;gt;. The logs-dir isn&#039;t important and is only needed for troubleshooting.&lt;br /&gt;
# Set up the Apache site configuration using the example below. (apache config is in /etc/apache2) Note the permanent redirect to https.&lt;br /&gt;
# Make sure to commit your changes when you&#039;re done.&lt;br /&gt;
# Reloading apache config is &amp;lt;tt&amp;gt;sudo systemctl reload apache2&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;VirtualHost *:80&amp;gt;&lt;br /&gt;
     ServerName example.com&lt;br /&gt;
     ServerAlias *.example.com&lt;br /&gt;
     ServerAdmin example@csclub.uwaterloo.ca&lt;br /&gt;
 &lt;br /&gt;
     #DocumentRoot /users/example/www/&lt;br /&gt;
     Redirect permanent / https://example.com/&lt;br /&gt;
 &lt;br /&gt;
     ErrorLog /var/log/apache2/example-error.log&lt;br /&gt;
     CustomLog /var/log/apache2/example-access.log combined&lt;br /&gt;
 &amp;lt;/VirtualHost&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
 &amp;lt;VirtualHost csclub:443&amp;gt;&lt;br /&gt;
     SSLEngine on&lt;br /&gt;
     SSLCertificateFile /etc/letsencrypt/live/example.com/fullchain.pem&lt;br /&gt;
     SSLCertificateKeyFile /etc/letsencrypt/live/example.com/privkey.pem&lt;br /&gt;
     SSLStrictSNIVHostCheck on&lt;br /&gt;
 &lt;br /&gt;
     ServerName example.com&lt;br /&gt;
     ServerAlias *.example.com&lt;br /&gt;
     ServerAdmin example@csclub.uwaterloo.ca&lt;br /&gt;
 &lt;br /&gt;
     DocumentRoot /users/example/www&lt;br /&gt;
 &lt;br /&gt;
     ErrorLog /var/log/apache2/example-error.log&lt;br /&gt;
     CustomLog /var/log/apache2/example-access.log combined&lt;br /&gt;
 &amp;lt;/VirtualHost&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== acme.sh ==&lt;br /&gt;
We are using [https://github.com/acmesh-official/acme.sh acme.sh] for provisioning SSL certificates for some of our *.csclub.cloud domains. It is currently set up under /root/.acme.sh on biloba.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;NOTE&amp;lt;/b&amp;gt;: acme.sh has a cron job which automatically renews certificates before they expire and reloads NGINX, so you do not have to do anything after issuing and installing a certificate (i.e. &amp;quot;set-and-forget&amp;quot;).&lt;br /&gt;
&lt;br /&gt;
=== How to add a new SSL cert for a custom domain on CSC cloud ===&lt;br /&gt;
Note: you do not need to acquire a new cert if the requested domain is directly on csclub.cloud, e.g. app1.csclub.cloud. We can re-use our wildcard cert on csclub.cloud for that. However, if a user requests a multi-level domain on csclub.cloud, or a domain hosted on an external registrar, then you will need to create a new cert.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s say user &amp;lt;code&amp;gt;ctdalek&amp;lt;/code&amp;gt; wants &amp;lt;code&amp;gt;mydomain.com&amp;lt;/code&amp;gt; to point to a VM on CSC cloud.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
TLDR:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Obtain the cert.&lt;br /&gt;
# If a subdomain was also requested, pass the -d option multiple times, e.g.&lt;br /&gt;
# `-d mydomain.com -d sub.mydomain.com`. Make sure the &amp;quot;main&amp;quot; domain is specified first.&lt;br /&gt;
acme.sh --issue -d mydomain.com -w /var/www&lt;br /&gt;
&lt;br /&gt;
# Install the cert.&lt;br /&gt;
# If a subdomain was also requested, only specify the &amp;quot;main&amp;quot; domain.&lt;br /&gt;
acme.sh --install-cert -d mydomain.com \&lt;br /&gt;
    --key-file /etc/nginx/ceod/member-ssl/mydomain.com.key \&lt;br /&gt;
    --fullchain-file /etc/nginx/ceod/member-ssl/mydomain.com.chain \&lt;br /&gt;
    --reloadcmd &amp;quot;/root/bin/reload-nginx.sh&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Create a vhost file.&lt;br /&gt;
# Look at the other files in the same directory for inspiration.&lt;br /&gt;
# Make sure the file starts with the username and an underscore, e.g. &amp;quot;ctdalek_&amp;quot;,&lt;br /&gt;
# because this is how ceod keeps track of the vhosts.&lt;br /&gt;
# Make sure to set the custom domain name(s) and paths to the SSL key/cert.&lt;br /&gt;
vim /etc/nginx/ceod/member-vhosts/ctdalek_mydomain.com&lt;br /&gt;
&lt;br /&gt;
# Finally, reload NGINX on both biloba and chamomile. The /etc/nginx/ceod directory&lt;br /&gt;
# is shared between them.&lt;br /&gt;
/root/bin/reload-nginx.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Installation ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cd /opt    &lt;br /&gt;
git clone --depth 1 https://github.com/acmesh-official/acme.sh    &lt;br /&gt;
cd acme.sh    &lt;br /&gt;
./acme.sh --install -m syscom@csclub.uwaterloo.ca    &lt;br /&gt;
. &amp;quot;/root/.acme.sh/acme.sh.env&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;Important&amp;lt;/b&amp;gt;: If invoking acme.sh from another program, it needs the environment variables set in acme.sh.env. Currently, that is just&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
LE_WORKING_DIR=&amp;quot;/root/.acme.sh&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For testing purposes, make sure to use the Let&#039;s Encrypt test server:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
acme.sh --set-default-ca --server letsencrypt_test&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== NGINX setup ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir -p /var/www/.well-known/acme-challenge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Add the following snippet to your default NGINX file (e.g. /etc/nginx/sites-enabled/default):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  # For Let&#039;s Encrypt&lt;br /&gt;
  location /.well-known/acme-challenge/ {&lt;br /&gt;
    alias /var/www/.well-known/acme-challenge/;&lt;br /&gt;
  }&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now assuming that biloba has the IP address for *.csclub.cloud, you can test that everything is working:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
acme.sh --issue -d app.merenber.csclub.cloud -w /var/www&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To install a certificate after it&#039;s been issued:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
acme.sh --install-cert -d app.merenber.csclub.cloud \&lt;br /&gt;
    --key-file /etc/nginx/ceod/member-ssl/app.merenber.csclub.cloud.key \&lt;br /&gt;
    --fullchain-file /etc/nginx/ceod/member-ssl/app.merenber.csclub.cloud.chain \&lt;br /&gt;
    --reloadcmd &amp;quot;/root/bin/reload-nginx.sh&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
At this point, you should add your NGINX vhost file which uses that SSL certificate.&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
To remove a certificate:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
acme.sh --remove -d app.merenber.csclub.cloud&lt;br /&gt;
rm -r /root/.acme.sh/app.merenber.csclub.cloud&lt;br /&gt;
rm /etc/nginx/ceod/member-ssl/app.merenber.csclub.cloud.chain&lt;br /&gt;
rm /etc/nginx/ceod/member-ssl/app.merenber.csclub.cloud.key&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Don&#039;t forget to remove the NGINX vhost file too.&lt;br /&gt;
&lt;br /&gt;
Once you think you&#039;re ready, use a real ACME provider, e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
acme.sh --set-default-ca --server letsencrypt&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Since we have a [https://zerossl.com ZeroSSL] account, and ZeroSSL has no rate limit, we are going to use that instead:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
acme.sh  --register-account  --server zerossl \&lt;br /&gt;
        --eab-kid  xxxxxxxxxxxx  \&lt;br /&gt;
        --eab-hmac-key  xxxxxxxxx&lt;br /&gt;
acme.sh --set-default-ca  --server zerossl&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== DNS challenge ===&lt;br /&gt;
To obtain a wildcard certificate (e.g. *.k8s.csclub.cloud), you will need to perform the DNS-01 challenge. We are going to use nsupdate to interact with our BIND9 server on dns1.&lt;br /&gt;
&lt;br /&gt;
On dns1, run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
tsig-keygen csc-cloud&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Paste the output into the appropriate section in /etc/bind/named.conf.local. Also paste it into a file somewhere on biloba, e.g. /etc/csc/csc-cloud-tsig.key.&lt;br /&gt;
&lt;br /&gt;
Add the following to the csclub.cloud zone block:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  allow-update {&lt;br /&gt;
    !{&lt;br /&gt;
      !127.0.0.1;&lt;br /&gt;
      !::1;&lt;br /&gt;
      !129.97.134.0/24;&lt;br /&gt;
      !2620:101:f000:4901::/64;&lt;br /&gt;
      any;&lt;br /&gt;
    };&lt;br /&gt;
    key csc-cloud;&lt;br /&gt;
  };&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
(We&#039;re basically trying to restrict updates to the given IP ranges. See https://serverfault.com/a/417229.)&lt;br /&gt;
&lt;br /&gt;
The &#039;bind&#039; user can&#039;t write to files under /etc/bind, so we&#039;re going to move our zone file to /var/lib/bind instead.&lt;br /&gt;
Comment out &#039;file &amp;quot;/etc/bind/db.csclub.cloud&amp;quot;;&#039; from named.conf.local and add this line below it:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  file &amp;quot;/var/lib/bind/db.csclub.cloud&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  cp /etc/bind/db.csclub.cloud /var/lib/bind/db.csclub.cloud&lt;br /&gt;
  chown bind:bind /var/lib/bind/db.csclub.cloud&lt;br /&gt;
  rndc reload&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
On biloba, check that everything&#039;s working:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  nsupdate -k /etc/csc/csc-cloud-tsig.key -v &amp;lt;&amp;lt;EOF&lt;br /&gt;
  update add test.csclub.cloud 300 A 0.0.0.0&lt;br /&gt;
  send&lt;br /&gt;
  EOF&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Use a tool such as &amp;lt;code&amp;gt;dig&amp;lt;/code&amp;gt; to make sure that the update was successful.&lt;br /&gt;
If it worked, you can delete the record:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  nsupdate -k /etc/csc/csc-cloud-tsig.key -v &amp;lt;&amp;lt;EOF&lt;br /&gt;
  delete test.csclub.cloud&lt;br /&gt;
  send&lt;br /&gt;
  EOF&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now we are ready to actually perform the challenge with acme.sh:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  export NSUPDATE_SERVER=&amp;quot;dns1.csclub.uwaterloo.ca&amp;quot;&lt;br /&gt;
  export NSUPDATE_KEY=&amp;quot;/etc/csc/csc-cloud-tsig.key&amp;quot;&lt;br /&gt;
  acme.sh --issue --dns dns_nsupdate -d &#039;k8s.csclub.cloud&#039; -d &#039;*.k8s.csclub.cloud&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
(If something goes wrong, use the &amp;lt;code&amp;gt;--debug&amp;lt;/code&amp;gt; flag.)&lt;br /&gt;
&lt;br /&gt;
If all went well, just install the certificate as usual:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  acme.sh --install-cert -d k8s.csclub.cloud \&lt;br /&gt;
    --key-file /etc/nginx/ceod/syscom-ssl/k8s.csclub.cloud.key \&lt;br /&gt;
    --fullchain-file /etc/nginx/ceod/syscom-ssl/k8s.csclub.cloud.chain \&lt;br /&gt;
    --reloadcmd &#039;systemctl reload nginx&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=CloudStack&amp;diff=5279</id>
		<title>CloudStack</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=CloudStack&amp;diff=5279"/>
		<updated>2024-09-19T01:34:40Z</updated>

		<summary type="html">&lt;p&gt;Merenber: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;We are using [https://cloudstack.apache.org/ Apache CloudStack] to provide VMs-as-a-service to members. Our user documentation is here: https://docs.cloud.csclub.uwaterloo.ca&lt;br /&gt;
&lt;br /&gt;
Prerequisite reading:&lt;br /&gt;
&lt;br /&gt;
* [[Ceph]]&lt;br /&gt;
* [[Cloud Networking]]&lt;br /&gt;
&lt;br /&gt;
Official CloudStack documentation: http://docs.cloudstack.apache.org/en/4.16.0.0/&lt;br /&gt;
&lt;br /&gt;
== Rebooting machines ==&lt;br /&gt;
I&#039;m going to start with this first because this is what future sysadmins are most interested in. If you reboot one of the CloudStack guest machines (as of this writing: biloba, ginkgo and chamomile), then I suggest you perform a live migration of all of the VMs on that host to the other other machines (see [[#Sequential reboot]]).&lt;br /&gt;
&lt;br /&gt;
If this is not possible (e.g. there is not enough capacity on the other machines), then CloudStack will most likely shut down the VMs automatically. &amp;lt;b&amp;gt;You are responsible for restarting them manually after the reboot.&amp;lt;/b&amp;gt; You will also need to manually restart any Kubernetes clusters.&lt;br /&gt;
&lt;br /&gt;
Note: if the cloudstack-agent.service is having trouble reconnecting to the management servers after a reboot, just do a systemctl restart and cross your fingers.&lt;br /&gt;
&lt;br /&gt;
=== Sequential reboot ===&lt;br /&gt;
If it is possible to reboot the machines one at a time (e.g. for a software upgrade), then it is possible to avoid having any downtime. Login to the web UI as admin, go to Infrastructure &amp;gt; Hosts, hover above the three-dots button for a particular host, then press the &amp;quot;Enable Maintenance Mode&amp;quot; button.&lt;br /&gt;
[[File:Cloudstack-enable-maintenance-mode-button.png|1000px]]&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
Wait for the VMs to be migrated to the other machines (press the Refresh button to update the table). If you see an error which says &amp;quot;ErrorInPrepareForMaintenance&amp;quot;, just wait it out. If more than 20 minutes have passed and there is still no progress, take the host out of maintenance mode, and put it back into maintenance mode. If this still does not work, restart the management server.&lt;br /&gt;
&lt;br /&gt;
When a host is in maintenance mode, it should look like this:&lt;br /&gt;
[[File:Cloudstack-host-in-maintenance-mode.png|1000px]]&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
Once all VMs have been migrated, do whatever you need to do on the physical host; once it is back up, take it back out of maintenance mode from the web UI. Repeat for any other hosts which need to be taken offline.&lt;br /&gt;
&lt;br /&gt;
== Unexpected reboot ==&lt;br /&gt;
Sometimes a network interface fails on a machine after the switches in MC are rebooted (looking at you, riboflavin). Or a machine randomly goes offline in the middle of the night (looking at you, ginkgo). Point is, sometimes a machine needs to rebooted, or is forcefully rebooted, without preparation. Unfortunately, &amp;lt;strong&amp;gt;CloudStack is unable to recover gracefully from an unexpected reboot&amp;lt;/strong&amp;gt;. This means that &amp;lt;strong&amp;gt;manual intervention is required&amp;lt;/strong&amp;gt; to get the VMs back into a working state.&lt;br /&gt;
&lt;br /&gt;
Once the machine has come back online, perform the following:&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;All of the VMs which were on that machine will eventually transition to the Stopped state. Wait for this to happen first (from the web UI).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Go to Infrastructure -&amp;gt; Management servers and make sure that both biloba and chamomile are present and running. If not, you may need to restart the management server on the machine (&amp;lt;code&amp;gt;systemctl restart cloudstack-management&amp;lt;/code&amp;gt;). Watch the journald logs for any error messages.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Go to Infrastructure -&amp;gt; Hosts and make sure that all three hosts (biloba, chamomile and ginkgo) are present and running. If not, you may need to restart the agent on the machine (&amp;lt;code&amp;gt;systemctl restart cloudstack-agent&amp;lt;/code&amp;gt;). Watch the journald logs for any error messages.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If you restart cloudstack-agent, restart virtlogd as well, just for good measure. Watch the journald logs for any error messages.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Restart ONE of the stopped VMs and make sure that it transitions to the Started state. If more than 20 minutes pass and it still hasn&#039;t started, restart the management servers and try again.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Restart the rest of the stopped VMs.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Administration ==&lt;br /&gt;
To login with the admin account, use the following credentials in the web UI&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Username: admin&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Password: &amp;lt;i&amp;gt;stored in the usual place&amp;lt;/i&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Domain: &amp;lt;i&amp;gt;leave this empty&amp;lt;/i&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There is another admin account for the Members domain. This is necessary to create projects in the Members domain which regular members can access. Note that his account has fewer privileges than the root admin account above (it has the DomainAdmin role instead of the RootAdmin role).&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Username: membersadmin&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Password: &amp;lt;i&amp;gt;stored in the usual place&amp;lt;/i&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Domain: Members&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that there are two management servers, one on each of biloba and chamomile (chamomile is a hot standby for biloba). If you restart one of them, you should restart the other as well.&lt;br /&gt;
&lt;br /&gt;
=== CLI ===&lt;br /&gt;
CloudStack has a CLI called [https://github.com/apache/cloudstack-cloudmonkey cloudmonkey] which is already set up on biloba. Just run &amp;lt;code&amp;gt;cmk&amp;lt;/code&amp;gt; as root to start it up.&lt;br /&gt;
&lt;br /&gt;
Cloudmonkey is basically a shell for the API (https://cloudstack.apache.org/api/apidocs-4.16/). For example, to list all domains:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
listDomains details=min&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Run &amp;lt;code&amp;gt;somecommand -h&amp;lt;/code&amp;gt; to see all parameters for a particular command (or browse the API documentation).&lt;br /&gt;
See https://github.com/apache/cloudstack-cloudmonkey for more details.&lt;br /&gt;
&lt;br /&gt;
== Building packages ==&lt;br /&gt;
While CloudStack does provide .deb packages for Ubuntu, unfortunately these don&#039;t work on Debian (the &#039;qemu-kvm&#039; dependency is a virtual package on Debian, but not on Ubuntu). So we&#039;re going to build our own packages instead.&lt;br /&gt;
&lt;br /&gt;
We&#039;re going to perform the build in a Podman container to avoid polluting the host machine with unnecessary packages. There&#039;s a container called cloudstack-build on biloba which you can re-use. If you create a new container, make sure to use the same Podman image as the release for which you&#039;re building (e.g. &#039;debian:bullseye&#039;).&lt;br /&gt;
&lt;br /&gt;
The instructions below are adapted from http://docs.cloudstack.apache.org/en/latest/installguide/building_from_source.html&lt;br /&gt;
&lt;br /&gt;
Inside the container, install the dependencies:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt install maven openjdk-11-jdk libws-commons-util-java libcommons-codec-java libcommons-httpclient-java liblog4j1.2-java genisoimage devscripts debhelper python3-setuptools&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Install Node.js 12 as well (Debian bullseye&#039;s version happens to be 12):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt install nodejs npm&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Build the node-sass module (see [https://github.com/sass/node-sass/issues/1579 this issue] to see why this is necessary):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cd ui &amp;amp;&amp;amp; npm install &amp;amp;&amp;amp; npm rebuild node-sass &amp;amp;&amp;amp; cd ..&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The python3-mysql.connector package is not available in bullseye, so we&#039;re going to download and install it from the sid release:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
curl -LOJ http://ftp.ca.debian.org/debian/pool/main/m/mysql-connector-python/python3-mysql.connector_8.0.15-2_all.deb&lt;br /&gt;
apt install ./python3-mysql.connector_8.0.15-2_all.deb&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Download the CloudStack source code:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
curl -LOJ http://mirror.csclub.uwaterloo.ca/apache/cloudstack/releases/4.16.0.0/apache-cloudstack-4.16.0.0-src.tar.bz2&lt;br /&gt;
tar -jxvf apache-cloudstack-4.16.0.0-src.tar.bz2&lt;br /&gt;
cd apache-cloudstack-4.16.0.0-src&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Download the Maven dependencies:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mvn -P deps&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now open debian/control and perform the following changes:&lt;br /&gt;
&lt;br /&gt;
* Replace &#039;qemu-kvm (&amp;gt;=2.5)&#039; with &#039;qemu-system-x86 (&amp;gt;= 1:5.2)&#039; in the dependencies of cloudstack-agent&lt;br /&gt;
* Remove dh-systemd as a build dependency of cloudstack (it&#039;s included in debhelper)&lt;br /&gt;
&lt;br /&gt;
Now open debian/rules and add the following flags to the &amp;lt;code&amp;gt;mvn&amp;lt;/code&amp;gt; command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
-Dmaven.test.skip=true -Dclean.skip=true -Dcheckstyle.skip&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now open debian/changelog and change &#039;unstable&#039; to &#039;bullseye&#039;.&lt;br /&gt;
&lt;br /&gt;
As of this writing, there is a [https://gitlab.com/libvirt/libvirt/-/issues/161 bug in libvirt] which prevents VMs with more than 4GB of RAM from being created on hosts with cgroups2. Until that issue is fixed, we&#039;re going to need to modify the source code. Since we&#039;re already building a custom CloudStack package, it&#039;s easier to patch CloudStack than to patch libvirt, so paste something like the following into debian/patches/fix-cgroups2-cpu-weight.patch:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Description: Workaround for libvirt trying to write a value to the cgroups v2&lt;br /&gt;
  cpu.weight controller which is greater than the maximum (10000). The&lt;br /&gt;
  libvirt developers are currently discussing a solution.&lt;br /&gt;
Forwarded: not-needed&lt;br /&gt;
Origin: upstream, https://gitlab.com/libvirt/libvirt/-/issues/161&lt;br /&gt;
Author: Max Erenberg &amp;lt;merenber@csclub.uwaterloo.ca&amp;gt;&lt;br /&gt;
Last-Update: 2021-12-03&lt;br /&gt;
Index: apache-cloudstack-4.16.0.0-src/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/LibvirtVMDef.java&lt;br /&gt;
===================================================================&lt;br /&gt;
--- apache-cloudstack-4.16.0.0-src.orig/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/LibvirtVMDef.java&lt;br /&gt;
+++ apache-cloudstack-4.16.0.0-src/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/LibvirtVMDef.java&lt;br /&gt;
@@ -1483,6 +1483,10 @@ public class LibvirtVMDef {&lt;br /&gt;
         static final int MAX_PERIOD = 1000000;&lt;br /&gt;
 &lt;br /&gt;
         public void setShares(int shares) {&lt;br /&gt;
+           // Clamp the value to the cgroups v2 cpu.weight maximum until&lt;br /&gt;
+           // upstream libvirt gets fixed:&lt;br /&gt;
+           // https://gitlab.com/libvirt/libvirt/-/issues/161&lt;br /&gt;
+           shares = Math.min(shares, 10000);&lt;br /&gt;
             _shares = shares;&lt;br /&gt;
         }&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
I think you have to manually modify that LibvirtVMDef.java file to incorporate those changes (I could be wrong on this, but that&#039;s how I did it).&lt;br /&gt;
&lt;br /&gt;
Then paste the following into debian/patches/00list:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fix-cgroup2-cpu-weight&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Finally, import your GPG key into the container (make sure to delete it afterwards!), and build the packages:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
debuild -k&amp;lt;YOUR_GPG_KEY_ID&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There should already be a .dupload.conf in the /root directory in the cloudstack-build container; if you need need another copy, ask a syscom member. Open /root/.ssh/config and change the User parameter to your username. Finally, go to /root and upload the packages to potassium-benzoate (replace the version number):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dupload cloudstack_4.16.0.0+1_amd64.changes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Incompatibility with Debian 12 packages ==&lt;br /&gt;
After upgrading ginkgo to bookworm, we discovered that libvirt 8+ was incompatible with CloudStack 4.16.0.0. See https://www.shapeblue.com/advisory-on-libvirt-8-compatibility-issues-with-cloudstack/ for details. So we built new packages from the 4.16.1.0 branch of ShapeBlue&#039;s GitHub repository. For some reason the cloudstack-management process failed with some errors from SLF4J, so we needed to download some JARs:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
wget -O /usr/share/cloudstack-management/lib/log4j-1.2.17.jar https://repo1.maven.org/maven2/log4j/log4j/1.2.17/log4j-1.2.17.jar &lt;br /&gt;
wget -O /usr/share/cloudstack-management/lib/slf4j-log4j12-1.6.6.jar https://repo1.maven.org/maven2/org/slf4j/slf4j-log4j12/1.6.6/slf4j-log4j12-1.6.6.jar&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
See https://stackoverflow.com/a/70528383 for details.&lt;br /&gt;
&lt;br /&gt;
We also encountered some kind of Java 11 -&amp;gt; 17 incompatibility issue, so following parameters were added to the JAVA_OPTS variable in /etc/default/cloudstack-management:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
--add-opens java.base/java.lang=ALL-UNNAMED&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
See https://stackoverflow.com/a/41265267 for details. Note that this file is NOT a shell script so you cannot use variable interpolation. You must modify the value of JAVA_OPTS directly.&lt;br /&gt;
&lt;br /&gt;
== Database setup ==&lt;br /&gt;
We are using master-master replication between two MariaDB instances on biloba and chamomile. See [https://mariadb.com/kb/en/setting-up-replication/ here] and [https://tunnelix.com/simple-master-master-replication-on-mariadb/ here] for instructions on how to set this up.&lt;br /&gt;
&lt;br /&gt;
To avoid split-brain syndrome, mariadb.cloud.csclub.uwaterloo.ca points to a virtual IP shared by biloba and chamomile via keepalived. This means that only one host is actually handling requests at any moment; the other is a hot standby.&lt;br /&gt;
&lt;br /&gt;
Also add the following parameters to /etc/mysql/my.cnf on the hosts running MariaDB:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[mysqld]&lt;br /&gt;
innodb_rollback_on_timeout=1&lt;br /&gt;
innodb_lock_wait_timeout=600&lt;br /&gt;
max_connections=350&lt;br /&gt;
log-bin=mysql-bin&lt;br /&gt;
binlog-format = &#039;ROW&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Also comment out (or remove) the following line in /etc/mysql/mariadb.conf.d/50-server.cnf:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bind-address = 127.0.0.1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now restart MariaDB.&lt;br /&gt;
&lt;br /&gt;
== Management server setup ==&lt;br /&gt;
Install the management server from our Debian repository:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt install cloudstack-management&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Run the database scripts:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cloudstack-setup-databases cloud:password@localhost --deploy-as=root&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
(Replace &#039;password&#039; by a strong password.)&lt;br /&gt;
&lt;br /&gt;
Open /etc/cloudstack/management/db.properties and replace all instances of &#039;localhost&#039; by &#039;mariadb.cloud.csclub.uwaterloo.ca&#039;.&lt;br /&gt;
&lt;br /&gt;
Open /etc/cloudstack/management/server.properties and set &#039;bind-interface&#039; to 127.0.0.1 (CloudStack is being reverse proxied behind NGINX).&lt;br /&gt;
&lt;br /&gt;
Run some more scripts:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cloudstack-setup-management&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Mount the cloudstack-secondary CephFS volume at /mnt/cloudstack-secondary:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir /mnt/cloudstack-secondary&lt;br /&gt;
mount -t nfs4 -o port=2049 ceph-nfs.cloud.csclub.uwaterloo.ca:/cloudstack-secondary /mnt/cloudstack-secondary&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now download the management VM template:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
/usr/share/cloudstack-common/scripts/storage/secondary/cloud-install-sys-tmplt -m /mnt/cloudstack-secondary/ -u https://download.cloudstack.org/systemvm/4.16/systemvmtemplate-4.16.0-kvm.qcow2.bz2 -h kvm -F&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The management server will run on port 8080 by default, so reverse proxy it from NGINX:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
location / {&lt;br /&gt;
  proxy_pass http://localhost:8080;&lt;br /&gt;
}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Compute node setup ==&lt;br /&gt;
Install packages:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt install cloudstack-agent libvirt-daemon-driver-storage-rbd qemu-block-extra&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Create a new user for CloudStack:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
useradd -s /bin/bash -d /nonexistent -M cloudstack&lt;br /&gt;
# set the password&lt;br /&gt;
passwd cloudstack&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Add the following to /etc/sudoers:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cloudstack ALL=(ALL) NOPASSWD:ALL     &lt;br /&gt;
Defaults:cloudstack !requiretty&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
(There is a way to restrict this, but I was never able to get it to work.)&lt;br /&gt;
&lt;br /&gt;
=== Network setup ===&lt;br /&gt;
The /etc/network/interfaces file should look something like this (taking ginkgo as an example):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
auto enp3s0f0&lt;br /&gt;
iface enp3s0f0 inet manual&lt;br /&gt;
&lt;br /&gt;
auto ens1f0np0&lt;br /&gt;
iface ens1f0np0 inet manual&lt;br /&gt;
&lt;br /&gt;
# csc-cloud management&lt;br /&gt;
auto enp3s0f0.529&lt;br /&gt;
iface enp3s0f0.529 inet manual&lt;br /&gt;
&lt;br /&gt;
auto br529&lt;br /&gt;
iface br529 inet static&lt;br /&gt;
    bridge_ports enp3s0f0.529&lt;br /&gt;
    address 172.19.168.22/27&lt;br /&gt;
iface br529 inet6 static&lt;br /&gt;
    bridge_ports enp3s0f0.529&lt;br /&gt;
    address fd74:6b6a:8eca:4902::22/64&lt;br /&gt;
&lt;br /&gt;
# csc-cloud provider&lt;br /&gt;
auto ens1f0np0.425&lt;br /&gt;
iface ens1f0np0.425 inet manual&lt;br /&gt;
&lt;br /&gt;
auto br425&lt;br /&gt;
iface br425 inet manual&lt;br /&gt;
    bridge_ports ens1f0np0.425&lt;br /&gt;
&lt;br /&gt;
# csc server network&lt;br /&gt;
auto ens1f0np0.134&lt;br /&gt;
iface ens1f0np0.134 inet manual&lt;br /&gt;
&lt;br /&gt;
auto br134&lt;br /&gt;
iface br134 inet static&lt;br /&gt;
    bridge_ports ens1f0np0.134&lt;br /&gt;
    address 129.97.134.148/24&lt;br /&gt;
    gateway 129.97.134.1&lt;br /&gt;
iface br134 inet6 static&lt;br /&gt;
    bridge_ports ens1f0np0.134&lt;br /&gt;
    address 2620:101:f000:4901:c5c::148/64&lt;br /&gt;
    gateway 2620:101:f000:4901::1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Add/modify the following lines to /etc/cloudstack/agent.properties:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
private.network.device=br529&lt;br /&gt;
guest.network.device=br425&lt;br /&gt;
public.network.device=br425&lt;br /&gt;
host=172.19.168.23,172.19.168.24@static&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== libvirtd setup ===&lt;br /&gt;
Add/modify the following lines in /etc/libvirt/libvirtd.conf:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
listen_tls = 0&lt;br /&gt;
listen_tcp = 1&lt;br /&gt;
tcp_port = &amp;quot;16509&amp;quot;&lt;br /&gt;
auth_tcp = &amp;quot;none&amp;quot;&lt;br /&gt;
mdns_adv = 0&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Uncomment the following line in /etc/default/libvirtd:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
LIBVIRTD_ARGS=&amp;quot;--listen&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Make sure the following lines are present in /etc/libvirt/qemu.conf:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
security_driver=&amp;quot;none&amp;quot;&lt;br /&gt;
user=&amp;quot;root&amp;quot;&lt;br /&gt;
group=&amp;quot;root&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl mask libvirtd.socket&lt;br /&gt;
systemctl mask libvirtd-ro.socket&lt;br /&gt;
systemctl mask libvirtd-admin.socket&lt;br /&gt;
systemctl restart libvirtd&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Management server setup (cont&#039;d) ==&lt;br /&gt;
Now start the cloudstack-management systemd service and visit the web UI (https://cloud.csclub.uwaterloo.ca). The login credentials are &#039;admin&#039; for both the username and password. Start the setup walkthrough (you will be prompted to change the password). Make sure to choose Basic Networking.&lt;br /&gt;
&lt;br /&gt;
The walkthrough is almost certainly going to fail (at least, it did for me). Don&#039;t panic when this happens; just abort the walkthrough, and set up everything else manually. Once primary and secondary storage have been setup, and at least one host has been added, enable the Pod, Cluster and Zone (there should only be one of each).&lt;br /&gt;
&lt;br /&gt;
=== Primary Storage ===&lt;br /&gt;
* Type: RBD&lt;br /&gt;
* IP address: ceph-mon.cloud.csclub.uwaterloo.ca&lt;br /&gt;
* Scope: zone&lt;br /&gt;
* Get the credentials which you created in [[Ceph#CloudStack_Primary_Storage]]&lt;br /&gt;
&lt;br /&gt;
=== Secondary Storage ===&lt;br /&gt;
* Type: NFS&lt;br /&gt;
* Host: ceph-nfs.cloud.csclub.uwaterloo.ca:2049&lt;br /&gt;
* Path: /cloudstack-secondary&lt;br /&gt;
&lt;br /&gt;
=== Global settings ===&lt;br /&gt;
Some global settings which you&#039;ll need to set from the web UI:&lt;br /&gt;
&lt;br /&gt;
* ca.plugin.root.auth.strictness: false (this always caused issues for me, so I just disabled it)&lt;br /&gt;
* host: 172.19.168.23,172.19.168.24  (the VLAN 529 addresses of biloba and chamomile)&lt;br /&gt;
&lt;br /&gt;
=== Adding a host ===&lt;br /&gt;
This is an extremely painful process which I am almost certainly doing wrong. It usually takes me 7-8 attempts to add a single host (that&#039;s not an exaggeration). This is what it looks like:&lt;br /&gt;
&lt;br /&gt;
* Stop cloudstack-agent service&lt;br /&gt;
* Configure /etc/cloudstack-agent/agent.properties&lt;br /&gt;
* Add a host from the CloudStack UI&lt;br /&gt;
* Start cloudstack-agent.service&lt;br /&gt;
&lt;br /&gt;
The reason why this takes several attempts is because cloudstack-agent actually &amp;lt;i&amp;gt;overwrites&amp;lt;/i&amp;gt; your agent.properties file. If/when you notice that this happens, restart the whole process again.&lt;br /&gt;
&lt;br /&gt;
=== Accessing the System VMs ===&lt;br /&gt;
If you need to SSH into one of the System VMs, get its link-local address from the web UI, and run e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -i /var/lib/cloudstack/management/.ssh/id_rsa -p 3922 root@169.254.232.179&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Some more global settings ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
allow.user.expunge.recover.vm = true&lt;br /&gt;
allow.user.view.destroyed.vm = true&lt;br /&gt;
expunge.delay = 1&lt;br /&gt;
expunge.interval = 1&lt;br /&gt;
network.securitygroups.defaultadding = false&lt;br /&gt;
allow.public.user.templates = false&lt;br /&gt;
vm.network.throttling.rate = 0&lt;br /&gt;
network.throttling.rate = 0&lt;br /&gt;
cpu.overprovisioning.factor = 4.0&lt;br /&gt;
allow.user.create.projects = false&lt;br /&gt;
max.project.cpus = 8&lt;br /&gt;
max.project.memory = 8192&lt;br /&gt;
max.project.primary.storage = 40&lt;br /&gt;
max.projet.secondary.storage = 20&lt;br /&gt;
max.account.cpus = 8&lt;br /&gt;
max.account.memory = 8192&lt;br /&gt;
max.account.primary.storage = 40&lt;br /&gt;
max.account.secondary.storage = 20&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;NOTE&amp;lt;/b&amp;gt;: the &amp;lt;code&amp;gt;cpu.overprovisioning.factor&amp;lt;/code&amp;gt; setting also needs to be set for existing clusters. Go to Infrastructure -&amp;gt; Clusters -&amp;gt; Cluster1 -&amp;gt; Settings and set it accordingly.&lt;br /&gt;
&lt;br /&gt;
=== Firewall ===&lt;br /&gt;
Since we disabled certificate validation from the clients, we&#039;re going to use some iptables-fu on all of the CloudStack hosts (to make our lives easier, we&#039;re going to use the same rules on the management and agent servers):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
iptables -N CLOUDSTACK-SERVICES&lt;br /&gt;
iptables -A INPUT -j CLOUDSTACK-SERVICES&lt;br /&gt;
iptables -A CLOUDSTACK-SERVICES -i lo -j RETURN&lt;br /&gt;
iptables -A CLOUDSTACK-SERVICES -s 172.19.168.0/27 -j RETURN&lt;br /&gt;
iptables -A CLOUDSTACK-SERVICES -p tcp -m multiport --dports 16509,16514,45335,41047,8250 -j REJECT&lt;br /&gt;
iptables-save &amp;gt; /etc/iptables/rules.v4&lt;br /&gt;
&lt;br /&gt;
ip6tables -N CLOUDSTACK-SERVICES&lt;br /&gt;
ip6tables -A INPUT -j CLOUDSTACK-SERVICES&lt;br /&gt;
ip6tables -A CLOUDSTACK-SERVICES -i lo -j RETURN&lt;br /&gt;
ip6tables -A CLOUDSTACK-SERVICES -s fd74:6b6a:8eca:4902::/64 -j RETURN&lt;br /&gt;
ip6tables -A CLOUDSTACK-SERVICES -p tcp -m multiport --dports 16509,16514,45335,41047,8250 -j REJECT&lt;br /&gt;
ip6tables-save &amp;gt; /etc/iptables/rules.v6&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== LDAP authentication ===&lt;br /&gt;
Go to Global Settings in the UI, type &#039;ldap&#039; in the search bar, and configure the parameters as needed. Make sure the mail attribute is set to &#039;mailLocalAddress&#039;.&lt;br /&gt;
&lt;br /&gt;
Create a new domain called &#039;Members&#039;. Then go to &#039;LDAP Configuration&#039;, click the &#039;Configure LDAP +&#039; button, and add a new LDAP config linked to the domain you just created.&lt;br /&gt;
&lt;br /&gt;
[[ceo]] handles the creation of CloudStack accounts, so create an API key + secret token and add it to /etc/csc/ceod.ini on biloba.&lt;br /&gt;
&lt;br /&gt;
=== Templates ===&lt;br /&gt;
This deserves an entire page of its own - see [[CloudStack Templates]].&lt;br /&gt;
&lt;br /&gt;
=== Kubernetes ===&lt;br /&gt;
This deserves an entire page of its own - see [[Kubernetes]].&lt;br /&gt;
&lt;br /&gt;
== Upgrading CloudStack ==&lt;br /&gt;
Please be &amp;lt;b&amp;gt;extremely&amp;lt;/b&amp;gt; careful if you decide to upgrade CloudStack. The last time I tried to perform an upgrade (from 4.15 to 4.16), the agents refused to connect to the management servers (or maybe it was the other way around?), and I ended up having to &amp;lt;b&amp;gt;wipe the entire CloudStack installation clean and start again from scratch&amp;lt;/b&amp;gt;. Therefore it is fair to say that nobody has ever managed to successfully upgrade CloudStack on our machines. Do this at your own risk.&lt;br /&gt;
&lt;br /&gt;
If you decide to perform an upgrade, then at the very least, you will need to backup the MariaDB databases (&#039;cloud&#039; and &#039;cloud_usage&#039;), as well as the /etc/cloudstack and /var/lib/cloudstack folders on each of biloba, chamomile and ginkgo. Also, good luck.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=Systemd&amp;diff=5278</id>
		<title>Systemd</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=Systemd&amp;diff=5278"/>
		<updated>2024-09-13T01:32:24Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Services */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page contains some tips and tricks for writing systemd units on CSC machines.&lt;br /&gt;
&lt;br /&gt;
== Services ==&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If your service should not be restarted by systemd automatically (e.g. because it has its own retry mechanism), set &amp;lt;code&amp;gt;Restart=no&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If your service should be restarted by systemd automatically, make sure you set &amp;lt;code&amp;gt;RestartSec&amp;lt;/code&amp;gt; to a reasonable value so that it does not restart too quickly&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If your service does not need to keep any persistent state on disk, consider using &amp;lt;code&amp;gt;DynamicUser=yes&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If you are running your service as root just so you can read a secret from a file, consider using &amp;lt;code&amp;gt;DynamicUser=yes&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;LoadCredential&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Consider using ProtectSystem, ProtectHome, etc. See https://manpages.debian.org/stable/systemd/systemd.exec.5.en.html for details.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If your service needs to accept network connections (i.e. is a server), use &amp;lt;code&amp;gt;After=network.target&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If your service needs to create network connections (i.e. is a client), use &amp;lt;code&amp;gt;After=network-online.target&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If your service needs to lookup LDAP users, use &amp;lt;code&amp;gt;After=nslcd.service sssd.service&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;if your service needs to access a folder on a networked filesystem, use &amp;lt;code&amp;gt;RequiresMountsFor&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Timers ==&lt;br /&gt;
Unlike cron, systemd timers do not send email alerts if the job fails. However, you can create your own alerts using &amp;lt;code&amp;gt;OnFailure=&amp;lt;/code&amp;gt;. Paste the following into /usr/local/bin/csc-systemd-email and make it executable:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
# Adapted from https://wiki.archlinux.org/title/systemd/Timers#MAILTO&lt;br /&gt;
&lt;br /&gt;
set -e&lt;br /&gt;
&lt;br /&gt;
if [[ $# -ne 2 ]]; then&lt;br /&gt;
  echo &amp;quot;Usage: $0 &amp;lt;address&amp;gt; &amp;lt;unit&amp;gt;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
  exit 1&lt;br /&gt;
fi&lt;br /&gt;
&lt;br /&gt;
FROM=&amp;quot;Systemd &amp;lt;root@$HOSTNAME&amp;gt;&amp;quot;&lt;br /&gt;
TO=&amp;quot;$1&amp;quot;&lt;br /&gt;
if ! [[ $TO =~ @ ]]; then&lt;br /&gt;
  TO=&amp;quot;$TO@csclub.uwaterloo.ca&amp;quot;&lt;br /&gt;
fi&lt;br /&gt;
SUBJECT=&amp;quot;Systemd &amp;lt;root@$HOSTNAME&amp;gt; Unit &#039;$2&#039; failed&amp;quot;&lt;br /&gt;
MESSAGE=&amp;quot;$(systemctl status --full &amp;quot;$2&amp;quot; || true)&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Don&#039;t use the Postfix sendmail. It creates a new spool file and also&lt;br /&gt;
# forks to the background, which we don&#039;t want.&lt;br /&gt;
if [[ -x /usr/sbin/ssmtp ]]; then&lt;br /&gt;
  /usr/sbin/ssmtp -t &amp;lt;&amp;lt;EOF&lt;br /&gt;
To: $TO&lt;br /&gt;
From: $FROM&lt;br /&gt;
Subject: $SUBJECT&lt;br /&gt;
Content-Transfer-Encoding: 8bit&lt;br /&gt;
Content-Type: text/plain; charset=UTF-8&lt;br /&gt;
&lt;br /&gt;
$MESSAGE&lt;br /&gt;
EOF&lt;br /&gt;
elif [[ -x /usr/bin/mutt ]]; then&lt;br /&gt;
  EMAIL=&amp;quot;$FROM&amp;quot; /usr/bin/mutt -F /dev/null -e &amp;quot;set copy=no&amp;quot; -s &amp;quot;$SUBJECT&amp;quot; -- &amp;quot;$TO&amp;quot; &amp;lt;&amp;lt;&amp;lt; &amp;quot;$MESSAGE&amp;quot;&lt;br /&gt;
else&lt;br /&gt;
  echo &amp;quot;Could not find program to email&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
  exit 1&lt;br /&gt;
fi&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Next, paste the following into /etc/systemd/system/csc-email-on-failure@.service:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Send email alert when %i fails&lt;br /&gt;
&lt;br /&gt;
[Service]&lt;br /&gt;
Type=oneshot&lt;br /&gt;
ExecStart=/usr/local/bin/csc-systemd-email root@csclub.uwaterloo.ca %i&lt;br /&gt;
# Do not use DynamicUser=true until this issue gets fixed:&lt;br /&gt;
# https://github.com/systemd/systemd/issues/22737&lt;br /&gt;
User=nobody&lt;br /&gt;
# Need to be in the adm group to read journald logs&lt;br /&gt;
Group=adm&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then run &amp;lt;code&amp;gt;systemctl daemon-reload&amp;lt;/code&amp;gt;. Now, all you need to do is add the following line to the &amp;lt;code&amp;gt;[Unit]&amp;lt;/code&amp;gt; of any service for which you would like to receive email alerts:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
OnFailure=csc-email-on-failure@%n.service&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;strong&amp;gt;IMPORTANT&amp;lt;/strong&amp;gt;: make sure you have the following setting in /etc/ssmtp/ssmtp.conf:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
FromLineOverride=NO&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Otherwise, Mailman 3 will reject the message because the Envelope From does not have a FQDN.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=Systemd&amp;diff=5277</id>
		<title>Systemd</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=Systemd&amp;diff=5277"/>
		<updated>2024-09-13T01:31:53Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Restart policy */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page contains some tips and tricks for writing systemd units on CSC machines.&lt;br /&gt;
&lt;br /&gt;
== Services ==&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If your service should not be restarted by systemd automatically (e.g. because it has its own retry mechanism), set &amp;lt;code&amp;gt;Restart=no&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If your service should be restarted by systemd automatically, make sure you set &amp;lt;code&amp;gt;RestartSec&amp;lt;/code&amp;gt; to a reasonable value so that it does not restart too quickly&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If your service does not need to keep any persistent state on disk, consider using &amp;lt;code&amp;gt;DynamicUser=yes&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;LoadCredential&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If you are running your service as root just so you can read a secret from a file, consider using &amp;lt;code&amp;gt;DynamicUser=yes&amp;lt;/code&amp;gt; with &amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Consider using ProtectSystem, ProtectHome, etc. See https://manpages.debian.org/stable/systemd/systemd.exec.5.en.html for details.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If your service needs to accept network connections (i.e. is a server), use &amp;lt;code&amp;gt;After=network.target&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If your service needs to create network connections (i.e. is a client), use &amp;lt;code&amp;gt;After=network-online.target&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If your service needs to lookup LDAP users, use &amp;lt;code&amp;gt;After=nslcd.service sssd.service&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;if your service needs to access a folder on a networked filesystem, use &amp;lt;code&amp;gt;RequiresMountsFor&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Timers ==&lt;br /&gt;
Unlike cron, systemd timers do not send email alerts if the job fails. However, you can create your own alerts using &amp;lt;code&amp;gt;OnFailure=&amp;lt;/code&amp;gt;. Paste the following into /usr/local/bin/csc-systemd-email and make it executable:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
# Adapted from https://wiki.archlinux.org/title/systemd/Timers#MAILTO&lt;br /&gt;
&lt;br /&gt;
set -e&lt;br /&gt;
&lt;br /&gt;
if [[ $# -ne 2 ]]; then&lt;br /&gt;
  echo &amp;quot;Usage: $0 &amp;lt;address&amp;gt; &amp;lt;unit&amp;gt;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
  exit 1&lt;br /&gt;
fi&lt;br /&gt;
&lt;br /&gt;
FROM=&amp;quot;Systemd &amp;lt;root@$HOSTNAME&amp;gt;&amp;quot;&lt;br /&gt;
TO=&amp;quot;$1&amp;quot;&lt;br /&gt;
if ! [[ $TO =~ @ ]]; then&lt;br /&gt;
  TO=&amp;quot;$TO@csclub.uwaterloo.ca&amp;quot;&lt;br /&gt;
fi&lt;br /&gt;
SUBJECT=&amp;quot;Systemd &amp;lt;root@$HOSTNAME&amp;gt; Unit &#039;$2&#039; failed&amp;quot;&lt;br /&gt;
MESSAGE=&amp;quot;$(systemctl status --full &amp;quot;$2&amp;quot; || true)&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Don&#039;t use the Postfix sendmail. It creates a new spool file and also&lt;br /&gt;
# forks to the background, which we don&#039;t want.&lt;br /&gt;
if [[ -x /usr/sbin/ssmtp ]]; then&lt;br /&gt;
  /usr/sbin/ssmtp -t &amp;lt;&amp;lt;EOF&lt;br /&gt;
To: $TO&lt;br /&gt;
From: $FROM&lt;br /&gt;
Subject: $SUBJECT&lt;br /&gt;
Content-Transfer-Encoding: 8bit&lt;br /&gt;
Content-Type: text/plain; charset=UTF-8&lt;br /&gt;
&lt;br /&gt;
$MESSAGE&lt;br /&gt;
EOF&lt;br /&gt;
elif [[ -x /usr/bin/mutt ]]; then&lt;br /&gt;
  EMAIL=&amp;quot;$FROM&amp;quot; /usr/bin/mutt -F /dev/null -e &amp;quot;set copy=no&amp;quot; -s &amp;quot;$SUBJECT&amp;quot; -- &amp;quot;$TO&amp;quot; &amp;lt;&amp;lt;&amp;lt; &amp;quot;$MESSAGE&amp;quot;&lt;br /&gt;
else&lt;br /&gt;
  echo &amp;quot;Could not find program to email&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
  exit 1&lt;br /&gt;
fi&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Next, paste the following into /etc/systemd/system/csc-email-on-failure@.service:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Send email alert when %i fails&lt;br /&gt;
&lt;br /&gt;
[Service]&lt;br /&gt;
Type=oneshot&lt;br /&gt;
ExecStart=/usr/local/bin/csc-systemd-email root@csclub.uwaterloo.ca %i&lt;br /&gt;
# Do not use DynamicUser=true until this issue gets fixed:&lt;br /&gt;
# https://github.com/systemd/systemd/issues/22737&lt;br /&gt;
User=nobody&lt;br /&gt;
# Need to be in the adm group to read journald logs&lt;br /&gt;
Group=adm&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then run &amp;lt;code&amp;gt;systemctl daemon-reload&amp;lt;/code&amp;gt;. Now, all you need to do is add the following line to the &amp;lt;code&amp;gt;[Unit]&amp;lt;/code&amp;gt; of any service for which you would like to receive email alerts:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
OnFailure=csc-email-on-failure@%n.service&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;strong&amp;gt;IMPORTANT&amp;lt;/strong&amp;gt;: make sure you have the following setting in /etc/ssmtp/ssmtp.conf:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
FromLineOverride=NO&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Otherwise, Mailman 3 will reject the message because the Envelope From does not have a FQDN.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=Systemd&amp;diff=5276</id>
		<title>Systemd</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=Systemd&amp;diff=5276"/>
		<updated>2024-09-13T01:31:36Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Services */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page contains some tips and tricks for writing systemd units on CSC machines.&lt;br /&gt;
&lt;br /&gt;
== Services ==&lt;br /&gt;
=== Restart policy ===&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If your service should not be restarted by systemd automatically (e.g. because it has its own retry mechanism), set &amp;lt;code&amp;gt;Restart=no&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If your service should be restarted by systemd automatically, make sure you set &amp;lt;code&amp;gt;RestartSec&amp;lt;/code&amp;gt; to a reasonable value so that it does not restart too quickly&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If your service does not need to keep any persistent state on disk, consider using &amp;lt;code&amp;gt;DynamicUser=yes&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;LoadCredential&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If you are running your service as root just so you can read a secret from a file, consider using &amp;lt;code&amp;gt;DynamicUser=yes&amp;lt;/code&amp;gt; with &amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Consider using ProtectSystem, ProtectHome, etc. See https://manpages.debian.org/stable/systemd/systemd.exec.5.en.html for details.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If your service needs to accept network connections (i.e. is a server), use &amp;lt;code&amp;gt;After=network.target&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If your service needs to create network connections (i.e. is a client), use &amp;lt;code&amp;gt;After=network-online.target&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;If your service needs to lookup LDAP users, use &amp;lt;code&amp;gt;After=nslcd.service sssd.service&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;if your service needs to access a folder on a networked filesystem, use &amp;lt;code&amp;gt;RequiresMountsFor&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Timers ==&lt;br /&gt;
Unlike cron, systemd timers do not send email alerts if the job fails. However, you can create your own alerts using &amp;lt;code&amp;gt;OnFailure=&amp;lt;/code&amp;gt;. Paste the following into /usr/local/bin/csc-systemd-email and make it executable:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
# Adapted from https://wiki.archlinux.org/title/systemd/Timers#MAILTO&lt;br /&gt;
&lt;br /&gt;
set -e&lt;br /&gt;
&lt;br /&gt;
if [[ $# -ne 2 ]]; then&lt;br /&gt;
  echo &amp;quot;Usage: $0 &amp;lt;address&amp;gt; &amp;lt;unit&amp;gt;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
  exit 1&lt;br /&gt;
fi&lt;br /&gt;
&lt;br /&gt;
FROM=&amp;quot;Systemd &amp;lt;root@$HOSTNAME&amp;gt;&amp;quot;&lt;br /&gt;
TO=&amp;quot;$1&amp;quot;&lt;br /&gt;
if ! [[ $TO =~ @ ]]; then&lt;br /&gt;
  TO=&amp;quot;$TO@csclub.uwaterloo.ca&amp;quot;&lt;br /&gt;
fi&lt;br /&gt;
SUBJECT=&amp;quot;Systemd &amp;lt;root@$HOSTNAME&amp;gt; Unit &#039;$2&#039; failed&amp;quot;&lt;br /&gt;
MESSAGE=&amp;quot;$(systemctl status --full &amp;quot;$2&amp;quot; || true)&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Don&#039;t use the Postfix sendmail. It creates a new spool file and also&lt;br /&gt;
# forks to the background, which we don&#039;t want.&lt;br /&gt;
if [[ -x /usr/sbin/ssmtp ]]; then&lt;br /&gt;
  /usr/sbin/ssmtp -t &amp;lt;&amp;lt;EOF&lt;br /&gt;
To: $TO&lt;br /&gt;
From: $FROM&lt;br /&gt;
Subject: $SUBJECT&lt;br /&gt;
Content-Transfer-Encoding: 8bit&lt;br /&gt;
Content-Type: text/plain; charset=UTF-8&lt;br /&gt;
&lt;br /&gt;
$MESSAGE&lt;br /&gt;
EOF&lt;br /&gt;
elif [[ -x /usr/bin/mutt ]]; then&lt;br /&gt;
  EMAIL=&amp;quot;$FROM&amp;quot; /usr/bin/mutt -F /dev/null -e &amp;quot;set copy=no&amp;quot; -s &amp;quot;$SUBJECT&amp;quot; -- &amp;quot;$TO&amp;quot; &amp;lt;&amp;lt;&amp;lt; &amp;quot;$MESSAGE&amp;quot;&lt;br /&gt;
else&lt;br /&gt;
  echo &amp;quot;Could not find program to email&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
  exit 1&lt;br /&gt;
fi&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Next, paste the following into /etc/systemd/system/csc-email-on-failure@.service:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Send email alert when %i fails&lt;br /&gt;
&lt;br /&gt;
[Service]&lt;br /&gt;
Type=oneshot&lt;br /&gt;
ExecStart=/usr/local/bin/csc-systemd-email root@csclub.uwaterloo.ca %i&lt;br /&gt;
# Do not use DynamicUser=true until this issue gets fixed:&lt;br /&gt;
# https://github.com/systemd/systemd/issues/22737&lt;br /&gt;
User=nobody&lt;br /&gt;
# Need to be in the adm group to read journald logs&lt;br /&gt;
Group=adm&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then run &amp;lt;code&amp;gt;systemctl daemon-reload&amp;lt;/code&amp;gt;. Now, all you need to do is add the following line to the &amp;lt;code&amp;gt;[Unit]&amp;lt;/code&amp;gt; of any service for which you would like to receive email alerts:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
OnFailure=csc-email-on-failure@%n.service&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;strong&amp;gt;IMPORTANT&amp;lt;/strong&amp;gt;: make sure you have the following setting in /etc/ssmtp/ssmtp.conf:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
FromLineOverride=NO&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Otherwise, Mailman 3 will reject the message because the Envelope From does not have a FQDN.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=Main_Page&amp;diff=5275</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=Main_Page&amp;diff=5275"/>
		<updated>2024-09-13T01:15:41Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Software Infrastructure */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This is the Wiki of the [[Computer Science Club]]. Feel free to start adding pages and information.&lt;br /&gt;
&lt;br /&gt;
[[Special:AllPages]]&lt;br /&gt;
&lt;br /&gt;
== Member/Club Rep Documentation ==&lt;br /&gt;
To access our Linux machines, see [[How to SSH]] and select one of the general-use machines from [[Machine List#General-Use Servers]].&lt;br /&gt;
&lt;br /&gt;
To host a website, see [[Web Hosting]]. If you are trying to host websites for clubs, see [[Club Hosting]].&lt;br /&gt;
&lt;br /&gt;
To use our VPS services (similar to Linode and Amazon EC2), see [https://docs.cloud.csclub.uwaterloo.ca/ CSC Cloud Documentation]. Note that you&#039;ll need to activate your account on one of CSC&#039;s machines before using the management panel.&lt;br /&gt;
&lt;br /&gt;
To view instruction on playing music at the office, see [[Music]].&lt;br /&gt;
&lt;br /&gt;
To use our Nextcloud instance (similar to Google Drive and Dropbox), go to [https://files.csclub.uwaterloo.ca CSC Files].&lt;br /&gt;
&lt;br /&gt;
=== Guides ===&lt;br /&gt;
&amp;lt;div style=&amp;quot;-webkit-column-count:3; -moz-column-count:3; column-count:3;&amp;quot;&amp;gt;&lt;br /&gt;
* [[New Member Guide]]&lt;br /&gt;
* [[Club Hosting]]&lt;br /&gt;
* [[Web Hosting]]&lt;br /&gt;
* [[Git Hosting]]&lt;br /&gt;
* [[How to IRC]]&lt;br /&gt;
* [[How to SSH]]&lt;br /&gt;
* [[MySQL]]&lt;br /&gt;
* [[PostgreSQL]]&lt;br /&gt;
* [https://docs.cloud.csclub.uwaterloo.ca/ CSC Cloud Documentation]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== News and Events ===&lt;br /&gt;
&amp;lt;div style=&amp;quot;-webkit-column-count:3; -moz-column-count:3; column-count:3;&amp;quot;&amp;gt;&lt;br /&gt;
* [[Meetings]]&lt;br /&gt;
* [[Talks]]&lt;br /&gt;
* [[Projects]]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Committees Documentation ==&lt;br /&gt;
=== Club Operation ===&lt;br /&gt;
&amp;lt;div style=&amp;quot;-webkit-column-count:3; -moz-column-count:3; column-count:3;&amp;quot;&amp;gt;&lt;br /&gt;
* [[Budget Guide]]&lt;br /&gt;
* [[ceo]]&lt;br /&gt;
* [[Exec Manual]]&lt;br /&gt;
* [[MEF Guide]]&lt;br /&gt;
* [[Office Policies]]&lt;br /&gt;
* [[Office Staff]]&lt;br /&gt;
* [[Sysadmin Guide]]&lt;br /&gt;
* [[How to (Extra) Ban Someone]]&lt;br /&gt;
* [[SCS Guide]]&lt;br /&gt;
* [[Kerberos |Password Reset]]&lt;br /&gt;
* [[Keys and Fobs]]&lt;br /&gt;
&lt;br /&gt;
* [[Talks Guide]]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Hardware Infrastructure (the bare metals) ===&lt;br /&gt;
&amp;lt;div style=&amp;quot;-webkit-column-count:3; -moz-column-count:3; column-count:3;&amp;quot;&amp;gt;&lt;br /&gt;
* [[Disk Drive RMA Process]]&lt;br /&gt;
* [[Machine List]]&lt;br /&gt;
* [[IPMI101]]&lt;br /&gt;
* [[New NetApp]]&lt;br /&gt;
* [[Switches]]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Software Infrastructure ===&lt;br /&gt;
&amp;lt;div style=&amp;quot;-webkit-column-count:3; -moz-column-count:3; column-count:3;&amp;quot;&amp;gt;&lt;br /&gt;
* [[ADFS]]&lt;br /&gt;
* [[Backups]]&lt;br /&gt;
* [[DNS]]&lt;br /&gt;
* [[Debian Repository]]&lt;br /&gt;
* [[Firewall]]&lt;br /&gt;
* [[Kerberos]]&lt;br /&gt;
* [[Keycloak]]&lt;br /&gt;
* [[KVM]]&lt;br /&gt;
* [[LDAP]]&lt;br /&gt;
* [[Network]]&lt;br /&gt;
* [[New CSC Machine]]&lt;br /&gt;
* [[Observability]]&lt;br /&gt;
* [[OID Assignment]]&lt;br /&gt;
* [[Podman]]&lt;br /&gt;
* [[Scratch]]&lt;br /&gt;
* [[SNMP]]&lt;br /&gt;
* [[SSL]]&lt;br /&gt;
* [[Syscom Todo]]&lt;br /&gt;
* [[Systemd]]&lt;br /&gt;
* [[Systemd-nspawn]]&lt;br /&gt;
* [[Two-Factor Authentication]]&lt;br /&gt;
* [[UID/GID Assignment]]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Services ===&lt;br /&gt;
&amp;lt;div style=&amp;quot;-webkit-column-count:3; -moz-column-count:3; column-count:3;&amp;quot;&amp;gt;&lt;br /&gt;
* [[Application List]]&lt;br /&gt;
* [[BigBlueButton]]&lt;br /&gt;
* [[Mail]]&lt;br /&gt;
* [[Mailing Lists]]&lt;br /&gt;
* [[Mirror]]&lt;br /&gt;
* [[Music]]&lt;br /&gt;
* [[Nextcloud]]&lt;br /&gt;
* [[Printing]]&lt;br /&gt;
* [[Pulseaudio]]&lt;br /&gt;
* [[Webmail]]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== CSC Cloud ===&lt;br /&gt;
&amp;lt;div style=&amp;quot;-webkit-column-count:3; -moz-column-count:3; column-count:3;&amp;quot;&amp;gt;&lt;br /&gt;
* [[Ceph]]&lt;br /&gt;
* [[Cloud Networking]]&lt;br /&gt;
* [[CloudStack]]&lt;br /&gt;
* [[CloudStack Templates]]&lt;br /&gt;
* [[Kubernetes]]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Miscellaneous ==&lt;br /&gt;
&amp;lt;div style=&amp;quot;-webkit-column-count:3; -moz-column-count:3; column-count:3;&amp;quot;&amp;gt;&lt;br /&gt;
* [[Acronyms]]&lt;br /&gt;
* [[Budget]]&lt;br /&gt;
* [[Executive]]&lt;br /&gt;
* [[Past Executive]]&lt;br /&gt;
* [[History]]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Historical ==&lt;br /&gt;
&amp;lt;div style=&amp;quot;-webkit-column-count:3; -moz-column-count:3; column-count:3;&amp;quot;&amp;gt;&lt;br /&gt;
* [[Robot Arm]]&lt;br /&gt;
* [[Webcams]]&lt;br /&gt;
* [[Website]]&lt;br /&gt;
* [[Digital Cutter]]&lt;br /&gt;
* [[Electronics]]&lt;br /&gt;
* [[NetApp]]&lt;br /&gt;
* [[Frosh]]&lt;br /&gt;
* [[Virtualization (LXC Containers)]]&lt;br /&gt;
* [[Serial Connections]]&lt;br /&gt;
* [[Library]]&lt;br /&gt;
* [[MEF Proposals]]&lt;br /&gt;
* [[Proposed Constitution Changes]]&lt;br /&gt;
* [[NFS/Kerberos]]&lt;br /&gt;
* [[Hardware]]&lt;br /&gt;
* [[Imapd Guide]]&lt;br /&gt;
__NOTOC__&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=Systemd&amp;diff=5274</id>
		<title>Systemd</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=Systemd&amp;diff=5274"/>
		<updated>2024-09-13T01:15:10Z</updated>

		<summary type="html">&lt;p&gt;Merenber: Add instructions for systemd timer email alerts&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page contains some tips and tricks for writing systemd units on CSC machines.&lt;br /&gt;
&lt;br /&gt;
== Services ==&lt;br /&gt;
TODO&lt;br /&gt;
&lt;br /&gt;
== Timers ==&lt;br /&gt;
Unlike cron, systemd timers do not send email alerts if the job fails. However, you can create your own alerts using &amp;lt;code&amp;gt;OnFailure=&amp;lt;/code&amp;gt;. Paste the following into /usr/local/bin/csc-systemd-email and make it executable:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
# Adapted from https://wiki.archlinux.org/title/systemd/Timers#MAILTO&lt;br /&gt;
&lt;br /&gt;
set -e&lt;br /&gt;
&lt;br /&gt;
if [[ $# -ne 2 ]]; then&lt;br /&gt;
  echo &amp;quot;Usage: $0 &amp;lt;address&amp;gt; &amp;lt;unit&amp;gt;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
  exit 1&lt;br /&gt;
fi&lt;br /&gt;
&lt;br /&gt;
FROM=&amp;quot;Systemd &amp;lt;root@$HOSTNAME&amp;gt;&amp;quot;&lt;br /&gt;
TO=&amp;quot;$1&amp;quot;&lt;br /&gt;
if ! [[ $TO =~ @ ]]; then&lt;br /&gt;
  TO=&amp;quot;$TO@csclub.uwaterloo.ca&amp;quot;&lt;br /&gt;
fi&lt;br /&gt;
SUBJECT=&amp;quot;Systemd &amp;lt;root@$HOSTNAME&amp;gt; Unit &#039;$2&#039; failed&amp;quot;&lt;br /&gt;
MESSAGE=&amp;quot;$(systemctl status --full &amp;quot;$2&amp;quot; || true)&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Don&#039;t use the Postfix sendmail. It creates a new spool file and also&lt;br /&gt;
# forks to the background, which we don&#039;t want.&lt;br /&gt;
if [[ -x /usr/sbin/ssmtp ]]; then&lt;br /&gt;
  /usr/sbin/ssmtp -t &amp;lt;&amp;lt;EOF&lt;br /&gt;
To: $TO&lt;br /&gt;
From: $FROM&lt;br /&gt;
Subject: $SUBJECT&lt;br /&gt;
Content-Transfer-Encoding: 8bit&lt;br /&gt;
Content-Type: text/plain; charset=UTF-8&lt;br /&gt;
&lt;br /&gt;
$MESSAGE&lt;br /&gt;
EOF&lt;br /&gt;
elif [[ -x /usr/bin/mutt ]]; then&lt;br /&gt;
  EMAIL=&amp;quot;$FROM&amp;quot; /usr/bin/mutt -F /dev/null -e &amp;quot;set copy=no&amp;quot; -s &amp;quot;$SUBJECT&amp;quot; -- &amp;quot;$TO&amp;quot; &amp;lt;&amp;lt;&amp;lt; &amp;quot;$MESSAGE&amp;quot;&lt;br /&gt;
else&lt;br /&gt;
  echo &amp;quot;Could not find program to email&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
  exit 1&lt;br /&gt;
fi&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Next, paste the following into /etc/systemd/system/csc-email-on-failure@.service:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Send email alert when %i fails&lt;br /&gt;
&lt;br /&gt;
[Service]&lt;br /&gt;
Type=oneshot&lt;br /&gt;
ExecStart=/usr/local/bin/csc-systemd-email root@csclub.uwaterloo.ca %i&lt;br /&gt;
# Do not use DynamicUser=true until this issue gets fixed:&lt;br /&gt;
# https://github.com/systemd/systemd/issues/22737&lt;br /&gt;
User=nobody&lt;br /&gt;
# Need to be in the adm group to read journald logs&lt;br /&gt;
Group=adm&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then run &amp;lt;code&amp;gt;systemctl daemon-reload&amp;lt;/code&amp;gt;. Now, all you need to do is add the following line to the &amp;lt;code&amp;gt;[Unit]&amp;lt;/code&amp;gt; of any service for which you would like to receive email alerts:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
OnFailure=csc-email-on-failure@%n.service&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;strong&amp;gt;IMPORTANT&amp;lt;/strong&amp;gt;: make sure you have the following setting in /etc/ssmtp/ssmtp.conf:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
FromLineOverride=NO&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Otherwise, Mailman 3 will reject the message because the Envelope From does not have a FQDN.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=SSL&amp;diff=5273</id>
		<title>SSL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=SSL&amp;diff=5273"/>
		<updated>2024-09-11T13:01:13Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* How to add a new SSL cert for a custom domain on CSC cloud */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== GlobalSign ==&lt;br /&gt;
&lt;br /&gt;
The CSC currently has an SSL Certificate from GlobalSign for *.csclub.uwaterloo.ca provided at no cost to us through IST.  GlobalSign likes to take a long time to respond to certificate signing requests (CSR) for wildcard certs, so our CSR really needs to be handed off to IST at least 2 weeks in advance. You can do it sooner – the certificate expiry date will be the old expiry date + 1 year (+ a bonus )  Having an invalid cert for any length of time leads to terrible breakage, followed by terrible workarounds and prolonged problems.&lt;br /&gt;
&lt;br /&gt;
When the certificate is due to expire in a month or two, syscom should (but apparently doesn&#039;t always) get an email notification. This will include a renewal link. Otherwise, use the [https://uwaterloo.ca/information-systems-technology/about/organizational-structure/information-security-services/certificate-authority/globalsign-signed-x5093-certificates/self-service-globalsign-ssl-certificates IST-CA self service system]. Please keep a copy of the key, CSR and (once issued) certificate in &amp;lt;tt&amp;gt;/home/sysadmin/certs&amp;lt;/tt&amp;gt;. The OpenSSL examples linked there are good to generate a 2048-bit RSA key and a corresponding CSR. It&#039;s probably a good idea to change the private key (as it&#039;s not that much effort anyways). Just sure your CSR is for &amp;lt;tt&amp;gt;*.csclub.uwaterloo.ca&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
At the self-service portal, these options worked in 2013. If you need IST assistance, [mailto:ist-ca@uwaterloo.ca ist-ca@uwaterloo.ca] is the email address you should contact.&lt;br /&gt;
  Products: OrganizationSSL&lt;br /&gt;
  SSL Certificate Type: Wildcard SSL Certificate&lt;br /&gt;
  Validity Period: 1 year&lt;br /&gt;
  Are you switching from a Competitor? No, I am not switching&lt;br /&gt;
  Are you renewing this Certificate? Yes (paste current certificate)&lt;br /&gt;
  30-day bonus: Yes (why not?)&lt;br /&gt;
  Add specific Subject Alternative Names (SANs): No (*.csclub.uwaterloo.ca automatically adds csclub.uwaterloo.ca as a SAN)&lt;br /&gt;
  Enter Certificate Signing Request (CSR): Yes (paste CSR)&lt;br /&gt;
  Contact Information:&lt;br /&gt;
    First Name: Computer Science Club&lt;br /&gt;
    Last Name: Systems Committee&lt;br /&gt;
    Telephone: +1 519 888 4567 x33870&lt;br /&gt;
    Email Address: syscom@csclub.uwaterloo.ca&lt;br /&gt;
&lt;br /&gt;
=== Helpful links ===&lt;br /&gt;
* [https://support.globalsign.com/ssl/ssl-certificates-installation/generate-csr-openssl How to generate a new CSR and private key]&lt;br /&gt;
* [https://uwaterloo.atlassian.net/wiki/spaces/ISTKB/pages/262013183/How+to+obtain+a+new+GlobalSign+certificate+or+renew+an+existing+one How to obtain a new GlobalSign certificate or renew an existing one]&lt;br /&gt;
* [https://system.globalsign.com/bm/public/certificate/poporder.do?domain=PAR12271n5w6s27pvg8d92v4150t GlobalSign UWaterloo self-service page]&lt;br /&gt;
* [https://support.globalsign.com/ca-certificates/intermediate-certificates/organizationssl-intermediate-certificates GlobalSign intermediate certificate] (needed to create a certificate chain; see below)&lt;br /&gt;
&lt;br /&gt;
=== OpenSSL cheat sheet ===&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Generate a new CSR and private key (do this in a new directory):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
openssl req -out csclub.uwaterloo.ca.csr -new -newkey rsa:2048 -keyout csclub.uwaterloo.ca.key -nodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Enter the following information at the prompts:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Country Name (2 letter code) [AU]:CA&lt;br /&gt;
State or Province Name (full name) [Some-State]:Ontario&lt;br /&gt;
Locality Name (eg, city) []:Waterloo&lt;br /&gt;
Organization Name (eg, company) [Internet Widgits Pty Ltd]:University of Waterloo&lt;br /&gt;
Organizational Unit Name (eg, section) []:Computer Science Club&lt;br /&gt;
Common Name (e.g. server FQDN or YOUR name) []:*.csclub.uwaterloo.ca&lt;br /&gt;
Email Address []:systems-committee@csclub.uwaterloo.ca&lt;br /&gt;
&lt;br /&gt;
Please enter the following &#039;extra&#039; attributes&lt;br /&gt;
to be sent with your certificate request&lt;br /&gt;
A challenge password []:&lt;br /&gt;
An optional company name []:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
View the information inside a CSR:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
openssl req -noout -text -in csclub.uwaterloo.ca.csr&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
View the information inside a private key:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
openssl pkey -noout -text -in csclub.uwaterloo.ca.key&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
View information inside a certificate:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
openssl x509 -noout -text -in csclub.uwaterloo.ca.crt&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== csclub.cloud ===&lt;br /&gt;
Once a year, someone from IST will ask us to create a temporary TXT record for csclub.cloud to prove to GlobalSign that we own it. This must be created at the &amp;lt;b&amp;gt;root&amp;lt;/b&amp;gt; of the domain. Since this zone is managed dynamically (via the acme.sh script on biloba, see below), we need to freeze the domain and update /var/lib/bind/db.csclub.cloud directly.&lt;br /&gt;
&lt;br /&gt;
Once you&#039;re in the correct server (not Biloba). Here are the steps:&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;rndc freeze csclub.cloud&amp;lt;/code&amp;gt;.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Open /var/lib/bind/db.csclub.cloud and add a new TXT record. It&#039;ll look something like&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
TXT &amp;quot;_globalsign-domain-verification=blablabla&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
In the same file, make sure to also update the SOA serial number. It should generally be YYYYMMDDNN where NN is a monotonically increasing counter (YYYYMMDD is the current date).&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;rndc reload&amp;lt;/code&amp;gt;.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Run a DNS query to make sure you can see the TXT record:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dig -t txt @dns1 csclub.cloud&lt;br /&gt;
dig -t txt @dns2 csclub.cloud&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Email back the person from IST and let them know that we created the TXT record.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Once the certificate has been renewed, delete the TXT record, update the SOA serial number, and run &amp;lt;code&amp;gt;rndc reload&amp;lt;/code&amp;gt;.&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;rndc thaw csclub.cloud&amp;lt;/code&amp;gt;.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Certificate Files ==&lt;br /&gt;
Let&#039;s say you obtain a new certificate for *.csclub.uwaterloo.ca. Here are the files which should be stored in the certs folder:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;csclub.uwaterloo.ca.key: private key created by openssl&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;csclub.uwaterloo.ca.csr: certificate signing request created by openssl&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;order: order number from GlobalSign&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;csclub.uwaterloo.ca.crt: certificate created by GlobalSign&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;globalsign-intermediate.crt: intermediate certificate from GlobalSign, obtainable from [https://support.globalsign.com/ca-certificates/intermediate-certificates/organizationssl-intermediate-certificates here]. As of this writing, we use the &amp;quot;OrganizationSSL SHA-256 R3 Intermediate Certificate&amp;quot;. Just click the &amp;quot;View in Base64&amp;quot; button and copy the contents.&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;There is an alternative way to get the intermediate certificate: if you run &amp;lt;code&amp;gt;openssl x509 -noout -text -in csclub.uwaterloo.ca.crt&amp;lt;/code&amp;gt;, under X509v3 extensions &amp;gt; Authority Information Access, there should be a field called &amp;quot;CA Issuers&amp;quot; which has a URL which looks like http://secure.globalsign.com/cacert/gsrsaovsslca2018.crt. You can download that file and convert it to PEM:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
wget https://secure.globalsign.com/cacert/gsrsaovsslca2018.crt&lt;br /&gt;
openssl x509 -inform der -in gsrsaovsslca2018.crt -out globalsign-intermediate.crt&lt;br /&gt;
rm gsrsaovsslca2018.crt&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;csclub.uwaterloo.ca.chain: create this with the following command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cat csclub.uwaterloo.ca.crt globalsign-intermediate.crt &amp;gt; csclub.uwaterloo.ca.chain&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;csclub.uwaterloo.ca.pem: create this with the following command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cat csclub.uwaterloo.ca.key csclub.uwaterloo.ca.chain &amp;gt; csclub.uwaterloo.ca.pem&lt;br /&gt;
chmod 600 csclub.uwaterloo.ca.pem&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Certificate Locations ==&lt;br /&gt;
&lt;br /&gt;
Keep a copy of newly generated certificates in /users/sysadmin/certs.&lt;br /&gt;
&lt;br /&gt;
A list of places you&#039;ll need to put the new certificate to keep our services running. Private key (if applicable) should be kept next to the certificate with the extension .key.&lt;br /&gt;
&lt;br /&gt;
* caffeine:/etc/ssl/private/csclub-wildcard.crt (for Apache)&lt;br /&gt;
* coffee:/etc/ssl/private/csclub.uwaterloo.ca (for PostgreSQL and MariaDB)&lt;br /&gt;
* &amp;lt;s&amp;gt;mail:/etc/ssl/private/csclub-wildcard.crt (for Apache, Postfix and Dovecot)&amp;lt;/s&amp;gt; (UPDATE: we use certbot now for these)&lt;br /&gt;
* mailman:/etc/ssl/private/csclub-wildcard-chain.crt (for Apache)&lt;br /&gt;
* rt:/etc/ssl/private/csclub-wildcard.crt (for Apache)&lt;br /&gt;
* potassium-benzoate:/etc/ssl/private/csclub-wildcard.crt (for nginx)&lt;br /&gt;
* phosphoric-acid:/etc/ssl/private/csclub-wildcard-chain.crt (for ceod)&lt;br /&gt;
* auth1:/etc/ssl/private/csclub-wildcard.crt (for slapd, make sure to &amp;lt;code&amp;gt;sudo service slapd restart&amp;lt;/code&amp;gt;)&lt;br /&gt;
* auth2:/etc/ssl/private/csclub-wildcard.crt (for slapd, make sure to &amp;lt;code&amp;gt;sudo service slapd restart&amp;lt;/code&amp;gt;)&lt;br /&gt;
* mattermost:/etc/ssl/private/csclub-wildcard.crt (for nginx)&lt;br /&gt;
* load-balancer-0(1|2):/etc/ssl/private/csclub.uwaterloo.ca (for haproxy) [temporarily down 2020]&lt;br /&gt;
* chat:/etc/ssl/private/csclub-wildcard-chain.crt (for nginx)&lt;br /&gt;
* prometheus:/etc/ssl/private/csclub-wildcard-chain.crt (for Apache)&lt;br /&gt;
* bigbluebutton:/etc/nginx/ssl/csclub-wildcard-chain.crt (podman container on xylitol)&lt;br /&gt;
* icy:/etc/ssl/private/csclub-wildcard.pem (for Icecast)&lt;br /&gt;
* chamomile:/etc/ssl/private/cloud.csclub.uwaterloo.ca.chain.crt, /etc/ssl/private/csclub.cloud.chain, /etc/ssl/private/csclub.uwaterloo.ca.chain (for nginx)&lt;br /&gt;
* biloba:/etc/ssl/private/cloud.csclub.uwaterloo.ca.chain.crt, /etc/ssl/private/csclub.cloud.chain, /etc/ssl/private/csclub.uwaterloo.ca.chain (for nginx)&lt;br /&gt;
* nextcloud (nspawn container inside guayusa): /etc/ssl/private/csclub.uwaterloo.ca.chain (for nginx)&lt;br /&gt;
* citric-acid (runs vaultwarden): /etc/ssl/private/csclub.uwaterloo.ca.{chain,key} (for nginx)&lt;br /&gt;
&lt;br /&gt;
Some services (e.g. Dovecot, Postfix) prefer to have the certificate chain in one file. Concatenate the appropriate intermediate root to the end of the certificate and store this as csclub-wildcard-chain.crt.&lt;br /&gt;
&lt;br /&gt;
=== More certificate locations ===&lt;br /&gt;
We have some SSL certificates which are not used by web servers, but still need to be renewed eventually.&lt;br /&gt;
&lt;br /&gt;
==== Prometheus node exporter ====&lt;br /&gt;
All of our Prometheus node exporters are using mTLS via stunnel (every bare-metal host, as well as caffeine, coffee and mail, is running this exporter). The certificates (both client and server) are set to expire in &amp;lt;b&amp;gt;September 2031&amp;lt;/b&amp;gt;; before then, create new keypairs in /opt/prometheus/tls, and deploy the new server.crt, node.crt and node.key to /etc/stunnel/tls on all machines. Restart prometheus and all of the node exporters.&lt;br /&gt;
&lt;br /&gt;
==== ADFS ====&lt;br /&gt;
See [[ADFS]]. When the university&#039;s IdP certificate expires (&amp;lt;b&amp;gt;October 2025&amp;lt;/b&amp;gt;), we can just download a new one and restart Apache; when our own certificate expires (&amp;lt;b&amp;gt;July 2031&amp;lt;/b&amp;gt;), we need to submit a new form to IST (please do this &amp;lt;i&amp;gt;before&amp;lt;/i&amp;gt; the cert expires).&lt;br /&gt;
&lt;br /&gt;
==== Keycloak ====&lt;br /&gt;
See [[Keycloak]]. When the saml-passthrough certificate expires (&amp;lt;b&amp;gt;January 2032&amp;lt;/b&amp;gt;), you need to create a new keypair in /srv/saml-passthrough on caffeine, and upload the new certificate into the Keycloak UI (IdP settings). When the Keycloak SP certificate expires (&amp;lt;b&amp;gt;December 2031&amp;lt;/b&amp;gt;), make sure to create a new keypair and upload it to the Keycloak UI (Realm Settings).&lt;br /&gt;
&lt;br /&gt;
== letsencrypt ==&lt;br /&gt;
&lt;br /&gt;
We support letsencrypt for our virtual hosts with custom domains. We use the &amp;lt;tt&amp;gt;cerbot&amp;lt;/tt&amp;gt; from debian repositories with a configuration file at &amp;lt;tt&amp;gt;/etc/letsencrypt/cli.ini&amp;lt;/tt&amp;gt;, and a systemd timer to handle renewals.&lt;br /&gt;
&lt;br /&gt;
The setup for a new domain is:&lt;br /&gt;
&lt;br /&gt;
# Become &amp;lt;tt&amp;gt;certbot&amp;lt;/tt&amp;gt; on caffine with &amp;lt;tt&amp;gt;sudo -u certbot bash&amp;lt;/tt&amp;gt; or similar.&lt;br /&gt;
# Run &amp;lt;tt&amp;gt;certbot certonly -c /etc/letsencrypt/cli.ini -d DOMAIN --logs-dir /tmp&amp;lt;/tt&amp;gt;. The logs-dir isn&#039;t important and is only needed for troubleshooting.&lt;br /&gt;
# Set up the Apache site configuration using the example below. (apache config is in /etc/apache2) Note the permanent redirect to https.&lt;br /&gt;
# Make sure to commit your changes when you&#039;re done.&lt;br /&gt;
# Reloading apache config is &amp;lt;tt&amp;gt;sudo systemctl reload apache2&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;VirtualHost *:80&amp;gt;&lt;br /&gt;
     ServerName example.com&lt;br /&gt;
     ServerAlias *.example.com&lt;br /&gt;
     ServerAdmin example@csclub.uwaterloo.ca&lt;br /&gt;
 &lt;br /&gt;
     #DocumentRoot /users/example/www/&lt;br /&gt;
     Redirect permanent / https://example.com/&lt;br /&gt;
 &lt;br /&gt;
     ErrorLog /var/log/apache2/example-error.log&lt;br /&gt;
     CustomLog /var/log/apache2/example-access.log combined&lt;br /&gt;
 &amp;lt;/VirtualHost&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
 &amp;lt;VirtualHost csclub:443&amp;gt;&lt;br /&gt;
     SSLEngine on&lt;br /&gt;
     SSLCertificateFile /etc/letsencrypt/live/example.com/fullchain.pem&lt;br /&gt;
     SSLCertificateKeyFile /etc/letsencrypt/live/example.com/privkey.pem&lt;br /&gt;
     SSLStrictSNIVHostCheck on&lt;br /&gt;
 &lt;br /&gt;
     ServerName example.com&lt;br /&gt;
     ServerAlias *.example.com&lt;br /&gt;
     ServerAdmin example@csclub.uwaterloo.ca&lt;br /&gt;
 &lt;br /&gt;
     DocumentRoot /users/example/www&lt;br /&gt;
 &lt;br /&gt;
     ErrorLog /var/log/apache2/example-error.log&lt;br /&gt;
     CustomLog /var/log/apache2/example-access.log combined&lt;br /&gt;
 &amp;lt;/VirtualHost&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== acme.sh ==&lt;br /&gt;
We are using [https://github.com/acmesh-official/acme.sh acme.sh] for provisioning SSL certificates for some of our *.csclub.cloud domains. It is currently set up under /root/.acme.sh on biloba.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;NOTE&amp;lt;/b&amp;gt;: acme.sh has a cron job which automatically renews certificates before they expire and reloads NGINX, so you do not have to do anything after issuing and installing a certificate (i.e. &amp;quot;set-and-forget&amp;quot;).&lt;br /&gt;
&lt;br /&gt;
=== How to add a new SSL cert for a custom domain on CSC cloud ===&lt;br /&gt;
Note: you do not need to acquire a new cert if the requested domain is directly on csclub.cloud, e.g. app1.csclub.cloud. We can re-use our wildcard cert on csclub.cloud for that. However, if a user requests a multi-level domain on csclub.cloud, or a domain hosted on an external registrar, then you will need to create a new cert.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s say user &amp;lt;code&amp;gt;ctdalek&amp;lt;/code&amp;gt; wants &amp;lt;code&amp;gt;mydomain.com&amp;lt;/code&amp;gt; to point to a VM on CSC cloud.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
TLDR:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Obtain the cert.&lt;br /&gt;
# If a subdomain was also requested, pass the -d option multiple times, e.g.&lt;br /&gt;
# `-d mydomain.com -d sub.mydomain.com`. Make sure the &amp;quot;main&amp;quot; domain is specified first.&lt;br /&gt;
acme.sh --issue -d mydomain.com -w /var/www&lt;br /&gt;
&lt;br /&gt;
# Install the cert.&lt;br /&gt;
# If a subdomain was also requested, only specify the &amp;quot;main&amp;quot; domain.&lt;br /&gt;
acme.sh --install-cert -d mydomain.com \&lt;br /&gt;
    --key-file /etc/nginx/ceod/member-ssl/mydomain.com.key \&lt;br /&gt;
    --fullchain-file /etc/nginx/ceod/member-ssl/mydomain.com.chain \&lt;br /&gt;
    --reloadcmd &amp;quot;/root/bin/reload-nginx.sh&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Create a vhost file.&lt;br /&gt;
# Look at the other files in the same directory for inspiration.&lt;br /&gt;
# Make sure the file starts with the username and an underscore, e.g. &amp;quot;ctdalek_&amp;quot;,&lt;br /&gt;
# because this is how ceod keeps track of the vhosts.&lt;br /&gt;
# Make sure to set the custom domain name(s) and paths to the SSL key/cert.&lt;br /&gt;
vim /etc/nginx/ceod/member-vhosts/ctdalek_mydomain.com&lt;br /&gt;
&lt;br /&gt;
# Finally, reload NGINX on both biloba and chamomile. The /etc/nginx/ceod directory&lt;br /&gt;
# is shared between them.&lt;br /&gt;
/root/bin/reload-nginx.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Installation ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cd /opt    &lt;br /&gt;
git clone --depth 1 https://github.com/acmesh-official/acme.sh    &lt;br /&gt;
cd acme.sh    &lt;br /&gt;
./acme.sh --install -m syscom@csclub.uwaterloo.ca    &lt;br /&gt;
. &amp;quot;/root/.acme.sh/acme.sh.env&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;Important&amp;lt;/b&amp;gt;: If invoking acme.sh from another program, it needs the environment variables set in acme.sh.env. Currently, that is just&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
LE_WORKING_DIR=&amp;quot;/root/.acme.sh&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For testing purposes, make sure to use the Let&#039;s Encrypt test server:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
acme.sh --set-default-ca --server letsencrypt_test&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== NGINX setup ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir -p /var/www/.well-known/acme-challenge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Add the following snippet to your default NGINX file (e.g. /etc/nginx/sites-enabled/default):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  # For Let&#039;s Encrypt&lt;br /&gt;
  location /.well-known/acme-challenge/ {&lt;br /&gt;
    alias /var/www/.well-known/acme-challenge/;&lt;br /&gt;
  }&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now assuming that biloba has the IP address for *.csclub.cloud, you can test that everything is working:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
acme.sh --issue -d app.merenber.csclub.cloud -w /var/www&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To install a certificate after it&#039;s been issued:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
acme.sh --install-cert -d app.merenber.csclub.cloud \&lt;br /&gt;
    --key-file /etc/nginx/ceod/member-ssl/app.merenber.csclub.cloud.key \&lt;br /&gt;
    --fullchain-file /etc/nginx/ceod/member-ssl/app.merenber.csclub.cloud.chain \&lt;br /&gt;
    --reloadcmd &amp;quot;/root/bin/reload-nginx.sh&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
At this point, you should add your NGINX vhost file which uses that SSL certificate.&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
To remove a certificate:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
acme.sh --remove -d app.merenber.csclub.cloud&lt;br /&gt;
rm -r /root/.acme.sh/app.merenber.csclub.cloud&lt;br /&gt;
rm /etc/nginx/ceod/member-ssl/app.merenber.csclub.cloud.chain&lt;br /&gt;
rm /etc/nginx/ceod/member-ssl/app.merenber.csclub.cloud.key&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Don&#039;t forget to remove the NGINX vhost file too.&lt;br /&gt;
&lt;br /&gt;
Once you think you&#039;re ready, use a real ACME provider, e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
acme.sh --set-default-ca --server letsencrypt&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Since we have a [https://zerossl.com ZeroSSL] account, and ZeroSSL has no rate limit, we are going to use that instead:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
acme.sh  --register-account  --server zerossl \&lt;br /&gt;
        --eab-kid  xxxxxxxxxxxx  \&lt;br /&gt;
        --eab-hmac-key  xxxxxxxxx&lt;br /&gt;
acme.sh --set-default-ca  --server zerossl&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== DNS challenge ===&lt;br /&gt;
To obtain a wildcard certificate (e.g. *.k8s.csclub.cloud), you will need to perform the DNS-01 challenge. We are going to use nsupdate to interact with our BIND9 server on dns1.&lt;br /&gt;
&lt;br /&gt;
On dns1, run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
tsig-keygen csc-cloud&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Paste the output into the appropriate section in /etc/bind/named.conf.local. Also paste it into a file somewhere on biloba, e.g. /etc/csc/csc-cloud-tsig.key.&lt;br /&gt;
&lt;br /&gt;
Add the following to the csclub.cloud zone block:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  allow-update {&lt;br /&gt;
    !{&lt;br /&gt;
      !127.0.0.1;&lt;br /&gt;
      !::1;&lt;br /&gt;
      !129.97.134.0/24;&lt;br /&gt;
      !2620:101:f000:4901::/64;&lt;br /&gt;
      any;&lt;br /&gt;
    };&lt;br /&gt;
    key csc-cloud;&lt;br /&gt;
  };&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
(We&#039;re basically trying to restrict updates to the given IP ranges. See https://serverfault.com/a/417229.)&lt;br /&gt;
&lt;br /&gt;
The &#039;bind&#039; user can&#039;t write to files under /etc/bind, so we&#039;re going to move our zone file to /var/lib/bind instead.&lt;br /&gt;
Comment out &#039;file &amp;quot;/etc/bind/db.csclub.cloud&amp;quot;;&#039; from named.conf.local and add this line below it:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  file &amp;quot;/var/lib/bind/db.csclub.cloud&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  cp /etc/bind/db.csclub.cloud /var/lib/bind/db.csclub.cloud&lt;br /&gt;
  chown bind:bind /var/lib/bind/db.csclub.cloud&lt;br /&gt;
  rndc reload&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
On biloba, check that everything&#039;s working:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  nsupdate -k /etc/csc/csc-cloud-tsig.key -v &amp;lt;&amp;lt;EOF&lt;br /&gt;
  update add test.csclub.cloud 300 A 0.0.0.0&lt;br /&gt;
  send&lt;br /&gt;
  EOF&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Use a tool such as &amp;lt;code&amp;gt;dig&amp;lt;/code&amp;gt; to make sure that the update was successful.&lt;br /&gt;
If it worked, you can delete the record:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  nsupdate -k /etc/csc/csc-cloud-tsig.key -v &amp;lt;&amp;lt;EOF&lt;br /&gt;
  delete test.csclub.cloud&lt;br /&gt;
  send&lt;br /&gt;
  EOF&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now we are ready to actually perform the challenge with acme.sh:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  export NSUPDATE_SERVER=&amp;quot;dns1.csclub.uwaterloo.ca&amp;quot;&lt;br /&gt;
  export NSUPDATE_KEY=&amp;quot;/etc/csc/csc-cloud-tsig.key&amp;quot;&lt;br /&gt;
  acme.sh --issue --dns dns_nsupdate -d &#039;k8s.csclub.cloud&#039; -d &#039;*.k8s.csclub.cloud&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
(If something goes wrong, use the &amp;lt;code&amp;gt;--debug&amp;lt;/code&amp;gt; flag.)&lt;br /&gt;
&lt;br /&gt;
If all went well, just install the certificate as usual:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  acme.sh --install-cert -d k8s.csclub.cloud \&lt;br /&gt;
    --key-file /etc/nginx/ceod/syscom-ssl/k8s.csclub.cloud.key \&lt;br /&gt;
    --fullchain-file /etc/nginx/ceod/syscom-ssl/k8s.csclub.cloud.chain \&lt;br /&gt;
    --reloadcmd &#039;systemctl reload nginx&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=SSL&amp;diff=5272</id>
		<title>SSL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=SSL&amp;diff=5272"/>
		<updated>2024-09-11T13:00:49Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* How to add a new SSL cert for a custom domain on CSC cloud */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== GlobalSign ==&lt;br /&gt;
&lt;br /&gt;
The CSC currently has an SSL Certificate from GlobalSign for *.csclub.uwaterloo.ca provided at no cost to us through IST.  GlobalSign likes to take a long time to respond to certificate signing requests (CSR) for wildcard certs, so our CSR really needs to be handed off to IST at least 2 weeks in advance. You can do it sooner – the certificate expiry date will be the old expiry date + 1 year (+ a bonus )  Having an invalid cert for any length of time leads to terrible breakage, followed by terrible workarounds and prolonged problems.&lt;br /&gt;
&lt;br /&gt;
When the certificate is due to expire in a month or two, syscom should (but apparently doesn&#039;t always) get an email notification. This will include a renewal link. Otherwise, use the [https://uwaterloo.ca/information-systems-technology/about/organizational-structure/information-security-services/certificate-authority/globalsign-signed-x5093-certificates/self-service-globalsign-ssl-certificates IST-CA self service system]. Please keep a copy of the key, CSR and (once issued) certificate in &amp;lt;tt&amp;gt;/home/sysadmin/certs&amp;lt;/tt&amp;gt;. The OpenSSL examples linked there are good to generate a 2048-bit RSA key and a corresponding CSR. It&#039;s probably a good idea to change the private key (as it&#039;s not that much effort anyways). Just sure your CSR is for &amp;lt;tt&amp;gt;*.csclub.uwaterloo.ca&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
At the self-service portal, these options worked in 2013. If you need IST assistance, [mailto:ist-ca@uwaterloo.ca ist-ca@uwaterloo.ca] is the email address you should contact.&lt;br /&gt;
  Products: OrganizationSSL&lt;br /&gt;
  SSL Certificate Type: Wildcard SSL Certificate&lt;br /&gt;
  Validity Period: 1 year&lt;br /&gt;
  Are you switching from a Competitor? No, I am not switching&lt;br /&gt;
  Are you renewing this Certificate? Yes (paste current certificate)&lt;br /&gt;
  30-day bonus: Yes (why not?)&lt;br /&gt;
  Add specific Subject Alternative Names (SANs): No (*.csclub.uwaterloo.ca automatically adds csclub.uwaterloo.ca as a SAN)&lt;br /&gt;
  Enter Certificate Signing Request (CSR): Yes (paste CSR)&lt;br /&gt;
  Contact Information:&lt;br /&gt;
    First Name: Computer Science Club&lt;br /&gt;
    Last Name: Systems Committee&lt;br /&gt;
    Telephone: +1 519 888 4567 x33870&lt;br /&gt;
    Email Address: syscom@csclub.uwaterloo.ca&lt;br /&gt;
&lt;br /&gt;
=== Helpful links ===&lt;br /&gt;
* [https://support.globalsign.com/ssl/ssl-certificates-installation/generate-csr-openssl How to generate a new CSR and private key]&lt;br /&gt;
* [https://uwaterloo.atlassian.net/wiki/spaces/ISTKB/pages/262013183/How+to+obtain+a+new+GlobalSign+certificate+or+renew+an+existing+one How to obtain a new GlobalSign certificate or renew an existing one]&lt;br /&gt;
* [https://system.globalsign.com/bm/public/certificate/poporder.do?domain=PAR12271n5w6s27pvg8d92v4150t GlobalSign UWaterloo self-service page]&lt;br /&gt;
* [https://support.globalsign.com/ca-certificates/intermediate-certificates/organizationssl-intermediate-certificates GlobalSign intermediate certificate] (needed to create a certificate chain; see below)&lt;br /&gt;
&lt;br /&gt;
=== OpenSSL cheat sheet ===&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Generate a new CSR and private key (do this in a new directory):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
openssl req -out csclub.uwaterloo.ca.csr -new -newkey rsa:2048 -keyout csclub.uwaterloo.ca.key -nodes&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Enter the following information at the prompts:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Country Name (2 letter code) [AU]:CA&lt;br /&gt;
State or Province Name (full name) [Some-State]:Ontario&lt;br /&gt;
Locality Name (eg, city) []:Waterloo&lt;br /&gt;
Organization Name (eg, company) [Internet Widgits Pty Ltd]:University of Waterloo&lt;br /&gt;
Organizational Unit Name (eg, section) []:Computer Science Club&lt;br /&gt;
Common Name (e.g. server FQDN or YOUR name) []:*.csclub.uwaterloo.ca&lt;br /&gt;
Email Address []:systems-committee@csclub.uwaterloo.ca&lt;br /&gt;
&lt;br /&gt;
Please enter the following &#039;extra&#039; attributes&lt;br /&gt;
to be sent with your certificate request&lt;br /&gt;
A challenge password []:&lt;br /&gt;
An optional company name []:&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
View the information inside a CSR:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
openssl req -noout -text -in csclub.uwaterloo.ca.csr&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
View the information inside a private key:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
openssl pkey -noout -text -in csclub.uwaterloo.ca.key&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
View information inside a certificate:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
openssl x509 -noout -text -in csclub.uwaterloo.ca.crt&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== csclub.cloud ===&lt;br /&gt;
Once a year, someone from IST will ask us to create a temporary TXT record for csclub.cloud to prove to GlobalSign that we own it. This must be created at the &amp;lt;b&amp;gt;root&amp;lt;/b&amp;gt; of the domain. Since this zone is managed dynamically (via the acme.sh script on biloba, see below), we need to freeze the domain and update /var/lib/bind/db.csclub.cloud directly.&lt;br /&gt;
&lt;br /&gt;
Once you&#039;re in the correct server (not Biloba). Here are the steps:&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;rndc freeze csclub.cloud&amp;lt;/code&amp;gt;.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Open /var/lib/bind/db.csclub.cloud and add a new TXT record. It&#039;ll look something like&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
TXT &amp;quot;_globalsign-domain-verification=blablabla&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
In the same file, make sure to also update the SOA serial number. It should generally be YYYYMMDDNN where NN is a monotonically increasing counter (YYYYMMDD is the current date).&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;rndc reload&amp;lt;/code&amp;gt;.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Run a DNS query to make sure you can see the TXT record:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dig -t txt @dns1 csclub.cloud&lt;br /&gt;
dig -t txt @dns2 csclub.cloud&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Email back the person from IST and let them know that we created the TXT record.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Once the certificate has been renewed, delete the TXT record, update the SOA serial number, and run &amp;lt;code&amp;gt;rndc reload&amp;lt;/code&amp;gt;.&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;rndc thaw csclub.cloud&amp;lt;/code&amp;gt;.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Certificate Files ==&lt;br /&gt;
Let&#039;s say you obtain a new certificate for *.csclub.uwaterloo.ca. Here are the files which should be stored in the certs folder:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;csclub.uwaterloo.ca.key: private key created by openssl&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;csclub.uwaterloo.ca.csr: certificate signing request created by openssl&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;order: order number from GlobalSign&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;csclub.uwaterloo.ca.crt: certificate created by GlobalSign&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;globalsign-intermediate.crt: intermediate certificate from GlobalSign, obtainable from [https://support.globalsign.com/ca-certificates/intermediate-certificates/organizationssl-intermediate-certificates here]. As of this writing, we use the &amp;quot;OrganizationSSL SHA-256 R3 Intermediate Certificate&amp;quot;. Just click the &amp;quot;View in Base64&amp;quot; button and copy the contents.&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;There is an alternative way to get the intermediate certificate: if you run &amp;lt;code&amp;gt;openssl x509 -noout -text -in csclub.uwaterloo.ca.crt&amp;lt;/code&amp;gt;, under X509v3 extensions &amp;gt; Authority Information Access, there should be a field called &amp;quot;CA Issuers&amp;quot; which has a URL which looks like http://secure.globalsign.com/cacert/gsrsaovsslca2018.crt. You can download that file and convert it to PEM:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
wget https://secure.globalsign.com/cacert/gsrsaovsslca2018.crt&lt;br /&gt;
openssl x509 -inform der -in gsrsaovsslca2018.crt -out globalsign-intermediate.crt&lt;br /&gt;
rm gsrsaovsslca2018.crt&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;csclub.uwaterloo.ca.chain: create this with the following command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cat csclub.uwaterloo.ca.crt globalsign-intermediate.crt &amp;gt; csclub.uwaterloo.ca.chain&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;csclub.uwaterloo.ca.pem: create this with the following command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cat csclub.uwaterloo.ca.key csclub.uwaterloo.ca.chain &amp;gt; csclub.uwaterloo.ca.pem&lt;br /&gt;
chmod 600 csclub.uwaterloo.ca.pem&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Certificate Locations ==&lt;br /&gt;
&lt;br /&gt;
Keep a copy of newly generated certificates in /users/sysadmin/certs.&lt;br /&gt;
&lt;br /&gt;
A list of places you&#039;ll need to put the new certificate to keep our services running. Private key (if applicable) should be kept next to the certificate with the extension .key.&lt;br /&gt;
&lt;br /&gt;
* caffeine:/etc/ssl/private/csclub-wildcard.crt (for Apache)&lt;br /&gt;
* coffee:/etc/ssl/private/csclub.uwaterloo.ca (for PostgreSQL and MariaDB)&lt;br /&gt;
* &amp;lt;s&amp;gt;mail:/etc/ssl/private/csclub-wildcard.crt (for Apache, Postfix and Dovecot)&amp;lt;/s&amp;gt; (UPDATE: we use certbot now for these)&lt;br /&gt;
* mailman:/etc/ssl/private/csclub-wildcard-chain.crt (for Apache)&lt;br /&gt;
* rt:/etc/ssl/private/csclub-wildcard.crt (for Apache)&lt;br /&gt;
* potassium-benzoate:/etc/ssl/private/csclub-wildcard.crt (for nginx)&lt;br /&gt;
* phosphoric-acid:/etc/ssl/private/csclub-wildcard-chain.crt (for ceod)&lt;br /&gt;
* auth1:/etc/ssl/private/csclub-wildcard.crt (for slapd, make sure to &amp;lt;code&amp;gt;sudo service slapd restart&amp;lt;/code&amp;gt;)&lt;br /&gt;
* auth2:/etc/ssl/private/csclub-wildcard.crt (for slapd, make sure to &amp;lt;code&amp;gt;sudo service slapd restart&amp;lt;/code&amp;gt;)&lt;br /&gt;
* mattermost:/etc/ssl/private/csclub-wildcard.crt (for nginx)&lt;br /&gt;
* load-balancer-0(1|2):/etc/ssl/private/csclub.uwaterloo.ca (for haproxy) [temporarily down 2020]&lt;br /&gt;
* chat:/etc/ssl/private/csclub-wildcard-chain.crt (for nginx)&lt;br /&gt;
* prometheus:/etc/ssl/private/csclub-wildcard-chain.crt (for Apache)&lt;br /&gt;
* bigbluebutton:/etc/nginx/ssl/csclub-wildcard-chain.crt (podman container on xylitol)&lt;br /&gt;
* icy:/etc/ssl/private/csclub-wildcard.pem (for Icecast)&lt;br /&gt;
* chamomile:/etc/ssl/private/cloud.csclub.uwaterloo.ca.chain.crt, /etc/ssl/private/csclub.cloud.chain, /etc/ssl/private/csclub.uwaterloo.ca.chain (for nginx)&lt;br /&gt;
* biloba:/etc/ssl/private/cloud.csclub.uwaterloo.ca.chain.crt, /etc/ssl/private/csclub.cloud.chain, /etc/ssl/private/csclub.uwaterloo.ca.chain (for nginx)&lt;br /&gt;
* nextcloud (nspawn container inside guayusa): /etc/ssl/private/csclub.uwaterloo.ca.chain (for nginx)&lt;br /&gt;
* citric-acid (runs vaultwarden): /etc/ssl/private/csclub.uwaterloo.ca.{chain,key} (for nginx)&lt;br /&gt;
&lt;br /&gt;
Some services (e.g. Dovecot, Postfix) prefer to have the certificate chain in one file. Concatenate the appropriate intermediate root to the end of the certificate and store this as csclub-wildcard-chain.crt.&lt;br /&gt;
&lt;br /&gt;
=== More certificate locations ===&lt;br /&gt;
We have some SSL certificates which are not used by web servers, but still need to be renewed eventually.&lt;br /&gt;
&lt;br /&gt;
==== Prometheus node exporter ====&lt;br /&gt;
All of our Prometheus node exporters are using mTLS via stunnel (every bare-metal host, as well as caffeine, coffee and mail, is running this exporter). The certificates (both client and server) are set to expire in &amp;lt;b&amp;gt;September 2031&amp;lt;/b&amp;gt;; before then, create new keypairs in /opt/prometheus/tls, and deploy the new server.crt, node.crt and node.key to /etc/stunnel/tls on all machines. Restart prometheus and all of the node exporters.&lt;br /&gt;
&lt;br /&gt;
==== ADFS ====&lt;br /&gt;
See [[ADFS]]. When the university&#039;s IdP certificate expires (&amp;lt;b&amp;gt;October 2025&amp;lt;/b&amp;gt;), we can just download a new one and restart Apache; when our own certificate expires (&amp;lt;b&amp;gt;July 2031&amp;lt;/b&amp;gt;), we need to submit a new form to IST (please do this &amp;lt;i&amp;gt;before&amp;lt;/i&amp;gt; the cert expires).&lt;br /&gt;
&lt;br /&gt;
==== Keycloak ====&lt;br /&gt;
See [[Keycloak]]. When the saml-passthrough certificate expires (&amp;lt;b&amp;gt;January 2032&amp;lt;/b&amp;gt;), you need to create a new keypair in /srv/saml-passthrough on caffeine, and upload the new certificate into the Keycloak UI (IdP settings). When the Keycloak SP certificate expires (&amp;lt;b&amp;gt;December 2031&amp;lt;/b&amp;gt;), make sure to create a new keypair and upload it to the Keycloak UI (Realm Settings).&lt;br /&gt;
&lt;br /&gt;
== letsencrypt ==&lt;br /&gt;
&lt;br /&gt;
We support letsencrypt for our virtual hosts with custom domains. We use the &amp;lt;tt&amp;gt;cerbot&amp;lt;/tt&amp;gt; from debian repositories with a configuration file at &amp;lt;tt&amp;gt;/etc/letsencrypt/cli.ini&amp;lt;/tt&amp;gt;, and a systemd timer to handle renewals.&lt;br /&gt;
&lt;br /&gt;
The setup for a new domain is:&lt;br /&gt;
&lt;br /&gt;
# Become &amp;lt;tt&amp;gt;certbot&amp;lt;/tt&amp;gt; on caffine with &amp;lt;tt&amp;gt;sudo -u certbot bash&amp;lt;/tt&amp;gt; or similar.&lt;br /&gt;
# Run &amp;lt;tt&amp;gt;certbot certonly -c /etc/letsencrypt/cli.ini -d DOMAIN --logs-dir /tmp&amp;lt;/tt&amp;gt;. The logs-dir isn&#039;t important and is only needed for troubleshooting.&lt;br /&gt;
# Set up the Apache site configuration using the example below. (apache config is in /etc/apache2) Note the permanent redirect to https.&lt;br /&gt;
# Make sure to commit your changes when you&#039;re done.&lt;br /&gt;
# Reloading apache config is &amp;lt;tt&amp;gt;sudo systemctl reload apache2&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;VirtualHost *:80&amp;gt;&lt;br /&gt;
     ServerName example.com&lt;br /&gt;
     ServerAlias *.example.com&lt;br /&gt;
     ServerAdmin example@csclub.uwaterloo.ca&lt;br /&gt;
 &lt;br /&gt;
     #DocumentRoot /users/example/www/&lt;br /&gt;
     Redirect permanent / https://example.com/&lt;br /&gt;
 &lt;br /&gt;
     ErrorLog /var/log/apache2/example-error.log&lt;br /&gt;
     CustomLog /var/log/apache2/example-access.log combined&lt;br /&gt;
 &amp;lt;/VirtualHost&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
 &amp;lt;VirtualHost csclub:443&amp;gt;&lt;br /&gt;
     SSLEngine on&lt;br /&gt;
     SSLCertificateFile /etc/letsencrypt/live/example.com/fullchain.pem&lt;br /&gt;
     SSLCertificateKeyFile /etc/letsencrypt/live/example.com/privkey.pem&lt;br /&gt;
     SSLStrictSNIVHostCheck on&lt;br /&gt;
 &lt;br /&gt;
     ServerName example.com&lt;br /&gt;
     ServerAlias *.example.com&lt;br /&gt;
     ServerAdmin example@csclub.uwaterloo.ca&lt;br /&gt;
 &lt;br /&gt;
     DocumentRoot /users/example/www&lt;br /&gt;
 &lt;br /&gt;
     ErrorLog /var/log/apache2/example-error.log&lt;br /&gt;
     CustomLog /var/log/apache2/example-access.log combined&lt;br /&gt;
 &amp;lt;/VirtualHost&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== acme.sh ==&lt;br /&gt;
We are using [https://github.com/acmesh-official/acme.sh acme.sh] for provisioning SSL certificates for some of our *.csclub.cloud domains. It is currently set up under /root/.acme.sh on biloba.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;NOTE&amp;lt;/b&amp;gt;: acme.sh has a cron job which automatically renews certificates before they expire and reloads NGINX, so you do not have to do anything after issuing and installing a certificate (i.e. &amp;quot;set-and-forget&amp;quot;).&lt;br /&gt;
&lt;br /&gt;
=== How to add a new SSL cert for a custom domain on CSC cloud ===&lt;br /&gt;
Note: you do not need to acquire new cert if the requested domain is directly on csclub.cloud, e.g. app1.csclub.cloud. We can re-use our wildcart cert on csclub.cloud for that. However, if a user requests a multi-level domain on csclub.cloud, or a domain hosted on an external registrar, then you will need to create a new cert.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s say user &amp;lt;code&amp;gt;ctdalek&amp;lt;/code&amp;gt; wants &amp;lt;code&amp;gt;mydomain.com&amp;lt;/code&amp;gt; to point to a VM on CSC cloud.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
TLDR:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Obtain the cert.&lt;br /&gt;
# If a subdomain was also requested, pass the -d option multiple times, e.g.&lt;br /&gt;
# `-d mydomain.com -d sub.mydomain.com`. Make sure the &amp;quot;main&amp;quot; domain is specified first.&lt;br /&gt;
acme.sh --issue -d mydomain.com -w /var/www&lt;br /&gt;
&lt;br /&gt;
# Install the cert.&lt;br /&gt;
# If a subdomain was also requested, only specify the &amp;quot;main&amp;quot; domain.&lt;br /&gt;
acme.sh --install-cert -d mydomain.com \&lt;br /&gt;
    --key-file /etc/nginx/ceod/member-ssl/mydomain.com.key \&lt;br /&gt;
    --fullchain-file /etc/nginx/ceod/member-ssl/mydomain.com.chain \&lt;br /&gt;
    --reloadcmd &amp;quot;/root/bin/reload-nginx.sh&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Create a vhost file.&lt;br /&gt;
# Look at the other files in the same directory for inspiration.&lt;br /&gt;
# Make sure the file starts with the username and an underscore, e.g. &amp;quot;ctdalek_&amp;quot;,&lt;br /&gt;
# because this is how ceod keeps track of the vhosts.&lt;br /&gt;
# Make sure to set the custom domain name(s) and paths to the SSL key/cert.&lt;br /&gt;
vim /etc/nginx/ceod/member-vhosts/ctdalek_mydomain.com&lt;br /&gt;
&lt;br /&gt;
# Finally, reload NGINX on both biloba and chamomile. The /etc/nginx/ceod directory&lt;br /&gt;
# is shared between them.&lt;br /&gt;
/root/bin/reload-nginx.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Installation ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cd /opt    &lt;br /&gt;
git clone --depth 1 https://github.com/acmesh-official/acme.sh    &lt;br /&gt;
cd acme.sh    &lt;br /&gt;
./acme.sh --install -m syscom@csclub.uwaterloo.ca    &lt;br /&gt;
. &amp;quot;/root/.acme.sh/acme.sh.env&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;Important&amp;lt;/b&amp;gt;: If invoking acme.sh from another program, it needs the environment variables set in acme.sh.env. Currently, that is just&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
LE_WORKING_DIR=&amp;quot;/root/.acme.sh&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For testing purposes, make sure to use the Let&#039;s Encrypt test server:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
acme.sh --set-default-ca --server letsencrypt_test&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== NGINX setup ===&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir -p /var/www/.well-known/acme-challenge&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Add the following snippet to your default NGINX file (e.g. /etc/nginx/sites-enabled/default):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  # For Let&#039;s Encrypt&lt;br /&gt;
  location /.well-known/acme-challenge/ {&lt;br /&gt;
    alias /var/www/.well-known/acme-challenge/;&lt;br /&gt;
  }&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now assuming that biloba has the IP address for *.csclub.cloud, you can test that everything is working:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
acme.sh --issue -d app.merenber.csclub.cloud -w /var/www&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To install a certificate after it&#039;s been issued:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
acme.sh --install-cert -d app.merenber.csclub.cloud \&lt;br /&gt;
    --key-file /etc/nginx/ceod/member-ssl/app.merenber.csclub.cloud.key \&lt;br /&gt;
    --fullchain-file /etc/nginx/ceod/member-ssl/app.merenber.csclub.cloud.chain \&lt;br /&gt;
    --reloadcmd &amp;quot;/root/bin/reload-nginx.sh&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
At this point, you should add your NGINX vhost file which uses that SSL certificate.&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
To remove a certificate:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
acme.sh --remove -d app.merenber.csclub.cloud&lt;br /&gt;
rm -r /root/.acme.sh/app.merenber.csclub.cloud&lt;br /&gt;
rm /etc/nginx/ceod/member-ssl/app.merenber.csclub.cloud.chain&lt;br /&gt;
rm /etc/nginx/ceod/member-ssl/app.merenber.csclub.cloud.key&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Don&#039;t forget to remove the NGINX vhost file too.&lt;br /&gt;
&lt;br /&gt;
Once you think you&#039;re ready, use a real ACME provider, e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
acme.sh --set-default-ca --server letsencrypt&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Since we have a [https://zerossl.com ZeroSSL] account, and ZeroSSL has no rate limit, we are going to use that instead:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
acme.sh  --register-account  --server zerossl \&lt;br /&gt;
        --eab-kid  xxxxxxxxxxxx  \&lt;br /&gt;
        --eab-hmac-key  xxxxxxxxx&lt;br /&gt;
acme.sh --set-default-ca  --server zerossl&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== DNS challenge ===&lt;br /&gt;
To obtain a wildcard certificate (e.g. *.k8s.csclub.cloud), you will need to perform the DNS-01 challenge. We are going to use nsupdate to interact with our BIND9 server on dns1.&lt;br /&gt;
&lt;br /&gt;
On dns1, run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
tsig-keygen csc-cloud&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Paste the output into the appropriate section in /etc/bind/named.conf.local. Also paste it into a file somewhere on biloba, e.g. /etc/csc/csc-cloud-tsig.key.&lt;br /&gt;
&lt;br /&gt;
Add the following to the csclub.cloud zone block:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  allow-update {&lt;br /&gt;
    !{&lt;br /&gt;
      !127.0.0.1;&lt;br /&gt;
      !::1;&lt;br /&gt;
      !129.97.134.0/24;&lt;br /&gt;
      !2620:101:f000:4901::/64;&lt;br /&gt;
      any;&lt;br /&gt;
    };&lt;br /&gt;
    key csc-cloud;&lt;br /&gt;
  };&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
(We&#039;re basically trying to restrict updates to the given IP ranges. See https://serverfault.com/a/417229.)&lt;br /&gt;
&lt;br /&gt;
The &#039;bind&#039; user can&#039;t write to files under /etc/bind, so we&#039;re going to move our zone file to /var/lib/bind instead.&lt;br /&gt;
Comment out &#039;file &amp;quot;/etc/bind/db.csclub.cloud&amp;quot;;&#039; from named.conf.local and add this line below it:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  file &amp;quot;/var/lib/bind/db.csclub.cloud&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  cp /etc/bind/db.csclub.cloud /var/lib/bind/db.csclub.cloud&lt;br /&gt;
  chown bind:bind /var/lib/bind/db.csclub.cloud&lt;br /&gt;
  rndc reload&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
On biloba, check that everything&#039;s working:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  nsupdate -k /etc/csc/csc-cloud-tsig.key -v &amp;lt;&amp;lt;EOF&lt;br /&gt;
  update add test.csclub.cloud 300 A 0.0.0.0&lt;br /&gt;
  send&lt;br /&gt;
  EOF&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Use a tool such as &amp;lt;code&amp;gt;dig&amp;lt;/code&amp;gt; to make sure that the update was successful.&lt;br /&gt;
If it worked, you can delete the record:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  nsupdate -k /etc/csc/csc-cloud-tsig.key -v &amp;lt;&amp;lt;EOF&lt;br /&gt;
  delete test.csclub.cloud&lt;br /&gt;
  send&lt;br /&gt;
  EOF&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now we are ready to actually perform the challenge with acme.sh:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  export NSUPDATE_SERVER=&amp;quot;dns1.csclub.uwaterloo.ca&amp;quot;&lt;br /&gt;
  export NSUPDATE_KEY=&amp;quot;/etc/csc/csc-cloud-tsig.key&amp;quot;&lt;br /&gt;
  acme.sh --issue --dns dns_nsupdate -d &#039;k8s.csclub.cloud&#039; -d &#039;*.k8s.csclub.cloud&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
(If something goes wrong, use the &amp;lt;code&amp;gt;--debug&amp;lt;/code&amp;gt; flag.)&lt;br /&gt;
&lt;br /&gt;
If all went well, just install the certificate as usual:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
  acme.sh --install-cert -d k8s.csclub.cloud \&lt;br /&gt;
    --key-file /etc/nginx/ceod/syscom-ssl/k8s.csclub.cloud.key \&lt;br /&gt;
    --fullchain-file /etc/nginx/ceod/syscom-ssl/k8s.csclub.cloud.chain \&lt;br /&gt;
    --reloadcmd &#039;systemctl reload nginx&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=Podman&amp;diff=5268</id>
		<title>Podman</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=Podman&amp;diff=5268"/>
		<updated>2024-08-16T13:28:43Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Networking */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[https://podman.io/ Podman] is a very neat Docker-compatible container solution. Some of the advantages it has over Docker are:&lt;br /&gt;
&lt;br /&gt;
* no daemon (uses a fork-and-exec model)&lt;br /&gt;
* systemd can run inside containers very easily&lt;br /&gt;
* containers can become systemd services on the host&lt;br /&gt;
* non-root users can run containers&lt;br /&gt;
&lt;br /&gt;
== Installation ==&lt;br /&gt;
As of bullseye, podman is available in the official Debian repositories. I suggest installing it from the unstable distribution, since podman 3.2 has many useful improvements over previous versions:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt install -t unstable podman podman-docker &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The podman-docker package provides a wrapper script so that running the command &#039;docker&#039; will invoke podman. Recent versions of podman also provide API compatibility with Docker, which means that docker-compose will actually work out of the box. (For non-root users, you will need to set the DOCKER_HOST environment variable to &amp;lt;code&amp;gt;unix://$XDG_RUNTIME_DIR/podman/podman.sock&amp;lt;/code&amp;gt;).&lt;br /&gt;
&lt;br /&gt;
I suggest adding the following to /etc/containers/registries.conf so that podman automatically pulls packages from docker.io instead of quay.io:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[registries.search]&lt;br /&gt;
registries = [&#039;docker.io&#039;]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Networking ==&lt;br /&gt;
As of this writing (2024-08-16), the latest network backend in Podman is [https://github.com/containers/netavark netavark]. Hosts which are still using the legacy CNI backend should switch to netavark as soon as possible, because support for CNI will be removed in Podman 5.0. Unfortunately, the officially recommended way to migrate from CNI to netavark is to run &amp;quot;podman system reset&amp;quot;, which deletes &#039;&#039;&#039;everything&#039;&#039;&#039; (containers, images, networks, etc.). This is usually undesirable. Here&#039;s what I suggest instead (assuming you don&#039;t have custom Podman networks):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;Stop all running containers.&amp;lt;/li&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;echo -n netavark &amp;amp;gt; /var/lib/containers/storage/defaultNetworkBackend&amp;lt;/code&amp;gt;.&amp;lt;/li&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;Restart the stopped containers.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you had custom networks before, this is trickier. You will need to manually convert the CNI JSON file into the netavark JSON format (under /etc/containers/networks).&lt;br /&gt;
&lt;br /&gt;
=== Directly exposing a container to a public network ===&lt;br /&gt;
The easiest way to do this, in my opinion, is with a macvlan network. Here&#039;s an example of how this was done for [[BigBlueButton]] on xylitol:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman network create \&lt;br /&gt;
  --driver=macvlan \&lt;br /&gt;
  --ipv6 \&lt;br /&gt;
  --opt parent=br0 \&lt;br /&gt;
  --subnet=129.97.134.0/24 \&lt;br /&gt;
  --gateway=129.97.134.1 \&lt;br /&gt;
  --subnet=2620:101:f000:4901:c5c::0/64 \&lt;br /&gt;
  --gateway=2620:101:f000:4901::1 \&lt;br /&gt;
  bbbnet&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then create a pod in which the containers will be run:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman pod create \&lt;br /&gt;
  --name bbbpod \&lt;br /&gt;
  --network bbbnet \&lt;br /&gt;
  --share net \&lt;br /&gt;
  --ip=129.97.134.173 \&lt;br /&gt;
  --ip6=2620:101:f000:4901:c5c::173&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Systemd ==&lt;br /&gt;
Podman integrates with systemd in both directions - systemd can run in podman, and podman can run in systemd.&lt;br /&gt;
&lt;br /&gt;
=== Systemd in podman ===&lt;br /&gt;
To run systemd in podman, just create a Dockerfile like the following:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
FROM ubuntu:bionic  &lt;br /&gt;
    &lt;br /&gt;
ENV DEBIAN_FRONTEND=noninteractive    &lt;br /&gt;
RUN apt update &amp;amp;&amp;amp; apt install -y systemd&lt;br /&gt;
RUN passwd -d root    &lt;br /&gt;
    &lt;br /&gt;
CMD [ &amp;quot;/bin/systemd&amp;quot; ]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman build --privileged -t ubuntu-systemd:bionic -f ubuntu-bionic-systemd.Dockerfile&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If you&#039;re running this as root, I suggest using the --privileged flag. I am pretty sure that there some specific capabilities you can add instead to make it work (via the --cap-add flag), but this is easier.&lt;br /&gt;
&lt;br /&gt;
Then, to run a container with this image:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman run -it --privileged ubuntu-systemd:bionic&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Podman in systemd ===&lt;br /&gt;
Podman has a built-in command to generate systemd service files to start containers and pods. For example, let&#039;s say we have a pod named bbbpod. Run the following:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman generate systemd --files --name bbbpod&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
This will create .service files for the pod and the containers inside it. Now you just need to enable them:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mv *.service /etc/systemd/system/&lt;br /&gt;
systemctl daemon-reload &lt;br /&gt;
systemctl enable pod-bbbpod.service&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If you now run &amp;lt;code&amp;gt;systemctl start pod-bbbpod&amp;lt;/code&amp;gt;, the pod and its containers will start.&lt;br /&gt;
&lt;br /&gt;
== Pods ==&lt;br /&gt;
Podman pods are similar to Kubernetes pods; they can share namespaces with each other, such as network namespaces and UTS namespaces. In this example, we will use a network namespace.&lt;br /&gt;
&lt;br /&gt;
First, we create a pod in the network we previously created:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman pod create --network bbbnet --name bbbpod --share net&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run a container inside the pod:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman run -it --name bbb --hostname bbb --pod bbbpod --privileged ubuntu-systemd:bionic&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
You can add more containers to the pod:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman run -d --name greenlight --pod bbbpod --env-file $PWD/env bigbluebutton/greenlight:v2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The bbb and greenlight containers can now communicate with each other over localhost.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Important&amp;lt;/b&amp;gt;: Make sure to edit /etc/hostname and /etc/network/interfaces (or whichever network manager you decide to use) in each container.&lt;br /&gt;
&lt;br /&gt;
== Volumes ==&lt;br /&gt;
Unfortunately podman does not currently have functionality to allocate a separate volume to each container. Instead, I suggest mounting each root-level folder in a separate volume.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s say you created a new LVM volume mounted at /vm/bigbluebutton. So create your container like the following:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman run ... --name bbb -v /vm/bigbluebutton/bin:/bin -v /vm/bigbluebutton/boot:/boot -v /vm/bigbluebutton/etc:/etc -v /vm/bigbluebutton/home:/home -v /vm/bigbluebutton/lib:/lib -v /vm/bigbluebutton/lib64:/lib64 -v /vm/bigbluebutton/media:/media -v /vm/bigbluebutton/mnt:/mnt -v /vm/bigbluebutton/opt:/opt -v /vm/bigbluebutton/root:/root -v /vm/bigbluebutton/sbin:/sbin -v /vm/bigbluebutton/srv:/srv -v /vm/bigbluebutton/usr:/usr -v /vm/bigbluebutton/var:/var ubuntu-systemd:bionic&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is also a good idea to mount /var/lib/containers in a separate LVM volume to avoid running out of space on the host.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=Podman&amp;diff=5267</id>
		<title>Podman</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=Podman&amp;diff=5267"/>
		<updated>2024-08-16T06:03:30Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Networking */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[https://podman.io/ Podman] is a very neat Docker-compatible container solution. Some of the advantages it has over Docker are:&lt;br /&gt;
&lt;br /&gt;
* no daemon (uses a fork-and-exec model)&lt;br /&gt;
* systemd can run inside containers very easily&lt;br /&gt;
* containers can become systemd services on the host&lt;br /&gt;
* non-root users can run containers&lt;br /&gt;
&lt;br /&gt;
== Installation ==&lt;br /&gt;
As of bullseye, podman is available in the official Debian repositories. I suggest installing it from the unstable distribution, since podman 3.2 has many useful improvements over previous versions:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt install -t unstable podman podman-docker &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The podman-docker package provides a wrapper script so that running the command &#039;docker&#039; will invoke podman. Recent versions of podman also provide API compatibility with Docker, which means that docker-compose will actually work out of the box. (For non-root users, you will need to set the DOCKER_HOST environment variable to &amp;lt;code&amp;gt;unix://$XDG_RUNTIME_DIR/podman/podman.sock&amp;lt;/code&amp;gt;).&lt;br /&gt;
&lt;br /&gt;
I suggest adding the following to /etc/containers/registries.conf so that podman automatically pulls packages from docker.io instead of quay.io:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[registries.search]&lt;br /&gt;
registries = [&#039;docker.io&#039;]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Networking ==&lt;br /&gt;
As of this writing (2024-08-16), the latest network backend in Podman is [https://github.com/containers/netavark netavark]. Hosts which are still using the legacy CNI backend should switch to netavark as soon as possible, because support for CNI will be removed in Podman 5.0. Unfortunately, the officially recommended way to migrate from CNI to netavark is to run &amp;quot;podman system reset&amp;quot;, which deletes &#039;&#039;&#039;everything&#039;&#039;&#039; (containers, images, networks, etc.). This is usually undesirable. Here&#039;s what I suggest instead (assuming you don&#039;t have custom Podman networks):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;Stop all running containers.&amp;lt;/li&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;echo -n netavark &amp;amp;gt; /var/lib/containers/storage/defaultNetworkBackend&amp;lt;/code&amp;gt;.&amp;lt;/li&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;Restart the stopped containers.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you had custom networks before, this is trickier. You will need to manually convert the CNI JSON file into the netavark JSON format (under /etc/containers/networks).&lt;br /&gt;
&lt;br /&gt;
=== Directly exposing a container to a public network ===&lt;br /&gt;
The easiest way to do this, in my opinion, is with a macvlan network. Here&#039;s an example of how this was done for [[BigBlueButton]] on xylitol:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman network create \&lt;br /&gt;
  --driver=macvlan \&lt;br /&gt;
  --ipv6 \&lt;br /&gt;
  --opt parent=br0 \&lt;br /&gt;
  --subnet=129.97.134.0/24 \&lt;br /&gt;
  --gateway=129.97.134.1 \&lt;br /&gt;
  --subnet=2620:101:f000:4901:c5c::0/64 \&lt;br /&gt;
  --gateway=2620:101:f000:4901::1 \&lt;br /&gt;
  bbbnet&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Systemd ==&lt;br /&gt;
Podman integrates with systemd in both directions - systemd can run in podman, and podman can run in systemd.&lt;br /&gt;
&lt;br /&gt;
=== Systemd in podman ===&lt;br /&gt;
To run systemd in podman, just create a Dockerfile like the following:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
FROM ubuntu:bionic  &lt;br /&gt;
    &lt;br /&gt;
ENV DEBIAN_FRONTEND=noninteractive    &lt;br /&gt;
RUN apt update &amp;amp;&amp;amp; apt install -y systemd&lt;br /&gt;
RUN passwd -d root    &lt;br /&gt;
    &lt;br /&gt;
CMD [ &amp;quot;/bin/systemd&amp;quot; ]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman build --privileged -t ubuntu-systemd:bionic -f ubuntu-bionic-systemd.Dockerfile&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If you&#039;re running this as root, I suggest using the --privileged flag. I am pretty sure that there some specific capabilities you can add instead to make it work (via the --cap-add flag), but this is easier.&lt;br /&gt;
&lt;br /&gt;
Then, to run a container with this image:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman run -it --privileged ubuntu-systemd:bionic&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Podman in systemd ===&lt;br /&gt;
Podman has a built-in command to generate systemd service files to start containers and pods. For example, let&#039;s say we have a pod named bbbpod. Run the following:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman generate systemd --files --name bbbpod&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
This will create .service files for the pod and the containers inside it. Now you just need to enable them:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mv *.service /etc/systemd/system/&lt;br /&gt;
systemctl daemon-reload &lt;br /&gt;
systemctl enable pod-bbbpod.service&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If you now run &amp;lt;code&amp;gt;systemctl start pod-bbbpod&amp;lt;/code&amp;gt;, the pod and its containers will start.&lt;br /&gt;
&lt;br /&gt;
== Pods ==&lt;br /&gt;
Podman pods are similar to Kubernetes pods; they can share namespaces with each other, such as network namespaces and UTS namespaces. In this example, we will use a network namespace.&lt;br /&gt;
&lt;br /&gt;
First, we create a pod in the network we previously created:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman pod create --network bbbnet --name bbbpod --share net&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run a container inside the pod:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman run -it --name bbb --hostname bbb --pod bbbpod --privileged ubuntu-systemd:bionic&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
You can add more containers to the pod:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman run -d --name greenlight --pod bbbpod --env-file $PWD/env bigbluebutton/greenlight:v2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The bbb and greenlight containers can now communicate with each other over localhost.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Important&amp;lt;/b&amp;gt;: Make sure to edit /etc/hostname and /etc/network/interfaces (or whichever network manager you decide to use) in each container.&lt;br /&gt;
&lt;br /&gt;
== Volumes ==&lt;br /&gt;
Unfortunately podman does not currently have functionality to allocate a separate volume to each container. Instead, I suggest mounting each root-level folder in a separate volume.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s say you created a new LVM volume mounted at /vm/bigbluebutton. So create your container like the following:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman run ... --name bbb -v /vm/bigbluebutton/bin:/bin -v /vm/bigbluebutton/boot:/boot -v /vm/bigbluebutton/etc:/etc -v /vm/bigbluebutton/home:/home -v /vm/bigbluebutton/lib:/lib -v /vm/bigbluebutton/lib64:/lib64 -v /vm/bigbluebutton/media:/media -v /vm/bigbluebutton/mnt:/mnt -v /vm/bigbluebutton/opt:/opt -v /vm/bigbluebutton/root:/root -v /vm/bigbluebutton/sbin:/sbin -v /vm/bigbluebutton/srv:/srv -v /vm/bigbluebutton/usr:/usr -v /vm/bigbluebutton/var:/var ubuntu-systemd:bionic&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is also a good idea to mount /var/lib/containers in a separate LVM volume to avoid running out of space on the host.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=Podman&amp;diff=5266</id>
		<title>Podman</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=Podman&amp;diff=5266"/>
		<updated>2024-08-16T06:02:44Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Networking */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[https://podman.io/ Podman] is a very neat Docker-compatible container solution. Some of the advantages it has over Docker are:&lt;br /&gt;
&lt;br /&gt;
* no daemon (uses a fork-and-exec model)&lt;br /&gt;
* systemd can run inside containers very easily&lt;br /&gt;
* containers can become systemd services on the host&lt;br /&gt;
* non-root users can run containers&lt;br /&gt;
&lt;br /&gt;
== Installation ==&lt;br /&gt;
As of bullseye, podman is available in the official Debian repositories. I suggest installing it from the unstable distribution, since podman 3.2 has many useful improvements over previous versions:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt install -t unstable podman podman-docker &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The podman-docker package provides a wrapper script so that running the command &#039;docker&#039; will invoke podman. Recent versions of podman also provide API compatibility with Docker, which means that docker-compose will actually work out of the box. (For non-root users, you will need to set the DOCKER_HOST environment variable to &amp;lt;code&amp;gt;unix://$XDG_RUNTIME_DIR/podman/podman.sock&amp;lt;/code&amp;gt;).&lt;br /&gt;
&lt;br /&gt;
I suggest adding the following to /etc/containers/registries.conf so that podman automatically pulls packages from docker.io instead of quay.io:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[registries.search]&lt;br /&gt;
registries = [&#039;docker.io&#039;]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Networking ==&lt;br /&gt;
As of this writing (2024-08-16), the latest network backend in Podman is [https://github.com/containers/netavark netavark]. Hosts which are still using the legacy CNI backend should switch to netavark as soon as possible, because support for CNI will be removed in Podman 5.0. Unfortunately, the officially recommended way to migrate from netavark to CNI is to run &amp;quot;podman system reset&amp;quot;, which deletes &#039;&#039;&#039;everything&#039;&#039;&#039; (containers, images, networks, etc.). This is usually undesirable. Here&#039;s what I suggest instead (assuming you don&#039;t have custom Podman networks):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;Stop all running containers.&amp;lt;/li&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;echo -n netavark &amp;amp;gt; /var/lib/containers/storage/defaultNetworkBackend&amp;lt;/code&amp;gt;.&amp;lt;/li&amp;gt;&lt;br /&gt;
  &amp;lt;li&amp;gt;Restart the stopped containers.&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you had custom networks before, this is trickier. You will need to manually convert the CNI JSON file into the netavark JSON format (under /etc/containers/networks).&lt;br /&gt;
&lt;br /&gt;
=== Directly exposing a container to a public network ===&lt;br /&gt;
The easiest way to do this, in my opinion, is with a macvlan network. Here&#039;s an example of how this was done for [[BigBlueButton]] on xylitol:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman network create \&lt;br /&gt;
  --driver=macvlan \&lt;br /&gt;
  --ipv6 \&lt;br /&gt;
  --opt parent=br0 \&lt;br /&gt;
  --subnet=129.97.134.0/24 \&lt;br /&gt;
  --gateway=129.97.134.1 \&lt;br /&gt;
  --subnet=2620:101:f000:4901:c5c::0/64 \&lt;br /&gt;
  --gateway=2620:101:f000:4901::1 \&lt;br /&gt;
  bbbnet&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Systemd ==&lt;br /&gt;
Podman integrates with systemd in both directions - systemd can run in podman, and podman can run in systemd.&lt;br /&gt;
&lt;br /&gt;
=== Systemd in podman ===&lt;br /&gt;
To run systemd in podman, just create a Dockerfile like the following:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
FROM ubuntu:bionic  &lt;br /&gt;
    &lt;br /&gt;
ENV DEBIAN_FRONTEND=noninteractive    &lt;br /&gt;
RUN apt update &amp;amp;&amp;amp; apt install -y systemd&lt;br /&gt;
RUN passwd -d root    &lt;br /&gt;
    &lt;br /&gt;
CMD [ &amp;quot;/bin/systemd&amp;quot; ]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman build --privileged -t ubuntu-systemd:bionic -f ubuntu-bionic-systemd.Dockerfile&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If you&#039;re running this as root, I suggest using the --privileged flag. I am pretty sure that there some specific capabilities you can add instead to make it work (via the --cap-add flag), but this is easier.&lt;br /&gt;
&lt;br /&gt;
Then, to run a container with this image:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman run -it --privileged ubuntu-systemd:bionic&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Podman in systemd ===&lt;br /&gt;
Podman has a built-in command to generate systemd service files to start containers and pods. For example, let&#039;s say we have a pod named bbbpod. Run the following:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman generate systemd --files --name bbbpod&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
This will create .service files for the pod and the containers inside it. Now you just need to enable them:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mv *.service /etc/systemd/system/&lt;br /&gt;
systemctl daemon-reload &lt;br /&gt;
systemctl enable pod-bbbpod.service&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If you now run &amp;lt;code&amp;gt;systemctl start pod-bbbpod&amp;lt;/code&amp;gt;, the pod and its containers will start.&lt;br /&gt;
&lt;br /&gt;
== Pods ==&lt;br /&gt;
Podman pods are similar to Kubernetes pods; they can share namespaces with each other, such as network namespaces and UTS namespaces. In this example, we will use a network namespace.&lt;br /&gt;
&lt;br /&gt;
First, we create a pod in the network we previously created:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman pod create --network bbbnet --name bbbpod --share net&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run a container inside the pod:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman run -it --name bbb --hostname bbb --pod bbbpod --privileged ubuntu-systemd:bionic&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
You can add more containers to the pod:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman run -d --name greenlight --pod bbbpod --env-file $PWD/env bigbluebutton/greenlight:v2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The bbb and greenlight containers can now communicate with each other over localhost.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;Important&amp;lt;/b&amp;gt;: Make sure to edit /etc/hostname and /etc/network/interfaces (or whichever network manager you decide to use) in each container.&lt;br /&gt;
&lt;br /&gt;
== Volumes ==&lt;br /&gt;
Unfortunately podman does not currently have functionality to allocate a separate volume to each container. Instead, I suggest mounting each root-level folder in a separate volume.&lt;br /&gt;
&lt;br /&gt;
Let&#039;s say you created a new LVM volume mounted at /vm/bigbluebutton. So create your container like the following:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman run ... --name bbb -v /vm/bigbluebutton/bin:/bin -v /vm/bigbluebutton/boot:/boot -v /vm/bigbluebutton/etc:/etc -v /vm/bigbluebutton/home:/home -v /vm/bigbluebutton/lib:/lib -v /vm/bigbluebutton/lib64:/lib64 -v /vm/bigbluebutton/media:/media -v /vm/bigbluebutton/mnt:/mnt -v /vm/bigbluebutton/opt:/opt -v /vm/bigbluebutton/root:/root -v /vm/bigbluebutton/sbin:/sbin -v /vm/bigbluebutton/srv:/srv -v /vm/bigbluebutton/usr:/usr -v /vm/bigbluebutton/var:/var ubuntu-systemd:bionic&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It is also a good idea to mount /var/lib/containers in a separate LVM volume to avoid running out of space on the host.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=IPMI101&amp;diff=5245</id>
		<title>IPMI101</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=IPMI101&amp;diff=5245"/>
		<updated>2024-04-03T05:08:21Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* carbonated-water */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Guide to IPMI (IPMI 101) =&lt;br /&gt;
&lt;br /&gt;
IPMI is a necessary evil. Let’s learn to make the best of it.&lt;br /&gt;
&lt;br /&gt;
== Setting up IPMI ==&lt;br /&gt;
&lt;br /&gt;
# Install ipmitool&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# apt-get install ipmitool&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Load IPMI modules (they are included in most upstream kernels)&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You may also need a kernel module specific to your motherboard’s manufacture as some BMC/LOMs do not conform to IPMI spec and thus need a translation layer.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# modprobe ipmi_*&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Locally connect to the &amp;lt;code&amp;gt;/dev/ipmi&amp;lt;/code&amp;gt; interface&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; help&lt;br /&gt;
&amp;amp;gt; mc info&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Securing IPMI ==&lt;br /&gt;
&lt;br /&gt;
Note that root on the machine is root on the BMC and vice versa.&lt;br /&gt;
&lt;br /&gt;
# User administration&lt;br /&gt;
&lt;br /&gt;
(re)set the password, rename the admin account to root and delete any extra users as they can have surprising privilege. You may have to use the BMC’s web interface delete accounts.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; user list 1&lt;br /&gt;
ID Name ...&lt;br /&gt;
2  ADMIN ...&lt;br /&gt;
&amp;amp;gt; user set password 2&lt;br /&gt;
User id 2: *******&lt;br /&gt;
User id 2: *******&lt;br /&gt;
&amp;amp;gt; user set username 2 root&lt;br /&gt;
&amp;amp;gt; user disable $other_user_ids&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Disable NULL password and cipher suite 0&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that the $channel is usually 0 but can range from 0-10 and there can be multiple NICs and so multiple channels to fix.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; lan print $channel&lt;br /&gt;
&amp;amp;gt; lan set $channel auth ADMIN MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth CALLBACK MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth USER MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth OPERATOR MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel cipher_privs XXXaXXXXXXXXXXX&lt;br /&gt;
&amp;amp;gt; lan print $channel&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Configuring networking ==&lt;br /&gt;
&lt;br /&gt;
Note once again that there are sometimes multiple channels, to find the correct channel it is helpful to use either trial and error and/or an ARP scanner to find the correct MAC address. Usually the channel is 0 but I have seen 1, 8 and 17. Especially when there are multiple NICs.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; lan print $channel&lt;br /&gt;
&amp;amp;gt; lan set $channel ipsrc static&lt;br /&gt;
&amp;amp;gt; lan set $channel ipaddr 10.15.134.?&lt;br /&gt;
&amp;amp;gt; lan set $channel defgw ipaddr 10.15.134.1&lt;br /&gt;
&amp;amp;gt; lan set $channel netmask 255.255.255.0&lt;br /&gt;
// if you have vlan tagging enabled on the switch port, useful for a shared NIC&lt;br /&gt;
&amp;amp;gt; lan set $channel vlan id 520&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Configuring Serial over LAN ==&lt;br /&gt;
&lt;br /&gt;
To enable serial over LAN you need to ensure that it is enabled in your BIOS or EFI setup utility and further note the baud rate. 115200 is used as an example below. Note that GRUB is the only boot loader that takes input via serial properly, in my experience. Syslinux failed horribly on corn-syrup.&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/default/grub.d/99-csclub.cfg:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
GRUB_CMDLINE_LINUX=&amp;amp;quot;console=tty1 console=ttyS1,115200n8&amp;amp;quot;&lt;br /&gt;
GRUB_TERMINAL_INPUT=&amp;amp;quot;console serial&amp;amp;quot;&lt;br /&gt;
GRUB_TERMINAL_OUTPUT=&amp;amp;quot;console serial&amp;amp;quot;&lt;br /&gt;
GRUB_SERIAL_COMMAND=&amp;amp;quot;serial --speed=115200 --unit=1 --word=8 --parity=no --stop=1&amp;amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
and then run:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;// on debian based distros&lt;br /&gt;
// Yay, Debian magic :\&lt;br /&gt;
# update-grub&lt;br /&gt;
// on upstream packages (Arch, Fedora, etc.)&lt;br /&gt;
# grub-mkconfig -o /boot/grub/grub.cfg&lt;br /&gt;
&lt;br /&gt;
# reboot&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= iDRAC =&lt;br /&gt;
== riboflavin ==&lt;br /&gt;
riboflavin is using iDRAC 6. The web console can be viewed from https://riboflavin-ipmi.csclub.uwaterloo.ca; if you are not on campus, you can use a [[How_to_SSH#SOCKS_proxy|SOCKS proxy]]. Unfortunately, the virtual console uses Java Web Start, which is now deprecated. Here&#039;s a workaround which you can use instead.&lt;br /&gt;
&lt;br /&gt;
From the web UI, go to the &amp;quot;Console/Media&amp;quot; tab and click the &amp;quot;Launch virtual console&amp;quot; button. This will download a file whose name starts with &amp;quot;viewer.jnlp&amp;quot;. Now go to https://www.java.com and download JRE 8; any later version will not have support for JWS (note that OpenJDK will not work; JWS was a proprietary framework from Sun/Oracle). Unpack the tarball, open jre1.8.0_391/lib/security/java.security in a text editor, and comment out the following properties (note that each property spans multiple lines):&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.certpath.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.jar.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.tls.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you are off-campus, you will need to setup some proxying so that the Java application can access ports 443 and 5900 on riboflavin-ipmi. In the example below, I am using caffeine as a jump host, but any machine on campus should do:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L 5443:localhost:5443 -L 5900:localhost:5900 caffeine.csclub.uwaterloo.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now on caffeine, open a tmux/screen session, and run the following commands in two different panes:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;socat TCP-LISTEN:5443,fork TCP:riboflavin-ipmi:443&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;socat TCP-LISTEN:5900,fork TCP:riboflavin-ipmi:5900&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Back on your personal machine, open the viewer.jnlp file in a text editor and perform the following:&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Replace all instances of &amp;lt;code&amp;gt;riboflavin-ipmi.csclub.uwaterloo.ca:443&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;localhost:5443&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Under the &amp;lt;code&amp;gt;application-desc&amp;lt;/code&amp;gt; element, the first &amp;lt;code&amp;gt;argument&amp;lt;/code&amp;gt; child element should say &amp;lt;code&amp;gt;ip=riboflavin-ipmi&amp;lt;/code&amp;gt;. Replace this with &amp;lt;code&amp;gt;ip=localhost&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;.&lt;br /&gt;
&amp;lt;li&amp;gt;Under the &amp;lt;code&amp;gt;application-desc&amp;lt;/code&amp;gt; element, there are child &amp;lt;code&amp;gt;argument&amp;lt;/code&amp;gt;elements for &amp;lt;code&amp;gt;user&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;passwd&amp;lt;/code&amp;gt;. For some reason these are set to numbers; set these to the username and password for IPMI (username should be &amp;lt;code&amp;gt;root&amp;lt;/code&amp;gt;).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
jre1.8.0_391/bin/javaws viewer.jnlp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If all goes well, the virtual console should eventually appear:&lt;br /&gt;
[[File:Riboflavin-idrac-virtual-console.png|1000px]]&lt;br /&gt;
&lt;br /&gt;
== carbonated-water ==&lt;br /&gt;
carbonated-water is also using iDRAC 6, but seems to have some kind of TLS certificate configuration which prevents modern browsers from loading its web UI. So we&#039;re going to run an old version of Firefox inside a Podman container instead:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman run --name firefox -it -e DISPLAY --net=host -v $XAUTHORITY:/root/.Xauthority -v /tmp/.X11-unix:/tmp/.X11-unix debian:9-slim bash&lt;br /&gt;
sed -i &#039;s/deb\.debian\.org/archive.debian.org/&#039; /etc/apt/sources.list&lt;br /&gt;
sed -i &#039;s/security\.debian\.org/archive.debian.org/&#039; /etc/apt/sources.list&lt;br /&gt;
sed -i &#039;/stretch-updates/d&#039; /etc/apt/sources.list&lt;br /&gt;
apt update&lt;br /&gt;
apt install firefox-esr&lt;br /&gt;
firefox&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, follow the instructions here to set up a SOCKS proxy: [[How to SSH#SOCKS proxy]]&lt;br /&gt;
&lt;br /&gt;
Now visit https://carbonated-water-ipmi.csclub.uwaterloo.ca from Firefox, login using the IPMI credentials, and download the JNLP file. Copy it from the Podman container to your computer (replace &amp;quot;viewer.jnlp&amp;quot; with the full file name):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman cp firefox:/root/Downloads/viewer.jnlp launch.jnlp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Follow the same steps as done for riboflavin to edit the JDK settings and JNLP file. In addition, there are a few more settings which we need to tweak:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;jre1.8.0_391/bin/ControlPanel&amp;lt;/code&amp;gt;, go to the Advanced tab, scroll down and check &amp;quot;TLS 1.0&amp;quot; and &amp;quot;TLS 1.1&amp;quot;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;We also need to disable OCSP. In the same window, set &amp;quot;Check for signed code certificate revocation using&amp;quot; to &amp;quot;Certificate Revocation Lists (CRLs)&amp;quot; and set &amp;quot;Check for TLS certificate revocation using&amp;quot; to &amp;quot;Certificate Revocation Lists (CRLs)&amp;quot; (see [https://www.kunxi.org/2015/01/bypass-the-certpathvalidatorexception-caused-by-malformed-ocsp-response/ here] for the reference).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
[[File:java-control-panel-advanced.png]]&lt;br /&gt;
&lt;br /&gt;
Now you can launch the JNLP file as usual.&lt;br /&gt;
&lt;br /&gt;
= Supermicro =&lt;br /&gt;
== ginkgo ==&lt;br /&gt;
To access the virtual console on ginkgo, the steps are the same as those for riboflavin, with the following changes:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;In the launch.jnlp file, in the root &amp;lt;code&amp;gt;&amp;lt;jnlp&amp;gt;&amp;lt;/code&amp;gt; tag, change the value of the &amp;lt;code&amp;gt;codebase&amp;lt;/code&amp;gt; attribute from &amp;lt;code&amp;gt;https&amp;lt;nowiki/&amp;gt;://ginkgo-ipmi.csclub.uwaterloo.ca:443&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;https&amp;lt;nowiki/&amp;gt;://localhost:5443&amp;lt;/code&amp;gt;. Next, in the first &amp;lt;code&amp;gt;&amp;lt;argument&amp;gt;&amp;lt;/code&amp;gt; element under &amp;lt;code&amp;gt;&amp;lt;application-desc&amp;gt;&amp;lt;/code&amp;gt;, replace &amp;lt;code&amp;gt;ginkgo-ipmi.csclub.uwaterloo.ca&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;localhost&amp;lt;/code&amp;gt;. These are the only changes which you should make to this file (unless you are already on the campus network, in which case you do not need to modify this file at all).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;jre1.8.0_391/bin/ControlPanel&amp;lt;/code&amp;gt;, go to the Security tab, click &amp;quot;Edit Site List&amp;quot;, and add &amp;lt;code&amp;gt;https&amp;lt;nowiki/&amp;gt;://ginkgo-ipmi.csclub.uwaterloo.ca&amp;lt;/code&amp;gt; as an exception.&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=IPMI101&amp;diff=5244</id>
		<title>IPMI101</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=IPMI101&amp;diff=5244"/>
		<updated>2024-04-03T05:07:15Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* carbonated-water */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Guide to IPMI (IPMI 101) =&lt;br /&gt;
&lt;br /&gt;
IPMI is a necessary evil. Let’s learn to make the best of it.&lt;br /&gt;
&lt;br /&gt;
== Setting up IPMI ==&lt;br /&gt;
&lt;br /&gt;
# Install ipmitool&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# apt-get install ipmitool&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Load IPMI modules (they are included in most upstream kernels)&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You may also need a kernel module specific to your motherboard’s manufacture as some BMC/LOMs do not conform to IPMI spec and thus need a translation layer.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# modprobe ipmi_*&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Locally connect to the &amp;lt;code&amp;gt;/dev/ipmi&amp;lt;/code&amp;gt; interface&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; help&lt;br /&gt;
&amp;amp;gt; mc info&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Securing IPMI ==&lt;br /&gt;
&lt;br /&gt;
Note that root on the machine is root on the BMC and vice versa.&lt;br /&gt;
&lt;br /&gt;
# User administration&lt;br /&gt;
&lt;br /&gt;
(re)set the password, rename the admin account to root and delete any extra users as they can have surprising privilege. You may have to use the BMC’s web interface delete accounts.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; user list 1&lt;br /&gt;
ID Name ...&lt;br /&gt;
2  ADMIN ...&lt;br /&gt;
&amp;amp;gt; user set password 2&lt;br /&gt;
User id 2: *******&lt;br /&gt;
User id 2: *******&lt;br /&gt;
&amp;amp;gt; user set username 2 root&lt;br /&gt;
&amp;amp;gt; user disable $other_user_ids&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Disable NULL password and cipher suite 0&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that the $channel is usually 0 but can range from 0-10 and there can be multiple NICs and so multiple channels to fix.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; lan print $channel&lt;br /&gt;
&amp;amp;gt; lan set $channel auth ADMIN MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth CALLBACK MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth USER MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth OPERATOR MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel cipher_privs XXXaXXXXXXXXXXX&lt;br /&gt;
&amp;amp;gt; lan print $channel&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Configuring networking ==&lt;br /&gt;
&lt;br /&gt;
Note once again that there are sometimes multiple channels, to find the correct channel it is helpful to use either trial and error and/or an ARP scanner to find the correct MAC address. Usually the channel is 0 but I have seen 1, 8 and 17. Especially when there are multiple NICs.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; lan print $channel&lt;br /&gt;
&amp;amp;gt; lan set $channel ipsrc static&lt;br /&gt;
&amp;amp;gt; lan set $channel ipaddr 10.15.134.?&lt;br /&gt;
&amp;amp;gt; lan set $channel defgw ipaddr 10.15.134.1&lt;br /&gt;
&amp;amp;gt; lan set $channel netmask 255.255.255.0&lt;br /&gt;
// if you have vlan tagging enabled on the switch port, useful for a shared NIC&lt;br /&gt;
&amp;amp;gt; lan set $channel vlan id 520&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Configuring Serial over LAN ==&lt;br /&gt;
&lt;br /&gt;
To enable serial over LAN you need to ensure that it is enabled in your BIOS or EFI setup utility and further note the baud rate. 115200 is used as an example below. Note that GRUB is the only boot loader that takes input via serial properly, in my experience. Syslinux failed horribly on corn-syrup.&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/default/grub.d/99-csclub.cfg:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
GRUB_CMDLINE_LINUX=&amp;amp;quot;console=tty1 console=ttyS1,115200n8&amp;amp;quot;&lt;br /&gt;
GRUB_TERMINAL_INPUT=&amp;amp;quot;console serial&amp;amp;quot;&lt;br /&gt;
GRUB_TERMINAL_OUTPUT=&amp;amp;quot;console serial&amp;amp;quot;&lt;br /&gt;
GRUB_SERIAL_COMMAND=&amp;amp;quot;serial --speed=115200 --unit=1 --word=8 --parity=no --stop=1&amp;amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
and then run:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;// on debian based distros&lt;br /&gt;
// Yay, Debian magic :\&lt;br /&gt;
# update-grub&lt;br /&gt;
// on upstream packages (Arch, Fedora, etc.)&lt;br /&gt;
# grub-mkconfig -o /boot/grub/grub.cfg&lt;br /&gt;
&lt;br /&gt;
# reboot&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= iDRAC =&lt;br /&gt;
== riboflavin ==&lt;br /&gt;
riboflavin is using iDRAC 6. The web console can be viewed from https://riboflavin-ipmi.csclub.uwaterloo.ca; if you are not on campus, you can use a [[How_to_SSH#SOCKS_proxy|SOCKS proxy]]. Unfortunately, the virtual console uses Java Web Start, which is now deprecated. Here&#039;s a workaround which you can use instead.&lt;br /&gt;
&lt;br /&gt;
From the web UI, go to the &amp;quot;Console/Media&amp;quot; tab and click the &amp;quot;Launch virtual console&amp;quot; button. This will download a file whose name starts with &amp;quot;viewer.jnlp&amp;quot;. Now go to https://www.java.com and download JRE 8; any later version will not have support for JWS (note that OpenJDK will not work; JWS was a proprietary framework from Sun/Oracle). Unpack the tarball, open jre1.8.0_391/lib/security/java.security in a text editor, and comment out the following properties (note that each property spans multiple lines):&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.certpath.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.jar.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.tls.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you are off-campus, you will need to setup some proxying so that the Java application can access ports 443 and 5900 on riboflavin-ipmi. In the example below, I am using caffeine as a jump host, but any machine on campus should do:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L 5443:localhost:5443 -L 5900:localhost:5900 caffeine.csclub.uwaterloo.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now on caffeine, open a tmux/screen session, and run the following commands in two different panes:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;socat TCP-LISTEN:5443,fork TCP:riboflavin-ipmi:443&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;socat TCP-LISTEN:5900,fork TCP:riboflavin-ipmi:5900&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Back on your personal machine, open the viewer.jnlp file in a text editor and perform the following:&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Replace all instances of &amp;lt;code&amp;gt;riboflavin-ipmi.csclub.uwaterloo.ca:443&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;localhost:5443&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Under the &amp;lt;code&amp;gt;application-desc&amp;lt;/code&amp;gt; element, the first &amp;lt;code&amp;gt;argument&amp;lt;/code&amp;gt; child element should say &amp;lt;code&amp;gt;ip=riboflavin-ipmi&amp;lt;/code&amp;gt;. Replace this with &amp;lt;code&amp;gt;ip=localhost&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;.&lt;br /&gt;
&amp;lt;li&amp;gt;Under the &amp;lt;code&amp;gt;application-desc&amp;lt;/code&amp;gt; element, there are child &amp;lt;code&amp;gt;argument&amp;lt;/code&amp;gt;elements for &amp;lt;code&amp;gt;user&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;passwd&amp;lt;/code&amp;gt;. For some reason these are set to numbers; set these to the username and password for IPMI (username should be &amp;lt;code&amp;gt;root&amp;lt;/code&amp;gt;).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
jre1.8.0_391/bin/javaws viewer.jnlp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If all goes well, the virtual console should eventually appear:&lt;br /&gt;
[[File:Riboflavin-idrac-virtual-console.png|1000px]]&lt;br /&gt;
&lt;br /&gt;
== carbonated-water ==&lt;br /&gt;
carbonated-water is also using iDRAC 6, but seems to have some kind of TLS certificate configuration which prevents modern browsers from its web UI. So we&#039;re going to run an old version of Firefox inside a Podman container instead:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman run --name firefox -it -e DISPLAY --net=host -v $XAUTHORITY:/root/.Xauthority -v /tmp/.X11-unix:/tmp/.X11-unix debian:9-slim bash&lt;br /&gt;
sed -i &#039;s/deb\.debian\.org/archive.debian.org/&#039; /etc/apt/sources.list&lt;br /&gt;
sed -i &#039;s/security\.debian\.org/archive.debian.org/&#039; /etc/apt/sources.list&lt;br /&gt;
sed -i &#039;/stretch-updates/d&#039; /etc/apt/sources.list&lt;br /&gt;
apt update&lt;br /&gt;
apt install firefox-esr&lt;br /&gt;
firefox&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, follow the instructions here to set up a SOCKS proxy: [[How to SSH#SOCKS proxy]]&lt;br /&gt;
&lt;br /&gt;
Now visit https://carbonated-water-ipmi.csclub.uwaterloo.ca from Firefox, login using the IPMI credentials, and download the JNLP file. Copy it from the Podman container to your computer (replace &amp;quot;viewer.jnlp&amp;quot; with the full file name):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman cp firefox:/root/Downloads/viewer.jnlp launch.jnlp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Follow the same steps as done for riboflavin to edit the JDK settings and JNLP file. In addition, there are a few more settings which we need to tweak:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;jre1.8.0_391/bin/ControlPanel&amp;lt;/code&amp;gt;, go to the Advanced tab, scroll down and check &amp;quot;TLS 1.0&amp;quot; and &amp;quot;TLS 1.1&amp;quot;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;We also need to disable OCSP. In the same window, set &amp;quot;Check for signed code certificate revocation using&amp;quot; to &amp;quot;Certificate Revocation Lists (CRLs)&amp;quot; and set &amp;quot;Check for TLS certificate revocation using&amp;quot; to &amp;quot;Certificate Revocation Lists (CRLs)&amp;quot; (see [https://www.kunxi.org/2015/01/bypass-the-certpathvalidatorexception-caused-by-malformed-ocsp-response/ here] for the reference).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
[[File:java-control-panel-advanced.png]]&lt;br /&gt;
&lt;br /&gt;
Now you can launch the JNLP file as usual.&lt;br /&gt;
&lt;br /&gt;
= Supermicro =&lt;br /&gt;
== ginkgo ==&lt;br /&gt;
To access the virtual console on ginkgo, the steps are the same as those for riboflavin, with the following changes:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;In the launch.jnlp file, in the root &amp;lt;code&amp;gt;&amp;lt;jnlp&amp;gt;&amp;lt;/code&amp;gt; tag, change the value of the &amp;lt;code&amp;gt;codebase&amp;lt;/code&amp;gt; attribute from &amp;lt;code&amp;gt;https&amp;lt;nowiki/&amp;gt;://ginkgo-ipmi.csclub.uwaterloo.ca:443&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;https&amp;lt;nowiki/&amp;gt;://localhost:5443&amp;lt;/code&amp;gt;. Next, in the first &amp;lt;code&amp;gt;&amp;lt;argument&amp;gt;&amp;lt;/code&amp;gt; element under &amp;lt;code&amp;gt;&amp;lt;application-desc&amp;gt;&amp;lt;/code&amp;gt;, replace &amp;lt;code&amp;gt;ginkgo-ipmi.csclub.uwaterloo.ca&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;localhost&amp;lt;/code&amp;gt;. These are the only changes which you should make to this file (unless you are already on the campus network, in which case you do not need to modify this file at all).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;jre1.8.0_391/bin/ControlPanel&amp;lt;/code&amp;gt;, go to the Security tab, click &amp;quot;Edit Site List&amp;quot;, and add &amp;lt;code&amp;gt;https&amp;lt;nowiki/&amp;gt;://ginkgo-ipmi.csclub.uwaterloo.ca&amp;lt;/code&amp;gt; as an exception.&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=IPMI101&amp;diff=5243</id>
		<title>IPMI101</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=IPMI101&amp;diff=5243"/>
		<updated>2024-04-03T05:06:53Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* carbonated-water */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Guide to IPMI (IPMI 101) =&lt;br /&gt;
&lt;br /&gt;
IPMI is a necessary evil. Let’s learn to make the best of it.&lt;br /&gt;
&lt;br /&gt;
== Setting up IPMI ==&lt;br /&gt;
&lt;br /&gt;
# Install ipmitool&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# apt-get install ipmitool&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Load IPMI modules (they are included in most upstream kernels)&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You may also need a kernel module specific to your motherboard’s manufacture as some BMC/LOMs do not conform to IPMI spec and thus need a translation layer.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# modprobe ipmi_*&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Locally connect to the &amp;lt;code&amp;gt;/dev/ipmi&amp;lt;/code&amp;gt; interface&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; help&lt;br /&gt;
&amp;amp;gt; mc info&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Securing IPMI ==&lt;br /&gt;
&lt;br /&gt;
Note that root on the machine is root on the BMC and vice versa.&lt;br /&gt;
&lt;br /&gt;
# User administration&lt;br /&gt;
&lt;br /&gt;
(re)set the password, rename the admin account to root and delete any extra users as they can have surprising privilege. You may have to use the BMC’s web interface delete accounts.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; user list 1&lt;br /&gt;
ID Name ...&lt;br /&gt;
2  ADMIN ...&lt;br /&gt;
&amp;amp;gt; user set password 2&lt;br /&gt;
User id 2: *******&lt;br /&gt;
User id 2: *******&lt;br /&gt;
&amp;amp;gt; user set username 2 root&lt;br /&gt;
&amp;amp;gt; user disable $other_user_ids&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Disable NULL password and cipher suite 0&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that the $channel is usually 0 but can range from 0-10 and there can be multiple NICs and so multiple channels to fix.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; lan print $channel&lt;br /&gt;
&amp;amp;gt; lan set $channel auth ADMIN MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth CALLBACK MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth USER MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth OPERATOR MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel cipher_privs XXXaXXXXXXXXXXX&lt;br /&gt;
&amp;amp;gt; lan print $channel&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Configuring networking ==&lt;br /&gt;
&lt;br /&gt;
Note once again that there are sometimes multiple channels, to find the correct channel it is helpful to use either trial and error and/or an ARP scanner to find the correct MAC address. Usually the channel is 0 but I have seen 1, 8 and 17. Especially when there are multiple NICs.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; lan print $channel&lt;br /&gt;
&amp;amp;gt; lan set $channel ipsrc static&lt;br /&gt;
&amp;amp;gt; lan set $channel ipaddr 10.15.134.?&lt;br /&gt;
&amp;amp;gt; lan set $channel defgw ipaddr 10.15.134.1&lt;br /&gt;
&amp;amp;gt; lan set $channel netmask 255.255.255.0&lt;br /&gt;
// if you have vlan tagging enabled on the switch port, useful for a shared NIC&lt;br /&gt;
&amp;amp;gt; lan set $channel vlan id 520&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Configuring Serial over LAN ==&lt;br /&gt;
&lt;br /&gt;
To enable serial over LAN you need to ensure that it is enabled in your BIOS or EFI setup utility and further note the baud rate. 115200 is used as an example below. Note that GRUB is the only boot loader that takes input via serial properly, in my experience. Syslinux failed horribly on corn-syrup.&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/default/grub.d/99-csclub.cfg:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
GRUB_CMDLINE_LINUX=&amp;amp;quot;console=tty1 console=ttyS1,115200n8&amp;amp;quot;&lt;br /&gt;
GRUB_TERMINAL_INPUT=&amp;amp;quot;console serial&amp;amp;quot;&lt;br /&gt;
GRUB_TERMINAL_OUTPUT=&amp;amp;quot;console serial&amp;amp;quot;&lt;br /&gt;
GRUB_SERIAL_COMMAND=&amp;amp;quot;serial --speed=115200 --unit=1 --word=8 --parity=no --stop=1&amp;amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
and then run:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;// on debian based distros&lt;br /&gt;
// Yay, Debian magic :\&lt;br /&gt;
# update-grub&lt;br /&gt;
// on upstream packages (Arch, Fedora, etc.)&lt;br /&gt;
# grub-mkconfig -o /boot/grub/grub.cfg&lt;br /&gt;
&lt;br /&gt;
# reboot&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= iDRAC =&lt;br /&gt;
== riboflavin ==&lt;br /&gt;
riboflavin is using iDRAC 6. The web console can be viewed from https://riboflavin-ipmi.csclub.uwaterloo.ca; if you are not on campus, you can use a [[How_to_SSH#SOCKS_proxy|SOCKS proxy]]. Unfortunately, the virtual console uses Java Web Start, which is now deprecated. Here&#039;s a workaround which you can use instead.&lt;br /&gt;
&lt;br /&gt;
From the web UI, go to the &amp;quot;Console/Media&amp;quot; tab and click the &amp;quot;Launch virtual console&amp;quot; button. This will download a file whose name starts with &amp;quot;viewer.jnlp&amp;quot;. Now go to https://www.java.com and download JRE 8; any later version will not have support for JWS (note that OpenJDK will not work; JWS was a proprietary framework from Sun/Oracle). Unpack the tarball, open jre1.8.0_391/lib/security/java.security in a text editor, and comment out the following properties (note that each property spans multiple lines):&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.certpath.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.jar.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.tls.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you are off-campus, you will need to setup some proxying so that the Java application can access ports 443 and 5900 on riboflavin-ipmi. In the example below, I am using caffeine as a jump host, but any machine on campus should do:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L 5443:localhost:5443 -L 5900:localhost:5900 caffeine.csclub.uwaterloo.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now on caffeine, open a tmux/screen session, and run the following commands in two different panes:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;socat TCP-LISTEN:5443,fork TCP:riboflavin-ipmi:443&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;socat TCP-LISTEN:5900,fork TCP:riboflavin-ipmi:5900&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Back on your personal machine, open the viewer.jnlp file in a text editor and perform the following:&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Replace all instances of &amp;lt;code&amp;gt;riboflavin-ipmi.csclub.uwaterloo.ca:443&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;localhost:5443&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Under the &amp;lt;code&amp;gt;application-desc&amp;lt;/code&amp;gt; element, the first &amp;lt;code&amp;gt;argument&amp;lt;/code&amp;gt; child element should say &amp;lt;code&amp;gt;ip=riboflavin-ipmi&amp;lt;/code&amp;gt;. Replace this with &amp;lt;code&amp;gt;ip=localhost&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;.&lt;br /&gt;
&amp;lt;li&amp;gt;Under the &amp;lt;code&amp;gt;application-desc&amp;lt;/code&amp;gt; element, there are child &amp;lt;code&amp;gt;argument&amp;lt;/code&amp;gt;elements for &amp;lt;code&amp;gt;user&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;passwd&amp;lt;/code&amp;gt;. For some reason these are set to numbers; set these to the username and password for IPMI (username should be &amp;lt;code&amp;gt;root&amp;lt;/code&amp;gt;).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
jre1.8.0_391/bin/javaws viewer.jnlp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If all goes well, the virtual console should eventually appear:&lt;br /&gt;
[[File:Riboflavin-idrac-virtual-console.png|1000px]]&lt;br /&gt;
&lt;br /&gt;
== carbonated-water ==&lt;br /&gt;
carbonated-water is also using iDRAC 6, but seems to have some kind of TLS certificate configuration which prevents modern browsers from its web UI. So we&#039;re going to run an old version of Firefox inside a Podman container instead:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman run --name firefox -it -e DISPLAY --net=host -v $XAUTHORITY:/root/.Xauthority -v /tmp/.X11-unix:/tmp/.X11-unix debian:9-slim bash&lt;br /&gt;
sed -i &#039;s/deb\.debian\.org/archive.debian.org/&#039; /etc/apt/sources.list&lt;br /&gt;
sed -i &#039;s/security\.debian\.org/archive.debian.org/&#039; /etc/apt/sources.list&lt;br /&gt;
sed -i &#039;/stretch-updates/d&#039; /etc/apt/sources.list&lt;br /&gt;
apt update&lt;br /&gt;
apt install firefox-esr&lt;br /&gt;
firefox&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, follow the instructions here to set up a SOCKS proxy: [[How to SSH#SOCKS proxy]]&lt;br /&gt;
&lt;br /&gt;
Now visit https://carbonated-water-ipmi.csclub.uwaterloo.ca from Firefox, login using the IPMI credentials, and download the JNLP file. Copy it from the Podman container to your computer (replace &amp;quot;viewer.jnlp&amp;quot; with the full file name):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman cp firefox:/root/Downloads/viewer.jnlp launch.jnlp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Follow the same steps as done for riboflavin to edit the JDK settings and JNLP file. In addition, there are a few more settings which we need to tweak:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;jre1.8.0_391/bin/ControlPanel&amp;lt;/code&amp;gt;, go to the Advanced tab, scroll down and check &amp;quot;TLS 1.0&amp;quot; and &amp;quot;TLS 1.1&amp;quot;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;We also need to disable OCSP. In the same window, set &amp;quot;Check for signed code certificate revocation using&amp;quot; to &amp;quot;Certificate Revocation Lists (CRLs)&amp;quot; and set &amp;quot;Check for TLS certificate revocation using&amp;quot; to &amp;quot;Certificate Revocation Lists (CRLs)&amp;quot; (see [https://www.kunxi.org/2015/01/bypass-the-certpathvalidatorexception-caused-by-malformed-ocsp-response/ here] for the reference).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
[[File:java-control-panel-advanced.png]]&lt;br /&gt;
Now you can launch the JNLP file as usual.&lt;br /&gt;
&lt;br /&gt;
= Supermicro =&lt;br /&gt;
== ginkgo ==&lt;br /&gt;
To access the virtual console on ginkgo, the steps are the same as those for riboflavin, with the following changes:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;In the launch.jnlp file, in the root &amp;lt;code&amp;gt;&amp;lt;jnlp&amp;gt;&amp;lt;/code&amp;gt; tag, change the value of the &amp;lt;code&amp;gt;codebase&amp;lt;/code&amp;gt; attribute from &amp;lt;code&amp;gt;https&amp;lt;nowiki/&amp;gt;://ginkgo-ipmi.csclub.uwaterloo.ca:443&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;https&amp;lt;nowiki/&amp;gt;://localhost:5443&amp;lt;/code&amp;gt;. Next, in the first &amp;lt;code&amp;gt;&amp;lt;argument&amp;gt;&amp;lt;/code&amp;gt; element under &amp;lt;code&amp;gt;&amp;lt;application-desc&amp;gt;&amp;lt;/code&amp;gt;, replace &amp;lt;code&amp;gt;ginkgo-ipmi.csclub.uwaterloo.ca&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;localhost&amp;lt;/code&amp;gt;. These are the only changes which you should make to this file (unless you are already on the campus network, in which case you do not need to modify this file at all).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;jre1.8.0_391/bin/ControlPanel&amp;lt;/code&amp;gt;, go to the Security tab, click &amp;quot;Edit Site List&amp;quot;, and add &amp;lt;code&amp;gt;https&amp;lt;nowiki/&amp;gt;://ginkgo-ipmi.csclub.uwaterloo.ca&amp;lt;/code&amp;gt; as an exception.&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=File:Java-control-panel-advanced.png&amp;diff=5242</id>
		<title>File:Java-control-panel-advanced.png</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=File:Java-control-panel-advanced.png&amp;diff=5242"/>
		<updated>2024-04-03T05:04:19Z</updated>

		<summary type="html">&lt;p&gt;Merenber: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=IPMI101&amp;diff=5241</id>
		<title>IPMI101</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=IPMI101&amp;diff=5241"/>
		<updated>2024-04-03T05:02:30Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* iDRAC */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Guide to IPMI (IPMI 101) =&lt;br /&gt;
&lt;br /&gt;
IPMI is a necessary evil. Let’s learn to make the best of it.&lt;br /&gt;
&lt;br /&gt;
== Setting up IPMI ==&lt;br /&gt;
&lt;br /&gt;
# Install ipmitool&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# apt-get install ipmitool&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Load IPMI modules (they are included in most upstream kernels)&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You may also need a kernel module specific to your motherboard’s manufacture as some BMC/LOMs do not conform to IPMI spec and thus need a translation layer.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# modprobe ipmi_*&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Locally connect to the &amp;lt;code&amp;gt;/dev/ipmi&amp;lt;/code&amp;gt; interface&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; help&lt;br /&gt;
&amp;amp;gt; mc info&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Securing IPMI ==&lt;br /&gt;
&lt;br /&gt;
Note that root on the machine is root on the BMC and vice versa.&lt;br /&gt;
&lt;br /&gt;
# User administration&lt;br /&gt;
&lt;br /&gt;
(re)set the password, rename the admin account to root and delete any extra users as they can have surprising privilege. You may have to use the BMC’s web interface delete accounts.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; user list 1&lt;br /&gt;
ID Name ...&lt;br /&gt;
2  ADMIN ...&lt;br /&gt;
&amp;amp;gt; user set password 2&lt;br /&gt;
User id 2: *******&lt;br /&gt;
User id 2: *******&lt;br /&gt;
&amp;amp;gt; user set username 2 root&lt;br /&gt;
&amp;amp;gt; user disable $other_user_ids&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Disable NULL password and cipher suite 0&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that the $channel is usually 0 but can range from 0-10 and there can be multiple NICs and so multiple channels to fix.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; lan print $channel&lt;br /&gt;
&amp;amp;gt; lan set $channel auth ADMIN MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth CALLBACK MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth USER MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth OPERATOR MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel cipher_privs XXXaXXXXXXXXXXX&lt;br /&gt;
&amp;amp;gt; lan print $channel&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Configuring networking ==&lt;br /&gt;
&lt;br /&gt;
Note once again that there are sometimes multiple channels, to find the correct channel it is helpful to use either trial and error and/or an ARP scanner to find the correct MAC address. Usually the channel is 0 but I have seen 1, 8 and 17. Especially when there are multiple NICs.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; lan print $channel&lt;br /&gt;
&amp;amp;gt; lan set $channel ipsrc static&lt;br /&gt;
&amp;amp;gt; lan set $channel ipaddr 10.15.134.?&lt;br /&gt;
&amp;amp;gt; lan set $channel defgw ipaddr 10.15.134.1&lt;br /&gt;
&amp;amp;gt; lan set $channel netmask 255.255.255.0&lt;br /&gt;
// if you have vlan tagging enabled on the switch port, useful for a shared NIC&lt;br /&gt;
&amp;amp;gt; lan set $channel vlan id 520&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Configuring Serial over LAN ==&lt;br /&gt;
&lt;br /&gt;
To enable serial over LAN you need to ensure that it is enabled in your BIOS or EFI setup utility and further note the baud rate. 115200 is used as an example below. Note that GRUB is the only boot loader that takes input via serial properly, in my experience. Syslinux failed horribly on corn-syrup.&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/default/grub.d/99-csclub.cfg:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
GRUB_CMDLINE_LINUX=&amp;amp;quot;console=tty1 console=ttyS1,115200n8&amp;amp;quot;&lt;br /&gt;
GRUB_TERMINAL_INPUT=&amp;amp;quot;console serial&amp;amp;quot;&lt;br /&gt;
GRUB_TERMINAL_OUTPUT=&amp;amp;quot;console serial&amp;amp;quot;&lt;br /&gt;
GRUB_SERIAL_COMMAND=&amp;amp;quot;serial --speed=115200 --unit=1 --word=8 --parity=no --stop=1&amp;amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
and then run:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;// on debian based distros&lt;br /&gt;
// Yay, Debian magic :\&lt;br /&gt;
# update-grub&lt;br /&gt;
// on upstream packages (Arch, Fedora, etc.)&lt;br /&gt;
# grub-mkconfig -o /boot/grub/grub.cfg&lt;br /&gt;
&lt;br /&gt;
# reboot&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= iDRAC =&lt;br /&gt;
== riboflavin ==&lt;br /&gt;
riboflavin is using iDRAC 6. The web console can be viewed from https://riboflavin-ipmi.csclub.uwaterloo.ca; if you are not on campus, you can use a [[How_to_SSH#SOCKS_proxy|SOCKS proxy]]. Unfortunately, the virtual console uses Java Web Start, which is now deprecated. Here&#039;s a workaround which you can use instead.&lt;br /&gt;
&lt;br /&gt;
From the web UI, go to the &amp;quot;Console/Media&amp;quot; tab and click the &amp;quot;Launch virtual console&amp;quot; button. This will download a file whose name starts with &amp;quot;viewer.jnlp&amp;quot;. Now go to https://www.java.com and download JRE 8; any later version will not have support for JWS (note that OpenJDK will not work; JWS was a proprietary framework from Sun/Oracle). Unpack the tarball, open jre1.8.0_391/lib/security/java.security in a text editor, and comment out the following properties (note that each property spans multiple lines):&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.certpath.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.jar.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.tls.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you are off-campus, you will need to setup some proxying so that the Java application can access ports 443 and 5900 on riboflavin-ipmi. In the example below, I am using caffeine as a jump host, but any machine on campus should do:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L 5443:localhost:5443 -L 5900:localhost:5900 caffeine.csclub.uwaterloo.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now on caffeine, open a tmux/screen session, and run the following commands in two different panes:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;socat TCP-LISTEN:5443,fork TCP:riboflavin-ipmi:443&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;socat TCP-LISTEN:5900,fork TCP:riboflavin-ipmi:5900&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Back on your personal machine, open the viewer.jnlp file in a text editor and perform the following:&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Replace all instances of &amp;lt;code&amp;gt;riboflavin-ipmi.csclub.uwaterloo.ca:443&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;localhost:5443&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Under the &amp;lt;code&amp;gt;application-desc&amp;lt;/code&amp;gt; element, the first &amp;lt;code&amp;gt;argument&amp;lt;/code&amp;gt; child element should say &amp;lt;code&amp;gt;ip=riboflavin-ipmi&amp;lt;/code&amp;gt;. Replace this with &amp;lt;code&amp;gt;ip=localhost&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;.&lt;br /&gt;
&amp;lt;li&amp;gt;Under the &amp;lt;code&amp;gt;application-desc&amp;lt;/code&amp;gt; element, there are child &amp;lt;code&amp;gt;argument&amp;lt;/code&amp;gt;elements for &amp;lt;code&amp;gt;user&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;passwd&amp;lt;/code&amp;gt;. For some reason these are set to numbers; set these to the username and password for IPMI (username should be &amp;lt;code&amp;gt;root&amp;lt;/code&amp;gt;).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
jre1.8.0_391/bin/javaws viewer.jnlp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If all goes well, the virtual console should eventually appear:&lt;br /&gt;
[[File:Riboflavin-idrac-virtual-console.png|1000px]]&lt;br /&gt;
&lt;br /&gt;
== carbonated-water ==&lt;br /&gt;
carbonated-water is also using iDRAC 6, but seems to have some kind of TLS certificate configuration which prevents modern browsers from its web UI. So we&#039;re going to run an old version of Firefox inside a Podman container instead:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman run --name firefox -it -e DISPLAY --net=host -v $XAUTHORITY:/root/.Xauthority -v /tmp/.X11-unix:/tmp/.X11-unix debian:9-slim bash&lt;br /&gt;
sed -i &#039;s/deb\.debian\.org/archive.debian.org/&#039; /etc/apt/sources.list&lt;br /&gt;
sed -i &#039;s/security\.debian\.org/archive.debian.org/&#039; /etc/apt/sources.list&lt;br /&gt;
sed -i &#039;/stretch-updates/d&#039; /etc/apt/sources.list&lt;br /&gt;
apt update&lt;br /&gt;
apt install firefox-esr&lt;br /&gt;
firefox&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, follow the instructions here to set up a SOCKS proxy: [[How to SSH#SOCKS proxy]]&lt;br /&gt;
&lt;br /&gt;
Now visit https://carbonated-water-ipmi.csclub.uwaterloo.ca from Firefox, login using the IPMI credentials, and download the JNLP file. Copy it from the Podman container to your computer (replace &amp;quot;viewer.jnlp&amp;quot; with the full file name):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
podman cp firefox:/root/Downloads/viewer.jnlp launch.jnlp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Follow the same steps as done for riboflavin to edit the JDK settings and JNLP file. In addition, there are a few more settings which we need to tweak:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;jre1.8.0_391/bin/ControlPanel&amp;lt;/code&amp;gt;, go to the Advanced tab, scroll down and check &amp;quot;TLS 1.0&amp;quot; and &amp;quot;TLS 1.1&amp;quot;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;We also need to disable OCSP. In the same window, set &amp;quot;Check for signed code certificate revocation using&amp;quot; to &amp;quot;Certificate Revocation Lists (CRLs)&amp;quot; and set &amp;quot;Check for TLS certificate revocation using&amp;quot; to &amp;quot;Certificate Revocation Lists (CRLs)&amp;quot; (see [https://www.kunxi.org/2015/01/bypass-the-certpathvalidatorexception-caused-by-malformed-ocsp-response/ here] for the reference).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Supermicro =&lt;br /&gt;
== ginkgo ==&lt;br /&gt;
To access the virtual console on ginkgo, the steps are the same as those for riboflavin, with the following changes:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;In the launch.jnlp file, in the root &amp;lt;code&amp;gt;&amp;lt;jnlp&amp;gt;&amp;lt;/code&amp;gt; tag, change the value of the &amp;lt;code&amp;gt;codebase&amp;lt;/code&amp;gt; attribute from &amp;lt;code&amp;gt;https&amp;lt;nowiki/&amp;gt;://ginkgo-ipmi.csclub.uwaterloo.ca:443&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;https&amp;lt;nowiki/&amp;gt;://localhost:5443&amp;lt;/code&amp;gt;. Next, in the first &amp;lt;code&amp;gt;&amp;lt;argument&amp;gt;&amp;lt;/code&amp;gt; element under &amp;lt;code&amp;gt;&amp;lt;application-desc&amp;gt;&amp;lt;/code&amp;gt;, replace &amp;lt;code&amp;gt;ginkgo-ipmi.csclub.uwaterloo.ca&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;localhost&amp;lt;/code&amp;gt;. These are the only changes which you should make to this file (unless you are already on the campus network, in which case you do not need to modify this file at all).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;jre1.8.0_391/bin/ControlPanel&amp;lt;/code&amp;gt;, go to the Security tab, click &amp;quot;Edit Site List&amp;quot;, and add &amp;lt;code&amp;gt;https&amp;lt;nowiki/&amp;gt;://ginkgo-ipmi.csclub.uwaterloo.ca&amp;lt;/code&amp;gt; as an exception.&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5240</id>
		<title>MySQL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5240"/>
		<updated>2024-03-30T15:03:11Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Backups */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For members ==&lt;br /&gt;
Note: the database on caffeine is actually MariaDB, not MySQL. Although they are mostly compatible, there are some incompatibilities to be aware of. See [https://mariadb.com/kb/en/mariadb-vs-mysql-compatibility/ MariaDB versus MySQL: Compatibility] for details.&lt;br /&gt;
&lt;br /&gt;
=== Creating databases ===&lt;br /&gt;
&lt;br /&gt;
Users can create their own MySQL databases through [[ceo]]. Users emailing syscom asking for a MySQL database should be directed to do so. The process is as follows:&lt;br /&gt;
&lt;br /&gt;
# SSH into any [[Machine_List|CSC machine]].&lt;br /&gt;
# Run &amp;lt;tt&amp;gt;ceo&amp;lt;/tt&amp;gt;.&lt;br /&gt;
# Select &amp;quot;Create MySQL database&amp;quot; and follow the instructions.&lt;br /&gt;
# Login info will be stored in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
# You can now connect to the MySQL database (from [[Machine_List#caffeine|caffeine]] only).&lt;br /&gt;
&lt;br /&gt;
=== Deleting databases ===&lt;br /&gt;
&lt;br /&gt;
Users can delete their own MySQL databases. &lt;br /&gt;
&lt;br /&gt;
SSH into [[Machine_List#caffeine|caffeine]].&lt;br /&gt;
 mysql -u yourusernamehere -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
 DROP DATABASE database name goes here&lt;br /&gt;
Login info and database name was created on database creation in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
&lt;br /&gt;
== For syscom ==&lt;br /&gt;
&lt;br /&gt;
=== Creating a database manually ===&lt;br /&gt;
To create a MySQL database manually on caffeine, first connect to the database as root:&lt;br /&gt;
&lt;br /&gt;
 $ mysql -uroot -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
&lt;br /&gt;
Then run the following SQL statements:&lt;br /&gt;
&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;%&#039; IDENTIFIED BY &#039;longrandompassword&#039;;&lt;br /&gt;
 CREATE DATABASE someuser;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;%&#039;;&lt;br /&gt;
&lt;br /&gt;
This will allow users to connect locally without a password, and connect remotely with a password.&lt;br /&gt;
&lt;br /&gt;
For random passwords run &amp;lt;code&amp;gt;pwgen -s 20 1&amp;lt;/code&amp;gt;. For the administrative passwords see /users/sysadmin/passwords/mysql.&lt;br /&gt;
&lt;br /&gt;
Write a file (usually ~club/mysql) to the club&#039;s homedir readable only by them containing the following:&lt;br /&gt;
&lt;br /&gt;
 Username: clubuserid&lt;br /&gt;
 Password: longrandompassword&lt;br /&gt;
 Hostname: localhost&lt;br /&gt;
&lt;br /&gt;
Try not to send passwords via plaintext email.&lt;br /&gt;
&lt;br /&gt;
=== Replication ===&lt;br /&gt;
&lt;br /&gt;
See the history of this page for information on the previous replication setup.&lt;br /&gt;
&lt;br /&gt;
=== Backups ===&lt;br /&gt;
&lt;br /&gt;
We use [https://mariadb.com/kb/en/mariabackup-overview/ mariabackup] to take periodic backups. It is currently installed and configured on both caffeine and coffee.&lt;br /&gt;
&lt;br /&gt;
==== Installation ====&lt;br /&gt;
In the example below, we will be installing mariabackup on coffee, and sending the backups to corn-syrup.&lt;br /&gt;
&lt;br /&gt;
First, install the mariadb-backup package:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt install mariadb-backup&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Next, create an SSH key pair for the mysql user:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir /var/mariadb&lt;br /&gt;
chown mysql:mysql /var/mariadb&lt;br /&gt;
su -s /bin/bash mysql&lt;br /&gt;
cd /var/mariadb&lt;br /&gt;
mkdir .ssh&lt;br /&gt;
chmod 700 .ssh&lt;br /&gt;
 # Choose /var/mariadb/.ssh/id_ed25519 for the path&lt;br /&gt;
ssh-keygen -t ed25519&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the public key (/var/mariadb/.ssh/id_ed25519.pub) into /users/syscom/.ssh/authorized_keys on corn-syrup:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
restrict ssh-ed25519 AAAAC3Nza... mysql@coffee&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Also create the folder &amp;lt;code&amp;gt;/users/syscom/backups/coffee/mariabackup&amp;lt;/code&amp;gt;. We will store the backups here.&lt;br /&gt;
&lt;br /&gt;
We will use a hacky bash script to try to emulate the same behaviour as pgBackRest. We will compress and stream each backup to a folder on corn-syrup in the format &amp;lt;code&amp;gt;1701678356-F&amp;lt;/code&amp;gt;, where the number is a Unix epoch timestamp and the letter at the end is one of F, D or I (for full, differential or incremental backups). Full backups do not depend on any other backups. Differential backups depend on the latest full backup before them. Incremental backups depend on the latest backup before them (of any type).&lt;br /&gt;
&lt;br /&gt;
On coffee, paste the following into e.g. /var/mariadb/bin/backup-mariadb.sh:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
RETENTION_FULL=2&lt;br /&gt;
RETENTION_DIFF=4&lt;br /&gt;
SSH_KEY=/var/mariadb/.ssh/id_ed25519&lt;br /&gt;
SSH_USER=syscom&lt;br /&gt;
SSH_HOST=corn-syrup&lt;br /&gt;
SSH_FOLDER=/users/$SSH_USER/backups/$(hostname)/mariabackup&lt;br /&gt;
SSH_ARGS=&amp;quot;-i $SSH_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null&amp;quot;&lt;br /&gt;
SSH=&amp;quot;ssh $SSH_ARGS $SSH_USER@$SSH_HOST&amp;quot;&lt;br /&gt;
&lt;br /&gt;
set -euxo pipefail&lt;br /&gt;
# $USER doesn&#039;t seem to be defined when we run this from cron&lt;br /&gt;
if [ &amp;quot;$(id -un)&amp;quot; != mysql ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the mysql user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Usage: $0 &amp;lt;full|diff|incr&amp;gt;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
backup_type=$1&lt;br /&gt;
if [ &amp;quot;$backup_type&amp;quot; = full ]; then&lt;br /&gt;
    backup_type_letter=F&lt;br /&gt;
elif [ &amp;quot;$backup_type&amp;quot; = diff ]; then&lt;br /&gt;
    backup_type_letter=D&lt;br /&gt;
elif [ &amp;quot;$backup_type&amp;quot; = incr ]; then&lt;br /&gt;
    backup_type_letter=I&lt;br /&gt;
else&lt;br /&gt;
    echo &amp;quot;Backup type must be one of &#039;full&#039;, &#039;diff&#039; or &#039;incr&#039;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if ! pgrep mariadbd &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;MariaDB is not running&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if pgrep mariabackup &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;mariabackup is already running&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Delete temporary files left behind by previous run, if there are any&lt;br /&gt;
$SSH -- &amp;quot;rm -rf $SSH_FOLDER/*.tmp&amp;quot;&lt;br /&gt;
# Get a list of all backups in chronological order&lt;br /&gt;
mapfile -t backups &amp;lt; &amp;lt;($SSH -- &amp;quot;/bin/ls -1 $SSH_FOLDER | grep -P &#039;^\\d+-[FDI]$&#039; | sort&amp;quot;)&lt;br /&gt;
incremental_basedir_args=&lt;br /&gt;
old_checkpoint_dir=$(mktemp -d)&lt;br /&gt;
new_checkpoint_dir=$(mktemp -d)&lt;br /&gt;
trap &amp;quot;rm -rf $old_checkpoint_dir $new_checkpoint_dir&amp;quot; EXIT&lt;br /&gt;
if [ &amp;quot;$backup_type&amp;quot; = diff -o &amp;quot;$backup_type&amp;quot; = incr ]; then&lt;br /&gt;
    # Find a backup which we can use as a base.&lt;br /&gt;
    # For incr, this can be any type; for diff, this must be a full backup.&lt;br /&gt;
    base_backup=&lt;br /&gt;
    for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
        backup=${backups[i]}&lt;br /&gt;
        if [ $backup_type = incr ] || [[ $backup =~ -F$ ]]; then&lt;br /&gt;
            base_backup=$backup&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    if [ -z &amp;quot;$base_backup&amp;quot; ]; then&lt;br /&gt;
        echo &amp;quot;Could not find base backup for $backup_type type&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
    # Copy the xtrabackup_checkpoints file from the base backup into a&lt;br /&gt;
    # temporary directory, and use it in the mariabackup command.&lt;br /&gt;
    scp $SSH_ARGS &amp;quot;$SSH_USER@$SSH_HOST:$SSH_FOLDER/$base_backup/xtrabackup_*&amp;quot; $old_checkpoint_dir/&lt;br /&gt;
    incremental_basedir_args=&amp;quot;--incremental-basedir=$old_checkpoint_dir&amp;quot;&lt;br /&gt;
fi&lt;br /&gt;
compress_level=6&lt;br /&gt;
if [ $backup_type = full ]; then&lt;br /&gt;
    # Use a lower compression level to go faster&lt;br /&gt;
    compress_level=5&lt;br /&gt;
fi&lt;br /&gt;
foldername=&amp;quot;$(date +%s)-$backup_type_letter&amp;quot;&lt;br /&gt;
# First copy to a temporary dir, then rename the temporary dir to the&lt;br /&gt;
# desired dir name (in case our process gets killed)&lt;br /&gt;
mariabackup --user=mysql --backup $incremental_basedir_args --stream=xbstream --extra-lsndir=$new_checkpoint_dir \&lt;br /&gt;
    | nice zstd -$compress_level -T4 \&lt;br /&gt;
    | $SSH -- &amp;quot;cd $SSH_FOLDER &amp;amp;&amp;amp; mkdir $foldername.tmp &amp;amp;&amp;amp; cat &amp;gt; $foldername.tmp/data.xb.zst&amp;quot;&lt;br /&gt;
scp $SSH_ARGS $new_checkpoint_dir/* $SSH_USER@$SSH_HOST:$SSH_FOLDER/$foldername.tmp/&lt;br /&gt;
$SSH -- &amp;quot;mv $SSH_FOLDER/$foldername.tmp $SSH_FOLDER/$foldername&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Delete old backups&lt;br /&gt;
if [ $backup_type = incr ]; then&lt;br /&gt;
    # We don&#039;t delete backups when making an incr backup, since we only&lt;br /&gt;
    # have retention limits for full and diff&lt;br /&gt;
    exit&lt;br /&gt;
fi&lt;br /&gt;
if [ $backup_type = full ]; then&lt;br /&gt;
    retention=$RETENTION_FULL&lt;br /&gt;
else&lt;br /&gt;
    retention=$RETENTION_DIFF&lt;br /&gt;
fi&lt;br /&gt;
num_backups_of_same_type=1&lt;br /&gt;
backups_to_delete=()&lt;br /&gt;
for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
    backup=${backups[i]}&lt;br /&gt;
    if ! [[ $backup =~ -${backup_type_letter}$ ]]; then&lt;br /&gt;
        continue&lt;br /&gt;
    fi&lt;br /&gt;
    ((num_backups_of_same_type++))&lt;br /&gt;
    if [ $num_backups_of_same_type -lt $retention ]; then&lt;br /&gt;
        continue&lt;br /&gt;
    fi&lt;br /&gt;
    if [ $backup_type = full ]; then&lt;br /&gt;
        # Delete everything before the last full backup which we want to&lt;br /&gt;
        # keep&lt;br /&gt;
        pat=&#039;^&#039;&lt;br /&gt;
    else&lt;br /&gt;
        # Delete all the diff and incr backups before the last diff backup&lt;br /&gt;
        # which we want to keep&lt;br /&gt;
        pat=&#039;-[DI]$&#039;&lt;br /&gt;
    fi&lt;br /&gt;
    for ((j=$i-1; j&amp;gt;=0; j--)); do&lt;br /&gt;
        backup=${backups[j]}&lt;br /&gt;
        if [[ $backup =~ $pat ]]; then&lt;br /&gt;
            backups_to_delete+=($backup)&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    break&lt;br /&gt;
done&lt;br /&gt;
if [ ${#backups_to_delete[@]} -eq 0 ]; then&lt;br /&gt;
    echo &amp;quot;No backups to delete&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit&lt;br /&gt;
fi&lt;br /&gt;
$SSH -- &amp;quot;cd $SSH_FOLDER &amp;amp;&amp;amp; rm -r ${backups_to_delete[@]}&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The script should be invoked with exactly one argument which must be one of &amp;quot;full&amp;quot;, &amp;quot;diff&amp;quot; or &amp;quot;incr&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
==== Cron ====&lt;br /&gt;
We are going to use systemd timers because they are much nicer to use than cron. Install /usr/local/bin/csc-systemd-email and /etc/systemd/system/csc-email-on-failure@.service on the target machine so that we get emails for failed jobs (there should be a copy of this on caffeine).&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/mariadb-backup@.service:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=MariaDB backup (%i)&lt;br /&gt;
Documentation=https://wiki.csclub.uwaterloo.ca/MySQL#Backups&lt;br /&gt;
&lt;br /&gt;
[Service]&lt;br /&gt;
Type=oneshot&lt;br /&gt;
User=mysql&lt;br /&gt;
ExecStart=/var/mariadb/bin/backup-mariadb.sh %i&lt;br /&gt;
&lt;br /&gt;
[Unit]&lt;br /&gt;
OnFailure=csc-email-on-failure@%n.service&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/mariadb-backup-full.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=MariaDB backup (full)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Full back up at 00:20 every Sunday and Wednesday&lt;br /&gt;
OnCalendar=Sun,Wed *-*-* 00:20:00&lt;br /&gt;
Unit=mariadb-backup@full.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/mariadb-backup-diff.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=MariaDB backup (diff)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Differential backup at 00:35 every day&lt;br /&gt;
OnCalendar=*-*-* 00:35:00&lt;br /&gt;
Unit=mariadb-backup@diff.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/mariadb-backup-incr.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=MariaDB backup (incr)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Incremental backup at the 50th minute of every hour&lt;br /&gt;
OnCalendar=*-*-* *:50:00&lt;br /&gt;
Unit=mariadb-backup@incr.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Finally, enable and start the timers:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl daemon-reload&lt;br /&gt;
systemctl enable --now mariadb-backup-full.timer&lt;br /&gt;
systemctl enable --now mariadb-backup-diff.timer&lt;br /&gt;
systemctl enable --now mariadb-backup-incr.timer&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Restore ====&lt;br /&gt;
Paste the following into e.g. /var/mariadb/bin/restore-mariadb.sh:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
SSH_KEY=/var/mariadb/.ssh/id_ed25519&lt;br /&gt;
SSH_USER=syscom&lt;br /&gt;
SSH_HOST=corn-syrup&lt;br /&gt;
SSH_FOLDER=/users/$SSH_USER/backups/$(hostname)/mariabackup&lt;br /&gt;
SSH_ARGS=&amp;quot;-i $SSH_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null&amp;quot;&lt;br /&gt;
SSH=&amp;quot;ssh $SSH_ARGS $SSH_USER@$SSH_HOST&amp;quot;&lt;br /&gt;
&lt;br /&gt;
set -euxo pipefail&lt;br /&gt;
shopt -s dotglob&lt;br /&gt;
if [ &amp;quot;$(id -un)&amp;quot; != mysql ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the mysql user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -gt 1 ]; then&lt;br /&gt;
    echo &amp;quot;Usage: $0 [0123456789-I]&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if pgrep mariadbd &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;Please stop MariaDB first&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Get a list of all backups in chronological order&lt;br /&gt;
mapfile -t backups &amp;lt; &amp;lt;($SSH -- &amp;quot;/bin/ls -1 $SSH_FOLDER | grep -P &#039;^\\d+-[FDI]$&#039; | sort&amp;quot;)&lt;br /&gt;
if [ ${#backups[@]} -eq 0 ]; then&lt;br /&gt;
    echo &amp;quot;No backups found&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -eq 1 ]; then&lt;br /&gt;
    last_backup_idx=&lt;br /&gt;
    for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
        if [ ${backups[i]} = &amp;quot;$1&amp;quot; ]; then&lt;br /&gt;
            last_backup_idx=$i&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    if [ -z &amp;quot;$last_backup_idx&amp;quot; ]; then&lt;br /&gt;
        echo &amp;quot;Could not find $1 on remote&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
else&lt;br /&gt;
    last_backup_idx=$(( ${#backups[@]} - 1 ))&lt;br /&gt;
fi&lt;br /&gt;
last_full_backup_idx=&lt;br /&gt;
for ((i=$last_backup_idx; i&amp;gt;=0; i--)); do&lt;br /&gt;
    if [[ ${backups[i]} =~ -F$ ]]; then&lt;br /&gt;
        last_full_backup_idx=$i&lt;br /&gt;
        break&lt;br /&gt;
    fi&lt;br /&gt;
done&lt;br /&gt;
if [ -z &amp;quot;$last_full_backup_idx&amp;quot; ]; then&lt;br /&gt;
    echo &amp;quot;Could not find full backup for ${backups[last_backup_idx]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
backups_to_use=()&lt;br /&gt;
if [[ ${backups[last_backup_idx]} =~ -F$ ]]; then&lt;br /&gt;
    # If we&#039;re restoring a full backup, we only need that one backup&lt;br /&gt;
    backups_to_use=(${backups[last_backup_idx]})&lt;br /&gt;
elif [[ ${backups[last_backup_idx]} =~ -D$ ]]; then&lt;br /&gt;
    # If we&#039;re restoring a diff backup, we only need that one backup and the&lt;br /&gt;
    # first full backup before it&lt;br /&gt;
    backups_to_use=(${backups[last_full_backup_idx]} ${backups[last_backup_idx]})&lt;br /&gt;
else&lt;br /&gt;
    # If we&#039;re restoring an incr backup, we need all the backups from it to&lt;br /&gt;
    # the first diff backup before it, and the first full backup before that.&lt;br /&gt;
    # If there is no diff backup between it and the last full backup, then&lt;br /&gt;
    # we need everything between it and the last full backup.&lt;br /&gt;
    for ((i=$last_backup_idx; i&amp;gt;=$last_full_backup_idx; i--)); do&lt;br /&gt;
        backups_to_use=(${backups[i]} ${backups_to_use[@]})&lt;br /&gt;
        if [[ ${backups[i]} =~ -D$ ]]; then&lt;br /&gt;
            backups_to_use=(${backups[last_full_backup_idx]} ${backups_to_use[@]})&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
fi&lt;br /&gt;
base_dir=$(mktemp -d)&lt;br /&gt;
incr_dir=$(mktemp -d)&lt;br /&gt;
trap &amp;quot;rm -rf $base_dir $incr_dir&amp;quot; EXIT&lt;br /&gt;
for backup in ${backups_to_use[@]}; do&lt;br /&gt;
    if [[ $backup =~ -F$ ]]; then&lt;br /&gt;
        backup_dir=$base_dir&lt;br /&gt;
    else&lt;br /&gt;
        backup_dir=$incr_dir&lt;br /&gt;
    fi&lt;br /&gt;
    $SSH -- &amp;quot;cat $SSH_FOLDER/$backup/data.xb.zst&amp;quot; | zstd -d | mbstream -x -C $backup_dir&lt;br /&gt;
    incremental_dir_args=&lt;br /&gt;
    if [ $backup_dir = $incr_dir ]; then&lt;br /&gt;
        incremental_dir_args=&amp;quot;--incremental-dir=$incr_dir&amp;quot;&lt;br /&gt;
    fi&lt;br /&gt;
    mariabackup --prepare --target-dir=$base_dir $incremental_dir_args&lt;br /&gt;
    if [ $backup_dir = $incr_dir ]; then&lt;br /&gt;
        rm -rf $incr_dir/*&lt;br /&gt;
    fi&lt;br /&gt;
done&lt;br /&gt;
if [ &amp;quot;$(/bin/ls -1 /var/lib/mysql | wc -l)&amp;quot; -gt 0 ]; then&lt;br /&gt;
    read -p &amp;quot;Everything under /var/lib/mysql will be deleted. Continue (y/n)? &amp;quot; yn&lt;br /&gt;
    yn=${yn,,}  # convert to lower case&lt;br /&gt;
    if [ &amp;quot;$yn&amp;quot; = y -o &amp;quot;$yn&amp;quot; = yes ]; then&lt;br /&gt;
        rm -rf /var/lib/mysql/*&lt;br /&gt;
    else&lt;br /&gt;
        echo &amp;quot;Aborting.&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
fi&lt;br /&gt;
mariabackup --move-back --target-dir=$base_dir&lt;br /&gt;
echo &amp;quot;Restoration succeeded, please restart MariaDB&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Make sure to stop MariaDB before restoring a backup. If this script is invoked without any arguments, the latest backup found on corn-syrup will be used; a single argument may also be specified, which must be the name of one of the backup folders stored on corn-syrup.&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=PostgreSQL&amp;diff=5239</id>
		<title>PostgreSQL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=PostgreSQL&amp;diff=5239"/>
		<updated>2024-03-30T14:47:16Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Installation */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For members ==&lt;br /&gt;
PostgreSQL is available as a service for members on caffeine. Just run &amp;lt;code&amp;gt;ceo postgresql create&amp;lt;/code&amp;gt; to create a new database for your account. As of this writing, club reps cannot create PostgreSQL databases for their clubs via ceo, so they will need to send an email to syscom instead.&lt;br /&gt;
&lt;br /&gt;
== For syscom ==&lt;br /&gt;
We are also running a Postgres database on coffee, which is not available to members. Any software installed by syscom should use this database instead of the one on caffeine.&lt;br /&gt;
&lt;br /&gt;
=== Creating a database manually on caffeine ===&lt;br /&gt;
See [https://git.csclub.uwaterloo.ca/public/pyceo/src/commit/392ec153d0a1a9f4068a5ba3c4e4ecb2279ebab4/ceod/db/PostgreSQLService.py#L58 how ceo does it].&lt;br /&gt;
&lt;br /&gt;
=== Upgrades ===&lt;br /&gt;
Upgrading Postgres is more difficult than upgrading MySQL; when you upgrade the Debian version on a machine, a newer version of Postgres will be installed but the old version will remain and the data will not be migrated. &amp;lt;strong&amp;gt;You are responsible for manually upgrading the database yourself&amp;lt;/strong&amp;gt; on all machines where Postgres is installed (currently, just coffee and caffeine).&lt;br /&gt;
&lt;br /&gt;
Here&#039;s the Debian-specific way to do it (steps adapted from [https://www.pontikis.net/blog/update-postgres-major-version-in-debian here]). In the example below, we will assume that we are upgrading from Postgres 13 to 15.&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
First, take a full backup of the database. &amp;lt;strong&amp;gt;DO NOT SKIP THIS STEP.&amp;lt;/strong&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
pg_dumpall | xz -T0 &amp;gt; dump.sql.xz&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Drop the &amp;lt;strong&amp;gt;new&amp;lt;/strong&amp;gt; database, which should be empty at this point. &amp;lt;strong&amp;gt;Make sure that you are not dropping the old database instead!&amp;lt;/strong&amp;gt; You can run &amp;lt;code&amp;gt;pg_lsclusters&amp;lt;/code&amp;gt; to see which database versions are present.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Make sure that this is the NEW version, not the old version!&lt;br /&gt;
pg_dropcluster --stop 15 main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Upgrade the cluster:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
pg_upgradecluster -v 15 13 main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Run psql and make sure that the databases are present:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres -c psql&lt;br /&gt;
\l&lt;br /&gt;
\q&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Once we are sure that everything is working, drop the old database:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Make sure that this is the OLD version, not the new version!&lt;br /&gt;
pg_dropcluster --stop 13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
It is now safe to purge the old postgres package:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt purge postgresql-13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Backups ===&lt;br /&gt;
We use [https://pgbackrest.org pgBackRest] for Postgres backups. It has already been installed on coffee and caffeine.&lt;br /&gt;
&lt;br /&gt;
==== Installation ====&lt;br /&gt;
In the example below, we will be installing pgbackrest on coffee, and using corn-syrup to store the backups (via SSH).&lt;br /&gt;
&lt;br /&gt;
The pgbackrest package in bookworm is too old and doesn&#039;t support SFTP, so we&#039;re going to download the packages we need from trixie instead (starting from trixie and higher, this should no longer be necessary):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# On coffee&lt;br /&gt;
wget http://mirror.csclub.uwaterloo.ca/debian/pool/main/p/pgbackrest/pgbackrest_2.48-1_amd64.deb&lt;br /&gt;
wget http://mirror.csclub.uwaterloo.ca/debian/pool/main/libz/libzstd/libzstd1_1.5.5+dfsg2-2_amd64.deb&lt;br /&gt;
apt install ./pgbackrest_2.48-1_amd64.deb ./libzstd1_1.5.5+dfsg2-2_amd64.deb&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Switch to the postgres user and create a new SSH key:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
ssh-keygen -t ed25519&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Login to corn-syrup, switch to the syscom user, and paste the public key you created earlier into ~/.ssh/authorized_keys:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
restrict ssh-ed25519 AAAAC3Nza... postgres@coffee&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Create a folder to store the backups:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir ~/backups/coffee/pgbackrest&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, on coffee, paste something like the following into /etc/pgbackrest.conf. &amp;lt;strong&amp;gt;Make sure to adjust repo1-path and pg1-path.&amp;lt;/strong&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[global]&lt;br /&gt;
repo1-retention-full=2&lt;br /&gt;
repo1-retention-diff=4&lt;br /&gt;
repo1-bundle=y&lt;br /&gt;
repo1-type=sftp&lt;br /&gt;
repo1-sftp-host=corn-syrup&lt;br /&gt;
repo1-sftp-host-user=syscom&lt;br /&gt;
repo1-path=/users/syscom/backups/coffee/pgbackrest&lt;br /&gt;
repo1-sftp-private-key-file=/var/lib/postgresql/.ssh/id_ed25519&lt;br /&gt;
repo1-sftp-public-key-file=/var/lib/postgresql/.ssh/id_ed25519.pub&lt;br /&gt;
repo1-sftp-host-key-hash-type=sha256&lt;br /&gt;
repo1-sftp-host-key-check-type=none&lt;br /&gt;
start-fast=y&lt;br /&gt;
log-level-console=info&lt;br /&gt;
process-max=4&lt;br /&gt;
compress-type=zst&lt;br /&gt;
&lt;br /&gt;
[main]&lt;br /&gt;
pg1-path=/var/lib/postgresql/15/main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The config above will keep two full backups and at least four differential backups. See https://pgbackrest.org/user-guide.html#retention for more details.&lt;br /&gt;
&lt;br /&gt;
Next, open /etc/postgresql/15/main/postgresql.conf and add/edit the following lines:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
archive_mode = on&lt;br /&gt;
archive_command = &#039;pgbackrest --stanza=main archive-push %p&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
See https://pgbackrest.org/user-guide.html#quickstart/configure-archiving for more details.&lt;br /&gt;
&lt;br /&gt;
Next, restart Postgres:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl restart postgresql@15-main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Switch to the postgres user, create the main stanza, and run the first backup:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
pgbackrest --stanza=main stanza-create&lt;br /&gt;
pgbackrest --stanza=main check&lt;br /&gt;
pgbackrest --stanza=main backup --type=full&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Upgrades ====&lt;br /&gt;
Normally, whenever you upgrade Postgres, you have to manually edit /etc/pgbackrest.conf and run the &amp;quot;stanza-upgrade&amp;quot; command. To make this easier for future sysadmins, I wrote a wrapper script around pgbackrest which does this automatically if it detects that Postgres was upgraded. Paste the following into /var/lib/postgresql/bin/pgbackrest-wrapper.sh and make it executable:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
set -ex&lt;br /&gt;
if [ &amp;quot;$(id -un)&amp;quot; != postgres ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the postgres user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Use the full path to ls to avoid bash aliases&lt;br /&gt;
mapfile -t pg_versions &amp;lt; &amp;lt;(/bin/ls -1 /var/lib/postgresql | grep -P &#039;^\d+$&#039;)&lt;br /&gt;
if [ ${#pg_versions[@]} -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Expected to find 1 Postgres version, found ${#pg_versions[@]} instead: ${pg_versions[*]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
pg_ver=${pg_versions[0]}&lt;br /&gt;
mapfile -t pgbr_versions &amp;lt; &amp;lt;(grep -oP &#039;/var/lib/postgresql/\K(\d+)&#039; /etc/pgbackrest.conf)&lt;br /&gt;
if [ ${#pgbr_versions[@]} -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Expected to find 1 pgBackRest folder, found ${#pgbr_versions[@]} instead: ${pgbr_versions[*]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
pgbr_ver=${pgbr_versions[0]}&lt;br /&gt;
if [ $pg_ver -eq $pgbr_ver ]; then&lt;br /&gt;
    # pgbackrest.conf is up to date, so just run the backup normally&lt;br /&gt;
    pgbackrest &amp;quot;$@&amp;quot;&lt;br /&gt;
    exit 0&lt;br /&gt;
elif [ $pg_ver -lt $pgbr_ver ]; then&lt;br /&gt;
    echo &amp;quot;pgBackRest does not support downgrades - you will have to fix this manually&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# sed -i needs to create a temporary file, and the postgres user doesn&#039;t have&lt;br /&gt;
# write permissions on /etc, so write to a temporary file first&lt;br /&gt;
sed &amp;quot;s,/var/lib/postgresql/$pgbr_ver,/var/lib/postgresql/$pg_ver,&amp;quot; /etc/pgbackrest.conf &amp;gt; /tmp/pgbackrest.conf&lt;br /&gt;
cp /tmp/pgbackrest.conf /etc/pgbackrest.conf&lt;br /&gt;
rm /tmp/pgbackrest.conf&lt;br /&gt;
pgbackrest --stanza=main stanza-upgrade&lt;br /&gt;
pgbackrest --stanza=main check&lt;br /&gt;
# Run the backup&lt;br /&gt;
pgbackrest &amp;quot;$@&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now we can just pass pgbackrest parameters directly to this script, e.g. &amp;lt;code&amp;gt;pgbackrest-wrapper.sh --stanza=main backup&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Cron ====&lt;br /&gt;
We are going to use systemd timers because they are much nicer to use than cron. Install /usr/local/bin/csc-systemd-email and /etc/systemd/system/csc-email-on-failure@.service on the target machine so that we get emails for failed jobs (there should be a copy of this on caffeine). &lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/postgres-backup@.service: &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Postgres backup (%i)&lt;br /&gt;
Documentation=https://wiki.csclub.uwaterloo.ca/PostgreSQL#Backups&lt;br /&gt;
&lt;br /&gt;
[Service]&lt;br /&gt;
Type=oneshot&lt;br /&gt;
User=postgres&lt;br /&gt;
ExecStart=/var/lib/postgresql/bin/pgbackrest-wrapper.sh --stanza=main backup --type=%i&lt;br /&gt;
&lt;br /&gt;
[Unit]&lt;br /&gt;
OnFailure=csc-email-on-failure@%n.service&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/postgres-backup-full.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Postgres backup (full)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Full back up at 00:15 every Sunday and Wednesday&lt;br /&gt;
OnCalendar=Sun,Wed *-*-* 00:15:00&lt;br /&gt;
Unit=postgres-backup@full.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/postgres-backup-diff.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Postgres backup (diff)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Differential backup at 00:30 every day&lt;br /&gt;
OnCalendar=*-*-* 00:30:00&lt;br /&gt;
Unit=postgres-backup@diff.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/postgres-backup-incr.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Postgres backup (incr)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Incremental backup at the 45th minute of every hour&lt;br /&gt;
OnCalendar=*-*-* *:45:00&lt;br /&gt;
Unit=postgres-backup@incr.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Finally, enable and start the timers:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl daemon-reload&lt;br /&gt;
systemctl enable --now postgres-backup-full.timer&lt;br /&gt;
systemctl enable --now postgres-backup-diff.timer&lt;br /&gt;
systemctl enable --now postgres-backup-incr.timer&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Restore ====&lt;br /&gt;
Suppose we want to restore the latest backup, and the installed Postgres is 15. First, make sure that you actually have at least one backup present for this version:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su -c postgres -c &#039;pgbackrest --stanza=main info&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, stop the database and delete all of the files:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl stop postgresql@15-main&lt;br /&gt;
rm -rf /var/lib/postgresql/15/main/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now switch to the postgres user and run the &amp;quot;restore&amp;quot; command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
pgbackrest --stanza=main restore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If you start Postgres, everything should be in a working state:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl start postgresql@15-main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to restore a backup which is not the latest version, pass the &amp;lt;code&amp;gt;--set&amp;lt;/code&amp;gt; argument to pgbackrest. See https://pgbackrest.org/user-guide.html#restore for more details.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=PostgreSQL&amp;diff=5232</id>
		<title>PostgreSQL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=PostgreSQL&amp;diff=5232"/>
		<updated>2024-03-16T09:54:48Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Cron */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For members ==&lt;br /&gt;
PostgreSQL is available as a service for members on caffeine. Just run &amp;lt;code&amp;gt;ceo postgresql create&amp;lt;/code&amp;gt; to create a new database for your account. As of this writing, club reps cannot create PostgreSQL databases for their clubs via ceo, so they will need to send an email to syscom instead.&lt;br /&gt;
&lt;br /&gt;
== For syscom ==&lt;br /&gt;
We are also running a Postgres database on coffee, which is not available to members. Any software installed by syscom should use this database instead of the one on caffeine.&lt;br /&gt;
&lt;br /&gt;
=== Creating a database manually on caffeine ===&lt;br /&gt;
See [https://git.csclub.uwaterloo.ca/public/pyceo/src/commit/392ec153d0a1a9f4068a5ba3c4e4ecb2279ebab4/ceod/db/PostgreSQLService.py#L58 how ceo does it].&lt;br /&gt;
&lt;br /&gt;
=== Upgrades ===&lt;br /&gt;
Upgrading Postgres is more difficult than upgrading MySQL; when you upgrade the Debian version on a machine, a newer version of Postgres will be installed but the old version will remain and the data will not be migrated. &amp;lt;strong&amp;gt;You are responsible for manually upgrading the database yourself&amp;lt;/strong&amp;gt; on all machines where Postgres is installed (currently, just coffee and caffeine).&lt;br /&gt;
&lt;br /&gt;
Here&#039;s the Debian-specific way to do it (steps adapted from [https://www.pontikis.net/blog/update-postgres-major-version-in-debian here]). In the example below, we will assume that we are upgrading from Postgres 13 to 15.&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
First, take a full backup of the database. &amp;lt;strong&amp;gt;DO NOT SKIP THIS STEP.&amp;lt;/strong&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
pg_dumpall | xz -T0 &amp;gt; dump.sql.xz&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Drop the &amp;lt;strong&amp;gt;new&amp;lt;/strong&amp;gt; database, which should be empty at this point. &amp;lt;strong&amp;gt;Make sure that you are not dropping the old database instead!&amp;lt;/strong&amp;gt; You can run &amp;lt;code&amp;gt;pg_lsclusters&amp;lt;/code&amp;gt; to see which database versions are present.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Make sure that this is the NEW version, not the old version!&lt;br /&gt;
pg_dropcluster --stop 15 main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Upgrade the cluster:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
pg_upgradecluster -v 15 13 main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Run psql and make sure that the databases are present:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres -c psql&lt;br /&gt;
\l&lt;br /&gt;
\q&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Once we are sure that everything is working, drop the old database:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Make sure that this is the OLD version, not the new version!&lt;br /&gt;
pg_dropcluster --stop 13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
It is now safe to purge the old postgres package:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt purge postgresql-13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Backups ===&lt;br /&gt;
We use [https://pgbackrest.org pgBackRest] for Postgres backups. It has already been installed on coffee and caffeine.&lt;br /&gt;
&lt;br /&gt;
==== Installation ====&lt;br /&gt;
In the example below, we will be installing pgbackrest on coffee, and using corn-syrup to store the backups (via SSH).&lt;br /&gt;
&lt;br /&gt;
The pgbackrest package in bookworm is too old and doesn&#039;t support SFTP, so we&#039;re going to download the packages we need from trixie instead (starting from trixie and higher, this should no longer be necessary):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# On coffee&lt;br /&gt;
wget http://mirror.csclub.uwaterloo.ca/debian/pool/main/p/pgbackrest/pgbackrest_2.48-1_amd64.deb&lt;br /&gt;
wget http://mirror.csclub.uwaterloo.ca/debian/pool/main/libz/libzstd/libzstd1_1.5.5+dfsg2-2_amd64.deb&lt;br /&gt;
apt install ./pgbackrest_2.48-1_amd64.deb ./libzstd1_1.5.5+dfsg2-2_amd64.deb&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Switch to the postgres user and create a new SSH key:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
ssh-keygen -t ed25519&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Login to corn-syrup, switch to the syscom user, and paste the public key you created earlier into ~/.ssh/authorized_keys:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
restrict ssh-ed25519 AAAAC3Nza... postgres@coffee&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Create a folder to store the backups:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir ~/backups/coffee/pgbackrest&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, on coffee, paste something like the following into /etc/pgbackrest.conf. &amp;lt;strong&amp;gt;Make sure to adjust repo1-path and pg1-path.&amp;lt;/strong&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[global]&lt;br /&gt;
repo1-retention-full=2&lt;br /&gt;
repo1-retention-diff=4&lt;br /&gt;
repo1-bundle=y&lt;br /&gt;
repo1-type=sftp&lt;br /&gt;
repo1-sftp-host=corn-syrup&lt;br /&gt;
repo1-sftp-host-user=syscom&lt;br /&gt;
repo1-path=/users/syscom/backups/coffee/pgbackrest&lt;br /&gt;
repo1-sftp-private-key-file=/var/lib/postgresql/.ssh/id_ed25519&lt;br /&gt;
repo1-sftp-public-key-file=/var/lib/postgresql/.ssh/id_ed25519.pub&lt;br /&gt;
repo1-sftp-host-key-hash-type=sha256&lt;br /&gt;
repo1-sftp-host-key-check-type=none&lt;br /&gt;
start-fast=y&lt;br /&gt;
log-level-console=info&lt;br /&gt;
process-max=4&lt;br /&gt;
compress-type=lz4&lt;br /&gt;
&lt;br /&gt;
[main]&lt;br /&gt;
pg1-path=/var/lib/postgresql/15/main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The config above will keep two full backups and at least four differential backups. See https://pgbackrest.org/user-guide.html#retention for more details.&lt;br /&gt;
&lt;br /&gt;
Next, open /etc/postgresql/15/main/postgresql.conf and add/edit the following lines:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
archive_mode = on&lt;br /&gt;
archive_command = &#039;pgbackrest --stanza=main archive-push %p&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
See https://pgbackrest.org/user-guide.html#quickstart/configure-archiving for more details.&lt;br /&gt;
&lt;br /&gt;
Next, restart Postgres:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl restart postgresql@15-main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Switch to the postgres user, create the main stanza, and run the first backup:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
pgbackrest --stanza=main stanza-create&lt;br /&gt;
pgbackrest --stanza=main check&lt;br /&gt;
pgbackrest --stanza=main backup --type=full&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Upgrades ====&lt;br /&gt;
Normally, whenever you upgrade Postgres, you have to manually edit /etc/pgbackrest.conf and run the &amp;quot;stanza-upgrade&amp;quot; command. To make this easier for future sysadmins, I wrote a wrapper script around pgbackrest which does this automatically if it detects that Postgres was upgraded. Paste the following into /var/lib/postgresql/bin/pgbackrest-wrapper.sh and make it executable:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
set -ex&lt;br /&gt;
if [ &amp;quot;$(id -un)&amp;quot; != postgres ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the postgres user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Use the full path to ls to avoid bash aliases&lt;br /&gt;
mapfile -t pg_versions &amp;lt; &amp;lt;(/bin/ls -1 /var/lib/postgresql | grep -P &#039;^\d+$&#039;)&lt;br /&gt;
if [ ${#pg_versions[@]} -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Expected to find 1 Postgres version, found ${#pg_versions[@]} instead: ${pg_versions[*]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
pg_ver=${pg_versions[0]}&lt;br /&gt;
mapfile -t pgbr_versions &amp;lt; &amp;lt;(grep -oP &#039;/var/lib/postgresql/\K(\d+)&#039; /etc/pgbackrest.conf)&lt;br /&gt;
if [ ${#pgbr_versions[@]} -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Expected to find 1 pgBackRest folder, found ${#pgbr_versions[@]} instead: ${pgbr_versions[*]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
pgbr_ver=${pgbr_versions[0]}&lt;br /&gt;
if [ $pg_ver -eq $pgbr_ver ]; then&lt;br /&gt;
    # pgbackrest.conf is up to date, so just run the backup normally&lt;br /&gt;
    pgbackrest &amp;quot;$@&amp;quot;&lt;br /&gt;
    exit 0&lt;br /&gt;
elif [ $pg_ver -lt $pgbr_ver ]; then&lt;br /&gt;
    echo &amp;quot;pgBackRest does not support downgrades - you will have to fix this manually&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# sed -i needs to create a temporary file, and the postgres user doesn&#039;t have&lt;br /&gt;
# write permissions on /etc, so write to a temporary file first&lt;br /&gt;
sed &amp;quot;s,/var/lib/postgresql/$pgbr_ver,/var/lib/postgresql/$pg_ver,&amp;quot; /etc/pgbackrest.conf &amp;gt; /tmp/pgbackrest.conf&lt;br /&gt;
cp /tmp/pgbackrest.conf /etc/pgbackrest.conf&lt;br /&gt;
rm /tmp/pgbackrest.conf&lt;br /&gt;
pgbackrest --stanza=main stanza-upgrade&lt;br /&gt;
pgbackrest --stanza=main check&lt;br /&gt;
# Run the backup&lt;br /&gt;
pgbackrest &amp;quot;$@&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now we can just pass pgbackrest parameters directly to this script, e.g. &amp;lt;code&amp;gt;pgbackrest-wrapper.sh --stanza=main backup&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Cron ====&lt;br /&gt;
We are going to use systemd timers because they are much nicer to use than cron. Install /usr/local/bin/csc-systemd-email and /etc/systemd/system/csc-email-on-failure@.service on the target machine so that we get emails for failed jobs (there should be a copy of this on caffeine). &lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/postgres-backup@.service: &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Postgres backup (%i)&lt;br /&gt;
Documentation=https://wiki.csclub.uwaterloo.ca/PostgreSQL#Backups&lt;br /&gt;
&lt;br /&gt;
[Service]&lt;br /&gt;
Type=oneshot&lt;br /&gt;
User=postgres&lt;br /&gt;
ExecStart=/var/lib/postgresql/bin/pgbackrest-wrapper.sh --stanza=main backup --type=%i&lt;br /&gt;
&lt;br /&gt;
[Unit]&lt;br /&gt;
OnFailure=csc-email-on-failure@%n.service&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/postgres-backup-full.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Postgres backup (full)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Full back up at 00:15 every Sunday and Wednesday&lt;br /&gt;
OnCalendar=Sun,Wed *-*-* 00:15:00&lt;br /&gt;
Unit=postgres-backup@full.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/postgres-backup-diff.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Postgres backup (diff)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Differential backup at 00:30 every day&lt;br /&gt;
OnCalendar=*-*-* 00:30:00&lt;br /&gt;
Unit=postgres-backup@diff.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/postgres-backup-incr.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Postgres backup (incr)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Incremental backup at the 45th minute of every hour&lt;br /&gt;
OnCalendar=*-*-* *:45:00&lt;br /&gt;
Unit=postgres-backup@incr.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Finally, enable and start the timers:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl daemon-reload&lt;br /&gt;
systemctl enable --now postgres-backup-full.timer&lt;br /&gt;
systemctl enable --now postgres-backup-diff.timer&lt;br /&gt;
systemctl enable --now postgres-backup-incr.timer&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Restore ====&lt;br /&gt;
Suppose we want to restore the latest backup, and the installed Postgres is 15. First, make sure that you actually have at least one backup present for this version:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su -c postgres -c &#039;pgbackrest --stanza=main info&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, stop the database and delete all of the files:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl stop postgresql@15-main&lt;br /&gt;
rm -rf /var/lib/postgresql/15/main/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now switch to the postgres user and run the &amp;quot;restore&amp;quot; command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
pgbackrest --stanza=main restore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If you start Postgres, everything should be in a working state:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl start postgresql@15-main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to restore a backup which is not the latest version, pass the &amp;lt;code&amp;gt;--set&amp;lt;/code&amp;gt; argument to pgbackrest. See https://pgbackrest.org/user-guide.html#restore for more details.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=PostgreSQL&amp;diff=5231</id>
		<title>PostgreSQL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=PostgreSQL&amp;diff=5231"/>
		<updated>2024-03-16T09:24:18Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Cron */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For members ==&lt;br /&gt;
PostgreSQL is available as a service for members on caffeine. Just run &amp;lt;code&amp;gt;ceo postgresql create&amp;lt;/code&amp;gt; to create a new database for your account. As of this writing, club reps cannot create PostgreSQL databases for their clubs via ceo, so they will need to send an email to syscom instead.&lt;br /&gt;
&lt;br /&gt;
== For syscom ==&lt;br /&gt;
We are also running a Postgres database on coffee, which is not available to members. Any software installed by syscom should use this database instead of the one on caffeine.&lt;br /&gt;
&lt;br /&gt;
=== Creating a database manually on caffeine ===&lt;br /&gt;
See [https://git.csclub.uwaterloo.ca/public/pyceo/src/commit/392ec153d0a1a9f4068a5ba3c4e4ecb2279ebab4/ceod/db/PostgreSQLService.py#L58 how ceo does it].&lt;br /&gt;
&lt;br /&gt;
=== Upgrades ===&lt;br /&gt;
Upgrading Postgres is more difficult than upgrading MySQL; when you upgrade the Debian version on a machine, a newer version of Postgres will be installed but the old version will remain and the data will not be migrated. &amp;lt;strong&amp;gt;You are responsible for manually upgrading the database yourself&amp;lt;/strong&amp;gt; on all machines where Postgres is installed (currently, just coffee and caffeine).&lt;br /&gt;
&lt;br /&gt;
Here&#039;s the Debian-specific way to do it (steps adapted from [https://www.pontikis.net/blog/update-postgres-major-version-in-debian here]). In the example below, we will assume that we are upgrading from Postgres 13 to 15.&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
First, take a full backup of the database. &amp;lt;strong&amp;gt;DO NOT SKIP THIS STEP.&amp;lt;/strong&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
pg_dumpall | xz -T0 &amp;gt; dump.sql.xz&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Drop the &amp;lt;strong&amp;gt;new&amp;lt;/strong&amp;gt; database, which should be empty at this point. &amp;lt;strong&amp;gt;Make sure that you are not dropping the old database instead!&amp;lt;/strong&amp;gt; You can run &amp;lt;code&amp;gt;pg_lsclusters&amp;lt;/code&amp;gt; to see which database versions are present.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Make sure that this is the NEW version, not the old version!&lt;br /&gt;
pg_dropcluster --stop 15 main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Upgrade the cluster:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
pg_upgradecluster -v 15 13 main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Run psql and make sure that the databases are present:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres -c psql&lt;br /&gt;
\l&lt;br /&gt;
\q&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Once we are sure that everything is working, drop the old database:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Make sure that this is the OLD version, not the new version!&lt;br /&gt;
pg_dropcluster --stop 13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
It is now safe to purge the old postgres package:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt purge postgresql-13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Backups ===&lt;br /&gt;
We use [https://pgbackrest.org pgBackRest] for Postgres backups. It has already been installed on coffee and caffeine.&lt;br /&gt;
&lt;br /&gt;
==== Installation ====&lt;br /&gt;
In the example below, we will be installing pgbackrest on coffee, and using corn-syrup to store the backups (via SSH).&lt;br /&gt;
&lt;br /&gt;
The pgbackrest package in bookworm is too old and doesn&#039;t support SFTP, so we&#039;re going to download the packages we need from trixie instead (starting from trixie and higher, this should no longer be necessary):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# On coffee&lt;br /&gt;
wget http://mirror.csclub.uwaterloo.ca/debian/pool/main/p/pgbackrest/pgbackrest_2.48-1_amd64.deb&lt;br /&gt;
wget http://mirror.csclub.uwaterloo.ca/debian/pool/main/libz/libzstd/libzstd1_1.5.5+dfsg2-2_amd64.deb&lt;br /&gt;
apt install ./pgbackrest_2.48-1_amd64.deb ./libzstd1_1.5.5+dfsg2-2_amd64.deb&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Switch to the postgres user and create a new SSH key:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
ssh-keygen -t ed25519&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Login to corn-syrup, switch to the syscom user, and paste the public key you created earlier into ~/.ssh/authorized_keys:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
restrict ssh-ed25519 AAAAC3Nza... postgres@coffee&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Create a folder to store the backups:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir ~/backups/coffee/pgbackrest&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, on coffee, paste something like the following into /etc/pgbackrest.conf. &amp;lt;strong&amp;gt;Make sure to adjust repo1-path and pg1-path.&amp;lt;/strong&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[global]&lt;br /&gt;
repo1-retention-full=2&lt;br /&gt;
repo1-retention-diff=4&lt;br /&gt;
repo1-bundle=y&lt;br /&gt;
repo1-type=sftp&lt;br /&gt;
repo1-sftp-host=corn-syrup&lt;br /&gt;
repo1-sftp-host-user=syscom&lt;br /&gt;
repo1-path=/users/syscom/backups/coffee/pgbackrest&lt;br /&gt;
repo1-sftp-private-key-file=/var/lib/postgresql/.ssh/id_ed25519&lt;br /&gt;
repo1-sftp-public-key-file=/var/lib/postgresql/.ssh/id_ed25519.pub&lt;br /&gt;
repo1-sftp-host-key-hash-type=sha256&lt;br /&gt;
repo1-sftp-host-key-check-type=none&lt;br /&gt;
start-fast=y&lt;br /&gt;
log-level-console=info&lt;br /&gt;
process-max=4&lt;br /&gt;
compress-type=lz4&lt;br /&gt;
&lt;br /&gt;
[main]&lt;br /&gt;
pg1-path=/var/lib/postgresql/15/main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The config above will keep two full backups and at least four differential backups. See https://pgbackrest.org/user-guide.html#retention for more details.&lt;br /&gt;
&lt;br /&gt;
Next, open /etc/postgresql/15/main/postgresql.conf and add/edit the following lines:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
archive_mode = on&lt;br /&gt;
archive_command = &#039;pgbackrest --stanza=main archive-push %p&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
See https://pgbackrest.org/user-guide.html#quickstart/configure-archiving for more details.&lt;br /&gt;
&lt;br /&gt;
Next, restart Postgres:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl restart postgresql@15-main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Switch to the postgres user, create the main stanza, and run the first backup:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
pgbackrest --stanza=main stanza-create&lt;br /&gt;
pgbackrest --stanza=main check&lt;br /&gt;
pgbackrest --stanza=main backup --type=full&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Upgrades ====&lt;br /&gt;
Normally, whenever you upgrade Postgres, you have to manually edit /etc/pgbackrest.conf and run the &amp;quot;stanza-upgrade&amp;quot; command. To make this easier for future sysadmins, I wrote a wrapper script around pgbackrest which does this automatically if it detects that Postgres was upgraded. Paste the following into /var/lib/postgresql/bin/pgbackrest-wrapper.sh and make it executable:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
set -ex&lt;br /&gt;
if [ &amp;quot;$(id -un)&amp;quot; != postgres ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the postgres user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Use the full path to ls to avoid bash aliases&lt;br /&gt;
mapfile -t pg_versions &amp;lt; &amp;lt;(/bin/ls -1 /var/lib/postgresql | grep -P &#039;^\d+$&#039;)&lt;br /&gt;
if [ ${#pg_versions[@]} -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Expected to find 1 Postgres version, found ${#pg_versions[@]} instead: ${pg_versions[*]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
pg_ver=${pg_versions[0]}&lt;br /&gt;
mapfile -t pgbr_versions &amp;lt; &amp;lt;(grep -oP &#039;/var/lib/postgresql/\K(\d+)&#039; /etc/pgbackrest.conf)&lt;br /&gt;
if [ ${#pgbr_versions[@]} -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Expected to find 1 pgBackRest folder, found ${#pgbr_versions[@]} instead: ${pgbr_versions[*]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
pgbr_ver=${pgbr_versions[0]}&lt;br /&gt;
if [ $pg_ver -eq $pgbr_ver ]; then&lt;br /&gt;
    # pgbackrest.conf is up to date, so just run the backup normally&lt;br /&gt;
    pgbackrest &amp;quot;$@&amp;quot;&lt;br /&gt;
    exit 0&lt;br /&gt;
elif [ $pg_ver -lt $pgbr_ver ]; then&lt;br /&gt;
    echo &amp;quot;pgBackRest does not support downgrades - you will have to fix this manually&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# sed -i needs to create a temporary file, and the postgres user doesn&#039;t have&lt;br /&gt;
# write permissions on /etc, so write to a temporary file first&lt;br /&gt;
sed &amp;quot;s,/var/lib/postgresql/$pgbr_ver,/var/lib/postgresql/$pg_ver,&amp;quot; /etc/pgbackrest.conf &amp;gt; /tmp/pgbackrest.conf&lt;br /&gt;
cp /tmp/pgbackrest.conf /etc/pgbackrest.conf&lt;br /&gt;
rm /tmp/pgbackrest.conf&lt;br /&gt;
pgbackrest --stanza=main stanza-upgrade&lt;br /&gt;
pgbackrest --stanza=main check&lt;br /&gt;
# Run the backup&lt;br /&gt;
pgbackrest &amp;quot;$@&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now we can just pass pgbackrest parameters directly to this script, e.g. &amp;lt;code&amp;gt;pgbackrest-wrapper.sh --stanza=main backup&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Cron ====&lt;br /&gt;
We are going to use systemd timers because they are much nicer to use than cron. Install /usr/local/bin/csc-systemd-email and /etc/systemd/system/csc-email-on-failure@.service on the target machine so that we get emails for failed jobs (there should be a copy of this on caffeine). &lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/postgres-backup@.service: &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Postgres backup (%i)&lt;br /&gt;
Documentation=https://wiki.csclub.uwaterloo.ca/PostgreSQL#Backups&lt;br /&gt;
&lt;br /&gt;
[Service]&lt;br /&gt;
Type=oneshot&lt;br /&gt;
User=postgres&lt;br /&gt;
ExecStart=/var/lib/postgresql/bin/pgbackrest-wrapper.sh --stanza=main backup --type=full&lt;br /&gt;
&lt;br /&gt;
[Unit]&lt;br /&gt;
OnFailure=csc-email-on-failure@%n.service&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/postgres-backup-full.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Postgres backup (full)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Full back up at 00:15 every Sunday and Wednesday&lt;br /&gt;
OnCalendar=Sun,Wed *-*-* 00:15:00&lt;br /&gt;
Unit=postgres-backup@full.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/postgres-backup-diff.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Postgres backup (diff)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Differential backup at 00:30 every day&lt;br /&gt;
OnCalendar=*-*-* 00:30:00&lt;br /&gt;
Unit=postgres-backup@diff.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/postgres-backup-incr.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Postgres backup (incr)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Incremental backup at the 45th minute of every hour&lt;br /&gt;
OnCalendar=*-*-* *:45:00&lt;br /&gt;
Unit=postgres-backup@incr.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Finally, enable and start the timers:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl daemon-reload&lt;br /&gt;
systemctl enable --now postgres-backup-full.timer&lt;br /&gt;
systemctl enable --now postgres-backup-diff.timer&lt;br /&gt;
systemctl enable --now postgres-backup-incr.timer&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Restore ====&lt;br /&gt;
Suppose we want to restore the latest backup, and the installed Postgres is 15. First, make sure that you actually have at least one backup present for this version:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su -c postgres -c &#039;pgbackrest --stanza=main info&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, stop the database and delete all of the files:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl stop postgresql@15-main&lt;br /&gt;
rm -rf /var/lib/postgresql/15/main/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now switch to the postgres user and run the &amp;quot;restore&amp;quot; command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
pgbackrest --stanza=main restore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If you start Postgres, everything should be in a working state:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl start postgresql@15-main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to restore a backup which is not the latest version, pass the &amp;lt;code&amp;gt;--set&amp;lt;/code&amp;gt; argument to pgbackrest. See https://pgbackrest.org/user-guide.html#restore for more details.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5230</id>
		<title>MySQL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5230"/>
		<updated>2024-03-16T08:51:31Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Cron */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For members ==&lt;br /&gt;
Note: the database on caffeine is actually MariaDB, not MySQL. Although they are mostly compatible, there are some incompatibilities to be aware of. See [https://mariadb.com/kb/en/mariadb-vs-mysql-compatibility/ MariaDB versus MySQL: Compatibility] for details.&lt;br /&gt;
&lt;br /&gt;
=== Creating databases ===&lt;br /&gt;
&lt;br /&gt;
Users can create their own MySQL databases through [[ceo]]. Users emailing syscom asking for a MySQL database should be directed to do so. The process is as follows:&lt;br /&gt;
&lt;br /&gt;
# SSH into any [[Machine_List|CSC machine]].&lt;br /&gt;
# Run &amp;lt;tt&amp;gt;ceo&amp;lt;/tt&amp;gt;.&lt;br /&gt;
# Select &amp;quot;Create MySQL database&amp;quot; and follow the instructions.&lt;br /&gt;
# Login info will be stored in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
# You can now connect to the MySQL database (from [[Machine_List#caffeine|caffeine]] only).&lt;br /&gt;
&lt;br /&gt;
=== Deleting databases ===&lt;br /&gt;
&lt;br /&gt;
Users can delete their own MySQL databases. &lt;br /&gt;
&lt;br /&gt;
SSH into [[Machine_List#caffeine|caffeine]].&lt;br /&gt;
 mysql -u yourusernamehere -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
 DROP DATABASE database name goes here&lt;br /&gt;
Login info and database name was created on database creation in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
&lt;br /&gt;
== For syscom ==&lt;br /&gt;
&lt;br /&gt;
=== Creating a database manually ===&lt;br /&gt;
To create a MySQL database manually on caffeine, first connect to the database as root:&lt;br /&gt;
&lt;br /&gt;
 $ mysql -uroot -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
&lt;br /&gt;
Then run the following SQL statements:&lt;br /&gt;
&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;%&#039; IDENTIFIED BY &#039;longrandompassword&#039;;&lt;br /&gt;
 CREATE DATABASE someuser;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;%&#039;;&lt;br /&gt;
&lt;br /&gt;
This will allow users to connect locally without a password, and connect remotely with a password.&lt;br /&gt;
&lt;br /&gt;
For random passwords run &amp;lt;code&amp;gt;pwgen -s 20 1&amp;lt;/code&amp;gt;. For the administrative passwords see /users/sysadmin/passwords/mysql.&lt;br /&gt;
&lt;br /&gt;
Write a file (usually ~club/mysql) to the club&#039;s homedir readable only by them containing the following:&lt;br /&gt;
&lt;br /&gt;
 Username: clubuserid&lt;br /&gt;
 Password: longrandompassword&lt;br /&gt;
 Hostname: localhost&lt;br /&gt;
&lt;br /&gt;
Try not to send passwords via plaintext email.&lt;br /&gt;
&lt;br /&gt;
=== Replication ===&lt;br /&gt;
&lt;br /&gt;
See the history of this page for information on the previous replication setup.&lt;br /&gt;
&lt;br /&gt;
=== Backups ===&lt;br /&gt;
&lt;br /&gt;
We use [https://mariadb.com/kb/en/mariabackup-overview/ mariabackup] to take periodic backups. It is currently installed and configured on both caffeine and coffee.&lt;br /&gt;
&lt;br /&gt;
==== Installation ====&lt;br /&gt;
In the example below, we will be installing mariabackup on coffee, and sending the backups to corn-syrup.&lt;br /&gt;
&lt;br /&gt;
First, install the mariadb-backup package:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt install mariadb-backup&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Next, create an SSH key pair for the mysql user:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir /var/mariadb&lt;br /&gt;
chown mysql:mysql /var/mariadb&lt;br /&gt;
su -s /bin/bash mysql&lt;br /&gt;
cd /var/mariadb&lt;br /&gt;
mkdir .ssh&lt;br /&gt;
chmod 700 .ssh&lt;br /&gt;
 # Choose /var/mariadb/.ssh/id_ed25519 for the path&lt;br /&gt;
ssh-keygen -t ed25519&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the public key (/var/mariadb/.ssh/id_ed25519.pub) into /users/syscom/.ssh/authorized_keys on corn-syrup:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
restrict ssh-ed25519 AAAAC3Nza... mysql@coffee&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Also create the folder &amp;lt;code&amp;gt;/users/syscom/backups/coffee/mariabackup&amp;lt;/code&amp;gt;. We will store the backups here.&lt;br /&gt;
&lt;br /&gt;
We will use a hacky bash script to try to emulate the same behaviour as pgBackRest. We will compress and stream each backup to a folder on corn-syrup in the format &amp;lt;code&amp;gt;1701678356-F&amp;lt;/code&amp;gt;, where the number is a Unix epoch timestamp and the letter at the end is one of F, D or I (for full, differential or incremental backups). Full backups do not depend on any other backups. Differential backups depend on the latest full backup before them. Incremental backups depend on the latest backup before them (of any type).&lt;br /&gt;
&lt;br /&gt;
On coffee, paste the following into e.g. /var/mariadb/bin/backup-mariadb.sh:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
RETENTION_FULL=2&lt;br /&gt;
RETENTION_DIFF=4&lt;br /&gt;
SSH_KEY=/var/mariadb/.ssh/id_ed25519&lt;br /&gt;
SSH_USER=syscom&lt;br /&gt;
SSH_HOST=corn-syrup&lt;br /&gt;
SSH_FOLDER=/users/$SSH_USER/backups/$(hostname)/mariabackup&lt;br /&gt;
SSH_ARGS=&amp;quot;-i $SSH_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null&amp;quot;&lt;br /&gt;
SSH=&amp;quot;ssh $SSH_ARGS $SSH_USER@$SSH_HOST&amp;quot;&lt;br /&gt;
&lt;br /&gt;
set -euxo pipefail&lt;br /&gt;
# $USER doesn&#039;t seem to be defined when we run this from cron&lt;br /&gt;
if [ &amp;quot;$(id -un)&amp;quot; != mysql ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the mysql user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Usage: $0 &amp;lt;full|diff|incr&amp;gt;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
backup_type=$1&lt;br /&gt;
if [ &amp;quot;$backup_type&amp;quot; = full ]; then&lt;br /&gt;
    backup_type_letter=F&lt;br /&gt;
elif [ &amp;quot;$backup_type&amp;quot; = diff ]; then&lt;br /&gt;
    backup_type_letter=D&lt;br /&gt;
elif [ &amp;quot;$backup_type&amp;quot; = incr ]; then&lt;br /&gt;
    backup_type_letter=I&lt;br /&gt;
else&lt;br /&gt;
    echo &amp;quot;Backup type must be one of &#039;full&#039;, &#039;diff&#039; or &#039;incr&#039;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if ! pgrep mariadbd &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;MariaDB is not running&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if pgrep mariabackup &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;mariabackup is already running&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Delete temporary files left behind by previous run, if there are any&lt;br /&gt;
$SSH -- &amp;quot;rm -rf $SSH_FOLDER/*.tmp&amp;quot;&lt;br /&gt;
# Get a list of all backups in chronological order&lt;br /&gt;
mapfile -t backups &amp;lt; &amp;lt;($SSH -- &amp;quot;/bin/ls -1 $SSH_FOLDER | grep -P &#039;^\\d+-[FDI]$&#039; | sort&amp;quot;)&lt;br /&gt;
incremental_basedir_args=&lt;br /&gt;
old_checkpoint_dir=$(mktemp -d)&lt;br /&gt;
new_checkpoint_dir=$(mktemp -d)&lt;br /&gt;
trap &amp;quot;rm -rf $old_checkpoint_dir $new_checkpoint_dir&amp;quot; EXIT&lt;br /&gt;
if [ &amp;quot;$backup_type&amp;quot; = diff -o &amp;quot;$backup_type&amp;quot; = incr ]; then&lt;br /&gt;
    # Find a backup which we can use as a base.&lt;br /&gt;
    # For incr, this can be any type; for diff, this must be a full backup.&lt;br /&gt;
    base_backup=&lt;br /&gt;
    for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
        backup=${backups[i]}&lt;br /&gt;
        if [ $backup_type = incr ] || [[ $backup =~ -F$ ]]; then&lt;br /&gt;
            base_backup=$backup&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    if [ -z &amp;quot;$base_backup&amp;quot; ]; then&lt;br /&gt;
        echo &amp;quot;Could not find base backup for $backup_type type&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
    # Copy the xtrabackup_checkpoints file from the base backup into a&lt;br /&gt;
    # temporary directory, and use it in the mariabackup command.&lt;br /&gt;
    scp $SSH_ARGS &amp;quot;$SSH_USER@$SSH_HOST:$SSH_FOLDER/$base_backup/xtrabackup_*&amp;quot; $old_checkpoint_dir/&lt;br /&gt;
    incremental_basedir_args=&amp;quot;--incremental-basedir=$old_checkpoint_dir&amp;quot;&lt;br /&gt;
fi&lt;br /&gt;
compress_level=6&lt;br /&gt;
if [ $backup_type = full ]; then&lt;br /&gt;
    # Use a lower compression level to go faster&lt;br /&gt;
    compress_level=5&lt;br /&gt;
fi&lt;br /&gt;
foldername=&amp;quot;$(date +%s)-$backup_type_letter&amp;quot;&lt;br /&gt;
# First copy to a temporary dir, then rename the temporary dir to the&lt;br /&gt;
# desired dir name (in case our process gets killed)&lt;br /&gt;
mariabackup --user=mysql --backup $incremental_basedir_args --stream=xbstream --extra-lsndir=$new_checkpoint_dir \&lt;br /&gt;
    | nice xz -$compress_level -T4 \&lt;br /&gt;
    | $SSH -- &amp;quot;cd $SSH_FOLDER &amp;amp;&amp;amp; mkdir $foldername.tmp &amp;amp;&amp;amp; cat &amp;gt; $foldername.tmp/data.xb.xz&amp;quot;&lt;br /&gt;
scp $SSH_ARGS $new_checkpoint_dir/* $SSH_USER@$SSH_HOST:$SSH_FOLDER/$foldername.tmp/&lt;br /&gt;
$SSH -- &amp;quot;mv $SSH_FOLDER/$foldername.tmp $SSH_FOLDER/$foldername&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Delete old backups&lt;br /&gt;
if [ $backup_type = incr ]; then&lt;br /&gt;
    # We don&#039;t delete backups when making an incr backup, since we only&lt;br /&gt;
    # have retention limits for full and diff&lt;br /&gt;
    exit&lt;br /&gt;
fi&lt;br /&gt;
if [ $backup_type = full ]; then&lt;br /&gt;
    retention=$RETENTION_FULL&lt;br /&gt;
else&lt;br /&gt;
    retention=$RETENTION_DIFF&lt;br /&gt;
fi&lt;br /&gt;
num_backups_of_same_type=1&lt;br /&gt;
backups_to_delete=()&lt;br /&gt;
for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
    backup=${backups[i]}&lt;br /&gt;
    if ! [[ $backup =~ -${backup_type_letter}$ ]]; then&lt;br /&gt;
        continue&lt;br /&gt;
    fi&lt;br /&gt;
    ((num_backups_of_same_type++))&lt;br /&gt;
    if [ $num_backups_of_same_type -lt $retention ]; then&lt;br /&gt;
        continue&lt;br /&gt;
    fi&lt;br /&gt;
    if [ $backup_type = full ]; then&lt;br /&gt;
        # Delete everything before the last full backup which we want to&lt;br /&gt;
        # keep&lt;br /&gt;
        pat=&#039;^&#039;&lt;br /&gt;
    else&lt;br /&gt;
        # Delete all the diff and incr backups before the last diff backup&lt;br /&gt;
        # which we want to keep&lt;br /&gt;
        pat=&#039;-[DI]$&#039;&lt;br /&gt;
    fi&lt;br /&gt;
    for ((j=$i-1; j&amp;gt;=0; j--)); do&lt;br /&gt;
        backup=${backups[j]}&lt;br /&gt;
        if [[ $backup =~ $pat ]]; then&lt;br /&gt;
            backups_to_delete+=($backup)&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    break&lt;br /&gt;
done&lt;br /&gt;
if [ ${#backups_to_delete[@]} -eq 0 ]; then&lt;br /&gt;
    echo &amp;quot;No backups to delete&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit&lt;br /&gt;
fi&lt;br /&gt;
$SSH -- &amp;quot;cd $SSH_FOLDER &amp;amp;&amp;amp; rm -r ${backups_to_delete[@]}&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The script should be invoked with exactly one argument which must be one of &amp;quot;full&amp;quot;, &amp;quot;diff&amp;quot; or &amp;quot;incr&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
==== Cron ====&lt;br /&gt;
We are going to use systemd timers because they are much nicer to use than cron. Install /usr/local/bin/csc-systemd-email and /etc/systemd/system/csc-email-on-failure@.service on the target machine so that we get emails for failed jobs (there should be a copy of this on caffeine).&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/mariadb-backup@.service:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=MariaDB backup (%i)&lt;br /&gt;
Documentation=https://wiki.csclub.uwaterloo.ca/MySQL#Backups&lt;br /&gt;
&lt;br /&gt;
[Service]&lt;br /&gt;
Type=oneshot&lt;br /&gt;
User=mysql&lt;br /&gt;
ExecStart=/var/mariadb/bin/backup-mariadb.sh %i&lt;br /&gt;
&lt;br /&gt;
[Unit]&lt;br /&gt;
OnFailure=csc-email-on-failure@%n.service&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/mariadb-backup-full.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=MariaDB backup (full)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Full back up at 00:20 every Sunday and Wednesday&lt;br /&gt;
OnCalendar=Sun,Wed *-*-* 00:20:00&lt;br /&gt;
Unit=mariadb-backup@full.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/mariadb-backup-diff.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=MariaDB backup (diff)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Differential backup at 00:35 every day&lt;br /&gt;
OnCalendar=*-*-* 00:35:00&lt;br /&gt;
Unit=mariadb-backup@diff.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/mariadb-backup-incr.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=MariaDB backup (incr)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Incremental backup at the 50th minute of every hour&lt;br /&gt;
OnCalendar=*-*-* *:50:00&lt;br /&gt;
Unit=mariadb-backup@incr.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Finally, enable and start the timers:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl daemon-reload&lt;br /&gt;
systemctl enable --now mariadb-backup-full.timer&lt;br /&gt;
systemctl enable --now mariadb-backup-diff.timer&lt;br /&gt;
systemctl enable --now mariadb-backup-incr.timer&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Restore ====&lt;br /&gt;
Paste the following into e.g. /var/mariadb/bin/restore-mariadb.sh:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
SSH_KEY=/var/mariadb/.ssh/id_ed25519&lt;br /&gt;
SSH_USER=syscom&lt;br /&gt;
SSH_HOST=corn-syrup&lt;br /&gt;
SSH_FOLDER=/users/$SSH_USER/backups/$(hostname)/mariabackup&lt;br /&gt;
SSH_ARGS=&amp;quot;-i $SSH_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null&amp;quot;&lt;br /&gt;
SSH=&amp;quot;ssh $SSH_ARGS $SSH_USER@$SSH_HOST&amp;quot;&lt;br /&gt;
&lt;br /&gt;
set -euxo pipefail&lt;br /&gt;
shopt -s dotglob&lt;br /&gt;
if [ &amp;quot;$(id -un)&amp;quot; != mysql ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the mysql user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -gt 1 ]; then&lt;br /&gt;
    echo &amp;quot;Usage: $0 [0123456789-I]&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if pgrep mariadbd &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;Please stop MariaDB first&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Get a list of all backups in chronological order&lt;br /&gt;
mapfile -t backups &amp;lt; &amp;lt;($SSH -- &amp;quot;/bin/ls -1 $SSH_FOLDER | grep -P &#039;^\\d+-[FDI]$&#039; | sort&amp;quot;)&lt;br /&gt;
if [ ${#backups[@]} -eq 0 ]; then&lt;br /&gt;
    echo &amp;quot;No backups found&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -eq 1 ]; then&lt;br /&gt;
    last_backup_idx=&lt;br /&gt;
    for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
        if [ ${backups[i]} = &amp;quot;$1&amp;quot; ]; then&lt;br /&gt;
            last_backup_idx=$i&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    if [ -z &amp;quot;$last_backup_idx&amp;quot; ]; then&lt;br /&gt;
        echo &amp;quot;Could not find $1 on remote&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
else&lt;br /&gt;
    last_backup_idx=$(( ${#backups[@]} - 1 ))&lt;br /&gt;
fi&lt;br /&gt;
last_full_backup_idx=&lt;br /&gt;
for ((i=$last_backup_idx; i&amp;gt;=0; i--)); do&lt;br /&gt;
    if [[ ${backups[i]} =~ -F$ ]]; then&lt;br /&gt;
        last_full_backup_idx=$i&lt;br /&gt;
        break&lt;br /&gt;
    fi&lt;br /&gt;
done&lt;br /&gt;
if [ -z &amp;quot;$last_full_backup_idx&amp;quot; ]; then&lt;br /&gt;
    echo &amp;quot;Could not find full backup for ${backups[last_backup_idx]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
backups_to_use=()&lt;br /&gt;
if [[ ${backups[last_backup_idx]} =~ -F$ ]]; then&lt;br /&gt;
    # If we&#039;re restoring a full backup, we only need that one backup&lt;br /&gt;
    backups_to_use=(${backups[last_backup_idx]})&lt;br /&gt;
elif [[ ${backups[last_backup_idx]} =~ -D$ ]]; then&lt;br /&gt;
    # If we&#039;re restoring a diff backup, we only need that one backup and the&lt;br /&gt;
    # first full backup before it&lt;br /&gt;
    backups_to_use=(${backups[last_full_backup_idx]} ${backups[last_backup_idx]})&lt;br /&gt;
else&lt;br /&gt;
    # If we&#039;re restoring an incr backup, we need all the backups from it to&lt;br /&gt;
    # the first diff backup before it, and the first full backup before that.&lt;br /&gt;
    # If there is no diff backup between it and the last full backup, then&lt;br /&gt;
    # we need everything between it and the last full backup.&lt;br /&gt;
    for ((i=$last_backup_idx; i&amp;gt;=$last_full_backup_idx; i--)); do&lt;br /&gt;
        backups_to_use=(${backups[i]} ${backups_to_use[@]})&lt;br /&gt;
        if [[ ${backups[i]} =~ -D$ ]]; then&lt;br /&gt;
            backups_to_use=(${backups[last_full_backup_idx]} ${backups_to_use[@]})&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
fi&lt;br /&gt;
base_dir=$(mktemp -d)&lt;br /&gt;
incr_dir=$(mktemp -d)&lt;br /&gt;
trap &amp;quot;rm -rf $base_dir $incr_dir&amp;quot; EXIT&lt;br /&gt;
for backup in ${backups_to_use[@]}; do&lt;br /&gt;
    if [[ $backup =~ -F$ ]]; then&lt;br /&gt;
        backup_dir=$base_dir&lt;br /&gt;
    else&lt;br /&gt;
        backup_dir=$incr_dir&lt;br /&gt;
    fi&lt;br /&gt;
    $SSH -- &amp;quot;cat $SSH_FOLDER/$backup/data.xb.xz&amp;quot; | xz -d | mbstream -x -C $backup_dir&lt;br /&gt;
    incremental_dir_args=&lt;br /&gt;
    if [ $backup_dir = $incr_dir ]; then&lt;br /&gt;
        incremental_dir_args=&amp;quot;--incremental-dir=$incr_dir&amp;quot;&lt;br /&gt;
    fi&lt;br /&gt;
    mariabackup --prepare --target-dir=$base_dir $incremental_dir_args&lt;br /&gt;
    if [ $backup_dir = $incr_dir ]; then&lt;br /&gt;
        rm -rf $incr_dir/*&lt;br /&gt;
    fi&lt;br /&gt;
done&lt;br /&gt;
if [ &amp;quot;$(/bin/ls -1 /var/lib/mysql | wc -l)&amp;quot; -gt 0 ]; then&lt;br /&gt;
    read -p &amp;quot;Everything under /var/lib/mysql will be deleted. Continue (y/n)? &amp;quot; yn&lt;br /&gt;
    yn=${yn,,}  # convert to lower case&lt;br /&gt;
    if [ &amp;quot;$yn&amp;quot; = y -o &amp;quot;$yn&amp;quot; = yes ]; then&lt;br /&gt;
        rm -rf /var/lib/mysql/*&lt;br /&gt;
    else&lt;br /&gt;
        echo &amp;quot;Aborting.&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
fi&lt;br /&gt;
mariabackup --move-back --target-dir=$base_dir&lt;br /&gt;
echo &amp;quot;Restoration succeeded, please restart MariaDB&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Make sure to stop MariaDB before restoring a backup. If this script is invoked without any arguments, the latest backup found on corn-syrup will be used; a single argument may also be specified, which must be the name of one of the backup folders stored on corn-syrup.&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5229</id>
		<title>MySQL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5229"/>
		<updated>2024-03-16T08:50:58Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Cron */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For members ==&lt;br /&gt;
Note: the database on caffeine is actually MariaDB, not MySQL. Although they are mostly compatible, there are some incompatibilities to be aware of. See [https://mariadb.com/kb/en/mariadb-vs-mysql-compatibility/ MariaDB versus MySQL: Compatibility] for details.&lt;br /&gt;
&lt;br /&gt;
=== Creating databases ===&lt;br /&gt;
&lt;br /&gt;
Users can create their own MySQL databases through [[ceo]]. Users emailing syscom asking for a MySQL database should be directed to do so. The process is as follows:&lt;br /&gt;
&lt;br /&gt;
# SSH into any [[Machine_List|CSC machine]].&lt;br /&gt;
# Run &amp;lt;tt&amp;gt;ceo&amp;lt;/tt&amp;gt;.&lt;br /&gt;
# Select &amp;quot;Create MySQL database&amp;quot; and follow the instructions.&lt;br /&gt;
# Login info will be stored in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
# You can now connect to the MySQL database (from [[Machine_List#caffeine|caffeine]] only).&lt;br /&gt;
&lt;br /&gt;
=== Deleting databases ===&lt;br /&gt;
&lt;br /&gt;
Users can delete their own MySQL databases. &lt;br /&gt;
&lt;br /&gt;
SSH into [[Machine_List#caffeine|caffeine]].&lt;br /&gt;
 mysql -u yourusernamehere -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
 DROP DATABASE database name goes here&lt;br /&gt;
Login info and database name was created on database creation in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
&lt;br /&gt;
== For syscom ==&lt;br /&gt;
&lt;br /&gt;
=== Creating a database manually ===&lt;br /&gt;
To create a MySQL database manually on caffeine, first connect to the database as root:&lt;br /&gt;
&lt;br /&gt;
 $ mysql -uroot -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
&lt;br /&gt;
Then run the following SQL statements:&lt;br /&gt;
&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;%&#039; IDENTIFIED BY &#039;longrandompassword&#039;;&lt;br /&gt;
 CREATE DATABASE someuser;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;%&#039;;&lt;br /&gt;
&lt;br /&gt;
This will allow users to connect locally without a password, and connect remotely with a password.&lt;br /&gt;
&lt;br /&gt;
For random passwords run &amp;lt;code&amp;gt;pwgen -s 20 1&amp;lt;/code&amp;gt;. For the administrative passwords see /users/sysadmin/passwords/mysql.&lt;br /&gt;
&lt;br /&gt;
Write a file (usually ~club/mysql) to the club&#039;s homedir readable only by them containing the following:&lt;br /&gt;
&lt;br /&gt;
 Username: clubuserid&lt;br /&gt;
 Password: longrandompassword&lt;br /&gt;
 Hostname: localhost&lt;br /&gt;
&lt;br /&gt;
Try not to send passwords via plaintext email.&lt;br /&gt;
&lt;br /&gt;
=== Replication ===&lt;br /&gt;
&lt;br /&gt;
See the history of this page for information on the previous replication setup.&lt;br /&gt;
&lt;br /&gt;
=== Backups ===&lt;br /&gt;
&lt;br /&gt;
We use [https://mariadb.com/kb/en/mariabackup-overview/ mariabackup] to take periodic backups. It is currently installed and configured on both caffeine and coffee.&lt;br /&gt;
&lt;br /&gt;
==== Installation ====&lt;br /&gt;
In the example below, we will be installing mariabackup on coffee, and sending the backups to corn-syrup.&lt;br /&gt;
&lt;br /&gt;
First, install the mariadb-backup package:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt install mariadb-backup&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Next, create an SSH key pair for the mysql user:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir /var/mariadb&lt;br /&gt;
chown mysql:mysql /var/mariadb&lt;br /&gt;
su -s /bin/bash mysql&lt;br /&gt;
cd /var/mariadb&lt;br /&gt;
mkdir .ssh&lt;br /&gt;
chmod 700 .ssh&lt;br /&gt;
 # Choose /var/mariadb/.ssh/id_ed25519 for the path&lt;br /&gt;
ssh-keygen -t ed25519&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the public key (/var/mariadb/.ssh/id_ed25519.pub) into /users/syscom/.ssh/authorized_keys on corn-syrup:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
restrict ssh-ed25519 AAAAC3Nza... mysql@coffee&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Also create the folder &amp;lt;code&amp;gt;/users/syscom/backups/coffee/mariabackup&amp;lt;/code&amp;gt;. We will store the backups here.&lt;br /&gt;
&lt;br /&gt;
We will use a hacky bash script to try to emulate the same behaviour as pgBackRest. We will compress and stream each backup to a folder on corn-syrup in the format &amp;lt;code&amp;gt;1701678356-F&amp;lt;/code&amp;gt;, where the number is a Unix epoch timestamp and the letter at the end is one of F, D or I (for full, differential or incremental backups). Full backups do not depend on any other backups. Differential backups depend on the latest full backup before them. Incremental backups depend on the latest backup before them (of any type).&lt;br /&gt;
&lt;br /&gt;
On coffee, paste the following into e.g. /var/mariadb/bin/backup-mariadb.sh:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
RETENTION_FULL=2&lt;br /&gt;
RETENTION_DIFF=4&lt;br /&gt;
SSH_KEY=/var/mariadb/.ssh/id_ed25519&lt;br /&gt;
SSH_USER=syscom&lt;br /&gt;
SSH_HOST=corn-syrup&lt;br /&gt;
SSH_FOLDER=/users/$SSH_USER/backups/$(hostname)/mariabackup&lt;br /&gt;
SSH_ARGS=&amp;quot;-i $SSH_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null&amp;quot;&lt;br /&gt;
SSH=&amp;quot;ssh $SSH_ARGS $SSH_USER@$SSH_HOST&amp;quot;&lt;br /&gt;
&lt;br /&gt;
set -euxo pipefail&lt;br /&gt;
# $USER doesn&#039;t seem to be defined when we run this from cron&lt;br /&gt;
if [ &amp;quot;$(id -un)&amp;quot; != mysql ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the mysql user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Usage: $0 &amp;lt;full|diff|incr&amp;gt;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
backup_type=$1&lt;br /&gt;
if [ &amp;quot;$backup_type&amp;quot; = full ]; then&lt;br /&gt;
    backup_type_letter=F&lt;br /&gt;
elif [ &amp;quot;$backup_type&amp;quot; = diff ]; then&lt;br /&gt;
    backup_type_letter=D&lt;br /&gt;
elif [ &amp;quot;$backup_type&amp;quot; = incr ]; then&lt;br /&gt;
    backup_type_letter=I&lt;br /&gt;
else&lt;br /&gt;
    echo &amp;quot;Backup type must be one of &#039;full&#039;, &#039;diff&#039; or &#039;incr&#039;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if ! pgrep mariadbd &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;MariaDB is not running&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if pgrep mariabackup &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;mariabackup is already running&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Delete temporary files left behind by previous run, if there are any&lt;br /&gt;
$SSH -- &amp;quot;rm -rf $SSH_FOLDER/*.tmp&amp;quot;&lt;br /&gt;
# Get a list of all backups in chronological order&lt;br /&gt;
mapfile -t backups &amp;lt; &amp;lt;($SSH -- &amp;quot;/bin/ls -1 $SSH_FOLDER | grep -P &#039;^\\d+-[FDI]$&#039; | sort&amp;quot;)&lt;br /&gt;
incremental_basedir_args=&lt;br /&gt;
old_checkpoint_dir=$(mktemp -d)&lt;br /&gt;
new_checkpoint_dir=$(mktemp -d)&lt;br /&gt;
trap &amp;quot;rm -rf $old_checkpoint_dir $new_checkpoint_dir&amp;quot; EXIT&lt;br /&gt;
if [ &amp;quot;$backup_type&amp;quot; = diff -o &amp;quot;$backup_type&amp;quot; = incr ]; then&lt;br /&gt;
    # Find a backup which we can use as a base.&lt;br /&gt;
    # For incr, this can be any type; for diff, this must be a full backup.&lt;br /&gt;
    base_backup=&lt;br /&gt;
    for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
        backup=${backups[i]}&lt;br /&gt;
        if [ $backup_type = incr ] || [[ $backup =~ -F$ ]]; then&lt;br /&gt;
            base_backup=$backup&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    if [ -z &amp;quot;$base_backup&amp;quot; ]; then&lt;br /&gt;
        echo &amp;quot;Could not find base backup for $backup_type type&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
    # Copy the xtrabackup_checkpoints file from the base backup into a&lt;br /&gt;
    # temporary directory, and use it in the mariabackup command.&lt;br /&gt;
    scp $SSH_ARGS &amp;quot;$SSH_USER@$SSH_HOST:$SSH_FOLDER/$base_backup/xtrabackup_*&amp;quot; $old_checkpoint_dir/&lt;br /&gt;
    incremental_basedir_args=&amp;quot;--incremental-basedir=$old_checkpoint_dir&amp;quot;&lt;br /&gt;
fi&lt;br /&gt;
compress_level=6&lt;br /&gt;
if [ $backup_type = full ]; then&lt;br /&gt;
    # Use a lower compression level to go faster&lt;br /&gt;
    compress_level=5&lt;br /&gt;
fi&lt;br /&gt;
foldername=&amp;quot;$(date +%s)-$backup_type_letter&amp;quot;&lt;br /&gt;
# First copy to a temporary dir, then rename the temporary dir to the&lt;br /&gt;
# desired dir name (in case our process gets killed)&lt;br /&gt;
mariabackup --user=mysql --backup $incremental_basedir_args --stream=xbstream --extra-lsndir=$new_checkpoint_dir \&lt;br /&gt;
    | nice xz -$compress_level -T4 \&lt;br /&gt;
    | $SSH -- &amp;quot;cd $SSH_FOLDER &amp;amp;&amp;amp; mkdir $foldername.tmp &amp;amp;&amp;amp; cat &amp;gt; $foldername.tmp/data.xb.xz&amp;quot;&lt;br /&gt;
scp $SSH_ARGS $new_checkpoint_dir/* $SSH_USER@$SSH_HOST:$SSH_FOLDER/$foldername.tmp/&lt;br /&gt;
$SSH -- &amp;quot;mv $SSH_FOLDER/$foldername.tmp $SSH_FOLDER/$foldername&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Delete old backups&lt;br /&gt;
if [ $backup_type = incr ]; then&lt;br /&gt;
    # We don&#039;t delete backups when making an incr backup, since we only&lt;br /&gt;
    # have retention limits for full and diff&lt;br /&gt;
    exit&lt;br /&gt;
fi&lt;br /&gt;
if [ $backup_type = full ]; then&lt;br /&gt;
    retention=$RETENTION_FULL&lt;br /&gt;
else&lt;br /&gt;
    retention=$RETENTION_DIFF&lt;br /&gt;
fi&lt;br /&gt;
num_backups_of_same_type=1&lt;br /&gt;
backups_to_delete=()&lt;br /&gt;
for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
    backup=${backups[i]}&lt;br /&gt;
    if ! [[ $backup =~ -${backup_type_letter}$ ]]; then&lt;br /&gt;
        continue&lt;br /&gt;
    fi&lt;br /&gt;
    ((num_backups_of_same_type++))&lt;br /&gt;
    if [ $num_backups_of_same_type -lt $retention ]; then&lt;br /&gt;
        continue&lt;br /&gt;
    fi&lt;br /&gt;
    if [ $backup_type = full ]; then&lt;br /&gt;
        # Delete everything before the last full backup which we want to&lt;br /&gt;
        # keep&lt;br /&gt;
        pat=&#039;^&#039;&lt;br /&gt;
    else&lt;br /&gt;
        # Delete all the diff and incr backups before the last diff backup&lt;br /&gt;
        # which we want to keep&lt;br /&gt;
        pat=&#039;-[DI]$&#039;&lt;br /&gt;
    fi&lt;br /&gt;
    for ((j=$i-1; j&amp;gt;=0; j--)); do&lt;br /&gt;
        backup=${backups[j]}&lt;br /&gt;
        if [[ $backup =~ $pat ]]; then&lt;br /&gt;
            backups_to_delete+=($backup)&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    break&lt;br /&gt;
done&lt;br /&gt;
if [ ${#backups_to_delete[@]} -eq 0 ]; then&lt;br /&gt;
    echo &amp;quot;No backups to delete&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit&lt;br /&gt;
fi&lt;br /&gt;
$SSH -- &amp;quot;cd $SSH_FOLDER &amp;amp;&amp;amp; rm -r ${backups_to_delete[@]}&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The script should be invoked with exactly one argument which must be one of &amp;quot;full&amp;quot;, &amp;quot;diff&amp;quot; or &amp;quot;incr&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
==== Cron ====&lt;br /&gt;
We are going to use systemd timers because they are much nicer to use than cron. Install /etc/systemd/system/csc-email-on-failure@.service on the target machine so that we get emails for failed jobs (there should be a copy of this on caffeine).&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/mariadb-backup@.service:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=MariaDB backup (%i)&lt;br /&gt;
Documentation=https://wiki.csclub.uwaterloo.ca/MySQL#Backups&lt;br /&gt;
&lt;br /&gt;
[Service]&lt;br /&gt;
Type=oneshot&lt;br /&gt;
User=mysql&lt;br /&gt;
ExecStart=/var/mariadb/bin/backup-mariadb.sh %i&lt;br /&gt;
&lt;br /&gt;
[Unit]&lt;br /&gt;
OnFailure=csc-email-on-failure@%n.service&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/mariadb-backup-full.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=MariaDB backup (full)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Full back up at 00:20 every Sunday and Wednesday&lt;br /&gt;
OnCalendar=Sun,Wed *-*-* 00:20:00&lt;br /&gt;
Unit=mariadb-backup@full.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/mariadb-backup-diff.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=MariaDB backup (diff)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Differential backup at 00:35 every day&lt;br /&gt;
OnCalendar=*-*-* 00:35:00&lt;br /&gt;
Unit=mariadb-backup@diff.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/mariadb-backup-incr.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=MariaDB backup (incr)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Incremental backup at the 50th minute of every hour&lt;br /&gt;
OnCalendar=*-*-* *:50:00&lt;br /&gt;
Unit=mariadb-backup@incr.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Finally, enable and start the timers:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl daemon-reload&lt;br /&gt;
systemctl enable --now mariadb-backup-full.timer&lt;br /&gt;
systemctl enable --now mariadb-backup-diff.timer&lt;br /&gt;
systemctl enable --now mariadb-backup-incr.timer&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Restore ====&lt;br /&gt;
Paste the following into e.g. /var/mariadb/bin/restore-mariadb.sh:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
SSH_KEY=/var/mariadb/.ssh/id_ed25519&lt;br /&gt;
SSH_USER=syscom&lt;br /&gt;
SSH_HOST=corn-syrup&lt;br /&gt;
SSH_FOLDER=/users/$SSH_USER/backups/$(hostname)/mariabackup&lt;br /&gt;
SSH_ARGS=&amp;quot;-i $SSH_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null&amp;quot;&lt;br /&gt;
SSH=&amp;quot;ssh $SSH_ARGS $SSH_USER@$SSH_HOST&amp;quot;&lt;br /&gt;
&lt;br /&gt;
set -euxo pipefail&lt;br /&gt;
shopt -s dotglob&lt;br /&gt;
if [ &amp;quot;$(id -un)&amp;quot; != mysql ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the mysql user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -gt 1 ]; then&lt;br /&gt;
    echo &amp;quot;Usage: $0 [0123456789-I]&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if pgrep mariadbd &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;Please stop MariaDB first&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Get a list of all backups in chronological order&lt;br /&gt;
mapfile -t backups &amp;lt; &amp;lt;($SSH -- &amp;quot;/bin/ls -1 $SSH_FOLDER | grep -P &#039;^\\d+-[FDI]$&#039; | sort&amp;quot;)&lt;br /&gt;
if [ ${#backups[@]} -eq 0 ]; then&lt;br /&gt;
    echo &amp;quot;No backups found&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -eq 1 ]; then&lt;br /&gt;
    last_backup_idx=&lt;br /&gt;
    for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
        if [ ${backups[i]} = &amp;quot;$1&amp;quot; ]; then&lt;br /&gt;
            last_backup_idx=$i&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    if [ -z &amp;quot;$last_backup_idx&amp;quot; ]; then&lt;br /&gt;
        echo &amp;quot;Could not find $1 on remote&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
else&lt;br /&gt;
    last_backup_idx=$(( ${#backups[@]} - 1 ))&lt;br /&gt;
fi&lt;br /&gt;
last_full_backup_idx=&lt;br /&gt;
for ((i=$last_backup_idx; i&amp;gt;=0; i--)); do&lt;br /&gt;
    if [[ ${backups[i]} =~ -F$ ]]; then&lt;br /&gt;
        last_full_backup_idx=$i&lt;br /&gt;
        break&lt;br /&gt;
    fi&lt;br /&gt;
done&lt;br /&gt;
if [ -z &amp;quot;$last_full_backup_idx&amp;quot; ]; then&lt;br /&gt;
    echo &amp;quot;Could not find full backup for ${backups[last_backup_idx]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
backups_to_use=()&lt;br /&gt;
if [[ ${backups[last_backup_idx]} =~ -F$ ]]; then&lt;br /&gt;
    # If we&#039;re restoring a full backup, we only need that one backup&lt;br /&gt;
    backups_to_use=(${backups[last_backup_idx]})&lt;br /&gt;
elif [[ ${backups[last_backup_idx]} =~ -D$ ]]; then&lt;br /&gt;
    # If we&#039;re restoring a diff backup, we only need that one backup and the&lt;br /&gt;
    # first full backup before it&lt;br /&gt;
    backups_to_use=(${backups[last_full_backup_idx]} ${backups[last_backup_idx]})&lt;br /&gt;
else&lt;br /&gt;
    # If we&#039;re restoring an incr backup, we need all the backups from it to&lt;br /&gt;
    # the first diff backup before it, and the first full backup before that.&lt;br /&gt;
    # If there is no diff backup between it and the last full backup, then&lt;br /&gt;
    # we need everything between it and the last full backup.&lt;br /&gt;
    for ((i=$last_backup_idx; i&amp;gt;=$last_full_backup_idx; i--)); do&lt;br /&gt;
        backups_to_use=(${backups[i]} ${backups_to_use[@]})&lt;br /&gt;
        if [[ ${backups[i]} =~ -D$ ]]; then&lt;br /&gt;
            backups_to_use=(${backups[last_full_backup_idx]} ${backups_to_use[@]})&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
fi&lt;br /&gt;
base_dir=$(mktemp -d)&lt;br /&gt;
incr_dir=$(mktemp -d)&lt;br /&gt;
trap &amp;quot;rm -rf $base_dir $incr_dir&amp;quot; EXIT&lt;br /&gt;
for backup in ${backups_to_use[@]}; do&lt;br /&gt;
    if [[ $backup =~ -F$ ]]; then&lt;br /&gt;
        backup_dir=$base_dir&lt;br /&gt;
    else&lt;br /&gt;
        backup_dir=$incr_dir&lt;br /&gt;
    fi&lt;br /&gt;
    $SSH -- &amp;quot;cat $SSH_FOLDER/$backup/data.xb.xz&amp;quot; | xz -d | mbstream -x -C $backup_dir&lt;br /&gt;
    incremental_dir_args=&lt;br /&gt;
    if [ $backup_dir = $incr_dir ]; then&lt;br /&gt;
        incremental_dir_args=&amp;quot;--incremental-dir=$incr_dir&amp;quot;&lt;br /&gt;
    fi&lt;br /&gt;
    mariabackup --prepare --target-dir=$base_dir $incremental_dir_args&lt;br /&gt;
    if [ $backup_dir = $incr_dir ]; then&lt;br /&gt;
        rm -rf $incr_dir/*&lt;br /&gt;
    fi&lt;br /&gt;
done&lt;br /&gt;
if [ &amp;quot;$(/bin/ls -1 /var/lib/mysql | wc -l)&amp;quot; -gt 0 ]; then&lt;br /&gt;
    read -p &amp;quot;Everything under /var/lib/mysql will be deleted. Continue (y/n)? &amp;quot; yn&lt;br /&gt;
    yn=${yn,,}  # convert to lower case&lt;br /&gt;
    if [ &amp;quot;$yn&amp;quot; = y -o &amp;quot;$yn&amp;quot; = yes ]; then&lt;br /&gt;
        rm -rf /var/lib/mysql/*&lt;br /&gt;
    else&lt;br /&gt;
        echo &amp;quot;Aborting.&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
fi&lt;br /&gt;
mariabackup --move-back --target-dir=$base_dir&lt;br /&gt;
echo &amp;quot;Restoration succeeded, please restart MariaDB&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Make sure to stop MariaDB before restoring a backup. If this script is invoked without any arguments, the latest backup found on corn-syrup will be used; a single argument may also be specified, which must be the name of one of the backup folders stored on corn-syrup.&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5228</id>
		<title>MySQL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5228"/>
		<updated>2024-03-16T08:48:15Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Cron */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For members ==&lt;br /&gt;
Note: the database on caffeine is actually MariaDB, not MySQL. Although they are mostly compatible, there are some incompatibilities to be aware of. See [https://mariadb.com/kb/en/mariadb-vs-mysql-compatibility/ MariaDB versus MySQL: Compatibility] for details.&lt;br /&gt;
&lt;br /&gt;
=== Creating databases ===&lt;br /&gt;
&lt;br /&gt;
Users can create their own MySQL databases through [[ceo]]. Users emailing syscom asking for a MySQL database should be directed to do so. The process is as follows:&lt;br /&gt;
&lt;br /&gt;
# SSH into any [[Machine_List|CSC machine]].&lt;br /&gt;
# Run &amp;lt;tt&amp;gt;ceo&amp;lt;/tt&amp;gt;.&lt;br /&gt;
# Select &amp;quot;Create MySQL database&amp;quot; and follow the instructions.&lt;br /&gt;
# Login info will be stored in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
# You can now connect to the MySQL database (from [[Machine_List#caffeine|caffeine]] only).&lt;br /&gt;
&lt;br /&gt;
=== Deleting databases ===&lt;br /&gt;
&lt;br /&gt;
Users can delete their own MySQL databases. &lt;br /&gt;
&lt;br /&gt;
SSH into [[Machine_List#caffeine|caffeine]].&lt;br /&gt;
 mysql -u yourusernamehere -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
 DROP DATABASE database name goes here&lt;br /&gt;
Login info and database name was created on database creation in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
&lt;br /&gt;
== For syscom ==&lt;br /&gt;
&lt;br /&gt;
=== Creating a database manually ===&lt;br /&gt;
To create a MySQL database manually on caffeine, first connect to the database as root:&lt;br /&gt;
&lt;br /&gt;
 $ mysql -uroot -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
&lt;br /&gt;
Then run the following SQL statements:&lt;br /&gt;
&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;%&#039; IDENTIFIED BY &#039;longrandompassword&#039;;&lt;br /&gt;
 CREATE DATABASE someuser;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;%&#039;;&lt;br /&gt;
&lt;br /&gt;
This will allow users to connect locally without a password, and connect remotely with a password.&lt;br /&gt;
&lt;br /&gt;
For random passwords run &amp;lt;code&amp;gt;pwgen -s 20 1&amp;lt;/code&amp;gt;. For the administrative passwords see /users/sysadmin/passwords/mysql.&lt;br /&gt;
&lt;br /&gt;
Write a file (usually ~club/mysql) to the club&#039;s homedir readable only by them containing the following:&lt;br /&gt;
&lt;br /&gt;
 Username: clubuserid&lt;br /&gt;
 Password: longrandompassword&lt;br /&gt;
 Hostname: localhost&lt;br /&gt;
&lt;br /&gt;
Try not to send passwords via plaintext email.&lt;br /&gt;
&lt;br /&gt;
=== Replication ===&lt;br /&gt;
&lt;br /&gt;
See the history of this page for information on the previous replication setup.&lt;br /&gt;
&lt;br /&gt;
=== Backups ===&lt;br /&gt;
&lt;br /&gt;
We use [https://mariadb.com/kb/en/mariabackup-overview/ mariabackup] to take periodic backups. It is currently installed and configured on both caffeine and coffee.&lt;br /&gt;
&lt;br /&gt;
==== Installation ====&lt;br /&gt;
In the example below, we will be installing mariabackup on coffee, and sending the backups to corn-syrup.&lt;br /&gt;
&lt;br /&gt;
First, install the mariadb-backup package:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt install mariadb-backup&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Next, create an SSH key pair for the mysql user:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir /var/mariadb&lt;br /&gt;
chown mysql:mysql /var/mariadb&lt;br /&gt;
su -s /bin/bash mysql&lt;br /&gt;
cd /var/mariadb&lt;br /&gt;
mkdir .ssh&lt;br /&gt;
chmod 700 .ssh&lt;br /&gt;
 # Choose /var/mariadb/.ssh/id_ed25519 for the path&lt;br /&gt;
ssh-keygen -t ed25519&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the public key (/var/mariadb/.ssh/id_ed25519.pub) into /users/syscom/.ssh/authorized_keys on corn-syrup:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
restrict ssh-ed25519 AAAAC3Nza... mysql@coffee&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Also create the folder &amp;lt;code&amp;gt;/users/syscom/backups/coffee/mariabackup&amp;lt;/code&amp;gt;. We will store the backups here.&lt;br /&gt;
&lt;br /&gt;
We will use a hacky bash script to try to emulate the same behaviour as pgBackRest. We will compress and stream each backup to a folder on corn-syrup in the format &amp;lt;code&amp;gt;1701678356-F&amp;lt;/code&amp;gt;, where the number is a Unix epoch timestamp and the letter at the end is one of F, D or I (for full, differential or incremental backups). Full backups do not depend on any other backups. Differential backups depend on the latest full backup before them. Incremental backups depend on the latest backup before them (of any type).&lt;br /&gt;
&lt;br /&gt;
On coffee, paste the following into e.g. /var/mariadb/bin/backup-mariadb.sh:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
RETENTION_FULL=2&lt;br /&gt;
RETENTION_DIFF=4&lt;br /&gt;
SSH_KEY=/var/mariadb/.ssh/id_ed25519&lt;br /&gt;
SSH_USER=syscom&lt;br /&gt;
SSH_HOST=corn-syrup&lt;br /&gt;
SSH_FOLDER=/users/$SSH_USER/backups/$(hostname)/mariabackup&lt;br /&gt;
SSH_ARGS=&amp;quot;-i $SSH_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null&amp;quot;&lt;br /&gt;
SSH=&amp;quot;ssh $SSH_ARGS $SSH_USER@$SSH_HOST&amp;quot;&lt;br /&gt;
&lt;br /&gt;
set -euxo pipefail&lt;br /&gt;
# $USER doesn&#039;t seem to be defined when we run this from cron&lt;br /&gt;
if [ &amp;quot;$(id -un)&amp;quot; != mysql ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the mysql user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Usage: $0 &amp;lt;full|diff|incr&amp;gt;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
backup_type=$1&lt;br /&gt;
if [ &amp;quot;$backup_type&amp;quot; = full ]; then&lt;br /&gt;
    backup_type_letter=F&lt;br /&gt;
elif [ &amp;quot;$backup_type&amp;quot; = diff ]; then&lt;br /&gt;
    backup_type_letter=D&lt;br /&gt;
elif [ &amp;quot;$backup_type&amp;quot; = incr ]; then&lt;br /&gt;
    backup_type_letter=I&lt;br /&gt;
else&lt;br /&gt;
    echo &amp;quot;Backup type must be one of &#039;full&#039;, &#039;diff&#039; or &#039;incr&#039;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if ! pgrep mariadbd &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;MariaDB is not running&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if pgrep mariabackup &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;mariabackup is already running&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Delete temporary files left behind by previous run, if there are any&lt;br /&gt;
$SSH -- &amp;quot;rm -rf $SSH_FOLDER/*.tmp&amp;quot;&lt;br /&gt;
# Get a list of all backups in chronological order&lt;br /&gt;
mapfile -t backups &amp;lt; &amp;lt;($SSH -- &amp;quot;/bin/ls -1 $SSH_FOLDER | grep -P &#039;^\\d+-[FDI]$&#039; | sort&amp;quot;)&lt;br /&gt;
incremental_basedir_args=&lt;br /&gt;
old_checkpoint_dir=$(mktemp -d)&lt;br /&gt;
new_checkpoint_dir=$(mktemp -d)&lt;br /&gt;
trap &amp;quot;rm -rf $old_checkpoint_dir $new_checkpoint_dir&amp;quot; EXIT&lt;br /&gt;
if [ &amp;quot;$backup_type&amp;quot; = diff -o &amp;quot;$backup_type&amp;quot; = incr ]; then&lt;br /&gt;
    # Find a backup which we can use as a base.&lt;br /&gt;
    # For incr, this can be any type; for diff, this must be a full backup.&lt;br /&gt;
    base_backup=&lt;br /&gt;
    for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
        backup=${backups[i]}&lt;br /&gt;
        if [ $backup_type = incr ] || [[ $backup =~ -F$ ]]; then&lt;br /&gt;
            base_backup=$backup&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    if [ -z &amp;quot;$base_backup&amp;quot; ]; then&lt;br /&gt;
        echo &amp;quot;Could not find base backup for $backup_type type&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
    # Copy the xtrabackup_checkpoints file from the base backup into a&lt;br /&gt;
    # temporary directory, and use it in the mariabackup command.&lt;br /&gt;
    scp $SSH_ARGS &amp;quot;$SSH_USER@$SSH_HOST:$SSH_FOLDER/$base_backup/xtrabackup_*&amp;quot; $old_checkpoint_dir/&lt;br /&gt;
    incremental_basedir_args=&amp;quot;--incremental-basedir=$old_checkpoint_dir&amp;quot;&lt;br /&gt;
fi&lt;br /&gt;
compress_level=6&lt;br /&gt;
if [ $backup_type = full ]; then&lt;br /&gt;
    # Use a lower compression level to go faster&lt;br /&gt;
    compress_level=5&lt;br /&gt;
fi&lt;br /&gt;
foldername=&amp;quot;$(date +%s)-$backup_type_letter&amp;quot;&lt;br /&gt;
# First copy to a temporary dir, then rename the temporary dir to the&lt;br /&gt;
# desired dir name (in case our process gets killed)&lt;br /&gt;
mariabackup --user=mysql --backup $incremental_basedir_args --stream=xbstream --extra-lsndir=$new_checkpoint_dir \&lt;br /&gt;
    | nice xz -$compress_level -T4 \&lt;br /&gt;
    | $SSH -- &amp;quot;cd $SSH_FOLDER &amp;amp;&amp;amp; mkdir $foldername.tmp &amp;amp;&amp;amp; cat &amp;gt; $foldername.tmp/data.xb.xz&amp;quot;&lt;br /&gt;
scp $SSH_ARGS $new_checkpoint_dir/* $SSH_USER@$SSH_HOST:$SSH_FOLDER/$foldername.tmp/&lt;br /&gt;
$SSH -- &amp;quot;mv $SSH_FOLDER/$foldername.tmp $SSH_FOLDER/$foldername&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Delete old backups&lt;br /&gt;
if [ $backup_type = incr ]; then&lt;br /&gt;
    # We don&#039;t delete backups when making an incr backup, since we only&lt;br /&gt;
    # have retention limits for full and diff&lt;br /&gt;
    exit&lt;br /&gt;
fi&lt;br /&gt;
if [ $backup_type = full ]; then&lt;br /&gt;
    retention=$RETENTION_FULL&lt;br /&gt;
else&lt;br /&gt;
    retention=$RETENTION_DIFF&lt;br /&gt;
fi&lt;br /&gt;
num_backups_of_same_type=1&lt;br /&gt;
backups_to_delete=()&lt;br /&gt;
for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
    backup=${backups[i]}&lt;br /&gt;
    if ! [[ $backup =~ -${backup_type_letter}$ ]]; then&lt;br /&gt;
        continue&lt;br /&gt;
    fi&lt;br /&gt;
    ((num_backups_of_same_type++))&lt;br /&gt;
    if [ $num_backups_of_same_type -lt $retention ]; then&lt;br /&gt;
        continue&lt;br /&gt;
    fi&lt;br /&gt;
    if [ $backup_type = full ]; then&lt;br /&gt;
        # Delete everything before the last full backup which we want to&lt;br /&gt;
        # keep&lt;br /&gt;
        pat=&#039;^&#039;&lt;br /&gt;
    else&lt;br /&gt;
        # Delete all the diff and incr backups before the last diff backup&lt;br /&gt;
        # which we want to keep&lt;br /&gt;
        pat=&#039;-[DI]$&#039;&lt;br /&gt;
    fi&lt;br /&gt;
    for ((j=$i-1; j&amp;gt;=0; j--)); do&lt;br /&gt;
        backup=${backups[j]}&lt;br /&gt;
        if [[ $backup =~ $pat ]]; then&lt;br /&gt;
            backups_to_delete+=($backup)&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    break&lt;br /&gt;
done&lt;br /&gt;
if [ ${#backups_to_delete[@]} -eq 0 ]; then&lt;br /&gt;
    echo &amp;quot;No backups to delete&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit&lt;br /&gt;
fi&lt;br /&gt;
$SSH -- &amp;quot;cd $SSH_FOLDER &amp;amp;&amp;amp; rm -r ${backups_to_delete[@]}&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The script should be invoked with exactly one argument which must be one of &amp;quot;full&amp;quot;, &amp;quot;diff&amp;quot; or &amp;quot;incr&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
==== Cron ====&lt;br /&gt;
We are going to use systemd timers because they are much nicer to use than cron. Install /etc/systemd/system/mariadb-backup@.service on the target machine so that we get emails for failed jobs (there should be a copy of this on caffeine).&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/mariadb-backup@.service:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=MariaDB backup (%i)&lt;br /&gt;
Documentation=https://wiki.csclub.uwaterloo.ca/MySQL#Backups&lt;br /&gt;
&lt;br /&gt;
[Service]&lt;br /&gt;
Type=oneshot&lt;br /&gt;
User=mysql&lt;br /&gt;
ExecStart=/var/mariadb/bin/backup-mariadb.sh %i&lt;br /&gt;
&lt;br /&gt;
[Unit]&lt;br /&gt;
OnFailure=csc-email-on-failure@%n.service&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/mariadb-backup-full.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=MariaDB backup (full)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Full back up at 00:20 every Sunday and Wednesday&lt;br /&gt;
OnCalendar=Sun,Wed *-*-* 00:20:00&lt;br /&gt;
Unit=mariadb-backup@full.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/mariadb-backup-diff.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=MariaDB backup (diff)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Differential backup at 00:35 every day&lt;br /&gt;
OnCalendar=*-*-* 00:35:00&lt;br /&gt;
Unit=mariadb-backup@diff.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/systemd/system/mariadb-backup-incr.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=MariaDB backup (incr)&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
# Incremental backup at the 50th minute of every hour&lt;br /&gt;
OnCalendar=*-*-* *:50:00&lt;br /&gt;
Unit=mariadb-backup@incr.service&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Finally, enable and start the timers:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl daemon-reload&lt;br /&gt;
systemctl enable --now mariadb-backup-full.timer&lt;br /&gt;
systemctl enable --now mariadb-backup-diff.timer&lt;br /&gt;
systemctl enable --now mariadb-backup-incr.timer&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Restore ====&lt;br /&gt;
Paste the following into e.g. /var/mariadb/bin/restore-mariadb.sh:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
SSH_KEY=/var/mariadb/.ssh/id_ed25519&lt;br /&gt;
SSH_USER=syscom&lt;br /&gt;
SSH_HOST=corn-syrup&lt;br /&gt;
SSH_FOLDER=/users/$SSH_USER/backups/$(hostname)/mariabackup&lt;br /&gt;
SSH_ARGS=&amp;quot;-i $SSH_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null&amp;quot;&lt;br /&gt;
SSH=&amp;quot;ssh $SSH_ARGS $SSH_USER@$SSH_HOST&amp;quot;&lt;br /&gt;
&lt;br /&gt;
set -euxo pipefail&lt;br /&gt;
shopt -s dotglob&lt;br /&gt;
if [ &amp;quot;$(id -un)&amp;quot; != mysql ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the mysql user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -gt 1 ]; then&lt;br /&gt;
    echo &amp;quot;Usage: $0 [0123456789-I]&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if pgrep mariadbd &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;Please stop MariaDB first&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Get a list of all backups in chronological order&lt;br /&gt;
mapfile -t backups &amp;lt; &amp;lt;($SSH -- &amp;quot;/bin/ls -1 $SSH_FOLDER | grep -P &#039;^\\d+-[FDI]$&#039; | sort&amp;quot;)&lt;br /&gt;
if [ ${#backups[@]} -eq 0 ]; then&lt;br /&gt;
    echo &amp;quot;No backups found&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -eq 1 ]; then&lt;br /&gt;
    last_backup_idx=&lt;br /&gt;
    for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
        if [ ${backups[i]} = &amp;quot;$1&amp;quot; ]; then&lt;br /&gt;
            last_backup_idx=$i&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    if [ -z &amp;quot;$last_backup_idx&amp;quot; ]; then&lt;br /&gt;
        echo &amp;quot;Could not find $1 on remote&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
else&lt;br /&gt;
    last_backup_idx=$(( ${#backups[@]} - 1 ))&lt;br /&gt;
fi&lt;br /&gt;
last_full_backup_idx=&lt;br /&gt;
for ((i=$last_backup_idx; i&amp;gt;=0; i--)); do&lt;br /&gt;
    if [[ ${backups[i]} =~ -F$ ]]; then&lt;br /&gt;
        last_full_backup_idx=$i&lt;br /&gt;
        break&lt;br /&gt;
    fi&lt;br /&gt;
done&lt;br /&gt;
if [ -z &amp;quot;$last_full_backup_idx&amp;quot; ]; then&lt;br /&gt;
    echo &amp;quot;Could not find full backup for ${backups[last_backup_idx]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
backups_to_use=()&lt;br /&gt;
if [[ ${backups[last_backup_idx]} =~ -F$ ]]; then&lt;br /&gt;
    # If we&#039;re restoring a full backup, we only need that one backup&lt;br /&gt;
    backups_to_use=(${backups[last_backup_idx]})&lt;br /&gt;
elif [[ ${backups[last_backup_idx]} =~ -D$ ]]; then&lt;br /&gt;
    # If we&#039;re restoring a diff backup, we only need that one backup and the&lt;br /&gt;
    # first full backup before it&lt;br /&gt;
    backups_to_use=(${backups[last_full_backup_idx]} ${backups[last_backup_idx]})&lt;br /&gt;
else&lt;br /&gt;
    # If we&#039;re restoring an incr backup, we need all the backups from it to&lt;br /&gt;
    # the first diff backup before it, and the first full backup before that.&lt;br /&gt;
    # If there is no diff backup between it and the last full backup, then&lt;br /&gt;
    # we need everything between it and the last full backup.&lt;br /&gt;
    for ((i=$last_backup_idx; i&amp;gt;=$last_full_backup_idx; i--)); do&lt;br /&gt;
        backups_to_use=(${backups[i]} ${backups_to_use[@]})&lt;br /&gt;
        if [[ ${backups[i]} =~ -D$ ]]; then&lt;br /&gt;
            backups_to_use=(${backups[last_full_backup_idx]} ${backups_to_use[@]})&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
fi&lt;br /&gt;
base_dir=$(mktemp -d)&lt;br /&gt;
incr_dir=$(mktemp -d)&lt;br /&gt;
trap &amp;quot;rm -rf $base_dir $incr_dir&amp;quot; EXIT&lt;br /&gt;
for backup in ${backups_to_use[@]}; do&lt;br /&gt;
    if [[ $backup =~ -F$ ]]; then&lt;br /&gt;
        backup_dir=$base_dir&lt;br /&gt;
    else&lt;br /&gt;
        backup_dir=$incr_dir&lt;br /&gt;
    fi&lt;br /&gt;
    $SSH -- &amp;quot;cat $SSH_FOLDER/$backup/data.xb.xz&amp;quot; | xz -d | mbstream -x -C $backup_dir&lt;br /&gt;
    incremental_dir_args=&lt;br /&gt;
    if [ $backup_dir = $incr_dir ]; then&lt;br /&gt;
        incremental_dir_args=&amp;quot;--incremental-dir=$incr_dir&amp;quot;&lt;br /&gt;
    fi&lt;br /&gt;
    mariabackup --prepare --target-dir=$base_dir $incremental_dir_args&lt;br /&gt;
    if [ $backup_dir = $incr_dir ]; then&lt;br /&gt;
        rm -rf $incr_dir/*&lt;br /&gt;
    fi&lt;br /&gt;
done&lt;br /&gt;
if [ &amp;quot;$(/bin/ls -1 /var/lib/mysql | wc -l)&amp;quot; -gt 0 ]; then&lt;br /&gt;
    read -p &amp;quot;Everything under /var/lib/mysql will be deleted. Continue (y/n)? &amp;quot; yn&lt;br /&gt;
    yn=${yn,,}  # convert to lower case&lt;br /&gt;
    if [ &amp;quot;$yn&amp;quot; = y -o &amp;quot;$yn&amp;quot; = yes ]; then&lt;br /&gt;
        rm -rf /var/lib/mysql/*&lt;br /&gt;
    else&lt;br /&gt;
        echo &amp;quot;Aborting.&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
fi&lt;br /&gt;
mariabackup --move-back --target-dir=$base_dir&lt;br /&gt;
echo &amp;quot;Restoration succeeded, please restart MariaDB&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Make sure to stop MariaDB before restoring a backup. If this script is invoked without any arguments, the latest backup found on corn-syrup will be used; a single argument may also be specified, which must be the name of one of the backup folders stored on corn-syrup.&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5227</id>
		<title>MySQL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5227"/>
		<updated>2024-03-11T16:40:09Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Installation */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For members ==&lt;br /&gt;
Note: the database on caffeine is actually MariaDB, not MySQL. Although they are mostly compatible, there are some incompatibilities to be aware of. See [https://mariadb.com/kb/en/mariadb-vs-mysql-compatibility/ MariaDB versus MySQL: Compatibility] for details.&lt;br /&gt;
&lt;br /&gt;
=== Creating databases ===&lt;br /&gt;
&lt;br /&gt;
Users can create their own MySQL databases through [[ceo]]. Users emailing syscom asking for a MySQL database should be directed to do so. The process is as follows:&lt;br /&gt;
&lt;br /&gt;
# SSH into any [[Machine_List|CSC machine]].&lt;br /&gt;
# Run &amp;lt;tt&amp;gt;ceo&amp;lt;/tt&amp;gt;.&lt;br /&gt;
# Select &amp;quot;Create MySQL database&amp;quot; and follow the instructions.&lt;br /&gt;
# Login info will be stored in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
# You can now connect to the MySQL database (from [[Machine_List#caffeine|caffeine]] only).&lt;br /&gt;
&lt;br /&gt;
=== Deleting databases ===&lt;br /&gt;
&lt;br /&gt;
Users can delete their own MySQL databases. &lt;br /&gt;
&lt;br /&gt;
SSH into [[Machine_List#caffeine|caffeine]].&lt;br /&gt;
 mysql -u yourusernamehere -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
 DROP DATABASE database name goes here&lt;br /&gt;
Login info and database name was created on database creation in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
&lt;br /&gt;
== For syscom ==&lt;br /&gt;
&lt;br /&gt;
=== Creating a database manually ===&lt;br /&gt;
To create a MySQL database manually on caffeine, first connect to the database as root:&lt;br /&gt;
&lt;br /&gt;
 $ mysql -uroot -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
&lt;br /&gt;
Then run the following SQL statements:&lt;br /&gt;
&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;%&#039; IDENTIFIED BY &#039;longrandompassword&#039;;&lt;br /&gt;
 CREATE DATABASE someuser;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;%&#039;;&lt;br /&gt;
&lt;br /&gt;
This will allow users to connect locally without a password, and connect remotely with a password.&lt;br /&gt;
&lt;br /&gt;
For random passwords run &amp;lt;code&amp;gt;pwgen -s 20 1&amp;lt;/code&amp;gt;. For the administrative passwords see /users/sysadmin/passwords/mysql.&lt;br /&gt;
&lt;br /&gt;
Write a file (usually ~club/mysql) to the club&#039;s homedir readable only by them containing the following:&lt;br /&gt;
&lt;br /&gt;
 Username: clubuserid&lt;br /&gt;
 Password: longrandompassword&lt;br /&gt;
 Hostname: localhost&lt;br /&gt;
&lt;br /&gt;
Try not to send passwords via plaintext email.&lt;br /&gt;
&lt;br /&gt;
=== Replication ===&lt;br /&gt;
&lt;br /&gt;
See the history of this page for information on the previous replication setup.&lt;br /&gt;
&lt;br /&gt;
=== Backups ===&lt;br /&gt;
&lt;br /&gt;
We use [https://mariadb.com/kb/en/mariabackup-overview/ mariabackup] to take periodic backups. It is currently installed and configured on both caffeine and coffee.&lt;br /&gt;
&lt;br /&gt;
==== Installation ====&lt;br /&gt;
In the example below, we will be installing mariabackup on coffee, and sending the backups to corn-syrup.&lt;br /&gt;
&lt;br /&gt;
First, install the mariadb-backup package:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt install mariadb-backup&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Next, create an SSH key pair for the mysql user:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir /var/mariadb&lt;br /&gt;
chown mysql:mysql /var/mariadb&lt;br /&gt;
su -s /bin/bash mysql&lt;br /&gt;
cd /var/mariadb&lt;br /&gt;
mkdir .ssh&lt;br /&gt;
chmod 700 .ssh&lt;br /&gt;
 # Choose /var/mariadb/.ssh/id_ed25519 for the path&lt;br /&gt;
ssh-keygen -t ed25519&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the public key (/var/mariadb/.ssh/id_ed25519.pub) into /users/syscom/.ssh/authorized_keys on corn-syrup:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
restrict ssh-ed25519 AAAAC3Nza... mysql@coffee&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Also create the folder &amp;lt;code&amp;gt;/users/syscom/backups/coffee/mariabackup&amp;lt;/code&amp;gt;. We will store the backups here.&lt;br /&gt;
&lt;br /&gt;
We will use a hacky bash script to try to emulate the same behaviour as pgBackRest. We will compress and stream each backup to a folder on corn-syrup in the format &amp;lt;code&amp;gt;1701678356-F&amp;lt;/code&amp;gt;, where the number is a Unix epoch timestamp and the letter at the end is one of F, D or I (for full, differential or incremental backups). Full backups do not depend on any other backups. Differential backups depend on the latest full backup before them. Incremental backups depend on the latest backup before them (of any type).&lt;br /&gt;
&lt;br /&gt;
On coffee, paste the following into e.g. /var/mariadb/bin/backup-mariadb.sh:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
RETENTION_FULL=2&lt;br /&gt;
RETENTION_DIFF=4&lt;br /&gt;
SSH_KEY=/var/mariadb/.ssh/id_ed25519&lt;br /&gt;
SSH_USER=syscom&lt;br /&gt;
SSH_HOST=corn-syrup&lt;br /&gt;
SSH_FOLDER=/users/$SSH_USER/backups/$(hostname)/mariabackup&lt;br /&gt;
SSH_ARGS=&amp;quot;-i $SSH_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null&amp;quot;&lt;br /&gt;
SSH=&amp;quot;ssh $SSH_ARGS $SSH_USER@$SSH_HOST&amp;quot;&lt;br /&gt;
&lt;br /&gt;
set -euxo pipefail&lt;br /&gt;
# $USER doesn&#039;t seem to be defined when we run this from cron&lt;br /&gt;
if [ &amp;quot;$(id -un)&amp;quot; != mysql ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the mysql user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Usage: $0 &amp;lt;full|diff|incr&amp;gt;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
backup_type=$1&lt;br /&gt;
if [ &amp;quot;$backup_type&amp;quot; = full ]; then&lt;br /&gt;
    backup_type_letter=F&lt;br /&gt;
elif [ &amp;quot;$backup_type&amp;quot; = diff ]; then&lt;br /&gt;
    backup_type_letter=D&lt;br /&gt;
elif [ &amp;quot;$backup_type&amp;quot; = incr ]; then&lt;br /&gt;
    backup_type_letter=I&lt;br /&gt;
else&lt;br /&gt;
    echo &amp;quot;Backup type must be one of &#039;full&#039;, &#039;diff&#039; or &#039;incr&#039;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if ! pgrep mariadbd &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;MariaDB is not running&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if pgrep mariabackup &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;mariabackup is already running&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Delete temporary files left behind by previous run, if there are any&lt;br /&gt;
$SSH -- &amp;quot;rm -rf $SSH_FOLDER/*.tmp&amp;quot;&lt;br /&gt;
# Get a list of all backups in chronological order&lt;br /&gt;
mapfile -t backups &amp;lt; &amp;lt;($SSH -- &amp;quot;/bin/ls -1 $SSH_FOLDER | grep -P &#039;^\\d+-[FDI]$&#039; | sort&amp;quot;)&lt;br /&gt;
incremental_basedir_args=&lt;br /&gt;
old_checkpoint_dir=$(mktemp -d)&lt;br /&gt;
new_checkpoint_dir=$(mktemp -d)&lt;br /&gt;
trap &amp;quot;rm -rf $old_checkpoint_dir $new_checkpoint_dir&amp;quot; EXIT&lt;br /&gt;
if [ &amp;quot;$backup_type&amp;quot; = diff -o &amp;quot;$backup_type&amp;quot; = incr ]; then&lt;br /&gt;
    # Find a backup which we can use as a base.&lt;br /&gt;
    # For incr, this can be any type; for diff, this must be a full backup.&lt;br /&gt;
    base_backup=&lt;br /&gt;
    for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
        backup=${backups[i]}&lt;br /&gt;
        if [ $backup_type = incr ] || [[ $backup =~ -F$ ]]; then&lt;br /&gt;
            base_backup=$backup&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    if [ -z &amp;quot;$base_backup&amp;quot; ]; then&lt;br /&gt;
        echo &amp;quot;Could not find base backup for $backup_type type&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
    # Copy the xtrabackup_checkpoints file from the base backup into a&lt;br /&gt;
    # temporary directory, and use it in the mariabackup command.&lt;br /&gt;
    scp $SSH_ARGS &amp;quot;$SSH_USER@$SSH_HOST:$SSH_FOLDER/$base_backup/xtrabackup_*&amp;quot; $old_checkpoint_dir/&lt;br /&gt;
    incremental_basedir_args=&amp;quot;--incremental-basedir=$old_checkpoint_dir&amp;quot;&lt;br /&gt;
fi&lt;br /&gt;
compress_level=6&lt;br /&gt;
if [ $backup_type = full ]; then&lt;br /&gt;
    # Use a lower compression level to go faster&lt;br /&gt;
    compress_level=5&lt;br /&gt;
fi&lt;br /&gt;
foldername=&amp;quot;$(date +%s)-$backup_type_letter&amp;quot;&lt;br /&gt;
# First copy to a temporary dir, then rename the temporary dir to the&lt;br /&gt;
# desired dir name (in case our process gets killed)&lt;br /&gt;
mariabackup --user=mysql --backup $incremental_basedir_args --stream=xbstream --extra-lsndir=$new_checkpoint_dir \&lt;br /&gt;
    | nice xz -$compress_level -T4 \&lt;br /&gt;
    | $SSH -- &amp;quot;cd $SSH_FOLDER &amp;amp;&amp;amp; mkdir $foldername.tmp &amp;amp;&amp;amp; cat &amp;gt; $foldername.tmp/data.xb.xz&amp;quot;&lt;br /&gt;
scp $SSH_ARGS $new_checkpoint_dir/* $SSH_USER@$SSH_HOST:$SSH_FOLDER/$foldername.tmp/&lt;br /&gt;
$SSH -- &amp;quot;mv $SSH_FOLDER/$foldername.tmp $SSH_FOLDER/$foldername&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Delete old backups&lt;br /&gt;
if [ $backup_type = incr ]; then&lt;br /&gt;
    # We don&#039;t delete backups when making an incr backup, since we only&lt;br /&gt;
    # have retention limits for full and diff&lt;br /&gt;
    exit&lt;br /&gt;
fi&lt;br /&gt;
if [ $backup_type = full ]; then&lt;br /&gt;
    retention=$RETENTION_FULL&lt;br /&gt;
else&lt;br /&gt;
    retention=$RETENTION_DIFF&lt;br /&gt;
fi&lt;br /&gt;
num_backups_of_same_type=1&lt;br /&gt;
backups_to_delete=()&lt;br /&gt;
for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
    backup=${backups[i]}&lt;br /&gt;
    if ! [[ $backup =~ -${backup_type_letter}$ ]]; then&lt;br /&gt;
        continue&lt;br /&gt;
    fi&lt;br /&gt;
    ((num_backups_of_same_type++))&lt;br /&gt;
    if [ $num_backups_of_same_type -lt $retention ]; then&lt;br /&gt;
        continue&lt;br /&gt;
    fi&lt;br /&gt;
    if [ $backup_type = full ]; then&lt;br /&gt;
        # Delete everything before the last full backup which we want to&lt;br /&gt;
        # keep&lt;br /&gt;
        pat=&#039;^&#039;&lt;br /&gt;
    else&lt;br /&gt;
        # Delete all the diff and incr backups before the last diff backup&lt;br /&gt;
        # which we want to keep&lt;br /&gt;
        pat=&#039;-[DI]$&#039;&lt;br /&gt;
    fi&lt;br /&gt;
    for ((j=$i-1; j&amp;gt;=0; j--)); do&lt;br /&gt;
        backup=${backups[j]}&lt;br /&gt;
        if [[ $backup =~ $pat ]]; then&lt;br /&gt;
            backups_to_delete+=($backup)&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    break&lt;br /&gt;
done&lt;br /&gt;
if [ ${#backups_to_delete[@]} -eq 0 ]; then&lt;br /&gt;
    echo &amp;quot;No backups to delete&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit&lt;br /&gt;
fi&lt;br /&gt;
$SSH -- &amp;quot;cd $SSH_FOLDER &amp;amp;&amp;amp; rm -r ${backups_to_delete[@]}&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The script should be invoked with exactly one argument which must be one of &amp;quot;full&amp;quot;, &amp;quot;diff&amp;quot; or &amp;quot;incr&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
==== Cron ====&lt;br /&gt;
Paste something like the following into e.g. /etc/cron.d/mariadb_backup:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
MAILTO=root@csclub.uwaterloo.ca&lt;br /&gt;
&lt;br /&gt;
# Full back up at 00:20 every Sunday and Wednesday&lt;br /&gt;
20 0 * * 0,3 mysql chronic /var/mariadb/bin/backup-mariadb.sh full&lt;br /&gt;
# Differential backup at 00:35 every day&lt;br /&gt;
35 0 * * * mysql chronic /var/mariadb/bin/backup-mariadb.sh diff&lt;br /&gt;
# Incremental backup at the 50th minute of every hour&lt;br /&gt;
50 * * * * mysql chronic /var/mariadb/bin/backup-mariadb.sh incr&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Restore ====&lt;br /&gt;
Paste the following into e.g. /var/mariadb/bin/restore-mariadb.sh:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
SSH_KEY=/var/mariadb/.ssh/id_ed25519&lt;br /&gt;
SSH_USER=syscom&lt;br /&gt;
SSH_HOST=corn-syrup&lt;br /&gt;
SSH_FOLDER=/users/$SSH_USER/backups/$(hostname)/mariabackup&lt;br /&gt;
SSH_ARGS=&amp;quot;-i $SSH_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null&amp;quot;&lt;br /&gt;
SSH=&amp;quot;ssh $SSH_ARGS $SSH_USER@$SSH_HOST&amp;quot;&lt;br /&gt;
&lt;br /&gt;
set -euxo pipefail&lt;br /&gt;
shopt -s dotglob&lt;br /&gt;
if [ &amp;quot;$(id -un)&amp;quot; != mysql ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the mysql user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -gt 1 ]; then&lt;br /&gt;
    echo &amp;quot;Usage: $0 [0123456789-I]&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if pgrep mariadbd &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;Please stop MariaDB first&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Get a list of all backups in chronological order&lt;br /&gt;
mapfile -t backups &amp;lt; &amp;lt;($SSH -- &amp;quot;/bin/ls -1 $SSH_FOLDER | grep -P &#039;^\\d+-[FDI]$&#039; | sort&amp;quot;)&lt;br /&gt;
if [ ${#backups[@]} -eq 0 ]; then&lt;br /&gt;
    echo &amp;quot;No backups found&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -eq 1 ]; then&lt;br /&gt;
    last_backup_idx=&lt;br /&gt;
    for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
        if [ ${backups[i]} = &amp;quot;$1&amp;quot; ]; then&lt;br /&gt;
            last_backup_idx=$i&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    if [ -z &amp;quot;$last_backup_idx&amp;quot; ]; then&lt;br /&gt;
        echo &amp;quot;Could not find $1 on remote&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
else&lt;br /&gt;
    last_backup_idx=$(( ${#backups[@]} - 1 ))&lt;br /&gt;
fi&lt;br /&gt;
last_full_backup_idx=&lt;br /&gt;
for ((i=$last_backup_idx; i&amp;gt;=0; i--)); do&lt;br /&gt;
    if [[ ${backups[i]} =~ -F$ ]]; then&lt;br /&gt;
        last_full_backup_idx=$i&lt;br /&gt;
        break&lt;br /&gt;
    fi&lt;br /&gt;
done&lt;br /&gt;
if [ -z &amp;quot;$last_full_backup_idx&amp;quot; ]; then&lt;br /&gt;
    echo &amp;quot;Could not find full backup for ${backups[last_backup_idx]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
backups_to_use=()&lt;br /&gt;
if [[ ${backups[last_backup_idx]} =~ -F$ ]]; then&lt;br /&gt;
    # If we&#039;re restoring a full backup, we only need that one backup&lt;br /&gt;
    backups_to_use=(${backups[last_backup_idx]})&lt;br /&gt;
elif [[ ${backups[last_backup_idx]} =~ -D$ ]]; then&lt;br /&gt;
    # If we&#039;re restoring a diff backup, we only need that one backup and the&lt;br /&gt;
    # first full backup before it&lt;br /&gt;
    backups_to_use=(${backups[last_full_backup_idx]} ${backups[last_backup_idx]})&lt;br /&gt;
else&lt;br /&gt;
    # If we&#039;re restoring an incr backup, we need all the backups from it to&lt;br /&gt;
    # the first diff backup before it, and the first full backup before that.&lt;br /&gt;
    # If there is no diff backup between it and the last full backup, then&lt;br /&gt;
    # we need everything between it and the last full backup.&lt;br /&gt;
    for ((i=$last_backup_idx; i&amp;gt;=$last_full_backup_idx; i--)); do&lt;br /&gt;
        backups_to_use=(${backups[i]} ${backups_to_use[@]})&lt;br /&gt;
        if [[ ${backups[i]} =~ -D$ ]]; then&lt;br /&gt;
            backups_to_use=(${backups[last_full_backup_idx]} ${backups_to_use[@]})&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
fi&lt;br /&gt;
base_dir=$(mktemp -d)&lt;br /&gt;
incr_dir=$(mktemp -d)&lt;br /&gt;
trap &amp;quot;rm -rf $base_dir $incr_dir&amp;quot; EXIT&lt;br /&gt;
for backup in ${backups_to_use[@]}; do&lt;br /&gt;
    if [[ $backup =~ -F$ ]]; then&lt;br /&gt;
        backup_dir=$base_dir&lt;br /&gt;
    else&lt;br /&gt;
        backup_dir=$incr_dir&lt;br /&gt;
    fi&lt;br /&gt;
    $SSH -- &amp;quot;cat $SSH_FOLDER/$backup/data.xb.xz&amp;quot; | xz -d | mbstream -x -C $backup_dir&lt;br /&gt;
    incremental_dir_args=&lt;br /&gt;
    if [ $backup_dir = $incr_dir ]; then&lt;br /&gt;
        incremental_dir_args=&amp;quot;--incremental-dir=$incr_dir&amp;quot;&lt;br /&gt;
    fi&lt;br /&gt;
    mariabackup --prepare --target-dir=$base_dir $incremental_dir_args&lt;br /&gt;
    if [ $backup_dir = $incr_dir ]; then&lt;br /&gt;
        rm -rf $incr_dir/*&lt;br /&gt;
    fi&lt;br /&gt;
done&lt;br /&gt;
if [ &amp;quot;$(/bin/ls -1 /var/lib/mysql | wc -l)&amp;quot; -gt 0 ]; then&lt;br /&gt;
    read -p &amp;quot;Everything under /var/lib/mysql will be deleted. Continue (y/n)? &amp;quot; yn&lt;br /&gt;
    yn=${yn,,}  # convert to lower case&lt;br /&gt;
    if [ &amp;quot;$yn&amp;quot; = y -o &amp;quot;$yn&amp;quot; = yes ]; then&lt;br /&gt;
        rm -rf /var/lib/mysql/*&lt;br /&gt;
    else&lt;br /&gt;
        echo &amp;quot;Aborting.&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
fi&lt;br /&gt;
mariabackup --move-back --target-dir=$base_dir&lt;br /&gt;
echo &amp;quot;Restoration succeeded, please restart MariaDB&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Make sure to stop MariaDB before restoring a backup. If this script is invoked without any arguments, the latest backup found on corn-syrup will be used; a single argument may also be specified, which must be the name of one of the backup folders stored on corn-syrup.&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=Ceph&amp;diff=5211</id>
		<title>Ceph</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=Ceph&amp;diff=5211"/>
		<updated>2024-02-10T11:03:27Z</updated>

		<summary type="html">&lt;p&gt;Merenber: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;We are running a three-node [https://ceph.io Ceph] cluster on riboflavin, ginkgo and biloba for the purpose of cloud storage. Most Ceph services are running on riboflavin or ginkgo; biloba is just providing a tiny bit of extra storage space.&lt;br /&gt;
&lt;br /&gt;
Official documentation: https://docs.ceph.com/en/latest/&lt;br /&gt;
&lt;br /&gt;
At the time this page was written, the latest version of Ceph was &#039;Pacific&#039;; check the website above to see what the latest version is.&lt;br /&gt;
&lt;br /&gt;
== Bootstrap ==&lt;br /&gt;
The instructions below were adapted from https://docs.ceph.com/en/pacific/cephadm/install/.&lt;br /&gt;
&lt;br /&gt;
riboflavin was used as the bootstrap host, since it has the most storage.&lt;br /&gt;
&lt;br /&gt;
Add the following to /etc/apt/sources.list.d/ceph.list:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
deb http://mirror.csclub.uwaterloo.ca/ceph/debian-pacific/ bullseye main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Download the Ceph release key for the Debian packages:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
wget -O /etc/apt/trusted.gpg.d/ceph.release.gpg https://download.ceph.com/keys/release.gpg&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt update&lt;br /&gt;
apt install cephadm podman&lt;br /&gt;
ceph boostrap --mon-ip 172.19.168.25&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For the rest of the instructions below, the &amp;lt;code&amp;gt;ceph&amp;lt;/code&amp;gt; command can be run inside a Podman container by running &amp;lt;code&amp;gt;cephadm shell&amp;lt;/code&amp;gt;. Alternatively, you can install the &amp;lt;code&amp;gt;ceph-common&amp;lt;/code&amp;gt; package to run &amp;lt;code&amp;gt;ceph&amp;lt;/code&amp;gt; directly on the host.&lt;br /&gt;
&lt;br /&gt;
Add the disks for riboflavin:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph orch daemon add osd riboflavin:/dev/sdb&lt;br /&gt;
ceph orch daemon add osd riboflavin:/dev/sdc&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;b&amp;gt;Note&amp;lt;/b&amp;gt;: Unfortunately Ceph didn&#039;t like it when I used one of the /dev/disk/by-id paths, so I had to use the /dev/sdX paths instead. I&#039;m not sure what&#039;ll happen if the device names change at boot. Let&#039;s just cross our fingers and pray.&lt;br /&gt;
&lt;br /&gt;
Add more hosts:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph orch host add ginkgo 172.19.168.22 --labels _admin&lt;br /&gt;
ceph orch host add biloba 172.19.168.23&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Add each available disk on each of the additional hosts.&lt;br /&gt;
&lt;br /&gt;
Disable unnecessary services:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph orch rm alertmanager&lt;br /&gt;
ceph orch rm grafana&lt;br /&gt;
ceph orch rm node-exporter&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Set the autoscale profile to scale-up instead of scale-down:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph osd pool set autoscale-profile scale-up&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Set the default pool replication factor to 2 instead of 3:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph config set global osd_pool_default_size 2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Deploy the Managers and Monitors on riboflavin and ginkgo only:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph orch apply mon --placement &#039;2 riboflavin ginkgo&#039;&lt;br /&gt;
ceph orch apply mgr --placement &#039;2 riboflavin ginkgo&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== CloudStack Primary Storage ==&lt;br /&gt;
We are using RBD (RADOS Block Device) for CloudStack primary storage. The instructions below were adapted from https://docs.ceph.com/en/pacific/rbd/rbd-cloudstack/.&lt;br /&gt;
&lt;br /&gt;
Create and initialize a pool:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph osd pool create cloudstack&lt;br /&gt;
rbd pool init cloudstack&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Create a user for CloudStack:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph auth get-or-create client.cloudstack mon &#039;profile rbd&#039; osd &#039;profile rbd pool=cloudstack&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Make a backup of this key. There is currently a copy in /etc/ceph/ceph.client.cloudstack.keyring on biloba. If you want to use the &amp;lt;code&amp;gt;ceph&amp;lt;/code&amp;gt; command with this set of credentials, use the &amp;lt;code&amp;gt;-n&amp;lt;/code&amp;gt; flag, e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph -n client.cloudstack status&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== RBD commands ===&lt;br /&gt;
Here are some RBD commands which might be useful:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
List images (i.e. block devices) in the cloudstack pool:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
rbd ls -p cloudstack&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
View snapshots for an image:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
rbd snap ls cloudstack/265dc008-4db5-11ec-b585-32ee6075b19b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Unprotect a snapshot:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
rbd snap unprotect cloudstack/265dc008-4db5-11ec-b585-32ee6075b19b@cloudstack-base-snap&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Purge all snapshots for an image (after unprotecting them):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
rbd snap purge cloudstack/265dc008-4db5-11ec-b585-32ee6075b19b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Delete an image:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
rbd rm cloudstack/265dc008-4db5-11ec-b585-32ee6075b19b&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
A quick &#039;n dirty script to delete all images in the pool:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
rbd ls -p cloudstack | while read image; do rbd snap unprotect cloudstack/$image@cloudstack-base-snap; done&lt;br /&gt;
rbd ls -p cloudstack | while read image; do rbd snap purge cloudstack/$image; done&lt;br /&gt;
rbd ls -p cloudstack | while read image; do rbd rm cloudstack/$image; done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== CloudStack Secondary Storage ==&lt;br /&gt;
We are using NFS (v4) for CloudStack secondary storage. The steps below were adapted from:&lt;br /&gt;
&lt;br /&gt;
* https://docs.ceph.com/en/pacific/cephfs.&lt;br /&gt;
* https://docs.ceph.com/en/pacific/cephadm/nfs/&lt;br /&gt;
* https://docs.ceph.com/en/pacific/mgr/nfs/#mgr-nfs&lt;br /&gt;
&lt;br /&gt;
Create a new CephFS filesystem:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph fs volume create cloudstack-secondary&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Enable the NFS module:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph mgr module enable nfs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Create a cluster placed on two hosts:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph nfs cluster create cloudstack-nfs --placement &#039;2 riboflavin ginkgo&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
View cluster info:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph nfs cluster ls&lt;br /&gt;
ceph nfs cluster info cloudstack-nfs&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now create a CephFS export:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph nfs export create cephfs cloudstack-secondary cloudstack-nfs /cloudstack-secondary /&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
View export info:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph nfs export ls cloudstack-nfs&lt;br /&gt;
ceph nfs export get cloudstack-nfs /cloudstack-secondary&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now on the clients, we can just mount the NFS export normally:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir /mnt/cloudstack-secondary&lt;br /&gt;
mount -t nfs4 -o port=2049 ceph-nfs.cloud.csclub.uwaterloo.ca:/cloudstack-secondary /mnt/cloudstack-secondary&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Security ===&lt;br /&gt;
The NFS module in Ceph is just [https://github.com/nfs-ganesha/nfs-ganesha NFS-Ganesha], which does theoretically support ACLs, but I wasn&#039;t able to get it to work. I kept on getting some weird Python error. So we&#039;re going to use our iptables-fu instead (on riboflavin and ginkgo; make sure iptables-persistent is installed):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
iptables -N CEPH-NFS&lt;br /&gt;
iptables -A INPUT -j CEPH-NFS&lt;br /&gt;
iptables -A CEPH-NFS -s 172.19.168.0/27 -j RETURN&lt;br /&gt;
iptables -A CEPH-NFS -p tcp --dport 2049 -j REJECT&lt;br /&gt;
iptables -A CEPH-NFS -p udp --dport 2049 -j REJECT&lt;br /&gt;
iptables-save &amp;gt; /etc/iptables/rules.v4&lt;br /&gt;
&lt;br /&gt;
ip6tables -N CEPH-NFS&lt;br /&gt;
ip6tables -A INPUT -j CEPH-NFS&lt;br /&gt;
ip6tables -A CEPH-NFS -s fd74:6b6a:8eca:4902::/64 -j RETURN&lt;br /&gt;
ip6tables -A CEPH-NFS -p tcp --dport 2049 -j REJECT&lt;br /&gt;
ip6tables -A CEPH-NFS -p udp --dport 2049 -j REJECT&lt;br /&gt;
ip6tables-save &amp;gt; /etc/iptables/rules.v6&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Dashboard ==&lt;br /&gt;
There is a web dashboard for Ceph running on riboflavin which is useful to get a holistic view of the system. You will need to do a port-forward over SSH:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L 8443:172.19.168.25:8443 riboflavin&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now if you visit https://localhost:8443 (ignore the HTTPS warning), you can login to the dashboard. Credentials are stored in the usual place.&lt;br /&gt;
&lt;br /&gt;
== Adding a new disk ==&lt;br /&gt;
Let&#039;s say we added a new disk /dev/sdg to ginkgo. Log in to one of the Ceph management servers (riboflavin or ginkgo), then run&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# clear any metadata at the start of the disk&lt;br /&gt;
dd if=/dev/zero of=/dev/sdg bs=1M count=10 conv=fsync&lt;br /&gt;
# Run this from inside `cephadm shell`&lt;br /&gt;
ceph orch daemon add osd ginkgo:/dev/sdg&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
And that&#039;s it! You can run &amp;lt;code&amp;gt;ceph status&amp;lt;/code&amp;gt; to see the progress of the PGs getting rebalanced.&lt;br /&gt;
&lt;br /&gt;
== Recovering from a disk failure ==&lt;br /&gt;
Check which placement group(s) failed:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Run this from `cephadm shell`&lt;br /&gt;
ceph health detail&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The output will look something like this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent&lt;br /&gt;
[ERR] OSD_SCRUB_ERRORS: 1 scrub errors&lt;br /&gt;
[ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent&lt;br /&gt;
    pg 2.5 is active+clean+inconsistent, acting [3,0]&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
This means that placement group 2.5 failed and is in OSDs 3 and 0. Since our cluster has a replication factor of 2, one of those OSDs will be on the machine with the failed disk, and the other OSD will be on a healthy machine. Run this to see which machines have which OSDs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph osd tree&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Repairing the placement group ===&lt;br /&gt;
If the disk failure might have been intermittent, try and see if we can repair the PG first. See https://docs.ceph.com/en/pacific/rados/operations/pg-repair/ for details.&lt;br /&gt;
&lt;br /&gt;
=== Removing or replacing a disk ===&lt;br /&gt;
First, find the OSD corresponding to the failed disk:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Run this from `cephadm shell`&lt;br /&gt;
ceph-volume lvm list&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Read these pages:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;https://docs.ceph.com/en/pacific/rados/operations/add-or-rm-osds&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/operations_guide/handling-a-disk-failure&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here&#039;s the TLDR (assuming OSD 3 has the disk which failed):&lt;br /&gt;
&lt;br /&gt;
First, take the OSD out of the cluster:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph osd out osd.3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Wait until the data is backfilled to the other OSDs (this could take a long time):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph status&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Remove the OSD daemon, then purge the OSD completely:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph orch daemon rm osd.3 --force&lt;br /&gt;
ceph osd purge osd.3 --yes-i-really-mean-it&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Destroy the LVM logical volume and volume group:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph-volume lvm zap --destroy --osd-id 3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
At this point, the hard drive can be removed.&lt;br /&gt;
&lt;br /&gt;
After the drive has been replaced, zap it and add it to the cluster normally:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dd if=/dev/zero of=/dev/sde bs=1M count=10 conv=fsync&lt;br /&gt;
ceph orch daemon add osd ginkgo:/dev/sde&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Reducing log verbosity ==&lt;br /&gt;
By default, debug messages are enabled and written to stderr (so they up in the journald logs because they run in Podman). Unfortunately there is a [https://tracker.ceph.com/issues/49161 bug in Ceph] which seems to always cause debug messages to be enabled on stderr specifically. So we will log to syslog instead (which is just systemd-journald on Debian).&lt;br /&gt;
&lt;br /&gt;
Run &amp;lt;code&amp;gt;cephadm shell&amp;lt;/code&amp;gt; on riboflavin, then run the following:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph config set global mon_cluster_log_file_level info&lt;br /&gt;
ceph config set global log_to_stderr false&lt;br /&gt;
ceph config set global mon_cluster_log_to_stderr false&lt;br /&gt;
ceph config set global mon_cluster_log_to_syslog true&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
These setting should take effect on all of the Ceph hosts immediately. See https://docs.ceph.com/en/pacific/rados/configuration/ceph-conf/ for reference.&lt;br /&gt;
&lt;br /&gt;
== Miscellaneous commands ==&lt;br /&gt;
Here are some commands which may be useful. See the [https://docs.ceph.com/en/latest/man/8/ceph/ man page] for a full reference.&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Show devices:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ceph orch device ls&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note: this doesn&#039;t actually show all of the individual disks. I think it might have to do with the hardware RAID controllers.&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Show OSDs (Object Storage Daemons) on the current host (this needs to be run from &amp;lt;code&amp;gt;cephadm shell&amp;lt;/code&amp;gt;):&lt;br /&gt;
&amp;lt;pre&amp;gt;ceph-volume lvm list&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Show services:&lt;br /&gt;
&amp;lt;pre&amp;gt;ceph orch ls&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Show daemons of those services:&lt;br /&gt;
&amp;lt;pre&amp;gt;ceph orch ps&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Show non-default config settings:&lt;br /&gt;
&amp;lt;pre&amp;gt;ceph config dump&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Show pools:&lt;br /&gt;
&amp;lt;pre&amp;gt;ceph osd pool ls detail&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
List users:&lt;br /&gt;
&amp;lt;pre&amp;gt;ceph auth ls&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=IPMI101&amp;diff=5187</id>
		<title>IPMI101</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=IPMI101&amp;diff=5187"/>
		<updated>2023-12-22T19:51:57Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Supermicro */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Guide to IPMI (IPMI 101) =&lt;br /&gt;
&lt;br /&gt;
IPMI is a necessary evil. Let’s learn to make the best of it.&lt;br /&gt;
&lt;br /&gt;
== Setting up IPMI ==&lt;br /&gt;
&lt;br /&gt;
# Install ipmitool&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# apt-get install ipmitool&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Load IPMI modules (they are included in most upstream kernels)&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You may also need a kernel module specific to your motherboard’s manufacture as some BMC/LOMs do not conform to IPMI spec and thus need a translation layer.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# modprobe ipmi_*&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Locally connect to the &amp;lt;code&amp;gt;/dev/ipmi&amp;lt;/code&amp;gt; interface&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; help&lt;br /&gt;
&amp;amp;gt; mc info&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Securing IPMI ==&lt;br /&gt;
&lt;br /&gt;
Note that root on the machine is root on the BMC and vice versa.&lt;br /&gt;
&lt;br /&gt;
# User administration&lt;br /&gt;
&lt;br /&gt;
(re)set the password, rename the admin account to root and delete any extra users as they can have surprising privilege. You may have to use the BMC’s web interface delete accounts.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; user list 1&lt;br /&gt;
ID Name ...&lt;br /&gt;
2  ADMIN ...&lt;br /&gt;
&amp;amp;gt; user set password 2&lt;br /&gt;
User id 2: *******&lt;br /&gt;
User id 2: *******&lt;br /&gt;
&amp;amp;gt; user set username 2 root&lt;br /&gt;
&amp;amp;gt; user disable $other_user_ids&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Disable NULL password and cipher suite 0&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that the $channel is usually 0 but can range from 0-10 and there can be multiple NICs and so multiple channels to fix.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; lan print $channel&lt;br /&gt;
&amp;amp;gt; lan set $channel auth ADMIN MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth CALLBACK MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth USER MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth OPERATOR MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel cipher_privs XXXaXXXXXXXXXXX&lt;br /&gt;
&amp;amp;gt; lan print $channel&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Configuring networking ==&lt;br /&gt;
&lt;br /&gt;
Note once again that there are sometimes multiple channels, to find the correct channel it is helpful to use either trial and error and/or an ARP scanner to find the correct MAC address. Usually the channel is 0 but I have seen 1, 8 and 17. Especially when there are multiple NICs.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; lan print $channel&lt;br /&gt;
&amp;amp;gt; lan set $channel ipsrc static&lt;br /&gt;
&amp;amp;gt; lan set $channel ipaddr 10.15.134.?&lt;br /&gt;
&amp;amp;gt; lan set $channel defgw ipaddr 10.15.134.1&lt;br /&gt;
&amp;amp;gt; lan set $channel netmask 255.255.255.0&lt;br /&gt;
// if you have vlan tagging enabled on the switch port, useful for a shared NIC&lt;br /&gt;
&amp;amp;gt; lan set $channel vlan id 520&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Configuring Serial over LAN ==&lt;br /&gt;
&lt;br /&gt;
To enable serial over LAN you need to ensure that it is enabled in your BIOS or EFI setup utility and further note the baud rate. 115200 is used as an example below. Note that GRUB is the only boot loader that takes input via serial properly, in my experience. Syslinux failed horribly on corn-syrup.&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/default/grub.d/99-csclub.cfg:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
GRUB_CMDLINE_LINUX=&amp;amp;quot;console=tty1 console=ttyS1,115200n8&amp;amp;quot;&lt;br /&gt;
GRUB_TERMINAL_INPUT=&amp;amp;quot;console serial&amp;amp;quot;&lt;br /&gt;
GRUB_TERMINAL_OUTPUT=&amp;amp;quot;console serial&amp;amp;quot;&lt;br /&gt;
GRUB_SERIAL_COMMAND=&amp;amp;quot;serial --speed=115200 --unit=1 --word=8 --parity=no --stop=1&amp;amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
and then run:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;// on debian based distros&lt;br /&gt;
// Yay, Debian magic :\&lt;br /&gt;
# update-grub&lt;br /&gt;
// on upstream packages (Arch, Fedora, etc.)&lt;br /&gt;
# grub-mkconfig -o /boot/grub/grub.cfg&lt;br /&gt;
&lt;br /&gt;
# reboot&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= iDRAC =&lt;br /&gt;
== riboflavin ==&lt;br /&gt;
riboflavin is using iDRAC 6. The web console can be viewed from https://riboflavin-ipmi.csclub.uwaterloo.ca; if you are not on campus, you can use a [[How_to_SSH#SOCKS_proxy|SOCKS proxy]]. Unfortunately, the virtual console uses Java Web Start, which is now deprecated. Here&#039;s a workaround which you can use instead.&lt;br /&gt;
&lt;br /&gt;
From the web UI, go to the &amp;quot;Console/Media&amp;quot; tab and click the &amp;quot;Launch virtual console&amp;quot; button. This will download a file whose name starts with &amp;quot;viewer.jnlp&amp;quot;. Now go to https://www.java.com and download JRE 8; any later version will not have support for JWS (note that OpenJDK will not work; JWS was a proprietary framework from Sun/Oracle). Unpack the tarball, open jre1.8.0_391/lib/security/java.security in a text editor, and comment out the following properties (note that each property spans multiple lines):&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.certpath.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.jar.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.tls.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you are off-campus, you will need to setup some proxying so that the Java application can access ports 443 and 5900 on riboflavin-ipmi. In the example below, I am using caffeine as a jump host, but any machine on campus should do:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L 5443:localhost:5443 -L 5900:localhost:5900 caffeine.csclub.uwaterloo.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now on caffeine, open a tmux/screen session, and run the following commands in two different panes:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;socat TCP-LISTEN:5443,fork TCP:riboflavin-ipmi:443&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;socat TCP-LISTEN:5900,fork TCP:riboflavin-ipmi:5900&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Back on your personal machine, open the viewer.jnlp file in a text editor and perform the following:&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Replace all instances of &amp;lt;code&amp;gt;riboflavin-ipmi.csclub.uwaterloo.ca:443&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;localhost:5443&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Under the &amp;lt;code&amp;gt;application-desc&amp;lt;/code&amp;gt; element, the first &amp;lt;code&amp;gt;argument&amp;lt;/code&amp;gt; child element should say &amp;lt;code&amp;gt;ip=riboflavin-ipmi&amp;lt;/code&amp;gt;. Replace this with &amp;lt;code&amp;gt;ip=localhost&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;.&lt;br /&gt;
&amp;lt;li&amp;gt;Under the &amp;lt;code&amp;gt;application-desc&amp;lt;/code&amp;gt; element, there are child &amp;lt;code&amp;gt;argument&amp;lt;/code&amp;gt;elements for &amp;lt;code&amp;gt;user&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;passwd&amp;lt;/code&amp;gt;. For some reason these are set to numbers; set these to the username and password for IPMI (username should be &amp;lt;code&amp;gt;root&amp;lt;/code&amp;gt;).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
jre1.8.0_391/bin/javaws viewer.jnlp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If all goes well, the virtual console should eventually appear:&lt;br /&gt;
[[File:Riboflavin-idrac-virtual-console.png|1000px]]&lt;br /&gt;
&lt;br /&gt;
= Supermicro =&lt;br /&gt;
== ginkgo ==&lt;br /&gt;
To access the virtual console on ginkgo, the steps are the same as those for riboflavin, with the following changes:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;In the launch.jnlp file, in the root &amp;lt;code&amp;gt;&amp;lt;jnlp&amp;gt;&amp;lt;/code&amp;gt; tag, change the value of the &amp;lt;code&amp;gt;codebase&amp;lt;/code&amp;gt; attribute from &amp;lt;code&amp;gt;https&amp;lt;nowiki/&amp;gt;://ginkgo-ipmi.csclub.uwaterloo.ca:443&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;https&amp;lt;nowiki/&amp;gt;://localhost:5443&amp;lt;/code&amp;gt;. Next, in the first &amp;lt;code&amp;gt;&amp;lt;argument&amp;gt;&amp;lt;/code&amp;gt; element under &amp;lt;code&amp;gt;&amp;lt;application-desc&amp;gt;&amp;lt;/code&amp;gt;, replace &amp;lt;code&amp;gt;ginkgo-ipmi.csclub.uwaterloo.ca&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;localhost&amp;lt;/code&amp;gt;. These are the only changes which you should make to this file (unless you are already on the campus network, in which case you do not need to modify this file at all).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;jre1.8.0_391/bin/ControlPanel&amp;lt;/code&amp;gt;, go to the Security tab, click &amp;quot;Edit Site List&amp;quot;, and add &amp;lt;code&amp;gt;https&amp;lt;nowiki/&amp;gt;://ginkgo-ipmi.csclub.uwaterloo.ca&amp;lt;/code&amp;gt; as an exception.&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=New_CSC_Machine&amp;diff=5186</id>
		<title>New CSC Machine</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=New_CSC_Machine&amp;diff=5186"/>
		<updated>2023-12-19T22:30:35Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* apt */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Firmware Updates =&lt;br /&gt;
&lt;br /&gt;
Vendors such as Dell provide firmware updates that should be applied before putting new machines into service. Even if the machine&#039;s warranty has expired, security updates are still made available.&lt;br /&gt;
&lt;br /&gt;
It is recommended to use the following sequence when updating firmware on the Dell PowerEdge servers ([https://downloads.dell.com/solutions/general-solution-resources/White%20Papers/Recommended%20Workflow%20for%20Performing%20Firmware%20Updates%20on%20PowerEdge%20Servers.pdf]):&lt;br /&gt;
&lt;br /&gt;
# iDRAC&lt;br /&gt;
# Lifecycle Controller&lt;br /&gt;
# BIOS&lt;br /&gt;
# Diagnostics&lt;br /&gt;
# OS Driver Pack&lt;br /&gt;
# RAID&lt;br /&gt;
# NIC&lt;br /&gt;
# PSU&lt;br /&gt;
# CPLD&lt;br /&gt;
# Other update&lt;br /&gt;
For consumer grade hardware, go to the motherboard vendor&#039;s website and find the way to upgrade the firmware.&lt;br /&gt;
&lt;br /&gt;
= Booting =&lt;br /&gt;
&lt;br /&gt;
* Put the TFTP image in place (if dist-arch pair installed before, you may skip this).&lt;br /&gt;
e.g. extract http://mirror.csclub.uwaterloo.ca/ubuntu/dists/oneiric/main/installer-amd64/current/images/netboot/netboot.tar.gz to caffeine:/srv/tftp/oneiric-amd64&lt;br /&gt;
&lt;br /&gt;
* Force network boot in the BIOS. This may be called &amp;quot;Legacy LAN&amp;quot; or other such cryptic things. If this doesn&#039;t work, boot from CD or USB instead.&lt;br /&gt;
&lt;br /&gt;
It is preferred to use the &amp;quot;alternate&amp;quot; Ubuntu installer image, based on debian-installer, instead of the Ubiquity installer. This installer supports software RAID and LVM out of the box, and will generally make your life easier. If installing Debian, this is the usual installer, so don&#039;t sweat it.&lt;br /&gt;
&lt;br /&gt;
* Most of our newer servers (e.g. PowerEdge R815) need non-free firmware in order to boot. This means that if you are using a new netboot image, it is highly recommended to include the entire non-free firmware bundle in the boot image. See [https://wiki.debian.org/DebianInstaller/NetbootFirmware] for more information.&lt;br /&gt;
* For office terminals, create a boot USB (via dd, for example) and boot from USB.&lt;br /&gt;
&lt;br /&gt;
= Installing =&lt;br /&gt;
&lt;br /&gt;
== debian-installer ==&lt;br /&gt;
&lt;br /&gt;
At least in expert mode, you can choose a custom mirror (top of the countries list) and give the path for mirror directly. This will make installation super-fast compared to installing from anywhere else.&lt;br /&gt;
&lt;br /&gt;
Please install to LVM volumes, as this is our standard configuration on all machines where possible. It allows more flexible partitioning across available volumes. Since GRUB 2, even /boot may be on LVM; this is the preferred configuration for simplicity, except when legacy partitioning setups make this inconvenient.&lt;br /&gt;
&lt;br /&gt;
You may enable unattended upgrades, but do not enable Canonical&#039;s remote management service or any such nonsense. This is mostly a straightforward Debian/Ubuntu install.&lt;br /&gt;
&lt;br /&gt;
= After Installing =&lt;br /&gt;
&lt;br /&gt;
Add the machine&#039;s name to ~git/public/hosts.git, and run the ansible playbook (https://git.uwaterloo.ca/csc/playbooks/blob/master/update-hosts.yml) to distribute the updated hosts file to all machines.&lt;br /&gt;
== apt ==&lt;br /&gt;
&lt;br /&gt;
Delete/clear the file &amp;lt;tt&amp;gt;/etc/apt/sources.list&amp;lt;/tt&amp;gt; and paste something like the following into &amp;lt;tt&amp;gt;/etc/apt/sources.list.d/debian.sources&amp;lt;/tt&amp;gt; (replace &amp;quot;bookworm&amp;quot; by the the current Debian stable codename):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Types: deb&lt;br /&gt;
URIs: http://mirror.csclub.uwaterloo.ca/debian&lt;br /&gt;
Suites: bookworm bookworm-updates bookworm-backports&lt;br /&gt;
Components: main contrib non-free non-free-firmware&lt;br /&gt;
Signed-By: /usr/share/keyrings/debian-archive-keyring.gpg&lt;br /&gt;
&lt;br /&gt;
Types: deb&lt;br /&gt;
URIs: http://mirror.csclub.uwaterloo.ca/debian-security&lt;br /&gt;
Suites: bookworm-security&lt;br /&gt;
Components: main contrib non-free non-free-firmware&lt;br /&gt;
Signed-By: /usr/share/keyrings/debian-archive-keyring.gpg&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Install the CSC archive signing key:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
wget -O /etc/apt/keyrings/csclub.gpg http://debian.csclub.uwaterloo.ca/csclub.gpg&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the following into &amp;lt;tt&amp;gt;/etc/apt/sources.list.d/csclub.sources&amp;lt;/tt&amp;gt; (or copy from another host):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Types: deb&lt;br /&gt;
URIs: http://debian.csclub.uwaterloo.ca&lt;br /&gt;
Suites: bookworm&lt;br /&gt;
Components: main&lt;br /&gt;
Signed-By: /etc/apt/keyrings/csclub.gpg&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In order to make Debian use packages in our repository by default, set our repository to the highest priority. Create &amp;lt;code&amp;gt;/etc/apt/preferences.d/99-csclub&amp;lt;/code&amp;gt;: &amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
Package: *&lt;br /&gt;
Pin: origin debian.csclub.uwaterloo.ca&lt;br /&gt;
Pin-Priority: 1001&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;You should now run &amp;lt;tt&amp;gt;apt-get update&amp;lt;/tt&amp;gt; to reflect these changes.&lt;br /&gt;
&lt;br /&gt;
For unattended upgrades in the future, install the &amp;lt;tt&amp;gt;unattended-upgrades&amp;lt;/tt&amp;gt; package and copy &amp;lt;tt&amp;gt;/etc/apt/apt.conf&amp;lt;/tt&amp;gt; from another host.&lt;br /&gt;
&lt;br /&gt;
== Network ==&lt;br /&gt;
&lt;br /&gt;
Note that debian 11 will use NetworkManager or &amp;lt;code&amp;gt;/etc/interfaces&amp;lt;/code&amp;gt; by default if you install a desktop environment, which doesn&#039;t seem to do DHCPv6 nicely. For simplicity and consistency across machines, we will use &amp;lt;code&amp;gt;systemd-networkd&amp;lt;/code&amp;gt;. First stop and disable NetworkManager:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
systemctl disable --now NetworkManager.service networking.service&lt;br /&gt;
apt autoremove NetworkManager&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;Then, create a network configuration file at &amp;lt;code&amp;gt;/etc/systemd/network/10-wired.network&amp;lt;/code&amp;gt;:&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
[Match]&lt;br /&gt;
# Check the interface name using `ip a`&lt;br /&gt;
Name=enp3s0&lt;br /&gt;
&lt;br /&gt;
[Network]&lt;br /&gt;
# DHCP for IPv4 should work just fine&lt;br /&gt;
DHCP=ipv4&lt;br /&gt;
# IPv6 doesn&#039;t seem to work properly. Manually set them here&lt;br /&gt;
Address=ALLOCATED_IPv6_ADDRESS&lt;br /&gt;
Gateway=IPv6_GATEWAY&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;Then start and enable &amp;lt;code&amp;gt;systemd-networkd.service&amp;lt;/code&amp;gt;. Also remember to specify the campus DNS at &amp;lt;code&amp;gt;/etc/resolve.conf&amp;lt;/code&amp;gt;. You can copy it from another CSC machine.&lt;br /&gt;
&lt;br /&gt;
== Kerberos keys ==&lt;br /&gt;
&lt;br /&gt;
If this is a reinstall of an existing host, copy back the SSH host keys and &amp;lt;tt&amp;gt;/etc/krb5.keytab&amp;lt;/tt&amp;gt; from its former incarnation. Otherwise, create a new Kerberos principal and copy the keytab over, as follows (run from the host in question):&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
kadmin -p sysadmin/admin   # or any other admin principal; the password for this one is the usual root password&lt;br /&gt;
addprinc -randkey host/[hostname].csclub.uwaterloo.ca&lt;br /&gt;
ktadd host/[hostname].csclub.uwaterloo.ca&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;This will generate a new principal (you can skip this step if one already exists) and add it to the local Kerberos keytab.&lt;br /&gt;
&lt;br /&gt;
== Configuration ==&lt;br /&gt;
&lt;br /&gt;
=== General ===&lt;br /&gt;
Install packages that we will need:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
apt install krb5-user nfs-common nslcd sudo-ldap&lt;br /&gt;
# This package are automatically installed already, but we need to install our version so that NFS can connect to our crappy NetApp server&lt;br /&gt;
apt install --reinstall libk5crypto3&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;The following config files are needed to work in the CSC environment (examples given below for an office terminal; perhaps refer to another host if preferred).&lt;br /&gt;
&lt;br /&gt;
&amp;lt;tt&amp;gt;/etc/nsswitch.conf&amp;lt;/tt&amp;gt;&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
# /etc/nsswitch.conf&lt;br /&gt;
#&lt;br /&gt;
# Example configuration of GNU Name Service Switch functionality.&lt;br /&gt;
# If you have the `glibc-doc-reference&#039; and `info&#039; packages installed, try:&lt;br /&gt;
# `info libc &amp;quot;Name Service Switch&amp;quot;&#039; for information about this file.&lt;br /&gt;
&lt;br /&gt;
passwd:         files systemd ldap&lt;br /&gt;
group:          files systemd ldap&lt;br /&gt;
shadow:         files ldap&lt;br /&gt;
gshadow:        files ldap&lt;br /&gt;
sudoers:        files ldap&lt;br /&gt;
&lt;br /&gt;
hosts:          files dns&lt;br /&gt;
networks:       files&lt;br /&gt;
&lt;br /&gt;
protocols:      db files&lt;br /&gt;
services:       db files&lt;br /&gt;
ethers:         db files&lt;br /&gt;
rpc:            db files&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&amp;lt;tt&amp;gt;/etc/ldap/ldap.conf&amp;lt;/tt&amp;gt;&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
#&lt;br /&gt;
# LDAP Defaults&lt;br /&gt;
#&lt;br /&gt;
&lt;br /&gt;
# See ldap.conf(5) for details&lt;br /&gt;
# This file should be world readable but not world writable.&lt;br /&gt;
&lt;br /&gt;
BASE    dc=csclub, dc=uwaterloo, dc=ca&lt;br /&gt;
URI     ldaps://ldap1.csclub.uwaterloo.ca ldaps://ldap2.csclub.uwaterloo.ca&lt;br /&gt;
&lt;br /&gt;
SIZELIMIT       0&lt;br /&gt;
&lt;br /&gt;
TLS_CACERT      /etc/ssl/certs/ca-certificates.crt&lt;br /&gt;
TLS_CACERTFILE  /etc/ssl/certs/ca-certificates.crt&lt;br /&gt;
&lt;br /&gt;
SUDOERS_BASE ou=SUDOers,dc=csclub,dc=uwaterloo,dc=ca&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;Also make &amp;lt;tt&amp;gt;/etc/sudo-ldap.conf&amp;lt;/tt&amp;gt; a symlink to the above. On debian, install &amp;lt;tt&amp;gt;sudo-ldap&amp;lt;/tt&amp;gt; package too.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;tt&amp;gt;/etc/nslcd.conf&amp;lt;/tt&amp;gt;&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
# /etc/nslcd.conf&lt;br /&gt;
# nslcd configuration file. See nslcd.conf(5)&lt;br /&gt;
# for details.&lt;br /&gt;
&lt;br /&gt;
# The user and group nslcd should run as.&lt;br /&gt;
uid nslcd&lt;br /&gt;
gid nslcd&lt;br /&gt;
&lt;br /&gt;
# The location at which the LDAP server(s) should be reachable.&lt;br /&gt;
uri ldaps://ldap1.csclub.uwaterloo.ca&lt;br /&gt;
uri ldaps://ldap2.csclub.uwaterloo.ca&lt;br /&gt;
&lt;br /&gt;
# The search base that will be used for all queries.&lt;br /&gt;
base dc=csclub, dc=uwaterloo, dc=ca&lt;br /&gt;
&lt;br /&gt;
# The LDAP protocol version to use.&lt;br /&gt;
#ldap_version 3&lt;br /&gt;
&lt;br /&gt;
# The DN to bind with for normal lookups.&lt;br /&gt;
#binddn cn=annonymous,dc=example,dc=net&lt;br /&gt;
#bindpw secret&lt;br /&gt;
&lt;br /&gt;
# The DN used for password modifications by root.&lt;br /&gt;
#rootpwmoddn cn=admin,dc=example,dc=com&lt;br /&gt;
&lt;br /&gt;
# SSL options&lt;br /&gt;
#ssl off&lt;br /&gt;
tls_reqcert demand&lt;br /&gt;
tls_cacertfile /etc/ssl/certs/ca-certificates.crt&lt;br /&gt;
&lt;br /&gt;
# The search scope.&lt;br /&gt;
#scope sub&lt;br /&gt;
&lt;br /&gt;
map group member uniqueMember&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&amp;lt;tt&amp;gt;/etc/krb5.conf&amp;lt;/tt&amp;gt;&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
[libdefaults]&lt;br /&gt;
  default_realm = CSCLUB.UWATERLOO.CA&lt;br /&gt;
  forwardable = true&lt;br /&gt;
  proxiable = true&lt;br /&gt;
  dns_lookup_kdc = false&lt;br /&gt;
  dns_lookup_realm = false&lt;br /&gt;
  allow_weak_crypto = true&lt;br /&gt;
&lt;br /&gt;
[realms]&lt;br /&gt;
  CSCLUB.UWATERLOO.CA = {&lt;br /&gt;
    kdc = kdc1.csclub.uwaterloo.ca&lt;br /&gt;
    kdc = kdc2.csclub.uwaterloo.ca&lt;br /&gt;
    admin_server = kadmin.csclub.uwaterloo.ca&lt;br /&gt;
  }&lt;br /&gt;
(rest omitted for brevity, see any CSC machine)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;Notably, &amp;lt;tt&amp;gt;allow_weak_crypto&amp;lt;/tt&amp;gt; is currently needed to mount &amp;lt;tt&amp;gt;/users&amp;lt;/tt&amp;gt; (/music and &amp;lt;tt&amp;gt;/scratch&amp;lt;/tt&amp;gt; is sec=sys and thus will always mount, even when krb5 is down and/or broken). Otherwise, you will get a mysterious &amp;quot;permission denied&amp;quot; error (even though the server claims to have authenticated the mount successfully).&lt;br /&gt;
&lt;br /&gt;
Furthermore, the lines &amp;lt;tt&amp;gt;dns_lookup_kdc&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;dns_lookup_realm&amp;lt;/tt&amp;gt; have been added - they are needed to stop the KDC from throwing its arms in the air and giving up if IST&#039;s DNS servers ever explode - an event that has happened in the recent past far more often than I&#039;d like it to.&lt;br /&gt;
&lt;br /&gt;
Change all lines in &amp;lt;tt&amp;gt;/etc/pam.d/common-*&amp;lt;/tt&amp;gt; to have &amp;lt;tt&amp;gt;minimum_uid=10000&amp;lt;/tt&amp;gt; so that Kerberos won&#039;t interfere with local users. Note that pam configs are notably different on syscom-only hosts. Look at an existing syscom-only host to see the difference.&lt;br /&gt;
&lt;br /&gt;
Alter &amp;lt;tt&amp;gt;/etc/default/nfs-common&amp;lt;/tt&amp;gt; &amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
# Alter these lines:&lt;br /&gt;
NEED_STATD=1&lt;br /&gt;
NEED_GSSD=1&lt;br /&gt;
# -l for gssd is to allow legacy crypto suites&lt;br /&gt;
RPCGSSDOPTS=&amp;quot;-v -l&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;to enable &amp;lt;tt&amp;gt;statd&amp;lt;/tt&amp;gt;, and more importantly &amp;lt;tt&amp;gt;gssd&amp;lt;/tt&amp;gt; (needed for Kerberos NFS mounts). Start &amp;lt;code&amp;gt;rpc-statd.service&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;rpc-gssd.service&amp;lt;/code&amp;gt; manually for now.&lt;br /&gt;
&lt;br /&gt;
Add &amp;lt;tt&amp;gt;/users&amp;lt;/tt&amp;gt;, &amp;lt;tt&amp;gt;/music&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;/scratch&amp;lt;/tt&amp;gt; to &amp;lt;tt&amp;gt;/etc/fstab&amp;lt;/tt&amp;gt; (as appropriate for the machine&#039;s role), make their mount points and mount them. Note that &amp;lt;tt&amp;gt;/scratch&amp;lt;/tt&amp;gt; are sec=sys whereas &amp;lt;tt&amp;gt;/music&amp;lt;/tt&amp;gt; and /users is sec=krb5p (with exceptions granted on a case-by-case basis for servers only, office terminals are always sec=krb5p for security reasons).&lt;br /&gt;
&lt;br /&gt;
To allow single sign-on as &amp;lt;tt&amp;gt;root&amp;lt;/tt&amp;gt; (primarily useful for pushing files to all machines simultaneously), put the following in &amp;lt;tt&amp;gt;/root/.k5login&amp;lt;/tt&amp;gt;:&lt;br /&gt;
 sysadmin/admin@CSCLUB.UWATERLOO.CA&lt;br /&gt;
&lt;br /&gt;
Also copy the following files from another CSC host:&lt;br /&gt;
* &amp;lt;tt&amp;gt;/etc/ssh/ssh_config&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;/etc/ssh/sshd_config&amp;lt;/tt&amp;gt; (for single sign-on)&lt;br /&gt;
* &amp;lt;tt&amp;gt;/etc/ssh/ssh_known_hosts&amp;lt;/tt&amp;gt; (to remove hostkey warnings within our network)&lt;br /&gt;
* &amp;lt;tt&amp;gt;/etc/hosts&amp;lt;/tt&amp;gt; (for host tab completion and emergency name resolution)&lt;br /&gt;
* &amp;lt;tt&amp;gt;/etc/resolv.conf&amp;lt;/tt&amp;gt; (to use IST&#039;s nameservers and search csclub/uwaterloo domains. Only required if you are not using &amp;lt;tt&amp;gt;/etc/network/interfaces&amp;lt;/tt&amp;gt; to configure DNS)&lt;br /&gt;
&lt;br /&gt;
=== Audio ===&lt;br /&gt;
&lt;br /&gt;
On an office terminal, copy &amp;lt;tt&amp;gt;/etc/pulse/default.pa&amp;lt;/tt&amp;gt; from another office terminal.&lt;br /&gt;
&lt;br /&gt;
If this is to be the machine that actually plays audio (currently &amp;lt;tt&amp;gt;nullsleep&amp;lt;/tt&amp;gt;), the setup is slightly more complicated. You&#039;ll need to set up MPD and PipeWire to receive connections, and store the PulseAudio cookie in &amp;lt;tt&amp;gt;~audio&amp;lt;/tt&amp;gt;, with appropriate permissions so that only the &amp;lt;tt&amp;gt;audio&amp;lt;/tt&amp;gt; group can access it. If this is a new audio machine, you&#039;ll also need to change &amp;lt;tt&amp;gt;default.pa&amp;lt;/tt&amp;gt; on all office terminals to point to it.&lt;br /&gt;
&lt;br /&gt;
=== Password ===&lt;br /&gt;
Change the root password to the specified password in the usual place. If it&#039;s an office terminal, change the local user&#039;s password to the one specified in the usual place.&lt;br /&gt;
&lt;br /&gt;
=== Prevent suspend and hibernation (Office Terminal) ===&lt;br /&gt;
Set &amp;lt;code&amp;gt;AllowSuspend&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;AllowHibernation&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;AllowSuspendThenHibernate&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;AllowHybridSleep&amp;lt;/code&amp;gt; all to &amp;lt;code&amp;gt;no&amp;lt;/code&amp;gt; in &amp;lt;code&amp;gt;/etc/systemd/sleep.conf&amp;lt;/code&amp;gt;, and reboot.&lt;br /&gt;
&lt;br /&gt;
== Records ==&lt;br /&gt;
&lt;br /&gt;
You probably already created the host in the University IPAM system beforehand. If not, please do so.&lt;br /&gt;
&lt;br /&gt;
Please also add the host to the [[Machine List]] here on the Wiki.&lt;br /&gt;
&lt;br /&gt;
== Munin (System Monitoring) ==&lt;br /&gt;
&lt;br /&gt;
If the new machine is not a container, you probably want to have it participate in the Munin cluster. Run &amp;lt;tt&amp;gt;apt-get install munin-node&amp;lt;/tt&amp;gt; to install the monitoring client, then&lt;br /&gt;
edit the file /etc/munin/munin-node.conf. Look for a line that says &amp;lt;tt&amp;gt;allow ^127\.0\.0\.1$&amp;lt;/tt&amp;gt; and add the following on a new line immediately below it:&lt;br /&gt;
&amp;lt;tt&amp;gt;allow ^129\.97\.134\.51$&amp;lt;/tt&amp;gt; (this is the IP address for munin.csclub). Save the file, then &amp;lt;tt&amp;gt;/etc/init.d/munin-node restart&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;update-rc.d munin-node defaults&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Then, ssh into munin.csclub and edit the file /etc/munin/munin.conf and add the following lines to the end:&lt;br /&gt;
&amp;lt;tt&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[NEW-MACHINE-NAME.csclub] &amp;lt;br/&amp;gt;&lt;br /&gt;
addr 129.97.134.### &amp;lt;br /&amp;gt;&lt;br /&gt;
use_node_name yes&amp;lt;/tt&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Prometheus (System Monitoring) ==&lt;br /&gt;
&lt;br /&gt;
We are currently using Prometheus to monitor our systems. On the new machine, install &amp;lt;tt&amp;gt;prometheus-node-exporter&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;stunnel&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
Change &amp;lt;tt&amp;gt;/etc/default/prometheus-node-exporter&amp;lt;/tt&amp;gt; to this: &lt;br /&gt;
&lt;br /&gt;
 ARGS=&amp;quot;--web.listen-address=localhost:9101&amp;quot;&lt;br /&gt;
&lt;br /&gt;
and start &amp;lt;tt&amp;gt;prometheus-node-exporter.service&amp;lt;/tt&amp;gt;. &lt;br /&gt;
&lt;br /&gt;
Then set up stunnel. Create &amp;lt;tt&amp;gt;/etc/stunnel/prometheus-node-exporter.conf&amp;lt;/tt&amp;gt; with this content:&lt;br /&gt;
&lt;br /&gt;
 setuid = stunnel4&lt;br /&gt;
 setgid = stunnel4&lt;br /&gt;
 pid = /var/run/stunnel4/exporter.pid&lt;br /&gt;
 &lt;br /&gt;
 debug = 7&lt;br /&gt;
 &lt;br /&gt;
 [prometheus-node-exporter]&lt;br /&gt;
 accept = 0.0.0.0:9100&lt;br /&gt;
 connect = 127.0.0.1:9101&lt;br /&gt;
 CAfile = /etc/stunnel/tls/server.crt&lt;br /&gt;
 cert = /etc/stunnel/tls/node.crt&lt;br /&gt;
 key = /etc/stunnel/tls/node.key&lt;br /&gt;
 verifyPeer = yes&lt;br /&gt;
&lt;br /&gt;
Copy &amp;lt;tt&amp;gt;/etc/stunnel/{node.crt, node.key, server.crt}&amp;lt;/tt&amp;gt; from &amp;lt;tt&amp;gt;prometheus:/opt/prometheus/tls&amp;lt;/tt&amp;gt; or the same location on other machines.&lt;br /&gt;
&lt;br /&gt;
Finally, start &amp;lt;tt&amp;gt;stunnel4.service&amp;lt;/tt&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
If it&#039;s a new machine, you&#039;ll also need to add it to the list of monitoring at &amp;lt;tt&amp;gt;prometheus:/opt/prometheus/prometheus.yml&amp;lt;/tt&amp;gt;. Add it under a suitable label (or create a new label) in &#039;node_exporter&#039; job.&lt;br /&gt;
&lt;br /&gt;
= New Distribution =&lt;br /&gt;
&lt;br /&gt;
If you&#039;re adding a new distribution, there a couple of steps you&#039;ll need to take in updating the CSClub Debian repository on [[Machine_List#sodium_benzoate|sodium-benzoate/mirror]]. &lt;br /&gt;
&lt;br /&gt;
The steps to add a new Debian release (in the examples, jessie) is as follows, modify as necessary:&lt;br /&gt;
&lt;br /&gt;
=== Step 0: Create a GPG key ===&lt;br /&gt;
&lt;br /&gt;
Use &amp;quot;gpg --gen-key&amp;quot; or something like that. Skip this if you already have one.&lt;br /&gt;
&lt;br /&gt;
=== Step 1: Add to Uploaders ===&lt;br /&gt;
&lt;br /&gt;
The /srv/debian/conf/uploaders file on mirror contains the list of people who can upload. Add your GPG key id to this file.  Use &amp;quot;gpg --list-secret-keys&amp;quot; to find out the key ID. You also need to import your key into the mirror&#039;s gpg homedir as follows:&lt;br /&gt;
&lt;br /&gt;
 gpg --export $KEYID | sudo env GNUPGHOME=/srv/debian/gpg gpg --import&lt;br /&gt;
&lt;br /&gt;
You only need to do this step once.&lt;br /&gt;
&lt;br /&gt;
=== Step 2: Add Distro ===&lt;br /&gt;
&lt;br /&gt;
Add a new section to /srv/debian/conf/distributions:&lt;br /&gt;
&lt;br /&gt;
 Origin: CSC&lt;br /&gt;
 Label: Debian&lt;br /&gt;
 Codename: &#039;&#039;&#039;jessie&#039;&#039;&#039;&lt;br /&gt;
 Architectures: alpha amd64 i386 mips mipsel sparc powerpc armel source&lt;br /&gt;
 Components: main contrib non-free&lt;br /&gt;
 Uploaders: uploaders&lt;br /&gt;
 Update: dell chrome&lt;br /&gt;
 SignWith: yes&lt;br /&gt;
 Log: &#039;&#039;&#039;jessie&#039;&#039;&#039;.log&lt;br /&gt;
  --changes notifier&lt;br /&gt;
&lt;br /&gt;
And update the &#039;&#039;&#039;Allow&#039;&#039;&#039; line in /srv/debian/conf/incoming:&lt;br /&gt;
&lt;br /&gt;
 Allow: &#039;&#039;&#039;jessie&amp;gt;jessie&#039;&#039;&#039; oldstable&amp;gt;squeeze stable&amp;gt;wheezy lucid&amp;gt;lucid maverick&amp;gt;maverick oneiric&amp;gt;oneiric precise&amp;gt;precise quantal&amp;gt;quantal&lt;br /&gt;
&lt;br /&gt;
=== Step 3: Update from Sources ===&lt;br /&gt;
&lt;br /&gt;
Run:&lt;br /&gt;
&lt;br /&gt;
 sudo env GNUPGHOME=/srv/debian/gpg /srv/debian/bin/rrr-update&lt;br /&gt;
&lt;br /&gt;
If all went well you should see the new distribution listed at http://debian.csclub.uwaterloo.ca/dists/&lt;br /&gt;
&lt;br /&gt;
=== Step 4: CSC Packages ===&lt;br /&gt;
&lt;br /&gt;
Now that we&#039;ve got our new distribution set up we need to generate our packages and have them uploaded. Namely, ceo and libpam-csc. For libpam-csc:&lt;br /&gt;
&lt;br /&gt;
Get the package:&lt;br /&gt;
&lt;br /&gt;
 git clone https://git.csclub.uwaterloo.ca/public/libpam-csc.git&lt;br /&gt;
 cd libpam-csc&lt;br /&gt;
&lt;br /&gt;
Update change log:&lt;br /&gt;
&lt;br /&gt;
 EMAIL=[you]@csclub.uwaterloo.ca NAME=&amp;quot;Your Name&amp;quot; dch -i&lt;br /&gt;
&lt;br /&gt;
Update as necessary, i.e:&lt;br /&gt;
&lt;br /&gt;
 libpam-csc (1.10&#039;&#039;&#039;jessie0&#039;&#039;&#039;) &#039;&#039;&#039;jessie&#039;&#039;&#039;; urgency=low&lt;br /&gt;
 &lt;br /&gt;
   * Packaging for jessie.&lt;br /&gt;
 &lt;br /&gt;
  -- Your Name &amp;lt;[you]@csclub.uwaterloo.ca&amp;gt;  Thu, 10 Oct 2013 22:08:48 -0400&lt;br /&gt;
&lt;br /&gt;
Build! (You may need to install various dependencies, which it will yell at you if you don&#039;t have.)&lt;br /&gt;
&lt;br /&gt;
 debuild -k&#039;&#039;&#039;YOURKEYID&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
Yay, it built now let&#039;s upload it to the repo. The build process which create a PACKAGE.changes file in the parent directory (replace PACKAGE with the actual package name).&lt;br /&gt;
&lt;br /&gt;
Copy the dupload file from corn-syrup and dupload:&lt;br /&gt;
&lt;br /&gt;
 mv /etc/dupload /etc/dupload.bak&lt;br /&gt;
 scp corn-syrup:/etc/dupload /etc/dupload&lt;br /&gt;
 dupload libpam-csc_1.10jessie0_amd64.changes&lt;br /&gt;
&lt;br /&gt;
Finally, log into mirror and type &amp;quot;sudo /srv/debian/bin/rrr-incoming&amp;quot;. This is supposed to happen once every few minutes however it is always faster to run it manually.&lt;br /&gt;
&lt;br /&gt;
And you&#039;re done. For CEO, see https://git.csclub.uwaterloo.ca/public/pyceo/src/branch/master/PACKAGING.md&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=Observability&amp;diff=5185</id>
		<title>Observability</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=Observability&amp;diff=5185"/>
		<updated>2023-12-16T02:06:50Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Schema */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are [https://www.oreilly.com/library/view/distributed-systems-observability/9781492033431/ch04.html three pillars of observability]: metrics, logging and tracing. We are only interested in the first two.&lt;br /&gt;
&lt;br /&gt;
== Metrics ==&lt;br /&gt;
All of our machines are, or at least should be, running the Prometheus node exporter. This collects and sends machine metrics (e.g. RAM used, disk space) to the Prometheus server running at https://prometheus.csclub.uwaterloo.ca (currently a VM on phosphoric-acid). There are a few specialized exporters running on several other machines; a Postfix exporter is running on mail, an Apache exporter is running on caffeine, and an NGINX expoter is running on potassium-benzoate. There is also a custom exporter written by syscom running on potassium-benzoate for mirror stats.&lt;br /&gt;
&lt;br /&gt;
Most of the exporters use mutual TLS authentication with the Prometheus server. I set the expiration date for the TLS certs to 10 years. If you are reading this and it is 2031 or later, then go update the certs.&lt;br /&gt;
&lt;br /&gt;
I highly suggest becoming familiar with [https://prometheus.io/docs/prometheus/latest/querying/basics/ PromQL], the query language for Prometheus. You can run and visualize some queries at https://prometheus.csclub.uwaterloo.ca/prometheus. For example, here is a query to determine which machines are up or down:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
up{job=&amp;quot;node_exporter&amp;quot;}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Here&#039;s how we determine if a machine has NFS mounted. This will return 1 for machines which have NFS mounted, but will not return any records for machines which do not have NFS mounted. (We ignore the actual value of node_filesystem_device_error because it returns 1 for machines using Kerberized NFS.)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
count by (instance) (node_filesystem_device_error{mountpoint=&amp;quot;/users&amp;quot;, fstype=&amp;quot;nfs&amp;quot;})&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Now this is a rather complicated expression which can return one of three values:&lt;br /&gt;
* 0: the machine is down&lt;br /&gt;
* 1: the machine is up, but NFS is not mounted&lt;br /&gt;
* 2: the machine is up and NFS is mounted&lt;br /&gt;
The [https://prometheus.io/docs/prometheus/latest/querying/operators/#logical-set-binary-operators or operator] in PromQL is key here.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sum by (instance) (&lt;br /&gt;
  (count by (instance) (node_filesystem_device_error{mountpoint=&amp;quot;/users&amp;quot;, fstype=&amp;quot;nfs&amp;quot;}))&lt;br /&gt;
  or up{job=&amp;quot;node_exporter&amp;quot;}&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
We also use [https://prometheus.io/docs/alerting/latest/alertmanager/ AlertManager] to send email alerts from Prometheus metrics. We should figure out how to also send messages to IRC or similar.&lt;br /&gt;
&lt;br /&gt;
We also use the [https://github.com/prometheus/blackbox_exporter Blackbox prober exporter] to check if some of our web-based services are up.&lt;br /&gt;
&lt;br /&gt;
We make some pretty charts on Grafana (https://prometheus.csclub.uwaterloo.ca) from PromQL queries. Grafana also has an &#039;Explorer&#039; page where you can test out some queries before making chart panels from them.&lt;br /&gt;
&lt;br /&gt;
== Logging ==&lt;br /&gt;
We now use [https://vector.dev/ Vector] for collecting and transforming logs, and [https://clickhouse.com/ ClickHouse] for storing log data.&lt;br /&gt;
&lt;br /&gt;
=== ClickHouse ===&lt;br /&gt;
ClickHouse is a very fast OLAP database which has great documentation for storing and analyzing [https://clickhouse.com/use-cases/logging-and-metrics logging and metrics]. Unfortunately, the CPU on phosphoric-acid (which hosts the prometheus VM) is so old that when we try to install the official deb package, the following error occurs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Instruction check fail. The CPU does not support SSSE3 instruction set.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
So we&#039;re going to download the &amp;quot;compat&amp;quot; version instead:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cd /root&lt;br /&gt;
wget https://s3.amazonaws.com/clickhouse-builds/master/amd64compat/clickhouse&lt;br /&gt;
chmod +x clickhouse&lt;br /&gt;
./clickhouse install&lt;br /&gt;
rm clickhouse&lt;br /&gt;
wget -O /etc/systemd/system/clickhouse-server.service https://github.com/ClickHouse/ClickHouse/raw/master/packages/clickhouse-server.service&lt;br /&gt;
systemctl daemon-reload&lt;br /&gt;
systemctl enable clickhouse-server&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
By default, systemd limits the number of threads which a service can create, so we&#039;ll want to disable that. Run &amp;lt;code&amp;gt;systemctl edit clickhouse-server&amp;lt;/code&amp;gt; and paste the following:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Service]&lt;br /&gt;
TasksMax=infinity&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, paste the following into /etc/clickhouse-server/users.d/csclub-users.xml:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;clickhouse&amp;gt;&lt;br /&gt;
  &amp;lt;profiles&amp;gt;&lt;br /&gt;
    &amp;lt;default&amp;gt;&lt;br /&gt;
      &amp;lt;!-- disable logs (using too much disk space) --&amp;gt;&lt;br /&gt;
      &amp;lt;log_queries replace=&amp;quot;replace&amp;quot;&amp;gt;0&amp;lt;/log_queries&amp;gt;&lt;br /&gt;
      &amp;lt;log_query_threads replace=&amp;quot;replace&amp;quot;&amp;gt;0&amp;lt;/log_query_threads&amp;gt;&lt;br /&gt;
    &amp;lt;/default&amp;gt;&lt;br /&gt;
    &amp;lt;readonly&amp;gt;&lt;br /&gt;
      &amp;lt;!-- Grafana needs to change settings queries --&amp;gt;&lt;br /&gt;
      &amp;lt;readonly&amp;gt;2&amp;lt;/readonly&amp;gt;&lt;br /&gt;
    &amp;lt;/readonly&amp;gt;&lt;br /&gt;
  &amp;lt;/profiles&amp;gt;&lt;br /&gt;
  &amp;lt;users&amp;gt;&lt;br /&gt;
    &amp;lt;default&amp;gt;&lt;br /&gt;
      &amp;lt;!-- The default user should only be allowed to connect from localhost --&amp;gt;&lt;br /&gt;
      &amp;lt;networks&amp;gt;&lt;br /&gt;
        &amp;lt;ip&amp;gt;::1&amp;lt;/ip&amp;gt;&lt;br /&gt;
        &amp;lt;ip&amp;gt;127.0.0.1&amp;lt;/ip&amp;gt;&lt;br /&gt;
      &amp;lt;/networks&amp;gt;&lt;br /&gt;
      &amp;lt;!-- Allow the default user to create new users --&amp;gt;&lt;br /&gt;
      &amp;lt;access_management&amp;gt;1&amp;lt;/access_management&amp;gt;&lt;br /&gt;
      &amp;lt;named_collection_control&amp;gt;1&amp;lt;/named_collection_control&amp;gt;&lt;br /&gt;
      &amp;lt;show_named_collections&amp;gt;1&amp;lt;/show_named_collections&amp;gt;&lt;br /&gt;
      &amp;lt;show_named_collections_secrets&amp;gt;1&amp;lt;/show_named_collections_secrets&amp;gt;&lt;br /&gt;
    &amp;lt;/default&amp;gt;&lt;br /&gt;
  &amp;lt;/users&amp;gt;&lt;br /&gt;
&amp;lt;/clickhouse&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then paste the following into /etc/clickhouse-server/config/zzz-csclub.xml (we need the zzz prefix because the configuration files are merged in alphabetical order, and we want ours to be applied last):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;clickhouse&amp;gt;&lt;br /&gt;
  &amp;lt;listen_host&amp;gt;127.0.0.1&amp;lt;/listen_host&amp;gt;&lt;br /&gt;
  &amp;lt;listen_host&amp;gt;::1&amp;lt;/listen_host&amp;gt;&lt;br /&gt;
  &amp;lt;logger&amp;gt;&lt;br /&gt;
    &amp;lt;level&amp;gt;information&amp;lt;/level&amp;gt;&lt;br /&gt;
    &amp;lt;size&amp;gt;100M&amp;lt;/size&amp;gt;&lt;br /&gt;
    &amp;lt;count&amp;gt;10&amp;lt;/count&amp;gt;&lt;br /&gt;
  &amp;lt;/logger&amp;gt;&lt;br /&gt;
  &amp;lt;mysql_port&amp;gt;&amp;lt;/mysql_port&amp;gt;&lt;br /&gt;
  &amp;lt;postgresql_port&amp;gt;&amp;lt;/postgresql_port&amp;gt;&lt;br /&gt;
&lt;br /&gt;
  &amp;lt;!-- disable logs (using too much disk space) --&amp;gt;&lt;br /&gt;
  &amp;lt;asynchronous_metric_log remove=&amp;quot;1&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;metric_log remove=&amp;quot;1&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;query_thread_log remove=&amp;quot;1&amp;quot; /&amp;gt;&lt;br /&gt;
  &amp;lt;query_log remove=&amp;quot;1&amp;quot; /&amp;gt;&lt;br /&gt;
  &amp;lt;query_views_log remove=&amp;quot;1&amp;quot; /&amp;gt;&lt;br /&gt;
  &amp;lt;part_log remove=&amp;quot;1&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;session_log remove=&amp;quot;1&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;text_log remove=&amp;quot;1&amp;quot; /&amp;gt;&lt;br /&gt;
  &amp;lt;trace_log remove=&amp;quot;1&amp;quot;/&amp;gt;&lt;br /&gt;
&amp;lt;/clickhouse&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run &amp;lt;code&amp;gt;systemctl restart clickhouse-server&amp;lt;/code&amp;gt; and make sure that it&#039;s running.&lt;br /&gt;
&lt;br /&gt;
==== Schema ====&lt;br /&gt;
Run &amp;lt;code&amp;gt;clickhouse-client&amp;lt;/code&amp;gt; to get a SQL shell. First we need to create a new database and some users:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE DATABASE vector;&lt;br /&gt;
CREATE USER vector IDENTIFIED BY &#039;REPLACE_ME&#039;;&lt;br /&gt;
GRANT ALL ON vector.* TO vector;&lt;br /&gt;
CREATE USER grafana IDENTIFIED BY &#039;REPLACE_ME&#039; SETTINGS PROFILE &#039;readonly&#039;;&lt;br /&gt;
GRANT SHOW DATABASES, SHOW TABLES, SELECT, DICTGET ON *.* TO grafana;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In some of our tables, we&#039;ll store the two-letter country code instead of a country&#039;s full name to save space. So we&#039;ll create a [https://clickhouse.com/docs/en/sql-reference/dictionaries dictionary] so that we can look up a country&#039;s full name. Exit the SQL shell, then, download the CSV file:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
wget -O /var/lib/clickhouse/user_files/country_codes.csv &#039;https://datahub.io/core/country-list/r/data.csv&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run &amp;lt;code&amp;gt;clickhouse-client&amp;lt;/code&amp;gt; and create the dictionary:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE DICTIONARY vector.country_codes_dictionary&lt;br /&gt;
(&lt;br /&gt;
    Name String,&lt;br /&gt;
    Code String&lt;br /&gt;
)&lt;br /&gt;
PRIMARY KEY Code&lt;br /&gt;
SOURCE(FILE(path &#039;/var/lib/clickhouse/user_files/country_codes.csv&#039; FORMAT &#039;CSVWithNames&#039;))&lt;br /&gt;
LIFETIME(MIN 0 MAX 0)&lt;br /&gt;
LAYOUT(HASHED_ARRAY());&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Perform a SELECT to fill the table:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT * FROM country_codes_dictionary;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now we need to create the tables for storing our actual log data (after they are transformed by Vector).&lt;br /&gt;
Create a table for failed SSH logins:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.failed_ssh_logins&lt;br /&gt;
(&lt;br /&gt;
    host LowCardinality(String),&lt;br /&gt;
    timestamp DateTime,&lt;br /&gt;
    ip_address IPv6,&lt;br /&gt;
    username String,&lt;br /&gt;
    country_code LowCardinality(String)&lt;br /&gt;
)&lt;br /&gt;
ENGINE = MergeTree()&lt;br /&gt;
PRIMARY KEY (host, timestamp)&lt;br /&gt;
TTL timestamp + INTERVAL 1 MONTH DELETE;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Create a table for storing mirror requests:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    timestamp DateTime CODEC(Delta, ZSTD),&lt;br /&gt;
    ip_address IPv6,&lt;br /&gt;
    bytes_sent UInt64,&lt;br /&gt;
    user_agent String,&lt;br /&gt;
    country_code LowCardinality(String),&lt;br /&gt;
    region_name String,&lt;br /&gt;
    city String&lt;br /&gt;
)&lt;br /&gt;
ENGINE = MergeTree()&lt;br /&gt;
PRIMARY KEY (distro, timestamp, country_code, region_name, city)&lt;br /&gt;
TTL timestamp + INTERVAL 1 WEEK DELETE;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
One of ClickHouse&#039;s great features is [https://clickhouse.com/docs/en/guides/developer/cascading-materialized-views Materialized Views]. These allow us to automatically &amp;quot;forward&amp;quot; data from one table to another, and the second table can use a different storage engine to aggregate data and save space.&lt;br /&gt;
&lt;br /&gt;
We want to calculate the total number of requests and bytes sent for each distro, so let&#039;s create a table and view for that:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_by_distro&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    date Date CODEC(Delta, ZSTD),&lt;br /&gt;
    country_code LowCardinality(String),&lt;br /&gt;
    num_requests UInt64,&lt;br /&gt;
    bytes_sent UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((num_requests, bytes_sent))&lt;br /&gt;
PRIMARY KEY (distro, toStartOfMonth(date), date, country_code)&lt;br /&gt;
TTL date + INTERVAL 1 MONTH&lt;br /&gt;
        GROUP BY distro, toStartOfMonth(date)&lt;br /&gt;
        SET num_requests = sum(num_requests),&lt;br /&gt;
            bytes_sent = sum(bytes_sent),&lt;br /&gt;
    date + INTERVAL 2 YEAR DELETE;&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_by_distro_mv&lt;br /&gt;
TO vector.mirror_requests_agg_by_distro&lt;br /&gt;
AS&lt;br /&gt;
SELECT&lt;br /&gt;
    distro,&lt;br /&gt;
    toDate(timestamp) AS date,&lt;br /&gt;
    country_code,&lt;br /&gt;
    sum(1) AS num_requests,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
GROUP BY distro, date, country_code;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We also wants some stats for Canada specifically:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_canada&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    date Date CODEC(Delta, ZSTD),&lt;br /&gt;
    region_name LowCardinality(String),&lt;br /&gt;
    city String,&lt;br /&gt;
    bytes_sent UInt64,&lt;br /&gt;
    num_requests UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((bytes_sent, num_requests))&lt;br /&gt;
PRIMARY KEY (distro, toStartOfMonth(date), date, region_name, city)&lt;br /&gt;
TTL date + INTERVAL 1 MONTH&lt;br /&gt;
        GROUP BY distro, toStartOfMonth(date)&lt;br /&gt;
        SET num_requests = sum(num_requests),&lt;br /&gt;
            bytes_sent = sum(bytes_sent),&lt;br /&gt;
    date + INTERVAL 2 YEAR DELETE;&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_canada_mv&lt;br /&gt;
TO vector.mirror_requests_agg_canada&lt;br /&gt;
AS&lt;br /&gt;
SELECT&lt;br /&gt;
    distro,&lt;br /&gt;
    toDate(timestamp) as date,&lt;br /&gt;
    region_name,&lt;br /&gt;
    city,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent,&lt;br /&gt;
    sum(1) AS num_requests&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
WHERE country_code = &#039;CA&#039;&lt;br /&gt;
GROUP BY distro, date, region_name, city;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We also want to keep stats just for the university:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_uw&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    date Date CODEC(Delta, ZSTD),&lt;br /&gt;
    bytes_sent UInt64,&lt;br /&gt;
    num_requests UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((bytes_sent, num_requests))&lt;br /&gt;
PRIMARY KEY (distro, toStartOfMonth(date), date)&lt;br /&gt;
TTL date + INTERVAL 1 MONTH&lt;br /&gt;
        GROUP BY distro, toStartOfMonth(date)&lt;br /&gt;
        SET num_requests = sum(num_requests),&lt;br /&gt;
            bytes_sent = sum(bytes_sent),&lt;br /&gt;
    date + INTERVAL 2 YEAR DELETE;&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_uw_mv&lt;br /&gt;
TO vector.mirror_requests_agg_uw&lt;br /&gt;
AS&lt;br /&gt;
SELECT&lt;br /&gt;
    distro,&lt;br /&gt;
    toDate(timestamp) as date,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent,&lt;br /&gt;
    sum(1) AS num_requests&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
WHERE isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:129.97.0.0/112&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:10.0.0.0/104&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:172.16.0.0/108&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:192.168.0.0/112&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;2620:101:f000::/47&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;fd74:6b6a:8eca::/47&#039;)&lt;br /&gt;
GROUP BY distro, date, region_name, city;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Finally, we&#039;ll store some stats for IP subnets:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_ip&lt;br /&gt;
(&lt;br /&gt;
    timestamp DateTime CODEC(Delta, ZSTD),&lt;br /&gt;
    cidr_start IPv6,&lt;br /&gt;
    country_code LowCardinality(String),&lt;br /&gt;
    num_requests UInt64,&lt;br /&gt;
    bytes_sent UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((num_requests, bytes_sent))&lt;br /&gt;
PRIMARY KEY (timestamp, cidr_start, country_code)&lt;br /&gt;
TTL timestamp + toIntervalWeek(2);&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_ip_mv TO vector.mirror_requests_agg_ip AS&lt;br /&gt;
SELECT&lt;br /&gt;
    toStartOfFiveMinutes(timestamp) AS timestamp,&lt;br /&gt;
    IPv6CIDRToRange(ip_address, 120).1 AS cidr_start,&lt;br /&gt;
    country_code,&lt;br /&gt;
    sum(1) AS num_requests,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
GROUP BY&lt;br /&gt;
    timestamp,&lt;br /&gt;
    cidr_start,&lt;br /&gt;
    country_code;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== GeoIP database ===&lt;br /&gt;
We&#039;ll want to look up geographic information for the IP addresses in our data. To do this, we&#039;ll use the [https://dev.maxmind.com/geoip/geolite2-free-geolocation-data MaxMind GeoLite2 databases]. Syscom already has a MaxMind account; the password is stored in the usual place. Install the latest geoipupdate package from [https://github.com/maxmind/geoipupdate/releases here], then edit /etc/GeoIP.conf as necessary (use the syscom account ID and license key). Set &amp;lt;code&amp;gt;EditionIDs&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;GeoLite2-City&amp;lt;/code&amp;gt; only.&lt;br /&gt;
&lt;br /&gt;
We&#039;ll use a systemd timer to run the geoipupdate script periodically. Paste the following into /etc/systemd/system/geoipupdate.service:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=GeoIP Update&lt;br /&gt;
Documentation=https://dev.maxmind.com/geoip/updating-databases&lt;br /&gt;
After=network-online.target&lt;br /&gt;
&lt;br /&gt;
[Service]&lt;br /&gt;
Type=oneshot&lt;br /&gt;
ExecStart=/usr/bin/geoipupdate&lt;br /&gt;
Nice=19&lt;br /&gt;
IOSchedulingClass=idle&lt;br /&gt;
IOSchedulingPriority=7&lt;br /&gt;
ProtectSystem=strict&lt;br /&gt;
ReadWritePaths=/usr/share/GeoIP&lt;br /&gt;
ProtectHome=true&lt;br /&gt;
PrivateTmp=true&lt;br /&gt;
PrivateDevices=true&lt;br /&gt;
ProtectHostname=true&lt;br /&gt;
ProtectClock=true&lt;br /&gt;
ProtectKernelTunables=true&lt;br /&gt;
ProtectKernelModules=true&lt;br /&gt;
ProtectKernelLogs=true&lt;br /&gt;
ProtectControlGroups=true&lt;br /&gt;
LockPersonality=true&lt;br /&gt;
RestrictRealtime=true&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Run &amp;lt;code&amp;gt;systemctl daemon-reload&amp;lt;/code&amp;gt; and then &amp;lt;code&amp;gt;systemctl start geoipupdate&amp;lt;/code&amp;gt; to download the database for the first time.&lt;br /&gt;
&lt;br /&gt;
Now paste the following into /etc/systemd/system/geoipupdate.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Automatic GeoIP database update&lt;br /&gt;
Documentation=https://dev.maxmind.com/geoip/updating-databases&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
OnCalendar=monthly&lt;br /&gt;
RandomizedDelaySec=12h&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl daemon-reload&lt;br /&gt;
systemctl enable geoipupdate.timer&lt;br /&gt;
systemctl start geoipupdate.timer&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Vector ===&lt;br /&gt;
Vector allows you to create directed acyclic graphs (DAGs) for collecting and processing logs, which gives us a lot of flexibility. It also has a built-in scripting language, [https://vector.dev/docs/reference/vrl/ Vector Remap Language (VRL)] for slicing and dicing data. This allows us to remove fields which we don&#039;t need, add new fields which we do need, enrich an event with extra data, etc.&lt;br /&gt;
&lt;br /&gt;
Our data pipeline looks like this: Vector agents -&amp;gt; Vector aggregator -&amp;gt; ClickHouse. We use Grafana for visualization.&lt;br /&gt;
&lt;br /&gt;
We use mutual TLS between the agents and the aggregator to make sure that random people can&#039;t send us garbage data:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
openssl req -newkey rsa:2048 -nodes -keyout aggregator.key -x509 -out aggregator.crt -days 36500&lt;br /&gt;
openssl req -newkey rsa:2048 -nodes -keyout agent.key -x509 -out agent.crt -days 36500&lt;br /&gt;
chown vector:vector *.crt *.key&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here is what our vector.toml looks like on the general-use machines; currently, we only use it for collecting failed SSH login attempts.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[sources.source_sshd]&lt;br /&gt;
type = &amp;quot;journald&amp;quot;&lt;br /&gt;
include_units = [&amp;quot;ssh.service&amp;quot;]&lt;br /&gt;
&lt;br /&gt;
[transforms.remap_sshd]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;source_sshd&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
  parsed, err = parse_regex(&lt;br /&gt;
    .message, r&#039;^(?:Connection (?:closed|reset)|Disconnected) (?:by|from) (?:invalid|authenticating) user (?P&amp;lt;user&amp;gt;[^ ]+) (?P&amp;lt;ip&amp;gt;[0-9.a-f:]+)&#039;&lt;br /&gt;
  )&lt;br /&gt;
  if is_null(err) {&lt;br /&gt;
    . = {&lt;br /&gt;
      &amp;quot;username&amp;quot;: parsed.user,&lt;br /&gt;
      &amp;quot;ip_address&amp;quot;: parsed.ip,&lt;br /&gt;
      &amp;quot;host&amp;quot;: .host,&lt;br /&gt;
      &amp;quot;timestamp&amp;quot;: .timestamp,&lt;br /&gt;
      &amp;quot;job&amp;quot;: &amp;quot;vector-sshd&amp;quot;&lt;br /&gt;
    }&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[transforms.filter_sshd]&lt;br /&gt;
type = &amp;quot;filter&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;remap_sshd&amp;quot;]&lt;br /&gt;
condition = &#039;.job == &amp;quot;vector-sshd&amp;quot;&#039;&lt;br /&gt;
&lt;br /&gt;
[sinks.aggregator]&lt;br /&gt;
type = &amp;quot;vector&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;filter_sshd&amp;quot;]&lt;br /&gt;
address = &amp;quot;prometheus:5045&amp;quot;&lt;br /&gt;
  [sinks.aggregator.tls]&lt;br /&gt;
  enabled = true&lt;br /&gt;
  ca_file = &amp;quot;/etc/vector/aggregator.crt&amp;quot;&lt;br /&gt;
  crt_file = &amp;quot;/etc/vector/agent.crt&amp;quot;&lt;br /&gt;
  key_file = &amp;quot;/etc/vector/agent.key&amp;quot;&lt;br /&gt;
  verify_hostname = false&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The agent on potassium-benzoate collects NGINX logs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[sources.source_nginx]&lt;br /&gt;
type = &amp;quot;file&amp;quot;&lt;br /&gt;
include = [&amp;quot;/var/log/nginx/access.log&amp;quot;]&lt;br /&gt;
max_read_bytes = 65536&lt;br /&gt;
&lt;br /&gt;
[transforms.remap_nginx]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;source_nginx&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot; &lt;br /&gt;
  parsed_log, err = parse_nginx_log(.message, &amp;quot;combined&amp;quot;)&lt;br /&gt;
  status = parsed_log.status&lt;br /&gt;
  request = string!(parsed_log.request || &amp;quot;&amp;quot;)&lt;br /&gt;
  if is_null(err) &amp;amp;&amp;amp; status == 200 {&lt;br /&gt;
    parsed_path, err = parse_regex(request, r&#039;^GET /+(?P&amp;lt;distro&amp;gt;[^/? ]+)&#039;)&lt;br /&gt;
    distro = parsed_path.distro&lt;br /&gt;
    ignore = [&lt;br /&gt;
      &amp;quot;server-status&amp;quot;, &amp;quot;stats&amp;quot;, &amp;quot;robots.txt&amp;quot;,&lt;br /&gt;
      &amp;quot;include&amp;quot;, &amp;quot;pub&amp;quot;, &amp;quot;news&amp;quot;, &amp;quot;index.html&amp;quot;, &amp;quot;sync.json&amp;quot;, &amp;quot;ups&amp;quot;,&lt;br /&gt;
      &amp;quot;pool&amp;quot;, &amp;quot;dists&amp;quot;, &amp;quot;csclub.asc&amp;quot;, &amp;quot;csclub.gpg&amp;quot;&lt;br /&gt;
    ]&lt;br /&gt;
    if (&lt;br /&gt;
      is_null(err) &amp;amp;&amp;amp; !includes(ignore, distro) &amp;amp;&amp;amp; !contains(request, &amp;quot;..&amp;quot;) &amp;amp;&amp;amp;&lt;br /&gt;
      !starts_with(request, &amp;quot;#&amp;quot;) &amp;amp;&amp;amp; !starts_with(request, &amp;quot;%&amp;quot;) &amp;amp;&amp;amp; !starts_with(request, &amp;quot;.&amp;quot;)&lt;br /&gt;
    ) {&lt;br /&gt;
      . = {&lt;br /&gt;
        &amp;quot;distro&amp;quot;: distro,&lt;br /&gt;
        &amp;quot;user_agent&amp;quot;: parsed_log.agent,&lt;br /&gt;
        &amp;quot;ip_address&amp;quot;: parsed_log.client,&lt;br /&gt;
        &amp;quot;bytes_sent&amp;quot;: parsed_log.size,&lt;br /&gt;
        &amp;quot;timestamp&amp;quot;: parsed_log.timestamp,&lt;br /&gt;
        &amp;quot;job&amp;quot;: &amp;quot;vector-mirror&amp;quot;&lt;br /&gt;
      }&lt;br /&gt;
    }&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Finally, here&#039;s the aggregator config, which collects data from each agent and then inserts it into ClickHouse:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[enrichment_tables.enrich_geoip]                                                    &lt;br /&gt;
type = &amp;quot;geoip&amp;quot;                                                                      &lt;br /&gt;
path = &amp;quot;/usr/share/GeoIP/GeoLite2-City.mmdb&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[sources.source_agents]      &lt;br /&gt;
type = &amp;quot;vector&amp;quot;&lt;br /&gt;
address = &amp;quot;[::]:5045&amp;quot;&lt;br /&gt;
  [sources.source_agents.tls]&lt;br /&gt;
  enabled = true&lt;br /&gt;
  ca_file = &amp;quot;/etc/vector/agent.crt&amp;quot;&lt;br /&gt;
  crt_file = &amp;quot;/etc/vector/aggregator.crt&amp;quot;&lt;br /&gt;
  key_file = &amp;quot;/etc/vector/aggregator.key&amp;quot;&lt;br /&gt;
  verify_hostname = false&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_route]&lt;br /&gt;
type = &amp;quot;route&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;source_agents&amp;quot;]&lt;br /&gt;
route.sshd = &#039;.job == &amp;quot;vector-sshd&amp;quot;&#039;&lt;br /&gt;
route.mirror = &#039;.job == &amp;quot;vector-mirror&amp;quot;&#039;&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_sshd]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_route.sshd&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot; &lt;br /&gt;
  ipinfo = get_enrichment_table_record(&amp;quot;enrich_geoip&amp;quot;, {&amp;quot;ip&amp;quot;: .ip_address}) ?? {}&lt;br /&gt;
  if is_ipv4!(.ip_address) &amp;amp;&amp;amp; (ip_cidr_contains!(&amp;quot;10.0.0.0/8&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;172.16.0.0/12&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;192.168.0.0/16&amp;quot;, .ip_address)) {&lt;br /&gt;
    ipinfo.country_code = &amp;quot;CA&amp;quot;;&lt;br /&gt;
    ipinfo.region_name = &amp;quot;Ontario&amp;quot;;&lt;br /&gt;
    ipinfo.city_name = &amp;quot;Waterloo&amp;quot;;&lt;br /&gt;
  }&lt;br /&gt;
  . = {&lt;br /&gt;
    &amp;quot;host&amp;quot;: .host,&lt;br /&gt;
    &amp;quot;timestamp&amp;quot;: .timestamp,&lt;br /&gt;
    &amp;quot;username&amp;quot;: .username,&lt;br /&gt;
    &amp;quot;ip_address&amp;quot;: ip_to_ipv6!(.ip_address),&lt;br /&gt;
    &amp;quot;country_code&amp;quot;: ipinfo.country_code || &amp;quot;&amp;quot;&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_mirror]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_route.mirror&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
  ipinfo = get_enrichment_table_record(&amp;quot;enrich_geoip&amp;quot;, {&amp;quot;ip&amp;quot;: .ip_address}) ?? {}&lt;br /&gt;
  if is_ipv4!(.ip_address) &amp;amp;&amp;amp; (ip_cidr_contains!(&amp;quot;10.0.0.0/8&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;172.16.0.0/12&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;192.168.0.0/16&amp;quot;, .ip_address)) {&lt;br /&gt;
    ipinfo.country_code = &amp;quot;CA&amp;quot;;&lt;br /&gt;
    ipinfo.region_name = &amp;quot;Ontario&amp;quot;;&lt;br /&gt;
    ipinfo.city_name = &amp;quot;Waterloo&amp;quot;;&lt;br /&gt;
  }&lt;br /&gt;
  . = {&lt;br /&gt;
    &amp;quot;distro&amp;quot;: .distro,&lt;br /&gt;
    &amp;quot;timestamp&amp;quot;: .timestamp,&lt;br /&gt;
    &amp;quot;ip_address&amp;quot;: ip_to_ipv6!(.ip_address),&lt;br /&gt;
    &amp;quot;bytes_sent&amp;quot;: .bytes_sent,&lt;br /&gt;
    &amp;quot;user_agent&amp;quot;: .user_agent,&lt;br /&gt;
    &amp;quot;country_code&amp;quot;: ipinfo.country_code || &amp;quot;&amp;quot;,&lt;br /&gt;
    &amp;quot;region_name&amp;quot;: ipinfo.region_name || &amp;quot;&amp;quot;,&lt;br /&gt;
    &amp;quot;city&amp;quot;: ipinfo.city_name || &amp;quot;&amp;quot;&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_unmatched]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_route._unmatched&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
  log(&amp;quot;unrecognized job: &amp;quot; + string!(.job || &amp;quot;null&amp;quot;), level: &amp;quot;warn&amp;quot;)&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[sinks.sink_unmatched]&lt;br /&gt;
type = &amp;quot;blackhole&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_unmatched&amp;quot;]&lt;br /&gt;
print_interval_secs = 0&lt;br /&gt;
&lt;br /&gt;
[sinks.clickhouse_sshd]&lt;br /&gt;
type = &amp;quot;clickhouse&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_sshd&amp;quot;]&lt;br /&gt;
encoding.timestamp_format = &amp;quot;unix&amp;quot;&lt;br /&gt;
endpoint = &amp;quot;$CLICKHOUSE_ENDPOINT&amp;quot;&lt;br /&gt;
database = &amp;quot;$CLICKHOUSE_DATABASE&amp;quot;&lt;br /&gt;
table = &amp;quot;failed_ssh_logins&amp;quot;&lt;br /&gt;
  [sinks.clickhouse_sshd.auth]&lt;br /&gt;
  strategy = &amp;quot;basic&amp;quot;&lt;br /&gt;
  user = &amp;quot;$CLICKHOUSE_USER&amp;quot;&lt;br /&gt;
  password = &amp;quot;$CLICKHOUSE_PASSWORD&amp;quot;&lt;br /&gt;
&lt;br /&gt;
...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Beats, Logstash and Loki (old) ===&lt;br /&gt;
We previously used Elastic Beats, Logstash and Grafana Loki for collecting and storing logs. One day I tried to upgrade Logstash and it exploded so badly that I figured it would be easier to just switch to Vector instead.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;The sections below are kept for historical purposes only and are no longer accurate.&amp;lt;/b&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We use a combination of [https://www.elastic.co/beats/ Elastic Beats], [https://www.elastic.co/logstash/ Logstash] and [https://grafana.com/oss/loki/ Loki] for collecting, storing and querying our logs; for visualization, we use Grafana. Logstash and Loki are currently both running in the prometheus VM.&lt;br /&gt;
&lt;br /&gt;
The reason why I chose Loki over Elasticsearch is because Loki is &amp;lt;i&amp;gt;very&amp;lt;/i&amp;gt; space efficient with regards to storage. It also consumes way less RAM and CPU. This means that we can collect a lot of logs without worrying too much about resource usage.&lt;br /&gt;
&lt;br /&gt;
We have Journalbeat and/or Filebeat running on some of our machines to collect logs from sshd, Apache and NGINX. The Beats send these logs to Logstash, which does some pre-processing. The most useful contribution by Logstash is its GeoIP plugin, which allows us to enrich the logs with some geographical information from IP addresses (e.g. add city and country). Logstash sends these logs to Loki, and we can then view these from Grafana.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note&#039;&#039;&#039;: Sometimes the Loki output plugin for Logstash disappears after a reboot or an upgrade. If you see Logstash complaining about this in the journald logs, run this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cd /usr/share/logstash&lt;br /&gt;
bin/logstash-plugin install logstash-output-loki&lt;br /&gt;
systemctl restart logstash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
See [https://grafana.com/docs/loki/latest/clients/logstash/ here] for details.&lt;br /&gt;
&lt;br /&gt;
The language for querying logs in Loki is [https://grafana.com/docs/loki/latest/logql/ LogQL], which, syntactically, is very similar to PromQL. If you have already learned PromQL, then you should be able to pick up LogQL very easily. You can try out some LogQL queries from the &#039;Explore&#039; page on Grafana; make sure you toggle the data source to &#039;Loki&#039; in the top left corner. For the &#039;topk&#039; queries, you will also want to toggle &#039;Query type&#039; to &#039;Instant&#039; rather than &#039;Range&#039;.&lt;br /&gt;
&lt;br /&gt;
==== LogQL examples ====&lt;br /&gt;
Here are the number of failed SSH login attempts for each host for a given time range:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sum by (hostname) (&lt;br /&gt;
  count_over_time(&lt;br /&gt;
    {job=&amp;quot;logstash-sshd&amp;quot;} [$__range]&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that &amp;lt;code&amp;gt;$__range&amp;lt;/code&amp;gt; is a special [https://grafana.com/docs/grafana/latest/variables/variable-types/global-variables/ global variable] in Grafana which is equal to the time range in the top right corner of a chart.&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
Here are the top 10 IP addresses from which failed SSH login attempts arrived, for a given host and time range:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
topk(10,&lt;br /&gt;
  sum by (ip_address) (&lt;br /&gt;
    count_over_time(&lt;br /&gt;
      {job=&amp;quot;logstash-sshd&amp;quot;,hostname=&amp;quot;$hostname&amp;quot;} | json | __error__ = &amp;quot;&amp;quot;&lt;br /&gt;
      [$__range]&lt;br /&gt;
    )&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
$hostname is a chart variable, which can be configured from a chart&#039;s settings.&lt;br /&gt;
&lt;br /&gt;
I configured Logstash to send logs to Loki as JSON, but it&#039;s a rather hacky solution, so occasionally invalid JSON is sent.&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
Here are the number of HTTP requests for the 15 distros on our mirror from the last hour:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
topk(15,&lt;br /&gt;
  sum by (distro) (&lt;br /&gt;
    count_over_time(&lt;br /&gt;
      {job=&amp;quot;logstash-nginx&amp;quot;} | json | __error__ = &amp;quot;&amp;quot; | distro != &amp;quot;server-status&amp;quot;&lt;br /&gt;
      [1h]&lt;br /&gt;
    )&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
Here are the number of total bytes sent over HTTP for the top 15 distros from the last hour. Note the use of the &amp;lt;code&amp;gt;unwrap&amp;lt;/code&amp;gt; operator.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
topk(15,&lt;br /&gt;
  sum by (distro) (&lt;br /&gt;
    sum_over_time(&lt;br /&gt;
      {job=&amp;quot;logstash-nginx&amp;quot;} | json | __error__ = &amp;quot;&amp;quot; | distro != &amp;quot;server-status&amp;quot; | unwrap bytes&lt;br /&gt;
      [1h]&lt;br /&gt;
    )&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
You can see more examples on the Mirror Requests dashboard on Grafana.&lt;br /&gt;
&lt;br /&gt;
==== Avoid high cardinality ====&lt;br /&gt;
For both Prometheus and Loki, you must [https://prometheus.io/docs/practices/naming/#labels avoid high cardinality] labels at all costs. By high cardinality, I mean labels which can take on a very large number of values; for example, using a label to store IP addresses would be a very bad idea. This is because Prometheus and Loki use labels to store metrics/logs efficiently with compression; when two metrics have two different sets of labels, they cannot be stored together, which increases the storage space usage.&lt;br /&gt;
&lt;br /&gt;
With Loki, you can extract labels from your logs inside your query dynamically. One way to do this is with the &amp;lt;code&amp;gt;json&amp;lt;/code&amp;gt; operator; there are other ways to do this as well (see the LogQL docs). This basically means that we get infinite cardinality from our logs, the tradeoff being that queries may take longer to execute.&lt;br /&gt;
&lt;br /&gt;
Also, be very careful about what you send to Loki from Logstash - [https://grafana.com/docs/loki/latest/clients/logstash/#usage-and-configuration every field in a Logstash message becomes a Loki label]. Usage of the &amp;lt;code&amp;gt;prune&amp;lt;/code&amp;gt; command in Logstash is highly recommended.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=Observability&amp;diff=5184</id>
		<title>Observability</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=Observability&amp;diff=5184"/>
		<updated>2023-12-16T02:05:16Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* ClickHouse */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are [https://www.oreilly.com/library/view/distributed-systems-observability/9781492033431/ch04.html three pillars of observability]: metrics, logging and tracing. We are only interested in the first two.&lt;br /&gt;
&lt;br /&gt;
== Metrics ==&lt;br /&gt;
All of our machines are, or at least should be, running the Prometheus node exporter. This collects and sends machine metrics (e.g. RAM used, disk space) to the Prometheus server running at https://prometheus.csclub.uwaterloo.ca (currently a VM on phosphoric-acid). There are a few specialized exporters running on several other machines; a Postfix exporter is running on mail, an Apache exporter is running on caffeine, and an NGINX expoter is running on potassium-benzoate. There is also a custom exporter written by syscom running on potassium-benzoate for mirror stats.&lt;br /&gt;
&lt;br /&gt;
Most of the exporters use mutual TLS authentication with the Prometheus server. I set the expiration date for the TLS certs to 10 years. If you are reading this and it is 2031 or later, then go update the certs.&lt;br /&gt;
&lt;br /&gt;
I highly suggest becoming familiar with [https://prometheus.io/docs/prometheus/latest/querying/basics/ PromQL], the query language for Prometheus. You can run and visualize some queries at https://prometheus.csclub.uwaterloo.ca/prometheus. For example, here is a query to determine which machines are up or down:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
up{job=&amp;quot;node_exporter&amp;quot;}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Here&#039;s how we determine if a machine has NFS mounted. This will return 1 for machines which have NFS mounted, but will not return any records for machines which do not have NFS mounted. (We ignore the actual value of node_filesystem_device_error because it returns 1 for machines using Kerberized NFS.)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
count by (instance) (node_filesystem_device_error{mountpoint=&amp;quot;/users&amp;quot;, fstype=&amp;quot;nfs&amp;quot;})&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Now this is a rather complicated expression which can return one of three values:&lt;br /&gt;
* 0: the machine is down&lt;br /&gt;
* 1: the machine is up, but NFS is not mounted&lt;br /&gt;
* 2: the machine is up and NFS is mounted&lt;br /&gt;
The [https://prometheus.io/docs/prometheus/latest/querying/operators/#logical-set-binary-operators or operator] in PromQL is key here.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sum by (instance) (&lt;br /&gt;
  (count by (instance) (node_filesystem_device_error{mountpoint=&amp;quot;/users&amp;quot;, fstype=&amp;quot;nfs&amp;quot;}))&lt;br /&gt;
  or up{job=&amp;quot;node_exporter&amp;quot;}&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
We also use [https://prometheus.io/docs/alerting/latest/alertmanager/ AlertManager] to send email alerts from Prometheus metrics. We should figure out how to also send messages to IRC or similar.&lt;br /&gt;
&lt;br /&gt;
We also use the [https://github.com/prometheus/blackbox_exporter Blackbox prober exporter] to check if some of our web-based services are up.&lt;br /&gt;
&lt;br /&gt;
We make some pretty charts on Grafana (https://prometheus.csclub.uwaterloo.ca) from PromQL queries. Grafana also has an &#039;Explorer&#039; page where you can test out some queries before making chart panels from them.&lt;br /&gt;
&lt;br /&gt;
== Logging ==&lt;br /&gt;
We now use [https://vector.dev/ Vector] for collecting and transforming logs, and [https://clickhouse.com/ ClickHouse] for storing log data.&lt;br /&gt;
&lt;br /&gt;
=== ClickHouse ===&lt;br /&gt;
ClickHouse is a very fast OLAP database which has great documentation for storing and analyzing [https://clickhouse.com/use-cases/logging-and-metrics logging and metrics]. Unfortunately, the CPU on phosphoric-acid (which hosts the prometheus VM) is so old that when we try to install the official deb package, the following error occurs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Instruction check fail. The CPU does not support SSSE3 instruction set.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
So we&#039;re going to download the &amp;quot;compat&amp;quot; version instead:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cd /root&lt;br /&gt;
wget https://s3.amazonaws.com/clickhouse-builds/master/amd64compat/clickhouse&lt;br /&gt;
chmod +x clickhouse&lt;br /&gt;
./clickhouse install&lt;br /&gt;
rm clickhouse&lt;br /&gt;
wget -O /etc/systemd/system/clickhouse-server.service https://github.com/ClickHouse/ClickHouse/raw/master/packages/clickhouse-server.service&lt;br /&gt;
systemctl daemon-reload&lt;br /&gt;
systemctl enable clickhouse-server&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
By default, systemd limits the number of threads which a service can create, so we&#039;ll want to disable that. Run &amp;lt;code&amp;gt;systemctl edit clickhouse-server&amp;lt;/code&amp;gt; and paste the following:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Service]&lt;br /&gt;
TasksMax=infinity&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, paste the following into /etc/clickhouse-server/users.d/csclub-users.xml:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;clickhouse&amp;gt;&lt;br /&gt;
  &amp;lt;profiles&amp;gt;&lt;br /&gt;
    &amp;lt;default&amp;gt;&lt;br /&gt;
      &amp;lt;!-- disable logs (using too much disk space) --&amp;gt;&lt;br /&gt;
      &amp;lt;log_queries replace=&amp;quot;replace&amp;quot;&amp;gt;0&amp;lt;/log_queries&amp;gt;&lt;br /&gt;
      &amp;lt;log_query_threads replace=&amp;quot;replace&amp;quot;&amp;gt;0&amp;lt;/log_query_threads&amp;gt;&lt;br /&gt;
    &amp;lt;/default&amp;gt;&lt;br /&gt;
    &amp;lt;readonly&amp;gt;&lt;br /&gt;
      &amp;lt;!-- Grafana needs to change settings queries --&amp;gt;&lt;br /&gt;
      &amp;lt;readonly&amp;gt;2&amp;lt;/readonly&amp;gt;&lt;br /&gt;
    &amp;lt;/readonly&amp;gt;&lt;br /&gt;
  &amp;lt;/profiles&amp;gt;&lt;br /&gt;
  &amp;lt;users&amp;gt;&lt;br /&gt;
    &amp;lt;default&amp;gt;&lt;br /&gt;
      &amp;lt;!-- The default user should only be allowed to connect from localhost --&amp;gt;&lt;br /&gt;
      &amp;lt;networks&amp;gt;&lt;br /&gt;
        &amp;lt;ip&amp;gt;::1&amp;lt;/ip&amp;gt;&lt;br /&gt;
        &amp;lt;ip&amp;gt;127.0.0.1&amp;lt;/ip&amp;gt;&lt;br /&gt;
      &amp;lt;/networks&amp;gt;&lt;br /&gt;
      &amp;lt;!-- Allow the default user to create new users --&amp;gt;&lt;br /&gt;
      &amp;lt;access_management&amp;gt;1&amp;lt;/access_management&amp;gt;&lt;br /&gt;
      &amp;lt;named_collection_control&amp;gt;1&amp;lt;/named_collection_control&amp;gt;&lt;br /&gt;
      &amp;lt;show_named_collections&amp;gt;1&amp;lt;/show_named_collections&amp;gt;&lt;br /&gt;
      &amp;lt;show_named_collections_secrets&amp;gt;1&amp;lt;/show_named_collections_secrets&amp;gt;&lt;br /&gt;
    &amp;lt;/default&amp;gt;&lt;br /&gt;
  &amp;lt;/users&amp;gt;&lt;br /&gt;
&amp;lt;/clickhouse&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then paste the following into /etc/clickhouse-server/config/zzz-csclub.xml (we need the zzz prefix because the configuration files are merged in alphabetical order, and we want ours to be applied last):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;clickhouse&amp;gt;&lt;br /&gt;
  &amp;lt;listen_host&amp;gt;127.0.0.1&amp;lt;/listen_host&amp;gt;&lt;br /&gt;
  &amp;lt;listen_host&amp;gt;::1&amp;lt;/listen_host&amp;gt;&lt;br /&gt;
  &amp;lt;logger&amp;gt;&lt;br /&gt;
    &amp;lt;level&amp;gt;information&amp;lt;/level&amp;gt;&lt;br /&gt;
    &amp;lt;size&amp;gt;100M&amp;lt;/size&amp;gt;&lt;br /&gt;
    &amp;lt;count&amp;gt;10&amp;lt;/count&amp;gt;&lt;br /&gt;
  &amp;lt;/logger&amp;gt;&lt;br /&gt;
  &amp;lt;mysql_port&amp;gt;&amp;lt;/mysql_port&amp;gt;&lt;br /&gt;
  &amp;lt;postgresql_port&amp;gt;&amp;lt;/postgresql_port&amp;gt;&lt;br /&gt;
&lt;br /&gt;
  &amp;lt;!-- disable logs (using too much disk space) --&amp;gt;&lt;br /&gt;
  &amp;lt;asynchronous_metric_log remove=&amp;quot;1&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;metric_log remove=&amp;quot;1&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;query_thread_log remove=&amp;quot;1&amp;quot; /&amp;gt;&lt;br /&gt;
  &amp;lt;query_log remove=&amp;quot;1&amp;quot; /&amp;gt;&lt;br /&gt;
  &amp;lt;query_views_log remove=&amp;quot;1&amp;quot; /&amp;gt;&lt;br /&gt;
  &amp;lt;part_log remove=&amp;quot;1&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;session_log remove=&amp;quot;1&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;text_log remove=&amp;quot;1&amp;quot; /&amp;gt;&lt;br /&gt;
  &amp;lt;trace_log remove=&amp;quot;1&amp;quot;/&amp;gt;&lt;br /&gt;
&amp;lt;/clickhouse&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run &amp;lt;code&amp;gt;systemctl restart clickhouse-server&amp;lt;/code&amp;gt; and make sure that it&#039;s running.&lt;br /&gt;
&lt;br /&gt;
==== Schema ====&lt;br /&gt;
Run &amp;lt;code&amp;gt;clickhouse-client&amp;lt;/code&amp;gt; to get a SQL shell. First we need to create a new database and some users:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE DATABASE vector;&lt;br /&gt;
CREATE USER vector IDENTIFIED BY &#039;REPLACE_ME&#039;;&lt;br /&gt;
GRANT ALL ON vector.* TO vector;&lt;br /&gt;
CREATE USER grafana IDENTIFIED BY &#039;REPLACE_ME&#039; SETTINGS PROFILE &#039;readonly&#039;;&lt;br /&gt;
GRANT SHOW DATABASES, SHOW TABLES, SELECT ON *.* TO grafana;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In some of our tables, we&#039;ll store the two-letter country code instead of a country&#039;s full name to save space. So we&#039;ll create a [https://clickhouse.com/docs/en/sql-reference/dictionaries dictionary] so that we can look up a country&#039;s full name. Exit the SQL shell, then, download the CSV file:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
wget -O /var/lib/clickhouse/user_files/country_codes.csv &#039;https://datahub.io/core/country-list/r/data.csv&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run &amp;lt;code&amp;gt;clickhouse-client&amp;lt;/code&amp;gt; and create the dictionary:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE DICTIONARY vector.country_codes_dictionary&lt;br /&gt;
(&lt;br /&gt;
    Name String,&lt;br /&gt;
    Code String&lt;br /&gt;
)&lt;br /&gt;
PRIMARY KEY Code&lt;br /&gt;
SOURCE(FILE(path &#039;/var/lib/clickhouse/user_files/country_codes.csv&#039; FORMAT &#039;CSVWithNames&#039;))&lt;br /&gt;
LIFETIME(MIN 0 MAX 0)&lt;br /&gt;
LAYOUT(HASHED_ARRAY());&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Perform a SELECT to fill the table:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT * FROM country_codes_dictionary;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now we need to create the tables for storing our actual log data (after they are transformed by Vector).&lt;br /&gt;
Create a table for failed SSH logins:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.failed_ssh_logins&lt;br /&gt;
(&lt;br /&gt;
    host LowCardinality(String),&lt;br /&gt;
    timestamp DateTime,&lt;br /&gt;
    ip_address IPv6,&lt;br /&gt;
    username String,&lt;br /&gt;
    country_code LowCardinality(String)&lt;br /&gt;
)&lt;br /&gt;
ENGINE = MergeTree()&lt;br /&gt;
PRIMARY KEY (host, timestamp)&lt;br /&gt;
TTL timestamp + INTERVAL 1 MONTH DELETE;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Create a table for storing mirror requests:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    timestamp DateTime CODEC(Delta, ZSTD),&lt;br /&gt;
    ip_address IPv6,&lt;br /&gt;
    bytes_sent UInt64,&lt;br /&gt;
    user_agent String,&lt;br /&gt;
    country_code LowCardinality(String),&lt;br /&gt;
    region_name String,&lt;br /&gt;
    city String&lt;br /&gt;
)&lt;br /&gt;
ENGINE = MergeTree()&lt;br /&gt;
PRIMARY KEY (distro, timestamp, country_code, region_name, city)&lt;br /&gt;
TTL timestamp + INTERVAL 1 WEEK DELETE;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
One of ClickHouse&#039;s great features is [https://clickhouse.com/docs/en/guides/developer/cascading-materialized-views Materialized Views]. These allow us to automatically &amp;quot;forward&amp;quot; data from one table to another, and the second table can use a different storage engine to aggregate data and save space.&lt;br /&gt;
&lt;br /&gt;
We want to calculate the total number of requests and bytes sent for each distro, so let&#039;s create a table and view for that:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_by_distro&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    date Date CODEC(Delta, ZSTD),&lt;br /&gt;
    country_code LowCardinality(String),&lt;br /&gt;
    num_requests UInt64,&lt;br /&gt;
    bytes_sent UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((num_requests, bytes_sent))&lt;br /&gt;
PRIMARY KEY (distro, toStartOfMonth(date), date, country_code)&lt;br /&gt;
TTL date + INTERVAL 1 MONTH&lt;br /&gt;
        GROUP BY distro, toStartOfMonth(date)&lt;br /&gt;
        SET num_requests = sum(num_requests),&lt;br /&gt;
            bytes_sent = sum(bytes_sent),&lt;br /&gt;
    date + INTERVAL 2 YEAR DELETE;&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_by_distro_mv&lt;br /&gt;
TO vector.mirror_requests_agg_by_distro&lt;br /&gt;
AS&lt;br /&gt;
SELECT&lt;br /&gt;
    distro,&lt;br /&gt;
    toDate(timestamp) AS date,&lt;br /&gt;
    country_code,&lt;br /&gt;
    sum(1) AS num_requests,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
GROUP BY distro, date, country_code;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We also wants some stats for Canada specifically:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_canada&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    date Date CODEC(Delta, ZSTD),&lt;br /&gt;
    region_name LowCardinality(String),&lt;br /&gt;
    city String,&lt;br /&gt;
    bytes_sent UInt64,&lt;br /&gt;
    num_requests UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((bytes_sent, num_requests))&lt;br /&gt;
PRIMARY KEY (distro, toStartOfMonth(date), date, region_name, city)&lt;br /&gt;
TTL date + INTERVAL 1 MONTH&lt;br /&gt;
        GROUP BY distro, toStartOfMonth(date)&lt;br /&gt;
        SET num_requests = sum(num_requests),&lt;br /&gt;
            bytes_sent = sum(bytes_sent),&lt;br /&gt;
    date + INTERVAL 2 YEAR DELETE;&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_canada_mv&lt;br /&gt;
TO vector.mirror_requests_agg_canada&lt;br /&gt;
AS&lt;br /&gt;
SELECT&lt;br /&gt;
    distro,&lt;br /&gt;
    toDate(timestamp) as date,&lt;br /&gt;
    region_name,&lt;br /&gt;
    city,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent,&lt;br /&gt;
    sum(1) AS num_requests&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
WHERE country_code = &#039;CA&#039;&lt;br /&gt;
GROUP BY distro, date, region_name, city;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We also want to keep stats just for the university:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_uw&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    date Date CODEC(Delta, ZSTD),&lt;br /&gt;
    bytes_sent UInt64,&lt;br /&gt;
    num_requests UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((bytes_sent, num_requests))&lt;br /&gt;
PRIMARY KEY (distro, toStartOfMonth(date), date)&lt;br /&gt;
TTL date + INTERVAL 1 MONTH&lt;br /&gt;
        GROUP BY distro, toStartOfMonth(date)&lt;br /&gt;
        SET num_requests = sum(num_requests),&lt;br /&gt;
            bytes_sent = sum(bytes_sent),&lt;br /&gt;
    date + INTERVAL 2 YEAR DELETE;&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_uw_mv&lt;br /&gt;
TO vector.mirror_requests_agg_uw&lt;br /&gt;
AS&lt;br /&gt;
SELECT&lt;br /&gt;
    distro,&lt;br /&gt;
    toDate(timestamp) as date,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent,&lt;br /&gt;
    sum(1) AS num_requests&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
WHERE isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:129.97.0.0/112&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:10.0.0.0/104&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:172.16.0.0/108&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:192.168.0.0/112&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;2620:101:f000::/47&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;fd74:6b6a:8eca::/47&#039;)&lt;br /&gt;
GROUP BY distro, date, region_name, city;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Finally, we&#039;ll store some stats for IP subnets:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_ip&lt;br /&gt;
(&lt;br /&gt;
    timestamp DateTime CODEC(Delta, ZSTD),&lt;br /&gt;
    cidr_start IPv6,&lt;br /&gt;
    country_code LowCardinality(String),&lt;br /&gt;
    num_requests UInt64,&lt;br /&gt;
    bytes_sent UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((num_requests, bytes_sent))&lt;br /&gt;
PRIMARY KEY (timestamp, cidr_start, country_code)&lt;br /&gt;
TTL timestamp + toIntervalWeek(2);&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_ip_mv TO vector.mirror_requests_agg_ip AS&lt;br /&gt;
SELECT&lt;br /&gt;
    toStartOfFiveMinutes(timestamp) AS timestamp,&lt;br /&gt;
    IPv6CIDRToRange(ip_address, 120).1 AS cidr_start,&lt;br /&gt;
    country_code,&lt;br /&gt;
    sum(1) AS num_requests,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
GROUP BY&lt;br /&gt;
    timestamp,&lt;br /&gt;
    cidr_start,&lt;br /&gt;
    country_code;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== GeoIP database ===&lt;br /&gt;
We&#039;ll want to look up geographic information for the IP addresses in our data. To do this, we&#039;ll use the [https://dev.maxmind.com/geoip/geolite2-free-geolocation-data MaxMind GeoLite2 databases]. Syscom already has a MaxMind account; the password is stored in the usual place. Install the latest geoipupdate package from [https://github.com/maxmind/geoipupdate/releases here], then edit /etc/GeoIP.conf as necessary (use the syscom account ID and license key). Set &amp;lt;code&amp;gt;EditionIDs&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;GeoLite2-City&amp;lt;/code&amp;gt; only.&lt;br /&gt;
&lt;br /&gt;
We&#039;ll use a systemd timer to run the geoipupdate script periodically. Paste the following into /etc/systemd/system/geoipupdate.service:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=GeoIP Update&lt;br /&gt;
Documentation=https://dev.maxmind.com/geoip/updating-databases&lt;br /&gt;
After=network-online.target&lt;br /&gt;
&lt;br /&gt;
[Service]&lt;br /&gt;
Type=oneshot&lt;br /&gt;
ExecStart=/usr/bin/geoipupdate&lt;br /&gt;
Nice=19&lt;br /&gt;
IOSchedulingClass=idle&lt;br /&gt;
IOSchedulingPriority=7&lt;br /&gt;
ProtectSystem=strict&lt;br /&gt;
ReadWritePaths=/usr/share/GeoIP&lt;br /&gt;
ProtectHome=true&lt;br /&gt;
PrivateTmp=true&lt;br /&gt;
PrivateDevices=true&lt;br /&gt;
ProtectHostname=true&lt;br /&gt;
ProtectClock=true&lt;br /&gt;
ProtectKernelTunables=true&lt;br /&gt;
ProtectKernelModules=true&lt;br /&gt;
ProtectKernelLogs=true&lt;br /&gt;
ProtectControlGroups=true&lt;br /&gt;
LockPersonality=true&lt;br /&gt;
RestrictRealtime=true&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Run &amp;lt;code&amp;gt;systemctl daemon-reload&amp;lt;/code&amp;gt; and then &amp;lt;code&amp;gt;systemctl start geoipupdate&amp;lt;/code&amp;gt; to download the database for the first time.&lt;br /&gt;
&lt;br /&gt;
Now paste the following into /etc/systemd/system/geoipupdate.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Automatic GeoIP database update&lt;br /&gt;
Documentation=https://dev.maxmind.com/geoip/updating-databases&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
OnCalendar=monthly&lt;br /&gt;
RandomizedDelaySec=12h&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl daemon-reload&lt;br /&gt;
systemctl enable geoipupdate.timer&lt;br /&gt;
systemctl start geoipupdate.timer&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Vector ===&lt;br /&gt;
Vector allows you to create directed acyclic graphs (DAGs) for collecting and processing logs, which gives us a lot of flexibility. It also has a built-in scripting language, [https://vector.dev/docs/reference/vrl/ Vector Remap Language (VRL)] for slicing and dicing data. This allows us to remove fields which we don&#039;t need, add new fields which we do need, enrich an event with extra data, etc.&lt;br /&gt;
&lt;br /&gt;
Our data pipeline looks like this: Vector agents -&amp;gt; Vector aggregator -&amp;gt; ClickHouse. We use Grafana for visualization.&lt;br /&gt;
&lt;br /&gt;
We use mutual TLS between the agents and the aggregator to make sure that random people can&#039;t send us garbage data:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
openssl req -newkey rsa:2048 -nodes -keyout aggregator.key -x509 -out aggregator.crt -days 36500&lt;br /&gt;
openssl req -newkey rsa:2048 -nodes -keyout agent.key -x509 -out agent.crt -days 36500&lt;br /&gt;
chown vector:vector *.crt *.key&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here is what our vector.toml looks like on the general-use machines; currently, we only use it for collecting failed SSH login attempts.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[sources.source_sshd]&lt;br /&gt;
type = &amp;quot;journald&amp;quot;&lt;br /&gt;
include_units = [&amp;quot;ssh.service&amp;quot;]&lt;br /&gt;
&lt;br /&gt;
[transforms.remap_sshd]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;source_sshd&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
  parsed, err = parse_regex(&lt;br /&gt;
    .message, r&#039;^(?:Connection (?:closed|reset)|Disconnected) (?:by|from) (?:invalid|authenticating) user (?P&amp;lt;user&amp;gt;[^ ]+) (?P&amp;lt;ip&amp;gt;[0-9.a-f:]+)&#039;&lt;br /&gt;
  )&lt;br /&gt;
  if is_null(err) {&lt;br /&gt;
    . = {&lt;br /&gt;
      &amp;quot;username&amp;quot;: parsed.user,&lt;br /&gt;
      &amp;quot;ip_address&amp;quot;: parsed.ip,&lt;br /&gt;
      &amp;quot;host&amp;quot;: .host,&lt;br /&gt;
      &amp;quot;timestamp&amp;quot;: .timestamp,&lt;br /&gt;
      &amp;quot;job&amp;quot;: &amp;quot;vector-sshd&amp;quot;&lt;br /&gt;
    }&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[transforms.filter_sshd]&lt;br /&gt;
type = &amp;quot;filter&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;remap_sshd&amp;quot;]&lt;br /&gt;
condition = &#039;.job == &amp;quot;vector-sshd&amp;quot;&#039;&lt;br /&gt;
&lt;br /&gt;
[sinks.aggregator]&lt;br /&gt;
type = &amp;quot;vector&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;filter_sshd&amp;quot;]&lt;br /&gt;
address = &amp;quot;prometheus:5045&amp;quot;&lt;br /&gt;
  [sinks.aggregator.tls]&lt;br /&gt;
  enabled = true&lt;br /&gt;
  ca_file = &amp;quot;/etc/vector/aggregator.crt&amp;quot;&lt;br /&gt;
  crt_file = &amp;quot;/etc/vector/agent.crt&amp;quot;&lt;br /&gt;
  key_file = &amp;quot;/etc/vector/agent.key&amp;quot;&lt;br /&gt;
  verify_hostname = false&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The agent on potassium-benzoate collects NGINX logs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[sources.source_nginx]&lt;br /&gt;
type = &amp;quot;file&amp;quot;&lt;br /&gt;
include = [&amp;quot;/var/log/nginx/access.log&amp;quot;]&lt;br /&gt;
max_read_bytes = 65536&lt;br /&gt;
&lt;br /&gt;
[transforms.remap_nginx]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;source_nginx&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot; &lt;br /&gt;
  parsed_log, err = parse_nginx_log(.message, &amp;quot;combined&amp;quot;)&lt;br /&gt;
  status = parsed_log.status&lt;br /&gt;
  request = string!(parsed_log.request || &amp;quot;&amp;quot;)&lt;br /&gt;
  if is_null(err) &amp;amp;&amp;amp; status == 200 {&lt;br /&gt;
    parsed_path, err = parse_regex(request, r&#039;^GET /+(?P&amp;lt;distro&amp;gt;[^/? ]+)&#039;)&lt;br /&gt;
    distro = parsed_path.distro&lt;br /&gt;
    ignore = [&lt;br /&gt;
      &amp;quot;server-status&amp;quot;, &amp;quot;stats&amp;quot;, &amp;quot;robots.txt&amp;quot;,&lt;br /&gt;
      &amp;quot;include&amp;quot;, &amp;quot;pub&amp;quot;, &amp;quot;news&amp;quot;, &amp;quot;index.html&amp;quot;, &amp;quot;sync.json&amp;quot;, &amp;quot;ups&amp;quot;,&lt;br /&gt;
      &amp;quot;pool&amp;quot;, &amp;quot;dists&amp;quot;, &amp;quot;csclub.asc&amp;quot;, &amp;quot;csclub.gpg&amp;quot;&lt;br /&gt;
    ]&lt;br /&gt;
    if (&lt;br /&gt;
      is_null(err) &amp;amp;&amp;amp; !includes(ignore, distro) &amp;amp;&amp;amp; !contains(request, &amp;quot;..&amp;quot;) &amp;amp;&amp;amp;&lt;br /&gt;
      !starts_with(request, &amp;quot;#&amp;quot;) &amp;amp;&amp;amp; !starts_with(request, &amp;quot;%&amp;quot;) &amp;amp;&amp;amp; !starts_with(request, &amp;quot;.&amp;quot;)&lt;br /&gt;
    ) {&lt;br /&gt;
      . = {&lt;br /&gt;
        &amp;quot;distro&amp;quot;: distro,&lt;br /&gt;
        &amp;quot;user_agent&amp;quot;: parsed_log.agent,&lt;br /&gt;
        &amp;quot;ip_address&amp;quot;: parsed_log.client,&lt;br /&gt;
        &amp;quot;bytes_sent&amp;quot;: parsed_log.size,&lt;br /&gt;
        &amp;quot;timestamp&amp;quot;: parsed_log.timestamp,&lt;br /&gt;
        &amp;quot;job&amp;quot;: &amp;quot;vector-mirror&amp;quot;&lt;br /&gt;
      }&lt;br /&gt;
    }&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Finally, here&#039;s the aggregator config, which collects data from each agent and then inserts it into ClickHouse:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[enrichment_tables.enrich_geoip]                                                    &lt;br /&gt;
type = &amp;quot;geoip&amp;quot;                                                                      &lt;br /&gt;
path = &amp;quot;/usr/share/GeoIP/GeoLite2-City.mmdb&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[sources.source_agents]      &lt;br /&gt;
type = &amp;quot;vector&amp;quot;&lt;br /&gt;
address = &amp;quot;[::]:5045&amp;quot;&lt;br /&gt;
  [sources.source_agents.tls]&lt;br /&gt;
  enabled = true&lt;br /&gt;
  ca_file = &amp;quot;/etc/vector/agent.crt&amp;quot;&lt;br /&gt;
  crt_file = &amp;quot;/etc/vector/aggregator.crt&amp;quot;&lt;br /&gt;
  key_file = &amp;quot;/etc/vector/aggregator.key&amp;quot;&lt;br /&gt;
  verify_hostname = false&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_route]&lt;br /&gt;
type = &amp;quot;route&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;source_agents&amp;quot;]&lt;br /&gt;
route.sshd = &#039;.job == &amp;quot;vector-sshd&amp;quot;&#039;&lt;br /&gt;
route.mirror = &#039;.job == &amp;quot;vector-mirror&amp;quot;&#039;&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_sshd]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_route.sshd&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot; &lt;br /&gt;
  ipinfo = get_enrichment_table_record(&amp;quot;enrich_geoip&amp;quot;, {&amp;quot;ip&amp;quot;: .ip_address}) ?? {}&lt;br /&gt;
  if is_ipv4!(.ip_address) &amp;amp;&amp;amp; (ip_cidr_contains!(&amp;quot;10.0.0.0/8&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;172.16.0.0/12&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;192.168.0.0/16&amp;quot;, .ip_address)) {&lt;br /&gt;
    ipinfo.country_code = &amp;quot;CA&amp;quot;;&lt;br /&gt;
    ipinfo.region_name = &amp;quot;Ontario&amp;quot;;&lt;br /&gt;
    ipinfo.city_name = &amp;quot;Waterloo&amp;quot;;&lt;br /&gt;
  }&lt;br /&gt;
  . = {&lt;br /&gt;
    &amp;quot;host&amp;quot;: .host,&lt;br /&gt;
    &amp;quot;timestamp&amp;quot;: .timestamp,&lt;br /&gt;
    &amp;quot;username&amp;quot;: .username,&lt;br /&gt;
    &amp;quot;ip_address&amp;quot;: ip_to_ipv6!(.ip_address),&lt;br /&gt;
    &amp;quot;country_code&amp;quot;: ipinfo.country_code || &amp;quot;&amp;quot;&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_mirror]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_route.mirror&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
  ipinfo = get_enrichment_table_record(&amp;quot;enrich_geoip&amp;quot;, {&amp;quot;ip&amp;quot;: .ip_address}) ?? {}&lt;br /&gt;
  if is_ipv4!(.ip_address) &amp;amp;&amp;amp; (ip_cidr_contains!(&amp;quot;10.0.0.0/8&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;172.16.0.0/12&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;192.168.0.0/16&amp;quot;, .ip_address)) {&lt;br /&gt;
    ipinfo.country_code = &amp;quot;CA&amp;quot;;&lt;br /&gt;
    ipinfo.region_name = &amp;quot;Ontario&amp;quot;;&lt;br /&gt;
    ipinfo.city_name = &amp;quot;Waterloo&amp;quot;;&lt;br /&gt;
  }&lt;br /&gt;
  . = {&lt;br /&gt;
    &amp;quot;distro&amp;quot;: .distro,&lt;br /&gt;
    &amp;quot;timestamp&amp;quot;: .timestamp,&lt;br /&gt;
    &amp;quot;ip_address&amp;quot;: ip_to_ipv6!(.ip_address),&lt;br /&gt;
    &amp;quot;bytes_sent&amp;quot;: .bytes_sent,&lt;br /&gt;
    &amp;quot;user_agent&amp;quot;: .user_agent,&lt;br /&gt;
    &amp;quot;country_code&amp;quot;: ipinfo.country_code || &amp;quot;&amp;quot;,&lt;br /&gt;
    &amp;quot;region_name&amp;quot;: ipinfo.region_name || &amp;quot;&amp;quot;,&lt;br /&gt;
    &amp;quot;city&amp;quot;: ipinfo.city_name || &amp;quot;&amp;quot;&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_unmatched]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_route._unmatched&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
  log(&amp;quot;unrecognized job: &amp;quot; + string!(.job || &amp;quot;null&amp;quot;), level: &amp;quot;warn&amp;quot;)&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[sinks.sink_unmatched]&lt;br /&gt;
type = &amp;quot;blackhole&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_unmatched&amp;quot;]&lt;br /&gt;
print_interval_secs = 0&lt;br /&gt;
&lt;br /&gt;
[sinks.clickhouse_sshd]&lt;br /&gt;
type = &amp;quot;clickhouse&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_sshd&amp;quot;]&lt;br /&gt;
encoding.timestamp_format = &amp;quot;unix&amp;quot;&lt;br /&gt;
endpoint = &amp;quot;$CLICKHOUSE_ENDPOINT&amp;quot;&lt;br /&gt;
database = &amp;quot;$CLICKHOUSE_DATABASE&amp;quot;&lt;br /&gt;
table = &amp;quot;failed_ssh_logins&amp;quot;&lt;br /&gt;
  [sinks.clickhouse_sshd.auth]&lt;br /&gt;
  strategy = &amp;quot;basic&amp;quot;&lt;br /&gt;
  user = &amp;quot;$CLICKHOUSE_USER&amp;quot;&lt;br /&gt;
  password = &amp;quot;$CLICKHOUSE_PASSWORD&amp;quot;&lt;br /&gt;
&lt;br /&gt;
...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Beats, Logstash and Loki (old) ===&lt;br /&gt;
We previously used Elastic Beats, Logstash and Grafana Loki for collecting and storing logs. One day I tried to upgrade Logstash and it exploded so badly that I figured it would be easier to just switch to Vector instead.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;The sections below are kept for historical purposes only and are no longer accurate.&amp;lt;/b&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We use a combination of [https://www.elastic.co/beats/ Elastic Beats], [https://www.elastic.co/logstash/ Logstash] and [https://grafana.com/oss/loki/ Loki] for collecting, storing and querying our logs; for visualization, we use Grafana. Logstash and Loki are currently both running in the prometheus VM.&lt;br /&gt;
&lt;br /&gt;
The reason why I chose Loki over Elasticsearch is because Loki is &amp;lt;i&amp;gt;very&amp;lt;/i&amp;gt; space efficient with regards to storage. It also consumes way less RAM and CPU. This means that we can collect a lot of logs without worrying too much about resource usage.&lt;br /&gt;
&lt;br /&gt;
We have Journalbeat and/or Filebeat running on some of our machines to collect logs from sshd, Apache and NGINX. The Beats send these logs to Logstash, which does some pre-processing. The most useful contribution by Logstash is its GeoIP plugin, which allows us to enrich the logs with some geographical information from IP addresses (e.g. add city and country). Logstash sends these logs to Loki, and we can then view these from Grafana.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note&#039;&#039;&#039;: Sometimes the Loki output plugin for Logstash disappears after a reboot or an upgrade. If you see Logstash complaining about this in the journald logs, run this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cd /usr/share/logstash&lt;br /&gt;
bin/logstash-plugin install logstash-output-loki&lt;br /&gt;
systemctl restart logstash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
See [https://grafana.com/docs/loki/latest/clients/logstash/ here] for details.&lt;br /&gt;
&lt;br /&gt;
The language for querying logs in Loki is [https://grafana.com/docs/loki/latest/logql/ LogQL], which, syntactically, is very similar to PromQL. If you have already learned PromQL, then you should be able to pick up LogQL very easily. You can try out some LogQL queries from the &#039;Explore&#039; page on Grafana; make sure you toggle the data source to &#039;Loki&#039; in the top left corner. For the &#039;topk&#039; queries, you will also want to toggle &#039;Query type&#039; to &#039;Instant&#039; rather than &#039;Range&#039;.&lt;br /&gt;
&lt;br /&gt;
==== LogQL examples ====&lt;br /&gt;
Here are the number of failed SSH login attempts for each host for a given time range:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sum by (hostname) (&lt;br /&gt;
  count_over_time(&lt;br /&gt;
    {job=&amp;quot;logstash-sshd&amp;quot;} [$__range]&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that &amp;lt;code&amp;gt;$__range&amp;lt;/code&amp;gt; is a special [https://grafana.com/docs/grafana/latest/variables/variable-types/global-variables/ global variable] in Grafana which is equal to the time range in the top right corner of a chart.&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
Here are the top 10 IP addresses from which failed SSH login attempts arrived, for a given host and time range:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
topk(10,&lt;br /&gt;
  sum by (ip_address) (&lt;br /&gt;
    count_over_time(&lt;br /&gt;
      {job=&amp;quot;logstash-sshd&amp;quot;,hostname=&amp;quot;$hostname&amp;quot;} | json | __error__ = &amp;quot;&amp;quot;&lt;br /&gt;
      [$__range]&lt;br /&gt;
    )&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
$hostname is a chart variable, which can be configured from a chart&#039;s settings.&lt;br /&gt;
&lt;br /&gt;
I configured Logstash to send logs to Loki as JSON, but it&#039;s a rather hacky solution, so occasionally invalid JSON is sent.&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
Here are the number of HTTP requests for the 15 distros on our mirror from the last hour:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
topk(15,&lt;br /&gt;
  sum by (distro) (&lt;br /&gt;
    count_over_time(&lt;br /&gt;
      {job=&amp;quot;logstash-nginx&amp;quot;} | json | __error__ = &amp;quot;&amp;quot; | distro != &amp;quot;server-status&amp;quot;&lt;br /&gt;
      [1h]&lt;br /&gt;
    )&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
Here are the number of total bytes sent over HTTP for the top 15 distros from the last hour. Note the use of the &amp;lt;code&amp;gt;unwrap&amp;lt;/code&amp;gt; operator.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
topk(15,&lt;br /&gt;
  sum by (distro) (&lt;br /&gt;
    sum_over_time(&lt;br /&gt;
      {job=&amp;quot;logstash-nginx&amp;quot;} | json | __error__ = &amp;quot;&amp;quot; | distro != &amp;quot;server-status&amp;quot; | unwrap bytes&lt;br /&gt;
      [1h]&lt;br /&gt;
    )&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
You can see more examples on the Mirror Requests dashboard on Grafana.&lt;br /&gt;
&lt;br /&gt;
==== Avoid high cardinality ====&lt;br /&gt;
For both Prometheus and Loki, you must [https://prometheus.io/docs/practices/naming/#labels avoid high cardinality] labels at all costs. By high cardinality, I mean labels which can take on a very large number of values; for example, using a label to store IP addresses would be a very bad idea. This is because Prometheus and Loki use labels to store metrics/logs efficiently with compression; when two metrics have two different sets of labels, they cannot be stored together, which increases the storage space usage.&lt;br /&gt;
&lt;br /&gt;
With Loki, you can extract labels from your logs inside your query dynamically. One way to do this is with the &amp;lt;code&amp;gt;json&amp;lt;/code&amp;gt; operator; there are other ways to do this as well (see the LogQL docs). This basically means that we get infinite cardinality from our logs, the tradeoff being that queries may take longer to execute.&lt;br /&gt;
&lt;br /&gt;
Also, be very careful about what you send to Loki from Logstash - [https://grafana.com/docs/loki/latest/clients/logstash/#usage-and-configuration every field in a Logstash message becomes a Loki label]. Usage of the &amp;lt;code&amp;gt;prune&amp;lt;/code&amp;gt; command in Logstash is highly recommended.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=Observability&amp;diff=5183</id>
		<title>Observability</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=Observability&amp;diff=5183"/>
		<updated>2023-12-16T02:04:04Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* ClickHouse */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are [https://www.oreilly.com/library/view/distributed-systems-observability/9781492033431/ch04.html three pillars of observability]: metrics, logging and tracing. We are only interested in the first two.&lt;br /&gt;
&lt;br /&gt;
== Metrics ==&lt;br /&gt;
All of our machines are, or at least should be, running the Prometheus node exporter. This collects and sends machine metrics (e.g. RAM used, disk space) to the Prometheus server running at https://prometheus.csclub.uwaterloo.ca (currently a VM on phosphoric-acid). There are a few specialized exporters running on several other machines; a Postfix exporter is running on mail, an Apache exporter is running on caffeine, and an NGINX expoter is running on potassium-benzoate. There is also a custom exporter written by syscom running on potassium-benzoate for mirror stats.&lt;br /&gt;
&lt;br /&gt;
Most of the exporters use mutual TLS authentication with the Prometheus server. I set the expiration date for the TLS certs to 10 years. If you are reading this and it is 2031 or later, then go update the certs.&lt;br /&gt;
&lt;br /&gt;
I highly suggest becoming familiar with [https://prometheus.io/docs/prometheus/latest/querying/basics/ PromQL], the query language for Prometheus. You can run and visualize some queries at https://prometheus.csclub.uwaterloo.ca/prometheus. For example, here is a query to determine which machines are up or down:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
up{job=&amp;quot;node_exporter&amp;quot;}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Here&#039;s how we determine if a machine has NFS mounted. This will return 1 for machines which have NFS mounted, but will not return any records for machines which do not have NFS mounted. (We ignore the actual value of node_filesystem_device_error because it returns 1 for machines using Kerberized NFS.)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
count by (instance) (node_filesystem_device_error{mountpoint=&amp;quot;/users&amp;quot;, fstype=&amp;quot;nfs&amp;quot;})&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Now this is a rather complicated expression which can return one of three values:&lt;br /&gt;
* 0: the machine is down&lt;br /&gt;
* 1: the machine is up, but NFS is not mounted&lt;br /&gt;
* 2: the machine is up and NFS is mounted&lt;br /&gt;
The [https://prometheus.io/docs/prometheus/latest/querying/operators/#logical-set-binary-operators or operator] in PromQL is key here.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sum by (instance) (&lt;br /&gt;
  (count by (instance) (node_filesystem_device_error{mountpoint=&amp;quot;/users&amp;quot;, fstype=&amp;quot;nfs&amp;quot;}))&lt;br /&gt;
  or up{job=&amp;quot;node_exporter&amp;quot;}&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
We also use [https://prometheus.io/docs/alerting/latest/alertmanager/ AlertManager] to send email alerts from Prometheus metrics. We should figure out how to also send messages to IRC or similar.&lt;br /&gt;
&lt;br /&gt;
We also use the [https://github.com/prometheus/blackbox_exporter Blackbox prober exporter] to check if some of our web-based services are up.&lt;br /&gt;
&lt;br /&gt;
We make some pretty charts on Grafana (https://prometheus.csclub.uwaterloo.ca) from PromQL queries. Grafana also has an &#039;Explorer&#039; page where you can test out some queries before making chart panels from them.&lt;br /&gt;
&lt;br /&gt;
== Logging ==&lt;br /&gt;
We now use [https://vector.dev/ Vector] for collecting and transforming logs, and [https://clickhouse.com/ ClickHouse] for storing log data.&lt;br /&gt;
&lt;br /&gt;
=== ClickHouse ===&lt;br /&gt;
ClickHouse is a very fast OLAP database which has great documentation for storing and analyzing [https://clickhouse.com/use-cases/logging-and-metrics logging and metrics]. Unfortunately, the CPU on phosphoric-acid (which hosts the prometheus VM) is so old that when we try to install the official deb package, the following error occurs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Instruction check fail. The CPU does not support SSSE3 instruction set.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
So we&#039;re going to download the &amp;quot;compat&amp;quot; version instead:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cd /root&lt;br /&gt;
wget https://s3.amazonaws.com/clickhouse-builds/master/amd64compat/clickhouse&lt;br /&gt;
chmod +x clickhouse&lt;br /&gt;
./clickhouse install&lt;br /&gt;
rm clickhouse&lt;br /&gt;
wget -O /etc/systemd/system/clickhouse-server.service https://github.com/ClickHouse/ClickHouse/raw/master/packages/clickhouse-server.service&lt;br /&gt;
systemctl daemon-reload&lt;br /&gt;
systemctl enable clickhouse-server&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
By default, systemd limits the number of threads which a service can create, so we&#039;ll want to disable that. Run &amp;lt;code&amp;gt;systemctl edit clickhouse-server&amp;lt;/code&amp;gt; and paste the following:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Service]&lt;br /&gt;
TasksMax=infinity&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, paste the following into /etc/clickhouse-server/users.d/csclub-users.xml:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;clickhouse&amp;gt;&lt;br /&gt;
  &amp;lt;profiles&amp;gt;&lt;br /&gt;
    &amp;lt;default&amp;gt;&lt;br /&gt;
      &amp;lt;!-- disable logs (using too much disk space) --&amp;gt;&lt;br /&gt;
      &amp;lt;log_queries replace=&amp;quot;replace&amp;quot;&amp;gt;0&amp;lt;/log_queries&amp;gt;&lt;br /&gt;
      &amp;lt;log_query_threads replace=&amp;quot;replace&amp;quot;&amp;gt;0&amp;lt;/log_query_threads&amp;gt;&lt;br /&gt;
    &amp;lt;/default&amp;gt;&lt;br /&gt;
    &amp;lt;readonly&amp;gt;&lt;br /&gt;
      &amp;lt;!-- Grafana needs to change settings queries --&amp;gt;&lt;br /&gt;
      &amp;lt;readonly&amp;gt;2&amp;lt;/readonly&amp;gt;&lt;br /&gt;
    &amp;lt;/readonly&amp;gt;&lt;br /&gt;
  &amp;lt;/profiles&amp;gt;&lt;br /&gt;
  &amp;lt;users&amp;gt;&lt;br /&gt;
    &amp;lt;default&amp;gt;&lt;br /&gt;
      &amp;lt;!-- The default user should only be allowed to connect from localhost --&amp;gt;&lt;br /&gt;
      &amp;lt;networks&amp;gt;&lt;br /&gt;
        &amp;lt;ip&amp;gt;::1&amp;lt;/ip&amp;gt;&lt;br /&gt;
        &amp;lt;ip&amp;gt;127.0.0.1&amp;lt;/ip&amp;gt;&lt;br /&gt;
      &amp;lt;/networks&amp;gt;&lt;br /&gt;
      &amp;lt;!-- Allow the default user to create new users --&amp;gt;&lt;br /&gt;
      &amp;lt;access_management&amp;gt;1&amp;lt;/access_management&amp;gt;&lt;br /&gt;
      &amp;lt;named_collection_control&amp;gt;1&amp;lt;/named_collection_control&amp;gt;&lt;br /&gt;
      &amp;lt;show_named_collections&amp;gt;1&amp;lt;/show_named_collections&amp;gt;&lt;br /&gt;
      &amp;lt;show_named_collections_secrets&amp;gt;1&amp;lt;/show_named_collections_secrets&amp;gt;&lt;br /&gt;
    &amp;lt;/default&amp;gt;&lt;br /&gt;
  &amp;lt;/users&amp;gt;&lt;br /&gt;
&amp;lt;/clickhouse&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then paste the following into /etc/clickhouse-server/config/zzz-csclub.xml (we need the zzz prefix because the configuration files are merged in alphabetical order, and we want ours to be applied last):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;clickhouse&amp;gt;&lt;br /&gt;
  &amp;lt;listen_host&amp;gt;127.0.0.1&amp;lt;/listen_host&amp;gt;&lt;br /&gt;
  &amp;lt;listen_host&amp;gt;::1&amp;lt;/listen_host&amp;gt;&lt;br /&gt;
  &amp;lt;logger&amp;gt;&lt;br /&gt;
    &amp;lt;level&amp;gt;information&amp;lt;/level&amp;gt;&lt;br /&gt;
    &amp;lt;size&amp;gt;100M&amp;lt;/size&amp;gt;&lt;br /&gt;
    &amp;lt;count&amp;gt;10&amp;lt;/count&amp;gt;&lt;br /&gt;
  &amp;lt;/logger&amp;gt;&lt;br /&gt;
  &amp;lt;mysql_port&amp;gt;&amp;lt;/mysql_port&amp;gt;&lt;br /&gt;
  &amp;lt;postgresql_port&amp;gt;&amp;lt;/postgresql_port&amp;gt;&lt;br /&gt;
&amp;lt;/clickhouse&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run &amp;lt;code&amp;gt;systemctl restart clickhouse-server&amp;lt;/code&amp;gt; and make sure that it&#039;s running.&lt;br /&gt;
&lt;br /&gt;
==== Schema ====&lt;br /&gt;
Run &amp;lt;code&amp;gt;clickhouse-client&amp;lt;/code&amp;gt; to get a SQL shell. First we need to create a new database and some users:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE DATABASE vector;&lt;br /&gt;
CREATE USER vector IDENTIFIED BY &#039;REPLACE_ME&#039;;&lt;br /&gt;
GRANT ALL ON vector.* TO vector;&lt;br /&gt;
CREATE USER grafana IDENTIFIED BY &#039;REPLACE_ME&#039; SETTINGS PROFILE &#039;readonly&#039;;&lt;br /&gt;
GRANT SHOW DATABASES, SHOW TABLES, SELECT ON *.* TO grafana;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In some of our tables, we&#039;ll store the two-letter country code instead of a country&#039;s full name to save space. So we&#039;ll create a [https://clickhouse.com/docs/en/sql-reference/dictionaries dictionary] so that we can look up a country&#039;s full name. Exit the SQL shell, then, download the CSV file:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
wget -O /var/lib/clickhouse/user_files/country_codes.csv &#039;https://datahub.io/core/country-list/r/data.csv&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run &amp;lt;code&amp;gt;clickhouse-client&amp;lt;/code&amp;gt; and create the dictionary:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE DICTIONARY vector.country_codes_dictionary&lt;br /&gt;
(&lt;br /&gt;
    Name String,&lt;br /&gt;
    Code String&lt;br /&gt;
)&lt;br /&gt;
PRIMARY KEY Code&lt;br /&gt;
SOURCE(FILE(path &#039;/var/lib/clickhouse/user_files/country_codes.csv&#039; FORMAT &#039;CSVWithNames&#039;))&lt;br /&gt;
LIFETIME(MIN 0 MAX 0)&lt;br /&gt;
LAYOUT(HASHED_ARRAY());&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Perform a SELECT to fill the table:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT * FROM country_codes_dictionary;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now we need to create the tables for storing our actual log data (after they are transformed by Vector).&lt;br /&gt;
Create a table for failed SSH logins:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.failed_ssh_logins&lt;br /&gt;
(&lt;br /&gt;
    host LowCardinality(String),&lt;br /&gt;
    timestamp DateTime,&lt;br /&gt;
    ip_address IPv6,&lt;br /&gt;
    username String,&lt;br /&gt;
    country_code LowCardinality(String)&lt;br /&gt;
)&lt;br /&gt;
ENGINE = MergeTree()&lt;br /&gt;
PRIMARY KEY (host, timestamp)&lt;br /&gt;
TTL timestamp + INTERVAL 1 MONTH DELETE;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Create a table for storing mirror requests:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    timestamp DateTime CODEC(Delta, ZSTD),&lt;br /&gt;
    ip_address IPv6,&lt;br /&gt;
    bytes_sent UInt64,&lt;br /&gt;
    user_agent String,&lt;br /&gt;
    country_code LowCardinality(String),&lt;br /&gt;
    region_name String,&lt;br /&gt;
    city String&lt;br /&gt;
)&lt;br /&gt;
ENGINE = MergeTree()&lt;br /&gt;
PRIMARY KEY (distro, timestamp, country_code, region_name, city)&lt;br /&gt;
TTL timestamp + INTERVAL 1 WEEK DELETE;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
One of ClickHouse&#039;s great features is [https://clickhouse.com/docs/en/guides/developer/cascading-materialized-views Materialized Views]. These allow us to automatically &amp;quot;forward&amp;quot; data from one table to another, and the second table can use a different storage engine to aggregate data and save space.&lt;br /&gt;
&lt;br /&gt;
We want to calculate the total number of requests and bytes sent for each distro, so let&#039;s create a table and view for that:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_by_distro&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    date Date CODEC(Delta, ZSTD),&lt;br /&gt;
    country_code LowCardinality(String),&lt;br /&gt;
    num_requests UInt64,&lt;br /&gt;
    bytes_sent UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((num_requests, bytes_sent))&lt;br /&gt;
PRIMARY KEY (distro, toStartOfMonth(date), date, country_code)&lt;br /&gt;
TTL date + INTERVAL 1 MONTH&lt;br /&gt;
        GROUP BY distro, toStartOfMonth(date)&lt;br /&gt;
        SET num_requests = sum(num_requests),&lt;br /&gt;
            bytes_sent = sum(bytes_sent),&lt;br /&gt;
    date + INTERVAL 2 YEAR DELETE;&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_by_distro_mv&lt;br /&gt;
TO vector.mirror_requests_agg_by_distro&lt;br /&gt;
AS&lt;br /&gt;
SELECT&lt;br /&gt;
    distro,&lt;br /&gt;
    toDate(timestamp) AS date,&lt;br /&gt;
    country_code,&lt;br /&gt;
    sum(1) AS num_requests,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
GROUP BY distro, date, country_code;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We also wants some stats for Canada specifically:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_canada&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    date Date CODEC(Delta, ZSTD),&lt;br /&gt;
    region_name LowCardinality(String),&lt;br /&gt;
    city String,&lt;br /&gt;
    bytes_sent UInt64,&lt;br /&gt;
    num_requests UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((bytes_sent, num_requests))&lt;br /&gt;
PRIMARY KEY (distro, toStartOfMonth(date), date, region_name, city)&lt;br /&gt;
TTL date + INTERVAL 1 MONTH&lt;br /&gt;
        GROUP BY distro, toStartOfMonth(date)&lt;br /&gt;
        SET num_requests = sum(num_requests),&lt;br /&gt;
            bytes_sent = sum(bytes_sent),&lt;br /&gt;
    date + INTERVAL 2 YEAR DELETE;&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_canada_mv&lt;br /&gt;
TO vector.mirror_requests_agg_canada&lt;br /&gt;
AS&lt;br /&gt;
SELECT&lt;br /&gt;
    distro,&lt;br /&gt;
    toDate(timestamp) as date,&lt;br /&gt;
    region_name,&lt;br /&gt;
    city,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent,&lt;br /&gt;
    sum(1) AS num_requests&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
WHERE country_code = &#039;CA&#039;&lt;br /&gt;
GROUP BY distro, date, region_name, city;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We also want to keep stats just for the university:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_uw&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    date Date CODEC(Delta, ZSTD),&lt;br /&gt;
    bytes_sent UInt64,&lt;br /&gt;
    num_requests UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((bytes_sent, num_requests))&lt;br /&gt;
PRIMARY KEY (distro, toStartOfMonth(date), date)&lt;br /&gt;
TTL date + INTERVAL 1 MONTH&lt;br /&gt;
        GROUP BY distro, toStartOfMonth(date)&lt;br /&gt;
        SET num_requests = sum(num_requests),&lt;br /&gt;
            bytes_sent = sum(bytes_sent),&lt;br /&gt;
    date + INTERVAL 2 YEAR DELETE;&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_uw_mv&lt;br /&gt;
TO vector.mirror_requests_agg_uw&lt;br /&gt;
AS&lt;br /&gt;
SELECT&lt;br /&gt;
    distro,&lt;br /&gt;
    toDate(timestamp) as date,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent,&lt;br /&gt;
    sum(1) AS num_requests&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
WHERE isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:129.97.0.0/112&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:10.0.0.0/104&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:172.16.0.0/108&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:192.168.0.0/112&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;2620:101:f000::/47&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;fd74:6b6a:8eca::/47&#039;)&lt;br /&gt;
GROUP BY distro, date, region_name, city;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Finally, we&#039;ll store some stats for IP subnets:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_ip&lt;br /&gt;
(&lt;br /&gt;
    timestamp DateTime CODEC(Delta, ZSTD),&lt;br /&gt;
    cidr_start IPv6,&lt;br /&gt;
    country_code LowCardinality(String),&lt;br /&gt;
    num_requests UInt64,&lt;br /&gt;
    bytes_sent UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((num_requests, bytes_sent))&lt;br /&gt;
PRIMARY KEY (timestamp, cidr_start, country_code)&lt;br /&gt;
TTL timestamp + toIntervalWeek(2);&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_ip_mv TO vector.mirror_requests_agg_ip AS&lt;br /&gt;
SELECT&lt;br /&gt;
    toStartOfFiveMinutes(timestamp) AS timestamp,&lt;br /&gt;
    IPv6CIDRToRange(ip_address, 120).1 AS cidr_start,&lt;br /&gt;
    country_code,&lt;br /&gt;
    sum(1) AS num_requests,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
GROUP BY&lt;br /&gt;
    timestamp,&lt;br /&gt;
    cidr_start,&lt;br /&gt;
    country_code;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== GeoIP database ===&lt;br /&gt;
We&#039;ll want to look up geographic information for the IP addresses in our data. To do this, we&#039;ll use the [https://dev.maxmind.com/geoip/geolite2-free-geolocation-data MaxMind GeoLite2 databases]. Syscom already has a MaxMind account; the password is stored in the usual place. Install the latest geoipupdate package from [https://github.com/maxmind/geoipupdate/releases here], then edit /etc/GeoIP.conf as necessary (use the syscom account ID and license key). Set &amp;lt;code&amp;gt;EditionIDs&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;GeoLite2-City&amp;lt;/code&amp;gt; only.&lt;br /&gt;
&lt;br /&gt;
We&#039;ll use a systemd timer to run the geoipupdate script periodically. Paste the following into /etc/systemd/system/geoipupdate.service:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=GeoIP Update&lt;br /&gt;
Documentation=https://dev.maxmind.com/geoip/updating-databases&lt;br /&gt;
After=network-online.target&lt;br /&gt;
&lt;br /&gt;
[Service]&lt;br /&gt;
Type=oneshot&lt;br /&gt;
ExecStart=/usr/bin/geoipupdate&lt;br /&gt;
Nice=19&lt;br /&gt;
IOSchedulingClass=idle&lt;br /&gt;
IOSchedulingPriority=7&lt;br /&gt;
ProtectSystem=strict&lt;br /&gt;
ReadWritePaths=/usr/share/GeoIP&lt;br /&gt;
ProtectHome=true&lt;br /&gt;
PrivateTmp=true&lt;br /&gt;
PrivateDevices=true&lt;br /&gt;
ProtectHostname=true&lt;br /&gt;
ProtectClock=true&lt;br /&gt;
ProtectKernelTunables=true&lt;br /&gt;
ProtectKernelModules=true&lt;br /&gt;
ProtectKernelLogs=true&lt;br /&gt;
ProtectControlGroups=true&lt;br /&gt;
LockPersonality=true&lt;br /&gt;
RestrictRealtime=true&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Run &amp;lt;code&amp;gt;systemctl daemon-reload&amp;lt;/code&amp;gt; and then &amp;lt;code&amp;gt;systemctl start geoipupdate&amp;lt;/code&amp;gt; to download the database for the first time.&lt;br /&gt;
&lt;br /&gt;
Now paste the following into /etc/systemd/system/geoipupdate.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Automatic GeoIP database update&lt;br /&gt;
Documentation=https://dev.maxmind.com/geoip/updating-databases&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
OnCalendar=monthly&lt;br /&gt;
RandomizedDelaySec=12h&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl daemon-reload&lt;br /&gt;
systemctl enable geoipupdate.timer&lt;br /&gt;
systemctl start geoipupdate.timer&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Vector ===&lt;br /&gt;
Vector allows you to create directed acyclic graphs (DAGs) for collecting and processing logs, which gives us a lot of flexibility. It also has a built-in scripting language, [https://vector.dev/docs/reference/vrl/ Vector Remap Language (VRL)] for slicing and dicing data. This allows us to remove fields which we don&#039;t need, add new fields which we do need, enrich an event with extra data, etc.&lt;br /&gt;
&lt;br /&gt;
Our data pipeline looks like this: Vector agents -&amp;gt; Vector aggregator -&amp;gt; ClickHouse. We use Grafana for visualization.&lt;br /&gt;
&lt;br /&gt;
We use mutual TLS between the agents and the aggregator to make sure that random people can&#039;t send us garbage data:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
openssl req -newkey rsa:2048 -nodes -keyout aggregator.key -x509 -out aggregator.crt -days 36500&lt;br /&gt;
openssl req -newkey rsa:2048 -nodes -keyout agent.key -x509 -out agent.crt -days 36500&lt;br /&gt;
chown vector:vector *.crt *.key&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here is what our vector.toml looks like on the general-use machines; currently, we only use it for collecting failed SSH login attempts.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[sources.source_sshd]&lt;br /&gt;
type = &amp;quot;journald&amp;quot;&lt;br /&gt;
include_units = [&amp;quot;ssh.service&amp;quot;]&lt;br /&gt;
&lt;br /&gt;
[transforms.remap_sshd]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;source_sshd&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
  parsed, err = parse_regex(&lt;br /&gt;
    .message, r&#039;^(?:Connection (?:closed|reset)|Disconnected) (?:by|from) (?:invalid|authenticating) user (?P&amp;lt;user&amp;gt;[^ ]+) (?P&amp;lt;ip&amp;gt;[0-9.a-f:]+)&#039;&lt;br /&gt;
  )&lt;br /&gt;
  if is_null(err) {&lt;br /&gt;
    . = {&lt;br /&gt;
      &amp;quot;username&amp;quot;: parsed.user,&lt;br /&gt;
      &amp;quot;ip_address&amp;quot;: parsed.ip,&lt;br /&gt;
      &amp;quot;host&amp;quot;: .host,&lt;br /&gt;
      &amp;quot;timestamp&amp;quot;: .timestamp,&lt;br /&gt;
      &amp;quot;job&amp;quot;: &amp;quot;vector-sshd&amp;quot;&lt;br /&gt;
    }&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[transforms.filter_sshd]&lt;br /&gt;
type = &amp;quot;filter&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;remap_sshd&amp;quot;]&lt;br /&gt;
condition = &#039;.job == &amp;quot;vector-sshd&amp;quot;&#039;&lt;br /&gt;
&lt;br /&gt;
[sinks.aggregator]&lt;br /&gt;
type = &amp;quot;vector&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;filter_sshd&amp;quot;]&lt;br /&gt;
address = &amp;quot;prometheus:5045&amp;quot;&lt;br /&gt;
  [sinks.aggregator.tls]&lt;br /&gt;
  enabled = true&lt;br /&gt;
  ca_file = &amp;quot;/etc/vector/aggregator.crt&amp;quot;&lt;br /&gt;
  crt_file = &amp;quot;/etc/vector/agent.crt&amp;quot;&lt;br /&gt;
  key_file = &amp;quot;/etc/vector/agent.key&amp;quot;&lt;br /&gt;
  verify_hostname = false&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The agent on potassium-benzoate collects NGINX logs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[sources.source_nginx]&lt;br /&gt;
type = &amp;quot;file&amp;quot;&lt;br /&gt;
include = [&amp;quot;/var/log/nginx/access.log&amp;quot;]&lt;br /&gt;
max_read_bytes = 65536&lt;br /&gt;
&lt;br /&gt;
[transforms.remap_nginx]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;source_nginx&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot; &lt;br /&gt;
  parsed_log, err = parse_nginx_log(.message, &amp;quot;combined&amp;quot;)&lt;br /&gt;
  status = parsed_log.status&lt;br /&gt;
  request = string!(parsed_log.request || &amp;quot;&amp;quot;)&lt;br /&gt;
  if is_null(err) &amp;amp;&amp;amp; status == 200 {&lt;br /&gt;
    parsed_path, err = parse_regex(request, r&#039;^GET /+(?P&amp;lt;distro&amp;gt;[^/? ]+)&#039;)&lt;br /&gt;
    distro = parsed_path.distro&lt;br /&gt;
    ignore = [&lt;br /&gt;
      &amp;quot;server-status&amp;quot;, &amp;quot;stats&amp;quot;, &amp;quot;robots.txt&amp;quot;,&lt;br /&gt;
      &amp;quot;include&amp;quot;, &amp;quot;pub&amp;quot;, &amp;quot;news&amp;quot;, &amp;quot;index.html&amp;quot;, &amp;quot;sync.json&amp;quot;, &amp;quot;ups&amp;quot;,&lt;br /&gt;
      &amp;quot;pool&amp;quot;, &amp;quot;dists&amp;quot;, &amp;quot;csclub.asc&amp;quot;, &amp;quot;csclub.gpg&amp;quot;&lt;br /&gt;
    ]&lt;br /&gt;
    if (&lt;br /&gt;
      is_null(err) &amp;amp;&amp;amp; !includes(ignore, distro) &amp;amp;&amp;amp; !contains(request, &amp;quot;..&amp;quot;) &amp;amp;&amp;amp;&lt;br /&gt;
      !starts_with(request, &amp;quot;#&amp;quot;) &amp;amp;&amp;amp; !starts_with(request, &amp;quot;%&amp;quot;) &amp;amp;&amp;amp; !starts_with(request, &amp;quot;.&amp;quot;)&lt;br /&gt;
    ) {&lt;br /&gt;
      . = {&lt;br /&gt;
        &amp;quot;distro&amp;quot;: distro,&lt;br /&gt;
        &amp;quot;user_agent&amp;quot;: parsed_log.agent,&lt;br /&gt;
        &amp;quot;ip_address&amp;quot;: parsed_log.client,&lt;br /&gt;
        &amp;quot;bytes_sent&amp;quot;: parsed_log.size,&lt;br /&gt;
        &amp;quot;timestamp&amp;quot;: parsed_log.timestamp,&lt;br /&gt;
        &amp;quot;job&amp;quot;: &amp;quot;vector-mirror&amp;quot;&lt;br /&gt;
      }&lt;br /&gt;
    }&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Finally, here&#039;s the aggregator config, which collects data from each agent and then inserts it into ClickHouse:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[enrichment_tables.enrich_geoip]                                                    &lt;br /&gt;
type = &amp;quot;geoip&amp;quot;                                                                      &lt;br /&gt;
path = &amp;quot;/usr/share/GeoIP/GeoLite2-City.mmdb&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[sources.source_agents]      &lt;br /&gt;
type = &amp;quot;vector&amp;quot;&lt;br /&gt;
address = &amp;quot;[::]:5045&amp;quot;&lt;br /&gt;
  [sources.source_agents.tls]&lt;br /&gt;
  enabled = true&lt;br /&gt;
  ca_file = &amp;quot;/etc/vector/agent.crt&amp;quot;&lt;br /&gt;
  crt_file = &amp;quot;/etc/vector/aggregator.crt&amp;quot;&lt;br /&gt;
  key_file = &amp;quot;/etc/vector/aggregator.key&amp;quot;&lt;br /&gt;
  verify_hostname = false&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_route]&lt;br /&gt;
type = &amp;quot;route&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;source_agents&amp;quot;]&lt;br /&gt;
route.sshd = &#039;.job == &amp;quot;vector-sshd&amp;quot;&#039;&lt;br /&gt;
route.mirror = &#039;.job == &amp;quot;vector-mirror&amp;quot;&#039;&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_sshd]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_route.sshd&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot; &lt;br /&gt;
  ipinfo = get_enrichment_table_record(&amp;quot;enrich_geoip&amp;quot;, {&amp;quot;ip&amp;quot;: .ip_address}) ?? {}&lt;br /&gt;
  if is_ipv4!(.ip_address) &amp;amp;&amp;amp; (ip_cidr_contains!(&amp;quot;10.0.0.0/8&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;172.16.0.0/12&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;192.168.0.0/16&amp;quot;, .ip_address)) {&lt;br /&gt;
    ipinfo.country_code = &amp;quot;CA&amp;quot;;&lt;br /&gt;
    ipinfo.region_name = &amp;quot;Ontario&amp;quot;;&lt;br /&gt;
    ipinfo.city_name = &amp;quot;Waterloo&amp;quot;;&lt;br /&gt;
  }&lt;br /&gt;
  . = {&lt;br /&gt;
    &amp;quot;host&amp;quot;: .host,&lt;br /&gt;
    &amp;quot;timestamp&amp;quot;: .timestamp,&lt;br /&gt;
    &amp;quot;username&amp;quot;: .username,&lt;br /&gt;
    &amp;quot;ip_address&amp;quot;: ip_to_ipv6!(.ip_address),&lt;br /&gt;
    &amp;quot;country_code&amp;quot;: ipinfo.country_code || &amp;quot;&amp;quot;&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_mirror]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_route.mirror&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
  ipinfo = get_enrichment_table_record(&amp;quot;enrich_geoip&amp;quot;, {&amp;quot;ip&amp;quot;: .ip_address}) ?? {}&lt;br /&gt;
  if is_ipv4!(.ip_address) &amp;amp;&amp;amp; (ip_cidr_contains!(&amp;quot;10.0.0.0/8&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;172.16.0.0/12&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;192.168.0.0/16&amp;quot;, .ip_address)) {&lt;br /&gt;
    ipinfo.country_code = &amp;quot;CA&amp;quot;;&lt;br /&gt;
    ipinfo.region_name = &amp;quot;Ontario&amp;quot;;&lt;br /&gt;
    ipinfo.city_name = &amp;quot;Waterloo&amp;quot;;&lt;br /&gt;
  }&lt;br /&gt;
  . = {&lt;br /&gt;
    &amp;quot;distro&amp;quot;: .distro,&lt;br /&gt;
    &amp;quot;timestamp&amp;quot;: .timestamp,&lt;br /&gt;
    &amp;quot;ip_address&amp;quot;: ip_to_ipv6!(.ip_address),&lt;br /&gt;
    &amp;quot;bytes_sent&amp;quot;: .bytes_sent,&lt;br /&gt;
    &amp;quot;user_agent&amp;quot;: .user_agent,&lt;br /&gt;
    &amp;quot;country_code&amp;quot;: ipinfo.country_code || &amp;quot;&amp;quot;,&lt;br /&gt;
    &amp;quot;region_name&amp;quot;: ipinfo.region_name || &amp;quot;&amp;quot;,&lt;br /&gt;
    &amp;quot;city&amp;quot;: ipinfo.city_name || &amp;quot;&amp;quot;&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_unmatched]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_route._unmatched&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
  log(&amp;quot;unrecognized job: &amp;quot; + string!(.job || &amp;quot;null&amp;quot;), level: &amp;quot;warn&amp;quot;)&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[sinks.sink_unmatched]&lt;br /&gt;
type = &amp;quot;blackhole&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_unmatched&amp;quot;]&lt;br /&gt;
print_interval_secs = 0&lt;br /&gt;
&lt;br /&gt;
[sinks.clickhouse_sshd]&lt;br /&gt;
type = &amp;quot;clickhouse&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_sshd&amp;quot;]&lt;br /&gt;
encoding.timestamp_format = &amp;quot;unix&amp;quot;&lt;br /&gt;
endpoint = &amp;quot;$CLICKHOUSE_ENDPOINT&amp;quot;&lt;br /&gt;
database = &amp;quot;$CLICKHOUSE_DATABASE&amp;quot;&lt;br /&gt;
table = &amp;quot;failed_ssh_logins&amp;quot;&lt;br /&gt;
  [sinks.clickhouse_sshd.auth]&lt;br /&gt;
  strategy = &amp;quot;basic&amp;quot;&lt;br /&gt;
  user = &amp;quot;$CLICKHOUSE_USER&amp;quot;&lt;br /&gt;
  password = &amp;quot;$CLICKHOUSE_PASSWORD&amp;quot;&lt;br /&gt;
&lt;br /&gt;
...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Beats, Logstash and Loki (old) ===&lt;br /&gt;
We previously used Elastic Beats, Logstash and Grafana Loki for collecting and storing logs. One day I tried to upgrade Logstash and it exploded so badly that I figured it would be easier to just switch to Vector instead.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;The sections below are kept for historical purposes only and are no longer accurate.&amp;lt;/b&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We use a combination of [https://www.elastic.co/beats/ Elastic Beats], [https://www.elastic.co/logstash/ Logstash] and [https://grafana.com/oss/loki/ Loki] for collecting, storing and querying our logs; for visualization, we use Grafana. Logstash and Loki are currently both running in the prometheus VM.&lt;br /&gt;
&lt;br /&gt;
The reason why I chose Loki over Elasticsearch is because Loki is &amp;lt;i&amp;gt;very&amp;lt;/i&amp;gt; space efficient with regards to storage. It also consumes way less RAM and CPU. This means that we can collect a lot of logs without worrying too much about resource usage.&lt;br /&gt;
&lt;br /&gt;
We have Journalbeat and/or Filebeat running on some of our machines to collect logs from sshd, Apache and NGINX. The Beats send these logs to Logstash, which does some pre-processing. The most useful contribution by Logstash is its GeoIP plugin, which allows us to enrich the logs with some geographical information from IP addresses (e.g. add city and country). Logstash sends these logs to Loki, and we can then view these from Grafana.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note&#039;&#039;&#039;: Sometimes the Loki output plugin for Logstash disappears after a reboot or an upgrade. If you see Logstash complaining about this in the journald logs, run this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cd /usr/share/logstash&lt;br /&gt;
bin/logstash-plugin install logstash-output-loki&lt;br /&gt;
systemctl restart logstash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
See [https://grafana.com/docs/loki/latest/clients/logstash/ here] for details.&lt;br /&gt;
&lt;br /&gt;
The language for querying logs in Loki is [https://grafana.com/docs/loki/latest/logql/ LogQL], which, syntactically, is very similar to PromQL. If you have already learned PromQL, then you should be able to pick up LogQL very easily. You can try out some LogQL queries from the &#039;Explore&#039; page on Grafana; make sure you toggle the data source to &#039;Loki&#039; in the top left corner. For the &#039;topk&#039; queries, you will also want to toggle &#039;Query type&#039; to &#039;Instant&#039; rather than &#039;Range&#039;.&lt;br /&gt;
&lt;br /&gt;
==== LogQL examples ====&lt;br /&gt;
Here are the number of failed SSH login attempts for each host for a given time range:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sum by (hostname) (&lt;br /&gt;
  count_over_time(&lt;br /&gt;
    {job=&amp;quot;logstash-sshd&amp;quot;} [$__range]&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that &amp;lt;code&amp;gt;$__range&amp;lt;/code&amp;gt; is a special [https://grafana.com/docs/grafana/latest/variables/variable-types/global-variables/ global variable] in Grafana which is equal to the time range in the top right corner of a chart.&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
Here are the top 10 IP addresses from which failed SSH login attempts arrived, for a given host and time range:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
topk(10,&lt;br /&gt;
  sum by (ip_address) (&lt;br /&gt;
    count_over_time(&lt;br /&gt;
      {job=&amp;quot;logstash-sshd&amp;quot;,hostname=&amp;quot;$hostname&amp;quot;} | json | __error__ = &amp;quot;&amp;quot;&lt;br /&gt;
      [$__range]&lt;br /&gt;
    )&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
$hostname is a chart variable, which can be configured from a chart&#039;s settings.&lt;br /&gt;
&lt;br /&gt;
I configured Logstash to send logs to Loki as JSON, but it&#039;s a rather hacky solution, so occasionally invalid JSON is sent.&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
Here are the number of HTTP requests for the 15 distros on our mirror from the last hour:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
topk(15,&lt;br /&gt;
  sum by (distro) (&lt;br /&gt;
    count_over_time(&lt;br /&gt;
      {job=&amp;quot;logstash-nginx&amp;quot;} | json | __error__ = &amp;quot;&amp;quot; | distro != &amp;quot;server-status&amp;quot;&lt;br /&gt;
      [1h]&lt;br /&gt;
    )&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
Here are the number of total bytes sent over HTTP for the top 15 distros from the last hour. Note the use of the &amp;lt;code&amp;gt;unwrap&amp;lt;/code&amp;gt; operator.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
topk(15,&lt;br /&gt;
  sum by (distro) (&lt;br /&gt;
    sum_over_time(&lt;br /&gt;
      {job=&amp;quot;logstash-nginx&amp;quot;} | json | __error__ = &amp;quot;&amp;quot; | distro != &amp;quot;server-status&amp;quot; | unwrap bytes&lt;br /&gt;
      [1h]&lt;br /&gt;
    )&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
You can see more examples on the Mirror Requests dashboard on Grafana.&lt;br /&gt;
&lt;br /&gt;
==== Avoid high cardinality ====&lt;br /&gt;
For both Prometheus and Loki, you must [https://prometheus.io/docs/practices/naming/#labels avoid high cardinality] labels at all costs. By high cardinality, I mean labels which can take on a very large number of values; for example, using a label to store IP addresses would be a very bad idea. This is because Prometheus and Loki use labels to store metrics/logs efficiently with compression; when two metrics have two different sets of labels, they cannot be stored together, which increases the storage space usage.&lt;br /&gt;
&lt;br /&gt;
With Loki, you can extract labels from your logs inside your query dynamically. One way to do this is with the &amp;lt;code&amp;gt;json&amp;lt;/code&amp;gt; operator; there are other ways to do this as well (see the LogQL docs). This basically means that we get infinite cardinality from our logs, the tradeoff being that queries may take longer to execute.&lt;br /&gt;
&lt;br /&gt;
Also, be very careful about what you send to Loki from Logstash - [https://grafana.com/docs/loki/latest/clients/logstash/#usage-and-configuration every field in a Logstash message becomes a Loki label]. Usage of the &amp;lt;code&amp;gt;prune&amp;lt;/code&amp;gt; command in Logstash is highly recommended.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=Observability&amp;diff=5182</id>
		<title>Observability</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=Observability&amp;diff=5182"/>
		<updated>2023-12-16T01:23:40Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* ClickHouse */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are [https://www.oreilly.com/library/view/distributed-systems-observability/9781492033431/ch04.html three pillars of observability]: metrics, logging and tracing. We are only interested in the first two.&lt;br /&gt;
&lt;br /&gt;
== Metrics ==&lt;br /&gt;
All of our machines are, or at least should be, running the Prometheus node exporter. This collects and sends machine metrics (e.g. RAM used, disk space) to the Prometheus server running at https://prometheus.csclub.uwaterloo.ca (currently a VM on phosphoric-acid). There are a few specialized exporters running on several other machines; a Postfix exporter is running on mail, an Apache exporter is running on caffeine, and an NGINX expoter is running on potassium-benzoate. There is also a custom exporter written by syscom running on potassium-benzoate for mirror stats.&lt;br /&gt;
&lt;br /&gt;
Most of the exporters use mutual TLS authentication with the Prometheus server. I set the expiration date for the TLS certs to 10 years. If you are reading this and it is 2031 or later, then go update the certs.&lt;br /&gt;
&lt;br /&gt;
I highly suggest becoming familiar with [https://prometheus.io/docs/prometheus/latest/querying/basics/ PromQL], the query language for Prometheus. You can run and visualize some queries at https://prometheus.csclub.uwaterloo.ca/prometheus. For example, here is a query to determine which machines are up or down:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
up{job=&amp;quot;node_exporter&amp;quot;}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Here&#039;s how we determine if a machine has NFS mounted. This will return 1 for machines which have NFS mounted, but will not return any records for machines which do not have NFS mounted. (We ignore the actual value of node_filesystem_device_error because it returns 1 for machines using Kerberized NFS.)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
count by (instance) (node_filesystem_device_error{mountpoint=&amp;quot;/users&amp;quot;, fstype=&amp;quot;nfs&amp;quot;})&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Now this is a rather complicated expression which can return one of three values:&lt;br /&gt;
* 0: the machine is down&lt;br /&gt;
* 1: the machine is up, but NFS is not mounted&lt;br /&gt;
* 2: the machine is up and NFS is mounted&lt;br /&gt;
The [https://prometheus.io/docs/prometheus/latest/querying/operators/#logical-set-binary-operators or operator] in PromQL is key here.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sum by (instance) (&lt;br /&gt;
  (count by (instance) (node_filesystem_device_error{mountpoint=&amp;quot;/users&amp;quot;, fstype=&amp;quot;nfs&amp;quot;}))&lt;br /&gt;
  or up{job=&amp;quot;node_exporter&amp;quot;}&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
We also use [https://prometheus.io/docs/alerting/latest/alertmanager/ AlertManager] to send email alerts from Prometheus metrics. We should figure out how to also send messages to IRC or similar.&lt;br /&gt;
&lt;br /&gt;
We also use the [https://github.com/prometheus/blackbox_exporter Blackbox prober exporter] to check if some of our web-based services are up.&lt;br /&gt;
&lt;br /&gt;
We make some pretty charts on Grafana (https://prometheus.csclub.uwaterloo.ca) from PromQL queries. Grafana also has an &#039;Explorer&#039; page where you can test out some queries before making chart panels from them.&lt;br /&gt;
&lt;br /&gt;
== Logging ==&lt;br /&gt;
We now use [https://vector.dev/ Vector] for collecting and transforming logs, and [https://clickhouse.com/ ClickHouse] for storing log data.&lt;br /&gt;
&lt;br /&gt;
=== ClickHouse ===&lt;br /&gt;
ClickHouse is a very fast OLAP database which has great documentation for storing and analyzing [https://clickhouse.com/use-cases/logging-and-metrics logging and metrics]. Unfortunately, the CPU on phosphoric-acid (which hosts the prometheus VM) is so old that when we try to install the official deb package, the following error occurs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Instruction check fail. The CPU does not support SSSE3 instruction set.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
So we&#039;re going to download the &amp;quot;compat&amp;quot; version instead:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cd /root&lt;br /&gt;
wget https://s3.amazonaws.com/clickhouse-builds/master/amd64compat/clickhouse&lt;br /&gt;
chmod +x clickhouse&lt;br /&gt;
./clickhouse install&lt;br /&gt;
rm clickhouse&lt;br /&gt;
wget -O /etc/systemd/system/clickhouse-server.service https://github.com/ClickHouse/ClickHouse/raw/master/packages/clickhouse-server.service&lt;br /&gt;
systemctl daemon-reload&lt;br /&gt;
systemctl enable clickhouse-server&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
By default, systemd limits the number of threads which a service can create, so we&#039;ll want to disable that. Run &amp;lt;code&amp;gt;systemctl edit clickhouse-server&amp;lt;/code&amp;gt; and paste the following:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Service]&lt;br /&gt;
TasksMax=infinity&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, paste the following into /etc/clickhouse-server/users.d/csclub-users.xml:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;clickhouse&amp;gt;&lt;br /&gt;
  &amp;lt;readonly&amp;gt;&lt;br /&gt;
    &amp;lt;!-- Grafana needs to change settings queries --&amp;gt;&lt;br /&gt;
    &amp;lt;readonly&amp;gt;2&amp;lt;/readonly&amp;gt;&lt;br /&gt;
  &amp;lt;/readonly&amp;gt;&lt;br /&gt;
  &amp;lt;users&amp;gt;&lt;br /&gt;
    &amp;lt;default&amp;gt;&lt;br /&gt;
      &amp;lt;!-- The default user should only be allowed to connect from localhost --&amp;gt;&lt;br /&gt;
      &amp;lt;networks&amp;gt;&lt;br /&gt;
        &amp;lt;ip&amp;gt;::1&amp;lt;/ip&amp;gt;&lt;br /&gt;
        &amp;lt;ip&amp;gt;127.0.0.1&amp;lt;/ip&amp;gt;&lt;br /&gt;
      &amp;lt;/networks&amp;gt;&lt;br /&gt;
      &amp;lt;!-- Allow the default user to create new users --&amp;gt;&lt;br /&gt;
      &amp;lt;access_management&amp;gt;1&amp;lt;/access_management&amp;gt;&lt;br /&gt;
      &amp;lt;named_collection_control&amp;gt;1&amp;lt;/named_collection_control&amp;gt;&lt;br /&gt;
      &amp;lt;show_named_collections&amp;gt;1&amp;lt;/show_named_collections&amp;gt;&lt;br /&gt;
      &amp;lt;show_named_collections_secrets&amp;gt;1&amp;lt;/show_named_collections_secrets&amp;gt;&lt;br /&gt;
    &amp;lt;/default&amp;gt;&lt;br /&gt;
  &amp;lt;/users&amp;gt;&lt;br /&gt;
&amp;lt;/clickhouse&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then paste the following into /etc/clickhouse-server/config/zzz-csclub.xml (we need the zzz prefix because the configuration files are merged in alphabetical order, and we want ours to be applied last):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;clickhouse&amp;gt;&lt;br /&gt;
  &amp;lt;listen_host&amp;gt;127.0.0.1&amp;lt;/listen_host&amp;gt;&lt;br /&gt;
  &amp;lt;listen_host&amp;gt;::1&amp;lt;/listen_host&amp;gt;&lt;br /&gt;
  &amp;lt;logger&amp;gt;&lt;br /&gt;
    &amp;lt;level&amp;gt;information&amp;lt;/level&amp;gt;&lt;br /&gt;
    &amp;lt;size&amp;gt;100M&amp;lt;/size&amp;gt;&lt;br /&gt;
    &amp;lt;count&amp;gt;10&amp;lt;/count&amp;gt;&lt;br /&gt;
  &amp;lt;/logger&amp;gt;&lt;br /&gt;
  &amp;lt;mysql_port&amp;gt;&amp;lt;/mysql_port&amp;gt;&lt;br /&gt;
  &amp;lt;postgresql_port&amp;gt;&amp;lt;/postgresql_port&amp;gt;&lt;br /&gt;
&amp;lt;/clickhouse&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run &amp;lt;code&amp;gt;systemctl restart clickhouse-server&amp;lt;/code&amp;gt; and make sure that it&#039;s running.&lt;br /&gt;
&lt;br /&gt;
==== Schema ====&lt;br /&gt;
Run &amp;lt;code&amp;gt;clickhouse-client&amp;lt;/code&amp;gt; to get a SQL shell. First we need to create a new database and some users:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE DATABASE vector;&lt;br /&gt;
CREATE USER vector IDENTIFIED BY &#039;REPLACE_ME&#039;;&lt;br /&gt;
GRANT ALL ON vector.* TO vector;&lt;br /&gt;
CREATE USER grafana IDENTIFIED BY &#039;REPLACE_ME&#039; SETTINGS PROFILE &#039;readonly&#039;;&lt;br /&gt;
GRANT SHOW DATABASES, SHOW TABLES, SELECT ON *.* TO grafana;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In some of our tables, we&#039;ll store the two-letter country code instead of a country&#039;s full name to save space. So we&#039;ll create a [https://clickhouse.com/docs/en/sql-reference/dictionaries dictionary] so that we can look up a country&#039;s full name. Exit the SQL shell, then, download the CSV file:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
wget -O /var/lib/clickhouse/user_files/country_codes.csv &#039;https://datahub.io/core/country-list/r/data.csv&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run &amp;lt;code&amp;gt;clickhouse-client&amp;lt;/code&amp;gt; and create the dictionary:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE DICTIONARY vector.country_codes_dictionary&lt;br /&gt;
(&lt;br /&gt;
    Name String,&lt;br /&gt;
    Code String&lt;br /&gt;
)&lt;br /&gt;
PRIMARY KEY Code&lt;br /&gt;
SOURCE(FILE(path &#039;/var/lib/clickhouse/user_files/country_codes.csv&#039; FORMAT &#039;CSVWithNames&#039;))&lt;br /&gt;
LIFETIME(MIN 0 MAX 0)&lt;br /&gt;
LAYOUT(HASHED_ARRAY());&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Perform a SELECT to fill the table:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT * FROM country_codes_dictionary;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now we need to create the tables for storing our actual log data (after they are transformed by Vector).&lt;br /&gt;
Create a table for failed SSH logins:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.failed_ssh_logins&lt;br /&gt;
(&lt;br /&gt;
    host LowCardinality(String),&lt;br /&gt;
    timestamp DateTime,&lt;br /&gt;
    ip_address IPv6,&lt;br /&gt;
    username String,&lt;br /&gt;
    country_code LowCardinality(String)&lt;br /&gt;
)&lt;br /&gt;
ENGINE = MergeTree()&lt;br /&gt;
PRIMARY KEY (host, timestamp)&lt;br /&gt;
TTL timestamp + INTERVAL 1 MONTH DELETE;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Create a table for storing mirror requests:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    timestamp DateTime CODEC(Delta, ZSTD),&lt;br /&gt;
    ip_address IPv6,&lt;br /&gt;
    bytes_sent UInt64,&lt;br /&gt;
    user_agent String,&lt;br /&gt;
    country_code LowCardinality(String),&lt;br /&gt;
    region_name String,&lt;br /&gt;
    city String&lt;br /&gt;
)&lt;br /&gt;
ENGINE = MergeTree()&lt;br /&gt;
PRIMARY KEY (distro, timestamp, country_code, region_name, city)&lt;br /&gt;
TTL timestamp + INTERVAL 1 WEEK DELETE;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
One of ClickHouse&#039;s great features is [https://clickhouse.com/docs/en/guides/developer/cascading-materialized-views Materialized Views]. These allow us to automatically &amp;quot;forward&amp;quot; data from one table to another, and the second table can use a different storage engine to aggregate data and save space.&lt;br /&gt;
&lt;br /&gt;
We want to calculate the total number of requests and bytes sent for each distro, so let&#039;s create a table and view for that:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_by_distro&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    date Date CODEC(Delta, ZSTD),&lt;br /&gt;
    country_code LowCardinality(String),&lt;br /&gt;
    num_requests UInt64,&lt;br /&gt;
    bytes_sent UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((num_requests, bytes_sent))&lt;br /&gt;
PRIMARY KEY (distro, toStartOfMonth(date), date, country_code)&lt;br /&gt;
TTL date + INTERVAL 1 MONTH&lt;br /&gt;
        GROUP BY distro, toStartOfMonth(date)&lt;br /&gt;
        SET num_requests = sum(num_requests),&lt;br /&gt;
            bytes_sent = sum(bytes_sent),&lt;br /&gt;
    date + INTERVAL 2 YEAR DELETE;&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_by_distro_mv&lt;br /&gt;
TO vector.mirror_requests_agg_by_distro&lt;br /&gt;
AS&lt;br /&gt;
SELECT&lt;br /&gt;
    distro,&lt;br /&gt;
    toDate(timestamp) AS date,&lt;br /&gt;
    country_code,&lt;br /&gt;
    sum(1) AS num_requests,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
GROUP BY distro, date, country_code;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We also wants some stats for Canada specifically:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_canada&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    date Date CODEC(Delta, ZSTD),&lt;br /&gt;
    region_name LowCardinality(String),&lt;br /&gt;
    city String,&lt;br /&gt;
    bytes_sent UInt64,&lt;br /&gt;
    num_requests UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((bytes_sent, num_requests))&lt;br /&gt;
PRIMARY KEY (distro, toStartOfMonth(date), date, region_name, city)&lt;br /&gt;
TTL date + INTERVAL 1 MONTH&lt;br /&gt;
        GROUP BY distro, toStartOfMonth(date)&lt;br /&gt;
        SET num_requests = sum(num_requests),&lt;br /&gt;
            bytes_sent = sum(bytes_sent),&lt;br /&gt;
    date + INTERVAL 2 YEAR DELETE;&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_canada_mv&lt;br /&gt;
TO vector.mirror_requests_agg_canada&lt;br /&gt;
AS&lt;br /&gt;
SELECT&lt;br /&gt;
    distro,&lt;br /&gt;
    toDate(timestamp) as date,&lt;br /&gt;
    region_name,&lt;br /&gt;
    city,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent,&lt;br /&gt;
    sum(1) AS num_requests&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
WHERE country_code = &#039;CA&#039;&lt;br /&gt;
GROUP BY distro, date, region_name, city;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We also want to keep stats just for the university:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_uw&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    date Date CODEC(Delta, ZSTD),&lt;br /&gt;
    bytes_sent UInt64,&lt;br /&gt;
    num_requests UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((bytes_sent, num_requests))&lt;br /&gt;
PRIMARY KEY (distro, toStartOfMonth(date), date)&lt;br /&gt;
TTL date + INTERVAL 1 MONTH&lt;br /&gt;
        GROUP BY distro, toStartOfMonth(date)&lt;br /&gt;
        SET num_requests = sum(num_requests),&lt;br /&gt;
            bytes_sent = sum(bytes_sent),&lt;br /&gt;
    date + INTERVAL 2 YEAR DELETE;&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_uw_mv&lt;br /&gt;
TO vector.mirror_requests_agg_uw&lt;br /&gt;
AS&lt;br /&gt;
SELECT&lt;br /&gt;
    distro,&lt;br /&gt;
    toDate(timestamp) as date,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent,&lt;br /&gt;
    sum(1) AS num_requests&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
WHERE isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:129.97.0.0/112&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:10.0.0.0/104&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:172.16.0.0/108&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:192.168.0.0/112&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;2620:101:f000::/47&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;fd74:6b6a:8eca::/47&#039;)&lt;br /&gt;
GROUP BY distro, date, region_name, city;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Finally, we&#039;ll store some stats for IP subnets:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_ip&lt;br /&gt;
(&lt;br /&gt;
    timestamp DateTime CODEC(Delta, ZSTD),&lt;br /&gt;
    cidr_start IPv6,&lt;br /&gt;
    country_code LowCardinality(String),&lt;br /&gt;
    num_requests UInt64,&lt;br /&gt;
    bytes_sent UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((num_requests, bytes_sent))&lt;br /&gt;
PRIMARY KEY (timestamp, cidr_start, country_code)&lt;br /&gt;
TTL timestamp + toIntervalWeek(2);&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_ip_mv TO vector.mirror_requests_agg_ip AS&lt;br /&gt;
SELECT&lt;br /&gt;
    toStartOfFiveMinutes(timestamp) AS timestamp,&lt;br /&gt;
    IPv6CIDRToRange(ip_address, 120).1 AS cidr_start,&lt;br /&gt;
    country_code,&lt;br /&gt;
    sum(1) AS num_requests,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
GROUP BY&lt;br /&gt;
    timestamp,&lt;br /&gt;
    cidr_start,&lt;br /&gt;
    country_code;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== GeoIP database ===&lt;br /&gt;
We&#039;ll want to look up geographic information for the IP addresses in our data. To do this, we&#039;ll use the [https://dev.maxmind.com/geoip/geolite2-free-geolocation-data MaxMind GeoLite2 databases]. Syscom already has a MaxMind account; the password is stored in the usual place. Install the latest geoipupdate package from [https://github.com/maxmind/geoipupdate/releases here], then edit /etc/GeoIP.conf as necessary (use the syscom account ID and license key). Set &amp;lt;code&amp;gt;EditionIDs&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;GeoLite2-City&amp;lt;/code&amp;gt; only.&lt;br /&gt;
&lt;br /&gt;
We&#039;ll use a systemd timer to run the geoipupdate script periodically. Paste the following into /etc/systemd/system/geoipupdate.service:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=GeoIP Update&lt;br /&gt;
Documentation=https://dev.maxmind.com/geoip/updating-databases&lt;br /&gt;
After=network-online.target&lt;br /&gt;
&lt;br /&gt;
[Service]&lt;br /&gt;
Type=oneshot&lt;br /&gt;
ExecStart=/usr/bin/geoipupdate&lt;br /&gt;
Nice=19&lt;br /&gt;
IOSchedulingClass=idle&lt;br /&gt;
IOSchedulingPriority=7&lt;br /&gt;
ProtectSystem=strict&lt;br /&gt;
ReadWritePaths=/usr/share/GeoIP&lt;br /&gt;
ProtectHome=true&lt;br /&gt;
PrivateTmp=true&lt;br /&gt;
PrivateDevices=true&lt;br /&gt;
ProtectHostname=true&lt;br /&gt;
ProtectClock=true&lt;br /&gt;
ProtectKernelTunables=true&lt;br /&gt;
ProtectKernelModules=true&lt;br /&gt;
ProtectKernelLogs=true&lt;br /&gt;
ProtectControlGroups=true&lt;br /&gt;
LockPersonality=true&lt;br /&gt;
RestrictRealtime=true&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Run &amp;lt;code&amp;gt;systemctl daemon-reload&amp;lt;/code&amp;gt; and then &amp;lt;code&amp;gt;systemctl start geoipupdate&amp;lt;/code&amp;gt; to download the database for the first time.&lt;br /&gt;
&lt;br /&gt;
Now paste the following into /etc/systemd/system/geoipupdate.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Automatic GeoIP database update&lt;br /&gt;
Documentation=https://dev.maxmind.com/geoip/updating-databases&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
OnCalendar=monthly&lt;br /&gt;
RandomizedDelaySec=12h&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl daemon-reload&lt;br /&gt;
systemctl enable geoipupdate.timer&lt;br /&gt;
systemctl start geoipupdate.timer&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Vector ===&lt;br /&gt;
Vector allows you to create directed acyclic graphs (DAGs) for collecting and processing logs, which gives us a lot of flexibility. It also has a built-in scripting language, [https://vector.dev/docs/reference/vrl/ Vector Remap Language (VRL)] for slicing and dicing data. This allows us to remove fields which we don&#039;t need, add new fields which we do need, enrich an event with extra data, etc.&lt;br /&gt;
&lt;br /&gt;
Our data pipeline looks like this: Vector agents -&amp;gt; Vector aggregator -&amp;gt; ClickHouse. We use Grafana for visualization.&lt;br /&gt;
&lt;br /&gt;
We use mutual TLS between the agents and the aggregator to make sure that random people can&#039;t send us garbage data:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
openssl req -newkey rsa:2048 -nodes -keyout aggregator.key -x509 -out aggregator.crt -days 36500&lt;br /&gt;
openssl req -newkey rsa:2048 -nodes -keyout agent.key -x509 -out agent.crt -days 36500&lt;br /&gt;
chown vector:vector *.crt *.key&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here is what our vector.toml looks like on the general-use machines; currently, we only use it for collecting failed SSH login attempts.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[sources.source_sshd]&lt;br /&gt;
type = &amp;quot;journald&amp;quot;&lt;br /&gt;
include_units = [&amp;quot;ssh.service&amp;quot;]&lt;br /&gt;
&lt;br /&gt;
[transforms.remap_sshd]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;source_sshd&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
  parsed, err = parse_regex(&lt;br /&gt;
    .message, r&#039;^(?:Connection (?:closed|reset)|Disconnected) (?:by|from) (?:invalid|authenticating) user (?P&amp;lt;user&amp;gt;[^ ]+) (?P&amp;lt;ip&amp;gt;[0-9.a-f:]+)&#039;&lt;br /&gt;
  )&lt;br /&gt;
  if is_null(err) {&lt;br /&gt;
    . = {&lt;br /&gt;
      &amp;quot;username&amp;quot;: parsed.user,&lt;br /&gt;
      &amp;quot;ip_address&amp;quot;: parsed.ip,&lt;br /&gt;
      &amp;quot;host&amp;quot;: .host,&lt;br /&gt;
      &amp;quot;timestamp&amp;quot;: .timestamp,&lt;br /&gt;
      &amp;quot;job&amp;quot;: &amp;quot;vector-sshd&amp;quot;&lt;br /&gt;
    }&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[transforms.filter_sshd]&lt;br /&gt;
type = &amp;quot;filter&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;remap_sshd&amp;quot;]&lt;br /&gt;
condition = &#039;.job == &amp;quot;vector-sshd&amp;quot;&#039;&lt;br /&gt;
&lt;br /&gt;
[sinks.aggregator]&lt;br /&gt;
type = &amp;quot;vector&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;filter_sshd&amp;quot;]&lt;br /&gt;
address = &amp;quot;prometheus:5045&amp;quot;&lt;br /&gt;
  [sinks.aggregator.tls]&lt;br /&gt;
  enabled = true&lt;br /&gt;
  ca_file = &amp;quot;/etc/vector/aggregator.crt&amp;quot;&lt;br /&gt;
  crt_file = &amp;quot;/etc/vector/agent.crt&amp;quot;&lt;br /&gt;
  key_file = &amp;quot;/etc/vector/agent.key&amp;quot;&lt;br /&gt;
  verify_hostname = false&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The agent on potassium-benzoate collects NGINX logs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[sources.source_nginx]&lt;br /&gt;
type = &amp;quot;file&amp;quot;&lt;br /&gt;
include = [&amp;quot;/var/log/nginx/access.log&amp;quot;]&lt;br /&gt;
max_read_bytes = 65536&lt;br /&gt;
&lt;br /&gt;
[transforms.remap_nginx]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;source_nginx&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot; &lt;br /&gt;
  parsed_log, err = parse_nginx_log(.message, &amp;quot;combined&amp;quot;)&lt;br /&gt;
  status = parsed_log.status&lt;br /&gt;
  request = string!(parsed_log.request || &amp;quot;&amp;quot;)&lt;br /&gt;
  if is_null(err) &amp;amp;&amp;amp; status == 200 {&lt;br /&gt;
    parsed_path, err = parse_regex(request, r&#039;^GET /+(?P&amp;lt;distro&amp;gt;[^/? ]+)&#039;)&lt;br /&gt;
    distro = parsed_path.distro&lt;br /&gt;
    ignore = [&lt;br /&gt;
      &amp;quot;server-status&amp;quot;, &amp;quot;stats&amp;quot;, &amp;quot;robots.txt&amp;quot;,&lt;br /&gt;
      &amp;quot;include&amp;quot;, &amp;quot;pub&amp;quot;, &amp;quot;news&amp;quot;, &amp;quot;index.html&amp;quot;, &amp;quot;sync.json&amp;quot;, &amp;quot;ups&amp;quot;,&lt;br /&gt;
      &amp;quot;pool&amp;quot;, &amp;quot;dists&amp;quot;, &amp;quot;csclub.asc&amp;quot;, &amp;quot;csclub.gpg&amp;quot;&lt;br /&gt;
    ]&lt;br /&gt;
    if (&lt;br /&gt;
      is_null(err) &amp;amp;&amp;amp; !includes(ignore, distro) &amp;amp;&amp;amp; !contains(request, &amp;quot;..&amp;quot;) &amp;amp;&amp;amp;&lt;br /&gt;
      !starts_with(request, &amp;quot;#&amp;quot;) &amp;amp;&amp;amp; !starts_with(request, &amp;quot;%&amp;quot;) &amp;amp;&amp;amp; !starts_with(request, &amp;quot;.&amp;quot;)&lt;br /&gt;
    ) {&lt;br /&gt;
      . = {&lt;br /&gt;
        &amp;quot;distro&amp;quot;: distro,&lt;br /&gt;
        &amp;quot;user_agent&amp;quot;: parsed_log.agent,&lt;br /&gt;
        &amp;quot;ip_address&amp;quot;: parsed_log.client,&lt;br /&gt;
        &amp;quot;bytes_sent&amp;quot;: parsed_log.size,&lt;br /&gt;
        &amp;quot;timestamp&amp;quot;: parsed_log.timestamp,&lt;br /&gt;
        &amp;quot;job&amp;quot;: &amp;quot;vector-mirror&amp;quot;&lt;br /&gt;
      }&lt;br /&gt;
    }&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Finally, here&#039;s the aggregator config, which collects data from each agent and then inserts it into ClickHouse:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[enrichment_tables.enrich_geoip]                                                    &lt;br /&gt;
type = &amp;quot;geoip&amp;quot;                                                                      &lt;br /&gt;
path = &amp;quot;/usr/share/GeoIP/GeoLite2-City.mmdb&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[sources.source_agents]      &lt;br /&gt;
type = &amp;quot;vector&amp;quot;&lt;br /&gt;
address = &amp;quot;[::]:5045&amp;quot;&lt;br /&gt;
  [sources.source_agents.tls]&lt;br /&gt;
  enabled = true&lt;br /&gt;
  ca_file = &amp;quot;/etc/vector/agent.crt&amp;quot;&lt;br /&gt;
  crt_file = &amp;quot;/etc/vector/aggregator.crt&amp;quot;&lt;br /&gt;
  key_file = &amp;quot;/etc/vector/aggregator.key&amp;quot;&lt;br /&gt;
  verify_hostname = false&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_route]&lt;br /&gt;
type = &amp;quot;route&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;source_agents&amp;quot;]&lt;br /&gt;
route.sshd = &#039;.job == &amp;quot;vector-sshd&amp;quot;&#039;&lt;br /&gt;
route.mirror = &#039;.job == &amp;quot;vector-mirror&amp;quot;&#039;&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_sshd]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_route.sshd&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot; &lt;br /&gt;
  ipinfo = get_enrichment_table_record(&amp;quot;enrich_geoip&amp;quot;, {&amp;quot;ip&amp;quot;: .ip_address}) ?? {}&lt;br /&gt;
  if is_ipv4!(.ip_address) &amp;amp;&amp;amp; (ip_cidr_contains!(&amp;quot;10.0.0.0/8&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;172.16.0.0/12&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;192.168.0.0/16&amp;quot;, .ip_address)) {&lt;br /&gt;
    ipinfo.country_code = &amp;quot;CA&amp;quot;;&lt;br /&gt;
    ipinfo.region_name = &amp;quot;Ontario&amp;quot;;&lt;br /&gt;
    ipinfo.city_name = &amp;quot;Waterloo&amp;quot;;&lt;br /&gt;
  }&lt;br /&gt;
  . = {&lt;br /&gt;
    &amp;quot;host&amp;quot;: .host,&lt;br /&gt;
    &amp;quot;timestamp&amp;quot;: .timestamp,&lt;br /&gt;
    &amp;quot;username&amp;quot;: .username,&lt;br /&gt;
    &amp;quot;ip_address&amp;quot;: ip_to_ipv6!(.ip_address),&lt;br /&gt;
    &amp;quot;country_code&amp;quot;: ipinfo.country_code || &amp;quot;&amp;quot;&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_mirror]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_route.mirror&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
  ipinfo = get_enrichment_table_record(&amp;quot;enrich_geoip&amp;quot;, {&amp;quot;ip&amp;quot;: .ip_address}) ?? {}&lt;br /&gt;
  if is_ipv4!(.ip_address) &amp;amp;&amp;amp; (ip_cidr_contains!(&amp;quot;10.0.0.0/8&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;172.16.0.0/12&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;192.168.0.0/16&amp;quot;, .ip_address)) {&lt;br /&gt;
    ipinfo.country_code = &amp;quot;CA&amp;quot;;&lt;br /&gt;
    ipinfo.region_name = &amp;quot;Ontario&amp;quot;;&lt;br /&gt;
    ipinfo.city_name = &amp;quot;Waterloo&amp;quot;;&lt;br /&gt;
  }&lt;br /&gt;
  . = {&lt;br /&gt;
    &amp;quot;distro&amp;quot;: .distro,&lt;br /&gt;
    &amp;quot;timestamp&amp;quot;: .timestamp,&lt;br /&gt;
    &amp;quot;ip_address&amp;quot;: ip_to_ipv6!(.ip_address),&lt;br /&gt;
    &amp;quot;bytes_sent&amp;quot;: .bytes_sent,&lt;br /&gt;
    &amp;quot;user_agent&amp;quot;: .user_agent,&lt;br /&gt;
    &amp;quot;country_code&amp;quot;: ipinfo.country_code || &amp;quot;&amp;quot;,&lt;br /&gt;
    &amp;quot;region_name&amp;quot;: ipinfo.region_name || &amp;quot;&amp;quot;,&lt;br /&gt;
    &amp;quot;city&amp;quot;: ipinfo.city_name || &amp;quot;&amp;quot;&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_unmatched]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_route._unmatched&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
  log(&amp;quot;unrecognized job: &amp;quot; + string!(.job || &amp;quot;null&amp;quot;), level: &amp;quot;warn&amp;quot;)&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[sinks.sink_unmatched]&lt;br /&gt;
type = &amp;quot;blackhole&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_unmatched&amp;quot;]&lt;br /&gt;
print_interval_secs = 0&lt;br /&gt;
&lt;br /&gt;
[sinks.clickhouse_sshd]&lt;br /&gt;
type = &amp;quot;clickhouse&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_sshd&amp;quot;]&lt;br /&gt;
encoding.timestamp_format = &amp;quot;unix&amp;quot;&lt;br /&gt;
endpoint = &amp;quot;$CLICKHOUSE_ENDPOINT&amp;quot;&lt;br /&gt;
database = &amp;quot;$CLICKHOUSE_DATABASE&amp;quot;&lt;br /&gt;
table = &amp;quot;failed_ssh_logins&amp;quot;&lt;br /&gt;
  [sinks.clickhouse_sshd.auth]&lt;br /&gt;
  strategy = &amp;quot;basic&amp;quot;&lt;br /&gt;
  user = &amp;quot;$CLICKHOUSE_USER&amp;quot;&lt;br /&gt;
  password = &amp;quot;$CLICKHOUSE_PASSWORD&amp;quot;&lt;br /&gt;
&lt;br /&gt;
...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Beats, Logstash and Loki (old) ===&lt;br /&gt;
We previously used Elastic Beats, Logstash and Grafana Loki for collecting and storing logs. One day I tried to upgrade Logstash and it exploded so badly that I figured it would be easier to just switch to Vector instead.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;The sections below are kept for historical purposes only and are no longer accurate.&amp;lt;/b&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We use a combination of [https://www.elastic.co/beats/ Elastic Beats], [https://www.elastic.co/logstash/ Logstash] and [https://grafana.com/oss/loki/ Loki] for collecting, storing and querying our logs; for visualization, we use Grafana. Logstash and Loki are currently both running in the prometheus VM.&lt;br /&gt;
&lt;br /&gt;
The reason why I chose Loki over Elasticsearch is because Loki is &amp;lt;i&amp;gt;very&amp;lt;/i&amp;gt; space efficient with regards to storage. It also consumes way less RAM and CPU. This means that we can collect a lot of logs without worrying too much about resource usage.&lt;br /&gt;
&lt;br /&gt;
We have Journalbeat and/or Filebeat running on some of our machines to collect logs from sshd, Apache and NGINX. The Beats send these logs to Logstash, which does some pre-processing. The most useful contribution by Logstash is its GeoIP plugin, which allows us to enrich the logs with some geographical information from IP addresses (e.g. add city and country). Logstash sends these logs to Loki, and we can then view these from Grafana.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note&#039;&#039;&#039;: Sometimes the Loki output plugin for Logstash disappears after a reboot or an upgrade. If you see Logstash complaining about this in the journald logs, run this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cd /usr/share/logstash&lt;br /&gt;
bin/logstash-plugin install logstash-output-loki&lt;br /&gt;
systemctl restart logstash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
See [https://grafana.com/docs/loki/latest/clients/logstash/ here] for details.&lt;br /&gt;
&lt;br /&gt;
The language for querying logs in Loki is [https://grafana.com/docs/loki/latest/logql/ LogQL], which, syntactically, is very similar to PromQL. If you have already learned PromQL, then you should be able to pick up LogQL very easily. You can try out some LogQL queries from the &#039;Explore&#039; page on Grafana; make sure you toggle the data source to &#039;Loki&#039; in the top left corner. For the &#039;topk&#039; queries, you will also want to toggle &#039;Query type&#039; to &#039;Instant&#039; rather than &#039;Range&#039;.&lt;br /&gt;
&lt;br /&gt;
==== LogQL examples ====&lt;br /&gt;
Here are the number of failed SSH login attempts for each host for a given time range:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sum by (hostname) (&lt;br /&gt;
  count_over_time(&lt;br /&gt;
    {job=&amp;quot;logstash-sshd&amp;quot;} [$__range]&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that &amp;lt;code&amp;gt;$__range&amp;lt;/code&amp;gt; is a special [https://grafana.com/docs/grafana/latest/variables/variable-types/global-variables/ global variable] in Grafana which is equal to the time range in the top right corner of a chart.&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
Here are the top 10 IP addresses from which failed SSH login attempts arrived, for a given host and time range:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
topk(10,&lt;br /&gt;
  sum by (ip_address) (&lt;br /&gt;
    count_over_time(&lt;br /&gt;
      {job=&amp;quot;logstash-sshd&amp;quot;,hostname=&amp;quot;$hostname&amp;quot;} | json | __error__ = &amp;quot;&amp;quot;&lt;br /&gt;
      [$__range]&lt;br /&gt;
    )&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
$hostname is a chart variable, which can be configured from a chart&#039;s settings.&lt;br /&gt;
&lt;br /&gt;
I configured Logstash to send logs to Loki as JSON, but it&#039;s a rather hacky solution, so occasionally invalid JSON is sent.&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
Here are the number of HTTP requests for the 15 distros on our mirror from the last hour:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
topk(15,&lt;br /&gt;
  sum by (distro) (&lt;br /&gt;
    count_over_time(&lt;br /&gt;
      {job=&amp;quot;logstash-nginx&amp;quot;} | json | __error__ = &amp;quot;&amp;quot; | distro != &amp;quot;server-status&amp;quot;&lt;br /&gt;
      [1h]&lt;br /&gt;
    )&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
Here are the number of total bytes sent over HTTP for the top 15 distros from the last hour. Note the use of the &amp;lt;code&amp;gt;unwrap&amp;lt;/code&amp;gt; operator.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
topk(15,&lt;br /&gt;
  sum by (distro) (&lt;br /&gt;
    sum_over_time(&lt;br /&gt;
      {job=&amp;quot;logstash-nginx&amp;quot;} | json | __error__ = &amp;quot;&amp;quot; | distro != &amp;quot;server-status&amp;quot; | unwrap bytes&lt;br /&gt;
      [1h]&lt;br /&gt;
    )&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
You can see more examples on the Mirror Requests dashboard on Grafana.&lt;br /&gt;
&lt;br /&gt;
==== Avoid high cardinality ====&lt;br /&gt;
For both Prometheus and Loki, you must [https://prometheus.io/docs/practices/naming/#labels avoid high cardinality] labels at all costs. By high cardinality, I mean labels which can take on a very large number of values; for example, using a label to store IP addresses would be a very bad idea. This is because Prometheus and Loki use labels to store metrics/logs efficiently with compression; when two metrics have two different sets of labels, they cannot be stored together, which increases the storage space usage.&lt;br /&gt;
&lt;br /&gt;
With Loki, you can extract labels from your logs inside your query dynamically. One way to do this is with the &amp;lt;code&amp;gt;json&amp;lt;/code&amp;gt; operator; there are other ways to do this as well (see the LogQL docs). This basically means that we get infinite cardinality from our logs, the tradeoff being that queries may take longer to execute.&lt;br /&gt;
&lt;br /&gt;
Also, be very careful about what you send to Loki from Logstash - [https://grafana.com/docs/loki/latest/clients/logstash/#usage-and-configuration every field in a Logstash message becomes a Loki label]. Usage of the &amp;lt;code&amp;gt;prune&amp;lt;/code&amp;gt; command in Logstash is highly recommended.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=Observability&amp;diff=5181</id>
		<title>Observability</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=Observability&amp;diff=5181"/>
		<updated>2023-12-16T00:06:09Z</updated>

		<summary type="html">&lt;p&gt;Merenber: Don&amp;#039;t remove country_codes.csv&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;There are [https://www.oreilly.com/library/view/distributed-systems-observability/9781492033431/ch04.html three pillars of observability]: metrics, logging and tracing. We are only interested in the first two.&lt;br /&gt;
&lt;br /&gt;
== Metrics ==&lt;br /&gt;
All of our machines are, or at least should be, running the Prometheus node exporter. This collects and sends machine metrics (e.g. RAM used, disk space) to the Prometheus server running at https://prometheus.csclub.uwaterloo.ca (currently a VM on phosphoric-acid). There are a few specialized exporters running on several other machines; a Postfix exporter is running on mail, an Apache exporter is running on caffeine, and an NGINX expoter is running on potassium-benzoate. There is also a custom exporter written by syscom running on potassium-benzoate for mirror stats.&lt;br /&gt;
&lt;br /&gt;
Most of the exporters use mutual TLS authentication with the Prometheus server. I set the expiration date for the TLS certs to 10 years. If you are reading this and it is 2031 or later, then go update the certs.&lt;br /&gt;
&lt;br /&gt;
I highly suggest becoming familiar with [https://prometheus.io/docs/prometheus/latest/querying/basics/ PromQL], the query language for Prometheus. You can run and visualize some queries at https://prometheus.csclub.uwaterloo.ca/prometheus. For example, here is a query to determine which machines are up or down:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
up{job=&amp;quot;node_exporter&amp;quot;}&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Here&#039;s how we determine if a machine has NFS mounted. This will return 1 for machines which have NFS mounted, but will not return any records for machines which do not have NFS mounted. (We ignore the actual value of node_filesystem_device_error because it returns 1 for machines using Kerberized NFS.)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
count by (instance) (node_filesystem_device_error{mountpoint=&amp;quot;/users&amp;quot;, fstype=&amp;quot;nfs&amp;quot;})&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Now this is a rather complicated expression which can return one of three values:&lt;br /&gt;
* 0: the machine is down&lt;br /&gt;
* 1: the machine is up, but NFS is not mounted&lt;br /&gt;
* 2: the machine is up and NFS is mounted&lt;br /&gt;
The [https://prometheus.io/docs/prometheus/latest/querying/operators/#logical-set-binary-operators or operator] in PromQL is key here.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sum by (instance) (&lt;br /&gt;
  (count by (instance) (node_filesystem_device_error{mountpoint=&amp;quot;/users&amp;quot;, fstype=&amp;quot;nfs&amp;quot;}))&lt;br /&gt;
  or up{job=&amp;quot;node_exporter&amp;quot;}&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
We also use [https://prometheus.io/docs/alerting/latest/alertmanager/ AlertManager] to send email alerts from Prometheus metrics. We should figure out how to also send messages to IRC or similar.&lt;br /&gt;
&lt;br /&gt;
We also use the [https://github.com/prometheus/blackbox_exporter Blackbox prober exporter] to check if some of our web-based services are up.&lt;br /&gt;
&lt;br /&gt;
We make some pretty charts on Grafana (https://prometheus.csclub.uwaterloo.ca) from PromQL queries. Grafana also has an &#039;Explorer&#039; page where you can test out some queries before making chart panels from them.&lt;br /&gt;
&lt;br /&gt;
== Logging ==&lt;br /&gt;
We now use [https://vector.dev/ Vector] for collecting and transforming logs, and [https://clickhouse.com/ ClickHouse] for storing log data.&lt;br /&gt;
&lt;br /&gt;
=== ClickHouse ===&lt;br /&gt;
ClickHouse is a very fast OLAP database which has great documentation for storing and analyzing [https://clickhouse.com/use-cases/logging-and-metrics logging and metrics]. Unfortunately, the CPU on phosphoric-acid (which hosts the prometheus VM) is so old that when we try to install the official deb package, the following error occurs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Instruction check fail. The CPU does not support SSSE3 instruction set.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
So we&#039;re going to download the &amp;quot;compat&amp;quot; version instead:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cd /root&lt;br /&gt;
wget https://s3.amazonaws.com/clickhouse-builds/master/amd64compat/clickhouse&lt;br /&gt;
chmod +x clickhouse&lt;br /&gt;
./clickhouse install&lt;br /&gt;
rm clickhouse&lt;br /&gt;
wget -O /etc/systemd/system/clickhouse-server.service https://github.com/ClickHouse/ClickHouse/raw/master/packages/clickhouse-server.service&lt;br /&gt;
systemctl daemon-reload&lt;br /&gt;
systemctl enable clickhouse-server&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
By default, systemd limits the number of threads which a service can create, so we&#039;ll want to disable that. Run &amp;lt;code&amp;gt;systemctl edit clickhouse-server&amp;lt;/code&amp;gt; and paste the following:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Service]&lt;br /&gt;
TasksMax=infinity&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, paste the following into /etc/clickhouse-server/users.d/csclub-users.xml:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;clickhouse&amp;gt;&lt;br /&gt;
  &amp;lt;readonly&amp;gt;&lt;br /&gt;
    &amp;lt;!-- Grafana needs to change settings queries --&amp;gt;&lt;br /&gt;
    &amp;lt;readonly&amp;gt;2&amp;lt;/readonly&amp;gt;&lt;br /&gt;
  &amp;lt;/readonly&amp;gt;&lt;br /&gt;
  &amp;lt;users&amp;gt;&lt;br /&gt;
    &amp;lt;default&amp;gt;&lt;br /&gt;
      &amp;lt;!-- The default user should only be allowed to connect from localhost --&amp;gt;&lt;br /&gt;
      &amp;lt;networks&amp;gt;&lt;br /&gt;
        &amp;lt;ip&amp;gt;::1&amp;lt;/ip&amp;gt;&lt;br /&gt;
        &amp;lt;ip&amp;gt;127.0.0.1&amp;lt;/ip&amp;gt;&lt;br /&gt;
      &amp;lt;/networks&amp;gt;&lt;br /&gt;
      &amp;lt;!-- Allow the default user to create new users --&amp;gt;&lt;br /&gt;
      &amp;lt;access_management&amp;gt;1&amp;lt;/access_management&amp;gt;&lt;br /&gt;
      &amp;lt;named_collection_control&amp;gt;1&amp;lt;/named_collection_control&amp;gt;&lt;br /&gt;
      &amp;lt;show_named_collections&amp;gt;1&amp;lt;/show_named_collections&amp;gt;&lt;br /&gt;
      &amp;lt;show_named_collections_secrets&amp;gt;1&amp;lt;/show_named_collections_secrets&amp;gt;&lt;br /&gt;
    &amp;lt;/default&amp;gt;&lt;br /&gt;
  &amp;lt;/users&amp;gt;&lt;br /&gt;
&amp;lt;/clickhouse&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then paste the following into /etc/clickhouse-server/config/zzz-csclub.xml (we need the zzz prefix because the configuration files are merged in alphabetical order, and we want ours to be applied last):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;clickhouse&amp;gt;&lt;br /&gt;
  &amp;lt;listen_host&amp;gt;127.0.0.1&amp;lt;/listen_host&amp;gt;&lt;br /&gt;
  &amp;lt;listen_host&amp;gt;::1&amp;lt;/listen_host&amp;gt;&lt;br /&gt;
  &amp;lt;logger&amp;gt;&lt;br /&gt;
    &amp;lt;level&amp;gt;information&amp;lt;/level&amp;gt;&lt;br /&gt;
    &amp;lt;size&amp;gt;100M&amp;lt;/size&amp;gt;&lt;br /&gt;
    &amp;lt;count&amp;gt;10&amp;lt;/count&amp;gt;&lt;br /&gt;
  &amp;lt;/logger&amp;gt;&lt;br /&gt;
  &amp;lt;mysql_port&amp;gt;&amp;lt;/mysql_port&amp;gt;&lt;br /&gt;
  &amp;lt;postgresql_port&amp;gt;&amp;lt;/postgresql_port&amp;gt;&lt;br /&gt;
&amp;lt;/clickhouse&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run &amp;lt;code&amp;gt;systemctl restart clickhouse-server&amp;lt;/code&amp;gt; and make sure that it&#039;s running.&lt;br /&gt;
&lt;br /&gt;
==== Schema ====&lt;br /&gt;
Run &amp;lt;code&amp;gt;clickhouse-client&amp;lt;/code&amp;gt; to get a SQL shell. First we need to create a new database and some users:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE DATABASE vector;&lt;br /&gt;
CREATE USER vector IDENTIFIED BY &#039;REPLACE_ME&#039;;&lt;br /&gt;
GRANT ALL ON vector.* TO vector;&lt;br /&gt;
CREATE USER grafana IDENTIFIED BY &#039;REPLACE_ME&#039; SETTINGS PROFILE &#039;readonly&#039;;&lt;br /&gt;
GRANT SHOW DATABASES, SHOW TABLES, SELECT ON *.* TO grafana;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In some of our tables, we&#039;ll store the two-letter country code instead of a country&#039;s full name to save space. So we&#039;ll create a [https://clickhouse.com/docs/en/sql-reference/dictionaries dictionary] so that we can look up a country&#039;s full name. Exit the SQL shell, then, download the CSV file:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
wget -O /var/lib/clickhouse/user_files/country_codes.csv &#039;https://datahub.io/core/country-list/r/data.csv&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run &amp;lt;code&amp;gt;clickhouse-client&amp;lt;/code&amp;gt; and create the dictionary:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE DICTIONARY country_codes_dictionary&lt;br /&gt;
(&lt;br /&gt;
    Name String,&lt;br /&gt;
    Code String&lt;br /&gt;
)&lt;br /&gt;
PRIMARY KEY Code&lt;br /&gt;
SOURCE(FILE(path &#039;/var/lib/clickhouse/user_files/country_codes.csv&#039; FORMAT &#039;CSVWithNames&#039;))&lt;br /&gt;
LIFETIME(MIN 0 MAX 0)&lt;br /&gt;
LAYOUT(HASHED_ARRAY());&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Perform a SELECT to fill the table:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
SELECT * FROM country_codes_dictionary;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now we need to create the tables for storing our actual log data (after they are transformed by Vector).&lt;br /&gt;
Create a table for failed SSH logins:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.failed_ssh_logins&lt;br /&gt;
(&lt;br /&gt;
    host LowCardinality(String),&lt;br /&gt;
    timestamp DateTime,&lt;br /&gt;
    ip_address IPv6,&lt;br /&gt;
    username String,&lt;br /&gt;
    country_code LowCardinality(String)&lt;br /&gt;
)&lt;br /&gt;
ENGINE = MergeTree()&lt;br /&gt;
PRIMARY KEY (host, timestamp)&lt;br /&gt;
TTL timestamp + INTERVAL 1 MONTH DELETE;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Create a table for storing mirror requests:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    timestamp DateTime CODEC(Delta, ZSTD),&lt;br /&gt;
    ip_address IPv6,&lt;br /&gt;
    bytes_sent UInt64,&lt;br /&gt;
    user_agent String,&lt;br /&gt;
    country_code LowCardinality(String),&lt;br /&gt;
    region_name String,&lt;br /&gt;
    city String&lt;br /&gt;
)&lt;br /&gt;
ENGINE = MergeTree()&lt;br /&gt;
PRIMARY KEY (distro, timestamp, country_code, region_name, city)&lt;br /&gt;
TTL timestamp + INTERVAL 1 WEEK DELETE;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
One of ClickHouse&#039;s great features is [https://clickhouse.com/docs/en/guides/developer/cascading-materialized-views Materialized Views]. These allow us to automatically &amp;quot;forward&amp;quot; data from one table to another, and the second table can use a different storage engine to aggregate data and save space.&lt;br /&gt;
&lt;br /&gt;
We want to calculate the total number of requests and bytes sent for each distro, so let&#039;s create a table and view for that:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_by_distro&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    date Date CODEC(Delta, ZSTD),&lt;br /&gt;
    country_code LowCardinality(String),&lt;br /&gt;
    num_requests UInt64,&lt;br /&gt;
    bytes_sent UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((num_requests, bytes_sent))&lt;br /&gt;
PRIMARY KEY (distro, toStartOfMonth(date), date, country_code)&lt;br /&gt;
TTL date + INTERVAL 1 MONTH&lt;br /&gt;
        GROUP BY distro, toStartOfMonth(date)&lt;br /&gt;
        SET num_requests = sum(num_requests),&lt;br /&gt;
            bytes_sent = sum(bytes_sent),&lt;br /&gt;
    date + INTERVAL 2 YEAR DELETE;&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_by_distro_mv&lt;br /&gt;
TO vector.mirror_requests_agg_by_distro&lt;br /&gt;
AS&lt;br /&gt;
SELECT&lt;br /&gt;
    distro,&lt;br /&gt;
    toDate(timestamp) AS date,&lt;br /&gt;
    country_code,&lt;br /&gt;
    sum(1) AS num_requests,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
GROUP BY distro, date, country_code;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We also wants some stats for Canada specifically:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_canada&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    date Date CODEC(Delta, ZSTD),&lt;br /&gt;
    region_name LowCardinality(String),&lt;br /&gt;
    city String,&lt;br /&gt;
    bytes_sent UInt64,&lt;br /&gt;
    num_requests UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((bytes_sent, num_requests))&lt;br /&gt;
PRIMARY KEY (distro, toStartOfMonth(date), date, region_name, city)&lt;br /&gt;
TTL date + INTERVAL 1 MONTH&lt;br /&gt;
        GROUP BY distro, toStartOfMonth(date)&lt;br /&gt;
        SET num_requests = sum(num_requests),&lt;br /&gt;
            bytes_sent = sum(bytes_sent),&lt;br /&gt;
    date + INTERVAL 2 YEAR DELETE;&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_canada_mv&lt;br /&gt;
TO vector.mirror_requests_agg_canada&lt;br /&gt;
AS&lt;br /&gt;
SELECT&lt;br /&gt;
    distro,&lt;br /&gt;
    toDate(timestamp) as date,&lt;br /&gt;
    region_name,&lt;br /&gt;
    city,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent,&lt;br /&gt;
    sum(1) AS num_requests&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
WHERE country_code = &#039;CA&#039;&lt;br /&gt;
GROUP BY distro, date, region_name, city;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We also want to keep stats just for the university:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_uw&lt;br /&gt;
(&lt;br /&gt;
    distro LowCardinality(String),&lt;br /&gt;
    date Date CODEC(Delta, ZSTD),&lt;br /&gt;
    bytes_sent UInt64,&lt;br /&gt;
    num_requests UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((bytes_sent, num_requests))&lt;br /&gt;
PRIMARY KEY (distro, toStartOfMonth(date), date)&lt;br /&gt;
TTL date + INTERVAL 1 MONTH&lt;br /&gt;
        GROUP BY distro, toStartOfMonth(date)&lt;br /&gt;
        SET num_requests = sum(num_requests),&lt;br /&gt;
            bytes_sent = sum(bytes_sent),&lt;br /&gt;
    date + INTERVAL 2 YEAR DELETE;&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_uw_mv&lt;br /&gt;
TO vector.mirror_requests_agg_uw&lt;br /&gt;
AS&lt;br /&gt;
SELECT&lt;br /&gt;
    distro,&lt;br /&gt;
    toDate(timestamp) as date,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent,&lt;br /&gt;
    sum(1) AS num_requests&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
WHERE isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:129.97.0.0/112&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:10.0.0.0/104&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:172.16.0.0/108&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;::ffff:192.168.0.0/112&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;2620:101:f000::/47&#039;)&lt;br /&gt;
   OR isIPAddressInRange(IPv6NumToString(ip_address), &#039;fd74:6b6a:8eca::/47&#039;)&lt;br /&gt;
GROUP BY distro, date, region_name, city;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Finally, we&#039;ll store some stats for IP subnets:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
CREATE TABLE vector.mirror_requests_agg_ip&lt;br /&gt;
(&lt;br /&gt;
    timestamp DateTime CODEC(Delta, ZSTD),&lt;br /&gt;
    cidr_start IPv6,&lt;br /&gt;
    country_code LowCardinality(String),&lt;br /&gt;
    num_requests UInt64,&lt;br /&gt;
    bytes_sent UInt64&lt;br /&gt;
)&lt;br /&gt;
ENGINE = SummingMergeTree((num_requests, bytes_sent))&lt;br /&gt;
PRIMARY KEY (timestamp, cidr_start, country_code)&lt;br /&gt;
TTL timestamp + toIntervalWeek(2);&lt;br /&gt;
&lt;br /&gt;
CREATE MATERIALIZED VIEW vector.mirror_requests_agg_ip_mv TO vector.mirror_requests_agg_ip AS&lt;br /&gt;
SELECT&lt;br /&gt;
    toStartOfFiveMinutes(timestamp) AS timestamp,&lt;br /&gt;
    IPv6CIDRToRange(ip_address, 120).1 AS cidr_start,&lt;br /&gt;
    country_code,&lt;br /&gt;
    sum(1) AS num_requests,&lt;br /&gt;
    sum(bytes_sent) AS bytes_sent&lt;br /&gt;
FROM vector.mirror_requests&lt;br /&gt;
GROUP BY&lt;br /&gt;
    timestamp,&lt;br /&gt;
    cidr_start,&lt;br /&gt;
    country_code;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== GeoIP database ===&lt;br /&gt;
We&#039;ll want to look up geographic information for the IP addresses in our data. To do this, we&#039;ll use the [https://dev.maxmind.com/geoip/geolite2-free-geolocation-data MaxMind GeoLite2 databases]. Syscom already has a MaxMind account; the password is stored in the usual place. Install the latest geoipupdate package from [https://github.com/maxmind/geoipupdate/releases here], then edit /etc/GeoIP.conf as necessary (use the syscom account ID and license key). Set &amp;lt;code&amp;gt;EditionIDs&amp;lt;/code&amp;gt; to &amp;lt;code&amp;gt;GeoLite2-City&amp;lt;/code&amp;gt; only.&lt;br /&gt;
&lt;br /&gt;
We&#039;ll use a systemd timer to run the geoipupdate script periodically. Paste the following into /etc/systemd/system/geoipupdate.service:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=GeoIP Update&lt;br /&gt;
Documentation=https://dev.maxmind.com/geoip/updating-databases&lt;br /&gt;
After=network-online.target&lt;br /&gt;
&lt;br /&gt;
[Service]&lt;br /&gt;
Type=oneshot&lt;br /&gt;
ExecStart=/usr/bin/geoipupdate&lt;br /&gt;
Nice=19&lt;br /&gt;
IOSchedulingClass=idle&lt;br /&gt;
IOSchedulingPriority=7&lt;br /&gt;
ProtectSystem=strict&lt;br /&gt;
ReadWritePaths=/usr/share/GeoIP&lt;br /&gt;
ProtectHome=true&lt;br /&gt;
PrivateTmp=true&lt;br /&gt;
PrivateDevices=true&lt;br /&gt;
ProtectHostname=true&lt;br /&gt;
ProtectClock=true&lt;br /&gt;
ProtectKernelTunables=true&lt;br /&gt;
ProtectKernelModules=true&lt;br /&gt;
ProtectKernelLogs=true&lt;br /&gt;
ProtectControlGroups=true&lt;br /&gt;
LockPersonality=true&lt;br /&gt;
RestrictRealtime=true&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Run &amp;lt;code&amp;gt;systemctl daemon-reload&amp;lt;/code&amp;gt; and then &amp;lt;code&amp;gt;systemctl start geoipupdate&amp;lt;/code&amp;gt; to download the database for the first time.&lt;br /&gt;
&lt;br /&gt;
Now paste the following into /etc/systemd/system/geoipupdate.timer:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[Unit]&lt;br /&gt;
Description=Automatic GeoIP database update&lt;br /&gt;
Documentation=https://dev.maxmind.com/geoip/updating-databases&lt;br /&gt;
&lt;br /&gt;
[Timer]&lt;br /&gt;
OnCalendar=monthly&lt;br /&gt;
RandomizedDelaySec=12h&lt;br /&gt;
Persistent=true&lt;br /&gt;
&lt;br /&gt;
[Install]&lt;br /&gt;
WantedBy=timers.target&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl daemon-reload&lt;br /&gt;
systemctl enable geoipupdate.timer&lt;br /&gt;
systemctl start geoipupdate.timer&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Vector ===&lt;br /&gt;
Vector allows you to create directed acyclic graphs (DAGs) for collecting and processing logs, which gives us a lot of flexibility. It also has a built-in scripting language, [https://vector.dev/docs/reference/vrl/ Vector Remap Language (VRL)] for slicing and dicing data. This allows us to remove fields which we don&#039;t need, add new fields which we do need, enrich an event with extra data, etc.&lt;br /&gt;
&lt;br /&gt;
Our data pipeline looks like this: Vector agents -&amp;gt; Vector aggregator -&amp;gt; ClickHouse. We use Grafana for visualization.&lt;br /&gt;
&lt;br /&gt;
We use mutual TLS between the agents and the aggregator to make sure that random people can&#039;t send us garbage data:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
openssl req -newkey rsa:2048 -nodes -keyout aggregator.key -x509 -out aggregator.crt -days 36500&lt;br /&gt;
openssl req -newkey rsa:2048 -nodes -keyout agent.key -x509 -out agent.crt -days 36500&lt;br /&gt;
chown vector:vector *.crt *.key&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Here is what our vector.toml looks like on the general-use machines; currently, we only use it for collecting failed SSH login attempts.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[sources.source_sshd]&lt;br /&gt;
type = &amp;quot;journald&amp;quot;&lt;br /&gt;
include_units = [&amp;quot;ssh.service&amp;quot;]&lt;br /&gt;
&lt;br /&gt;
[transforms.remap_sshd]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;source_sshd&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
  parsed, err = parse_regex(&lt;br /&gt;
    .message, r&#039;^(?:Connection (?:closed|reset)|Disconnected) (?:by|from) (?:invalid|authenticating) user (?P&amp;lt;user&amp;gt;[^ ]+) (?P&amp;lt;ip&amp;gt;[0-9.a-f:]+)&#039;&lt;br /&gt;
  )&lt;br /&gt;
  if is_null(err) {&lt;br /&gt;
    . = {&lt;br /&gt;
      &amp;quot;username&amp;quot;: parsed.user,&lt;br /&gt;
      &amp;quot;ip_address&amp;quot;: parsed.ip,&lt;br /&gt;
      &amp;quot;host&amp;quot;: .host,&lt;br /&gt;
      &amp;quot;timestamp&amp;quot;: .timestamp,&lt;br /&gt;
      &amp;quot;job&amp;quot;: &amp;quot;vector-sshd&amp;quot;&lt;br /&gt;
    }&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[transforms.filter_sshd]&lt;br /&gt;
type = &amp;quot;filter&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;remap_sshd&amp;quot;]&lt;br /&gt;
condition = &#039;.job == &amp;quot;vector-sshd&amp;quot;&#039;&lt;br /&gt;
&lt;br /&gt;
[sinks.aggregator]&lt;br /&gt;
type = &amp;quot;vector&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;filter_sshd&amp;quot;]&lt;br /&gt;
address = &amp;quot;prometheus:5045&amp;quot;&lt;br /&gt;
  [sinks.aggregator.tls]&lt;br /&gt;
  enabled = true&lt;br /&gt;
  ca_file = &amp;quot;/etc/vector/aggregator.crt&amp;quot;&lt;br /&gt;
  crt_file = &amp;quot;/etc/vector/agent.crt&amp;quot;&lt;br /&gt;
  key_file = &amp;quot;/etc/vector/agent.key&amp;quot;&lt;br /&gt;
  verify_hostname = false&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The agent on potassium-benzoate collects NGINX logs:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[sources.source_nginx]&lt;br /&gt;
type = &amp;quot;file&amp;quot;&lt;br /&gt;
include = [&amp;quot;/var/log/nginx/access.log&amp;quot;]&lt;br /&gt;
max_read_bytes = 65536&lt;br /&gt;
&lt;br /&gt;
[transforms.remap_nginx]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;source_nginx&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot; &lt;br /&gt;
  parsed_log, err = parse_nginx_log(.message, &amp;quot;combined&amp;quot;)&lt;br /&gt;
  status = parsed_log.status&lt;br /&gt;
  request = string!(parsed_log.request || &amp;quot;&amp;quot;)&lt;br /&gt;
  if is_null(err) &amp;amp;&amp;amp; status == 200 {&lt;br /&gt;
    parsed_path, err = parse_regex(request, r&#039;^GET /+(?P&amp;lt;distro&amp;gt;[^/? ]+)&#039;)&lt;br /&gt;
    distro = parsed_path.distro&lt;br /&gt;
    ignore = [&lt;br /&gt;
      &amp;quot;server-status&amp;quot;, &amp;quot;stats&amp;quot;, &amp;quot;robots.txt&amp;quot;,&lt;br /&gt;
      &amp;quot;include&amp;quot;, &amp;quot;pub&amp;quot;, &amp;quot;news&amp;quot;, &amp;quot;index.html&amp;quot;, &amp;quot;sync.json&amp;quot;, &amp;quot;ups&amp;quot;,&lt;br /&gt;
      &amp;quot;pool&amp;quot;, &amp;quot;dists&amp;quot;, &amp;quot;csclub.asc&amp;quot;, &amp;quot;csclub.gpg&amp;quot;&lt;br /&gt;
    ]&lt;br /&gt;
    if (&lt;br /&gt;
      is_null(err) &amp;amp;&amp;amp; !includes(ignore, distro) &amp;amp;&amp;amp; !contains(request, &amp;quot;..&amp;quot;) &amp;amp;&amp;amp;&lt;br /&gt;
      !starts_with(request, &amp;quot;#&amp;quot;) &amp;amp;&amp;amp; !starts_with(request, &amp;quot;%&amp;quot;) &amp;amp;&amp;amp; !starts_with(request, &amp;quot;.&amp;quot;)&lt;br /&gt;
    ) {&lt;br /&gt;
      . = {&lt;br /&gt;
        &amp;quot;distro&amp;quot;: distro,&lt;br /&gt;
        &amp;quot;user_agent&amp;quot;: parsed_log.agent,&lt;br /&gt;
        &amp;quot;ip_address&amp;quot;: parsed_log.client,&lt;br /&gt;
        &amp;quot;bytes_sent&amp;quot;: parsed_log.size,&lt;br /&gt;
        &amp;quot;timestamp&amp;quot;: parsed_log.timestamp,&lt;br /&gt;
        &amp;quot;job&amp;quot;: &amp;quot;vector-mirror&amp;quot;&lt;br /&gt;
      }&lt;br /&gt;
    }&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Finally, here&#039;s the aggregator config, which collects data from each agent and then inserts it into ClickHouse:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[enrichment_tables.enrich_geoip]                                                    &lt;br /&gt;
type = &amp;quot;geoip&amp;quot;                                                                      &lt;br /&gt;
path = &amp;quot;/usr/share/GeoIP/GeoLite2-City.mmdb&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[sources.source_agents]      &lt;br /&gt;
type = &amp;quot;vector&amp;quot;&lt;br /&gt;
address = &amp;quot;[::]:5045&amp;quot;&lt;br /&gt;
  [sources.source_agents.tls]&lt;br /&gt;
  enabled = true&lt;br /&gt;
  ca_file = &amp;quot;/etc/vector/agent.crt&amp;quot;&lt;br /&gt;
  crt_file = &amp;quot;/etc/vector/aggregator.crt&amp;quot;&lt;br /&gt;
  key_file = &amp;quot;/etc/vector/aggregator.key&amp;quot;&lt;br /&gt;
  verify_hostname = false&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_route]&lt;br /&gt;
type = &amp;quot;route&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;source_agents&amp;quot;]&lt;br /&gt;
route.sshd = &#039;.job == &amp;quot;vector-sshd&amp;quot;&#039;&lt;br /&gt;
route.mirror = &#039;.job == &amp;quot;vector-mirror&amp;quot;&#039;&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_sshd]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_route.sshd&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot; &lt;br /&gt;
  ipinfo = get_enrichment_table_record(&amp;quot;enrich_geoip&amp;quot;, {&amp;quot;ip&amp;quot;: .ip_address}) ?? {}&lt;br /&gt;
  if is_ipv4!(.ip_address) &amp;amp;&amp;amp; (ip_cidr_contains!(&amp;quot;10.0.0.0/8&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;172.16.0.0/12&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;192.168.0.0/16&amp;quot;, .ip_address)) {&lt;br /&gt;
    ipinfo.country_code = &amp;quot;CA&amp;quot;;&lt;br /&gt;
    ipinfo.region_name = &amp;quot;Ontario&amp;quot;;&lt;br /&gt;
    ipinfo.city_name = &amp;quot;Waterloo&amp;quot;;&lt;br /&gt;
  }&lt;br /&gt;
  . = {&lt;br /&gt;
    &amp;quot;host&amp;quot;: .host,&lt;br /&gt;
    &amp;quot;timestamp&amp;quot;: .timestamp,&lt;br /&gt;
    &amp;quot;username&amp;quot;: .username,&lt;br /&gt;
    &amp;quot;ip_address&amp;quot;: ip_to_ipv6!(.ip_address),&lt;br /&gt;
    &amp;quot;country_code&amp;quot;: ipinfo.country_code || &amp;quot;&amp;quot;&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_mirror]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_route.mirror&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
  ipinfo = get_enrichment_table_record(&amp;quot;enrich_geoip&amp;quot;, {&amp;quot;ip&amp;quot;: .ip_address}) ?? {}&lt;br /&gt;
  if is_ipv4!(.ip_address) &amp;amp;&amp;amp; (ip_cidr_contains!(&amp;quot;10.0.0.0/8&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;172.16.0.0/12&amp;quot;, .ip_address) \&lt;br /&gt;
                            || ip_cidr_contains!(&amp;quot;192.168.0.0/16&amp;quot;, .ip_address)) {&lt;br /&gt;
    ipinfo.country_code = &amp;quot;CA&amp;quot;;&lt;br /&gt;
    ipinfo.region_name = &amp;quot;Ontario&amp;quot;;&lt;br /&gt;
    ipinfo.city_name = &amp;quot;Waterloo&amp;quot;;&lt;br /&gt;
  }&lt;br /&gt;
  . = {&lt;br /&gt;
    &amp;quot;distro&amp;quot;: .distro,&lt;br /&gt;
    &amp;quot;timestamp&amp;quot;: .timestamp,&lt;br /&gt;
    &amp;quot;ip_address&amp;quot;: ip_to_ipv6!(.ip_address),&lt;br /&gt;
    &amp;quot;bytes_sent&amp;quot;: .bytes_sent,&lt;br /&gt;
    &amp;quot;user_agent&amp;quot;: .user_agent,&lt;br /&gt;
    &amp;quot;country_code&amp;quot;: ipinfo.country_code || &amp;quot;&amp;quot;,&lt;br /&gt;
    &amp;quot;region_name&amp;quot;: ipinfo.region_name || &amp;quot;&amp;quot;,&lt;br /&gt;
    &amp;quot;city&amp;quot;: ipinfo.city_name || &amp;quot;&amp;quot;&lt;br /&gt;
  }&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[transforms.transform_unmatched]&lt;br /&gt;
type = &amp;quot;remap&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_route._unmatched&amp;quot;]&lt;br /&gt;
source = &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
  log(&amp;quot;unrecognized job: &amp;quot; + string!(.job || &amp;quot;null&amp;quot;), level: &amp;quot;warn&amp;quot;)&lt;br /&gt;
&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
[sinks.sink_unmatched]&lt;br /&gt;
type = &amp;quot;blackhole&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_unmatched&amp;quot;]&lt;br /&gt;
print_interval_secs = 0&lt;br /&gt;
&lt;br /&gt;
[sinks.clickhouse_sshd]&lt;br /&gt;
type = &amp;quot;clickhouse&amp;quot;&lt;br /&gt;
inputs = [&amp;quot;transform_sshd&amp;quot;]&lt;br /&gt;
encoding.timestamp_format = &amp;quot;unix&amp;quot;&lt;br /&gt;
endpoint = &amp;quot;$CLICKHOUSE_ENDPOINT&amp;quot;&lt;br /&gt;
database = &amp;quot;$CLICKHOUSE_DATABASE&amp;quot;&lt;br /&gt;
table = &amp;quot;failed_ssh_logins&amp;quot;&lt;br /&gt;
  [sinks.clickhouse_sshd.auth]&lt;br /&gt;
  strategy = &amp;quot;basic&amp;quot;&lt;br /&gt;
  user = &amp;quot;$CLICKHOUSE_USER&amp;quot;&lt;br /&gt;
  password = &amp;quot;$CLICKHOUSE_PASSWORD&amp;quot;&lt;br /&gt;
&lt;br /&gt;
...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Beats, Logstash and Loki (old) ===&lt;br /&gt;
We previously used Elastic Beats, Logstash and Grafana Loki for collecting and storing logs. One day I tried to upgrade Logstash and it exploded so badly that I figured it would be easier to just switch to Vector instead.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;b&amp;gt;The sections below are kept for historical purposes only and are no longer accurate.&amp;lt;/b&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We use a combination of [https://www.elastic.co/beats/ Elastic Beats], [https://www.elastic.co/logstash/ Logstash] and [https://grafana.com/oss/loki/ Loki] for collecting, storing and querying our logs; for visualization, we use Grafana. Logstash and Loki are currently both running in the prometheus VM.&lt;br /&gt;
&lt;br /&gt;
The reason why I chose Loki over Elasticsearch is because Loki is &amp;lt;i&amp;gt;very&amp;lt;/i&amp;gt; space efficient with regards to storage. It also consumes way less RAM and CPU. This means that we can collect a lot of logs without worrying too much about resource usage.&lt;br /&gt;
&lt;br /&gt;
We have Journalbeat and/or Filebeat running on some of our machines to collect logs from sshd, Apache and NGINX. The Beats send these logs to Logstash, which does some pre-processing. The most useful contribution by Logstash is its GeoIP plugin, which allows us to enrich the logs with some geographical information from IP addresses (e.g. add city and country). Logstash sends these logs to Loki, and we can then view these from Grafana.&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note&#039;&#039;&#039;: Sometimes the Loki output plugin for Logstash disappears after a reboot or an upgrade. If you see Logstash complaining about this in the journald logs, run this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cd /usr/share/logstash&lt;br /&gt;
bin/logstash-plugin install logstash-output-loki&lt;br /&gt;
systemctl restart logstash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
See [https://grafana.com/docs/loki/latest/clients/logstash/ here] for details.&lt;br /&gt;
&lt;br /&gt;
The language for querying logs in Loki is [https://grafana.com/docs/loki/latest/logql/ LogQL], which, syntactically, is very similar to PromQL. If you have already learned PromQL, then you should be able to pick up LogQL very easily. You can try out some LogQL queries from the &#039;Explore&#039; page on Grafana; make sure you toggle the data source to &#039;Loki&#039; in the top left corner. For the &#039;topk&#039; queries, you will also want to toggle &#039;Query type&#039; to &#039;Instant&#039; rather than &#039;Range&#039;.&lt;br /&gt;
&lt;br /&gt;
==== LogQL examples ====&lt;br /&gt;
Here are the number of failed SSH login attempts for each host for a given time range:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sum by (hostname) (&lt;br /&gt;
  count_over_time(&lt;br /&gt;
    {job=&amp;quot;logstash-sshd&amp;quot;} [$__range]&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Note that &amp;lt;code&amp;gt;$__range&amp;lt;/code&amp;gt; is a special [https://grafana.com/docs/grafana/latest/variables/variable-types/global-variables/ global variable] in Grafana which is equal to the time range in the top right corner of a chart.&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
Here are the top 10 IP addresses from which failed SSH login attempts arrived, for a given host and time range:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
topk(10,&lt;br /&gt;
  sum by (ip_address) (&lt;br /&gt;
    count_over_time(&lt;br /&gt;
      {job=&amp;quot;logstash-sshd&amp;quot;,hostname=&amp;quot;$hostname&amp;quot;} | json | __error__ = &amp;quot;&amp;quot;&lt;br /&gt;
      [$__range]&lt;br /&gt;
    )&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
$hostname is a chart variable, which can be configured from a chart&#039;s settings.&lt;br /&gt;
&lt;br /&gt;
I configured Logstash to send logs to Loki as JSON, but it&#039;s a rather hacky solution, so occasionally invalid JSON is sent.&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
Here are the number of HTTP requests for the 15 distros on our mirror from the last hour:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
topk(15,&lt;br /&gt;
  sum by (distro) (&lt;br /&gt;
    count_over_time(&lt;br /&gt;
      {job=&amp;quot;logstash-nginx&amp;quot;} | json | __error__ = &amp;quot;&amp;quot; | distro != &amp;quot;server-status&amp;quot;&lt;br /&gt;
      [1h]&lt;br /&gt;
    )&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;&lt;br /&gt;
Here are the number of total bytes sent over HTTP for the top 15 distros from the last hour. Note the use of the &amp;lt;code&amp;gt;unwrap&amp;lt;/code&amp;gt; operator.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
topk(15,&lt;br /&gt;
  sum by (distro) (&lt;br /&gt;
    sum_over_time(&lt;br /&gt;
      {job=&amp;quot;logstash-nginx&amp;quot;} | json | __error__ = &amp;quot;&amp;quot; | distro != &amp;quot;server-status&amp;quot; | unwrap bytes&lt;br /&gt;
      [1h]&lt;br /&gt;
    )&lt;br /&gt;
  )&lt;br /&gt;
)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
You can see more examples on the Mirror Requests dashboard on Grafana.&lt;br /&gt;
&lt;br /&gt;
==== Avoid high cardinality ====&lt;br /&gt;
For both Prometheus and Loki, you must [https://prometheus.io/docs/practices/naming/#labels avoid high cardinality] labels at all costs. By high cardinality, I mean labels which can take on a very large number of values; for example, using a label to store IP addresses would be a very bad idea. This is because Prometheus and Loki use labels to store metrics/logs efficiently with compression; when two metrics have two different sets of labels, they cannot be stored together, which increases the storage space usage.&lt;br /&gt;
&lt;br /&gt;
With Loki, you can extract labels from your logs inside your query dynamically. One way to do this is with the &amp;lt;code&amp;gt;json&amp;lt;/code&amp;gt; operator; there are other ways to do this as well (see the LogQL docs). This basically means that we get infinite cardinality from our logs, the tradeoff being that queries may take longer to execute.&lt;br /&gt;
&lt;br /&gt;
Also, be very careful about what you send to Loki from Logstash - [https://grafana.com/docs/loki/latest/clients/logstash/#usage-and-configuration every field in a Logstash message becomes a Loki label]. Usage of the &amp;lt;code&amp;gt;prune&amp;lt;/code&amp;gt; command in Logstash is highly recommended.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=IPMI101&amp;diff=5180</id>
		<title>IPMI101</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=IPMI101&amp;diff=5180"/>
		<updated>2023-12-12T06:07:51Z</updated>

		<summary type="html">&lt;p&gt;Merenber: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Guide to IPMI (IPMI 101) =&lt;br /&gt;
&lt;br /&gt;
IPMI is a necessary evil. Let’s learn to make the best of it.&lt;br /&gt;
&lt;br /&gt;
== Setting up IPMI ==&lt;br /&gt;
&lt;br /&gt;
# Install ipmitool&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# apt-get install ipmitool&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Load IPMI modules (they are included in most upstream kernels)&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You may also need a kernel module specific to your motherboard’s manufacture as some BMC/LOMs do not conform to IPMI spec and thus need a translation layer.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# modprobe ipmi_*&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Locally connect to the &amp;lt;code&amp;gt;/dev/ipmi&amp;lt;/code&amp;gt; interface&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; help&lt;br /&gt;
&amp;amp;gt; mc info&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Securing IPMI ==&lt;br /&gt;
&lt;br /&gt;
Note that root on the machine is root on the BMC and vice versa.&lt;br /&gt;
&lt;br /&gt;
# User administration&lt;br /&gt;
&lt;br /&gt;
(re)set the password, rename the admin account to root and delete any extra users as they can have surprising privilege. You may have to use the BMC’s web interface delete accounts.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; user list 1&lt;br /&gt;
ID Name ...&lt;br /&gt;
2  ADMIN ...&lt;br /&gt;
&amp;amp;gt; user set password 2&lt;br /&gt;
User id 2: *******&lt;br /&gt;
User id 2: *******&lt;br /&gt;
&amp;amp;gt; user set username 2 root&lt;br /&gt;
&amp;amp;gt; user disable $other_user_ids&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Disable NULL password and cipher suite 0&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that the $channel is usually 0 but can range from 0-10 and there can be multiple NICs and so multiple channels to fix.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; lan print $channel&lt;br /&gt;
&amp;amp;gt; lan set $channel auth ADMIN MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth CALLBACK MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth USER MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth OPERATOR MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel cipher_privs XXXaXXXXXXXXXXX&lt;br /&gt;
&amp;amp;gt; lan print $channel&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Configuring networking ==&lt;br /&gt;
&lt;br /&gt;
Note once again that there are sometimes multiple channels, to find the correct channel it is helpful to use either trial and error and/or an ARP scanner to find the correct MAC address. Usually the channel is 0 but I have seen 1, 8 and 17. Especially when there are multiple NICs.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; lan print $channel&lt;br /&gt;
&amp;amp;gt; lan set $channel ipsrc static&lt;br /&gt;
&amp;amp;gt; lan set $channel ipaddr 10.15.134.?&lt;br /&gt;
&amp;amp;gt; lan set $channel defgw ipaddr 10.15.134.1&lt;br /&gt;
&amp;amp;gt; lan set $channel netmask 255.255.255.0&lt;br /&gt;
// if you have vlan tagging enabled on the switch port, useful for a shared NIC&lt;br /&gt;
&amp;amp;gt; lan set $channel vlan id 520&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Configuring Serial over LAN ==&lt;br /&gt;
&lt;br /&gt;
To enable serial over LAN you need to ensure that it is enabled in your BIOS or EFI setup utility and further note the baud rate. 115200 is used as an example below. Note that GRUB is the only boot loader that takes input via serial properly, in my experience. Syslinux failed horribly on corn-syrup.&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/default/grub.d/99-csclub.cfg:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
GRUB_CMDLINE_LINUX=&amp;amp;quot;console=tty1 console=ttyS1,115200n8&amp;amp;quot;&lt;br /&gt;
GRUB_TERMINAL_INPUT=&amp;amp;quot;console serial&amp;amp;quot;&lt;br /&gt;
GRUB_TERMINAL_OUTPUT=&amp;amp;quot;console serial&amp;amp;quot;&lt;br /&gt;
GRUB_SERIAL_COMMAND=&amp;amp;quot;serial --speed=115200 --unit=1 --word=8 --parity=no --stop=1&amp;amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
and then run:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;// on debian based distros&lt;br /&gt;
// Yay, Debian magic :\&lt;br /&gt;
# update-grub&lt;br /&gt;
// on upstream packages (Arch, Fedora, etc.)&lt;br /&gt;
# grub-mkconfig -o /boot/grub/grub.cfg&lt;br /&gt;
&lt;br /&gt;
# reboot&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= iDRAC =&lt;br /&gt;
== riboflavin ==&lt;br /&gt;
riboflavin is using iDRAC 6. The web console can be viewed from https://riboflavin-ipmi.csclub.uwaterloo.ca; if you are not on campus, you can use a [[How_to_SSH#SOCKS_proxy|SOCKS proxy]]. Unfortunately, the virtual console uses Java Web Start, which is now deprecated. Here&#039;s a workaround which you can use instead.&lt;br /&gt;
&lt;br /&gt;
From the web UI, go to the &amp;quot;Console/Media&amp;quot; tab and click the &amp;quot;Launch virtual console&amp;quot; button. This will download a file whose name starts with &amp;quot;viewer.jnlp&amp;quot;. Now go to https://www.java.com and download JRE 8; any later version will not have support for JWS (note that OpenJDK will not work; JWS was a proprietary framework from Sun/Oracle). Unpack the tarball, open jre1.8.0_391/lib/security/java.security in a text editor, and comment out the following properties (note that each property spans multiple lines):&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.certpath.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.jar.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.tls.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you are off-campus, you will need to setup some proxying so that the Java application can access ports 443 and 5900 on riboflavin-ipmi. In the example below, I am using caffeine as a jump host, but any machine on campus should do:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L 5443:localhost:5443 -L 5900:localhost:5900 caffeine.csclub.uwaterloo.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now on caffeine, open a tmux/screen session, and run the following commands in two different panes:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;socat TCP-LISTEN:5443,fork TCP:riboflavin-ipmi:443&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;socat TCP-LISTEN:5900,fork TCP:riboflavin-ipmi:5900&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Back on your personal machine, open the viewer.jnlp file in a text editor and perform the following:&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Replace all instances of &amp;lt;code&amp;gt;riboflavin-ipmi.csclub.uwaterloo.ca:443&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;localhost:5443&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Under the &amp;lt;code&amp;gt;application-desc&amp;lt;/code&amp;gt; element, the first &amp;lt;code&amp;gt;argument&amp;lt;/code&amp;gt; child element should say &amp;lt;code&amp;gt;ip=riboflavin-ipmi&amp;lt;/code&amp;gt;. Replace this with &amp;lt;code&amp;gt;ip=localhost&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;.&lt;br /&gt;
&amp;lt;li&amp;gt;Under the &amp;lt;code&amp;gt;application-desc&amp;lt;/code&amp;gt; element, there are child &amp;lt;code&amp;gt;argument&amp;lt;/code&amp;gt;elements for &amp;lt;code&amp;gt;user&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;passwd&amp;lt;/code&amp;gt;. For some reason these are set to numbers; set these to the username and password for IPMI (username should be &amp;lt;code&amp;gt;root&amp;lt;/code&amp;gt;).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
jre1.8.0_391/bin/javaws viewer.jnlp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If all goes well, the virtual console should eventually appear:&lt;br /&gt;
[[File:Riboflavin-idrac-virtual-console.png|1000px]]&lt;br /&gt;
&lt;br /&gt;
= Supermicro =&lt;br /&gt;
== ginkgo ==&lt;br /&gt;
To access the virtual console on ginkgo, the steps are the same as those for riboflavin, with the following changes:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;In the launch.jnlp file, in the first &amp;lt;code&amp;gt;&amp;lt;argument&amp;gt;&amp;lt;/code&amp;gt; element under &amp;lt;code&amp;gt;&amp;lt;application-desc&amp;gt;&amp;lt;/code&amp;gt;, replace &amp;lt;code&amp;gt;ginkgo-ipmi.csclub.uwaterloo.ca&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;localhost&amp;lt;/code&amp;gt;. This is the only change which you should make to this file (unless you are already on the campus network, in which case you do not need to modify this file at all).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;jre1.8.0_391/bin/ControlPanel&amp;lt;/code&amp;gt;, go to the Security tab, click &amp;quot;Edit Site List&amp;quot;, and add &amp;lt;code&amp;gt;https://ginkgo-ipmi.csclub.uwaterloo.ca&amp;lt;/code&amp;gt; as an exception.&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=IPMI101&amp;diff=5179</id>
		<title>IPMI101</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=IPMI101&amp;diff=5179"/>
		<updated>2023-12-12T06:06:53Z</updated>

		<summary type="html">&lt;p&gt;Merenber: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Guide to IPMI (IPMI 101) =&lt;br /&gt;
&lt;br /&gt;
IPMI is a necessary evil. Let’s learn to make the best of it.&lt;br /&gt;
&lt;br /&gt;
== Setting up IPMI ==&lt;br /&gt;
&lt;br /&gt;
# Install ipmitool&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# apt-get install ipmitool&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Load IPMI modules (they are included in most upstream kernels)&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You may also need a kernel module specific to your motherboard’s manufacture as some BMC/LOMs do not conform to IPMI spec and thus need a translation layer.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# modprobe ipmi_*&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Locally connect to the &amp;lt;code&amp;gt;/dev/ipmi&amp;lt;/code&amp;gt; interface&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; help&lt;br /&gt;
&amp;amp;gt; mc info&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Securing IPMI ==&lt;br /&gt;
&lt;br /&gt;
Note that root on the machine is root on the BMC and vice versa.&lt;br /&gt;
&lt;br /&gt;
# User administration&lt;br /&gt;
&lt;br /&gt;
(re)set the password, rename the admin account to root and delete any extra users as they can have surprising privilege. You may have to use the BMC’s web interface delete accounts.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; user list 1&lt;br /&gt;
ID Name ...&lt;br /&gt;
2  ADMIN ...&lt;br /&gt;
&amp;amp;gt; user set password 2&lt;br /&gt;
User id 2: *******&lt;br /&gt;
User id 2: *******&lt;br /&gt;
&amp;amp;gt; user set username 2 root&lt;br /&gt;
&amp;amp;gt; user disable $other_user_ids&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Disable NULL password and cipher suite 0&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that the $channel is usually 0 but can range from 0-10 and there can be multiple NICs and so multiple channels to fix.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; lan print $channel&lt;br /&gt;
&amp;amp;gt; lan set $channel auth ADMIN MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth CALLBACK MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth USER MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth OPERATOR MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel cipher_privs XXXaXXXXXXXXXXX&lt;br /&gt;
&amp;amp;gt; lan print $channel&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Configuring networking ==&lt;br /&gt;
&lt;br /&gt;
Note once again that there are sometimes multiple channels, to find the correct channel it is helpful to use either trial and error and/or an ARP scanner to find the correct MAC address. Usually the channel is 0 but I have seen 1, 8 and 17. Especially when there are multiple NICs.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; lan print $channel&lt;br /&gt;
&amp;amp;gt; lan set $channel ipsrc static&lt;br /&gt;
&amp;amp;gt; lan set $channel ipaddr 10.15.134.?&lt;br /&gt;
&amp;amp;gt; lan set $channel defgw ipaddr 10.15.134.1&lt;br /&gt;
&amp;amp;gt; lan set $channel netmask 255.255.255.0&lt;br /&gt;
// if you have vlan tagging enabled on the switch port, useful for a shared NIC&lt;br /&gt;
&amp;amp;gt; lan set $channel vlan id 520&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Configuring Serial over LAN ==&lt;br /&gt;
&lt;br /&gt;
To enable serial over LAN you need to ensure that it is enabled in your BIOS or EFI setup utility and further note the baud rate. 115200 is used as an example below. Note that GRUB is the only boot loader that takes input via serial properly, in my experience. Syslinux failed horribly on corn-syrup.&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/default/grub.d/99-csclub.cfg:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
GRUB_CMDLINE_LINUX=&amp;amp;quot;console=tty1 console=ttyS1,115200n8&amp;amp;quot;&lt;br /&gt;
GRUB_TERMINAL_INPUT=&amp;amp;quot;console serial&amp;amp;quot;&lt;br /&gt;
GRUB_TERMINAL_OUTPUT=&amp;amp;quot;console serial&amp;amp;quot;&lt;br /&gt;
GRUB_SERIAL_COMMAND=&amp;amp;quot;serial --speed=115200 --unit=1 --word=8 --parity=no --stop=1&amp;amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
and then run:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;// on debian based distros&lt;br /&gt;
// Yay, Debian magic :\&lt;br /&gt;
# update-grub&lt;br /&gt;
// on upstream packages (Arch, Fedora, etc.)&lt;br /&gt;
# grub-mkconfig -o /boot/grub/grub.cfg&lt;br /&gt;
&lt;br /&gt;
# reboot&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= iDRAC =&lt;br /&gt;
== riboflavin ==&lt;br /&gt;
riboflavin is using iDRAC 6. The web console can be viewed from https://riboflavin-ipmi.csclub.uwaterloo.ca; if you are not on campus, you can use a [[How_to_SSH#SOCKS_proxy|SOCKS proxy]]. Unfortunately, the virtual console uses Java Web Start, which is now deprecated. Here&#039;s a workaround which you can use instead.&lt;br /&gt;
&lt;br /&gt;
From the web UI, go to the &amp;quot;Console/Media&amp;quot; tab and click the &amp;quot;Launch virtual console&amp;quot; button. This will download a file whose name starts with &amp;quot;viewer.jnlp&amp;quot;. Now go to https://www.java.com and download JRE 8; any later version will not have support for JWS (note that OpenJDK will not work; JWS was a proprietary framework from Sun/Oracle). Unpack the tarball, open jre1.8.0_391/lib/security/java.security in a text editor, and comment out the following properties (note that each property spans multiple lines):&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.certpath.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.jar.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.tls.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you are off-campus, you will need to setup some proxying so that the Java application can access ports 443 and 5900 on riboflavin-ipmi. In the example below, I am using caffeine as a jump host, but any machine on campus should do:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L 5443:localhost:5443 -L 5900:localhost:5900 caffeine.csclub.uwaterloo.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now on caffeine, open a tmux/screen session, and run the following commands in two different panes:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;socat TCP-LISTEN:5443,fork TCP:riboflavin-ipmi:443&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;socat TCP-LISTEN:5900,fork TCP:riboflavin-ipmi:5900&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Back on your personal machine, open the viewer.jnlp file in a text editor and perform the following:&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Replace all instances of &amp;lt;code&amp;gt;riboflavin-ipmi.csclub.uwaterloo.ca:443&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;localhost:5443&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Under the &amp;lt;code&amp;gt;application-desc&amp;lt;/code&amp;gt; element, the first &amp;lt;code&amp;gt;argument&amp;lt;/code&amp;gt; child element should say &amp;lt;code&amp;gt;ip=riboflavin-ipmi&amp;lt;/code&amp;gt;. Replace this with &amp;lt;code&amp;gt;ip=localhost&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;.&lt;br /&gt;
&amp;lt;li&amp;gt;Under the &amp;lt;code&amp;gt;application-desc&amp;lt;/code&amp;gt; element, there are child &amp;lt;code&amp;gt;argument&amp;lt;/code&amp;gt;elements for &amp;lt;code&amp;gt;user&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;passwd&amp;lt;/code&amp;gt;. For some reason these are set to numbers; set these to the username and password for IPMI (username should be &amp;lt;code&amp;gt;root&amp;lt;/code&amp;gt;).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
jre1.8.0_391/bin/javaws viewer.jnlp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If all goes well, the virtual console should eventually appear:&lt;br /&gt;
[[File:Riboflavin-idrac-virtual-console.png|1000px]]&lt;br /&gt;
&lt;br /&gt;
== Supermicro ==&lt;br /&gt;
=== ginkgo ===&lt;br /&gt;
To access the virtual console on ginkgo, the steps are the same as those for riboflavin, with the following changes:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;In the launch.jnlp file, in the first &amp;lt;code&amp;gt;&amp;lt;argument&amp;gt;&amp;lt;/code&amp;gt; element under &amp;lt;code&amp;gt;&amp;lt;application-desc&amp;gt;&amp;lt;/code&amp;gt;, replace &amp;lt;code&amp;gt;ginkgo-ipmi.csclub.uwaterloo.ca&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;localhost&amp;lt;/code&amp;gt;. This is the only change which you should make to this file (unless you are already on the campus network, in which case you do not need to modify this file at all).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Run &amp;lt;code&amp;gt;jre1.8.0_391/bin/ControlPanel&amp;lt;/code&amp;gt;, go to the Security tab, click &amp;quot;Edit Site List&amp;quot;, and add &amp;lt;code&amp;gt;https://ginkgo-ipmi.csclub.uwaterloo.ca&amp;lt;/code&amp;gt; as an exception.&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=IPMI101&amp;diff=5178</id>
		<title>IPMI101</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=IPMI101&amp;diff=5178"/>
		<updated>2023-12-08T06:42:40Z</updated>

		<summary type="html">&lt;p&gt;Merenber: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Guide to IPMI (IPMI 101) =&lt;br /&gt;
&lt;br /&gt;
IPMI is a necessary evil. Let’s learn to make the best of it.&lt;br /&gt;
&lt;br /&gt;
== Setting up IPMI ==&lt;br /&gt;
&lt;br /&gt;
# Install ipmitool&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# apt-get install ipmitool&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Load IPMI modules (they are included in most upstream kernels)&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You may also need a kernel module specific to your motherboard’s manufacture as some BMC/LOMs do not conform to IPMI spec and thus need a translation layer.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# modprobe ipmi_*&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;3&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Locally connect to the &amp;lt;code&amp;gt;/dev/ipmi&amp;lt;/code&amp;gt; interface&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; help&lt;br /&gt;
&amp;amp;gt; mc info&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Securing IPMI ==&lt;br /&gt;
&lt;br /&gt;
Note that root on the machine is root on the BMC and vice versa.&lt;br /&gt;
&lt;br /&gt;
# User administration&lt;br /&gt;
&lt;br /&gt;
(re)set the password, rename the admin account to root and delete any extra users as they can have surprising privilege. You may have to use the BMC’s web interface delete accounts.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; user list 1&lt;br /&gt;
ID Name ...&lt;br /&gt;
2  ADMIN ...&lt;br /&gt;
&amp;amp;gt; user set password 2&lt;br /&gt;
User id 2: *******&lt;br /&gt;
User id 2: *******&lt;br /&gt;
&amp;amp;gt; user set username 2 root&lt;br /&gt;
&amp;amp;gt; user disable $other_user_ids&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;ol start=&amp;quot;2&amp;quot; style=&amp;quot;list-style-type: decimal;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Disable NULL password and cipher suite 0&amp;lt;/li&amp;gt;&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that the $channel is usually 0 but can range from 0-10 and there can be multiple NICs and so multiple channels to fix.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; lan print $channel&lt;br /&gt;
&amp;amp;gt; lan set $channel auth ADMIN MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth CALLBACK MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth USER MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel auth OPERATOR MD5&lt;br /&gt;
&amp;amp;gt; lan set $channel cipher_privs XXXaXXXXXXXXXXX&lt;br /&gt;
&amp;amp;gt; lan print $channel&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Configuring networking ==&lt;br /&gt;
&lt;br /&gt;
Note once again that there are sometimes multiple channels, to find the correct channel it is helpful to use either trial and error and/or an ARP scanner to find the correct MAC address. Usually the channel is 0 but I have seen 1, 8 and 17. Especially when there are multiple NICs.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;# ipmitool shell&lt;br /&gt;
&amp;amp;gt; lan print $channel&lt;br /&gt;
&amp;amp;gt; lan set $channel ipsrc static&lt;br /&gt;
&amp;amp;gt; lan set $channel ipaddr 10.15.134.?&lt;br /&gt;
&amp;amp;gt; lan set $channel defgw ipaddr 10.15.134.1&lt;br /&gt;
&amp;amp;gt; lan set $channel netmask 255.255.255.0&lt;br /&gt;
// if you have vlan tagging enabled on the switch port, useful for a shared NIC&lt;br /&gt;
&amp;amp;gt; lan set $channel vlan id 520&amp;lt;/pre&amp;gt;&lt;br /&gt;
== Configuring Serial over LAN ==&lt;br /&gt;
&lt;br /&gt;
To enable serial over LAN you need to ensure that it is enabled in your BIOS or EFI setup utility and further note the baud rate. 115200 is used as an example below. Note that GRUB is the only boot loader that takes input via serial properly, in my experience. Syslinux failed horribly on corn-syrup.&lt;br /&gt;
&lt;br /&gt;
Paste the following into /etc/default/grub.d/99-csclub.cfg:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
GRUB_CMDLINE_LINUX=&amp;amp;quot;console=tty1 console=ttyS1,115200n8&amp;amp;quot;&lt;br /&gt;
GRUB_TERMINAL_INPUT=&amp;amp;quot;console serial&amp;amp;quot;&lt;br /&gt;
GRUB_TERMINAL_OUTPUT=&amp;amp;quot;console serial&amp;amp;quot;&lt;br /&gt;
GRUB_SERIAL_COMMAND=&amp;amp;quot;serial --speed=115200 --unit=1 --word=8 --parity=no --stop=1&amp;amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
and then run:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;// on debian based distros&lt;br /&gt;
// Yay, Debian magic :\&lt;br /&gt;
# update-grub&lt;br /&gt;
// on upstream packages (Arch, Fedora, etc.)&lt;br /&gt;
# grub-mkconfig -o /boot/grub/grub.cfg&lt;br /&gt;
&lt;br /&gt;
# reboot&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= iDRAC =&lt;br /&gt;
== riboflavin ==&lt;br /&gt;
riboflavin is using iDRAC 6. The web console can be viewed from https://riboflavin-ipmi.csclub.uwaterloo.ca; if you are not on campus, you can use a [[How_to_SSH#SOCKS_proxy|SOCKS proxy]]. Unfortunately, the virtual console uses Java Web Start, which is now deprecated. Here&#039;s a workaround which you can use instead.&lt;br /&gt;
&lt;br /&gt;
From the web UI, go to the &amp;quot;Console/Media&amp;quot; tab and click the &amp;quot;Launch virtual console&amp;quot; button. This will download a file whose name starts with &amp;quot;viewer.jnlp&amp;quot;. Now go to https://www.java.com and download JRE 8; any later version will not have support for JWS (note that OpenJDK will not work; JWS was a proprietary framework from Sun/Oracle). Unpack the tarball, open jre1.8.0_391/lib/security/java.security in a text editor, and comment out the following properties (note that each property spans multiple lines):&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.certpath.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.jar.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;jdk.tls.disabledAlgorithms&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you are off-campus, you will need to setup some proxying so that the Java application can access ports 443 and 5900 on riboflavin-ipmi. In the example below, I am using caffeine as a jump host, but any machine on campus should do:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh -L 5443:localhost:5443 -L 5900:localhost:5900 caffeine.csclub.uwaterloo.ca&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now on caffeine, open a tmux/screen session, and run the following commands in two different panes:&lt;br /&gt;
&amp;lt;ul&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;socat TCP-LISTEN:5443,fork TCP:riboflavin-ipmi:443&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&amp;lt;code&amp;gt;socat TCP-LISTEN:5900,fork TCP:riboflavin-ipmi:5900&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ul&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Back on your personal machine, open the viewer.jnlp file in a text editor and perform the following:&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Replace all instances of &amp;lt;code&amp;gt;riboflavin-ipmi.csclub.uwaterloo.ca:443&amp;lt;/code&amp;gt; with &amp;lt;code&amp;gt;localhost:5443&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;Under the &amp;lt;code&amp;gt;application-desc&amp;lt;/code&amp;gt; element, the first &amp;lt;code&amp;gt;argument&amp;lt;/code&amp;gt; child element should say &amp;lt;code&amp;gt;ip=riboflavin-ipmi&amp;lt;/code&amp;gt;. Replace this with &amp;lt;code&amp;gt;ip=localhost&amp;lt;/code&amp;gt;&amp;lt;/li&amp;gt;.&lt;br /&gt;
&amp;lt;li&amp;gt;Under the &amp;lt;code&amp;gt;application-desc&amp;lt;/code&amp;gt; element, there are child &amp;lt;code&amp;gt;argument&amp;lt;/code&amp;gt;elements for &amp;lt;code&amp;gt;user&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;passwd&amp;lt;/code&amp;gt;. For some reason these are set to numbers; set these to the username and password for IPMI (username should be &amp;lt;code&amp;gt;root&amp;lt;/code&amp;gt;).&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
jre1.8.0_391/bin/javaws viewer.jnlp&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If all goes well, the virtual console should eventually appear:&lt;br /&gt;
[[File:Riboflavin-idrac-virtual-console.png|1000px]]&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=File:Riboflavin-idrac-virtual-console.png&amp;diff=5177</id>
		<title>File:Riboflavin-idrac-virtual-console.png</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=File:Riboflavin-idrac-virtual-console.png&amp;diff=5177"/>
		<updated>2023-12-08T04:58:51Z</updated>

		<summary type="html">&lt;p&gt;Merenber: iDRAC virtual console for riboflavin&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Summary ==&lt;br /&gt;
iDRAC virtual console for riboflavin&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5176</id>
		<title>MySQL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5176"/>
		<updated>2023-12-05T18:23:13Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Backups */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For members ==&lt;br /&gt;
Note: the database on caffeine is actually MariaDB, not MySQL. Although they are mostly compatible, there are some incompatibilities to be aware of. See [https://mariadb.com/kb/en/mariadb-vs-mysql-compatibility/ MariaDB versus MySQL: Compatibility] for details.&lt;br /&gt;
&lt;br /&gt;
=== Creating databases ===&lt;br /&gt;
&lt;br /&gt;
Users can create their own MySQL databases through [[ceo]]. Users emailing syscom asking for a MySQL database should be directed to do so. The process is as follows:&lt;br /&gt;
&lt;br /&gt;
# SSH into any [[Machine_List|CSC machine]].&lt;br /&gt;
# Run &amp;lt;tt&amp;gt;ceo&amp;lt;/tt&amp;gt;.&lt;br /&gt;
# Select &amp;quot;Create MySQL database&amp;quot; and follow the instructions.&lt;br /&gt;
# Login info will be stored in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
# You can now connect to the MySQL database (from [[Machine_List#caffeine|caffeine]] only).&lt;br /&gt;
&lt;br /&gt;
=== Deleting databases ===&lt;br /&gt;
&lt;br /&gt;
Users can delete their own MySQL databases. &lt;br /&gt;
&lt;br /&gt;
SSH into [[Machine_List#caffeine|caffeine]].&lt;br /&gt;
 mysql -u yourusernamehere -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
 DROP DATABASE database name goes here&lt;br /&gt;
Login info and database name was created on database creation in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
&lt;br /&gt;
== For syscom ==&lt;br /&gt;
&lt;br /&gt;
=== Creating a database manually ===&lt;br /&gt;
To create a MySQL database manually on caffeine, first connect to the database as root:&lt;br /&gt;
&lt;br /&gt;
 $ mysql -uroot -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
&lt;br /&gt;
Then run the following SQL statements:&lt;br /&gt;
&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;%&#039; IDENTIFIED BY &#039;longrandompassword&#039;;&lt;br /&gt;
 CREATE DATABASE someuser;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;%&#039;;&lt;br /&gt;
&lt;br /&gt;
This will allow users to connect locally without a password, and connect remotely with a password.&lt;br /&gt;
&lt;br /&gt;
For random passwords run &amp;lt;code&amp;gt;pwgen -s 20 1&amp;lt;/code&amp;gt;. For the administrative passwords see /users/sysadmin/passwords/mysql.&lt;br /&gt;
&lt;br /&gt;
Write a file (usually ~club/mysql) to the club&#039;s homedir readable only by them containing the following:&lt;br /&gt;
&lt;br /&gt;
 Username: clubuserid&lt;br /&gt;
 Password: longrandompassword&lt;br /&gt;
 Hostname: localhost&lt;br /&gt;
&lt;br /&gt;
Try not to send passwords via plaintext email.&lt;br /&gt;
&lt;br /&gt;
=== Replication ===&lt;br /&gt;
&lt;br /&gt;
See the history of this page for information on the previous replication setup.&lt;br /&gt;
&lt;br /&gt;
=== Backups ===&lt;br /&gt;
&lt;br /&gt;
We use [https://mariadb.com/kb/en/mariabackup-overview/ mariabackup] to take periodic backups. It is currently installed and configured on both caffeine and coffee.&lt;br /&gt;
&lt;br /&gt;
==== Installation ====&lt;br /&gt;
In the example below, we will be installing mariabackup on coffee, and sending the backups to corn-syrup.&lt;br /&gt;
&lt;br /&gt;
First, install the mariadb-backup package:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt install mariadb-backup&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Next, create an SSH key pair for the mysql user:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir /var/mariadb&lt;br /&gt;
chown mysql:mysql /var/mariadb&lt;br /&gt;
su -s /bin/bash mysql&lt;br /&gt;
cd /var/mariadb&lt;br /&gt;
mkdir .ssh&lt;br /&gt;
chmod 700 .ssh&lt;br /&gt;
 # Choose /var/mariadb/.ssh/id_ed25519 for the path&lt;br /&gt;
ssh-keygen -t ed25519&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the public key (/var/mariadb/.ssh/id_ed25519.pub) into /users/syscom/.ssh/authorized_keys on corn-syrup:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
restrict ssh-ed25519 AAAAC3Nza... mysql@coffee&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Also create the folder &amp;lt;code&amp;gt;/users/syscom/backups/coffee/mariabackup&amp;lt;/code&amp;gt;. We will store the backups here.&lt;br /&gt;
&lt;br /&gt;
We will use a hacky bash script to try to emulate the same behaviour as pgBackRest. We will compress and stream each backup to a folder on corn-syrup in the format &amp;lt;code&amp;gt;1701678356-F&amp;lt;/code&amp;gt;, where the number is a Unix epoch timestamp and the letter at the end is one of F, D or I (for full, differential or incremental backups). Full backups do not depend on any other backups. Differential backups depend on the latest full backup before them. Incremental backups depend on the latest backup before them (of any type).&lt;br /&gt;
&lt;br /&gt;
On coffee, paste the following into e.g. /var/mariadb/bin/backup-mariadb.sh:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
RETENTION_FULL=2&lt;br /&gt;
RETENTION_DIFF=4&lt;br /&gt;
SSH_KEY=/var/mariadb/.ssh/id_ed25519&lt;br /&gt;
SSH_USER=syscom&lt;br /&gt;
SSH_HOST=corn-syrup&lt;br /&gt;
SSH_FOLDER=/users/$SSH_USER/backups/$(hostname)/mariabackup&lt;br /&gt;
SSH_ARGS=&amp;quot;-i $SSH_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null&amp;quot;&lt;br /&gt;
SSH=&amp;quot;ssh $SSH_ARGS $SSH_USER@$SSH_HOST&amp;quot;&lt;br /&gt;
&lt;br /&gt;
set -euxo pipefail&lt;br /&gt;
# $USER doesn&#039;t seem to be defined when we run this from cron&lt;br /&gt;
if [ &amp;quot;$(id -un)&amp;quot; != mysql ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the mysql user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Usage: $0 &amp;lt;full|diff|incr&amp;gt;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
backup_type=$1&lt;br /&gt;
if [ &amp;quot;$backup_type&amp;quot; = full ]; then&lt;br /&gt;
    backup_type_letter=F&lt;br /&gt;
elif [ &amp;quot;$backup_type&amp;quot; = diff ]; then&lt;br /&gt;
    backup_type_letter=D&lt;br /&gt;
elif [ &amp;quot;$backup_type&amp;quot; = incr ]; then&lt;br /&gt;
    backup_type_letter=I&lt;br /&gt;
else&lt;br /&gt;
    echo &amp;quot;Backup type must be one of &#039;full&#039;, &#039;diff&#039; or &#039;incr&#039;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if ! pgrep mariadbd &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;MariaDB is not running&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if pgrep mariabackup &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;mariabackup is already running&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Delete temporary files left behind by previous run, if there are any&lt;br /&gt;
$SSH -- &amp;quot;rm -rf $SSH_FOLDER/*.tmp&amp;quot;&lt;br /&gt;
# Get a list of all backups in chronological order&lt;br /&gt;
mapfile -t backups &amp;lt; &amp;lt;($SSH -- &amp;quot;/bin/ls -1 $SSH_FOLDER | grep -P &#039;^\\d+-[FDI]$&#039; | sort&amp;quot;)&lt;br /&gt;
incremental_basedir_args=&lt;br /&gt;
old_checkpoint_dir=$(mktemp -d)&lt;br /&gt;
new_checkpoint_dir=$(mktemp -d)&lt;br /&gt;
trap &amp;quot;rm -rf $old_checkpoint_dir $new_checkpoint_dir&amp;quot; EXIT&lt;br /&gt;
if [ &amp;quot;$backup_type&amp;quot; = diff -o &amp;quot;$backup_type&amp;quot; = incr ]; then&lt;br /&gt;
    # Find a backup which we can use as a base.&lt;br /&gt;
    # For incr, this can be any type; for diff, this must be a full backup.&lt;br /&gt;
    base_backup=&lt;br /&gt;
    for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
        backup=${backups[i]}&lt;br /&gt;
        if [ $backup_type = incr ] || [[ $backup =~ -F$ ]]; then&lt;br /&gt;
            base_backup=$backup&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    if [ -z &amp;quot;$base_backup&amp;quot; ]; then&lt;br /&gt;
        echo &amp;quot;Could not find base backup for $backup_type type&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
    # Copy the xtrabackup_checkpoints file from the base backup into a&lt;br /&gt;
    # temporary directory, and use it in the mariabackup command.&lt;br /&gt;
    scp $SSH_ARGS &amp;quot;$SSH_USER@$SSH_HOST:$SSH_FOLDER/$base_backup/xtrabackup_*&amp;quot; $old_checkpoint_dir/&lt;br /&gt;
    incremental_basedir_args=&amp;quot;--incremental-basedir=$old_checkpoint_dir&amp;quot;&lt;br /&gt;
fi&lt;br /&gt;
compress_level=6&lt;br /&gt;
if [ $backup_type = full ]; then&lt;br /&gt;
    # Use a lower compression level to go faster&lt;br /&gt;
    compress_level=5&lt;br /&gt;
fi&lt;br /&gt;
foldername=&amp;quot;$(date +%s)-$backup_type_letter&amp;quot;&lt;br /&gt;
# First copy to a temporary dir, then rename the temporary dir to the&lt;br /&gt;
# desired dir name (in case our process gets killed)&lt;br /&gt;
mariabackup --user=mysql --backup $incremental_basedir_args --stream=xbstream --extra-lsndir=$new_checkpoint_dir \&lt;br /&gt;
    | nice xz -$compress_level -T0 \&lt;br /&gt;
    | $SSH -- &amp;quot;cd $SSH_FOLDER &amp;amp;&amp;amp; mkdir $foldername.tmp &amp;amp;&amp;amp; cat &amp;gt; $foldername.tmp/data.xb.xz&amp;quot;&lt;br /&gt;
scp $SSH_ARGS $new_checkpoint_dir/* $SSH_USER@$SSH_HOST:$SSH_FOLDER/$foldername.tmp/&lt;br /&gt;
$SSH -- &amp;quot;mv $SSH_FOLDER/$foldername.tmp $SSH_FOLDER/$foldername&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Delete old backups&lt;br /&gt;
if [ $backup_type = incr ]; then&lt;br /&gt;
    # We don&#039;t delete backups when making an incr backup, since we only&lt;br /&gt;
    # have retention limits for full and diff&lt;br /&gt;
    exit&lt;br /&gt;
fi&lt;br /&gt;
if [ $backup_type = full ]; then&lt;br /&gt;
    retention=$RETENTION_FULL&lt;br /&gt;
else&lt;br /&gt;
    retention=$RETENTION_DIFF&lt;br /&gt;
fi&lt;br /&gt;
num_backups_of_same_type=1&lt;br /&gt;
backups_to_delete=()&lt;br /&gt;
for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
    backup=${backups[i]}&lt;br /&gt;
    if ! [[ $backup =~ -${backup_type_letter}$ ]]; then&lt;br /&gt;
        continue&lt;br /&gt;
    fi&lt;br /&gt;
    ((num_backups_of_same_type++))&lt;br /&gt;
    if [ $num_backups_of_same_type -lt $retention ]; then&lt;br /&gt;
        continue&lt;br /&gt;
    fi&lt;br /&gt;
    if [ $backup_type = full ]; then&lt;br /&gt;
        # Delete everything before the last full backup which we want to&lt;br /&gt;
        # keep&lt;br /&gt;
        pat=&#039;^&#039;&lt;br /&gt;
    else&lt;br /&gt;
        # Delete all the diff and incr backups before the last diff backup&lt;br /&gt;
        # which we want to keep&lt;br /&gt;
        pat=&#039;-[DI]$&#039;&lt;br /&gt;
    fi&lt;br /&gt;
    for ((j=$i-1; j&amp;gt;=0; j--)); do&lt;br /&gt;
        backup=${backups[j]}&lt;br /&gt;
        if [[ $backup =~ $pat ]]; then&lt;br /&gt;
            backups_to_delete+=($backup)&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    break&lt;br /&gt;
done&lt;br /&gt;
if [ ${#backups_to_delete[@]} -eq 0 ]; then&lt;br /&gt;
    echo &amp;quot;No backups to delete&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit&lt;br /&gt;
fi&lt;br /&gt;
$SSH -- &amp;quot;cd $SSH_FOLDER &amp;amp;&amp;amp; rm -r ${backups_to_delete[@]}&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The script should be invoked with exactly one argument which must be one of &amp;quot;full&amp;quot;, &amp;quot;diff&amp;quot; or &amp;quot;incr&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
==== Cron ====&lt;br /&gt;
Paste something like the following into e.g. /etc/cron.d/mariadb_backup:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
MAILTO=root@csclub.uwaterloo.ca&lt;br /&gt;
&lt;br /&gt;
# Full back up at 00:20 every Sunday and Wednesday&lt;br /&gt;
20 0 * * 0,3 mysql chronic /var/mariadb/bin/backup-mariadb.sh full&lt;br /&gt;
# Differential backup at 00:35 every day&lt;br /&gt;
35 0 * * * mysql chronic /var/mariadb/bin/backup-mariadb.sh diff&lt;br /&gt;
# Incremental backup at the 50th minute of every hour&lt;br /&gt;
50 * * * * mysql chronic /var/mariadb/bin/backup-mariadb.sh incr&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Restore ====&lt;br /&gt;
Paste the following into e.g. /var/mariadb/bin/restore-mariadb.sh:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
SSH_KEY=/var/mariadb/.ssh/id_ed25519&lt;br /&gt;
SSH_USER=syscom&lt;br /&gt;
SSH_HOST=corn-syrup&lt;br /&gt;
SSH_FOLDER=/users/$SSH_USER/backups/$(hostname)/mariabackup&lt;br /&gt;
SSH_ARGS=&amp;quot;-i $SSH_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null&amp;quot;&lt;br /&gt;
SSH=&amp;quot;ssh $SSH_ARGS $SSH_USER@$SSH_HOST&amp;quot;&lt;br /&gt;
&lt;br /&gt;
set -euxo pipefail&lt;br /&gt;
shopt -s dotglob&lt;br /&gt;
if [ &amp;quot;$(id -un)&amp;quot; != mysql ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the mysql user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -gt 1 ]; then&lt;br /&gt;
    echo &amp;quot;Usage: $0 [0123456789-I]&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if pgrep mariadbd &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;Please stop MariaDB first&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Get a list of all backups in chronological order&lt;br /&gt;
mapfile -t backups &amp;lt; &amp;lt;($SSH -- &amp;quot;/bin/ls -1 $SSH_FOLDER | grep -P &#039;^\\d+-[FDI]$&#039; | sort&amp;quot;)&lt;br /&gt;
if [ ${#backups[@]} -eq 0 ]; then&lt;br /&gt;
    echo &amp;quot;No backups found&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -eq 1 ]; then&lt;br /&gt;
    last_backup_idx=&lt;br /&gt;
    for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
        if [ ${backups[i]} = &amp;quot;$1&amp;quot; ]; then&lt;br /&gt;
            last_backup_idx=$i&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    if [ -z &amp;quot;$last_backup_idx&amp;quot; ]; then&lt;br /&gt;
        echo &amp;quot;Could not find $1 on remote&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
else&lt;br /&gt;
    last_backup_idx=$(( ${#backups[@]} - 1 ))&lt;br /&gt;
fi&lt;br /&gt;
last_full_backup_idx=&lt;br /&gt;
for ((i=$last_backup_idx; i&amp;gt;=0; i--)); do&lt;br /&gt;
    if [[ ${backups[i]} =~ -F$ ]]; then&lt;br /&gt;
        last_full_backup_idx=$i&lt;br /&gt;
        break&lt;br /&gt;
    fi&lt;br /&gt;
done&lt;br /&gt;
if [ -z &amp;quot;$last_full_backup_idx&amp;quot; ]; then&lt;br /&gt;
    echo &amp;quot;Could not find full backup for ${backups[last_backup_idx]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
backups_to_use=()&lt;br /&gt;
if [[ ${backups[last_backup_idx]} =~ -F$ ]]; then&lt;br /&gt;
    # If we&#039;re restoring a full backup, we only need that one backup&lt;br /&gt;
    backups_to_use=(${backups[last_backup_idx]})&lt;br /&gt;
elif [[ ${backups[last_backup_idx]} =~ -D$ ]]; then&lt;br /&gt;
    # If we&#039;re restoring a diff backup, we only need that one backup and the&lt;br /&gt;
    # first full backup before it&lt;br /&gt;
    backups_to_use=(${backups[last_full_backup_idx]} ${backups[last_backup_idx]})&lt;br /&gt;
else&lt;br /&gt;
    # If we&#039;re restoring an incr backup, we need all the backups from it to&lt;br /&gt;
    # the first diff backup before it, and the first full backup before that.&lt;br /&gt;
    # If there is no diff backup between it and the last full backup, then&lt;br /&gt;
    # we need everything between it and the last full backup.&lt;br /&gt;
    for ((i=$last_backup_idx; i&amp;gt;=$last_full_backup_idx; i--)); do&lt;br /&gt;
        backups_to_use=(${backups[i]} ${backups_to_use[@]})&lt;br /&gt;
        if [[ ${backups[i]} =~ -D$ ]]; then&lt;br /&gt;
            backups_to_use=(${backups[last_full_backup_idx]} ${backups_to_use[@]})&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
fi&lt;br /&gt;
base_dir=$(mktemp -d)&lt;br /&gt;
incr_dir=$(mktemp -d)&lt;br /&gt;
trap &amp;quot;rm -rf $base_dir $incr_dir&amp;quot; EXIT&lt;br /&gt;
for backup in ${backups_to_use[@]}; do&lt;br /&gt;
    if [[ $backup =~ -F$ ]]; then&lt;br /&gt;
        backup_dir=$base_dir&lt;br /&gt;
    else&lt;br /&gt;
        backup_dir=$incr_dir&lt;br /&gt;
    fi&lt;br /&gt;
    $SSH -- &amp;quot;cat $SSH_FOLDER/$backup/data.xb.xz&amp;quot; | xz -d | mbstream -x -C $backup_dir&lt;br /&gt;
    incremental_dir_args=&lt;br /&gt;
    if [ $backup_dir = $incr_dir ]; then&lt;br /&gt;
        incremental_dir_args=&amp;quot;--incremental-dir=$incr_dir&amp;quot;&lt;br /&gt;
    fi&lt;br /&gt;
    mariabackup --prepare --target-dir=$base_dir $incremental_dir_args&lt;br /&gt;
    if [ $backup_dir = $incr_dir ]; then&lt;br /&gt;
        rm -rf $incr_dir/*&lt;br /&gt;
    fi&lt;br /&gt;
done&lt;br /&gt;
if [ &amp;quot;$(/bin/ls -1 /var/lib/mysql | wc -l)&amp;quot; -gt 0 ]; then&lt;br /&gt;
    read -p &amp;quot;Everything under /var/lib/mysql will be deleted. Continue (y/n)? &amp;quot; yn&lt;br /&gt;
    yn=${yn,,}  # convert to lower case&lt;br /&gt;
    if [ &amp;quot;$yn&amp;quot; = y -o &amp;quot;$yn&amp;quot; = yes ]; then&lt;br /&gt;
        rm -rf /var/lib/mysql/*&lt;br /&gt;
    else&lt;br /&gt;
        echo &amp;quot;Aborting.&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
fi&lt;br /&gt;
mariabackup --move-back --target-dir=$base_dir&lt;br /&gt;
echo &amp;quot;Restoration succeeded, please restart MariaDB&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Make sure to stop MariaDB before restoring a backup. If this script is invoked without any arguments, the latest backup found on corn-syrup will be used; a single argument may also be specified, which must be the name of one of the backup folders stored on corn-syrup.&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5175</id>
		<title>MySQL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5175"/>
		<updated>2023-12-05T18:17:46Z</updated>

		<summary type="html">&lt;p&gt;Merenber: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For members ==&lt;br /&gt;
Note: the database on caffeine is actually MariaDB, not MySQL. Although they are mostly compatible, there are some incompatibilities to be aware of. See [https://mariadb.com/kb/en/mariadb-vs-mysql-compatibility/ MariaDB versus MySQL: Compatibility] for details.&lt;br /&gt;
&lt;br /&gt;
=== Creating databases ===&lt;br /&gt;
&lt;br /&gt;
Users can create their own MySQL databases through [[ceo]]. Users emailing syscom asking for a MySQL database should be directed to do so. The process is as follows:&lt;br /&gt;
&lt;br /&gt;
# SSH into any [[Machine_List|CSC machine]].&lt;br /&gt;
# Run &amp;lt;tt&amp;gt;ceo&amp;lt;/tt&amp;gt;.&lt;br /&gt;
# Select &amp;quot;Create MySQL database&amp;quot; and follow the instructions.&lt;br /&gt;
# Login info will be stored in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
# You can now connect to the MySQL database (from [[Machine_List#caffeine|caffeine]] only).&lt;br /&gt;
&lt;br /&gt;
=== Deleting databases ===&lt;br /&gt;
&lt;br /&gt;
Users can delete their own MySQL databases. &lt;br /&gt;
&lt;br /&gt;
SSH into [[Machine_List#caffeine|caffeine]].&lt;br /&gt;
 mysql -u yourusernamehere -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
 DROP DATABASE database name goes here&lt;br /&gt;
Login info and database name was created on database creation in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
&lt;br /&gt;
== For syscom ==&lt;br /&gt;
&lt;br /&gt;
=== Creating a database manually ===&lt;br /&gt;
To create a MySQL database manually on caffeine, first connect to the database as root:&lt;br /&gt;
&lt;br /&gt;
 $ mysql -uroot -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
&lt;br /&gt;
Then run the following SQL statements:&lt;br /&gt;
&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;%&#039; IDENTIFIED BY &#039;longrandompassword&#039;;&lt;br /&gt;
 CREATE DATABASE someuser;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;%&#039;;&lt;br /&gt;
&lt;br /&gt;
This will allow users to connect locally without a password, and connect remotely with a password.&lt;br /&gt;
&lt;br /&gt;
For random passwords run &amp;lt;code&amp;gt;pwgen -s 20 1&amp;lt;/code&amp;gt;. For the administrative passwords see /users/sysadmin/passwords/mysql.&lt;br /&gt;
&lt;br /&gt;
Write a file (usually ~club/mysql) to the club&#039;s homedir readable only by them containing the following:&lt;br /&gt;
&lt;br /&gt;
 Username: clubuserid&lt;br /&gt;
 Password: longrandompassword&lt;br /&gt;
 Hostname: localhost&lt;br /&gt;
&lt;br /&gt;
Try not to send passwords via plaintext email.&lt;br /&gt;
&lt;br /&gt;
=== Replication ===&lt;br /&gt;
&lt;br /&gt;
See the history of this page for information on the previous replication setup.&lt;br /&gt;
&lt;br /&gt;
=== Backups ===&lt;br /&gt;
&lt;br /&gt;
We use [https://mariadb.com/kb/en/mariabackup-overview/ mariabackup] to take periodic backups. It is currently installed and configured on both caffeine and coffee.&lt;br /&gt;
&lt;br /&gt;
==== Installation ====&lt;br /&gt;
In the example below, we will be installing mariabackup on coffee, and sending the backups to corn-syrup.&lt;br /&gt;
&lt;br /&gt;
First, install the mariadb-backup package:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt install mariadb-backup&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Next, create an SSH key pair for the mysql user:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir /var/mariadb&lt;br /&gt;
chown mysql:mysql /var/mariadb&lt;br /&gt;
su -s /bin/bash mysql&lt;br /&gt;
cd /var/mariadb&lt;br /&gt;
mkdir .ssh&lt;br /&gt;
chmod 700 .ssh&lt;br /&gt;
 # Choose /var/mariadb/.ssh/id_ed25519 for the path&lt;br /&gt;
ssh-keygen -t ed25519&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the public key (/var/mariadb/.ssh/id_ed25519.pub) into /users/syscom/.ssh/authorized_keys on corn-syrup:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
restrict ssh-ed25519 AAAAC3Nza... mysql@coffee&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Also create the folder &amp;lt;code&amp;gt;/users/syscom/backups/coffee/mariabackup&amp;lt;/code&amp;gt;. We will store the backups here.&lt;br /&gt;
&lt;br /&gt;
We will use a hacky bash script to try to emulate the same behaviour as pgBackRest. We will compress and stream each backup to a folder on corn-syrup in the format &amp;lt;code&amp;gt;1701678356-F&amp;lt;/code&amp;gt;, where the number is a Unix epoch timestamp and the letter at the end is one of F, D or I (for full, differential or incremental backups). Full backups do not depend on any other backups. Differential backups depend on the latest full backup before them. Incremental backups depend on the latest backup before them (of any type).&lt;br /&gt;
&lt;br /&gt;
On coffee, paste the following into e.g. /var/mariadb/bin/backup-mariadb.sh:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
RETENTION_FULL=2&lt;br /&gt;
RETENTION_DIFF=4&lt;br /&gt;
SSH_KEY=/var/mariadb/.ssh/id_ed25519&lt;br /&gt;
SSH_USER=syscom&lt;br /&gt;
SSH_HOST=corn-syrup&lt;br /&gt;
SSH_FOLDER=/users/$SSH_USER/backups/$(hostname)/mariabackup&lt;br /&gt;
SSH_ARGS=&amp;quot;-i $SSH_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null&amp;quot;&lt;br /&gt;
SSH=&amp;quot;ssh $SSH_ARGS $SSH_USER@$SSH_HOST&amp;quot;&lt;br /&gt;
&lt;br /&gt;
set -euxo pipefail&lt;br /&gt;
# $USER doesn&#039;t seem to be defined when we run this from cron&lt;br /&gt;
if [ &amp;quot;$(id -un)&amp;quot; != mysql ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the mysql user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Usage: $0 &amp;lt;full|diff|incr&amp;gt;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
backup_type=$1&lt;br /&gt;
if [ &amp;quot;$backup_type&amp;quot; = full ]; then&lt;br /&gt;
    backup_type_letter=F&lt;br /&gt;
elif [ &amp;quot;$backup_type&amp;quot; = diff ]; then&lt;br /&gt;
    backup_type_letter=D&lt;br /&gt;
elif [ &amp;quot;$backup_type&amp;quot; = incr ]; then&lt;br /&gt;
    backup_type_letter=I&lt;br /&gt;
else&lt;br /&gt;
    echo &amp;quot;Backup type must be one of &#039;full&#039;, &#039;diff&#039; or &#039;incr&#039;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if ! pgrep mariadbd &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;MariaDB is not running&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if pgrep mariabackup &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;mariabackup is already running&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Delete temporary files left behind by previous run, if there are any&lt;br /&gt;
$SSH -- &amp;quot;rm -rf $SSH_FOLDER/*.tmp&amp;quot;&lt;br /&gt;
# Get a list of all backups in chronological order&lt;br /&gt;
mapfile -t backups &amp;lt; &amp;lt;($SSH -- &amp;quot;/bin/ls -1 $SSH_FOLDER | grep -P &#039;^\\d+-[FDI]$&#039; | sort&amp;quot;)&lt;br /&gt;
incremental_basedir_args=&lt;br /&gt;
old_checkpoint_dir=$(mktemp -d)&lt;br /&gt;
new_checkpoint_dir=$(mktemp -d)&lt;br /&gt;
trap &amp;quot;rm -rf $old_checkpoint_dir $new_checkpoint_dir&amp;quot; EXIT&lt;br /&gt;
if [ &amp;quot;$backup_type&amp;quot; = diff -o &amp;quot;$backup_type&amp;quot; = incr ]; then&lt;br /&gt;
    # Find a backup which we can use as a base.&lt;br /&gt;
    # For incr, this can be any type; for diff, this must be a full backup.&lt;br /&gt;
    base_backup=&lt;br /&gt;
    for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
        backup=${backups[i]}&lt;br /&gt;
        if [ $backup_type = incr ] || [[ $backup =~ -F$ ]]; then&lt;br /&gt;
            base_backup=$backup&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    if [ -z &amp;quot;$base_backup&amp;quot; ]; then&lt;br /&gt;
        echo &amp;quot;Could not find base backup for $backup_type type&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
    # Copy the xtrabackup_checkpoints file from the base backup into a&lt;br /&gt;
    # temporary directory, and use it in the mariabackup command.&lt;br /&gt;
    scp $SSH_ARGS &amp;quot;$SSH_USER@$SSH_HOST:$SSH_FOLDER/$base_backup/xtrabackup_*&amp;quot; $old_checkpoint_dir/&lt;br /&gt;
    incremental_basedir_args=&amp;quot;--incremental-basedir=$old_checkpoint_dir&amp;quot;&lt;br /&gt;
fi&lt;br /&gt;
compress_level=6&lt;br /&gt;
if [ $backup_type = full ]; then&lt;br /&gt;
    # Use a lower compression level to go faster&lt;br /&gt;
    compress_level=5&lt;br /&gt;
fi&lt;br /&gt;
foldername=&amp;quot;$(date +%s)-$backup_type_letter&amp;quot;&lt;br /&gt;
# First copy to a temporary dir, then rename the temporary dir to the&lt;br /&gt;
# desired dir name (in case our process gets killed)&lt;br /&gt;
mariabackup --user=mysql --backup $incremental_basedir_args --stream=xbstream --extra-lsndir=$new_checkpoint_dir \&lt;br /&gt;
    | nice xz -$compress_level -T0 \&lt;br /&gt;
    | $SSH -- &amp;quot;cd $SSH_FOLDER &amp;amp;&amp;amp; mkdir $foldername.tmp &amp;amp;&amp;amp; cat &amp;gt; $foldername.tmp/data.xb.xz&amp;quot;&lt;br /&gt;
scp $SSH_ARGS $new_checkpoint_dir/* $SSH_USER@$SSH_HOST:$SSH_FOLDER/$foldername.tmp/&lt;br /&gt;
$SSH -- &amp;quot;mv $SSH_FOLDER/$foldername.tmp $SSH_FOLDER/$foldername&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Delete old backups&lt;br /&gt;
if [ $backup_type = incr ]; then&lt;br /&gt;
    # We don&#039;t delete backups when making an incr backup, since we only&lt;br /&gt;
    # have retention limits for full and diff&lt;br /&gt;
    exit&lt;br /&gt;
fi&lt;br /&gt;
if [ $backup_type = full ]; then&lt;br /&gt;
    retention=$RETENTION_FULL&lt;br /&gt;
else&lt;br /&gt;
    retention=$RETENTION_DIFF&lt;br /&gt;
fi&lt;br /&gt;
num_backups_of_same_type=1&lt;br /&gt;
backups_to_delete=()&lt;br /&gt;
for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
    backup=${backups[i]}&lt;br /&gt;
    if ! [[ $backup =~ -${backup_type_letter}$ ]]; then&lt;br /&gt;
        continue&lt;br /&gt;
    fi&lt;br /&gt;
    ((num_backups_of_same_type++))&lt;br /&gt;
    if [ $num_backups_of_same_type -lt $retention ]; then&lt;br /&gt;
        continue&lt;br /&gt;
    fi&lt;br /&gt;
    if [ $backup_type = full ]; then&lt;br /&gt;
        # Delete everything before the last full backup which we want to&lt;br /&gt;
        # keep&lt;br /&gt;
        pat=&#039;^&#039;&lt;br /&gt;
    else&lt;br /&gt;
        # Delete all the diff and incr backups before the last diff backup&lt;br /&gt;
        # which we want to keep&lt;br /&gt;
        pat=&#039;-[DI]$&#039;&lt;br /&gt;
    fi&lt;br /&gt;
    for ((j=$i-1; j&amp;gt;=0; j--)); do&lt;br /&gt;
        backup=${backups[j]}&lt;br /&gt;
        if [[ $backup =~ $pat ]]; then&lt;br /&gt;
            backups_to_delete+=($backup)&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    break&lt;br /&gt;
done&lt;br /&gt;
if [ ${#backups_to_delete[@]} -eq 0 ]; then&lt;br /&gt;
    echo &amp;quot;No backups to delete&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit&lt;br /&gt;
fi&lt;br /&gt;
$SSH -- &amp;quot;cd $SSH_FOLDER &amp;amp;&amp;amp; rm -r ${backups_to_delete[@]}&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The script should be invoked with exactly one argument which must be one of &amp;quot;full&amp;quot;, &amp;quot;diff&amp;quot; or &amp;quot;incr&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
==== Cron ====&lt;br /&gt;
Paste something like the following into e.g. /etc/cron.d/mariadb_backup:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
MAILTO=root@csclub.uwaterloo.ca&lt;br /&gt;
&lt;br /&gt;
# Full back up at 00:20 every Sunday and Wednesday&lt;br /&gt;
20 0 * * 0,3 mysql chronic /var/mariadb/bin/backup-mariadb.sh full&lt;br /&gt;
# Differential backup at 00:35 every day&lt;br /&gt;
35 0 * * * mysql chronic /var/mariadb/bin/backup-mariadb.sh diff&lt;br /&gt;
# Incremental backup at the 50th minute of every hour&lt;br /&gt;
50 * * * * mysql chronic /var/mariadb/bin/backup-mariadb.sh incr&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5174</id>
		<title>MySQL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5174"/>
		<updated>2023-12-05T18:15:25Z</updated>

		<summary type="html">&lt;p&gt;Merenber: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For members ==&lt;br /&gt;
Note: the database on caffeine is actually MariaDB, not MySQL. Although they are mostly compatible, there are some incompatibilities to be aware of. See [[https://mariadb.com/kb/en/mariadb-vs-mysql-compatibility/ MariaDB versus MySQL: Compatibility]] for details.&lt;br /&gt;
&lt;br /&gt;
=== Creating databases ===&lt;br /&gt;
&lt;br /&gt;
Users can create their own MySQL databases through [[ceo]]. Users emailing syscom asking for a MySQL database should be directed to do so. The process is as follows:&lt;br /&gt;
&lt;br /&gt;
# SSH into any [[Machine_List|CSC machine]].&lt;br /&gt;
# Run &amp;lt;tt&amp;gt;ceo&amp;lt;/tt&amp;gt;.&lt;br /&gt;
# Select &amp;quot;Create MySQL database&amp;quot; and follow the instructions.&lt;br /&gt;
# Login info will be stored in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
# You can now connect to the MySQL database (from [[Machine_List#caffeine|caffeine]] only).&lt;br /&gt;
&lt;br /&gt;
=== Deleting databases ===&lt;br /&gt;
&lt;br /&gt;
Users can delete their own MySQL databases. &lt;br /&gt;
&lt;br /&gt;
SSH into [[Machine_List#caffeine|caffeine]].&lt;br /&gt;
 mysql -u yourusernamehere -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
 DROP DATABASE database name goes here&lt;br /&gt;
Login info and database name was created on database creation in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
&lt;br /&gt;
== For syscom ==&lt;br /&gt;
&lt;br /&gt;
=== Creating a database manually ===&lt;br /&gt;
To create a MySQL database manually on caffeine, first connect to the database as root:&lt;br /&gt;
&lt;br /&gt;
 $ mysql -uroot -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
&lt;br /&gt;
Then run the following SQL statements:&lt;br /&gt;
&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;%&#039; IDENTIFIED BY &#039;longrandompassword&#039;;&lt;br /&gt;
 CREATE DATABASE someuser;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;%&#039;;&lt;br /&gt;
&lt;br /&gt;
This will allow users to connect locally without a password, and connect remotely with a password.&lt;br /&gt;
&lt;br /&gt;
For random passwords run &amp;lt;code&amp;gt;pwgen -s 20 1&amp;lt;/code&amp;gt;. For the administrative passwords see /users/sysadmin/passwords/mysql.&lt;br /&gt;
&lt;br /&gt;
Write a file (usually ~club/mysql) to the club&#039;s homedir readable only by them containing the following:&lt;br /&gt;
&lt;br /&gt;
 Username: clubuserid&lt;br /&gt;
 Password: longrandompassword&lt;br /&gt;
 Hostname: localhost&lt;br /&gt;
&lt;br /&gt;
Try not to send passwords via plaintext email.&lt;br /&gt;
&lt;br /&gt;
=== Replication ===&lt;br /&gt;
&lt;br /&gt;
See the history of this page for information on the previous replication setup.&lt;br /&gt;
&lt;br /&gt;
=== Backups ===&lt;br /&gt;
&lt;br /&gt;
We use [[https://mariadb.com/kb/en/mariabackup-overview/ mariabackup]] to take periodic backups. It is currently installed and configured on both caffeine and coffee.&lt;br /&gt;
&lt;br /&gt;
==== Installation ====&lt;br /&gt;
In the example below, we will be installing mariabackup on coffee, and sending the backups to corn-syrup.&lt;br /&gt;
&lt;br /&gt;
First, install the mariadb-backup package:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt install mariadb-backup&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Next, create an SSH key pair for the mysql user:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir /var/mariadb&lt;br /&gt;
chown mysql:mysql /var/mariadb&lt;br /&gt;
su -s /bin/bash mysql&lt;br /&gt;
cd /var/mariadb&lt;br /&gt;
mkdir .ssh&lt;br /&gt;
chmod 700 .ssh&lt;br /&gt;
 # Choose /var/mariadb/.ssh/id_ed25519 for the path&lt;br /&gt;
ssh-keygen -t ed25519&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Paste the public key (/var/mariadb/.ssh/id_ed25519.pub) into /users/syscom/.ssh/authorized_keys on corn-syrup:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
restrict ssh-ed25519 AAAAC3Nza... mysql@coffee&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Also create the folder &amp;lt;code&amp;gt;/users/syscom/backups/coffee/mariabackup&amp;lt;/code&amp;gt;. We will store the backups here.&lt;br /&gt;
&lt;br /&gt;
We will use a hacky bash script to try to emulate the same behaviour as pgBackRest. We will compress and stream each backup to a folder on corn-syrup in the format &amp;lt;code&amp;gt;1701678356-F&amp;lt;/code&amp;gt;, where the number is a Unix epoch timestamp and the letter at the end is one of F, D or I (for full, differential or incremental backups). Full backups do not depend on any other backups. Differential backups depend on the latest full backup before them. Incremental backups depend on the latest backup before them (of any type).&lt;br /&gt;
&lt;br /&gt;
On coffee, paste the following into e.g. /var/mariadb/bin/backup-mariadb.sh:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
RETENTION_FULL=2&lt;br /&gt;
RETENTION_DIFF=4&lt;br /&gt;
SSH_KEY=/var/mariadb/.ssh/id_ed25519&lt;br /&gt;
SSH_USER=syscom&lt;br /&gt;
SSH_HOST=corn-syrup&lt;br /&gt;
SSH_FOLDER=/users/$SSH_USER/backups/$(hostname)/mariabackup&lt;br /&gt;
SSH_ARGS=&amp;quot;-i $SSH_KEY -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null&amp;quot;&lt;br /&gt;
SSH=&amp;quot;ssh $SSH_ARGS $SSH_USER@$SSH_HOST&amp;quot;&lt;br /&gt;
&lt;br /&gt;
set -euxo pipefail&lt;br /&gt;
# $USER doesn&#039;t seem to be defined when we run this from cron&lt;br /&gt;
if [ &amp;quot;$(id -un)&amp;quot; != mysql ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the mysql user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if [ $# -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Usage: $0 &amp;lt;full|diff|incr&amp;gt;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
backup_type=$1&lt;br /&gt;
if [ &amp;quot;$backup_type&amp;quot; = full ]; then&lt;br /&gt;
    backup_type_letter=F&lt;br /&gt;
elif [ &amp;quot;$backup_type&amp;quot; = diff ]; then&lt;br /&gt;
    backup_type_letter=D&lt;br /&gt;
elif [ &amp;quot;$backup_type&amp;quot; = incr ]; then&lt;br /&gt;
    backup_type_letter=I&lt;br /&gt;
else&lt;br /&gt;
    echo &amp;quot;Backup type must be one of &#039;full&#039;, &#039;diff&#039; or &#039;incr&#039;&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if ! pgrep mariadbd &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;MariaDB is not running&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
if pgrep mariabackup &amp;gt;/dev/null; then&lt;br /&gt;
    echo &amp;quot;mariabackup is already running&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Delete temporary files left behind by previous run, if there are any&lt;br /&gt;
$SSH -- &amp;quot;rm -rf $SSH_FOLDER/*.tmp&amp;quot;&lt;br /&gt;
# Get a list of all backups in chronological order&lt;br /&gt;
mapfile -t backups &amp;lt; &amp;lt;($SSH -- &amp;quot;/bin/ls -1 $SSH_FOLDER | grep -P &#039;^\\d+-[FDI]$&#039; | sort&amp;quot;)&lt;br /&gt;
incremental_basedir_args=&lt;br /&gt;
old_checkpoint_dir=$(mktemp -d)&lt;br /&gt;
new_checkpoint_dir=$(mktemp -d)&lt;br /&gt;
trap &amp;quot;rm -rf $old_checkpoint_dir $new_checkpoint_dir&amp;quot; EXIT&lt;br /&gt;
if [ &amp;quot;$backup_type&amp;quot; = diff -o &amp;quot;$backup_type&amp;quot; = incr ]; then&lt;br /&gt;
    # Find a backup which we can use as a base.&lt;br /&gt;
    # For incr, this can be any type; for diff, this must be a full backup.&lt;br /&gt;
    base_backup=&lt;br /&gt;
    for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
        backup=${backups[i]}&lt;br /&gt;
        if [ $backup_type = incr ] || [[ $backup =~ -F$ ]]; then&lt;br /&gt;
            base_backup=$backup&lt;br /&gt;
            break&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    if [ -z &amp;quot;$base_backup&amp;quot; ]; then&lt;br /&gt;
        echo &amp;quot;Could not find base backup for $backup_type type&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
        exit 1&lt;br /&gt;
    fi&lt;br /&gt;
    # Copy the xtrabackup_checkpoints file from the base backup into a&lt;br /&gt;
    # temporary directory, and use it in the mariabackup command.&lt;br /&gt;
    scp $SSH_ARGS &amp;quot;$SSH_USER@$SSH_HOST:$SSH_FOLDER/$base_backup/xtrabackup_*&amp;quot; $old_checkpoint_dir/&lt;br /&gt;
    incremental_basedir_args=&amp;quot;--incremental-basedir=$old_checkpoint_dir&amp;quot;&lt;br /&gt;
fi&lt;br /&gt;
compress_level=6&lt;br /&gt;
if [ $backup_type = full ]; then&lt;br /&gt;
    # Use a lower compression level to go faster&lt;br /&gt;
    compress_level=5&lt;br /&gt;
fi&lt;br /&gt;
foldername=&amp;quot;$(date +%s)-$backup_type_letter&amp;quot;&lt;br /&gt;
# First copy to a temporary dir, then rename the temporary dir to the&lt;br /&gt;
# desired dir name (in case our process gets killed)&lt;br /&gt;
mariabackup --user=mysql --backup $incremental_basedir_args --stream=xbstream --extra-lsndir=$new_checkpoint_dir \&lt;br /&gt;
    | nice xz -$compress_level -T0 \&lt;br /&gt;
    | $SSH -- &amp;quot;cd $SSH_FOLDER &amp;amp;&amp;amp; mkdir $foldername.tmp &amp;amp;&amp;amp; cat &amp;gt; $foldername.tmp/data.xb.xz&amp;quot;&lt;br /&gt;
scp $SSH_ARGS $new_checkpoint_dir/* $SSH_USER@$SSH_HOST:$SSH_FOLDER/$foldername.tmp/&lt;br /&gt;
$SSH -- &amp;quot;mv $SSH_FOLDER/$foldername.tmp $SSH_FOLDER/$foldername&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Delete old backups&lt;br /&gt;
if [ $backup_type = incr ]; then&lt;br /&gt;
    # We don&#039;t delete backups when making an incr backup, since we only&lt;br /&gt;
    # have retention limits for full and diff&lt;br /&gt;
    exit&lt;br /&gt;
fi&lt;br /&gt;
if [ $backup_type = full ]; then&lt;br /&gt;
    retention=$RETENTION_FULL&lt;br /&gt;
else&lt;br /&gt;
    retention=$RETENTION_DIFF&lt;br /&gt;
fi&lt;br /&gt;
num_backups_of_same_type=1&lt;br /&gt;
backups_to_delete=()&lt;br /&gt;
for ((i=${#backups[@]}-1; i&amp;gt;=0; i--)); do&lt;br /&gt;
    backup=${backups[i]}&lt;br /&gt;
    if ! [[ $backup =~ -${backup_type_letter}$ ]]; then&lt;br /&gt;
        continue&lt;br /&gt;
    fi&lt;br /&gt;
    ((num_backups_of_same_type++))&lt;br /&gt;
    if [ $num_backups_of_same_type -lt $retention ]; then&lt;br /&gt;
        continue&lt;br /&gt;
    fi&lt;br /&gt;
    if [ $backup_type = full ]; then&lt;br /&gt;
        # Delete everything before the last full backup which we want to&lt;br /&gt;
        # keep&lt;br /&gt;
        pat=&#039;^&#039;&lt;br /&gt;
    else&lt;br /&gt;
        # Delete all the diff and incr backups before the last diff backup&lt;br /&gt;
        # which we want to keep&lt;br /&gt;
        pat=&#039;-[DI]$&#039;&lt;br /&gt;
    fi&lt;br /&gt;
    for ((j=$i-1; j&amp;gt;=0; j--)); do&lt;br /&gt;
        backup=${backups[j]}&lt;br /&gt;
        if [[ $backup =~ $pat ]]; then&lt;br /&gt;
            backups_to_delete+=($backup)&lt;br /&gt;
        fi&lt;br /&gt;
    done&lt;br /&gt;
    break&lt;br /&gt;
done&lt;br /&gt;
if [ ${#backups_to_delete[@]} -eq 0 ]; then&lt;br /&gt;
    echo &amp;quot;No backups to delete&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit&lt;br /&gt;
fi&lt;br /&gt;
$SSH -- &amp;quot;cd $SSH_FOLDER &amp;amp;&amp;amp; rm -r ${backups_to_delete[@]}&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The script should be invoked with exactly one argument which must be one of &amp;quot;full&amp;quot;, &amp;quot;diff&amp;quot; or &amp;quot;incr&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5173</id>
		<title>MySQL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5173"/>
		<updated>2023-12-05T18:04:28Z</updated>

		<summary type="html">&lt;p&gt;Merenber: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For members ==&lt;br /&gt;
Note: the database on caffeine is actually MariaDB, not MySQL. Although they are mostly compatible, there are some incompatibilities to be aware of. See [[https://mariadb.com/kb/en/mariadb-vs-mysql-compatibility/ MariaDB versus MySQL: Compatibility]] for details.&lt;br /&gt;
&lt;br /&gt;
=== Creating databases ===&lt;br /&gt;
&lt;br /&gt;
Users can create their own MySQL databases through [[ceo]]. Users emailing syscom asking for a MySQL database should be directed to do so. The process is as follows:&lt;br /&gt;
&lt;br /&gt;
# SSH into any [[Machine_List|CSC machine]].&lt;br /&gt;
# Run &amp;lt;tt&amp;gt;ceo&amp;lt;/tt&amp;gt;.&lt;br /&gt;
# Select &amp;quot;Create MySQL database&amp;quot; and follow the instructions.&lt;br /&gt;
# Login info will be stored in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
# You can now connect to the MySQL database (from [[Machine_List#caffeine|caffeine]] only).&lt;br /&gt;
&lt;br /&gt;
=== Deleting databases ===&lt;br /&gt;
&lt;br /&gt;
Users can delete their own MySQL databases. &lt;br /&gt;
&lt;br /&gt;
SSH into [[Machine_List#caffeine|caffeine]].&lt;br /&gt;
 mysql -u yourusernamehere -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
 DROP DATABASE database name goes here&lt;br /&gt;
Login info and database name was created on database creation in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
&lt;br /&gt;
== For syscom ==&lt;br /&gt;
&lt;br /&gt;
=== Creating a database manually ===&lt;br /&gt;
To create a MySQL database manually on caffeine, first connect to the database as root:&lt;br /&gt;
&lt;br /&gt;
 $ mysql -uroot -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
&lt;br /&gt;
Then run the following SQL statements:&lt;br /&gt;
&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;%&#039; IDENTIFIED BY &#039;longrandompassword&#039;;&lt;br /&gt;
 CREATE DATABASE someuser;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;%&#039;;&lt;br /&gt;
&lt;br /&gt;
This will allow users to connect locally without a password, and connect remotely with a password.&lt;br /&gt;
&lt;br /&gt;
For random passwords run &amp;lt;code&amp;gt;pwgen -s 20 1&amp;lt;/code&amp;gt;. For the administrative passwords see /users/sysadmin/passwords/mysql.&lt;br /&gt;
&lt;br /&gt;
Write a file (usually ~club/mysql) to the club&#039;s homedir readable only by them containing the following:&lt;br /&gt;
&lt;br /&gt;
 Username: clubuserid&lt;br /&gt;
 Password: longrandompassword&lt;br /&gt;
 Hostname: localhost&lt;br /&gt;
&lt;br /&gt;
Try not to send passwords via plaintext email.&lt;br /&gt;
&lt;br /&gt;
=== Replication ===&lt;br /&gt;
&lt;br /&gt;
See the history of this page for information on the previous replication setup.&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5172</id>
		<title>MySQL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5172"/>
		<updated>2023-12-05T18:02:49Z</updated>

		<summary type="html">&lt;p&gt;Merenber: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For members ==&lt;br /&gt;
&lt;br /&gt;
=== Creating databases ===&lt;br /&gt;
&lt;br /&gt;
Users can create their own MySQL databases through [[ceo]]. Users emailing syscom asking for a MySQL database should be directed to do so. The process is as follows:&lt;br /&gt;
&lt;br /&gt;
# SSH into any [[Machine_List|CSC machine]].&lt;br /&gt;
# Run &amp;lt;tt&amp;gt;ceo&amp;lt;/tt&amp;gt;.&lt;br /&gt;
# Select &amp;quot;Create MySQL database&amp;quot; and follow the instructions.&lt;br /&gt;
# Login info will be stored in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
# You can now connect to the MySQL database (from [[Machine_List#caffeine|caffeine]] only).&lt;br /&gt;
&lt;br /&gt;
=== Deleting databases ===&lt;br /&gt;
&lt;br /&gt;
Users can delete their own MySQL databases. &lt;br /&gt;
&lt;br /&gt;
SSH into [[Machine_List#caffeine|caffeine]].&lt;br /&gt;
 mysql -u yourusernamehere -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
 DROP DATABASE database name goes here&lt;br /&gt;
Login info and database name was created on database creation in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
&lt;br /&gt;
== For syscom ==&lt;br /&gt;
&lt;br /&gt;
=== Manually ===&lt;br /&gt;
To create a MySQL database manually, first connect to the database as root:&lt;br /&gt;
&lt;br /&gt;
 $ mysql -uroot -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
&lt;br /&gt;
Then run the following SQL statements:&lt;br /&gt;
&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;%&#039; IDENTIFIED BY &#039;longrandompassword&#039;;&lt;br /&gt;
 CREATE DATABASE someuser;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;%&#039;;&lt;br /&gt;
&lt;br /&gt;
This will allow users to connect locally without a password, and connect remotely with a password.&lt;br /&gt;
&lt;br /&gt;
For random passwords run &amp;lt;code&amp;gt;pwgen -s 20 1&amp;lt;/code&amp;gt;. For the administrative passwords see /users/sysadmin/passwords/mysql.&lt;br /&gt;
&lt;br /&gt;
Write a file (usually ~club/mysql) to the club&#039;s homedir readable only by them containing the following:&lt;br /&gt;
&lt;br /&gt;
 Username: clubuserid&lt;br /&gt;
 Password: longrandompassword&lt;br /&gt;
 Hostname: localhost&lt;br /&gt;
&lt;br /&gt;
Try not to send passwords via plaintext email.&lt;br /&gt;
&lt;br /&gt;
=== Replication ===&lt;br /&gt;
&lt;br /&gt;
See the history of this page for information on the previous replication setup.&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5171</id>
		<title>MySQL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=MySQL&amp;diff=5171"/>
		<updated>2023-12-05T17:59:48Z</updated>

		<summary type="html">&lt;p&gt;Merenber: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For members ==&lt;br /&gt;
Note: the database on caffeine is actually MariaDB, not MySQL. Although they are mostly compatible, there are some incompatibilities to be aware of. See [[https://mariadb.com/kb/en/mariadb-vs-mysql-compatibility/ MariaDB versus MySQL: Compatibility]] for details.&lt;br /&gt;
&lt;br /&gt;
=== Creating databases ===&lt;br /&gt;
&lt;br /&gt;
Users can create their own MySQL databases through [[ceo]]. Users emailing syscom asking for a MySQL database should be directed to do so. The process is as follows:&lt;br /&gt;
&lt;br /&gt;
# SSH into any [[Machine_List|CSC machine]].&lt;br /&gt;
# Run &amp;lt;tt&amp;gt;ceo&amp;lt;/tt&amp;gt;.&lt;br /&gt;
# Select &amp;quot;Create MySQL database&amp;quot; and follow the instructions.&lt;br /&gt;
# Login info will be stored in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
# You can now connect to the MySQL database (from [[Machine_List#caffeine|caffeine]] only).&lt;br /&gt;
&lt;br /&gt;
=== Deleting databases ===&lt;br /&gt;
&lt;br /&gt;
Users can delete their own MySQL databases.&lt;br /&gt;
&lt;br /&gt;
SSH into [[Machine_List#caffeine|caffeine]].&lt;br /&gt;
 mysql -u yourusernamehere -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
 DROP DATABASE database name goes here&lt;br /&gt;
Login info and database name was created on database creation in &amp;lt;tt&amp;gt;ceo-mysql-info&amp;lt;/tt&amp;gt; in your home directory.&lt;br /&gt;
&lt;br /&gt;
== For syscom ==&lt;br /&gt;
&lt;br /&gt;
=== Creating a database manually ===&lt;br /&gt;
To create a MySQL database manually on caffeine, first connect to the database as root:&lt;br /&gt;
&lt;br /&gt;
 $ mysql -uroot -p&lt;br /&gt;
 Enter password: ******&lt;br /&gt;
&lt;br /&gt;
Then run the following SQL statements:&lt;br /&gt;
&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 CREATE USER &#039;someuser&#039;@&#039;%&#039; IDENTIFIED BY &#039;longrandompassword&#039;;&lt;br /&gt;
 CREATE DATABASE someuser;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;localhost&#039; IDENTIFIED VIA unix_socket;&lt;br /&gt;
 GRANT ALL PRIVILEGES ON someusername.* to &#039;someuser&#039;@&#039;%&#039;;&lt;br /&gt;
&lt;br /&gt;
This will allow users to connect locally without a password, and connect remotely with a password.&lt;br /&gt;
&lt;br /&gt;
For random passwords run &amp;lt;code&amp;gt;pwgen -s 20 1&amp;lt;/code&amp;gt;. For the administrative passwords see /users/sysadmin/passwords/mysql.&lt;br /&gt;
&lt;br /&gt;
Write a file (usually ~club/mysql) to the club&#039;s homedir readable only by them containing the following:&lt;br /&gt;
&lt;br /&gt;
 Username: clubuserid&lt;br /&gt;
 Password: longrandompassword&lt;br /&gt;
 Hostname: localhost&lt;br /&gt;
&lt;br /&gt;
Try not to send passwords via plaintext email.&lt;br /&gt;
&lt;br /&gt;
=== Replication ===&lt;br /&gt;
&lt;br /&gt;
See the history of this page for information on the previous replication setup.&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=PostgreSQL&amp;diff=5169</id>
		<title>PostgreSQL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=PostgreSQL&amp;diff=5169"/>
		<updated>2023-12-02T23:11:43Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Upgrades */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For members ==&lt;br /&gt;
PostgreSQL is available as a service for members on caffeine. Just run &amp;lt;code&amp;gt;ceo postgresql create&amp;lt;/code&amp;gt; to create a new database for your account. As of this writing, club reps cannot create PostgreSQL databases for their clubs via ceo, so they will need to send an email to syscom instead.&lt;br /&gt;
&lt;br /&gt;
== For syscom ==&lt;br /&gt;
We are also running a Postgres database on coffee, which is not available to members. Any software installed by syscom should use this database instead of the one on caffeine.&lt;br /&gt;
&lt;br /&gt;
=== Creating a database manually on caffeine ===&lt;br /&gt;
See [https://git.csclub.uwaterloo.ca/public/pyceo/src/commit/392ec153d0a1a9f4068a5ba3c4e4ecb2279ebab4/ceod/db/PostgreSQLService.py#L58 how ceo does it].&lt;br /&gt;
&lt;br /&gt;
=== Upgrades ===&lt;br /&gt;
Upgrading Postgres is more difficult than upgrading MySQL; when you upgrade the Debian version on a machine, a newer version of Postgres will be installed but the old version will remain and the data will not be migrated. &amp;lt;strong&amp;gt;You are responsible for manually upgrading the database yourself&amp;lt;/strong&amp;gt; on all machines where Postgres is installed (currently, just coffee and caffeine).&lt;br /&gt;
&lt;br /&gt;
Here&#039;s the Debian-specific way to do it (steps adapted from [https://www.pontikis.net/blog/update-postgres-major-version-in-debian here]). In the example below, we will assume that we are upgrading from Postgres 13 to 15.&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
First, take a full backup of the database. &amp;lt;strong&amp;gt;DO NOT SKIP THIS STEP.&amp;lt;/strong&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
pg_dumpall | xz -T0 &amp;gt; dump.sql.xz&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Drop the &amp;lt;strong&amp;gt;new&amp;lt;/strong&amp;gt; database, which should be empty at this point. &amp;lt;strong&amp;gt;Make sure that you are not dropping the old database instead!&amp;lt;/strong&amp;gt; You can run &amp;lt;code&amp;gt;pg_lsclusters&amp;lt;/code&amp;gt; to see which database versions are present.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Make sure that this is the NEW version, not the old version!&lt;br /&gt;
pg_dropcluster --stop 15 main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Upgrade the cluster:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
pg_upgradecluster -v 15 13 main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Run psql and make sure that the databases are present:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres -c psql&lt;br /&gt;
\l&lt;br /&gt;
\q&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Once we are sure that everything is working, drop the old database:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Make sure that this is the OLD version, not the new version!&lt;br /&gt;
pg_dropcluster --stop 13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
It is now safe to purge the old postgres package:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt purge postgresql-13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Backups ===&lt;br /&gt;
We use [https://pgbackrest.org pgBackRest] for Postgres backups. It has already been installed on coffee and caffeine.&lt;br /&gt;
&lt;br /&gt;
==== Installation ====&lt;br /&gt;
In the example below, we will be installing pgbackrest on coffee, and using corn-syrup to store the backups (via SSH).&lt;br /&gt;
&lt;br /&gt;
The pgbackrest package in bookworm is too old and doesn&#039;t support SFTP, so we&#039;re going to download the packages we need from trixie instead (starting from trixie and higher, this should no longer be necessary):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# On coffee&lt;br /&gt;
wget http://mirror.csclub.uwaterloo.ca/debian/pool/main/p/pgbackrest/pgbackrest_2.48-1_amd64.deb&lt;br /&gt;
wget http://mirror.csclub.uwaterloo.ca/debian/pool/main/libz/libzstd/libzstd1_1.5.5+dfsg2-2_amd64.deb&lt;br /&gt;
apt install ./pgbackrest_2.48-1_amd64.deb ./libzstd1_1.5.5+dfsg2-2_amd64.deb&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Switch to the postgres user and create a new SSH key:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
ssh-keygen -t ed25519&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Login to corn-syrup, switch to the syscom user, and paste the public key you created earlier into ~/.ssh/authorized_keys:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
restrict ssh-ed25519 AAAAC3Nza... postgres@coffee&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Create a folder to store the backups:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir ~/backups/coffee/pgbackrest&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, on coffee, paste something like the following into /etc/pgbackrest.conf. &amp;lt;strong&amp;gt;Make sure to adjust repo1-path and pg1-path.&amp;lt;/strong&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[global]&lt;br /&gt;
repo1-retention-full=2&lt;br /&gt;
repo1-retention-diff=4&lt;br /&gt;
repo1-bundle=y&lt;br /&gt;
repo1-type=sftp&lt;br /&gt;
repo1-sftp-host=corn-syrup&lt;br /&gt;
repo1-sftp-host-user=syscom&lt;br /&gt;
repo1-path=/users/syscom/backups/coffee/pgbackrest&lt;br /&gt;
repo1-sftp-private-key-file=/var/lib/postgresql/.ssh/id_ed25519&lt;br /&gt;
repo1-sftp-public-key-file=/var/lib/postgresql/.ssh/id_ed25519.pub&lt;br /&gt;
repo1-sftp-host-key-hash-type=sha256&lt;br /&gt;
repo1-sftp-host-key-check-type=none&lt;br /&gt;
start-fast=y&lt;br /&gt;
log-level-console=info&lt;br /&gt;
process-max=4&lt;br /&gt;
compress-type=lz4&lt;br /&gt;
&lt;br /&gt;
[main]&lt;br /&gt;
pg1-path=/var/lib/postgresql/15/main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The config above will keep two full backups and at least four differential backups. See https://pgbackrest.org/user-guide.html#retention for more details.&lt;br /&gt;
&lt;br /&gt;
Next, open /etc/postgresql/15/main/postgresql.conf and add/edit the following lines:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
archive_mode = on&lt;br /&gt;
archive_command = &#039;pgbackrest --stanza=main archive-push %p&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
See https://pgbackrest.org/user-guide.html#quickstart/configure-archiving for more details.&lt;br /&gt;
&lt;br /&gt;
Next, restart Postgres:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl restart postgresql@15-main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Switch to the postgres user, create the main stanza, and run the first backup:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
pgbackrest --stanza=main stanza-create&lt;br /&gt;
pgbackrest --stanza=main check&lt;br /&gt;
pgbackrest --stanza=main backup --type=full&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Upgrades ====&lt;br /&gt;
Normally, whenever you upgrade Postgres, you have to manually edit /etc/pgbackrest.conf and run the &amp;quot;stanza-upgrade&amp;quot; command. To make this easier for future sysadmins, I wrote a wrapper script around pgbackrest which does this automatically if it detects that Postgres was upgraded. Paste the following into /var/lib/postgresql/bin/pgbackrest-wrapper.sh and make it executable:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
set -ex&lt;br /&gt;
if [ &amp;quot;$(id -un)&amp;quot; != postgres ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the postgres user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Use the full path to ls to avoid bash aliases&lt;br /&gt;
mapfile -t pg_versions &amp;lt; &amp;lt;(/bin/ls -1 /var/lib/postgresql | grep -P &#039;^\d+$&#039;)&lt;br /&gt;
if [ ${#pg_versions[@]} -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Expected to find 1 Postgres version, found ${#pg_versions[@]} instead: ${pg_versions[*]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
pg_ver=${pg_versions[0]}&lt;br /&gt;
mapfile -t pgbr_versions &amp;lt; &amp;lt;(grep -oP &#039;/var/lib/postgresql/\K(\d+)&#039; /etc/pgbackrest.conf)&lt;br /&gt;
if [ ${#pgbr_versions[@]} -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Expected to find 1 pgBackRest folder, found ${#pgbr_versions[@]} instead: ${pgbr_versions[*]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
pgbr_ver=${pgbr_versions[0]}&lt;br /&gt;
if [ $pg_ver -eq $pgbr_ver ]; then&lt;br /&gt;
    # pgbackrest.conf is up to date, so just run the backup normally&lt;br /&gt;
    pgbackrest &amp;quot;$@&amp;quot;&lt;br /&gt;
    exit 0&lt;br /&gt;
elif [ $pg_ver -lt $pgbr_ver ]; then&lt;br /&gt;
    echo &amp;quot;pgBackRest does not support downgrades - you will have to fix this manually&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# sed -i needs to create a temporary file, and the postgres user doesn&#039;t have&lt;br /&gt;
# write permissions on /etc, so write to a temporary file first&lt;br /&gt;
sed &amp;quot;s,/var/lib/postgresql/$pgbr_ver,/var/lib/postgresql/$pg_ver,&amp;quot; /etc/pgbackrest.conf &amp;gt; /tmp/pgbackrest.conf&lt;br /&gt;
cp /tmp/pgbackrest.conf /etc/pgbackrest.conf&lt;br /&gt;
rm /tmp/pgbackrest.conf&lt;br /&gt;
pgbackrest --stanza=main stanza-upgrade&lt;br /&gt;
pgbackrest --stanza=main check&lt;br /&gt;
# Run the backup&lt;br /&gt;
pgbackrest &amp;quot;$@&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now we can just pass pgbackrest parameters directly to this script, e.g. &amp;lt;code&amp;gt;pgbackrest-wrapper.sh --stanza=main backup&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Cron ====&lt;br /&gt;
We want backups to be taken periodically. Paste the following into e.g. /etc/cron.d/postgres_backup (this file must be owned by root):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
MAILTO=root@csclub.uwaterloo.ca&lt;br /&gt;
&lt;br /&gt;
# Full back up at 00:15 every Sunday and Wednesday&lt;br /&gt;
15 0 * * 0,3 postgres chronic ~/bin/pgbackrest-wrapper.sh --stanza=main backup --type=full&lt;br /&gt;
# Differential backup at 00:30 every day&lt;br /&gt;
30 0 * * * postgres chronic ~/bin/pgbackrest-wrapper.sh --stanza=main backup --type=diff&lt;br /&gt;
# Incremental backup at the 45th minute of every hour&lt;br /&gt;
45 * * * * postgres chronic ~/bin/pgbackrest-wrapper.sh --stanza=main backup --type=incr&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Restore ====&lt;br /&gt;
Suppose we want to restore the latest backup, and the installed Postgres is 15. First, make sure that you actually have at least one backup present for this version:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su -c postgres -c &#039;pgbackrest --stanza=main info&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, stop the database and delete all of the files:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl stop postgresql@15-main&lt;br /&gt;
rm -rf /var/lib/postgresql/15/main/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now switch to the postgres user and run the &amp;quot;restore&amp;quot; command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
pgbackrest --stanza=main restore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If you start Postgres, everything should be in a working state:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl start postgresql@15-main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to restore a backup which is not the latest version, pass the &amp;lt;code&amp;gt;--set&amp;lt;/code&amp;gt; argument to pgbackrest. See https://pgbackrest.org/user-guide.html#restore for more details.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=Main_Page&amp;diff=5165</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=Main_Page&amp;diff=5165"/>
		<updated>2023-11-25T16:52:52Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Guides */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This is the Wiki of the [[Computer Science Club]]. Feel free to start adding pages and information.&lt;br /&gt;
&lt;br /&gt;
[[Special:AllPages]]&lt;br /&gt;
&lt;br /&gt;
== Member/Club Rep Documentation ==&lt;br /&gt;
To access our Linux machines, see [[How to SSH]] and select one of the general-use machines from [[Machine List#General-Use Servers]].&lt;br /&gt;
&lt;br /&gt;
To host a website, see [[Web Hosting]]. If you are trying to host websites for clubs, see [[Club Hosting]].&lt;br /&gt;
&lt;br /&gt;
To use our VPS services (similar to Linode and Amazon EC2), see [https://docs.cloud.csclub.uwaterloo.ca/ CSC Cloud Documentation]. Note that you&#039;ll need to activate your account on one of CSC&#039;s machines before using the management panel.&lt;br /&gt;
&lt;br /&gt;
To view instruction on playing music at the office, see [[Music]].&lt;br /&gt;
&lt;br /&gt;
To use our Nextcloud instance (similar to Google Drive and Dropbox), go to [https://files.csclub.uwaterloo.ca CSC Files].&lt;br /&gt;
&lt;br /&gt;
=== Guides ===&lt;br /&gt;
&amp;lt;div style=&amp;quot;-webkit-column-count:3; -moz-column-count:3; column-count:3;&amp;quot;&amp;gt;&lt;br /&gt;
* [[New Member Guide]]&lt;br /&gt;
* [[Club Hosting]]&lt;br /&gt;
* [[Web Hosting]]&lt;br /&gt;
* [[Git Hosting]]&lt;br /&gt;
* [[How to IRC]]&lt;br /&gt;
* [[How to SSH]]&lt;br /&gt;
* [[MySQL]]&lt;br /&gt;
* [[PostgreSQL]]&lt;br /&gt;
* [https://docs.cloud.csclub.uwaterloo.ca/ CSC Cloud Documentation]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== News and Events ===&lt;br /&gt;
&amp;lt;div style=&amp;quot;-webkit-column-count:3; -moz-column-count:3; column-count:3;&amp;quot;&amp;gt;&lt;br /&gt;
* [[Meetings]]&lt;br /&gt;
* [[Talks]]&lt;br /&gt;
* [[Projects]]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Committees Documentation ==&lt;br /&gt;
=== Club Operation ===&lt;br /&gt;
&amp;lt;div style=&amp;quot;-webkit-column-count:3; -moz-column-count:3; column-count:3;&amp;quot;&amp;gt;&lt;br /&gt;
* [[Budget Guide]]&lt;br /&gt;
* [[ceo]]&lt;br /&gt;
* [[Exec Manual]]&lt;br /&gt;
* [[MEF Guide]]&lt;br /&gt;
* [[Office Policies]]&lt;br /&gt;
* [[Office Staff]]&lt;br /&gt;
* [[Sysadmin Guide]]&lt;br /&gt;
* [[How to (Extra) Ban Someone]]&lt;br /&gt;
* [[SCS Guide]]&lt;br /&gt;
* [[Kerberos |Password Reset]]&lt;br /&gt;
* [[Keys and Fobs]]&lt;br /&gt;
&lt;br /&gt;
* [[Talks Guide]]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Hardware Infrastructure (the bare metals) ===&lt;br /&gt;
&amp;lt;div style=&amp;quot;-webkit-column-count:3; -moz-column-count:3; column-count:3;&amp;quot;&amp;gt;&lt;br /&gt;
* [[Disk Drive RMA Process]]&lt;br /&gt;
* [[Machine List]]&lt;br /&gt;
* [[IPMI101]]&lt;br /&gt;
* [[New NetApp]]&lt;br /&gt;
* [[Switches]]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Software Infrastructure ===&lt;br /&gt;
&amp;lt;div style=&amp;quot;-webkit-column-count:3; -moz-column-count:3; column-count:3;&amp;quot;&amp;gt;&lt;br /&gt;
* [[ADFS]]&lt;br /&gt;
* [[Backups]]&lt;br /&gt;
* [[DNS]]&lt;br /&gt;
* [[Debian Repository]]&lt;br /&gt;
* [[Firewall]]&lt;br /&gt;
* [[Kerberos]]&lt;br /&gt;
* [[Keycloak]]&lt;br /&gt;
* [[KVM]]&lt;br /&gt;
* [[LDAP]]&lt;br /&gt;
* [[Network]]&lt;br /&gt;
* [[New CSC Machine]]&lt;br /&gt;
* [[Observability]]&lt;br /&gt;
* [[OID Assignment]]&lt;br /&gt;
* [[Podman]]&lt;br /&gt;
* [[Scratch]]&lt;br /&gt;
* [[SNMP]]&lt;br /&gt;
* [[SSL]]&lt;br /&gt;
* [[Syscom Todo]]&lt;br /&gt;
* [[Systemd-nspawn]]&lt;br /&gt;
* [[Two-Factor Authentication]]&lt;br /&gt;
* [[UID/GID Assignment]]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Services ===&lt;br /&gt;
&amp;lt;div style=&amp;quot;-webkit-column-count:3; -moz-column-count:3; column-count:3;&amp;quot;&amp;gt;&lt;br /&gt;
* [[Application List]]&lt;br /&gt;
* [[BigBlueButton]]&lt;br /&gt;
* [[Mail]]&lt;br /&gt;
* [[Mailing Lists]]&lt;br /&gt;
* [[Mirror]]&lt;br /&gt;
* [[Music]]&lt;br /&gt;
* [[Nextcloud]]&lt;br /&gt;
* [[Printing]]&lt;br /&gt;
* [[Pulseaudio]]&lt;br /&gt;
* [[Webmail]]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== CSC Cloud ===&lt;br /&gt;
&amp;lt;div style=&amp;quot;-webkit-column-count:3; -moz-column-count:3; column-count:3;&amp;quot;&amp;gt;&lt;br /&gt;
* [[Ceph]]&lt;br /&gt;
* [[Cloud Networking]]&lt;br /&gt;
* [[CloudStack]]&lt;br /&gt;
* [[CloudStack Templates]]&lt;br /&gt;
* [[Kubernetes]]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Miscellaneous ==&lt;br /&gt;
&amp;lt;div style=&amp;quot;-webkit-column-count:3; -moz-column-count:3; column-count:3;&amp;quot;&amp;gt;&lt;br /&gt;
* [[Acronyms]]&lt;br /&gt;
* [[Budget]]&lt;br /&gt;
* [[Executive]]&lt;br /&gt;
* [[Past Executive]]&lt;br /&gt;
* [[History]]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Historical ==&lt;br /&gt;
&amp;lt;div style=&amp;quot;-webkit-column-count:3; -moz-column-count:3; column-count:3;&amp;quot;&amp;gt;&lt;br /&gt;
* [[Robot Arm]]&lt;br /&gt;
* [[Webcams]]&lt;br /&gt;
* [[Website]]&lt;br /&gt;
* [[Digital Cutter]]&lt;br /&gt;
* [[Electronics]]&lt;br /&gt;
* [[NetApp]]&lt;br /&gt;
* [[Frosh]]&lt;br /&gt;
* [[Virtualization (LXC Containers)]]&lt;br /&gt;
* [[Serial Connections]]&lt;br /&gt;
* [[Library]]&lt;br /&gt;
* [[MEF Proposals]]&lt;br /&gt;
* [[Proposed Constitution Changes]]&lt;br /&gt;
* [[NFS/Kerberos]]&lt;br /&gt;
* [[Hardware]]&lt;br /&gt;
* [[Imapd Guide]]&lt;br /&gt;
__NOTOC__&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=PostgreSQL&amp;diff=5164</id>
		<title>PostgreSQL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=PostgreSQL&amp;diff=5164"/>
		<updated>2023-11-25T16:22:46Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Backups */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For members ==&lt;br /&gt;
PostgreSQL is available as a service for members on caffeine. Just run &amp;lt;code&amp;gt;ceo postgresql create&amp;lt;/code&amp;gt; to create a new database for your account. As of this writing, club reps cannot create PostgreSQL databases for their clubs via ceo, so they will need to send an email to syscom instead.&lt;br /&gt;
&lt;br /&gt;
== For syscom ==&lt;br /&gt;
We are also running a Postgres database on coffee, which is not available to members. Any software installed by syscom should use this database instead of the one on caffeine.&lt;br /&gt;
&lt;br /&gt;
=== Creating a database manually on caffeine ===&lt;br /&gt;
See [https://git.csclub.uwaterloo.ca/public/pyceo/src/commit/392ec153d0a1a9f4068a5ba3c4e4ecb2279ebab4/ceod/db/PostgreSQLService.py#L58 how ceo does it].&lt;br /&gt;
&lt;br /&gt;
=== Upgrades ===&lt;br /&gt;
Upgrading Postgres is more difficult than upgrading MySQL; when you upgrade the Debian version on a machine, a newer version of Postgres will be installed but the old version will remain and the data will not be migrated. &amp;lt;strong&amp;gt;You are responsible for manually upgrading the database yourself&amp;lt;/strong&amp;gt; on all machines where Postgres is installed (currently, just coffee and caffeine).&lt;br /&gt;
&lt;br /&gt;
Here&#039;s the Debian-specific way to do it (steps adapted from [https://www.pontikis.net/blog/update-postgres-major-version-in-debian here]). In the example below, we will assume that we are upgrading from Postgres 13 to 15.&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
First, take a full backup of the database. &amp;lt;strong&amp;gt;DO NOT SKIP THIS STEP.&amp;lt;/strong&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
pg_dumpall | xz -T0 &amp;gt; dump.sql.xz&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Drop the &amp;lt;strong&amp;gt;new&amp;lt;/strong&amp;gt; database, which should be empty at this point. &amp;lt;strong&amp;gt;Make sure that you are not dropping the old database instead!&amp;lt;/strong&amp;gt; You can run &amp;lt;code&amp;gt;pg_lsclusters&amp;lt;/code&amp;gt; to see which database versions are present.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Make sure that this is the NEW version, not the old version!&lt;br /&gt;
pg_dropcluster --stop 15 main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Upgrade the cluster:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
pg_upgradecluster -v 15 13 main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Run psql and make sure that the databases are present:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres -c psql&lt;br /&gt;
\l&lt;br /&gt;
\q&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Once we are sure that everything is working, drop the old database:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Make sure that this is the OLD version, not the new version!&lt;br /&gt;
pg_dropcluster --stop 13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
It is now safe to purge the old postgres package:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt purge postgresql-13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Backups ===&lt;br /&gt;
We use [https://pgbackrest.org pgBackRest] for Postgres backups. It has already been installed on coffee and caffeine.&lt;br /&gt;
&lt;br /&gt;
==== Installation ====&lt;br /&gt;
In the example below, we will be installing pgbackrest on coffee, and using corn-syrup to store the backups (via SSH).&lt;br /&gt;
&lt;br /&gt;
The pgbackrest package in bookworm is too old and doesn&#039;t support SFTP, so we&#039;re going to download the packages we need from trixie instead (starting from trixie and higher, this should no longer be necessary):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# On coffee&lt;br /&gt;
wget http://mirror.csclub.uwaterloo.ca/debian/pool/main/p/pgbackrest/pgbackrest_2.48-1_amd64.deb&lt;br /&gt;
wget http://mirror.csclub.uwaterloo.ca/debian/pool/main/libz/libzstd/libzstd1_1.5.5+dfsg2-2_amd64.deb&lt;br /&gt;
apt install ./pgbackrest_2.48-1_amd64.deb ./libzstd1_1.5.5+dfsg2-2_amd64.deb&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Switch to the postgres user and create a new SSH key:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
ssh-keygen -t ed25519&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Login to corn-syrup, switch to the syscom user, and paste the public key you created earlier into ~/.ssh/authorized_keys:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
restrict ssh-ed25519 AAAAC3Nza... postgres@coffee&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Create a folder to store the backups:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir ~/backups/coffee/pgbackrest&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, on coffee, paste something like the following into /etc/pgbackrest.conf. &amp;lt;strong&amp;gt;Make sure to adjust repo1-path and pg1-path.&amp;lt;/strong&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[global]&lt;br /&gt;
repo1-retention-full=2&lt;br /&gt;
repo1-retention-diff=4&lt;br /&gt;
repo1-bundle=y&lt;br /&gt;
repo1-type=sftp&lt;br /&gt;
repo1-sftp-host=corn-syrup&lt;br /&gt;
repo1-sftp-host-user=syscom&lt;br /&gt;
repo1-path=/users/syscom/backups/coffee/pgbackrest&lt;br /&gt;
repo1-sftp-private-key-file=/var/lib/postgresql/.ssh/id_ed25519&lt;br /&gt;
repo1-sftp-public-key-file=/var/lib/postgresql/.ssh/id_ed25519.pub&lt;br /&gt;
repo1-sftp-host-key-hash-type=sha256&lt;br /&gt;
repo1-sftp-host-key-check-type=none&lt;br /&gt;
start-fast=y&lt;br /&gt;
log-level-console=info&lt;br /&gt;
process-max=4&lt;br /&gt;
compress-type=lz4&lt;br /&gt;
&lt;br /&gt;
[main]&lt;br /&gt;
pg1-path=/var/lib/postgresql/15/main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The config above will keep two full backups and at least four differential backups. See https://pgbackrest.org/user-guide.html#retention for more details.&lt;br /&gt;
&lt;br /&gt;
Next, open /etc/postgresql/15/main/postgresql.conf and add/edit the following lines:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
archive_mode = on&lt;br /&gt;
archive_command = &#039;pgbackrest --stanza=main archive-push %p&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
See https://pgbackrest.org/user-guide.html#quickstart/configure-archiving for more details.&lt;br /&gt;
&lt;br /&gt;
Next, restart Postgres:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl restart postgresql@15-main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Switch to the postgres user, create the main stanza, and run the first backup:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
pgbackrest --stanza=main stanza-create&lt;br /&gt;
pgbackrest --stanza=main check&lt;br /&gt;
pgbackrest --stanza=main backup --type=full&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Upgrades ====&lt;br /&gt;
Normally, whenever you upgrade Postgres, you have to manually edit /etc/pgbackrest.conf and run the &amp;quot;stanza-upgrade&amp;quot; command. To make this easier for future sysadmins, I wrote a wrapper script around pgbackrest which does this automatically if it detects that Postgres was upgraded. Paste the following into /var/lib/postgresql/bin/pgbackrest-wrapper.sh and make it executable:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
set -ex&lt;br /&gt;
if [ $USER != postgres ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the postgres user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Use the full path to ls to avoid bash aliases&lt;br /&gt;
mapfile -t pg_versions &amp;lt; &amp;lt;(/bin/ls -1 /var/lib/postgresql | grep -P &#039;^\d+$&#039;)&lt;br /&gt;
if [ ${#pg_versions[@]} -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Expected to find 1 Postgres version, found ${#pg_versions[@]} instead: ${pg_versions[*]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
pg_ver=${pg_versions[0]}&lt;br /&gt;
mapfile -t pgbr_versions &amp;lt; &amp;lt;(grep -oP &#039;/var/lib/postgresql/\K(\d+)&#039; /etc/pgbackrest.conf)&lt;br /&gt;
if [ ${#pgbr_versions[@]} -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Expected to find 1 pgBackRest folder, found ${#pgbr_versions[@]} instead: ${pgbr_versions[*]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
pgbr_ver=${pgbr_versions[0]}&lt;br /&gt;
if [ $pg_ver -eq $pgbr_ver ]; then&lt;br /&gt;
    # pgbackrest.conf is up to date, so just run the backup normally&lt;br /&gt;
    pgbackrest &amp;quot;$@&amp;quot;&lt;br /&gt;
    exit 0&lt;br /&gt;
elif [ $pg_ver -lt $pgbr_ver ]; then&lt;br /&gt;
    echo &amp;quot;pgBackRest does not support downgrades - you will have to fix this manually&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# sed -i needs to create a temporary file, and the postgres user doesn&#039;t have&lt;br /&gt;
# write permissions on /etc, so write to a temporary file first&lt;br /&gt;
sed &amp;quot;s,/var/lib/postgresql/$pgbr_ver,/var/lib/postgresql/$pg_ver,&amp;quot; /etc/pgbackrest.conf &amp;gt; /tmp/pgbackrest.conf&lt;br /&gt;
cp /tmp/pgbackrest.conf /etc/pgbackrest.conf&lt;br /&gt;
rm /tmp/pgbackrest.conf&lt;br /&gt;
pgbackrest --stanza=main stanza-upgrade&lt;br /&gt;
pgbackrest --stanza=main check&lt;br /&gt;
# Run the backup&lt;br /&gt;
pgbackrest &amp;quot;$@&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now we can just pass pgbackrest parameters directly to this script, e.g. &amp;lt;code&amp;gt;pgbackrest-wrapper.sh --stanza=main backup&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Cron ====&lt;br /&gt;
We want backups to be taken periodically. Paste the following into e.g. /etc/cron.d/postgres_backup (this file must be owned by root):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
MAILTO=root@csclub.uwaterloo.ca&lt;br /&gt;
&lt;br /&gt;
# Full back up at 00:15 every Sunday and Wednesday&lt;br /&gt;
15 0 * * 0,3 postgres chronic ~/bin/pgbackrest-wrapper.sh --stanza=main backup --type=full&lt;br /&gt;
# Differential backup at 00:30 every day&lt;br /&gt;
30 0 * * * postgres chronic ~/bin/pgbackrest-wrapper.sh --stanza=main backup --type=diff&lt;br /&gt;
# Incremental backup at the 45th minute of every hour&lt;br /&gt;
45 * * * * postgres chronic ~/bin/pgbackrest-wrapper.sh --stanza=main backup --type=incr&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Restore ====&lt;br /&gt;
Suppose we want to restore the latest backup, and the installed Postgres is 15. First, make sure that you actually have at least one backup present for this version:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su -c postgres -c &#039;pgbackrest --stanza=main info&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, stop the database and delete all of the files:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl stop postgresql@15-main&lt;br /&gt;
rm -rf /var/lib/postgresql/15/main/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now switch to the postgres user and run the &amp;quot;restore&amp;quot; command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
pgbackrest --stanza=main restore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If you start Postgres, everything should be in a working state:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl start postgresql@15-main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to restore a backup which is not the latest version, pass the &amp;lt;code&amp;gt;--set&amp;lt;/code&amp;gt; argument to pgbackrest. See https://pgbackrest.org/user-guide.html#restore for more details.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=PostgreSQL&amp;diff=5163</id>
		<title>PostgreSQL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=PostgreSQL&amp;diff=5163"/>
		<updated>2023-11-25T16:13:26Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Upgrades */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For members ==&lt;br /&gt;
PostgreSQL is available as a service for members on caffeine. Just run &amp;lt;code&amp;gt;ceo postgresql create&amp;lt;/code&amp;gt; to create a new database for your account. As of this writing, club reps cannot create PostgreSQL databases for their clubs via ceo, so they will need to send an email to syscom instead.&lt;br /&gt;
&lt;br /&gt;
== For syscom ==&lt;br /&gt;
We are also running a Postgres database on coffee, which is not available to members. Any software installed by syscom should use this database instead of the one on caffeine.&lt;br /&gt;
&lt;br /&gt;
=== Creating a database manually on caffeine ===&lt;br /&gt;
See [https://git.csclub.uwaterloo.ca/public/pyceo/src/commit/392ec153d0a1a9f4068a5ba3c4e4ecb2279ebab4/ceod/db/PostgreSQLService.py#L58 how ceo does it].&lt;br /&gt;
&lt;br /&gt;
=== Upgrades ===&lt;br /&gt;
Upgrading Postgres is more difficult than upgrading MySQL; when you upgrade the Debian version on a machine, a newer version of Postgres will be installed but the old version will remain and the data will not be migrated. &amp;lt;strong&amp;gt;You are responsible for manually upgrading the database yourself&amp;lt;/strong&amp;gt; on all machines where Postgres is installed (currently, just coffee and caffeine).&lt;br /&gt;
&lt;br /&gt;
Here&#039;s the Debian-specific way to do it (steps adapted from [https://www.pontikis.net/blog/update-postgres-major-version-in-debian here]). In the example below, we will assume that we are upgrading from Postgres 13 to 15.&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
First, take a full backup of the database. &amp;lt;strong&amp;gt;DO NOT SKIP THIS STEP.&amp;lt;/strong&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
pg_dumpall | xz -T0 &amp;gt; dump.sql.xz&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Drop the &amp;lt;strong&amp;gt;new&amp;lt;/strong&amp;gt; database, which should be empty at this point. &amp;lt;strong&amp;gt;Make sure that you are not dropping the old database instead!&amp;lt;/strong&amp;gt; You can run &amp;lt;code&amp;gt;pg_lsclusters&amp;lt;/code&amp;gt; to see which database versions are present.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Make sure that this is the NEW version, not the old version!&lt;br /&gt;
pg_dropcluster --stop 15 main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Upgrade the cluster:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
pg_upgradecluster -v 15 13 main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Run psql and make sure that the databases are present:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres -c psql&lt;br /&gt;
\l&lt;br /&gt;
\q&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Once we are sure that everything is working, drop the old database:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Make sure that this is the OLD version, not the new version!&lt;br /&gt;
pg_dropcluster --stop 13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
It is now safe to purge the old postgres package:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt purge postgresql-13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Backups ===&lt;br /&gt;
We use [https://pgbackrest.org pgBackRest] for Postgres backups.&lt;br /&gt;
&lt;br /&gt;
==== Installation ====&lt;br /&gt;
In the example below, we will be installing pgbackrest on coffee, and using corn-syrup to store the backups (via SSH).&lt;br /&gt;
&lt;br /&gt;
The pgbackrest package in bookworm is too old and doesn&#039;t support SFTP, so we&#039;re going to download the packages we need from trixie instead (starting from trixie and higher, this should no longer be necessary):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# On coffee&lt;br /&gt;
wget http://mirror.csclub.uwaterloo.ca/debian/pool/main/p/pgbackrest/pgbackrest_2.48-1_amd64.deb&lt;br /&gt;
wget http://mirror.csclub.uwaterloo.ca/debian/pool/main/libz/libzstd/libzstd1_1.5.5+dfsg2-2_amd64.deb&lt;br /&gt;
apt install ./pgbackrest_2.48-1_amd64.deb ./libzstd1_1.5.5+dfsg2-2_amd64.deb&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Switch to the postgres user and create a new SSH key:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
ssh-keygen -t ed25519&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Login to corn-syrup, switch to the syscom user, and paste the public key you created earlier into ~/.ssh/authorized_keys:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
restrict ssh-ed25519 AAAAC3Nza... postgres@coffee&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Create a folder to store the backups:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir ~/backups/coffee/pgbackrest&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, on coffee, paste something like the following into /etc/pgbackrest.conf. &amp;lt;strong&amp;gt;Make sure to adjust repo1-path and pg1-path.&amp;lt;/strong&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[global]&lt;br /&gt;
repo1-retention-full=2&lt;br /&gt;
repo1-retention-diff=4&lt;br /&gt;
repo1-bundle=y&lt;br /&gt;
repo1-type=sftp&lt;br /&gt;
repo1-sftp-host=corn-syrup&lt;br /&gt;
repo1-sftp-host-user=syscom&lt;br /&gt;
repo1-path=/users/syscom/backups/coffee/pgbackrest&lt;br /&gt;
repo1-sftp-private-key-file=/var/lib/postgresql/.ssh/id_ed25519&lt;br /&gt;
repo1-sftp-public-key-file=/var/lib/postgresql/.ssh/id_ed25519.pub&lt;br /&gt;
repo1-sftp-host-key-hash-type=sha256&lt;br /&gt;
repo1-sftp-host-key-check-type=none&lt;br /&gt;
start-fast=y&lt;br /&gt;
log-level-console=info&lt;br /&gt;
process-max=4&lt;br /&gt;
compress-type=lz4&lt;br /&gt;
&lt;br /&gt;
[main]&lt;br /&gt;
pg1-path=/var/lib/postgresql/15/main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The config above will keep two full backups and at least four differential backups. See https://pgbackrest.org/user-guide.html#retention for more details.&lt;br /&gt;
&lt;br /&gt;
Next, open /etc/postgresql/15/main/postgresql.conf and add/edit the following lines:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
archive_mode = on&lt;br /&gt;
archive_command = &#039;pgbackrest --stanza=main archive-push %p&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
See https://pgbackrest.org/user-guide.html#quickstart/configure-archiving for more details.&lt;br /&gt;
&lt;br /&gt;
Next, restart Postgres:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl restart postgresql@15-main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Switch to the postgres user, create the main stanza, and run the first backup:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
pgbackrest --stanza=main stanza-create&lt;br /&gt;
pgbackrest --stanza=main check&lt;br /&gt;
pgbackrest --stanza=main backup --type=full&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Upgrades ====&lt;br /&gt;
Normally, whenever you upgrade Postgres, you have to manually edit /etc/pgbackrest.conf and run the &amp;quot;stanza-upgrade&amp;quot; command. To make this easier for future sysadmins, I wrote a wrapper script around pgbackrest which does this automatically if it detects that Postgres was upgraded. Paste the following into /var/lib/postgresql/bin/pgbackrest-wrapper.sh and make it executable:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
set -ex&lt;br /&gt;
if [ $USER != postgres ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the postgres user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Use the full path to ls to avoid bash aliases&lt;br /&gt;
mapfile -t pg_versions &amp;lt; &amp;lt;(/bin/ls -1 /var/lib/postgresql | grep -P &#039;^\d+$&#039;)&lt;br /&gt;
if [ ${#pg_versions[@]} -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Expected to find 1 Postgres version, found ${#pg_versions[@]} instead: ${pg_versions[*]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
pg_ver=${pg_versions[0]}&lt;br /&gt;
mapfile -t pgbr_versions &amp;lt; &amp;lt;(grep -oP &#039;/var/lib/postgresql/\K(\d+)&#039; /etc/pgbackrest.conf)&lt;br /&gt;
if [ ${#pgbr_versions[@]} -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Expected to find 1 pgBackRest folder, found ${#pgbr_versions[@]} instead: ${pgbr_versions[*]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
pgbr_ver=${pgbr_versions[0]}&lt;br /&gt;
if [ $pg_ver -eq $pgbr_ver ]; then&lt;br /&gt;
    # pgbackrest.conf is up to date, so just run the backup normally&lt;br /&gt;
    pgbackrest &amp;quot;$@&amp;quot;&lt;br /&gt;
    exit 0&lt;br /&gt;
elif [ $pg_ver -lt $pgbr_ver ]; then&lt;br /&gt;
    echo &amp;quot;pgBackRest does not support downgrades - you will have to fix this manually&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# sed -i needs to create a temporary file, and the postgres user doesn&#039;t have&lt;br /&gt;
# write permissions on /etc, so write to a temporary file first&lt;br /&gt;
sed &amp;quot;s,/var/lib/postgresql/$pgbr_ver,/var/lib/postgresql/$pg_ver,&amp;quot; /etc/pgbackrest.conf &amp;gt; /tmp/pgbackrest.conf&lt;br /&gt;
cp /tmp/pgbackrest.conf /etc/pgbackrest.conf&lt;br /&gt;
rm /tmp/pgbackrest.conf&lt;br /&gt;
pgbackrest --stanza=main stanza-upgrade&lt;br /&gt;
pgbackrest --stanza=main check&lt;br /&gt;
# Run the backup&lt;br /&gt;
pgbackrest &amp;quot;$@&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now we can just pass pgbackrest parameters directly to this script, e.g. &amp;lt;code&amp;gt;pgbackrest-wrapper.sh --stanza=main backup&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Cron ====&lt;br /&gt;
We want backups to be taken periodically. Paste the following into e.g. /etc/cron.d/postgres_backup (this file must be owned by root):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
MAILTO=root@csclub.uwaterloo.ca&lt;br /&gt;
&lt;br /&gt;
# Full back up at 00:15 every Sunday and Wednesday&lt;br /&gt;
15 0 * * 0,3 postgres chronic ~/bin/pgbackrest-wrapper.sh --stanza=main backup --type=full&lt;br /&gt;
# Differential backup at 00:30 every day&lt;br /&gt;
30 0 * * * postgres chronic ~/bin/pgbackrest-wrapper.sh --stanza=main backup --type=diff&lt;br /&gt;
# Incremental backup at the 45th minute of every hour&lt;br /&gt;
45 * * * * postgres chronic ~/bin/pgbackrest-wrapper.sh --stanza=main backup --type=incr&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Restore ====&lt;br /&gt;
Suppose we want to restore the latest backup, and the installed Postgres is 15. First, make sure that you actually have at least one backup present for this version:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su -c postgres -c &#039;pgbackrest --stanza=main info&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, stop the database and delete all of the files:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl stop postgresql@15-main&lt;br /&gt;
rm -rf /var/lib/postgresql/15/main/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now switch to the postgres user and run the &amp;quot;restore&amp;quot; command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
pgbackrest --stanza=main restore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If you start Postgres, everything should be in a working state:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl start postgresql@15-main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to restore a backup which is not the latest version, pass the &amp;lt;code&amp;gt;--set&amp;lt;/code&amp;gt; argument to pgbackrest. See https://pgbackrest.org/user-guide.html#restore for more details.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=PostgreSQL&amp;diff=5162</id>
		<title>PostgreSQL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=PostgreSQL&amp;diff=5162"/>
		<updated>2023-11-25T16:09:07Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Installation */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For members ==&lt;br /&gt;
PostgreSQL is available as a service for members on caffeine. Just run &amp;lt;code&amp;gt;ceo postgresql create&amp;lt;/code&amp;gt; to create a new database for your account. As of this writing, club reps cannot create PostgreSQL databases for their clubs via ceo, so they will need to send an email to syscom instead.&lt;br /&gt;
&lt;br /&gt;
== For syscom ==&lt;br /&gt;
We are also running a Postgres database on coffee, which is not available to members. Any software installed by syscom should use this database instead of the one on caffeine.&lt;br /&gt;
&lt;br /&gt;
=== Creating a database manually on caffeine ===&lt;br /&gt;
See [https://git.csclub.uwaterloo.ca/public/pyceo/src/commit/392ec153d0a1a9f4068a5ba3c4e4ecb2279ebab4/ceod/db/PostgreSQLService.py#L58 how ceo does it].&lt;br /&gt;
&lt;br /&gt;
=== Upgrades ===&lt;br /&gt;
Upgrading Postgres is more difficult than upgrading MySQL; when you upgrade the Debian version on a machine, a newer version of Postgres will be installed but the old version will remain and the data will not be migrated. &amp;lt;strong&amp;gt;You are responsible for manually upgrading the database yourself&amp;lt;/strong&amp;gt; on all machines where Postgres is installed (currently, just coffee and caffeine).&lt;br /&gt;
&lt;br /&gt;
Here&#039;s the Debian-specific way to do it (steps adapted from [https://www.pontikis.net/blog/update-postgres-major-version-in-debian here]). In the example below, we will assume that we are upgrading from Postgres 13 to 15.&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
First, take a full backup of the database. &amp;lt;strong&amp;gt;DO NOT SKIP THIS STEP.&amp;lt;/strong&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
pg_dumpall | xz -T0 &amp;gt; dump.sql.xz&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Drop the &amp;lt;strong&amp;gt;new&amp;lt;/strong&amp;gt; database, which should be empty at this point. &amp;lt;strong&amp;gt;Make sure that you are not dropping the old database instead!&amp;lt;/strong&amp;gt; You can run &amp;lt;code&amp;gt;pg_lsclusters&amp;lt;/code&amp;gt; to see which database versions are present.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Make sure that this is the NEW version, not the old version!&lt;br /&gt;
pg_dropcluster --stop 15 main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Upgrade the cluster:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
pg_upgradecluster -v 15 13 main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Run psql and make sure that the databases are present:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres -c psql&lt;br /&gt;
\l&lt;br /&gt;
\q&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Once we are sure that everything is working, drop the old database:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Make sure that this is the OLD version, not the new version!&lt;br /&gt;
pg_dropcluster --stop 13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
It is now safe to purge the old postgres package:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt purge postgresql-13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Backups ===&lt;br /&gt;
We use [https://pgbackrest.org pgBackRest] for Postgres backups.&lt;br /&gt;
&lt;br /&gt;
==== Installation ====&lt;br /&gt;
In the example below, we will be installing pgbackrest on coffee, and using corn-syrup to store the backups (via SSH).&lt;br /&gt;
&lt;br /&gt;
The pgbackrest package in bookworm is too old and doesn&#039;t support SFTP, so we&#039;re going to download the packages we need from trixie instead (starting from trixie and higher, this should no longer be necessary):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# On coffee&lt;br /&gt;
wget http://mirror.csclub.uwaterloo.ca/debian/pool/main/p/pgbackrest/pgbackrest_2.48-1_amd64.deb&lt;br /&gt;
wget http://mirror.csclub.uwaterloo.ca/debian/pool/main/libz/libzstd/libzstd1_1.5.5+dfsg2-2_amd64.deb&lt;br /&gt;
apt install ./pgbackrest_2.48-1_amd64.deb ./libzstd1_1.5.5+dfsg2-2_amd64.deb&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Switch to the postgres user and create a new SSH key:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
ssh-keygen -t ed25519&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Login to corn-syrup, switch to the syscom user, and paste the public key you created earlier into ~/.ssh/authorized_keys:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
restrict ssh-ed25519 AAAAC3Nza... postgres@coffee&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Create a folder to store the backups:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir ~/backups/coffee/pgbackrest&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, on coffee, paste something like the following into /etc/pgbackrest.conf. &amp;lt;strong&amp;gt;Make sure to adjust repo1-path and pg1-path.&amp;lt;/strong&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[global]&lt;br /&gt;
repo1-retention-full=2&lt;br /&gt;
repo1-retention-diff=4&lt;br /&gt;
repo1-bundle=y&lt;br /&gt;
repo1-type=sftp&lt;br /&gt;
repo1-sftp-host=corn-syrup&lt;br /&gt;
repo1-sftp-host-user=syscom&lt;br /&gt;
repo1-path=/users/syscom/backups/coffee/pgbackrest&lt;br /&gt;
repo1-sftp-private-key-file=/var/lib/postgresql/.ssh/id_ed25519&lt;br /&gt;
repo1-sftp-public-key-file=/var/lib/postgresql/.ssh/id_ed25519.pub&lt;br /&gt;
repo1-sftp-host-key-hash-type=sha256&lt;br /&gt;
repo1-sftp-host-key-check-type=none&lt;br /&gt;
start-fast=y&lt;br /&gt;
log-level-console=info&lt;br /&gt;
process-max=4&lt;br /&gt;
compress-type=lz4&lt;br /&gt;
&lt;br /&gt;
[main]&lt;br /&gt;
pg1-path=/var/lib/postgresql/15/main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The config above will keep two full backups and at least four differential backups. See https://pgbackrest.org/user-guide.html#retention for more details.&lt;br /&gt;
&lt;br /&gt;
Next, open /etc/postgresql/15/main/postgresql.conf and add/edit the following lines:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
archive_mode = on&lt;br /&gt;
archive_command = &#039;pgbackrest --stanza=main archive-push %p&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
See https://pgbackrest.org/user-guide.html#quickstart/configure-archiving for more details.&lt;br /&gt;
&lt;br /&gt;
Next, restart Postgres:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl restart postgresql@15-main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Switch to the postgres user, create the main stanza, and run the first backup:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
pgbackrest --stanza=main stanza-create&lt;br /&gt;
pgbackrest --stanza=main check&lt;br /&gt;
pgbackrest --stanza=main backup --type=full&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Upgrades ====&lt;br /&gt;
Normally, whenever you upgrade Postgres, you have to manually edit /etc/pgbackrest.conf and run the &amp;quot;stanza-upgrade&amp;quot; command. To make this easier for future sysadmins, I wrote a wrapper script around pgbackrest which does this automatically if it detects that Postgres was upgraded. Paste the following into /var/lib/postgresql/bin/pgbackrest-wrapper.sh and make it executable:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
set -ex&lt;br /&gt;
if [ $USER != postgres ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the postgres user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Use the full path to ls to avoid bash aliases&lt;br /&gt;
mapfile -t pg_versions &amp;lt; &amp;lt;(/bin/ls -1 /var/lib/postgresql | grep -P &#039;^\d+$&#039;)&lt;br /&gt;
if [ ${#pg_versions[@]} -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Expected to find 1 Postgres version, found ${#pg_versions[@]} instead: ${pg_versions[*]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
pg_ver=${pg_versions[0]}&lt;br /&gt;
mapfile -t pgbr_versions &amp;lt; &amp;lt;(grep -oP &#039;/var/lib/postgresql/\K(\d+)&#039; /etc/pgbackrest.conf)&lt;br /&gt;
if [ ${#pgbr_versions[@]} -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Expected to find 1 pgBackRest folder, found ${pgbr_versions[@]} instead: ${pgbr_versions[*]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
pgbr_ver=${pgbr_versions[0]}&lt;br /&gt;
if [ $pg_ver -eq $pgbr_ver ]; then&lt;br /&gt;
    # pgbackrest.conf is up to date, so just run the backup normally&lt;br /&gt;
    pgbackrest &amp;quot;$@&amp;quot;&lt;br /&gt;
    exit 0&lt;br /&gt;
elif [ $pg_ver -lt $pgbr_ver ]; then&lt;br /&gt;
    echo &amp;quot;pgBackRest does not support downgrades - you will have to fix this manually&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# sed -i needs to create a temporary file, and the postgres user doesn&#039;t have&lt;br /&gt;
# write permissions on /etc, so write to a temporary file first&lt;br /&gt;
sed &amp;quot;s,/var/lib/postgresql/$pgbr_ver,/var/lib/postgresql/$pg_ver,&amp;quot; /etc/pgbackrest.conf &amp;gt; /tmp/pgbackrest.conf&lt;br /&gt;
cp /tmp/pgbackrest.conf /etc/pgbackrest.conf&lt;br /&gt;
rm /tmp/pgbackrest.conf&lt;br /&gt;
pgbackrest --stanza=main stanza-upgrade&lt;br /&gt;
pgbackrest --stanza=main check&lt;br /&gt;
# Run the backup&lt;br /&gt;
pgbackrest &amp;quot;$@&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now we can just pass pgbackrest parameters directly to this script, e.g. &amp;lt;code&amp;gt;pgbackrest-wrapper.sh --stanza=main backup&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Cron ====&lt;br /&gt;
We want backups to be taken periodically. Paste the following into e.g. /etc/cron.d/postgres_backup (this file must be owned by root):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
MAILTO=root@csclub.uwaterloo.ca&lt;br /&gt;
&lt;br /&gt;
# Full back up at 00:15 every Sunday and Wednesday&lt;br /&gt;
15 0 * * 0,3 postgres chronic ~/bin/pgbackrest-wrapper.sh --stanza=main backup --type=full&lt;br /&gt;
# Differential backup at 00:30 every day&lt;br /&gt;
30 0 * * * postgres chronic ~/bin/pgbackrest-wrapper.sh --stanza=main backup --type=diff&lt;br /&gt;
# Incremental backup at the 45th minute of every hour&lt;br /&gt;
45 * * * * postgres chronic ~/bin/pgbackrest-wrapper.sh --stanza=main backup --type=incr&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Restore ====&lt;br /&gt;
Suppose we want to restore the latest backup, and the installed Postgres is 15. First, make sure that you actually have at least one backup present for this version:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su -c postgres -c &#039;pgbackrest --stanza=main info&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, stop the database and delete all of the files:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl stop postgresql@15-main&lt;br /&gt;
rm -rf /var/lib/postgresql/15/main/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now switch to the postgres user and run the &amp;quot;restore&amp;quot; command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
pgbackrest --stanza=main restore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If you start Postgres, everything should be in a working state:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl start postgresql@15-main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to restore a backup which is not the latest version, pass the &amp;lt;code&amp;gt;--set&amp;lt;/code&amp;gt; argument to pgbackrest. See https://pgbackrest.org/user-guide.html#restore for more details.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
	<entry>
		<id>https://wiki.csclub.uwaterloo.ca/index.php?title=PostgreSQL&amp;diff=5161</id>
		<title>PostgreSQL</title>
		<link rel="alternate" type="text/html" href="https://wiki.csclub.uwaterloo.ca/index.php?title=PostgreSQL&amp;diff=5161"/>
		<updated>2023-11-25T16:08:05Z</updated>

		<summary type="html">&lt;p&gt;Merenber: /* Installation */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== For members ==&lt;br /&gt;
PostgreSQL is available as a service for members on caffeine. Just run &amp;lt;code&amp;gt;ceo postgresql create&amp;lt;/code&amp;gt; to create a new database for your account. As of this writing, club reps cannot create PostgreSQL databases for their clubs via ceo, so they will need to send an email to syscom instead.&lt;br /&gt;
&lt;br /&gt;
== For syscom ==&lt;br /&gt;
We are also running a Postgres database on coffee, which is not available to members. Any software installed by syscom should use this database instead of the one on caffeine.&lt;br /&gt;
&lt;br /&gt;
=== Creating a database manually on caffeine ===&lt;br /&gt;
See [https://git.csclub.uwaterloo.ca/public/pyceo/src/commit/392ec153d0a1a9f4068a5ba3c4e4ecb2279ebab4/ceod/db/PostgreSQLService.py#L58 how ceo does it].&lt;br /&gt;
&lt;br /&gt;
=== Upgrades ===&lt;br /&gt;
Upgrading Postgres is more difficult than upgrading MySQL; when you upgrade the Debian version on a machine, a newer version of Postgres will be installed but the old version will remain and the data will not be migrated. &amp;lt;strong&amp;gt;You are responsible for manually upgrading the database yourself&amp;lt;/strong&amp;gt; on all machines where Postgres is installed (currently, just coffee and caffeine).&lt;br /&gt;
&lt;br /&gt;
Here&#039;s the Debian-specific way to do it (steps adapted from [https://www.pontikis.net/blog/update-postgres-major-version-in-debian here]). In the example below, we will assume that we are upgrading from Postgres 13 to 15.&lt;br /&gt;
&amp;lt;ol&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
First, take a full backup of the database. &amp;lt;strong&amp;gt;DO NOT SKIP THIS STEP.&amp;lt;/strong&amp;gt;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
pg_dumpall | xz -T0 &amp;gt; dump.sql.xz&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Drop the &amp;lt;strong&amp;gt;new&amp;lt;/strong&amp;gt; database, which should be empty at this point. &amp;lt;strong&amp;gt;Make sure that you are not dropping the old database instead!&amp;lt;/strong&amp;gt; You can run &amp;lt;code&amp;gt;pg_lsclusters&amp;lt;/code&amp;gt; to see which database versions are present.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Make sure that this is the NEW version, not the old version!&lt;br /&gt;
pg_dropcluster --stop 15 main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Upgrade the cluster:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
pg_upgradecluster -v 15 13 main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Run psql and make sure that the databases are present:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres -c psql&lt;br /&gt;
\l&lt;br /&gt;
\q&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
Once we are sure that everything is working, drop the old database:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# Make sure that this is the OLD version, not the new version!&lt;br /&gt;
pg_dropcluster --stop 13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;li&amp;gt;&lt;br /&gt;
It is now safe to purge the old postgres package:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
apt purge postgresql-13&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;lt;/li&amp;gt;&lt;br /&gt;
&amp;lt;/ol&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Backups ===&lt;br /&gt;
We use [https://pgbackrest.org pgBackRest] for Postgres backups.&lt;br /&gt;
&lt;br /&gt;
==== Installation ====&lt;br /&gt;
In the example below, we will be installing pgbackrest on coffee, and using corn-syrup to store the backups (via SSH).&lt;br /&gt;
&lt;br /&gt;
The pgbackrest package in bookworm is too old and doesn&#039;t support SFTP, so we&#039;re going to download the packages we need from trixie instead (starting from trixie and higher, this should no longer be necessary):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
# On coffee&lt;br /&gt;
wget http://mirror.csclub.uwaterloo.ca/debian/pool/main/p/pgbackrest/pgbackrest_2.48-1_amd64.deb&lt;br /&gt;
wget http://mirror.csclub.uwaterloo.ca/debian/pool/main/libz/libzstd/libzstd1_1.5.5+dfsg2-2_amd64.deb&lt;br /&gt;
apt install ./pgbackrest_2.48-1_amd64.deb ./libzstd1_1.5.5+dfsg2-2_amd64.deb&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Switch to the postgres user and create a new SSH key:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
ssh-keygen -t ed25519&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Login to corn-syrup, switch to the syscom user, and paste the public key you created earlier into ~/.ssh/authorized_keys:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
restrict ssh-ed25519 AAAAC3Nza... postgres@coffee&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Create a folder to store the backups:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
mkdir ~/backups/coffee/pgbackrest&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, on coffee, paste the following into /etc/pgbackrest.conf (adjust repo1-path and pg1-path as appropriate):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
[global]&lt;br /&gt;
repo1-retention-full=2&lt;br /&gt;
repo1-retention-diff=4&lt;br /&gt;
repo1-bundle=y&lt;br /&gt;
repo1-type=sftp&lt;br /&gt;
repo1-sftp-host=corn-syrup&lt;br /&gt;
repo1-sftp-host-user=syscom&lt;br /&gt;
repo1-path=/users/syscom/backups/coffee/pgbackrest&lt;br /&gt;
repo1-sftp-private-key-file=/var/lib/postgresql/.ssh/id_ed25519&lt;br /&gt;
repo1-sftp-public-key-file=/var/lib/postgresql/.ssh/id_ed25519.pub&lt;br /&gt;
repo1-sftp-host-key-hash-type=sha256&lt;br /&gt;
repo1-sftp-host-key-check-type=none&lt;br /&gt;
start-fast=y&lt;br /&gt;
log-level-console=info&lt;br /&gt;
process-max=4&lt;br /&gt;
compress-type=lz4&lt;br /&gt;
&lt;br /&gt;
[main]&lt;br /&gt;
pg1-path=/var/lib/postgresql/15/main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The config above will keep two full backups and at least four differential backups. See https://pgbackrest.org/user-guide.html#retention for more details.&lt;br /&gt;
&lt;br /&gt;
Next, open /etc/postgresql/15/main/postgresql.conf and add/edit the following lines:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
archive_mode = on&lt;br /&gt;
archive_command = &#039;pgbackrest --stanza=main archive-push %p&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
See https://pgbackrest.org/user-guide.html#quickstart/configure-archiving for more details.&lt;br /&gt;
&lt;br /&gt;
Next, restart Postgres:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl restart postgresql@15-main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Switch to the postgres user, create the main stanza, and run the first backup:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
pgbackrest --stanza=main stanza-create&lt;br /&gt;
pgbackrest --stanza=main check&lt;br /&gt;
pgbackrest --stanza=main backup --type=full&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Upgrades ====&lt;br /&gt;
Normally, whenever you upgrade Postgres, you have to manually edit /etc/pgbackrest.conf and run the &amp;quot;stanza-upgrade&amp;quot; command. To make this easier for future sysadmins, I wrote a wrapper script around pgbackrest which does this automatically if it detects that Postgres was upgraded. Paste the following into /var/lib/postgresql/bin/pgbackrest-wrapper.sh and make it executable:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
set -ex&lt;br /&gt;
if [ $USER != postgres ]; then&lt;br /&gt;
    echo &amp;quot;This script should run as the postgres user&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# Use the full path to ls to avoid bash aliases&lt;br /&gt;
mapfile -t pg_versions &amp;lt; &amp;lt;(/bin/ls -1 /var/lib/postgresql | grep -P &#039;^\d+$&#039;)&lt;br /&gt;
if [ ${#pg_versions[@]} -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Expected to find 1 Postgres version, found ${#pg_versions[@]} instead: ${pg_versions[*]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
pg_ver=${pg_versions[0]}&lt;br /&gt;
mapfile -t pgbr_versions &amp;lt; &amp;lt;(grep -oP &#039;/var/lib/postgresql/\K(\d+)&#039; /etc/pgbackrest.conf)&lt;br /&gt;
if [ ${#pgbr_versions[@]} -ne 1 ]; then&lt;br /&gt;
    echo &amp;quot;Expected to find 1 pgBackRest folder, found ${pgbr_versions[@]} instead: ${pgbr_versions[*]}&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
pgbr_ver=${pgbr_versions[0]}&lt;br /&gt;
if [ $pg_ver -eq $pgbr_ver ]; then&lt;br /&gt;
    # pgbackrest.conf is up to date, so just run the backup normally&lt;br /&gt;
    pgbackrest &amp;quot;$@&amp;quot;&lt;br /&gt;
    exit 0&lt;br /&gt;
elif [ $pg_ver -lt $pgbr_ver ]; then&lt;br /&gt;
    echo &amp;quot;pgBackRest does not support downgrades - you will have to fix this manually&amp;quot; &amp;gt;&amp;amp;2&lt;br /&gt;
    exit 1&lt;br /&gt;
fi&lt;br /&gt;
# sed -i needs to create a temporary file, and the postgres user doesn&#039;t have&lt;br /&gt;
# write permissions on /etc, so write to a temporary file first&lt;br /&gt;
sed &amp;quot;s,/var/lib/postgresql/$pgbr_ver,/var/lib/postgresql/$pg_ver,&amp;quot; /etc/pgbackrest.conf &amp;gt; /tmp/pgbackrest.conf&lt;br /&gt;
cp /tmp/pgbackrest.conf /etc/pgbackrest.conf&lt;br /&gt;
rm /tmp/pgbackrest.conf&lt;br /&gt;
pgbackrest --stanza=main stanza-upgrade&lt;br /&gt;
pgbackrest --stanza=main check&lt;br /&gt;
# Run the backup&lt;br /&gt;
pgbackrest &amp;quot;$@&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now we can just pass pgbackrest parameters directly to this script, e.g. &amp;lt;code&amp;gt;pgbackrest-wrapper.sh --stanza=main backup&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Cron ====&lt;br /&gt;
We want backups to be taken periodically. Paste the following into e.g. /etc/cron.d/postgres_backup (this file must be owned by root):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
MAILTO=root@csclub.uwaterloo.ca&lt;br /&gt;
&lt;br /&gt;
# Full back up at 00:15 every Sunday and Wednesday&lt;br /&gt;
15 0 * * 0,3 postgres chronic ~/bin/pgbackrest-wrapper.sh --stanza=main backup --type=full&lt;br /&gt;
# Differential backup at 00:30 every day&lt;br /&gt;
30 0 * * * postgres chronic ~/bin/pgbackrest-wrapper.sh --stanza=main backup --type=diff&lt;br /&gt;
# Incremental backup at the 45th minute of every hour&lt;br /&gt;
45 * * * * postgres chronic ~/bin/pgbackrest-wrapper.sh --stanza=main backup --type=incr&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Restore ====&lt;br /&gt;
Suppose we want to restore the latest backup, and the installed Postgres is 15. First, make sure that you actually have at least one backup present for this version:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su -c postgres -c &#039;pgbackrest --stanza=main info&#039;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Next, stop the database and delete all of the files:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl stop postgresql@15-main&lt;br /&gt;
rm -rf /var/lib/postgresql/15/main/*&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now switch to the postgres user and run the &amp;quot;restore&amp;quot; command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
su - postgres&lt;br /&gt;
pgbackrest --stanza=main restore&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If you start Postgres, everything should be in a working state:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
systemctl start postgresql@15-main&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to restore a backup which is not the latest version, pass the &amp;lt;code&amp;gt;--set&amp;lt;/code&amp;gt; argument to pgbackrest. See https://pgbackrest.org/user-guide.html#restore for more details.&lt;/div&gt;</summary>
		<author><name>Merenber</name></author>
	</entry>
</feed>