Cloud Networking
This page gives an overview of the network topology of the CSC cloud.
For the sections below, ".csc" is an abbreviation for ".csclub.uwaterloo.ca".
Hostnames and IP addresses
Some important hostnames and IP addresses to know about:
- VLAN 529 (172.19.168.0/27, fd74:6b6a:8eca:4902::/64) is the cloud management network
- VLAN 425 (172.19.134.0/24, 2620:101:f000:4903::/64) is the cloud user network
- Due to routing decisions, everything in VLAN 529 is accessible from VLAN 425, so make sure to use iptables if there's some service running in VLAN 529 which shouldn't be accessible by regular users. See the routers section below.
- riboflavin, ginkgo and biloba are on VLAN 529 and VLAN 134
- Each of these hosts has a ".cloud" domain record which resolves to their VLAN 529 address, e.g. riboflavin.cloud.csc
- Most Ceph services are explicitly placed on riboflavin and ginkgo, since they have the most storage
- ceph-nfs.cloud.csc and ceph-mon.cloud.csc are CNAMES for riboflavin and ginkgo
- biloba and chamomile are the two CloudStack management servers
- cloud.csc (and csclub.cloud) point to a virtual IP address (in VLAN 529) shared by biloba and chamomile via keepalived
- CloudStack keeps cookie sessions in-memory, so it is important that only one machine is the "active" CloudStack management server at any time (otherwise, if you connect to a different machine, your cookie session will suddenly be invalid)
- mariadb.cloud.csc points to this virtual IP address. We are using master-master replication between two MariaDB instances on biloba and chamomile. To avoid split-brain syndrome, it is very important that only instance gets written to at any time.
- We are authoritative for the *.cloud.csc and *.csclub.cloud DNS zones, so do not use Infoblox for these (except for PTR records)
- All *.csclub.cloud records point to the cloud.csc virtual IP
- Most *.csclub.cloud TLS certificates are managed with acme.sh
Cloud routers
The two cloud routers are router1.cloud.csc and router2.cloud.csc. They each have 3 IP addresses: one for each of VLANs 134, 425 and 529. router1.cloud.csc resolves to the VLAN 529 IP, and router1-cloud.csc resolves to the VLAN 134 IP (same for router2).
The two routers are also sharing the virtual IP addresses for router.cloud.csc (VLAN 529) and nat-gateway.cloud.csc (VLAN 425) via keepalived. For incoming VLAN 425 traffic, the routers basically strip off the VLAN 425 header and replace it with a VLAN 529 header, meaning that VLAN 529 is fully accessible from VLAN 425. (This is actually useful for us since we can access Ceph storage from Kubernetes this way.) Use iptables where necessary.
Setup
The routers are systemd-nspawn containers running on biloba and chamomile. Standard container setup (FQDN, DNS, etc.) applies.
For reference, here's the .nspawn file for router1:
[Exec] Boot=yes Hostname=router1 PrivateUsers=no [Network] Bridge=br134 # These are manually bridged to the host; see # /etc/systemd/system/systemd-nspawn@router1.service.d/override.conf VirtualEthernetExtra=ve-router1-529:veth529 VirtualEthernetExtra=ve-router1-425:veth425
Here's the aforementioned systemd drop-in:
[Service] ExecStartPost=/usr/sbin/ip link set dev ve-router1-529 master br529 ExecStartPost=/usr/sbin/ip link set dev ve-router1-529 up ExecStartPost=/usr/sbin/ip link set dev ve-router1-425 master br425 ExecStartPost=/usr/sbin/ip link set dev ve-router1-425 up
Here's the /etc/network/interfaces in router1:
auto host0 iface host0 inet static address 129.97.134.18/24 gateway 129.97.134.1 iface host0 inet6 static pre-up sysctl net.ipv6.conf.host0.accept_ra=0 pre-up sysctl net.ipv6.conf.host0.accept_dad=0 address 2620:101:f000:4901:c5c::18/64 gateway 2620:101:f000:4901::1 auto veth529 iface veth529 inet static address 172.19.168.19/27 iface veth529 inet6 static pre-up sysctl net.ipv6.conf.veth529.accept_ra=0 pre-up sysctl net.ipv6.conf.veth529.accept_dad=0 address fd74:6b6a:8eca:4902::19/64 auto veth425 iface veth425 inet manual iface veth425 inet6 manual pre-up sysctl net.ipv6.conf.veth425.accept_ra=0 pre-up sysctl net.ipv6.conf.veth425.accept_dad=0
It is very important to disable IPv6 router advertisements. Note how veth425 doesn't have a dedicated IP address because it is provided by keepalived.
Here's the /etc/keepalived/keepalived.conf in router1:
vrrp_instance VI_1 { state MASTER interface veth529 virtual_router_id 21 priority 255 advert_int 2 authentication { auth_type PASS auth_pass ******** } virtual_ipaddress { # VLAN 529 router 172.19.168.21/27 # VLAN 425 router 172.19.134.254/24 dev veth425 } virtual_ipaddress_excluded { # VLAN 529 router fd74:6b6a:8eca:4902::21/64 # VLAN 425 router 2620:101:f000:4903::254/64 dev veth425 } }
Here's a snippet from /etc/sysctl.conf in router1:
net.ipv4.ip_forward=1 net.ipv6.conf.all.forwarding=1
Here are the iptables rules in router1:
iptables -t nat -N CLOUD-ROUTER iptables -t nat -A POSTROUTING -j CLOUD-ROUTER iptables -t nat -A CLOUD-ROUTER -d 10.0.0.0/8 -j RETURN iptables -t nat -A CLOUD-ROUTER -d 172.16.0.0/12 -j RETURN iptables -t nat -A CLOUD-ROUTER -d 129.97.0.0/16 -j RETURN iptables -t nat -A CLOUD-ROUTER -s 172.19.168.0/27 -j SNAT --to-source 129.97.134.18 iptables -t nat -A CLOUD-ROUTER -s 172.19.134.0/24 -j SNAT --to-source 129.97.134.18
Note that for on-campus addresses, forwarding is used instead of NAT.