Filer: Difference between revisions

From CSCWiki
Jump to navigation Jump to search
Content deleted Content added
Initial edit
 
No edit summary
Line 8: Line 8:


Currently ranch is used as the head unit, and only one (one of the middle) disk shelves is connected to it.
Currently ranch is used as the head unit, and only one (one of the middle) disk shelves is connected to it.
A QSFP+ (SFF-8436) to External Mini-SAS (SFF-8088) Cable is used to connect the disk shelf to a SAS2308 HBA card. Note that based on homelab community's wisdom, it has to be the port on the top left corner (marked with a black rectangle) of the back of the disk shelf. Also make sure all 4 PSUs are connected and powered on.
A QSFP+ (SFF-8436) to External Mini-SAS (SFF-8088) Cable is used to connect the disk shelf to a SAS2308 HBA card. Note that according [https://forums.unraid.net/topic/89444-how-to-configure-a-netapp-ds4243-shelf-in-unraid/ Unraid Forum] ([https://web.archive.org/web/20250108225045/https://forums.unraid.net/topic/89444-how-to-configure-a-netapp-ds4243-shelf-in-unraid/ Wayback Machine]), it should be connected to the port marked with a '''black rectangle''' on the '''top IOM''' of the back of the disk shelf. Also make sure all PSUs (we currently have 2 PSU and 2 blank filler) are connected and powered on. If everything is connected correctly, you should not see any amber LED on the PSUs. After the filer is booted, you should see green/blue LNK LED next to the connected QSFP port on the disk shelf.


A total of 24 disks are available, but 3 of them has shown signs of failure, so we only use 21 of them right now.
A total of 24x 2T disks are available, but 3 of them has shown signs of failure, so we only use 21 of them right now.


They are all 2TB drives from ~2010, so we should consider replacing them.
They are all 2TB drives from ~2010, so we should consider replacing them.
Line 18: Line 18:
ranch runs regular Debian with ZFS, so that we can share the technology stack with mirror.
ranch runs regular Debian with ZFS, so that we can share the technology stack with mirror.


A quirk of the disk shelf is that they only do disk spinups after your system has booted, and takes quite some time to do so, so ZFS freaks out when only part of the pool is visible and report the pool as SUSPENDED. Do <code>systemctl edit zfs-import-cache.service</code> and put these in should fix this by delaying ZFS import for 3 minutes so the disk shelves have time to finish initialization:
A '''pitfall''' of the disk shelf is that they only do disk spinups after your system has booted, and takes quite some time to do so (6 disks every 12s according to [https://docs.netapp.com/p/ontap-systems/platforms/Installation-And-Service-Guide.pdf]), so ZFS freaks out when only part of the pool is visible and report the pool as SUSPENDED. Do <code>systemctl edit zfs-import-cache.service</code> and put these in should fix this by delaying ZFS import for 3 minutes so the disk shelves have time to finish initialization:


<pre>
<pre>

Revision as of 00:20, 17 October 2025

NOTE This page describes Filer Generation 3, which is put into production at Fall 2025. To see previous generations of filers, see New NetApp (2017-2025) and NetApp (2013-2017).

At Fall 2023, MFCF donated us their FAS8040 NetApp filers alongside several DS4243 disk shelves.

We decided to connect the disk shelves directly to one of our servers, since it's hard to keep syscom/termcom trained to use NetApp's proprietary system, and we can mostly get away with using just 1/2 disk shelves for our storage need anyways.

Physical Configuration

Currently ranch is used as the head unit, and only one (one of the middle) disk shelves is connected to it. A QSFP+ (SFF-8436) to External Mini-SAS (SFF-8088) Cable is used to connect the disk shelf to a SAS2308 HBA card. Note that according Unraid Forum (Wayback Machine), it should be connected to the port marked with a black rectangle on the top IOM of the back of the disk shelf. Also make sure all PSUs (we currently have 2 PSU and 2 blank filler) are connected and powered on. If everything is connected correctly, you should not see any amber LED on the PSUs. After the filer is booted, you should see green/blue LNK LED next to the connected QSFP port on the disk shelf.

A total of 24x 2T disks are available, but 3 of them has shown signs of failure, so we only use 21 of them right now.

They are all 2TB drives from ~2010, so we should consider replacing them.

Configuration

ranch runs regular Debian with ZFS, so that we can share the technology stack with mirror.

A pitfall of the disk shelf is that they only do disk spinups after your system has booted, and takes quite some time to do so (6 disks every 12s according to [1]), so ZFS freaks out when only part of the pool is visible and report the pool as SUSPENDED. Do systemctl edit zfs-import-cache.service and put these in should fix this by delaying ZFS import for 3 minutes so the disk shelves have time to finish initialization:

[Service]
ExecStartPre=/bin/sleep 180

NFS

We use ZFS's sharenfs property to set NFS configuration for each datasets. This is done so that those NFS shares only start after ZFS is ready.

As before, we export sec=sys (so no authentication) on the special MC storage VLAN (VLAN 530, containing 172.19.168.32/27 and fd74:6b6a:8eca:4903::/64). This VLAN is only connected to trusted machines (NetApp, CSC servers in the MC 3015 or DC 3558 machine rooms).

All other machines uses sec=krb5p. By default, NFS clients will need a nfs/ Krb5 principal, but since all of CSC machines will need to mount /users anyways, we just reuse the host/ krb5 principal. This is done by running systemctl edit rpc-svcgssd.service and adding these (see rpc.svcgssd manual for more information):

[Service]
ExecStart=
ExecStart=/usr/sbin/rpc.svcgssd -n

We disabled NFSv3 and NFSv4.0 in /etc/nfs.conf since all of the machines are expected to run recent versions of debian.