As of Fall 2013, the CSC has a NetApp FAS3000 series which is capable of hosting network shares. It was donated to us by CSCF. It is also pretty old. Like, Pentium IV old.
All the manuals are hosted in ~sysadmin/netapp-docs/
Relevant docs for storage modification are: smg.pdf, sysadmin.pdf
iSCSI documentation is in ontop/bsag.pdf
While the NetApp supports both NFS and CIFS, neither of these export options provide the versatility nor the options we desire of a network fileshare (for instance, no device authentication is supported). Instead, we have configured the NetApp to export iSCSI block devices to be mounted on aspartame. Therefore, aspartame now replaces ginseng as the primary CSC fileserver.
- Filer: the controller unit for the NetApp. Currently psilodump.
- Disk shelf: where the physical disks live. Can be plugged into a filer or directly into another machine.
- RAID: "Redundant Array of Independent Disks", used to improve reliability and protect against disk failures.
- RAID-DP: "Double Parity" RAID, similar to RAID6 in failure tolerance (implemented like RAID4 but with two dedicated parity disks). It can survive up to two disk failures before degradation.
- aggr: An aggregate of disks. This is a list of physical disks, similar to selecting the physical devices used for LVM.
- vol: A volume consisting of some space on an aggregate. In general, we use the whole aggregate for a volume. RAID level is set at the volume. Similar to an LVM volume group.
- lun: "Logical Unit Number" The LUN is a device addressed by the SCSI protocol, and looks like a disk to the user. We usually use the whole volume for a single LUN. This is similar to an LVM logical volume.
aggr status -r aggr_name Shows aggregate status disk show -v Shows disks, and which filer they are owned by (currently all by psilodump) storage storage related things disk assign Assigns orphaned disks to a filer vol Volume stuffs
Should aspartame get totally hosed, or stability is long enough such that all sysadmin folk at the time have graduated, here is how to access, configure, and complete set up iSCSI on the NetApp+aspartame.
Configuration mechanisms are accessible via SSH or serial interface, but through aspartame only, which the machine is directly plugged into. The NetApp is not visible on 134net at all.
The private IP is 10.15.134.130, only available from aspartame on the interface with IP 10.15.134.1. You may have to remove the default route from the routing table in order to successfully contact the machine with ssh.
- shelf 1
- 14x136GB 10,000RPM FibreChannel disks
- Currently disconnected, could be connected to psilodump or directly to another machine.
- shelf 2
- 14x136GB 10,000RPM FibreChannel disks
- Currently assigned to psilodump
- shelf 3
- 14x500GB 7,200RPM ATA disks
- Currently assigned to psilodump
- shelf 4
- 14x500GB 7,200RPM ATA disks
- Currently assigned to psilodump
- Root aggregate volume, in RAID-DP
- Music aggregate volume, in RAID-DP
- Users aggregate volume, in RAID-DP
- Backups volume for CSC videos, in RAID-DP
- Root volume.
- Music volume. This volume is not accessible via NFS or CIFS. It contains only the iSCSI LUN /vol/vol1music/lun0 .
- Users volume. This volume is not accessible via NFS or CIFS. It contains only the iSCSI LUN /vol/vol2users/lun0 .
- Backup volume for videos. This volume is not accessible via NFS or CIFS. It contains only the iSCSI LUN /vol/vol3backup/lun0 .
Enabling iSCSI and Auth (one-time setup)
Enable iSCSI and configure default authentication.
options iscsi.enable on iscsi nodename iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca iscsi security default -s CHAP -p yoursecurepassword -n psilodump
where yoursecurepassword is more secure. For iSCSI hosts, the target will be on node iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca with username psilodump and password yoursecurepassword.
Setting up a new disk aggregate, volume, and LUN
1. Login to the NetApp. You'll either need access to the physical serial console or to ssh as root to psilodump's private IP (10.15.134.130). Credentials are stored in /users/sysadmin .
2. To get information on the available disks, run the command:
aggr status -r
This command will return three lists: Active aggregates with their assigned disks, spare disks, and disks managed by the partner. An aggregate is roughly equivalent to an LVM volume group: It is a collection of physical disks, possibly across multiple disk shelves and with various RAID levels applied, which may host one or more logical volumes. Do not proceed if there are fewer than three spare disks of each type available. Refer to the NetApp documentation to add more disks or release disks from existing aggregates.
3. Choose a list of disks for your new aggregate. The available space will be approximately 2/3 of the total disk space.
4. Create the aggregate as follows:
aggr create aggrN -t raid_dp -d [disk-list]
where [disk-list] is a list of the form AA:BB CC:DD ... containing the identifiers for the disks you wish to use to create the aggregate.
5. Retrieve the aggregate information. You will need to know the available space for the next step.
aggr show_space aggrN
6. Create a volume in the aggregate:
vol create volNfoo -s volume aggrN XXXK
where XXX is the total available space in aggrN. You may need to choose a smaller number due to hidden size constraints and rounding. If you can't seem to find the right size, pick one much smaller, and then use the command
vol size volNfoo +XXX
to grow the volume. This command will tell you how much available space remains, unlike `vol create`, so you don't need to keep guessing.
7. Disable snapshotting and access time update. Neither will be needed for exporting an iSCSI LUN.
vol options volNfoo no_atime_update on vol options volNfoo nosnap on snap reserve volNfoo 0
8. Create a LUN on your volume:
lun create -s XXXK -t linux /vol/volNfoo/lun0
where XXXK is the amount of available space on the volume, as shown by the command df.
9. Create an iSCSI initiator group and add all of your hosts to it:
igroup create -i -t linux volNfoo_group igroup add volNfoo_group iqn.1993-08.org.debian:01:123456789 igroup add volNfoo_group iqn.1993-08.org.debian:01:981287231 ...
The node identifiers given to the igroup add command will soon be able to access the iSCSI LUN you created above.
10. Map the LUN to the iSCSI initiator group:
lun map /vol/volNfoo/lun0 volNfoo_group
You're done! Any host in the initiator group should now be able to access the LUN you've created as a block device.
Expanding an aggregate, volume, and LUN
1. Start by getting the aggregate's status, e.g.
psilodump> aggr status -r aggr3 Aggregate aggr3 (online, raid_dp) (block checksums) Plex /aggr3/plex0 (online, normal, active) RAID group /aggr3/plex0/rg0 (normal) RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks) --------- ------ ------------- ---- ---- ---- ----- -------------- -------------- dparity 0c.32 0c 2 0 FC:B - FCAL 10000 136000/278528000 139072/284820800 parity 0c.33 0c 2 1 FC:B - FCAL 10000 136000/278528000 139072/284820800 data 0a.34 0a 2 2 FC:A - FCAL 10000 136000/278528000 139072/284820800 ...
2. Now determine the available spare disks:
psilodump> aggr status -s Spare disks RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks) --------- ------ ------------- ---- ---- ---- ----- -------------- -------------- Spare disks for block or zoned checksum traditional volumes or aggregates spare 0a.41 0a 2 9 FC:A - FCAL 10000 136000/278528000 137104/280790184 spare 0c.38 0c 2 6 FC:B - FCAL 10000 136000/278528000 137104/280790184 spare 0c.37 0c 2 5 FC:B - FCAL 10000 136000/278528000 137422/281442144 ...
3. Select disks by device number and add them to the aggregate, using the following command. (Use the -n flag if you want to test your command syntax with a dry run.)
psilodump> aggr add aggr3 -g rg0 -d 0a.39 0a.44 0c.40 0c.45 Addition of 4 disks to the aggregate has completed. Wed Dec 16 19:55:09 EST [psilodump: raid.vol.disk.add.done:notice]: Addition of Disk /aggr3/plex0/rg0/0c.45 Shelf 2 Bay 13 [NETAPP X274_HJURE146F10 NA14] S/N [404W6272] to aggregate aggr3 has completed successfully ...
4. Now fight with `vol size` to resize the volume:
psilodump> df -A aggr3 Aggregate kbytes used avail capacity aggr3 833369408 357122492 476246916 43% psilodump> vol size vol3backup +476246000k vol size: Insufficient space to grow this volume with its guarantee enabled; maximum growth is +473602692k. psilodump> vol size vol3backup +473602692k vol size: Flexible volume 'vol3backup' size set to 828725892k.
5. Last, fight with `lun resize` to increase the lun size:
psilodump> lun resize /vol/vol3backup/lun0 +473602692k lun resize: No space left on device lun resize: max size: 788g (846844657664) psilodump> lun resize /vol/vol3backup/lun0 846844657664
apt-get install open-scsi
node.startup = manual discovery.sendtargets.auth.authmethod=CHAP discovery.sendtargets.auth.username=username discovery.sendtargets.auth.password=password node.session.auth.authmethod=CHAP node.session.auth.username=username node.session.auth.password=password
Start open-iscsi service:
service open-iscsi start
Scan for iSCSI devices from the NetApp:
iscsiadm --mode discovery --type st --portal psilodump
This should dump out a ton of information, for example:
[fe80::XXXX:XXXX:XXXX:XXXX]:3260,2001 iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca [fe80::XXXX:XXXX:XXXX:XXXX]:3260,2000 iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca [fe80::XXXX:XXXX:XXXX:XXXX]:3260,2002 iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca [fe80::XXXX:XXXX:XXXX:XXXX]:3260,1000 iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca 10.15.134.131:3260,2002 iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca 184.108.40.206:3260,2001 iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca 10.15.134.130:3260,2000 iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca 220.127.116.11:3260,1000 iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca
The .130 IPs correspond to one filer, and the .131 IPs correspond to the other filer. Currently we are only using one of the filers (psilodump).
This also populates the /etc/iscsi/nodes/iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca directory with all possible ways to access the NetApp. For testing purposes (i.e. node.startup = manual), this is okay.
Test to see if you can get the iSCSI device to show up correctly:
iscsiadm --mode node --targetname "iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca" --portal 10.15.134.130:3260 --login
This should produce output similar to:
Logging in to [iface: default, target: iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca, portal: 10.15.134.130,3260] Login to [iface: default, target: iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca, portal: 10.15.134.130,3260]: successful
Check /dev/disk/by-path/ip* to ensure new disks show up:
# ls -l /dev/disk/by-path/ip* /dev/disk/by-path/ip-10.15.134.130:3260-iscsi-iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca-lun-0 -> ../../sda /dev/disk/by-path/ip-10.15.134.130:3260-iscsi-iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca-lun-0-part1 -> ../../sda1 /dev/disk/by-path/ip-10.15.134.130:3260-iscsi-iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca-lun-1 -> ../../sdb /dev/disk/by-path/ip-10.15.134.130:3260-iscsi-iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca-lun-1-part1 -> ../../sdb1
If this fails, check all your configuration again.
If this succeeds, you are now ready to try autoconnecting the iSCSI device.
Delete all extraneous entries from /etc/iscsi/nodes/iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca . This prevents the startup script from (a) hanging, and (b) being very upset. All that is left should be the interface you intend to connect through:
# ls -l /etc/iscsi/nodes/iqn.1992-08.com.netapp:psilodump.csclub.uwaterloo.ca/ 10.15.134.130,3260,2000
node.startup = automatic
For the init.d script to work correctly (i.e. properly mount things) we need to add a sleep to allow the device to settle: Edit /etc/init.d/open-iscsi roughly around line 127 to add a "sleep 1":
... # Now let's mount sleep 1 log_daemon_msg "Mounting network filesystems" MOUNT_RESULT=1 if mount -a -O _netdev >/dev/null 2>&1; then MOUNT_RESULT=0 break fi log_end_msg $MOUNT_RESULT ...
Now we can restart the service:
service open-iscsi restart
Now you can configure partitions and mountpoints.
Exporting Kerberized NFS from Debian Sid
The default kernel in Debian sid (stable, 2.6.32) does not support the necessary crypto suites to export kerberized NFS to newer kernels. You MUST upgrade the kernel, nfs-common, and nfs-kernel-server packages to AT LEAST squeeze-backports.
iSCSI block device mount optimizations
tmyklebu made some changes to /sys/block/sda/queue. The following is now in /etc/rc.local on aspartame:
echo 2048 > /sys/block/sda/queue/read_ahead_kb echo 32768 > /sys/block/sda/queue/max_sectors_kb echo 4096 > /sys/block/sda/queue/nr_requests echo noop > /sys/block/sda/queue/scheduler
We should increase the iSCSI configs node.session.queue_depth and node.session.cmds_max during next maintenance window.
Transferring old files from ginseng
- On ginseng, use parted to set up the mounted iscsi drive as an ext4 primary partition (setting up a partition of size >2TB requires care and a GPT)
- Compiled star in /root on ginseng
- Transferred files with the following Makefile (assuming original user directories in /export/users, destination volume in /mnt/iscsi, make -j8):
foo := $(wildcard /export/users/*) bar := $(patsubst /export/users/%,/mnt/iscsi/%,$(foo)) all: $(bar) /mnt/iscsi/%: /export/users/% # echo $@ $< ~/star-1.5.2/star/OBJ/x86_64-linux-cc/star \ -copy -p -acl artype=exustar \ -C /export/users $(notdir $<) /mnt/iscsi
- On ginseng, authenticate with iSCSI target (psilodump.csclub.uwaterloo.ca lun0).
- Umount /dev/mapper/vg0-users
- Copy users filesystem directly to iSCSI target:
dd if=/dev/mapper/vg0-users of=/path/to/psilodump:lun0 bs=8M
- Resize users filesystem on destination partition to fit: