Systemd-nspawn: Difference between revisions

From CSCWiki
Jump to navigation Jump to search
Content deleted Content added
mNo edit summary
M5choo (talk | contribs)
Add info for migrating resource limit infos and nspawn configs
 
(7 intermediate revisions by 3 users not shown)
Line 19: Line 19:
Now bootstrap the rootfs:
Now bootstrap the rootfs:
<pre>
<pre>
debootstrap --variant=minbase --include=systemd,systemd-sysv,systemd-container,iproute2,inetutils-ping,ifupdown,procps,less,nano bullseye /var/lib/machines/machine1 http://mirror.csclub.uwaterloo.ca/debian
debootstrap --variant=minbase --include=dbus,systemd-container,vim bookworm . http://mirror.csclub.uwaterloo.ca/debian
</pre>
</pre>
Note that the <code>systemd-container</code> package <b>must</b> be installed in the guest.
Note that the <code>systemd-container</code> package <b>must</b> be installed in the guest.
Line 38: Line 38:
# set FQDN
# set FQDN
nano /etc/hosts
nano /etc/hosts
# set up network config
# Use systemd-networkd for network management. See
nano /etc/network/interfaces
vim /etc/systemd/network/10-hostbr0.network
exit
exit
</pre>
</pre>
Line 73: Line 73:
<pre>
<pre>
machinectl shell machine1
machinectl shell machine1
</pre>

<b>Note</b>: if you see the error <code>sh: 2: exec: : Permission denied</code>, append /bin/bash to the end of the command:
<pre>
machinectl shell machine1 /bin/bash
</pre>
</pre>


Line 79: Line 84:
systemctl enable systemd-nspawn@machine1
systemctl enable systemd-nspawn@machine1
</pre>
</pre>

== Multiple network interfaces ==
Unfortunately systemd does not have a built-in way to create [https://github.com/systemd/systemd/issues/11087 multiple bridged network interfaces]. Thankfully, it's not too difficult to accomplish this using the <code>VirtualEthernetExtra</code> option and a systemd drop-in; the idea is to create some extra veth pairs and then manually attach them to the bridge.

Let's say you have three bridges on the host: br0, br1 and br2, and you want the container to be attached to all three. Make your nspawn file look like this:
<pre>
...
[Network]
Bridge=br0
# These will be manually bridged to the host
VirtualEthernetExtra=ve-machine1-1:veth1
VirtualEthernetExtra=ve-machine1-2:veth2
</pre>
Now run <code>systemctl edit systemd-nspawn@machine1</code> and paste the following:
<pre>
[Service]
ExecStartPost=/usr/sbin/ip link set dev ve-machine1-1 master br1
ExecStartPost=/usr/sbin/ip link set dev ve-machine1-1 up
ExecStartPost=/usr/sbin/ip link set dev ve-machine1-2 master br2
ExecStartPost=/usr/sbin/ip link set dev ve-machine1-2 up
</pre>
In the container, there will be three interfaces:

* host0, which is attached to br0 on the host
* veth1, which is attached to br1 on the host
* veth2, which is attached to br2 on the host

Make sure you update /etc/systemd/network/10-hostbr.network in the container accordingly.

== Migrate to LXC ==

=== Prerequisites ===
* Assumption is made that the original <code>systemd-nspawn</code> runs on Debian.
* Source machine running the systemd-nspawn container (Debian 12 base architecture).
* Target Proxmox VE node with CLI access.
* Network connectivity (or storage media) to transfer the container archive.
* Matching processor architectures between source and destination (e.g., both amd64).

=== Step 1: Package the Source Container ===
To prevent data inconsistency or race conditions during the compression phase, the running instance must be gracefully halted.

# Log into the nspawn host and terminate the target container instance:
#: <syntaxhighlight lang="bash">machinectl stop <container_name></syntaxhighlight>
# Confirm the container is inactive and no lingering worker threads exist:
#: <syntaxhighlight lang="bash">machinectl list</syntaxhighlight>
# Navigate to the root filesystem directory (typically located under <code>/var/lib/machines/</code>). Create a compressed tarball archive using relative path positioning:
#: <syntaxhighlight lang="bash">tar -cvzf container_backup.tar.gz -C /var/lib/machines/<container_name> .</syntaxhighlight>
#: ''Note: Utilizing the <code>-C</code> flag preserves the nested structures directly inside the top level of the archive, omitting parent folder descriptors that break standard LXC unpacking loops.''

==== Step 1.1: Extract Resource Limits and Metadata ====
Systemd-nspawn stores resource constraints and execution settings in <code>.nspawn</code> files or systemd unit overrides. These should be referenced to ensure the Proxmox LXC matches the original performance profile.

# Locate the specific settings for the container:
#: <syntaxhighlight lang="bash">systemctl cat nspawn@<container_name>.service</syntaxhighlight>
# Check for a manual configuration file (usually in <code>/etc/systemd/nspawn/</code>):
#: <syntaxhighlight lang="bash">cat /etc/systemd/nspawn/<container_name>.nspawn</syntaxhighlight>
# Note the following values to be used during the <code>pct create</code> or <code>pct set</code> phase:
#* '''CPU Shares:''' Look for <code>CPUWeight</code> or <code>CPUShares</code>.
#* '''Memory Limits:''' Look for <code>MemoryMax</code> or <code>MemoryLimit</code>.
#* '''Environment Variables:''' Look for <code>Environment=</code> lines.
#* '''Bind Mounts:''' Look for <code>Bind=</code> or <code>BindReadOnly=</code> entries which will need to be recreated as "Mount Points" in Proxmox.

=== Step 2: Transfer the Payload to Proxmox ===
Transport the generated payload file to the centralized Proxmox storage pool designated for distribution packages.

# Push the file to the default template cache path of your Proxmox server via secure copy protocol:
#: <syntaxhighlight lang="bash">scp container_backup.tar.gz root@<proxmox_ip>:/var/lib/vz/template/cache/</syntaxhighlight>

=== Step 3: Provision and Initialize the LXC Instance ===
Execute the container initialization binary via the Proxmox Command Line Interface. Select an unallocated Virtual Machine Identifier (VMID) code (e.g., <code>105</code>).

# Run the <code>pct create</code> string mapping configuration details appropriately. The following should work fine in most scenarios, but don't forget to verify it for each migration case:
#: <syntaxhighlight lang="bash">
pct create <VMID> /var/lib/vz/template/cache/container_backup.tar.gz \
--hostname my-migrated-container \
--storage vm \
--net0 name=eth0,bridge=vmbr0,ip=dhcp \
--ostype debian \
--arch amd64
</syntaxhighlight>

==== Parameter Breakdown ====
{| class="wikitable" style="width: 85%; text-align: left;"
|- style="background-color: #f2f2f2;"
! Parameter !! Type !! Purpose / Operational Notes
|-
| '''<VMID>''' || Integer || Target ID assignment for isolation reference.
|-
| '''Path String''' || Filepath || Literal path pointing directly toward the moved <code>.tar.gz</code> bundle.
|-
| '''--storage''' || Target Storage || Defines where the newly expanded disk images live. (Can use <code>local</code> or <code>local-lvm</code>).
|-
| '''--ostype''' || String ID || Configures automated networking script hooks inside guest boundaries. Use '''debian''' even if host hypervisor builds use an upstream point release.
|-
| '''--net0''' || Interface String || Sets the internal interface name, upstream virtualization bridge, and network configuration method (e.g., DHCP).
|}

=== Step 4: Post-Migration Integration Tasks ===
Before setting the automated power-on directive, minor discrepancies must be assessed:

* '''Resource Alignment:''' Reference the metrics extracted in '''Step 1.1''' (CPU, RAM, Mounts) and apply them via the Proxmox UI under the '''Resources''' tab or via <code>pct set <vmid> -memory <MB> -cores <number></code> to maintain performance parity.
* '''Systemd Nesting:''' Enable nesting to allow the guest systemd to manage its internal units. Navigate to '''Options''' -> '''Features''' -> Check '''nesting''', or run:
#: <syntaxhighlight lang="bash">pct set <VMID> --features nesting=1</syntaxhighlight>
* '''User Privilege Level Constraints:''' By default, Proxmox creates unprivileged namespaces for safety. If the nspawn container expects hardware node execution access, or special system task levels, go to the '''Options''' pane within the web terminal and modify state permissions to Privileged.
* '''Network Manager Hooks:''' If <code>systemd-networkd</code> or localized static loops exist inside the original container structure, it may conflict with parameters injected via Proxmox. Consider clearing internal static state binds if link interface problems materialize.

=== Step 5: Start and Verify Instance Health ===
# Start the newly provisioned infrastructure slice:
#: <syntaxhighlight lang="bash">pct start <VMID></syntaxhighlight>
# Intercept terminal standard streams to execute verification checks internally:
#: <syntaxhighlight lang="bash">pct enter <VMID></syntaxhighlight>
# Review service status tables inside the active root environment:
#: <syntaxhighlight lang="bash">systemctl status</syntaxhighlight>

=== Step 6: Decommission the Source Instance ===
Once the container is confirmed functional on Proxmox, the original files and configurations should be removed from the nspawn host to reclaim storage.

# Permanently delete the container image and its associated configuration files:
#: <syntaxhighlight lang="bash">machinectl terminate <container_name></syntaxhighlight>
#: <syntaxhighlight lang="bash">machinectl remove <container_name></syntaxhighlight>
# Verify the machine is no longer indexed:
#: <syntaxhighlight lang="bash">machinectl list-images</syntaxhighlight>
# Manually remove the transferred tarball from the nspawn host's local storage:
#: <syntaxhighlight lang="bash">rm container_backup.tar.gz</syntaxhighlight>

=== (Optional) Step 7: Update container ===
It's time to update apt repos and upgrade packages!

<syntaxhighlight lang="bash">
apt update
apt upgrade -y
apt full-upgrade -y
apt autoremove --purge
</syntaxhighlight>

Optionally, if the container is running on Debian 12 and you want Debian 13, upgrade the container to Debian 13 after consulting other Syscom members. (Good luck dealing with <code>/etc/apt/sources.list{,.d}</code> configs)

Latest revision as of 15:40, 30 May 2026

systemd-nspawn is a simpler alternative to LXC which works well on modern versions of Debian (and, unlike LXC, it does not break very critical systemd services running in containers). For "pet" containers, we should be using systemd-nspawn; for "cattle" containers, Podman is more appropriate.

Some light reading:

Quickstart

In the example below, we will create a container called 'machine1'.

Create a directory for the rootfs:

mkdir /var/lib/machines/machine1

Or, if you are using an LVM volume, just create a symlink in /var/lib/machines to where the LV is mounted:

ln -s /vm/machine1 /var/lib/machines/machine1

Now bootstrap the rootfs:

debootstrap --variant=minbase --include=dbus,systemd-container,vim bookworm . http://mirror.csclub.uwaterloo.ca/debian

Note that the systemd-container package must be installed in the guest.

Now do a bit of setup in the rootfs:

chroot /var/lib/machines/machine1
# Only do this if you want to use `machinectl login`
passwd -d root
cat <<EOF >>/etc/securetty
pts/0
pts/1
pts/2
pts/3
EOF
# set hostname
echo machine1 > /etc/hostname
# set FQDN
nano /etc/hosts
# Use systemd-networkd for network management. See 
vim /etc/systemd/network/10-hostbr0.network
exit

Now paste the following into /etc/systemd/nspawn/machine1.nspawn:

[Exec]
Boot=yes
Hostname=machine1
PrivateUsers=no

[Network]
Bridge=br0

Replace 'br0' by the bridge interface on the host to which the container should be attached (a veth pair will be created when the container starts up).

Also make sure to set 'PrivateUsers=no', because by default systemd-nspawn uses some randomized UID/GID mapping which makes it difficult to migrate the container to a different system.

Now start the container:

systemctl start systemd-nspawn@machine1

Or alternatively, using machinectl:

machinectl start machine1

To login to a container via an emulated serial console (I don't recommend doing this, since the TTY gets screwed up):

machinectl login machine1

Attach to a running container (similar to lxc-attach):

machinectl shell machine1

Note: if you see the error sh: 2: exec: : Permission denied, append /bin/bash to the end of the command:

machinectl shell machine1 /bin/bash

Important: make sure the container starts up at boot:

systemctl enable systemd-nspawn@machine1

Multiple network interfaces

Unfortunately systemd does not have a built-in way to create multiple bridged network interfaces. Thankfully, it's not too difficult to accomplish this using the VirtualEthernetExtra option and a systemd drop-in; the idea is to create some extra veth pairs and then manually attach them to the bridge.

Let's say you have three bridges on the host: br0, br1 and br2, and you want the container to be attached to all three. Make your nspawn file look like this:

...
[Network]
Bridge=br0
# These will be manually bridged to the host
VirtualEthernetExtra=ve-machine1-1:veth1
VirtualEthernetExtra=ve-machine1-2:veth2

Now run systemctl edit systemd-nspawn@machine1 and paste the following:

[Service]
ExecStartPost=/usr/sbin/ip link set dev ve-machine1-1 master br1
ExecStartPost=/usr/sbin/ip link set dev ve-machine1-1 up
ExecStartPost=/usr/sbin/ip link set dev ve-machine1-2 master br2
ExecStartPost=/usr/sbin/ip link set dev ve-machine1-2 up

In the container, there will be three interfaces:

  • host0, which is attached to br0 on the host
  • veth1, which is attached to br1 on the host
  • veth2, which is attached to br2 on the host

Make sure you update /etc/systemd/network/10-hostbr.network in the container accordingly.

Migrate to LXC

Prerequisites

  • Assumption is made that the original systemd-nspawn runs on Debian.
  • Source machine running the systemd-nspawn container (Debian 12 base architecture).
  • Target Proxmox VE node with CLI access.
  • Network connectivity (or storage media) to transfer the container archive.
  • Matching processor architectures between source and destination (e.g., both amd64).

Step 1: Package the Source Container

To prevent data inconsistency or race conditions during the compression phase, the running instance must be gracefully halted.

  1. Log into the nspawn host and terminate the target container instance:
    machinectl stop <container_name>
    
  2. Confirm the container is inactive and no lingering worker threads exist:
    machinectl list
    
  3. Navigate to the root filesystem directory (typically located under /var/lib/machines/). Create a compressed tarball archive using relative path positioning:
    tar -cvzf container_backup.tar.gz -C /var/lib/machines/<container_name> .
    
    Note: Utilizing the -C flag preserves the nested structures directly inside the top level of the archive, omitting parent folder descriptors that break standard LXC unpacking loops.

Step 1.1: Extract Resource Limits and Metadata

Systemd-nspawn stores resource constraints and execution settings in .nspawn files or systemd unit overrides. These should be referenced to ensure the Proxmox LXC matches the original performance profile.

  1. Locate the specific settings for the container:
    systemctl cat nspawn@<container_name>.service
    
  2. Check for a manual configuration file (usually in /etc/systemd/nspawn/):
    cat /etc/systemd/nspawn/<container_name>.nspawn
    
  3. Note the following values to be used during the pct create or pct set phase:
    • CPU Shares: Look for CPUWeight or CPUShares.
    • Memory Limits: Look for MemoryMax or MemoryLimit.
    • Environment Variables: Look for Environment= lines.
    • Bind Mounts: Look for Bind= or BindReadOnly= entries which will need to be recreated as "Mount Points" in Proxmox.

Step 2: Transfer the Payload to Proxmox

Transport the generated payload file to the centralized Proxmox storage pool designated for distribution packages.

  1. Push the file to the default template cache path of your Proxmox server via secure copy protocol:
    scp container_backup.tar.gz root@<proxmox_ip>:/var/lib/vz/template/cache/
    

Step 3: Provision and Initialize the LXC Instance

Execute the container initialization binary via the Proxmox Command Line Interface. Select an unallocated Virtual Machine Identifier (VMID) code (e.g., 105).

  1. Run the pct create string mapping configuration details appropriately. The following should work fine in most scenarios, but don't forget to verify it for each migration case:
    pct create <VMID> /var/lib/vz/template/cache/container_backup.tar.gz \
      --hostname my-migrated-container \
      --storage vm \
      --net0 name=eth0,bridge=vmbr0,ip=dhcp \
      --ostype debian \
      --arch amd64
    

Parameter Breakdown

Parameter Type Purpose / Operational Notes
<VMID> Integer Target ID assignment for isolation reference.
Path String Filepath Literal path pointing directly toward the moved .tar.gz bundle.
--storage Target Storage Defines where the newly expanded disk images live. (Can use local or local-lvm).
--ostype String ID Configures automated networking script hooks inside guest boundaries. Use debian even if host hypervisor builds use an upstream point release.
--net0 Interface String Sets the internal interface name, upstream virtualization bridge, and network configuration method (e.g., DHCP).

Step 4: Post-Migration Integration Tasks

Before setting the automated power-on directive, minor discrepancies must be assessed:

  • Resource Alignment: Reference the metrics extracted in Step 1.1 (CPU, RAM, Mounts) and apply them via the Proxmox UI under the Resources tab or via pct set <vmid> -memory <MB> -cores <number> to maintain performance parity.
  • Systemd Nesting: Enable nesting to allow the guest systemd to manage its internal units. Navigate to Options -> Features -> Check nesting, or run:
  1. pct set <VMID> --features nesting=1
    
  • User Privilege Level Constraints: By default, Proxmox creates unprivileged namespaces for safety. If the nspawn container expects hardware node execution access, or special system task levels, go to the Options pane within the web terminal and modify state permissions to Privileged.
  • Network Manager Hooks: If systemd-networkd or localized static loops exist inside the original container structure, it may conflict with parameters injected via Proxmox. Consider clearing internal static state binds if link interface problems materialize.

Step 5: Start and Verify Instance Health

  1. Start the newly provisioned infrastructure slice:
    pct start <VMID>
    
  2. Intercept terminal standard streams to execute verification checks internally:
    pct enter <VMID>
    
  3. Review service status tables inside the active root environment:
    systemctl status
    

Step 6: Decommission the Source Instance

Once the container is confirmed functional on Proxmox, the original files and configurations should be removed from the nspawn host to reclaim storage.

  1. Permanently delete the container image and its associated configuration files:
    machinectl terminate <container_name>
    
    machinectl remove <container_name>
    
  2. Verify the machine is no longer indexed:
    machinectl list-images
    
  3. Manually remove the transferred tarball from the nspawn host's local storage:
    rm container_backup.tar.gz
    

(Optional) Step 7: Update container

It's time to update apt repos and upgrade packages!

apt update
apt upgrade -y
apt full-upgrade -y
apt autoremove --purge

Optionally, if the container is running on Debian 12 and you want Debian 13, upgrade the container to Debian 13 after consulting other Syscom members. (Good luck dealing with /etc/apt/sources.list{,.d} configs)