Journey of a Home-based Personal Cloud Storage Project

SCALE 21x

Julien RIOU

March 16, 2024

2007

Ubuntu Party, Paris

Ubuntu Party

May 2007

Los Angeles

Los Angeles

August 2007

17 years later

Who am I?

Summary

  1. Why?
  2. History
  3. Infrastructure
  4. Data management
  5. Alerting
  6. Observability
  7. Automation
  8. What’s next?
  9. Takeaways

Why?

Home-based Personal Cloud Storage, why on earth?

  • Never lose data again
  • Control my data
  • Learn new stuff
  • Have fun!

History

Apartment

2013

Apartment

USB drives

USB drive

USB drives

  • Hard to find
  • NTFS (because Microsoft Windows)
  • Physically plug, automount
  • Umount/eject, unplug

Network Attached Storage (NAS)

Network Attached Storage

Shared NAS

  • Desktop PC
  • Home office
  • SMB shares with Samba
  • Breaking upgrades

New job

2015

New job

  • Major cloud provider in Europe
  • Discount price on HDDs (not anymore)
  • OpenZFS (NFS, CIFS)
  • GNU/Linux on servers and desktops

Small storage

  • Must be small and silent
  • Synology design
  • 3x4TB HDD at discount price
  • Intel NUC motherboard, PCI RAID card
  • FreeBSD for built-in OpenZFS support

Motherboard sizes

First server

Copying data…

🔊

Big storage

  • Classic ATX tower
  • 3x2TB HDD at discount price
  • FreeBSD

Baby

2018

Baby

  • Put the computers away to the basement
  • Time better spent with my baby

New house

2019

House

  • More space!
  • Noise is not an issue anymore
  • Secure basement

Old storage

  • Rebuilt my main computer
  • Re-used my old computer as a storage server
    • The first computer I’ve ever built in 2008
  • 3x1TB HDD from my stock

Issues

  • USB stick not bootable
  • CD-ROM of FreeBSD 12 had a LUA error
    • FreeBSD 11 too
    • Debian 10 worked
  • Freezes
    • Hard reboot
  • Fully replaced and upgraded today (3x2TB)

Recap

2024

Description Name Capacity (TiB) OS
Big storage storage1 5.45 Debian
Old storage storage2 5.45 Debian
Small storage storage3 10.9 Debian

Infrastructure

Map

Infrastructure map

Clients

Operating systems

  • No more Microsoft Windows
  • Ubuntu and friends

Network File System (NFS)

  • Easy to set up
  • Easy to maintain
  • Mount a remote directory locally

  • Harder to install and maintain
  • User friendly
    • Drive client, Web UI (seahub)
  • Keep files in sync
    • Pinned full files, full files and placeholders

Connectivity

Static IP address

ISP modem settings

  • SSH, HTTP and HTTPS closed by default
  • Port mapping
  • Request the ISP to set security level to low
  • It worked at the apartment, not in the house

  • Virtual Private Network (VPN)
  • Client-server model
  • Authentication with certificates
  • TLS
  • Client-to-client allowed
  • Static IP address assignment to clients

Custom settings

topology subnet                                 ; declare a subnet like home
server 10.xx.xx.xx 255.xx.xx.xx                 ; with the range you like
client-to-client                                ; allow clients to talk to each other
client-config-dir /etc/openvpn/ccd              ; static IP configuration per client
ifconfig-pool-persist /var/log/openvpn/ipp.txt  ; IP lease settings

Infrastructure map with VPS

Remote administration

  • Secure Shell protocol (SSH)
  • Login and execute commands on a remote host

Data management

Disk management

OpenZFS
  • Zettabyte File System (ZFS)
  • Volume manager, RAID-Z
  • Filesystems
  • Snapshots
    • Performance!
    • Replication, cloning, rollback
  • Compression, encryption
  • Production ready, even on Linux

RAID-Z

storage1 ~ # zpool status
  pool: storage
 state: ONLINE
  scan: scrub repaired 0B in 02:59:40 with 0 errors on Sun Feb 11 03:23:41 2024
config:

        NAME        STATE     READ WRITE CKSUM
        storage     ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            sda     ONLINE       0     0     0
            sdb     ONLINE       0     0     0
            sdc     ONLINE       0     0     0

errors: No known data errors

RAID-Z

storage1 ~ # zpool list
NAME      SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
storage  5.44T  2.89T  2.55T        -         -     8%    53%  1.00x    ONLINE  -

Compression

storage1 ~ # zfs get compression storage
NAME     PROPERTY     VALUE           SOURCE
storage  compression  lz4             local

Filesystems

storage1 ~ # zfs list -t filesystem
NAME              USED  AVAIL     REFER  MOUNTPOINT
storage          1.93T  1.58T      139K  /storage
storage/julien    348G  1.58T      338G  /storage/julien

Snapshots

storage1 ~ # zfs list -t snapshot -r storage/julien | tail -n 3
storage/julien@autosnap_2024-02-25_00:00:01_daily       0B      -      338G  -
storage/julien@autosnap_2024-02-26_00:00:02_daily       0B      -      338G  -
storage/julien@autosnap_2024-02-27_00:00:02_daily       0B      -      338G  -

Replication

zfs send POOL/FS@SNAPSHOT-1                       | ssh REMOTE_HOST zfs recv POOL/FS
zfs send -i POOL/FS@SNAPSHOT-1 POOL/FS@SNAPSHOT-2 | ssh REMOTE_HOST zfs recv POOL/FS
zfs send -i POOL/FS@SNAPSHOT-2 POOL/FS@SNAPSHOT-3 | ssh REMOTE_HOST zfs recv POOL/FS

Snapshot management

Sanoid

Policy-driven snapshot management tool for ZFS filesystems

  • Take snapshots
  • Pre and post snapshot scripts
  • Prune snapshots
  • Monitoring (health, capacity)

Templates configuration

[template_main]
    hourly = 0
    daily = 31
    monthly = 12
    yearly = 10
    autosnap = yes
    autoprune = yes

[template_archive]
    hourly = 0
    daily = 31
    monthly = 12
    yearly = 10
    autosnap = no
    autoprune = yes

Policies

[storage/julien]
    use_template = main

[storage/dad]
    use_template = archive

Job definition

systemd service

storage1 ~ # systemctl cat sanoid.service 
# /lib/systemd/system/sanoid.service
[Unit]
Description=Snapshot ZFS filesystems
Documentation=man:sanoid(8)
Requires=local-fs.target
After=local-fs.target
Before=sanoid-prune.service
Wants=sanoid-prune.service
ConditionFileNotEmpty=/etc/sanoid/sanoid.conf

[Service]
Type=oneshot
Environment=TZ=UTC
ExecStart=/usr/sbin/sanoid --take-snapshots --verbose

Job scheduling

systemd timer

storage1 ~ # systemctl cat sanoid.timer
# /lib/systemd/system/sanoid.timer
[Unit]
Description=Run Sanoid Every 15 Minutes

[Timer]
OnCalendar=*:0/15
Persistent=true

[Install]
WantedBy=timers.target
storage1 ~ # systemctl list-timers sanoid.timer --all
NEXT                        LEFT       LAST                        PASSED       UNIT         ACTIVATES     
Tue 2024-02-27 09:00:00 CET 11min left Tue 2024-02-27 08:45:01 CET 3min 20s ago sanoid.timer sanoid.service

1 timers listed.

Snapshot replication

  • Syncoid
    • included with Sanoid
  • rsync-like
  • Resume on interruption
  • Bandwidth control

Usage

/usr/sbin/syncoid                       \
    storage/julien                      \
    zfs@REMOTE_STORAGE:storage/julien   \
    --no-sync-snap                      \
    --source-bwlimit=512k 

Added to /opt/syncoid.sh script

Job definition

systemd service

storage1 ~ # systemctl cat syncoid.service 
# /etc/systemd/system/syncoid.service
[Unit]
Description=Send ZFS snapshots created by Sanoid
Requires=zfs.target
After=zfs.target

[Service]
Type=oneshot
User=zfs
ExecStart=-/opt/syncoid.sh

[Install]
WantedBy=multi-user.target

Job scheduling

systemd timer

storage1 ~ # systemctl cat syncoid.timer 
# /etc/systemd/system/syncoid.timer
[Unit]
Description=Run Syncoid every night

[Timer]
OnCalendar=*-*-* 00,04:30:00 UTC
AccuracySec=1us
RandomizedDelaySec=2h30

[Install]
WantedBy=timers.target

Client replication

Client replication

Replication overview

Health

storage1 ~ # sanoid --monitor-snapshots
OK: all monitored datasets (storage/dad, storage/julien) have fresh snapshots
storage1 ~ # sanoid --monitor-health
OK ZPOOL storage : ONLINE {Size:5.44T Free:2.55T Cap:53%} 

Alerting

Nagios

  • Nagios Core
  • Simple configuration files
  • Web UI
  • Plugins

Welcome to pilote!

Raspberry Pi

Components

  • Hosts
  • Hostgroups
  • Services
  • Notifications

Nagios components

Host

/etc/nagios4/conf.d/hosts.cfg

define host {
    use         home-host
    host_name   storage1
    alias       storage1
    address     169.254.0.1
}

Hostgroups

/etc/nagios4/conf.d/hostgroups.cfg

define hostgroup {
    hostgroup_name  storage-servers
    alias           Storage servers
    members         storage1,storage2,storage3
}

Services commands

  • check_ping
  • check_nrpe
    • Nagios Remote Plugin Executor
  • check_http

Services states

  • OK
  • WARNING
  • CRITICAL
  • UNKNOWN

Service configuration

define service {
    use                 home-service
    hostgroup_name      storage-servers 
    service_description zfs_snapshots
    check_command       check_nrpe!check_zfs_snapshots
}

NRPE agent

/etc/nagios/nrpe_local.cfg

command[check_zfs_snapshots]=/usr/bin/sudo /usr/sbin/sanoid --monitor-snapshots

Notifications

Send Nagios notifications to a Telegram Messenger channel.

notify-by-telegram

PROBLEM notification

RECOVERY notification

Web UI

External access

Nagios external access

Observability

  • Disk space evolution
  • Network stability
  • Elephant Temperature in the room
  • Power consumption

TIG stack

  • Telegraf
  • InfluxDB
  • Grafana

The plugin-driven server agent for collecting & reporting metrics.

https://github.com/influxdata/telegraf

Inputs

[[inputs.cpu]]
  percpu = false
  totalcpu = true
  collect_cpu_time = false
  report_active = false

[[inputs.diskio]]
  devices = ['sda', 'sdb', 'sdc', 'sdd']

Outputs

[[outputs.influxdb]]
  urls = ["https://x.x.x.x:8088"]
  database = "metrics"
  skip_database_creation = true
  username = "telegraf"
  password = "****"
  insecure_skip_verify = true
  content_encoding = "gzip"

Scalable datastore for metrics, events and real-time analytics

https://github.com/influxdata/influxdb

The open-source platform for monitoring and observability

https://github.com/grafana/grafana

Grafana dashboard

Overview

Notes

Sensors

  • Temperature
  • Humidity
  • Noise

Hardware

  • Arduino Uno (Elegoo Uno R3)
    • Powered by USB
  • DHT22 sensor (temperature, humidity)
  • KY-037 sensor (sound)
  • Breadboard
  • Cables

Arduino cricuit

Software

Sketch

Definitions

#include <DHT.h>

#define KYPIN A0  // analog pin where KY-037 sensor is connected
#define DHTPIN 2  // digital pin where DHT22 sensor is connected

DHT dht(DHTPIN, DHT22); // initialize DHT22 object

float h;  // humidity
float t;  // temperature
int s;    // sound

Setup

void setup()
{
    Serial.begin(9600);
    dht.begin();
}

Main loop (1/2)

void loop()
{
    // sensors need some time to produce valid values
    delay(2000);

    // read values from sensors
    h = dht.readHumidity();
    t = dht.readTemperature();
    s = analogRead(KYPIN);

Main loop (2/2)

    // print "<humidity>,<temperature>,<sound>" (CSV-like)
    if (!isnan(h) && !isnan(t) && !isnan(s)) {
        Serial.print(h);
        Serial.print(",");
        Serial.print(t);
        Serial.print(",");
        Serial.println(s);
    }
}

Multiplexing

How is the temperature?

Ambient temperature

Humidity

Ambient humidity

Noise

Ambient noise

Power consumption

How much will it cost?

Average monthly electricity wholesale price in Belgium from January 2019 to January 2024 (in euros per megawatt-hour)

Electricity chart

© Statista 2024

Household electricity prices worldwide in June 2023, by select country (in U.S. dollars per kilowatt-hour)

Electricity chart per household

© Statista 2024

Wattmeter

Wattmeter

Uninterruptible power supply (UPS)

APC Back-UPS Pro 550

Yearly cost

Power consumption dashboard

$7/y

In real life

storage1

storage2

storage3

Automation

Failures happen

MicroSD cards with I/O errors

Flood or fire in the house

Deployments

  1. Install the operating system
  2. Install and configure software
  3. Restore data (optional)
  1. Install the operating system
  2. Install and configure software
  3. Restore data (optional)

Ansible

Ansible is a radically simple IT automation system.

https://github.com/ansible/ansible

Concepts

  • Inventory: combination of
    • Hosts: remote machine to manage
    • Groups: hosts sharing a common attribute
  • Playbook: list of tasks executed in order, on groups
    • Roles: group of tasks that can be shared to the world
    • Tasks: module + arguments
      • Modules: smallest unit of code to execute on hosts

Inventory

inventory/hosts file

[all]
vps ansible_host=xxx.xxx.xxx.xxx
pilote ansible_host=xxx.xxx.xxx.xxx
metrics ansible_host=xxx.xxx.xxx.xxx
storage1 ansible_host=xxx.xxx.xxx.xxx
storage2 ansible_host=xxx.xxx.xxx.xxx
storage3 ansible_host=xxx.xxx.xxx.xxx

[storage]
storage1 ansible_host=xxx.xxx.xxx.xxx
storage2 ansible_host=xxx.xxx.xxx.xxx
storage3 ansible_host=xxx.xxx.xxx.xxx

Playbook overview

site.yml

- import_playbook: common.yml
- import_playbook: storage.yml
- import_playbook: ...

common.yml

- hosts: all
  roles:
    - common

storage.yml

- hosts: storage
  roles:
    - zfs
    - openvpn
    - sanoid
    - ...

- hosts: storage1
  roles:
    - nfs

Role example

roles/sanoid/
├── defaults
│   └── main.yml
├── handlers
│   └── main.yml
├── tasks
│   └── main.yml
└── templates
    ├── sanoid.conf.j2
    ├── syncoid.service.j2
    ├── syncoid.sh.j2
    └── syncoid.timer.j2

Module examples

Template example

Task

- name: Deploy Syncoid script
  ansible.builtin.template:
    src: syncoid.sh.j2
    dest: /opt/syncoid.sh
    owner: zfs
    group: root
    mode: "0750"

Template using Jinja2

#!/bin/bash
{{ ansible_managed | comment }}

{% for dataset in sanoid_main_datasets %}
{% for destination in syncoid_destinations %}
echo "Sending {{ dataset }} to {{ destination }}"
/usr/sbin/syncoid {{ dataset }} {{ syncoid_user }}@{{ destination }}:{{ dataset }} \
    --no-sync-snap \
    {% if syncoid_source_bwlimit %}--source-bwlimit={{ syncoid_source_bwlimit }} {% endif %}
{% endfor %}
{% endfor %}

Result on the managed host

#!/bin/bash
#
# Ansible managed
#

echo "Sending storage/julien to xxx.xxx.xxx.xxx"
/usr/sbin/syncoid storage/julien xxx@xxx.xxx.xxx.xxx:storage/julien \
    --no-sync-snap \
    --source-bwlimit=512k 
echo "Sending storage/dad to xxx.xxx.xxx.xxx"
/usr/sbin/syncoid storage/dad xxx@xxx.xxx.xxx.xxx:storage/dad \
    --no-sync-snap \
    --source-bwlimit=512k

Upgrades

upgrade.yml

- name: Upgrade systems
  hosts: all
  tasks:
    - include_tasks: tasks/apt-upgrade.yml

tasks/apt-upgrade.yml

- name: Run apt upgrade
  ansible.builtin.apt:
    update_cache: true
    upgrade: dist

CLI

ansible-playbook site.yml
ansible-playbook upgrade.yml

What’s next?

  • Open-source my Ansible code base
  • Automate certificates management
  • Use ZFS encryption
  • Use Prometheus for metrics
  • Forward logs
  • Handle mobile phones

Takeaways

  • Self-hosting is not that hard
  • Consider using TrueNAS
  • FOSS is awesome!
  • Enjoy what you are doing

Thank you

🙏

Questions

// reveal.js plugins