Journey of a Home-based Personal Cloud Storage Project

SCALE 21x

Julien RIOU

March 16, 2024



2007


Ubuntu Party, Paris

Ubuntu Party

May 2007

It all started in 2007, when I was in high school. My dad and I were searching for something to do on a weekend together to kill time and boredom. Then I heard about this thing called Ubuntu. An event was scheduled at La Cité des Sciences in Paris, called an “Ubuntu Party”. A day, open to every curious person like us, full of conferences about FOSS. I had a stimulus. That day, I knew what I wanted to do.

Photo by Yannick Croissant


Los Angeles

Los Angeles

August 2007

Later that year, my dream at the time came true. We went road tripping to the west coast, landing right here, in Los Angeles. My first trip overseas.


17 years later

And here we are, 17 years later, at the same place, to share my experience of setting up a personal project running on free and open source software, based on the idea that we don’t need to depend on cloud providers to hold our own data.

From the bottom of my heart, I would like to thank the organizers for selecting my talk. It means a lot to me.


Who am I?


Summary

  1. Why?
  2. History
  3. Infrastructure
  4. Data management
  5. Alerting
  6. Observability
  7. Automation
  8. What’s next?
  9. Takeaways

Why?

Home-based Personal Cloud Storage, why on earth?


History


Apartment

2013

Apartment

Apartment icons created by Smashicons - Flaticon


USB drives

USB drive


USB drives


Network Attached Storage (NAS)

Network Attached Storage


Shared NAS


New job

2015

New job

Office icon created by Backwoods - Flaticon


Small storage


Motherboard sizes

Link to the image


First server


Copying data…


🔊

I started to copy data to the storage space when a noisy alarm began to wake everybody up in the building. This was unbearable.

Later, I found out that the noise was coming from the disk backplane and not the motherboard. There is a buzzer that emits a sound sequence depending on the detected anomaly. At the apartment and at my current house in the summer, the temperature in the room was too high (more than 29°C). I moved the host in a cold place. Problem solved.


Big storage


Baby

2018

Baby

Baby icon created by Smashicons - Flaticon


New house

2019

House

House icon created by Freepik - Flaticon


Old storage


Issues


Recap

2024

Description Name Capacity (TiB) OS
Big storage storage1 5.45 Debian
Old storage storage2 5.45 Debian
Small storage storage3 10.9 Debian

Replaced FreeBSD with Debian because I fail back into my comfort zone when I set configuration management up



Infrastructure


Map

Infrastructure map


Clients


Operating systems


Network File System (NFS)


See the docs


Connectivity


Static IP address

Static IP option

Have a static IP address to access remotely to your server with an Internet connection.

33 USD/month

Example of an option of an existing Internet subscription to have a static IP address in a famous belgian ISP. This option is absolutely not worth the price.

Red icon created by hqrloveq - Flaticon


ISP modem settings

  • SSH, HTTP and HTTPS closed by default
  • Port mapping
  • Request the ISP to set security level to low
  • It worked at the apartment, not in the house



Custom settings

topology subnet                                 ; declare a subnet like home
server 10.xx.xx.xx 255.xx.xx.xx                 ; with the range you like
client-to-client                                ; allow clients to talk to each other
client-config-dir /etc/openvpn/ccd              ; static IP configuration per client
ifconfig-pool-persist /var/log/openvpn/ipp.txt  ; IP lease settings

Infrastructure map with VPS



Remote administration


Data management


Disk management

OpenZFS

ZFS stands for Zettabyte File System and a Zettabyte is equivalent to 1021 bytes. 1021 is also the estimated number of grains of sand on earth.

Source


RAID-Z

storage1 ~ # zpool status
  pool: storage
 state: ONLINE
  scan: scrub repaired 0B in 02:59:40 with 0 errors on Sun Feb 11 03:23:41 2024
config:

        NAME        STATE     READ WRITE CKSUM
        storage     ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            sda     ONLINE       0     0     0
            sdb     ONLINE       0     0     0
            sdc     ONLINE       0     0     0

errors: No known data errors

RAID-Z

storage1 ~ # zpool list
NAME      SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
storage  5.44T  2.89T  2.55T        -         -     8%    53%  1.00x    ONLINE  -

Compression

storage1 ~ # zfs get compression storage
NAME     PROPERTY     VALUE           SOURCE
storage  compression  lz4             local

Filesystems

storage1 ~ # zfs list -t filesystem
NAME              USED  AVAIL     REFER  MOUNTPOINT
storage          1.93T  1.58T      139K  /storage
storage/julien    348G  1.58T      338G  /storage/julien

Snapshots

storage1 ~ # zfs list -t snapshot -r storage/julien | tail -n 3
storage/julien@autosnap_2024-02-25_00:00:01_daily       0B      -      338G  -
storage/julien@autosnap_2024-02-26_00:00:02_daily       0B      -      338G  -
storage/julien@autosnap_2024-02-27_00:00:02_daily       0B      -      338G  -

Snapshots are free thanks to copy-on-write


Replication

zfs send POOL/FS@SNAPSHOT-1                       | ssh REMOTE_HOST zfs recv POOL/FS
zfs send -i POOL/FS@SNAPSHOT-1 POOL/FS@SNAPSHOT-2 | ssh REMOTE_HOST zfs recv POOL/FS
zfs send -i POOL/FS@SNAPSHOT-2 POOL/FS@SNAPSHOT-3 | ssh REMOTE_HOST zfs recv POOL/FS

Snapshot management

Sanoid

Policy-driven snapshot management tool for ZFS filesystems


Templates configuration

[template_main]
    hourly = 0
    daily = 31
    monthly = 12
    yearly = 10
    autosnap = yes
    autoprune = yes

[template_archive]
    hourly = 0
    daily = 31
    monthly = 12
    yearly = 10
    autosnap = no
    autoprune = yes

Policies

[storage/julien]
    use_template = main

[storage/dad]
    use_template = archive

Job definition

systemd service

storage1 ~ # systemctl cat sanoid.service 
# /lib/systemd/system/sanoid.service
[Unit]
Description=Snapshot ZFS filesystems
Documentation=man:sanoid(8)
Requires=local-fs.target
After=local-fs.target
Before=sanoid-prune.service
Wants=sanoid-prune.service
ConditionFileNotEmpty=/etc/sanoid/sanoid.conf

[Service]
Type=oneshot
Environment=TZ=UTC
ExecStart=/usr/sbin/sanoid --take-snapshots --verbose

Included in the Debian package


Job scheduling

systemd timer

storage1 ~ # systemctl cat sanoid.timer
# /lib/systemd/system/sanoid.timer
[Unit]
Description=Run Sanoid Every 15 Minutes

[Timer]
OnCalendar=*:0/15
Persistent=true

[Install]
WantedBy=timers.target
storage1 ~ # systemctl list-timers sanoid.timer --all
NEXT                        LEFT       LAST                        PASSED       UNIT         ACTIVATES     
Tue 2024-02-27 09:00:00 CET 11min left Tue 2024-02-27 08:45:01 CET 3min 20s ago sanoid.timer sanoid.service

1 timers listed.

Included in the Debian package


Snapshot replication


Usage

/usr/sbin/syncoid                       \
    storage/julien                      \
    zfs@REMOTE_STORAGE:storage/julien   \
    --no-sync-snap                      \
    --source-bwlimit=512k 

Added to /opt/syncoid.sh script


Job definition

systemd service

storage1 ~ # systemctl cat syncoid.service 
# /etc/systemd/system/syncoid.service
[Unit]
Description=Send ZFS snapshots created by Sanoid
Requires=zfs.target
After=zfs.target

[Service]
Type=oneshot
User=zfs
ExecStart=-/opt/syncoid.sh

[Install]
WantedBy=multi-user.target

Job scheduling

systemd timer

storage1 ~ # systemctl cat syncoid.timer 
# /etc/systemd/system/syncoid.timer
[Unit]
Description=Run Syncoid every night

[Timer]
OnCalendar=*-*-* 00,04:30:00 UTC
AccuracySec=1us
RandomizedDelaySec=2h30

[Install]
WantedBy=timers.target

Client replication

Client replication


Replication overview

The most important rule is to forbid writes on the same dataset on two different locations at the same time.


Health

storage1 ~ # sanoid --monitor-snapshots
OK: all monitored datasets (storage/dad, storage/julien) have fresh snapshots
storage1 ~ # sanoid --monitor-health
OK ZPOOL storage : ONLINE {Size:5.44T Free:2.55T Cap:53%} 

Alerting


Nagios

Nagios logo


Welcome to pilote!

Raspberry Pi

“Pilote” is the french translation of a driver. The host is responsible for multiple tasks including monitoring, executing automation playbooks and scheduling backups.


Components


Nagios components


Host

/etc/nagios4/conf.d/hosts.cfg

define host {
    use         home-host
    host_name   storage1
    alias       storage1
    address     169.254.0.1
}

Hostgroups

/etc/nagios4/conf.d/hostgroups.cfg

define hostgroup {
    hostgroup_name  storage-servers
    alias           Storage servers
    members         storage1,storage2,storage3
}

use refers to a template, used to define notifications and intervals.


Services commands

NRPE - Agent and Plugin Explained


Services states


Service configuration

define service {
    use                 home-service
    hostgroup_name      storage-servers 
    service_description zfs_snapshots
    check_command       check_nrpe!check_zfs_snapshots
}

NRPE agent

/etc/nagios/nrpe_local.cfg

command[check_zfs_snapshots]=/usr/bin/sudo /usr/sbin/sanoid --monitor-snapshots

Notifications

Send Nagios notifications to a Telegram Messenger channel.

notify-by-telegram


PROBLEM notification


RECOVERY notification


Web UI


External access

Nagios external access



Observability


TIG stack


The plugin-driven server agent for collecting & reporting metrics.

https://github.com/influxdata/telegraf

Telegraf logo


Inputs

[[inputs.cpu]]
  percpu = false
  totalcpu = true
  collect_cpu_time = false
  report_active = false

[[inputs.diskio]]
  devices = ['sda', 'sdb', 'sdc', 'sdd']

Input filters: apcupsd, cpu, disk, diskio, exec, kernel, mem, mqtt_consumer, net, ping, processes, sensors, smart, swap, system, zfs


Outputs

[[outputs.influxdb]]
  urls = ["https://x.x.x.x:8088"]
  database = "metrics"
  skip_database_creation = true
  username = "telegraf"
  password = "****"
  insecure_skip_verify = true
  content_encoding = "gzip"

Scalable datastore for metrics, events and real-time analytics

https://github.com/influxdata/influxdb

InfluxDB logo


The open-source platform for monitoring and observability

https://github.com/grafana/grafana

Compared to Graylog and Kibana, Grafana is my favorite interface. Elegant, fast and it’s the future.

Grafana logo


Grafana dashboard


Overview

Database icon created by Freepik - Flaticon


Notes


Sensors


Hardware

Arduino logo


Arduino cricuit


Software


Sketch

Definitions

#include <DHT.h>

#define KYPIN A0  // analog pin where KY-037 sensor is connected
#define DHTPIN 2  // digital pin where DHT22 sensor is connected

DHT dht(DHTPIN, DHT22); // initialize DHT22 object

float h;  // humidity
float t;  // temperature
int s;    // sound

Setup

void setup()
{
    Serial.begin(9600);
    dht.begin();
}

Main loop (1/2)

void loop()
{
    // sensors need some time to produce valid values
    delay(2000);

    // read values from sensors
    h = dht.readHumidity();
    t = dht.readTemperature();
    s = analogRead(KYPIN);

Main loop (2/2)

    // print "<humidity>,<temperature>,<sound>" (CSV-like)
    if (!isnan(h) && !isnan(t) && !isnan(s)) {
        Serial.print(h);
        Serial.print(",");
        Serial.print(t);
        Serial.print(",");
        Serial.println(s);
    }
}

Multiplexing



How is the temperature?

Ambient temperature


Humidity

Ambient humidity


Noise

Ambient noise

Link to the confused gif


Power consumption

How much will it cost?


Average monthly electricity wholesale price in Belgium from January 2019 to January 2024 (in euros per megawatt-hour)

Electricity chart

© Statista 2024


Household electricity prices worldwide in June 2023, by select country (in U.S. dollars per kilowatt-hour)

Electricity chart per household

© Statista 2024


Wattmeter

Wattmeter


Uninterruptible power supply (UPS)

APC Back-UPS Pro 550

Yearly cost

Power consumption dashboard

$7/y


In real life


storage1


storage2


storage3


Automation


Failures happen

MicroSD cards with I/O errors

Flood or fire in the house

Link to The Escalated Quickly meme


Deployments

  1. Install the operating system
  2. Install and configure software
  3. Restore data (optional)
  1. Install the operating system
  2. Install and configure software
  3. Restore data (optional)

Install and configure software is the most time and effort consuming task that can be automated


Ansible

Ansible is a radically simple IT automation system.

https://github.com/ansible/ansible


Concepts


Inventory

inventory/hosts file

[all]
vps ansible_host=xxx.xxx.xxx.xxx
pilote ansible_host=xxx.xxx.xxx.xxx
metrics ansible_host=xxx.xxx.xxx.xxx
storage1 ansible_host=xxx.xxx.xxx.xxx
storage2 ansible_host=xxx.xxx.xxx.xxx
storage3 ansible_host=xxx.xxx.xxx.xxx

[storage]
storage1 ansible_host=xxx.xxx.xxx.xxx
storage2 ansible_host=xxx.xxx.xxx.xxx
storage3 ansible_host=xxx.xxx.xxx.xxx

Playbook overview

site.yml

- import_playbook: common.yml
- import_playbook: storage.yml
- import_playbook: ...

common.yml

- hosts: all
  roles:
    - common

storage.yml

- hosts: storage
  roles:
    - zfs
    - openvpn
    - sanoid
    - ...

- hosts: storage1
  roles:
    - nfs

Role example

roles/sanoid/
├── defaults
│   └── main.yml
├── handlers
│   └── main.yml
├── tasks
│   └── main.yml
└── templates
    ├── sanoid.conf.j2
    ├── syncoid.service.j2
    ├── syncoid.sh.j2
    └── syncoid.timer.j2

Module examples


Template example

Task

- name: Deploy Syncoid script
  ansible.builtin.template:
    src: syncoid.sh.j2
    dest: /opt/syncoid.sh
    owner: zfs
    group: root
    mode: "0750"

Template using Jinja2

#!/bin/bash
{{ ansible_managed | comment }}

{% for dataset in sanoid_main_datasets %}
{% for destination in syncoid_destinations %}
echo "Sending {{ dataset }} to {{ destination }}"
/usr/sbin/syncoid {{ dataset }} {{ syncoid_user }}@{{ destination }}:{{ dataset }} \
    --no-sync-snap \
    {% if syncoid_source_bwlimit %}--source-bwlimit={{ syncoid_source_bwlimit }} {% endif %}
{% endfor %}
{% endfor %}

Result on the managed host

#!/bin/bash
#
# Ansible managed
#

echo "Sending storage/julien to xxx.xxx.xxx.xxx"
/usr/sbin/syncoid storage/julien xxx@xxx.xxx.xxx.xxx:storage/julien \
    --no-sync-snap \
    --source-bwlimit=512k 
echo "Sending storage/dad to xxx.xxx.xxx.xxx"
/usr/sbin/syncoid storage/dad xxx@xxx.xxx.xxx.xxx:storage/dad \
    --no-sync-snap \
    --source-bwlimit=512k

Upgrades

upgrade.yml

- name: Upgrade systems
  hosts: all
  tasks:
    - include_tasks: tasks/apt-upgrade.yml

tasks/apt-upgrade.yml

- name: Run apt upgrade
  ansible.builtin.apt:
    update_cache: true
    upgrade: dist

CLI

ansible-playbook site.yml
ansible-playbook upgrade.yml

What’s next?


Takeaways


Thank you

🙏


Questions