Automating Internal Databases Operations at OVHcloud with Ansible
CfgMgmtCamp 2024
Julien RIOU
February 6, 2024
Speaker
Summary
Who are we?
Managed infrastructure
Management tools
Ansible code base
Real world examples
Implementation
Development
What’s next?
Who are we?
Major cloud provider in Europe
Datacenters worldwide
Baremetal servers, public & private cloud, managed services
Managed infrastructure
3 DBMS (MySQL, MongoDB, PostgreSQL)
7 autonomous infrastructures worldwide
500+ servers
2000+ databases
100+ clusters
Highly secure environments
Cluster example
Mutualized environment
Infrastructure as Code
Manage infrastructure lifecycle
Create, replace, destroy
Scale up, down
Providers: OVH , vSphere , phpipam , AWS
Use standard providers first
Configuration management
Manage operating system security hardening
Install and configure packages (including DBMS)
Agent run manually on internal databases
One-shot operations
Requests from users
Maintenances
Orchestration of multiple tasks
Acting on external resources
Operation examples
Bootstrap clusters
Create/move/delete databases, users, permissions
Test/apply schema migrations
Minor/major upgrades
Reboot and decrypt servers, clusters
Daily restores
Automation
Reduce human errors
Free human time and energy
Focus on what’s important
Code base
Architecture of a playbook
Playbook
Play
include task
include task
Play
Reusable tasks
No role, only tasks
Located under tasks directory
One task = one module
Tasks can be included by one or more playbooks
Naming convention is scope-action.yml
Idempotence
TDD for Task-Driven Development (joke)
Sometimes, more than one module are used in a task
We try to keep tasks (very) small
Sometimes, modules are directly used in playbooks
Real-world examples
Schema migrations
Database creation
Minor upgrades
Databases migrations
Schema migrations
Applications evolve all the time
Databases schemas too
Reviewed and applied by DBAs
Schema migrations
sql-migrate
-- +migrate Up
create table author (
id bigserial primary key,
name text not null
);
create table talk (
id bigserial primary key,
title text not null,
author_id bigint not null references author(id)
);
-- +migrate Down
drop table author, talk;
Schema migrations
Move forward with sql-migrate up
Rollback with sql-migrate down
Playbook overview
- name: check arguments
hosts: all
run_once: true
delegate_to: localhost
tasks:
- name: check variable schema_url # fail fast
- name: check variable database_name # fail fast
- name: update database to the latest schema migration
hosts: "{{ database_name }}:&subrole_primary"
tasks:
- name: create sql-migrate directories
- name: create sql-migrate configuration file
- name: clone schema
- name: run migrations
Playbook tasks
- name: create sql-migrate directories
ansible.builtin.file:
path: "{{ item }}"
state: directory
loop:
- /etc/sqlmigrate
- /var/lib/sqlmigrate
- name: create sql-migrate configuration file
ansible.builtin.template:
src: sqlmigrate/database.yml.j2
dest: "/etc/sqlmigrate/{{ database_name }}.yml"
Playbook tasks
- name: clone schema repository
ansible.builtin.git:
repo: "{{ schema_url }}"
dest: "/var/lib/sqlmigrate/{{ database_name }}"
version: "{{ branch|default('master') }}" # branch or tag
force: true
environment:
TMPDIR: /run
- name: run migrations
ansible.builtin.command:
cmd: sql-migrate up -config /etc/sqlmigrate/{{ database_name }}.yml
Database creation
Just run CREATE DATABASE.
Database creation
Check arguments
Select an available cluster
Create git repository
Run CREATE DATABASE (using a module)
Create secrets
Create roles and users (for applications, humans)
Link the database to the git repository
Run schema migrations
Minor upgrades
Ensure softwares are up-to-date:
Minor upgrades
Upgrade packages (DBMS, system)
Reboot (if needed)
Restart DBMS (if needed)
Order by role criticity
Database migration
Cluster is about to reach maximum capacity
Colocate or spread logical divisions
Isolate noisy neighbours
Major upgrades
Database migration
Move one or more databases from one cluster to another
Setup logical replication
Promote
Database migration
Moved out of a datacenter last year with this method
400+ databases
16.78TiB
Under 30 minutes of downtime for the datacenter move
Big focus on playbook execution time
Thanks to Ansible
External collections
community.general
community.mysql
community.mongodb
community.postgresql
Internal collections
ovhcloud.internal
ovhcloud.mysqlsh
ovhcloud.patronictl
ovhcloud.sqlmigrate
Implementation
How we use Ansible
Secure Shell (SSH)
How can we securely connect to remote hosts to perform actions?
The Bastion
The Bastion is a SSH gateway
Secure environments includes PCI DSS and SecNumCloud
Ansible + The Bastion
“Ansible Wrapper”
[ssh_connection]
pipelining = True
private_key_file = ~/.ssh/id_ed25519
ssh_executable = /usr/share/ansible/plugins/bastion/sshwrapper.py
sftp_executable = /usr/share/ansible/plugins/bastion/sftpbastion.sh
transfer_method = sftp
retries = 3
https://github.com/ovh/the-bastion-ansible-wrapper
Inventory
Where can we find our hosts to perform operations?
Consul
Consul
Nodes
name, IP address, meta(data)
Services
Access control list (ACL) with tokens
Encryption
Static configuration
Node meta
server_type
postgresql, mysql, filer, …
role
cluster identifier
Dynamic configuration
Node “subrole”
Database services
Where is my database?
How to use the inventory?
With a limit option
ansible server_type_postgresql -m ping
ansible-playbook -l server_type_postgresql playbook.yml
Group combinaison
& for intersection (AND)
: for multiple groups (OR)
! for exclusion (NOT)
ansible-playbook -l 'test:&subrole_primary' playbook.yml
ansible-playbook -l 'server_type_postgresql:server_type_mysql' playbook.yml
ansible-playbook -l 'server_type_postgresql:!cluster_99' playbook.yml
Execution environments
Where Ansible runs?
Admin server
Virtual machine
Access via SSH
Shared environment
No API
AWX
Ansible orchestration
Running on Kubernetes
Personal accounts (via SSO/SAML)
REST API, web interface, CLI
Notifications (alerting, chat)
https://github.com/ansible/awx
Concepts
Organization, projects, teams, users, privileges
Inventory source
Source Control (Git) and Machine (SSH) credentials
Job templates
Scheduled jobs
Notification templates