Real-Time Feeding of a Data Lake with PostgreSQL and Debezium

intro_logo

PostgreSQL Users Group Belgium

Julien RIOU

February 13, 2024

Speaker

Summary

  • Who are we?
  • Internal databases
  • Data Lake
  • ETL
  • CDC
  • Other uses
  • The future

Who are we?

OVHcloud logo OVHcloud universes

Internal databases

OVHcloud universes pointing to PostgreSQL

Statistics

  • 3 DBMS (MySQL, MongoDB, PostgreSQL)
  • 7 autonomous infrastructures worldwide
  • 500+ servers
  • 2000+ databases
  • 100+ clusters
  • Highly secure environments

Cluster example

Cluster example

Mutualized environments

Mutualized databases

Analytics needs

  • Billing
  • Revenue
  • Enterprise strategy
  • KPIs
  • Fraud detection
  • Electrical consumption
  • Metadata analysis (from JIRA)
  • Work time detection of support teams

Mix of workloads