Handling alerts at OVH-Scale with Apache Flink

Abstract

OVH relies heavily on metrics to effectively monitor its entire infrastructure. Offering a low-level vision and business, these allow teams to better operate the daily operation of our services. After managing more than 300 TB of telemetry, we started working on an alerting solution over this huge datalake. For that, we decided to use Apache Flink to manage all these large scale alerts. Today, this project manages the alerting of flagship OVH products such as Public Cloud Instances and Kubernetes.

This conference is a feedback that will present:

  • What is Apache Flink?
  • How to develop a Flink job from 0
  • Deploying and operating a Flink cluster

Occurences

  • FinistDevs, 2019
  • Devoxx France, 2019

Ressources

Photos and tweets