Baptiste Azéma
I’m just a simple man, trying to make my way in the universe.
👨💻 Professional Experiences
Sifflet - Team Lead | 2022 > now
Building a data observability SaaS platform for worldwide customer.
Internal expert to architecture, design and develop a sub-system of Sifflet.
Work closely with Product team to create collaborative and realistic roadmaps. Daily communication and decisions with Product Managers.
Lead a team of 4 software engieers to deliver the company roadmap.
Java, Python
Temporal.io, k8s, Docker, MySql, AWS, GCP
Gitlab, Linear, Notion, ArgoCD, Sentry, Grafana
Shadow - Data Engineer / DataOps | 2020 > 2021
Tech Lead
Allow management and departments to make the right decisions based on data. Democratise data access across the entire company. Design and development of multiple products in order to deliver meaningful insights and forecasts.
Worldwide data platform. Collect, structure and serve real time data for the company. Processing millions of event per seconds. Worldwide Clickhouse cluster, administration of 10 Kafka clusters, APIs receiving data, scrappers.
Kafka, Clickhouse, InfluxDB, Faust, FastAPI, Kubernetes
Metabase, Grafana
Python, SQL, Docker, Ansible, Gitlab, GCP
Renault Digital - Data Engineer | 2018 > 2020
Datamart and IA - Tech Lead | 2 years
Creation of a datamart.
- Design and build a scalable environment for ML models, for both training and serving prediction.
- Four machine learning models deployed to production.
- Web application deployed on Kubernetes.
- Create and maintain data pipelines to extract, transform and load data from various data sources.
- Automation of CI/CD pipelines.
- Infrastructure as Code with Terraform.
4 data engineers, 4 data scientists
Google Cloud, BigQuery, Composer, PubSub, Scikit-learn, Spark, Kubernetes, VueJS, Flask
Gitlab, GCR, Docker, Terraform, Python, SQL - Scrum
Data replication on-premise to GCP | 3 months
Daily replication of 50 Hive databases (Hadoop) to BigQuery (GCP) — 2 developers
Hadoop, GCP, Spark, Oozie, Composer, Gitlab, Terraform, Python
Sopra Steria - Data Engineer | 2016 > 2018
Air France | 1 year 2 months
Near real-time platform providing advanced analytics and machine learning predictions to help front line improving operational performance. Deploy and maintain 2 ML models in production providing real-time predictions — 4 data engineers, 2 data scientists, 1 android developer
SparkStreaming, SparkML, Kafka, MongoDB, HortonWorks
BitBucket, Java - Kanban
Airbus | 6 months
MapReduce application used to process and analyse flight test data (high volume) — 4 developers
MapReduce, Hadoop, HBase, Apache Phoenix, HortonWorks, Java, C# - Scrum
EDF | 2 months
API oriented application providing IAM policies to clients of the professional website of EDF - 4 developers
OrientDB, HBase, Springboot, HortonWorks, Jenkins, Cucumber, Java - Safe
Misc
- Apache Spark trainer - 2 days training, provided 4 times for external and internal audience
- AI/sentiment analysis - collect and analyse data from web social media during European Championship 2016 - deployed to IMB Cloud Bluemix
- 2 web applications based on the MEAN stack - MongoDB ExpressJS AngularJS NodeJS - deployed to IMB Cloud Bluemix
🎓 Education & Certifications
2021 | Udacity - Machine Learning Engineer Nanodegree
Machine Learning Engineer Nanodegree — CREDENTIALS
Learn advanced machine learning techniques and algorithms and how to package and deploy your models to a production environment.
2019 | Google Cloud Platform (GCP)
Certified Professional Data Engineer on Google Cloud Platform — CREDENTIALS
The Professional Data Engineer exam assesses your ability to: Design data processing systems, build and operationalize data processing systems, operationalize machine learning models, ensure solution quality.
2011 > 2016 | Engineering Degree
ICAM - Institut Catholique des Arts et Métiers - Toulouse, France
General engineer training based on technical, scientific and human improvement.
🧘🏻♂️ Personal projects
- Bazema_pokemon - Classify images as Pokémon - Python, PyTorch, transfer learning RESNET-50, CNN, Image classification
- Topper - Parse and process log files of music listening - Python, OOP, Pypi, Github Actions
- Tagging - Web application for music tagging purpose - VueJS, Flask, SQLite, Docker
- Tagger - Music genre classifier using keras
- Domination - Real-time data processing - Kafka, Faust, Python, Clickhouse, Docker
- Bazema linker - Parse and process csv files - Python, Pandas, SQL
🌴 What else..?
Triathlon enthusiast 🏊♂️ / 🚴 / 🏃
Finisher of the Trialong at Bois-le-Roi 2023 | L Format: 1.9km swim / 85km bike / 21km run — RESULTS
Finisher of the Paris Triathlon 2022 | Olympic format: 1.5km swim / 40km bike / 10km run
Finisher of the Paris Triathlon 2019 | Olympic format: 1.5km swim / 40km bike / 10km run — RESULTS
2014 | Volunteering - backpacking
4-month autonomy experience in Western Canada (Vancouver, Calgary) — volunteer on organic farm (woofing), play with bisons, backpacker
📨 Contact
baptiste[at]azema[dot]tech