Vacature

Azure Site Reliability Engineer

Brussel

Solliciteer

We are seeking an experienced Azure Site Reliability Engineer to join our Engineering chapter team. You will play a critical role in ensuring the reliability, scalability, monitoring, and performance of our cloud-based services in the Consumer Centricity product organization. Your responsibilities will include designing, implementing best practices, and managing our infrastructure. You are working within cross-functional teams to improve systems and processes and ensure uptime and efficiency.

Responsibilities

  • Automation and CI/CD: Design, create, and maintain automation frameworks for deployment, scaling, and managing productive environments.

  • System Monitoring and Maintenance: Implement and manage monitoring tools to ensure system health and performance. Proactively identify and fix issues before they impact users.

  • Incident Management: Respond to and resolve incidents in a timely manner, perform root cause analysis, and implement measures to prevent recurrence.

  • Performance Optimization: Analyze system performance and implement improvements to ensure scalability and efficiency.

  • Capacity Planning: Conduct capacity planning assessments to predict system needs and ensure resources are in place to handle growth.

  • Collaboration: Work closely with development teams to integrate systems reliability into the development lifecycle through continuous integration and deployment practices.

  • Documentation: Create and maintain comprehensive documentation related to systems architecture, configuration, and operational procedures.

  • Tool Development: Develop and maintain internal tools to streamline processes and improve system reliability.

  • Security: Ensure that security controls are implemented, monitored, and maintained across all systems.

  • Service Level Objectives (SLOs): Define and track Service Level Objectives (SLOs) to ensure reliability metrics meet business requirements.

  • On-call Support: Participate in on-call rotations to provide 24/7 support for critical systems and infrastructure.

Ideal Profile

  • Experience: Minimum of 5 years in a Site Reliability Engineer or DevOps role with extensive experience in Microsoft Azure.

  • Proficient in scripting languages (Python, Azure CLI, PowerShell).

  • Experience with containerization technologies (Docker, Kubernetes).

  • Proficiency in Azure Cloud services (VMs, Storage, Networking, etc.).

  • Experience in Infrastructure as Code (IaC) tools such as Terraform, ARM templates, or Bicep to automate secure provisioning and configuration of Azure resources.

  • Strong experience with monitoring, logging and alerting tools such as Azure Monitor, Application Insights, or Log Analytics and third-party solutions like Grafana, Splunk or Elastic Stack.

  • Strong understanding of cloud networking, hybrid cloud, and virtual networking concepts (e.g.: VPNs, subnets, NSGs, load balancing, hub & spoke).

  • Experience in Azure governance and cost management using Azure Cost Management, Azure Policies, and management groups.

Preferable

  • Microsoft Azure certifications, such as Azure Solutions Architect Expert, or Azure DevOps Engineer Expert.

  • Experience with following technologies: Kong, Event Hubs, Dapr

  • Extra Languages: French (B1), Dutch (B1

Soft Skills

  • Excellent problem-solving and analytical abilities.

  • Strong communication and collaboration skills.

  • Ability to work in a fast-paced environment and manage multiple priorities.

  • Languages: English (C1).