The System Engineer provides advanced operational support for Linux-based IT infrastructure, with a strong focus on system monitoring, observability, and reliability. They ensure platform stability, performance, and availability through proactive monitoring, alerting, and continuous analysis of system health.
The engineer is responsible for timely resolution of incidents and service requests, leveraging monitoring data and diagnostics to identify and address OS-level issues. They perform routine maintenance and implement system changes in line with established processes and standards, ensuring minimal service disruption.
A key aspect of the role is the design, implementation, and continuous improvement of monitoring solutions, including metrics collection, logging, alerting, and dashboards. The engineer uses these capabilities to detect anomalies early and improve overall service quality.
The engineer contributes to problem management by performing root cause analysis based on monitoring data, logs, and system behavior, and by proposing and implementing long-term corrective actions. They collaborate with cross-functional teams to ensure systems are secure, compliant, and effectively monitored end-to-end.
In addition to operational duties, the engineer participates in R&D activities, evaluating new technologies and observability tooling, and contributing to solution prototypes. They drive automation through scripting and Infrastructure-as-Code to enhance monitoring coverage, reduce manual intervention, and increase platform reliability.
Ideal Profile
You are able to speak, read and write fluently English and French or Dutch.
Minimum 4 years of experience in IT operations, managing containerized, virtualized, and/or physical Linux-based infrastructure in large-scale environments (enterprise, governmental, or supranational).
Monitoring & Observability
Proven experience in the design, implementation, and operation of monitoring and alerting solutions
Hands-on experience with PRTG is strongly preferred
Solid understanding of:
Metrics collection, log aggregation, and alerting strategies
Event correlation, anomaly detection, and performance baselining
Incident detection and reduction of MTTD/MTTR
Linux Platform Engineering
End-to-end lifecycle management of Red Hat Enterprise Linux (or equivalent) environments
Experience with enterprise tooling, including:
Red Hat Satellite 6.x (provisioning, patching, lifecycle management)
System performance tuning, troubleshooting, and OS-level diagnostics
Automation & Infrastructure as Code
Strong experience with Ansible Automation Platform (playbooks, roles, automation workflows)
Familiarity with Infrastructure-as-Code principles and pipeline integration
Experience integrating automation with monitoring and operational workflows
CI/CD & Version Control
Practical experience with Git-based workflows (GitLab, Azure DevOps, or similar)
Understanding of CI/CD pipelines for infrastructure and configuration deployment
Container Technologies
Experience with Red Hat container ecosystem, including:
Podman / OCI containers
Understanding of container monitoring and lifecycle operations
IT Service Management (ITSM)
Experience working within structured ITIL-aligned processes:
Incident, problem, and change management
Service request handling
Strong focus on monitoring-driven operations, diagnostics, and forensics
Languages
Dutch or French: Full professional / native proficiency
English: Professional working proficiency
Preferred experience and skills
Infrastructure & Platform Ecosystem
Exposure to hardware lifecycle management (server, firmware updates, lifecycle planning)
Experience with VMware-based virtual environments
Familiarity with NetApp storage systems
Security & Detection
Experience with endpoint detection and protection tools:
Trend Micro Deep Security, ClamAV, or equivalent
Understanding of integrating security signals into monitoring/alerting pipelines
Network & Platform Integration
Basic operational knowledge of network/security platforms:
Palo Alto, Check Point, F5, InfoBlox
Ability to correlate infrastructure and network events for incident analysis
Scripting & Automation
Solid scripting skills in Bash and/or Python
Ability to develop automation for monitoring, remediation, and reporting
Vendor & Lifecycle Management
Experience with vendor coordination, support processes, and licensing management