IT Service Performance and Reliability Manager
We''re seeking a IT Service Performance and Reliability Manager to take ownership of performance, capacity, and resilience across critical IT services. This role focuses on keeping customer-facing services fast, reliable, and fully observable, while driving continuous improvement.You will lead observability across services, ensuring effective monitoring and actionable insights. You''ll manage capacity and performance through forecasting and trend analysis, identifying risks early and driving improvements. Ensure resilience and availability are built into services from the outset, while supporting continuity planning and risk management. Working closely with technical teams and stakeholders, you''ll help resolve issues and deliver ongoing service improvements.Key RequirementsExperience managing capacity and performance in IT environmentsHands-on experience with AWS and AzureStrong knowledge of ITIL v3/v4 (certification required)Experience with monitoring/observability tools (e.g. Zabbix, Grafana, Kibana, OpenSearch)Knowledge of Windows and Linux server environmentsScripting skills (e.g. Python, PowerShell, Node.js)Experience integrating data via APIs, webhooks, or messagingStrong analytical, problem-solving, and stakeholder management skillsDesirable:DevOps exposureNetwork infrastructure and communications protocols knowledgeExperience with social alarm platformsIf you''re looking for a role where you can make a tangible impact on service performance and resilience, we ..... full job details .....
Other jobs of interest...
Perform a fresh search...
-
Create your ideal job search criteria by
completing our quick and simple form and
receive daily job alerts tailored to you!