Beschreibung
We are looking for an Azure Monitoring SME (m/f/d) for our customer in the energy sector.Start: asap
Duration: 5 months+
Capacity: Fullime
Location: Remote
Responsibilities:
– Design and implement E2E monitoring strategy for Azure using Cloud best practices. Define long term monitoring roadmap and agree with the stakeholders (agreed by company stakeholders).
– Manage all cloud monitoring activities
– Configure and maintain all monitoring agents (and configuration thereof), including all deployment and ongoing maintenance/update activities.
– Organize check & test monitoring rules to monitor azure infrastructure capacity with alerts for compute, storage, and network.
– Organize the development of workbooks & creation of azure dashboards to understand the status of all resources of business applications in Azure portal after results are presented and approval provided from the HaCT team.
– Oversee the implementation of the end-to-end process to monitor the availability of different azure services with Azure Health and by enabling app insights web tests.
– Develop guidelines to implement application performance monitoring & alerts with app insights resources.
– Organize the custom alerts using kusto queries in log analytics and standard alerting rules for all Azure Virtual machines.
– Periodically (monthly) validate the twilio (SMS & Voice forwarding) configuration and check SMS notifications are delivered to the relevant stakeholders.
– Bring the integration between Azure alerting and SNOW incident raising to the next level making use of the SNOW API’s.
– Oversee the redirection of alerts to different systems and then decommission Request Tracker (custom ticketing system).
– Consult security team to implement Azure Sentinel as the new SIEM.
– Define & implement monitoring framework for Azure PaaS services.
– Provide recommendations into the company’s AIOPS initiative to extract further insights/value from the monitoring data.
– Coordinate with application teams and other technical teams within Hosting and Cloud Technologies (HaCT), to resolve complex problems and recommendations about system / process improvements
– Provide remote workshops/training with recommendations and solution suggestions after each improvement measure.
Requirements:
– Good knowledge on infrastructure monitoring tools (Zabbix / Dynatrace / datadog etc)
– Azure monitoring skills
– Agent deployment and management for IaaS services; Configuration of PaaS services for monitoring; Configuration of Azure alerting; Log analytic and Kusto experience
– Experience of Windows/Linux operating systems
– Team leadership skills – communication, organization, teamwork, customer orientation & ability to innovate
– Problem solving & amp; interpersonal skills
Nice to have:
– Zabbix experience (to complete our migration off Zabbix to native Azure monitoring);
– Experience on ServiceNow integration (event handling)