Beschreibung
Tasks
Design, develop, implement new functionalities or tooling on ELK clusters
Active Monitoring of the clusters and proactive fixing of problems
Designs Elasticsearch indices to efficiently store data currently optimizing both performance and the need for growth
Configures Logstash, FileBeats, MetricsBeats and possibly other ELK Stack components to collect and the store the data necessary to meet requirements efficiently
Develop testing scenarios for new software installation/upgrades/migrations. Automate the rollout to all environments using Puppet or custom scripts
Consultancy and recommendations - Verify the clusters' performance & fine tune. Propose changes and validate with the customer then implement. Offer consultancy, document best practices
Maintain or refactor puppet modules, classes, manifest or resources etc.
Maintain or create Ansible playbooks.
Conduct proof of concepts and experimentation on new ideas, technologies and features.
Incident/problem/change management
Monitoring and logging - Proactively monitor system health and performance. Detect, catch and remediate problems. Create alerting and properly react when triggered and evaluate of reports/log
Capacity management - Assess the sizing of the clusters in terms of data growing trends. Expand the clusters if necessary. Also check the available quota
Backup & restoration
Continuous Deployment - Maintenance and continuous improvement of service offering, configuration solution and tools as well as discovery of new technologies
Skills
Strong experience (8+ years) in implementing & managing ELK stack (Elasticsearch, Logstash, Kibana/Grafana)
Shall have prior experience in cluster provisioning & management, upgrades, backup/restore, patch management, performance tuning etc.
Proficient with Elastic Stack components, especially Elasticsearch, Logstash, Kibana required.
Proficient with Ansible, Docker, Kubernetes based deployment and administration
Must be proficient with Python
Excellent knowledge in both Windows and Linux operating systems
Shall be well versed with Linux admin tasks, automation using Puppet, Jenkins etc.
Shall be able to provide consulting & best practices to application teams for consuming ELK services
Adaptability to working with multiple teams on projects with varying degrees of flexibility/rigidity at different points in the development cycle
Previous experience working in an environment with formally structured IT Operational processes: work request ticket management, incident management, change management, and problem management
Proficiency in Scripting languages such as Python, Ruby, Perl, and/or Bash
Experience working with Git and supporting CI/CD pipelines
Ability to develop and maintain positive working relationships
Ability to work in a team environment and independently as needed
Ability to adapt to change and work well under pressure
Ability to multitask and manage numerous projects
Ability to take on internal operational initiatives as a prime or lead
Excellent communication, organizational, interpersonal, problem solving, and documentation skills
Experience running and supporting a global 24x7 Internet based service or product is considered an asset