75% Remote: Domain Architect Observability (f/m/d)

Berlin, Berlin  ‐ Hybrid

Schlagworte

Metriken Data Centers Iaas Verkehrsfluss Kubernetes Amazon Web Services Datenbanken Devops Vmware ESX Server Forecasting Hypervisor Identitätsmanagement Open Source Public Cloud Prometheus Snmp Virtualization Datadog Daten- / Datensatzprotokollierung Testen Grafana Reporting-Tools Commercial Off-The-Shelf Appdynamics Dynatrace

Beschreibung

For our client we are looking for a Domain Architect Observability (f/m/d).

Outline data:
Start: May 2024
Duration: 31.12.2024++
Workload: Full-time
Location: Berlin, 75% remote (3 weeks remote / 1 week Berlin), up to 50% onsite

Role:
The infrastructure product group offers data center services that are provided via a software stack for other product lines within the program. The Observability Architect is member of the Infrastructure Architecture team.
The architect is responsible for aligning with the strategy and vision of the Lead Infrastructure Architect and with other architects in the group (i.e., network, storage, software architects).
The Observability Architect is responsible for architecture of the infrastructure wide observability platform of which provides observability (metric, events, alerting, remediation) for the core infrastructure technology teams - Network, Compute, Virtualization, Storage Security and Software, and for products provided to Infrastructure customers.
The infrastructure observability platform must also integrate with the program wide observability platform which encompasses other technical towers (k8s, DevOps, Data, IAM, etc).
The architect does this in conjunction and through consultation with the other infrastructure technology architects and the lead Infrastructure Architect.

Targets:
The Observability Architect is responsible for the following technology areas.
- Infrastructure Core Metrics, events and alerting (Compute, Network, Storage, Security, IaaS)
- Customer Metrics, events and alerting.
- Security events, alerting and remediation.
- IaaS Metrics, Events and Alerting
- Program wide observability integration
- Internal Infrastructure Observability Platform
- Customer Observability Platform
- QA Team / Testing Observability Platform

Skills:
- MUST understand and have proven experience with observability (metrics, events, logging, alerting) for physical data center infrastructure. i.e., network devices, compute platforms, storage platforms, security platform.
- Have extensive experience with various observability platforms.
- Able to provide solutions with both COTS, open source and custom solutions (code) to fulfil end to end requirements.
- OpenTelemetry
- At least 3 or more of
o Loki, Grafana, Prometheus, DataDog
o AppDynamics, DynaTrace
o ELK
- Reporting tools and customizing for specific audiences
- Integrations with custom written software (i.e .Python) exposing metrics, events and alerting
- Integrations with hypervisor platforms (KVM, ESXi)
- ML experience used as a forecasting technology for scaling infrastructure hardware based on usage stats/metrics.
- eBPF experience to extract traffic flow metrics and analyze potential issues (performance, security)
- k8s CNI knowledge (i.e., Cilium) to extract traffic flow metrics and analyze potential issues (performance, security)
- Database knowledge to support observability platform data storage requirements (i.e., TimeSeries, Graph)
- SNMP
- Packet Analysis, IPFIX, sFlow
- KPI Reporting
- Fluent English in speech and writing (at least C1)

** PLEASE NOTE THAT EXPERIENCE IN PUBLIC CLOUDS (AZURE, GCP, AWS, ETC) IS NOT RELEVENT FOR THIS ROLE. THIS IS A PRIVATE ON-PREMISES CLOUD BUILT FROM THE GROUND UP **
Start
06.2024
Dauer
8 Monate
(Verlängerung möglich)
Von
Nemensis AG
Eingestellt
12.04.2024
Ansprechpartner:
Benjamin Walker
Projekt-ID:
2738961
Branche
IT
Vertragsart
Freiberuflich
Einsatzart
80 % Remote
Um sich auf dieses Projekt zu bewerben müssen Sie sich einloggen.
Registrieren