SK

Serge Kalinin

verfügbar

Letztes Update: 13.03.2025

Big Data, Cloud, DevOps

Abschluss: PhD Physics
Stunden-/Tagessatz: anzeigen
Sprachkenntnisse: Deutsch (verhandlungssicher) | Englisch (verhandlungssicher) | Französisch (gut) | Russisch (Muttersprache)

Schlagwörter

Big Data Amazon Web Services Microsoft Azure Datenbanken Devops Dateisysteme Mysql Performance-Tuning Transport Layer Security Google Cloud + 150 weitere Schlagwörter anzeigen

Dateianlagen

skalinin_130325.pdf

Skills

DynamoDB, Amazon EC2, S3, SQS, AWS services, AWS, Ansible, Flink, Flume, HBase, Apache, Hadoop, Hive, Kafka, Oozie, Spark, YARN, Zookeeper, Web Application Firewall, API, neural networks, backend, backups, batch processing, Big Data, BigData, BGP, C, C++, Cacti, Cassandra, Nexus, Citrix, ClickStream, CloudFoundry, Cloudera, Impala, CloudWatch, network components, programming, Containerization, data analysis, data governance, data integration, streaming, data replication, database, Databases, DevOps, distributed computing, Docker, DNS, dynamic routing, software packages, XML, ETL, file system, Filesystem, firewalls, Flask, Fraud Detection, Ganglia, GPFS, genetic algorithms, Git, Google Cloud Platform, Google Cloud, Grafana, Graph database, Grid Computing, HAProxy, HDFS, HDInsight, high performance computing, DB2, Isilon, JIRA, JSON data, Java, JavaScript, Juniper, Jupyter, Kerberos, KVM, Kibana, Kotlin, Kubernetes, LDAP, Linux, Linux servers, machine learning, MapR, Mathematica, memcached, Mesos, microservices, Microsoft Azure, Azure, Excel, office, PowerPoint, Microsoft SQL Server, Microsoft Word, MongoDB, Monte Carlo simulation, MPLS, MySQL, MySQL server, NAGIOS, Netapp, NetBeans, network protocols, routers, NGINX, object oriented programming, OSPF, OpenLDAP, Oracle, PCI DSS, PHP, package management, Parsing, Pattern recognition, peering, performance tuning, Performance optimization, Postfix, PostgreSQL, Prometheus, Pulumi, Puppet, PyTest, Python, quark, REST API, Ruby, SQL, Scala, Agile/SCRUM, search engine, search engines, server installation, Serverless, software development, Software quality, Software quality assurance, software tools, Spark streaming, Sphinx, Sqoop, State Machine, statistical tests, SVN, support vector machines, high availability, System Architect, Tableau, Terraform, TLS/SSL, SSL, TypeScript, Unit tests, authentication, Vagrant, VBA, VPNs, VirtualBox, Virtualization, VMware, web application, webpages, XEN, XML files

Projekthistorie

04/2023 - bis jetzt
Senior DevOps
Atruvia GmbH (Banken und Finanzdienstleistungen, 1000-5000 Mitarbeiter)

  • Development of REST APIs and ETLs
  • Data Governance
  • Security
  • Data validation

10/2021 - 03/2023
Senior DevOps
Otto GmbH & Co KG (Internet und Informationstechnologie, 1000-5000 Mitarbeiter)

  • Search Engine Optimzation
  • Development of services and ETLs
  • MLOps
  • AWS
  • Monitoring
  • Backups

07/2018 - 09/2021
Senior Big Data
REWE Systems

- Conceptualization and implementation of a hybrid environments on Google
Cloud Platform

- Provisioning of GCP infrastructure with Terraform and later with
Ansible
- Redundant connectivity and encryption of data between GCP and
on-premise systems
- Provisioning of MapR and Spark environments on GCP
- Setups of realtime data replication from on-premise tables to GCP
- Integration with services of REWE services(ActiveDirectory, DNS,
Instana, etc)
- Development of REST API for machine learning models using Flask

- Implementation of persistent storage based on MapR for Kubernetes cluster

- Operating of MapR clusters: upgrades, extensions, troubleshooting of services
and applications
- Synchronization of a Kafka cluster with MapR streams using Kafka connect

- Design and implementation of ETL pipelines, synchonization and integration
of MapR clusters with different data sources(e.g. DB2 and TerraData
warehouses)
- Onboarding of new internal REWE customers to MapR platforms
- Consulting of management by technical topics and future developments in
Big Data fields
- Proposals for solutions of security topics(e.g. contrained delegation on F5
or authentication for OpenTSDB) and PoCs
- Developer in data science projects

- Development of markets classification models
- Visualization of data and predictions with Jupyter and Grafana
- Integration with JIRA

- 3rd-level support

09/2016 - 05/2018
Senior BigData
Allianz Technology SE

- Management of large-scale, multi-tenant and secure, highly-available Hadoop
infrastructure supporting rapid data growth for a wide spectrum of innovative
customers
- Pre-sales: onboarding of new customers
- Providing architectural guidance, planning, estimating cluster capacity,
and creating roadmaps for Hadoop cluster deployments
- Design, implemention and maintainance of enterprise-level security Hadoop
environments(Kerberos, LDAP/AD, Sentry, encryption-in-motion, encryptionat-rest)

- Install and configuration of Hadoop multi-tenant environments, updates,
patches, version upgrades
- Creating run books for troubleshooting, cluster recovery and routine cluster
maintenance
- Troubleshooting Hadoop-related applications, components and infrastructure
issues at large scale
- 3rd-Level-Support (DevOps) for business-critical applications and use cases
- Evaluation and proposals of new tools and technologies to meet the needs
of the global organization (Allianz Group)
- Work closely with infrastructure, network, database, application, business
intelligence and data science units
- Developer in Fraud Detection projects including machine learning
- Design and setup of a Microsoft Revolution(Microsoft R Open) data science
model training platform on Microsoft Azure and on premise for Fraud
Detection using Docker and Terraform
- Developer in Supply Chain Analytics projects(e.g. GraphServer that allows
to execute graph queries on data stored on HDFS)
- Transformation of team's internal processes according to Agile/SCRUM
framework
- Developer of Kafka-based use cases

- ClickStream
- Producer: aggregator for streamed URLs clicked on webpages
with a REST API or other sources(e.g. Oracle)
- Consumer: Flink job that after pre-processing(sanity check, extraction
of time information) put data on HDFS in XML files
- Used stack: Java, Kafka, Cloudera, SASL, TLS/SSL, Sentry,
YARN, Flink, Cassandra
- Classification of documents
- Producer: custom written producer that reads documents from
a shared file system and writes them into Kafka




- Consumer: Spark streaming job that after pre-processing sends
documents to UIMA platform for classification of documents.
After classification data will be stored on HDFS for further batch
processing
- Used stack: Java, Kafka, Spark(streaming), Cloudera, SASL,
TLS/SSL, Sentry, YARN, UIMA
- Graph database(PoC): manage graphs via Kafka interface
- Producer: twitter, news agents sites, etc
- Consumer: converted articles and messages into graph queries
and executed them on graphs using Gremlin
- Used stack: Java, Python, Kafka, Cassandra, Germlin, Keylines(for
visualization of graphs; JavaScript), Google Cloud

06/2014 - 07/2016
System Architect Web
The unbelievable Machine

Operations Company GmbH

- Development and implementation of precise scalable IT solutions for Big
Data and Web projects as DevOps
- Development and implementation of security and operational concepts for
major Hadoop distributions MapR, Hortonworks and Cloudera including
data analysis, development of models, data governance, data protection
and data integration. Advanced Big Data Know-Hows
- Installation and testing of multi-tenant Kafka clusters
- Planing and design of Big Data platforms. Implementation of high performance
Hadoop applications
- Consulting by customers who would like to migrate to Hadoop environments

- Administration, automatization, performance tuning and securing Linux
servers
- Planing, optimization, administration and monitoring of dynamic routing:
BGP, OSPF, MPLS, peering
- Operating of network components: firewalls, loadbalancers, routers, switches
- Operating of mass-storage systems
- Provisioning of systems with Puppet and Cobbler
- Consulting of the management in technical questions
- Monitoring and project documentation
- Software error analysis, identification of bottlenecks and finding solutions
- 24x7 on-call service
- Migration of clients services to AWS:

- Provisioning and setup of VMs and software packages(Apache HTTP,
NGINX)




- Setup of VPNs
- Evalation and performance tests of DynamoDB

- Development of architecture, implementation and setup of monitoring for
an HDInsight analytics platform on Microsoft Azure:

- Containerization and orchestration with Docker and Terraform
- Used Hadoop components: Hortonworks, Spark with MLlib, Hive,
Kafka
- Storage: Azure Storage
- Monitoring: NAGIOS, Kibana, Grafana

09/2012 - 06/2014
System Operations
Werkenntwen GmbH

- System administration one of the biggest social networks in Germany werkennt-wen.de.
The network had more than 9 million users and was a
daughter of RTL Interactive
- Operating, maintenance, extension and monitoring of the platform( 500
server of Hewlett Packard) and the following services: MySQL, memcached,
Postfix, SSL, Scality(mass-storage), search engines ElasticSeach
and Sphinx, Apache, NGINX
- Development and implementation of data consolidation initially spread
over 120 MySQL servers
- Optimization of the mass-storage system based on Scality to reduce latency
and to achieve better load distribution
- Development and implementation of informational infrastructure database(which
server, which rack, which IP, etc) inclusive frontend
- Automatization of server installation process with Full Automatic Installaton(FAI)

- Automatization of package management with Puppet
- Development of MySQL backup tool based on Percona Xtrabackup
- Development and implementation of mass-storage backups: full and incremental

- Virtualization of services with KVM, XEN and VMware
- Support for the IT-infrastructure in office
- Monitoring with Cacti, Munin, NAGIOS. Development of NAGIOS checks
- 24x7 on-call service

01/2009 - 09/2012
Postdoc
Bergische Universit; Wokrshops, Deutsche Physikalische Gesellschaft

Wuppertal

- ATLAS experiment at CERN(Geneva, Switzerland): Grid Computing und
top-quark physics
- Statistic-based separation of singal from background: statistical tests, estimation
of statistical and systematical errors, data unfolding, likelihood,
Monte Carlo simulation. Improvement of measured results using multivariate
methods: neural networks, genetic algorithms and support vector
machines. Used programming languagues: C++, python
- ATLAS note "A search for ttbar resonances in the lepton plus jets channel
in 35 pb-1 of pp collisions at 7 TeV"
- Operating, maintenance, monitoring and extension of a WLCG-cluster in
Wuppertal(1500 cores, 1 PB storage)
- Installation and perfromace optimization of MySQL server for the CREAM-
CE service
- Entwicklung, Implementierung und Leitung des Webapplications "Accounting
f"ur Netzwerk-Filesystem": PostGRESQL Datenbank, Python-API,
JavaScript, Apache. Datenbank-Optimierung
- Development, implementation and project management of a web application
"Accounting for the mass-storage system dCache"
- Vergleich verschiedener Netzwerk-Protokollen und Mass-Storage-Systemen:
dCache, GPFS, Panasas
- Evaluation of performance of different mass-storage systems and network
protocols: dCache, GPFS, Panasas
- Administration of the cluster infrastructure with focus on performance,
high availability and security
- Virtualization with XEN, VMware
- Installation, administration und optimization of batch-systems: Oracle
Grid Engine, Portable Batch System
- Evaluation of Amazon EC2 services for high energy physics: scalability,
storage
- Technical support for other ATLAS Tier-2 centers worldwide
- Developer of Job Execution Monitor: remote debugger for jobs on Grid
- Organization and project management of dCache tutorials
- WissGrid projekt: extraction and validation of meta-data of data formats
used in high enegry physics. Longtime archiving
- Software quality assirance and tests
- Lectures on distributed computing
- Talks at conferences "Computing in High Energy Physics 2012", dCache
Wokrshops, Deutsche Physikalische Gesellschaft

10/2006 - 12/2008
Postdoc
Rheinisch-Westf"alische

Technische Hochschule

- CMS experiment at CERN: Grid Computing and top-quark physics
- Development of software tools to simulation top-quark processes based on
Monte Carlo technique
- Pattern recognition in physics processes
- Support and administration of the Tier-2 cluster in Aachen
- Virtualization with VMware
- Automatization of OS provisioning
- Performance optimization of the Lustre network filesystem
- Software quality assurance and tests
- Lectures on elementary particle physics
- Publications and talks

Zertifikate

AWS Certified Data Engineer - Associate
2024

Reisebereitschaft

Weltweit verfügbar
Profilbild von Serge Kalinin Big Data, Cloud, DevOps aus Muenchen Big Data, Cloud, DevOps
Registrieren