Beschreibung
To qualify for the role you must haveMinimum 4 years relevant devops and data wrangling experience in a (big) data environment
Experience in big data Hadoop ecosystem: (with some of the component of the Hadoop ecosystem)
Storage: HDFS, MongoDB, PostgreSQL, T HBase, Cassandra
Tools: Kafka, Mesos, Docker, Spark, Hive, YARN, …
Programming knowledge in Scala, Python is a plus
Excellent knowledge of Linux environment
Knowledge of continuous development/integration pipelines including rules to test/validate code (git, Jenkins, test framework)
Tasks & responsibilities:
Data pipelines starting from RDBMS with event capturing, transfer into KAFKA broker, consuming the events from the cluster with Spark, Spark streaming, generating metadata tables on Hive metastore, and generating data marts that will be exposed on Solr,HBase, Impala.
CDC and Stream processing inside the Hadoop Stack
English speaking, no other language required.
If you have the required skills and interested to apply, please send your CV now for immediate consideration.