Beschreibung
Required skills:
- Hands-on experience with Hadoop/Big Data and Data Lake architecture (HDFS, Hadoop M/R and streaming, Spark and Spark Streaming)
- Hands-on experience with data governance and data lineage
- Conceptual Know-How of Data Lake design, esp. data zones (landing, raw, hot, cold etc.) and meaningful user rights
- Knowledge of building Hadoop based services like Spark Streaming in Java, Kafka, Hive, Sqoop, Oozie
- Hands-on experience with ingestion (message queuing, Kafka, Nifi) and data provision downstream
- Knowledge in ElasticSearch, Neo4j and traditional relational Databases (Oracle, PostgreSQL)
- Knowledge in data processing (eg Spark, NiFi, HiveQL, Impala, R-Statistics, KNIME, PipelinePilot)
- Developing Java Web applications with focus on building REST APIs connecting to Hadoop and Hana
- Ability to work with huge volumes of data so as to derive Business Intelligence
- Analyze data, uncover information, derive insights and propose data driven strategies
- Knowledge of OOP Languages like Java should be must
- Knowledge of installing, configuring, maintaining and securing Hadoop
- In-depth knowledge & experience with Hadoop ecosystem & architecture (including HDFS)
- Monitoring the above-mentioned components in a production system via Ambari, Kibana and other tools
- An analytical bent of mind and ability to learn-unlearn-relearn
- Excellent Communication Skills to present his/her own ideas/concepts and solution to various stakeholders