Data Science Team / Data Engineer / Airflow / Spark / Prefect

Berlin, Berlin  ‐ Remote
Dieses Projekt ist archiviert und leider nicht (mehr) aktiv.
Sie finden vakante Projekte hier in unserer Projektbörse.

Beschreibung

The Client
The client is the first company in Germany who has developed a data room solution.

The company is applying ML and other Big Data technologies in order to create value and a unique experience for their customers.

The Project
You will be part of the Data Science Team, who's mission is using state of the art technology, to not only extract data but to analyse and learn from it, using Machine Learning and other Big Data technologies in order to create real value and a unique experience for our customers.

They are shifting the tech stack from Airflow to Prefect and need a senior data engineer to guide them through the process, but also code hands-on. The want to use Prefect to build a new data transformation structure and feed the data into a Spark Pipeline before making it available to 3rd parties via a GraphQL API. An ideal candiate would have experience in working with such a structure.

Responsibilities
Transform raw data to perform insightful analytics
Design, implement and maintain data systems and pipelines in Spark/Airflow and Dask/Prefect Stack
Refactor and transform data for machine learning-based modelling and information extraction
Consolidate, harmonise and structure data coming from different sources
You are an advocate of data quality, reliability, security and consistency
Design and implement data infrastructure in AWS
Mentoring and helping junior engineers to develop further
Be an integrated part of the existing team and support through hands-on coding

Requirements

Must-Haves
You can start no later than end of November
You are available full-time for the duration of the project
You have demonstrated experience with service-oriented architecture from previous projects
Strong experience with Data Architecture / Engineering
Strong experience with data processing, especially for image and text
Strong hands-on experience with Airflow, Prefect, Spark and Dask
Senior with Python and Database Design (SQL and NoSQL)
Experience with ETLs, ML Workflows as a microservice in AWS
Experience in setting up Docker containers

Nice-to-Haves
Experience with Amazon Glue and Sagemaker
You can visit the Berlin office every two weeks (Office is open with limited capacity and follows all guidelines from the health authorities.)
You are German-speaking

Team
You are working on a service-oriented SCRUM team.
Reporting to Eng Lead / PO.

Tech Stack
Typescript
Node.js
Python
GraphQL
MongoDB
Postgres
RESTful APIs
AWS (SQS, SNS, S3)
Spark
Airflow
Dask
Prefect


Please answer these questions if you are interested:
On which of your projects have you been working with Prefect?
On which of your projects have you been working with Spark?
On which of your projects have you been using Dask?
On which of your projects have you been using Amazon Glue and/or Sagemaker?
What is your earliest start date?
When could you do interviews with the client?
Are you available full-time throughout the full duration of the project?
Are you able to visit Berlin office once every 2 weeks?
What would be your day rate for this project?
What are open questions from your side about the project?
Start
11.2020
Dauer
5 Monate
(Verlängerung möglich)
Von
MVPF Global Talent Solutions GmbH
Eingestellt
30.10.2020
Ansprechpartner:
Levin Wense
Projekt-ID:
1992036
Vertragsart
Freiberuflich
Einsatzart
100 % Remote
Um sich auf dieses Projekt zu bewerben müssen Sie sich einloggen.
Registrieren