Profilbild von Florian Hinzpeter Senior Data Science Consultant | Big Data, Machine Learning, MLOps & Responsible AI aus StGallen

Florian Hinzpeter

teilweise verfügbar

Letztes Update: 17.04.2024

Senior Data Science Consultant | Big Data, Machine Learning, MLOps & Responsible AI

Firma: Scalable Analytics GmbH
Abschluss: PhD in Theoretical Physics
Stunden-/Tagessatz: anzeigen
Sprachkenntnisse: deutsch (Muttersprache) | englisch (verhandlungssicher) | portugiesisch (gut) | spanisch (verhandlungssicher)

Schlagwörter

Microsoft Azure Künstliche Intelligenz Clusteranalyse Python Machine Learning SQL Apache Spark Deep Learning Databricks APIs + 35 weitere Schlagwörter anzeigen

Dateianlagen

CV-FlorianHinzpeter_200224.pdf
Profile-Florian-Hinzpeter-EN_200224.pdf

Skills

Machine Learning
  • Tools
    • Scikit-Learn
    • CatBoost
    • XGBoost
    • SciPy
    • SparkML
  • Concepts
    • Supervised Learning (Random Forest, Boosted Trees, Support Vector Machine, Naive Bayes, K-NN)
    • Unsupervised Learning (Dimensionality Reduction: PCA, t-SNE, UMAP,
      Clustering: K-Means, DBSCAN, Hierachical Clustering)
Deep Learning
  • Tools
    • PyTorch
    • Huggingface
  • Concepts
    • Autoencoders, RNNs, LSTM, Transformers
    • Computer Vision (CNNs)
    • Generative AI: Large Language Models (LLM), Prompt Engineering (e.g. with LangChain), Fine Tuning, Parameter Efficient Fine Tuning, Generative Adversarial Networks
Responsible AI

Data Engineering
  • SQL
  • Python Pandas,
  • PySpark
  • Spark SQL
  • Delta Lake
  • Databricks
Azure Cloud
  • Azure Data Factory
  • Azure Blob Storage
  • Azure Machine Learning Studio
  • Azure Active Directory
  • RBAC
  • Databricks
  • Infrastructure as Code (ARM, Bicep)
Software Engineering & DevOps
  • Python
  • PyTest
  • Flask
  • Fast-API
  • Git
  • CI/CD
  • Gitlab
  • Azure DevOps
  • Poetry
  • PipEnv
  • Docker
  • Bash
  • Powershell

Projekthistorie

09/2022 - 12/2022
Predicting the length of hospital stays
Positive Thinking Company GmbH (Versicherungen, 1000-5000 Mitarbeiter)

Project roles:
Lead Software Developer and Senior Data Scientist

Tasks:
I developed the software of a ML Pipeline to predict the length of hospital stays.
I have taken the lead in technical and analytical as well as PM tasks.
The technical and analytical tasks included:
  • Analytical Problem Assessment
  • PoC development and evaluation,
  • ETL and Data Selection
  • ML Software Development in Python
Tools:
Python (Pandas, Scikit-Learn, PyTorch Catboost), SQL, Git and Gitlab, Poetry

Methods:
CatBoost Regression Model, OOP in Python, Scikit-Learn ML Pipelines, Feature Engineering and Selection

01/2022 - 12/2022
Recommendation Engine for health insurance products
Positive Thinking Company GmbH (Versicherungen, 1000-5000 Mitarbeiter)

Project roles:
Lead Software Developer and Senior Data Scientist

Tasks:
I developed the software for a ML Software that estimates product recommendations for health insurance products.
I have taken the lead in technical and analytical as well as PM tasks.
The technical and analytical tasks included:
  • ETL
  • Data Discovery and Selection
  • ML Software Development in Python
  • Operationalization of the model as a batch process
Tools:
Python (Pandas, Pytest, Scipy), SQL, Git and Gitlab, Poetry, Docker

Methods:
Probabilistical Graphical Models, OOP in Python, Unit-and Integration testing, CI/CD Pipeline

09/2022 - 11/2022
Responsible AI Compliance Assessment with respect to the requirements of the EU AI Act
Positive Thinking Company (Banken und Finanzdienstleistungen, 1000-5000 Mitarbeiter)

Project roles:
AI Consultant for Responsible & Ethical AI

Tasks:
Creation of an evaluation framework including
  1. a survey to determine the level of maturity in the area of responsible AI
  2. in-depth evaluation of the survey and classification of the level of maturity
  3. assessment of potential regulatory breaches with respect of the EU AI Act
  4. Development of a future proof Responsible AI roadmap

07/2022 - 09/2022
Design and implementation of a CI/CD Process (Pipeline)
Positive Thinking Company GmbH (Versicherungen, 5000-10.000 Mitarbeiter)

Project roles:
DevOps Architect and Software Developer

Tasks:
Conceptual design and implementation of a generic CI/CD pipeline in Gitlab

Tools:
Gitlab, bach, pytest, pylint

Methods:
CI/CD, Unit-and Integration Testing

09/2020 - 12/2021
Design and built of a Data Science Platform around Azure Databricks
Positive Thinking Company GmbH (Automobil und Fahrzeugbau, 250-500 Mitarbeiter)

Project roles:
Cloud Architect, Data Engineer

Tasks:
  • Conceptual design of the data science platform
    • Determination of the required resources and their interconnection
    • Selection of an environment separation pattern (e.g. Dev/Test/Prod)
    • Leveraging the Data Lakehouse architecture
  • Implementation
    • Setup of the resources using Infrastructure as Code
    • Connecting ressources such as mounting of Data Lake to Databricks or connecting Azure Key Vaults to Databricks secret scope
    • Setting up Assets and Ressources within the Databricks workspace using Databricks CLI
  • Data Engineering
    • Integrating siloed data scources into a central Lakehouse
    • Building Data Integration as well as ETL and ELT Pipelines
  • Cloud Admin
    • Developing a permission and role concept
    • Developing a permission group structure in Azure Active Directoy
    • Desingning a data ownership concept
    • Securing Data using ACLs and Databricks Unity Catalog
    • Syncronizing Identity managament of Azure and Databricks via SCIM.
Tools:
Databricks, Azure Cloud (Azure Data Factory, Azure Machine Learning Studio, Azure Data Lake Gen2, VNets),
Bicep and ARM templates, Bash and Powershell, az-CLI and Databricks-CLI, Azure DevOps

Methods:
Infrastructure as Code (IaC), Continuous Deployment Pipelines

02/2020 - 07/2021
Predicting used car prices from historical sales transaction data
Positive Thinking Company GmbH (Automobil und Fahrzeugbau, 250-500 Mitarbeiter)

Project roles:
Senior Data Scientist

Tasks:
I developed the software of a ML Pipeline to predict the prices of used cars.
I have taken the lead in technical and analytical tasks.
  • Exploratory Data Analysis
  • PoC development and evaluation
  • ML Pipeline implementation
  • Data and model governance
Tools:
Python (Pandas, PySpark, Catboost), Git and Azure DevOps, MLFlow, Databricks

Methods:
CatBoost Regression Model, ETL, Feature Engineering and Selection

10/2019 - 06/2020
Extraction of accident patterns from historical data of vehicle repairs
Positive Thinking Company GmbH (Automobil und Fahrzeugbau, 250-500 Mitarbeiter)

Project roles:
Senior Data Scientist

Tasks:
I developed the ML software to extract common accident pattern from vehicle repair data.
  • Exploratory Data Analysis
  • PoC development and evaluation
  • ML Pipeline implementation
  • Close coordination with domain experts (e.g. topic annotation)
Tools:
Python (Pandas, SparkML, Scikit-Learn), Git and Azure DevOps, MLFlow, Databricks

Methods:
Autoencoder, K-Means clustering, dimenionality reduction (UMAP, t-SNE), ETL, Topic modelling (Latent Dirichlet Allocation)

10/2019 - 11/2019
Automated email classification and information extraction
Positive Thinking Company GmbH (Automobil und Fahrzeugbau, 500-1000 Mitarbeiter)

Project roles:
Data Scientist

Tasks:
I developed the software of a ML Pipeline to classify emails into various categories based on the email body text.
I have taken the lead in technical and analytical task
  • Exploratory Data Analysis
  • Text data preprocessing, Encoding and Tokenization
  • ML Pipeline implementation
Tools:
Python (Pandas, NLTK, Scikit-Learn), Git, Jupyter Notebook

Methods:
NLP modelling based on bag of words and Support Vector Machine

06/2019 - 10/2019
Predicting real estate prices

Project roles:
Data Scientist

Tasks:
I developed a ML Pipeline to predict the prices of real estates from publicly available data from the US.
This project was part of a Kaggle challenge I took part in
  • Exploratory Data Analysis
  • ML Pipeline implementation
  • Extensive tuning of the data quality and pipeline parameters (competition context)
Tools:
Python (Pandas, Scikit-Learn, XGBoost), Jupyter Notebook

Methods:
Boosted Tree Regression, advanced Feature Engineering (e.g. hierachical target encoding)

01/2017 - 06/2019
Monte Carlo Simulations of spatial arrangements and Finite Element Modelling
Technical University München (Sonstiges)

Project roles:
Researcher, Data Scientist

Tasks:
This project consisted of applying Monte Carlo simulation to generate an ensemble of spatial organizations of reactive particles and using finite element simulation to solve Reaction-Diffusion equations and various resulting observables for each generated instance of the ensemble.
The resulting modelling data was analyzed and evaluated.
 
Tools:
Python (Matplotlib, Seaborn), Matlab, C++, Comsol Multiphysics

Methods:
Monte Carlo Simulation, Finite Element Simulation, Data Visualization, Regression Modelling and Correlation Analysis

01/2014 - 12/2015
Development of methods and algorithms for the numerical optimization of the distribution of chemical reactants with respect to their throughput
Technical Univerity München

Project roles:
Researcher and Software Developer

Tasks:
Numerical solution of Reaction-Diffusion PDE and optimization of reactant distribution with respect to reaction-flux

Tools:
Matlab, C++

Methods:
Numerical solving of spatial differential equtaion on a lattice, numerical optimization

Zertifikate

Databricks Certified Data Engineer Associate
2023
CI/CD YAML Pipelines with Azure DevOps
2023
Databricks Certified Machine Learning Associate
2022
Python 3: Deep Dive (Functional)
2022
PyTorch for Deep Learning and Computer Vision
2021
REST APIs with Flask and Python
2021
Deployment of Machine Learning Models
2021
Big Data Modeling and Management Systems
2020
Introduction to Big Data
2020
Fundamentals of Reinforcement Learning
2020
Build Basic Generative Adversarial Networks (GANs)
2020
Build Better Generative Adversarial Networks (GANs)
2020
Machine Learning
2019
Neural Networks and Deep Learning
2019
Structuring Machine Learning Projects
2019
Introduction to TensorFlow for Artificial Intelligence, Machine Learning, and Deep Learning
2019
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization
2019
Sequence Models
2019
Convolutional Neural Networks
2019
Introduction to Data Science in Python
2019
Applied Machine Learning in Python
2019
Complete SQL Bootcamp
2019
Version Control with Git
2019
Object oriented programming in Java
Edx
2018

Reisebereitschaft

Weltweit verfügbar
Grundsätzliche Reisebereitschaft vorhanden
Profilbild von Florian Hinzpeter Senior Data Science Consultant | Big Data, Machine Learning, MLOps & Responsible AI aus StGallen Senior Data Science Consultant | Big Data, Machine Learning, MLOps & Responsible AI
Registrieren