Beschreibung
For one of our clients we are looking for a Data Scientist (w/m/d)Skills:
• Python programming language (pandas, numpy, seaborn)
• Hands-on experience with ML model cloud deployment, MLOps, preferably on AWS Sagemaker
• Experience in building machine learning and deep learning models. (related Python packages such as, scikit-learn, pytorch, keras, tensorflow, etc)
• Experience with building end to end ML pipelines: data pre-processing, model fitting, hyper-parameter tuning, model validation and model deployment
• Hands-on experience of working along data mining and ML modeling standard processes in context of model life cycle management, e.g. CRISP-DM
Nice to have skills:
• Experience with the RDKit python SDK
• Experience with machine-learning based drug discovery: i.e., molecular properties prediction, de novo molecule generation
Tasks:
• Building understanding existing ML models
• Create Gap Assessment of models
• Reproduce scoring/unit testing of existing models
• Adding FAIR metadata/formats to models and curated datasets
• Augmentation of scoring script for each model
• Model registration on Sagemaker
• Model deployment on Sagemaker in test environment
• Validation of deployed model in test environment
• Expose model as API using Sagemaker into production environment
• Sagemaker ML Pipelines (random forest, decision trees, logistic regressions etc.) for pre-processing, model training /validation, performance metrics reporting and deployment
• Establish understanding of required similarity measures
• Define algorithm for similarity measures against required catalogues
• Implement similarity search algorithms by using RDkit
• Pipelines for dimensionality reduction algorithms
Start: ASAP
Location: Remote
Duration: 12 months +