Santhosh Parampottupadam verfügbar

Santhosh Parampottupadam

Data Scientist, Data Scientist (R&D), Big Data Hadoop Developer

Profilbild von Santhosh Parampottupadam Data Scientist, Data Scientist (R&D), Big Data Hadoop Developer aus Berlin
  • 10787 Berlin Freelancer in
  • Abschluss: Masters in Cloud Computing
  • Stunden-/Tagessatz: nicht angegeben
  • Sprachkenntnisse: deutsch (Grundkenntnisse) | englisch (Muttersprache)
  • Letztes Update: 06.04.2020
Profilbild von Santhosh Parampottupadam Data Scientist, Data Scientist (R&D), Big Data Hadoop Developer aus Berlin

Diese Anzeige ist nur für angemeldete Nutzer möglich.

Stanford University ML

Diese Anzeige ist nur für angemeldete Nutzer möglich.

AI awards sponsored by Microsoft and SAP

Diese Anzeige ist nur für angemeldete Nutzer möglich.

AWS certification

Diese Anzeige ist nur für angemeldete Nutzer möglich.

Cloudera Certification

Diese Anzeige ist nur für angemeldete Nutzer möglich.

XGBoost, Deep Learning, Neural Network, Stacked Ensemble model, Hyperparameter, Machine Learning, NLP, Random Forest, DT, GBM, Ensemble, Grid Search, DFS, Pandas, Numpy, Matpolitlib, Plotly, Scikit, Spacy, Pytorch, Tensorflow, ScikitLearn, JAVA, Python, UNIX, Bash Scripting, R, Big Data, Hadoop, Spark, Map-Reduce, HBase, Hive, Sqoop, Impala, NoSQL, MapR, Cloudera, Cloud, AWS, IBM Blumix, MySQL, Oracle, Version Control, GIT, Gradle, IDEs, Jupyter Notebook, VS Code, PyCharm, Docker, Tableu, Windows, Linux, Mac OS, Scikit-learn, Histogram, Facebook, feature engineering, Bigdata, Kaspersky, Algorithm, CI/CD, Jenkins, POC,  algorithms, extract, transform and load, caching, Amazon AWS, MapReduce, DB, bash, map reduce, RDBMS, HDFS
  • 08/2019 - bis jetzt

    • Softgarden
  • Data Scientist


     Deployed H20, Scikit-learn, and PyTorch based salary prediction models for recruiting managers 

     Improved the model performance via Distribution Analysis like Histogram, Boxplot, Correlation Plot by removing outliers 

     Increased the final machine learning model performance by 5 % using ensemble models of Deep Leaning, XGBoost, GBM 

     Increased the machine learning model performance by creating new 300 Dimension Smooth Inverse Frequency Vectors 

     Working closely with Product owners and stake holders to build new machine learning use cases for customers 

     Developing NLP and Facebook Fast Text based work force recruitment Recommendation model 

     Statistical Analysis to find out the relationship and to build better feature engineering for ML models 

     Cleaning Vast amount of structured and unstructured data for machine learning models using NLP and Bigdata tools 

     Creating Machine Leaning Pipe line for Production with Flake8 and Pylint convention 

  • 02/2019 - 07/2019

    • Kaspersky Lab R&D
  • Data Scientist
  • * Prevention of Domain Generation Algorithm using Random Forest and Deep Learning
    * Creating Fail fast machine learning models as POC to understand Model Performance and Feature Contributions
    * Worked with lateral movement detection algorithm using neural network
    * CI/CD using Jenkins and Docker
    * Note : Since they shut down the whole Ireland office, everyone had to leave the company.

  • 03/2018 - 12/2018

    • First Data Corporation, R&D
  • Data Scientist (R&D)
  • * Developed deep learning based credit and debit card fraud detection model using historic banking data
    * Improved machine learning model performance by 7% using Hyper parameter grid search to obtain best model
    * Statistical correlation analysis and plots to find outliers and feature importance with targeted data
    * Deployed NLP based Obfuscation Detection Algorithm to prevent fraudsters using sentence embedding
    * Created POC to reach out clients by a web-based machine learning model for credit risk analysis
    * Developed fail fast machine learning algorithms and stacked ensemble models using Automatic Machine Learning
    * Automating Hadoop Spark jobs to collect, extract, transform and load to ML models from EDH to Local

  • 10/2014 - 12/2016

    • Tata Consultancy Services Ltd
  • Big Data Hadoop Developer
  • * Provided design recommendations and thought leadership to stakeholders to analyze new data sources.
    * Improved Hadoop job performance by 11% using HBase and Hive to enhance the caching logic and data partitions.
    * Worked with Amazon AWS to make client data faster, better, more accessible, and/or more accurate with optimized cost
    * Worked with terabytes of the live/hourly based dataset in an agile environment and managed 4 members team.
    * Developed MapReduce jobs to process, transform and populate staging tables and store the refined data in Distributed DB's.
    * Developed bash scripts to automate the map reduce jobs and scripts for pulling data from RDBMS to HDFS using sqoop.
    * Created reusable components which can run on top of HDFS for attaining general Hadoop requirements.
    * As part of the Total cost of ownership, migrated the project to AWS cloud and validated the performance and cost as a POC.
    * Mentored Campus graduates to Big Data Ecosystem and trained them in MapReduce, HBase, Hive and etc tools.
    * Hadoop Job Monitoring and troubleshooting production cluster issues.