Schlagwörter
Skills
Projekthistorie
Deployed H20, Scikit-learn, and PyTorch based salary prediction models for recruiting managers
Improved the model performance via Distribution Analysis like Histogram, Boxplot, Correlation Plot by removing outliers
Increased the final machine learning model performance by 5 % using ensemble models of Deep Leaning, XGBoost, GBM
Increased the machine learning model performance by creating new 300 Dimension Smooth Inverse Frequency Vectors
Working closely with Product owners and stake holders to build new machine learning use cases for customers
Developing NLP and Facebook Fast Text based work force recruitment Recommendation model
Statistical Analysis to find out the relationship and to build better feature engineering for ML models
Cleaning Vast amount of structured and unstructured data for machine learning models using NLP and Bigdata tools
Creating Machine Leaning Pipe line for Production with Flake8 and Pylint convention
* Creating Fail fast machine learning models as POC to understand Model Performance and Feature Contributions
* Worked with lateral movement detection algorithm using neural network
* CI/CD using Jenkins and Docker
* Note : Since they shut down the whole Ireland office, everyone had to leave the company.
* Improved machine learning model performance by 7% using Hyper parameter grid search to obtain best model
* Statistical correlation analysis and plots to find outliers and feature importance with targeted data
* Deployed NLP based Obfuscation Detection Algorithm to prevent fraudsters using sentence embedding
* Created POC to reach out clients by a web-based machine learning model for credit risk analysis
* Developed fail fast machine learning algorithms and stacked ensemble models using Automatic Machine Learning
* Automating Hadoop Spark jobs to collect, extract, transform and load to ML models from EDH to Local
* Improved Hadoop job performance by 11% using HBase and Hive to enhance the caching logic and data partitions.
* Worked with Amazon AWS to make client data faster, better, more accessible, and/or more accurate with optimized cost
* Worked with terabytes of the live/hourly based dataset in an agile environment and managed 4 members team.
* Developed MapReduce jobs to process, transform and populate staging tables and store the refined data in Distributed DB's.
* Developed bash scripts to automate the map reduce jobs and scripts for pulling data from RDBMS to HDFS using sqoop.
* Created reusable components which can run on top of HDFS for attaining general Hadoop requirements.
* As part of the Total cost of ownership, migrated the project to AWS cloud and validated the performance and cost as a POC.
* Mentored Campus graduates to Big Data Ecosystem and trained them in MapReduce, HBase, Hive and etc tools.
* Hadoop Job Monitoring and troubleshooting production cluster issues.