Schlagwörter
Skills
- 7+ Jahre Erfahrung mit Natural Language Processing (NLP)
- Erstellen von semantischen Graphen ie. Knowledge bzw. Ontology Graph
- Full-chain für NLP Aufgaben: Tagging, Parsing, Named Entity Recognition (NER), Relation Recognition
- Statistische Analyse und Modellierung
- Machine Learning: Scikit-Learn (sk-learn), Keras, Tensorflow
- 9+ Jahre Erfahrung mit Python
- Spark, Spark SQL, Spark ML, ElasticSearch mit Kibana (ELK)
- Docker, AWS, Grafana Prometheus, GO, Ansible (indirekt anbieten)
Projekthistorie
- 2017.09 - 2018.01 NLP specialist for an E-Commerce company
- Analyze the text of product names, descriptions, etc. to improve search quality
- Develop novel POS tagger. Based on Python (NLTK, Sklearn), Git, AWS Hadoop cluster
- 2017.01 - 2017.09 Data scientist for an automobile company
- Knowledge transfer
- Product owner for a WebApp to improve data quality (R, Shiny); successfully released
- Data visualization (Qliksense)
- Reverse engineering of existing Visual Basic codes
- POC for task automation to save enterprise cost
- 2017.03 - 2017.06 In-house projects
- Use Kafka to process real time data-streams
- Forward Kafka’s output to Spark/Spark-ML for further processing: detect features of data, make statistical analysis and develop predictive models
- 2016.08 - 2017.01 Data scientist in the Data-Analytics Group of metaFinanz GmbH
- Specialized in web-mining and text-analytics
- Crawl websites, apply text analytics techniques to extract information
- Build dashboards using Shiny
- Natural language processing for German: POS-tagging and stemming based on statistical inference; topic and sentiment analysis of the news
- Project experience with IBM Watson: technical consulting on the architecture and principle of IBM Watson regarding the construction of domain specific ontology graph and bayersian inference. The use-case is automatic diagnostics in healthcare systems.
- 2015.01 - 2016.07 SW Architect and Developer for my information retrieval project
- Information retrieval and knowledge discovery with NLP and text-mining techniques;
- Data source: scientific papers (20+ million papers)
- Implemented in Python
- Activities:
- crawled online databases of scientific papers and online english dictionaries
- constructed a way to identify concepts from raw text
- calculated semantic and lexical statistics of the concepts
- categorization of concepts using statistical inference
- developed an long-short-term memory mechanism to find related concepts, hence obtaining a knowledge network
- classification of papers on topics
- Expertise: Text-Mining, natural language processing (NLP), statistical modeling, semantic
search, regular expression - Demo: http://munich-datageeks.de/2016/09/13/kun-lu-text-mining-on-academic-
publications/
- Information retrieval and knowledge discovery with NLP and text-mining techniques;
- 2015.06 - 2015.12 Big-Data engineer (Professional Consultant), SHS Viveon
- Member of the Data-Warehousing Team specialized in Big-Data
- Participated in different intern and extern big-data projects:
- integrating different data sources with Spark and Spark SQL, in oder to improve the ETL process
- build Dashboard with Elasticsearch and Kibana
- parse SAS-files with logstash
- data extraction from contract fotos, written in Scala
- optimize data models for business reporting
- Tools/Techniques: Spark, Scala, Elastic Search, Logstash, Hadoop, MySQL, Microsoft SQL-Server, SSAS, Regular expression
Reisebereitschaft
Verfügbar in den Ländern
Deutschland, Österreich und Schweiz
- Innerhalb von Deutschland:
- Flexibel
- 1 Tag pro Woche remote erwünscht aber nicht erfordert