Objectives

With a strong foundation in semantic web technologies, including SPARQL, OWL, and RDF, honed through a knowledge graph course and practical experience in my master's thesis on knowledge graph entity type prediction, I aim to deepen my expertise at ISWS. My background, further enriched by constructing databases with DBpedia and Wikidata, positions me to contribute significantly to ISWS while enhancing my research skills for my ongoing PhD in knowledge graphs and language models.

Work Experience

R&D NLU Language Specialist

10.2020 09.2022
Cerence, Remote

- Processing unstructured data (intents & slots annotation, syntax & discourse correction and diversity, ASR annotations) used for dialogue system training and testing, and analysing errors to improve intent and slot recognition accuracy by 10%.

- Significantly increase the speed of annotating new data by developing an annotation tool using Windows Presentation Foundation (WPF).  It assembles multiple reference files on a single page with click-to-link and obtains the required tags with clickable use cases replacing manually typing the tags.

Marketing Data Analyst - Intern

07.2019 08.2019
TINNO Mobile/Wiko Mobile, China

- Create the potential for marketing departments to move from purely manual collection of competitors' product and sales information to real-time automated information scraping by leaving a set of data scraping Python code and workflows.

Research Analyst - Intern

12.2016 06.2017
E-Research & Solutions, Macao

- Analyzed social media text data to gauge public opinion, emphasizing the enhancement of granularity in sentiment analysis model classifications.

- Write opinion analysis reports for delivery to clients, participation in competitions, and write brochures "Data Mining".

Language

- Mandarin (Native), Cantonese (Native)

- English (C1), German (A2)

Project Experience

Applied Contrastive Learning to Fine-grained Entity Type Classification

07.2021 03.2023

- Master thesis

- Utilized SPARQL to extract artificial product entity data from DBpedia, and associated types and properties from Wikidata.

- Evaluated and enhanced performance using models including CNN, LSTM, BERT, ALBERT, and RoBERTa through Contrastive Learning.

Deep Learning Graph-based Dependency Parser

12.2020 03.2021

- Take the chu-Liu-Edmonds algorithm as the baseline, Bi-LSTM model and Deep Biaffine Attention as the boosting model. The accuracy was obtained on English and German datasets with UAS 87.1%, LAS 85% and UAS 85.9%, LAS 81.9% respectively.

Predicting Author's Personality by Text

07.2020 09.2020

- Crawl user comments and user self-tagged personality MBTI tags from personality type forums, collecting a total of 22,422 texts.

- Four binary Bi-LSTM models are used for replacing one sixteen classification model to achieve an average accuracy of 20%.

- Using BERT as a pre-trained model, obtained f1 31% for the 16 classification models, f1 70% for the four binary classification models.

Optimizing Reinforcement Learning Policies with Emotion Signals

04.2020 07.2020

- Optimizing RL policies by capture emotion signals from the logs  file of rule-based simulator-system conversations in the dialogue system.

Twitter Sentiment Analysis with Ordered Neuron LSTM

04.2019 07.2019

- Build eight classification models using Naïve Bayes and Ordered Neuron LSTM with Stance Sentiment Emotion Corpus (SSEC) as training and testing data respectively, obtain an average accuracy of 60%.

Language Learning Chatbot

12.2018 01.2019

- Using Django (HTML, CSS, JS, Python) implement a webpage application allows users to learn the basic vocabulary in Arabic, Chinese and German.

Program

AITalents Competition - Empowering Business with AI Technology

11.2020 01.2021
TechQuartier

- Collaborate with AI company SONEAN on a real-time platform for using AI technology to improve the efficiency of machine supply chain operations.

- Propose a global intelligent business system solution for building partnership networks and ESG indices for each entity:

- Establish real-time crawling of business news and news for real-time tagging of environmental, social, and government (ESG) signal indices and sentiment indices for each company entity to facilitate the selection of suppliers by buyers.
- Use docker and Neo4j to display company network graphs on the web to show competitors' partnership networks and suppliers' various indices in real time.

- Personal responsibility:

- Write Python code for crawling news and business reports and building a sentiment analysis system.

- Build graphs and write Cypher code for Neo4j queries.

- Create an animated Pitching video.

IBM Female Mentoring Program

04.2020 12.2020
IBM & University of Stuttgart

- A nine-month training program for twelve selected female students in computer science-related disciplines on topics including AI, data science and technology consulting.

Other Skills

- Graphic Design, Painting (>20 years), Presentation (Demo video).

- News Interview (5 years).