πŸ‘€ About Me

Wan Yee Ki, Kenneth

  • πŸ“ Based in Hong Kong
  • πŸŽ“ MSc in Computing (AI & ML) from Imperial College London - Graduated with Distinction
  • πŸŽ“ First-class Honors in Computer Science from The Chinese University of Hong Kong
  • πŸ’Ό Strong in database and information systems
  • πŸ”’ Proficient in mathematics, statistics, and data analytics
  • πŸ’» Experienced in data engineering, deep learning, and web development
  • πŸ€– Specialized in LLM-based automation and operational optimization techniques
Job Interests Data Scientist Data Engineer Data Analyst AI Engineer

πŸ’Ό Experience

Associate Data Scientist - OOCL, Hong Kong
September 2024 – Present
Project 1: LLM-Based Workflow Automation
  • Developed and deployed a Large Language Model (LLM) solution to parse unstructured data from emails, images, and tables, incorporating preprocessing pipelines to extract key information, automate database queries, and categorize content
  • Implemented an embedding-based Retrieval-Augmented Generation (RAG) pipeline with Maximum Marginal Relevance (MMR) to enhance LLM accuracy by retrieving relevant and diverse historical samples, adapting to varied input patterns
  • Leveraged LangChain for scalable LLM application development and LangFuse for performance monitoring
Project 2: Copilot for Business Strategy and Knowledge Management
  • Built a LangGraph-based LLM pipeline to extract and organize domain-specific knowledge from presentation slides
  • Utilized GraphRAG to structure extracted knowledge, enhancing LLM outputs with relevant, context-specific information
  • Designed a standalone knowledge base system with a Streamlit-based UI, allowing business users to manage (insert, delete, update) knowledge, supported by a PostgreSQL database with Pgvector for efficient embedding storage and an MCP server for querying relevant data
Project 3: Operational Optimization
  • Developed linear programming models to optimize operational planning processes, improving efficiency through data-driven decision-making
  • Maintained automated data preprocessing pipelines to clean and prepare historical data for analytical tasks
  • Built evaluation pipelines to assess and refine planning strategies
  • Applied stochastic optimization techniques to address uncertainty in operational data, incorporating hard and soft constraints with dynamic penalty mechanisms to model user behavior accurately
Research Intern (AI Music) - Huawei, Hong Kong
February 2023 – July 2023
  • Conducted in-depth research on music generation and music-related classification tasks
  • Developed deep learning models for music classification using Stochastic Weight Averaging, mix-up, and gradient clipping, surpassing state-of-the-art benchmarks
  • Contributed to the development of content generation
QA Engineer Intern - Viu (PCCW Media), Hong Kong
June 2021 – September 2021

Job duty is mainly on data engineering

  • Conduct quality assurance (QA) on mobile applications and web pages to ensure data accuracy and completeness
  • Track and process data using various tools such as DBT, Databricks, DBeaver, and Spark
  • Query data from Redshift and S3 to extract insights and create reports
  • Build interactive Tableau dashboards to visualize and analyze data
  • Using Atlassian tools (Jira, Confluence, and BitBucket) for project management / collaboration

πŸŽ“ Education

Master of Science, Computing (Artificial Intelligence and Machine Learning)
Imperial College London, 2023-2024

- Graduated with Distinction (September 2024)
- MSc Project: Data mining medical records of cannabis therapy in the UK

  • Term 1:
    • Introduction to Machine Learning
    • Mathematics for Machine Learning
    • Computational Finance
    • Computer Vision
    • Prolog
  • Term 2:
    • Machine Learning for Imaging
    • Natural Language Processing
    • Software engineering for Machine Learning Systems
    • Deep Graph-Based Learning
Bachelor of Science, Computer Science (Stream: Database and Information Systems)
The Chinese University of Hong Kong, 2019-2023

- Graduated with First Class Honors (GPA: 3.595 / 4.000)
- Minor in Data Analytics and Informatics
- Dean’s List for Academic Excellence: 2021-2022, 2022-2023
- Final Year Project: Music Chord Detection

  • Highlighted Math Courses:
    • Calculus for Engineers
    • Linear Algebra for Engineers
    • Multivariable Calculus for Engineers
    • Games and Strategic Thinking
    • Discrete Mathematics for Engineers
    • Probability for Engineers
    • Statistics for Engineers
  • Highlighted Computer Science Courses:
    • Fundamentals of Machine Learning
    • Fundamentals of Artificial Intelligence
    • Introduction to Database Systems
    • Introduction to Operating Systems
    • Introduction to Data Science
    • E-Commerce Data Mining

πŸ›  Skills

Programming Languages Python (Advanced), JavaScript, SQL, Prolog, C
Python Ecosystem
ML/AI: PyTorch, TensorFlow, scikit-learn, LangChain, LangGraph, Hugging Face Transformers
Data & Analytics: NumPy, Pandas, PySpark, Matplotlib, Seaborn, Librosa
Web & UI: Streamlit, Selenium, BeautifulSoup
AI/ML Specialization
LLM & NLP: Retrieval-Augmented Generation (RAG), GraphRAG, Vector Embeddings
Optimization: Linear Programming, Stochastic Optimization
Tools: LangFuse, MCP (Model Context Protocol)
Data Engineering & Infrastructure
Databases: PostgreSQL, Pgvector, Amazon Redshift, MongoDB, MySQL
Platforms: Databricks, Apache Spark, dbt, AWS S3
Tools: DBeaver, Tableau
Development & Collaboration
Web: Node.js, Express.js, React.js, Bootstrap, HTML/CSS
DevOps: Git, Atlassian Suite (Jira, Confluence, BitBucket)
Languages English (Fluent), Chinese (Native)

πŸš€ GitHub Repositories

Virus

A discord bot developed using discord.py

NightBlue

A dark theme for VSCode across various programming languages

Music Genre Classification

My first deep learning project - music genre classification using CNNs.

LeetCode_Solution

My solutions to LeetCode problems, including Python, C and SQL.

Codewars_Solution

My solutions to Codewars problems, including Python, C and Java.

comfyui_controlnet_preprocessors

Adding controlnet preprocessor nodes to ComfyUI