Projects

Selected projects that highlight data engineering, ML experimentation, and applied AI systems.

Automatic KPI Interpretation with Multimodal LLMs

Automatic KPI Interpretation with Multimodal LLMs

Interpreting business KPI dashboards with multimodal LLMs inside KNIME.

  • - Built a modular LLM workflow to explain KPI shifts with structured summaries.
  • - Highlighted anomalies and trends using multimodal dashboard inputs.
  • - Delivered consistent, human-readable explanations for stakeholders.
GenAIKNIMELLMAnalytics
Semantic Segmentation on CamVid

Semantic Segmentation on CamVid

From-scratch U-Net-inspired model compared with MobileNetV2 and DeepLab via transfer learning.

  • - Implemented a full training and evaluation pipeline with reproducible experiments.
  • - Benchmarked transfer learning variants and tracked class-level metrics.
  • - Packaged data preprocessing, augmentation, and reporting into one workflow.
Computer VisionSegmentationTransfer Learning
Vision-Language-Action (VLA) Models for Robotics

Vision-Language-Action (VLA) Models for Robotics

Multimodal agents for robotic pick-and-place following prompt-specified features.

  • - Research activity (band 25CE114) with currently private repositories (paper under submission).
  • - Benchmarked teleoperation interfaces (with 1-to-1 kynematic map) to train Imitation Learning models.
  • - Experimented extensions of ACT model to support multi-tasking.
RoboticsVLAMultimodalDatasetEvaluation
Emotion Analysis of Stock Tweets

Emotion Analysis of Stock Tweets

Multi-class emotion classification and topic modeling in stock market tweets.

  • - Emotion classification using TF-IDF, Word2Vec, and contextual embeddings (BERTweet, Distil-RoBERTa).
  • - Topic modeling with LDA and BERTopic.
  • - Integration of emoji features and sentiment lexicons (VADER, NRC, Bing Liu).
NLPSentiment AnalysisStock Market
LLM Belief Bias Evaluation

LLM Belief Bias Evaluation

Evaluating belief bias in (local) large language models using syllogistic reasoning tasks.

  • - Investigated belief bias in (local) large language models using syllogistic reasoning tasks.
  • - Designed a questionnaire to measure the influence of prior beliefs on logical reasoning performance.
  • - Benchmarked LLaMa3.2:1b, Mistral:7b and Qwen3:8b.
NLPLLM EvaluationCognitive Science
Hourly Time Series Traffic Forecasting

Hourly Time Series Traffic Forecasting

Forecasting hourly traffic congestion indicator using classical and deep learning models.

  • - Compared SARIMA, UCM and deep learning models in the forecast of a hourly time series.
  • - Engineered time-based and lag features to enhance model performance.
  • - Evaluated models using MAE metrics.
Time SeriesForecastingDeep Learning
Jailbreak Game with LLMs

Jailbreak Game with LLMs

A text-based jailbreak game where players try to make a LLM reveal a secret.

  • - Developed a text-based jailbreak game using local LLMs.
  • - Implemented difficult levels, with instruction prompt depending on the level.
  • - Created UI for the user to chat with the model and test if they found the secret.
NLPLLMGame Development
Benchmarking Portfolio Optimization Techniques

Benchmarking Portfolio Optimization Techniques

A benchmark with multiple optimization strategies on real historical stock data from the Nasdaq.

  • - Evaluated portfolio optimization techniques using historical stock data.
  • - Explored the Markowitz Optimization and LSTM-based forecasting.
  • - Created a benchmarking framework for portfolio optimization.
Portfolio OptimizationFinanceDeep Learning
Abstractive Summarization on XSum Dataset

Abstractive Summarization on XSum Dataset

A benchmark with multiple models and NLP strategies applied to the XSum dataset.

  • - Compared summarization performance across GRU, attention baselines, transformer models (T5-small, Flan-T5-base) and local LLMs.
  • - Evaluate zero-shot, one-shot and few-shot prompting strategies, and also experimenting PEFT methods (LoRA, Prefix Tuning, ...).
  • - Analyze outputs with ROUGE and BERTScore, and used explainability techniques to interpret models focus.
Abstractive SummarizationNLPDeep Learning
MATLAB Modelling of a Chopper Preamplifier for Proton Sound Detectors (BSc Thesis)

MATLAB Modelling of a Chopper Preamplifier for Proton Sound Detectors (BSc Thesis)

Bachelor’s thesis: modelling and analyzing a chopper-stabilized preamplifier chain for proton sound detector readout in MATLAB.

  • - Developed a MATLAB model of the chopper preamplifier signal chain to study noise, gain, and stability trade-offs.
  • - Simulated key operating conditions and design parameters to evaluate performance and guide design choices.
  • - Documented the modelling approach, assumptions, and results in a reproducible thesis workflow and codebase.
MATLABSignal ProcessingAnalog ElectronicsModeling
Lex-RAG Lab

Lex-RAG Lab

Quick demo of Lex-RAG pipelines for document retrieval and question answering (currently no vector store, but BM25).

  • - Implemented retrieval pipelines using lexical search (e.g., BM25).
  • - Built an end-to-end workflow for chunking, indexing, querying, and answer generation over custom documents.
  • - More to be done on retrieval quality and prompting strategies to improve grounded, citation-aware responses.
RAGInformation RetrievalLLMNLP
Text2Topic: Multi-Label Topic Classification (Data Science Lab)

Text2Topic: Multi-Label Topic Classification (Data Science Lab)

Multi-label topic classification to automate tagging and improve content organization for Doppiozero’s CMS, benchmarked across classical ML, pretrained embeddings, and LLM approaches.

  • - Explored and cleaned a large-scale dataset (~200k articles) and defined a practical labeling policy (≤3 labels/article) using probability thresholds to reduce noisy tags.
  • - Addressed class imbalance with targeted text augmentation (synonym replacement, back-translation, sentence shuffling, contextual BERT augmentation) to build a balanced training set.
  • - Benchmarked Random Forest (TF-IDF), Universal Sentence Encoder embeddings, BART zero-shot, a custom neural model, and prompted LLMs; USE emerged as the strongest overall traditional approach in the comparison.
NLPMulti-Label ClassificationText EmbeddingsBenchmarking
KPIs-Datawarehouse

KPIs-Datawarehouse

A KNIME workflow to automate the computation and storage of 6 key business KPIs into a SQLite database.

  • - Built an automated ETL pipeline to extract 6 business KPIs for various years and save them into a structured SQLite database.
  • - Implemented modular logic where each KPI is computed in a dedicated sub-workflow before being aggregated.
  • - Configured dynamic database interactions using DB Writer for initial setup and DB Insert for incremental year-by-year updates.
  • - Optimized the storage architecture to serve as the backend for an interactive 6 KPIs Data App dashboard.
KNIMEFinanceETLSQLiteAnalytics
Abstractive SummarizationAnalog ElectronicsAnalyticsBenchmarkingCognitive ScienceComputer VisionDatasetDeep LearningETLEvaluationFinanceForecastingGame DevelopmentGenAIInformation RetrievalKNIMELLMLLM EvaluationMATLABModelingMulti-Label ClassificationMultimodalNLPPortfolio OptimizationRAGRoboticsSQLiteSegmentationSentiment AnalysisSignal ProcessingStock MarketText EmbeddingsTime SeriesTransfer LearningVLA