Keshav Singh

Senior Software Engineer & Data Scientist

Zurich, Switzerland

Learn More

About Me

Passionate software engineer with a strong experience (>5 years) in SQL and spark query optimization, machine learning, generative AI, database systems features design, development and debugging. Skilled in C++, SQL, Python, PyTorch, and various ML libraries. Excellent problem-solving, research, and collaboration abilities. Seeking an impactful software engineering role at the intersection of query processing, AI and machine learning.

Technical Skills

Programming Languages and tools
C++ SQL Python Java Git Jenkins Shell CMake
ML Frameworks
PyTorch Keras Scikit-learn Langchain Langgraph Spark-ML Crew AI
Data tools
Palantir Foundry Databricks Oracle DB MySQL HeatWave Airflow Dataiku Neo4J MS SQL Server AWS Aurora

Professional Experience

Senior Member of Technical Staff

Oracle MySQL HeatWave Optimizer Team
May 2022 – Present

Zurich, Switzerland


  • Enhancing Optimizer for MySQL Heatwave leading to performance gains upto 27x in various analytics database system benchmarks: JOB, TPCH, TPCDS using machine learning and algorithmic techniques. Languages used: C++, Python, Java and Shell.
  • Developing vector processing and vector embedding features for RAG in MySQL HeatWave Generative AI support. Implementing hnsw vector index in C++ to accelerate similarity search queries.
  • Developing MySQL-AI Enterprise Release feature of Chain of Thought reasoning on relational data using LangChain.
  • Benchmarking and optimization of MySQL heatwave query processing performance on latest ARM architecture. Optimizations involved process NUMA locality, L3 Cache-use optimization, instruction pre-fetching optimization.
  • Performing peer code reviews and feature design reviews.
  • Successfully applied to 2 US Patent applications (currently under processing) in the area of applying Machine Learning methods for query optimization.
  • Development and deployment of machine learning models for cardinality estimation, query optimization decisions.
  • Testing infrastructure enhancements using Git, Jenkins, Java and Python.
  • Mentoring Interns and and new members of query processing team.

Data Scientist Consultant

Unit8
Feb 2021 – April 2022

Lausanne, Switzerland


  • Developed scalable data pipelines (up-to billion records) and robust data ontology for client using PySpark, ElasticSearch, Palantir Foundry, MS SQL Server for dashboards used by CEO, CFO , MDs and 1000s of analysts at Swiss Re.
  • Developed AI and ML models for various client usecases like pricing models for reinsurance contracts using hierarchical encoding, xgboost model.
  • Improving Anti Money Laundering models at Credit Suisse using Graph Machine learning, and feature extraction on knowledge graphs build on neo4j via DeepWalk and Node2Vec.
  • Developed Model Catalog for AI Center of Excellence at Swiss Re leveraging Palantir Model auto-registry, Slate dashboard. Dashboard used as main hub of model governance and approvals.
  • Data platform architecture design and review for Julius Baer, identifying current and potential future scalability, usability and reliability, and suggesting architecture changes needed. Final solution leverages Apache Calcite, Hive, Iceberg, Dataiku.

Data Science Intern

Credit Suisse
March 2020 – Aug 2020

Lausanne, Switzerland


  • Developed and deployed graph machine learning models for financial compliance and fraud detection using Python, Palantir Foundry, Pyspark, Neo4J.

Deep Learning Engineer (Part Time)

Siemens Healthineers
July 2019 – Feb 2020

Lausanne, Switzerland


  • Research and development of ML based pipeline for calculation of mean upper cervical cord area from MRI images. Developed a E2E pipeline using Image segmentation and classification tasks using InceptionResnet Model with custom loss function based on DICE and FOCAL loss functions, with a high IOU score in the final output
  • Summer Deep Learning Research Intern from July - September 2019 Hyper parameter optimization of deep machine learning models aimed at better medical diagnosis using MRI images as datasets.
  • Project integrated in Siemens pipeline and led to improvement of a clinical segmentation network by 10%.

Quantitative Developer

CIGP
March 2018 – Feb 2019

Hong Kong, Geneva, Switzerland


  • Joined Asset Management team as full time intern from March - September, then part time until Feb for researching and implementing the following: 1. Efficient Frontier Optimization of Portfolios by using Statistical Optimization Methods with Market Pricing Data, 2. Deep Sentiment Analysis of Large Cap US equities using News Corpus. 3. Deep Q Reinforcement learning for Optimal Portfolio Allocation

Education

M.S. in Communication Systems
EPFL

Aug 2017 – Feb 2021
Lausanne, Switzerland
5.45/6 GPA

B.S. in Electrical Engineering and Computer Science
City University of Hong Kong

Aug 2013 – May 2017
Hong Kong
4.1/4.3 GPA

Notable Projects

NxTreasury
View
Technical Advisor (CTO as a service) • September 2024 - Now

Web development of https://www.nxtreasury.com using bootstrap, ReactJS frontend, Python flask backend. Integration with web3 ethereum blockchain for transaction execution. Architecture setup, devops deployment using Github Actions, Azure Webapps. Finetuning LLM using AWS for financial contract management on AWS using CoT reasoning using langgchain. Deploying Agents fleet for transactions execution, risk screening using crew AI and laggraph. Successfully applied for and awarded Azure Startup grant, NVIDIA startup grant, AWS Activate grant. Watch the demo video at https://www.youtube.com/watch?v=9w6u0tVnlAs

Python, vLLM, Crew AI, Langgraph
Metoo Analysis
View
Student project • 2018

Big Data Analysis using Spark to analyze metoo movement on twitter.

Python, Spark
ChatBot
View
Student project • 2018

Building a chatbot by using sequence to sequence ML model implemented with Transformer.

Python, LLMs
AggMo
View
Student project • 2020

Implementing a Machine Learning framework using PyTorch with a custom optimizer AggMo.

Python, Pytorch
NSM operators
View
Student project • 2019

Implementing NSM, DSM database storage systems in Java and associated operators.

Java
Cube operators
View
Student project • 2020

Implementing Cube operator, similarity join operations using Spark in Scala, to operate efficiently on huge datasets.

Java, Scala
Vote Graphs
View
Student project • 2019

Using Transductive learning and Signal Processing on Graphs to predict votes of US senators.

Python, Graphs

Publications

MGSAT: Multilayer Graph Self-Attention Transformer

Authors: Keshav Singh, Mireille El geche, Pascal Frossard

2024 • Researchgate

Detection of resistive open and short defects in FDSOI under delay-based test: Optimal VDD and body biasing conditions

Authors: Amit Karel Florence Azais, Keshav Singh

2017 • 2017 22nd IEEE European Test Symposium

Get In Touch

I'm always interested in discussing new opportunities, innovative projects, and collaborations in the field of software engineering and data science.