Siddhartha Gairola

Snowmass Aspen
Snowmass Mountain, Snowmass Village, CO, USA (clicked by me in March, 2020).

I am an ELLIS PhD Student advised by Prof. Bernt Schiele (Max Planck Institute of Informatics, Saarbrücken) and Prof. Francesco Locatello (ISTA). During the course of my PhD I shall be associated with both the Max Planck Institute for Informatics and Institute of Science and Technology, Austria.

Previously, I was a Research Fellow at Microsoft Research (MSR), India in the Technology for Emerging Markets group, where I worked on applications of Computer Vision, Image Processing and Machine Learning to developing low-cost diagnostic solutions for healthcare. Before that, I worked as a Research Intern at Adobe Inc. in the Media and Data Science Research Group on image understanding tasks.

I completed my Master's and B.Tech with Honours in May, 2020 at IIIT Hyderabad, where I was advised by Prof. PJ Narayanan. My work was mainly on learning robust unsupervised style representations for image recognition and retrieval tasks.

Siddhartha Gairola

News

  • Feb 2026 DAVE pre-print on distribution-aware attribution for ViTs released on arXiv.
  • Jan 2025 "How to Probe" accepted at ICLR 2025.
  • 2025 SmartKC++ accepted at WACV 2025.
  • Oct 2024 Invited talk at GMUM Workshop, Jagiellonian University on XAI in Visual Models.
  • Sept 2022 Started PhD at Max Planck Institute for Informatics as part of the ELLIS PhD program.
  • 2022 Auto-retinoscopy and keratoconus classifier papers published at IMWUT and EMBC.
  • 2021 SmartKC published at IMWUT; RespireNet at EMBC 2021.

Research

Current Research: I am broadly interested in Artificial Intelligence (Machine Learning), Computer Vision, Image Processing, Natural Language Processing and their applications to real-world problems. I am particularly interested in building reliable (robust) systems that model visual perception with limited supervision. To this end, for my PhD I will be exploring two major directions:

  1. Interpretability and Robustness of Deep Neural Networks
  2. Learning Powerful (unsupervised) Object-Centric Representations

When not working on my research, I like to play the piano and guitar, listen to music, read non-fiction, drive motorcycles, and go for a run or hike. I also am really fascinated by paradoxes, can find some here. (I wish I had Hermione's Time-Turner to do much more in a day as much as I'd like to.) (see more at Personal)

Computer Vision, Image Processing and Machine Learning

How to Probe figure How to Probe figure hover

How to Probe: Simple Yet Effective Techniques for Improving Post-hoc Explanations

ICLR

Siddhartha Gairola, Moritz Boehle, Francesco Locatello and Bernt Schiele

International Conference on Learning Representations (ICLR), 2025

We show that training details of a pre-trained model's classification layer (<10% of parameters) play a crucial role in post-hoc explanations, and present simple adjustments that significantly enhance explanation quality across multiple visual pre-training frameworks.

SimPropNet figure SimPropNet figure hover

SimPropNet: Improved Similarity Propagation for Few Shot Segmentation

IJCAI

Siddhartha Gairola, Ayush Chopra, Mayur Hemani and Balalji K.

International Joint Conferences on Artificial Intelligence (IJCAI), 2020

Improving similarity propagation to improve one-shot and few-shot image segmentation.

Style Embeddings figure Style Embeddings figure hover

Unsupervised Image Style Embeddings for Retrieval and Recognition Tasks

WACV

Siddhartha Gairola, Rajvi Shah, P.J. Narayanan

IEEE Winter Conference on Applications of Computer Vision (WACV '20), 2020

An unsupervised protocol for learning a neural embedding of visual style of images. The proposed protocol does not leverage categorical labels but a proxy measure for finding stylistically similar and dissimilar images.

Applied ML, Vision, HCI for Healthcare

SmartKC++ figure SmartKC++ figure hover

SmartKC++: Improving Performance of Smartphone-Based Corneal Topographers

WACV

Vaibhav Ganatra, Siddhartha Gairola, Nipun Kwatra and Mohit Jain et al.

IEEE Winter Conference on Applications of Computer Vision (WACV '25), 2025

Improving the SmartKC image processing pipeline, making it more robust and accurate.

Retinoscopy figure Retinoscopy figure hover

Towards Automating Retinoscopy for Refractive Error Diagnosis

IMWUT

Aditya Aggarwal, Siddhartha Gairola, Nipun Kwatra and Mohit Jain et al.

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), Volume 6, Issue 3, 2022

We automate retinoscopy using a smartphone attached to a retinoscope, achieving 91.0% sensitivity and 0.75D mean error for refractive error screening.

Keratoconus figure Keratoconus figure hover

Keratoconus Classifier for Smartphone-based Corneal Topographer

EMBC

Siddhartha Gairola, Nipun Kwatra and Mohit Jain et al.

IEEE Engineering in Medicine & Biology Society (EMBC), 2022

A dual-head CNN for classifying keratoconus on SmartKC heatmaps, achieving 91.3% sensitivity and 94.2% specificity using a 2-stage transfer learning strategy.

SmartKC figure SmartKC figure hover

Smartphone based Corneal Topographer

IMWUT

Siddhartha Gairola, Nipun Kwatra and Mohit Jain

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), Volume 5, Issue 4, 2021

SmartKC is a low-cost smartphone-based corneal topographer. It provides a low-cost solution for the mass screening of keratoconus at scale.

RespireNet figure RespireNet figure hover

RespireNet: A Deep Neural Network for Accurately Detecting Abnormal Lung Sounds in Limited Data Setting

EMBC

Siddhartha Gairola, Francis Tom, Nipun Kwatra and Mohit Jain

IEEE Engineering in Medicine & Biology Society (EMBC), 2021

RespireNet is a CNN-based model with novel techniques—device specific fine-tuning, concatenation-based augmentation, blank region clipping, and smart padding. We improve upon the state-of-the-art on ICBHI dataset by 2.2%.

Multimodal Machine Learning: Vision and NLP

Master's Thesis

For my Master's research I was supervised by Prof. P. J. Narayanan at CVIT at IIIT Hyderabad. My research was on the following two tasks (1) representation learning for image style search and retrieval, and (2) color consistent background replacement.

MS Thesis figure MS Thesis figure hover

Image Representations for Style Retrieval, Recognition and Background Replacement Tasks

Siddhartha Gairola

Master's Thesis, IIIT Hyderabad, 2020

Work Experience

Microsoft Research Research Fellow (Aug, 2020 - Aug, 2022)
Microsoft Research Research Intern (Jan, 2020 - July, 2020)
Adobe Inc. Research Intern (Jun, 2019 - Jan, 2020)

Education

Max Planck Institute for Informatics & Saarland University
Ph.D. Student, Computer Science (Sept. 2022 - present)
International Institute of Information Technology - Hyderabad
Master of Science (MS) by Research, Computer Science (2018-2020)
Bachelor of Technology (BTech) with Honours, Computer Science (2014-2018)
St. Joseph's Academy, Dehradun
Senior Secondary, ISC (2012-2013)
Secondary, ICSE (2010-2011)

Academic Service

Organizer: Computer Vision for Developing Countries Workshop at ACCV 2024, ICCV 2025

Reviewer: ICML 2023-25; NeurIPS 2022-23, 2025; ICLR 2022-24; CVPR 2024-26; ECCV 2024; ICCV 2025; IHCI 2021

Talks and Presentations

Intriguing Applications and Overlooked Pitfalls of XAI in Visual Models
GMUM Workshop, Jagiellonian University, October 2024
RespireNet: A DNN for Accurately Detecting Abnormal Lung Sounds in Limited Data Setting
EMBC 2021
SimPropNet: Improved Similarity Propagation for Few-shot Image Segmentation
IJCAI 2020
Unsupervised Image Style Embeddings for Retrieval and Recognition Tasks
IEEE WACV 2020

Teaching Experience

Saarland University

Worked as a Teaching Assistant at Saarland University for the courses listed below. The duties involved setting up questions for assignments, examinations and paper corrections.

  • 1. Elements of Data Science and Artificial Intelligence (Winter 2023, 2024)
IIIT Hyderabad

Worked as a Teaching Assistant at IIIT Hyderabad for the courses listed below. The duties involved taking regular tutorials, paper corrections, setting up questions for assignments and conducting evaluations.

  • 1. Digital Logic and Processors (Monsoon 2016)
  • 2. Artificial Intelligence (Spring 2017)
  • 3. Digital Image Processing (Monsoon 2017)
  • 4. Computer Vision (Spring 2018)
  • 5. Digital Image Processing (Monsoon 2018)
  • 6. Computer Graphics (Spring 2019)

Useful Resources and Writings

  1. Notes and Resources on How to Do Research as a Young Researcher — some nice resources that I maintain on how to do research that have proved to be immensely useful to me, a gold mine.
  2. Resources on writing academic papers. (click here)
  3. Resources on reviewing scientific papers. (click here)
  4. Reproducibility checklist by Joelle Pineau. (click here)
  5. I also maintain some writings, resources and FAQs on Graduate School (PhD) application process here that I update sporadically. (click here)
  6. I do blog sometimes on medium (click here) about research, general thoughts and some personal things.

Open-Source Contribution

Google Summer of Code

I contribute actively to open source organizations — Scilab and LibreOffice. I have been working with Scilab for the past 3 years now. My proposals were selected twice (2017, 2018) as a project for GSoC (Google Summer of Code) Program.

Google Summer of Code 2017
Project Details: Implemented a C/C++ wrapper for Matlab MEX-API on current API Scilab.

Google Summer of Code 2018
Project Details: Implemented a DEMO in C/C++ and Scilab as a working example for the MEX Library in Scilab.