Jae Myung Kim

I am a PhD student at the University of Tübingen and a member of ELLIS & IMPRS-IS programs advised by Zeynep Akata (TU Munich, Helmholtz Munich) and Cordelia Schmid (Google, Inria). Previously, I completed my B.S. and M.S. degrees at the Seoul National University.

I am broadly interested in efficient data-centric approaches. At the moment I'm working on the following questions:

  1. Synthetic data as training data: Exploring how generative models can be leveraged as a robust data source for pertaining or downstream tasks.
  2. Weak alignment: Multi-modal data are normally weakly aligned. How can we have better representations with weak data alignment?
  3. Zero-shot and few-shot learning: How can we better leverage foundation models with access to minimum data?
In addition to these questions, I have previously worked on building reliable models: topics such as XAI, bias, and uncertainty.

Email  /  Google Scholar  /  LinkedIn

profile photo

Selected papers

project image Synthetic Training Data Few-shot Learning

DataDream: Few-shot Guided Dataset Generation


Jae Myung Kim*, Jessica Bader*, Stephan Alaniz, Cordelia Schmid, Zeynep Akata
ECCV 2024
arxiv / code /

We generate a synthetic dataset guided by few-shot real samples, which more faithfully represents the real data distribution of the targeted classification task.

project image Explainability

Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models


Nishad Singhi, Jae Myung Kim, Karsten Roth, Zeynep Akata
ECCV 2024
arxiv / code /

We learn concept relations to realign concept assignments post-intervention in CBMs. This effectively reduces the number of necessary interventions to reach a target performance.

project image Zero-shot Learning

Waffling around for Performance: Visual Classification with Random Words and Broad Concepts


Karsten Roth*, Jae Myung Kim*, A. Sophia Koepke, Oriol Vinyals, Cordelia Schmid, Zeynep Akata
ICCV 2023
arxiv / code /

We achieve comparable zero-shot CLIP performance without access to external models by using random characters and random word descriptors.

project image Bias Synthetic Training Data

Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval


Jae Myung Kim, A. Sophia Koepke, Cordelia Schmid, Zeynep Akata
CVPR Workshop 2023
arxiv /

We find that image-text retrieval models commonly learn to memorize spurious correlations in the training data. We introduce a metric that measures a model’s robustness to spurious correlations in the training data, and de-bias those models by finetuning them with the controlled synthetic dataset.

project image Weak Alignment Explainability

Bridging the Gap between Model Explanations in Partially Annotated Multi-label Classification


Youngwook Kim, Jae Myung Kim, Jieun Jeong, Cordelia Schmid, Zeynep Akata, Jungwoo Lee
CVPR 2023
arxiv / code /

We observe that the explanation of two models, trained with full and partial labels each, highlights similar regions but with different scaling. We then propose to boost the attribution scores of the model trained with partial labels to make its explanation resemble that of the model trained with full labels.

project image Weak Alignment

Large Loss Matters in Weakly Supervised Multi-Label Classification


Youngwook Kim*, Jae Myung Kim*, Zeynep Akata, Jungwoo Lee
CVPR 2022
arxiv / code /

We frame the partially labeled setting as a noisy multi-label classification task, and observe the memorization effect. We then propose to reject or correct high-loss samples, preventing the memorization of noise.

project image Explainability

Keep CALM and Improve Visual Feature Attribution


Jae Myung Kim*, Junsuk Choe*, Zeynep Akata, Seong Joon Oh
ICCV 2021
arxiv / code /

Class Activation Mapping (CAM) is widely used for visual feature attribution, but its reliance on ad-hoc calibration steps outside the training graph limits its interpretability. We address this issue by introducing a latent variable for cue location, an explanation by itself, in the training graph.






Design and source code from Leonid Keselman's website
Keywords color-coding inspired from Seong Joon Oh's website