I am a Research Scientist at UC Berkeley, working with Prof. Trevor Darrell. I have completed my PhD at Max Planck Institute for Informatics under supervision of Prof. Bernt Schiele. My research is at the intersection of vision and language. I am interested in a variety of tasks, including image and video description, visual grounding, visual question answering and others. Recently, I am focusing on building explainable models, addressing bias in existing vision and language models, and detecting semantic mismatch in context of multimodal misinformation.
My old MPII homepage is here.
Also see the web page for our Berkeley group here.
You can reach me via firstname.lastname at berkeley.edu
- 1 paper accepted to EMNLP 2021.
- 1 paper accepted to the NeurIPS 2021 Track Datasets and Benchmarks.
- We are holding another LSMDC this year, submission servers are open!
- I am co-organizing the 4th Workshop on Closing the Loop Between Vision and Language (in conjunction with ICCV 2021), we invite paper submissions!
- I gave a talk at the 2021 VizWiz Grand Challenge Workshop, in conjunction with CVPR 2021.
- I gave a talk at the 2nd Workshop on Advances in Language and Vision Research (ALVR), in conjunction with NAACL 2021.
- Recognized as an Outstanding Reviewer at CVPR 2021.
- 1 paper accepted to ICML 2021.
- I am serving as an Area Chair for ICCV 2021.
- I gave a talk at the The 2nd workshop on Video Turing Test: Toward Human-Level Video Story Understanding, in conjunction with ECCV 2020.
- 1 paper accepted to ECCV 2020.
- I gave talks at the Visual Question Answering and Dialog Workshop and The End-of-End-to-End: A Video Understanding Pentathlon, in conjunction with CVPR 2020.
- 1 paper accepted to CVPR 2020.
- I was recognized as a Best Reviewer at NeurIPS 2019.
- Our work on “Robust Change Captioning” is one of the Best Paper Nominations at ICCV 2019!
- I was recognized as an Outstanding Reviewer at ICCV 2019.
- 2 papers accepted to ICCV 2019, including 1 Oral.
- I co-organized the Workshop on Closing the Loop Between Vision and Language and The Large Scale Movie Description Challenge (LSMDC), at ICCV 2019.
- A short paper accepted to ACL 2019.
- I was recognized as an Outstanding Reviewer at CVPR 2019.
- 1 paper accepted to CVPR 2019 for an Oral presentation.
- I co-organized the Workshop on Fairness Accountability Transparency and Ethics in Computer Vision (at CVPR 2019).
- I was recognized as a Best Reviewer at EMNLP 2018.
- 1 paper accepted to ACCV 2018.
- 1 paper accepted to NeurIPS 2018.
- 1 paper accepted to EMNLP 2018.
- I was recognized as an Outstanding Reviewer at CVPR 2018.
- 2 papers accepted to ECCV 2018.
- 2 papers accepted to CVPR 2018, including one spotlight.
- I am honored to be a recipient of Otto Hahn Medal for 2017.
Sheng Shen, Liunian Harold Li, Hao Tan, Mohit Bansal, Anna Rohrbach, Kai-Wei Chang, Zhewei Yao, Kurt Keutzer
Medhini Narasimhan, Anna Rohrbach, Trevor Darrell
- DETReg: Unsupervised Pretraining with Region Priors for Object Detection
Amir Bar, Xin Wang, Vadim Kantorov, Colorado J. Reed, Roei Herzig, Gal Chechik, Anna Rohrbach, Trevor Darrell, Amir Globerson
- NewsCLIPpings: Automatic Generation of Out-of-Context Multimodal Media
Grace Luo, Trevor Darrell, Anna Rohrbach
- Benchmark for Compositional Text-to-Image Synthesis
Dong Huk Park, Samaneh Azadi, Xihui Liu, Trevor Darrell, Anna Rohrbach
NeurIPS 2021 Track Datasets and Benchmarks 2021
- Compositional Video Synthesis with Action Graphs
Amir Bar, Roei Herzig, Xiaolong Wang, Anna Rohrbach, Gal Chechik, Trevor Darrell, Amir Globerson
- Identity-Aware Multi-Sentence Video Description
Jae Sung Park, Trevor Darrell, Anna Rohrbach
- Advisable Learning for Self-driving Vehicles by Internalizing Observation-to-Action Rules
Jinkyu Kim, Suhong Moon, Anna Rohrbach, Trevor Darrell, John Canny
- Language-Conditioned Graph Networks for Relational Reasoning
Ronghang Hu, Anna Rohrbach, Trevor Darrell, Kate Saenko
- Robust Change Captioning
Dong Huk Park, Trevor Darrell, Anna Rohrbach
ICCV 2019, Oral, Best Paper Nomination
- Are You Looking? Grounding to Multiple Modalities in Vision-and-Language Navigation
Ronghang Hu, Daniel Fried, Anna Rohrbach, Dan Klein, Trevor Darrell, Kate Saenko
- Adversarial Inference for Multi-Sentence Video Description
Jae Sung Park, Marcus Rohrbach, Trevor Darrell, Anna Rohrbach
CVPR 2019, Oral
- Speaker-Follower Models for Vision-and-Language Navigation
Daniel Fried*, Ronghang Hu*, Volkan Cirik*, Anna Rohrbach, Jacob Andreas, Louis-Philippe Morency, Taylor Berg-Kirkpatrick, Kate Saenko, Dan Klein**, and Trevor Darrell**
NeurIPS 2018, *, ** indicate equal contribution
- Video Object Segmentation with Language Referring Expressions
Anna Khoreva, Anna Rohrbach, and Bernt Schiele
- Object Hallucination in Image Captioning
Anna Rohrbach*, Lisa Anne Hendricks*, Kaylee Burns, Trevor Darrell, and Kate Saenko
EMNLP 2018, * indicates equal contribution
- Women also Snowboard: Overcoming Bias in Captioning Models
Lisa Anne Hendricks*, Kaylee Burns*, Kate Saenko, Trevor Darrell, Anna Rohrbach
ECCV 2018, * indicates equal contribution
- Textual Explanations for Self-Driving Vehicles
Jinkyu Kim, Anna Rohrbach, Trevor Darrell, John Canny, and Zeynep Akata
- Multimodal Explanations: Justifying Decisions and Pointing to the Evidence
Dong Huk Park, Lisa Anne Hendricks, Zeynep Akata, Anna Rohrbach, Bernt Schiele, Trevor Darrell, and Marcus Rohrbach
CVPR 2018, Spotlight
- Fooling Vision and Language Models Despite Localization and Attention Mechanisms
Xiaojun Xu, Xinyun Chen, Chang Liu, Anna Rohrbach, Trevor Darrell, and Dawn Song
The complete list of publications is available on my Google Scholar profile.