I am a Research Scientist at UC Berkeley, working with Prof. Trevor Darrell. I have completed my PhD at Max Planck Institute for Informatics under supervision of Prof. Bernt Schiele. My research is at the intersection of vision and language. I am interested in a variety of tasks, including image and video description, visual grounding, visual question answering and others. Recently, I am focusing on building explainable models, addressing bias in existing vision and language models, and detecting semantic mismatch in context of multimodal misinformation.
Excited to share that I will be joining TU Darmstadt (Germany) as a W3-Professor starting September 2023 (further supported by a €2M LOEWE Start Professorship)! I am looking for prospective PhD students and postdocs. If interested, please reach out via email (see below) and include your latest CV.
I am Ukrainian and I stand with my people against Russian aggression. One-pager with tiny instructions that make a huge difference.
My old MPII homepage is here.
Also see the web page for our UC Berkeley group here.
You can reach me via: contact @ anna-rohrbach.net
News
2023
- I will be joining TU Darmstadt (Germany) as a full W3-Professor starting September 2023!
- I have been awarded €2M LOEWE Start Professorship from the state of Hesse
- I am serving as an Area Chair for ICCV 2023.
2022
- Congrats to my team for winning the Ego4D PNR Temporal Localization Challenge 2022, technical report here!
- Recognized as an Outstanding Reviewer at CVPR 2022
- I am serving as an Area Chair for NeurIPS Datasets and Benchmarks 2022
- An open letter from engineers and researchers around the world to IEEE Spectrum: Open Letter: IEEE Spectrum editors apparently fell for Russian propaganda
- Check out our blog post on Accelerating Ukraine Intelligence Analysis with Computer Vision on Synthetic Aperture Radar Imagery
2021
- Recognized as an Outstanding Reviewer at NeurIPS 2021.
- I co-organized the 4th Workshop on Closing the Loop Between Vision and Language (in conjunction with ICCV 2021).
- I gave a talk at the 2021 VizWiz Grand Challenge Workshop, in conjunction with CVPR 2021.
- I gave a talk at the 2nd Workshop on Advances in Language and Vision Research (ALVR), in conjunction with NAACL 2021.
- Recognized as an Outstanding Reviewer at CVPR 2021.
- I am serving as an Area Chair for ICCV 2021.
2020
- I gave a talk at the The 2nd workshop on Video Turing Test: Toward Human-Level Video Story Understanding, in conjunction with ECCV 2020.
- I gave talks at the Visual Question Answering and Dialog Workshop and The End-of-End-to-End: A Video Understanding Pentathlon, in conjunction with CVPR 2020.
2019
- Recognized as a Best Reviewer at NeurIPS 2019.
- Our work on “Robust Change Captioning” is one of the Best Paper Nominations at ICCV 2019!
- I was recognized as an Outstanding Reviewer at ICCV 2019.
- I co-organized the Workshop on Closing the Loop Between Vision and Language and The Large Scale Movie Description Challenge (LSMDC), at ICCV 2019.
- I was recognized as an Outstanding Reviewer at CVPR 2019.
- I co-organized the Workshop on Fairness Accountability Transparency and Ethics in Computer Vision (at CVPR 2019).
2018
- I was recognized as a Best Reviewer at EMNLP 2018.
- I was recognized as an Outstanding Reviewer at CVPR 2018.
- I am honored to be a recipient of Otto Hahn Medal for 2017.
Preprints and Technical Reports
-
Shape-Guided Diffusion with Inside-Outside Attention
Dong Huk Park, Grace Luo, Clayton Toste, Samaneh Azadi, Xihui Liu, Maka Karalashvili, Anna Rohrbach, Trevor Darrell - Structured Video Tokens @ Ego4D PNR Temporal Localization Challenge 2022
Elad Ben-Avraham, Roei Herzig, Karttikeya Mangalam, Amir Bar, Anna Rohrbach, Leonid Karlinsky, Trevor Darrell, Amir Globerson
Recent Publications
- MammalNet: A Large-scale Video Benchmark for Mammal Recognition and Behavior Understanding
Jun Chen*, Ming Hu*, Darren Cooker, Michale Berumen, Blair Costelloe, Sara Beery, Anna Rohrbach, Mohamed Elhoseiny
CVPR 2023, * indicate equal contribution -
Lisa Dunlap, Clara Mohri, Devin Guillory, Han Zhang, Trevor Darrell, Joseph E Gonzalez, Aditi Raghunanthan, Anna Rohrbach
ICLR 2023, Notable-top-25% (aka Spotlight) - Watch Those Words: Video Falsification Detection Using Word-Conditioned Facial Motion
Shruti Agarwal, Liwen Hu, Evonne Ng, Trevor Darrell, Hao Li, Anna Rohrbach
WACV 2023 - More Control for Free! Image Synthesis with Semantic Diffusion Guidance
Xihui Liu, Dong Huk Park, Samaneh Azadi, Gong Zhang, Arman Chopikyan, Yuxiao Hu, Humphrey Shi, Anna Rohrbach, Trevor Darrell
WACV 2023 - G^3: Geolocation via Guidebook Grounding
Grace Luo*, Giscard Biamby*, Trevor Darrell, Daniel Fried, Anna Rohrbach
Findings of EMNLP 2022, * indicate equal contribution - Focus! Relevant and Sufficient Context Selection for News Image Captioning
Mingyang Zhou, Grace Luo, Anna Rohrbach, Zhou Yu
Findings of EMNLP 2022 - K-LITE: Learning Transferable Visual Models with External Knowledge
Sheng Shen*, Chunyuan Li*, Xiaowei Hu*, Jianwei Yang, Yujia Xie, Pengchuan Zhang, Zhe Gan, Lijuan Wang, Lu Yuan, Ce Liu, Kurt Keutzer, Trevor Darrell, Anna Rohrbach, Jianfeng Gao
NeurIPS 2022, * indicate equal contribution, Oral - Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens
Elad Ben-Avraham, Roei Herzig, Karttikeya Mangalam, Amir Bar, Anna Rohrbach, Leonid Karlinsky, Trevor Darrell, Amir Globerson
NeurIPS 2022 - TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency
Medhini Narasimhan, Arsha Nagrani, Chen Sun, Michael Rubinstein, Trevor Darrell*, Anna Rohrbach*, Cordelia Schmid*
ECCV 2022, * indicate equal contribution - Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly
Spencer Whitehead, Suzanne Petryk, Vedaad Shakib, Joseph Gonzalez, Trevor Darrell, Anna Rohrbach, Marcus Rohrbach
ECCV 2022 - The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning
Jack Hessel, Jena D Hwang, Jae Sung Park, Rowan Zellers, Chandra Bhagavatula, Anna Rohrbach, Kate Saenko, Yejin Choi
ECCV 2022, Oral - Twitter-COMMs: Detecting Climate, COVID, and Military Multimodal Misinformation
Giscard Biamby, Grace Luo, Trevor Darrell, Anna Rohrbach
NAACL 2022 - Exposing the Limits of Video-Text Models through Contrast Sets
Jae Sung Park, Sheng Shen, Ali Farhadi, Trevor Darrell, Yejin Choi, Anna Rohrbach
NAACL 2022 - On Guiding Visual Attention with Language Specification
Suzanne Petryk, Lisa Dunlap, Keyan Nasseri, Joseph Gonzalez, Trevor Darrell, Anna Rohrbach
CVPR 2022 - Object-Region Video Transformers
Roei Herzig, Elad Ben-Avraham, Karttikeya Mangalam, Amir Bar, Gal Chechik, Anna Rohrbach, Trevor Darrell, Amir Globerson
CVPR 2022 - DETReg: Unsupervised Pretraining with Region Priors for Object Detection
Amir Bar, Xin Wang, Vadim Kantorov, Colorado J. Reed, Roei Herzig, Gal Chechik, Anna Rohrbach, Trevor Darrell, Amir Globerson
CVPR 2022 - ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension
Sanjay Subramanian, William Merrill, Trevor Darrell, Matt Gardner, Sameer Singh, Anna Rohrbach
ACL 2022 -
Sheng Shen, Liunian Harold Li, Hao Tan, Mohit Bansal, Anna Rohrbach, Kai-Wei Chang, Zhewei Yao, Kurt Keutzer
ICLR 2022 -
Medhini Narasimhan, Anna Rohrbach, Trevor Darrell
NeurIPS 2021 - NewsCLIPpings: Automatic Generation of Out-of-Context Multimodal Media
Grace Luo, Trevor Darrell, Anna Rohrbach
EMNLP 2021, Oral - Benchmark for Compositional Text-to-Image Synthesis
Dong Huk Park, Samaneh Azadi, Xihui Liu, Trevor Darrell, Anna Rohrbach
NeurIPS 2021 Track Datasets and Benchmarks 2021 - Compositional Video Synthesis with Action Graphs
Amir Bar, Roei Herzig, Xiaolong Wang, Anna Rohrbach, Gal Chechik, Trevor Darrell, Amir Globerson
ICML 2021 - Identity-Aware Multi-Sentence Video Description
Jae Sung Park, Trevor Darrell, Anna Rohrbach
ECCV 2020 - Advisable Learning for Self-driving Vehicles by Internalizing Observation-to-Action Rules
Jinkyu Kim, Suhong Moon, Anna Rohrbach, Trevor Darrell, John Canny
CVPR 2020 - Language-Conditioned Graph Networks for Relational Reasoning
Ronghang Hu, Anna Rohrbach, Trevor Darrell, Kate Saenko
ICCV 2019 - Robust Change Captioning
Dong Huk Park, Trevor Darrell, Anna Rohrbach
ICCV 2019, Oral, Best Paper Nomination - Are You Looking? Grounding to Multiple Modalities in Vision-and-Language Navigation
Ronghang Hu, Daniel Fried, Anna Rohrbach, Dan Klein, Trevor Darrell, Kate Saenko
ACL 2019 - Adversarial Inference for Multi-Sentence Video Description
Jae Sung Park, Marcus Rohrbach, Trevor Darrell, Anna Rohrbach
CVPR 2019, Oral - Speaker-Follower Models for Vision-and-Language Navigation
Daniel Fried*, Ronghang Hu*, Volkan Cirik*, Anna Rohrbach, Jacob Andreas, Louis-Philippe Morency, Taylor Berg-Kirkpatrick, Kate Saenko, Dan Klein**, and Trevor Darrell**
NeurIPS 2018, *, ** indicate equal contribution - Video Object Segmentation with Language Referring Expressions
Anna Khoreva, Anna Rohrbach, and Bernt Schiele
ACCV 2018 - Object Hallucination in Image Captioning
Anna Rohrbach*, Lisa Anne Hendricks*, Kaylee Burns, Trevor Darrell, and Kate Saenko
EMNLP 2018, * indicates equal contribution - Women also Snowboard: Overcoming Bias in Captioning Models
Lisa Anne Hendricks*, Kaylee Burns*, Kate Saenko, Trevor Darrell, Anna Rohrbach
ECCV 2018, * indicates equal contribution - Textual Explanations for Self-Driving Vehicles
Jinkyu Kim, Anna Rohrbach, Trevor Darrell, John Canny, and Zeynep Akata
ECCV 2018 - Multimodal Explanations: Justifying Decisions and Pointing to the Evidence
Dong Huk Park, Lisa Anne Hendricks, Zeynep Akata, Anna Rohrbach, Bernt Schiele, Trevor Darrell, and Marcus Rohrbach
CVPR 2018, Spotlight
The complete list of publications is available on my Google Scholar profile.