We conduct analyses of the performance of existing VQA model with respect to logically composed questions
We curate two large scale datasets VQA-Compose and VQA-Supplement that contain logically composed binary questions.
We show a capability of answering logically composed questions with our novel modules, while retaining performance on VQA data.
VQA models struggle at negation, antonyms, conjunction, disjunction!
We provide dataset, a new model, and detailed analysis!
Our dataset contains questions composed with negation, antonyms, conjunctions, and disjunctions.
Our model learns to identify the type of question and they type of logical connective in the question to aid question-answering.
Gokhale, T., Banerjee, P., Baral, C., & Yang, Y. (2020). VQA-LOL: Visual Question Answering under the Lens of Logic.
Copyright @ Tejas Gokhale, 2020
Template from Bolei Zhou