0 Datasets
0 Files
Get instant academic access to this publication’s datasets.
Yes. After verification, you can browse and download datasets at no cost. Some premium assets may require author approval.
Files are stored on encrypted storage. Access is restricted to verified users and all downloads are logged.
Yes, message the author after sign-up to request supplementary files or replication code.
Join 50,000+ researchers worldwide. Get instant access to peer-reviewed datasets, advanced analytics, and global collaboration tools.
✓ Immediate verification • ✓ Free institutional access • ✓ Global collaborationJoin our academic network to download verified datasets and collaborate with researchers worldwide.
Get Free AccessMedical Visual Question Answering (VQA) has emerged as a promising solution to enhance clinic-decision making and patient interactions. Given a medical image and a corresponding question, medical VQA aims to predict an informative answer by reasoning the visual and textual information. However, datasets with limited samples circumscribe the generalization of medical VQA, reducing its accuracy when applied to unseen medical samples. Existing works tried to solve this problem with meta-learning or self-supervised learning but still failed to achieve satisfactory performance on medical VQA with insufficient samples. To address this problem, we propose multimodal hierarchical knowledge distillation for medical VQA (MHKD-MVQA). In the primary novelty of MHKD-MVQA, we distill knowledge from not only the output but also the intermediate layers, which leverages the knowledge from limited samples to a greater extent. Meanwhile, medical images and questions are embedded in a shared latent space, enabling our model to tackle multimodal samples. We evaluate our model on two medical VQA datasets, VQA-MED 2019 and VQA-RAD, where MHKD-MVQA achieves state-of-the-art performance and outperforms baselines by 3.6% and 1.6%, respectively. The extensive experiments also highlight the generalization of knowledge distillation by analyzing the class activation maps on medical images concerning specific questions.
Jianfeng Wang, Shuokang Huang, Huifang Du, Yu Qin, Haofen Wang, Wenqiang Zhang (2022). MHKD-MVQA: Multimodal Hierarchical Knowledge Distillation for Medical Visual Question Answering. 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 567-574, DOI: 10.1109/bibm55620.2022.9995473.
Datasets shared by verified academics with rich metadata and previews.
Authors choose access levels; downloads are logged for transparency.
Students and faculty get instant access after verification.
Type
Article
Year
2022
Authors
6
Datasets
0
Total Files
0
Language
English
Journal
2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
DOI
10.1109/bibm55620.2022.9995473
Access datasets from 50,000+ researchers worldwide with institutional verification.
Get Free Access