Visual Modality Examples

Bimodal Presentation Speeds up Auditory Processing and Slows Down Visual Processing

Many situations require the simultaneous processing of auditory and visual information, however, stimuli presented to one sensory modality can sometimes interfere with processing in a second sensory ...

GitHub

visual_modality_prompt_for_adapting_vision-language_object_detectors.md

description [ICCV 2025][Object Detection][Visual Prompt] This paper proposes ModPrompt, an encoder-decoder-based visual prompting strategy that adapts vision-language object detectors (e.g., ...

IEEE

AVCaps: An Audio-Visual Dataset With Modality-Specific Captions

Abstract: This paper introduces AVCaps, an audio-visual dataset that contains separate textual captions for the audio, visual, and audio-visual contents of video clips. The dataset contains 2061 video ...

Microsoft

Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event Parser

Audio-visual learning has been a major pillar of multi-modal machine learning, where the community mostly focused on its modality-aligned setting, i.e., the audio and visual modality are both assumed ...

IEEE

Bidirectional Cross-Modal Collaborative Alignment via Semantic-Guided Visual Embeddings for Partially Relevant Video Retrieval

Abstract: Partially Relevant Video Retrieval (PRVR) aims to retrieve videos that match a given textual query only partially. This task is inherently challenging due to the modality gap between text ...

GitHub

CIMB-MVQA: Causal Intervention on Modality-specific Biases for Medical Visual Question Answering

Medical Visual Question Answering (Med-VQA) aims to combine medical image understanding with clinical language reasoning, enabling automatic answering of natural language questions grounded on medical ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results