Abstract: Visual tracking faces challenges in effectively leveraging long-term video-level contextual information due to error accumulation caused by appearance ...