Vision-Language Adaptive Clustering and Meta-Adaptation for Unsupervised Few-Shot Action Recognition
Abstract: Unsupervised few-shot action recognition is a practical but challenging task, which adapts knowledge learned from unlabeled videos to novel action classes with only limited labeled data.
Abstract: Large contrastive vision-language models (VLMs) have recently shown promise in skeleton-based action recognition. However, given the lack of skeleton frame-text training datasets for VLMs, ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results