Video_Retrieval
Video_Retrieval
-
Semantic Role Aware Correlation Transformer for Text to Video Retrieval
Burak Satar, Hongyuan Zhu, Xavier Bresson, Joo Hwee Lim
arXiv_CV
arXiv_CV
Transformer
Embedding
Pose
Relation
Attention
Video_Retrieval
Matching
PDF
-
RoME: Role-aware Mixture-of-Expert Transformer for Text-to-Video Retrieval
Burak Satar, Hongyuan Zhu, Hanwang Zhang, Joo Hwee Lim
arXiv_CV
arXiv_CV
Transformer
Embedding
Pose
Relation
Attention
Video_Retrieval
PDF
-
SLIC: Self-Supervised Learning with Iterative Clustering for Human Action Videos
Salar Hosseini Khorasgani, Yuxuan Chen, Florian Shkurti
arXiv_CV
arXiv_CV
Self-Supervised
Pose
Contrastive_Learning
Action
Classification
Image_Classification
Video_Retrieval
PDF
-
LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling
Linjie Li, Zhe Gan, Kevin Lin, Chung-Ching Lin, Zicheng Liu, Ce Liu, Lijuan Wang
arXiv_CV
arXiv_CV
Video_Caption
Zero-Shot
Face
Few-Shot
Caption
Language_Model
Video_Retrieval
PDF
-
Revealing Single Frame Bias for Video-and-Language Learning
Jie Lei, Tamara L. Berg, Mohit Bansal
arXiv_AI
arXiv_AI
Recognition
Pose
Action_Recognition
Action
Inference
Language_Model
Video_Retrieval
PDF
-
Revisiting the 'Video' in Video-Language Understanding
Shyamal Buch, Cristóbal Eyzaguirre, Adrien Gaidon, Jiajun Wu, Li Fei-Fei, Juan Carlos Niebles
arXiv_AI
arXiv_AI
Self-Supervised
Pose
Language_Model
Video_Retrieval
PDF
-
Cross-Architecture Self-supervised Video Representation Learning
Sheng Guo, Zihua Xiong, Yujie Zhong, Limin Wang, Xiaobo Guo, Bing Han, Weilin Huang
arXiv_CV
arXiv_CV
Transformer
Recognition
3D
Represenation_Learning
Self-Supervised
Contrastive_Learning
Action_Recognition
Action
Video_Retrieval
PDF
-
VRAG: Region Attention Graphs for Content-Based Video Retrieval
Kennard Ng, Ser-Nam Lim, Gim Hee Lee
arXiv_CV
arXiv_CV
Embedding
Relation
Attention
Recommendation
Video_Retrieval
PDF
-
A CLIP-Hitchhiker's Guide to Long Video Retrieval
Max Bain, Arsha Nagrani, Gül Varol, Andrew Zisserman
arXiv_CV
arXiv_CV
Embedding
Video_Retrieval
PDF
-
Learning to Retrieve Videos by Asking Questions
Avinash Madasu, Junier Oliva, Gedas Bertasius
arXiv_AI
arXiv_AI
Pose
Action
Video_Retrieval
PDF
-
TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognition
Haodong Duan, Nanxuan Zhao, Kai Chen, Dahua Lin
arXiv_CV
arXiv_CV
Recognition
Represenation_Learning
Self-Supervised
Action_Recognition
Action
Classification
Video_Retrieval
PDF
-
CenterCLIP: Token Clustering for Efficient Text-Video Retrieval
Shuai Zhao, Linchao Zhu, Xiaohan Wang, Yi Yang
arXiv_CV
arXiv_CV
Transformer
Action
Relation
Inference
Video_Retrieval
PDF
-
Learn to Understand Negation in Video Retrieval
Ziyue Wang, Aozhu Chen, Fan Hu, Xirong Li
arXiv_CV
arXiv_CV
Pose
Deep_Learning
Caption
Video_Retrieval
PDF
-
Relevance-based Margin for Contrastively-trained Video Retrieval Models
Alex Falcon, Swathikiran Sudhakaran, Giuseppe Serra, Sergio Escalera, Oswald Lanz
arXiv_CV
arXiv_CV
Embedding
Pose
GAN
Video_Retrieval
PDF
-
MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text Retrieval
Yuying Ge, Yixiao Ge, Xihui Liu, Alex Jinpeng Wang, Jianping Wu, Ying Shan, Xiaohu Qie, Ping Luo
arXiv_CV
arXiv_CV
Reconstruction
Recognition
Bert
Zero-Shot
Action_Recognition
Action
Prediction
Video_Retrieval
PDF
-
A Survey of Video-based Action Quality Assessment
Shunli Wang, Dingkang Yang, Peng Zhai, Qing Yu, Tao Suo, Zhan Sun, Ka Li, Lihua Zhang
arXiv_CV
arXiv_CV
Surveillance
Recognition
Survey
Action_Recognition
Action
Medical
Video_Retrieval
PDF
-
Modality-Balanced Embedding for Video Retrieval
Xun Wang, Bingqing Ke, Xuanping Li, Fangyu Liu, Mingyu Zhang, Xiao Liang, Qiushi Xiao, Yue Yu
arXiv_AI
arXiv_AI
Embedding
Pose
Attention
Video_Retrieval
Matching
PDF
-
COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval
Haoyu Lu, Nanyi Fei, Yuqi Huo, Yizhao Gao, Zhiwu Lu, Ji-Rong Wen
arXiv_CV
arXiv_CV
Image_Caption
Pose
Contrastive_Learning
Face
Action
Attention
Inference
Language_Model
Video_Retrieval
PDF
-
Probabilistic Representations for Video Contrastive Learning
Jungin Park, Jiyoung Lee, Ig-Jae Kim, Kwanghoon Sohn
arXiv_CV
arXiv_CV
Embedding
Recognition
Represenation_Learning
Self-Supervised
Pose
Contrastive_Learning
Action_Recognition
Action
Video_Retrieval
PDF
-
HunYuan_tvr for Text-Video Retrivial
Shaobo Min, Weijie Kong, Rong-Cheng Tu, Dihong Gong, Chengfei Cai, Wenzhe Zhao, Chenyang Liu, Sixiao Zheng, Hongfa Wang, Zhifeng Li, Wei Liu
arXiv_CV
arXiv_CV
Embedding
Enhancement
Pose
Contrastive_Learning
Action
Relation
Attention
Caption
Activity
Video_Retrieval
PDF
-
Temporal Alignment Networks for Long-term Video
Tengda Han, Weidi Xie, Andrew Zisserman
arXiv_CV
arXiv_CV
Segmentation
Recognition
Video_Caption
Weakly_Supervised
Zero-Shot
Sparse
Pose
Action_Recognition
Action
Video_Retrieval
PDF
-
ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound
Yan-Bo Lin, Jie Lei, Mohit Bansal, Gedas Bertasius
arXiv_CL
arXiv_CL
Transformer
Pose
Action
Activity
Video_Retrieval
PDF
-
Learning Audio-Video Modalities from Image Captions
Arsha Nagrani, Paul Hongsuck Seo, Bryan Seybold, Anja Hauth, Santiago Manen, Chen Sun, Cordelia Schmid
arXiv_CV
arXiv_CV
Image_Caption
Video_Caption
Pose
Caption
Video_Retrieval
Matching
PDF
-
CREATE: A Benchmark for Chinese Short Video Retrieval and Title Generation
Ziqi Zhang, Yuxin Chen, Zongyang Ma, Zhongang Qi, Chunfeng Yuan, Bing Li, Ying Shan, Weiming Hu
arXiv_CV
arXiv_CV
Video_Caption
Pose
Caption
Video_Retrieval
PDF
-
Controllable Augmentations for Video Representation Learning
Rui Qian, Weiyao Lin, John See, Dian Li
arXiv_CV
arXiv_CV
Recognition
Represenation_Learning
Self-Supervised
Pose
Contrastive_Learning
Action_Recognition
Action
Relation
Video_Retrieval
PDF
-
X-Pool: Cross-Modal Language-Video Attention for Text-Video Retrieval
Satya Krishna Gorti, Noel Vouitsis, Junwei Ma, Keyvan Golestan, Maksims Volkovs, Animesh Garg, Guangwei Yu
arXiv_CV
arXiv_CV
Pose
Attention
Video_Retrieval
PDF
-
FitCLIP: Refining Large-Scale Pretrained Image-Text Models for Zero-Shot Video Understanding Tasks
Santiago Castro, Fabian Caba Heilbron
arXiv_CV
arXiv_CV
Recognition
Video_Caption
Zero-Shot
Action_Recognition
Action
Video_Retrieval
PDF
-
How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs
Hazel Doughty, Cees G. M. Snoek
arXiv_CV
arXiv_CV
Recognition
Pose
Action
Video_Retrieval
PDF
-
Learning video retrieval models with relevance-aware online mining
Alex Falcon, Giuseppe Serra, Oswald Lanz
arXiv_CV
arXiv_CV
Embedding
Pose
Deep_Learning
Attention
Caption
Video_Retrieval
PDF
-
Revitalize Region Feature for Democratizing Video-Language Pre-training
Guanyu Cai, Yixiao Ge, Alex Jinpeng Wang, Rui Yan, Xudong Lin, Ying Shan, Lianghua He, Xiaohu Qie, Jianping Wu, Mike Zheng Shou
arXiv_CV
arXiv_CV
OCR
Sparse
Regularization
Relation
Video_Retrieval
PDF
-
All in One: Exploring Unified Video-Language Pre-training
Alex Jinpeng Wang, Yixiao Ge, Rui Yan, Yuying Ge, Xudong Lin, Guanyu Cai, Jianping Wu, Ying Shan, Xiaohu Qie, Mike Zheng Shou
arXiv_CV
arXiv_CV
Transformer
Bert
Represenation_Learning
Language_Model
Video_Retrieval
PDF
-
Show Me More Details: Discovering Hierarchies of Procedures from Semi-structured Web Data
Shuyan Zhou, Li Zhang, Yue Yang, Qing Lyu, Pengcheng Yin, Chris Callison-Burch, Graham Neubig
arXiv_CL
arXiv_CL
Knowledge
Video_Retrieval
PDF
-
Disentangled Representation Learning for Text-Video Retrieval
Qiang Wang, Yanhao Zhang, Yun Zheng, Pan Pan, Xian-Sheng Hua
arXiv_CV
arXiv_CV
Optimization
Represenation_Learning
Regularization
Pose
Action
Relation
Video_Retrieval
Matching
PDF
-
MDMMT-2: Multidomain Multimodal Transformer for Video Retrieval, One More Step Towards Generalization
Alexander Kunitsyn, Maksim Kalashnikov, Maksim Dzabraev, Andrei Ivaniuta
arXiv_CV
arXiv_CV
Transformer
Knowledge
Video_Retrieval
PDF
-
Live Laparoscopic Video Retrieval with Compressed Uncertainty
Tong Yu, Pietro Mascagni, Juan Verde, Jacques Marescaux, Didier Mutter, Nicolas Padoy
arXiv_CV
arXiv_CV
Pose
Medical
Video_Retrieval
PDF
-
VScript: Controllable Script Generation with Audio-Visual Presentation
Ziwei Ji, Yan Xu, I-Tsun Cheng, Samuel Cahyawijaya, Rita Frieske, Etsuko Ishii, Min Zeng, Andrea Madotto, Pascale Fung
arXiv_CL
arXiv_CL
Speech
Face
Summarization
Video_Retrieval
PDF
-
NEWSKVQA: Knowledge-Aware News Video Question Answering
Pranay Gupta, Manish Gupta
arXiv_CV
arXiv_CV
Surveillance
Video_Indexing
Knowledge
Pose
VQA
Summarization
Activity
QA
Video_Retrieval
PDF
-
Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval
Jinpeng Wang, Bin Chen, Dongliang Liao, Ziyun Zeng, Gongfu Li, Shu-Tao Xia, Jin Xu
arXiv_CV
arXiv_CV
Transformer
Embedding
Quantization
Represenation_Learning
Pose
Contrastive_Learning
Attention
Video_Retrieval
PDF
-
Reading-strategy Inspired Visual Representation Learning for Text-to-Video Retrieval
Jianfeng Dong, Yabing Wang, Xianke Chen, Xiaoye Qu, Xirong Li, Yuan He, Xun Wang
arXiv_AI
arXiv_AI
Represenation_Learning
Review
Pose
Video_Retrieval
PDF
-
End-to-end Generative Pretraining for Multimodal Video Captioning
Paul Hongsuck Seo, Arsha Nagrani, Anurag Arnab, Cordelia Schmid
arXiv_AI
arXiv_AI
Video_Caption
Speech
Pose
Action
Classification
Caption
QA
Video_Retrieval
PDF
-
Self-supervised Video Representation Learning with Cascade Positive Retrieval
Cheng-En Wu, Farley Lai, Yu Hen Hu, Asim Kadav
arXiv_CV
arXiv_CV
Recognition
Represenation_Learning
Self-Supervised
Contrastive_Learning
Action_Recognition
Action
Video_Retrieval
PDF
-
BridgeFormer: Bridging Video-text Retrieval with Multiple Choice Questions
Yuying Ge, Yixiao Ge, Xihui Liu, Dian Li, Ying Shan, Xiaohu Qie, Ping Luo
arXiv_CV
arXiv_CV
Recognition
Zero-Shot
Action_Recognition
Action
Attention
Video_Retrieval
PDF
-
Multi-query Video Retrieval
Zeyu Wang, Yu Wu, Karthik Narasimhan, Olga Russakovsky
arXiv_CV
arXiv_CV
Pose
Attention
Video_Retrieval
PDF
-
Sign Language Video Retrieval with Free-Form Textual Queries
Amanda Duarte, Samuel Albanie, Xavier Giró-i-Nieto, Gül Varol
arXiv_AI
arXiv_AI
Embedding
Recognition
Pose
Attention
Video_Retrieval
PDF
-
Sound and Visual Representation Learning with Multiple Pretraining Tasks
Arun Balajee Vasudevan, Dengxin Dai, Luc Van Gool
arXiv_CV
arXiv_CV
Image_Caption
Super_Resolution
Represenation_Learning
Self-Supervised
Pose
Classification
Detection
Prediction
Video_Retrieval
PDF
-
Align and Prompt: Video-and-Language Pre-training with Entity Prompts
Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C.H. Hoi
arXiv_CV
arXiv_CV
Transformer
Pose
Action
Detection
Object_Detection
QA
Video_Retrieval
PDF
-
Vision Transformer Based Video Hashing Retrieval for Tracing the Source of Fake Videos
Pengfei Pei, Xianfeng Zhao, Jinchuan Li, Yun Cao, Xiaowei Yi
arXiv_CV
arXiv_CV
Transformer
Pose
Detection
Video_Retrieval
PDF
-
Self-supervised Spatiotemporal Representation Learning by Exploiting Video Continuity
Hanwen Liang, Niamul Quader, Zhixiang Chi, Lizhe Chen, Peng Dai, Juwei Lu, Yang Wang
arXiv_CV
arXiv_CV
Recognition
Represenation_Learning
Self-Supervised
Action_Localization
Pose
Action_Recognition
Action
Video_Retrieval
PDF
-
Prompting Visual-Language Models for Efficient Video Understanding
Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang, Weidi Xie
arXiv_CV
arXiv_CV
Transformer
Recognition
Video_Caption
Zero-Shot
Pose
Action_Recognition
Action
Few-Shot
Language_Model
Video_Retrieval
PDF
-
Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval
Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas, Brian Kingsbury, Rogerio Feris, David Harwath, James Glass, Hilde Kuehne
arXiv_CV
arXiv_CV
Transformer
Embedding
Zero-Shot
Action_Localization
Pose
Action
Classification
Attention
Video_Retrieval
PDF
-
STC-mix: Space, Time, Channel mixing for Self-supervised Video Representation
Srijan Das, Michael S. Ryoo
arXiv_CV
arXiv_CV
Recognition
Represenation_Learning
Knowledge
Self-Supervised
Pose
Action_Recognition
Action
Video_Retrieval
PDF
-
Time-Equivariant Contrastive Video Representation Learning
Simon Jenni, Hailin Jin
arXiv_CV
arXiv_CV
Recognition
Represenation_Learning
Self-Supervised
Pose
Contrastive_Learning
Action_Recognition
Action
Classification
Video_Retrieval
PDF
-
TCGL: Temporal Contrastive Graph for Self-supervised Video Representation Learning
Yang Liu, Keze Wang, Lingbo Liu, Haoyuan Lan, Liang Lin
arXiv_CV
arXiv_CV
Recognition
Represenation_Learning
Knowledge
Self-Supervised
Pose
Contrastive_Learning
Action_Recognition
Action
Relation
Prediction
Video_Retrieval
PDF
-
Lightweight Attentional Feature Fusion for Video Retrieval by Text
Fan Hu, Aozhu Chen, Ziyue Wang, Fangming Zhou, Xirong Li
arXiv_CV
arXiv_CV
Pose
Relation
Attention
Video_Retrieval
PDF
-
AssistSR: Affordance-centric Question-driven Video Segment Retrieval
Stan Weixian Lei, Yuxuan Wang, Dongxing Mao, Difei Gao, Mike Zheng Shou
arXiv_CV
arXiv_CV
Video_Retrieval
PDF
-
Video Content Classification using Deep Learning
Pradyumn Patil, Vishwajeet Pawar, Yashraj Pawar, Shruti Pisal
arXiv_AI
arXiv_AI
RNN
Action
Classification
Deep_Learning
CNN
Video_Retrieval
PDF
-
VIOLET : End-to-End Video-Language Transformers with Masked Visual-token Modeling
Tsu-Jui Fu, Linjie Li, Zhe Gan, Kevin Lin, William Yang Wang, Lijuan Wang, Zicheng Liu
arXiv_CV
arXiv_CV
Transformer
Video_Caption
Sparse
Video_Retrieval
PDF
-
Florence: A New Foundation Model for Computer Vision
Lu Yuan, Dongdong Chen, Yi-Ling Chen, Noel Codella, Xiyang Dai, Jianfeng Gao, Houdong Hu, Xuedong Huang, Boxin Li, Chunyuan Li, Ce Liu, Mengchen Liu, Zicheng Liu, Yumao Lu, Yu Shi, Lijuan Wang, Jianfeng Wang, Bin Xiao, Zhen Xiao, Jianwei Yang, Michael Zeng, Luowei Zhou, Pengchuan Zhang
arXiv_AI
arXiv_AI
Image_Caption
Transfer_Learning
Recognition
Zero-Shot
Pose
Action_Recognition
Action
Classification
Detection
VQA
Few-Shot
Object_Detection
Caption
QA
Video_Retrieval
PDF
-
Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions
Hongwei Xue, Tiankai Hang, Yanhong Zeng, Yuchong Sun, Bei Liu, Huan Yang, Jianlong Fu, Baining Guo
arXiv_CV
arXiv_CV
Transformer
Embedding
Super_Resolution
Zero-Shot
Pose
Action
Video_Retrieval
PDF
-
Induce, Edit, Retrieve:Language Grounded Multimodal Schema for Instructional Video Retrieval
Yue Yang, Joongwon Kim, Artemis Panagopoulou, Mark Yatskar, Chris Callison-Burch
arXiv_CV
arXiv_CV
Zero-Shot
Pose
Language_Model
Video_Retrieval
PDF
-
SwAMP: Swapped Assignment of Multi-Modal Pairs for Cross-Modal Retrieval
Minyoung Kim
arXiv_CV
arXiv_CV
Embedding
Sketch
Image_Retrieval
Pose
Contrastive_Learning
Video_Retrieval
PDF
-
Masking Modalities for Cross-modal Video Retrieval
Valentin Gabeur, Arsha Nagrani, Chen Sun, Karteek Alahari, Cordelia Schmid
arXiv_CV
arXiv_CV
Speech
Video_Retrieval
PDF
-
Visual Spatio-temporal Relation-enhanced Network for Cross-modal Text-Video Retrieval
Ning Han, Jingjing Chen, Guangyi Xiao, Yawen Zeng, Chuhao Shi, Hao Chen
arXiv_CV
arXiv_CV
Transformer
Embedding
3D
Pose
Action
Relation
Visual_Relation
CNN
Video_Retrieval
PDF
-
Domain Adaptation in Multi-View Embedding for Cross-Modal Video Retrieval
Jonathan Munro, Michael Wray, Diane Larlus, Gabriela Csurka, Dima Damen
arXiv_CV
arXiv_CV
Embedding
Unsupervised
Pose
Action
Caption
Video_Retrieval
PDF
-
Coarse to Fine: Video Retrieval before Moment Localization
Zijian Gao, Huanyu Liu, Jingyu Liu
arXiv_CV
arXiv_CV
Video_Retrieval
PDF
-
ViSeRet: A simple yet effective approach to moment retrieval via fine-grained video segmentation
Aiden Seungjoon Lee, Hanseok Oh, Minjoon Seo
arXiv_AI
arXiv_AI
Surveillance
Segmentation
Video_Retrieval
PDF
-
Spatio-Temporal Video Representation Learning for AI Based Video Playback Style Prediction
Rishubh Parihar, Gaurav Ramola, Ranajit Saha, Ravi Kini, Aniket Rege, Sudha Velusamy
arXiv_AI
arXiv_AI
Recognition
Video_Caption
Represenation_Learning
Pose
Action_Recognition
Action
Classification
Prediction
Recommendation
Video_Retrieval
PDF
-
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu, Gargi Ghosh, Po-Yao Huang, Dmytro Okhonko, Armen Aghajanyan, Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
arXiv_CV
arXiv_CV
Transformer
Segmentation
Zero-Shot
Action_Localization
Action
QA
Video_Retrieval
PDF
-
Self-Supervised Video Representation Learning by Video Incoherence Detection
Haozhi Cao, Yuecong Xu, Jianfei Yang, Kezhi Mao, Lihua Xie, Jianxiong Yin, Simon See
arXiv_CV
arXiv_CV
Recognition
Represenation_Learning
Self-Supervised
Pose
Contrastive_Learning
Action_Recognition
Action
Detection
Video_Retrieval
PDF
-
CONQUER: Contextual Query-aware Ranking for Video Corpus Moment Retrieval
Zhijian Hou, Chong-Wah Ngo, Wing Kwong Chan
arXiv_AI
arXiv_AI
Represenation_Learning
Pose
Attention
Video_Retrieval
PDF
-
TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment
Jianwei Yang, Yonatan Bisk, Jianfeng Gao
arXiv_AI
arXiv_AI
Transformer
Segmentation
Represenation_Learning
Contrastive_Learning
Action
Activity
Language_Model
Video_Retrieval
PDF
-
Self-Supervised Video Representation Learning with Meta-Contrastive Network
Yuanze Lin, Xun Guo, Yan Lu
arXiv_AI
arXiv_AI
Recognition
Represenation_Learning
Self-Supervised
Pose
Contrastive_Learning
Action_Recognition
Action
Video_Retrieval
PDF
-
Video Contrastive Learning with Global Context
Haofei Kuang, Yi Zhu, Zhi Zhang, Xinyu Li, Joseph Tighe, Sören Schwertfeger, Cyrill Stachniss, Mu Li
arXiv_AI
arXiv_AI
Image_Caption
Represenation_Learning
Regularization
Self-Supervised
Action_Localization
Pose
Contrastive_Learning
Action
Classification
Video_Retrieval
PDF
-
Boosting Video Captioning with Dynamic Loss Network
Nasibullah, Partha Pratim Mohanta
arXiv_CV
arXiv_CV
Surveillance
Video_Caption
Face
Classification
Deep_Learning
Detection
Object_Detection
Caption
Image_Classification
Video_Retrieval
PDF
-
Use of Affective Visual Information for Summarization of Human-Centric Videos
Berkay Köprü, Engin Erzin
arXiv_CV
arXiv_CV
Embedding
Recognition
Pose
Emotion
Face
Attention
Summarization
Video_Retrieval
PDF
-
Video 3D Sampling for Self-supervised Representation Learning
Wei Li, Dezhao Luo, Bo Fang, Yu Zhou, Weiping Wang
arXiv_CV
arXiv_CV
Recognition
3D
Represenation_Learning
Self-Supervised
Pose
Action_Recognition
Action
Video_Retrieval
PDF
-
How Incomplete is Contrastive Learning? AnInter-intra Variant Dual Representation Method forSelf-supervised Video Recognition
Lin Zhang, Qi She, Zhengyang Shen, Changhu Wang
arXiv_AI
arXiv_AI
Recognition
Represenation_Learning
Self-Supervised
Pose
Contrastive_Learning
Classification
Video_Retrieval
PDF
-
DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval
Giorgos Kordopatis-Zilos, Christos Tzelepis, Symeon Papadopoulos, Ioannis Kompatsiaris, Ioannis Patras
arXiv_CV
arXiv_CV
Video_Indexing
Knowledge
Pose
Video_Retrieval
PDF
-
Unsupervised Segmentation of Action Segments in Egocentric Videos using Gaze
I. Hipiny, H. Ujir, J.L. Minoi, S.F. Samson Juan, M.A. Khairuddin, M.S. Sunar
arXiv_CV
arXiv_CV
Segmentation
Tracking
Unsupervised
Recognition
Action
Activity
Video_Retrieval
Matching
PDF
-
Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting
Martine Toering, Ioannis Gatopoulos, Maarten Stol, Vincent Tao Hu
arXiv_CV
arXiv_CV
Embedding
Recognition
3D
Optimization
Represenation_Learning
Self-Supervised
Pose
Contrastive_Learning
Action_Recognition
Action
Optical_Flow
Inference
Prediction
Video_Retrieval
Matching
PDF
-
VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation
Linjie Li, Jie Lei, Zhe Gan, Licheng Yu, Yen-Chun Chen, Rohit Pillai, Yu Cheng, Luowei Zhou, Xin Eric Wang, William Yang Wang, Tamara Lee Berg, Mohit Bansal, Jingjing Liu, Lijuan Wang, Zicheng Liu
arXiv_CV
arXiv_CV
Video_Caption
Knowledge
Caption
Video_Retrieval
PDF
-
ASCNet: Self-supervised Video Representation Learning with Appearance-Speed Consistency
Deng Huang, Wenhao Wu, Weiwen Hu, Xu Liu, Dongliang He, Zhihua Wu, Xiangmiao Wu, Mingkui Tan, Errui Ding
arXiv_CV
arXiv_CV
Unsupervised
Recognition
Optimization
Represenation_Learning
Self-Supervised
Pose
Action_Recognition
Action
Video_Retrieval
PDF
-
SSAN: Separable Self-Attention Network for Video Representation Learning
Xudong Guo, Xun Guo, Yan Lu
arXiv_CV
arXiv_CV
Embedding
Recognition
Represenation_Learning
Pose
Action_Recognition
Action
Relation
Attention
Video_Retrieval
PDF
-
Action in Mind: A Neural Network Approach to Action Recognition and Segmentation
Zahra Gharaee
arXiv_AI
arXiv_AI
Surveillance
Segmentation
Recognition
3D
Pose
Action_Recognition
Action
GAN
Video_Retrieval
PDF
-
TRECVID 2020: A comprehensive campaign for evaluating video retrieval tasks across multiple application domains
George Awad, Asad A. Butt, Keith Curtis, Jonathan Fiscus, Afzal Godil, Yooyoung Lee, Andrew Delgado, Jesse Zhang, Eliot Godard, Baptiste Chocot, Lukas Diduch, Jeffrey Liu, Alan F. Smeaton, Yvette Graham, Gareth J. F. Jones, Wessel Kraaij, Georges Quenot
arXiv_AI
arXiv_AI
GAN
Summarization
Video_Retrieval
PDF
-
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
Brian Chen, Andrew Rouditchenko, Kevin Duarte, Hilde Kuehne, Samuel Thomas, Angie Boggust, Rameswar Panda, Brian Kingsbury, Rogerio Feris, David Harwath, James Glass, Michael Picheny, Shih-Fu Chang
arXiv_CV
arXiv_CV
Embedding
Zero-Shot
Self-Supervised
Action_Localization
Pose
Contrastive_Learning
Action
Attention
Video_Retrieval
PDF
-
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari, Linagzhe Yuan, Rui Qian, Wei-Hong Chuang, Shih-Fu Chang, Yin Cui, Boqing Gong
arXiv_AI
arXiv_AI
Transformer
Recognition
Self-Supervised
Action_Recognition
Action
Classification
Image_Classification
Video_Retrieval
PDF
-
T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval
Xiaohan Wang, Linchao Zhu, Yi Yang
arXiv_CV
arXiv_CV
Embedding
Optimization
Pose
Action
Video_Retrieval
Matching
PDF
-
TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval
Ioana Croitoru, Simion-Vlad Bogolin, Yang Liu, Samuel Albanie, Marius Leordeanu, Hailin Jin, Andrew Zisserman
arXiv_CV
arXiv_CV
Pose
Video_Retrieval
PDF
-
Self-supervised Video Retrieval Transformer Network
Xiangteng He, Yulin Pan, Mingqian Tang, Yiliang Lv
arXiv_CV
arXiv_CV
Transformer
Self-Supervised
Pose
Action
Video_Retrieval
PDF
-
Object Priors for Classifying and Localizing Unseen Actions
Pascal Mettes, William Thong, Cees G. M. Snoek
arXiv_CV
arXiv_CV
Embedding
Action_Localization
Pose
Action
Classification
Detection
Relation
Object_Detection
Video_Retrieval
Matching
PDF
-
Self-supervised Video Representation Learning by Context and Motion Decoupling
Lianghua Huang, Yu Liu, Bin Wang, Pan Pan, Yinghui Xu, Rong Jin
arXiv_AI
arXiv_AI
Recognition
Represenation_Learning
Regularization
Self-Supervised
Action_Recognition
Action
Prediction
Video_Retrieval
Matching
PDF
-
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Max Bain, Arsha Nagrani, Gül Varol, Andrew Zisserman
arXiv_CV
arXiv_CV
Embedding
Video_Caption
Pose
Attention
Caption
Video_Retrieval
PDF
-
CUPID: Adaptive Curation of Pre-training Data for Video-and-Language Representation Learning
Luowei Zhou, Jingjing Liu, Yu Cheng, Zhe Gan, Lei Zhang
arXiv_CV
arXiv_CV
Video_Caption
Represenation_Learning
Salient
Pose
Caption
Video_Retrieval
PDF
-
MDMMT: Multidomain Multimodal Transformer for Video Retrieval
Maksim Dzabraev, Maksim Kalashnikov, Stepan Komkov, Aleksandr Petiushko
arXiv_CV
arXiv_CV
Transformer
Video_Caption
Caption
Activity
Video_Retrieval
PDF
-
On Semantic Similarity in Video Retrieval
Michael Wray, Hazel Doughty, Dima Damen
arXiv_CV
arXiv_CV
Pose
Caption
Video_Retrieval
PDF
-
A Straightforward Framework For Video Retrieval Using CLIP
Jesús Andrés Portillo-Quintero, José Carlos Ortiz-Bayliss, Hugo Terashima-Marín
arXiv_CV
arXiv_CV
Video_Retrieval
PDF
-
Win-Fail Action Recognition
Paritosh Parmar, Brendan Morris
arXiv_CV
arXiv_CV
Recognition
Action_Recognition
Action
Video_Retrieval
PDF
-
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Jie Lei, Linjie Li, Luowei Zhou, Zhe Gan, Tamara L. Berg, Mohit Bansal, Jingjing Liu
arXiv_CV
arXiv_CV
Bert
Sparse
Pose
Activity
Language_Model
Video_Retrieval
PDF
-
TCLR: Temporal Contrastive Learning for Video Representation
Ishan Dave, Rohit Gupta, Mamshad Nayeem Rizve, Mubarak Shah
arXiv_CV
arXiv_CV
Image_Caption
Recognition
Video_Caption
3D
Represenation_Learning
Self-Supervised
Contrastive_Learning
Action_Recognition
Action
Classification
Video_Retrieval
PDF
-
Temporal Contrastive Graph for Self-supervised Video Representation Learning
Yang Liu, Keze Wang, Haoyuan Lan, Liang Lin
arXiv_CV
arXiv_CV
Embedding
Recognition
Represenation_Learning
Knowledge
Self-Supervised
Pose
Contrastive_Learning
Action_Recognition
Action
Relation
Prediction
Video_Retrieval
PDF
-
Bag of Genres for Video Retrieval
Leonardo A. Duarte, Otávio A. B. Penatti, Jurandy Almeida
arXiv_CV
arXiv_CV
Pose
Action
Classification
Video_Retrieval
PDF
-
SEA: Sentence Encoder Assembly for Video Retrieval by Textual Queries
Xirong Li, Fangming Zhou, Chaoxi Xu, Jiaqi Ji, Gang Yang
arXiv_AI
arXiv_AI
Represenation_Learning
Pose
Video_Retrieval
Matching
PDF
-
A Hierarchical Multi-Modal Encoder for Moment Localization in Video Corpus
Bowen Zhang, Hexiang Hu, Joonseok Lee, Ming Zhao, Sheide Chammas, Vihan Jain, Eugene Ie, Fei Sha
arXiv_CV
arXiv_CV
Pose
Caption
Activity
Language_Model
Video_Retrieval
PDF
-
Graph Based Temporal Aggregation for Video Retrieval
Arvind Srinivasan, Aprameya Bharadwaj, Aveek Saha, Subramanyam Natarajan
arXiv_CV
arXiv_CV
Pose
Video_Retrieval
PDF
-
Self-Supervised Video Representation Using Pretext-Contrastive Learning
Li Tao, Xueting Wang, Toshihiko Yamasaki
arXiv_CV
arXiv_CV
Recognition
Optimization
Self-Supervised
Pose
Contrastive_Learning
Video_Retrieval
PDF
-
RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning
Peihao Chen, Deng Huang, Dongliang He, Xiang Long, Runhao Zeng, Shilei Wen, Mingkui Tan, Chuang Gan
arXiv_CV
arXiv_CV
Unsupervised
Recognition
Represenation_Learning
Self-Supervised
Pose
Action_Recognition
Action
Prediction
Video_Retrieval
PDF
-
Self-supervised Co-training for Video Representation Learning
Tengda Han, Weidi Xie, Andrew Zisserman
arXiv_CV
arXiv_CV
Recognition
Represenation_Learning
Self-Supervised
Pose
Contrastive_Learning
Action_Recognition
Action
Optical_Flow
Video_Retrieval
PDF
-
Audio-based Near-Duplicate Video Retrieval with Audio Similarity Learning
Pavlos Avgoustinakis, Giorgos Kordopatis-Zilos, Symeon Papadopoulos, Andreas L. Symeonidis, Ioannis Kompatsiaris
arXiv_SD
arXiv_SD
Transfer_Learning
Pose
CNN
Video_Retrieval
PDF
-
Support-set bottlenecks for video-text representation learning
Mandela Patrick, Po-Yao Huang, Yuki Asano, Florian Metze, Alexander Hauptmann, João Henriques, Andrea Vedaldi
arXiv_CV
arXiv_CV
Represenation_Learning
Pose
Contrastive_Learning
Action
Caption
Activity
Video_Retrieval
PDF
-
Encode the Unseen: Predictive Video Hashing for Scalable Mid-Stream Retrieval
Tong Yu, Nicolas Padoy
arXiv_CV
arXiv_CV
Pose
Activity
Video_Retrieval
PDF
-
TRECVID 2019: An Evaluation Campaign to Benchmark Video Activity Detection, Video Captioning and Matching, and Video Search & Retrieval
George Awad, Asad A. Butt, Keith Curtis, Yooyoung Lee, Jonathan Fiscus, Afzal Godil, Andrew Delgado, Jesse Zhang, Eliot Godard, Lukas Diduch, Alan F. Smeaton, Yvette Graham, Wessel Kraaij, Georges Quenot
arXiv_AI
arXiv_AI
Video_Caption
Detection
GAN
Caption
Activity
Video_Retrieval
Matching
PDF
-
Hybrid Space Learning for Language-based Video Retrieval
Jianfeng Dong, Xirong Li, Chaoxi Xu, Gang Yang, Xun Wang
arXiv_CV
arXiv_CV
Pose
Video_Retrieval
Matching
PDF
-
Self-supervised Video Representation Learning by Uncovering Spatio-temporal Statistics
Jiangliu Wang, Jianbo Jiao, Linchao Bao, Shengfeng He, Wei Liu, Yun-hui Liu
arXiv_CV
arXiv_CV
Recognition
3D
Represenation_Learning
Self-Supervised
Pose
Action_Recognition
Action
Video_Retrieval
PDF
-
Discriminative Residual Analysis for Image Set Classification with Posture and Age Variations
Chuan-Xian Ren, You-Wei Luo, Xiao-Lin Xu, Dao-Qing Dai, Hong Yan
arXiv_CV
arXiv_CV
Image_Caption
Recognition
Regularization
Pose
Classification
Relation
Caption
Video_Retrieval
PDF
-
Self-supervised Video Representation Learning by Pace Prediction
Jiangliu Wang, Jianbo Jiao, Yun-Hui Liu
arXiv_CV
arXiv_CV
Recognition
Represenation_Learning
Self-Supervised
Pose
Contrastive_Learning
Action_Recognition
Action
Prediction
Video_Retrieval
PDF
-
The VISIONE Video Search System: Exploiting Off-the-Shelf Text Search Engines for Large-Scale Video Retrieval
Giuseppe Amato, Paolo Bolettieri, Fabio Carrara, Franca Debole, Fabrizio Falchi, Claudio Gennaro, Lucia Vadicamo, Claudio Vairo
arXiv_CV
arXiv_CV
Relation
Video_Retrieval
PDF
-
Exploring Relations in Untrimmed Videos for Self-Supervised Learning
Dezhao Luo, Bo Fang, Yu Zhou, Yucan Zhou, Dayan Wu, Weiping Wang
arXiv_CV
arXiv_CV
Recognition
3D
Self-Supervised
Pose
Action_Recognition
Action
Detection
Relation
Video_Retrieval
PDF
-
Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework
Li Tao, Xueting Wang, Toshihiko Yamasaki
arXiv_CV
arXiv_CV
Recognition
Represenation_Learning
Self-Supervised
Pose
Contrastive_Learning
Relation
CNN
Video_Retrieval
PDF
-
Context Encoding for Video Retrieval with Contrastive Learning
Jie Shao, Xin Wen, Bingchen Zhao, Changhu Wang, Xiangyang Xue
arXiv_CV
arXiv_CV
Represenation_Learning
Pose
Contrastive_Learning
Recommendation
Video_Retrieval
PDF
-
Memory-augmented Dense Predictive Coding for Video Representation Learning
Tengda Han, Weidi Xie, Andrew Zisserman
arXiv_CV
arXiv_CV
Unsupervised
Recognition
Represenation_Learning
Self-Supervised
Pose
Action_Recognition
Action
Classification
Attention
Optical_Flow
Video_Retrieval
PDF
-
The End-of-End-to-End: A Video Understanding Pentathlon Challenge
Samuel Albanie, Yang Liu, Arsha Nagrani, Antoine Miech, Ernesto Coto, Ivan Laptev, Rahul Sukthankar, Bernard Ghanem, Andrew Zisserman, Valentin Gabeur, Chen Sun, Karteek Alahari, Cordelia Schmid, Shizhe Chen, Yida Zhao, Qin Jin, Kaixu Cui, Hui Liu, Chen Wang, Yudong Jiang, Xiaoshuai Hao
arXiv_CV
arXiv_CV
Recognition
Video_Caption
Video_Retrieval
PDF
-
Multi-modal Transformer for Video Retrieval
Valentin Gabeur, Chen Sun, Karteek Alahari, Cordelia Schmid
arXiv_CV
arXiv_CV
Transformer
Embedding
Caption
Video_Retrieval
PDF
-
Tree-Augmented Cross-Modal Encoding for Complex-Query Video Retrieval
Xun Yang, Jianfeng Dong, Yixin Cao, Xun Wang, Meng Wang, Tat-Seng Chua
arXiv_CV
arXiv_CV
Embedding
Pose
Video_Retrieval
Matching
PDF
-
Video Playback Rate Perception for Self-supervisedSpatio-Temporal Representation Learning
Yuan Yao, Chang Liu, Dezhao Luo, Yu Zhou, Qixiang Ye
arXiv_CV
arXiv_CV
Recognition
Represenation_Learning
Self-Supervised
Pose
Action_Recognition
Action
Classification
Attention
Video_Retrieval
PDF
-
Exploiting Visual Semantic Reasoning for Video-Text Retrieval
Zerun Feng, Zhimin Zeng, Caili Guo, Zheng Li
arXiv_CV
arXiv_CV
Pose
Action
Relation
Attention
CNN
Video_Retrieval
PDF
-
Condensed Movies: Story Based Retrieval with Contextual Embeddings
Max Bain, Arsha Nagrani, Andrew Brown, Andrew Zisserman
arXiv_CV
arXiv_CV
Embedding
Speech
Pose
Face
Video_Retrieval
PDF
-
Multiple Visual-Semantic Embedding for Video Retrieval from Query Sentence
Huy Manh Nguyen, Tomo Miyazaki, Yoshihiro Sugaya, Shinichiro Omachi
arXiv_CV
arXiv_CV
Embedding
Pose
Relation
Video_Retrieval
Matching
PDF
-
Targeted Attack for Deep Hashing based Retrieval
Jiawang Bai, Bin Chen, Yiming Li, Dongxian Wu, Weiwei Guo, Shu-tao Xia, En-hui Yang
arXiv_CV
arXiv_CV
Optimization
Image_Retrieval
Adversarial
Pose
Video_Retrieval
PDF
-
SpeedNet: Learning the Speediness in Videos
Sagie Benaim, Ariel Ephrat, Oran Lang, Inbar Mosseri, William T. Freeman, Michael Rubinstein, Michal Irani, Tali Dekel
arXiv_CV
arXiv_CV
Recognition
Self-Supervised
Action_Recognition
Action
Classification
Prediction
Video_Retrieval
PDF
-
AMIL: Adversarial Multi Instance Learning for Human Pose Estimation
Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, Jie Yang
arXiv_CV
arXiv_CV
Surveillance
Pose_Estimation
Adversarial
Pose
Face
Action
GAN
Video_Retrieval
PDF
-
Noise Estimation Using Density Estimation for Self-Supervised Multimodal Learning
Elad Amrani, Rami Ben-Ari, Daniel Rotman, Alex Bronstein
arXiv_CV
arXiv_CV
Represenation_Learning
Self-Supervised
Pose
Relation
Video_Retrieval
PDF
-
Fine-Grained Instance-Level Sketch-Based Video Retrieval
Peng Xu, Kun Liu, Tao Xiang, Timothy M. Hospedales, Zhanyu Ma, Jun Guo, Yi-Zhe Song
arXiv_CV
arXiv_CV
Weakly_Supervised
Sketch
Image_Retrieval
Pose
Relation
Video_Retrieval
PDF
-
UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation
Huaishao Luo, Lei Ji, Botian Shi, Haoyang Huang, Nan Duan, Tianrui Li, Xilin Chen, Ming Zhou
arXiv_CV
arXiv_CV
Transformer
Video_Caption
Bert
Pose
Caption
Video_Retrieval
PDF
-
End-to-End Learning of Visual Representations from Uncurated Instructional Videos
Antoine Miech, Jean-Baptiste Alayrac, Lucas Smaira, Ivan Laptev, Josef Sivic, Andrew Zisserman
arXiv_CV
arXiv_CV
Segmentation
Recognition
Self-Supervised
Action_Localization
Pose
Action_Recognition
Action
Video_Retrieval
PDF
-
A Proposal-based Approach for Activity Image-to-Video Retrieval
Ruicong Xu, Li Niu, Jianfu Zhang, Liqing Zhang
arXiv_CV
arXiv_CV
3D
Adversarial
Pose
Classification
Activity
CNN
Video_Retrieval
PDF
-
Deep Heterogeneous Hashing for Face Video Retrieval
Shishi Qiao, Ruiping Wang, Shiguang Shan, Xilin Chen
arXiv_CV
arXiv_CV
Optimization
Face
Video_Retrieval
Matching
PDF
-
ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning
Giorgos Kordopatis-Zilos, Symeon Papadopoulos, Ioannis Patras, Ioannis Kompatsiaris
arXiv_CV
arXiv_CV
Pose
Relation
CNN
Video_Retrieval
Matching
PDF
-
Fine-Grained Action Retrieval Through Multiple Parts-of-Speech Embeddings
Michael Wray, Diane Larlus, Gabriela Csurka, Dima Damen
arXiv_CV
arXiv_CV
Embedding
Zero-Shot
Speech
Pose
Action
Caption
Video_Retrieval
PDF
-
Central Similarity Hashing via Hadamard matrix
Li Yuan, Tao Wang, Xiaopeng Zhang, Zequn Jie, Francis EH Tay, Jiashi Feng
arXiv_CV
arXiv_CV
Pose
Relation
Video_Retrieval
PDF
-
Use What You Have: Video Retrieval Using Representations From Collaborative Experts
Yang Liu, Samuel Albanie, Arsha Nagrani, Andrew Zisserman
arXiv_CV
arXiv_CV
Embedding
OCR
Knowledge
Speech
Pose
Activity
Video_Retrieval
PDF
-
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips
Antoine Miech, Dimitri Zhukov, Jean-Baptiste Alayrac, Makarand Tapaswi, Ivan Laptev, Josef Sivic
arXiv_CV
arXiv_CV
Embedding
Action_Localization
Pose
Action
Caption
Video_Retrieval
PDF
-
Spatio-temporal Video Re-localization by Warp LSTM
Yang Feng, Lin Ma, Wei Liu, Jiebo Luo
arXiv_CV
arXiv_CV
RNN
Pose
GAN
Video_Retrieval
PDF
-
Interactive Video Retrieval with Dialog
Sho Maeoki, Kohei Uehara, Tatsuya Harada
arXiv_CV
arXiv_CV
Pose
Action
Video_Retrieval
PDF
-
Improving MAE against CCE under Label Noise
Xinshao Wang, Elyor Kodirov, Yang Hua, Neil M. Robertson
arXiv_CV
arXiv_CV
Pose
Classification
Deep_Learning
Image_Classification
Video_Retrieval
PDF
-
Unsupervised Data Uncertainty Learning in Visual Retrieval Systems
Ahmed Taha, Yi-Ting Chen, Teruhisa Misu, Abhinav Shrivastava, Larry Davis
arXiv_CV
arXiv_CV
Embedding
Unsupervised
Pose
Video_Retrieval
PDF
-
Dual Dense Encoding for Zero-Example Video Retrieval
Jianfeng Dong, Xirong Li, Chaoxi Xu, Shouling Ji, Xun Wang
arXiv_CV
arXiv_CV
Pose
Deep_Learning
Video_Retrieval
PDF
-
FIVR: Fine-grained Incident Video Retrieval
Giorgos Kordopatis-Zilos, Symeon Papadopoulos, Ioannis Patras, Ioannis Kompatsiaris
arXiv_CV
arXiv_CV
Video_Retrieval
PDF
-
Person Search in Videos with One Portrait Through Visual and Temporal Links
Qingqiu Huang, Wentao Liu, Dahua Lin
arXiv_CV
arXiv_CV
Person_Re-identification
Pose
Re-identification
Video_Retrieval
PDF
-
Talking Face Generation by Adversarially Disentangled Audio-Visual Representation
Hang Zhou, Yu Liu, Ziwei Liu, Ping Luo, Xiaogang Wang
arXiv_CV
arXiv_CV
Adversarial
Speech
Pose
Face
Video_Retrieval
PDF
-
Human Action Recognition and Prediction: A Survey
Yu Kong, Yun Fu
arXiv_CV
arXiv_CV
Surveillance
Recognition
Survey
Action_Recognition
Action
Autonomous
Prediction
Video_Retrieval
PDF
-
Hashing with Mutual Information
Fatih Cakir, Kun He, Sarah Adel Bargal, Stan Sclaroff
arXiv_CV
arXiv_CV
Embedding
Gradient_Descent
Image_Retrieval
Pose
Video_Retrieval
PDF
-
Semantic Image Retrieval by Uniting Deep Neural Networks and Cognitive Architectures
Alexey Potapov, Innokentii Zhdanov, Oleg Scherbakov, Nikolai Skorobogatko, Hugo Latapie, Enzo Fenoglio
arXiv_CV
arXiv_CV
Image_Retrieval
Pose
Deep_Learning
Detection
Object_Detection
Video_Retrieval
PDF
-
ECO: Efficient Convolutional Network for Online Video Understanding
Mohammadreza Zolfaghari, Kamaljeet Singh, Thomas Brox
arXiv_CV
arXiv_CV
Video_Caption
Action
Classification
Relation
Caption
CNN
Video_Retrieval
PDF
-
Text-to-Clip Video Retrieval with Early Fusion and Re-Captioning
Huijuan Xu, Kun He, Leonid Sigal, Stan Sclaroff, Kate Saenko
arXiv_CV
arXiv_CV
Embedding
Video_Caption
Pose
Caption
Video_Retrieval
PDF
-
Learning a Text-Video Embedding from Incomplete and Heterogeneous Data
Antoine Miech, Ivan Laptev, Josef Sivic
arXiv_CV
arXiv_CV
Embedding
Pose
Face
Caption
Video_Retrieval
PDF
-
Dense-Captioning Events in Videos
Ranjay Krishna, Kenji Hata, Frederic Ren, Li Fei-Fei, Juan Carlos Niebles
arXiv_CV
arXiv_CV
Pose
Caption
Activity
Video_Retrieval
PDF
-
Learning Language-Visual Embedding for Movie Understanding with Natural-Language
Atousa Torabi, Niket Tandon, Leonid Sigal
arXiv_CV
arXiv_CV
Embedding
Knowledge
Pose
Caption
Activity
Language_Model
Video_Retrieval
PDF
-
Multimodal Approach for Video Surveillance Indexing and Retrieval
Ali Wali, Adel M. Alimi
arXiv_CV
arXiv_CV
Surveillance
Action
Video_Retrieval
PDF