Semantic Hyperlapse for First-Person Videos

In this project, we deal with a central challenge that is to make egocentric videos watchable. First person videos are generally long-running streams with unedited content, which make them boring and visually unpalatable. Efforts have been applied to try to accelerate them while maintaining the suavity since the naive fast-forwarding amplifies the natural motion of the recorder’s body turning the video nauseate. In this project, we tackle this challenge by an adaptive frame sampling based on the semantic information extracted from images.

Publications

[TPAMI 2023] Washington Ramos, Michel Silva, Edson Araujo, Victor Moura, Keller Oliveira, Leandro Soriano Marcolino, Erickson R. Nascimento. Text-Driven Video Acceleration: A Weakly-Supervised Reinforcement Learning Method, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023.
Visit the page for the code and paper access.

[SIBGRAPI 2021] Diognei de Matos, Washington Ramos, Luiz Romanhol, Erickson R. Nascimento. Musical Hyperlapse: A Multimodal Approach to Accelerate First-Person Videos, 2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), 2021.
Visit the page for the code and paper access.

[TPAMI 2021] Michel M. Silva, Washington L. S. Ramos, Mario F. M. Campos, Erickson R. Nascimento. A Sparse Sampling-based framework for Semantic Fast-Forward of First-Person Videos, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021.
Visit the page for the code and paper access.

[CVPRW 2020] Alan Neves, Michel Silva, Mario Campos, Erickson R. Nascimento. A gaze driven fast-forward method for first-person videos, Sixth International Workshop on Egocentric Perception, Interaction and Computing at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (EPIC@CVPR), 2020.
Visit the page for the code and paper access.

[CVPR 2020] Washington L. S. Ramos, Michel M. Silva, Edson R. Araujo, Leandro S. Marcolino, Erickson R. Nascimento. Straight to the Point: Fast-forwarding Videos via Reinforcement Learning Using Textual Data, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
Visit the page for the code and paper access.

[WACV 2020] Washington L. S. Ramos, Michel M. Silva, Edson R. Araujo, Alan C. Neves, Erickson R. Nascimento. Personalizing Fast-Forward Videos Based on Visual and Textual Features from Social Network, IEEE Winter Conference on Applications of Computer Vision (WACV), 2020.
Visit the page for the code and paper access.

[CVPRW 2018] Vinicius S. Furlan, Ruzena Bajcsy, Erickson R. Nascimento. Fast forwarding Egocentric Videos by Listening and Watching, IEEE Conference on Computer Vision and Pattern Recognition Sight and Sound Workshop (CVPRW), 2018.
Visit the page for more information.

[JVCI 2018] Michel M. Silva, Washington L. S. Ramos, Felipe C. Chamone, Joao P. K. Ferreira, Mario F. M. Campos, Erickson R. Nascimento. Making a long story short: A Multi-Importance fast-forwarding egocentric videos with the emphasis on relevant objects, Journal of Visual Communication and Image Representation (JVCI), 2018.
Visit the page for more information and paper access.

[CVPR 2018] Michel M. Silva, Washington L. S. Ramos, Joao P. K. Ferreira, Felipe C. Chamone, Mario F. M. Campos, Erickson R. Nascimento. A Weighted Sparse Sampling and Smoothing Frame Transition Approach for Semantic Fast-Forward First-Person Videos, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
Visit the page for more information and paper access.

[EPIC 2016] Michel M. Silva, Washington L. S. Ramos, Joao P. K. Ferreira, Mario F. M. Campos, Erickson R. Nascimento. Towards Semantic Fast-Forward and Stabilized Egocentric Videos, First International Workshop on Egocentric Perception, Interaction, and Computing at European Conference on Computer Vision (EPIC@ECCV), 2016.
Visit the page for the code and paper access.

[ICIP 2016] Washington L. S. Ramos, Michel M. Silva, Mario F. M. Campos, Erickson R. Nascimento. Fast-Forward Video Based on Semantic Extraction, IEEE International Conference on Image Processing (ICIP), 2016.
Visit the page for the code and paper access.

Datasets

[CVPR 2018] DoMSEV – 80-hour Dataset of Multimodal Semantic Egocentric Videos.
Visit the dataset page for video info and download.

[EPIC 2016] Semantic Dataset – First-Person Videos recorded and labeled concerning to faces and pedestrians.
Visit the dataset page for video info and download.

[MSHP 2021] Musical Hyperlapse Dataset – Some videos and songs with different contents and emotions.
Visit the dataset page for more information.