Semantic video research has employed crowdsourcing techniques on social web video data sets such as comments, tags, and annotations, but these data sets require an extra effort on behalf of the user. We propose a pulse modeling method, which analyzes implicit user interactions within web video, such as rewind. In particular, we have modeled the user information seeking behavior as a time series and the semantic regions as a discrete pulse of fixed width. We constructed these pulses from user interactions with a documentary video that has a very rich visual style with too many cuts and camera angles/frames for the same scene. Next, we calculated the correlation coefficient between dynamically detected user pulses at the local maximums and the reference pulse. We have found when people are actively seeking for information in a video, their activity (these pulses) significantly matches the semantics of the video. This proposed pulse analysis method complements previous work in content-based information retrieval and provides an additional user-based dimension for modeling the semantics of a web video.


Avlonitis, M., Chorianopoulos, K., and Shamma, D.A. 2012. Crowdsourcing user interactions within web video through pulse modeling. Proceedings of the ACM multimedia 2012 workshop on Crowdsourcing for multimedia - CrowdMM ’12, ACM Press, 19.   BibTeX