Crowdsourced Automatic Zoom and Scroll for Video Retargeting

Screen size and display resolution limit the experience of watching videos on mobile devices. The viewing experience can be improved by determining important or interesting regions within the video (called regions of interest, or ROIs) and displaying only the ROIs to the viewer. Previous work focuses on analyzing the video content using visual attention model to infer the ROIs. Such content-based technique, however, has limitations. In this paper, we propose an alternative paradigm to infer ROIs from a video. We crowdsource from a large number of users through their implicit viewing behavior using a zoom and pan interface, and infer the ROIs from their collective wisdom. A retargeted video, consisting of relevant shots determined from historical users behavior, can be automati- cally generated and replayed to subsequent users who would prefer a less interactive viewing experience. This paper presents how we collect the user traces, infer the ROIs and their dynamics, group the ROIs into shots, and automatically reframe those shots to improve the aesthetics of the video. A user study with 48 participants shows that our automatically retargeted video is of comparable quality to one handcrafted by an expert user.

Fig 1: four frames and a few viewports (first row), heatmaps and detected ROIs (second row), retargeted frames including reframing techniques (last row)

The work consists in several steps:

Retargeted video

Original video

A. Carlier, V. Charvillat, W.T. Ooi, R. Grigoras et G. Morin : Crowdsourced Automatic Zoom and Scroll for Video Retargeting ACMMM’10, 201-210