This paper introduces a new paradigm for interacting with zoomable video. Our interaction technique reduces the number of zooms and pans required by providing recommended viewports to the users, and replaces multiple zoom and pan actions with a simple click on the recommended viewport. The efficacy of our technique lies on the quality of the recommended viewport, which needs to match the user intention, track movement in the scene, and frame the scene in the video properly. To this end, we propose a hybrid method where content analysis and crowdsourcing are used to com- plement each other to recommend the viewports. We first compute a preliminary sets of recommended viewports by analyzing the con- tent of the video. These viewports allow tracking of moving objects in the scene, and are framed without violating basic aesthetic rules. To improve the relevance of the recommended viewport, we collect viewing statistics as users view a video, and use the viewports they select to reinforce the importance of certain recommendations and penalize others. New recommendations that are not previously rec- ognized by content analysis may also emerge. The resulting recom- mended viewports converge towards the regions in the video that are relevant to users. A user study involving 70 participants shows that an user interface build with our paradigm leads to more zooms into more informative regions with fewer interactions required.
The work consists in several steps:
Step 1: Building Importance Maps. This step is achieved by combining outputs of different algorithms : saliency maps, motion and face detection.
Step 2: Computing Viewports. We place viewports (which are rectangles) on the frame by maximizing the heat (given by the importance map) inside it.
Step 3: Zoomable Video with Recommendations. We create a Zoomable Video Interface that allows users to zoom and pan over the frame. Clickable recommendations are proposed under the form of semi-transparent white rectangles. These recommendations correspond to the viewports computed at the previous step.Test Zoomable Video Interface ! (Chrome only)
Step 4: Generating Interest Maps. Users behaviour when interacting with the Zoomable Video Interface are collected and used to produce interest maps (as described in the Video Retargeting Project ).
Step 5: Building New (Combined) Importance Maps. Interest maps are combined to the others algorithms used for computing importance maps in step 1. New Importance maps are therefore generated.
Step 6: Computing New Viewports. Step 2 is reapplied to the new importance maps generated in Step 5.
Step 7: Zoomable Video with New Recommendations. The Zoomable Video Interface proposes the recommendations from Step 6.Test Updated Zoomable Video Interface ! (Chrome only)
A. Carlier, R. Guntur, V. Charvillat, W.T. Ooi : Combining Content-based Analysis and Crowdsourcing to Improve User Interaction with Zoomable Video ACMMM'11, 43-52
A. Carlier, A. Shafiei, J. Badie, S. Bensiali, W.T. Ooi : COZI: Crowdsourced and Content-based Zoomable Video Player ACMMM'11, 829-830