Results of Year 3
In the third year of the project, we further evaluated the AXES PRO system, we finalized the AXES RESEARCH system, and we started preparing for AXES HOME.
The AXES PROFESSIONAL System
The AXES PRO system targets media professionals. These people search the archive for content on a daily basis, mostly for re-use.
AXES PRO was developed in the first two years of the project (see video walkthrough). We started Year 3 with an evaluation of this system by real- world users. Six image searchers tested AXES- PRO, searching in a repository of 500 hours of NISV content.
From the user study we conclude that professional users are highly interested in the type of audio-visual search that AXES advocates. They were impressed by the professional looks and smooth operation of the system, although the lower accuracy obtained by ‘noisy’ content- based automatic tools, especially compared to the very accurate metadata provided by archivists they are used to work with, is still a point of attention.
Based on their feedback, we put extra effort in improving the quality of the search results. Also some interface issues were resolved. A walkthrough video showing the functionality of AXES PRO was produced, and is available on our website. Demonstrations were held at various events, giving people the opportunity to try out the system by themselves.
The AXES-RESEARCH system
We then moved our attention to the second user group, consisting of journalists and academic researchers. A survey amongst academic researchers and journalists with in total 615 users was performed to collect the user requirements.
This user group mainly wants to browse and keep track of their search results. Instead of re-use, the focus is more on investigation and research. Taking these requirements into account, we built the AXES RESEARCH system (see video walkthrough).
In contrast with the AXES PRO interface, and in line with the differences in user requirements between professional and research users, the AXES RESEARCH user interface places a greater emphasis on linking, browsing, collaboration, and personalization. New functionalities have been added, such as a simplified search bar, a virtual cutter, the possibility to add notes or create collections, etc.
We again use on-the-fly model learning, which proved very useful in AXES-PRO, but extend it with a set of pretrained models. Pretrained models allow instantaneous retrieval and accurate search results. If no pretrained models are available, the system automatically switches to the “on-the-fly” mode: the textual query of the user is sent to a web image search engine like Google Images or Flickr, which retrieves a number of images in order to train a visual classifier. With this visual classifier, and appropriate preprocessing and indexing of the video corpus, it can then retrieve videos of the same person or object (e.g. a building facade) from the archive in a number of seconds. As such, the user can search for any person or location, not only those present in the thesaurus. We also offer searching on generic categories, either pretrained or created on-the-fly, of persons (e.g. man, child, muslim), face attributes (e.g. glasses, moustache), locations (e.g. countryside, city), objects (e.g. chair, car, school), and events (e.g. mountain climbing, demonstrating). Combined, this gives the user an unprecedented freedom in defining his information needs and searching directly in the audiovisual content.
To improve the quality of the returned results, we put a lot of effort in finding the best possible image and video representations. These should capture the essence of a scene, allowing simple models (e.g. linear support vector machines) to discriminate relevant vs. irrelevant content. At the same time, scalability towards large archives should not be compromised. While the on-the-fly methods need to find a compromise between accuracy and processing speed, the pretrained models can rely on larger and/or cleaner (preselected) training datasets and higher-dimensional representations. As a result, they give more accurate search results and improve the user experience. Performance of almost all components could be improved significantly.
Instead of entering keywords, a search can also be based on one or more images, either a keyframe from a video found in another search, or an image uploaded by the user. The currently supported options for such similarity search are instance search (i.e. the same scene or object, e.g. a house façade or book cover) and face search. Another technology, that proved very valuable, is speech recognition, i.e. a speech-to-text tool. This converts the spoken audio to a time-coded transcript, which can be searched, read and downloaded. Finally, we further investigated hyperlinking, i.e. linking anchors in a video fragment with target data found elsewhere in the archive.
AXES RESEARCH was evaluated by academics and journalists and was met with enthusiasm and deemed very useful. Segment identification, the videocutter editing tool, and personalisation were most appreciated. Suggestions for an improved user interface and more search transparency are taken into consideration for the next development phase and enhancements will be incorporated in the final prototype, AXES HOME, specifically designed and developed for and to be tested by the home users. In addition to regular end-user testing, the major parts of the underlying systems or individual components, for example event detection, also took part in benchmark evaluations, such as in the TrecVid (video retrieval evaluation workshop), MediaEval (multimedia benchmark evaluations), and THUMOS (large scale action recognition challenge). In all cases, the AXES team achieved excellent results (e.g. first place in THUMOS action recognition and TrecVid Multimodal event detection).
The AXES system was successfully implemented at three different locations, i.e. first at the integrator’s site (Airbus Defence and Aerospace, Paris), and subsequently at two user partners’: the Netherlands Institute for Sound and Vision (in Hilversum), and the BBC (in London). Deployment at the user partners’ sites proved local installation is feasible, and showed flexibility and adaptability, as well as allowing user partners to actually try, use, and demonstrate it in their own environment.
In parallel to the development of the AXES- RESEARCH system, we started preparing the AXES-HOME system. This system is specifically geared towards the home user, with some special features such as tablet use and novel recommendation features.
Based on feedback from user surveys, a first mockup of the AXES-HOME system has been developed (see screenshot), and research on how to make this happen is ongoing.