Progress in multimedia analysis CATALOGUS The same point also helps in full text retrieval. When texts are included as counter examples in the query, the computer may be able to determine the proper response much more quickly. For query by example, a distinction should be made between external examples brought in by the user and internal examples where the user has selected an item from the database. When the example is external, in practice the query example is not annotated, so the system can only search for similar items on the basis of the content descriptors described above. When the example is internal, similarity can also be based on the annotations of the items. In practice the user will not get the answer directly from one of the above query types, but will engage in an interactive session with the system where advanced visualization and relevance feedback from the user are iteratively used to bring the user closer to the desired information. Ideally, the system is actively participating in finding the best solution by posing the most informative questions or showing the most informative results to the user. Figure 3. Example of an advanced visualization tool where the user gives feedback to the system by indicating relevant and non-relevant items. Interactivity poses heavy demands on the computing, storage, and display capacity of the system. Users want immediate feedback on their queries, but this might require computing a large set of relevant descriptors if external examples are used, and then requires comparing the descriptors of all elements in the dataset with the query. Advanced database techniques are required to limit the search. In addition, interactive search stretches the functionality of the presentation devices to the limit. Nevertheless, interactivity compensates for the inability of the computer to take account of context. In a full interaction scheme, not only the query may be modified but also what is to be considered similar, and what are to be considered good examples and counter examples. By using relevance feedback and visual presentation of the best results (see the figure), current content-based retrieval systems only scratch the surface of what is to be expected in the near future. 102 ARNOLD W.M SMEULDERS, FRANCISKA DE JONG AND MARCEL WORRING MULTIMEDIA INFORMATION TECHNOLOGY AND THE ANNOTATION OF VIDEO In this section we review the state of the art in multimedia information analysis disciplines: computer vision, text processing, and audio processing, followed by interaction and machine learning. Computer vision started in the sixties with occasional pictures of space and medical images. Processing was concentrated on large computers. In the early nineties, personal computers became sufficiently powerful to hold a digital image, popularizing picture computation. Digital storage of pictures, and family communication with pictures through the internet, followed later. Digital image sensors are now found in many devices. It is estimated that more than half of all new cameras are digital as well as a quarter of all family video devices. Hence, computer vision has developed from an esoteric science to a necessary ingredient of the information society in just 15 years. An essential step forward was the recognition that precise segmentation of an object in the foreground against the background is unattainable. There is evidence that even humans break down images into named objects only when necessary. To identify a scene, it may be sufficient to recognize just a few details. A typical example is an orange circle somewhere in the middle of a picture signifying a setting sun. Another typical example relating to texture is a patch of striped skin immediately identifying the presence of a tiger or a zebra. And a typical example of a characteristic spatial arrangement is a face. Now it can be understood why Hawaiian sunsets, faces, and tigers are frequently used in demon strations of video search systems. But, it requires more progress to develop their success into a general capability of recognizing items in any image [Fergus 2003], visual collection [interact] Data definition annotation store in features-file detect visual store in object features features-file Figure 4. Sketch of the flow of information in a system for interactive visual annotation and query by example. similarity feedback interaction Retrieval [interact] detect visual object features query Query definition 103

Periodiekviewer Koninklijke Vereniging van Archivarissen

Jaarboeken Stichting Archiefpublicaties | 2005 | | pagina 53