b. Several studies note that as technology develops, new value can be assigned to records; this is particularly true with Cloud services. For example, Instagram is used as both a "storage box" of personal photos and a space to share information about users' identity and activities.12 Should the archival management system capture and preserve the profile in place at the moment of creation or transmission of each record? Additional complexities arise when new people enter the picture. The collaborative nature of social media platforms encourages the creation of new records (or new representations of existing records) via linkages, embedded content and comments. "Likes," tags, and participation by others on photos add new value to those possessions, but such metadata can easily become obscured in the interface, if not trapped in the application where it is recorded. The additional information added by others might be considered as context-of-creation metadata (in the case of collaborative environments such as Google Drive) or context-of-use metadata, such as "likes" and "shares" in a social media platform. Both forms of context suggest that archival systems will need a method to represent the role that a particular user played in modifying or adding to the core record, that is to say, the original "creation" developed by the original "author," "creator," or "collector" of a particular work (Bak, Hill, 2015, p. 101-161). Archival descriptive records might somehow catch and fix these new associations as some representation of provenance.13 Context is and has always been a fluid entity in time, that is, it changes as time passes by. What is new today is that context has become a fluid entity in space, that is, it changes as we look at it from a different perspective. For example, a document stored in Google Drive or a similar Cloud-storage service may be represented as belonging to one folder for the original creator and a different folder for a contributor provided permission to update the document. Given the collaborative nature of these tools, it appears that in general the same document belongs to different folders according to the agent - be it an individual or a system - that interacts with the document.14 Similarly, social media postings appear at a particular point in a stream of posts. The specific stream is produced by the interaction of object metadata with user preferences and choices, and these of course vary for different users at different times; as users comment on or annotate that record, evidence about its use accrues alongside the original post. The consequential question is whether the standards and tools available to archivists will allow them to preserve both the records and the complex relationships reflecting their creation and use, which represent a major part of their context. A preliminary question should be whether archivists agree that such network of relationships needs to be preserved. If so, what can be done to help them implement a cohesive set of archival services that are suitable to the Cloud-based environment in which many people live their digital lives? Should archivists stick to a static, single perspective framing data and metadata once it crosses the archival threshold, or should they adopt a more flexible approach where different perspectives may coexist? What metadata should be retained? For what purposes? Furthermore, how much metadata is enough? In the digital environment, metadata associated with or embedded into records may provide relevant information on the provenance of either the records themselves or the systems in which they reside. However, if the scope of provenance is broadened to include societal provenance,15 the list of sources where to get metadata needs to be extended to include materials documenting aspects of both the society at large and the specific communities in which the records have been created, managed and used. Linked Data The most promising model for describing digital resources is RDF (Resource Description Framework).16 Its very simple design is based on the notion of a triple, that is, a statement consisting of a subject, a predicate, and an object, describing some elemental aspects of a resource. RDF is a fundamental component of the Semantic Web architecture, since it allows - along with other Web technologies - to publish and interlink structured data that can support semantic queries, i.e., queries that enable the retrieval of both explicit and implicit information.17 Data published on the Web according to this architecture are called Linked Data.18 Ontologies complement and enhance the power of Linked Data, as they are formal specifications of a shared conceptualization, and act as a cornerstone of defining a knowledge domain. Tim Berners-Lee established four simple rules for creating Linked Data: "1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names 3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) 4. Include links to other URIs so that they can discover more things (Berners-Lee, 2006)". It is interesting to note that Linked Data seem to be a perfect fit for the nebula of data objects mentioned above: statements can be linked to other statements, archives in liquid times 12 The term "storage box" is used by Odom, Sellen, Harper and Thereska to illustrate how causal users may treat networked environments as a place to make digital materials accessible across different physical places or using it as an alternative place of storage for backup purposes. See William Odom et al., "Lost in Translation: Understanding the Possession of Digital Things in the Cloud," in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Austin, 2012 (New York, NY: ACM, 2012), 781-790. 13 New representations of provenance as a more complete set of information about actions taken in the origination and subsequent handling of a digital object can be represented in records complying with the requirements of the PROV Ontology. See Paolo Missier, Khalid Belhajjame and James Cheney, "The W3C PROV Family of Specifications for Modelling Provenance Metadata," in Proceedings of the 16th International Conference on Extending Database Technology, Austin, 2012 (New York, NY: ACM, 2013), 773. 14 Please note that we are not referring to the case in which a document is assigned to different folders for records management purposes. We are referring to the fact that a specific document gets a different context according to the user that interacts with it. 236 giovanni michetti provenance in the archives: the challenge of the digital 15 Societal provenance is a term used to mean provenance in the broader sociocultural dimension. Records creation, management, use and preservation are sociocultural phenomena. Therefore, provenance may be interpreted taking into account the sociocultural dimension as the context in which all actions take place. 16 For more information on RDF, see https://www.w3.org/RDF/. 17 The triples describe resources, so they may be interpreted as metadata, that is, data about data. However, it is important to highlight that being metadata is not an ontological property, since there is no such thing as metadata per se. Some data are called metadata, because a special value is assigned to them - they are recognized as conveying information on some specific dimension considered as being relevant in a given context. For example, dates are usually considered metadata, because of the relevance of the temporal dimension. At the same time, dates are data, because they are usually embedded into documents, that is, they are integral part of the datum. There is no antithesis nor contradiction - everything is data. Sometimes it is called metadata to highlight its special value. 18 RDF is a data architecture, while Linked Data is a way of publishing RDF data. 237

Periodiekviewer Koninklijke Vereniging van Archivarissen

Jaarboeken Stichting Archiefpublicaties | 2017 | | pagina 120