Jaarboeken Stichting Archiefpublicaties | 12 december 2017 | pagina 123 - Periodiekviewer Koninklijke Vereniging van Archivarissen

understand the meanings and biases hidden in our professional tools, practices and theories. "Recognizing the presence of an underlying paradigm and understanding the values it conveys is not difficult when we deal with concepts, principles and categories, while it may be tricky when we deal with technical, apparently neutral standards. In fact, different technologies may rely on different philosophies." (Michetti, 2015, p. 155) So far, archivists and records managers have focused on the documentary object as a whole. RDF and Linked Data are almost a Copernican revolution, because they rely on information atoms that - in theory - can be aggregated and manipulated at will. This is the perfect solution for those like Greg Bak who advocate an item-level thinking (Bak, 2012). However, the adoption of XML, RDF, Linked Open Data and other technologies is more than a technical option: it is rather the choice of a specific knowledge paradigm, not at all neutral. In the case of Linked Data, the graph is not only the symbolic representation of the network of relationships among the entities that make up the archival description. It is also the form taken by data, the structure that houses the descriptions, the container that gives shape to our vision of the world. To paraphrase Bowker and Star, there is nothing wrong with that. However, we need to understand the profound significance of this approach. The graph offers many advantages, but its strength - that is, the potential to create a network of connections that can be expanded indefinitely - can prove to be a limit. For example, if we consider EAD, it is evident that its limit resides in its design, that is, in thinking and designing an archival description as a document. As a matter of fact, EAD provides a digital replica of the paper object. However, it is also true that this approach has still some reasons, when we recognize that archival description is an autonomous work. In fact, in addition to practical and operational purposes, archival description has also a fundamental function of mediation between sources and users, and supports the authenticity of the sources. In a graph, it can be difficult to recognize the boundaries of a given archival description. With Linked Data, Anyone can say Anything about Anything25: once we accept this so-called Principle of the triple A, links explode - that's the beauty of Linked Data -, boundaries disappear and users can access directly from anywhere in the graph. In a sense, this is a profound form of disintermediation that is destined to grow as visualizations techniques and strategies occupy the archival space, dominated so far by written word, narrative and hierarchical diagrams. The complex network of relationships underlying - rather, making up - an archive can be now be represented in a myriad of ways. This is not a criticism of Linked Data: the graph paradigm is indeed a promising data architecture. This is rather an exploration of the possible limits and dangers of this paradigm. In short, archivists should investigate this transformation process that is slowly moving archival description in a direction that leads to bibliographic description: high fragmentation of information, and reduction of the narrative dimension. Finally, it should be noted that the effects of the principle of triple A are multiplied when added to the Open World Assumption (OWA). Roughly speaking, this assumption states that that the absence of a statement does not imply a declaration on the absence (for example, the absence of date of birth does not mean that the person is not born).26 Under these conditions, what value should be attributed to the statements (i.e., the triples)? The question is not trivial and indeed takes us back to issues such as source of authority and technical expertise, which have a deep connection with provenance and thus should be taken into account when designing new models for archival description. Strategies are needed to assess users' trust in relation to the quality of information on provenance. After all, this brings us back to the trust issue that Tim Berner-Lee already identified at the top of the Semantic Web stack (Berners-Lee, 2000). Conclusions As already stated and discussed, the Principle of Provenance is a pillar of Archival Science, originally intended to prevent the intermingling of documents from different origins, in order to maintain the identity of a body of records. Peter Scott challenged such a view. As a consequence, provenance in the archival domain moved from a simplistic ono-to-one relationship to a multi-dimensional approach, and started being understood as a network of relationships between objects, agents and functions. Conceptual debate pushed the boundaries of provenance further: the established orthodoxies cracked under the weight of societal, parallel and community provenance. The digital environment and new technologies have presented unpredictable challenges to the concept of provenance: not only are digital objects often the result of an aggregation of several different pieces, but it also is extremely easy to mix and re-use them, to a point where it may be very difficult to trace their provenance. Cloud Computing has complicated the picture further, due to the little control that it is possible to exercise over the Cloud service providers and their procedures. As a result, the archival functions are compromised, since objects get their meaning from their context, and provenance plays a major role in identifying and determining such context: whenever provenance is flawed, so is context, hence the overall meaning of an object. Moreover, any lack of control over provenance determines uncertainty, which in turn affects trust in digital objects, thus hindering the implementation of the top level of the Semantic Web stack designed by Tim Berners-Lee. However, new technologies provide a solution to cope with such complexity. Resource Description Framework (RDF) and ontologies can be used to represent provenance through new standards and models in a granular and articulated way that was not conceivable before the advent of computers. Provenance is slowly taking the form of a network of triples, that is, a complex set of interrelated statements that is apparently distant from the original Principle of Provenance, yet archives in liquid times 25 "To facilitate operation at Internet scale, RDF is an open-world framework that allows anyone to say anything about anything. In general, it is not assumed that all information about any topic is available. A consequence of this is that RDF cannot prevent anyone from making nonsensical or inconsistent assertions, and applications that build upon RDF must find ways to deal with conflicting sources of information." World Wide Web Consortium, Resource Description Framework (RDF): Concepts and Abstract Data Model, W3C Working Draft 29 August 2002, eds. Graham Klyne and Jeremy Carroll, accessed October 6, 2017, https://www.w3.org/TR/2002/WD-rdf-concepts-20020829/#xtocid48014. 242 giovanni michetti provenance in the archives: the challenge of the digital 26 The Open World Assumption codifies the informal notion that in general no single agent or observer has complete knowledge. Not surprisingly, the Semantic Web makes the Open World Assumption. 243

Vorige Volgende