Today there is a strong need for increased quality in the metadata descriptions of digital cultural heritage information, harmonisation of information between different domains and making collections available as machine-readable and linked data. Linked open data, in combination with accepted international standards and support for those standards, is a first step towards increased use of qualitative cultural heritage data and the possibility of the interconnection of different data sets.
Digisam has been participating in a project coordinated by the National Archives of Sweden, involving, among others, British Museum as a partner. The aim of the project was to examine whether the harmonisation of archival information and CIDOC CRM is possible, what the conditions for making data interoperable with this model look like when applied to data from archives and museums, and how those processes could be facilitated by a support service. The project has primarily tested the service 3M (Mapping Memory Manager), developed by FORTH.
The aim of the project was also to discuss various alternatives for the design of persistent identifiers (PID), the unique code strings for identification of digital objects. A workshop on persistent identifiers was organised to identify and discuss current routines and systems in the LAM sector.
Archives & Collections
Challenges we faced in describing archival information with the CIDOC CRM RDF model included defining the role of “archives creator”, the description of “volume” and object-based descriptions.
In CIDOC-CRM a person or organisation can be assigned different roles. An information object can be created by a ”creator”. But creator in the archival context does not necessarily need to be the same person as the one who created the information. A person or organisation can, in the role of records- or archive creator, receive information created by others. This means that in the archival context there is a difference between ”archive” and ”collection”. A collection is a selection of items collected on the basis of a specific theme or choice, which could form an archive, but do not necessarily need to. A collection requires creators, but may have been acquired in several independent collectors activities.
Another challenge is the concept of ”volume” which goes back to a time when the archives were usually paper documents, which means that the volume represented both the physical object (cardboard box) and the logical object (information content of the documents in the box). This complicates the use of CIDOC-CRM as it requires a clear distinction between the physical and the logical nature (for example for updates) – though in the real archive world, much of the information on a volume is related to either the physical or the logical description of the object.
On the other hand, additional information could be added during the adaptation to a more object-oriented description. A letter might have a completely different value for researchers and users, and would be interesting as more than a document. It may have to do with, for example, specific material, special ink, etc.
The results of the mappings between archival data and CIDOC-CRM indicates that today there are challenges with regard to the specific requirements for the description of archival information. While there is great potential in the ability to link information descriptions, it has also been obvious that an initiative to harmonisation of the descriptions should be taken on a more general level.
Linking archival information and museum information
Given the challenges that we met when mapping archival information to the CIDOC CRM, we decided to test how it would be to create links between archival and museum information using the CIDOC CRM model. After a few searches in the material we decided to make a test with some photographs by the photographer Victor Lundgren. In the collection system of the museum Murberget, we found a lot of photographs by Viktor Lundgren, including some showing horses.

Photo: Viktor Lundgren CC BY
The photographs were described at the object level, and it was quite straightforward to do a mapping of the information that was included.

To find a basic common level based on the metadata that was available in both the museum system and the archival system, and with using CIDOC CRM as a starting point, we drew up the following model that includes information from both the archival information system and the collection system:

In the archival system NAD (National Archive Database) we found a photography collection by Janrik Bromé where we got a hit on the same photographer, Viktor Lundgren, however not as a photographer but as a subject in a photograph, probably a self-portrait that he has made to be sent out as Christmas cards, with the text on the back of the card: ”Merry Christmas and a Happy New Year! Best wishes Viktor Lundgr ”(the rest is missing).

The image below shows the result of the mapping on the basis of archival information, a graphic representation of the hierarchical structures, expressed through relationships in the CIDOC CRM.

When we found out the basic information about Viktor Lundgren we could easily find much more information about him in NAD, including church records (birth records, parish book and county judicial archives (probate). We could even find information about Lundgren as a writer, and information (and authority file) on him as a writer in National Library database, Libris and VIAF (Virtual International Authority File).
However, our search for photographer Viktor Lundgren in the national photographer register available in web platform for authority files, Kulturnav did not give any match, even if there was an authority record of him as photographer embedded in a metadata record from Sundsvall museum . In Kulturnav it is possible to cooperate on the authority lists, and in our working group the question came up on how to add a single authority record in the national photographer database. We made contact with Kulturnav/Nordic Museum and they published an authority record for a photographer Viktor Lundgren in the register, so we could link to it. Now the interesting question about identifiers came up. Generally, authority files should not be duplicated, but here there are two different authority lists, one list for writers pointing to Lundgren in his role as a writer, and the other one, the national photographer register, pointing at his role as photographer. There is no doubt that the authority post about Lundgren should be a part of both lists, regarding his different roles, but is there a need for two separate persistent identifiers (with “SameAs”-connection) or should an identifier from Libris/VIAF be re-used? Technically, there are two ways to go, and we are looking forward to deal with this question in our future work.
Concerning other authority files (for example terms like ‘photographer’) we used TMP2 (ThesaurusManagement Platform), a web platform to collaborate on and to publish thesaurus and authority files. There, we could link information in metadata with terms like ”photographer”, ”Professional photography” , ”Black-and-white photograph,” to name a few.
Results
Regarding the interoperability between archives, museums and information that could be harmonised by use of CIDOC CRM, there are both opportunities and challenges. Results of the mappings between archival data and CIDOC-CRM RDF show that there are challenges with regard to the specific requirements in description of archival information. Based on current limitations, it is primarily about the difficulties in the description of the material itself because the information is not mapped on the same level, but also in finding a way to express some specific terms, as for example “archival volume”.
Today, in order to link information between different metadata models, the focus is on the linking information with authority files. There is also a great potential in the possibility to link information by creating interoperability between data models, which was what we explored with the help of CIDOC CRM in the tests carried out. It is also clear that a comprehensive initiative should be taken on a more general level. In the library domain, similar issues have been handled to overcome similar challenges and adjustments have been made on a global level in cooperation with ICOM / CIDOC by developing adaptations of CIDOC for library materials, including the authority of the data; FRBR, FRAD and FRSAD models and FRBRoo. This means that the library and museum data today have a common conceptual model for the description of the information.
Do you have personal experience of the linking of information from archives and museums? Have you been working on harmonisation of these data models? We are grateful for your comments and views on the project, either directly here on the blog or by email to sanja.halling@riksarkivet.se (note: deadline for feedback is May 23).
Lina Marklund and Sanja Halling