By Greg Bailey
In January 2014 I started my position at Texas A&M University with Cushing Memorial Library and Archives, which holds the University Archives and Special Collections at A&M. Our only digital presence consisted of a Flickr account hosting items from the University Archives and some items from Special Collections that were put into the Institutional Repository (OAK Trust). Eight months after I started a new Associate Dean of Special Collections and Director of Cushing Library was hired. The new director and I started to voice our opinion that we needed to increase our presence on the web, but also have a system to handle both digitized and born digital materials. In time the Dean of the Libraries organized a retreat for interested parties and out of that a task force was formed to investigate Digital Asset Management (DAM) tools and to come up with a recommendation for implementation.
In the fall of 2014 the task force was established with the objective of investigating and making recommendations for a solution or solutions that would enable the Texas A&M University Libraries to store, display, and preserve digitized and born digital university records and research. In the spring of 2015, the charge expanded to include attention to broader campus needs.
After defining an assessment process and expanding our scope to include campus, the task force first worked to conduct a campus needs assessment, to identify and develop use cases, and to distill core requirements. This became the basis of our testing rubrics. We ran multiple stages of assessment to identify and test systems, as well as to analyze the results of those tests. A recommendation was reached on the basis of this analysis and further inquiries.
Our analysis of twenty-six systems allowed us to confidently assert that no one digital asset management product would meet library and campus needs. Broadly, “digital asset management consists of management tasks and decisions surrounding the ingestion, annotation, cataloguing, storage, retrieval, and distribution” of image, multimedia, and text files. These tasks are performed by systems (DAMS) that differ in their approach to functions and range of associated capabilities. Given campus needs, and our experience as a leading developer with DSpace, which the Libraries uses as our IR, the task force was attuned to the particular importance of the data models embedded in these systems, which guide and constrain other functionality.
We were convinced that modular solutions to discrete needs for storing, displaying, and preserving digital assets are warranted, and that these solutions are likely to require customization. We recommended building a digital asset management ecosystem (DAME) rather than attempting to meet all needs with a single DAMS.
The choice of the word ecosystem, as opposed to “system” (as with a DAMS) is explained by the DAME’s emphasis on a distributed service architecture. This is an architecture in which the discrete roles of a DAMS are handled not by one application, but instead by a collection of applications, each one suited for the role it plays. The DAME’s structure will certainly vary from institution to institution, and in fact this flexibility is perhaps the DAME’s strongest quality. In general, a DAME’s ecosystem will be divided into the following layers:
- File service
In the DAME, the management layer is conceived of as a collection of web services that handle record creation, curation, and discovery. It does not, itself, handle the actual assets, but instead records the assets’ location and metadata, and allows for the management and retrieval of this information. The management layer should be comprised of at least two elements, the first being a custom web service and the second a repository with a fully featured application profile interface (API). The repository application can be one of the many popular DAMS solutions that are currently in use, the only requirement being that it exposes all desired functionality through an API.
It may seem that a repository with a fully featured API would be sufficient to satisfy the needs of a management layer, but there are several good reasons for including a custom web service in this layer. The first reason is that this web service will act as an interface for all communication with the management layer, and by so doing, the DAME is repository agnostic. All other applications in the ecosystem will be programmed against the consistent API of the custom service, and the job of interfacing with the repository’s API is left solely to the custom web service. If the decision is made to switch repositories, the only thing that needs to be updated in the DAME will be the custom web service, and the rest of the ecosystem will not realize the change took place. The second reason for this separation is it allows you to employ multiple repository solutions side-by-side, with the web service aggregating responses. Finally, in record retrieval, the and authentication of the user can be handled by the custom web service, relieving the repository of any need to be compatible with the institution’s authentication and authorization strategy.
This management layer thus communicates with the persistence layer, which is not, by necessity, one of the more complicated portions of the DAME’s architecture. It is simply the data source, or collection of data sources, needed to support the repository. Most repositories that would work well in the DAME are likely to have varied options when it comes to persistence, making the persistence layer one of the more flexible aspects of the DAME. In general this layer will store the assets’ URI, metadata, and possibly even application-specific information needed by the presentation layer.
The preservation layer, which had already been under development would continue and integrated into the new system. A processing layer would be connected to local redundant storage. That local storage would be also connected to dark archives storage and rarely accessed.
Every system that we tested consisted of different tools and components, bundled together as a single system. Part of the argument for a DAME over a DAMS is the ability to determine the components in these bundles locally, and to swap them out to meet evolving needs.
With that in mind the task forced recommended the deployment of modular digital asset management components to meet the complex needs of the Texas A&M University Libraries and campus. These include:
- The deployment of a system to manage and store digital assets and metadata. Our recommended open-source system is Fedora 4, to be coupled with Blacklight and Solr for search and retrieval. Solr indexes content managed by the repository, and Blacklight enables search and retrieval across the indexed content.
- The development of custom user interfaces as appropriate (likely, public user interface and administrative interfaces).
- The deployment of a triple store to enable linked data, along with Apache Camel and Fuseki as the basis of connecting Fedora to the triple store and to Solr indexing software.
- The deployment of an exhibition system. Our recommended open-source exhibition layer would be Spotlight, which is an extension to Blacklight and will easily integrate into our DAME.
- The deployment of a preservation system that would consist of Artefactual’s Archivematica that connects to localized redundant storage. Redundant storage it connected to dark archive of the Digital Preservation Network (DPN) and Amazon’s Glacier via Duracloud.
The development of the ecosystem has started. The Libraries’ IT team has started working on bringing up Fedora 4, along with the other components recommended by the task force. As mentioned above the preservation layer had already been in development, and the final kinks are being worked out in that part of the system. The hope is that the ecosystem will be fully functional within a year.
Overall, the work of the task force was beneficial. We had input from a number of stakeholders that brought forward desired functionality that one specific group of users might not have considered. There was a very strong presence on the Task Force representing the Special Collections, but also our preservation unit which had very similar ideas have groups that are regularly working together. The addition of subject/reference librarians and cataloging and the expertise of the Digital Initiatives group (Library IT) brought yet other perspectives. Having some university representatives also gave us an idea what units around Texas A&M require when dealing with digital materials. The task force had sent out surveys to a number of units on campus and we were able to gather a larger amount of useful info. At a minimum I now know of some units that have large amounts of electronic files that we will have to prepare for in the near future as we bring up the DAME and continue to develop our digital archiving process at Texas A&M. In the end this diverse group with expertise in a number of areas allowed us to test a large number of software solutions. We were able to robustly test the functionality of these solutions and we were able to collect data on strengths and weaknesses of the different softwares. The solution of a DAME built off of Fedora 4 and bringing in a number of other open source solutions might not work for other institutes as we are heavily reliant on the expertise of our IT to bring all of these components together, but the process of creating a task force for a diverse group (including those outside the library) was beneficial. We now have buy-in that had not existed before from multiple units in the library and interests from outside the Libraries, specifically in the area of materials related to the University Archives.
Greg Bailey is the University Archivist at Texas A&M University, a position he has help since January 2014. Prior to that, he served at the University Archivist and Records Manager at Stephen F. Austin State University. He is currently a member of the College and University Archives Section’s Steering Committee.