Assessing Digital Asset Management Tools at Texas A&M University

By Greg Bailey

In January 2014 I started my position at Texas A&M University with Cushing Memorial Library and Archives, which holds the University Archives and Special Collections at A&M.  Our only digital presence consisted of a Flickr account hosting items from the University Archives and some items from Special Collections that were put into the Institutional Repository (OAK Trust).  Eight months after I started a new Associate Dean of Special Collections and Director of Cushing Library was hired.  The new director and I started to voice our opinion that we needed to increase our presence on the web, but also have a system to handle both digitized and born digital materials. In time the Dean of the Libraries organized a retreat for interested parties and out of that a task force was formed to investigate Digital Asset Management (DAM) tools and to come up with a recommendation for implementation.

In the fall of 2014 the task force was established with the objective of investigating and making recommendations for a solution or solutions that would enable the Texas A&M University Libraries to store, display, and preserve digitized and born digital university records and research.  In the spring of 2015, the charge expanded to include attention to broader campus needs.

After defining an assessment process and expanding our scope to include campus, the task force first worked to conduct a campus needs assessment, to identify and develop use cases, and to distill core requirements. This became the basis of our testing rubrics. We ran multiple stages of assessment to identify and test systems, as well as to analyze the results of those tests. A recommendation was reached on the basis of this analysis and further inquiries.

Our analysis of twenty-six systems allowed us to confidently assert that no one digital asset management product would meet library and campus needs. Broadly, “digital asset management consists of management tasks and decisions surrounding the ingestion, annotation, cataloguing, storage, retrieval, and distribution” of image, multimedia, and text files.[1] These tasks are performed by systems (DAMS) that differ in their approach to functions and range of associated capabilities. Given campus needs, and our experience as a leading developer with DSpace, which the Libraries uses as our IR, the task force was attuned to the particular importance of the data models embedded in these systems, which guide and constrain other functionality.

Digital Asset Management Ecosystem model. Image created by Jeremy Huff, Senior Software Applications Developer for the TAMU Libraries.

We were convinced that modular solutions to discrete needs for storing, displaying, and preserving digital assets are warranted, and that these solutions are likely to require customization. We recommended building a digital asset management ecosystem (DAME) rather than attempting to meet all needs with a single DAMS.

The choice of the word ecosystem, as opposed to “system” (as with a DAMS) is explained by the DAME’s emphasis on a distributed service architecture. This is an architecture in which the discrete roles of a DAMS are handled not by one application, but instead by a collection of applications, each one suited for the role it plays. The DAME’s structure will certainly vary from institution to institution, and in fact this flexibility is perhaps the DAME’s strongest quality. In general, a DAME’s ecosystem will be divided into the following layers:

  • Management
  • Persistence
  • Presentation
  • Authorization
  • File service
  • Storage
  • Preservation

In the DAME, the management layer is conceived of as a collection of web services that handle record creation, curation, and discovery. It does not, itself, handle the actual assets, but instead records the assets’ location and metadata, and allows for the management and retrieval of this information. The management layer should be comprised of at least two elements, the first being a custom web service and the second a repository with a fully featured application profile interface (API). The repository application can be one of the many popular DAMS solutions that are currently in use, the only requirement being that it exposes all desired functionality through an API.

It may seem that a repository with a fully featured API would be sufficient to satisfy the needs of a management layer, but there are several good reasons for including a custom web service in this layer. The first reason is that this web service will act as an interface for all communication with the management layer, and by so doing, the DAME is repository agnostic. All other applications in the ecosystem will be programmed against the consistent API of the custom service, and the job of interfacing with the repository’s API is left solely to the custom web service. If the decision is made to switch repositories, the only thing that needs to be updated in the DAME will be the custom web service, and the rest of the ecosystem will not realize the change took place. The second reason for this separation is it allows you to employ multiple repository solutions side-by-side, with the web service aggregating responses. Finally, in record retrieval, the  and authentication of the user can be handled by the custom web service, relieving the repository of any need to be compatible with the institution’s authentication and authorization strategy.

This management layer thus communicates with the persistence layer, which is not, by necessity, one of the more complicated portions of the DAME’s architecture. It is simply the data source, or collection of data sources, needed to support the repository. Most repositories that would work well in the DAME are likely to have varied options when it comes to persistence, making the persistence layer one of the more flexible aspects of the DAME. In general this layer will store the assets’ URI, metadata, and possibly even application-specific information needed by the presentation layer.

The preservation layer, which had already been under development would continue and integrated into the new system.  A processing layer would be connected to local redundant storage.  That local storage would be also connected to dark archives storage and rarely accessed.

Every system that we tested consisted of different tools and components, bundled together as a single system. Part of the argument for a DAME over a DAMS is the ability to determine the components in these bundles locally, and to swap them out to meet evolving needs.

With that in mind the task forced recommended the deployment of modular digital asset management components to meet the complex needs of the Texas A&M University Libraries and campus. These include:

  • The deployment of a system to manage and store digital assets and metadata. Our recommended open-source system is Fedora 4, to be coupled with Blacklight and Solr for search and retrieval. Solr indexes content managed by the repository, and Blacklight enables search and retrieval across the indexed content.
  • The development of custom user interfaces as appropriate (likely, public user interface and administrative interfaces).
  • The deployment of a triple store to enable linked data, along with Apache Camel and Fuseki as the basis of connecting Fedora to the triple store and to Solr indexing software.
  • The deployment of an exhibition system.  Our recommended open-source exhibition layer would be Spotlight, which is an extension to Blacklight and will easily integrate into our DAME.
  • The deployment of a preservation system that would consist of Artefactual’s Archivematica that connects to localized redundant storage.  Redundant storage it connected to dark archive of the Digital Preservation Network (DPN) and Amazon’s Glacier via Duracloud.

The development of the ecosystem has started.  The Libraries’ IT team has started working on bringing up Fedora 4, along with the other components recommended by the task force.  As mentioned above the preservation layer had already been in development, and the final kinks are being worked out in that part of the system.  The hope is that the ecosystem will be fully functional within a year.

Overall, the work of the task force was beneficial.  We had input from a number of stakeholders that brought forward desired functionality that one specific group of users might not have considered.  There was a very strong presence on the Task Force representing the Special Collections, but also our preservation unit which had very similar ideas have groups that are regularly working together. The addition of subject/reference librarians and cataloging and the expertise of the Digital Initiatives group (Library IT) brought yet other perspectives. Having some university representatives also gave us an idea what units around Texas A&M require when dealing with digital materials.  The task force had sent out surveys to a number of units on campus and we were able to gather a larger amount of useful info.  At a minimum I now know of some units that have large amounts of electronic files that we will have to prepare for in the near future as we bring up the DAME and continue to develop our digital archiving process at Texas A&M.  In the end this diverse group with expertise in a number of areas allowed us to test a large number of software solutions.  We were able to robustly test the functionality of these solutions and we were able to collect data on strengths and weaknesses of the different softwares.  The solution of a DAME built off of Fedora 4 and bringing in a number of other open source solutions might not work for other institutes as we are heavily reliant on the expertise of our IT to bring all of these components together, but the process of creating a task force for a diverse group (including those outside the library) was beneficial.  We now have buy-in that had not existed before from multiple units in the library and interests from outside the Libraries, specifically in the area of materials related to the University Archives.

Greg Bailey is the University Archivist at Texas A&M University, a position he has help since January 2014.  Prior to that, he served at the University Archivist and Records Manager at Stephen F. Austin State University.  He is currently a member of the College and University Archives Section’s Steering Committee.  


May You Live in Interesting Times: Responding to Administrative Change at the University of Louisville

Graw & Oval Entrance
Grawemeyer Hall, University of Louisville, 2014. Image courtesy University of Louisville.

By Carrie Daniels

Over the past year, the University of Louisville has experienced an unprecedented level of administrative turnover in the presidency, the board of trustees, and the leadership of the University of Louisville Foundation (ULF), the charitable organization that helps support the activities of the University. The unfolding events added urgency to the development and clarification of some of the University Archives’ policies: issues we had been “working on” for years – including email preservation – could no longer be considered a theoretical problem.

While it is difficult to identify the true “beginning” of the turmoil, by March 2016, the Board of Trustees and Faculty Senate were discussing votes of no confidence in the president. There were also concerns about the president’s dual role as head of both the university and the ULF. We wanted to preserve a record of the events, of course, but we also wanted to capture the impact it had on attitudes and opinions within our larger community. We had a longstanding tradition of supplementing the official university record with clippings from newspapers, magazines, and (more recently) blogs, and we continued collecting these materials. However, the colleague who had been responsible for this activity for decades found himself newly concerned: what if the president’s office learned we had several file folders’ worth of newspaper clippings containing comments critical of him? Would the Archives, or the Libraries as a whole, suffer as a result? After a brief discussion at a staff meeting, we agreed we had to continue “clipping” in order to preserve a complete and accurate picture. I offered to take any responsibility for the activity, trusting in the twin shields of our duty to preserve the record as objectively as possible and my status as a tenured faculty member. As it turned out, we were the least of the president’s office’s concerns.

In June 2016, the Republican governor dissolved the existing board and replaced it with a new, smaller board. He also announced that the president would tender his resignation to the new board as soon as it was installed. The Democratic attorney general filed suit, arguing that the governor did not have the authority to dissolve the board; a judge reinstated the old board pending the outcome of the suit. We were unsure who our “real” board of trustees was, but nonetheless, the president negotiated his exit and departed.

While my colleague continued to clip print articles, we also knew there was a lot more going on online. Stories were breaking daily about the president, his exit, the board of trustees (both of them), the governor, and the attorney general. As we had for a special project in 2009, we used a short-term Archive-It account. Our Archivist for Manuscript Collections and I worked from Google alerts to create an individual “seed” for each story. While we still didn’t have the financial resources to use Archive-It on an ongoing basis, we made as much use of it as we could. When we exhausted the remaining space on our Archive-It account, we began preserving web-based stories as PDFs. Given our time and budgetary constraints, this seemed our best alternative: PDF/A is an acceptable preservation format; the vast majority of the stories did not contain relevant links, only links to advertisements; and PDFs are easy to provide access to.

With the president’s exit, I also realized we had come to a major fork in the road: I had to talk to the President’s Office about obtaining his electronic files, particularly his email. This was completely new territory for us. The Archives was at an interesting juncture in other ways, as well: our records manager had recently departed, and I was working with a couple of other colleagues to cover his responsibilities while we searched for a new Archivist for Records Management. In the interviews I did my best to explain our tentative entry into electronic records – as a founding member of the MetaArchive Cooperative, we had plenty of experience with digital preservation, but less with the ingest of digital files from university offices – and hoped we could recruit someone who was interested in developing this program with me. (We did!)

At the same time, I pursued access to the former president’s files. I contacted the interim president, who was now responsible for the records of the Office of the President. He was immediately supportive, but I still had to convince Information Technology (IT) that copies of the files could – in fact, should – be transferred to us. In my initial conversations with IT, I learned that the former president had not saved many files to his assigned network space; the assumption was that his assistants created his documents. But he had plenty of email.

And here we ran into a surprising roadblock. While the University Archivist is named in the university’s governance document (the “Redbook”) as the custodian of university records, IT was nervous enough to confer with the University Counsel’s office. And while I had anticipated concerns about the speed with which we might make the material available, the Counsel’s office was actually worried about attorney-client privilege. That is, they were concerned that by releasing privileged email to us, they would essentially be sharing them with a third party, and thus nullifying privilege. Like most college presidents, ours had been named in suits against the university, sometimes simply by virtue of being the head of the institution. We ultimately agreed that email between the former president and individuals at specific law firms (identified by the domain name in their email addresses) could be filtered out of the material we received. While this is somewhat less than optimal, we know the files will be maintained by IT pending several “litigation holds” (i.e., they cannot be destroyed until the litigation is resolved), giving us a chance to follow up with them again after the dust has settled.   

Our new Archivist for Records Management worked with IT’s specialist to use Library of Congress’s Bagger application to “bag” the .pst files (in 10 GB “chunks”) and transfer them to the Archives. We still have to face the issues of long-term preservation and access, but at least we have them in our possession.

In January 2017, we learned that our interim president was departing as well. In his case, it was to take the presidency at another institution, so the circumstances were happier. And since we had worked through the technical and organizational issues, the process of transferring his email went off without a hitch. While we certainly expected to cross these bridges under calmer circumstances, I am almost (almost) grateful that we were forced into action. We might do things differently next time, but we were able to develop and act on a reasonable plan in a short period of time. The approaches we worked out under these pressing circumstances are at least a starting point – something concrete we can modify and build on – rather than theoretical musings.

Carrie Daniels is University Archivist and Director of Archives and Special Collections at the University of Louisville. She holds an MSLIS from Simmons College and an EdM from the Harvard Graduate School of Education.