MDL-HT Digital Preservation Prototype

FYI, this project is being managed in Basecamp.

This proposal has been approved by John Butler, Bob Horton, and Bill DeJohn. It is partly based on the “MDL-HathiTrust Image Ingest Prototype Project” charter produced by John Butler and last revised on 100906. The title of this document was changed slightly on 101208, but otherwise the content remains as it was.

The Minnesota Digital Library (MDL) seeks to explore a common infrastructure strategy that will bring the state a significantly enhanced capacity for preserving and accessing its cultural heritage. The MDL perceives opportunity and the wisdom in providing large-scale digital content repository services for Minnesota, and considers establishing a shared digital preservation service as a valuable first step. Earlier this summer the MDL hired me to facilitate the production of a report on the image preservation needs of a number of stakeholders around the state of Minnesota.

As the project charter states: “The purpose of this project is to pilot and, therefore, demonstrate, the technological and organizational potential of a scaleable digital preservation program and service for cultural heritage stewardship across Minnesota.”

I have been asked to be the project manager for this effort. My role would be to build the project work plan, manage the project through its completion, and coordinate the evaluative reports upon conclusion. I would work closely with MDL and HathiTrust stakeholders, ensuring strong communication among the developers, stakeholders, and sponsors.

In addition to the project manager, the project team would consist of a digital content and software development coordinator (Jason Roy), a software developer (Bill Tantzen), a digital preservation consultant (TBD), and a liaison from HathiTrust (TBD).

For the purposes of this document, the project sponsors will be considered to be John Butler (UMN), Bob Horton (MHS), and Bill DeJohn (Minitex).

Deliverables

Deliverable 1: Work Plan.

Cost: $2,500

Requires an initial meeting with sponsors to make sure everyone is on the same page with regard to desired outcomes. Drafting workplan with a clear timeline. Reviewing draft with sponsors and stakeholders. Revising as needed. Completion expected in mid-September.

Deliverable 2: Requirements.

Cost: $3,000

Requires at least one, more likely two meetings with the project team to define the requirements of the project. Drafting and reviewing of these requirements should be complete by the end of September.

Deliverable 3: Specifications.

Cost: $3,000

While the first draft of the specifications will be prepared by early October, these will need constant revision on a project of this tight timeline. The project team will consider progress and changes to the specifications every two weeks and the draft specifications will reflect these changes continuously during as the project progresses.

Deliverable 4: Prototype.

Cost: $14,500

The heart of this project will be to develop the workflow move data from Minnesota into the HathiTrust, demonstrate that workflow by moving a defined set of images into HathiTrust, and work with HathiTrust to define the appropriate display for these images in that system. During the period from early October to the end of November we will be actively working on the extraction of data from systems at MDL and MHS and the ingest of this data into HathiTrust. This will be an iterative process, starting with simpler formats and progressing to the more complex. I expect to devote 25% of my time to the project during this phase, meet weekly with Jason, Bill, while staying in close contact with HathiTrust. I would also be consulting as needed sponsors and the rest of the project team. If the sponsors wish, we can also have a monthly phone meeting to keep everyone updated on progress.

Deliverable 5: Evaluation.

Cost: $2,500

Work closely with the digital preservation consultant to evaluate the prototype results and produce a report for the sponsors. This report would include an assessment of the technology we used, the costs of scaling this to broader participation, and the organizational challenges ahead, in particular the governance issues this presents for MDL. Report completed before the end of December.

Deliverable 6: Discussion of Next Steps.

Cost: $500

Available for a meeting of the sponsors in January to discuss the evaluation report and next steps in the digital preservation effort.

Optional Opportunity

Deliverable 7: Attendance at iPres 2010

Cost: registration and travel support

I believe the iPres 2010 meeting is ideally suited to helping us share word of our effort and learn more about best practices and potential partners. iPRES 2010 will be the seventh in the series of annual international conferences that bring together researchers and practitioners from around the world to explore the latest trends, innovations, and practices in preserving our digital heritage. It includes sessions on preservation services, metadata and object properties, preservation planning and evaluation, and PREMIS implementation that all sound directly applicable to the task before us in this project.

This meeting is from September 19 to 24 in Vienna, Austria, so if we are to send a representative we have to make that decision immediately. I estimate travel, lodging, meals, and registration to total a bit under $5,000. I may be able to save a bit by contacting friends and avoiding lodging expenses, but it would be best to assume worst-case expenses.

Scope

While the MDL intends its preservation efforts to encompass a wide variety of formats as its preservation strategy evolves, this particular project aims to demonstrate the preservation of a relatively specific type of data: digital images. I will work to keep the scope focused on this objective and resist efforts to generalize the conversation and solutions developed at this stage. While the process we use to preserve digital images may well inform the process we use to preserve other formats (such as audio or video), we must acknowledge that the specific workflow and tactics we adopt will focus on the needs of image collections.

Timeline

Work would start immediately upon receipt of a Minitex contract and conclude in January 2011.

Cost

As a consultant, I do not charge by the hour. The services above will cost $26,000. I will submit an invoice for $8,500 at the end of September 2010, for $7,000 at the end of October, for $7,500 at the end of November, for $2,500 in mid December, and for the final $500 in January 2011 . Any travel outside the Twin Cities that required for this work would be covered at typical University of Minnesota reimbursement rates and billed separately. Any expenses for facilities and travel for other participants in face to face meetings are also not included. Given spaces available for free, this should not be an issue, but travel costs for a HathiTrust participant in face to face meetings could be substantial.

/MDL/ImagePreservationPrototype/