Questions for RCA project

NOTE: This is a set of questions accepted by AHD on 3 December 2007 on behalf of the RCA group. It is an outline of the conversation we would like to have with a set of researchers at the University of Minnesota. Thanks for your interest, …Eric

This interview is part of an assessment for the Research Cyberinfrastructure Alliance sponsored by the Office of Information Technology, University Libraries, and the Office of the Vice President for Research. Together with collegiate units (CLA, AHC, IT, CFANS, CBS), we are working to understand how researchers manage their computing needs and their digital data. Your input will help shape new services at a campus wide level. We are trying to identify what campus-wide investments will best benefit our research community. We appreciate your participation.

[Note about question numbers. Each [#.#] will be asked. Any question below that level in the hierarchy is optional and will only be asked if it helps to clarify the answer being given.]

[1.1] Describe the process you follow in securing and implementing “high level” computing in support of your research.

[Take notes and possibly modify interview strategy based on response.]

Gathering

[2.1] Describe the computing resources you use to gather the data for your research?

[2.1.1] If some of this computing equipment is in the lab: what needs compel you to keep it there?
[2.1.2] If some of this computing equipment is housed in remote facilities:
- [2.1.2.1] Where is this equipment hosted?
- [2.1.2.2] What benefit is there for you in this remote arrangement?

[2.2] How much raw data do you gather?

[2.2.1] How is this data stored as it is being captured?
[2.2.2] What kind of backups of raw data are kept?
[2.2.3] What is the rate at which the data accumulates?
[2.2.4] What kind of bandwidth is consumed moving data from place to place as it arrives?

[2.3] What formats is this data in?

[2.3.1] Who specifies the standards for these formats?

[2.4] Are there security considerations for this raw data?

[2.4.1] Who is allowed to see the data? Who is not?
[2.4.2] What parts of the data might be considered “private” and need special protection?

Analysis

[3.1] What tools do you use to analyze the data gathered?

[3.1.1] Describe the hardware used for this analysis? Where is it? Who uses it? Who owns it?
[3.1.2] Describe the software used? Who licenses it?
[3.1.3] How do you check the data’s source and provenance to validate it’s authenticity?

[3.2] What new data sets are created during the analysis of the raw data?

[3.2.1] How large are these data sets?
[3.2.2] Where do you keep this data while it is being manipulated?
[3.2.3] How much bandwidth do you need to move it between machines?

[3.3] What identity or other information do you scrub from the raw data?

[3.3.1] When, in the life of this data, does such scrubbing occur?

[3.4.1] Is there a wider audience? Do they see scrubbed or specific data?

Storage

[4.1] Do you currently store (or have plans to store) data after the completion of your research project?

[4.1.1] What data do you (plan to) store for the long term?
[4.1.2] What does “long term” mean to you? How long must these archives last? How valuable is the data after 6 months?
[4.1.3] What are your projections for the growth of your data over time?

[4.2] Who would need to access the archived data?

[4.2.1] Are there security considerations for your data? Is it “private data”?
[4.2.2] Will special software be required to make use of the data downstream?

[4.3] Is it more important to archive the raw data from your research or the data sets that result from your analysis? Or are both equally important?

[4.3.1] If data was scrubbed for security or privacy reasons, do you need to archive the sanitized or unsanitized (or both) version of the data?

[4.4] In what ways is access to archival data important or not-so-important to the publishing and dissemination of your research findings?

Staffing & Funding

[5.1] What staff positions within your research group are responsible for helping to maintain your technology infrastructure?

[5.1.1] What differences are there for staff involved in the support of computing equipment used for gathering, analysis, and storage?
[5.1.2] Describe the roles and expectations of this staff.

[5.2] What departmental or school staff do you depend on for technology support?

[5.2.1] Describe the roles and expectations of this staff.

[5.3] What campus-wide technology support facilitates your research?

[5.4] Please describe any technology resources you depend on which are hosted outside the University altogether.

[5.5]How is your computing staff and equipment funded?

[5.5.1]Is it part of a departmental budget? A research grant? ICR money?

Concerns

[Only to be discussed if time permits, this section is expendable.]

[6.1] Do you feel secure in your research group’s grasp of the computing technology you rely on?

[6.1.1] What are you afraid might be falling through the cracks?

[6.2] What computing services do you feel you most need in order to carry out your research effectively?

[6.2.1] How well are those needs being served right now?
[6.2.2] How do you imagine these needs may change in the next five to ten years?

Conclusion