Data Management Plans Meeting May 21 2015



   Chris Brown, Jisc
   Anna Clements, St. Andrews
   Joy Davidson, University of Glasgow
   Catherine Jones, Science & Technology Facilities Council
   Rachel Proudfoot, University of Leeds
   David Baker, Casrai
   Sheri Belisle, Casrai


   Welcome and call to order
   Review actions from last meeting
   Ethics use case profile (David)
       Review updates
   Next steps

Supporting Materials

   Previous Minutes
   WG Charter
   Ethics Data Profile

Agreed Actions

    Include Internal ID in the values list for ID Types and allow more than one to be selected.
       Action owner: Casrai
       Completed by: June 1
   Insert new Research Dataset suggested definition.
       Action owner: Casrai
       Completed by: June 1
    Look back to find info on Data Capture Modes and Types lists to see where they came from.
       Action owner: Casrai
       Completed by: June 1
   Discuss next meeting timing and content.
       Action owner: David and Anna
       Completed by: Week of Jun 1 at ARMA


Meeting brought to order at 2:03 PM BST.

Welcome, actions from last meeting completed.

Agenda Item 3 – Ethics Use Case Profile.

We’ll focus on items in the profile that can be resolved relatively easily in the call, remainder will be addressed after the meeting (working directly with the commenters) until the profile is ready for wider review.

Lists tab. Need work with subject experts, except ID Types. Needs more values; this group is welcome to suggest more types that could be added. Currently addresses external IDs. There may be a need for internal IDs. One type/value could be Internal ID. Should more than one ID be able to be captured? Hoping there will be a shift to external but for now likely need both. Action: Include Internal ID in the values list for ID Types and allow more than one to be selected.

Research Dataset. Focus on the definition. Its a general definition at this time being used to separate research datasets from administrative datasets. Needs work. Suggest as a thumbnail “it is a set of data used for research purposes, that may or may not have been created by a research project”. This would capture data captured in a non-research source, that’s now being used for research. Distinction is helpful. Should keep terminology as close to what’s currently in use as possible. Attribute names should have come from inputs so should not be invented. This definition is meant to be a starting point, it can have an extended definition and be refined in review. The hope is that the same definition can be used across all use cases. Not limited to digital data? Needs to be considered – can be part of the extended definition. Common generic definition would be the start. Research Alliance have presented definitions, but they’re very technical and perhaps implementation-specific. Outputs from RDA definitions group will be published into the Casrai dictionary. Research Councils come from the perspective that data that comes from research they’ve funded is research data, no matter its source. Its somewhat circular, but perhaps needs to be to capture the concept. Is it true that its always used for research purposes? Not necessarily, but the suggested definition doesn’t imply exclusivity. Should it be “may or may not be used for research” – no because then it wouldn’t be ‘research data’. If data is made available on a gov’t site, its not research data until someone identifies it as the subject of research. We want the definition to be challenged so it can be bettered. Action: insert new suggested definition.

Other Capture Modes and Other Type. Attempt to handle an “other” field. “Other” decreases data integrity. “Other” should be used as a place to enter something not in the list. Then groups like this can look for patterns that evolve in those fields to improve the lists. This method will become standard at Casrai and versioning would be instated as the lists expand and improve. At the end of a cycle, funders (for example) as Casrai members, would submit whatever data they collected with regard to the standard so a new version of the standard can be created and over time, “other” fields can be phased out. It should, over time, improve data quality.

Data Types and Data Capture Modes overlap. For example, Survey is part of both. Is this a concern? Conceptually they’re different elements, but values seem too similar. We can address offline with either a corrected list, or perhaps one isn’t needed in the profile. Action: Sheri can look back to find info on these two lists to see where they came from.

Lots of other comments that need to be processed, but comments will be addressed directly with the commenters offline to move closer to a final version working directly with subject experts. We’re aiming for “good enough for review” so that others can comment and improve. Some comments from original profile (Cathy Pink’s) have been moved over to the profile already. Identified by “original sheet”. (Note for example consent in the survey might have to be identified.)

Propose moving the next meeting from mid-June (16th) to the end of June. May be able to have a wrap-up at that point where a group of profiles will be presented. A wrap-up may encourage attendance. Action: David and Anna to discuss next week.

Meeting adjourned at 2:46 PM BST.