Evaluating “data-intensive capability” across the Environmental & Earth Sciences – introducing a new profiling tool for community self-assessment
Evaluating “data-intensive capability” across the Environmental & Earth Sciences – introducing a new profiling tool for community self-assessment
Joint Workshop: School of Information Sciences University of Pittsburgh, UKOLN Informatics University of Bath, Microsoft Research Connections, Research Data Alliance
Overview: The partner organisations have collectively developed a Community Capability Model Framework (CCMF) for Data-Intensive Research [1], building on the principles described in The Fourth Paradigm [2]. The CCMF explores readiness for data-intensive science and we have developed a profiling tool for applying to different communities and domains. We have chosen environmental and earth sciences as a “deep-dive” area and are now seeking views from ESIP members to create and collect data profiles. The work is also part of the global Research Data Alliance initiative and we have formed a CCM Interest Group [3].
Objectives: This breakout session will include a scene-setting presentation, an opportunity for hands-on testing of the profiling tool, and time to give feedback and discuss ESIP community views. We aim to collect completed Capability Profiles from participants during the course of the ESIP meeting.
Outcomes: Participants will gain an understanding of the CCMF and its aims; participants will also complete a capability assessment of their own community using the CCMF Capability Profile Tool. They will have the opportunity to discuss findings and provide feedback on the CCMF and Capability Profile Tool, as well as consider further applications.
[1] CCMF White Paper, http://communitymodel.sharepoint.com
[2] The Fourth Paradigm, http://research.microsoft.com/en-us/collaboration/fourthparadigm/
[3] RDA CCM-IG, http://rd-alliance.org/working-groups/community-capability-model-wg.html/
Taking the group through a tool that has been developing with Microsoft research connections
The speaker begins with talking through the background context of the program
Reference to the Fourth Paradigm
Data is at the center of the transformative research practices
Self-assessment: systems to diagnosis to action
This can be done at different levels:
-
PI
-
Funding Level
-
Federation Level
-
Project Level
Title: CCM Framework http://www.communitymodel.sharepoint.com/
Developed over a year 6- international workshops
Produced case studies
Culminated in a white paper (available from the web page)
Model is comprehensive: not just technical
8 capability factors:
Research Culture
Collaboration
Skills and Training
OPenness
technical infrastructure
common practices
3 case studies in 2012
Lists funding bodies, institutions, and researchers
in 2013 there is an RDA interest group
Lists aims and activities of the interest group
Developed the tool and testing in “deep dive areas” as well as light touch areas
CCM-IG Capability Profile Template
Scorecard based tool- categories for each of a series of the characteristics
Series of assessments to find where your system or project exists
Walking through the Administrative- about you and your data
Collaboration: Disciplinary - from your project - from your institution - in terms of your data
address collaboration across sectors, disciplines, within disciplines, with public
Q: addressing how to define the public collaboration
Skills and Training:
Thinking about your discipline and instruction with data management, data collection and description, data description and identification...copyright… etc…
Q: about who is the audience in evaluation ?
Discussion issues of semantics (student - could be rephrased as user)
-
Suggestion that try application to a domain
-
Phrased in a way that is appropriate to the domain of research
-
Phrases that help you recognize yourself - Additional phrases that refers to users
Openness
related to data
in the course of research, published literature, data specifically, methodologies, reuse of existing data
Q: Why would reuse of data be related to openness?
A: discussion of classification- choices of categories- may disagree..
Technical Infrastructure
Range of categories: tools, tools support, curation, discovery and access, integration and collaboration platforms, visualizations and platforms for citizen science
Q: discussion of wording on category 5 relating to tools
Common Practices
in terms of your discipline; data formats, data collection methods, standard vocabularies, semantics, data packing and transfer
Economic and business models:
sustainability of funding for research
geographic scale of funding, physical size, funding for infrustruture, size and geographic scale of infrastructure funding, ROI
Q: potential issue of one component canceling another out in regards to the eccentricities of funding- worth looking at other potential pairs of responses that would offset each other
Suggestion: might add some other examples that people are familiar with
Legal Ethical and Commercial:
In respect to data
legal and regulatory framework, management of ethical responsibilities, management of commercial constraints
Q: Asking about clarification of commercial constraints
A: Thinking about a consortium project with both academic and private funding partners
Research Culture:
Entrepreneurship, innovation, and risk; reward models for researchers; quality and validation frameworks (expressed significance of this last row)
Ask if willing to work as a group to create a profiles for specific areas
http://www.communitymodel.sharepoint.com/
Could use this for a center or for a project
Karl - could use for center at the University of New Mexico
Use for center at George Mason University
Q: curiosity about weighting categories depending on the scale: as project or center, data intensive, research intensive
Suggestion from Liz for people trying the framework out to make notes as they fill it in
Carol: sees this as a tool for ESIP as a whole - to uncover certain strengths and weaknesses - gap analysis
- Perhaps have everyone fill it out and see what consistency
addresses struggle to assess the effects internally - or to capture the communities effectiveness
-May not yield publishable results, but as a tool it is simple and effective
Karl: the ability to roll up or characterize an organization as a whole - ability to see the range based on perspectives - where you sit on the value chain
Acknowledging the idea of evaluation perspective of the individual
Also interesting for measuring impacts or changes - do repeat analysis in the federation or elsewhere to see how programs are changing
Mentions discussions about whether this tool is indeed a scorecard
Rebecca mentions that it is important to do this longitudinally as opposed to past anonymous assessments
Q: Carol asks how can ESIP help you?
A: Executive committee (some selection bias) could evaluate this
or could you do this on a larger scale using the mailing list to see these perspectives statistically
mentions the importance of being sensitive to how information and responses would be used.
there would need to be a statement up front about safety of information
thinking about IRBs (the person who has the instrument must go through the IRB board)
Speaking of timeframe it may be something that could be reintroduced at the summer medium
Q: delineating between pre and post project assessment
A: discussions of interest with NSF use and Microsoft
Carol: Has a hunch that depending on who you ask there will be very different responses- domain communities within the organization
Where the members spend their time will be helpful to get full perspective
Speaker ends session and provides email address for further discussion.