Content-based Identifiers for Iterative Forecasts: A Proposal

Key Info
Description - a brief synopsis, abstract or summary of what the learning resource is about: 

Iterative forecasts pose particular challenges for archival data storage and retrieval. In an iterative forecast, data about the past and present must be downloaded and fed into an algorithm that will output a forecast data product. Previous forecasts must also be scored against the realized values in the latest observations. Content-based identifiers provide a convenient way to consistently identify input and outputs and associated scripts. These identifiers are:
(1) location-agnostic – they don’t depend on a URL or other location-based authority (like DOI)
(2) reproducible – the same data file always has the same identifier
(3) frictionless – cheap and easy to generate with widely available software, no authentication or network connection
(4) sticky – the identifier cannot become unstuck or separated from the content
(5) compatible – most existing infrastructure, including DataONE, can quite readily use these identifiers.

In this webinar, the speaker will illustrate an example iterative forecasting workflow. In the process, he will highlight some newly developed R packages for making this easier.

Authoring Person(s) Name: 
Carl Boettiger
Authoring Organization(s) Name: 
University of California, Berkeley
License - link to legal statement specifying the copyright status of the learning resource: 
Creative Commons Attribution 4.0 International - CC BY 4.0
Access Cost: 
No fee
Primary language(s) in which the learning resource was originally published or made available: 
English
More info about
Keywords - short phrases describing what the learning resource is about: 
Data archiving
Data skills education
Data storage
Persistent Identifiers (PID)
Programming
R software
Subject Discipline - subject domain(s) toward which the learning resource is targeted: 
Physical Sciences and Mathematics: Environmental Sciences
Published / Broadcast: 
Tuesday, October 13, 2020
Publisher - organization credited with publishing or broadcasting the learning resource: 
DataONE
Media Type - designation of the form in which the content of the learning resource is represented, e.g., moving image: 
Interactive Resource - requires a user to take action or make a request in order for the content to be understood, executed or experienced.
Educational Info
Purpose - primary educational reason for which the learning resource was created: 
Professional Development - increasing knowledge and capabilities related to managing the data produced, used or re-used, curated and/or archived.
Learning Resource Type - category of the learning resource from the point of view of a professional educator: 
Learning Activity - guided or unguided activity engaged in by a learner to acquire skills, concepts, or knowledge that may or may not be defined by a lesson. Examples: data exercises, data recipes.
Target Audience - intended audience for which the learning resource was created: 
Citizen scientist
Data manager
Data professional
Data supporter
Early-career research scientist
Graduate student
Librarian
Mid-career research scientist
Repository manager
Research scientist
Undergraduate student
Intended time to complete - approximate amount of time the average student will take to complete the learning resource: 
Up to 1 hour