Skip to main content

OpenRefine funded to improve its reproducibility

· 2 min read
Antonin Delpeuch

We are pleased to announce that the Chan Zuckerberg Initiative (CZI) has awarded OpenRefine a $310,000 grant to fund its development in 2023 and 2024. This award is part of the fifth round of CZI’s Essential Open Source Software for Science program.

Our proposal is centered around improving the reproducibility of OpenRefine workflows. This aims to address a range of long standing issues, such as:

  • limitations of the Undo/Redo functionality, such as the lack of ability to undo an operation in the middle of the history list (#183) or unintentional loss of work after using the undo feature and applying a new operation (#369, #3184)
  • the difficulty in reusing and sharing OpenRefine workflows, due to the lack of error handling when applying series of operations and the difficulty to adapt a workflow to a new project (for instance, with different column names)
  • the lack of support for running OpenRefine workflows as part of larger pipelines, without using the web UI interactively. This covers for instance the lack of import metadata in the JSON export of the history (#460).

This grant is a natural follow up to our earlier grant from the same program, EOSS1, which focused on scaling the data processing capabilities of the tool. Our work will build on the new architecture developed for this earlier grant, to be released in the 4.0 release series of the tool (currently in alpha version).

Watch this space and our forum for more news about this initative. We hope to be able to give more frequent updates and prototypes that can be tested by the community than in the previous EOSS1 grant, to enable faster feedback cycles.