Skip to main content Link Menu Expand (external link) Left Arrow Right Arrow Document Search Copy Copied

Selecting Libraries for Reproducible Software Projects

Selecting the right software libraries, including packages and tools, for reproducible software projects invovles balancing simplicity and functionality, which are important qualities for the sustainability and durability of these projects. Below we provide you with some key considerations to keep in mind when choosing packages to ensure that they align with the goals of your projects and contribute to its long-term success.

Key Considerations for Selection

  • Minimize Dependencies

    Select packages that come with minimal external dependencies, including the dependencies of the dependencies (the full dependency chain). Limiting the size of the technology stack reduces the risk of dependencies becoming outdated, unsupported or unavailable over time.

  • Local Dependencies and Resources

    Check if all dependencies and resources can be stored locally and do not require network access. Any resources (e.g. code, data, assets, credentials/authentication/authorization) that are stored on the network that are retrieved during the process may not be available in the future. Furthermore, networked resources will make it more challenging to preserve all dependencies and data in a way that maintains operability. For example, avoid libraries that update automatically from the internet upon initialisation, use data stored on public servers like Google Sheets or Google Drive, or require third party accounts.

  • Documentation

    Packages require clear instructions for installation, configuration and usage. This knowledge may not be available in the future and a community of support may have ended, which makes it difficult to revive, modify or reuse a preserved product. All documentation of dependencies needs to be localized for preservation.

  • Community Support

    Check the availability of community support for the package. A broad, active community can helps by providing support, updates and enhancements, needed both for current developement and future maintenance.

  • Open Source and Licensing

    Open source packages allow for collaboration and long-term maintenance by ensuring that everyone can access, modify, and redeploy the software. Use packages that are freely available and may be redistributed and modified. Look for packages with public licenses that allow this free reuse. Avoid packages that require fees or have restrictions on deployment of related software. These restrictions will apply to any software that uses them, so it is best to avoid these restrictions on dependencies.

  • Version Control

    Check if the package uses a version control system (e.g. GitHub) to track changes and manage updates. Version control helps in understanding the evolution of the package and facilitates collaboration among developers. Cloud version control systems encourage communities of support and broad access to the source code or resource. A copy of the exact versions of all dependencies, downloaded from the version control system, should be preserved with the project.

  • Resource Efficiency

    Choose packages that require minimal power consumption. Beyond the immediate benefits, this will allow the software to be executed in environments without access to high-end computing infrastructure. This can be a difficult quality to evaluate.