Summary
The learning module discussed five technologies and methodologies that help shaping the future of research activities.
-
Computational Notebooks
Computational notebooks are versatile tools that integrate code, text, and visualizations. They are essential for reproducible data analysis, facilitating collaboration, and enhancing educational experiences. These interactive documents enable researchers and educators to explore data, share insights, and develop complex models in a cohesive environment.
-
Evaluating Research Impact through Linked Open Data (LOD)
Measuring research impact is important to both faculty and their academic institutions. By demonstrating the tangible benefits of the work, there are LOD available that can provide these links and metrics to contribute to professional reputations and can lead to career advancement. Furthermore, as the pricing of subscription-based bibliometric services continues to rise, LOD bibliometric databases, such as OpenAlex, are a viable option to measure current activities and provide a data-driven basis for future research publication strategies.
-
Using Generative AI to Evaluate Quality of Data Dictionaries
Generative AI can be a useful tool to evaluate, and thereby improve, data dictionaries. They offer objective assessments of the quality, consistency and completeness of the data documentation.
-
Practical Guidance for Data Anonymization and De-identification
Effective anonymization and de-identification techniques are vital for protecting data privacy while maintaining the utility of datasets for research. There are best practices and considerations for achieving ideal data anonymization while maintaining the granularity of the datasets for research analysis, and software tools to support these activities.
-
Research Impact of Software Developers
Software developers play a crucial role in advancing scientific discoveries. Measuring and citing contributions of developers that support research helps acknowledge these contributions and helps support ongoing and future contributions.
References
- LinkedIn. (n.d.). How do you select the right dimensions for your data skills and statistics?. Retrieved March 17, 2025, from https://www.linkedin.com/advice/3/how-do-you-select-right-dimensions-your-data-skills-statistics
- Matthias Templ, Alexander Kowarik, Bernhard Meindl (2015). Statistical Disclosure Control for Micro-Data Using the R Package sdcMicro. Journal of Statistical Software, 67(4), 1-36. doi:10.18637/jss.v067.i04
- Katz, Daniel. (2014). Transitive Credit as a Means to Address Social and Technological Concerns Stemming from Citation and Attribution of Digital Products. Journal of Open Research Software. 2. 1-4. 10.5334/jors.be
- Khaled El Emam, Fida Kamal Dankar, Protecting Privacy Using k-Anonymity, Journal of the American Medical Informatics Association, Volume 15, Issue 5, September 2008, Pages 627–637, https://doi.org/10.1197/jamia.M2716
- Office of the Privacy Commissioner of Canada. (n.d.). The Personal Information Protection and Electronic Documents Act (PIPEDA) brief. Office of the Privacy Commissioner of Canada. https://www.priv.gc.ca/en/privacy-topics/privacy-laws-in-canada/the-personal-information-protection-and-electronic-documents-act-pipeda/pipeda_brief/
- OpenAIRE. (n.d.). Amnesia: Anonymize your data before publishing. Retrieved March 17, 2025, from https://www.openaire.eu/amnesia-guide
- OSF. (n.d.). How to make a data dictionary. Retrieved March 17, 2025, from https://help.osf.io/article/217-how-to-make-a-data-dictionary
- Portage COVID-19 Working Group, Kristi Thompson, Erin Clary, Lucia Costanzo, Beth Knazook, Nick Rochlin, Felicity Tayler, Jane Fry, Chantal Ripp, Kathy Szigeti, Qian Zhang, Roger Reka, Minglu Wang, Rebecca Dickson, Mark Leggott, & Melanie Parlette-Stewart. (2020). De-identification Guidance (Version 2). Zenodo. https://doi.org/10.5281/zenodo.4270551
- Prasser, F., Eicher, J., Spengler, H., Bild, R., & Kuhn, K. A. (2020). Flexible data anonymization using ARX - Current status and challenges ahead. Software: Practice and Experience, 50(7), 1277–1304. https://doi.org/10.1002/spe.2812
- Priem, J., Piwowar, H., & Orr, R. (2022). OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts. ArXiv. https://arxiv.org/abs/2205.01833
- Research Data Management Support, Dorien Huijser, Neha Moopen, Jacques Flores, Mercedes Beltrán, Kasper de Bruijn, Johathan de Bruin, Desiree Capel, Freek Dijkstra, Jonas Folkers, Andreas Franzke, Joris de Graaf, Saskia van den Hout, Frans Huigen, Rik D.T. Janssen, Katharina Jovic, Leon Kessels, Sanne Kleerebezem, Danny de Koning-van Nieuwamerongen, … Felix Weijdema. (2024). Data Privacy Handbook (v2024.07.12) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.12731721
- Research Excellence Framework. (n.d.). REF 2021 results. Retrieved March 17, 2025, from https://results2021.ref.ac.uk/
- Sochat, V., (2022). CiteLang: Modeling the Research Software Ecosystem. Journal of Open Source Software, 7(77), 4458, https://doi.org/10.21105/joss.04458
- UBC Library Research Commons. (n.d.). Data dictionary. Retrieved March 17, 2025, from https://ubc-library-rc.github.io/rdm/content/07_data_dictionary.html
- University of Pennsylvania Libraries. (n.d.). Data management: Data dictionary. Retrieved March 17, 2025, from https://guides.library.upenn.edu/c.php?g=564157&p=9554907