Tracking Article Processing Charges (APCs) for a given institution
In this notebook, we will query the OpenAlex API to answer the following questions:
- How much are researchers at my institution paying in APCs?
- Which journals/publishers are collecting the most APCs from researchers at my institution?
- How much money are my organization’s researchers saving in discounted APC charges from our transformative/read-publish agreements?
Most organizations do not have an effective way of tracking the APCs that their researchers pay to publish in open access journals. By estimating how much money is going to APCs each year, and which publishers are collecting the most APCs, libraries can make more informed decisions around the details of the read-publish agreements they have with various publishers.
APC-able Works
Before starting this analysis, it is important to define which types of works are subject to APCs and which are not.
While a work may include contributions from a number of different institutions, the APC is typically the responsbility of the work’s corresponding author.
In addition, open access works published in Gold and Hybrid OA journals are subject to APCs, while those published in Green, Diamond, and Bronze OA journals are not.
Finally, APCs are not typically charged for editorial content submitted to an open access journal.
Thus, for the purposes of this notebook, APC-able works must have the following characteristics:
- Original articles or reviews
- Published in a Gold or Hybrid OA journal
- Corresponding author is a researcher at our institution.
Surveying APCs by journal/publisher
Steps
- We need to get all the works published and corresponded by researchers at the institution
- We get the journal/publisher and APC for each publication
- We sum the APCs (by journal/publisher)
Input
For inputs, we first need to identify the Research Organization Registry (ROR) ID for our institution. In this example we will use the ROR ID for McMaster University (https://ror.org/02fa3aq29). You can search and substitute your own institution’s ROR here: https://ror.org/search.
Next, we identify the publication year we are interested in analyzing. If the details of your institution’s specific tranformative/read-publish agreements change from year to year, you will want to limit your analysis to a single year.
Finally, becauase editorial content is not typically subject to APCs, we will limit our search to works with the publication types “article” or “review”.
SAVE_CSV = False # flag to determine whether to save the output as a CSV file
# input
ror_id = "https://ror.org/02fa3aq29"
publication_year = 2024
publication_types = ["article", "review"]
publication_oa_statuses = ["gold", "hybrid"]
Get OpenAlex ID of the given institution
We only want publications with corresponding authors, who are affiliated with McMaster University. However, OpenAlex currently does not support filtering corresponding institutions by ROR ID, we will need to find out the OpenAlex ID for McMaster using the institutions
entity type.
Our search criteria are as follows:
ror
: ROR ID of the institution,ror:https://ror.org/02fa3aq2
Now we need to build an URL for the query from the following parameters:
- Starting point is the base URL of the OpenAlex API:
https://api.openalex.org/
- We append the entity type to it:
https://api.openalex.org/institutions
- All criteria need to go into the query parameter filter that is added after a question mark:
https://api.openalex.org/institutions?filter=
- To construct the filter value we take the criteria we specified and concatenate them using commas as separators:
https://api.openalex.org/institutions?filter=ror:https://ror.org/02fa3aq29
import requests
# construct the url using the provided ror id
url = f"https://api.openalex.org/institutions?filter=ror:{ror_id}"
# send a get request to the constructed url
response = requests.get(url)
# parse the response json data
json_data = response.json()
# extract the institution id from the first result
institution_id = json_data["results"][0]["id"] # https://openalex.org/I98251732
Get all APC-able works published by researchers at the institution
Our search criteria are as follows:
corresponding_institution_ids
: institution affiliated with the corresponding authors of a work (OpenAlex ID),corresponding_institution_ids:https://openalex.org/I98251732
publication_year
: the year the work was published,publication_year:2024
types
: the type of the work,type:article|review
oa_status
: the OA status of the work,oa_status:gold|hybrid
Now we need to build an URL for the query from the following parameters:
- Starting point is the base URL of the OpenAlex API:
https://api.openalex.org/
- We append the entity type to it:
https://api.openalex.org/works
- All criteria need to go into the query parameter filter that is added after a question mark:
https://api.openalex.org/works?filter=
- To construct the filter value we take the criteria we specified and concatenate them using commas as separators:
https://api.openalex.org/works?filter=corresponding_institution_ids:https://openalex.org/I98251732,publication_year:2024,type:article|review,oa_status:gold|hybrid&page=1&per-page=50
import numpy as np
import pandas as pd
def get_works_by_institution(institution_id, publication_year, publication_types, page=1, items_per_page=50):
# construct the api url with the given institution id, publication year, publication types, page number, and items per page
url = f"https://api.openalex.org/works?filter=corresponding_institution_ids:{institution_id},publication_year:{publication_year},type:{'|'.join(publication_types)},oa_status:{'|'.join(publication_oa_statuses)}&page={page}&per-page={items_per_page}"
# send a GET request to the api and parse the json response
response = requests.get(url)
json_data = response.json()
# convert the json response to a dataframe
df_json = pd.DataFrame.from_dict(json_data["results"])
next_page = True
if df_json.empty: # check if the dataframe is empty (i.e., no more pages available)
next_page = False
# if there are more pages, recursively fetch the next page
if next_page:
df_json_next_page = get_works_by_institution(institution_id, publication_year, publication_types, page=page+1, items_per_page=items_per_page)
df_json = pd.concat([df_json, df_json_next_page])
return df_json
df_works = get_works_by_institution(institution_id, publication_year, publication_types)
if SAVE_CSV:
df_works.to_csv(f"institution_works_{publication_year}.csv", index=True)
df_works
id | doi | title | display_name | publication_year | publication_date | ids | language | primary_location | type | ... | versions | referenced_works_count | referenced_works | related_works | abstract_inverted_index | abstract_inverted_index_v3 | cited_by_api_url | counts_by_year | updated_date | created_date | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | https://openalex.org/W4391340020 | https://doi.org/10.1002/adfm.202314520 | Borophene Based 3D Extrusion Printed Nanocompo... | Borophene Based 3D Extrusion Printed Nanocompo... | 2024 | 2024-01-30 | {'openalex': 'https://openalex.org/W4391340020... | en | {'is_oa': True, 'landing_page_url': 'https://d... | article | ... | [] | 72 | [https://openalex.org/W1963693172, https://ope... | [https://openalex.org/W4313334364, https://ope... | {'Abstract': [0], 'Herein,': [1], 'a': [2, 10,... | None | https://api.openalex.org/works?filter=cites:W4... | [{'year': 2025, 'cited_by_count': 7}, {'year':... | 2025-03-17T19:08:42.310588 | 2024-01-31 |
1 | https://openalex.org/W4391115198 | https://doi.org/10.1002/anie.202318665 | Development of Better Aptamers: Structured Lib... | Development of Better Aptamers: Structured Lib... | 2024 | 2024-01-23 | {'openalex': 'https://openalex.org/W4391115198... | en | {'is_oa': True, 'landing_page_url': 'https://d... | review | ... | [] | 204 | [https://openalex.org/W1509198841, https://ope... | [https://openalex.org/W4313237059, https://ope... | {'Abstract': [0], 'Systematic': [1], 'evolutio... | None | https://api.openalex.org/works?filter=cites:W4... | [{'year': 2025, 'cited_by_count': 11}, {'year'... | 2025-03-14T02:50:47.690863 | 2024-01-23 |
2 | https://openalex.org/W4392384326 | https://doi.org/10.1016/s2666-7568(24)00007-2 | Prevalence of multimorbidity and polypharmacy ... | Prevalence of multimorbidity and polypharmacy ... | 2024 | 2024-03-04 | {'openalex': 'https://openalex.org/W4392384326... | en | {'is_oa': True, 'landing_page_url': 'https://d... | review | ... | [] | 127 | [https://openalex.org/W1056226603, https://ope... | [https://openalex.org/W3196849760, https://ope... | {'Multimorbidity': [0], '(multiple': [1, 5], '... | None | https://api.openalex.org/works?filter=cites:W4... | [{'year': 2025, 'cited_by_count': 8}, {'year':... | 2025-03-12T16:35:56.335359 | 2024-03-05 |
3 | https://openalex.org/W4391180419 | https://doi.org/10.1016/j.molliq.2024.124105 | Rapid and effective antibiotics elimination fr... | Rapid and effective antibiotics elimination fr... | 2024 | 2024-01-24 | {'openalex': 'https://openalex.org/W4391180419... | en | {'is_oa': True, 'landing_page_url': 'https://d... | article | ... | [] | 62 | [https://openalex.org/W1988636074, https://ope... | [https://openalex.org/W888754083, https://open... | {'Tetracycline': [0], '(TC)': [1], 'and': [2, ... | None | https://api.openalex.org/works?filter=cites:W4... | [{'year': 2025, 'cited_by_count': 7}, {'year':... | 2025-03-11T00:28:29.517999 | 2024-01-25 |
4 | https://openalex.org/W4390610871 | https://doi.org/10.1016/j.snb.2024.135282 | Interleukin-6 electrochemical sensor using pol... | Interleukin-6 electrochemical sensor using pol... | 2024 | 2024-01-05 | {'openalex': 'https://openalex.org/W4390610871... | en | {'is_oa': True, 'landing_page_url': 'https://d... | article | ... | [] | 67 | [https://openalex.org/W1274997689, https://ope... | [https://openalex.org/W2385756659, https://ope... | {'Interleukin-6': [0], '(IL-6)': [1], 'is': [2... | None | https://api.openalex.org/works?filter=cites:W4... | [{'year': 2025, 'cited_by_count': 2}, {'year':... | 2025-03-04T15:16:23.540238 | 2024-01-07 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
47 | https://openalex.org/W4400490490 | https://doi.org/10.1371/journal.pone.0306075 | Receipt of Opioid Agonist Treatment in provinc... | Receipt of Opioid Agonist Treatment in provinc... | 2024 | 2024-07-10 | {'openalex': 'https://openalex.org/W4400490490... | en | {'is_oa': True, 'landing_page_url': 'https://d... | article | ... | [] | 58 | [https://openalex.org/W1533521761, https://ope... | [https://openalex.org/W3177241792, https://ope... | {'In': [0], 'many': [1], 'jurisdictions,': [2]... | None | https://api.openalex.org/works?filter=cites:W4... | [] | 2025-03-18T07:35:19.142891 | 2024-07-11 |
48 | https://openalex.org/W4400515928 | https://doi.org/10.1111/sjop.13053 | Sociability across Eastern–Western cultures: I... | Sociability across Eastern–Western cultures: I... | 2024 | 2024-07-09 | {'openalex': 'https://openalex.org/W4400515928... | en | {'is_oa': True, 'landing_page_url': 'https://d... | article | ... | [] | 67 | [https://openalex.org/W1492236751, https://ope... | [https://openalex.org/W68679956, https://opena... | {'In': [0], 'this': [1, 59], 'study,': [2], 'w... | None | https://api.openalex.org/works?filter=cites:W4... | [] | 2025-03-18T07:43:08.509042 | 2024-07-11 |
49 | https://openalex.org/W4401060634 | https://doi.org/10.1016/j.vaccine.2024.07.023 | The role of influenza Hemagglutination-Inhibit... | The role of influenza Hemagglutination-Inhibit... | 2024 | 2024-07-28 | {'openalex': 'https://openalex.org/W4401060634... | en | {'is_oa': True, 'landing_page_url': 'https://d... | article | ... | [] | 29 | [https://openalex.org/W1970848116, https://ope... | [https://openalex.org/W3023742952, https://ope... | {'Influenza': [0], 'vaccination': [1, 17], 'ma... | None | https://api.openalex.org/works?filter=cites:W4... | [] | 2025-03-18T08:45:08.115726 | 2024-07-31 |
0 | https://openalex.org/W4403306060 | https://doi.org/10.3390/ijms252010897 | Myeloid GSK3α Deficiency Reduces Lesional Infl... | Myeloid GSK3α Deficiency Reduces Lesional Infl... | 2024 | 2024-10-10 | {'openalex': 'https://openalex.org/W4403306060... | en | {'is_oa': True, 'landing_page_url': 'https://d... | article | ... | [] | 27 | [https://openalex.org/W1487848458, https://ope... | [https://openalex.org/W73242545, https://opena... | {'The': [0, 50, 100], 'molecular': [1], 'mecha... | None | https://api.openalex.org/works?filter=cites:W4... | [] | 2025-03-18T12:07:28.678331 | 2024-10-11 |
1 | https://openalex.org/W4405314055 | https://doi.org/10.1002/ksa.12556 | Anterior cruciate ligament reconstruction with... | Anterior cruciate ligament reconstruction with... | 2024 | 2024-12-12 | {'openalex': 'https://openalex.org/W4405314055... | en | {'is_oa': True, 'landing_page_url': 'https://d... | review | ... | [] | 50 | [https://openalex.org/W1958440778, https://ope... | [https://openalex.org/W4401483477, https://ope... | {'Abstract': [0], 'Purpose': [1], 'This': [2],... | None | https://api.openalex.org/works?filter=cites:W4... | [] | 2025-03-18T12:34:56.261172 | 2024-12-13 |
952 rows × 51 columns
Get Journals/Publishers and APCs in USD
In a work
entity object, there is information about the journal (primary_location
) and the journal’s APC list price (apc_list
).
is derived from the Directory of Open Access Journals (DOAJ), which compiles APC data currently available on publishers’ websites.
It should be noted that not all publishers list APC prices on their websites, meaning that not all works will have an apc_list
price in OpenAlex. In these cases, we will infer the APC price based on the mean APC prices of those works for which apc_list
data is available.
In addition, even when APC price is included on a publishers’ website, there is no guarantee that this is the final APC price our authors paid for publication.
For these reasons, results of this notebook must be understood as best available estimates.
# extract 'value_usd' from 'apc_list' if it is a dictionary (i.e. 'apc_list' exists in the work record); otherwise, set to null
df_works["apc_list_usd"] = df_works["apc_list"].apply(lambda apc_list: apc_list["value_usd"] if isinstance(apc_list, dict) else np.nan)
# extract 'id' and 'name' from 'source' within 'primary_location' if 'source' exists; otherwise, set to null
df_works["source_id"] = df_works["primary_location"].apply(lambda location: location["source"]["id"] if location["source"] else np.nan)
df_works["source_name"] = df_works["primary_location"].apply(lambda location: location["source"]["display_name"] if location["source"] else np.nan)
# extract 'host_organization' from 'source' within 'primary_location' if 'source' exists; otherwise, set to null
df_works["source_host_organization"] = df_works["primary_location"].apply(lambda location: location["source"]["host_organization"] if location["source"] else np.nan)
# extract 'issn' and 'issn_l' from 'source' within 'primary_location' if 'source' exists; otherwise, set to null
df_works["source_issn"] = df_works["primary_location"].apply(lambda location: location["source"]["issn"] if location["source"] else np.nan)
df_works["source_issn_l"] = df_works["primary_location"].apply(lambda location: location["source"]["issn_l"] if location["source"] else np.nan)
# calculate the average apc where 'apc_list_usd' is not null
apc_mean = df_works[df_works["apc_list_usd"].notnull()]["apc_list_usd"].mean()
# fill null values in 'apc_list_usd' with the calculated average
df_works["apc_list_usd"] = df_works["apc_list_usd"].fillna(apc_mean)
# fill null values in 'source_id', 'source_name', 'source_issn' and 'source_issn_l'
df_works["source_id"] = df_works["source_id"].fillna("unknown source")
df_works["source_name"] = df_works["source_name"].fillna("unknown source")
df_works["source_host_organization"] = df_works["source_host_organization"].fillna("unknown source")
df_works["source_issn"] = df_works["source_issn"].fillna("unknown source")
df_works["source_issn_l"] = df_works["source_issn_l"].fillna("unknown source")
Get Publisher Display Name (Optional)
OpenAlex identifies publishers with a unique identfier called an OpenAlex ID. The following code translates this OpenAlex ID to the publisher’s display name for easier analysis.
import re
CHUNK_SIZE = 5
def get_source_host_organization_display_name(publisher_ids):
def get_source_host_organization_publisher_display_name(publisher_ids):
def get_source_host_organization_publisher_display_name_by_chunk(publisher_ids_chunk):
# construct the api url using the chunk of publisher ids
url = f"https://api.openalex.org/publishers?filter=ids.openalex:{'|'.join(publisher_ids_chunk)}"
# send a GET request to the api and parse the json response
response = requests.get(url)
json_data = response.json()
# convert the json response to a dataframe and return the relevant columns
df_json = pd.DataFrame.from_dict(json_data["results"])
return df_json[["id", "display_name"]]
# check if the length of 'publisher_ids' is less than 1
if len(publisher_ids) < 1:
# if true, return an empty dataframe
return pd.DataFrame()
# split the publisher ids into chunks and apply the function to each chunk
chunks = np.array_split(publisher_ids, np.ceil(len(publisher_ids) / CHUNK_SIZE))
df_chunks = pd.DataFrame({"chunk": chunks})
return pd.concat(df_chunks["chunk"].apply(get_source_host_organization_publisher_display_name_by_chunk).tolist())
def get_source_host_organization_institution_display_name(institution_ids):
def get_source_host_organization_institution_display_name_by_chunk(institution_ids_chunk):
# construct the api url using the chunk of institution ids
url = f"https://api.openalex.org/institutions?filter=id:{'|'.join(institution_ids_chunk)}"
# send a GET request to the api and parse the json response
response = requests.get(url)
json_data = response.json()
# convert the json response to a dataframe and return the relevant columns
df_json = pd.DataFrame.from_dict(json_data["results"])
return df_json[["id", "display_name"]]
# check if the length of 'institution_ids' is less than 1
if len(institution_ids) < 1:
# if true, return an empty dataframe
return pd.DataFrame()
# split the institution ids into chunks and apply the function to each chunk
chunks = np.array_split(institution_ids, np.ceil(len(institution_ids) / CHUNK_SIZE))
df_chunks = pd.DataFrame({"chunk": chunks})
return pd.concat(df_chunks["chunk"].apply(get_source_host_organization_institution_display_name_by_chunk).tolist())
# filter the publisher ids to get only publisher urls
publishers = list(filter(lambda s: re.search(r"https:\/\/openalex\.org\/P", s), publisher_ids))
# filter the institution ids to get only institution urls
institutions = list(filter(lambda s: re.search(r"https:\/\/openalex\.org\/I", s), publisher_ids))
# create a dataframe with a default entry for unknown source
df_lookup = pd.DataFrame.from_dict({"id": ["unknown source"], "display_name": ["unknown source"]})
# concatenate the dataframes with publisher and institution display names
df_lookup = pd.concat([df_lookup, get_source_host_organization_publisher_display_name(publishers), get_source_host_organization_institution_display_name(institutions)], ignore_index=True)
return df_lookup
# get the display names for unique source_host_organization ids in df_works
df_lookup = get_source_host_organization_display_name(df_works["source_host_organization"].unique())
if SAVE_CSV:
df_lookup.to_csv(f"source_host_organization_lookup.csv", index=True)
df_lookup
id | display_name | |
---|---|---|
0 | unknown source | unknown source |
1 | https://openalex.org/P4310320990 | Elsevier BV |
2 | https://openalex.org/P4310320595 | Wiley |
3 | https://openalex.org/P4310319908 | Nature Portfolio |
4 | https://openalex.org/P4310320556 | Royal Society of Chemistry |
... | ... | ... |
57 | https://openalex.org/P4310312949 | Surveillance Studies Network |
58 | https://openalex.org/P4310311647 | University of Oxford |
59 | https://openalex.org/P4310319847 | Routledge |
60 | https://openalex.org/P4310319955 | Academic Press |
61 | https://openalex.org/P4324113769 | Iter Press |
62 rows × 2 columns
# update the 'source_host_organization' with the corresponding display names
df_works["source_host_organization"] = df_works["source_host_organization"].apply(lambda publisher: df_lookup[df_lookup["id"] == publisher]["display_name"].squeeze())
Aggregate APCs Data
Here, we build a dataframe containing the number of APC-able works and the estiamted total APC cost for each journal.
# group the dataframe by 'source_id' and 'source_issn_l'
# and aggregate 'source_issn' by taking the maximum value (in this case the common issn list of strings)
# and aggregate 'source_host_organization' by taking the maximum value (in this case the common string name of the source's host organization)
# and aggregate 'source_name' by taking the maximum value (in this case the common string name of the source)
# and 'id' by counting
# and 'apc_list_usd' by summing
df_apc = df_works.groupby(["source_id", "source_issn_l"]).agg({"source_issn": "max", "source_name": "max", "source_host_organization": "max", "id": "count", "apc_list_usd": "sum"})
# rename the 'id' column to 'num_publications' and 'apc_list_usd' column to 'apc_usd'
df_apc.rename(columns={"id": "num_publications", "apc_list_usd": "apc_usd"}, inplace=True)
if SAVE_CSV:
df_apc.to_csv(f"apc_usd_by_source.csv", index=True)
df_apc
source_issn | source_name | source_host_organization | num_publications | apc_usd | ||
---|---|---|---|---|---|---|
source_id | source_issn_l | |||||
https://openalex.org/S100014455 | 1756-0500 | [1756-0500] | BMC Research Notes | BioMed Central | 1 | 1361.0 |
https://openalex.org/S100662246 | 1748-2623 | [1748-2623, 1748-2631] | International Journal of Qualitative Studies o... | Taylor & Francis | 1 | 1790.0 |
https://openalex.org/S100695177 | 0004-6256 | [0004-6256, 1538-3881] | The Astronomical Journal | Institute of Physics | 1 | 4499.0 |
https://openalex.org/S10134376 | 2071-1050 | [2071-1050] | Sustainability | Multidisciplinary Digital Publishing Institute | 2 | 4764.0 |
https://openalex.org/S101949793 | 1424-8220 | [1424-8220] | Sensors | Multidisciplinary Digital Publishing Institute | 5 | 12990.0 |
... | ... | ... | ... | ... | ... | ... |
https://openalex.org/S98651283 | 0022-4227 | [0022-4227, 1467-9809] | Journal of Religious History | Wiley | 1 | 2630.0 |
https://openalex.org/S99352657 | 0935-9648 | [0935-9648, 1521-4095] | Advanced Materials | unknown source | 2 | 10500.0 |
https://openalex.org/S99498898 | 1567-5394 | [1567-5394, 1878-562X] | Bioelectrochemistry | Elsevier BV | 1 | 3370.0 |
https://openalex.org/S99546260 | 1836-9561 | [1836-9561, 1836-9553] | Journal of physiotherapy | Elsevier BV | 1 | 3450.0 |
https://openalex.org/S99985186 | 1360-8592 | [1360-8592, 1532-9283] | Journal of Bodywork and Movement Therapies | Elsevier BV | 1 | 2670.0 |
576 rows × 5 columns
Estimating the total (non-discounted) APC spend
total_apc = df_apc["apc_usd"].sum()
print(f"Estimated total (non-discounted) APC cost in {publication_year}: USD {round(total_apc, 2)}.")
Estimated total (non-discounted) APC cost in 2024: USD 2754479.12.
Calculating Discounted APCs
Steps
- We load the given list of read-publish agreement discounts
- We check if publishers are included in the list by ISSN
- We calculate the APCs paid with the list of read-publish agreement discounts and the APC listed prior
Input
Assume we have list of read-publish agreement discounts in CSV format, discount-list.csv
. In the file, it includes the following necessary attributes,
issn
: ISSN of the publisherdiscount
: value of the discount, either a number or a percentageis_flatrate
: flag indicating whether the discount is a flat rate discount or a percentage discount
You can download a template discount-list.csv
here and update it with the details of your institutions own APC discounts.
# input
df_discount = pd.read_csv("discount-list.csv")
import typing
def get_discount(issn: typing.List[str] | str, apc: float) -> float:
# check if issn is a string, if so, convert it to a list
if isinstance(issn, str):
issn = [issn]
# filter the discount dataframe to get rows where 'issn' is in the provided issn list
discount_rows = df_discount[df_discount["issn"].isin(issn)]
# if no discount rows are found, return the original apc
if discount_rows.empty:
return apc
# get the first row from the filtered discount rows
discount_row = discount_rows.iloc[0]
# if the discount is a flat rate, subtract the discount from the apc
if discount_row["is_flatrate"]:
return apc - discount_row["discount"]
else:
# if the discount is a percentage, apply the discount to the apc
return apc * (1 - discount_row["discount"])
Apply Discounts to Aggregated APC Data
Here, we apply the APC discounts to the aggregated APC data. This produces a dataframe and .csv
file that includes the number of APC-able publications and the discounted APC cost for each journal.
# apply the get_discount function to each row of the dataframe to calculate the discounted apc and store it in a new column 'discounted_apc_usd'
df_apc["discounted_apc_usd"] = df_apc.apply(lambda x: get_discount(issn=x["source_issn"], apc=x["apc_usd"]), axis=1)
if SAVE_CSV:
df_apc.to_csv(f"apc_usd_with_discounts_by_source.csv", index=True)
df_apc
source_issn | source_name | source_host_organization | num_publications | apc_usd | discounted_apc_usd | ||
---|---|---|---|---|---|---|---|
source_id | source_issn_l | ||||||
https://openalex.org/S100014455 | 1756-0500 | [1756-0500] | BMC Research Notes | BioMed Central | 1 | 1361.0 | 1361.0 |
https://openalex.org/S100662246 | 1748-2623 | [1748-2623, 1748-2631] | International Journal of Qualitative Studies o... | Taylor & Francis | 1 | 1790.0 | 1790.0 |
https://openalex.org/S100695177 | 0004-6256 | [0004-6256, 1538-3881] | The Astronomical Journal | Institute of Physics | 1 | 4499.0 | 4499.0 |
https://openalex.org/S10134376 | 2071-1050 | [2071-1050] | Sustainability | Multidisciplinary Digital Publishing Institute | 2 | 4764.0 | 4764.0 |
https://openalex.org/S101949793 | 1424-8220 | [1424-8220] | Sensors | Multidisciplinary Digital Publishing Institute | 5 | 12990.0 | 12990.0 |
... | ... | ... | ... | ... | ... | ... | ... |
https://openalex.org/S98651283 | 0022-4227 | [0022-4227, 1467-9809] | Journal of Religious History | Wiley | 1 | 2630.0 | 2630.0 |
https://openalex.org/S99352657 | 0935-9648 | [0935-9648, 1521-4095] | Advanced Materials | unknown source | 2 | 10500.0 | 10500.0 |
https://openalex.org/S99498898 | 1567-5394 | [1567-5394, 1878-562X] | Bioelectrochemistry | Elsevier BV | 1 | 3370.0 | 3370.0 |
https://openalex.org/S99546260 | 1836-9561 | [1836-9561, 1836-9553] | Journal of physiotherapy | Elsevier BV | 1 | 3450.0 | 3450.0 |
https://openalex.org/S99985186 | 1360-8592 | [1360-8592, 1532-9283] | Journal of Bodywork and Movement Therapies | Elsevier BV | 1 | 2670.0 | 2670.0 |
576 rows × 6 columns
Estimating the total discounted APC cost
total_apc_discount = df_apc["discounted_apc_usd"].sum()
print(f"Estimated APC cost (including discounts) for {publication_year}: USD {round(total_apc_discount, 2)}.")
Estimated APC cost (including discounts) for 2024: USD 2584462.85.
Estimating the total APC savings of our institution’s read-publish agreements
print(f"Estimated APC saving in {publication_year}: USD {round(total_apc - total_apc_discount, 2)}.")
Estimated APC saving in 2024: USD 170016.27.