NASA and Open Access Publishing
Michael J. Kurtz (ADS Project Scientist)
06 Apr 2020
Introduction
NASA has long been a (some would say the) leader in providing open access to scientific data. This commitment continues with the recent whitepaper: NASA’s Science Mission Directorate’s Strategy for Data Management and Computing for Groundbreaking Science 2019-2024, by the Strategic Data Management Working Group, approved by Thomas H. Zurbuchen, Associate Administrator for the Science Mission Directorate (SMD), on 17 Dec 2019.
A portion of the whitepaper discusses the scholarly literature and NASA’s strategy for improving access to it. In this blog I will discuss the current status of Open Access (OA) for NASA refereed research articles. First the relevant excerpts from the 23-page document:
Goal 1: Develop and Implement Capabilities to Enable Open Science
Strategy 1.4: Increase transparency into how science data are being used through a free and open unified journal server. Such a system enables the public to freely access journal articles based on NASA data. Open access provides the public with clear evidence of the linkages between mission investments and scientific results. Broad access also facilitates cross-disciplinary research by removing barriers to collaboration. Recognizing that there are limitations to how many articles a researcher can reasonably be expected to read in a year, migration to a single server that employs machine-assisted learning will also benefit the community. The computing industry and library science have already advanced to a reasonably mature state, so NASA should proactively research and develop these technologies to organize information and knowledge in an easily searchable way.
Finding 9: SMD and the scientific community at large are interested in having better transparency into how science data are being used. Tracking of publications based on NASA data is inconsistent across individual flight projects and divisions. Further, access policies vary between publications, including differences in pricing structures and when articles become available for free. For example, some publications charge different amounts based on when an article was published while others allow free access to preprint copies of articles and charge for the final, printed version. While some communities already have systems that allow for open access to journals, this varies by discipline and subdiscipline.
Recommendation 9: SMD should create a free and open, unified journal server along the lines of PubSpace, ADS or ERS to make science papers more accessible to the public. NASA Science should also consider adopting the National Science Foundation’s requirements to reinstate the need for grant recipients to provide copies of all published research as part of their annual reports.
History
At the beginning of the 1990s, three organizations were founded which combined have essentially solved the open access problem for High Energy Physics (HEP) and Astrophysics. At that time both HEP and Astronomy had a well developed preprint culture, so Paul Ginsparg’s Los Alamos Preprint Server (now arXiv) was immediately successful and quickly spread from HEP to astrophysics. The SLAC/Spires database (now INSPIRE), which was the first website in North America, included the preprints in their new web-based search system. Both organizations were funded by DoE.
Independently the Astrophysics Data System (ADS) built an internet search system based on the abstracts of astronomy research papers and negotiated with the publishers to digitize and put online copies of their journals, perhaps with embargos for the most recent issues. By 1996 the ADS also included the LANL preprints in their database. The ADS was (and continues to be) funded by NASA.
These three initiatives succeeded and have become key parts of the research infrastructure. By the turn of the century a large majority of papers in HEP and Astrophysics were posted in the renamed arXiv, the ADS had digitized all the main pre-electronic astronomy journal articles, and the astronomy journals all adopted embargo policies which allowed free access to articles after a short proprietary period (currently ApJ and A&A are 1 year, MNRAS is 3 years). The physics journals, which digitized their own back issues, have never allowed free access to their older material.
Well before the beginning of the Open Access Era with the publication of the Budapest Open Access Initiative (BOAI) declaration in 2002, astronomy was an almost completely OA discipline, in no small measure because of the support NASA gave to the ADS.
Data
We use the ADS full text database to search for refereed papers which acknowledge the use of NASA resources, or are written by NASA center personnel. The ADS has a near complete set of the full text of recent astrophysics, physics, and geophysics journals in XML format. The ADS also knows, by analyzing the rights metadata in the XML document, if and when an article becomes OA. Additionally we index arXiv, and we match incoming journal articles with their arXiv versions where possible. We thus know if there is a green OA version of the article (we adopt the standard nomenclature: green OA is author driven, and free; gold OA is publisher driven, and normally is paid to produce via author charges).
Limitations on the data are that we do not include papers which have been deposited in PubSpace, nor papers which are posted on the new AGU-supported preprint server ESSOAr. Most of the 617 papers dated 2018 (our sample year) in PubSpace are not from journals which are indexed by the ADS, but perhaps 30% are. Nearly all the astrophysics papers in this group are also in arXiv, but nearly none of the Planetary Science (PS) or Geophysics papers are, although some of these are available from the journals as gold OA. Properly taking these into account would increase the percentage of PS papers which have OA versions by a few percent. The ESSOAr is too new to affect the results here, but is growing rapidly, and will be a factor in the future. Additionally the ADS database is not perfect: some published papers are not properly linked to their arXiv versions and for some hybrid journals we may miss an OA article. These problems are too small to significantly affect our results.
Experiment
The entire experiment can be done with a few simple ADS queries. We ask for the peer-reviewed (property:refereed) journal articles from 2018 (year:2018) and examine their properties. We restrict the results to articles which either acknowledge NASA or are written by someone from a NASA center (ack:NASA or aff:NASA). We look for OA papers from any source (property:openaccess) and from arXiv (property:eprint_openaccess).
We look at the two major US publishers of NASA funded research, the AAS (bibstem:(apj or apjl or apjs or aj)) and AGU (bibstem:(jgr* or georl)), and the leading non-US astronomy publications (bibstem:mnras, bibstem:a&a). We extract heliophysics articles from the AAS journals using a keyword search (=keyword:sun) and from the main European heliophysics journal (bibstem:soph). We look at two important pure planetary science journals (bibstem:icar, bibstem:jgre) and a pure space physics journal (bibstem:jgra). We also look at a set of Elsevier journals which are primarily planetary and space science, but have some astrophysics and earth science articles as well (bibstem:(jastp or gecoa or adspr or p&ss or e&psl or ssrv or lssr)). Finally we look at the AGU journals which are more earth science related (bibstem:(jgrb or jgrc or jgrd or jgrf)).
The links below run the relevant queries; in ADS, click on the “Publications” facet to see the distribution of articles as a function of journal. The data is as of 18 Mar 2020. Before looking in detail at selected publishers/journals, we show the totals for refereed articles in the ADS for physics and astronomy (a+p) and just for astronomy (a), both in total and NASA-related.
Results
Category | Total | No OA | OA no arXiv | arXiv |
---|---|---|---|---|
a+p/all | 268962 | 196960 | 27902 | 44100 |
a+p/NASA | 8451 | 2717 | 1260 | 4474 |
a/all | 28628 | 8770 | 3620 | 16238 |
a/NASA | 6574 | 1473 | 703 | 4398 |
aas/all | 4429 | 0 | 633 | 3796 |
aas/NASA | 2242 | 0 | 265 | 1977 |
agu/all | 4529 | 3369 | 1056 | 104 |
agu/NASA | 1519 | 1102 | 356 | 61 |
mnras/all | 3897 | 345 | 0 | 3552 |
mnras/NASA | 1297 | 62 | 0 | 1235 |
a&a/all | 1921 | 0 | 236 | 1684 |
a&a/NASA | 614 | 0 | 50 | 564 |
icar/all | 402 | 265 | 27 | 90 |
icar/NASA | 204 | 149 | 13 | 42 |
jgre/all | 178 | 114 | 43 | 21 |
jgre/NASA | 118 | 73 | 28 | 17 |
jgra/all | 693 | 496 | 163 | 34 |
jgra/NASA | 467 | 314 | 123 | 30 |
soph/all | 167 | 62 | 14 | 91 |
soph/NASA | 67 | 7 | 2 | 59 |
apj-sun/all | 441 | 0 | 146 | 295 |
apj-sun/NASA | 267 | 0 | 80 | 187 |
jgr(bcdf)/all | 1970 | 1394 | 554 | 22 |
jgr(bcdf)/NASA | 411 | 293 | 116 | 2 |
elsevier/all | 2245 | 1916 | 185 | 144 |
elsevier/NASA | 380 | 306 | 42 | 32 |
Discussion
Clearly NASA is a very important research partner. About half of articles in AAS journals have a NASA contribution, as well as a third in AGU journals and also a third in the two main European astronomy journals MNRAS and A&A.
It is also clear that the fraction of OA articles is not different for NASA papers than for non-NASA papers. There are, however, large differences from journal to journal and field to field, both in OA fraction, and in the use of arXiv.
We will divide the discussion by NASA’s Science Mission Directorate’s four science divisions.
Astrophysics. The astrophysics journal literature has been de facto OA for more than two decades. Among the main journals only MNRAS has an embargo period longer than one year; even there only 9% of year old articles are closed access, for NASA articles this is down to 4%. Open Access is the cultural norm in astrophysics.
Heliophysics. As a star in the solar system the sun occupies its own niche, and heliophysics is a somewhat separate culture. Most Helio articles are OA, but less of this is due to arXiv, and more to AAS articles becoming OA after one year than in the rest of astrophysics. The larger fraction of NASA versus non-NASA SoPh articles in arXiv is the only place in this study where NASA policy may have influenced the OA fraction; there are other possible explanations, however.
Planetary Science. There is a clear cultural difference between PS and Astro. A large majority of PS articles are not OA, and arXiv is rarely used. This is true both for planets (Icar + JGRE) and space physics (JGRA). NASA is involved with more than half of these articles. The AAS OA PS Journal is too new to be included here.
Earth Science. The ADS does not cover ES; many of the main journals are absent in our database. We choose the non-PS parts of JGR as indicators of the OA status of NASA ES articles; this may not be correct. The JGR ES papers show OA/non-OA/arXiv properties very similar to JGRA (Space Physics), indicating that among the AGU journals, and by extension, members, the OA culture is small (~25%), and predominantly gold, not green. Spot checks of other ES journals in ADS, for Climate and Atmosphere studies, supports this conclusion, but with small (<4%) NASA content we do not include it here.
Elsevier. Elsevier is the leading publisher of PS research articles. In addition to Icar we look at seven journals which primarily cover planetary and space science, with some overlap in earth science (in particular E&PSL). The high similarity of these journals with the AGU journals, along with the lower fraction of NASA articles suggests that the cultural differences are real and widespread.
Clearly there are cultural differences at play here. Astrophysics is the main outlier, for the historical reasons discussed above. Heliophysics mostly has followed the lead of astro. The study of solar system planets lies between astro and space physics or earth science, which are similar. The range of OA goes from 95+% to 20% and the green/gold OA distinction runs from almost all green to almost all gold.
There is very little or no effect from different publishers, funding sources, or nationalities. In particular, NASA funding has had little or no measurable influence on whether an article is OA, or not. The use of PubSpace does not change this; the 2717 NASA related papers papers in ADS which are not OA would be reduced by less than 5%, were the full analysis to be made.
Recommendations
Moving scholarly publishing to OA means modifying research funding methods, and fundamentally changing the business models for a complex $20B/yr industry. We leave this task to others.
41% of non-OA NASA related articles are published in AGU journals. With its recent creation of the ESSOAr preprint server the AGU has shown its strong support for the green OA model. The AGU members are obvious first targets in any effort by NASA to change hearts and minds (culture) toward OA.
We suggest that NASA partner with the AGU so that authors who acknowledge NASA support, or have NASA affiliations, upon acceptance of their manuscripts would be strongly advised by the journal to post their accepted manuscript in ESSOAr and thus fulfill the NASA mandate. The NSF, which also funds a large amount of AGU published research, might also be interested in participating in this.
It would then become necessary for the ADS to index the ESSOAr.
Besides the immediate effect of making more AGU articles OA, this could begin a cultural change, similar to how arXiv has long effected physics, and more recently mathematics and computer science.