Abstract
The relationship between access to nature and public health outcomes has been well-studied and established in the literature. However, most studies use simple greenness indices as a proxy for access to nature, which ignores the “quality” of the nature since greenness indices are not able to predict biodiversity. My objective was to investigate the relationship between citizen scientist collected biodiversity data from the eBird platform, urban greenness and four human health outcomes (asthma, coronary heart disease, and self-assessed mental and physical health). I mapped and tested for correlations among eBird record species richness, greenness as NDVI and PLACES human health data in urban census tracts located in three metro areas/ecological zones (Albany, NY: eastern deciduous forest, Kansas City, MO: tallgrass prairie, and Phoenix, AZ: Sonoran Desert). eBird species richness was related to greenness, measures of urbanization and several human health factors; however, the correlations varied by metro area and in strength. Provided confounders are controlled for, eBird data could help to refine models surrounding relationships between public health and nature access.
It is established in the literature that proximity and access to “nature” and green space is related to subjective wellness and health outcomes (Jennings et al. 2016, Mavoa et al. 2019). Greenness indices are often used to evaluate the availability of natural areas in urban landscapes, and while NDVI is often a good estimator of green spaces, in some situations it may be over-simplification to suggest that the health benefits of nature areas are simply a function of local greenness (Markevych et al. 2017, Reid et al. 2018, Jarvis et al. 2020). There is also evidence that exposure to biodiversity, that is, the “quality” of the green space and the variety of habitats it offers, influences health outcomes (Fuller et al. 2007, Hanski et al. 2012, Shanahan et al. 2015, Mills et al. 2020). It is often assumed that greenness and biodiversity are synonymous and yet a functional crop field can be low in biodiversity but have a high NDVI. While there have been many attempts to model diversity based on vegetative reflectance, effectiveness and portability of these methods vary (Wang and Gamon 2019, Leveau et al. 2020), and ground truthing biodiversity is costly. Investigation of the mechanisms behind the relationship between greenness and health outcomes continues (Markevych et al. 2017); this study proposes to further probe this question using citizen gathered data from the eBird platform (Cornell Lab of Ornithology 2020), particularly in urban areas.
With rapid increases in data and data-driven decisions, citizen science has begun to play a larger role in scientific investigations, despite its inherent flaws. eBird, which was initiated in 2002, is a well-established, crowdsourced, georeferenced database to which more than 600,000 citizens and scientists submit bird sightings. The quantity of eBird data collected by citizen birders may be useful in further elucidating relationships among greenness, biodiversity, and human health. Urban development tends to homogenize bird communities, meaning that species richness and diversity decrease at high urban densities (Melles 2005, Tratalos et al. 2007), and trends in bird diversity and density may indicate biodiversity status in general. I may be able to use eBird contributor record species richness to provide a proxy measure of biodiversity and relate this to greenness and human health.
My objectives were to: 1) Map eBird records of bird species and determine bird species richness at the census tract level, and 2) Relate eBird record species richness to census tract level NDVI, development intensity and local human health outcome measures to elucidate relationships.
Methods
I mapped eBird, greenness as NDVI and CDC PLACES human health data in urban census tracts located in three metro areas (Albany, NY: eastern deciduous forest, Kansas City, MO: tallgrass prairie, and Phoenix, AZ: Sonoran Desert). I purposely located the areas in three distinct ecological zones to capture variation in response due to bird species distributions and climate and to avoid potentially faulty conclusions that can occur when using a single site (Fuertes et al. 2014). Temporally, I limited my analyses to only 2018 eBird sightings and NDVI data; I chose 2018 because the population density (Manson et al. 2020) and CDC public health PLACES data are dated to 2018.
Data processing/analysis
I downloaded census, public health, satellite reflectance, eBird and land use data (Table 1). For each metro area, I selected urban tracts from census data (Manson et al. 2020) based on calculated population density > 1500 people/square mile and at least 40% developed area based on USGS-NLCD 2011 land cover rasters (2014). I calculated the number of unique bird species sighted and mean NDVI values for each tract and spatially joined these and the PLACES health data to the urban tract polygons (USGS 2018, Centers for Disease Control et al. 2020). I calculated bivariate Spearman’s correlation coefficients to determine associations among eBird record species richness, NDVI, and public health outcomes,. I focused on four health outcomes (asthma, coronary heart disease and self-rated mental and physical health based on prior work (Hanski et al. 2012, James et al. 2015) and the self-rating aspect of the general mental health and physical health fields. I also calculated and visualized relationships among eBird species richness, NDVI and population density using the same methods. I used ArcGIS 10.6.1 and R for all data analysis and visualization (Buckley 2015, Esri 2017, R. Core Team 2020).
Results
While eBird richness correlated with several public health outcomes and with NDVI, it did not, overall, appear to be a proxy for greenness. In fact, neither richness nor greenness related to the four public health outcomes to a degree that would justify their use as predictors due to wide scatter and apparent (unmeasured) differences among metro areas. When combining census tract data across all three metro areas combined, eBird record species richness has a weak positive correlation with NDVI (Rs = 0.25, Table 2, Fig. 1) and is weakly negatively correlated with human population density and the aerial proportion developed (Rs = -0.44, -0.30, respectively, Table 2, Fig. 1). When aggregated across metro areas, there are weak negative relationships between species richness and current adult asthma, self-assessed poor mental health, and self-assessed poor physical health (Rs = -0.38, -0.43, -.38 respectively, Table 2) whereas NDVI is less correlated with these three health outcomes. However, relationships among the variables differed depending on the metro area (Tables 3, 4, 5, Figs. 2-4).
While NDVI was positively correlated with species richness in all three metro areas, the correlation was weaker in Kansas City (Rs = 0.16, Table 4, Fig. S1, S2) than in Phoenix (Rs = 0.33, Table 3, Fig. S3, S4) and Albany (Rs = 0.43, Table 5, Fig. S5, S6). Species richness decreased with increasing population density and development in all three metro areas, but the relationship was strongest in the Albany, NY area (Tables 3, 4, 5, Fig. 2, S1, S3, S5). Associations between species richness and the four health outcomes also differed among the metro areas. Species richness was more strongly correlated with asthma (Rs = -0.43), mental health (Rs = -0.45) and physical health (Rs = -0.40) than NDVI was (Rs = -0.06 ns, Rs = -0.15, Rs = -0.05 ns, respectively) in Kansas City (Table 4, Fig. 3, 4, S2); however, the NDVI was more correlated with those health outcomes than species richness in Phoenix (Table 3, Fig. 3, 4, S4) and the Albany area, NY (Table 5, Fig. 3, 4, S6). Chronic heart disease did not correlate with or is only very weakly correlated with any of the urbanization, greenness or eBird variables, regardless of metro area (Tables 2, 3, 4, 5).
Discussion
My objective was to determine whether eBird data might be an improved estimator of nature access over simple greenness indices. While eBird species richness was related to greenness, measures of urbanization and several human health factors, the correlations varied by metro area and were not always strong. eBird data was a clearer correlate with health factors than NDVI in one metro area (Kansas City) but was not an improvement over NDVI in for Phoenix or the Albany area of NY. However, the spatial relationship between asthma and mental health and bird species richness was striking. Both asthma and poor mental health increased as bird diversity and NDVI decreased such that city centers and areas of high urbanization are readily apparent on bivariate choropleth maps (Fig. 3, 4). Despite the weakness of some relationships, eBird and other volunteer gathered data could be useful covariates when modeling access to nature and biodiversity and implications for public health, particularly if some limiting factors were addressed. One should consider the ecological and cultural situation of the population center, biases associated with volunteer collected data, potential confounders including socio-economic status, and limitations on data precision.
Differences in relationships depending on the metro area and location should not be discounted-blanket statements regarding eBird data or greenness and public health based on single location studies should be avoided (Fuertes et al. 2014). For example, greenness in the Phoenix area is concentrated in ephemeral waterways and in heavily populated and irrigated areas whereas greenness has a completely different distribution in upstate New York-a much more humid climate. Bird population distributions and whether the metro area is part of a migratory fly-zone should also be considered. Further, cultural differences within and across cities may influence birding and contribution activities and public values placed on parks and other open spaces.
Beyond considering the location of the investigation, controlling for other variables that may confound the relationships among NDVI, eBird record species richness and public health outcomes might make eBird and other volunteer gathered biodiversity data more useful. Citizen science data are vulnerable to bias since observation records depend on the knowledge, geographic location, biases, and effort of the contributor. Data such as these are typically concentrated around population centers since most citizen scientists are associated with urban, suburban and exurban areas (Zhang 2020). I attempted to control for this bias by limiting my investigation to only ‘urban’ census tracts with population densities > 1500 people/square mile and at least 40% development. However, variables such as effort and knowledge of the birder were not controlled for in this study, which may have biased the results. Socioeconomic factors should also be controlled for when attempting to model the relationships between nature access and human health. Often, green space, healthy food access and access to medical services are all limited in neighborhoods with lower socioeconomic status, implying that while green space may be a correlate to poor health, the relationship may not be causal (Jarvis et al. 2020). I did not control for SES in this study; however, the PLACES data does include an estimate of health insurance access which may serve as a proxy for SES (Centers for Disease Control et al. 2020) that one could investigate in the future. Other factors also interact with health outcomes; while asthma and allergy prevalence has been related to the proximity of green space (James et al. 2015, Shanahan et al. 2015, Sbihi et al. 2019), asthma may also be related to air pollution, which tends to be higher in urbanized areas.
Limitations in the spatial and numerical precision of the data may also have influenced the results of this investigation. A census tract can have a large or small spatial footprint depending on population density. When one aggregates fine scale data such as NDVI rasters or eBird points to larger scale polygons such as census tract, data, details, and nuance are lost or compressed with loss. In addition, human health data provided in the PLACES data is modeled and interpolated, further limiting precision and impinging relationship discernment. Therefore, the spatial coarseness of the public health data and confounders associated with eBird sightings may limit their utility as a predictor of public health measures. Finer scale public health data would alleviate this limitation. In the interest of expediency, I limited my dataset to greenness and eBird data from only one year. An investigation across years would be informative and might enhance the value of eBird and other volunteer collected data toward understanding relationships between access to nature and public health outcomes.
Despite the limitations and potential confounders, eBird data does appear to have some value in understanding the relationships among proximity to greenness, biodiversity, and human health measures.
Conclusion
Greenness indices are often used as a proxy for access to nature when studying the relationship between nature and public health; however, greenness indices do not provide measure of the “quality” of the proximate nature experience. One possible remedy for this oversight is the use of citizen science biodiversity data, such as eBird records, to relate greenness and species richness-a measure of habitat quality. In this study, I found that eBird record richness was correlated with measures of greenness, urban development, and public health outcome estimates in all three metro areas investigated; however, the variability of the data and potential confounders do not recommend eBird data as an improved estimator of nature access above and beyond simple greenness indices. However, eBird and other volunteer collected biodiversity data could inform models as a covariate.
Data Availability
All data are publicly available via the CDC website, USGS, US Census and Cornell lab of Ornithology (upon request)
https://ebird.org/science/use-ebird-data/download-ebird-data-products