Graphing Seasonality in Ebird Bird Sightings

R
eBird
Author

Conor Tompkins

Published

March 31, 2020

Over the winter I became interested in birding. Sitting in your back yard doing nothing but watching birds fly around is quite relaxing. Naturally I am looking for ways to optimize and quantify this relaxing activity. eBird lets you track your bird sightings and research which birds are common or more rare in your area. Luckily, the folks at ROpenSci have the {rebird} package, which provides an easy interface to the eBird API.

In this post I will graph the seasonality of observation frequency of the top 10 birds in Pennsylvania. Frequency in this context is the % of eBird checklists that the bird appeared in during a given period.

Load up packages:

library(tidyverse)
library(lubridate)
library(vroom)
library(janitor)
library(rebird)
library(hrbrthemes)
library(ggrepel)
library(gganimate)

theme_set(theme_ipsum())

options(scipen = 99, digits = 2)

The ebirdfreq takes a location and time period and returns the frequency and sample size for the birds returned in the query.

[1] 48
[1] 48
df_freq_raw <- ebirdfreq(loctype = 'states', loc = 'US-PA', startyear = 2019,
                         endyear = 2019, startmonth = 1, endmonth = 12)
df_freq_raw
# A tibble: 22,272 × 3
   com_name                     month_qt   frequency
   <chr>                        <chr>          <dbl>
 1 Black-bellied Whistling-Duck January-1          0
 2 Black-bellied Whistling-Duck January-2          0
 3 Black-bellied Whistling-Duck January-3          0
 4 Black-bellied Whistling-Duck January-4          0
 5 Black-bellied Whistling-Duck February-1         0
 6 Black-bellied Whistling-Duck February-2         0
 7 Black-bellied Whistling-Duck February-3         0
 8 Black-bellied Whistling-Duck February-4         0
 9 Black-bellied Whistling-Duck March-1            0
10 Black-bellied Whistling-Duck March-2            0
# ℹ 22,262 more rows

This does some light data munging to get the data in shape.

df_freq_clean <- df_freq_raw %>% 
  separate(month_qt, into = c("month", "week")) %>% 
  mutate(week = as.numeric(week),
         month = ymd(str_c("2019", month, "01", sep = "-")),
         month = month(month, label = TRUE, abbr = TRUE),
         state = "PA") %>% 
  rename(common_name = com_name) %>% 
  arrange(common_name, month, week)

df_freq_clean
# A tibble: 22,272 × 5
   common_name        month  week frequency state
   <chr>              <ord> <dbl>     <dbl> <chr>
 1 Acadian Flycatcher Jan       1         0 PA   
 2 Acadian Flycatcher Jan       2         0 PA   
 3 Acadian Flycatcher Jan       3         0 PA   
 4 Acadian Flycatcher Jan       4         0 PA   
 5 Acadian Flycatcher Feb       1         0 PA   
 6 Acadian Flycatcher Feb       2         0 PA   
 7 Acadian Flycatcher Feb       3         0 PA   
 8 Acadian Flycatcher Feb       4         0 PA   
 9 Acadian Flycatcher Mar       1         0 PA   
10 Acadian Flycatcher Mar       2         0 PA   
# ℹ 22,262 more rows

This takes the month-week time series and summarizes to the month level:

df_month <- df_freq_clean %>% 
  group_by(common_name, month) %>% 
  summarize(frequency_mean = mean(frequency) %>% round(2)) %>%
  ungroup()

df_month
# A tibble: 5,568 × 3
   common_name        month frequency_mean
   <chr>              <ord>          <dbl>
 1 Acadian Flycatcher Jan             0   
 2 Acadian Flycatcher Feb             0   
 3 Acadian Flycatcher Mar             0   
 4 Acadian Flycatcher Apr             0   
 5 Acadian Flycatcher May             0.04
 6 Acadian Flycatcher Jun             0.06
 7 Acadian Flycatcher Jul             0.05
 8 Acadian Flycatcher Aug             0.02
 9 Acadian Flycatcher Sep             0   
10 Acadian Flycatcher Oct             0   
# ℹ 5,558 more rows

Here I find the top 10 birds in terms of average monthly observation frequency:

df_top_birds <- df_freq_clean %>% 
  group_by(common_name) %>% 
  summarize(frequency_mean = mean(frequency) %>% round(2)) %>% 
  ungroup() %>% 
  arrange(desc(frequency_mean)) %>% 
  select(common_name) %>% 
  slice(1:10)

df_top_birds
# A tibble: 10 × 1
   common_name           
   <chr>                 
 1 Northern Cardinal     
 2 Blue Jay              
 3 Mourning Dove         
 4 American Robin        
 5 Song Sparrow          
 6 American Crow         
 7 Red-bellied Woodpecker
 8 American Goldfinch    
 9 Carolina Wren         
10 Downy Woodpecker      

This basic line graph shows some of the pattern of seasonality, but fails to show the cyclical nature of the data.

df_month %>% 
  semi_join(df_top_birds) %>% 
  ggplot(aes(month, frequency_mean, group = common_name)) +
    geom_line() +
    scale_y_percent() +
    labs(title = "Bird observation frequency",
         subtitle = "Top 10 birds in PA, 2019",
         x = NULL,
         y = "Mean frequency",
         caption = "Data from ebird.org. @conorotompkins")

I use coord_polar to change the coordinate system to match the cyclical flow of the months:

df_month %>% 
  semi_join(df_top_birds) %>% 
  ggplot(aes(month, frequency_mean, group = common_name)) +
    geom_polygon(color = "black", fill = NA, size = .5) +
    coord_polar() +
    scale_y_percent() +
    labs(title = "Bird observation frequency",
         subtitle = "Top 10 birds in PA, 2019",
         x = NULL,
         y = "Mean frequency",
         caption = "Data from ebird.org. @conorotompkins")

gganimate lets me focus on one species at a time while showing all the data.

plot_animated <- df_month %>% 
  semi_join(df_top_birds) %>% 
  mutate(common_name = fct_inorder(common_name)) %>% 
  ggplot(aes(month, frequency_mean)) +
  geom_polygon(data = df_month %>% rename(name = common_name),
               aes(group = name),
               color = "grey", fill = NA, size = .5) +
  geom_polygon(aes(group = common_name),
               color = "blue", fill = NA, size = 1.2) +
  coord_polar() +
  #facet_wrap(~common_name) +
  scale_y_percent() +
   labs(subtitle = "Most frequently observed birds in PA (2019)",
        x = NULL,
        y = "Frequency of observation",
        caption = "Data from ebird.org. @conorotompkins") +
  theme(plot.margin = margin(2, 2, 2, 2),
        plot.title = element_text(color = "blue"))

plot_animated +
  transition_manual(common_name) +
  ggtitle("{current_frame}")