R help and debugging

Christine Stawitz

ECS Federal in support of NOAA Fisheries

March 24, 2020

About me

  • Undergraduate degree in Systems Engineering, CS

  • Microsoft

  • Phd Quantitative Ecology

  • Now: creating generalized tools for fisheries modeling at ECS Federal for NOAA

Not: CS training

  • More elegant code
  • Code is less buggy

Yes: CS training

  • Constantly foiled by weak typing
  • Really good at Googling through documentation
  • Very reliant on debugging

Outline

  • Documentation types and uses
  • Diagnosing problems with inputs & environment
  • Debugging to find code problems

Documentation

dox

Types

  • User guides, vignettes, tutorials
  • Examples
  • Function reference

User guides, tutorials, and vignettes

Best when starting to use a new package for the first time. Step-by-step reference with guidance on syntax, dependencies, and workflow.

User guides, tutorials, and vignettes

Examples

Best for understanding syntax and testing environment.

tibble(vec_col = 1:10) %>%
  mutate(vec_sum = sum(vec_col))

Function reference

  • a brief description of the function/class/dataset
  • a description of all of the function arguments, including what type of object (integer, string, etc.) each argument is, what the argument is named, acceptable values for the argument (if it must match a specific input), and the default value for the argument
  • a description of the object(s) returned from the function
  • an example detailing syntax of how the function is used

Finding errors

  • Environment errors
  • Input errors
  • Code errors & debugging

Environment errors

tibble(vec_col = 1:10) %>%
  mutate(vec_sum = sum(vec_col))
## Error in tibble(vec_col = 1:10) %>% mutate(vec_sum = sum(vec_col)): could not find function "%>%"

Environment errors

require(dplyr)
tibble(vec_col = 1:10) %>%
  mutate(vec_sum = sum(vec_col))
## # A tibble: 10 x 2
##    vec_col vec_sum
##      <int>   <int>
##  1       1      55
##  2       2      55
##  3       3      55
##  4       4      55
##  5       5      55
##  6       6      55
##  7       7      55
##  8       8      55
##  9       9      55
## 10      10      55
  • Restart R
  • Use package examples

Input errors

input_data <- list(c(1, 5, 7), 
                       5, 
                       c(10, 10, 11))
tibble(list_col = input_data) %>%
  mutate(list_sum = sum(list_col))
## Error in sum(list_col): invalid 'type' (list) of argument
str(input_data)
## List of 3
##  $ : num [1:3] 1 5 7
##  $ : num 5
##  $ : num [1:3] 10 10 11
  • Cross-reference function documentation
  • str() to see input structure

Debugging

RStudio::conf talk

Restart R - still useful!

foo <- c(1,2,3)

sum_the_cols <- function(dat){
tibble(list_col = input_data) %>%
  mutate(list_sum = sum(list_col))
}

sum_the_cols(foo)
## Error in sum(list_col): invalid 'type' (list) of argument

Restart R - still useful!

rm(input_data)

sum_the_cols(foo)
## Error in eval_tidy(xs[[i]], unique_output): object 'input_data' not found

debugonce()

#Use it for other people's code
debugonce(mutate)
foo <- c(1,2,3)

#use it for your own
sum_the_cols <- function(dat){
tibble(list_col = dat) %>%
  mutate(list_sum = sum(list_col))
}

debugonce(sum_the_cols)
sum_the_cols(foo)

browser()

input_data <- list(c(1, 5, 7), 
                       5, 
                       c(10, 10, 11))
foo <- 1:10

sum_the_cols <- function(dat){
  browser()
  
tibble(list_col = input_data) %>%
  mutate(list_sum = sum(list_col))
}

sum_the_cols(input_data)
## Called from: sum_the_cols(input_data)
## debug at <text>#9: tibble(list_col = input_data) %>% mutate(list_sum = sum(list_col))
## Error in sum(list_col): invalid 'type' (list) of argument

Preventing future bugs

foo <- c(1,2,3)

sum_the_cols <- function(dat){
  if(is.list(dat)){
    stop("Error: mutate doesn't work on list inputs!")
  }
  tibble(list_col = dat) %>%
    mutate(list_sum = sum(list_col))
}

sum_the_cols(input_data)
## Error in sum_the_cols(input_data): Error: mutate doesn't work on list inputs!

Summary

  • Use user guides & vignettes before you start coding
  • Start with simple checks: environment errors by reloading and testing examples
  • Next, check your inputs against the function reference
  • Debug!
  • If it’s someone else’s package, create a reprex
  • If it’s your own code, use stop() to prevent the next time!

thanks!

Source for example code