Skip to content

Documenting Code

Brendan Casey edited this page Sep 20, 2025 · 2 revisions

Documenting Code

Proper documentation is crucial for ensuring code reproducibility. It allows colleagues and reviewers to fully understand your rationale, methods, and outputs. While there are many ways to document code, here are some recomendations:

Code header

Scripts should start with a header that includes the following components:

Component Description
Title A brief and descriptive name for the script that summarizes its purpose.
Author The name of the person or team responsible for writing the code.
Date The creation or last modification date of the code, formatted as YYYY-MM-DD for consistency.
Inputs A list of the input data, files, or parameters required for the script to run, including file paths or formats.
Outputs A description of the output produced by the script, including file names, formats, and what the results represent.
Notes A concise explanation of what the code does, its purpose, and any important details about its function. You can also use this section to list proposed improvements for the code for future iterations.

Example header in YAML-like format:

# ---
# title: "Title"
# author: "Your Name"
# created: "YYYY-MM-DD"
# inputs: [list the required input files]
# outputs: [list the output files produced by the script]
# notes: 
#   "This script performs [describe the main purpose of the script].
#   The script uses [briefly describe data or object inputs] to 
#   [briefly describe the main steps or processes]. The script
#   produces [describe the final output]."
# ---

Code body

The body of your script should be divided into clear, well-labeled, numbered sections for easy navigation:

Setup

Start with a setup section. This section will usually include the following subsections:

  • Load packages: List all the required R packages, each accompanied by a comment explaining their use and package version.
## 1.1 Load packages ----
library(tidyverse)   # data manipulation and visualization (version: 1.3.1)
library(lubridate)   # date-time manipulation (version: 1.7.10)
  • Import data: Describe the data and objects that are being loaded.
## 1.2 Import data ----
# Description of data
# data <- read.csv("path/to/your/data.csv")

Section headings

Each subsequent section of the script should include a descriptibve numbered heading. Beneath the heading, include a comment describing the section's purpose. This makes it easier for others to understand the logical flow of the script. For example:

# 2. Data Cleaning ---- 
# This section handles the preprocessing and cleaning of the input 
# data. It removes missing values, filters unnecessary rows, and 
# transforms variables.

Subdivide sections as needed with descriptive numbered subheadings:

## 2.1 Filter data ----
# This step filters the data to keep only relevant observations.
filtered_data <- data %>% filter(variable == "value")

Use four trailing dashes (-), equal signs (=), or hashtags (#) at the end of your headings to create discrete sections that are foldable and navigable within RStudio's Jump To menu at the bottom of the editor.

Create new R files using a template

You can create new R files based on the r_module.R template by adding a function to your .Rprofile file.

  1. Run the following command in your R console to open the global .Rprofile for editing:
file.edit("~/.Rprofile")
  1. Add the following function and save:
#' Create a New R Module Based on a GitHub Template
#'
#' This function creates a new R script using a template stored on 
#' GitHub. The template is downloaded directly from the provided 
#' URL, ensuring the latest version is used.
#'
#' @param filename Character. The name of the new R script file to 
#' be created. Defaults to "new_script.R".
#' @return Opens the newly created R script in the editor.
#' 
#' @example 
#' # Example usage of the function
#' new_r_module("my_new_script.R")
#' # This will create a new R script named "my_new_script.R" using 
#' # the GitHub template.
#' 
new_r_module <- function(filename = "new_script.R") {
  # Step 1: Define the GitHub URL for the template
  github_template_url <- "https://raw.githubusercontent.com/bgcasey/code_standards/main/templates/r_module.R"
  
  # Step 2: Download the template from GitHub to a temporary file
  temp_template <- tempfile(fileext = ".R")
  tryCatch(
    {
      download.file(github_template_url, temp_template, quiet = TRUE)
      message("Template downloaded successfully!")
    },
    error = function(e) {
      stop("Failed to download template: ", e$message)
    }
  )
  
  # Step 3: Copy the template to the specified filename
  tryCatch(
    {
      file.copy(temp_template, filename, overwrite = TRUE)
      message("New R module created: ", filename)
      
      # Step 4: Open the new file in the editor
      file.edit(filename)
    },
    error = function(e) {
      stop("Failed to create the new R module: ", e$message)
    }
  )
  
  # Step 5: Return a success message
  return(invisible(filename))
}
  1. Restart R.

  2. Now you can use the function to create new R files.

new_r_module("my_new_script.R")

This will create a new R script named "my_new_script.R" using the r_module.R template.

Clone this wiki locally