-
Notifications
You must be signed in to change notification settings - Fork 1
Documenting Code
Proper documentation is crucial for ensuring code reproducibility. It allows colleagues and reviewers to fully understand your rationale, methods, and outputs. While there are many ways to document code, here are some recomendations:
Scripts should start with a header that includes the following components:
| Component | Description |
|---|---|
| Title | A brief and descriptive name for the script that summarizes its purpose. |
| Author | The name of the person or team responsible for writing the code. |
| Date | The creation or last modification date of the code, formatted as YYYY-MM-DD for consistency. |
| Inputs | A list of the input data, files, or parameters required for the script to run, including file paths or formats. |
| Outputs | A description of the output produced by the script, including file names, formats, and what the results represent. |
| Notes | A concise explanation of what the code does, its purpose, and any important details about its function. You can also use this section to list proposed improvements for the code for future iterations. |
Example header in YAML-like format:
# ---
# title: "Title"
# author: "Your Name"
# created: "YYYY-MM-DD"
# inputs: [list the required input files]
# outputs: [list the output files produced by the script]
# notes:
# "This script performs [describe the main purpose of the script].
# The script uses [briefly describe data or object inputs] to
# [briefly describe the main steps or processes]. The script
# produces [describe the final output]."
# ---
The body of your script should be divided into clear, well-labeled, numbered sections for easy navigation:
Start with a setup section. This section will usually include the following subsections:
- Load packages: List all the required R packages, each accompanied by a comment explaining their use and package version.
## 1.1 Load packages ----
library(tidyverse) # data manipulation and visualization (version: 1.3.1)
library(lubridate) # date-time manipulation (version: 1.7.10)- Import data: Describe the data and objects that are being loaded.
## 1.2 Import data ----
# Description of data
# data <- read.csv("path/to/your/data.csv")Each subsequent section of the script should include a descriptibve numbered heading. Beneath the heading, include a comment describing the section's purpose. This makes it easier for others to understand the logical flow of the script. For example:
# 2. Data Cleaning ----
# This section handles the preprocessing and cleaning of the input
# data. It removes missing values, filters unnecessary rows, and
# transforms variables.Subdivide sections as needed with descriptive numbered subheadings:
## 2.1 Filter data ----
# This step filters the data to keep only relevant observations.
filtered_data <- data %>% filter(variable == "value")Use four trailing dashes (-), equal signs (=), or hashtags (#) at the end of your headings to create discrete sections that are foldable and navigable within RStudio's Jump To menu at the bottom of the editor.
You can create new R files based on the r_module.R template by adding a function to your .Rprofile file.
- Run the following command in your R console to open the global .Rprofile for editing:
file.edit("~/.Rprofile")- Add the following function and save:
#' Create a New R Module Based on a GitHub Template
#'
#' This function creates a new R script using a template stored on
#' GitHub. The template is downloaded directly from the provided
#' URL, ensuring the latest version is used.
#'
#' @param filename Character. The name of the new R script file to
#' be created. Defaults to "new_script.R".
#' @return Opens the newly created R script in the editor.
#'
#' @example
#' # Example usage of the function
#' new_r_module("my_new_script.R")
#' # This will create a new R script named "my_new_script.R" using
#' # the GitHub template.
#'
new_r_module <- function(filename = "new_script.R") {
# Step 1: Define the GitHub URL for the template
github_template_url <- "https://raw.githubusercontent.com/bgcasey/code_standards/main/templates/r_module.R"
# Step 2: Download the template from GitHub to a temporary file
temp_template <- tempfile(fileext = ".R")
tryCatch(
{
download.file(github_template_url, temp_template, quiet = TRUE)
message("Template downloaded successfully!")
},
error = function(e) {
stop("Failed to download template: ", e$message)
}
)
# Step 3: Copy the template to the specified filename
tryCatch(
{
file.copy(temp_template, filename, overwrite = TRUE)
message("New R module created: ", filename)
# Step 4: Open the new file in the editor
file.edit(filename)
},
error = function(e) {
stop("Failed to create the new R module: ", e$message)
}
)
# Step 5: Return a success message
return(invisible(filename))
}-
Restart R.
-
Now you can use the function to create new R files.
new_r_module("my_new_script.R")This will create a new R script named "my_new_script.R" using the r_module.R template.