Tips: Folder Structure

Folder Structure Best Practices

The following is a suggested structure for organizing data and code related to a research project in a way that others can more easily understand and replicate your project. We've created a template folder (download here) that includes a basic folder structure, readme file, and suggested headings for code in R and Stata. These specifics may not work for every situation, but the concept can be adapted to just about any project. In general:

Data folders should be broken down into raw data, cleaned data, and/or final data for analysis (and any other categories that may be necessary).
Data folders should not contain code in them.
Any code subfolders should be numbered in the order that they should be run.
- E.g. 01_Cleaning, 02_Data_Prep, 03_Analysis.
- For each of these folders, it is helpful to have a master run file (e.g. master do file for those using stata) that runs all of the code within that folder.
- There should be a master file that runs all of the other "sub-master" files. Ideally, someone downloading your code and data should be able to replicate the entire work by simply running a single file.
Results should be saved to a separate output folder.
The master folder should contain a readme file with information that helps the user navigate and understand the folder contents

Below is a screenshot of a folder structure following these guidelines:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tips: Folder Structure

Folder Structure Best Practices

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally