This script pulls dependencies (only gem and ruby at the moment) versions out of the Gemfile.lock and stores them in snapshot files in json format for easy consumption. So, it transforms a Gemfile.lock into json. It then uses these json files to create diffs between two snapshots to track dependency updates over time.
Currently, the script takes snapshots of gem versions and ruby versions. If the ruby version is not called out in the Gemfile.lock file, it looks at the .ruby-version file in the directory specified by the repos.repo-name.gemfile_dir config.
This script was built in the following environment, and is therefore the recommended setup:
- MacOS 12.6 or newer.
- Python 3.
pyenvas the Python Version Manager andpyenv-virtualenvas the package manager. The file.python-versionis read bypyenvand switches to this python version whencding into this directory.
-
If you use
pyenvandpyenv-virtualenv, install the version of python (if you don't have it already in thepyenv versionslist) and create the virtualenv:pyenv install `cat .python-version` pyenv virtualenv `cat .python-version` dep
-
Install the required package dependencies:
pip install -r requirements.txt
-
In order to pull data from Github, the environment variable
GITHUB_TOKENmust be set with permissions to read the repositories specified in the config file. Go here to create one if you don't already have one. -
Create your
configs/default.jsonconfig file. You can start fromcp configs/default_sample.json configs/default.json
If you're using pyenv and pyenv-virtualenv, you need to activate the virtualenv with pyenv activate dep, assuming you called the virtualenv dep at creation during setup.
By default, if you just run ./dep, the script will try to use the config file configs/default.json. If this file doesn't exist, it will fail. You can explicitly specify which config file to use with the -c flag. For example:
./dep -c servicesThis will make the script use the config file configs/services.json.
Every time the script runs successfully, it will create a json snapshot file in snapshots/.
Use the -d flag to create a diff file against an old snapshot. Diff files are csv files stored in diffs/.
-
Use a config file like this:
{ "owner": "github-owner", "repos": { "repo1": { }, "repo2": { }, } }
-
Use a config file like this:
{ "repos": { "repo1": { }, "repo2": { }, }, "force_debug_mode": true, "debug": { "repos_dir": "/Users/reposdirectory" } }
-
Let's say you have a config file like
configs/services.jsonlike this:{ "owner": "github-owner", "repos": { "repo1": { }, "repo2": { }, } } -
Let's also say you have a bunch of snapshots taken
snapshots/ | +----- services_2022-10-01.json | +----- services_2022-11-01.json -
One can capture the dependency changes between one of these old snapshots and today with
./dep -c services -d services_2022-11-01
If today is
2022-12-01, this will create a diff filediffs/services_2022-11-01_2022-12-01.csv
This is useful when you are in the process of upgrading rails and you run tests on two different versions of rails using different Gemfile.lock files following a process similar to Github's.
For example, let's say you have a repo called repo1 which is upgrading from Rails 3 to Rails 4, using Gemfile.lock and Gemfile_next.lock respectively. Here's how we can get the diff:
-
Use 2 different config files, let's call them
core_r3.jsonandcore_r4.json// core_r3.json { "owner": "github-owner", "repos": { "repo1": { "gemfile_name": "Gemfile.lock" } } }
// core_r4.json { "owner": "github-owner", "repos": { "repo1": { "gemfile_name": "Gemfile_next.lock" } } }
-
Take a snapshot using
core_r3.jsonwith:$ ./dep -c core_r3.json Using config file core_r3.json Wrote snapshots/core_r3_2022-12-07.json
-
Take a snapshot using
core_r4.jsonand diff it against the snapshot just taken for Rails 3:$ ./dep -c core_r4.json -d core_r3_2022-12-07 Wrote snapshots/core_r4_2022-12-07-v1.json Wrote diffs/core_r3_2022-12-07_core_r4_2022-12-07.csv
That's it. The diff file
diffs/core_r3_2022-12-07_core_r4_2022-12-07.csvwill contain all the changes.
These are json files that live in the configs/ directory have a defined structure. If a config file is not specified, the script is going to use configs/default.json.
All config files are gitignored.
Config files have the following structure:
{
"owner": "<github-org-or-owner>",
"repos": {
"<repo-name-1>": {
"gemfile_dir": "/path/to/directory",
"gemfile_name": "Gemfile.lock"
},
"<repo-name-2>": { ... },
// ...
},
"force_debug_mode": false,
"debug": {
"repos_dir": "/path/to/directory"
}
}{
"owner": "<github-org-or-owner>",
// ...
}Used when making API calls to github. For example when pulling Gemfile.lock files, the API call path uses an owner name.
{
"repos": { /* ... */ },
// ...
}Repos to report on. Each key within this dictionary is a repo name. The values are configured as follows.
{
"repos": {
"<repo-name>": {
"gemfile_dir":"/path/to/directory",
// ...
}
},
// ...
}Defaults to "/" if this is not present. This is the directory where the script will search for the Gemfile.lock and .ruby-version (if the version is not found within the Gemfile.lock file) files.
{
"repos": {
"<repo-name>": {
"gemfile_name": "Gemfile.lock",
// ...
}
},
// ...
}Defaults to "Gemfile.lock". This is the name of the Gemfile itself within the repo. Together with the gemfile_dir config, this is how the script finds these files. If you don't specify these configs the script will try to find "/Gemfile.lock"
{
"force_debug_mode": false,
// ...
}This config allows you to avoid making any API calls to github and only look at the repos locally. It will look for them in the directory specified by the debug.repos_dir config.
This config name will likely change in the future to force_local_mode.
{
"debug": { /* ... */ },
// ...
}Settings relevant only when the script runs in debug mode.
This config name will likely change in the future to local.
{
"debug": {
"repos_dir": '/path/to/directory',
// ...
}
// ...
}When running in debug mode, this is the directory where the script tries to find the repos instead of making API calls to Github.
Snapshots are gitignored json files stored in snapshots/.
They contain all the dependency versions for each repo specified in the repos config. The script consumes these files to create the diffs. A new snapshot is created every time the script runs successfully.
The format is snapshots/<config>_<date>.json where <config> is the name of the config file used to take the snapshot, and the <date> is the day when the snapshot was taken. For example snapshots/default_2022-12-07.json.
Diffs are gitignored csv files stored in diffs/. They're created only when passing the -c flag.
The format is diffs/<config-from>_<date-from>_<config-to>_<date-to>.csv where:
<config-from>is the name of the config file used in the snapshot we're diffing from.<date-from>is the date of the snapshot we're diffing from.<config-to>is the name of the config file used in the snapshot we're diffing to. I.e. the snapshot we took as part of this run.<date-to>is the date of the snapshot we're diffing to. I.e. the snapshot we took as part of this run.
When both the snapshot we're diffing from and the snapshot we're diffing to use the same config file, the _<config-to> part is removed. E.g. diffs/default_2022-11-01_2022-12-01.csv