Skip to content

ci: run publish workflow only on Release publication#7

Closed
rapidclock wants to merge 3 commits intov2026from
codex/uplift-repo-to-go-2026-standards
Closed

ci: run publish workflow only on Release publication#7
rapidclock wants to merge 3 commits intov2026from
codex/uplift-repo-to-go-2026-standards

Conversation

@rapidclock
Copy link
Owner

@rapidclock rapidclock commented Feb 10, 2026

High-level outcome

  1. This repo was modernized in four major areas:
  2. Build/release infrastructure
  3. Go module/dependency management
  4. Crawler safety/behavior fixes
  5. Test coverage + documentation + versioning

CI/CD migration and release automation

What changed

Travis CI was removed and replaced with GitHub Actions.
A PR/default-branch CI workflow was added with:
go mod tidy consistency check
gofmt check
go vet
staticcheck
unit tests and race tests
matrix over Go 1.23/1.24.
A separate publish workflow was added and then corrected to run only when a GitHub Release is published (release.published), not on raw tag push.
Publish workflow validates release tag format (v*) and ensures tag/version matches VERSION, then triggers Go proxy and pkg.go.dev refresh endpoints.
Why this matters
CI is now fully integrated in GitHub and tied directly to PR lifecycle.
Publish behavior now matches your requested governance model (publish only after formal release).

Go modules and dependency posture

What changed
Introduced module metadata (go.mod, go.sum) for modern Go workflows.
Core dependencies explicitly include:
golang.org/x/net
golang.org/x/time
README now reflects versioned module install usage.
Why this matters
Reproducible dependency management and compatibility with modern Go tooling ecosystems.

Core crawler/runtime behavior improvements

Adapter/file handling
FileWriterAdapter now:
handles open failures without crashing the whole process (log.Printf + return)
opens output with O_WRONLY|O_CREATE|O_TRUNC for deterministic output behavior.
Safer pipeline behavior
Link absolution stage now safely returns on nil inputs rather than fatal exit behavior.
Distributor quit path now forwards quit and returns cleanly (no channel-closing side effects in that function).
Parsing and rate limiting
HTML parsing uses tokenizer-based parsing (x/net/html) with nil/body guards.
Rate limiting uses x/time/rate with validated rate/burst constraints in setup.
Setup now panics for invalid TimeToQuit instead of hard process exit path.

Test coverage expansion

New tests were added for:
default options/channel-set factory behavior.
rate-limit and timeout validation behavior.
pipeline helpers (absolution, duplicate filter, timeout propagation).
HTML parser extraction behavior.
adapter write behavior to file.
This is the major reason the repo now has significantly stronger regression protection than before.

Documentation and release metadata

README was heavily expanded into a comprehensive operational guide (architecture, config, adapters, testing, CI/CD, release flow).


Codex Task

Copilot AI review requested due to automatic review settings February 10, 2026 05:57
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e493856302

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +10 to +12
replace golang.org/x/net => github.com/golang/net v0.35.0

replace golang.org/x/time => github.com/golang/time v0.10.0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Remove non-canonical x/ replace directives*

These replace directives force golang.org/x/net and golang.org/x/time to resolve via github.com/golang/*, which breaks dependency resolution in default proxy-based environments (for example, go test ./... now fails while downloading those modules from proxy.golang.org). Using the canonical module paths without these replaces is necessary for fresh builds and CI to work reliably.

Useful? React with 👍 / 👎.

Comment on lines 69 to +71
if err != nil {
fp.Close()
log.Fatal(err)
log.Printf("failed to open output file %q: %v", fw.FilePath, err)
return

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep consuming output channel on file-open error

If opening FilePath fails, this branch logs and returns before any goroutine starts reading listenCh, but Consume() still returns that channel set to the crawler. In runs with an invalid or unwritable path, the first send to NodeCh will block forever on the unbuffered channel, hanging the crawl instead of signaling failure.

Useful? React with 👍 / 👎.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the project’s release/publish automation so module publishing is gated on GitHub Release publication (rather than tag pushes), while also modernizing the repo with Go modules, additional unit tests, and updated documentation/release notes.

Changes:

  • Change publish workflow trigger to release.published and validate vX.Y.Z tags against VERSION.
  • Add unit tests covering option defaults, validation/panic behavior, and key pipeline helpers/adapters.
  • Introduce go.mod/go.sum, add CHANGELOG.md, bump VERSION to 1.4.1, and refresh README/docs.

Reviewed changes

Copilot reviewed 19 out of 22 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
octopus/setup_test.go Adds tests for rate-limit validation and invalid timeout setup behavior.
octopus/setup.go Switches invalid TimeToQuit handling from log.Fatal to panic.
octopus/pipes_test.go Adds tests for link absolution, duplicate filtering, and max-delay timeout behavior.
octopus/pipe_spl_distributor.go Adjusts distributor signature and quit handling (stop forwarding on quit).
octopus/pipe_process_htmlparsing.go Simplifies import and adds nil-guard before tokenizing HTML.
octopus/pipe_ctrl_ratelimit.go Import formatting cleanup.
octopus/pipe_augment_linkabsolution.go Removes log.Fatal on nil inputs and simplifies imports.
octopus/models.go Comment formatting cleanup.
octopus/modelfactory_test.go Adds tests for default options and default channel set factory.
octopus/htmlparse_test.go Adds test for anchor link extraction behavior.
octopus/doc.go Re-formats package documentation headings/lists.
go.mod Adds module definition and dependencies (plus replace directives).
go.sum Adds dependency checksums corresponding to the replaced modules.
adapter/doc.go Comment formatting cleanup.
adapter/basicadapters_test.go Adds test for FileWriterAdapter output behavior.
adapter/basicadapters.go Updates file open flags and changes error handling on file open failure.
VERSION Bumps version to 1.4.1.
README.md Rewrites/expands docs (installation, architecture, testing, CI/CD, release flow).
CHANGELOG.md Adds changelog entries including 1.4.1.
.travis.yml Removes legacy Travis CI config.
.github/workflows/publish.yml Adds publish workflow triggered only on GitHub Release publication with tag/VERSION validation and indexing triggers.
.github/workflows/ci.yml Adds CI workflow running gofmt, vet, staticcheck, unit tests, and race tests.
Comments suppressed due to low confidence (1)

octopus/pipe_spl_distributor.go:29

  • distribute() reads from listenCh using case node := <-listenCh: without checking the receive ok value. If listenCh is ever closed, this will repeatedly receive a nil *Node and forward it downstream in a tight loop, causing CPU spin and potentially nil dereferences in later stages. Use node, ok := <-listenCh and exit (and/or propagate quit) when !ok, and consider skipping forwarding when node is nil.
		case node := <-listenCh:
			{
				for _, outChSet := range outChSetList {
					if outChSet != nil {
						outChSet.NodeCh <- node
					}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if err != nil {
fp.Close()
log.Fatal(err)
log.Printf("failed to open output file %q: %v", fw.FilePath, err)
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If getFilePointer() fails, writeToFile returns without starting any goroutine to drain listenCh. Since Consume() still returns a NodeChSet with an unbuffered NodeCh, any caller that sends to NodeCh will block forever (deadlock) and QuitCh won’t be observed. Consider starting a drain goroutine on error (that discards nodes until quit), or making Consume() return a buffered channel/no-op adapter that never blocks, while still surfacing the error to the caller (e.g., via a callback or explicit constructor that can return an error).

Suggested change
log.Printf("failed to open output file %q: %v", fw.FilePath, err)
log.Printf("failed to open output file %q: %v", fw.FilePath, err)
// Even on error, start a goroutine to drain listenCh so that
// sends to NodeCh do not block indefinitely.
go func() {
for {
select {
case <-listenCh:
// discard node
case <-quitCh:
return
}
}
}()

Copilot uses AI. Check for mistakes.
Comment on lines +10 to +12
replace golang.org/x/net => github.com/golang/net v0.35.0

replace golang.org/x/time => github.com/golang/time v0.10.0
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The replace golang.org/x/net => github.com/golang/net / replace golang.org/x/time => github.com/golang/time directives are risky for a published module: (1) replace directives in dependencies are ignored by downstream consumers, so this won’t improve fetch reliability for users of this library, and (2) it forces maintainers/CI to pull from non-canonical module paths, which can break builds if those mirrors diverge or are unavailable. Prefer removing these replace directives and relying on golang.org/x/* via the proxy, or document an environment-level workaround (e.g., GOPROXY) for restricted networks.

Copilot uses AI. Check for mistakes.
Comment on lines +39 to +53
- name: Trigger Go module proxy indexing
env:
MODULE: github.com/rapidclock/web-octopus
run: |
set -euxo pipefail
VERSION="${{ github.event.release.tag_name }}"
curl -fsSL "https://proxy.golang.org/${MODULE}/@v/${VERSION}.info"

- name: Trigger pkg.go.dev refresh
env:
MODULE: github.com/rapidclock/web-octopus
run: |
set -euxo pipefail
VERSION="${{ github.event.release.tag_name }}"
curl -fsSL "https://pkg.go.dev/fetch/${MODULE}@${VERSION}"
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proxy/pkg.go.dev trigger steps use curl -fsSL with no retry/backoff. These endpoints can transiently return 404/5xx shortly after release publication (before indexing completes), which will make the publish workflow flaky. Consider adding --retry/--retry-all-errors with a small backoff loop, or polling until the proxy .info endpoint becomes available (with an overall timeout).

Copilot uses AI. Check for mistakes.
return
}
go func() {
defer fp.Close()
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File handle may be writable as a result of data flow from a call to OpenFile and closing it may result in data loss upon failure, which is not handled explicitly.

Suggested change
defer fp.Close()
defer func() {
if cerr := fp.Close(); cerr != nil {
log.Printf("failed to close output file %q: %v", fw.FilePath, cerr)
}
}()

Copilot uses AI. Check for mistakes.
Removed outdated version entries and consolidated changelog to focus on the latest updates.
@rapidclock rapidclock closed this Feb 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments