Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 58 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
name: CI

on:
pull_request:
types: [opened, synchronize, reopened, ready_for_review]
push:
branches: [main, master]

permissions:
contents: read

jobs:
test-and-lint:
runs-on: ubuntu-latest
timeout-minutes: 20

strategy:
fail-fast: false
matrix:
go-version: ['1.23.x', '1.24.x']

steps:
- name: Checkout
uses: actions/checkout@v4

- name: Setup Go
uses: actions/setup-go@v5
with:
go-version: ${{ matrix.go-version }}
cache: true

- name: Verify module graph
run: |
go mod tidy
git diff --exit-code -- go.mod go.sum

- name: Format check
run: |
unformatted=$(gofmt -l $(git ls-files '*.go'))
if [ -n "$unformatted" ]; then
echo "These files are not gofmt-formatted:"
echo "$unformatted"
exit 1
fi

- name: Vet
run: go vet ./...

- name: Staticcheck
run: |
go install honnef.co/go/tools/cmd/staticcheck@latest
staticcheck ./...

- name: Unit tests
run: go test ./...

- name: Race tests
run: go test -race ./...
53 changes: 53 additions & 0 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
name: Publish

on:
release:
types: [published]

permissions:
contents: read

jobs:
publish-go-module:
runs-on: ubuntu-latest
timeout-minutes: 10

steps:
- name: Checkout
uses: actions/checkout@v4

- name: Setup Go
uses: actions/setup-go@v5
with:
go-version: '1.23.x'

- name: Validate release tag format and VERSION match
run: |
TAG="${{ github.event.release.tag_name }}"
case "$TAG" in
v*) ;;
*)
echo "Release tag must start with 'v' (got: $TAG)"
exit 1
;;
esac

TAG_NO_V="${TAG#v}"
VERSION_FILE=$(tr -d '[:space:]' < VERSION)
test "$TAG_NO_V" = "$VERSION_FILE"

- name: Trigger Go module proxy indexing
env:
MODULE: github.com/rapidclock/web-octopus
run: |
set -euxo pipefail
VERSION="${{ github.event.release.tag_name }}"
curl -fsSL "https://proxy.golang.org/${MODULE}/@v/${VERSION}.info"

- name: Trigger pkg.go.dev refresh
env:
MODULE: github.com/rapidclock/web-octopus
run: |
set -euxo pipefail
VERSION="${{ github.event.release.tag_name }}"
curl -fsSL "https://pkg.go.dev/fetch/${MODULE}@${VERSION}"
Comment on lines +39 to +53
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proxy/pkg.go.dev trigger steps use curl -fsSL with no retry/backoff. These endpoints can transiently return 404/5xx shortly after release publication (before indexing completes), which will make the publish workflow flaky. Consider adding --retry/--retry-all-errors with a small backoff loop, or polling until the proxy .info endpoint becomes available (with an overall timeout).

Copilot uses AI. Check for mistakes.
13 changes: 0 additions & 13 deletions .travis.yml

This file was deleted.

29 changes: 29 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Changelog

All notable changes to this project are documented in this file.

## [1.3.0] - 2026-02-10
### Changed
- Updated publish automation to run **only** when a GitHub Release is published (not on every tag push).
- Added strict release-tag validation in publish workflow and kept VERSION/tag consistency checks.
- Removed legacy Travis CI configuration in favor of GitHub Actions.
- Restored robust HTML parsing using `golang.org/x/net/html` tokenizer instead of regex-based extraction.
- Restored mature rate limiting with `golang.org/x/time/rate` for predictable throttling semantics.
- Added module `replace` directives to GitHub mirrors for `golang.org/x/*` to improve fetch reliability in restricted environments.
- Fixed distributor shutdown to stop forwarding after quit signal without closing inbound channels owned by upstream producers.
- Replaced `log.Fatal` behavior in core library paths with non-process-terminating logic (`panic` for invalid setup value and early return for nil or file-open failure paths).
- Improved file adapter open flags to use write-only, create, and truncate semantics for predictable output.

### Added
- Added GitHub Actions CI workflow for pull requests and default-branch pushes with module, formatting, vet, static analysis, test, and race checks.
- Added GitHub Actions publish workflow on version tags to trigger Go proxy and pkg.go.dev indexing.
- Added parser-focused unit coverage to validate anchor extraction behavior.
- Go module support (`go.mod`) with explicit dependency versions.
- Unit tests for crawler defaults and factory helpers.
- Unit tests for rate-limit validation and timeout setup behavior.
- Unit tests for pipeline helper behavior (link absolution, duplicate filter, timeout propagation).
- Unit test coverage for `adapter.FileWriterAdapter` write behavior.
- Comprehensive README covering installation, architecture, configuration, adapters, testing, and release policy.

### Notes
- This version focuses on modernization and maintainability without changing the fundamental crawler pipeline design.
Loading
Loading