Skip to content

Allow CD in any region to deploy the app#1916

Draft
lionello wants to merge 2 commits intomainfrom
lio/cd-region
Draft

Allow CD in any region to deploy the app#1916
lionello wants to merge 2 commits intomainfrom
lio/cd-region

Conversation

@lionello
Copy link
Member

@lionello lionello commented Feb 10, 2026

Description

Currently, the CD CloudFormation stack has to be in the same region as the region the app will be deployed too. This now requires a CD stack per region.

This PR allows us to use the CD stack in any region to deploy to app, to other regions.

Linked Issues

Checklist

  • I have performed a self-review of my code
  • I have added appropriate tests
  • I have updated the Defang CLI docs and/or README to reflect my changes, if necessary

Summary by CodeRabbit

  • New Features

    • AWS Service Quotas checks enabled; dedicated CD-region handling for region-aware operations
    • Automatic S3 bucket region detection and improved public ECR image existence checks
    • New CD log type and merged live-log tailing for consolidated logs
  • Bug Fixes

    • Clearer upload failure message
    • Global verbose flag now causes storage listings to include verbose results
  • Refactor

    • Split config-loading paths for CD vs. application contexts and reorganized log/query flows

@lionello lionello changed the title Allow any CD in any region to deploy the app Allow CD in any region to deploy the app Feb 10, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 10, 2026

📝 Walkthrough

Walkthrough

Per-region AWS/CD changes: S3 bucket discovery by prefix, new S3Lister and ListBucketsByPrefix, CD vs app config loaders (LoadConfigForCD/ForApp), per-region CloudWatch log querying and LiveTailStream merging, several Aws methods promoted to package-level, and cd listing now respects global verbose flag.

Changes

Cohort / File(s) Summary
CLI: cd listing
src/cmd/cli/command/cd.go
CdListFromStorage now passes (all
BYOC / AWS client & logs
src/pkg/cli/client/byoc/aws/byoc.go, src/pkg/cli/client/byoc/aws/list.go
Introduced per-region log querying (CDRegion vs app Region), split/merged LiveTailStreams, replaced some LoadConfig calls with LoadConfigForApp/LoadConfigForCD, switched to CheckImageExistOnPublicECR, adjusted servicequotas usage and other internal refactors.
S3 bucket discovery & listing
src/pkg/clouds/aws/s3.go, src/pkg/cli/client/byoc/aws/list.go
Added S3Lister interface and ListBucketsByPrefix; listPulumiStacksAllRegions now consumes aws.S3Lister and iterates buckets-with-regions returned by ListBucketsByPrefix instead of per-bucket GetBucketLocation calls.
AWS core: config, Region, ARNs, and method promotions
src/pkg/clouds/aws/common.go, src/pkg/clouds/aws/ec2.go, src/pkg/clouds/aws/publicecr.go, src/pkg/clouds/aws/region/region.go
Region type aliased, added MakeARN and Aws.MakeARN, LoadConfig renamed/added variants (LoadConfigForApp), promoted some Aws methods to package-level functions (e.g., RevokeDefaultSecurityGroupRules, CheckImageExistOnPublicECR), and added Region.Values export.
ECS / CD region-awareness
src/pkg/clouds/aws/ecs/...
.../cfn/setup.go, .../common.go, info.go, run.go, stop.go, tail.go, upload.go, ..._test.go
Added CDRegion to AwsEcs, LoadConfigForCD, MakeCdARN, region-aware CloudFormation client creation with describeStacksAllRegions (search S3 to find stack region), annotated CFN errors, and updated tests to reflect new constructors and flows.
CloudWatch / live-tail merging & log types
src/pkg/clouds/aws/cw/stream.go, src/pkg/logs/log_type.go
Added MergeLiveTailStreams to combine multiple LiveTailStream sources; introduced LogTypeCD and LogTypeUnspecified and extended mappings to parse/use CD log type.
Secrets & config loader updates
src/pkg/clouds/aws/secrets.go, src/pkg/clouds/aws/secrets_test.go
Replaced LoadConfig calls with LoadConfigForApp in secrets APIs and tests; no logic changes beyond loader selection.
Misc tests & small fixes
src/pkg/cli/compose/context.go, src/pkg/cli/configSet_test.go, src/pkg/stacks/manager.go, src/pkg/clouds/aws/ecs/cfn/setup_test.go, src/pkg/clouds/aws/publicecr_test.go
Minor error message wording changes, mock receiver fixes, comment tweaks, and test adjustments to package-level function usage and new config loaders.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant CLI as CLI
participant BYOC as BYOC/AWS client
participant S3 as S3 (ListBucketsByPrefix)
participant CFN as CloudFormation
participant CW_US as CloudWatch (us-east-1 / CDRegion)
participant CW_APP as CloudWatch (app Region)
participant MERGER as MergeLiveTailStreams
CLI->>BYOC: request logs / cd list / stack ops
BYOC->>S3: ListBucketsByPrefix(prefix)
S3-->>BYOC: buckets-with-regions
BYOC->>CFN: describeStacksAllRegions(bucket regions)
CFN-->>BYOC: stack region / outputs
BYOC->>CW_US: query CDRegion log groups
BYOC->>CW_APP: query app Region log groups
CW_US-->>BYOC: LiveTailStream_US
CW_APP-->>BYOC: LiveTailStream_APP
BYOC->>MERGER: pass LiveTailStream_US, LiveTailStream_APP
MERGER-->>CLI: merged LiveTailStream events

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • jordanstephens
  • edwardrf

Poem

"A rabbit hopped through buckets and logs,
I stitched regions, merged tails, chased fogs.
Found stacks in S3, made ARNs bright,
Merged streams by moon, by day and night.
Carrot cheers — deploys take flight!" 🐰✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 14.29% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: allowing CD deployment from any region rather than requiring co-location with the application. It reflects the primary objective of the PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch lio/cd-region

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.5.0)

level=warning msg="[linters_context] running gomodguard failed: unable to read module file go.mod: current working directory must have a go.mod file: if you are not using go modules it is suggested to disable this linter"
level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies"

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/pkg/cli/client/byoc/aws/list.go (1)

107-122: ⚠️ Potential issue | 🟠 Major

Fix loop-variable capture in the goroutine.

bucket is captured from the range loop without being passed as a parameter (line 111), so all goroutines will reference the same bucket variable which gets reassigned across iterations. Capture by shadowing the loop variable or pass it as a parameter to the goroutine.

Proposed fix
 	for bucket, bucketRegion := range buckets {
+		bucket := bucket
+		region := bucketRegion
 		wg.Add(1)
-		go func(region aws.Region) {
+		go func(bucket string, region aws.Region) {
 			defer wg.Done()
 			stacks, err := listPulumiStacksInBucket(ctx, region, bucket)
 			if err != nil {
 				return
 			}
 			for stack := range stacks {
 				select {
 				case <-ctx.Done():
 					return
 				case stackCh <- fmt.Sprintf("%s [%s]", stack, region):
 				}
 			}
-		}(bucketRegion)
+		}(bucket, region)
 	}
src/pkg/clouds/aws/ecs/common.go (1)

21-44: ⚠️ Potential issue | 🟠 Major

Fix error handling in LoadConfigForCD to avoid state mutation on failure.

The current implementation sets CDRegion even when LoadDefaultConfig returns an error (line 41 executes before the error is returned). Additionally, when CDRegion is empty, passing it as an empty string to LoadDefaultConfig doesn't trigger normal AWS region resolution—instead, it propagates the empty value.

Fall back to LoadConfigForApp when CDRegion is unset (which validates the region and handles fallback logic), then only call LoadDefaultConfig with a confirmed non-empty region. This ensures state mutations only occur on success.

💡 Proposed fix
 func (a *AwsEcs) LoadConfigForCD(ctx context.Context) (awsx.Config, error) {
-	cfg, err := aws.LoadDefaultConfig(ctx, a.CDRegion)
-	// If we don't have an region override for CD, use the current AWS region
-	if a.CDRegion == "" {
-		a.CDRegion = aws.Region(cfg.Region)
-	}
-	return cfg, err
+	if a.CDRegion == "" {
+		cfg, err := a.LoadConfigForApp(ctx)
+		if err != nil {
+			return cfg, err
+		}
+		a.CDRegion = aws.Region(cfg.Region)
+		return cfg, nil
+	}
+
+	cfg, err := aws.LoadDefaultConfig(ctx, a.CDRegion)
+	if err != nil {
+		return cfg, err
+	}
+	return cfg, nil
 }
src/pkg/clouds/aws/ecs/cfn/setup.go (1)

71-165: ⚠️ Potential issue | 🟠 Major

Recreate the CloudFormation client after CDRegion changes in updateStackAndWait.

When describeStacksAllRegions finds the stack in a different region, it updates a.CDRegion, but updateStackAndWait continues using the cfn client created with the original region. The subsequent UpdateStack call will target the wrong region. This same issue exists in createStackAndWait and the teardown method.

🔧 Suggested fix
 func (a *AwsEcsCfn) updateStackAndWait(ctx context.Context, templateBody string, parameters []cfnTypes.Parameter) error {
 	cfn, err := a.newCloudFormationClient(ctx)
 	if err != nil {
 		return err
 	}

+	initialRegion := a.CDRegion
 	// Check the template version first, to avoid updating to an outdated template; TODO: can we use StackPolicy/Conditions instead?
 	if dso, err := a.describeStacksAllRegions(ctx, cfn); err == nil && len(dso.Stacks) == 1 {
 		for _, output := range dso.Stacks[0].Outputs {
 			if *output.OutputKey == OutputsTemplateVersion {
 				deployedRev, _ := strconv.Atoi(*output.OutputValue)
 				if deployedRev > TemplateRevision {
 					return fmt.Errorf("This CLI has an older CloudFormation template than the deployed %s stack: please update the CLI", a.stackName)
 				}
 			}
 		}
 		// Set "Use previous value" for parameters not in the new parameters list
 		newParams := map[string]struct{}{}
 		for _, newParam := range parameters {
 			newParams[*newParam.ParameterKey] = struct{}{}
 		}
 		for _, param := range dso.Stacks[0].Parameters {
 			if _, ok := newParams[*param.ParameterKey]; !ok {
 				parameters = append(parameters, cfnTypes.Parameter{
 					ParameterKey:     param.ParameterKey,
 					UsePreviousValue: ptr.Bool(true),
 				})
 			}
 		}
 	}
+	if a.CDRegion != initialRegion {
+		cfn, err = a.newCloudFormationClient(ctx)
+		if err != nil {
+			return err
+		}
+	}

 	uso, err := cfn.UpdateStack(ctx, &cloudformation.UpdateStackInput{
 		Capabilities: []cfnTypes.Capability{cfnTypes.CapabilityCapabilityNamedIam},
 		Parameters:   parameters,
 		StackName:    ptr.String(a.stackName),
 		TemplateBody: ptr.String(templateBody),
 	})
🤖 Fix all issues with AI agents
In `@src/pkg/cli/client/byoc/aws/byoc.go`:
- Around line 683-686: Currently a single CloudWatch client (cwClient created
via cw.NewCloudWatchLogsClient with b.driver.CDRegion) is used for all log
queries, but CloudWatch APIs are region-scoped; create a second CloudWatch
client (e.g., appCWClient) by calling cw.NewCloudWatchLogsClient with
b.driver.AppRegion and keep the existing cwClient for CDRegion. Update any code
paths that call region-locked APIs (FilterLogEvents, StartLiveTail, and any
place that builds ARNs with makeAppLogGroupARN) to use appCWClient when the log
group ARN is for the app region and use cwClient for CD-region ARNs; follow the
existing TODO in stream.go as guidance and ensure both clients are constructed
where cwClient is currently created (reference cwClient, b.driver.CDRegion, new
appCWClient, b.driver.AppRegion, and makeAppLogGroupARN).

In `@src/pkg/cli/compose/context.go`:
- Line 268: The error message incorrectly labels resp.Status (a string like "403
Forbidden") as a "status code"; update the return in the upload handling to use
resp.StatusCode (int) in the formatted message, e.g. "Upload failed: HTTP PUT
status code %v", resp.StatusCode, or alternatively change the label to "status"
if you want to keep resp.Status; locate the return using resp in the upload
function in context.go and make the replacement accordingly.

In `@src/pkg/clouds/aws/ec2.go`:
- Around line 13-15: RevokeDefaultSecurityGroupRules calls LoadDefaultConfig
with region which may be empty (CDRegion can be initialized to ""), and
LoadDefaultConfig always applies WithRegion causing SDK to ignore env/shared
config and later error; update LoadDefaultConfig (or the call site) to guard
against empty region: only apply WithRegion when region != "" (or validate and
return an error if region is empty) so the AWS SDK can fallback to its normal
region resolution; reference the LoadDefaultConfig function and the
RevokeDefaultSecurityGroupRules call to locate the change.

In `@src/pkg/clouds/aws/ecs/cfn/setup_test.go`:
- Around line 49-68: The test creates cfnOtherRegion via New(...) but calls
cfnOtherRegion.SetUp(...) before mirroring the main cfn's config and env, so
default RetainBucket/Spot and missing DockerHub envs can trigger stack updates
and flakiness; fix by copying the relevant fields from the primary cfn to
cfnOtherRegion (e.g., set cfnOtherRegion.RetainBucket = cfn.RetainBucket,
cfnOtherRegion.Spot = cfn.Spot) and copy any DockerHub-related environment
variables used by SetUp (or otherwise apply the same config/env used when the
main cfn was created) before invoking cfnOtherRegion.SetUp(ctx, testContainers).

In `@src/pkg/clouds/aws/ecs/info.go`:
- Around line 13-16: The Info method on AwsEcs currently calls LoadConfigForCD
while Stop calls LoadConfigForApp; update Info to call LoadConfigForApp instead
of LoadConfigForCD (or, if there is a deliberate reason to use the CD config,
add a concise comment in the Info method explaining why LoadConfigForCD is
required) so the configuration-loading approach is consistent with Stop
(reference symbols: AwsEcs.Info, AwsEcs.Stop, LoadConfigForCD, LoadConfigForApp,
and AwsEcsCfn.GetInfo).

In `@src/pkg/clouds/aws/publicecr.go`:
- Around line 25-26: The CheckImageExistOnPublicECR function hardcodes
"us-east-1" when calling LoadDefaultConfig which excludes the us-west-2 public
ECR endpoint; update the implementation to support both regions or accept region
configuration: either add a region parameter (e.g., region string) to
CheckImageExistOnPublicECR and pass it into LoadDefaultConfig, or iterate over a
list ["us-east-1","us-west-2"] and try LoadDefaultConfig/describe calls against
each until one succeeds; ensure any new config option is validated and errors
are returned if both regions fail, and update references to
CheckImageExistOnPublicECR accordingly.
🧹 Nitpick comments (1)
src/cmd/cli/command/cd.go (1)

144-144: Consider documenting that --verbose now queries all regions.

The allRegions parameter in CdListFromStorage controls both querying all regions and displaying region info. By adding || global.Verbose, the --verbose flag now also triggers querying all regions—not just showing extra details for the current region.

This is likely intentional per the commit message, but users might be surprised that --verbose changes the scope of the query (potentially slower). Consider:

  1. Adding a comment here explaining the intent, or
  2. Updating the --verbose help text to clarify this behavior

Comment on lines +49 to +68
t.Run("Use other region", func(t *testing.T) {
cfnOtherRegion := New("crun-test-"+user, aws.Region("us-east-1"))
err := cfnOtherRegion.SetUp(ctx, testContainers)
if err != nil {
t.Fatal(err)
}
assert.Equal(t, cfn.Region, cfnOtherRegion.Region)
assert.Equal(t, cfn.AccountID, cfnOtherRegion.AccountID)
assert.Equal(t, cfn.BucketName, cfnOtherRegion.BucketName)
assert.Equal(t, cfn.CIRoleARN, cfnOtherRegion.CIRoleARN)
assert.Equal(t, cfn.ClusterName, cfnOtherRegion.ClusterName)
assert.Equal(t, cfn.DefaultSecurityGroupID, cfnOtherRegion.DefaultSecurityGroupID)
assert.Equal(t, cfn.LogGroupARN, cfnOtherRegion.LogGroupARN)
assert.Equal(t, cfn.RetainBucket, cfnOtherRegion.RetainBucket)
assert.Equal(t, cfn.SecurityGroupID, cfnOtherRegion.SecurityGroupID)
assert.Equal(t, cfn.Spot, cfnOtherRegion.Spot)
assert.Equal(t, cfn.SubNetID, cfnOtherRegion.SubNetID)
assert.Equal(t, cfn.TaskDefARN, cfnOtherRegion.TaskDefARN)
assert.Equal(t, cfn.VpcID, cfnOtherRegion.VpcID)
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Align cfnOtherRegion config/env before SetUp to avoid unintended stack updates.

cfnOtherRegion runs with default RetainBucket/Spot and no DockerHub envs, but SetUp uses those values to update the stack. That can both mutate the existing stack and make the equality assertions for RetainBucket/Spot flaky. Mirror the main test’s config/env (or use a read‑only path) before calling SetUp.

🛠️ Minimal fix to keep config/env consistent
 t.Run("Use other region", func(t *testing.T) {
+	t.Setenv("DEFANG_NO_CACHE", "0")
+	t.Setenv("DOCKERHUB_USERNAME", "defanglabs2")
+	t.Setenv("DOCKERHUB_ACCESS_TOKEN", "defanglabs")
 	cfnOtherRegion := New("crun-test-"+user, aws.Region("us-east-1"))
+	cfnOtherRegion.RetainBucket = cfn.RetainBucket
+	cfnOtherRegion.Spot = cfn.Spot
 	err := cfnOtherRegion.SetUp(ctx, testContainers)
 	if err != nil {
 		t.Fatal(err)
 	}
🤖 Prompt for AI Agents
In `@src/pkg/clouds/aws/ecs/cfn/setup_test.go` around lines 49 - 68, The test
creates cfnOtherRegion via New(...) but calls cfnOtherRegion.SetUp(...) before
mirroring the main cfn's config and env, so default RetainBucket/Spot and
missing DockerHub envs can trigger stack updates and flakiness; fix by copying
the relevant fields from the primary cfn to cfnOtherRegion (e.g., set
cfnOtherRegion.RetainBucket = cfn.RetainBucket, cfnOtherRegion.Spot = cfn.Spot)
and copy any DockerHub-related environment variables used by SetUp (or otherwise
apply the same config/env used when the main cfn was created) before invoking
cfnOtherRegion.SetUp(ctx, testContainers).

Comment on lines 13 to 16
func (a AwsEcs) Info(ctx context.Context, id TaskArn) (*clouds.TaskInfo, error) {
cfg, err := a.LoadConfig(ctx)
cfg, err := a.LoadConfigForCD(ctx)
if err != nil {
return nil, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find all call sites of .Info method in Go files
rg -nP '\.Info\s*\(' --type=go -C3

Repository: DefangLabs/defang

Length of output: 22820


🏁 Script executed:

#!/bin/bash
# Read the Info method definition and surrounding context
cat -n src/pkg/clouds/aws/ecs/info.go | head -30

echo "---"

# Read the call site in cfn/setup.go
cat -n src/pkg/clouds/aws/ecs/cfn/setup.go | sed -n '340,360p'

echo "---"

# Search for LoadConfigForCD and LoadConfigForApp to understand the difference
rg -n "LoadConfigFor(CD|App)" --type=go -B2 -A2

Repository: DefangLabs/defang

Length of output: 8591


🏁 Script executed:

#!/bin/bash
# Check the AwsEcs struct definition and ClusterName semantics
rg -n "ClusterName" --type=go -B2 -A2 src/pkg/clouds/aws/ecs/

echo "---"

# Check the AwsEcsCfn initialization to understand intent
rg -n "AwsEcsCfn" --type=go -B3 -A3 | head -60

Repository: DefangLabs/defang

Length of output: 9546


🏁 Script executed:

#!/bin/bash
# Check AwsEcs struct and CDRegion initialization
cat -n src/pkg/clouds/aws/ecs/common.go | head -60

echo "---"

# Verify how CDRegion is set in AwsEcsCfn
cat -n src/pkg/clouds/aws/ecs/cfn/setup.go | sed -n '40,52p'

Repository: DefangLabs/defang

Length of output: 2341


Consider using LoadConfigForApp for consistency with Stop().

Info() currently uses LoadConfigForCD while Stop() uses LoadConfigForApp for the same AwsEcs struct. Although both regions are initialized to the same value in AwsEcsCfn (line 48 of cfn/setup.go: "assume CD runs in the same region as the app"), this inconsistency is semantically confusing. Since Info() is only called from AwsEcsCfn.GetInfo() context and operates on the CD cluster, the current implementation works, but consider aligning with Stop() for clarity, or add a comment documenting why Info() requires LoadConfigForCD when both region configs are identical in practice.

🤖 Prompt for AI Agents
In `@src/pkg/clouds/aws/ecs/info.go` around lines 13 - 16, The Info method on
AwsEcs currently calls LoadConfigForCD while Stop calls LoadConfigForApp; update
Info to call LoadConfigForApp instead of LoadConfigForCD (or, if there is a
deliberate reason to use the CD config, add a concise comment in the Info method
explaining why LoadConfigForCD is required) so the configuration-loading
approach is consistent with Stop (reference symbols: AwsEcs.Info, AwsEcs.Stop,
LoadConfigForCD, LoadConfigForApp, and AwsEcsCfn.GetInfo).

Comment on lines +25 to +26
func CheckImageExistOnPublicECR(ctx context.Context, repo, tag string) (bool, error) {
cfg, err := LoadDefaultConfig(ctx, "us-east-1") // ECR Public is only in us-east-1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

AWS ECR Public registry available regions 2024 2025

💡 Result:

As of 2024–2025, Amazon ECR Public is available in only two AWS commercial regions (these are the only regions with ECR Public API endpoints):

Notable exclusions called out in AWS docs:

Citations:


Reconsider hardcoded us-east-1 region—ECR Public is also available in us-west-2.

AWS ECR Public is available in two commercial regions: us-east-1 and us-west-2. The current implementation hardcodes only us-east-1, unnecessarily limiting functionality. Consider either supporting both regions or making the region configurable to account for deployments in us-west-2.

🤖 Prompt for AI Agents
In `@src/pkg/clouds/aws/publicecr.go` around lines 25 - 26, The
CheckImageExistOnPublicECR function hardcodes "us-east-1" when calling
LoadDefaultConfig which excludes the us-west-2 public ECR endpoint; update the
implementation to support both regions or accept region configuration: either
add a region parameter (e.g., region string) to CheckImageExistOnPublicECR and
pass it into LoadDefaultConfig, or iterate over a list ["us-east-1","us-west-2"]
and try LoadDefaultConfig/describe calls against each until one succeeds; ensure
any new config option is validated and errors are returned if both regions fail,
and update references to CheckImageExistOnPublicECR accordingly.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/pkg/clouds/aws/ecs/cfn/setup.go (2)

265-276: ⚠️ Potential issue | 🟠 Major

Use cross‑region discovery in FillOutputs.

FillOutputs still calls DescribeStacks only in the current CD region. If the CD stack lives in another region, this path fails even though describeStacksAllRegions exists.

✅ Proposed fix
-	dso, err := cfn.DescribeStacks(ctx, &cloudformation.DescribeStacksInput{
-		StackName: ptr.String(a.stackName),
-	})
-	if err != nil {
-		return annotateCfnError(err)
-	}
+	dso, err := a.describeStacksAllRegions(ctx, cfn)
+	if err != nil {
+		return err
+	}

345-368: ⚠️ Potential issue | 🟠 Major

TearDown doesn't resolve cross-region stacks.

If the CD stack exists in another region, TearDown will target the wrong region and fail to delete the stack. The updateStackAndWait method correctly handles this by calling describeStacksAllRegions to discover the actual stack location before operating; TearDown should follow the same pattern.

✅ Proposed fix
 	cfn, err := a.newCloudFormationClient(ctx)
 	if err != nil {
 		return err
 	}
+	origRegion := a.CDRegion
+	if _, err := a.describeStacksAllRegions(ctx, cfn); err != nil {
+		return err
+	}
+	if a.CDRegion != origRegion {
+		cfn, err = a.newCloudFormationClient(ctx)
+		if err != nil {
+			return err
+		}
+	}
🤖 Fix all issues with AI agents
In `@src/pkg/clouds/aws/ecs/cfn/setup.go`:
- Around line 103-110: The CloudFormation client created by
a.newCloudFormationClient(ctx) is region-bound but describeStacksAllRegions(ctx,
cfn) may change a.CDRegion; after that call, detect if a.CDRegion changed and
recreate the client (call a.newCloudFormationClient(ctx) again and reassign cfn)
before any subsequent calls such as UpdateStack or the waiter, propagating and
returning any error from the second creation so all operations use the correct
region-bound client.
- Around line 63-100: The build fails because withRegion is referenced in
describeStacksAllRegions but not implemented; add a helper function named
withRegion that returns a CloudFormation per-operation option to override the
region (signature: withRegion(region string) func(*cloudformation.Options) or
the equivalent option type used by the aws-sdk-go-v2 cloudformation client),
ensure it sets the Region field to the provided region, and use that function
where describeStacksAllRegions currently passes withRegion(bucketRegion) so
DescribeStacks calls run against the target region.

Comment on lines +103 to 110
cfn, err := a.newCloudFormationClient(ctx)
if err != nil {
return err
}

// Check the template version first, to avoid updating to an outdated template; TODO: can we use StackPolicy/Conditions instead?
// TODO: should check all regions
if dso, err := cfn.DescribeStacks(ctx, &cloudformation.DescribeStacksInput{StackName: &a.stackName}); err == nil && len(dso.Stacks) == 1 {
if dso, err := a.describeStacksAllRegions(ctx, cfn); err == nil && len(dso.Stacks) == 1 {
for _, output := range dso.Stacks[0].Outputs {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

aws-sdk-go-v2 CloudFormation client region binding configuration

💡 Result:

In aws-sdk-go-v2, the CloudFormation client’s region ultimately comes from the aws.Config you pass to cloudformation.NewFromConfig, and a region is required. (pkg.go.dev)

Typical (recommended) setup: bind region when loading config

cfg, err := config.LoadDefaultConfig(ctx,
    config.WithRegion("us-west-2"),
)
if err != nil { /* handle */ }

cfn := cloudformation.NewFromConfig(cfg)

The SDK has no default region; you must provide one via config.WithRegion, AWS_REGION, or shared config files. (docs.aws.amazon.com)

Environment/shared-config setup (no code change)

Set AWS_REGION, or set region = ... in ~/.aws/config, then:

cfg, err := config.LoadDefaultConfig(ctx)
cfn := cloudformation.NewFromConfig(cfg)

(docs.aws.amazon.com)

Per-client (or per-call) region differences

There isn’t a service-client WithRegion(...) helper; instead, create a new config (or client) per region:

cfg1, _ := config.LoadDefaultConfig(ctx, config.WithRegion("us-west-2"))
cfg2, _ := config.LoadDefaultConfig(ctx, config.WithRegion("us-east-1"))

cfnWest := cloudformation.NewFromConfig(cfg1)
cfnEast := cloudformation.NewFromConfig(cfg2)

cloudformation client options include Region and mark it as required. (pkg.go.dev)

Note on WithDefaultRegion

config.WithDefaultRegion(...) exists, but config.WithRegion(...) takes precedence when resolving the final region. (pkg.go.dev)

Citations:


🏁 Script executed:

# First, let's examine the setup.go file to understand the context
find . -type f -name "setup.go" | grep -E "ecs/cfn"

Repository: DefangLabs/defang

Length of output: 98


🏁 Script executed:

# Find describeStacksAllRegions to understand what it does
rg "describeStacksAllRegions" -A 20 -B 5

Repository: DefangLabs/defang

Length of output: 3904


🏁 Script executed:

# Look for CDRegion modification in the codebase
rg "CDRegion\s*=" -B 3 -A 3

Repository: DefangLabs/defang

Length of output: 1120


🏁 Script executed:

# Examine the full setup.go file around lines 103-110 and 149
cat -n src/pkg/clouds/aws/ecs/cfn/setup.go | sed -n '95,160p'

Repository: DefangLabs/defang

Length of output: 3326


🏁 Script executed:

# Look for UpdateStack and other cfn operations after describeStacksAllRegions
rg "UpdateStack|cfn\." src/pkg/clouds/aws/ecs/cfn/setup.go | head -20

Repository: DefangLabs/defang

Length of output: 582


Recreate the CloudFormation client if CDRegion changes.

The CloudFormation client is bound to the region at creation time (line 103). However, describeStacksAllRegions (line 109) can update a.CDRegion when discovering the stack in a different region. Subsequent calls to UpdateStack (line 133) and the waiter (line 150) will then operate against the wrong region, causing failures.

The proposed fix is necessary: after describeStacksAllRegions returns, recreate the CloudFormation client if a.CDRegion has changed to ensure all subsequent operations target the correct region.

🤖 Prompt for AI Agents
In `@src/pkg/clouds/aws/ecs/cfn/setup.go` around lines 103 - 110, The
CloudFormation client created by a.newCloudFormationClient(ctx) is region-bound
but describeStacksAllRegions(ctx, cfn) may change a.CDRegion; after that call,
detect if a.CDRegion changed and recreate the client (call
a.newCloudFormationClient(ctx) again and reassign cfn) before any subsequent
calls such as UpdateStack or the waiter, propagating and returning any error
from the second creation so all operations use the correct region-bound client.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/pkg/clouds/aws/ecs/common.go (1)

21-44: ⚠️ Potential issue | 🟠 Major

Default CD region before loading config.

When CDRegion is empty, passing it to LoadDefaultConfig doesn't guarantee cfg.Region will be populated (AWS SDK resolution depends on environment/profile configuration). This can leave CDRegion empty, breaking ARN generation in MakeCdARN. Default to a.Region before loading the config, following the safer validation pattern used in LoadConfigForApp.

🔧 Suggested fix
 func (a *AwsEcs) LoadConfigForCD(ctx context.Context) (awsx.Config, error) {
-	cfg, err := aws.LoadDefaultConfig(ctx, a.CDRegion)
-	// If we don't have an region override for CD, use the current AWS region
-	if a.CDRegion == "" {
-		a.CDRegion = aws.Region(cfg.Region)
-	}
+	region := a.CDRegion
+	if region == "" {
+		region = a.Region
+	}
+	cfg, err := aws.LoadDefaultConfig(ctx, region)
+	if a.CDRegion == "" {
+		a.CDRegion = region
+	}
 	return cfg, err
 }
src/pkg/clouds/aws/ecs/cfn/setup.go (1)

157-162: ⚠️ Potential issue | 🟡 Minor

Use CDRegion in CloudFormation console links.
When CD and app regions differ, the current error message points to the wrong region.

🔧 Suggested fix
-		return fmt.Errorf("failed to update CloudFormation stack: check the CloudFormation console (https://%s.console.aws.amazon.com/cloudformation/home) for the %q stack to learn more: %w", a.AwsEcs.Region, a.stackName, err)
+		return fmt.Errorf("failed to update CloudFormation stack: check the CloudFormation console (https://%s.console.aws.amazon.com/cloudformation/home) for the %q stack to learn more: %w", a.CDRegion, a.stackName, err)
@@
-		return fmt.Errorf("failed to create CloudFormation stack: check the CloudFormation console (https://%s.console.aws.amazon.com/cloudformation/home) for the %q stack to learn more: %w", a.AwsEcs.Region, a.stackName, err)
+		return fmt.Errorf("failed to create CloudFormation stack: check the CloudFormation console (https://%s.console.aws.amazon.com/cloudformation/home) for the %q stack to learn more: %w", a.CDRegion, a.stackName, err)

Also applies to: 189-195

🤖 Fix all issues with AI agents
In `@src/pkg/cli/client/byoc/aws/byoc.go`:
- Line 43: The package-level mutable variable quotaClient can race across
concurrent deploys; move initialization off the package scope by adding a field
on the ByocAws struct (e.g., ByocAws.quotaClient) or by threading the client
through call sites, and update validateGPUResources (and any callers) to accept
the client as a parameter (or use the ByocAws field) so each request uses its
own client instance rather than mutating the package-level quotaClient.
- Around line 770-788: The function queryLogGroupsInRegion currently ignores
errsChan when evtsChan is non-nil; change it to drain errsChan and surface any
errors instead of dropping them: after calling cw.QueryLogGroups(ctx, cwClient,
start, end, limit, lgis...) read all errors from errsChan into a slice, and if
that slice is non-empty return nil and errors.Join(errs...) (rather than
returning the static stream), otherwise return cw.NewStaticLogStream(evtsChan,
func() {}), nil; reference symbols: queryLogGroupsInRegion, cw.QueryLogGroups,
errsChan, evtsChan, cw.NewStaticLogStream.

In `@src/pkg/clouds/aws/cw/stream.go`:
- Around line 147-171: MergeLiveTailStreams spawns goroutines that concurrently
assign to es.err and capture the loop variable s; fix by introducing a mutex
(e.g., errMu) to serialize writes to es.err and by passing s as a parameter to
the goroutine (go func(s LiveTailStream) { ... }(s)). Inside each goroutine call
es.pipeEvents(ctx, s) and, if it returns a non-nil error, lock errMu, set es.err
if not already set (or aggregate), then unlock; keep wg usage and closing es.ch
as-is.

"github.com/aws/aws-sdk-go-v2/service/route53"
"github.com/aws/aws-sdk-go-v2/service/s3"
"github.com/aws/aws-sdk-go-v2/service/secretsmanager"
"github.com/aws/aws-sdk-go-v2/service/servicequotas"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid package-level quotaClient mutation during deploys.
Assigning a shared client per request can race and leak config across concurrent deployments. Prefer storing it on ByocAws or passing it into validateGPUResources.

Also applies to: 213-213

🤖 Prompt for AI Agents
In `@src/pkg/cli/client/byoc/aws/byoc.go` at line 43, The package-level mutable
variable quotaClient can race across concurrent deploys; move initialization off
the package scope by adding a field on the ByocAws struct (e.g.,
ByocAws.quotaClient) or by threading the client through call sites, and update
validateGPUResources (and any callers) to accept the client as a parameter (or
use the ByocAws field) so each request uses its own client instance rather than
mutating the package-level quotaClient.

Comment on lines +770 to 788
func (b *ByocAws) queryLogGroupsInRegion(ctx context.Context, region aws.Region, start, end time.Time, follow bool, limit int32, lgis []cw.LogGroupInput) (cw.LiveTailStream, error) {
cwClient, err := cw.NewCloudWatchLogsClient(ctx, region)
if err != nil {
return nil, err
}
if follow {
return cw.QueryAndTailLogGroups(ctx, cwClient, start, end, lgis...)
}
evtsChan, errsChan := cw.QueryLogGroups(ctx, cwClient, start, end, limit, lgis...)
if evtsChan == nil {
var errs []error
for err := range errsChan {
errs = append(errs, err)
}
return nil, errors.Join(errs...)
}
// TODO: any errors from errsChan should be reported but get dropped
return cw.NewStaticLogStream(evtsChan, func() {}), nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Surface log-query errors when not following.
errsChan is currently dropped, so partial failures are silent.

💡 Suggested fix
 	evtsChan, errsChan := cw.QueryLogGroups(ctx, cwClient, start, end, limit, lgis...)
 	if evtsChan == nil {
 		var errs []error
 		for err := range errsChan {
 			errs = append(errs, err)
 		}
 		return nil, errors.Join(errs...)
 	}
-	// TODO: any errors from errsChan should be reported but get dropped
+	go func() {
+		var errs []error
+		for err := range errsChan {
+			errs = append(errs, err)
+		}
+		if len(errs) > 0 {
+			term.Warnf("Log query errors in %s: %v", region, errors.Join(errs...))
+		}
+	}()
 	return cw.NewStaticLogStream(evtsChan, func() {}), nil
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (b *ByocAws) queryLogGroupsInRegion(ctx context.Context, region aws.Region, start, end time.Time, follow bool, limit int32, lgis []cw.LogGroupInput) (cw.LiveTailStream, error) {
cwClient, err := cw.NewCloudWatchLogsClient(ctx, region)
if err != nil {
return nil, err
}
if follow {
return cw.QueryAndTailLogGroups(ctx, cwClient, start, end, lgis...)
}
evtsChan, errsChan := cw.QueryLogGroups(ctx, cwClient, start, end, limit, lgis...)
if evtsChan == nil {
var errs []error
for err := range errsChan {
errs = append(errs, err)
}
return nil, errors.Join(errs...)
}
// TODO: any errors from errsChan should be reported but get dropped
return cw.NewStaticLogStream(evtsChan, func() {}), nil
}
func (b *ByocAws) queryLogGroupsInRegion(ctx context.Context, region aws.Region, start, end time.Time, follow bool, limit int32, lgis []cw.LogGroupInput) (cw.LiveTailStream, error) {
cwClient, err := cw.NewCloudWatchLogsClient(ctx, region)
if err != nil {
return nil, err
}
if follow {
return cw.QueryAndTailLogGroups(ctx, cwClient, start, end, lgis...)
}
evtsChan, errsChan := cw.QueryLogGroups(ctx, cwClient, start, end, limit, lgis...)
if evtsChan == nil {
var errs []error
for err := range errsChan {
errs = append(errs, err)
}
return nil, errors.Join(errs...)
}
go func() {
var errs []error
for err := range errsChan {
errs = append(errs, err)
}
if len(errs) > 0 {
term.Warnf("Log query errors in %s: %v", region, errors.Join(errs...))
}
}()
return cw.NewStaticLogStream(evtsChan, func() {}), nil
}
🤖 Prompt for AI Agents
In `@src/pkg/cli/client/byoc/aws/byoc.go` around lines 770 - 788, The function
queryLogGroupsInRegion currently ignores errsChan when evtsChan is non-nil;
change it to drain errsChan and surface any errors instead of dropping them:
after calling cw.QueryLogGroups(ctx, cwClient, start, end, limit, lgis...) read
all errors from errsChan into a slice, and if that slice is non-empty return nil
and errors.Join(errs...) (rather than returning the static stream), otherwise
return cw.NewStaticLogStream(evtsChan, func() {}), nil; reference symbols:
queryLogGroupsInRegion, cw.QueryLogGroups, errsChan, evtsChan,
cw.NewStaticLogStream.

Comment on lines +147 to +171
// MergeLiveTailStreams merges multiple LiveTailStreams into one.
func MergeLiveTailStreams(streams ...LiveTailStream) LiveTailStream {
if len(streams) == 1 {
return streams[0]
}
ctx, cancel := context.WithCancel(context.Background())
es := newEventStream(func() {
cancel()
for _, s := range streams {
s.Close()
}
})
var wg sync.WaitGroup
for _, s := range streams {
wg.Add(1)
go func() {
defer wg.Done()
es.err = es.pipeEvents(ctx, s)
}()
}
go func() {
wg.Wait()
close(es.ch)
}()
return es
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's read the file to understand the context
cat -n src/pkg/clouds/aws/cw/stream.go | head -180 | tail -60

Repository: DefangLabs/defang

Length of output: 1742


🏁 Script executed:

# Get the full function and understand the types involved
cat -n src/pkg/clouds/aws/cw/stream.go | sed -n '100,200p'

Repository: DefangLabs/defang

Length of output: 2831


🏁 Script executed:

# Search for type definitions in the file
rg -A 5 "type (LiveTailStream|EventStream)" src/pkg/clouds/aws/cw/stream.go

Repository: DefangLabs/defang

Length of output: 43


🏁 Script executed:

# Check what pipeEvents returns
rg -B 2 -A 10 "func.*pipeEvents" src/pkg/clouds/aws/cw/stream.go

Repository: DefangLabs/defang

Length of output: 709


🏁 Script executed:

# Check Go version requirement
cat go.mod | head -5

Repository: DefangLabs/defang

Length of output: 99


🏁 Script executed:

# Find go.mod file
fd -n "go.mod" . | head -5

Repository: DefangLabs/defang

Length of output: 291


🏁 Script executed:

# Find go.mod file
fd "go.mod"

Repository: DefangLabs/defang

Length of output: 71


🏁 Script executed:

# Check Go version requirement
head -5 src/go.mod

Repository: DefangLabs/defang

Length of output: 131


🏁 Script executed:

# Let's also check the eventStream struct definition to understand the full context
rg -B 10 "type eventStream struct" src/pkg/clouds/aws/cw/stream.go

Repository: DefangLabs/defang

Length of output: 319


🏁 Script executed:

# Get the full eventStream struct definition
rg -A 10 "type eventStream struct" src/pkg/clouds/aws/cw/stream.go

Repository: DefangLabs/defang

Length of output: 285


Fix concurrent writes to es.err and loop variable capture in goroutine.

The es.err field is written by multiple goroutines without synchronization, creating a data race that can cause undefined behavior and lost errors. Additionally, the loop variable s is captured by the goroutine closure; while Go 1.22+ fixed per-iteration scoping, explicit parameter passing is the clearer, safer practice.

Guard error assignment with a mutex and pass s as a goroutine parameter to prevent capture issues.

🔧 Suggested fix
 func MergeLiveTailStreams(streams ...LiveTailStream) LiveTailStream {
@@
 	es := newEventStream(func() {
 		cancel()
 		for _, s := range streams {
 			s.Close()
 		}
 	})
 	var wg sync.WaitGroup
+	var errMu sync.Mutex
+	setErr := func(err error) {
+		if err == nil {
+			return
+		}
+		errMu.Lock()
+		if es.err == nil {
+			es.err = err
+		}
+		errMu.Unlock()
+	}
 	for _, s := range streams {
+		s := s
 		wg.Add(1)
-		go func() {
+		go func(s LiveTailStream) {
 			defer wg.Done()
-			es.err = es.pipeEvents(ctx, s)
-		}()
+			if err := es.pipeEvents(ctx, s); err != nil {
+				setErr(err)
+			}
+		}(s)
 	}
🤖 Prompt for AI Agents
In `@src/pkg/clouds/aws/cw/stream.go` around lines 147 - 171, MergeLiveTailStreams
spawns goroutines that concurrently assign to es.err and capture the loop
variable s; fix by introducing a mutex (e.g., errMu) to serialize writes to
es.err and by passing s as a parameter to the goroutine (go func(s
LiveTailStream) { ... }(s)). Inside each goroutine call es.pipeEvents(ctx, s)
and, if it returns a non-nil error, lock errMu, set es.err if not already set
(or aggregate), then unlock; keep wg usage and closing es.ch as-is.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/pkg/logs/log_type.go (1)

26-30: ⚠️ Potential issue | 🟡 Minor

Add LogTypeCD to AllLogTypes to keep String()/errors accurate.

String() builds composite labels from AllLogTypes, and ErrInvalidLogType prints AllLogTypes. With CD omitted, combinations like CD,RUN stringify as RUN, and error messages won’t list CD as valid.

🛠️ Proposed fix
 var AllLogTypes = []LogType{
+	LogTypeCD,
 	LogTypeRun,
 	LogTypeBuild,
 	LogTypeAll,
 }
🤖 Fix all issues with AI agents
In `@src/pkg/clouds/aws/ecs/common.go`:
- Around line 37-43: LoadConfigForCD must validate that aws.LoadDefaultConfig
actually resolved a region: after calling aws.LoadDefaultConfig(ctx, a.CDRegion)
check cfg.Region and if it's empty return a descriptive error (similar to
LoadConfigForApp) instead of silently leaving a.CDRegion empty; if cfg.Region is
present set a.CDRegion = aws.Region(cfg.Region) and return cfg, nil. Update the
same validation in the analogous block at lines 73-79; reference:
AwsEcs.LoadConfigForCD, a.CDRegion, aws.LoadDefaultConfig, and MakeCdARN to
ensure downstream ARN construction never receives an empty region.

In `@src/pkg/logs/log_type.go`:
- Around line 18-23: The LogType bit values were shifted by introducing
`LogTypeCD = 1 << iota`, altering `LogTypeRun` and `LogTypeBuild` numeric
values; revert to explicit constants (e.g., set `LogTypeCD = 1`, `LogTypeRun =
2`, `LogTypeBuild = 4`, keep `LogTypeAll = 0xFFFFFFFF`) to preserve backward
compatibility, and add `LogTypeCD` to the `AllLogTypes` slice used by `String()`
so the CD flag appears when combined with other flags (e.g., `LogTypeCD |
LogTypeRun`).
🧹 Nitpick comments (3)
src/pkg/clouds/aws/common.go (1)

23-32: Hardcoded "aws" partition may fail in GovCloud/China regions.

The ARN construction uses "aws" as the partition, but AWS GovCloud uses "aws-us-gov" and China regions use "aws-cn". If this tool is intended to support those regions, the partition should be derived from the region.

💡 Optional: Derive partition from region
+func partitionFromRegion(region Region) string {
+	r := string(region)
+	if strings.HasPrefix(r, "us-gov-") {
+		return "aws-us-gov"
+	}
+	if strings.HasPrefix(r, "cn-") {
+		return "aws-cn"
+	}
+	return "aws"
+}
+
 func MakeARN(service string, region Region, accountID, resource string) string {
 	return strings.Join([]string{
 		"arn",
-		"aws",
+		partitionFromRegion(region),
 		service,
 		string(region),
 		accountID,
 		resource,
 	}, ":")
 }
src/pkg/cli/client/byoc/aws/byoc.go (1)

764-766: Error when both CD and app log groups are empty may be overly strict.

If both cdLgis and appLgis are empty (e.g., stack not yet created, or specific log types filtered out), this returns an error. Consider whether this should be a warning that returns an empty stream instead.

💡 Consider returning empty stream instead of error
 	if len(streams) == 0 {
-		return nil, errors.New("no log groups to query")
+		term.Warn("No log groups available to query")
+		// Return an empty stream that closes immediately
+		ctx, cancel := context.WithCancel(context.Background())
+		cancel()
+		return cw.NewStaticLogStream(nil, func() {}), nil
 	}
src/pkg/clouds/aws/ecs/cfn/setup.go (1)

97-106: Consider logging errors during cross-region discovery.

When DescribeStacks fails in the loop, the error is silently ignored. While this is reasonable for "stack not found" errors, other errors (permissions, network issues) might be worth logging at debug level to aid troubleshooting.

🔧 Optional: Add debug logging for failed region checks
 	for _, bucketRegion := range buckets {
 		if bucketRegion == a.CDRegion {
 			continue // skip, already done above
 		}
-		if dso, err := cfn.DescribeStacks(ctx, input, withRegion(bucketRegion)); err == nil {
+		dso, err := cfn.DescribeStacks(ctx, input, withRegion(bucketRegion))
+		if err != nil {
+			term.Debugf("DescribeStacks in %s failed: %v", bucketRegion, err)
+			continue
+		}
+		if dso != nil {
 			a.CDRegion = bucketRegion
 			term.Debug("Reusing CloudFormation stack in region", bucketRegion)
 			return dso, nil
 		}
 	}

Comment on lines +37 to +43
func (a *AwsEcs) LoadConfigForCD(ctx context.Context) (awsx.Config, error) {
cfg, err := aws.LoadDefaultConfig(ctx, a.CDRegion)
// If we don't have an region override for CD, use the current AWS region
if a.CDRegion == "" {
a.CDRegion = aws.Region(cfg.Region)
}
return cfg, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, let's examine the file structure and locate the relevant functions
cd /tmp && find . -name "common.go" -path "*/aws/ecs/*" 2>/dev/null | head -5

Repository: DefangLabs/defang

Length of output: 43


🏁 Script executed:

#!/bin/bash
# Search for LoadConfigForCD, MakeCdARN, and CDRegion usage
rg -n "LoadConfigForCD|MakeCdARN|CDRegion" --type=go -B 2 -A 5 | head -100

Repository: DefangLabs/defang

Length of output: 6761


🏁 Script executed:

#!/bin/bash
# Find the aws wrapper to understand LoadDefaultConfig behavior
rg -n "func.*LoadDefaultConfig" --type=go -A 10

Repository: DefangLabs/defang

Length of output: 1785


🏁 Script executed:

#!/bin/bash
# Search for AwsEcs struct definition to see all fields
rg -n "type AwsEcs struct" --type=go -A 20

Repository: DefangLabs/defang

Length of output: 1486


🏁 Script executed:

#!/bin/bash
# Read the full common.go file to see MakeCdARN and understand the complete picture
cat -n src/pkg/clouds/aws/ecs/common.go | head -100

Repository: DefangLabs/defang

Length of output: 2575


🏁 Script executed:

#!/bin/bash
# Search for all AwsEcs struct initializations to see if CDRegion is always set
rg -n "AwsEcs\s*{" --type=go -A 15 | grep -E "AwsEcs|CDRegion"

Repository: DefangLabs/defang

Length of output: 144


🏁 Script executed:

#!/bin/bash
# Search for places where AwsEcs is created without explicit CDRegion initialization
rg -n "AwsEcs\{" --type=go -A 10

Repository: DefangLabs/defang

Length of output: 816


🏁 Script executed:

#!/bin/bash
# Find all AwsEcs initializations across the codebase
rg -n "AwsEcs\{" --type=go -B 2 -A 20 | grep -v "^--$"

Repository: DefangLabs/defang

Length of output: 1576


🏁 Script executed:

#!/bin/bash
# Also search for cases where AwsEcs might be embedded or initialized differently
rg -n "AwsEcs:" --type=go -B 5 -A 10 | head -80

Repository: DefangLabs/defang

Length of output: 1101


🏁 Script executed:

#!/bin/bash
# Search for any code paths that create AwsEcs without setting CDRegion
rg -n "new.*AwsEcs|&AwsEcs" --type=go -B 3 -A 10

Repository: DefangLabs/defang

Length of output: 888


🏁 Script executed:

#!/bin/bash
# Find the NewAwsEcsCfn function signature to see where region comes from
rg -n "func NewAwsEcsCfn" --type=go -A 5

Repository: DefangLabs/defang

Length of output: 43


🏁 Script executed:

#!/bin/bash
# Search for calls to NewAwsEcsCfn to see if region is always provided
rg -n "NewAwsEcsCfn" --type=go -B 2 -A 2

Repository: DefangLabs/defang

Length of output: 43


🏁 Script executed:

#!/bin/bash
# Check the aws.LoadDefaultConfig behavior with empty region more carefully
cat -n src/pkg/clouds/aws/common.go | sed -n '50,80p'

Repository: DefangLabs/defang

Length of output: 1109


🏁 Script executed:

#!/bin/bash
# Search for NewAwsEcsCfn definition and callers more broadly
rg -n "func.*NewAwsEcsCfn|NewAwsEcsCfn\(" --type=go

Repository: DefangLabs/defang

Length of output: 43


🏁 Script executed:

#!/bin/bash
# Search in cfn directory for the function definition
find . -path "*/aws/ecs/cfn/*" -name "*.go" -exec grep -l "NewAwsEcsCfn" {} \;

Repository: DefangLabs/defang

Length of output: 43


🏁 Script executed:

#!/bin/bash
# Check the setup.go file for the NewAwsEcsCfn function
cat -n src/pkg/clouds/aws/ecs/cfn/setup.go | head -60

Repository: DefangLabs/defang

Length of output: 2020


🏁 Script executed:

#!/bin/bash
# Search for all files in the ecs package to see if there are other places where AwsEcs is created
find src/pkg/clouds/aws/ecs -name "*.go" -type f | xargs grep -l "AwsEcs"

Repository: DefangLabs/defang

Length of output: 285


🏁 Script executed:

#!/bin/bash
# Check if AwsEcs can be instantiated in other ways or if the cfn package is the only entry point
rg -n "type.*AwsEcs|AwsEcs{" --type=go src/pkg/clouds/aws/ecs/

Repository: DefangLabs/defang

Length of output: 164


🏁 Script executed:

#!/bin/bash
# Check AWS SDK behavior: what happens to cfg.Region when empty region is passed
# Let's look at how region is used in the codebase
rg -n "cfg\.Region" --type=go -B 2 -A 2

Repository: DefangLabs/defang

Length of output: 1339


🏁 Script executed:

#!/bin/bash
# Let's look at the full aws/common.go to understand the error handling
cat -n src/pkg/clouds/aws/common.go | sed -n '35,50p'

Repository: DefangLabs/defang

Length of output: 745


🏁 Script executed:

#!/bin/bash
# Search for callers of the New function in cfn/setup.go
rg -n "cfn\.New|New\(" src/pkg/clouds/aws/ecs/cfn/ --type=go -B 2 -A 2

Repository: DefangLabs/defang

Length of output: 1851


🏁 Script executed:

#!/bin/bash
# Let's check all initialization code in the codebase for AwsEcsCfn or the New function
git grep -n "New(" src/pkg/clouds/aws/ecs/cfn/ | grep -v "NewFromConfig\|New\(stack\|New\(cfn"

Repository: DefangLabs/defang

Length of output: 84


🏁 Script executed:

#!/bin/bash
# Let's check if there are any other ways AwsEcs instances are created
rg -n "AwsEcs\{|&AwsEcs\{" --type=go -B 5 -A 10

Repository: DefangLabs/defang

Length of output: 1101


🏁 Script executed:

#!/bin/bash
# Check the actual AWS SDK behavior by looking at how the region is handled
# Let's see what happens when empty region is passed to config.WithRegion
cat -n src/pkg/clouds/aws/common.go | sed -n '54,75p'

Repository: DefangLabs/defang

Length of output: 896


🏁 Script executed:

#!/bin/bash
# Check if there are any tests that show the expected behavior
find . -name "*test.go" -path "*/aws/ecs/*" -exec grep -l "CDRegion\|LoadConfigForCD" {} \;

Repository: DefangLabs/defang

Length of output: 43


🏁 Script executed:

#!/bin/bash
# Let's search for where the New function is called from to see who provides the region
rg -n 'New\(' --type=go src/pkg/cli src/pkg/cmd | grep -i "aws\|ecs\|cfn" | head -20

Repository: DefangLabs/defang

Length of output: 2381


🏁 Script executed:

#!/bin/bash
# Search for factory functions or entry points that create AwsEcsCfn instances
rg -n "NewAwsEcsCfn|cfn.New" --type=go -B 3 -A 3

Repository: DefangLabs/defang

Length of output: 2866


🏁 Script executed:

#!/bin/bash
# Check if there's any code path where region could be empty when passed to New
rg -n "aws\.Region\(\"\"" --type=go

Repository: DefangLabs/defang

Length of output: 628


Guard against empty CDRegion from unresolved LoadDefaultConfig.

Code paths in byoc.go:136 pass empty region to New(), which initializes CDRegion to empty string with a comment "default region". However, LoadConfigForCD does not validate that cfg.Region is populated after calling LoadDefaultConfig, unlike LoadConfigForApp which explicitly errors on empty region. If LoadDefaultConfig returns empty cfg.Region, then CDRegion remains empty and propagates into MakeCdARN, creating malformed ARNs.

Add validation to ensure a region is resolved:

func (a *AwsEcs) LoadConfigForCD(ctx context.Context) (awsx.Config, error) {
	cfg, err := aws.LoadDefaultConfig(ctx, a.CDRegion)
+	if err != nil {
+		return cfg, err
+	}
	// If we don't have an region override for CD, use the current AWS region
	if a.CDRegion == "" {
		a.CDRegion = aws.Region(cfg.Region)
	}
+	if a.CDRegion == "" {
+		return cfg, errors.New("missing AWS region: set AWS_REGION or edit your AWS profile at ~/.aws/config")
+	}
	return cfg, err
}

Also applies to: 73-79

🤖 Prompt for AI Agents
In `@src/pkg/clouds/aws/ecs/common.go` around lines 37 - 43, LoadConfigForCD must
validate that aws.LoadDefaultConfig actually resolved a region: after calling
aws.LoadDefaultConfig(ctx, a.CDRegion) check cfg.Region and if it's empty return
a descriptive error (similar to LoadConfigForApp) instead of silently leaving
a.CDRegion empty; if cfg.Region is present set a.CDRegion =
aws.Region(cfg.Region) and return cfg, nil. Update the same validation in the
analogous block at lines 73-79; reference: AwsEcs.LoadConfigForCD, a.CDRegion,
aws.LoadDefaultConfig, and MakeCdARN to ensure downstream ARN construction never
receives an empty region.

Comment on lines 18 to +23
const (
_LogTypeUnused LogType = 1 << iota
LogTypeUnspecified LogType = 0
LogTypeCD LogType = 1 << iota
LogTypeRun
LogTypeBuild
LogTypeUnspecified LogType = 0
LogTypeAll LogType = 0xFFFFFFFF
LogTypeAll LogType = 0xFFFFFFFF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the current state of the file
cat -n src/pkg/logs/log_type.go | head -50

Repository: DefangLabs/defang

Length of output: 1339


🏁 Script executed:

# Check git history to see what the original values were
git log --oneline -20 src/pkg/logs/log_type.go 2>/dev/null || echo "Git history not available"

Repository: DefangLabs/defang

Length of output: 109


🏁 Script executed:

# Check for any git diff to see what changed
git diff HEAD~1 src/pkg/logs/log_type.go 2>/dev/null || echo "Git diff not available"

Repository: DefangLabs/defang

Length of output: 83


🏁 Script executed:

# Search for usages of LogType values throughout the codebase
rg -n --type=go '\bLogType(Run|Build|CD|Unspecified|All)\b' -C 2

Repository: DefangLabs/defang

Length of output: 17808


🏁 Script executed:

# Search for serialization patterns (JSON, protobuf, etc.)
rg -n --type=go '(json|proto|marshal|unmarshal|serialize|encode|decode)' -i --type=go src/pkg/logs/ -C 1

Repository: DefangLabs/defang

Length of output: 1133


🏁 Script executed:

# Search for database or file storage of log types
rg -n --type=go '(database|store|persist|save|read.*log.*type)' -i --type=go -C 2

Repository: DefangLabs/defang

Length of output: 50374


🏁 Script executed:

# Verify iota behavior by examining the const block more carefully
# and check if LogType is used in protobuf definitions
find . -name "*.proto" -type f | xargs rg -l "LogType" 2>/dev/null || echo "No proto files with LogType"

Repository: DefangLabs/defang

Length of output: 99


🏁 Script executed:

# Look for any marshaling/unmarshaling of LogType values
rg -n --type=go "LogType.*json|json.*LogType|proto.*LogType" -A 2 -B 2

Repository: DefangLabs/defang

Length of output: 828


🏁 Script executed:

# Check if LogType values are hardcoded anywhere (e.g., in tests, configs, or comparisons)
rg -n --type=go "LogType.*=.*[0-9]|== *[0-9]" | head -30

Repository: DefangLabs/defang

Length of output: 2041


🏁 Script executed:

# Look at the full log_type.go file to see String() and other methods that might reveal value usage
sed -n '49,130p' src/pkg/logs/log_type.go

Repository: DefangLabs/defang

Length of output: 1109


🏁 Script executed:

# Check git history for this file to see if LogTypeCD is new
git log --oneline --all -- src/pkg/logs/log_type.go | head -5

Repository: DefangLabs/defang

Length of output: 186


🏁 Script executed:

# Check if there's any reference to old bit values in git history
git show HEAD~1:src/pkg/logs/log_type.go 2>/dev/null | grep -A 5 "const (" | head -10 || echo "Cannot show previous version"

Repository: DefangLabs/defang

Length of output: 43


🏁 Script executed:

# Check the previous commit before the current one to see original values
git show db86cab6~1:src/pkg/logs/log_type.go 2>/dev/null | sed -n '18,30p'

Repository: DefangLabs/defang

Length of output: 43


🏁 Script executed:

# Verify what the commit "LogType bitfield" established
git show 920b0362:src/pkg/logs/log_type.go 2>/dev/null | sed -n '18,30p'

Repository: DefangLabs/defang

Length of output: 251


🏁 Script executed:

# Now check what was in the commit that added CD
git show db86cab6:src/pkg/logs/log_type.go 2>/dev/null | sed -n '18,35p'

Repository: DefangLabs/defang

Length of output: 382


Use explicit bit positions to preserve Run and Build values.

Inserting LogTypeCD = 1 << iota shifts subsequent values: Run moves from 2→4, Build from 4→8. This breaks compatibility with any persisted or serialized log type values that previously stored Run=2 or Build=4. Use explicit bit positions instead to maintain backward compatibility:

const (
	LogTypeUnspecified LogType = 0
	LogTypeCD          LogType = 1 << 0
	LogTypeRun         LogType = 1 << 1
	LogTypeBuild       LogType = 1 << 2
	LogTypeAll         LogType = 0xFFFFFFFF
)

Additionally, LogTypeCD is missing from the AllLogTypes slice, which will cause it to be omitted from the String() output when combined with other flags (e.g., LogTypeCD | LogTypeRun will only display "RUN"). Add it to the slice.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const (
_LogTypeUnused LogType = 1 << iota
LogTypeUnspecified LogType = 0
LogTypeCD LogType = 1 << iota
LogTypeRun
LogTypeBuild
LogTypeUnspecified LogType = 0
LogTypeAll LogType = 0xFFFFFFFF
LogTypeAll LogType = 0xFFFFFFFF
const (
LogTypeUnspecified LogType = 0
LogTypeCD LogType = 1 << 0
LogTypeRun LogType = 1 << 1
LogTypeBuild LogType = 1 << 2
LogTypeAll LogType = 0xFFFFFFFF
)
🤖 Prompt for AI Agents
In `@src/pkg/logs/log_type.go` around lines 18 - 23, The LogType bit values were
shifted by introducing `LogTypeCD = 1 << iota`, altering `LogTypeRun` and
`LogTypeBuild` numeric values; revert to explicit constants (e.g., set
`LogTypeCD = 1`, `LogTypeRun = 2`, `LogTypeBuild = 4`, keep `LogTypeAll =
0xFFFFFFFF`) to preserve backward compatibility, and add `LogTypeCD` to the
`AllLogTypes` slice used by `String()` so the CD flag appears when combined with
other flags (e.g., `LogTypeCD | LogTypeRun`).

@lionello lionello marked this pull request as draft February 11, 2026 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments