Commit Graph

12040 Commits

Author SHA1 Message Date
dependabot[bot]
735516cf0a chore(deps): bump github.com/shirou/gopsutil/v4 from 4.25.9 to 4.25.10 (#7457)
* chore(deps): bump github.com/shirou/gopsutil/v4 from 4.25.9 to 4.25.10

Bumps [github.com/shirou/gopsutil/v4](https://github.com/shirou/gopsutil) from 4.25.9 to 4.25.10.
- [Release notes](https://github.com/shirou/gopsutil/releases)
- [Commits](https://github.com/shirou/gopsutil/compare/v4.25.9...v4.25.10)

---
updated-dependencies:
- dependency-name: github.com/shirou/gopsutil/v4
  dependency-version: 4.25.10
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* go mod tidy

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: chrislu <chris.lu@gmail.com>
2025-11-10 12:29:00 -08:00
dependabot[bot]
ca8e7739be chore(deps): bump helm/chart-testing-action from 2.7.0 to 2.8.0 (#7454)
Bumps [helm/chart-testing-action](https://github.com/helm/chart-testing-action) from 2.7.0 to 2.8.0.
- [Release notes](https://github.com/helm/chart-testing-action/releases)
- [Commits](https://github.com/helm/chart-testing-action/compare/v2.7.0...v2.8.0)

---
updated-dependencies:
- dependency-name: helm/chart-testing-action
  dependency-version: 2.8.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-10 11:55:02 -08:00
dependabot[bot]
6b82a7cadc chore(deps): bump docker/metadata-action from 5.8.0 to 5.9.0 (#7456)
Bumps [docker/metadata-action](https://github.com/docker/metadata-action) from 5.8.0 to 5.9.0.
- [Release notes](https://github.com/docker/metadata-action/releases)
- [Commits](c1e51972af...318604b99e)

---
updated-dependencies:
- dependency-name: docker/metadata-action
  dependency-version: 5.9.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-10 11:54:54 -08:00
dependabot[bot]
d6a77b639a chore(deps): bump docker/setup-qemu-action from 3.6.0 to 3.7.0 (#7455)
Bumps [docker/setup-qemu-action](https://github.com/docker/setup-qemu-action) from 3.6.0 to 3.7.0.
- [Release notes](https://github.com/docker/setup-qemu-action/releases)
- [Commits](29109295f8...c7c5346462)

---
updated-dependencies:
- dependency-name: docker/setup-qemu-action
  dependency-version: 3.7.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-10 11:54:45 -08:00
dependabot[bot]
d37396b420 chore(deps): bump github.com/Azure/azure-sdk-for-go/sdk/azcore from 1.19.1 to 1.20.0 (#7458)
chore(deps): bump github.com/Azure/azure-sdk-for-go/sdk/azcore

Bumps [github.com/Azure/azure-sdk-for-go/sdk/azcore](https://github.com/Azure/azure-sdk-for-go) from 1.19.1 to 1.20.0.
- [Release notes](https://github.com/Azure/azure-sdk-for-go/releases)
- [Commits](https://github.com/Azure/azure-sdk-for-go/compare/sdk/azcore/v1.19.1...sdk/azcore/v1.20.0)

---
updated-dependencies:
- dependency-name: github.com/Azure/azure-sdk-for-go/sdk/azcore
  dependency-version: 1.20.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-10 11:54:30 -08:00
dependabot[bot]
c93bf8b29d chore(deps): bump github.com/hashicorp/vault/api from 1.20.0 to 1.22.0 (#7461)
Bumps [github.com/hashicorp/vault/api](https://github.com/hashicorp/vault) from 1.20.0 to 1.22.0.
- [Release notes](https://github.com/hashicorp/vault/releases)
- [Changelog](https://github.com/hashicorp/vault/blob/main/CHANGELOG.md)
- [Commits](https://github.com/hashicorp/vault/compare/v1.20.0...api/v1.22.0)

---
updated-dependencies:
- dependency-name: github.com/hashicorp/vault/api
  dependency-version: 1.22.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-10 11:54:11 -08:00
dependabot[bot]
4a34a2290f chore(deps): bump github.com/Azure/azure-sdk-for-go/sdk/storage/azblob from 1.6.2 to 1.6.3 (#7460)
chore(deps): bump github.com/Azure/azure-sdk-for-go/sdk/storage/azblob

Bumps [github.com/Azure/azure-sdk-for-go/sdk/storage/azblob](https://github.com/Azure/azure-sdk-for-go) from 1.6.2 to 1.6.3.
- [Release notes](https://github.com/Azure/azure-sdk-for-go/releases)
- [Commits](https://github.com/Azure/azure-sdk-for-go/compare/sdk/storage/azblob/v1.6.2...sdk/storage/azblob/v1.6.3)

---
updated-dependencies:
- dependency-name: github.com/Azure/azure-sdk-for-go/sdk/storage/azblob
  dependency-version: 1.6.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-10 11:54:00 -08:00
Lisandro Pin
76e4a51964 Unify the parameter to disable dry-run on weed shell commands to -apply (instead of -force). (#7450)
* Unify the parameter to disable dry-run on weed shell commands to --apply (instead of --force).

* lint

* refactor

* Execution Order Corrected

* handle deprecated force flag

* fix help messages

* Refactoring]: Using flag.FlagSet.Visit()

* consistent with other commands

* Checks for both flags

* fix toml files

---------

Co-authored-by: chrislu <chris.lu@gmail.com>
2025-11-09 19:58:38 -08:00
Chris Lu
2a05af2e14 docker: fix /data ownership and permission (#7451)
* docker: fix /data ownership and permission

* chown if not owned by seaweed user

* fix github tests

* comments

* fix the unquoted variables in the case pattern matching

* Update docker/entrypoint.sh

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* Update docker/entrypoint.sh

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* Update entrypoint.sh

* Update entrypoint.sh

* Update docker/entrypoint.sh

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-11-08 01:10:33 -08:00
Lisandro Pin
5fef4145a4 Fix date string parsing bug for the SQL Engine. (#7446)
`SQLEngine.valueToTime()` is parsing dates always as UTC (via `time.Parse()`),
regardless of TZ assumptions for different date formats.
2025-11-06 12:07:29 -08:00
Konstantin Lebedev
084b377f87 do delete expired entries on s3 list request (#7426)
* do delete expired entries on s3 list request
https://github.com/seaweedfs/seaweedfs/issues/6837

* disable delete expires s3 entry in filer

* pass opt allowDeleteObjectsByTTL to all servers

* delete on get and head

* add lifecycle expiration s3 tests

* fix opt allowDeleteObjectsByTTL for server

* fix test lifecycle expiration

* fix IsExpired

* fix locationPrefix for updateEntriesTTL

* fix s3tests

* resolv  coderabbitai

* GetS3ExpireTime on filer

* go mod

* clear TtlSeconds for volume

* move s3 delete expired entry to filer

* filer delete meta and data

* del unusing func removeExpiredObject

* test s3 put

* test s3 put multipart

* allowDeleteObjectsByTTL by default

* fix pipline tests

* rm dublicate SeaweedFSExpiresS3

* revert expiration tests

* fix updateTTL

* rm log

* resolv comment

* fix delete version object

* fix S3Versioning

* fix delete on FindEntry

* fix delete chunks

* fix sqlite not support concurrent writes/reads

* move deletion out of listing transaction; delete entries and empty folders

* Revert "fix sqlite not support concurrent writes/reads"

This reverts commit 5d5da14e0e.

* clearer handling on recursive empty directory deletion

* handle listing errors

* strut copying

* reuse code to delete empty folders

* use iterative approach with a queue to avoid recursive WithFilerClient calls

* stop a gRPC stream from the client-side callback is to return a specific error, e.g., io.EOF

* still issue UpdateEntry when the flag must be added

* errors join

* join path

* cleaner

* add context, sort directories by depth (deepest first) to avoid redundant checks

* batched operation, refactoring

* prevent deleting bucket

* constant

* reuse code

* more logging

* refactoring

* s3 TTL time

* Safety check

---------

Co-authored-by: chrislu <chris.lu@gmail.com>
2025-11-05 22:05:54 -08:00
chrislu
cc444b1868 muted texts 2025-11-04 22:17:21 -08:00
chrislu
ca8cd631ff Update admin.css 2025-11-04 22:11:19 -08:00
chrislu
82f2c3757f muted admin UI color 2025-11-04 22:09:32 -08:00
Chris Lu
ecdbe572ca master: fix negative active volumes (#7440)
* fix negative active volumes

* address comments

* simplify
2025-11-04 21:50:04 -08:00
Federico A. Corazza
17b23f61e1 Don't make nginx the default ingress controller (#7436) 2025-11-04 13:44:29 -08:00
Lisandro Pin
f466ff1412 Nit: use time.Durations instead of constants in seconds. (#7438)
Nit: use `time.Durations` instead of constants in seconds. Makes for slightly more readable code.
2025-11-04 13:02:22 -08:00
chrislu
f4f2718ba0 adjust test 2025-11-03 16:22:20 -08:00
dependabot[bot]
ac5108c301 chore(deps): bump go.mongodb.org/mongo-driver from 1.17.4 to 1.17.6 (#7430)
Bumps [go.mongodb.org/mongo-driver](https://github.com/mongodb/mongo-go-driver) from 1.17.4 to 1.17.6.
- [Release notes](https://github.com/mongodb/mongo-go-driver/releases)
- [Commits](https://github.com/mongodb/mongo-go-driver/compare/v1.17.4...v1.17.6)

---
updated-dependencies:
- dependency-name: go.mongodb.org/mongo-driver
  dependency-version: 1.17.6
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
2025-11-03 16:19:23 -08:00
dependabot[bot]
d592fcbe5c chore(deps): bump github.com/aws/aws-sdk-go-v2/credentials from 1.18.19 to 1.18.20 (#7432)
chore(deps): bump github.com/aws/aws-sdk-go-v2/credentials

Bumps [github.com/aws/aws-sdk-go-v2/credentials](https://github.com/aws/aws-sdk-go-v2) from 1.18.19 to 1.18.20.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Changelog](https://github.com/aws/aws-sdk-go-v2/blob/config/v1.18.20/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/config/v1.18.19...config/v1.18.20)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/credentials
  dependency-version: 1.18.20
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-03 16:19:07 -08:00
Chris Lu
498ac8903f S3: prevent deleting buckets with object locking (#7434)
* prevent deleting buckets with object locking

* addressing comments

* Update s3api_bucket_handlers.go

* address comments

* early return

* refactor

* simplify

* constant

* go fmt
2025-11-03 15:27:20 -08:00
chrislu
a154ef9a0f 4.00 4.00 2025-11-03 13:39:39 -08:00
dependabot[bot]
6d00d84721 chore(deps): bump helm/kind-action from 1.12.0 to 1.13.0 (#7428)
Bumps [helm/kind-action](https://github.com/helm/kind-action) from 1.12.0 to 1.13.0.
- [Release notes](https://github.com/helm/kind-action/releases)
- [Commits](https://github.com/helm/kind-action/compare/v1.12.0...v1.13.0)

---
updated-dependencies:
- dependency-name: helm/kind-action
  dependency-version: 1.13.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-03 12:27:13 -08:00
dependabot[bot]
29255f286e chore(deps): bump cloud.google.com/go/storage from 1.57.0 to 1.57.1 (#7431)
Bumps [cloud.google.com/go/storage](https://github.com/googleapis/google-cloud-go) from 1.57.0 to 1.57.1.
- [Release notes](https://github.com/googleapis/google-cloud-go/releases)
- [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/CHANGES.md)
- [Commits](https://github.com/googleapis/google-cloud-go/compare/spanner/v1.57.0...storage/v1.57.1)

---
updated-dependencies:
- dependency-name: cloud.google.com/go/storage
  dependency-version: 1.57.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-03 12:26:36 -08:00
dependabot[bot]
499ab47eaa chore(deps): bump github.com/aws/aws-sdk-go-v2/service/s3 from 1.88.3 to 1.89.1 (#7433)
chore(deps): bump github.com/aws/aws-sdk-go-v2/service/s3

Bumps [github.com/aws/aws-sdk-go-v2/service/s3](https://github.com/aws/aws-sdk-go-v2) from 1.88.3 to 1.89.1.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Changelog](https://github.com/aws/aws-sdk-go-v2/blob/main/changelog-template.json)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/service/s3/v1.88.3...service/s3/v1.89.1)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/service/s3
  dependency-version: 1.89.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-03 12:26:26 -08:00
chrislu
43cdd22133 4.00 2025-11-03 09:35:32 -08:00
chrislu
20a2e672d2 4.00 2025-11-02 22:08:38 -08:00
Lisandro Pin
1668c1042b Rework collection resultion for ec.rebuild, in preparation for parallelization. (#7420)
* Rework collection resultion for `ec.rebuild`, in preparation for parallelization.

See https://github.com/seaweedfs/seaweedfs/issues/7416 .

* simplify

* Update weed/shell/command_ec_rebuild.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

---------

Co-authored-by: chrislu <chris.lu@gmail.com>
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-11-02 08:54:37 -08:00
Chris Lu
69c49859fa fix go install (#7425)
fix https://github.com/seaweedfs/seaweedfs/issues/7424
2025-11-02 08:42:19 -08:00
Chris Lu
f234455b76 Filer: separate context for streaming (#7423)
* separate context for streaming

* Update weed/server/filer_server_handlers_read.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-11-01 23:25:56 -07:00
chrislu
b2fd31c08b fix volume utilization icon rendering 2025-11-01 13:42:25 -07:00
chrislu
c56a0a0ebd fix: handle 'default' collection filter in cluster volumes page
- Update matchesCollection to recognize 'default' as filter for empty collection
- Remove incorrect conversion of 'default' to empty string in handlers
- Fixes issue where ?collection=default would show all collections instead of just default collection
2025-11-01 13:08:29 -07:00
chrislu
fb46a8a61f adjust volume server link 2025-11-01 12:40:32 -07:00
Chris Lu
bdc20d1c1e S3: load bucket object locking configuration if not found in cache (#7422)
* load bucket object locking configuration if not found in cache

* fix cache building, more specific error, add back metrics
2025-10-31 22:35:09 -07:00
Chris Lu
b7e3284fc5 S3: fix TestSignedStreamingUploadInvalidSignature test (#7421)
* Added continue statements after all state transitions in the state machine to ensure immediate state processing

* simplify

* remove redundant continue clause

* ensure wrong signature
2025-10-31 20:59:44 -07:00
Chris Lu
f096b067fd weed master add peers=none option for faster startup (#7419)
* weed master -peers=none

* single master mode only when peers is none

* refactoring

* revert duplicated code

* revert

* Update weed/command/master.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* preventing "none" passed to other components if master is not started

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-10-31 18:29:16 -07:00
Chris Lu
5ab49e2971 Adjust cli option (#7418)
* adjust "weed benchmark" CLI to use readOnly/writeOnly

* consistently use "-master" CLI option

* If both -readOnly and -writeOnly are specified, the current logic silently allows it with -writeOnly taking precedence. This is confusing and could lead to unexpected behavior.
2025-10-31 17:08:00 -07:00
chrislu
58acc14d2c avoid unnecessary fail fast
fix https://github.com/seaweedfs/seaweedfs/issues/7417
2025-10-31 12:49:04 -07:00
chrislu
f00ae727b7 detect ipv6 2025-10-31 11:58:10 -07:00
Chris Lu
d745e6e41d Fix masterclient vidmap race condition (#7412)
* fallback to check master

* clean up

* parsing

* refactor

* handle parse error

* return error

* avoid dup lookup

* use batch key

* dedup lookup logic

* address comments

* errors.Join(lookupErrors...)

* add a comment

* Fix: Critical data race in MasterClient vidMap

Fixes a critical data race where resetVidMap() was writing to the vidMap
pointer while other methods were reading it concurrently without synchronization.

Changes:
- Removed embedded *vidMap from MasterClient
- Added vidMapLock (sync.RWMutex) to protect vidMap pointer access
- Created safe accessor methods (GetLocations, GetDataCenter, etc.)
- Updated all direct vidMap accesses to use thread-safe methods
- Updated resetVidMap() to acquire write lock during pointer swap

The vidMap already has internal locking for its operations, but this fix
protects the vidMap pointer itself from concurrent read/write races.

Verified with: go test -race ./weed/wdclient/...

Impact:
- Prevents potential panics from concurrent pointer access
- No performance impact - uses RWMutex for read-heavy workloads
- Maintains backward compatibility through wrapper methods

* fmt

* Fix: Critical data race in MasterClient vidMap

Fixes a critical data race where resetVidMap() was writing to the vidMap
pointer while other methods were reading it concurrently without synchronization.

Changes:
- Removed embedded *vidMap from MasterClient struct
- Added vidMapLock (sync.RWMutex) to protect vidMap pointer access
- Created minimal public accessor methods for external packages:
  * GetLocations, GetLocationsClone, GetVidLocations
  * LookupFileId, LookupVolumeServerUrl
  * GetDataCenter
- Internal code directly locks and accesses vidMap (no extra indirection)
- Updated resetVidMap() to acquire write lock during pointer swap
- Updated shell/commands.go to use GetDataCenter() method

Design Philosophy:
- vidMap already has internal locking for its map operations
- This fix specifically protects the vidMap *pointer* from concurrent access
- Public methods for external callers, direct locking for internal use
- Minimizes wrapper overhead while maintaining thread safety

Verified with: go test -race ./weed/wdclient/... (passes)

Impact:
- Prevents potential panics/crashes from data races
- Minimal performance impact (RWMutex for read-heavy workload)
- Maintains full backward compatibility

* fix more concurrent access

* reduce lock scope

* Optimize vidMap locking for better concurrency

Improved locking strategy based on the understanding that:
- vidMapLock protects the vidMap pointer from concurrent swaps
- vidMap has internal locks that protect its data structures

Changes:
1. Read operations: Grab pointer with RLock, release immediately, then operate
   - Reduces lock hold time
   - Allows resetVidMap to proceed sooner
   - Methods: GetLocations, GetLocationsClone, GetVidLocations,
     LookupVolumeServerUrl, GetDataCenter

2. Write operations: Changed from Lock() to RLock()
   - RLock prevents pointer swap during operation
   - Allows concurrent readers and other writers (serialized by vidMap's lock)
   - Methods: addLocation, deleteLocation, addEcLocation, deleteEcLocation

Benefits:
- Significantly reduced lock contention
- Better concurrent performance under load
- Still prevents all race conditions

Verified with: go test -race ./weed/wdclient/... (passes)

* Further reduce lock contention in LookupVolumeIdsWithFallback

Optimized two loops that were holding RLock for extended periods:

Before:
- Held RLock during entire loop iteration
- Included string parsing and cache lookups
- Could block resetVidMap for significant time with large batches

After:
- Grab vidMap pointer with brief RLock
- Release lock immediately
- Perform all loop operations on local pointer

Impact:
- First loop: Cache check on initial volumeIds
- Second loop: Double-check after singleflight wait

Benefits:
- Minimal lock hold time (just pointer copy)
- resetVidMap no longer blocked by long loops
- Better concurrent performance with large volume ID lists
- Still thread-safe (vidMap methods have internal locks)

Verified with: go test -race ./weed/wdclient/... (passes)

* Add clarifying comments to vidMap helper functions

Added inline documentation to each helper function (addLocation, deleteLocation,
addEcLocation, deleteEcLocation) explaining the two-level locking strategy:

- RLock on vidMapLock prevents resetVidMap from swapping the pointer
- vidMap has internal locks that protect the actual map mutations
- This design provides optimal concurrency

The comments make it clear why RLock (not Lock) is correct and intentional,
preventing future confusion about the locking strategy.

* Improve encapsulation: Add shallowClone() method to vidMap

Added a shallowClone() method to vidMap to improve encapsulation and prevent
MasterClient from directly accessing vidMap's internal fields.

Changes:
1. Added vidMap.shallowClone() in vid_map.go
   - Encapsulates the shallow copy logic within vidMap
   - Makes vidMap responsible for its own state representation
   - Documented that caller is responsible for thread safety

2. Simplified resetVidMap() in masterclient.go
   - Uses tail := mc.vidMap.shallowClone() instead of manual field access
   - Cleaner, more maintainable code
   - Better adherence to encapsulation principles

Benefits:
- Improved code organization and maintainability
- vidMap internals are now properly encapsulated
- Easier to modify vidMap structure in the future
- More self-documenting code

Verified with: go test -race ./weed/wdclient/... (passes)

* Optimize locking: Reduce lock acquisitions and use helper methods

Two optimizations to further reduce lock contention and improve code consistency:

1. LookupFileIdWithFallback: Eliminated redundant lock acquisition
   - Before: Two separate locks to get vidMap and dataCenter
   - After: Single lock gets both values together
   - Benefit: 50% reduction in lock/unlock overhead for this hot path

2. KeepConnected: Use GetDataCenter() helper for consistency
   - Before: Manual lock/unlock to access DataCenter field
   - After: Use existing GetDataCenter() helper method
   - Benefit: Better encapsulation and code consistency

Impact:
- Reduced lock contention in high-traffic lookup path
- More consistent use of accessor methods throughout codebase
- Cleaner, more maintainable code

Verified with: go test -race ./weed/wdclient/... (passes)

* Refactor: Extract common locking patterns into helper methods

Eliminated code duplication by introducing two helper methods that encapsulate
the common locking patterns used throughout MasterClient:

1. getStableVidMap() - For read operations
   - Acquires lock, gets pointer, releases immediately
   - Returns stable snapshot for thread-safe reads
   - Used by: GetLocations, GetLocationsClone, GetVidLocations,
     LookupFileId, LookupVolumeServerUrl, GetDataCenter

2. withCurrentVidMap(f func(vm *vidMap)) - For write operations
   - Holds RLock during callback execution
   - Prevents pointer swap while allowing concurrent operations
   - Used by: addLocation, deleteLocation, addEcLocation, deleteEcLocation

Benefits:
- Reduced code duplication (eliminated 48 lines of repetitive locking code)
- Centralized locking logic makes it easier to understand and maintain
- Self-documenting pattern through named helper methods
- Easier to modify locking strategy in the future (single point of change)
- Improved readability - accessor methods are now one-liners

Code size reduction: ~40% fewer lines for accessor/helper methods

Verified with: go test -race ./weed/wdclient/... (passes)

* consistent

* Fix cache pointer race condition with atomic.Pointer

Use atomic.Pointer for vidMap cache field to prevent data races
during cache trimming in resetVidMap. This addresses the race condition
where concurrent GetLocations calls could read the cache pointer while
resetVidMap is modifying it during cache chain trimming.

Changes:
- Changed cache field from *vidMap to atomic.Pointer[vidMap]
- Updated all cache access to use Load() and Store() atomic operations
- Updated shallowClone, GetLocations, deleteLocation, deleteEcLocation
- Updated resetVidMap to use atomic operations for cache trimming

* Merge: Resolve conflict in deleteEcLocation - keep atomic.Pointer and fix bug

Resolved merge conflict by combining:
1. Atomic pointer access pattern (from HEAD): cache.Load()
2. Correct method call (from fix): deleteEcLocation (not deleteLocation)

Resolution:
- Before (HEAD): cachedMap.deleteLocation() - WRONG, reintroduced bug
- Before (fix): vc.cache.deleteEcLocation() - RIGHT method, old pattern
- After (merged): cachedMap.deleteEcLocation() - RIGHT method, new pattern

This preserves both improvements:
✓ Thread-safe atomic.Pointer access pattern
✓ Correct recursive call to deleteEcLocation

Verified with: go test -race ./weed/wdclient/... (passes)

* Update vid_map.go

* remove shallow clone

* simplify
2025-10-30 23:36:06 -07:00
Chris Lu
9f07bca9cc Fix IPv6 host header formatting to match AWS SDK behavior (#7414)
* Add nginx reverse proxy documentation for S3 API

Fixes #7407

Add comprehensive documentation and example configuration for using
nginx as a reverse proxy with SeaweedFS S3 API while maintaining AWS
Signature V4 authentication compatibility.

Changes:
- Add docker/nginx/README.md with detailed setup guide
- Add docker/nginx/s3-example.conf with working configuration
- Update docker/nginx/proxy.conf with important S3 notes

The documentation covers:
- Critical requirements for AWS Signature V4 authentication
- Common mistakes and why they break S3 authentication
- Complete working nginx configurations
- Debugging tips and troubleshooting
- Performance tuning recommendations

* Fix IPv6 host header formatting to match AWS SDK behavior

Follow-up to PR #7403

When a default port (80 for HTTP, 443 for HTTPS) is stripped from an
IPv6 address, the square brackets should also be removed to match AWS
SDK behavior for S3 signature calculation.

Reference: https://github.com/aws/aws-sdk-go-v2/blob/main/aws/signer/internal/v4/host.go
The AWS SDK's stripPort function explicitly removes brackets when
returning an IPv6 address without a port.

Changes:
- Update extractHostHeader to strip brackets from IPv6 addresses when
  no port or default port is used
- Update test expectations to match AWS SDK behavior
- Add detailed comments explaining the AWS SDK compatibility requirement

This ensures S3 signature validation works correctly with IPv6 addresses
behind reverse proxies, matching AWS S3 canonical request format.

Fixes the issue raised in PR #7403 comment:
https://github.com/seaweedfs/seaweedfs/pull/7403#issuecomment-3471105438

* Update docker/nginx/README.md

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* Add nginx reverse proxy documentation for S3 API

Fixes #7407

Add comprehensive documentation and example configuration for using
nginx as a reverse proxy with SeaweedFS S3 API while maintaining AWS
Signature V4 authentication compatibility.

Changes:
- Add docker/nginx/README.md with detailed setup guide
- Add docker/nginx/s3-example.conf with working configuration
- Update docker/nginx/proxy.conf with important S3 notes

The documentation covers:
- Critical requirements for AWS Signature V4 authentication
- Common mistakes and why they break S3 authentication
- Complete working nginx configurations
- Debugging tips and troubleshooting
- Performance tuning recommendations

Fix IPv6 host header formatting to match AWS SDK behavior

Follow-up to PR #7403

When a default port (80 for HTTP, 443 for HTTPS) is stripped from an
IPv6 address, the square brackets should also be removed to match AWS
SDK behavior for S3 signature calculation.

Reference: https://github.com/aws/aws-sdk-go-v2/blob/main/aws/signer/internal/v4/host.go
The AWS SDK's stripPort function explicitly removes brackets when
returning an IPv6 address without a port.

Changes:
- Update extractHostHeader to strip brackets from IPv6 addresses when
  no port or default port is used
- Update test expectations to match AWS SDK behavior
- Add detailed comments explaining the AWS SDK compatibility requirement

This ensures S3 signature validation works correctly with IPv6 addresses
behind reverse proxies, matching AWS S3 canonical request format.

Fixes the issue raised in PR #7403 comment:
https://github.com/seaweedfs/seaweedfs/pull/7403#issuecomment-3471105438

* Revert "Merge branch 'fix-ipv6-brackets-default-port' of https://github.com/seaweedfs/seaweedfs into fix-ipv6-brackets-default-port"

This reverts commit cca3f3985f, reversing
changes made to 2b8f9de78e.

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-10-30 21:06:00 -07:00
Chris Lu
5810aba763 Filer: fallback to check master (#7411)
* fallback to check master

* clean up

* parsing

* refactor

* handle parse error

* return error

* avoid dup lookup

* use batch key

* dedup lookup logic

* address comments

* errors.Join(lookupErrors...)

* add a comment
2025-10-30 20:18:21 -07:00
Chris Lu
ba07b3e4c6 network: Adaptive timeout (#7410)
* server can start when no network for local dev

* fixed superfluous response.WriteHeader call" warning

* adaptive based on last write time

* more doc

* refactoring
2025-10-30 16:43:29 -07:00
chrislu
f5a57a6463 fixed superfluous response.WriteHeader call" warning 2025-10-30 16:13:54 -07:00
chrislu
a6da3eb770 server can start when no network for local dev 2025-10-30 16:13:54 -07:00
Guilherme Moreira Rodrigues
db35159a41 [Helm Chart] add missing apiVersion and kind in PVC templates for better compatibility with GitOps tools (#7408)
* fix: add missing apiVersion and kind in PVC templates

* fix: correct PVC template condition in SeaweedFS filer StatefulSet
2025-10-30 14:31:54 -07:00
Chris Lu
d00a2a8707 Fix S3 bucket policy ARN validation to accept AWS ARNs and simplified formats (#7409)
* Fix S3 bucket policy ARN validation to accept AWS ARNs and simplified formats

Fixes #7252

The bucket policy validation was rejecting valid AWS-style ARNs and
simplified resource formats, causing validation failures with the
error 'resource X does not match bucket X' even when they were
identical strings.

Changes:
- Updated validateResourceForBucket() to accept three formats:
  1. AWS-style ARNs: arn:aws:s3:::bucket-name[/*|/path]
  2. SeaweedFS ARNs: arn:seaweed:s3:::bucket-name[/*|/path]
  3. Simplified formats: bucket-name[/*|/path]

- Added comprehensive test coverage for all three formats
- Added specific test cases from issue #7252 to prevent regression

This ensures compatibility with standard AWS S3 bucket policies
while maintaining support for SeaweedFS-specific ARN format.

* Refactor validateResourceForBucket to reduce code duplication

Simplified the validation logic by stripping ARN prefixes first,
then performing validation on the remaining resource path.
This reduces code duplication and improves maintainability while
maintaining identical functionality.

Addresses review feedback from Gemini Code Assist.

* Use strings.CutPrefix for cleaner ARN prefix stripping

Replace strings.HasPrefix checks with strings.CutPrefix for more
idiomatic Go code. This function is available in Go 1.20+ and
provides cleaner conditional logic with the combined check and
extraction.

Addresses review feedback from Gemini Code Assist.
2025-10-30 11:00:31 -07:00
Chris Lu
8a032bf57d fix add user command (#7406)
* fix add user command

* add folder /etc/seaweedfs
2025-10-29 19:41:04 -07:00
Dmitriy Pavlov
9b6b564235 Filer: Add retry mechanism for failed file deletions (#7402)
* Filer: Add retry mechanism for failed file deletions

Implement a retry queue with exponential backoff for handling transient
deletion failures, particularly when volumes are temporarily read-only.

Key features:
- Automatic retry for retryable errors (read-only volumes, network issues)
- Exponential backoff: 5min → 10min → 20min → ... (max 6 hours)
- Maximum 10 retry attempts per file before giving up
- Separate goroutine processing retry queue every minute
- Enhanced logging with retry/permanent error classification

This addresses the issue where file deletions fail when volumes are
temporarily read-only (tiered volumes, maintenance, etc.) and these
deletions were previously lost.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Update weed/filer/filer_deletion.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* Filer: Add retry mechanism for failed file deletions

Implement a retry queue with exponential backoff for handling transient
deletion failures, particularly when volumes are temporarily read-only.

Key features:
- Automatic retry for retryable errors (read-only volumes, network issues)
- Exponential backoff: 5min → 10min → 20min → ... (max 6 hours)
- Maximum 10 retry attempts per file before giving up
- Separate goroutine processing retry queue every minute
- Map-based retry queue for O(1) lookups and deletions
- Enhanced logging with retry/permanent error classification
- Consistent error detail limiting (max 10 total errors logged)
- Graceful shutdown support with quit channel for both processors

This addresses the issue where file deletions fail when volumes are
temporarily read-only (tiered volumes, maintenance, etc.) and these
deletions were previously lost.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Filer: Replace magic numbers with named constants in retry processor

Replace hardcoded values with package-level constants for better
maintainability:
- DeletionRetryPollInterval (1 minute): interval for checking retry queue
- DeletionRetryBatchSize (1000): max items to process per iteration

This improves code readability and makes configuration changes easier.

* Filer: Optimize retry queue with min-heap data structure

Replace map-based retry queue with a min-heap for better scalability
and deterministic ordering.

Performance improvements:
- GetReadyItems: O(N) → O(K log N) where K is items retrieved
- AddOrUpdate: O(1) → O(log N) (acceptable trade-off)
- Early exit when checking ready items (heap top is earliest)
- No full iteration over all items while holding lock

Benefits:
- Deterministic processing order (earliest NextRetryAt first)
- Better scalability for large retry queues (thousands of items)
- Reduced lock contention duration
- Memory efficient (no separate slice reconstruction)

Implementation:
- Min-heap ordered by NextRetryAt using container/heap
- Dual index: heap for ordering + map for O(1) FileId lookups
- heap.Fix() used when updating existing items
- Comprehensive complexity documentation in comments

This addresses the performance bottleneck identified in GetReadyItems
where iterating over the entire map with a write lock could block
other goroutines in high-failure scenarios.

* Filer: Modernize heap interface and improve error handling docs

1. Replace interface{} with any in heap methods
   - Addresses modern Go style (Go 1.18+)
   - Improves code readability

2. Enhance isRetryableError documentation
   - Acknowledge string matching brittleness
   - Add comprehensive TODO for future improvements:
     * Use HTTP status codes (503, 429, etc.)
     * Implement structured error types with errors.Is/As
     * Extract gRPC status codes
     * Add error wrapping for better context
   - Document each error pattern with context
   - Add defensive check for empty error strings

Current implementation remains pragmatic for initial release while
documenting a clear path for future robustness improvements. String
matching is acceptable for now but should be replaced with structured
error checking when refactoring the deletion pipeline.

* Filer: Refactor deletion processors for better readability

Extract large callback functions into dedicated private methods to
improve code organization and maintainability.

Changes:
1. Extract processDeletionBatch method
   - Handles deletion of a batch of file IDs
   - Classifies errors (success, not found, retryable, permanent)
   - Manages retry queue additions
   - Consolidates logging logic

2. Extract processRetryBatch method
   - Handles retry attempts for previously failed deletions
   - Processes retry results and updates queue
   - Symmetric to processDeletionBatch for consistency

Benefits:
- Main loop functions (loopProcessingDeletion, loopProcessingDeletionRetry)
  are now concise and focused on orchestration
- Business logic is separated into testable methods
- Reduced nesting depth improves readability
- Easier to understand control flow at a glance
- Better separation of concerns

The refactored methods follow the single responsibility principle,
making the codebase more maintainable and easier to extend.

* Update weed/filer/filer_deletion.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* Filer: Fix critical retry count bug and add comprehensive error patterns

Critical bug fixes from PR review:

1. Fix RetryCount reset bug (CRITICAL)
   - Problem: When items are re-queued via AddOrUpdate, RetryCount
     resets to 1, breaking exponential backoff
   - Solution: Add RequeueForRetry() method that preserves retry state
   - Impact: Ensures proper exponential backoff progression

2. Add overflow protection in backoff calculation
   - Check shift amount > 63 to prevent bit-shift overflow
   - Additional safety: check if delay <= 0 or > MaxRetryDelay
   - Protects against arithmetic overflow in extreme cases

3. Expand retryable error patterns
   - Added: timeout, deadline exceeded, context canceled
   - Added: lookup error/failed (volume discovery issues)
   - Added: connection refused, broken pipe (network errors)
   - Added: too many requests, service unavailable (backpressure)
   - Added: temporarily unavailable, try again (transient errors)
   - Added: i/o timeout (network timeouts)

Benefits:
- Retry mechanism now works correctly across restarts
- More robust against edge cases and overflow
- Better coverage of transient failure scenarios
- Improved resilience in high-failure environments

Addresses feedback from CodeRabbit and Gemini Code Assist in PR #7402.

* Filer: Add persistence docs and comprehensive unit tests

Documentation improvements:

1. Document in-memory queue limitation
   - Acknowledge that retry queue is volatile (lost on restart)
   - Document trade-offs and future persistence options
   - Provide clear path for production hardening
   - Note eventual consistency through main deletion queue

Unit test coverage:

1. TestDeletionRetryQueue_AddAndRetrieve
   - Basic add/retrieve operations
   - Verify items not ready before delay elapsed

2. TestDeletionRetryQueue_ExponentialBackoff
   - Verify exponential backoff progression (5m→10m→20m→40m→80m)
   - Validate delay calculations with timing tolerance

3. TestDeletionRetryQueue_OverflowProtection
   - Test high retry counts (60+) that could cause overflow
   - Verify capping at MaxRetryDelay

4. TestDeletionRetryQueue_MaxAttemptsReached
   - Verify items discarded after MaxRetryAttempts
   - Confirm proper queue cleanup

5. TestIsRetryableError
   - Comprehensive error pattern coverage
   - Test all retryable error types (timeout, connection, lookup, etc.)
   - Verify non-retryable errors correctly identified

6. TestDeletionRetryQueue_HeapOrdering
   - Verify min-heap property maintained
   - Test items processed in NextRetryAt order
   - Validate heap.Init() integration

All tests passing. Addresses PR feedback on testing requirements.

* Filer: Add code quality improvements for deletion retry

Address PR feedback with minor optimizations:
- Add MaxLoggedErrorDetails constant (replaces magic number 10)
- Pre-allocate slices and maps in processRetryBatch for efficiency
- Improve log message formatting to use constant

These changes improve code maintainability and runtime performance
without altering functionality.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* refactoring retrying

* use constant

* assert

* address comment

* refactor

* address comments

* dedup

* process retried deletions

* address comment

* check in-flight items also; dedup code

* refactoring

* refactoring

* simplify

* reset heap

* more efficient

* add DeletionBatchSize as a constant;Permanent > Retryable > Success > Not Found

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: chrislu <chris.lu@gmail.com>
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
2025-10-29 18:31:23 -07:00
zuzuviewer
7e624d5355 * Fix s3 auth with proxy request (#7403)
* * Fix s3 auth with proxy request

* * 6649 Add unit test for signature v4

* address comments

* fix for tests

* ipv6

* address comments

* setting scheme

Works for both cases (direct HTTPS and behind proxy)

* trim for ipv6

* Corrected Scheme Precedence Order

* trim

* accurate

---------

Co-authored-by: chrislu <chris.lu@gmail.com>
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
2025-10-29 18:01:18 -07:00