Improve safety for weed shell's ec.encode. (#6773)

Improve safety for weed shells `ec.encode`. The current process for `ec.encode` is: 1. EC shards for a volume are generated and added to a single server 2. The original volume is deleted 3. EC shards get re-balanced across the entire topology It is then possible to lose data between #2 and #3, if the underlying volume storage/server/rack/DC happens to fail, for whatever reason. As a fix, this MR reworks `ec.encode` so: * Newly created EC shards are spread across all locations for the source volume. * Source volumes are deleted only after EC shards are converted and balanced.
2025-09-20 02:47:58 +08:00 · 2025-05-09 18:01:32 +02:00
parent 2ae5b480a6
commit 848d1f7c34
2 changed files with 63 additions and 35 deletions
--- a/weed/shell/command_ec_common.go
+++ b/weed/shell/command_ec_common.go
@@ -134,6 +134,14 @@ func NewErrorWaitGroup(maxConcurrency int) *ErrorWaitGroup {
 	}
 }

+func (ewg *ErrorWaitGroup) Reset() {
+	close(ewg.wgSem)
+
+	ewg.wg = &sync.WaitGroup{}
+	ewg.wgSem = make(chan bool, ewg.maxConcurrency)
+	ewg.errors = nil
+}
+
 func (ewg *ErrorWaitGroup) Add(f ErrorWaitGroupTask) {
 	if ewg.maxConcurrency <= 1 {
 		// Keep run order deterministic when parallelization is off