Vacuum: Reclaim disk space by removing deleted files
Erasure Coding: Convert volumes to erasure-coded format for storage efficiency
Remote Upload: Upload volumes to remote/cloud storage
Replication: Fix replication issues and maintain data consistency
Balance: Redistribute volumes across volume servers for load balancing

Workers automatically register with the admin server and receive tasks based on their capabilities and current load.

Usage

weed worker [options]

Options

Option	Default	Description
`-admin`	localhost:23646	Admin server address
`-capabilities`	vacuum,erasure_coding,balance	Comma-separated list of task types this worker can handle
`-maxConcurrent`	2	Maximum number of concurrent tasks
`-heartbeat`	30s	Heartbeat interval to admin server
`-taskInterval`	5s	Task request interval

Examples

Basic Usage

# Start worker connecting to local admin server
weed worker -admin=localhost:23646

# Connect to remote admin server
weed worker -admin=admin.example.com:23646

# Start worker with custom admin server and port
weed worker -admin=192.168.1.100:8080

Capability Configuration

# Worker that only handles vacuum tasks
weed worker -admin=localhost:23646 -capabilities=vacuum

# Worker that handles vacuum and replication tasks
weed worker -admin=localhost:23646 -capabilities=vacuum,replication

# Worker with all capabilities (default)
weed worker -admin=localhost:23646 -capabilities=vacuum,ec,remote,replication,balance

# Worker using capability aliases
weed worker -admin=localhost:23646 -capabilities=vacuum,ec,remote,replication

Performance Tuning

# High-performance worker with more concurrent tasks
weed worker -admin=localhost:23646 -maxConcurrent=8

# More frequent task requests for busy clusters
weed worker -admin=localhost:23646 -taskInterval=2s

# Custom heartbeat interval
weed worker -admin=localhost:23646 -heartbeat=10s

Task Capabilities

Workers can be configured to handle specific types of maintenance tasks:

Available Task Types

Capability	Description
`vacuum`	Reclaim disk space by removing deleted files
`erasure_coding`	Convert volumes to erasure-coded format
`balance`	Redistribute volumes for load balancing

Worker Architecture

Worker Lifecycle

Registration: Worker connects to admin server via gRPC
Capabilities: Worker reports its capabilities to admin
Task Request: Worker periodically requests tasks from admin
Task Execution: Worker processes assigned tasks
Heartbeat: Worker sends periodic heartbeats to admin
Graceful Shutdown: Worker completes current tasks before stopping

Connection Details

Protocol: gRPC connection to admin server
Port: Admin HTTP port + 10000 (e.g., admin on 23646 → gRPC on 33646)
Security: Supports TLS using [grpc.worker] configuration
Fallback: Falls back to insecure connection if TLS unavailable

Configuration

Security Configuration

Workers read TLS configuration from security.toml:

[grpc.worker]
cert = "/etc/ssl/worker.crt"
key = "/etc/ssl/worker.key"
ca = "/etc/ssl/ca.crt"

Worker Identification

Worker ID: Automatically generated unique identifier
Address: Worker's network address (auto-detected)
Capabilities: Reported task capabilities
Status: Current worker status (active, idle, busy)

Task Processing

Concurrent Task Handling

Max Concurrent: Configurable via -maxConcurrent (default: 2)
Task Queue: Workers maintain internal task queues
Load Balancing: Admin distributes tasks based on worker load
Task Completion: Workers report task completion status

Task Request Cycle

Worker requests tasks from admin server
Admin assigns tasks based on worker capabilities and load
Worker processes tasks concurrently
Worker reports task completion/failure
Cycle repeats based on -taskInterval

Monitoring and Status

Worker Status

Workers report the following status information:

Worker ID: Unique identifier
Current Load: Number of active tasks
Capabilities: Supported task types
Last Heartbeat: Timestamp of last heartbeat
Tasks Completed: Total completed tasks
Tasks Failed: Total failed tasks
Uptime: Worker uptime duration

Health Monitoring

Heartbeat: Periodic heartbeat to admin server
Task Timeout: Tasks have configurable timeouts
Error Reporting: Failed tasks are reported to admin
Automatic Retry: Failed tasks may be retried

Best Practices

Deployment

Multiple Workers: Deploy multiple workers for redundancy
Capability Specialization: Consider specialized workers for specific tasks
Resource Allocation: Ensure adequate CPU and memory for concurrent tasks
Network Connectivity: Ensure reliable connection to admin server

Performance

Concurrent Tasks: Tune -maxConcurrent based on available resources
Task Interval: Adjust -taskInterval based on cluster activity
Heartbeat Frequency: Balance between responsiveness and overhead
Resource Monitoring: Monitor worker resource usage

Security

TLS Configuration: Use TLS for production deployments
Network Security: Secure communication between workers and admin
Access Control: Limit worker deployment to trusted systems
Certificate Management: Manage and rotate TLS certificates

Troubleshooting

Common Issues

Cannot connect to admin server:
- Verify admin server address and port
- Check network connectivity
- Ensure admin server is running
- Verify gRPC port (admin HTTP port + 10000)
No tasks received:
- Check worker capabilities match available tasks
- Verify worker registration with admin
- Check admin server logs for task assignment
- Ensure worker is not overloaded
TLS connection failures:
- Verify security.toml configuration
- Check certificate paths and permissions
- Ensure certificates are valid
- Check certificate compatibility
Task execution failures:
- Check worker logs for error details
- Verify worker has necessary permissions
- Check disk space and resources
- Ensure target volumes are accessible

Debug Information

Enable debug logging:

# Run with verbose logging
weed worker -admin=localhost:23646 -v=4

Worker Logs

Workers log important events:

Connection status to admin server
Task assignments and completion
Error conditions and failures
Heartbeat and health information

Task-Specific Information

Vacuum Tasks

Purpose: Reclaim disk space from deleted files
Requirements: Access to volume servers
Duration: Varies based on volume size and deleted data
Impact: Temporary increase in I/O during vacuum process

Erasure Coding Tasks

Purpose: Convert volumes to erasure-coded format
Requirements: Multiple volume servers for redundancy
Duration: Long-running, depends on volume size
Impact: Reduces storage requirements but increases complexity

Remote Upload Tasks

Purpose: Upload volumes to remote/cloud storage
Requirements: Cloud storage credentials and connectivity
Duration: Depends on volume size and upload bandwidth
Impact: Enables tiered storage and backup strategies

Replication Tasks

Purpose: Fix replication consistency issues
Requirements: Access to master and volume servers
Duration: Quick, depends on replication factor
Impact: Ensures data consistency and availability

Balance Tasks

Purpose: Redistribute volumes across volume servers
Requirements: Multiple volume servers
Duration: Depends on data movement requirements
Impact: Improves cluster load distribution

weed admin: Start admin server that manages workers
weed master: Start master servers
weed volume: Start volume servers
weed scaffold: Generate configuration files

This is still work in progress!

Weed Worker

Overview

Usage

Options

Examples

Basic Usage

Capability Configuration

Performance Tuning

Task Capabilities

Available Task Types

Worker Architecture

Worker Lifecycle

Connection Details

Configuration

Security Configuration

Worker Identification

Task Processing

Concurrent Task Handling

Task Request Cycle

Monitoring and Status

Worker Status

Health Monitoring

Best Practices

Deployment

Performance

Security

Troubleshooting

Common Issues

Debug Information

Worker Logs

Task-Specific Information

Vacuum Tasks

Erasure Coding Tasks

Remote Upload Tasks

Replication Tasks

Balance Tasks

Related Commands

See Also

Introduction

API

Configuration

Filer

Filer Stores

Management

Advanced Filer Configurations

FUSE Mount

WebDAV

Cloud Drive

AWS S3 API

Server-Side Encryption

AWS IAM

Machine Learning

HDFS

Replication and Backup

Metadata Change Events

Messaging

Use Cases

Operations

Advanced

Security

Misc Use Case Examples