Table of Contents
- Filer Notification Webhook
- Overview
- Limitations and Considerations
- Configuration
- Configuration Parameters
- Event Types
- Webhook Payload Format
- Example Payloads
- Event Filtering
- Retry and Error Handling
- Performance Tuning
- Security Considerations
- Example Webhook Receiver (Node.js/Express)
- Example Webhook Receiver (Python/Flask)
- Troubleshooting
- Monitoring
- Related Documentation
- Use Cases
- References
Filer Notification Webhook
The webhook notification feature allows SeaweedFS filers to send real-time file system events to external HTTP endpoints. This enables integration with external systems for monitoring, auditing, data processing pipelines, and event-driven workflows.
Overview
The webhook notification system sends HTTP POST requests containing file system event data (create, update, delete, rename operations) to a configured endpoint. The system includes features like retry logic, concurrent workers, event filtering, and authentication support.
Architecture: Push Model
Important: The webhook notification system uses a push model, where SeaweedFS actively sends events to your HTTP endpoint. This architecture is designed for low to moderate traffic scenarios where you need real-time event notifications for triggering actions.
When to Use Webhooks:
- Real-time monitoring and alerting (low volume)
- Triggering workflows on specific file events
- Audit logging for compliance
- Integration with external systems for selected directories
- Development and testing environments
When NOT to Use Webhooks:
- High-traffic production environments with thousands of events per second
- Bulk file processing operations
- Large-scale data synchronization
- Scenarios requiring guaranteed message delivery order
For High-Traffic Scenarios, consider these alternatives:
- Filer Metadata Events: Pull model using local event logs that can be consumed at your own pace
- Message Queue Systems: Use Kafka, AWS SQS, or Google Pub/Sub notification backends for scalable event processing
- Direct gRPC Subscription: Use the
SubscribeMetadataRPC API for efficient streaming of metadata changes
The webhook system includes buffering (default 10,000 events) and retry logic, but under sustained high load, events may be dropped if the buffer fills up or if your webhook endpoint cannot keep up with the event rate.
Limitations and Considerations
Push Model Constraints
The webhook system operates as a push model where SeaweedFS actively pushes events to your endpoint. This has important implications:
Traffic Limitations:
- Best suited for low to moderate traffic (< 100 events/second sustained)
- Each event requires an HTTP round-trip to your endpoint
- Network latency and endpoint processing time directly impact throughput
- Under sustained high load, the internal buffer may fill up, causing events to be dropped
Delivery Guarantees:
- At-least-once delivery: With retries enabled, events may be delivered more than once (implement idempotency)
- No strict ordering: Events may arrive out of order, especially with retries
- Best-effort delivery: Events may be lost if buffer overflows or all retries fail
Blocking Behavior:
- Slow webhook endpoints can create backpressure on the filer
- Failed endpoints with max retries can delay event processing
- Consider setting reasonable timeout and retry limits
Alternatives for High-Traffic Scenarios
If you're experiencing or expecting high event volumes, consider:
- Kafka Notifications (
notification.kafka): Industry-standard message queue with high throughput - AWS SQS (
notification.aws_sqs): Managed queue service with unlimited scaling - Google Pub/Sub (
notification.google_pub_sub): Managed pub/sub with global scale - Pull-based Event Logs: Read from
/topics/.system/logdirectory at your own pace (see Filer Metadata Events) - Direct gRPC Subscription: Use
SubscribeMetadataRPC for efficient streaming
Configuration
Basic Setup
Add the webhook configuration to your notification.toml file. This file should be placed in one of these locations (in descending priority):
./notification.toml$HOME/.seaweedfs/notification.toml/etc/seaweedfs/notification.toml
Minimal Configuration Example
[notification.webhook]
enabled = true
endpoint = "https://your-server.com/webhook"
Complete Configuration Example
[notification.webhook]
enabled = true
# Required: The HTTP endpoint to receive webhook notifications
endpoint = "https://your-server.com/webhook"
# Optional: Bearer token for authentication
bearer_token = "your-secret-token-here"
# Optional: HTTP request timeout in seconds (default: 10, range: 1-300)
timeout_seconds = 10
# Optional: Maximum number of retry attempts (default: 3, range: 0-10)
max_retries = 3
# Optional: Initial backoff delay in seconds (default: 3, range: 1-60)
backoff_seconds = 3
# Optional: Maximum backoff delay in seconds (default: 30, range: backoff_seconds-300)
max_backoff_seconds = 30
# Optional: Number of concurrent worker threads (default: 5, range: 1-100)
workers = 5
# Optional: Internal buffer size for queued events (default: 10000, range: 100-1000000)
buffer_size = 10000
# Optional: Filter by event types (if empty, all events are sent)
# Valid values: "create", "update", "delete", "rename"
event_types = ["create", "delete"]
# Optional: Filter by path prefixes (if empty, all paths are monitored)
path_prefixes = ["/important", "/data"]
Configuration Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
enabled |
boolean | Yes | false | - | Enable/disable webhook notifications |
endpoint |
string | Yes | - | Valid URL | HTTP endpoint to receive webhook POST requests |
bearer_token |
string | No | "" | - | Bearer token for Authorization header |
timeout_seconds |
integer | No | 10 | 1-300 | HTTP request timeout |
max_retries |
integer | No | 3 | 0-10 | Number of retry attempts on failure |
backoff_seconds |
integer | No | 3 | 1-60 | Initial backoff delay between retries |
max_backoff_seconds |
integer | No | 30 | backoff_seconds-300 | Maximum backoff delay (exponential backoff) |
workers |
integer | No | 5 | 1-100 | Number of concurrent worker threads |
buffer_size |
integer | No | 10000 | 100-1000000 | Internal queue buffer size |
event_types |
array | No | all | create, update, delete, rename | Filter events by type |
path_prefixes |
array | No | all | - | Filter events by path prefix |
Event Types
The webhook notification system supports four types of file system events:
1. Create Event
Triggered when a new file or directory is created.
- Detection:
new_entryis present,old_entryis null - Event Type:
"create"
2. Update Event
Triggered when an existing file or directory is modified.
- Detection: Both
old_entryandnew_entryare present, no path change - Event Type:
"update"
3. Delete Event
Triggered when a file or directory is deleted.
- Detection:
old_entryis present,new_entryis null - Event Type:
"delete"
4. Rename Event
Triggered when a file or directory is moved or renamed.
- Detection: Both
old_entryandnew_entryare present, andnew_parent_pathis specified - Event Type:
"rename"
Webhook Payload Format
HTTP Request Details
- Method: POST
- Content-Type: application/json
- Authorization: Bearer {bearer_token} (if configured)
Payload Structure
{
"key": "/path/to/file.txt",
"event_type": "create",
"message": {
"old_entry": null,
"new_entry": {
"name": "file.txt",
"is_directory": false,
"attributes": {
"file_size": 1024,
"mtime": 1733616000,
"file_mode": 420,
"uid": 1000,
"gid": 1000,
"mime": "text/plain"
},
"chunks": [
{
"file_id": "3,01637037d6",
"offset": 0,
"size": 1024,
"mtime": 1733616000
}
]
},
"delete_chunks": false,
"new_parent_path": "",
"is_from_other_cluster": false
}
}
Example Payloads
Create File Event
{
"key": "/documents/report.pdf",
"event_type": "create",
"message": {
"old_entry": null,
"new_entry": {
"name": "report.pdf",
"is_directory": false,
"attributes": {
"file_size": 524288,
"mtime": 1733616000,
"file_mode": 420,
"uid": 1000,
"gid": 1000,
"mime": "application/pdf"
},
"chunks": [
{
"file_id": "4,023f8a9c2e",
"offset": 0,
"size": 524288,
"mtime": 1733616000
}
]
},
"delete_chunks": false,
"new_parent_path": ""
}
}
Update File Event
{
"key": "/documents/report.pdf",
"event_type": "update",
"message": {
"old_entry": {
"name": "report.pdf",
"is_directory": false,
"attributes": {
"file_size": 524288,
"mtime": 1733616000,
"file_mode": 420,
"uid": 1000,
"gid": 1000,
"mime": "application/pdf"
}
},
"new_entry": {
"name": "report.pdf",
"is_directory": false,
"attributes": {
"file_size": 612352,
"mtime": 1733617200,
"file_mode": 420,
"uid": 1000,
"gid": 1000,
"mime": "application/pdf"
},
"chunks": [
{
"file_id": "5,034b9d1a3f",
"offset": 0,
"size": 612352,
"mtime": 1733617200
}
]
},
"delete_chunks": true,
"new_parent_path": ""
}
}
Delete File Event
{
"key": "/documents/old_file.txt",
"event_type": "delete",
"message": {
"old_entry": {
"name": "old_file.txt",
"is_directory": false,
"attributes": {
"file_size": 2048,
"mtime": 1733610000,
"file_mode": 420,
"uid": 1000,
"gid": 1000,
"mime": "text/plain"
}
},
"new_entry": null,
"delete_chunks": true,
"new_parent_path": ""
}
}
Rename/Move File Event
{
"key": "/documents/old_name.txt",
"event_type": "rename",
"message": {
"old_entry": {
"name": "old_name.txt",
"is_directory": false,
"attributes": {
"file_size": 1024,
"mtime": 1733616000,
"file_mode": 420,
"uid": 1000,
"gid": 1000,
"mime": "text/plain"
}
},
"new_entry": {
"name": "new_name.txt",
"is_directory": false,
"attributes": {
"file_size": 1024,
"mtime": 1733616000,
"file_mode": 420,
"uid": 1000,
"gid": 1000,
"mime": "text/plain"
}
},
"delete_chunks": false,
"new_parent_path": "/archive"
}
}
Create Directory Event
{
"key": "/data/new_folder",
"event_type": "create",
"message": {
"old_entry": null,
"new_entry": {
"name": "new_folder",
"is_directory": true,
"attributes": {
"file_size": 0,
"mtime": 1733616000,
"file_mode": 493,
"uid": 1000,
"gid": 1000
},
"chunks": []
},
"delete_chunks": false,
"new_parent_path": ""
}
}
Event Filtering
Filter by Event Types
To receive only specific types of events, use the event_types parameter:
[notification.webhook]
enabled = true
endpoint = "https://your-server.com/webhook"
# Only receive create and delete events
event_types = ["create", "delete"]
Filter by Path Prefixes
To monitor only specific directories, use the path_prefixes parameter:
[notification.webhook]
enabled = true
endpoint = "https://your-server.com/webhook"
# Only monitor /important and /data directories
path_prefixes = ["/important", "/data"]
Combined Filtering
You can combine both filters:
[notification.webhook]
enabled = true
endpoint = "https://your-server.com/webhook"
# Only receive create/delete events from /important directory
event_types = ["create", "delete"]
path_prefixes = ["/important"]
Retry and Error Handling
The webhook system includes robust error handling:
- Automatic Retries: Failed requests are automatically retried up to
max_retriestimes - Exponential Backoff: Retry delays increase exponentially from
backoff_secondstomax_backoff_seconds - Dead Letter Queue: After exhausting retries, failed messages are logged for debugging
- Status Code Validation: Only 2xx status codes are considered successful
Example Retry Configuration
[notification.webhook]
enabled = true
endpoint = "https://your-server.com/webhook"
max_retries = 5
backoff_seconds = 2
max_backoff_seconds = 60
This configuration will retry up to 5 times with delays: 2s, 4s, 8s, 16s, 32s.
Performance Tuning
⚠️ Note: While you can tune these parameters, remember that webhooks use a push model best suited for low to moderate traffic. For high-traffic scenarios (>1000 events/second sustained), consider using Kafka, message queues, or the pull-based metadata event logs instead.
Concurrent Workers
Adjust the number of concurrent workers based on your webhook endpoint's capacity:
[notification.webhook]
enabled = true
endpoint = "https://your-server.com/webhook"
workers = 10 # Increase for higher throughput
Trade-offs: More workers increase throughput but also increase the load on your webhook endpoint and network connections.
Buffer Size
Increase buffer size for high-volume environments to prevent event loss during traffic bursts:
[notification.webhook]
enabled = true
endpoint = "https://your-server.com/webhook"
buffer_size = 50000 # Handle more concurrent events
Trade-offs: Larger buffers consume more memory and may delay detection of delivery failures. If events are being generated faster than they can be delivered, increasing the buffer only delays the inevitable - you need to either increase processing capacity or switch to a pull-based model.
Security Considerations
Authentication
Use bearer token authentication to secure your webhook endpoint:
[notification.webhook]
enabled = true
endpoint = "https://your-server.com/webhook"
bearer_token = "your-secret-token-here"
The token will be sent as: Authorization: Bearer your-secret-token-here
HTTPS
Always use HTTPS endpoints in production:
[notification.webhook]
enabled = true
endpoint = "https://your-server.com/webhook" # Use HTTPS
Webhook Receiver Validation
Your webhook receiver should:
- Validate the bearer token (if configured)
- Validate request content type is
application/json - Parse and validate JSON payload structure
- Respond with 2xx status code for successful processing
- Implement idempotency to handle potential duplicate events
Example Webhook Receiver (Node.js/Express)
const express = require('express');
const app = express();
app.use(express.json());
app.post('/webhook', (req, res) => {
// Validate bearer token
const authHeader = req.headers.authorization;
const expectedToken = 'Bearer your-secret-token-here';
if (authHeader !== expectedToken) {
return res.status(401).json({ error: 'Unauthorized' });
}
// Process webhook payload
const { key, event_type, message } = req.body;
console.log(`Received ${event_type} event for: ${key}`);
// Process based on event type
switch (event_type) {
case 'create':
handleCreate(key, message);
break;
case 'update':
handleUpdate(key, message);
break;
case 'delete':
handleDelete(key, message);
break;
case 'rename':
handleRename(key, message);
break;
}
// Return success response
res.status(200).json({ success: true });
});
app.listen(3000, () => {
console.log('Webhook receiver listening on port 3000');
});
Example Webhook Receiver (Python/Flask)
from flask import Flask, request, jsonify
app = Flask(__name__)
EXPECTED_TOKEN = "your-secret-token-here"
@app.route('/webhook', methods=['POST'])
def webhook():
# Validate bearer token
auth_header = request.headers.get('Authorization')
if auth_header != f'Bearer {EXPECTED_TOKEN}':
return jsonify({'error': 'Unauthorized'}), 401
# Parse webhook payload
data = request.json
key = data.get('key')
event_type = data.get('event_type')
message = data.get('message')
print(f"Received {event_type} event for: {key}")
# Process based on event type
if event_type == 'create':
handle_create(key, message)
elif event_type == 'update':
handle_update(key, message)
elif event_type == 'delete':
handle_delete(key, message)
elif event_type == 'rename':
handle_rename(key, message)
# Return success response
return jsonify({'success': True}), 200
if __name__ == '__main__':
app.run(host='0.0.0.0', port=3000)
Troubleshooting
Enable Debug Logging
Check SeaweedFS logs for webhook-related errors:
weed filer -v=1
Common Issues
-
Webhook not receiving events
- Verify
enabled = truein configuration - Check endpoint URL is correct and accessible
- Verify filer is using the notification.toml file
- Check event/path filters aren't blocking events
- Verify
-
Authentication failures
- Verify bearer_token matches on both sides
- Check Authorization header format
-
Timeout errors
- Increase
timeout_seconds - Optimize webhook receiver response time
- Check network connectivity
- Increase
-
Events being dropped
- Increase
buffer_size - Increase
workersfor higher throughput - Check dead letter queue logs
- Increase
-
Webhook endpoint receiving duplicate events
- This is expected behavior with retry logic
- Implement idempotency in your webhook receiver
-
Consistent high latency or event backlog
- Check if event rate exceeds webhook capacity
- Monitor buffer utilization in logs
- Consider migrating to Kafka, SQS, or pull-based event logs for higher throughput
- Webhooks are a push model designed for low-moderate traffic; high sustained traffic requires alternative architectures
Monitoring
Monitor webhook health through logs:
# Watch for failed webhook deliveries
weed filer -v=1 | grep webhook
# Look for dead letter queue messages (failed after all retries)
weed filer -v=1 | grep "dead letter"
Related Documentation
- Filer Metadata Events - Understanding the underlying event system
- Filer Setup - Basic filer configuration
- Notification Configuration - Example notification.toml
Use Cases
Appropriate Webhook Use Cases (Low-Moderate Traffic)
- Real-time Alerting: Send notifications to Slack, email, or monitoring systems for critical file events
- Selective Audit Logging: Track file system changes for specific sensitive directories
- Triggered Workflows: Start business processes when specific files are uploaded (e.g., invoice processing)
- Development/Test Environments: Real-time event monitoring during development
- Configuration Change Detection: Monitor configuration directories for changes
- Compliance Notifications: Alert on access or modifications to regulated data
- Backup Triggers: Trigger backups for specific critical files or directories
Consider Alternatives For (High Traffic)
- Large-scale Search Indexing: Use Kafka → Elasticsearch pipeline for high-volume indexing
- Bulk Data Processing: Use pull-based event logs or Kafka for processing thousands of files
- Content Distribution at Scale: Use Kafka or message queues for reliable high-volume sync
- Data Lake Integration: Use Kafka or direct event log consumption for streaming at scale
- High-frequency Monitoring: Use pull-based metrics or dedicated monitoring integrations
Rule of Thumb: If you're processing more than 50-100 events per second sustained, or if you have large batch operations, webhooks are likely not the right tool. Use Kafka, message queues, or pull-based event logs instead.
References
Introduction
API
Configuration
- Replication
- Store file with a Time To Live
- Failover Master Server
- Erasure coding for warm storage
- Server Startup via Systemd
- Environment Variables
Filer
- Filer Setup
- Directories and Files
- File Operations Quick Reference
- Data Structure for Large Files
- Filer Data Encryption
- Filer Commands and Operations
- Filer JWT Use
- TUS Resumable Uploads
Filer Stores
- Filer Cassandra Setup
- Filer Redis Setup
- Super Large Directories
- Path-Specific Filer Store
- Choosing a Filer Store
- Customize Filer Store
Management
Advanced Filer Configurations
- Migrate to Filer Store
- Add New Filer Store
- Filer Store Replication
- Filer Active Active cross cluster continuous synchronization
- Filer as a Key-Large-Value Store
- Path Specific Configuration
- Filer Change Data Capture
FUSE Mount
WebDAV
Cloud Drive
- Cloud Drive Benefits
- Cloud Drive Architecture
- Configure Remote Storage
- Mount Remote Storage
- Cache Remote Storage
- Cloud Drive Quick Setup
- Gateway to Remote Object Storage
AWS S3 API
- Amazon S3 API
- S3 Conditional Operations
- S3 CORS
- S3 Object Lock and Retention
- S3 Object Versioning
- S3 API Benchmark
- S3 API FAQ
- S3 Bucket Quota
- S3 Rate Limiting
- S3 API Audit log
- S3 Nginx Proxy
- Docker Compose for S3
S3 Table Bucket
S3 Authentication & IAM
- S3 Configuration - Start Here
- S3 Credentials (
-s3.config) - OIDC Integration (
-s3.iam.config) - S3 Policy Variables
- Amazon IAM API
- AWS IAM CLI
Server-Side Encryption
S3 Client Tools
- AWS CLI with SeaweedFS
- s3cmd with SeaweedFS
- rclone with SeaweedFS
- restic with SeaweedFS
- nodejs with Seaweed S3
Machine Learning
HDFS
- Hadoop Compatible File System
- run Spark on SeaweedFS
- run HBase on SeaweedFS
- run Presto on SeaweedFS
- Hadoop Benchmark
- HDFS via S3 connector
Replication and Backup
- Async Replication to another Filer [Deprecated]
- Async Backup
- Async Filer Metadata Backup
- Async Replication to Cloud [Deprecated]
- Kubernetes Backups and Recovery with K8up
Metadata Change Events
Messaging
- Structured Data Lake with SMQ and SQL
- Seaweed Message Queue
- SQL Queries on Message Queue
- SQL Quick Reference
- PostgreSQL-compatible Server weed db
- Pub-Sub to SMQ to SQL
- Kafka to Kafka Gateway to SMQ to SQL
Use Cases
Operations
Advanced
- Large File Handling
- Optimization
- Volume Management
- Tiered Storage
- Cloud Tier
- Cloud Monitoring
- Load Command Line Options from a file
- SRV Service Discovery
- Volume Files Structure
Security
- Security Overview
- Security Configuration
- Cryptography and FIPS Compliance
- Run Blob Storage on Public Internet