mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2025-11-24 08:46:54 +08:00
Page:
SQL Quick Reference
Pages
AWS CLI with SeaweedFS
AWS IAM CLI
Actual Users
Admin UI
Amazon IAM API
Amazon S3 API
Applications
Async Backup
Async Filer Metadata Backup
Async Replication to Cloud
Async Replication to another Filer
Benchmark SeaweedFS as a GlusterFS replacement
Benchmarks from jinleileiking
Benchmarks
Cache Remote Storage
Choosing a Filer Store
Client Libraries
Cloud Drive Architecture
Cloud Drive Benefits
Cloud Drive Quick Setup
Cloud Monitoring
Cloud Tier
Components
Configure Remote Storage
Customize Filer Store
Data Backup
Data Structure for Large Files
Deployment to Kubernetes and Minikube
Directories and Files
Docker Compose for S3
Docker Image Registry with SeaweedFS
Environment Variables
Erasure Coding for warm storage
Error reporting to sentry
FAQ
FIO benchmark
FUSE Mount
Failover Master Server
File Operations Quick Reference
Filer Active Active cross cluster continuous synchronization
Filer Cassandra Setup
Filer Change Data Capture
Filer Commands and Operations
Filer Data Encryption
Filer JWT Use
Filer Metadata Events
Filer Redis Setup
Filer Server API
Filer Setup
Filer Store Replication
Filer Stores
Filer as a Key Large Value Store
Gateway to Remote Object Storage
Getting Started
HDFS via S3 connector
Hadoop Benchmark
Hadoop Compatible File System
Hardware
Hobbyest Tinkerer scale on premises tutorial
Home
Independent Benchmarks
Kafka to Kafka Gateway to SMQ to SQL
Keycloak Integration
Kubernetes Backups and Recovery with K8up
Large File Handling
Load Command Line Options from a file
Master Server API
Migrate to Filer Store
Mount Remote Storage
Optimization
Path Specific Configuration
Path Specific Filer Store
PostgreSQL compatible Server weed db
Production Setup
Pub Sub to SMQ to SQL
Replication
Run Blob Storage on Public Internet
Run Presto on SeaweedFS
S3 API Audit log
S3 API Benchmark
S3 API FAQ
S3 Bucket Quota
S3 CORS
S3 Conditional Operations
S3 Credentials
S3 Nginx Proxy
S3 Object Lock and Retention
S3 Object Versioning
SQL Queries on Message Queue
SQL Quick Reference
SRV Service Discovery
Seaweed Message Queue
SeaweedFS Java Client
SeaweedFS in Docker Swarm
Security Configuration
Security Overview
Server Side Encryption SSE C
Server Side Encryption SSE KMS
Server Side Encryption
Server Startup via Systemd
Store file with a Time To Live
Structured Data Lake with SMQ and SQL
Super Large Directories
System Metrics
TensorFlow with SeaweedFS
Tiered Storage
UrBackup with SeaweedFS
Use Cases
Volume Files Structure
Volume Management
Volume Server API
WebDAV
Words from SeaweedFS Users
Worker
fstab
nodejs with Seaweed S3
rclone with SeaweedFS
restic with SeaweedFS
run HBase on SeaweedFS
run Spark on SeaweedFS
s3cmd with SeaweedFS
weed shell
Clone
2
SQL Quick Reference
chrislusf edited this page 2025-09-08 02:05:25 -07:00
SQL Quick Reference
Quick reference guide for SeaweedFS SQL queries on Message Queue topics.
Commands
Start Database Server
# Basic (development)
weed db
# Production with MD5 auth
weed db -auth=md5 -users='{"admin":"secret"}' -host=0.0.0.0
# With TLS encryption
weed db -auth=md5 -users="@users.json" -tls-cert=server.crt -tls-key=server.key
Interactive CLI
# Start SQL shell
weed sql
# Execute single query
weed sql -exec="SHOW TABLES"
Authentication
Credential Formats
# JSON inline
-users='{"user1":"pass1","user2":"pass2"}'
# JSON file
-users="@/path/to/users.json"
Auth Methods
trust- No auth (dev only)md5- Hash + salt (recommended)password- Clear text (TLS required)
Client Connections
psql
# Basic connection
psql -h localhost -p 5432 -U admin -d default
# With password
PGPASSWORD=secret psql -h localhost -p 5432 -U admin -d default
# Connection string
psql "postgresql://admin:secret@localhost:5432/default"
Programming Languages
# Python
import psycopg2
conn = psycopg2.connect(host="localhost", port=5432, user="admin", password="secret", database="default")
// Java
String url = "jdbc:postgresql://localhost:5432/default";
Connection conn = DriverManager.getConnection(url, "admin", "secret");
// Go
db, err := sql.Open("postgres", "host=localhost port=5432 user=admin password=secret dbname=default sslmode=disable")
SQL Operations
Schema Commands
SHOW DATABASES; -- List MQ namespaces
USE namespace_name; -- Switch database
SHOW TABLES; -- List MQ topics
DESCRIBE table_name; -- Show table schema
-- Note: CREATE TABLE, DROP TABLE and ALTER TABLE not supported
Query Commands
-- Basic queries
SELECT * FROM events LIMIT 10;
SELECT * FROM events WHERE _ts > '2025-01-01';
SELECT * FROM events WHERE status IN ('active', 'completed');
SELECT COUNT(*) FROM events;
-- NULL checking operations
SELECT * FROM events WHERE status IS NULL;
SELECT * FROM events WHERE status IS NOT NULL;
SELECT * FROM events WHERE user_id IS NOT NULL AND status = 'active';
-- System columns (available on all tables)
SELECT _ts, _key, _source, * FROM events;
-- Aggregations (optimized)
SELECT COUNT(*) FROM events;
SELECT MIN(timestamp), MAX(timestamp) FROM events;
Time-based Queries
-- Time filtering with system timestamp column (_ts)
-- Automatic string-to-timestamp conversion for _ts system column
SELECT * FROM events
WHERE _ts >= '2025-01-01'
AND _ts < '2025-02-01'
LIMIT 1000;
-- INTERVAL expressions and timestamp arithmetic
SELECT NOW() - INTERVAL '1 hour';
SELECT * FROM events WHERE _ts >= NOW() - INTERVAL '24 hours';
SELECT * FROM events WHERE _ts >= CURRENT_TIMESTAMP - INTERVAL '1 day';
-- BETWEEN clauses with INTERVAL arithmetic
SELECT * FROM events WHERE _ts BETWEEN NOW() - INTERVAL '1 week' AND NOW();
SELECT * FROM events WHERE _ts BETWEEN '2025-01-01' AND '2025-12-31';
-- Current time functions
SELECT NOW(), CURRENT_TIMESTAMP, CURRENT_DATE FROM events LIMIT 1;
System Columns
Every topic includes these system columns:
_ts- Message timestamp (formatted timestamp, supports automatic string-to-timestamp conversion in WHERE clauses)_key- Message partition key_source- Data source ("live" or parquet file path)
Note: The _ts column supports automatic parsing of timestamp strings in WHERE clauses. Formats supported: '2025-01-01', '2025-01-01T15:30:00Z', '2025-01-01 15:30:00', etc.
NULL Value Handling
NULL Checking Operations
-- Check for NULL values
SELECT * FROM events WHERE description IS NULL;
-- Check for non-NULL values
SELECT * FROM events WHERE user_id IS NOT NULL;
-- Combine with other conditions
SELECT * FROM events
WHERE user_id IS NOT NULL
AND status = 'active'
AND _ts >= '2025-01-01';
-- Filter out records with missing data
SELECT * FROM events WHERE user_id IS NOT NULL AND description IS NOT NULL;
NULL Value Semantics
- Empty strings are treated as valid values (not NULL)
- Missing fields in records are considered NULL
- Boolean, numeric, and timestamp values are never NULL once present
- Bytes values are treated as non-NULL even if empty
- NULL values are excluded from aggregate functions like
COUNT(column_name)
Troubleshooting
Check Status
-- Verify tables exist
SHOW TABLES;
-- Check data sources
SELECT _source FROM table_name LIMIT 100;
-- Verify time range
SELECT MIN(timestamp), MAX(timestamp) FROM table_name;
Common Issues
- No data: Check
SHOW TABLESand topic names - Auth failed: Verify credentials in users file
- Timeouts: Increase
-idle-timeoutsetting - Slow queries: Add WHERE clauses and LIMIT
- NULL filtering: Use
IS NULL/IS NOT NULLinstead of= NULL/!= NULL
Debug Mode
# Enable verbose logging
export GLOG_v=2
weed db -v=2 ...
Performance Tips
-
Filter by time for large datasets:
WHERE _ts >= '2025-01-01' AND _ts < '2025-02-01' -
Use LIMIT for exploration:
SELECT * FROM events WHERE _ts > '2025-01-01' LIMIT 1000 -
Fast aggregations (basic functions only):
SELECT COUNT(*) FROM events; -- Optimized SELECT MIN(timestamp), MAX(timestamp) FROM events; -- Optimized -
Check data sources:
SELECT _source, _ts FROM events LIMIT 100; -
Filter NULL values for cleaner results:
SELECT * FROM events WHERE user_id IS NOT NULL LIMIT 100;
Limitations
Not Supported:
ORDER BY,GROUP BY,HAVINGclausesJOINoperations between tablesCREATE TABLE,DROP TABLE,ALTER TABLEstatements- Window functions and subqueries
- Complex aggregations with grouping
Links
For detailed documentation, see SQL Queries on Message Queue.
Introduction
API
Configuration
- Replication
- Store file with a Time To Live
- Failover Master Server
- Erasure coding for warm storage
- Server Startup via Systemd
- Environment Variables
Filer
- Filer Setup
- Directories and Files
- File Operations Quick Reference
- Data Structure for Large Files
- Filer Data Encryption
- Filer Commands and Operations
- Filer JWT Use
Filer Stores
- Filer Cassandra Setup
- Filer Redis Setup
- Super Large Directories
- Path-Specific Filer Store
- Choosing a Filer Store
- Customize Filer Store
Management
Advanced Filer Configurations
- Migrate to Filer Store
- Add New Filer Store
- Filer Store Replication
- Filer Active Active cross cluster continuous synchronization
- Filer as a Key-Large-Value Store
- Path Specific Configuration
- Filer Change Data Capture
FUSE Mount
WebDAV
Cloud Drive
- Cloud Drive Benefits
- Cloud Drive Architecture
- Configure Remote Storage
- Mount Remote Storage
- Cache Remote Storage
- Cloud Drive Quick Setup
- Gateway to Remote Object Storage
AWS S3 API
- S3 Credentials
- Amazon S3 API
- S3 Conditional Operations
- S3 CORS
- S3 Object Lock and Retention
- S3 Object Versioning
- AWS CLI with SeaweedFS
- s3cmd with SeaweedFS
- rclone with SeaweedFS
- restic with SeaweedFS
- nodejs with Seaweed S3
- S3 API Benchmark
- S3 API FAQ
- S3 Bucket Quota
- S3 API Audit log
- S3 Nginx Proxy
- Docker Compose for S3
Server-Side Encryption
AWS IAM
Machine Learning
HDFS
- Hadoop Compatible File System
- run Spark on SeaweedFS
- run HBase on SeaweedFS
- run Presto on SeaweedFS
- Hadoop Benchmark
- HDFS via S3 connector
Replication and Backup
- Async Replication to another Filer [Deprecated]
- Async Backup
- Async Filer Metadata Backup
- Async Replication to Cloud [Deprecated]
- Kubernetes Backups and Recovery with K8up
Metadata Change Events
Messaging
- Structured Data Lake with SMQ and SQL
- Seaweed Message Queue
- SQL Queries on Message Queue
- SQL Quick Reference
- PostgreSQL-compatible Server weed db
- Pub-Sub to SMQ to SQL
- Kafka to Kafka Gateway to SMQ to SQL
Use Cases
Operations
Advanced
- Large File Handling
- Optimization
- Volume Management
- Tiered Storage
- Cloud Tier
- Cloud Monitoring
- Load Command Line Options from a file
- SRV Service Discovery
- Volume Files Structure