2025-09-02 08:25:29 -07:00
# PostgreSQL Wire Protocol Support for SeaweedFS
## Overview
This design adds native PostgreSQL wire protocol support to SeaweedFS, enabling compatibility with all PostgreSQL clients, tools, and drivers without requiring custom implementations.
## Benefits
### Universal Compatibility
- **Standard PostgreSQL Clients**: psql, pgAdmin, Adminer, etc.
- **JDBC/ODBC Drivers**: Use standard PostgreSQL drivers
- **BI Tools**: Tableau, Power BI, Grafana, Superset with native PostgreSQL connectors
- **ORMs**: Hibernate, ActiveRecord, Django ORM, etc.
- **Programming Languages**: Native PostgreSQL libraries in Python (psycopg2), Node.js (pg), Go (lib/pq), etc.
### Enterprise Integration
- **Existing Infrastructure**: Drop-in replacement for PostgreSQL in read-only scenarios
- **Migration Path**: Easy transition from PostgreSQL-based analytics
- **Tool Ecosystem**: Leverage entire PostgreSQL ecosystem
## Architecture
```
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ PostgreSQL │ │ PostgreSQL │ │ SeaweedFS │
│ Clients │◄──►│ Protocol │◄──►│ SQL Engine │
│ (psql, etc.) │ │ Server │ │ │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│
▼
┌──────────────────┐
│ Authentication │
│ & Session Mgmt │
└──────────────────┘
```
## Core Components
### 1. PostgreSQL Wire Protocol Handler
```go
// PostgreSQL message types
const (
PG_MSG_STARTUP = 0x00 // Startup message
PG_MSG_QUERY = 'Q' // Simple query
PG_MSG_PARSE = 'P' // Parse (prepared statement)
PG_MSG_BIND = 'B' // Bind parameters
PG_MSG_EXECUTE = 'E' // Execute prepared statement
PG_MSG_DESCRIBE = 'D' // Describe statement/portal
PG_MSG_CLOSE = 'C' // Close statement/portal
PG_MSG_FLUSH = 'H' // Flush
PG_MSG_SYNC = 'S' // Sync
PG_MSG_TERMINATE = 'X' // Terminate connection
PG_MSG_PASSWORD = 'p' // Password message
)
// PostgreSQL response types
const (
PG_RESP_AUTH_OK = 'R' // Authentication OK
PG_RESP_AUTH_REQ = 'R' // Authentication request
PG_RESP_BACKEND_KEY = 'K' // Backend key data
PG_RESP_PARAMETER = 'S' // Parameter status
PG_RESP_READY = 'Z' // Ready for query
PG_RESP_COMMAND = 'C' // Command complete
PG_RESP_DATA_ROW = 'D' // Data row
PG_RESP_ROW_DESC = 'T' // Row description
PG_RESP_PARSE_COMPLETE = '1' // Parse complete
PG_RESP_BIND_COMPLETE = '2' // Bind complete
PG_RESP_CLOSE_COMPLETE = '3' // Close complete
PG_RESP_ERROR = 'E' // Error response
PG_RESP_NOTICE = 'N' // Notice response
)
```
### 2. Session Management
```go
type PostgreSQLSession struct {
conn net.Conn
reader *bufio.Reader
writer *bufio.Writer
authenticated bool
username string
database string
parameters map[string]string
preparedStmts map[string]*PreparedStatement
portals map[string]*Portal
transactionState TransactionState
processID uint32
secretKey uint32
}
type PreparedStatement struct {
name string
query string
paramTypes []uint32
fields []FieldDescription
}
type Portal struct {
name string
statement string
parameters [][]byte
suspended bool
}
```
### 3. SQL Translation Layer
```go
type PostgreSQLTranslator struct {
dialectMap map[string]string
}
// Translates PostgreSQL-specific SQL to SeaweedFS SQL
func (t *PostgreSQLTranslator) TranslateQuery(pgSQL string) (string, error) {
// Handle PostgreSQL-specific syntax:
// - SELECT version() -> SELECT 'SeaweedFS 1.0'
// - SELECT current_database() -> SELECT 'default'
// - SELECT current_user -> SELECT 'seaweedfs'
// - \d commands -> SHOW TABLES/DESCRIBE equivalents
// - PostgreSQL system catalogs -> SeaweedFS equivalents
}
```
### 4. Data Type Mapping
```go
var PostgreSQLTypeMap = map[string]uint32{
"TEXT": 25, // PostgreSQL TEXT type
"VARCHAR": 1043, // PostgreSQL VARCHAR type
"INTEGER": 23, // PostgreSQL INTEGER type
"BIGINT": 20, // PostgreSQL BIGINT type
"FLOAT": 701, // PostgreSQL FLOAT8 type
"BOOLEAN": 16, // PostgreSQL BOOLEAN type
"TIMESTAMP": 1114, // PostgreSQL TIMESTAMP type
"JSON": 114, // PostgreSQL JSON type
}
func SeaweedToPostgreSQLType(seaweedType string) uint32 {
if pgType, exists := PostgreSQLTypeMap[strings.ToUpper(seaweedType)]; exists {
return pgType
}
return 25 // Default to TEXT
}
```
## Protocol Implementation
### 1. Connection Flow
```
Client Server
│ │
├─ StartupMessage ────────────►│
│ ├─ AuthenticationOk
│ ├─ ParameterStatus (multiple)
│ ├─ BackendKeyData
│ └─ ReadyForQuery
│ │
├─ Query('SELECT 1') ─────────►│
│ ├─ RowDescription
│ ├─ DataRow
│ ├─ CommandComplete
│ └─ ReadyForQuery
│ │
├─ Parse('stmt1', 'SELECT $1')►│
│ └─ ParseComplete
├─ Bind('portal1', 'stmt1')───►│
│ └─ BindComplete
├─ Execute('portal1')─────────►│
│ ├─ DataRow (multiple)
│ └─ CommandComplete
├─ Sync ──────────────────────►│
│ └─ ReadyForQuery
│ │
├─ Terminate ─────────────────►│
│ └─ [Connection closed]
```
### 2. Authentication
```go
type AuthMethod int
const (
AuthTrust AuthMethod = iota
AuthPassword
AuthMD5
AuthSASL
)
func (s *PostgreSQLServer) handleAuthentication(session *PostgreSQLSession) error {
switch s.authMethod {
case AuthTrust:
return s.sendAuthenticationOk(session)
case AuthPassword:
return s.handlePasswordAuth(session)
case AuthMD5:
return s.handleMD5Auth(session)
default:
return fmt.Errorf("unsupported auth method")
}
}
```
### 3. Query Processing
```go
func (s *PostgreSQLServer) handleSimpleQuery(session *PostgreSQLSession, query string) error {
// 1. Translate PostgreSQL SQL to SeaweedFS SQL
translatedQuery, err := s.translator.TranslateQuery(query)
if err != nil {
return s.sendError(session, err)
}
// 2. Execute using existing SQL engine
result, err := s.sqlEngine.ExecuteSQL(context.Background(), translatedQuery)
if err != nil {
return s.sendError(session, err)
}
// 3. Send results in PostgreSQL format
err = s.sendRowDescription(session, result.Columns)
if err != nil {
return err
}
for _, row := range result.Rows {
err = s.sendDataRow(session, row)
if err != nil {
return err
}
}
return s.sendCommandComplete(session, fmt.Sprintf("SELECT %d", len(result.Rows)))
}
```
## System Catalogs Support
PostgreSQL clients expect certain system catalogs. We'll implement views for key ones:
```sql
-- pg_tables equivalent
SELECT
'default' as schemaname,
table_name as tablename,
'seaweedfs' as tableowner,
NULL as tablespace,
false as hasindexes,
false as hasrules,
false as hastriggers
FROM information_schema.tables;
-- pg_database equivalent
SELECT
database_name as datname,
'seaweedfs' as datdba,
'UTF8' as encoding,
'C' as datcollate,
'C' as datctype
FROM information_schema.schemata;
-- pg_version equivalent
SELECT 'SeaweedFS 1.0 (PostgreSQL 14.0 compatible)' as version;
```
## Configuration
### Server Configuration
```go
type PostgreSQLServerConfig struct {
Host string
Port int
Database string
AuthMethod AuthMethod
Users map[string]string // username -> password
TLSConfig *tls.Config
MaxConns int
IdleTimeout time.Duration
}
```
### Client Connection String
```bash
# Standard PostgreSQL connection strings work
psql "host=localhost port=5432 dbname=default user=seaweedfs"
PGPASSWORD=secret psql -h localhost -p 5432 -U seaweedfs -d default
# JDBC URL
jdbc:postgresql://localhost:5432/default?user=seaweedfs& password=secret
```
## Command Line Interface
```bash
# Start PostgreSQL protocol server
2025-09-02 11:02:08 -07:00
weed db -port=5432 -auth=trust
2025-09-03 22:57:47 -07:00
weed db -port=5432 -auth=password -users="admin:secret;readonly:pass"
2025-09-02 11:02:08 -07:00
weed db -port=5432 -tls-cert=server.crt -tls-key=server.key
2025-09-02 08:25:29 -07:00
# Configuration options
-host=localhost # Listen host
-port=5432 # PostgreSQL standard port
-auth=trust|password|md5 # Authentication method
2025-09-03 22:57:47 -07:00
-users=user:pass;user2:pass2 # User credentials (password/md5 auth) - use semicolons to separate users
2025-09-02 08:25:29 -07:00
-database=default # Default database name
-max-connections=100 # Maximum concurrent connections
-idle-timeout=1h # Connection idle timeout
-tls-cert="" # TLS certificate file
-tls-key="" # TLS private key file
```
## Client Compatibility Testing
### Essential Clients
- **psql**: PostgreSQL command line client
- **pgAdmin**: Web-based administration tool
- **DBeaver**: Universal database tool
- **DataGrip**: JetBrains database IDE
### Programming Language Drivers
- **Python**: psycopg2, asyncpg
- **Java**: PostgreSQL JDBC driver
- **Node.js**: pg, node-postgres
- **Go**: lib/pq, pgx
- **.NET**: Npgsql
### BI Tools
- **Grafana**: PostgreSQL data source
- **Superset**: PostgreSQL connector
- **Tableau**: PostgreSQL native connector
- **Power BI**: PostgreSQL connector
## Implementation Plan
1. **Phase 1** : Basic wire protocol and simple queries
2. **Phase 2** : Extended query protocol (prepared statements)
3. **Phase 3** : System catalog views
4. **Phase 4** : Advanced features (transactions, notifications)
5. **Phase 5** : Performance optimization and caching
## Limitations
### Read-Only Access
- INSERT/UPDATE/DELETE operations not supported
- Returns appropriate error messages for write operations
### Partial SQL Compatibility
- Subset of PostgreSQL SQL features
- SeaweedFS-specific limitations apply
### System Features
- No stored procedures/functions
- No triggers or constraints
- No user-defined types
- Limited transaction support (mostly no-op)
## Security Considerations
### Authentication
- Support for trust, password, and MD5 authentication
- TLS encryption support
- User access control
### SQL Injection Prevention
- Prepared statements with parameter binding
- Input validation and sanitization
- Query complexity limits
## Performance Optimizations
### Connection Pooling
- Configurable maximum connections
- Connection reuse and idle timeout
- Memory efficient session management
### Query Caching
- Prepared statement caching
- Result set caching for repeated queries
- Metadata caching
### Protocol Efficiency
- Binary result format support
- Batch query processing
- Streaming large result sets
This design provides a comprehensive PostgreSQL wire protocol implementation that makes SeaweedFS accessible to the entire PostgreSQL ecosystem while maintaining compatibility and performance.