Clone
15
Mount Remote Storage
Chris Lu edited this page 2026-01-15 01:11:57 -08:00

After Configure Remote Storage, you will get a storage name cloud1.

Mount a Remote Storage

Now you can run remote.mount in weed shell:

> remote.mount -h
Usage of remote.mount:
  -dir string
    	a directory in filer
  -nonempty
    	allows the mounting over a non-empty directory
  -remote string
    	a directory in remote storage, ex. <storageName>/<bucket>/path/to/dir

> help remote.mount
  remote.mount	# mount remote storage and pull its metadata

	# assume a remote storage is configured to name "s3_1"
	remote.configure -name=cloud1 -type=s3 -s3.access_key=xyz -s3.secret_key=yyy

	# mount and pull one bucket
	remote.mount -dir=/xxx -remote=cloud1/bucket
	# mount and pull one directory in the bucket
	remote.mount -dir=/xxx -remote=cloud1/bucket/dir1

	# after mount, start a separate process to write updates to remote storage
	weed filer.remote.sync -filer=<filerHost>:<filerPort> -dir=/xxx

With remote.mount, you can mount one bucket or any directory in the bucket.

Cache Metadata

This remote.mount will also pull down all metadata from the remote storage.

Later, any metadata operations will be fast to just read local metadata.

remote.unmount will drop all local metadata and cached file content.

Repeatedly Update Metadata

Sometimes the data on the cloud has changes and local metadata becomes stale. To unmount first and mount again can work but costly, since all data has to be cached again.

To refresh the metadata changes, you can run this on the mounted directory or any sub directories, e.g.:

	remote.meta.sync -dir=/xxx
	remote.meta.sync -dir=/xxx/sub/dir

This will update local metadata accordingly and still keep file contents that are not changed.

If the data on the cloud can changed often, you can create a cronjob to run it. Or you can add this command to the admin scripts defined in master.toml, to run it regularly.

Write Back Changes to Remote Storage

If the mounted directory is only for reading, you can skip this step.

If local changes need to be synchronized to the cloud storage, you have two options:

Option 1: Continuous Sync with weed filer.remote.sync

For continuous, real-time synchronization, start a separate process weed filer.remote.sync -dir=xxx. This process will listen to filer change events and write any changes back to the remote storage automatically.

weed filer.remote.sync -filer=<filerHost>:<filerPort> -dir=xxx

The process is designed to be worry-free. It should automatically resume if stopped, and can reconnect automatically.

Use this when:

  • You need real-time synchronization of all changes
  • Files are being continuously created, modified, or deleted
  • You want automatic background synchronization

Option 2: On-Demand Sync with remote.copy.local

For on-demand, batch synchronization of local-only files, use the remote.copy.local command in weed shell. This is useful for one-time or scheduled backups.

Use this when:

  • You have existing local files that were never synced to remote
  • You deleted filer logs and need to re-sync existing files
  • You want to run periodic batch backups via cron
  • You need fine-grained control over which files to sync (using filters)

Copy Local-Only Files to Remote

The remote.copy.local command synchronizes local-only files to remote storage. This is useful when you have files that were created locally and need to be backed up to remote storage.

> help remote.copy.local
  remote.copy.local	# copy local-only files to remote storage for mounted directories

	# copy all local-only files in a mounted directory to remote
	remote.copy.local -dir=/xxx

	# copy with filters
	remote.copy.local -dir=/xxx/some/sub/dir -include=*.pdf
	remote.copy.local -dir=/xxx/some/sub/dir -exclude=*.txt
	remote.copy.local -minSize=1024000    # copy files larger than 100K
	remote.copy.local -maxSize=10240000   # copy files smaller than 10MB

	# force update even if remote file exists
	remote.copy.local -dir=/xxx -forceUpdate=true

	# dry run to see what would be copied
	remote.copy.local -dir=/xxx -dryRun=true

	This command only copies files that don't have remote entries yet.
	Files that are already synchronized with remote storage are skipped.

	The actual data copying goes through volume servers in parallel.

Command Comparison

Here's a comprehensive comparison of remote storage synchronization and caching commands:

Global Command Comparison

Feature weed filer.remote.sync remote.copy.local remote.cache remote.uncache
Type Long-running daemon Shell command (batch) Shell command (batch) Shell command (batch)
Purpose Continuous sync (Local → Remote) Backup local-only files (Local → Remote) Download to local cache (Remote → Local) Free up local storage (Local → Delete)
Data Flow Bidirectional (metadata) Local → Remote Remote → Local Local → (Deleted)
Trigger Automatic (file events) Manual / Cron Manual / Cron Manual / Cron
File Scope All changes Local-only files Files on remote Cached files
Filters None (syncs all) Yes (include/exclude) Yes (include/exclude) Yes (include/exclude)
Force Automatic -forceUpdate N/A N/A
Dry Run No -dryRun -dryRun -dryRun
Use Case Active writes, consistency Backups, recovery Warm-up, pre-fetching Save disk space

Typical Workflow

  1. Mount remote storage: remote.mount -dir=/buckets/mybucket -remote=s3_1/bucket
  2. Start continuous sync (optional): weed filer.remote.sync -filer=localhost:8888 -dir=/buckets/mybucket
  3. Create local files: Write files directly to /buckets/mybucket/
  4. Batch backup (if not using filer.remote.sync): remote.copy.local -dir=/buckets/mybucket
  5. Free up space: remote.uncache -dir=/buckets/mybucket -minSize=10240000
  6. Warm up cache: remote.cache -dir=/buckets/mybucket -include=*.pdf

Unmount a Remote Storage

Similarly, running remote.unmount -dir=xxx can unmount a remote storage. However, this means all cached data and metadata will be deleted. And if weed filer.remote.sync -filer=<filerHost>:<filerPort> -dir=xxx was not run, the local updates have not been uploaded to the remote storage, so these local updates will be lost.

The weed filer.remote.sync will stop as soon as it sees the directory is unmounted. So the local deletion will not propagate back to the cloud, avoiding possible data loss.