Extra MongoDB Tools

August 27, 2025 · View on GitHub

This repository provides supplementary tools for MongoDB, supporting both backup and restoration workflows:

  • mongo-archive – Dumps MongoDB data to disk and uploads it to supported cloud storage services.
  • mongo-unarchive – Downloads archived dumps from cloud storage and restores them into a live MongoDB database.

🚀 Building the Tools

To build the binaries from source:

  1. Clone the repository:

    git clone https://github.com/egose/database-tools
    cd database-tools
    
  2. Install dependencies and build:

    go mod tidy
    make build
    

    This will install dependencies and build the binaries into the dist/ directory.

Installation

You can install mongo-archive and mongo-unarchive in two ways:

If you use asdf to manage CLI tools, you can install the mongodb-database-tools plugin and make the CLI available globally.

# Add the mongodb-database-tools plugin (only once)
asdf plugin add mongodb-database-tools

# Install the desired version
asdf install mongodb-database-tools <latest-version>

# Set it as the global version
asdf global mongodb-database-tools <latest-version>
# Or set it locally for a project
asdf local mongodb-database-tools <latest-version>

After installation, you can run:

mongo-archive --version
mongo-unarchive --version

2. Download from GitHub Releases

You can also manually download the prebuilt binaries from the official releases page:

Releases: https://github.com/egose/database-tools/releases

  1. Visit the release page for version .
  2. Download the binary for your operating system and architecture.
  3. Make the binary executable and move it into a directory in your PATH:
chmod +x mongo-archive
chmod +x mongo-unarchive
sudo mv mongo-archive /usr/local/bin/
sudo mv mongo-unarchive /usr/local/bin/

Verify Installation

Run the following commands to confirm the installed version:

mongo-archive --version
mongo-unarchive --version

⚙️ Configuration: CLI Flags & Environment Variables

Both mongo-archive and mongo-unarchive follow the conventions of MongoDB’s native tools (e.g., mongodump, mongorestore), using similar command-line arguments. Configuration values can also be passed via environment variables for convenience or container-based execution.

📦 mongo-archive

Functionality

  • Dumps MongoDB data locally.
  • Uploads the dump to cloud storage (Azure Blob, AWS S3, or Google Cloud Storage).
  • Can be run once or as a cron-scheduled job.

Parameters

FlagEnvironment VariableTypeDescription
uriMONGOARCHIVE__URIstringMongoDB URI connection string
dbMONGOARCHIVE__DBstringDatabase to use
collectionMONGOARCHIVE__COLLECTIONstringCollection to use
hostMONGOARCHIVE__HOSTstringMongoDB host to connect to (for replica sets: setname/host1,host2)
portMONGOARCHIVE__PORTstringMongoDB port (can also use --host hostname:port)
sslMONGOARCHIVE__SSLboolConnect to a mongod or mongos that has SSL enabled
ssl-ca-fileMONGOARCHIVE__SSL_CA_FILEstring.pem file containing the root certificate chain from the CA
ssl-pem-key-fileMONGOARCHIVE__SSL_PEM_KEY_FILEstring.pem file containing the certificate and key
ssl-pem-key-passwordMONGOARCHIVE__SSL_PEM_KEY_PASSWORDstringPassword to decrypt the sslPEMKeyFile
ssl-crl-fileMONGOARCHIVE__SSL_CRL_FILEstring.pem file containing the certificate revocation list
ssl-allow-invalid-certificatesMONGOARCHIVE__SSL_ALLOW_INVALID_CERTIFICATESboolBypass validation for server certificates
ssl-allow-invalid-hostnamesMONGOARCHIVE__SSL_ALLOW_INVALID_HOSTNAMESboolBypass validation for server hostnames
ssl-fips-modeMONGOARCHIVE__SSL_FIPS_MODEboolUse FIPS mode of the installed OpenSSL library
usernameMONGOARCHIVE__USERNAMEstringUsername for authentication
passwordMONGOARCHIVE__PASSWORDstringPassword for authentication
authentication-databaseMONGOARCHIVE__AUTHENTICATION_DATABASEstringDatabase that holds the user's credentials
authentication-mechanismMONGOARCHIVE__AUTHENTICATION_MECHANISMstringAuthentication mechanism to use
gssapi-service-nameMONGOARCHIVE__GSSAPI_SERVICE_NAMEstringService name for GSSAPI/Kerberos auth (default: mongodb)
gssapi-host-nameMONGOARCHIVE__GSSAPI_HOST_NAMEstringHostname for GSSAPI/Kerberos auth (default: server address)
uri-pruneMONGOARCHIVE__URI_PRUNEboolPrune MongoDB URI connection string (remove credentials etc.)
queryMONGOARCHIVE__QUERYstringQuery filter as v2 Extended JSON string
query-fileMONGOARCHIVE__QUERY_FILEstringPath to file containing query filter (v2 Extended JSON)
read-preferenceMONGOARCHIVE__READ_PREFERENCEstringPreference mode (e.g., nearest) or preference JSON object
force-table-scanMONGOARCHIVE__FORCE_TABLE_SCANboolForce a table scan
verboseMONGOARCHIVE__VERBOSEstringMore detailed log output (-vvvvv or --verbose=N)
quietMONGOARCHIVE__QUIETboolHide all log output
az-endpointMONGOARCHIVE__AZ_ENDPOINTstringAzure Blob Storage emulator hostname and port
az-account-nameMONGOARCHIVE__AZ_ACCOUNT_NAMEstringAzure Blob Storage account name
az-account-keyMONGOARCHIVE__AZ_ACCOUNT_KEYstringAzure Blob Storage account key
az-container-nameMONGOARCHIVE__AZ_CONTAINER_NAMEstringAzure Blob Storage container name
aws-endpointMONGOARCHIVE__AWS_ENDPOINTstringAWS endpoint URL (hostname only or fully qualified URI)
aws-access-key-idMONGOARCHIVE__AWS_ACCESS_KEY_IDstringAWS access key associated with an IAM account
aws-secret-access-keyMONGOARCHIVE__AWS_SECRET_ACCESS_KEYstringAWS secret key associated with the access key
aws-regionMONGOARCHIVE__AWS_REGIONstringAWS region to send requests to
aws-bucketMONGOARCHIVE__AWS_BUCKETstringAWS S3 bucket name
aws-s3-force-path-styleMONGOARCHIVE__AWS_S3_FORCE_PATH_STYLEboolForce path-style S3 addressing instead of virtual-hosted
gcp-endpointMONGOARCHIVE__GCP_ENDPOINTstringGCP endpoint URL
gcp-bucketMONGOARCHIVE__GCP_BUCKETstringGCP storage bucket name
gcp-creds-fileMONGOARCHIVE__GCP_CREDS_FILEstringPath to GCP service account credentials file
gcp-project-idMONGOARCHIVE__GCP_PROJECT_IDstringGCP project ID
gcp-private-key-idMONGOARCHIVE__GCP_PRIVATE_KEY_IDstringGCP private key ID
gcp-private-keyMONGOARCHIVE__GCP_PRIVATE_KEYstringGCP private key
gcp-client-emailMONGOARCHIVE__GCP_CLIENT_EMAILstringGCP client email
gcp-client-idMONGOARCHIVE__GCP_CLIENT_IDstringGCP client ID
local-pathMONGOARCHIVE__LOCAL_PATHstringLocal directory path to store backups
expiry-daysMONGOARCHIVE__EXPIRY_DAYSstringMax age (in days) for archives to be retained
rocketchat-webhook-urlMONGOARCHIVE__ROCKETCHAT_WEBHOOK_URLstringRocket.Chat webhook URL
rocketchat-webhook-prefixMONGOARCHIVE__ROCKETCHAT_WEBHOOK_PREFIXstringPrefix for Rocket.Chat webhook messages
rocketchat-notify-on-failure-onlyMONGOARCHIVE__ROCKETCHAT_NOTIFY_ON_FAILURE_ONLYboolSend Rocket.Chat notifications only on failure
cronMONGOARCHIVE__CRONboolRun a cron scheduler and block current execution path
cron-expressionMONGOARCHIVE__CRON_EXPRESSIONstringCron schedule expression
tzMONGOARCHIVE__TZstringUser-specified time zone (see GNU TZ variable format)
keepMONGOARCHIVE__KEEPboolKeep data dump after completion
version(no env var)boolShow the version and exit

🔄 mongo-unarchive

Functionality

  • Downloads archived MongoDB dumps from supported cloud storage.
  • Restores the data to a MongoDB database.
  • Supports applying update operations post-restore using a JSON configuration.

Parameters

FlagEnvironment VariableTypeDescription
verboseMONGOUNARCHIVE__VERBOSEstringMore detailed log output (-vvvvv or --verbose=N)
quietMONGOUNARCHIVE__QUIETboolHide all log output
hostMONGOUNARCHIVE__HOSTstringMongoDB host to connect to (for replica sets: setname/host1,host2)
portMONGOUNARCHIVE__PORTstringMongoDB port (can also use --host hostname:port)
sslMONGOUNARCHIVE__SSLboolConnect to a mongod or mongos that has SSL enabled
ssl-ca-fileMONGOUNARCHIVE__SSL_CA_FILEstring.pem file containing the root certificate chain from the CA
ssl-pem-key-fileMONGOUNARCHIVE__SSL_PEM_KEY_FILEstring.pem file containing the certificate and key
ssl-pem-key-passwordMONGOUNARCHIVE__SSL_PEM_KEY_PASSWORDstringPassword to decrypt the sslPEMKeyFile
ssl-crl-fileMONGOUNARCHIVE__SSL_CRL_FILEstring.pem file containing the certificate revocation list
ssl-allow-invalid-certificatesMONGOUNARCHIVE__SSL_ALLOW_INVALID_CERTIFICATESboolBypass validation for server certificates
ssl-allow-invalid-hostnamesMONGOUNARCHIVE__SSL_ALLOW_INVALID_HOSTNAMESboolBypass validation for server hostnames
ssl-fips-modeMONGOUNARCHIVE__SSL_FIPS_MODEboolUse FIPS mode of the installed OpenSSL library
usernameMONGOUNARCHIVE__USERNAMEstringUsername for authentication
passwordMONGOUNARCHIVE__PASSWORDstringPassword for authentication
authentication-databaseMONGOUNARCHIVE__AUTHENTICATION_DATABASEstringDatabase that holds the user's credentials
authentication-mechanismMONGOUNARCHIVE__AUTHENTICATION_MECHANISMstringAuthentication mechanism to use
gssapi-service-nameMONGOUNARCHIVE__GSSAPI_SERVICE_NAMEstringService name for GSSAPI/Kerberos auth (default: mongodb)
gssapi-host-nameMONGOUNARCHIVE__GSSAPI_HOST_NAMEstringHostname for GSSAPI/Kerberos auth (default: server address)
uriMONGOUNARCHIVE__URIstringMongoDB URI connection string
uri-pruneMONGOUNARCHIVE__URI_PRUNEboolPrune MongoDB URI connection string (remove credentials etc.)
dbMONGOUNARCHIVE__DBstringDatabase to use
collectionMONGOUNARCHIVE__COLLECTIONstringCollection to use
ns-excludeMONGOUNARCHIVE__NS_EXCLUDEstringExclude matching namespaces
ns-includeMONGOUNARCHIVE__NS_INCLUDEstringInclude matching namespaces
ns-fromMONGOUNARCHIVE__NS_FROMstringRename matching namespaces (requires matching ns-to)
ns-toMONGOUNARCHIVE__NS_TOstringRename matched namespaces (requires matching ns-from)
dropMONGOUNARCHIVE__DROPboolDrop each collection before import
dry-runMONGOUNARCHIVE__DRY_RUNboolView summary without importing anything (recommended with verbosity)
write-concernMONGOUNARCHIVE__WRITE_CONCERNstringWrite concern options
no-index-restoreMONGOUNARCHIVE__NO_INDEX_RESTOREboolDo not restore indexes
no-options-restoreMONGOUNARCHIVE__NO_OPTIONS_RESTOREboolDo not restore collection options
keep-index-versionMONGOUNARCHIVE__KEEP_INDEX_VERSIONboolDo not update index version
maintain-insertion-orderMONGOUNARCHIVE__MAINTAIN_INSERTION_ORDERboolRestore documents in the order they appear in the input source; also enables --stopOnError and restricts insertion workers to 1
num-parallel-collectionsMONGOUNARCHIVE__NUM_PARALLEL_COLLECTIONSstringNumber of collections to restore in parallel (default: 4)
num-insertion-workers-per-collectionMONGOUNARCHIVE__NUM_INSERTION_WORKERS_PER_COLLECTIONstringNumber of insert operations to run concurrently per collection (default: 1)
stop-on-errorMONGOUNARCHIVE__STOP_ON_ERRORboolHalt after any insertion error instead of continuing
bypass-document-validationMONGOUNARCHIVE__BYPASS_DOCUMENT_VALIDATIONboolBypass document validation
preserve-uuidMONGOUNARCHIVE__PRESERVE_UUIDboolPreserve original collection UUIDs (requires --drop)
az-endpointMONGOUNARCHIVE__AZ_ENDPOINTstringAzure Blob Storage emulator hostname and port
az-account-nameMONGOUNARCHIVE__AZ_ACCOUNT_NAMEstringAzure Blob Storage account name
az-account-keyMONGOUNARCHIVE__AZ_ACCOUNT_KEYstringAzure Blob Storage account key
az-container-nameMONGOUNARCHIVE__AZ_CONTAINER_NAMEstringAzure Blob Storage container name
aws-endpointMONGOUNARCHIVE__AWS_ENDPOINTstringAWS endpoint URL (hostname only or fully qualified URI)
aws-access-key-idMONGOUNARCHIVE__AWS_ACCESS_KEY_IDstringAWS access key associated with an IAM account
aws-secret-access-keyMONGOUNARCHIVE__AWS_SECRET_ACCESS_KEYstringAWS secret key associated with the access key
aws-regionMONGOUNARCHIVE__AWS_REGIONstringAWS region to send requests to
aws-bucketMONGOUNARCHIVE__AWS_BUCKETstringAWS S3 bucket name
aws-s3-force-path-styleMONGOUNARCHIVE__AWS_S3_FORCE_PATH_STYLEboolForce path-style S3 addressing instead of virtual-hosted
gcp-endpointMONGOUNARCHIVE__GCP_ENDPOINTstringGCP endpoint URL
gcp-bucketMONGOUNARCHIVE__GCP_BUCKETstringGCP storage bucket name
gcp-creds-fileMONGOUNARCHIVE__GCP_CREDS_FILEstringPath to GCP service account credentials file
gcp-project-idMONGOUNARCHIVE__GCP_PROJECT_IDstringGCP project ID
gcp-private-key-idMONGOUNARCHIVE__GCP_PRIVATE_KEY_IDstringGCP private key ID
gcp-private-keyMONGOUNARCHIVE__GCP_PRIVATE_KEYstringGCP private key
gcp-client-emailMONGOUNARCHIVE__GCP_CLIENT_EMAILstringGCP client email
gcp-client-idMONGOUNARCHIVE__GCP_CLIENT_IDstringGCP client ID
local-pathMONGOUNARCHIVE__LOCAL_PATHstringLocal directory path to store backups
object-nameMONGOUNARCHIVE__OBJECT_NAMEstringObject name of the archived file in the storage (optional)
dirMONGOUNARCHIVE__DIRstringDirectory containing the dumped files
updatesMONGOUNARCHIVE__UPDATESstringArray of update specifications in JSON string format
updates-fileMONGOUNARCHIVE__UPDATES_FILEstringPath to a file containing an array of update specifications
keepMONGOUNARCHIVE__KEEPboolKeep data dump after completion
version(no env var)boolShow the version and exit

🧪 Usage Examples

Dump a Database to Azure Storage

mongo-archive \
  --uri="mongodb://<username>:<password>@cluster0.mongodb.net/" \
  --db=<dbname> \
  --az-account-name=<az_account_name> \
  --az-account-key=<az_account_key> \
  --az-container-name=<az_container_name>

Schedule Regular Backups with Cron

mongo-archive \
  --uri="mongodb://<username>:<password>@cluster0.mongodb.net/" \
  --db=<dbname> \
  --az-account-name=<az_account_name> \
  --az-account-key=<az_account_key> \
  --az-container-name=<az_container_name> \
  --cron \
  --cron-expression="0 * * * *"

Restore from Azure Storage

mongo-unarchive \
  --uri="mongodb://localhost:27017" \
  --db=<dbname> \
  --az-account-name=<az_account_name> \
  --az-account-key=<az_account_key> \
  --az-container-name=<az_container_name>

Restore and Apply Updates

mongo-unarchive \
  --uri="mongodb://localhost:27017" \
  --db=<dbname> \
  --az-account-name=<az_account_name> \
  --az-account-key=<az_account_key> \
  --az-container-name=<az_container_name> \
  --updates-file=/home/nonroot/updates.json

Sample updates.json

[
  {
    "collection": "users",
    "filter": {
      "email": { "$exists": true }
    },
    "update": [
      {
        "$set": {
          "email": {
            "$replaceOne": {
              "input": "$email",
              "find": "@",
              "replacement": "_"
            }
          }
        }
      }
    ]
  }
]

🐳 Running with Docker

docker run --rm \
  -v "$(pwd)/tmp:/tmp" \
  -e MONGOARCHIVE__DUMP_PATH=/tmp/datadump \
  ghcr.io/egose/database-tools:latest \
  mongo-archive \
  --uri="mongodb://<username>:<password>@cluster0.mongodb.net/" \
  --db=<dbname> \
  --az-account-name=<az_account_name> \
  --az-account-key=<az_account_key> \
  --az-container-name=<az_container_name> \
  --keep

☁️ Running as a Kubernetes CronJob

apiVersion: batch/v1
kind: CronJob
metadata:
  name: mongo-archive
spec:
  schedule: "0 12 * * *"
  concurrencyPolicy: Forbid
  jobTemplate:
    spec:
      backoffLimit: 3
      template:
        spec:
          restartPolicy: Never
          initContainers:
            - name: backup-permission
              image: alpine:3.18
              command: ["/bin/sh", "-c"]
              args:
                - |
                  rm -rf /tmp/*;
                  adduser -D -u 1000 nonroot;
                  chown nonroot:nonroot /tmp;
              volumeMounts:
                - mountPath: /tmp
                  name: backup-volume
          containers:
            - name: backup-job
              image: ghcr.io/egose/database-tools:<latest-version>
              command: ["/bin/sh", "-c"]
              args:
                - mongo-archive --db=mydb --read-preference=primary --force-table-scan
              env:
                - name: MONGOARCHIVE__URI
                  value: "mongodb+srv://user:password@cluster0.my.mongodb.net"
                - name: MONGOARCHIVE__AZ_ACCOUNT_NAME
                  value: mystorage
                - name: MONGOARCHIVE__AZ_ACCOUNT_KEY
                  value: myaccountkey
                - name: MONGOARCHIVE__AZ_CONTAINER_NAME
                  value: mybackup
              volumeMounts:
                - mountPath: /tmp
                  name: backup-volume
          volumes:
            - name: backup-volume
              persistentVolumeClaim:
                claimName: backup-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: backup-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

🗂️ Backlog

To be documented.