Skip to content

Conversation

@firdausai
Copy link

What is this PR about?

When backing up a database that is located on a worker node (different server), it throws an error Container not found due to the backup command being ran on the manager server while the container is on another server.

This PR tried to fix it by deploying a service to the worker node when a backup is triggered with the sole purpose of running the backup command. When a backup is triggered, it will check if the backup service already exist, if it does it will delete it, then it will deploy the service. The backup service will die when the command finished running.

Important note:

  • This PR and commit only address the postgres database. If this approach is accepted, then I can make other commits for the other databases in this PR.
  • When working on this locally on a macbook, the way I simulate another worker was through docker-in-docker (see below command). There is a limitation on the socket when executing the command on the worker, which will throw an error. The way I tested this was: I console.log the backup command that is supposed to be ran on my local worker, copy it, change the necessary value to be the same as my prod environment (service name, target worker label, and existing database service name), and paste it to my prod. The command was able to upload a backup to my s3.
docker run -d --privileged \
  --hostname worker-1 \
  --name swarm-worker-1 \
  --label type=database \
  docker:dind

Checklist

Before submitting this PR, please make sure that:

Issues related (if applicable)

closes #3516

Screenshots (if applicable)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Database backup fail when database is in worker node

1 participant