Workflow containers are not cleaned up after being deleted by a podman.

Article: 100064346
Last Published: 2024-09-12
Ratings: 0 0
Product(s): Appliances, CloudPoint

Problem

Workflow containers are not cleaned up after being deleted by a podman. 

Error Message

In podman based system, an issue is observed in creating minimum workflow containers due to name conflict.
The following are the workflow containers:
General: All the jobs except long-running jobs use this.
System: Jobs like Discovery use this.
Longrun: All the long-running jobs use this.
Normally, this workflow container exits after some idle period.
All the jobs that use the above workflow containers will fail as the container will not be launched.

We get the following error message in the flexsnap.log:

75f4651896b2bb07374e904230ab5805d9926233f2384c77ab0fa7e3dda85cc0: "2024-02-15T03:13:16.685839020+00:00  Feb 15 03:13:16 flexsnap-coordinator flexsnap-coordinator[8] Thread-4579 coordinator: INFO - handle_update_sources: update sources for agent agent.04f1861b959745d391979347698511e4 source ids ['aws-115237957516-us-gov-west-1']"
09ee6e4dcaaf7efe94d175984a192f02de6f374b445c61f0c201ce664923c69c: "2024-02-15T03:13:16.691982115+00:00  Feb 15 03:13:16 flexsnap-agent.04f1861b959745d391979347698511e4 flexsnap-agent-agent.04f1861b959745d391979347698511e4[8] Poll detect_asset_changes@3600secs agent: INFO - detect_asset_changes: Asset discovery completed"
282baa345890e7c9f66ed3bdc129a98ad51322b30cf3eeca083549fd71957d45: "2024-02-15T03:13:16.697226531+00:00  Feb 15 03:13:16 flexsnap-notification flexsnap-notification[8] Thread-5231 flexsnap.alert_consumers.nbu_api: ERROR - Failed to fetch token."
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974746028+00:00  Feb 15 03:13:16 flexsnap-listener flexsnap-listener[7] Thread-4705 flexsnap.connectors.base: ERROR - Request failed unexpectedly"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974746028+00:00  Traceback (most recent call last):"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974746028+00:00    File ""/opt/VRTScloudpoint/lib/flexsnap/connectors/base.py"", line 112, in run"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974746028+00:00      func(self, user_id, **converted_values)"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974746028+00:00    File ""/opt/VRTScloudpoint/bin/flexsnap-listener.py"", line 257, in register"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974746028+00:00      listener.create_min_wf_runners()"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974746028+00:00    File ""/opt/VRTScloudpoint/lib/flexsnap/listener_container.py"", line 306, in create_min_wf_runners"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974746028+00:00      self._create_workflow_container(cn_name,"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974746028+00:00    File ""/opt/VRTScloudpoint/lib/flexsnap/listener_container.py"", line 365, in _create_workflow_container"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974746028+00:00      self.cm.run(imagename, name=service_name,detach=True,"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974746028+00:00    File""/opt/VRTScloudpoint/lib/flexsnap/containerd.py"", line 1495, in run"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974746028+00:00      cid = self.create_container(image,**kwargs)[""Id""]"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974746028+00:00    File""/opt/VRTScloudpoint/lib/flexsnap/containerd.py"", line 894, in _wfunc"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974746028+00:00      return self.__prepare_response(req, kwargs)"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974746028+00:00    File""/opt/VRTScloudpoint/lib/flexsnap/containerd.py"", line 849, in __prepare_response"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974746028+00:00      raise Exception(err)"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974746028+00:00  Exception: container create: creating container storage:the container name ""flexsnap-workflow-longrun-0-min"" is already in use by 53037fa21338f1a2a264242c0d24155f1416797ccad7fb5643b8e32e0eb71902. You have to remove that container to be able to reuse that name: that name is already in use"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974819915+00:00  Exception in thread Thread-4705:"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974819915+00:00  Traceback (most recent call last):"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.974819915+00:00    File ""/usr/lib64/python3.9/threading.py"", line 980,in _bootstrap_inner"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.975143662+00:00      self.run()"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.975143662+00:00    File""/opt/VRTScloudpoint/lib/flexsnap/connectors/base.py"", line 133, in run"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.975239750+00:00      self.send_exception(e, sys.exc_info()[2])"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.975239750+00:00    File ""/opt/VRTScloudpoint/lib/flexsnap/connectors/base.py"", line 232, in send_exception"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.975350847+00:00      raise e"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.975350847+00:00    File ""/opt/VRTScloudpoint/lib/flexsnap/connectors/base.py"", line 112, in run"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.975471419+00:00      func(self, user_id, **converted_values)"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.975471419+00:00    File ""/opt/VRTScloudpoint/bin/flexsnap-listener.py"",line 257, in register"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.977017883+00:00      listener.create_min_wf_runners()"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.977017883+00:00    File ""/opt/VRTScloudpoint/lib/flexsnap/listener_container.py"", line 306, in create_min_wf_runners"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.977017883+00:00      self._create_workflow_container(cn_name,"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.977017883+00:00    File ""/opt/VRTScloudpoint/lib/flexsnap/listener_container.py"", line 365, in _create_workflow_container"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.977017883+00:00      self.cm.run(imagename, name=service_name, detach=True,"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.977017883+00:00    File ""/opt/VRTScloudpoint/lib/flexsnap/containerd.py"", line 1495, in run"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.977017883+00:00      cid = self.create_container(image, **kwargs)[""Id""]"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.977017883+00:00    File ""/opt/VRTScloudpoint/lib/flexsnap/containerd.py"", line 894, in _wfunc"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.977017883+00:00      return self.__prepare_response(req, kwargs)"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.977017883+00:00    File ""/opt/VRTScloudpoint/lib/flexsnap/containerd.py"", line 849, in __prepare_response"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.977017883+00:00      raise Exception(err)"
1e00cf4cb7b167aa37d53a2ec8c9a40820b7eeb5fc63349aad9d05d6cb5990ef: "2024-02-15T03:13:16.977017883+00:00  Exception: container create: creating container storage: the container name ""flexsnap-workflow-longrun-0-min"" is already in use by 53037fa21338f1a2a264242c0d24155f1416797ccad7fb5643b8e32e0eb71902. You have to remove that container to be able to reuse that name: that name is already in use"

aaece47e2a6b4308f960ac97fbd9fb5d4c0c192d193662d3f7f26b7c6138f861: "2024-02-15T03:13:19.746152757+00:00  2024-02-15 03:13:19.745606+00:00 [info] <0.888.0> accepting AMQP connection <0.888.0> ([fd00::1:8:10]:43304 -> [fd00::1:8:e]:5671)[0m"

Cause

Podman maintains the container's file-system hierarchy and its configuration in /var/lib/containers/storage/overlay-containers. Whenever containers are deleted from the host, their respective file-system hierarchy and configurations will be removed from said location. Sometimes, for unknown reasons, if somehow file-system hierarchy or configuration is still present on the host machine, Snapshot Manager will land up in this condition. Under this condition, on demand workflow container(s) may not be able to launch due to container name conflict.

Solution

1. Disable the NetBackup Snapshot Manager from Web UI.
2. Stop the NetBackup Snapshot Manager containers:
    
flexsnap_configure stop

3. Remove all the flexsnap containers:
    
# docker rm $(docker ps -a -q)
# podman rm $(podman ps -a -q)

4. Clean any exited podman containers:
    
podman container cleanup --all

5. Remove all dangling and unused images from local storage:
    
podman image prune --all

6. Remove all unused containers (both dangling and unreferenced), networks, and volume from the local image:
    
podman system prune --all

7. Remove all the directories from inside /var/lib/containers/*:
    
rm -rf /var/lib/containers/*

If there is an issue with the installed podman version then execute the following steps also:

  • yum remove podman
  • reboot
  • yum install podman
  • systemctl enable podman.socket
  • systemctl start podman.socket
  • systemctl enable podman-restart
  • systemctl start  podman-restart


8. If no issue with the installed podman then reboot the host and continue with the remaining steps:

reboot

9. Run the preinstall script to load the NBSMNetBackup Snapshot Manager images:

./flexsnap_preinstall.sh

10 Install the NetBackup Snapshot Manager to create the flexsnap containers:

flexsnap_configure install -i

11. Enable NetBackup Snapshot Manager again from Web UI.

Was this content helpful?