Managing Swarm envi...
 
Notifications
Clear all

Managing Swarm environment with portainer: could not get it to work

5 Posts
2 Users
0 Reactions
13 Views
Posts: 3
Topic starter
(@ifs77)
Active Member
Joined: 21 hours ago

I'm newbee homelabber and was very impressed by Brandon's video on YouTube called "Best Container Server Setup". It seemes that Swarm + Portainer is really an ideal middle-ground between bare Docker and Kubernetes and thanks Brandon for highlighting this solution.ย 
I tried to reproduce Brandon's setup in my lab working, but gave up after two full days of hard efforts.
Couldn't connect Portainer server to agents on the nodes. It's simply doesn't work. I'm getting "Client.Timeout exceeded while awaiting headers" every time when I'm pushing "Connect" button in environment creation dialog. The only way I found possible is to connect Swarm by socket option, but this approach doesn't give you a beauty of Cluster Visualizer and thus becomes mostly useless.
I did numerous attemts to bootstrap it, starting with carefully repeating all the steps from the video then going to Portainer instance (which is on a node outside of the cluster), copying the commands provided by wizard to a destination node, then, after deploying Agents, trying to connect Portainer to them. A soon as it didn't work, I moved further, searching for a cause: fiddling with UFW in Ubuntu 24.04, iptables, DNS, trying to re-deploy nodes with and without Keepalived, reinstalling few Portainer versions, installing Portainer outside and inside of a cluster, re-creating VMs from Ubuntu full image instead of cloud-image etc. I ended up with a try to roll this up on Debian instead of Ubuntu, but it didn't work either.
As far as I can tell by some of the comments on YouTube video, this is a common problem, may be a bug in Portainer. And I found such complaints on their GitHub and somewhe else on the internet, but no suitable solution or explanation, unfortunately.
Assuming all that said, I would consider it a bug and totally gave up on it, but I saw it working in video.
Could somebody give me any advice, what I might be doing wrong and what is the right path?
---
Related links, that I've used:
-

- https://docs.portainer.io/start/install/server/swarm/linux
-

-

- https://github.com/portainer/portainer/issues/11362
- https://github.com/portainer/portainer/issues/10602

4 Replies
Brandon Lee
Posts: 423
Admin
(@brandon-lee)
Member
Joined: 14 years ago

@ifs77 welcome to the forums! I noticed a couple of others mention in the comments they were having issues as well. Give me more details on how you set things up. Hopefully we can figure out what is going on there. 👍ย 

Reply
1 Reply
Brandon Lee
Admin
(@brandon-lee)
Joined: 14 years ago

Member
Posts: 423

@ifs77 Also, just re-reading your post, did you setup the Docker Swarm cluster using the service command from Portainer?ย 

This command looks like this:

docker network create \
--driver overlay \
  portainer_agent_network

docker service create \
  --name portainer_agent \
  --network portainer_agent_network \
  -p 9001:9001/tcp \
  --mode global \
  --constraint 'node.platform.os == linux' \
  --mount type=bind,src=//var/run/docker.sock,dst=/var/run/docker.sock \
  --mount type=bind,src=//var/lib/docker/volumes,dst=/var/lib/docker/volumes \
  --mount type=bind,src=//,dst=/host \
  portainer/agent:2.21.5

Also, I am running Keepalived with a virtual IP address that I pointed the wizard to when onboarding my swarm cluster. The above command sets up the agent in "global" mode which means it will run on each of your swarm nodes.

Reply
Posts: 3
Topic starter
(@ifs77)
Active Member
Joined: 21 hours ago

Hello Brandon,
Thank you for quick reply!
Yes, I can easily give you more details as I was taking notes of all steps that I was doing. I'll attach a file to this post (just change an extension to .md because .md files disallowed on this forum).
Yes, was executing that commands from Portainer's wizard. And I was trying to connect first with virtual IP, then, after it's throwing back an error, I was trying individual IPs of all the nodes. Then I decided to eliminate Keepalived from the chain because suspected it was a source of error. I rolled back all VMs to bare Ubuntu then repeated all steps without Keepalived, using individual nodes IP, but result was the same.

ย 

Reply
1 Reply
(@ifs77)
Joined: 21 hours ago

Active Member
Posts: 3

Sorry, I didn't describe my infrastructure.
I have 2 hardware servers, one of them running on ESXi, the other one - Proxmox 8.3.2. My first attempt was to distribute Swarm nodes between those 2 servers, then I decided to simplify things and roll them up on a single Proxmox server. Firewall service disabled in Proxmox on datacenter and node levels and I don't have any filtration and restriction rules in my home network, VLANs etc. Just simple network with my router's DHCP with 1 subnet, 1 gateway (router) and single DNS at Pi-hole. All VMs were using DHCP.
When my Swarm node VMs was running, their FQDNs was resolvable by Nslookup when I was calling it from my desktop (Windows and MacOS)

Server:         10.0.0.254
Address:        10.0.0.254#53
Name:   swarm-3.babylon-8.local
Address: 10.0.0.63

, but there was strange nslookup output in Ubuntu on those particular Swarm nodes:

;; Got SERVFAIL reply from 127.0.0.53
Server: 127.0.0.53
Address: 127.0.0.53#53
** server can't find swarm-3.babylon-8.local: SERVFAIL

I don't have this issue in Debian, but my Debian Swarm was also not reachable from Portainer.

Reply