Docker MTU issues and solutions
If you want to use Docker on servers or virtual machines, technical limitations can sometimes lead to a situation in which – even without intentional limitation – it is not possible to access the outer world from a docker container.
Docker MTU configuration
A common problem when operating dockers within a virtualization infrastructure is that the network cards provided to virtual machines do not have the default MTU of 1500. This is often the case, for example, when working in a cloud infrastructure (e.g. OpenStack). The Docker Daemon does not check the MTU of the outgoing connection at startup. Therefore, the value of the Docker MTU is set to 1500.
Detecting the problem
With the command ip link you can display the locally configured network cards and their MTU:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1454 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether aa:bb:cc:dd:ee:ff brd ff:ff:ff:ff:ff:ff 3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default link/ether uu:vv:ww:xx:yy:zz brd ff:ff:ff:ff:ff:ff
If the outgoing interface (in this case ens3) has an MTU smaller than 1500, some action is required. If it is greater than or equal to 1500, this problem does not apply to you.
Solving the problem (docker daemon)
To solve the problem, you need to configure the Docker daemon in such a way that the virtual network card of newly created containers gets an MTU that is smaller than or equal to that of the outgoing network card. For this purpose create the file /etc/docker/daemon.json with the following content:
{ "mtu": 1454 }
In this example, I chose 1454 as the value, as this corresponds to the value of the outgoing network card (ens3). After restarting the Docker daemon, the MTU of new containers should be adapted accordingly. However, docker-compose create a new (bridge) network for every docker-compose environment by default.
Solving the problem (docker-compose)
If you work with docker-compose, you will notice that in containers created by docker-compose, the MTU of the daemon is not inherited. This happens because the mtu entry in /etc/docker/daemon.json file only affects the default bridge. Therefore you have to specify the MTU explicitly in the docker-compose.yml for the newly created network:
... networks: default: driver: bridge driver_opts: com.docker.network.driver.mtu: 1454
After rebuilding the docker-compose environment (docker-compose down; docker-compose up), the containers should use the modified MTU.
I personally don’t like this solution, because the docker-compose files have to be specially adapted to their environment and therefore lose their portability. Unfortunately, I am not aware of any other solution to this problem at the moment.
15 COMMENTS
Thanks for this, solved my problem. I spent a couple of days trying to work this out. I’m running docker on a single Ubuntu host, but behind a VPN which is also configured on the host. The containers could ping out and resolve DNS fine, but couldn’t transfer files. Turns out the VPN connection was using a MTU of 1400, and docker was using the default of 1500.
Is there a Kubernetes/Calico setting that sets MTU’s for new containers like your docker-compose example? I updated Calico MTU as follows, to no effect: kubectl edit configmaps calico-config -n kube-system
(Red Hat Enterprise Linux. /etc/docker/daemon.json value is just getting ignored. docker0 network still shows default 1500)
Hi,
I guess all the calico nodes need to be restarted. I usually use Kubespray for installing a Kubernetes clustere, which has also support for setting a calico MTU, so maybe looking at the code there can help you to understand how MTU can be configured. Maybe you can check https://mlohr.com/kubernetes-cluster-on-hetzner-bare-metal-servers/ for more information.
Best regards
Matthias
thx, for the description. I think you can keep the docker compose portability by using the docker-compose.override.yaml which is read by and merged with the default yaml. That way each server can have it’s own settings put there, and the main yaml can be versioned.
\
Incredible. This fixed what seemed to be a completely intractible problem. The symptom for me was that all networking was fine except when I was sending API POST requests with larger than usual payloads. The response would timeout after 5 minutes with a socket hangup.
While I was trying to figure out why I couldn’t connect to the host network (which I never did figure out … ) I found the command:
docker network inspect bridge
When I ran that I saw “mtu”:1500 and that gave me a new avenue to search.
So I’m going to post here in the comments to hopefully widen the search surface area:
If you’re seeing a socket hangup ERRCONNRESET in nodejs calling an HTTP/HTTPS REST API but you’re only seeing it in docker and not when run straight from node, check if you need to edit the mtu of your default docker bridge network.
Thank you SO MUCH for posting this article. I thought I was going to have to walk away from an unsolved problem after spending approximately 6 hours trying to figure it out!
Glad to help 🙂
Hello,
I’ve also been bitten by this: I run docker on my home DSL router where my PPPoE internet connection leaves me with a MTU of 1492.
I also use firewalld on the router. With older (I believe < 1.0) firewalld versions one needed enable TCP MSS clamping using iptables rules to work around MTU issues for the internal network. IIRC this has been a POSTROUTING rule for me. With newer versions of firewalld, iptables has been replaced with nftables, so the iptables rule no longer apply. The best solution is to use policy objects: https://firewalld.org/2020/10/tcp-mss-clamp
I had this solution in place, but only for traffic from the internal network zone to the external network zone (= DSL). Today I found that every machine in my internal network was working w.r.t. a specific site that caused problems. But docker containers on my router could not access the site in question. Turns out I had forgotten to apply the policy also to firewalld's "docker" zone!
I like this solution because the default MTU of 1500 can stay in place for the default bridge network and for the bridges created by docker-compose. There's no need for overrides. Firewalld (or just nftables) manages MTU for all containers.
Thanks for the pointer in the right direction!
Alex
Hello,
thanks for the article. Helped me a lot after several hours of debugging.
I now wrote a small bash script to detect a MTU mismatch of a given docker container with the default uplink interface. Furthermore, I proposed additional solutions that could be performed such as setting the mtu of the docker iface with ip link. This can also be helpful if the container was already created (no up&down required)
Feel free to use/improve/add.
Usage: ./docker_mtu_checker.sh
Returns: Either a message with a match or a mismatch with the MTU values
Source: https://gitlab.com/-/snippets/2343542
[…] The problem is described here, though I used the solution described here. […]
[…] look at this : https://mlohr.com/docker-mtu/ […]
> I personally don’t like this solution, because the docker-compose files have to be specially adapted to their environment and therefore lose their portability. Unfortunately, I am not aware of any other solution to this problem at the moment.
Matthias and others: there’s an upcoming fix in Docker/Moby which makes it possible to apply this to all networks being created by default. Unfortunately only available in Docker 24.0 and greater but still. Here is the PR: https://github.com/moby/moby/pull/43197
Thanks for pointing this out! I also don’t like the current solution that much, but somehow I needed to get things to work. From my first impression, with this new feature, it would definitely improve the situation.
Agree, it seems like it will make things better. We’ll just have to wait some year(s) for it to be available. 😉
Thanks for the hint about “default-network-opt”, Per. And also to Matthias for this article!
We ran into this problem when we added 3 additional servers to a Docker Swarm cluster. For the new servers a request to get a 9 KB image would fail, while a request for a 1 KB image was working. Apparently, there is a difference between Ubuntu 20.04 and 22.04 (or some other configuration). By setting a custom MTU value in /etc/docker/daemon.json using “default-network-opt” on the new servers, it was working correctly.
Nice to see people from such great companies read my blog posts, and nice to be of help with docker 🙂