Docker MTU issues and solutions

October 16, 2018July 24, 2020
by Matthias Lohr

If you want to use Docker on servers or virtual machines, technical limitations can sometimes lead to a situation in which – even without intentional limitation – it is not possible to access the outer world from a docker container.

Docker MTU configuration

A common problem when operating dockers within a virtualization infrastructure is that the network cards provided to virtual machines do not have the default MTU of 1500. This is often the case, for example, when working in a cloud infrastructure (e.g. OpenStack). The Docker Daemon does not check the MTU of the outgoing connection at startup. Therefore, the value of the Docker MTU is set to 1500.

Detecting the problem

With the command ip link you can display the locally configured network cards and their MTU:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1454 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether aa:bb:cc:dd:ee:ff brd ff:ff:ff:ff:ff:ff
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default 
    link/ether uu:vv:ww:xx:yy:zz brd ff:ff:ff:ff:ff:ff

If the outgoing interface (in this case ens3) has an MTU smaller than 1500, some action is required. If it is greater than or equal to 1500, this problem does not apply to you.

Solving the problem (docker daemon)

To solve the problem, you need to configure the Docker daemon in such a way that the virtual network card of newly created containers gets an MTU that is smaller than or equal to that of the outgoing network card. For this purpose create the file /etc/docker/daemon.json with the following content:

{
  "mtu": 1454
}

In this example, I chose 1454 as the value, as this corresponds to the value of the outgoing network card (ens3). After restarting the Docker daemon, the MTU of new containers should be adapted accordingly. However, docker-compose create a new (bridge) network for every docker-compose environment by default.

Solving the problem (docker-compose)

If you work with docker-compose, you will notice that in containers created by docker-compose, the MTU of the daemon is not inherited. This happens because the mtu entry in /etc/docker/daemon.json file only affects the default bridge. Therefore you have to specify the MTU explicitly in the docker-compose.yml for the newly created network:

...

networks:                                
  default:                               
    driver: bridge                       
    driver_opts:                         
      com.docker.network.driver.mtu: 1454

After rebuilding the docker-compose environment (docker-compose down; docker-compose up), the containers should use the modified MTU.

I personally don’t like this solution, because the docker-compose files have to be specially adapted to their environment and therefore lose their portability. Unfortunately, I am not aware of any other solution to this problem at the moment.

Raspberry Pi Kubernetes Cluster

Signing PGP Keys

mlohr

15 COMMENTS

Stephen
January 4, 2021 at 16:27
Thanks for this, solved my problem. I spent a couple of days trying to work this out. I’m running docker on a single Ubuntu host, but behind a VPN which is also configured on the host. The containers could ping out and resolve DNS fine, but couldn’t transfer files. Turns out the VPN connection was using a MTU of 1400, and docker was using the default of 1500.
Robert Bork
February 22, 2021 at 20:08
Is there a Kubernetes/Calico setting that sets MTU’s for new containers like your docker-compose example? I updated Calico MTU as follows, to no effect: kubectl edit configmaps calico-config -n kube-system

(Red Hat Enterprise Linux. /etc/docker/daemon.json value is just getting ignored. docker0 network still shows default 1500)

Matthias Lohr
February 26, 2021 at 19:24
Hi,

I guess all the calico nodes need to be restarted. I usually use Kubespray for installing a Kubernetes clustere, which has also support for setting a calico MTU, so maybe looking at the code there can help you to understand how MTU can be configured. Maybe you can check https://mlohr.com/kubernetes-cluster-on-hetzner-bare-metal-servers/ for more information.

Best regards
Matthias

Piotr Czapla
May 1, 2021 at 12:38
thx, for the description. I think you can keep the docker compose portability by using the docker-compose.override.yaml which is read by and merged with the default yaml. That way each server can have it’s own settings put there, and the main yaml can be versioned.
\
Iain Dooley
June 4, 2021 at 11:42
Incredible. This fixed what seemed to be a completely intractible problem. The symptom for me was that all networking was fine except when I was sending API POST requests with larger than usual payloads. The response would timeout after 5 minutes with a socket hangup.

While I was trying to figure out why I couldn’t connect to the host network (which I never did figure out … ) I found the command:

docker network inspect bridge

When I ran that I saw “mtu”:1500 and that gave me a new avenue to search.

So I’m going to post here in the comments to hopefully widen the search surface area:

If you’re seeing a socket hangup ERRCONNRESET in nodejs calling an HTTP/HTTPS REST API but you’re only seeing it in docker and not when run straight from node, check if you need to edit the mtu of your default docker bridge network.

Thank you SO MUCH for posting this article. I thought I was going to have to walk away from an unsolved problem after spending approximately 6 hours trying to figure it out!

Matthias Lohr
June 5, 2021 at 18:00
Glad to help 🙂

Alexander Groß
December 15, 2021 at 21:17
Hello,

I’ve also been bitten by this: I run docker on my home DSL router where my PPPoE internet connection leaves me with a MTU of 1492.

I also use firewalld on the router. With older (I believe < 1.0) firewalld versions one needed enable TCP MSS clamping using iptables rules to work around MTU issues for the internal network. IIRC this has been a POSTROUTING rule for me. With newer versions of firewalld, iptables has been replaced with nftables, so the iptables rule no longer apply. The best solution is to use policy objects: https://firewalld.org/2020/10/tcp-mss-clamp

I had this solution in place, but only for traffic from the internal network zone to the external network zone (= DSL). Today I found that every machine in my internal network was working w.r.t. a specific site that caused problems. But docker containers on my router could not access the site in question. Turns out I had forgotten to apply the policy also to firewalld's "docker" zone!

I like this solution because the default MTU of 1500 can stay in place for the default bridge network and for the bridges created by docker-compose. There's no need for overrides. Firewalld (or just nftables) manages MTU for all containers.

Thanks for the pointer in the right direction!

Alex
Ugurcan Albayrak
June 4, 2022 at 19:32
Hello,
thanks for the article. Helped me a lot after several hours of debugging.

I now wrote a small bash script to detect a MTU mismatch of a given docker container with the default uplink interface. Furthermore, I proposed additional solutions that could be performed such as setting the mtu of the docker iface with ip link. This can also be helpful if the container was already created (no up&down required)

Feel free to use/improve/add.

Usage: ./docker_mtu_checker.sh
Returns: Either a message with a match or a mismatch with the MTU values

Source: https://gitlab.com/-/snippets/2343542
Can't install pip packages inside a docker container with Ubuntu - Design Corral
February 8, 2023 at 07:30
[…] The problem is described here, though I used the solution described here. […]
Docker build "Could not resolve 'archive.ubuntu.com'" apt-get fails to install anything - Design Corral
February 13, 2023 at 13:18
[…] look at this : https://mlohr.com/docker-mtu/ […]
Per Lundberg
April 18, 2023 at 12:54
> I personally don’t like this solution, because the docker-compose files have to be specially adapted to their environment and therefore lose their portability. Unfortunately, I am not aware of any other solution to this problem at the moment.

Matthias and others: there’s an upcoming fix in Docker/Moby which makes it possible to apply this to all networks being created by default. Unfortunately only available in Docker 24.0 and greater but still. Here is the PR: https://github.com/moby/moby/pull/43197
Matthias Lohr
April 18, 2023 at 13:02
Thanks for pointing this out! I also don’t like the current solution that much, but somehow I needed to get things to work. From my first impression, with this new feature, it would definitely improve the situation.
Per Lundberg
April 19, 2023 at 13:29
Agree, it seems like it will make things better. We’ll just have to wait some year(s) for it to be available. 😉
Tobias
November 22, 2023 at 15:37
Thanks for the hint about “default-network-opt”, Per. And also to Matthias for this article!

We ran into this problem when we added 3 additional servers to a Docker Swarm cluster. For the new servers a request to get a 9 KB image would fail, while a request for a 1 KB image was working. Apparently, there is a difference between Ubuntu 20.04 and 22.04 (or some other configuration). By setting a custom MTU value in /etc/docker/daemon.json using “default-network-opt” on the new servers, it was working correctly.

Matthias Lohr
November 22, 2023 at 18:59
Nice to see people from such great companies read my blog posts, and nice to be of help with docker 🙂