Docker network timeouts
-
You may have noticed the lack of oneboxing around here. I updated iframely on the server and now all the requests time out. For instance:
-- [21-05-13 15:55:28]:347 -- plugin response: {"plugin":"htmlparser","response":"timeout","uri":"https://i.imgur.com/AGnnrMR.jpg"} Processing error: timeout
I added
curl
to the image and it's able to pull those down no problem from within the container. Anyone have any ideas?
-
Is the node thing using the right nameserver information?
-
@Tsaukpaetra what is "nameserver information?"
-
@boomzilla said in Docker network timeouts:
@Tsaukpaetra what is "nameserver information?"
I don't know how to express it. Like, if it cached the DNS for the IP address but that's no longer right so it's asking the wrong server?
-
@Tsaukpaetra said in Docker network timeouts:
@boomzilla said in Docker network timeouts:
@Tsaukpaetra what is "nameserver information?"
I don't know how to express it. Like, if it cached the DNS for the IP address but that's no longer right so it's asking the wrong server?
Uh...seems unlikely. It happens now for every domain. Requests from
ping
andcurl
seem fine.I just restarted the container (which I'd think would invalidate any caching going on inside the app itself) and then this showed up in the logs:
-- [21-05-13 18:51:15]:33 172.18.0.254 - Loading /iframely for https://www.nbcnews.com/news/us-news/she-called-out-health-care-misinfo-tiktok-then-trolls-found-n1265011 -- [21-05-13 18:51:20]:33 -- plugin response: {"plugin":"htmlparser","response":"timeout","uri":"https://www.nbcnews.com/news/us-news/she-called-out-health-care-misinfo-tiktok-then-trolls-found-n1265011"} Processing error: timeout
-
I guess the "good" news is that I have iframely in a docker container in my development VM and I'm getting similar timeouts.
-
Wireshark capture (in way over my head here):
So...line
4
you can see where I rancurl
against my iframely endpoint, which is at172.17.0.2
. At7
it gets the IP address for wikipedia,208.80.154.224
. Then there's a bit of back and forth.Between
22
and23
there's almost a 5 second delay. Timeout is set at...5 seconds, after which the408
response comes back to me.Then long after that there's some more traffic (at 161s and then again at 180s).
-
For comparison, running iframely not in docker:
Oneboxing twitter. Lots more traffic, but the response comes back instantly.
-
@boomzilla Stupid idea, but are the ingress and egress ports properly mapped for docker?
-
@Benjamin-Hall uh...hopefully. I'm basically running the Dockerfile here:
It appears to be successfully talking to the outside world. And I can
ping
andcurl
from in there. I'm starting to think it's an iframely configuration thing. But watching wireshark...it seems to be able to reach the outside world on its own.
-
@boomzilla said in Docker network timeouts:
I updated iframely on the server and now all the requests time out.
Do you still have the previous container that worked?
If you didn't do a
docker image prune
, it should still be there, just no longer tagged (it will still have the correct repository name, but<none>
instead of the version), so you should be able to fish it out. Then you can 1) run the old version again until you find why the new one does not work and 2) compare the old and new version.
@boomzilla said in Docker network timeouts:
For comparison, running iframely not in docker
Is that exactly the same version?
-
@Benjamin-Hall said in Docker network timeouts:
@boomzilla Stupid idea, but are the ingress and egress ports properly mapped for docker?
I don't think any port mapping is happening here at all. I suppose this is a Linux docker, and there the container just gets its own IP address and that is masqueraded by iptables using the default masquerade functionality.
-
@boomzilla The only application data originating from iframely is frame 19, and that's only 146 bytes of TLS data. I'm not sure on the details of TLS1.3, but ChangeCipherSpec is 1 byte and I think record overhead is still 5 bytes each, so that's 143 bytes for ApplicationData. On earlier TLS versions I'd also be expecting 32 or so bytes for a Finished message here, but since wireshark isn't listing any further handshake activity I'd guess that doesn't apply anymore.
I'd hazard there's probably 111 bytes of actual request data. That's not very much, but if they use a low-overhead HTTP library (or they're hand-rolling requests) it's possible.
What surprises me way more though is that the wikipedia server starts sending data right away. Even in HTTP 2.0, I'm pretty sure the first protocol communication is from the client not the server. And if it was out of sequence wireshark would say so in the overview.
-
@Bulb said in Docker network timeouts:
@boomzilla said in Docker network timeouts:
I updated iframely on the server and now all the requests time out.
Do you still have the previous container that worked?
If you didn't do a
docker image prune
, it should still be there, just no longer tagged (it will still have the correct repository name, but<none>
instead of the version), so you should be able to fish it out. Then you can 1) run the old version again until you find why the new one does not work and 2) compare the old and new version.How do I "fish it out?" I definitely haven't done the pruning thing.
@boomzilla said in Docker network timeouts:
For comparison, running iframely not in docker
Is that exactly the same version?
Yes, from the HEAD of the git repo.
-
@boomzilla said in Docker network timeouts:
How do I "fish it out?" I definitely haven't done the pruning thing.
Ah ha.
docker image history
.$ docker image history iframely IMAGE CREATED CREATED BY SIZE COMMENT 62ad11de8362 31 hours ago /bin/sh -c #(nop) ENTRYPOINT ["/iframely/... 0B 7fd48517e92f 31 hours ago /bin/sh -c #(nop) USER [iframely] 0B 440fed056314 31 hours ago /bin/sh -c #(nop) COPY dir:2c2c06f5d84e99a... 1.23MB 05994a1569ee 31 hours ago /bin/sh -c yarn install --pure-lockfile --... 76.5MB 787849561500 31 hours ago /bin/sh -c #(nop) COPY multi:369b38c1382bc... 91.1kB 6365ac938bc0 31 hours ago /bin/sh -c #(nop) ENV NODE_ENV=local 0B 872706dbdcbe 31 hours ago /bin/sh -c addgroup -S iframelygroup && ad... 4.88kB 79841b944ec3 31 hours ago /bin/sh -c #(nop) WORKDIR /iframely 0B 55eb8bd24be0 31 hours ago /bin/sh -c #(nop) EXPOSE 8061/tcp 0B 1448646743d0 7 months ago /bin/sh -c #(nop) CMD ["node"] 0B
So, the one from 7 months ago should be the old working version. Welp...that didn't work.
docker run -d --name wtdwtf-iframely \ --network wtdwtf \ --restart unless-stopped \ --ip 172.18.0.105 \ --volumes-from wtdwtf-iframely-old \ 1448646743d0
Error response from daemon: lstat /home/docker/overlay2/5057be58e3f9b22f248dfcbe1163212c53ffaafe1357f2220c34b1e571470c64/merged/iframely/lib/plugins/validators/sync: no such file or directory
-
OK...well, managed to restore back to broken container.
-
@boomzilla said in Docker network timeouts:
@boomzilla said in Docker network timeouts:
How do I "fish it out?" I definitely haven't done the pruning thing.
Ah ha.
docker image history
.No. That's history of the image itself. Not images that previously held that name.
$ docker image history iframely IMAGE CREATED CREATED BY SIZE COMMENT 62ad11de8362 31 hours ago /bin/sh -c #(nop) ENTRYPOINT ["/iframely/... 0B 7fd48517e92f 31 hours ago /bin/sh -c #(nop) USER [iframely] 0B 440fed056314 31 hours ago /bin/sh -c #(nop) COPY dir:2c2c06f5d84e99a... 1.23MB 05994a1569ee 31 hours ago /bin/sh -c yarn install --pure-lockfile --... 76.5MB 787849561500 31 hours ago /bin/sh -c #(nop) COPY multi:369b38c1382bc... 91.1kB 6365ac938bc0 31 hours ago /bin/sh -c #(nop) ENV NODE_ENV=local 0B 872706dbdcbe 31 hours ago /bin/sh -c addgroup -S iframelygroup && ad... 4.88kB 79841b944ec3 31 hours ago /bin/sh -c #(nop) WORKDIR /iframely 0B 55eb8bd24be0 31 hours ago /bin/sh -c #(nop) EXPOSE 8061/tcp 0B 1448646743d0 7 months ago /bin/sh -c #(nop) CMD ["node"] 0B
So, the one from 7 months ago should be the old working version. Welp...that didn't work.
No, it is the base image from which the iframely one is created.
You want
docker images
and there will be theiframely latest
, which is the current one, and someiframely <none>
ones that are the previous versions.Note that all
iframely
images must have been created by the same command, so one created withCMD ["node"]
cannot be it; that's the base image with node.
-
@Bulb hmm...nope, I don't see anything like that. I guess maybe I'll try setting up a non-dockerized iframely today while I try to get the dockerized version figured out.
-
@boomzilla What does
docker images
give you?
-
boomzilla@what:~$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE iframely latest fecf3ea05ad9 14 hours ago 168MB boomzillawtf/tdwtf latest e25f64dcdd55 44 hours ago 1.41GB <none> <none> 62ad11de8362 45 hours ago 168MB <none> <none> 71aad97f1576 4 weeks ago 1.42GB iframely last_good 1448646743d0 7 months ago 89.7MB iframely_last_good latest 1448646743d0 7 months ago 89.7MB node 12.18-alpine3.12 1448646743d0 7 months ago 89.7MB nodebb/docker v1.14.3 865b04cf3552 8 months ago 1.31GB <none> <none> 96e5496e9abc 11 months ago 1.44GB <none> <none> 2c6d1522ad59 11 months ago 1.44GB <none> <none> a0972793c0f4 11 months ago 1.44GB nodebb/docker v1.14.0-7 daefbeb3c29d 11 months ago 1.34GB nodebb/docker v1.13.3 863d1db2f8b6 12 months ago 1.34GB inedo-prod-forum_nodebb latest 6fb04b55620d 14 months ago 1.38GB <none> <none> dfe61cd4ca6b 15 months ago 1.41GB nodebb/docker v1.13.2 caf2186b9fba 15 months ago 1.31GB inedo-test-forum_nodebb latest 13c7685e59f3 16 months ago 1.38GB nodebb/docker v1.13.1 02ec667f0760 17 months ago 1.31GB nodebb/docker v1.13.0 c586ea99bb46 18 months ago 1.31GB nodebb/docker v1.12.2 a8a69ce22819 18 months ago 1.31GB nodebb/docker latest 2255287fb030 18 months ago 1.31GB node lts 11e92fc50c4a 19 months ago 908MB postgres 11-alpine bb6408c77dbf 21 months ago 72.5MB redis 5-alpine 72e76053ebb7 24 months ago 50.9MB postgres latest 9c116111eb08 2 years ago 312MB nodebb/docker v1.12.0 3d6e2017922e 2 years ago 1.39GB <none> <none> 787d4441173c 2 years ago 727MB mcr.microsoft.com/mssql/server latest 885d07287041 2 years ago 1.45GB postgres 10 63f824b22e5b 2 years ago 236MB
EDIT: that
iframely_last_good
stuff was me.
-
@boomzilla Time to create
iframely_last_good.old.BAK
-
@boomzilla That's strange. I have a lot of
something <none>
, which is the older versions of things.
-
@boomzilla said in Docker network timeouts:
I guess maybe I'll try setting up a non-dockerized iframely today while I try to get the dockerized version figured out.
So...that's running, but now I think I'm fighting with nginx to allow anything to talk to it.
-
@Bulb said in Docker network timeouts:
@boomzilla That's strange. I have a lot of
something <none>
, which is the older versions of things.In Ben's
update
scripts for docker images, he renames the old container, starts a new one, does some stuff with it, then deletes the old one:docker build -t iframely github.com/itteco/iframely docker rename wtdwtf-iframely wtdwtf-iframely-old docker stop wtdwtf-iframely-old docker run -d --name wtdwtf-iframely --network wtdwtf --restart unless-stopped --ip 172.18.0.105 iframely docker cp 99_wtdwtf_autoplay.js wtdwtf-iframely:/iframely/lib/plugins/validators/sync/99_wtdwtf_autoplay.js docker cp 99_wtdwtf_mastodon.js wtdwtf-iframely:/iframely/plugins/links/99_wtdwtf_mastodon.js docker cp mastodon-embed.js wtdwtf-iframely:/iframely/static/js/mastodon-embed.js docker cp config.local.js wtdwtf-iframely:/iframely/config.local.js docker restart wtdwtf-iframely docker rm -v wtdwtf-iframely-old
I suppose the
docker rm
removes the image, too?
-
@boomzilla No,
rm
doesn't remove the image,rmi
is needed for that. Thebuild
will build newiframely:latest
(:latest
is default if you don't give a tag).
-
@Bulb ok, so...building like that just overwrites latest and doesn't leave a history?
-
@boomzilla It overwrites the latest. And there isn't anything like history, really, that's a misnomer. But the image remains there.
Now what I didn't expect, but looking at my images confirms, is that while pulling a new
image:tag
leaves the old one asimage <none>
, building a newimage:tag
leaves the old one as<none> <none>
. So the old image would be one of those, but then you have toinspect
them and guess from the content of that which image is what.
-
-
@boomzilla said in Docker network timeouts:
Now to figure out changed.
This is IT. We all know the answer is "absolutely nothing"
-
@izzion said in Docker network timeouts:
@boomzilla said in Docker network timeouts:
Now to figure out changed.
This is IT. We all know the answer is "absolutely nothing"
Well...that image is listing at "2 years ago." But...HTF do I get a specific date? Looking at github, v1.3.0 was released in March 2019 and v1.3.1 May 31, 2019. Either one could be fuzzed into that.
-
$ docker inspect -f '{{ .Created }}' 787d4441173c 2018-10-22T20:56:50.368184966Z
Two years.
Anywho, that puts it at v1.2.7.
-
This is the Dockerfile diff:
boomzilla@boomzilla:~/iframely$ git diff v1.2.7..v1.6.0 Dockerfile diff --git a/Dockerfile b/Dockerfile index fcf9f609..5ec5ff35 100644 --- a/Dockerfile +++ b/Dockerfile @@ -1,18 +1,22 @@ -FROM node:5.8 +FROM node:12.18-alpine3.12 EXPOSE 8061 -COPY . /iframely - WORKDIR /iframely -RUN DEPS="libkrb5-dev" \ - apt-get update && \ - apt-get install -q -y --no-install-recommends $DEPS && \ - npm install -g forever && \ - npm install && \ - apt-get purge -y --auto-remove $DEPS && \ - apt-get autoremove && \ - apt-get clean +# Create new non-root user +RUN addgroup -S iframelygroup && adduser -S iframely -G iframelygroup + +# This will change the config to `config.<VALUE>.js` and the express server to change its behaviour. +# You should overwrite this on the CLI with `-e NODE_ENV=production`. +ENV NODE_ENV=local + +## Utilize docker layer cache +COPY package.json yarn.lock /iframely/ +RUN yarn install --pure-lockfile --production + +COPY . /iframely + +USER iframely -ENTRYPOINT ["/iframely/docker/entrypoint.sh"] +ENTRYPOINT [ "/iframely/docker/entrypoint.sh" ]
One thing that stands out to me is
Create new non-root user
.
-
Testing in my vm shows that adding an
iframely
user to the system makes everything work. Because apparently the user inside the container runs as that user on the system. But...there was noiframely
user, so .Going to try this again...
-
@boomzilla said in Docker network timeouts:
Testing in my vm shows that adding an
iframely
user to the system makes everything work. Because apparently the user inside the container runs as that user on the system. But...there was noiframely
user, so .Going to try this again...
Nope. Added an iframely user but still doesn't work here. :-(
-
@boomzilla said in Docker network timeouts:
+# You should overwrite this on the CLI with `-e NODE_ENV=production`.
Is that relevant?
-
@PleegWat no, we use the
local
named config file. I'm pretty sure it's about the users now. As I said, when I fixed that in my local VM it started working. But adding aniframely
user on this server didn't change it. Not sure why.https://medium.com/@mccode/processes-in-containers-should-not-run-as-root-2feae3f0df3b
-
@boomzilla If the user on the server was created after the container, is the user in the container appropriately associated with the one in the host? Notably, is their user ID the same?
-
@PleegWat probably not. However, I recreated the user and built a fresh new container and it still didn't work. So reverted to the last good once again.
-
@PleegWat also, if I
su
toiframely
I can successfully usecurl
.
-
@boomzilla said in Docker network timeouts:
I'm pretty sure it's about the users now. As I said, when I fixed that in my local VM it started working. But adding an iframely user on this server didn't change it. Not sure why.
Docker does, normally, not remap UIDs, so the UID seen inside the container is the same as seen outside of it. But inside the container the user info is going to be looked up in the
/etc/passwd
inside the container, so that's where the user must exist. Whether the user exists in the outside/etc/passwd
is not relevant.It is a bit more complicated than that though and it's good in this case. The user isn't going to be looked up in
/etc/passwd
, it is going to be looked up usinggetpwent
function. And there is libnss-unknown (2) that makesgetpwent
return valid result for all users. So I got used to installing that in all docker images that I build (and set HOME environment variable to what I want to use as default home) and that takes care of user id issues whatever user I run the container under (as usually the user id has to match owner of whatever directories I mount into it).
-
… note that
libnss-unknown
only synthesizes a pw record for any uid, not any name. This is for the benefit of programs that don't trust the environment and look up the user they are running as.Normally nothing should change user in docker, so looking up other users shouldn't be needed. You set the default user in given image with the
USER
instruction inDockerfile
and override it by-u
option torun
orcreate
(for the main process) orexec
(if you want additional shell; especially useful if you want root in a container with non-root default). I use the numeric id of myself when I am mounting directories in my home (I learned to use dockerized build tools for my projects so I can have different versions for each as needed, so there I mount the checkout), and random numeric ID for services that don't care.If the service or its start-up script are written so that they insist on doing
su
to give up root permissions, you will have to create the target user in the/etc/passwd
(and group in/etc/groups
) inside the container.
-
@Bulb said in Docker network timeouts:
Normally nothing should change user in docker, so looking up other users shouldn't be needed. You set the default user in given image with the USER instruction in Dockerfile
This is what the iframely Dockerfile does. It was added sometime since we last updated in October 2018.
If the service or its start-up script are written so that they insist on doing su to give up root permissions, you will have to create the target user in the /etc/passwd (and group in /etc/groups) inside the container.
Our server now has an iframely user and group. But unlike my VM, it didn't seem to fix the problem. I was out camping with the scouts this weekend so I haven't been doing much, but at least now I have something to research.
-
@boomzilla said in Docker network timeouts:
Normally nothing should change user in docker, so looking up other users shouldn't be needed. You set the default user in given image with the USER instruction in Dockerfile
This is what the iframely Dockerfile does. It was added sometime since we last updated in October 2018.
But does it also create the user (something like
RUN adduser …
)? It must either do that, or install thelibnss-unknown
.@boomzilla said in Docker network timeouts:
Our server now has an iframely user and group.
Users outside the container are not relevant. Kernel just compares the UID to the UID of files it tries to access, and to 0 for some extra privileges and never looks in
/etc/passwd
or anywhere else on the filesystem. The only thinks that might care about the user definition isnode
itself, some library it uses, or some start-up script that tries to dosu
.Also, UID has no effect on network except you can't bind port <1024 if UID != 0 (binding port <1024 in the container makes little sense; you can always bind a high port inside and expose it as the low one (80 and/or 443). The only way I can imagine user affects network is that the SSL library tries to access some data in home, looks up home in passwd, and blows up if it fails. Which is all inside the container.
-
@Bulb said in Docker network timeouts:
@boomzilla said in Docker network timeouts:
Normally nothing should change user in docker, so looking up other users shouldn't be needed. You set the default user in given image with the USER instruction in Dockerfile
This is what the iframely Dockerfile does. It was added sometime since we last updated in October 2018.
But does it also create the user (something like
RUN adduser …
)? It must either do that, or install thelibnss-unknown
.FROM node:12.18-alpine3.12 EXPOSE 8061 WORKDIR /iframely # Create new non-root user RUN addgroup -S iframelygroup && adduser -S iframely -G iframelygroup # This will change the config to `config.<VALUE>.js` and the express server to change its behaviour. # You should overwrite this on the CLI with `-e NODE_ENV=production`. ENV NODE_ENV=local ## Utilize docker layer cache COPY package.json yarn.lock /iframely/ RUN yarn install --pure-lockfile --production COPY . /iframely USER iframely ENTRYPOINT [ "/iframely/docker/entrypoint.sh" ]
@boomzilla said in Docker network timeouts:
Our server now has an iframely user and group.
Also, UID has no effect on network except you can't bind port <1024 if UID != 0 (binding port <1024 in the container makes little sense; you can always bind a high port inside and expose it as the low one (80 and/or 443). The only way I can imagine user affects network is that the SSL library tries to access some data in home, looks up home in passwd, and blows up if it fails. Which is all inside the container.
Except that (on my VM) it was suddenly able to reach external sites. I confirmed that the iframely user can successfully
curl
both on my VM and on the WTDWTF server.From outside, requesting a onebox from the old iframely:
$ curl 172.18.0.105:8061/iframely?url=https://twitter.com {"meta":{"theme-color":"#ffffff","canonical":"https://twitter.com/","site":"Twitter"},"links":[{"href":"https://abs.twimg.com/responsive-web/client-web-legacy/icon-ios.b1fc7275.png","rel":["apple-touch-icon","icon","ssl"],"type":"image/png","media":{"width":192,"height":192}},{"href":"https://abs.twimg.com/responsive-web/client-web-legacy/icon-svg.168b89d5.svg","rel":["mask-icon","icon","ssl"],"type":"image/svg"},{"href":"https://abs.twimg.com/favicons/twitter.ico","rel":["shortcut","icon","ssl"],"type":"image/x-icon"}],"rel":[]}
...and then using the latest:
$ curl 172.18.0.106:8061/iframely?url=https://twitter.com {"error":{"source":"iframely","code":408,"message":"Timeout"}
-
@boomzilla said in Docker network timeouts:
Except that (on my VM) it was suddenly able to reach external sites.
But that was out-of-docker, if I understood right, wasn't it? Which also means glibc rather than musl—the container is Alpine-based.
-
@Bulb said in Docker network timeouts:
@boomzilla said in Docker network timeouts:
Except that (on my VM) it was suddenly able to reach external sites.
But that was out-of-docker, if I understood right, wasn't it? Which also means glibc rather than musl—the container is Alpine-based.
Yes, iframely in docker works locally after adding the user but not on the server.
And yes, the container appears to be Alpine-based (I guess...based on the first line
FROM node:12.18-alpine3.12
). I have no idea what that means.
-
@boomzilla said in Docker network timeouts:
Yes, iframely in docker works locally after adding the user but not on the server.
Now that is … . That makes no, absolutely no, sense to me. The container is effectively a separate machine that only shares kernel. It shouldn't be affected by content of some silly file outside of it.
@boomzilla said in Docker network timeouts:
And yes, the container appears to be Alpine-based. I have no idea what that means.
It means it is using the minimal standard C library implementation called
musl
. It isn't as tested as the GNU one, so sometimes it might have issues it otherwise shouldn't.
Now another question? What is the docker version on the server and in the test VM? We had some weird problems on build agents after updating to 20.10 at work, though I didn't notice on either of my machines.
-
@Bulb said in Docker network timeouts:
Now another question? What is the docker version on the server and in the test VM? We had some weird problems on build agents after updating to 20.10 at work, though I didn't notice on either of my machines.
Ah! Good question...
boomzilla@what:~$ docker --version Docker version 17.05.0-ce, build 89658be
boomzilla@boomzilla:~/docker$ docker --version Docker version 19.03.13, build cd8016b6bc
-
Hrm...looks like the server here is using the package manager's
docker-engine
package. I havedocker
installed as a snap package locally. But I don't remember exactly how I did that.
-
17.05 is really Ancient, but my impression is that that was the first version that worked reliably (another project around here is still using 1.13 under its OpenShift 3.11, which was pretty broken, but RedHat somehow managed to kick it into a, working, square ball and make the whole house of cards stand). I have no known issues with that.
Did you build the container on the server or on your test VM and copied it over?