Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImportLayer hangs indefinitely after importing images. #2342

Open
geofffranks opened this issue Dec 16, 2024 · 1 comment
Open

ImportLayer hangs indefinitely after importing images. #2342

geofffranks opened this issue Dec 16, 2024 · 1 comment

Comments

@geofffranks
Copy link

We have noticed that starting with the November Windows Update cycle, VMs that have KB5046615 (and maybe also KB5046268) fail to load certain image layers into HCS. We've traced the hangs down to this syscall in hcsshim/internal/wclayer's _importLayer() function, but can't figure out what is going on at the OS level. It could be Windows, it could be how we're calling ImportLayer, it could be how we're building the image, it could be something else. But we can find no errors or windows level diagnostics for this, and since the function never returns, we have no idea what the issue could be. We've updated to the latest versions of hcsshim and docker, with no luck.

We've tried isolating layers of our image to determine if anything weird stands out, but after repeatably determining that one specific layer was failing last Wednesday (the layer was removing a single file from the image). When we resumed testing on Friday, the same layer was now working, making it seem less likely that it's something related to how we're building our image or importing it.

Is there anything we can do to get more information out of Windows about why this is failing, or troubleshoot further?

Windows Servers with KB5046615, as well as the newer December updates are affected. They cannot import any of our windows images built after Sept 10th. Strangely they can import our images built prior to then (latest working build was from July 9th). Without the KB5046615 patch, all of our images are imported without issue.

Last image working across all OS versions: cloudfoundry/windows2016fs:2019.0.168
First image breaking when imported on an OS with KB5046615 applied: cloudfoundry/windows2016fs:2019.0.169
Current image: cloudfoundry/windows2016:2019
Dockerfile used for building
Logic used for importing the image

@geofffranks
Copy link
Author

Additionally, we have unit tests failing for our groot-windows software (code that's doing the image layer importing). The tests hang indefinitely while trying to load the images from these two dockerfiles:

https://github.com/cloudfoundry/wg-app-platform-runtime-ci/tree/main/winc-release/dockerfiles/whiteout
https://github.com/cloudfoundry/wg-app-platform-runtime-ci/tree/main/winc-release/dockerfiles/link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant