2
nitwhiz
4y

Anyone ever passed docker builds between stages in gitlab ci?

I'm googleing my ass off and doing it via caching atm but it's unreliable. Artifacts are no option either for 2gigs of image.
It'd be nice to drop this `docker save --output image.tar` solution altogether..

Am I the only one trying to have seperate build, test and deploy stages for their docker builds?

Comments
  • 2
    I would just push it to a registry and pass the tag. Usually the thought about docker is it should be the same build result for every env and stage and you should supply configuration using volumes, config maps and secrets.
  • 2
    As far as I know caching is for storing things like node_modules between multiple pipelines on the same runner. To pass stuff between stages of the pipeline, you'd use artifacts. What's stopping you from using them?

    If artifacts don't work, you could try the Gitlab Docker registry. I think it can be configured so that only project members can pull/push.
  • 0
    Yes artifacts is the way to go.

    I'm building a base image if dependencies or so changes, pass it to build the app image, pass that to test it, if the tests were successful I release the base and app into the registry. Now when a new pipeline starts and there weren't any changes that affect the base its just pulls base from registry, build app image and passes that one via artifacts again.

    Also if your image is 2gigs in size you have usually done something wrong like using the wrong base for your docker build, not clearing the package cache, way to much other shit nobody needs (looking at you Ubuntu and Debian images), overusing the RUN command and creating a fuckton on layers. (each RUN always creates a new layer this layer is immutable then. So no further RUN can "change things" in the previous layer. If you delete stuff it still exists in previous layer)
  • 0
    @SortOfTested yes but that would maybe upload a broken image.

    Take a look at my comment to see how I work around that.
  • 0
    @fuckwit
    That's why you have incremental and CI. Nothing gets minored until the incremental passes tests. You can also just create a CI repo in the registry and do incrementals in that. Shipping artifacts around is a great way to leak IP.

    You don't have to worry about overusing run if you use multi-part builds and copy from intermediary aliased images. That always results in a single layer.
  • 0
    @fuckwit it's a very simple image for cordova builds: nodejs, java, gradle, ant, maven, android sdk + android platform tools for various platforms, cordova. That's the bare minimum of requirements for android builds. Caches cleared, nodejs:stretch as base (as alpine and sdkmanager aren't friends and debian makes the most sense for further adjustments I want to do)

    @SortOfTested I don't even get what that means, could you elaborate or send me some resources?

    @korrat jup. That's why I'm trying to use artifacts. But even docker images with less than 2 gigs would take some time to be passed around.

    As all jobs are linked against the docker:x.y.z-dind service - why isn't there the possibility to keep up the service between all jobs? This way the image is build by the build stage and available (like a local image on your host) in the next stages. Doesn't this make the most sense to be an option?
  • 0
    @SortOfTested Explain some more how that could result in loss of IP? most of the time the different stages run on the same host and even if shipping those things over the internet is never a good idea. Performance reasons aside.

    Yes but only if you use those multistaged builds. Ive seen a lot of popular images that don't use them and also don't cleanup after their 'apt update && apt upgrade'
  • 0
    @fuckwit the layer with all the necessary android platforms is 1.5 gigs. i don't really get what you trying to get across.

    i do remove the apt cache and it's friends.
  • 1
    @nitwhiz no its fine then. Nothing anybody could change.

    I was targeting SortOfTested with my reply and stating what I noticed in a lot of images.
  • 1
    @fuckwit oh i'm sorry then.

    yeah, looking forward to her answers, too :D
  • 0
    for the record, i made some kind of "cache" myself. the jobs' docker containers mount a host volume used for storing the images as /image-cache/${REPO_PATH}/${BRANCH_NAME}/... and loading them from there.

    no unnecessary artifact up/downloads or cache overhead - it's pretty fast for 4gigs of image - i can work with that!
  • 0
    @nitwhiz Can't you specify the services in default?
  • 0
    @korrat jup, but they are clean for every job. No images kept anywhere between the jobs.
  • 0
    @fuckwit
    If you ever pass around tarballs of your software, it makes it that much easier to wall out the door.
  • 1
    @nitwhiz
    Lots of density there, take this out the original context mostly as a glossary:

    Multi-part/stage builds:
    https://docs.docker.com/develop/....

    The last bit is important. Only the last stage outputs a persistent image. It should consist only of copy ops from the previous image. This is basically "rebase and flatten" from the multiple images produce during this type of build.

    Docker registries and repos
    https://docs.docker.com/registry/
    https://docs.docker.com/docker-hub/...

    Assume single registry.

    You can choose to use a single repo and trigger builds on commit that produce images, tag incremental until it passes all testing/env. The completion of a single image passing through all envs and tests should result in a releaseable tag being added to the registry in your repo. Retagging in this scenario does not duplicate data, so there's only metadata cost.

    The multirepo strategy just breaks up "testing" repo from staging. One successful completion of testing stages in the "test" repo, have a post success task push the image layer to the "release" repo with a minor.incremental tag. The release CI stage then pulls only from the "release" repo.

    Note: Many registry impls support cross-repo identity of individual images. This is useful to avoid multi-repo replication on retagging.

    Cleanup steps can also be added to onfailure if you want to clean incremental image layers that failed.

    The layering allows you to return the smallest amount of data in a single overlay (https://wiki.archlinux.org/index.ph...). My usual layer cake looks like:

    App (new per build, this is what passes through app ci/cd)
    ------
    Reusable deps (spring has a compilation strategy for this now as well, also reusable, separate repo)
    ------
    Runtimes/tools/core scripts (reusable, separate repo)
    -------
    Base
Add Comment