G
GuideDevOps
Lesson 8 of 11

Artifact Management

Part of the CI/CD Pipelines tutorial series.

What Are Build Artifacts?

A build artifact is the output of your CI/CD build stage — the deployable result of compiling, bundling, or packaging your application. It's the thing you actually deploy to production.

Source Code (what developers write)
        ↓
   Build Process
        ↓
Build Artifact (what gets deployed)

Types of Artifacts

Artifact TypeExampleUsed By
Docker Imagemyapp:v1.2.3Kubernetes, ECS, Docker Swarm
JAR/WARmyapp-1.2.3.jarJava applications (Spring Boot)
Binarymyapp-linux-amd64Go, Rust, C++ applications
NPM Package@mycompany/ui-libJavaScript libraries
Python Wheelmypackage-1.0.0-py3-none-any.whlPython libraries
Static Bundledist/ folder (HTML, CSS, JS)Websites, SPAs
Helm Chartmyapp-chart-1.2.3.tgzKubernetes deployments
Terraform Module.terraform/Infrastructure provisioning
Test Reportsjunit.xml, coverage/Quality tracking
SBOMsbom.jsonSecurity & compliance

Why Artifact Management Matters

Without Artifact Management

Developer A builds locally → deploys to staging
Developer B builds locally → deploys to production
                                ↓
Different build environments → Different results
                                ↓
"It works on staging but crashes in production!" 🐛

With Artifact Management

CI pipeline builds once → artifact stored in registry
                              ↓
Same artifact → deployed to staging (tested ✅)
Same artifact → deployed to production (identical ✅)
                              ↓
"What runs in staging is EXACTLY what runs in production" ✅

Key Benefits

BenefitWhy It Matters
ConsistencySame artifact in staging and production — no surprises
TraceabilityKnow exactly which commit produced which artifact
RollbackPull the previous version's artifact and deploy instantly
AuditabilityWho built it, when, from what code, with what dependencies
ReproducibilityAny team member can deploy any version at any time
SpeedBuild once, deploy many times — no rebuilding

Container Registries (Docker Images)

Docker images are the most common artifact in modern DevOps. They're stored in container registries.

Popular Container Registries

RegistryBest ForFree Tier
Docker HubOpen-source, public images1 private repo
Amazon ECRAWS-native workloads500 MB (free tier)
Google Artifact RegistryGCP-native workloads500 MB
Azure Container RegistryAzure-native workloadsBasic tier
GitHub Container RegistryGitHub-based projectsFree for public
GitLab Container RegistryGitLab-based projectsBuilt-in, free
HarborSelf-hosted, enterpriseOpen-source
JFrog ArtifactoryMulti-format, enterpriseCloud free tier

Building & Pushing Docker Images

# GitHub Actions - Build & Push to Docker Hub
build-and-push:
  name: Build Docker Image
  runs-on: ubuntu-latest
 
  steps:
    - uses: actions/checkout@v4
 
    - name: Set up Docker Buildx
      uses: docker/setup-buildx-action@v3
 
    - name: Login to Docker Hub
      uses: docker/login-action@v3
      with:
        username: ${{ secrets.DOCKER_USERNAME }}
        password: ${{ secrets.DOCKER_PASSWORD }}
 
    - name: Extract metadata (tags, labels)
      id: meta
      uses: docker/metadata-action@v5
      with:
        images: mycompany/myapp
        tags: |
          # Tag with git SHA (unique per commit)
          type=sha,prefix={{branch}}-
 
          # Tag with branch name
          type=ref,event=branch
 
          # Tag with semantic version (from git tag)
          type=semver,pattern={{version}}
          type=semver,pattern={{major}}.{{minor}}
          type=semver,pattern={{major}}
 
          # Tag latest for main branch
          type=raw,value=latest,enable={{is_default_branch}}
 
    - name: Build and push
      uses: docker/build-push-action@v5
      with:
        context: .
        push: true
        tags: ${{ steps.meta.outputs.tags }}
        labels: ${{ steps.meta.outputs.labels }}
        cache-from: type=gha
        cache-to: type=gha,mode=max
        build-args: |
          BUILD_DATE=${{ github.event.head_commit.timestamp }}
          GIT_SHA=${{ github.sha }}
          VERSION=${{ steps.meta.outputs.version }}

Pushing to AWS ECR

build-and-push-ecr:
  name: Build & Push to ECR
  runs-on: ubuntu-latest
 
  steps:
    - uses: actions/checkout@v4
 
    - name: Configure AWS credentials
      uses: aws-actions/configure-aws-credentials@v4
      with:
        aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
        aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
        aws-region: us-east-1
 
    - name: Login to Amazon ECR
      id: login-ecr
      uses: aws-actions/amazon-ecr-login@v2
 
    - name: Build, tag, and push image
      env:
        ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
        ECR_REPOSITORY: myapp
        IMAGE_TAG: ${{ github.sha }}
      run: |
        docker build -t $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG .
        docker build -t $ECR_REGISTRY/$ECR_REPOSITORY:latest .
        docker push $ECR_REGISTRY/$ECR_REPOSITORY:$IMAGE_TAG
        docker push $ECR_REGISTRY/$ECR_REPOSITORY:latest

Pushing to GitHub Container Registry

build-and-push-ghcr:
  name: Build & Push to GHCR
  runs-on: ubuntu-latest
  permissions:
    contents: read
    packages: write
 
  steps:
    - uses: actions/checkout@v4
 
    - name: Login to GitHub Container Registry
      uses: docker/login-action@v3
      with:
        registry: ghcr.io
        username: ${{ github.actor }}
        password: ${{ secrets.GITHUB_TOKEN }}
 
    - name: Build and push
      uses: docker/build-push-action@v5
      with:
        context: .
        push: true
        tags: |
          ghcr.io/${{ github.repository }}:${{ github.sha }}
          ghcr.io/${{ github.repository }}:latest

Artifact Versioning Strategies

Versioning is critical — you need to know exactly which version is running in production.

Strategy 1: Git SHA (Most Common for Containers)

myapp:abc123f    ← Unique to every commit
myapp:latest     ← Always points to newest build

Pros: Unique per commit, easy to trace back to code Cons: Not human-readable

Strategy 2: Semantic Versioning (Libraries & Releases)

myapp:1.2.3      ← Major.Minor.Patch
myapp:1.2        ← Points to latest patch (1.2.x)
myapp:1          ← Points to latest minor (1.x.x)

Pros: Human-readable, clear upgrade path Cons: Requires manual version bumps (or automated with tools)

Strategy 3: Branch + SHA (Best of Both)

myapp:main-abc123f      ← Main branch, specific commit
myapp:develop-def456a   ← Develop branch, specific commit
myapp:feature-login-789 ← Feature branch

Pros: Know which branch AND which commit Cons: More tags to manage

Strategy 4: Date-Based

myapp:2026-04-09-143022    ← Date and time of build
myapp:20260409             ← Daily build

Pros: Easy to see how old a deployment is Cons: Not traceable to a specific commit

Recommended: Combined Approach

# docker/metadata-action configuration
tags: |
  type=sha,prefix={{branch}}-         # main-abc123f
  type=semver,pattern={{version}}      # 1.2.3 (from git tags)
  type=raw,value=latest,enable={{is_default_branch}}

Package Registries

For libraries and packages (not containers), each language has its own registry.

NPM (JavaScript/TypeScript)

publish-npm:
  name: Publish to NPM
  runs-on: ubuntu-latest
  if: startsWith(github.ref, 'refs/tags/v')
 
  steps:
    - uses: actions/checkout@v4
 
    - uses: actions/setup-node@v4
      with:
        node-version: '20'
        registry-url: 'https://registry.npmjs.org'
 
    - run: npm ci
    - run: npm test
    - run: npm run build
 
    - name: Publish to NPM
      run: npm publish --access public
      env:
        NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}

PyPI (Python)

publish-pypi:
  name: Publish to PyPI
  runs-on: ubuntu-latest
  if: startsWith(github.ref, 'refs/tags/v')
 
  steps:
    - uses: actions/checkout@v4
 
    - uses: actions/setup-python@v4
      with:
        python-version: '3.11'
 
    - name: Install build tools
      run: pip install build twine
 
    - name: Build package
      run: python -m build
 
    - name: Publish to PyPI
      run: twine upload dist/*
      env:
        TWINE_USERNAME: __token__
        TWINE_PASSWORD: ${{ secrets.PYPI_TOKEN }}

Maven Central (Java)

publish-maven:
  name: Publish to Maven Central
  runs-on: ubuntu-latest
  if: startsWith(github.ref, 'refs/tags/v')
 
  steps:
    - uses: actions/checkout@v4
 
    - uses: actions/setup-java@v4
      with:
        java-version: '17'
        distribution: 'temurin'
        cache: 'maven'
 
    - name: Build and publish
      run: mvn deploy -DskipTests
      env:
        MAVEN_USERNAME: ${{ secrets.OSSRH_USERNAME }}
        MAVEN_PASSWORD: ${{ secrets.OSSRH_PASSWORD }}
        MAVEN_GPG_PASSPHRASE: ${{ secrets.GPG_PASSPHRASE }}

Pipeline Artifacts (Build Outputs)

Not all artifacts are packages. CI/CD pipelines also produce temporary artifacts that are shared between stages.

Passing Artifacts Between Jobs

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - run: npm run build
 
      # Upload build output for other jobs
      - uses: actions/upload-artifact@v4
        with:
          name: build-output
          path: dist/
          retention-days: 7        # Auto-delete after 7 days
          if-no-files-found: error # Fail if build produced nothing
 
  test:
    needs: build
    runs-on: ubuntu-latest
    steps:
      # Download the build from previous job
      - uses: actions/download-artifact@v4
        with:
          name: build-output
          path: dist/
 
      - run: npx serve -s dist &
      - run: npx playwright test
 
  deploy:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - uses: actions/download-artifact@v4
        with:
          name: build-output
          path: dist/
 
      - run: |
          aws s3 sync dist/ s3://my-bucket/ --delete

GitLab CI Artifacts

build:
  stage: build
  script:
    - npm ci
    - npm run build
  artifacts:
    paths:
      - dist/
      - node_modules/
    expire_in: 1 day      # Auto-cleanup
    reports:
      junit: junit.xml     # Parsed by GitLab for test results
 
test:
  stage: test
  script:
    - npm test
  dependencies:
    - build  # Download artifacts from build job
  artifacts:
    when: always           # Upload even on failure
    paths:
      - coverage/
    reports:
      coverage_report:
        coverage_format: cobertura
        path: coverage/cobertura-coverage.xml

Software Bill of Materials (SBOM)

An SBOM is a complete inventory of all components in your software — every library, every dependency, every tool used to build it.

Why SBOM Matters

Your App
  ├── [email protected]
  │    ├── [email protected]
  │    ├── [email protected]
  │    │    └── [email protected]
  │    └── ... (47 more packages)
  ├── [email protected]
  └── ... (200+ total packages)

Question: "Does your app use log4j?"
Without SBOM: "Let me check... 🤷"
With SBOM:    "No, confirmed in 2 seconds ✅"

Generating SBOM in CI

generate-sbom:
  name: Generate SBOM
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
 
    # Generate SBOM for source code
    - name: Generate SPDX SBOM
      uses: anchore/sbom-action@v0
      with:
        path: .
        format: spdx-json
        output-file: sbom-source.spdx.json
 
    # Generate SBOM for Docker image
    - name: Generate container SBOM
      run: |
        docker run --rm \
          -v /var/run/docker.sock:/var/run/docker.sock \
          anchore/syft:latest \
          myapp:${{ github.sha }} \
          -o spdx-json > sbom-container.spdx.json
 
    - uses: actions/upload-artifact@v4
      with:
        name: sbom
        path: sbom-*.spdx.json

Artifact Retention & Cleanup

Artifacts consume storage and cost money. You need a cleanup strategy.

Retention Policies

Artifact TypeRetentionWhy
Production images90 days minimumRollback capability
Staging images30 daysDebugging recent deploys
Feature branch images7 daysTemporary, clean up fast
Build logs14 daysDebugging pipeline issues
Test reports30 daysTrend analysis
Coverage reports30 daysCoverage tracking
SBOMLifetime of releaseCompliance requirement

Automated Cleanup

# GitHub Actions - Clean old container images
cleanup:
  name: Cleanup Old Images
  runs-on: ubuntu-latest
  schedule:
    - cron: '0 3 * * SUN'  # Weekly at 3 AM Sunday
 
  steps:
    - name: Delete old untagged images
      uses: actions/delete-package-versions@v5
      with:
        package-name: myapp
        package-type: container
        min-versions-to-keep: 10
        delete-only-untagged-versions: true
 
    - name: Delete images older than 30 days
      run: |
        # AWS ECR lifecycle policy
        aws ecr put-lifecycle-policy \
          --repository-name myapp \
          --lifecycle-policy-text '{
            "rules": [
              {
                "rulePriority": 1,
                "description": "Expire untagged images older than 7 days",
                "selection": {
                  "tagStatus": "untagged",
                  "countType": "sinceImagePushed",
                  "countUnit": "days",
                  "countNumber": 7
                },
                "action": {
                  "type": "expire"
                }
              },
              {
                "rulePriority": 2,
                "description": "Keep only last 20 tagged images",
                "selection": {
                  "tagStatus": "tagged",
                  "tagPrefixList": ["develop-"],
                  "countType": "imageCountMoreThan",
                  "countNumber": 20
                },
                "action": {
                  "type": "expire"
                }
              }
            ]
          }'

Artifact Security

Image Signing (Cosign)

Sign your images to prove they came from your CI pipeline:

sign-image:
  name: Sign Container Image
  runs-on: ubuntu-latest
  needs: build-and-push
 
  steps:
    - name: Install Cosign
      uses: sigstore/cosign-installer@v3
 
    - name: Sign the image
      run: |
        cosign sign --yes \
          --key env://COSIGN_PRIVATE_KEY \
          myregistry.io/myapp:${{ github.sha }}
      env:
        COSIGN_PRIVATE_KEY: ${{ secrets.COSIGN_PRIVATE_KEY }}
        COSIGN_PASSWORD: ${{ secrets.COSIGN_PASSWORD }}

Image Scanning Before Deployment

scan-image:
  name: Scan Image for Vulnerabilities
  runs-on: ubuntu-latest
  needs: build-and-push
 
  steps:
    - name: Scan with Trivy
      uses: aquasecurity/trivy-action@master
      with:
        image-ref: myregistry.io/myapp:${{ github.sha }}
        format: 'table'
        exit-code: '1'           # Fail on findings
        severity: 'CRITICAL,HIGH'
        ignore-unfixed: true     # Skip CVEs with no fix yet

Artifact Management Best Practices

✅ DO This

Build once, deploy everywhere — Same artifact from CI to staging to production

Tag with git SHA — Always trace back to the exact commit

Sign your artifacts — Prove they came from your pipeline

Scan for vulnerabilities — Before storing in the registry

Generate SBOM — Know what's inside your artifacts

Set retention policies — Prevent storage costs from spiraling

Use immutable tags — Once v1.2.3 is published, never overwrite it

❌ DON'T Do This

Don't rebuild for each environment — Different builds = different bugs

Don't use latest in production — It's a moving target, not a version

Don't store artifacts locally — Use a proper registry with access control

Don't skip vulnerability scanning — It only takes 30 seconds

Don't keep artifacts forever — Clean up old develop/feature branch artifacts