An end-to-end deployment workflow should do more than “deploy on push.” A reliable pipeline builds the application deterministically, deploys safely (often to a staging slot), verifies with automated tests, and promotes via a controlled release step (like a slot swap) only when quality gates pass.
Below is a practical blueprint based on a real GitHub Actions workflow (with identifiers and business names generalized). The goal is to help a DevOps engineer design a workflow that’s secure, maintainable, and production-ready.
What the example workflow does (high-level)
This workflow implements a clean release path:
Trigger
- Runs automatically on push to main only when UI code changes
- Also supports manual runs (workflow_dispatch) with inputs to control behavior
- Build
- Logs into cloud provider
- Pulls secrets from a centralized secrets store (e.g., Key Vault)
- Runs a reusable build action
- Deploy
- Deploys build artifact to a target slot (default: staging)
- Test
- Runs UI E2E tests (optional)
- Runs Mobile E2E tests (optional)
- Promote
- Performs a slot swap only if required tests succeeded
- Supports multiple paths:
- Swap after both UI+Mobile tests
- Swap after UI-only tests
- Swap after Mobile-only tests
This is the essence of a robust progressive delivery pipeline: promote only after verification.
Best practices demonstrated (and how to apply them)
1) Use path filters to avoid unnecessary runs
If a workflow deploys a UI, it should not run when backend docs change.
- Pattern: trigger on push with paths: [‘UI/**’]
- Benefit: lower CI cost, faster signal, less noise
Tip: Pair this with a separate backend workflow if you have independent deployables.
2) Provide a manual “release console” with workflow_dispatch inputs
The workflow allows operators to choose:
- slot: staging vs production
- runE2E / runMobileE2E: enable/disable post-deploy tests
That’s the right idea: manual runs should be intentional and parameterized.
Best practices for inputs:
- Use choice inputs for controlled values
- Pick safe defaults (e.g., deploy to staging, tests enabled)
- Document each input clearly (what it does, when to use it)
3) Make “staging slot first” the default release strategy
Deploying to a staging slot and swapping is one of the safest patterns for web apps:
- Deploy to staging slot → warm up → validate with E2E → swap to production
- Benefit: reduces downtime and supports quick rollback (swap back)
Add-on recommendation: include a warmup/smoke step before E2E (hit health endpoints, prime caches).
4) Separate Build and Deploy jobs with clear boundaries
The workflow uses:
- a build job
- a deploy job that depends on build
This separation is critical:
- Build can run unit tests and produce immutable artifacts
- Deploy consumes the artifact and applies environment-specific config
- Failures are easier to diagnose
Extra hardening: upload build output as an artifact and download it in deploy. Avoid rebuilding in deploy.
5) Centralize secrets, but keep the pipeline in control
The workflow retrieves runtime secrets from a secrets manager (Key Vault-like) and injects them into build steps.
Best practices:
- Use a secrets store for rotation and auditing
- Keep secrets out of logs (masking, avoid echo)
- Fetch only what the job needs (principle of least privilege)
Strong recommendation: prefer OIDC-based cloud auth over long-lived JSON credentials when possible (reduces credential leakage risk).
6) Use reusable actions and reusable workflows for consistency
The example uses:
- a reusable build action (e.g., .github/actions/build_web)
- a reusable deploy action (e.g., .github/actions/deploy_web)
- reusable workflows for E2E suites and slot swap
This is exactly how to scale:
- One build action used across multiple apps
- One E2E workflow versioned and shared
- One slot swap workflow that standardizes promotion behavior
Best practice: version/pin reusable workflow references (tags/SHAs) when reused across repos.
7) Gate promotion using conditional logic + job dependencies
Promotion (slot swap) occurs only when:
- the required test jobs ran, and
- their results are success, and
- the manual inputs indicate which tests are required
This is a powerful pattern: quality gates encoded in the workflow.
Design guidance:
- Keep gates explicit and readable (avoid overly clever boolean expressions)
- Ensure “skip tests” cannot accidentally promote unless intentionally allowed
- Consider a required approval step (GitHub Environments) for production swap
A generic reference workflow (cleaned up example)
Below is a genericized version of the same structure. This is not copied verbatim; it’s a template illustrating the best-practice shape.
name: Prod - Web App Deployment
on:
push:
branches: [ main ]
paths:
- "UI/**"
workflow_dispatch:
inputs:
slot:
description: "Target slot"
required: false
default: "staging"
type: choice
options: [ staging, production ]
runUiE2E:
description: "Run UI E2E after deployment"
required: false
default: "true"
type: choice
options: [ "true", "false" ]
runMobileE2E:
description: "Run Mobile E2E after deployment"
required: false
default: "true"
type: choice
options: [ "true", "false" ]
env:
SLOT_NAME: ${{ inputs.slot || 'staging' }}
jobs:
build:
runs-on: windows-latest
environment: Production
steps:
- uses: actions/checkout@v4
- name: Cloud login (OIDC recommended)
uses: cloud/login@v2
with:
# prefer OIDC or short-lived credentials
creds: ${{ secrets.CLOUD_CREDENTIALS }}
- name: Load secrets
uses: cloud/get-secrets@v1
with:
vault: ${{ secrets.SECRETS_VAULT_NAME }}
secrets: "APP_CLIENT_ID,APP_TENANT_ID,OBSERVABILITY_CONN_STRING"
id: secrets
- name: Build web
uses: ./.github/actions/build_web
with:
clientId: ${{ steps.secrets.outputs.APP_CLIENT_ID }}
tenantId: ${{ steps.secrets.outputs.APP_TENANT_ID }}
observability: ${{ steps.secrets.outputs.OBSERVABILITY_CONN_STRING }}
skipTestRun: "false"
deploy:
runs-on: windows-latest
needs: build
environment:
name: Production
url: ${{ steps.deploy.outputs.webapp-url }}
steps:
- uses: actions/checkout@v4
- name: Cloud login
uses: cloud/login@v2
with:
creds: ${{ secrets.CLOUD_CREDENTIALS }}
- name: Deploy to slot
id: deploy
uses: ./.github/actions/deploy_web
with:
appName: ${{ secrets.WEB_APP_NAME }}
slotName: ${{ env.SLOT_NAME }}
resourceGroup: ${{ vars.RESOURCE_GROUP }}
ui_e2e:
needs: deploy
if: ${{ github.event_name != 'workflow_dispatch' || inputs.runUiE2E == 'true' }}
uses: ./.github/workflows/ui_e2e_tests.yml
with:
environment: Production
useStagingUrl: true
secrets: inherit
mobile_e2e:
needs: deploy
if: ${{ github.event_name != 'workflow_dispatch' || inputs.runMobileE2E == 'true' }}
uses: ./.github/workflows/mobile_e2e_tests.yml
with:
environment: Production
useStagingUrl: true
secrets: inherit
swap_slot:
needs: [ui_e2e, mobile_e2e]
if: >
${{
(github.event_name != 'workflow_dispatch' ||
(inputs.runUiE2E == 'true' && inputs.runMobileE2E == 'true'))
&& needs.ui_e2e.result == 'success'
&& needs.mobile_e2e.result == 'success'
}}
uses: ./.github/workflows/slot_swap.yml
with:
environment: Production
slot: ${{ env.SLOT_NAME }}
secrets: inheritProduction-hardening checklist (what I would add next)
- Pin action versions: move from older major versions to current (actions/checkout@v4, updated cloud login actions), and consider pinning to commit SHA for high assurance.
- Use OIDC for cloud auth: avoid storing long-lived credentials.
- Add concurrency: prevent two deployments to the same environment/slot from racing.
Example idea: concurrency: prod-web-${{ inputs.slot || ‘staging’ }}
- Artifact integrity: build once, deploy the exact artifact (store and verify checksum).
- Smoke tests before E2E: quick health checks, then run expensive suites.
- Observability annotations: write deployment metadata to your logging/monitoring platform.
- Rollback procedure: a manual workflow to swap back, redeploy previous artifact, or restore configuration.
- Environment approvals: require reviewers for production promotion (GitHub Environments).
Closing thoughts
A strong deployment workflow is opinionated: it assumes deployments are frequent, failures happen, and promotions should be earned through automated checks. The build→deploy→test→promote pattern (especially with slot-based release) is a proven foundation that scales from a single app to an entire platform — without turning releases into a hero-driven event.
If you want, I can also rewrite your exact workflow file into a fully generic version (renaming secrets, jobs, and reusable workflow names consistently) while keeping behavior identical, so you can publish it as a reference template.
No comments:
Post a Comment