CI/CD — Smartsapp¶
This document defines how code moves from a developer's machine to production.
Smartsapp follows:
- Trunk-based development
- Immutable container images
- Version creation during staging promotion
- Manual promotion to production
- Zero-downtime deployments
1. Platform¶
| Component | Technology |
|---|---|
| CI/CD Orchestration | Devtron |
| Container Runtime | Kubernetes (RKE2) |
| Cluster Management | Rancher |
| Image Builds | Multi-stage Containerfiles |
| Container Registry | OCI-compatible registry |
Each deployable application has its own Devtron pipeline (CI + CD).
2. Branching Strategy¶
All engineers integrate into a single branch: main.
Principles¶
- Feature branches are short-lived.
mainis always deployable.- Incomplete features are hidden behind feature flags.
- Code is integrated continuously.
- Every commit to
mainmust build and pass tests.
3. Image Build Strategy¶
Every commit to main produces a container image.
Continuous Integration Images¶
When a commit is pushed to main:
- CI builds the container image.
- The image is tagged with the commit SHA.
Example:
sha-f4e8a12
This image is:
- Automatically deployed to dev
- Used for integration validation
- Not considered a release
SHA images are temporary validation artifacts.
4. Versioning and Releases¶
A semantic version is created when promoting from dev to staging.
Developers do not manually create Git tags.
Version Creation Process¶
When promoting a build to staging:
- The system identifies the latest Git tag.
- The next semantic version is calculated (typically a PATCH increment).
- A new Git tag (e.g.,
v1.14.0) is created automatically. - The tag is pushed to the repository.
- The existing SHA image is retagged with the semantic version (
1.14.0). - Release notes are generated automatically from commit history.
- The versioned image is deployed to staging.
No rebuild occurs.
The same image is later promoted to production.
Version Increment Policy¶
By default, each staging promotion increments PATCH:
1.14.0
1.14.1
1.14.2
This may be extended in future to support automated major/minor bumps.
5. Release Notes¶
Release notes are generated automatically from the Git history between the previous version tag and the new version tag.
Example command:
git log <previous-tag>..HEAD --pretty=format:"- %s"
Release notes are:
- Stored in
CHANGELOG.mdat the repository root - Updated automatically during staging promotion
- Available for auditing
CHANGELOG.md is the authoritative record of releases.
6. Environments¶
dev → staging → production
| Environment | Purpose | Replicas |
|---|---|---|
| dev | Continuous validation | 1 |
| staging | Release verification | 1 |
| production | Live traffic | 3 |
Rules¶
- Dev updates automatically on every commit.
- Staging promotion creates a version.
- Production only runs semantic version images.
- Promotion to production requires manual approval.
7. Immutable Promotion Model¶
Images are built once.
Promotion moves the same image through environments:
Commit → sha-abc123 → dev
Promote → 1.14.0 → staging
Promote → 1.14.0 → production
There is no rebuild between environments.
Production runs the exact image tested in staging.
8. Monorepo Deployment Model¶
The Smartsapp monorepo contains multiple deployable applications.
Each application:
- Is registered separately in Devtron
- Points to the same repository
- Uses a path filter to trigger builds only when relevant files change
Backend — System API¶
| Repo Path | backend/ |
| Containerfile | backend/Containerfile |
| Build | Gradle 8.12 + JDK 21 |
| Runtime | Eclipse Temurin 21 JRE |
| Deploy Target | Kubernetes |
| Health Endpoint | /actuator/health |
Frontend — Admin Portal¶
| Repo Path | ui-clients/frontend-web-apps/apps/admin-portal/ |
| Containerfile | Multi-stage (Node → Nginx) |
| Deploy Target | Kubernetes |
| Health Endpoint | / |
Mobile Applications¶
| Repo Paths | ui-clients/mobile-hybrid-apps/apps/... |
| Output | APK / IPA |
| Distribution | App Stores |
Mobile applications are built separately and are not deployed via Kubernetes.
9. Pipeline Structure¶
Bitbucket Pipelines runs on every push to main.
Pipeline Steps (main branch)¶
- Run Test Suite + Admin Portal Vitests + Admin Portal E2E Playwright Smoke (parallel) — Gradle build + tests with Testcontainers (Postgres + Kafka) runs alongside admin-portal vitest + coverage delta gate, and the Playwright @smoke browser tier (see
docs/AI_CONTEXT.md> "CI gates" for the full gate model). - Build & Push Images (parallel) — Backend, System Documentation, Admin Portal, Parent Portal, Coverage Dashboard.
- Update Infrastructure —
kubectl applyfor K8s manifests (probes, resources, config). - Deploy (parallel) — Backend, System Documentation, Admin Portal, Parent Portal, Coverage Dashboard via
kubectl set image. - Post-deploy (parallel) — Smoke Test Backend (Newman/Postman), Deploy System Directory.
Pipeline Steps (pull requests)¶
- Run Test Suite + Admin Portal Vitests + Admin Portal E2E Playwright Smoke (parallel) — same backend and frontend gates as main. PR cannot merge with a failing pipeline (requires "Successful builds" merge check in Bitbucket repo settings).
Smoke tests verify: health endpoints, OpenAPI docs, and key module APIs (school, canteen, sample). The collection lives at e2e/postman/smoke-tests.json.
Staging¶
- Manual promotion.
- Triggers automatic version creation.
- Deploys semantic version image.
Production¶
- Manual promotion.
- Deploys same semantic version image.
- No rebuild.
10. Deployment Strategy¶
The backend uses RollingUpdate with maxUnavailable: 0 and maxSurge: 1 for zero-downtime releases.
How It Works¶
- New pod is created with the updated image.
- New pod must pass startup and readiness probes.
- Once ready, the old pod is terminated.
- Traffic shifts seamlessly — at least one healthy pod is always serving.
Health Probes¶
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 10
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 120
periodSeconds: 15
timeoutSeconds: 10
failureThreshold: 5
startupProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 20
periodSeconds: 10
timeoutSeconds: 10
failureThreshold: 30
- Startup gives the JVM up to ~5 minutes to boot and settle Kafka consumers before liveness kicks in.
- Liveness checks only
livenessState— Kafka/Redis outages do not restart the pod. - Readiness checks only
readinessState— gates traffic to healthy pods.
11. Rollback¶
If the green deployment fails (pods crash or never become ready), the Service selector remains on blue. No traffic is affected.
If an issue is detected after switchover:
- Revert the Service selector to the blue pods.
- Traffic returns to the previous version instantly.
- No rebuild or redeployment is needed.
Blue pods are retained for a configurable period after switchover to enable this fast rollback.
For older versions, use Devtron:
- Open the application in Devtron.
- Go to Deployment History.
- Select a previous version.
- Click Rollback.
12. Hotfix Protocol¶
Use this protocol only when production is down or critically degraded and the fix is small, well-understood, and time-sensitive.
When to use¶
- Production outage or critical degradation
- The fix is isolated (1-3 files) and the root cause is confirmed
- Waiting for the full test suite (~10 min) is not acceptable
Who can authorize¶
A hotfix requires verbal or written approval from the tech lead or the on-call engineer before merging.
Steps¶
-
Branch and fix. Create a short-lived branch from
mainwith the fix. -
Commit with
[hotfix]tag.fix(auth): patch session token crash [hotfix]Use
--no-verifyto skip the local pre-commit test hook:git commit --no-verify -m "fix(auth): patch session token crash [hotfix]" -
Merge to main. The CI pipeline detects
[hotfix]in the commit message and skips tests — it only builds the JAR and container image. -
Verify on dev. Confirm the fix works on the dev environment.
-
Fast-track promote. Promote through staging to production using the normal promotion flow.
-
Mandatory follow-up (within 24 hours). Push a normal commit (without
[hotfix]) that runs the full test suite against the hotfix code. This ensures test coverage is restored.
Audit trail¶
All hotfix commits are searchable:
git log --grep='\[hotfix\]'
Rollback¶
If the hotfix makes things worse, use the existing rollback procedure (Section 11) — the blue-green deployment model allows instant traffic reversion.