CI/CD — Smartsapp¶

This document defines how code moves from a developer's machine to production.

Smartsapp follows:

Trunk-based development
Immutable container images
Version creation during staging promotion
Manual promotion to production
Zero-downtime deployments

1. Platform¶

Component	Technology
CI/CD Orchestration	Devtron
Container Runtime	Kubernetes (RKE2)
Cluster Management	Rancher
Image Builds	Multi-stage Containerfiles
Container Registry	OCI-compatible registry

Each deployable application has its own Devtron pipeline (CI + CD).

2. Branching Strategy¶

All engineers integrate into a single branch: main.

Principles¶

Feature branches are short-lived.
main is always deployable.
Incomplete features are hidden behind feature flags.
Code is integrated continuously.
Every commit to main must build and pass tests.

3. Image Build Strategy¶

Every commit to main produces a container image.

Continuous Integration Images¶

When a commit is pushed to main:

CI builds the container image.
The image is tagged with the commit SHA.

Example:

sha-f4e8a12

This image is:

Automatically deployed to dev
Used for integration validation
Not considered a release

SHA images are temporary validation artifacts.

4. Versioning and Releases¶

A semantic version is created when promoting from dev to staging.

Developers do not manually create Git tags.

Version Creation Process¶

When promoting a build to staging:

The system identifies the latest Git tag.
The next semantic version is calculated (typically a PATCH increment).
A new Git tag (e.g., v1.14.0) is created automatically.
The tag is pushed to the repository.
The existing SHA image is retagged with the semantic version (1.14.0).
Release notes are generated automatically from commit history.
The versioned image is deployed to staging.

No rebuild occurs.

The same image is later promoted to production.

Version Increment Policy¶

By default, each staging promotion increments PATCH:

1.14.0
1.14.1
1.14.2

This may be extended in future to support automated major/minor bumps.

5. Release Notes¶

Release notes are generated automatically from the Git history between the previous version tag and the new version tag.

Example command:

git log <previous-tag>..HEAD --pretty=format:"- %s"

Release notes are:

Stored in CHANGELOG.md at the repository root
Updated automatically during staging promotion
Available for auditing

CHANGELOG.md is the authoritative record of releases.

6. Environments¶

dev → staging → production

Environment	Purpose	Replicas
dev	Continuous validation	1
staging	Release verification	1
production	Live traffic	3

Rules¶

Dev updates automatically on every commit.
Staging promotion creates a version.
Production only runs semantic version images.
Promotion to production requires manual approval.

7. Immutable Promotion Model¶

Images are built once.

Promotion moves the same image through environments:

Commit → sha-abc123 → dev
Promote → 1.14.0 → staging
Promote → 1.14.0 → production

There is no rebuild between environments.

Production runs the exact image tested in staging.

8. Monorepo Deployment Model¶

The Smartsapp monorepo contains multiple deployable applications.

Each application:

Is registered separately in Devtron
Points to the same repository
Uses a path filter to trigger builds only when relevant files change

Backend — System API¶


Repo Path	`backend/`
Containerfile	`backend/Containerfile`
Build	Gradle 8.12 + JDK 21
Runtime	Eclipse Temurin 21 JRE
Deploy Target	Kubernetes
Health Endpoint	`/actuator/health`

Frontend — Admin Portal¶


Repo Path	`ui-clients/frontend-web-apps/apps/admin-portal/`
Containerfile	Multi-stage (Node → Nginx)
Deploy Target	Kubernetes
Health Endpoint	`/`

Mobile Applications¶


Repo Paths	`ui-clients/mobile-hybrid-apps/apps/...`
Output	APK / IPA
Distribution	App Stores

Mobile applications are built separately and are not deployed via Kubernetes.

9. Pipeline Structure¶

Bitbucket Pipelines runs on every push to main.

Pipeline Steps (main branch)¶

Run Test Suite + Admin Portal Vitests + Admin Portal E2E Playwright Smoke (parallel) — Gradle build + tests with Testcontainers (Postgres + Kafka) runs alongside admin-portal vitest + coverage delta gate, and the Playwright @smoke browser tier (see docs/AI_CONTEXT.md > "CI gates" for the full gate model).
Build & Push Images (parallel) — Backend, System Documentation, Admin Portal, Parent Portal, Coverage Dashboard.
Update Infrastructure — kubectl apply for K8s manifests (probes, resources, config).
Deploy (parallel) — Backend, System Documentation, Admin Portal, Parent Portal, Coverage Dashboard via kubectl set image.
Post-deploy (parallel) — Smoke Test Backend (Newman/Postman), Deploy System Directory.

Pipeline Steps (pull requests)¶

Run Test Suite + Admin Portal Vitests + Admin Portal E2E Playwright Smoke (parallel) — same backend and frontend gates as main. PR cannot merge with a failing pipeline (requires "Successful builds" merge check in Bitbucket repo settings).

Smoke tests verify: health endpoints, OpenAPI docs, and key module APIs (school, canteen, sample). The collection lives at e2e/postman/smoke-tests.json.

Staging¶

Manual promotion.
Triggers automatic version creation.
Deploys semantic version image.

Production¶

Manual promotion.
Deploys same semantic version image.
No rebuild.

10. Deployment Strategy¶

The backend uses RollingUpdate with maxUnavailable: 0 and maxSurge: 1 for zero-downtime releases.

How It Works¶

New pod is created with the updated image.
New pod must pass startup and readiness probes.
Once ready, the old pod is terminated.
Traffic shifts seamlessly — at least one healthy pod is always serving.

Health Probes¶

readinessProbe:
  httpGet:
    path: /actuator/health/readiness
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10
  timeoutSeconds: 10

livenessProbe:
  httpGet:
    path: /actuator/health/liveness
    port: 8080
  initialDelaySeconds: 120
  periodSeconds: 15
  timeoutSeconds: 10
  failureThreshold: 5

startupProbe:
  httpGet:
    path: /actuator/health/liveness
    port: 8080
  initialDelaySeconds: 20
  periodSeconds: 10
  timeoutSeconds: 10
  failureThreshold: 30

Startup gives the JVM up to ~5 minutes to boot and settle Kafka consumers before liveness kicks in.
Liveness checks only livenessState — Kafka/Redis outages do not restart the pod.
Readiness checks only readinessState — gates traffic to healthy pods.

11. Rollback¶

If the green deployment fails (pods crash or never become ready), the Service selector remains on blue. No traffic is affected.

If an issue is detected after switchover:

Revert the Service selector to the blue pods.
Traffic returns to the previous version instantly.
No rebuild or redeployment is needed.

Blue pods are retained for a configurable period after switchover to enable this fast rollback.

For older versions, use Devtron:

Open the application in Devtron.
Go to Deployment History.
Select a previous version.
Click Rollback.

12. Hotfix Protocol¶

Use this protocol only when production is down or critically degraded and the fix is small, well-understood, and time-sensitive.

When to use¶

Production outage or critical degradation
The fix is isolated (1-3 files) and the root cause is confirmed
Waiting for the full test suite (~10 min) is not acceptable

Who can authorize¶

A hotfix requires verbal or written approval from the tech lead or the on-call engineer before merging.

Steps¶

Branch and fix. Create a short-lived branch from main with the fix.

Commit with [hotfix] tag.

fix(auth): patch session token crash [hotfix]

Use --no-verify to skip the local pre-commit test hook:

git commit --no-verify -m "fix(auth): patch session token crash [hotfix]"

Merge to main. The CI pipeline detects [hotfix] in the commit message and skips tests — it only builds the JAR and container image.
Verify on dev. Confirm the fix works on the dev environment.
Fast-track promote. Promote through staging to production using the normal promotion flow.
Mandatory follow-up (within 24 hours). Push a normal commit (without [hotfix]) that runs the full test suite against the hotfix code. This ensures test coverage is restored.

Audit trail¶

All hotfix commits are searchable:

git log --grep='\[hotfix\]'

Rollback¶

If the hotfix makes things worse, use the existing rollback procedure (Section 11) — the blue-green deployment model allows instant traffic reversion.