Change Management Process
ITIL-aligned change management ensuring controlled and efficient handling of all changes to minimise risk and service disruption.
Change Types
Pre-approved, low risk, well-documented.
- Follows an established, tested procedure
- No CAB approval required
- Implemented by authorised engineers
- Examples: password resets, user provisioning, approved patch deployment, standard firewall rule additions
- Must be logged in ServiceNow for audit trail
Requires CAB review and approval.
- Non-standard, carries moderate to high risk
- Requires RFC submission with risk assessment
- Reviewed at weekly CAB meeting
- Examples: server migrations, network re-architecture, application upgrades, new service deployment
- Minimum 5 business days lead time for CAB
Urgent, bypasses standard CAB process.
- Required to resolve a P1/P2 incident or critical security vulnerability
- Approved by Emergency CAB (eCAB): two senior engineers + SDM
- Must be documented retrospectively within 24 hours
- Subject to full PIR
- Target: < 5% of all changes should be emergency
Normal Change Process Flow
Change Advisory Board (CAB)
CAB Membership
| Role | Name | Responsibility |
|---|---|---|
| CAB Chair | James O'Brien | Chairs meeting, final approval authority, conflict resolution |
| Service Delivery Rep | Sarah Chen (delegate) | Assesses client impact and SLA risk |
| Security Rep | Priya Sharma (delegate) | Assesses security implications |
| Solutions Architect | Rotating | Technical feasibility and architecture alignment |
| Project Delivery Rep | Marcus Webb (delegate) | Assesses project dependencies and scheduling |
| Change Manager | Designated | Presents RFCs, records decisions, tracks actions |
CAB Meeting Schedule
- Weekly CAB: Every Wednesday, 10:00–11:00 AEST (Microsoft Teams)
- RFC Submission Deadline: Monday 17:00 AEST (2 business days before CAB)
- Emergency CAB (eCAB): Convened within 30 minutes via PagerDuty escalation, any time 24/7
CAB Decision Outcomes
| Decision | Description | Next Step |
|---|---|---|
| Approved | RFC accepted as submitted | Schedule and implement per plan |
| Approved with Conditions | RFC accepted with modifications required | Amend RFC, confirm conditions met, then implement |
| Rejected | RFC not approved due to unacceptable risk or incomplete information | Revise and resubmit at next CAB, or withdraw |
| Deferred | More information needed, or scheduling conflict | Provide additional info and resubmit |
RFC Template
📄 Request for Change (RFC)
Risk Assessment Matrix
Every normal and emergency change must undergo risk assessment using the following matrix. The overall risk level determines the approval path.
| Low Likelihood | Medium Likelihood | High Likelihood | |
|---|---|---|---|
| High Impact | Medium Risk | High Risk | Critical Risk |
| Medium Impact | Low Risk | Medium Risk | High Risk |
| Low Impact | Low Risk | Low Risk | Medium Risk |
Risk Level Approval Requirements
| Risk Level | Approval Required | Additional Requirements |
|---|---|---|
| Low | Team Lead | Standard implementation plan required |
| Medium | CAB approval | Peer review of implementation and rollback plans |
| High | CAB + Head of Infrastructure | Pre-implementation test in staging environment required |
| Critical | CAB + VP Operations | Full DR test, client executive approval, on-call team during window |
Change Calendar
All approved changes are published to the shared change calendar in ServiceNow and Microsoft Teams. The following maintenance windows are standard:
| Window | Schedule | Scope | Client Impact |
|---|---|---|---|
| Standard Maintenance | Tuesday 22:00 – Wednesday 02:00 AEST (weekly) | Routine patches, minor config changes | Minimal — brief service interruptions possible |
| Extended Maintenance | Saturday 22:00 – Sunday 06:00 AEST (monthly, 1st weekend) | Major upgrades, migrations, infrastructure changes | Moderate — planned outages communicated 5 BD in advance |
| Emergency Window | Any time (as needed) | Critical security patches, P1 incident fixes | Variable — communicated ASAP, minimum 1 hr notice where possible |
Rollback Procedures
Every change must have a documented rollback plan. The following principles apply:
- Rollback trigger: Clearly define the criteria that trigger a rollback (e.g., service unavailable for > 15 min post-change, error rate exceeds 10%, performance degrades by > 50%)
- Rollback authority: The implementing engineer can initiate rollback. For changes affecting multiple clients, SDM approval is needed unless critical.
- Rollback window: Must be achievable within the remaining maintenance window. If rollback cannot complete before the window closes, escalate to CAB Chair immediately.
- Snapshots/Backups: Take VM snapshots and configuration backups immediately before implementation. Verify backup integrity before proceeding.
- Communication: If rollback is initiated, notify affected clients within 15 minutes with revised timeline.
- Post-rollback: Log a failed change in ServiceNow. Schedule PIR within 3 business days. Revised RFC required for re-attempt.
Post-Implementation Review
All high-risk, critical-risk, emergency, and failed changes require a Post-Implementation Review (PIR) within 5 business days.
PIR Checklist
- Was the change implemented as planned?
- Was the change completed within the scheduled window?
- Were there any unexpected issues during implementation?
- Was the rollback plan adequate (if tested or invoked)?
- Has the change achieved its stated objectives?
- Were there any incidents caused by the change within 72 hours?
- Have monitoring alerts been reviewed post-change?
- Has documentation been updated (network diagrams, runbooks, CMDB)?
- Should the change procedure be standardised for future use?
- Were there lessons learned that should be shared?
AI-Assisted Change Impact Analysis
ASI AI Sentinel provides automated change impact analysis for all RFCs:
What AI Analyses
- Dependency mapping: Automatically identifies upstream/downstream service dependencies from the CMDB
- Historical correlation: Analyses past changes to similar CIs and identifies common failure modes
- Conflict detection: Checks for scheduling conflicts with other approved changes
- Blast radius estimation: Calculates the number of users and services potentially affected
- Risk score: Generates a 0-100 risk score based on change complexity, affected CIs, time of day, and historical success rates
AI Recommendations
- Suggested maintenance window based on lowest usage patterns
- Recommended rollback checkpoints
- Similar successful change implementations for reference
- Flagged risks that require additional mitigation
- Estimated implementation duration based on historical data