Service Delivery & SLA Management - ASI AI Solutions Wiki

Service Tiers

ASI AI Solutions offers three managed service tiers. Each tier includes AI-powered monitoring via ASI AI Sentinel, with increasing levels of proactive support and dedicated resources.

♦

Platinum

Premium

Response Time: 15 minutes (P1)
Support Hours: 24/7/365
Dedicated SDM: Yes, named
AI Monitoring: Full suite + predictive
Service Reviews: Monthly
On-site Support: Included (scheduled)
Uptime SLA: 99.99%
Proactive Optimisation: Quarterly

♦

Gold

Standard

Response Time: 1 hour (P1)
Support Hours: 24/7 P1-P2, BH P3-P4
Dedicated SDM: Shared (1:5 ratio)
AI Monitoring: Full suite
Service Reviews: Quarterly
On-site Support: Chargeable
Uptime SLA: 99.95%
Proactive Optimisation: Bi-annual

♦

Silver

Essential

Response Time: 4 hours (P1)
Support Hours: Business Hours (7am-7pm AEST)
Dedicated SDM: Shared (1:10 ratio)
AI Monitoring: Core monitoring
Service Reviews: Quarterly
On-site Support: Chargeable
Uptime SLA: 99.9%
Proactive Optimisation: Annual

Incident Priority Matrix

Incidents are classified using a combination of Impact (number of users/business functions affected) and Urgency (time sensitivity) to determine priority.

Priority	Definition	Example	Platinum Response	Gold Response	Silver Response	Resolution Target
P1 - Critical	Complete service outage or critical business function unavailable. Affects all/most users.	Email system down, ERP unavailable, site-wide network outage	15 min	1 hr	4 hr	4 hours
P2 - High	Major degradation of a key service. Significant number of users impacted. Workaround may exist.	Shared drive slow, VPN dropping intermittently, backup failures	30 min	2 hr	8 hr	8 hours
P3 - Medium	Minor service impact. Single user or small group affected. Workaround available.	Single user can't print, Outlook add-in not loading, password reset	1 hr	4 hr	1 BD	24 hours
P4 - Low	Informational or cosmetic issue. No operational impact.	Feature request, how-to question, non-urgent change	4 hr	1 BD	2 BD	5 business days

BD = Business Day (Mon-Fri, 07:00-19:00 AEST, excl. Australian public holidays)

Impact × Urgency Matrix

	High Urgency	Medium Urgency	Low Urgency
High Impact	P1	P2	P3
Medium Impact	P2	P3	P4
Low Impact	P3	P4	P4

Escalation Procedures

Functional Escalation

When the current support tier cannot resolve an incident within the allocated time, it is escalated to the next tier.

L1: AI Triage
0-15 min

→

L2: Service Desk
15-60 min

→

L3: Specialist
1-4 hrs

→

L4: Vendor/Architect
4+ hrs

Hierarchical Escalation

When management attention or authority is needed (e.g., resource allocation, client communication for major incidents).

Time Elapsed	Escalation To	Action Required
P1 at 30 min	Service Delivery Manager	Notified and monitors. Ensures resources assigned.
P1 at 1 hr	Head of Service Delivery	Briefed. Authorises additional resources. Client executive notified.
P1 at 2 hrs	VP of Operations	Briefed. May invoke Major Incident Process. Executive bridge call.
P1 at 4 hrs	CEO	Briefed if business-critical client or reputational risk.
P2 at 4 hrs	Service Delivery Manager	Notified and ensures resolution path.
P2 at 8 hrs	Head of Service Delivery	Briefed. Additional resources allocated.

SLA Reporting & Review

Monthly SLA Report Contents

Executive summary of service performance
SLA compliance by priority level (response time and resolution time)
Total incident count and trend analysis
Top 10 incident categories
First-contact resolution rate
Customer satisfaction score (CSAT)
AI Sentinel proactive detection rate (incidents caught before user impact)
Service availability percentage vs SLA target
Pending changes and upcoming maintenance
Recommendations and improvement actions

Review Cadence

Review Type	Frequency	Attendees	Focus
Operational Review	Weekly	SDM, Team Leads	Ticket queue, SLA at risk, resource utilisation
Monthly Service Review	Monthly	SDM, Client Primary Contact	SLA performance, incidents, upcoming changes
Quarterly Business Review (QBR)	Quarterly	Head of SD, Client Exec Sponsor, SDM	Strategic alignment, service improvement, roadmap
Annual Service Review	Annually	VP Ops, Client CIO/CTO	Contract review, strategic planning, innovation

Service Improvement Plans

A Service Improvement Plan (SIP) is initiated when:

SLA compliance falls below 95% for any priority level in a calendar month
Customer satisfaction (CSAT) drops below 4.0/5.0 for two consecutive months
A Major Incident PIR identifies systemic process failures
Client formally requests a SIP through their SDM

SIP Structure

Problem Statement: Clear description of the service gap
Root Cause Analysis: 5-Whys or fishbone analysis
Improvement Actions: Specific, measurable actions with owners and due dates
Success Criteria: How we will know the improvement worked
Review Schedule: Weekly progress reviews until targets met
Closure: Formal closure when success criteria achieved for 30 consecutive days

Customer Satisfaction Measurement

CSAT (Per-Ticket)

Sent after every resolved ticket via automated ServiceNow survey.

Scale: 1-5 stars
Target: ≥ 4.5/5.0 average
Response rate target: ≥ 30%
Scores of 1-2 trigger automatic SDM follow-up within 4 hours
Results reviewed weekly in operational review

NPS (Net Promoter Score)

Sent quarterly to all client stakeholders via email survey.

Scale: 0-10 ("How likely are you to recommend ASI?")
Target: NPS ≥ 50
Detractors (0-6) receive personal follow-up from Head of SD
Results presented at QBR and internal leadership meeting
Trend analysis tracks improvements quarter-over-quarter

Template: Monthly Service Review Agenda

📄 Monthly Service Review Meeting Agenda

Client Name

[Client name]

Date & Time

[DD/MM/YYYY HH:MM AEST]

Attendees

[ASI SDM, Client Primary Contact, optional: technical leads]

Agenda Item 1 - Executive Summary (5 min)

Overall service health: Green/Amber/Red. Key highlights and lowlights.

Agenda Item 2 - SLA Performance (10 min)

Response and resolution compliance by priority. Availability metrics. Comparison to previous month.

Agenda Item 3 - Incident Analysis (10 min)

Total incidents, top categories, trends. Major incidents (if any) and PIR outcomes.

Agenda Item 4 - AI Monitoring Insights (5 min)

Proactive detections, predictive alerts acted upon, false positive rate tuning.

Agenda Item 5 - Change & Project Updates (5 min)

Changes completed, upcoming changes, project milestone updates.

Agenda Item 6 - Security Posture (5 min)

Security incidents, vulnerability scan results, compliance status.

Agenda Item 7 - Improvement Actions (5 min)

Open SIP items, recommendations, optimisation opportunities.

Agenda Item 8 - AOB & Next Steps (5 min)

Client questions, action items, confirm next meeting date.

AI-Powered Monitoring & Alerting

ASI AI Sentinel is our proprietary AI monitoring platform deployed to all managed clients. It provides real-time infrastructure monitoring with machine learning-driven anomaly detection.

Monitoring Coverage

Category	Metrics Monitored	Alert Threshold	AI Enhancement
Compute	CPU utilisation, memory usage, process count	CPU > 85% for 5 min, Memory > 90%	Predictive scaling recommendations based on historical patterns
Storage	Disk usage, IOPS, latency, disk health (SMART)	Disk > 85%, IOPS latency > 20ms	Capacity forecasting with 30/60/90 day projections
Network	Bandwidth, packet loss, latency, interface errors	Packet loss > 1%, Latency > 100ms	Anomaly detection on traffic patterns for DDoS/exfiltration
Application	Response time, error rate, availability, transaction volume	Error rate > 5%, Response > 3s	Correlation of application errors with infrastructure events
Security	Failed logins, privilege escalation, file integrity, SIEM events	Per security policy rules	Behavioral analysis for insider threat detection
Cloud	Azure/AWS/GCP resource health, cost anomalies, config drift	Cost > 20% above forecast, config drift detected	Cost optimisation recommendations, right-sizing suggestions
Backup	Backup success/failure, backup duration, RPO compliance	Any backup failure, RPO breach	Predictive backup window optimisation

ⓘ

AI Sentinel Proactive Detection Rate: Target of detecting and auto-remediating 40% of potential incidents before user impact. Current performance: 38% (February 2026). Auto-remediation covers actions such as service restarts, disk cleanup, certificate renewals, and DNS failover.

Key Performance Indicators

99.97%

Average Availability

96.2%

SLA Compliance (P1)

4.6/5.0

CSAT Score

+54

Net Promoter Score

72%

First Contact Resolution

38%

AI Proactive Detection