We’re often asked: “what service commitments can you make?” Agency Service Level Agreements (SLA’s) are those baseline commitments. For us, often an agency for agencies, the Service Level Agreement is a partner promise that comes with defined consequences if broken.
Background To Our Service Level Agreements
“Change is the only constant..." (said Heraclitus) something we often hear and know is true. In the context of services, this of course nods to agile and its principles, which help provide a success management framework to manage constant change to ensure partner success. And within any model that supports execution, customers or partners should be involved.
There is a lot going on behind the curtain.
For success to happen, there need to be structures which aid collaboration and coordination—with or without cadence per needs. Beyond the internal agency system which is working to build a solution, there is a complete support team engaged and involved in architecting and building the right operational system to ensure efficient and effective agency-to-agency collaboration (aka: doing the right things at the right time).
This is true especially for agency support services as customer requests could vary in urgency, complexity, and risk levels. The collaborative play generally involves reducing resolution and communication time while ensuring an effective solution.
All of this, when converted to measurable metrics and success criteria, ultimately defines the Service Level Agreement. These SLA's help us set clear and measurable guidelines and to eliminate confusion while defining what is and what is not acceptable. Eventually this establishes clarity: our commitment to our partners.
Tailoring Service Level Agreements: The Basics
The success of a support engagement is entirely determined by how much an agency is able to reduce response and resolution time for the customer. This depends on the complexity, the scope of work, and the velocity of work coming in by re-architecting the operational ecosystem. For example, an agency may want create an offshore team for a strategic partner to accommodate their time zone, essentially ensuring availability for efficient resolution and response times.
What does this depend on?
Tailoring is most often not dependent on the actual process of ticket closure, ticket collection, or ticket response, for which agencies normally have established baselines and frameworks. Rather this is based on:
- How the subunits in an agency are structured (to aid efficiency)
- The capability tiers within an agency (to manage complexity and delegation)
- Engineers managing the service requests (value system including softer skills such as communication)
For instance, a certain type of ticket created for an incident that occurred at the backend server may need to be pushed to a particular kind of functional crew (Centers of Excellence) or a different capability tier. The complexity and risk level of the ticket determines the capability tier to which the ticket should be delegated.
The more tiers there are available, the more advanced the SLA definitions, giving clients complete coverage across a range of issues with varying complexity.
Global teams have an effect.
With geographically distributed teams, SLA definitions could get another degree of flexibility (to aid efficiency) particularly if the company allows its team members to work for a certain number of planned and proactively communicated hours. We can have team members working within clients’ time zone, as required. We have found that this model is clearly scalable, for it provides a feasibility to build pods (fully functional in capability) covering more geographies, ultimately speeding up resolution times for our clients with strategic "geo-positioning."
Designing More Holistic SLA’s: Key Considerations
While support SLA’s will, first and foremost, deal in aspects revolving around defect resolution, a holistic SLA must cover more than just reactive maintenance needs. Maintenance partners also need to consider their partners’ longer term needs along with short term return on investments.
What about “preventive” maintenance?
Once support teams begin working on tickets, they’ll often find that there are other problems that run deeper on the end-client’s site, for example. The site may not adhere to prescribed standards, which could mean that further down the line it will not be scalable and the code will not be optimized for performance. Proactive, corrective, adaptive maintenance as separate streams demand separate centers of excellence as we scale.
At this point, support agencies can either resolve the superficial issue and close the ticket, or they can view this as an opportunity to consult with the client about considerations that will prove valuable in the long term. Mature agencies will take the opportunity to offer consultative value to their clients, helping them move closer to their long-term vision on their success journey.
SLA’s should therefore be designed to include implicit, non-functional expectations which are also part of maintenance. These can be based around security requirements, performance optimization, and scalability (e.g. the number of users, adding more functions, etc.).Through experience we have generally observed, maintenance services generally are supported in parallel with managed end-to-end projects.
It’s important to address “technical debt” before anything else.
Another important consideration for support agencies to keep in mind at the start of the engagement is that end-clients’ websites may be burdened with some degree of technical debt, that is: the implied cost of additional rework caused by choosing an easy solution in the past instead of using a better approach.
Once it is clear that there is technical debt associated with a project, the first “obvious” priority should be to address it. The very first milestone should be focused on resolving technical debt, so that SLA’s can be be reliably defined and adhered to.
If this is not done at the start of the project, any effort expended on fixing bugs, making enhancements or tailoring SLA’s is likely to be non-impactful (per expectations) effort, as unresolved technical debt is likely to create complications further down the line. Engineers may find themselves fixing some aspects, but in the process unearthing several other issues with the end-client’s website.
At later stages in the project, agencies may find that clients do not wish to pay for the cost of fixing new bugs as well as all the hidden issues that are revealed in the process (which may be rooted in unaddressed technical debt).
This might seem basic, but it’s important for support partners to bring any kind of technical debt to the attention of clients for a collaborative, consultative and a symbiotic relationship.
Agencies that are looking to enhance any partnership and progress towards shared value creation will want to raise such issues early and address them proactively. Doing so may offer no immediate returns, but it does increase the probability of having a successful long-term relationship.
Driving SLA Success: Preventing Breach
For service requests, the most important consideration usually is how quickly communication can be established with the end-client to update them on the expected time to resolution (depending on the severity and risk level of the service request: urgent, critical, normal or low priority and on the complexity of the solution) and floor developments. This is determined by how fast the ticket can be recorded, and the support team’s time to first response. System cycle time and Ticket cycle time being equally important to consider.
SLA’s can we tweaked to include the above drivers after identifying what works with the customer and the servicing organization collaboratively.
Effective Discovery: If an issue needs more research, causes a risk spike, doesn’t have a direct solution, or is a known issue, usually the best that support partners can do is to provide a workaround. In such a case, it becomes imperative to provide clients about the time needed to build a patch, for instance. Success is determined by how quickly support staff are able to establish effective communication with the end-client to source solutions to their challenge or opportunity.
Capability Tiers & Value System: For urgent or critical requests, which may cause website downtime and possible business losses for the clients success usually depends on two factors:
- Engineers available on the floor to solve that particular query
- Engineers with the right capability are available
Careful consideration and planning going into weaving one and two effectively in operational execution adds to customer success. It is imperative for support agencies to ensure that service requests are assigned to engineers at the appropriate capability level who can think beyond than just providing a quick-fix. All things considered, nobody really benefits from a quick or “hot” fix; they benefit from partners’ efforts towards genuine, consultative value creation—reducing these problems, and proactively evolving.
Empathy: Any effective partnership has to be grounded in a thorough understanding of the client’s vision (read: long term goals!) as well as a sense of ownership. Challenges exist but customer success entails how well they are actively listened to, acknowledged, understood, and supported with a mutually agreeable, co-architected solution.
Communication: The key factor in determining successful outcomes. There are two broad types of communication failures experienced:
- Lack of timely reply to clients, where the team may already be working on the issue, but if the client has not been kept informed, their experience is likely to be negative.
- Failure to provide a satisfactory explanation (i.e. communicating) with clients; even when an issue could not be effectively resolved in time, this should be conveyed to clients in such a way that conveys reasoning, efforts made, and what comes next.
Handling SLA Breaches: Broken Promises
An SLA breach is unacceptable. It can result in lost revenues and end-clients, as well as seriously damaging PR for the partner agency. In all such cases, it is vital for support staff to act prudently and methodically to restore service at the earliest.
What do we do if this happens?
It’s important to understand what led to the breach, and promptly resolve any internal challenges through a joint retrospective.Transparency, ownership and effective communication with end-clients can help to restore broken trust and create positive outcomes.
Any service provider has to be sensitive to the serious repercussions of SLA breaches for partners and end-clients and hence the need for measures to protect our partners in the event of an SLA breach. Depending on whether there is work left incomplete, work that is found to be sub-quality, or work that is delayed. Some should and do refund varying percentages of the project budget to partners.
Service Level Agreements & Trust
Ultimately, the goal of an SLA is to foster accountability and a sense of trust in the partner agency. Well-designed SLA’s protect the interests of both parties and ensure that any issues can be quickly and fairly resolved. When these are meticulously detailed and effectively used, they help create successful engagements and better outcomes for end-clients. And that is exactly what we're striving to do each and every time.