Four Problem Management SLAs you really can't live without

Simon Higginson

This article has been contributed by Simon Higginson.

Problem Management is the intriguing discipline of the Service Management suite.  The IT Department is continually being asked to be proactive not reactive.

Often in IT we presuppose what our customers in the business require, then give them a solution to issues that they didn’t know that they had.  But what happens when that business customer is asking IT for a permanent solution to an issue we might not have known that we had, or to an issue where we know only a sticking plaster fix is in place?

Your Problem Manager is the key

Step up to the plate the Problem Manager, the individual focussed on reacting to, and managing, issues that have already happened. They can’t really help but have a reactive mindset, rooted in the analysis of fact.  The incident might be closed but the Problem Manager is the person entrusted with ensuring that appropriate steps are taken to guarantee the incident doesn’t repeat itself.  It can be a stressful role, the systems were down, the company perhaps lost, and may still be, losing money, trading has been impacted.  People want to know what is being done.  So what SLAs can be put in place between the Problem Manager and the service owner to support the Problem Manager’s activities and maybe give them breathing space, whilst at the same time ensuring that there is some focus on resolution?

Lets look at the four problem management SLAs that you really can’t live without

#1 – Provision of Problem Management reference number

A simple SLA to get you started.  This is simply an acknowledgement by the problem management team that the problem has been logged, referenced and is in the workflow of the team.  It provides reassurance that the problem is going to be dealt with.

#2 – Time to get to the root cause of the issue

So this is where some breathing space is provided.  The message being given in this particular SLA is that there is a distinction between incident management and problem management.  Incident management has resulted in a temporary fix to an issue, now it is the turn of problem management to actually work out what lay at the heart of the matter – what was the root cause.

Note this is an SLA about identifying and not resolving the root cause – that could take a significant time period involving redevelopment of code.

The outcome that is being measured by the SLA is going to be the production of a deliverable, perhaps in the form of a brief document or even just an email that highlights the results of the root cause analysis.  Each company will have to determine its own policy of what that deliverable might contain, but the SLA is there to measure the time between the formal closure of the incident and the formal provisioning time of problem management’s root cause analysis deliverable.

#3 – Measurement of provision of Root Cause Analysis documentation.  To be provided within X working days of initial notification.

So, you’ve acknowledged receipt of the problem, and you’ve determined the root cause. The next SLA is in place to ensure that a formal document is delivered in a timely fashion. It should have a set format and set down the timeline of events that caused the problem, and actions that have been taken to provide a workaround. It should then list all of the actions and recommendations together with clearly identified owners that need to be completed by realistic dates in order to fix the problem. A suggested target date would be 3 days for simple problems and 5 and 10 days for increasingly more complex ones.

#4 – Measurement of progress on root cause analysis actions as agreed (Target dates not to change more than twice)

In the previous SLA we have measured the time to produce the root cause analysis.  This SLA takes over where the previous clock stopped.

The root cause analysis work will have identified actions that need to be undertaken and implemented to affect a permanent fix to the original issue and allow the sticky plaster solution to be superseded.

However, all resolutions will not be equal in complexity, effort and duration, therefore there will be an initial estimation of a target date for live implementation of a permanent fix.  Moving the target completion date is allowed, however this SLA limits how often this can occur to prevent action timescales drifting.

This article has been contributed by Simon Higginson of Frimley Green Ltd, Simon’s expertise is helping clients get the best out of their service suppliers and creating win-win partnerships.

Everything is improvement

Traditionally Continual Service Improvement (CSI) is too often thought of as the last bit we put in place when formalising ITSM.  In fact, we need to start with CSI, and we need to plan a whole portfolio of improvements encompassing formal projects, planned changes, and improvements done as part of business-as-usual (BAU) operations.  And the ITIL ‘process’ is the wrong unit of work for those improvements, despite what The Books tell you. Work with me here as I take you through a series of premises to reach these conclusions and see where it takes us.

In my last article, I said service portfolio management is a superset of organisational change management.  Service portfolio decisions are decisions about what new services go ahead and what changes are allowed to update existing services, often balancing them off against each other and against the demands of keeping the production services running.  Everything we change is service improvement. Why else would we do it?  If we define improvement as increasing value or reducing risk, then everything we change should be to improve the services to our customers, either directly or indirectly.
Therefore our improvement programme should manage and prioritise all change.  Change management and service improvement planning are one and the same.

Everything is improvement

First premise: Everything we change is service improvement

Look at a recent Union Pacific Railroad quarterly earnings report.  (The other US mega-railroad, BNSF, is now the personal train-set of Warren Buffett – that’s a real man’s toy – but luckily UP is still publicly listed and tell us what they are up to).

I don’t think UP management let one group decide to get into the fracking materials business and allowed another to decide to double track the Sunset Route.  Governors and executive management have an overall figure in mind for capital spend.   They allocate that money across both new services and infrastructure upgrades.

They manage the new and existing services as a portfolio.  If the new fracking sand traffic requires purchase of a thousand new covered hoppers then the El Paso Intermodal Yard expansion may have to wait.  Or maybe they borrow the money for the hoppers against the expected revenues because the rail-yard expansion can’t wait.  Or they squeeze operational budgets.  Either way the decisions are taken holistically: offsetting new services against BAU and balancing each change against the others.

Our improvement programme should manage and prioritise all change, including changes to introduce or upgrade (or retire) services, and changes to improve BAU operations.  Change management and service portfolio management are both aspects of the same improvement planning activity.  Service portfolio management makes the decisions; change management works out the details and puts them into effect.

It is all one portfolio

Second premise: Improvement planning comes first

Our CSI plan is the FIRST thing we put together, not some afterthought we put in place after an ‘improvement’ project or – shudder – ‘ITIL Implementation’ project.
UP don’t rush off and do $3.6 billion in capital improvements then start planning the minor improvements later.  Nor do they allow their regular track maintenance teams to spend any more than essential on the parts of the Sunset Route that are going to be torn up and double tracked in the next few years.  They run down infrastructure that they know is going to be replaced.  So the BAU improvements have to be planned in conjunction with major improvement projects.  It is all one portfolio, even if separate teams manage the sub-portfolios.  Sure miscommunications happen in the real world, but the intent is to prevent waste, duplication, shortages and conflicts.

Welcome to the real world

Third premise: we don’t have enough resource to execute all desired improvements

In the perfect world all trains would be controlled by automated systems that flawlessly controlled them, eliminating human error, running trains so close they were within sight of each other for maximum track utilisation, and never ever crashing or derailing a train.  Every few years governments legislate towards this, because political correctness says it is not enough to be one of the safest modes of transport around: not even one person may be allowed to die, ever.  The airlines can tell a similar story.   This irrational decision-making forces railroads to spend billions that otherwise would be allocated to better trackwork, new lines, or upgraded rolling stock and locos.  The analogy with – say – CMDB is a strong one: never mind all the other clearly more important projects, IT people can’t bear the idea of imperfect data or uncertain answers.
Even if our portfolio decision-making were rational, we can’t do everything we’d like to, in any organisation.  Look at a picture of all the practices involved in running IT

You can’t do everything

The meaning of most of these labels should be self-evident.  You can find out more here.  Ask yourself which of those activities (practices, functions, processes…  whatever you want to call them) which of them could use some improvement in your organisation.  I’m betting most of them.
So even without available funds being gobbled up by projects inspired by political correctness, a barmy new boss, or a genuine need in the business, what would be the probability of you getting approval and money for projects to improve all of them?  Even if you work at Google and money is no problem, assuming a mad boss signed off on all of them what chance would you have of actually getting them all done?  Hellooooo!!!

What are we doing wrong?

Fourth premise: there is something very wrong with the way we approach ITSM improvement projects, which causes them to become overly big and complex and disruptive.  This is because we choose the wrong unit of work for improvements.

How to cover everything that needs to be looked at?  The key word there is ‘needs’.  We should understand what are our business goals for service, and derive from those goals what are the required outcomes from service delivery, then focus on improvements that deliver those required outcomes … and nothing else.

One way to improve focus is to work on smaller units than a whole practice.  A major shortcoming of many IT service management projects is that they take the ITIL ‘processes’ as the building blocks of the programme.  ‘We will do Incident first’.  ‘We can’t do Change until we have done Configuration’.  Even some of the official ITIL books promote this thinking.

Put another way, you don’t eat an elephant one leg at a time: you eat it one steak at a time… and one mouthful at a time within the meal.  Especially when the elephant has about 80 legs.

Don’t eat the whole elephant

We must decompose the service management practices into smaller, more achievable units of work, which we assemble Lego-style into a solution to the current need.  The objective is not to eat the elephant, it is to get some good meals out of it.
Or to get back to railroads: the Sunset Route is identified as a critical bottleneck that needs to be improved, so they look at trackwork, yards, dispatching practices, traffic flows, alternate routes, partner and customer agreements…. Every practice of that one part of the business is considered.  Then a programme of improvements is put in place that includes a big capital project like double-tracking as much of it as is essential; but also includes lots of local minor improvements across all practices – not improvements for their own sake, not improvements to every aspect of every practice, just a collection of improvements assembled to relieve the congestion on Sunset.

Make improvement real

So take these four premises and consider the conclusions we can draw from them:

  1. Everything we change is service improvement.
  2. Improvement planning comes first.
  3. We don’t have enough resource to execute all desired improvements.
  4. We choose the wrong unit of work for improvements.

We should begin our strategic planning of operations by putting in place a service improvement programme.  That programme should encompass all change and BAU: i.e. it manages the service portfolio.

The task of “eating 80-plus elephant’s legs” is overwhelming. We can’t improve everything about every aspect of doing IT.   Some sort of expediency and pragmatism is required to make it manageable.  A first step down that road is to stop trying to fix things practice-by-practice, one ITIL “process” at a time.

Focus on needs

We must focus on what is needed.  To understand the word ‘needed’ we go back to the desired business outcomes.  Then we can make a list of the improvement outputs that will deliver those outcomes, and hence the pieces of work we need to do.

Even then we will find that the list can be daunting, and some sort of ruthless expediency will have to be applied to choose what does and doesn’t get done.

The other challenge will be resourcing the improvements, no matter how ruthlessly we cut down the list.  Almost all of us work in an environment of shrinking budgets and desperate shortages of every resource:  time , people and money.  One way to address this– as I’ve already hinted – is to do some of the work as part of BAU.

These are all aspects of my public-domain improvement planning method, Tipu:

  • Alignment to business outcomes
  • Ruthless decision making
  • Doing much of the work as part of our day jobs

More of this in my next article when we look closer at the Tipu approach.

Barclay Rae: Assessment Criteria for Service Catalogue

Editor’s Note: We are very pleased to welcome Barclay Rae as Analyst and contributor to The ITSM Review. Barclay is a well respected pillar of the ITSM community and we look forward to publishing his first analysis.

I will soon be started a competitive analysis of Service Catalogue offerings in the ITSM market.

As with previous reviews completed on The ITSM Review my goal will be to highlight the key strengths, competitive differentiators and innovation in the industry.

The criteria I will use for my assessments are published below, if you have any comments or recommendations please leave a comment below or contact us.


Service Catalogue Assessment Criteria

Service Catalogue Products

Offerings should provide the following high-level functionality:

  1. Service Design – the ability to create a database of service records, containing a number of business and technical attributes, processes and workflows
  2. Service Structure – the ability to organise and structure these services into a hierarchy of services and service offerings, ideally useable in a graphical format
  3. User Request Portal – a user friendly/external facing portal that provides users with an intuitive User Interface to request services
  4. Request Fulfilment – request management workflow functionality that can be easily used and configured by system users
  5. SLA and event management – the ability (in the software or by integration) to define universal and bespoke levels of SLA which are then automated and escalated though an event management process – ideally linking with Incident, Problem and Change Management functionality
  6. Demand Management – the ability to provide real time allocation and monitoring of Service consumption, with financial calculations
  7. Dashboard – real-time user-friendly graphical monitoring and analysis of usage, trends and metrics across services and to various stakeholders
  8. Service Reporting – the ability to present output that summarises individual and bundled service performance, consumption, SLA and event performance, in user-friendly, portable and graphical format

Specific Requirements

  1. Service Design – the ability to create a database of service records, containing a number of business and technical attributes, processes and workflows, including:
    • Service design form – to create a service
    • Add attributes – e.g.  description, user groups, business owner, service owner, type, portfolio status, SLA details, Key metric details, cost/price, support model details (1st/2nd/3rd level), warranty details, vendor details, resolver group details, escalation, CIs and technical details, parent and child services, offerings, unit cost and price, user-created attributes
    • Creation of services in hierarchy – sub-services and service offerings (specific tasks)
    • Provide pre-filled service form templates
  2. Service Structure – the ability to organise and structure these services into a hierarchy of services and service offerings, ideally useable in a graphical format
    • Create a hierarchy of services, subservices and service offerings
    • Present this hierarchy in a graphical format
    • Link the service structure to internal or external CIs and CMDBs
    • Be able to access (drill down) service information via the graphical hierarchy
    • Provide pre-filled service structure templates
  3. User Request Portal – a user-friendly/external facing portal that provides users with an intuitive User Interface to request services
    • Intuitive UI – look and feel of commercial shopping sites
    • Ability to present short simple options for service fulfilment – based on ‘bundled’ services where appropriate (e.g. new user)
    • Ability to present variable options and presentations based on login/access
    • Shopping basket
    • Option on full set of individual services
  4. Request Fulfilment – request management workflow functionality that can be easily used and configured by system users
    • Task management – creation of standard tasks and assignment to owners
    • Ability to cope with multiple task types and associated variable SLAs
    • Escalation and notification of task breaching
    • Output to portal for users to be able to track progress
    • Internal dashboard facility to track progress of requests and SLA progress
    • Provide pre-filled workflow templates
  5. SLA and event management – the ability (in the software or by integration) to define universal and bespoke levels of SLA which are then automated and escalated though an event management process – ideally linking with Incident, Problem and Change Management functionality
    • ITSM process linkage to manage events against service definitions
    • Real-time links to event management across internal and external ITSM systems
  6. Demand Management – the ability to provide real time allocation and monitoring of Service consumption, with financial calculations
    • Ability to set budgets and allocated consumption levels for services and offerings
    • Monitoring and analysis capability to track consumption and escalate thresholds and breaches
    • Financial analysis of supply and demand on services
    • Provide pre-filled budget model templates
  7. Dashboard – real-time user-friendly graphical monitoring and analysis of usage, trends and metrics across services and to various stakeholders
    • User-configurable views
    • Ability to track multiple views on service, SLA, demand consumption etc
    • Provide pre-defined and configurable dashboard templates
  8. Service Reporting – the ability to present output that summarises individual and bundled service performance, consumption, SLA and event performance, in user-friendly, portable and graphical format
    • User-configurable views
    • Ability to track multiple views on service, SLA, demand consumption etc.
    • Ability to combine multiple metrics into service bundles
    • Set thresholds for RAG indicators
    • Provide pre-defined and configurable dashboard templates

General Requirements

  1. User-configurable forms, tables, workflows
  2. Role-based security access – to allow control of available requests
  3. Integration with ITSM tools event management and CMDB (for SC only products)
  4. Integration with CMDB (ITSM tools)and ability to link CI components into Service Bundles
  5. Open system for real-time integration with financial management and other monitoring tools
  6. Vendors (SC only and ITSM tools) should have established proven links with other ITSM, CMDB and financial management packages
  7. Product should provide templates and pre-filled forms and structure to act as basic starting date where possible – service structure, service attributes, SLAs, workflows etc.
  8. Vendors should provide expertise and guidance in the implementation of the tool and relevant processes and project requirements around Service Catalogue – e.g. with workshops and training as well as implementation consultancy

What is your view, what have we missed?

Please leave a comment below or contact us. Similarly if you are a vendor and would like to be included in our review, please contact us.

CGI/Logica gains 5-star Service Desk Institute accreditation

Tessa Troubridge, Managing Director, SDI
Tessa Troubridge, Managing Director, SDI

Logica is positively beaming with a friendly welcoming smile this month after receiving news that it has been awarded 5-star certification by the Service Desk Institute (SDI) for its UK service desk.

Now part of CGI Group Inc. as a trading entity, this is apparently the first time that any organisation has achieved the 5-star standard.

The CGI/LogicaUK service desk team, based in South Wales, supports more than 180 clients across the public and private sector. To award the 5-star certification SDI carried out a four day audit incorporating feedback from clients and staff, and worked alongside members to understand the how the team provide services to a broad range of organisations.

NOTE: In terms of form and function, the 5-star service desk certification (introduced by SDI in 2012) is said to be a definition of the “ultimate levels” of quality and delivery for world-class service desks.

It found true integration of the service desk with the wider service management functions demonstrated combined strength and committment to delivery excellence.

Tim Gregory, UK President, CGI, said: ”The SDI Service Desk Certification is testament to the hard work of the team and their commitment to providing outstanding levels of service. We invest a lot of time in our members with in-depth training upfront so they have the skills to best help meet client’s diverse needs. We also encourage the team to spend time with our clients to greater understand their overall objectives and how their business works. Investing this time from the outset, allows us to offer our clients an unrivalled level of service and, as is proven by our accreditation.”

Tessa Troubridge, Managing Director, SDI, said, “Achieving 4 star on two consecutive occasions for the SDI Service Desk Certification programme is a tremendous accolade in its own right and to be recognised as a 5* world class service desk is a truly outstanding achievement. I am delighted and proud that we have been able to certify CGI/Logica as the first 5* world class service desk.”

Troubridge also said that the service desk here is extremely impressive with a remarkable people culture. Every team member displays a tangible passion, enthusiasm and drive to deliver not only excellent customer service but to provide added value as part of every single customer engagement.

Talking of Logica’s WOW factor, Troubridge says that the culture here is evidenced throughout the fabric of the organisation, the processes in place and the unique approach to team work to enhance the customer experience.

“It is in the DNA of each of the team members, their team leaders and across all levels of management and is driven both top down and bottom up.  This exceptional people culture is one of the real WOW factors of the service desk of which they should be extremely proud and which all other service desks should aspire to achieve.”.

Review: itSMF UK Tooling Event [January 2013]

itSMF UK Chair Colin Rudd

itSMF UK Tooling Event, London, January 25th 2013

I attended the itSMF UK Tooling event on 25th January in central London.

That week in the UK was bitterly cold with lots of snow – so this event had low turnout or cancelled written all over it.

However, hats off to the itSMF UK events crew who managed to persuade around 100 ITSM folks to brave the snow and ice and discuss service management tools and technology.

The event blurb stated:

“Finding the right ITSM products and implementing them correctly is a challenge for any organization, and keeping abreast of the latest software developments is becoming increasingly difficult as users have less and less time available to explore the options.

itSMF UK’s ITSM Software Tools Forum offers an unprecedented opportunity to bring vendors, consultants and potential buyers together under one roof to discuss product selection and implementation.”

Running for Ashley

itSMF UK Chairman Colin Rudd was our opening speaker and guide for the day. Colin began by painting the big ITSM picture and discussing the 50,000ft view on what we are aiming to achieve with the practice of ITSM.

Colin’s opening served as a useful orientation and allowed delegates, who had taken a day out from being at the rock face of day-to-day ITSM, to gain the right perspective.

Colin also urged us to support the itSMF UK team with their Reading Half Marathon charity run in support of long time itSMF supporter Ashley Hanna.

Colin Rudd, John Windebank, Ben Clacy, Mark Lilycrop, Rosemary Gurney and Barry Corless will be running for Macmillan Cancer Support on the 17th March – make a donation here:

http://www.justgiving.com/ashleysbigchallenge

After Colin’s introduction we heard from Cherwell, Marval, Hornbill, 2E2, BMC and Topdesk.

CHERWELL (8/10)

An old adage for presenters to keep their message clear and concise is to:

  1. Tell ’em what you’re gonna tell ’em
  2. Tell ’em
  3. and then tell ’em what you’ve told ’em

Simon Kent from Cherwell opened vendor presentations with a textbook example of this method in action.

He told us the leading Cherwell value points were: Ease of use, business value, service automation and innovation. He then proceeded to hammer each point home concisely by letting the technology do the talking.

MARVAL (7/10)

This is the first Marval pitch I’ve seen without Don Page. Whilst Don is clearly a leading pillar of the ITSM community and someone you won’t forget in a hurry, I thought it was refreshing to see a Marval presentation minus Don.

Underneath the façade of humour and expletives lies a solid ITSM company with a solid offering. The team are clearly service oriented and interested in the long-game consultative sale rather than just punting software. Good presentation from Tom West-Robinson, I look forward to seeing him present again.

HORNBILL (6/10)

The two presentations prior to Hornbill were focussed on ease of use, codeless configuration and DIY development. Prospective customers are perhaps thinking “If we swap our existing tool for something else we don’t want to re-mortgage the business to pay for the configuration”.

With this in mind, I felt the Hornbill proposition looked a little dated (versions aside).

Patrick’s presentation was good as per usual and Hornbill’s ‘Make IT Happen’ is a great approach but, given this is a tooling event, Patrick could have given us more to showcase the actual technology.

Quote du Jour from Patrick:

 “Renting software doesn’t make you any better at running it“ – Patrick Bolger, Hornbill

2e2 (1/10)

Martyn Birchall from 2e2 opened his pitch by stating that he ‘got bored with own PowerPoint’ and ‘preferred to make things interactive’. What a refreshing change – an interactive session before lunch? Alas, Martyn then proceeded to plod through his PowerPoint and not allow for interaction. I won’t dwell on his painful pitch since 2e2 unfortunately seem to have bitten the dust since the event.

BMC (8/10)

Andrew Smith provided a live demonstration of Remedy Force which included harnessing the enterprise social platform chatter into service management work streams. Remedy Force will look very cosy and familiar for anyone working with the force.com platform. It was a good showcase and attracted the most questions and interaction throughout the day.

For a big lumbering publicly listed conglomerate the demo showed surprising innovation. I also liked the tool BMC use to help potential prospects navigate the portfolio.

The video below was used during the presentation:

TOPDESK (7/10) 

Finally, last to present was Rob Goldsworth of TopDesk who stated that ‘ITSM is not an IT function’ and emphasized the use of their technology in HR, Facilities, CRM and so on.

Apart from a small home-goal with ITIL certification semantics Rob gave us a good tour of the compelling features within TopDesk via a live demo. In particular I liked the Kanban-style instant visualization of work in hand and resources available. Similarly the resource planner and process mapping tools look very well thought out. It was a good enough demo to whet your appetite without being too mechanical.

Whistle Stop Tour of ITSM Tools

In short, I thought this was a good event. It was well attended, had a good mixture of exhibitors and provided a great opportunity for prospective buyers to network with peers and engage with software companies without the formality of the normal sales process.

Note: This is just my opinion, as an itSMF member of an itSMF event. If you wish to share your own opinion on this or any other event please feel free to use the ITSM Review platform.

Coming Soon: Axios, BMC, Cherwell, NetSupport, TOPdesk & Nexthink Slog it out

Incident and Problem Product Review
Axios, BMC, Cherwell, NetSupport, TOPdesk & Nexthink slog it out for our Incident and Problem Management review

Axios, BMC, Cherwell, NetSupport, TOPdesk and Nexthink are confirmed participants for our upcoming ‘Incident and Problem Management’ review.

Our assessment Criteria at a Glance:

  • Logging & Categorization
  • Tracking
  • Lifecycle Tracking
  • Prioritisation
  • Escalations
  • Major Incidents and Problems
  • Incident and Problem Models
  • Incident and Problem Closure

Full details of the assessment criteria can be found here.

Reviewer: Ros Satar 

Confirmed Participants:

All results will be published free of charge without registration on The ITSM Review. You may wish to subscribe to the ITSM Review newsletter (top right of this page) or follow us on Twitter to receive a notification when it is published.