Tuesday, 19 January 2016

2016 - for testing it's that kind of year, again, again

2016 is progressing as planned and most projects are up to speed again after the holiday season. Deadlines are approaching and a lot of testing is taking place. It feels familiar and safe, but wait, something feels a bit different this year.

That's because 2016 is a leap year. One of those years with one more day in the calendar. A full extra day for testing. Oh joy. However that extra day should also lead to a bit of scrutiny for you, dear tester, test manager, QA-specialist or whatever title that allows you to spend most of your time on testing.

Image result for 29 february

Remember last leap year? 2012. Cloud was the perhaps most hyped field within IT. Microsoft had just spent the previous years pushing Azure to a lot of strategic customers across the entire planet when disaster struck. This of course prompted a lot of jokes, within my organisation the joke was "Office 364" for some time. It probably also meant that Microsoft had a lot of fan mail from various lawyers.

To Microsoft this was a PR disaster because it was felt by so many end users in so many different places at the same time. This of course coupled with Microsoft promising that it was safe to move business critical platforms to the Cloud. Well, only if it was a "normal" year.

Leap year bugs are a problem since the root cause can be difficult to spot before the problem occurs real life. It's one of those side effects to a side effect. So try to take a look at your test plans? You plan to go live with your project during February - then panic a bit. Even if you have releases after the 29th do a little brain storm to find out if that extra day will affect any functionality you have in your project scope - like end of month/quarter/year. Or simply just try to figure out what will happen this year on the 28th and 29th of February, and on the 1st of March. And whether end-of-March will be affected.

That extra day in February is so nice since it is a free extra day in most project plans, but it will slap you in your face unless you test for it and know that everything works according specification or assumption. Wiki has a short list of known leap year bugs for inspiration to get you started.

If you don't remember what you've done for the past 4 years in terms of development and testing, have a cup of coffee with your favorite friend - the portfolio manager - and ask if she has a list of projects that have gone live in that period. Or maybe the deployment manager, or Devops. Actually, right now might a be a good time for a few cups of productive coffee. All in the name of defect prevention.

Thursday, 7 January 2016

Performance testing - a real world experience and day-zero considerations

Welcome to 2016. Why not kick off the new year with an old story. Some years back I was a test manager for a large project aiming at launching a business critical system. The test strategy recommended a few structured performance test activities aiming at proving that the system would actually be able to deal with the expected number of users, that peak log on (during the mornings) could be handled and that system resources would be freed up as users log off.

All of these recommendations were approved by project management and a separate team was set up to design, implement and execute the necessary tests. This would be done late in the project phase and was as such totally normal. Do the testing at a point where the software is sufficiently mature to actually be able to trust and use the test results for a go/no-go decision. So far, so good.

Since this was a project that would implement a solution that would replace an existing solution we didn't have to guess too much on the user behaviour for normal use. Just look in the log files and find patterns. A large part of the testing was designed around this knowledge.

Then we consulted the implementation team to figure out how they expected to roll out the solution to the organisation. We returned with the knowledge of a "big bang" implementation. There were no alternatives so we also needed this as a scenario. How would the solution scale and behave on the first day when everybody had an email in their inbox saying "please log on to this brand new and super good system"?

No problems so far. Knowing that the organisation was located in two different time zones that took some of the expected peak load off and we didn't have to have this cruel "100% users at the same time"-scenario. Emails to different parts of the organisation could be sent out to groups of users with say 10-15 minutes intervals to avoid a tidal wave of concurrent log ons. Good and pragmatic idea and that was agreed in the project and executed by the implementation team.

Billedresultat for bomb

The one thing we didn't take into account was how organisations and especially middle management works. Middle management tend to send a lot of mails around these days. In ways not always known or controlled by a project like ours. So in the real world we succeeded with our performance testing but failed on day-zero.

As soon as middle management started to get the "Important information- log on to this new system" they did what they always do with this kind of information - passed it on. Not only to their own organisation but across the organisation. using different mail groups that would hit 30, 50 or 100 persons at a time. They were used to this in their daily operational life, and to them this was just another operational morning.

The result was that the peaks of log ons were completely different from what we had expected and planned - and tested. Not to the extent that there was a complete meltdown but there was short outages during the first couple of hours - and of course some angry and concerned users who needed feedback and assurance that they could trust the system which was mission critical for them.

Lesson learned: Think a bit outside the box. Not always worst case scenario, but closer than you might think. Even though you have a lot knowledge to build on always consider performance testing for day-zero scenarios as something truly special. First impressions last, especially for real-life users.