Background

LabKey performs the vast majority of product testing during the development cycle of a new release. The development of every new feature includes buddy testing, creation of automated unit tests, and creation of browser-based integration tests. Our automated servers run large suites of tests against every commit and even larger suites on a nightly and weekly basis to identify new bugs and regressions. We distribute monthly sprint builds to many clients, encouraging them to exercise these builds on their test servers and promptly report problems they find in new and existing functionality. After our final (stabilization) sprint, we push bi-weekly release candidates to our clients and ask them to validate these on their servers using their data. This culminates in LabKey making an official release of a build that has been tested thoroughly by us and many of our clients, typically occurring a couple weeks after the end of the stabilization sprint.

Our clients often find bugs in released builds. In most cases, we fix these problems as part of the next release cycle. We don’t typically fix bugs in released products for several reasons:

  1. Risk. Hotfixes completely bypass the standard testing that takes place during the development cycle. These fixes are often deployed to production servers shortly after being committed, with limited opportunity to verify the fix. The bigger problem, though, is the risk of "unintended consequences." Like all other code changes, a hotfix can cause additional (often more severe) issues in other parts of the system. A hotfix provides no opportunity to detect these follow-on issues before production deployment.
  2. Focus. At the point when a potential hotfix is identified, developers are deeply engaged in implementing features for the next release. Asking several developers to stop feature work and focus instead on a hotfix often prevents them from finishing one or more scrum board features.
  3. Cost. Producing a hotfix is typically 3 – 5 times as costly as fixing the exact same issue during the development cycle. To mitigate the risks mentioned above, we must be extremely conservative with hotfixes. We start with an evaluation process that involves senior management and the client. We then design, discuss, implement, and test several potential solutions, trying to find the fix that best addresses the issue while minimizing impact on other functionality. All hotfixes are risk assessed by senior management and code reviewed by one or more developers familiar with the area. Testers must attempt to verify the change immediately. Often times, the isolated hotfix solution is not an appropriate long-term solution; in these cases, the hotfix changes are rolled back and replaced with a more comprehensive fix in the next release. All of this additional overhead makes hotfixes very time consuming and expensive.

Policy

We evaluate every hotfix candidate using the following factors and questions:

  1. Severity. How bad is the problem? Does the problem involve a security exposure or data loss? Is functionality blocked? If so, how important is that functionality?
  2. Scope. How many people will be affected by this issue? To what extent will it impair their work? Are other clients affected?
  3. Workarounds. Are there reasonable steps that avoid the problem? Can those affected be shown these steps?
  4. Regression status. Is the bug:
    • A new problem with previously working functionality?
    • A problem with new functionality?
    • An old problem that’s been in the product for one or more previous releases?
  5. Cost of fixing. How long will it take to implement and test a fix?
  6. Risk of fixing. How invasive are the changes? What’s the likelihood that these changes will produce unintended consequences?
  7. Time. How long has the release been available? How long before a new release is made?

Evaluating a hotfix candidate is a subjective risk vs. reward trade-off. In most cases, our clients and we find the reward is simply not worth the risk and cost. But, as hinted in #7 above, the length of time since the last release does affect the evaluation. A critical issue discovered shortly after release needs to be evaluated seriously, but an issue that isn’t reported until three months into a release is almost certainly not a high priority (we release new versions every four months). Combining this temporal element with the other factors leads to some general guidelines that we use to quickly assess whether an issue is a hotfix candidate.

        Hotfix candidate
 Not a hotfix candidate

 

 

 

One month after releaseTwo months after releaseAlways (until next release)
Security issue   
Significant data loss issue   
Blocking issue in old functionality (regression)   
Blocking issue in new functionality   
Performance issue   
Issue present in previous releaseThese are not hotfix candidates
New feature or improvement request
Issue with reasonable workaround
Issue with limited impact

The above guidelines are not hard and fast rules. The risks or costs of a fix may preclude an otherwise worthy hotfix. On the other hand, we'll occasionally take a simple, low risk fix that doesn’t meet these criteria.

We encourage all clients to test new functionality promptly (as the sprint builds are made available) and perform regular regression testing of important existing functionality. Reporting all issues before public release is the best way to avoid hotfixes entirely.

discussion

Discussion

Search 

Pages 

previousnext
 
expand all collapse all