Evidence: How to encourage programmers to stop trusting evaluators too much?

Bottom line: This is a cultural problem.

I come from the point of view that proficient programmers at least write unit tests for the most complex parts of their code. The problem is not that everyone shares my point of view. I know people who have been developing code longer than I have been alive and are also testing their code, but not necessarily with automated testing. That said, there are a number of things in software development that are too simple to test, so those tests have no real value.

That said, there are different levels of testing with different percentages of random failures that can happen. From a management perspective, you must understand where it gains value.

  • Unit tests: check the implementation details and error handling as close to logic as possible.
  • Integration testing: check that the system specifications are working correctly
  • User interface tests: check that the application behaves according to the requirements

Unsurprisingly, the further you move away from the individual code units, the more fragile your tests become. The more pieces you have that work together, the more likely it is that something will come off intermittently. What that means is that the closer to the unit test it can detect problems, the more reliable and valuable those tests will be.

Automation cost

Automation takes time. Time costs money. It takes longer to write automated tests than to manually test a function. However, each time an automated test runs, it runs in a fraction of the time of the manual test. From a management point of view, you want to make sure you have a good return on your investment. I highly recommend that the highest risk code, if the application breaks is useless, you should have automated testing to ensure it detects regressions (broken code) as soon as they occur. If the unit tests fail, the developer cannot push their code.

In general, it is helpful to have some guidelines and a means to ensure that the high risk code is covered.

  • Unit tests should have at least 25% coverage. (Personally I prefer higher, but for a team without unit testing, this is a good place to start)
  • Unit and integration testing should be prioritized first in the high risk code.
  • Definition of Fact you need to have one or both requirements:
    • Code peer review (pull requests are a great way to organize them)
    • Unit test coverage meets the minimum criteria

Manual test cost

Manual testing takes time. Time costs money. While it is faster to test a function manually once, it takes the same amount of time to test it each time. You want to continue testing the finished functions to protect you from regressions. Regressions are functionalities in your application that used to work and no longer work.

The hidden cost of manual testing is that testers are people, and sometimes people skip tests by accident. If you thought writing automated tests was tedious, try testing the same functions with every click of the button over and over again.

Optimizing your investment

This is where both management and development should be on the same page. If quality is important to the company, the company must be willing to invest in quality. If quality is not important, just delete the evidence. Your users will complain, and you may or may not be ashamed of the problems they complain about. That being said, if the app is mission critical, then the quality should be important.

  • Automate high-risk code testing
    • The risk can be high due to the complexity of the solution.
    • The risk may be high due to the need for the function.
  • Don't write tests for code that is too simple to fail (like getters and setters)
  • Manually test things that are too complicated to test automatically (like drag / drop)
  • Invest in simplicity
    • A requirement alone can be simple enough, but it may conflict with other requirements.
    • Be ready to remove features for the application to meet current needs.
  • Define "done" to make it clear and unambiguous
    • When the work to be done is unclear, developers and testers have different opinions on what is right.
    • A few more minutes in a meeting with three people can save days of work and rework due to different fact definitions.

Summary

The company culture is currently in a no win situation. Cultural changes are easier when the administration has acceptance. It is also easier when the team introduces disciplines that help them be more effective. To that end, I would prioritize Define Done before prioritizing anything to do with how testing is done.

It's great that you're collecting metrics. It's not great how those metrics are currently used. A better way is to watch trends in metrics as you introduce more structure to how your team develops software. For example, if the finish time is improved and the number of test failures is reduced because you spend more time defining what should be done, then that is a victory. If you increase your automated test coverage to 50% and you don't see any improvement in the number of test failures then maybe 25% is good enough.

Software development is a team activity. The more they work together as a team, the better everyone's attitude will be. The more prepared your team is for success, the more your team will experience.