Optimization: why does it take so long to test optimality when starting from an optimal solution?

So I'm solving larger instances of some linear-binary program using cplex.
The formulations of the problem that I am using are compatible with the integers, which means that almost all my instances can be resolved in the root node.
Also, I have a pretty good heuristic to calculate solutions. The heuristic is quite fast and almost always offers the optimal solution worldwide.

By combining cplex with heuristics I do not achieve the performance gains I expected. I provide the optimal global solution for cplex as a "hot start solution", but it still takes quite some time for cplex to demonstrate its optimality (cplex chooses the simple duplex for relaxation).

I may lack a bit of theoretical understanding, but why can not the dual-simplex simply build a foundation for the optimal solution provided and show that there is nothing else to do?