The automated system feeds millions of pieces of input to Mathematica, and checks that the output obtained from them is correct. Often there is some subtlety in doing such checking: one must account for different behavior of randomized algorithms and for such issues as differences in machine-precision arithmetic on different computers.
There is also a special instrumented version of Mathematica which is set up to perform internal consistency tests. This version of Mathematica runs at a small fraction of the speed of the real Mathematica, but at every step it checks internal memory consistency, interruptibility, and so on.
The instrumented version of Mathematica also records which pieces of Mathematica source code have been accessed, allowing one to confirm that all of the various internal functions in Mathematica have been exercised by the tests given.
All standard Mathematica tests are routinely run on current versions of Mathematica, on each different computer system. Depending on the speed of the computer system, these tests take from a few hours to a few days of computer time.
The standards of correctness for Mathematica are certainly much higher than for typical mathematical proofs. But just as long proofs will inevitably contain errors that go undetected for many years, so also a complex software system such as Mathematica will contain errors that go undetected even after millions of people have used it.
Doubtless there will be times when Mathematica does things you do not expect. But you should realize that the probabilities are such that it is vastly more likely that there is something wrong with your input to Mathematica or your understanding of what is happening than with the internal code of the Mathematica system itself.
If you do believe that you have found a genuine error in Mathematica, then you should contact Wolfram Research Technical Support, so that the error can be corrected in future versions.