How do we check the quality of complex products? And what do you have to do if your product is a complex optimization algorithm? 

Normally a test is performed "bottom up", i.e. insulate the small components and test them first, then integrate them to larger components which have to be tested commonly - until you integrate them all to the final product which has to be tested entirely. 

But how do you proceed if you are able to check the correctness of the results, but you cannot prove to have found the best solution? 

For the actual famous timetable information systems this problem occurs systematically because the algorithms have originally been designed in times of the Intel 286. Developers were enforced at that time (end of the 80s) to employ heuristics which were not able to guarantee the optimal quality in order to achieve good results with acceptable response times.

For this reason the problem was always not only for the manufacturers of such systems but also for their large customers how to achieve best results and to what costs (= computing time, working hours). The mean of choice was usually regression tests during which the old version was tested against the new version. This was very exhausting and results were achieved in small steps but it was still not possible to guarantee optimal results. 

How do other industries solve those problems? The buzz word is “Diversitary Redundance”.

Redundance is normally the installation of more than one component in a way that one component can take over a task which another component was supposed to do but failed. The same component exists in this case identically in more than one version, e.g. for redundant network parts or network interfaces in fault-tolerant computers. But what happens if a system fails because of a design error? Then the emergency system will also fail! This is not a comfortable thought for critical systems. 

This is the reason why diversitary redundance (Wikipedia) is preferred in critical systems. Main criterium is that every component will be developed separately and discretely (e.g. different hardware / software) so it would be unlikely to repeat errors. The system must eventually be built two times ... with all consequences, including higher development costs. 

In 2001 the existing timetable information system had been changed in large parts for a large customer. Quality assurance solely by regression tests were not considered to be sufficient. Additionally it was better in the long run for further development to be able to compare the productive system with another, independently developed system. 

Luckily the team at the Chair of Prof. Karsten Weihe (at that time Bonn) had experience in this field and for that reason the costs for a pure system for quality assurance were acceptable. The thought was born and in the middle of 2002 the first fully functionable predecessor of MOTIS was developed. It was stated as a requirement that at every time the optimality of the information results must be guaranteed by the algorithmic approach. 

Even though MOTIS has been enhanced clearly since that time, its role for quality assurance it is still playing  - and guaranteed, of course.