As we are curious of engine and game performance beyond the “bad” frames-per-second measurement, I started to integrate a benchmark “tool” into the engine. This “tool” only registers the start and the end of a specified function block with a small useful information description.
With a separate benchmark viewer we can analyze the engine performance per function per thread and it shows the actions on a nice time line. What we see are blocks, aligned in time per thread. A large block means something took a shitload of time, a small block means mostly a few milliseconds. Also the engine schedules some update, physics and render, content loading calls on different threads, we now can see what actually happens inside. This also gives us more power for better optimizations and debug capabilities.