These changes add more useful profiling statistics to the frame-level CSV file, which is generated by --csv log.csv --log-level debug. The idea is to be able to identify systemic bottlenecks on new architectures