You have omitted to say what processor is on the board (far more important than the brand of board!), if the processor includes ETM, and you have a ULINK-Pro or other trace-capable debugger then uVision can unintrusively profile the executing code directly at the instruction cycle level.
Similarly if you run the code in the uVision simulator rather than real hardware, you can get cycle accurate profiling and timing, without the need for hardware trace support.
Even without the trace capability, uVision's "stopwatch" feature can perform timing between two break-points directly. The stopwatch is at the bottom of the IDE in the status bar. You do need to set the clock frequency in the debugger trace configuration to get "real-time" from the stop-watch.
A simple approach that requires no special debug or simulator capability is to use an available timer peripheral (or in the case of Cortex-M devices the sysclk) to timestamp the start and end of execution of a code section, or if you have no available timing resource, you could toggle a GPIO pin and monitor it on an oscilloscope. These methods have some level of software overhead that is not present in hardware or simulator trace, that may make them unsuitable for very short code sections.