2
votes

I want to compute the clock cycle count for a specific function inside my c code which is going to be compiled and run on BeagleBone Black. I have no idea how I can do this. I searched over the web and I found this instruction:

Clock Read method on Arndale board:

Step-1: Inserting kernel module to enable user space access to PMU counters. Untar the attached file “arndale_clockread.tar.bz2” which is having Makefile and enableccnt.c. In Makefile change the “KERNELDIR” with your kernel source directory e.g. /usr/src/linux-kernel-version then run the command.

linaro@linaro-server:~/enableccnt$ make

The above command should give output as enableccnt.ko, which is kernel module to enable userspace access to PMU counters. Then run the command.

linaro@linaro-server:~/enableccnt$ sudo insmod enableccnt.ko

The following command should show enableccnt module is being inserted in the running kernel.

linaro@linaro-server:~/enableccnt$ lsmod

Step-2: Reading the counter from user space applications. Once the kernel module is being setup. Following function can be used to read the counter

static void readticks(unsigned int *result)
{   
  struct timeval t;
  unsigned int cc;
  if (!enabled) {
   // program the performance-counter control-register:
    asm volatile("mcr p15, 0, %0, c9, c12, 0" :: "r"(17));
   //enable all counters.
    asm volatile("mcr p15, 0, %0, c9, c12, 1" :: "r"(0x8000000f));
   //clear overflow of coutners
    asm volatile("mcr p15, 0, %0, c9, c12, 3" :: "r"(0x8000000f));
    enabled = 1;
  }
  //read the counter value.
  asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r"(cc));
  gettimeofday(&t,(struct timezone *) 0);
  result[0] = cc;
  result[1] = t.tv_usec;
  result[2] = t.tv_sec;
}

I believe this instruction should be working for any ARMv7 platforms. So, I followed the instruction and change the kernel source directory. This is how the Makefile looks like:

KERNELDIR := /usr/src/linux-headers-3.8.13-bone70

obj-m := enableccnt.o
CROSS=arm-linux-gnueabihf-

all:
        CC=arm-cortex_a15-linux-gnueabihf-gcc $(MAKE) ARCH=arm -C $(KERNELDIR) M=`pwd`  CROSS_COMPILE=$(CROSS) -I/lib/arm-linux-gnueabihf/lib

Now, when I run make, I've got this error which is complaining about arm-linux-gnueabihf-ar:

CC=arm-cortex_a08-linux-gnueabihf-gcc make ARCH=arm -C /usr/src/linux-headers-3.8.13-bone70 M=`pwd`  CROSS_COMPILE=arm-linux-gnueabihf- -I/lib/arm-linux-gnueabihf/
make[1]: Entering directory `/usr/src/linux-headers-3.8.13-bone70'
  LD      /root/crypto_project/Arndale_enableccnt/built-in.o
/bin/sh: 1: arm-linux-gnueabihf-ar: not found
make[2]: *** [/root/crypto_project/Arndale_enableccnt/built-in.o] Error 127
make[1]: *** [_module_/root/crypto_project/Arndale_enableccnt] Error 2
make[1]: Leaving directory `/usr/src/linux-headers-3.8.13-bone70'
make: *** [all] Error 2

I tried to install arm-linux-gnueabihf-ar but it doesn't work. So, I have no clue what should I do now!

EDIT1- As it is mentioned in comments, I add my toolchain path into my environment variable by using:

export PATH=/path/to/mytoolchain/bin:$PATH

And now I don't get previous error. However, I've got this syntax error which I think it relates to the kernel header files:

CC=arm-cortex_a15-linux-gnueabihf-gcc make ARCH=arm -C /usr/src/linux-headers-3.8.13-bone70 M=`pwd`  CROSS_COMPILE=arm-linux-gnueabihf- -I/lib/arm-linux-gnueabihf/bin
/root/gcc-linaro-arm-linux-gnueabihf-4.7-2012.11-20121123_linux/bin/arm-linux-gnueabihf-gcc: 1: /root/gcc-linaro-arm-linux-gnueabihf-4.7-2012.11-20121123_linux/bin/arm-linux-gnueabihf-gcc: Syntax error: "(" unexpected
make[1]: Entering directory `/usr/src/linux-headers-3.8.13-bone70'
  LD      /root/crypto_project/Arndale_enableccnt/built-in.o
/root/gcc-linaro-arm-linux-gnueabihf-4.7-2012.11-20121123_linux/bin/arm-linux-gnueabihf-ar: 1: /root/gcc-linaro-arm-linux-gnueabihf-4.7-2012.11-20121123_linux/bin/arm-linux-gnueabihf-ar: Syntax error: "(" unexpected
make[2]: *** [/root/crypto_project/Arndale_enableccnt/built-in.o] Error 2
make[1]: *** [_module_/root/crypto_project/Arndale_enableccnt] Error 2
make[1]: Leaving directory `/usr/src/linux-headers-3.8.13-bone70'
make: *** [all] Error 2

The only reasonable solution that comes to my mind is to download kernel source code with its header files and try to make again. Does anyone have any idea to resolve this issue?

1
You want to measure the FLOPS of your processor. Have a look at how the Linux kernel computes it at boot time.Claudio
After installing toolchain, you need to configure your environment. First of all, do export PATH=/path/to/your/toolchain/bin:$PATH. Then export CROSS_COMPILE=arm-linux-gnueabihf-. Now you can try to build your module with ARCH=arm (like you do).Sam Protsenko
@Claudio I don't know if I understand your comment correctly or not, but all I want to compute is clock cycles count for a specific function inside my C code which is going to be compiled with gcc on ARM Cortex-A8 microprocessor.A23149577
@SamProtsenko Thank you for your comment. I'll give it a try and let you know how it goes.A23149577
@AJeneral Look at this answer, particularly "Toolchain" section. Basically you need to do the same steps, only for ARM toolchain (rather than MIPS one).Sam Protsenko

1 Answers

3
votes

As there can be many obstacles along the way, below is the complete guide how to build that kernel module and user-space application.

Toolchain

First of all, you need to download and install 2 toolchains:

  1. Toolchain for building kernel (and kernel modules): bare-metal (EABI) toolchain
  2. Toolchain for building user-space application: GNU/Linux toolchain

I recommend you to use Linaro ARM toolchains, as they are free, reliable and well optimized for ARM. Here you can choose desired toolchains (in "Linaro Toolchain" section). On BeagleBone Black you have little-endian architecture by default (like on most ARMv7 processors), so download next two archives:

  1. linaro-toolchain-binaries (little-endian) Bare Metal
  2. linaro-toolchain-binaries (little-endian) Linux

Once downloaded, extract those archives into /opt directory.

Kernel sources

First of all, you need to find out which exactly kernel sources were used to build the kernel which flashed to your board. You can try to figure that out (by your board revision) from here. Or you can build your own kernel, flash it to your board, and now you know exactly which kernel version is in use.

Anyway, you need to download correct kernel sources (which correspond to kernel on your board). Those sources will be used further to build kernel module. If kernel version is incorrect, you will have "magic mismatch" error or something like that on module loading.

I will use stable kernel sources from kernel.org just for references (it should be sufficient at least to build the module).

Build kernel

Run next commands in your terminal to configure shell environment (bare-metal toolchain) for kernel building:

$ export PATH=/opt/gcc-linaro-5.1-2015.08-x86_64_arm-eabi/bin:$PATH
$ export CROSS_COMPILE=arm-eabi-
$ export ARCH=arm

Configure kernel using defconfig for your board (from arch/arm/configs/). I will use omap2plus_defconfig for example:

$ make omap2plus_defconfig

Now either build the whole kernel:

$ make -j4

or prepare needed kernel files for building external module:

$ make prepare
$ make modules_prepare

In second case the module will not have dependency list and probably you will need to use "force" option when loading it. So the preferred option is building the whole kernel.

Kernel module

NOTE: the code I'm gonna use further is from this answer.

First you need to enable ARM performance counter for user-space access (details are here). It can be done only in kernel-space. Here is the module code and Makefile you can use to do so:

perfcnt_enable.c:

#include <linux/module.h>

static int __init perfcnt_enable_init(void)
{

    /* Enable user-mode access to the performance counter */
    asm ("mcr p15, 0, %0, C9, C14, 0\n\t" :: "r"(1));

    /* Disable counter overflow interrupts (just in case) */
    asm ("mcr p15, 0, %0, C9, C14, 2\n\t" :: "r"(0x8000000f));

    pr_debug("### perfcnt_enable module is loaded\n");
    return 0;
}

static void __exit perfcnt_enable_exit(void)
{
}

module_init(perfcnt_enable_init);
module_exit(perfcnt_enable_exit);

MODULE_AUTHOR("Sam Protsenko");
MODULE_DESCRIPTION("Module for enabling performance counter on ARMv7");
MODULE_LICENSE("GPL");

Makefile:

ifneq ($(KERNELRELEASE),)

# kbuild part of makefile

CFLAGS_perfcnt_enable.o := -DDEBUG
obj-m := perfcnt_enable.o

else

# normal makefile

KDIR ?= /lib/modules/$(shell uname -r)/build

module:
    $(MAKE) -C $(KDIR) M=$(PWD) modules

clean:
    $(MAKE) -C $(KDIR) M=$(PWD) clean

.PHONY: module clean

endif

Build kernel module

Using configured shell environment from previous step, let's export one more environment variable:

$ export KDIR=/path/to/your/kernel/sources/dir

Now just run:

$ make

The module is built (perfcnt_enable.ko file).

User-space application

Once ARM performance counter is enabled in kernel-space (by kernel module), you can read its value in user-space application. Here is the example of such application.

perfcnt_test.c:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

static unsigned int get_cyclecount(void)
{
    unsigned int value;

    /* Read CCNT Register */
    asm volatile ("mrc p15, 0, %0, c9, c13, 0\t\n": "=r"(value));

    return value;
}

static void init_perfcounters(int32_t do_reset, int32_t enable_divider)
{
    /* In general enable all counters (including cycle counter) */
    int32_t value = 1;

    /* Peform reset */
    if (do_reset) {
        value |= 2; /* reset all counters to zero */
        value |= 4; /* reset cycle counter to zero */
    }

    if (enable_divider)
        value |= 8; /* enable "by 64" divider for CCNT */

    value |= 16;

    /* Program the performance-counter control-register */
    asm volatile ("mcr p15, 0, %0, c9, c12, 0\t\n" :: "r"(value));

    /* Enable all counters */
    asm volatile ("mcr p15, 0, %0, c9, c12, 1\t\n" :: "r"(0x8000000f));

    /* Clear overflows */
    asm volatile ("mcr p15, 0, %0, c9, c12, 3\t\n" :: "r"(0x8000000f));
}

int main(void)
{
    unsigned int overhead;
    unsigned int t;

    /* Init counters */
    init_perfcounters(1, 0);

    /* Measure the counting overhead */
    overhead = get_cyclecount();
    overhead = get_cyclecount() - overhead;

    /* Measure ticks for some operation */
    t = get_cyclecount();
    sleep(1);
    t = get_cyclecount() - t;

    printf("function took exactly %d cycles (including function call)\n",
            t - overhead);

    return EXIT_SUCCESS;
}

Makefile:

CC = gcc
APP = perfcnt_test
SOURCES = perfcnt_test.c
CFLAGS = -Wall -O2 -static

default:
    $(CROSS_COMPILE)$(CC) $(CFLAGS) $(SOURCES) -o $(APP)

clean:
    -rm -f $(APP)

.PHONY: default clean

Notice that I added -static option just in case if you are using Android etc. If your distro has regular libc, you can probably remove that flag to reduce size of result binary.

Build user-space application

Prepare shell environment (Linux toolchain):

$ export PATH=/opt/gcc-linaro-5.1-2015.08-x86_64_arm-linux-gnueabihf/bin:$PATH
$ export CROSS_COMPILE=arm-linux-gnueabihf-

Build the application:

$ make

Output binary is perfcnt_test.

Testing

  1. Upload both kernel module and user-space application to your board.
  2. Load the module:

    # insmod perfcnt_enable.ko
    
  3. Run the application:

    # ./perfcnt_test