2
votes

I'm developing code for the NXP LPC1788 microcontroller, and part of my work is making the product based on it USB-compatible. Most of the legwork is done and in general communication over USB works almost as well as over CAN.

However, one of the issues I've been encountering is that, when producing a constant output of USB messages from the microcontroller that are sent pretty close together, some of these messages are occasionally dropped.

I'm using a custom driver I wrote based on WinUSB to receive the messages on the PC side, and I originally suspected that the problem was on the receiving end of things. However, using USBLyzer I'm now confident that the problem is on the sending side - the USBLyzer logs perfectly match logs produced from what I get out of WinUsb_ReadPipe().

The LPC1788 uses USB 2.0 Full Speed protocol and I've confirmed that information is being sent and received at around 12 MHz using a probe, which is what it should be.

The device is configured to make use of two endpoints: logical endpoint 2 IN and logical endpoint 2 OUT. Both of these are configured for bulk transfer with a maximum packet size of 64 bytes.

I would think that messages are being sent around 500-600 microseconds apart at the least (I've introduced an artificial 500 us delay in the thread and message transmission should take a lot less time than that). This is about what I got last week; I can't check now as my debugging tools are acting up.

This is the USB initialisation code for the microcontroller:

void USBInit()
{  
  // Configure USB pins.
  PINSEL_ConfigPin(0, 29, 1); // USB_D+1
  PINSEL_ConfigPin(0, 30, 1); // USB_D-1
  PINSEL_ConfigPin(1, 18, 1); // USB_UP_LED1
  PINSEL_ConfigPin(2,  9, 1); // USB_CONNECT1
  PINSEL_ConfigPin(1, 30, 2); // USB_VBUS 

  // Turn on power and clock
  CLKPWR_ConfigPPWR(CLKPWR_PCONP_PCUSB, ENABLE);

  PINSEL_SetPinMode(1, 30, PINSEL_BASICMODE_PLAINOUT);

  // Set DEV_CLK_EN and AHB_CLK_EN.
  LPC_USB->USBClkCtrl |= 0x12;

  // Wait until change is reflected in clock status register.
  while((LPC_USB->USBClkSt & 0x12) != 0x12);

  // Enable NVIC USB interrupts.
  NVIC_EnableIRQ(USB_IRQn);

  // Reset the USB.
  USBReset();

  // Set device address to 0x0 and enable device & connection.
  USBSetAddress(0);
}

This is the code used by the microcontroller to send messages via USB:

uint32_t USB_Send(uint32_t endpoint, uint8_t *pData, uint32_t count)
{
  // Convert into a form that can be sent successfully using USB.  
  uint8_t data[USB_MAX_PACKET_SIZE];

  for(int i=0; i < count; i++)
  {
    data[i*2] = hex[(pData[i] >> 4)];
    data[(i*2)+1] = hex[(pData[i] & 0xF)];
  }

  return USBWriteEndpoint(endpoint, data, count*2);
}

uint32_t USBWriteEndpoint(uint32_t endpoint, uint8_t *pData, uint32_t count)
{
  uint32_t i;

  LPC_USB->Ctrl = ((endpoint & 0xF) << 2) | CTRL_WR_EN;

  LPC_USB->TxPLen = count;

  for(i=0; i < (count+3)/4; i++)
  {
    LPC_USB->TxData = *((__packed uint32_t *)pData);
    pData += 4;
  }

  LPC_USB->Ctrl = 0;
  USBValidateEndpoint(endpoint);
  return count;
}

void USBValidateEndpoint(uint32_t endpoint)
{
  writeSIEEndpointCommand(endpoint, CMD_VALID_BUF);
}

void writeSIECommandData(uint32_t cmd, uint32_t data)
{
  LPC_USB->DevIntClr = CCEMTY_INT;
  LPC_USB->CmdCode = cmd;
  while((LPC_USB->DevIntSt & CCEMTY_INT) == 0);
  LPC_USB->DevIntClr = CCEMTY_INT;
  LPC_USB->CmdCode = data;
  while((LPC_USB->DevIntSt & CCEMTY_INT) == 0);
}

EDIT

To give an idea of what's happening, this is a log file produced from my USB driver's receive function (the one from USBLyzer is practically identical):

0000030D000D
0000010D002D0004001B0024
0000000D0FFF001600310016
0000010D002D0004001F0028
0000020D00280028001B002D
0000030D0009
0000000D0FFF001600310016
0000010D002D0004001F0028
0000020D00280028001B002D
0000030D0009
0000010D002D0004001F0028
0000020D00280028001B002D
0000030D0009
0000010D002D0004001F0028
0000020D00280028001B002D
0000030D0009
0000010D002D0004001F0028
0000020D00280028001B002D
0000030D0009
0000000D0FFF001600310016
0000010D002D0004001F0028
0000020D00280028001B002D
0000030D0009
0000010D002D0004001F0028

I should be receiving messages in the following cycle:

0000000D...   
0000010D...
0000020D...
0000030D...

You can see from this log that some of the messages in the cycle are getting skipped.

EDIT 2

Included below are excerpts from the raw and filtered capture logs produced by USBLyzer. The raw logs mostly consist of cancelled requests because my driver is poll-driven and uses a timeout.

Raw logs:

USBlyzer Report

Capture List

Type    Seq Time    Elapsed Duration    Request Request Details Raw Data    I/O C:I:E   Device Object   Device Name Driver Name IRP Status
START   0001    11:09:15.413                                                
URB 0002    11:09:18.484    3.071197 s      Bulk or Interrupt Transfer  10 bytes data   30 30 30 30 30 30 31 46...  out 01:00:02    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA801142FAB0h   
00000000  30 30 30 30 30 30 31 46 30 31                    0000001F01      
URB 0003    11:09:18.484    3.071212 s      Bulk or Interrupt Transfer  64 bytes buffer     in  01:00:82    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA8010E965A0h   
URB 0004-0002   11:09:18.484    3.071371 s  174 us  Bulk or Interrupt Transfer  10 bytes buffer     out 01:00:02    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA801142FAB0h   Success (Success)
URB 0005-0003   11:09:18.485    3.071586 s  374 us  Bulk or Interrupt Transfer          in  01:00:82    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA8010E965A0h   Cancelled (Canceled)
URB 0006    11:09:18.485    3.071608 s      Bulk or Interrupt Transfer  64 bytes buffer     in  01:00:82    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA8010E965A0h   
URB 0007-0006   11:09:18.486    3.072582 s  974 us  Bulk or Interrupt Transfer          in  01:00:82    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA8010E965A0h   Cancelled (Canceled)
URB 0008    11:09:18.486    3.072603 s      Bulk or Interrupt Transfer  64 bytes buffer     in  01:00:82    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA8010E965A0h   
URB 0009-0008   11:09:18.487    3.073598 s  996 us  Bulk or Interrupt Transfer          in  01:00:82    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA8010E965A0h   Cancelled (Canceled)
URB 0010    11:09:18.487    3.073630 s      Bulk or Interrupt Transfer  64 bytes buffer     in  01:00:82    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA8010E965A0h   
URB 0011-0010   11:09:18.488    3.074601 s  970 us  Bulk or Interrupt Transfer          in  01:00:82    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA8010E965A0h   Cancelled (Canceled)
[...]
URB 2504-2501   11:09:19.734    4.320666 s  161 us  Bulk or Interrupt Transfer  14 bytes buffer     out 01:00:02    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA800CF662D0h   Success (Success)
URB 2505-2503   11:09:19.734    4.320785 s  192 us  Bulk or Interrupt Transfer  24 bytes data   30 30 30 30 30 30 30 44...  in  01:00:82    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA8010E965A0h   Success (Success)
00000000  30 30 30 30 30 30 30 44 30 46 46 46 30 30 31 36  0000000D0FFF0016
00000010  30 30 33 31 30 30 31 42                          0031001B        

Filtered logs:

USBlyzer Report

Capture List

Type    Seq Time    Elapsed Duration    Request Request Details Raw Data    I/O C:I:E   Device Object   Device Name Driver Name IRP Status
URB 0004-0002   11:09:18.484    3.071371 s  174 us  Bulk or Interrupt Transfer  10 bytes buffer     out 01:00:02    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA801142FAB0h   Success (Success)
URB 2504-2501   11:09:19.734    4.320666 s  161 us  Bulk or Interrupt Transfer  14 bytes buffer     out 01:00:02    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA800CF662D0h   Success (Success)
URB 2505-2503   11:09:19.734    4.320785 s  192 us  Bulk or Interrupt Transfer  24 bytes data   30 30 30 30 30 30 30 44...  in  01:00:82    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA8010E965A0h   Success (Success)
00000000  30 30 30 30 30 30 30 44 30 46 46 46 30 30 31 36  0000000D0FFF0016
00000010  30 30 33 31 30 30 31 42                          0031001B        
URB 2507-2506   11:09:19.734    4.321309 s  459 us  Bulk or Interrupt Transfer  24 bytes data   30 30 30 30 30 31 30 44...  in  01:00:82    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA8010E965A0h   Success (Success)
00000000  30 30 30 30 30 31 30 44 30 30 32 44 30 30 30 34  0000010D002D0004
00000010  30 30 31 46 30 30 32 44                          001F002D        
URB 2511-2510   11:09:19.735    4.321931 s  311 us  Bulk or Interrupt Transfer  24 bytes data   30 30 30 30 30 32 30 44...  in  01:00:82    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA8010E965A0h   Success (Success)
00000000  30 30 30 30 30 32 30 44 30 30 32 38 30 30 32 38  0000020D00280028
00000010  30 30 31 42 30 30 33 31                          001B0031        
URB 2513-2512   11:09:19.735    4.322306 s  332 us  Bulk or Interrupt Transfer  12 bytes data   30 30 30 30 30 33 30 44...  in  01:00:82    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA8010E965A0h   Success (Success)
00000000  30 30 30 30 30 33 30 44 30 30 30 44              0000030D000D    
URB 2725-2724   11:09:19.840    4.426662 s  89 us   Bulk or Interrupt Transfer  24 bytes data   30 30 30 30 30 30 30 44...  in  01:00:82    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA8010E965A0h   Success (Success)
00000000  30 30 30 30 30 30 30 44 30 46 46 46 30 30 31 36  0000000D0FFF0016
00000010  30 30 33 31 30 30 31 42                          0031001B        
URB 2727-2726   11:09:19.840    4.427183 s  471 us  Bulk or Interrupt Transfer  24 bytes data   30 30 30 30 30 31 30 44...  in  01:00:82    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA8010E965A0h   Success (Success)
00000000  30 30 30 30 30 31 30 44 30 30 32 44 30 30 30 34  0000010D002D0004
00000010  30 30 31 46 30 30 32 44                          001F002D        
URB 2731-2730   11:09:19.841    4.427803 s  209 us  Bulk or Interrupt Transfer  24 bytes data   30 30 30 30 30 32 30 44...  in  01:00:82    FFFFFA800FC98440h   USBPDO-11   usbhub  FFFFFA8010E965A0h   Success (Success)
00000000  30 30 30 30 30 32 30 44 30 30 32 38 30 30 32 38  0000020D00280028
00000010  30 30 31 42 30 30 33 31                          001B0031        
3

3 Answers

0
votes

Do you have a hardware USB analyzer available? I'm not sure how USBLyzer works but I presume it is still using the Windows USB subsystem at the lowest levels. My experience with Windows USB subsystem has been that it is extremely bug-riddled - one particular example was when transferring data that is an exact multiple of the max frame size, it doesn't work. I'm not saying this is your exact problem - we had different symptoms from what you are reporting - but in your shoes I would look at a) changing the max amount of data your transmitting end will send in one frame so it's not an exact multiple of the frame size, and b) getting a hardware USB analyzer to see what is actually on the wire.

0
votes

The problem might be the USB_IRQHandler interrupting your write function. That would leave the USB unexpected in a different state, thus the write fails.

You can temporary disable the IRQ as a workaround:

uint32_t USBWriteEndpoint(uint32_t endpoint, uint8_t *pData, uint32_t count)
{
  uint32_t i;
  NVIC_DisableIRQ(USB_IRQn);    // USB IRQ handlaer must not run ...
  LPC_USB->Ctrl = ((endpoint & 0xF) << 2) | CTRL_WR_EN;

  LPC_USB->TxPLen = count;

  for(i=0; i < (count+3)/4; i++)
  {
    LPC_USB->TxData = *((__packed uint32_t *)pData);
    pData += 4;
  }

  LPC_USB->Ctrl = 0;
  USBValidateEndpoint(endpoint);
  NVIC_EnableIRQ(USB_IRQn); // ... until we are here. Enable USB IRQ again
  return count;
}
0
votes

I think I've managed to make some significant improvements in the way that the microcontroller sends USB messages.

On page 400 of the LPC178x/7x user manual, I saw a description for how to properly handle sending data from the device to the host using a bulk IN endpoint. It was pretty lucky that I eventually stumbled across it since it was within the DMA section of the chapter (and I'm not using DMA so until now I've ignored it).

Based on what I read there, I added the following method:

// Params:      endpoint - the logical endpoint number.
// Returns:     TRUE if at least one write buffer is free,
//              FALSE otherwise.
// Description: Checks that the IN endpoint has a free write buffer.
uint8_t USBCheckInEndpointFree(uint32_t endpoint)
{
  uint16_t data;
  uint32_t physicalEndpoint = getEndpointPhysicalAddress(endpoint);

  writeSIECommand(CMD_SEL_EP(physicalEndpoint));
  data = readSIECommandData(DAT_SEL_EP(physicalEndpoint));

  return (data & 0x1) == 0;
}

I changed USBWriteEndpoint to the following:

uint32_t USBWriteEndpoint(uint32_t endpoint, uint8_t *pData, uint32_t count)
{
  uint32_t i = 0;
  NVIC_DisableIRQ(USB_IRQn);

  if((endpoint & 0xF) != 0)
  {
    while(getSendMessageFlag(endpoint & 0xF) != 0);
    setSendMessageFlag(endpoint & 0xF);
  }

  while(!USBCheckInEndpointFree(endpoint))
  {
    uint32_t physicalEndpoint = getEndpointPhysicalAddress(endpoint);

    writeSIECommand(CMD_SEL_EP(physicalEndpoint));
    ITM_EVENT32_WITH_PC(3, readSIECommandData(DAT_SEL_EP(physicalEndpoint)));
  }

  LPC_USB->Ctrl = ((endpoint & 0xF) << 2) | CTRL_WR_EN;

  LPC_USB->TxPLen = count;

  for(i=0; i < (count+3)/4; i++)
  {
    LPC_USB->TxData = *((__packed uint32_t *)(pData+i*4));
    //pData += 4;
  }

  ITM_EVENT32_WITH_PC(4, (pData[4] << 24) | (pData[5] << 16) | (pData[6] << 8) | pData[7]);

  LPC_USB->Ctrl = 0;
  USBValidateEndpoint(endpoint);

  NVIC_EnableIRQ(USB_IRQn);
  return count;
}

The ITM_EVENT32_WITH_PC macro is used for debugging.

While sending USB messages at a high rate, I noticed this pattern in the event timeline:

In that image, the lines within the third channel show where the thread is caught within the while loop because none of the write buffers of the endpoint I'm using are free. The lines in the fourth channel are where a message is written to one of the endpoint's write buffers.

The LPC1788 uses double-buffering for bulk and isochronous endpoints, so what you're seeing in that image is the following:

  1. The microcontroller tries to send a USB message to the host. Both write buffers are free, so it picks one and uses that.
  2. The micro tries to send a second message. It uses the remaining free write buffer.
  3. The micro tries to send a third message. Both buffers are in use so it has to wait. That produces the series of lines in the third channel while the micro polls the endpoint status.
  4. Eventually one of the buffers (probably the first one used) becomes free, and the message is written to that.
  5. The micro tries to send a fourth message. Both buffers are in use and the micro apparently has to wait a fair while before one of them becomes free.
  6. Eventually one of the buffers becomes free and it writes the message to that.

Before adding in that additional check, I got behaviour like this:

Clearly without the check, the endpoint's buffers are getting overwritten!

To check how well this change resolved the problem, I created a Python script with a simple algorithm that I chose to evaluate performance by comparing how well the output from the USB logs (generated by one of my USB drivers on the PC end) conforms to a perfect cycle (which is what I'd ideally want).

I did three runs of my program with the additional check and three runs of my program without it, each time leaving it running long enough for 1000 USB messages to be received and recorded in a log file by the DLL.

The three logs with the additional check were USBLogGood1.txt ... USBLogGood3.txt. The three without the check were USBLogBad1.txt ... USBLogBad3.txt.

The Python code is given below:

# Checks how well the USB is performing by checking how
# closely the stream of USB messages returned by the DLL
# matches a perfect cyclical pattern.

from statistics import *

cycle = [1,2,3,4]
sampleSize = 1000

class Globals:
    totalCount = 0
    errorCount = 0

usbLogFile = "usbLog.txt"

usbGoodLogFiles = ["usbLogGood1.txt",
                   "usbLogGood2.txt",
                   "usbLogGood3.txt"]

usbBadLogFiles = ["usbLogBad1.txt",
                  "usbLogBad2.txt",
                  "usbLogBad3.txt"]

# Switch between sets of log files.
usbLogFiles = usbBadLogFiles

# Read log file.
def readLog(logFile):
    with open(logFile) as fIn:
        return fIn.readlines()

# Converts raw log data into a list of cycle values.
def processLog(log):
    data = []

    for line in log:
        data.append(processLogLine(line))

    return data

# Converts raw log line into a cycle value.
def processLogLine(logLine):
    l = logLine.replace("Message ", "")
    return int(l[5],16)+1

# Counts distance in one direction, so the distance
# will be high if i2 is behind i1.
def getListDistance(val1, val2):
    cycleLen = len(cycle)
    i1 = cycle.index(val1)
    i2 = cycle.index(val2)

    if i1 <= i2:
        return i2 - i1
    else:
        return (cycleLen - i1) + i2

def getNextValueInCycle(val):
    cycleLen = len(cycle)
    i = cycle.index(val)
    if i < cycleLen-1:
        return cycle[i+1]
    else:
        return cycle[0]

def checkCycleValue(expected, value):
    Globals.totalCount += 1

    if value != expected:
        Globals.errorCount += getListDistance(expected, value)

    expected = getNextValueInCycle(value)
    return expected

def getPerformance():
    return 1-float(Globals.errorCount)/float(Globals.totalCount)

def printPerformance():
    print("Sampled %d values. USB performance: %0.1f%%"
          % (Globals.totalCount, getPerformance()*100))

# Read log file and evaluate performance.
def evaluatePerformance(logFile):
    Globals.totalCount = 0
    Globals.errorCount = 0

    log = readLog(logFile)
    data = processLog(log)
    if not data:
        print("No data available")
        return

    if len(data) < sampleSize:
        print("Not enough data available to meet requirements")
        return
    else:
        data = data[:sampleSize]

    expected = data[0]
    for value in data:
        expected = checkCycleValue(expected, value)

    return getPerformance()

def printAggregatePerformanceData(logFiles, performances):
    performances = [100*p for p in performances]

    for f, p in zip(logFiles, performances):
        print("%s: %0.2f%%" % (f, p))

    print("\nAverage performance: %0.2f%%" % mean(performances))
    print("Standard deviation: %0.2f" % stdev(performances))

def main():
    performances = []
    for logFile in usbLogFiles:
        performances.append(evaluatePerformance(logFile))

    printAggregatePerformanceData(usbLogFiles, performances)

if __name__ == "__main__":
    main()

With the good set of logs, I got the following output:

usbLogGood1.txt: 93.70%
usbLogGood2.txt: 92.50%
usbLogGood3.txt: 92.60%

Average performance: 92.93%
Standard deviation: 0.67

For the bad set, I got this:

usbLogBad1.txt: 16.60%
usbLogBad2.txt: 13.80%
usbLogBad3.txt: 14.10%

Average performance: 14.83%
Standard deviation: 1.54

By adding in the additional check to make sure a write buffer is free, I managed to increase the 'performance' of the USB by around 78.1% (100% means a perfect cycle of USB messages: 1,2,3,4,1,2,3,4,1,2,3,4...).

In addition to this, I found that the message throughput about doubled when I put the check in, despite the delays associated with waiting in while loops.

92.93% still isn't perfect, but the manual talks about checking for ACKS from the host as well as just free write endpoints. I tried to do that earlier (without apparent success), but this was before I tried this check. Hopefully, if I implement both together I can get the USB performance to rival that of CAN.

Edit: The wait for ACK thing didn't work, but if I force a 1ms delay in between sending messages, I can get a performance of ~99.99...%.

This is not an ideal solution since 1ms is quite a long time to be delaying for, so I'm leaving the question unresolved for now.

EDIT

I quite firmly believe at this point that the issue is mainly with the PC-side driver I wrote. It doesn't read fast enough.