How to decode and visualize DICOM curve data in Python 3?

Question

I am trying to visualize a DICOM file with Python 3 and pyDicom which should contain a black 100x100 image with some curves drawn in it. The pixel data is extracted from header (7fe0,0010) and when printed shows b'\x00\x00\x00...'. This I can easily convert to a 100x100 numpy array.

However, the curve data in (5000,3000) shows me b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xc0H@\x00\x00\x00\x00\x00\xc0X@\x00\x00\x00\x00\x00\xc0H@' which I am not able to convert to x,y coordinates in my 100x100 pixel image. In the DICOM file it says

curve dimensions: 2
number of points: 2
type of data: poly
data value representation: 3
curve label: horizontal axis
curve data: 32 elements

The main question is: How do I decode the coordinates required for retracing the curve within my 100x100 image? My main concern is the fact that there should be 32 elements, but only 26 hex values in the output. Also I have no clue how to deal with the \xc0H@ and \xc0X@. When I print those, it yields 192 72 64 and 192 88 64. How does python decode these 2 hex codes to 6 numbers? And what do these numbers represent?

EDIT: Apparently data value representation 3 means the data is represented as a floating point double. On the other hand, there should be two points in the data, so each point is represented by 16 elements? I don't see how these two statements are compatible. What is interesting is that the first \xc0H@ translates to 3 numbers as mentioned before, and by doing so complete the first 16 elements of the curve data. How can I convert this into a point in my 2D image?

Curve data has been retired in DICOM since 2004, so you have to check the DICOM standard from 2004 or earlier to find the relevant information. Somebody else may know where to find it... — MrBean Bremen
the majority of older versions of the DICOM Standard is still maintained on the NEMA site. For 2004: dicom.nema.org/dicom/2004 — kritzel_sw
I did find the DICOM standard from 2004 and I have read it, but my problem lies in the interpretation of the curve data. — SMey1908

MrBean Bremen MrBean Bremen · Accepted Answer · 2020-07-24T08:11:36

Curve data has been retired in DICOM since 2004, so you will find the relevant information in the DICOM standard from 2004 (thanks to @kritzel_sw for the link).

As you already found out, Data Value Representation 3 means that the data entries are in double format, and with a Type of Data of polygon, you have x/y tuples in your data. As a double value is saved in 8 bytes, there are 16 bytes per point -- in your case (32 bytes of data) 2 points overall.

Pydicom does not (and probably will not) directly support the retired Curve module (though support for the Waveform module, the current equivalent, is being added now), so you have to decode the data yourself. You can do something like this (given double numbers):

from struct import unpack
from pydicom import dcm_read

ds = dcm_read(filename)
data = ds[0x50003000].value

# unpack('d') unpacks 8 bytes into a double
numbers = [unpack('d', data[i:i+8])[0] for i in range(0, len(b), 8)]
# I'm sure there is a nicer way for this...
coords = [(numbers[i], numbers[i+1]) for i in range(0, len(numbers), 2)]

In your example data, this will return:

[(0.0, 49.5), (99.0, 49.5)]

e.g. the x/y coordinates (0, 49.9) and (99.0, 49.5), which corresponds to a horizontal line in the middle of your image.

As to the mismatch of 26 hex elements vs 32 bytes: a byte string representation shows only the bytes that cannot be converted to ASCII in hex string notation, the rest is just shown as the representation of the corresponding ASCII characters. So, for example this part of your byte string: \x00\xc0H@ is 4 bytes long and could also be represented as \x00\xc0\x48\x40 in hex string notation.

How to decode and visualize DICOM curve data in Python 3?

1 Answers