6
votes

I have pcap files continuously generated to me. It want to continuously feed them to a "ever-running" tshark/wireshark. Here is what I have tried (OSX)

mkfifo tsharkin
tail -f -c +0 tsharkin | tshark -l -i - > tsharkout 2>stderr &
cat file1.pcap > tsharkin

The above works fine, I get expected output from file1.pcap in tsharkout

cat file2.pcap > tsharkin

The above does not work, I get nothing in tsharkout, but I get "1 packet dropped" + "3 packets captured" in stderr

cat file2.pcap > tsharkin

Trying again makes the tail/tshark processes stop/crash

I tried doing it again, but this time with file2.pcap first and then file1.pcap. This time file2.pcap is processed just fine, and file1.pcap is making tail/tshark processes stop/crash. So I will conclude that nothing is wrong with the two pcap-files, but it seems tshark does not like having more than one pcap-file thrown at it.

Just to test it, I tried merging file1.pcap and file2.pcap using mergecap first, and feed that into tshark

mergecap -F pcap -w file1_2.pcap file1.pcap file2.pcap
cat file1_2.pcap > tsharkin

This works fine, I get expected output from both file1.pcap and file2.pcap in tsharkout

Problem is that my pcap-files arrive along the way, so I cannot just merge them all before feeding to tshark. I need to be able to feed the pcap-files as they arrive, to a "ever-running" tshark. How can I do that?

1
why not wrap this code in a node server, and spawn a new tshark every time a file is received. the high-level algorithm, - Recieve pcap in a newly generated pipe. (possibly name it on timestamp) - Make node app poll the directory to check for newly created named pipes. - Once received a pcap on a named pipe, spawn a child process running tshark. - Write results to a common directory.Himanshu97
@Himanshu97 It will not do it. Two consecutive pcap-file may contain data belonging to the same session. E.g. an html-document was transferred over the wire over several ip-packets. The first half of those ip-packets could be in file1.pcap while the last half of those ip-packets could be in file2.pcap. I want that html-document reconstructed, and that will only happen if file1.pcap and file2.pcap if feed into the same tshark instance.Per Steffensen
In that case you can simply pipe the file into your unix pipe like this, "cat <file.pcap>' | tsharkin" and make tshark always read from the pipe. so what will happen in, you will append your files content to the pipe. and named pipe works as a FIFO, so it should work.Himanshu97
@Himanshu97 Thanks, but I am not sure how that approach differs from the approach I described in the original question, and which I also described how does not work!?Per Steffensen

1 Answers

1
votes

TL;DR. libpcap files have a header. You need to remove it for the second and subsequent capture files:

cat file2.pcap | tail -c +25 > tsharkin

My output, when feeding tshark twice the same file:

1   0.000000 10.0.0.1 → 10.0.0.2 TLSv1.2 246 Application Data
2   0.058816 10.0.0.2 → 10.0.0.1 TCP 66 443 → 58616 [ACK] Seq=1 Ack=181 Win=1701 Len=0 TSval=3578216450 TSecr=5878499

3   0.000000 10.0.0.1 → 10.0.0.2 TLSv1.2 246 [TCP Spurious Retransmission] , Application Data
4   0.058816 10.0.0.2 → 10.0.0.1 TCP 66 443 → 58616 [ACK] Seq=1 Ack=181 Win=1701 Len=0 TSval=3578216450 TSecr=5878499

Details

As explained in the documentation for the libpcap format, libpcap files have a 24 bytes global header. This global header is directly followed by packets (with their own libpcap header) without padding.

Therefore, when feeding libpcap files to tshark, the first file works fine because tshark is expecting a global header. It isn't, however, expecting one for the subsequent file you feed it. This probably causes tshark to see the beginning of the second file (which is actually a global header, again) as a malformed packet and to drop it ("1 packet dropped"). I don't know why tshark stops with the third file though.

If you're sure that all your capture files will have the same format (beware of pcapng files) and the same header, you can then safely remove the global header of the second file (and subsequent files) before sending them to your named pipe. One way to do this is to use the tail syntax you already used in your question:

tail -c +startoffset file

The global header being 24 bytes long, we want to start reading capture files at the 25th byte.

Note on the same header. If all your capture files are retrieved the same way, they probably have the same global header. In particular, the physical layer protocol (e.g., Ethernet) needs to be the same. That global header also contains the version of the format (e.g., 2.4), the timezone, and the maximum capture length for packets.