This bash tip can be useful when trying to extract all HTTP requests from PCAP generated traces.
First, use this command to generate the pcap file :
# tcpdump -s 0 -w trace.pcap
The option -s 0 enables capture of the whole packets and not only the first 64 bytes of each. With the -w trace.pcap parameter, raw captured data are written to the trace.pcap file.
FIELDS=(
tcp.stream
http.request.method http.request.uri http.request.version
http.request.line
http.response.version http.response.code http.response.phrase
http.response.line
http.file_data
)
tshark -r trace.pcap -T fields -Y http ${FIELDS[@]/#/-e$IFS} |
awk -v FS=$'\t' '
{
output = $1 ".http";
n = $2 ? 2 : 6
if (OUTPUTS[output]) printf("") >> output;
else {printf("") > output; OUTPUTS[output] = 1; }
printf("%s %s %s\n", $n, $(n+1), $(n+2)) >> output;
printf("%s\n", gensub("(\\\\r\\\\n,?)+", "\n", "g", $(n+3))) >> output;
if (substr($10,1,1) == "<") {
fflush(output);
close(output);
xmlstarlet = "xmlstarlet fo - >> "output;
printf("%s\n", gensub("\\\\n", "\n", "g", $10)) | xmlstarlet;
close(xmlstarlet);
printf("") >> output;
}
else
printf("%s\n", $10) >> output;
printf("\n--\n\n") >> output;
close(output);
}
'
This script is using :
- tshark, the wireshark cli (https://www.wireshark.org/docs/man-pages/tshark.html)
- xmlstartlet, command line tool to work with XML (http://xmlstar.sourceforge.net/)
- awk, the famous (https://en.wikipedia.org/wiki/AWK)
- and just a little Bash (https://en.wikipedia.org/wiki/Bash_(Unix_shell))
With a sample downloaded at https://www.cloudshark.org/captures/74a6deb7aa4e, the result is :
$ ls -l *.http -rw-rw-r-- 1 john dev 78407 Jan 26 13:55 0.http -rw-rw-r-- 1 john dev 9089 Jan 26 13:55 1.http -rw-rw-r-- 1 john dev 8307 Jan 26 13:55 2.http -rw-rw-r-- 1 john dev 82888 Jan 26 13:55 3.http -rw-rw-r-- 1 john dev 91637 Jan 26 13:55 4.http -rw-rw-r-- 1 john dev 6733 Jan 26 13:55 5.http
$ cat 5.http
POST /ReportingWebService/ReportingWebService.asmx HTTP/1.1
Accept: text/xml
Content-Type: text/xml
SOAPAction: "http://www.microsoft.com/SoftwareDistribution/ReportEventBatch"
User-Agent: Windows-Update-Agent
Host: asupdt
Content-Length: 4318
Connection: Keep-Alive
<?xml version="1.0"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/">
<soap:Body>
<ReportEventBatch xmlns="http://www.microsoft.com/SoftwareDistribution">
<cookie>
<Expiration>2012-06-26T21:26:32.995Z</Expiration>
...
</ReportEventBatch>
</soap:Body>
</soap:Envelope>
--
HTTP/1.1 200 OK
Cache-Control: private, max-age=0
Content-Type: text/xml; charset=utf-8
Server: Microsoft-IIS/7.0
X-AspNet-Version: 2.0.50727
X-Powered-By: ASP.NET
Date: Tue, 26 Jun 2012 20:32:05 GMT
Content-Length: 406
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi=
"http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<ReportEventBatchResponse xmlns="http://www.microsoft.com/SoftwareDistribution">
<ReportEventBatchResult>true</ReportEventBatchResult>
</ReportEventBatchResponse>
</soap:Body>
</soap:Envelope>
It is now easy to use grep, wc and sort on data.
$ grep "500 Error" *.http | wc -l 3
Now, try to do the same with REST requests !