Today I had to do some advance curl
ing and I though I would share what I did
to really get the most out of what I was doing. Essentially I had a huge CSV of
values that I wanted to go through and get the status code for. One option
possible was to bust out Ruby or Elixir and write some quick software to
accomplish the task; but bash once again has come out on top with the one
liner that seems to be ideal.
To start with, dealing with CSVs can be a pain; however, most times cut
has
your back. Lets says you only want the second column of a CSV. To do that use
the following command…
|
This breaks the input of a file with each line using the delimiter of ,
and
takes the second field; denoted with -f 2
. If you wanted to grab say the
first and third field it would look like this…
|
In my case I only wanted the first field; which contains the URLs that I wanted to check. Here is the command that was used; which will be broken down:
|
This is taking the urls from the csv, running 10 curls at a time, and outputting
the http status code and url tried to a results.txt
file. I don’t want to go
into too much detail on the curl
part of this call but I would like to explain
more on what xargs is doing. The -n
part is saying to run each line as an
argument to a command. The -P
part is how many items you want to run in
parallel; in our case ten at a time. The -I URL
part is taking our single
parameter and substituting anywhere it finds URL
with the line from the list.
Lastly the output of all of these results are being aggregated into a
results.txt
file with the following format:
|
If you have what seems like a complicated task; look again. It may be as simple
as the one liner I found! A big thanks to Lee Jones
for showing me that xargs
has a parallel option to it. I haven’t looked into
it yet; however, it looks like the gnu parallel command
has a bit more fire-power to it for processing.