Tuesday, 22 April 2014

Quiet wget to stdout, removing html tags

Here is a quick note on how to download a file with wget and printing it to STDOUT
wget -q -O - http://www.google.com
Some other useful parameters with when cookis (e.g. login needs to be saved)
--save-cookies cookies.txt--load-cookies cookies.txt
For debugging purposes, it is sometimes useful to remove HTML markup
sed -e 's/<[^>]*>//g'