Excerpt
I wanted to have a diagram showing the number of hits for a particular search term in Google, Yahoo, Bing and other search engines and how they change over time. Naturally, this should be done as a Linux cron job executed in regular intervals.

The first attempt,

No Format


wget 'http://www.google.de/search?hl=en&q="my+query"'
--2011-02-03 09:29:54--  http://www.google.de/search?hl=en&q=%22my+query%22
Resolving www.google.de... 74.125.79.147, 74.125.79.99, 74.125.79.104
Connecting to www.google.de|74.125.79.147|:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
2011-02-03 09:29:55 ERROR 403: Forbidden.

...

It seems we have to use a real webbrowser here.

No Format
lynx 'http://www.google.de/search?hl=en&q="my+query"'

works better, but we get a lot of requests for cookies. So, let's accept all cookies by default:

No Format
lynx 'http://www.google.de/search?hl=en&q="my+query"'

Finally, we would like to dump the whole to stdout, instead of running it interactively. We then find the interesting bit of information with some extra grep commands:

No Format
lynx -accept_all_cookies -dump 'http://www.google.de/search?hl=en&q="my+query"' \| grep About \| grep results About 3,660,000 results (0.12 seconds)

...

See also: the actual results of this search script (sorry, page available in German language only).

Child pages

Versions Compared

Old Version 6

New Version Current

Key

Related articles

Child pages

Page History

Versions Compared

Old Version 6

New Version Current

Key

Related articles