So, I can not stand the [brou ha ha over Google News and fair use](http://www.nytimes.com/2009/04/08/technology/internet/08google.html?ref=technology).
It offends me on several levels, not the least of which is common sense. So for the dim, dull and dense out there, I decided at lunch to do a quick and dirty [content analysis](http://en.wikipedia.org/wiki/Content_analysis).
**Sampled from Google News homepage – 4/8/2009 10:15 a.m.**
* **Stories with excerpts:** 32
* **Total excerpt word count:** 979
* **Average excerpt length:** 30.59
* **Number of sources:** 111
* **Number of links:** 190
* **Average number of links:** 1.71
* **Median number of links:** 1
* **Max links:** 14
* **Min links:** 1
Please feel free to **check my work.** The screen capture I used for my analysis is on [Flickr](http://www.flickr.com/photos/cmheisel/3424166471/) and the [data is in a Google spreadsheet](http://spreadsheets.google.com/pub?key=p0-6UxWupDZ8J9KwpMQcACw&gid=0).
If you look on the [second page of data](http://spreadsheets.google.com/pub?key=p0-6UxWupDZ8J9KwpMQcACw&gid=2) you’ll see my “and what do these ungrateful new agencies get out of the deal calculation”. The short answer is more than **2 million unique visitors** seeing their headlines and all at the “cost” of 30 words (most of which included the byline).
Long and short of it — these are [links](http://en.wikipedia.org/wiki/Hyperlink) — they are what make the Web go round. If you don’t like it, please ban Google and any other user agent you think is unfairly stealing promoting your content.
I’d provide a link but I wouldn’t want to unfairly steal promote another site, so just go to Google your **card catalog** and look up **robots.txt**.
You go, Chris!
Thank heaven someone is smart enough to put his foot down and stop that spinning merry-go-round.
Congratulations. Okay. How to import data and summarizes the main results window to make your graphics? Did you use any specific software? I am very interested in knowing how you did.
Kortazar — I just used the normal OSX screen capture utility to grab a shot of the Google News homepage at a particular moment in time.
The rest was just me counting the old fashioned way and using Google Docs to store the data.
If I wanted to keep this up I’d write a screen-scraper in Python to gather the data, crunch the data in Python and display it with Django.
Very interesting. :-O Could you write this program in Py and integrate into Django? How much would you charge to do? I want a program to automate an analysis much like you did? My email: luis.cortazar at gmail.com. Greetings.