The visualization explores the evolution of Charles Darwin's theory of, uh, evolution. It began as a less-defined 150,000-word text in the first edition and grew and developed to a 190,000-word theory in the sixth edition.
Watch where the updates in the text occur over time. Chunks are removed, chunks are added, and words are changed. Blocks are color-coded by edition. Roll over blocks to see the text underneath.
Much of this is well-known by those of us who have worked with dataviz for the past decade or two, but his ending conclusions are solid and worth reviewing.
Key quote from Jeffrey Veen: "We need to create tools to help people manipulate THEIR data."
Good examples of how to use large data sets to find and tell stories and, if desired, to answer YOUR questions about the data.
Video: Designing for Big Data
This is a 20-minute talk I gave at the Web2.0 Expo in San Francisco a couple weeks ago. In it, I describe two trends: how we're shifting as a culture from consumers to participants, and how technology has enabled massive amounts of data to be recorded, stored, and analyzed. Putting those things together has resulted in some fascinating innovations that echo data visualization work that's been happening for centuries.
I've given this talk a few times now, but this particular delivery really went well. Only having 20 minutes forced me to really stay focus, and the large audience was very engaged. I'll be giving an extended version of this talk in June at the UX London conference, with a deeper look at how we integrated design and research while I was at Google.
Nathan, over at Flowingdata.com, posts this interesting data visualization from the Baylor College of Medicine. No, it probably doesn't give a science writer a story in itself, but the concept of taking a complex data set and illustrating that data with the right tool -- in this case, Circos -- good generate some interesting reporting vectors. For example, could Circos show us something about traffic patterns? Ambulance or fire department response times? We're not sure, but we hope someone could probe this a bit.
The thing about cancer cells is that they suck. Their DNA is all screwy. They've got chunks of DNA ripped out and reinserted into different places, which is just plain bad news for the cells in our body that play nice. You know, kind of like life. Researchers at the Baylor College of Medicine in Houston have compared the DNA of a certain type of breast cancer cell to a normal cell and mapped the differences (and similarities) with the above visualization.
The graphic summarizes their results. Round the outer ring are shown the 23 chromosomes of the human genome. The lines in blue, in the third ring, show internal rearrangements, in which a stretch of DNA has been moved from one site to another within the same chromosome. The red lines, in the bull's eye, designate switches of DNA from one chromosome to another.
Some design would benefit the graphic so that your eyes don't bounce around when you look at the technicolor genome but it's interesting nevertheless.
Check out the Flare Visualization Toolkit or Circos if you're interested in implementing a similar visualization with the above network technique.
It’s human nature: Elections and disinformation go hand-in-hand. We idealize the competition of ideas and the process of debate while we listen to the whisper campaigns telling us of the skeletons in the other candidate’s closet. Or, we can learn from serious journalism to tap into the growing number of digital tools at hand and see what is really going on ... more»
What have we here? Cooperation between two academic departments in the same university? Largely unheard of in most schools, but it has happened with positive results in Hong Kong.
Power Distribution of the Four Political Camps, Seeing the 2007 District Council Election Results with Maps
The Department of Geography and the Journalism and Media Studies Centre of The University of Hong Kong (HKU) announced today (November 23) an analysis of results of the 2007 District Council Election of four political camps from the spatial perspective.
Dr. P.C. Lai, Associate Professor of the Department of Geography, and her team applied the Geographic Information System (GIS) to analyze results of the District Council Election. The GIS technology was used to explore the power re-distribution of the four political camps or affiliations - pro-government, pro-democrat, moderate (Liberal Party) and independent candidates - of the said election. [more]
We've long been intrigued with Benford's Law and its potential for Analytic Journalism. Today we ran across a new post by Charley Kyd that explains both the Law and presents some clear formulas for its application.
Benford's Law addresses an amazing characteristic of data. Not only does his formula help to identify fraud, it could help you to improve your budgets and forecasts.
Unless you're a public accountant, you probably haven't experimented with Benford's Law.
Auditors sometimes use this fascinating statistical insight to uncover fraudulent accounting data. But it might reveal a useful strategy for investing in the stock market. And it might help you to improve the accuracy of your budgets and forecasts.
This article will explain Benford's Law, show you how to calculate it with Excel, and suggest ways that you could put it to good use.
From a hands-on-Excel point of view, the article describes new uses for the SUMPRODUCT function and discusses the use of local and global range names. [Read more...]
"Unveiling the Beauty of Statistics
Posted: 11 Jul 2007 03:01 AM CDT
By Jesse Robbins
I presented last week at the OECD World Forum in Istanbul along with Professor Hans Rosling, Mike Arrington, John Gage and teams from MappingWorlds, Swivel (disclosure: I am an adviser to Swivel) and Many Eyes. We were the "Web2.0 Delegation" and it was an incredible experience.
The Istanbul Declaration signed at the conference calls for governments to make their statistical data freely available online as a "public good." The declaration also calls for new measures of happiness and well-being, going ... more»
This weekend, friend-of-the-IAJ Joe Traub sent the following to the editor of the New York Times. Here's the story Joe is talking about: "White House...."
To the Editor:
The headline on page 1 on May 26 states "White House Said to Debate '08 Cut in Troops by 50%" The article reports a possible reduction to 100,000 troops from 146,000. Thats 31.5%, not 50%. NPR's Morning Edition picked up the story from the NYT and also reported 50% erroneously.
Joseph F. Traub The writer is a Professor of Computer Science at Columbia University
The headline error is bad enough (it's only in the hed, not not in the story) -- and should be a huge embarrassment to the NYT. But the error gets compounded because while the Times no longer sets the agenda for the national discussion, it is still thought of (by most?) as the paper of record. Consequently, as other colleagues have pointed out, the reduction percentage gets picked up by other journalists who don't bother to do the math (or who cannot do the math.)
See, for example: * CBS News -- "Troop Retreat In '08?" -- (This video has a shot of the NYT story even though the percentage is not mentioned. Could it be that the TV folks don't think viewers can do the arithmetic?) (NB: We could not yet find on the NPR site the transcript of the radio story that picked up the 50 percent error. But run a Google search with "cut in Troops by 50%" and note the huge number of bloggers who also went with the story without doing the math.)
Colleague Steve Doig has queried the reporter of the piece, David Sanger, asking if the mistake is that of the NYT or the White House. No answer yet received, but Doig later commented: "Sanger's story did talk about reducing brigades from 20 to 10. That's
how they'll justify the "50% reduction" headline, I guess, despite the
clear reference higher up to cutting 146,000 troops to 100,000."
Either way, it is a serious blunder of a fundamental sort on an issue most grave. It should have been caught, but then most journalists are WORD people and only word people, we guess.
We would also point out the illogical construction that the NYT uses consistently in relaying statistical change over time. To wit: "... could lower troop levels by the midst of the 2008 presidential election to roughly 100,000, from about 146,000..." We wince.
English is read from left to right. Most English calendars and horizontal timelines are read from left to right. When writing about statistical change, the same convention should be followed: oldest dates and data precedes newest or futuredates and data. Therefore, this should best be written: "...could lower troop levels from about 146,000 to roughly 100,000 by the midst of the 2008 presidential election."
No story? Then check out Swivel, a web site rich with data -- and the display of data -- that you didn't know about and which is pregnant with possibilities for a good news feature. And often a news feature that could be localized.
Here, for example, is a posting from the SECRECY REPORT CARD 2005 illustrating the changing trends in the the classification and de-classification of U.S. government data. (You can probably guess the direction of the curves.)
The
number of classified documents is steadily increasing, while the number
of pages being declassified is dwindling. This data were uploaded by mcroydon.
Paul Parker, of the Providence (Rhode Island) Journal, is the Quick and an impressive list of folks on the state's voter registration rolls are the Dead this week. Below is a note Parker posted to the NICAR-L listserv. The great thing about this is the recipe Parker provides for an analytic journalists' cookbook. Said he:
Nothing new or innovative, but we ran a dead voters story today, and
it's getting tons of buzz. I would recommend -- no, URGE -- everyone on
the list do the same for your area.
Here's the link:
http://www.projo.com/extra/election/content/deadvoters9_11-09-06_DN2P2GR.33b46ef.html more»
Guests are encouraged to browse and search through all of this blog and its subdirectories. Please sign in or register and then add comments to the blog.