Politico, Slate, and story versioning — or: the only Web constant is change

Last month, the hardworking gang at Politico got into a dustup with critics after an editor made a change in an already-posted story. The story was about the Rolling Stone/General McChrystal affair; the change removed a phrase that described how beat reporting works; the phrase had drawn considerable attention, and so did its disappearance.

I’m not going to add to the volume of commentary on that affair. I’m interested here in the larger issue of the mutability of online content, and how responsible news organizations deal with it.

A story posted at Slate yesterday sheds considerable light on this issue, in the course of making a few stumbles of its own. (The story includes quotes from a recent post I wrote about best practices in online corrections.) It’s remarkable that, after 15 years of Web publishing experience, we haven’t gotten better at handling changes to news published online. Before this post is done, I will offer a straightforward, concrete proposal for doing so.

Any news organization that strives to present a version of reality to its readers or users must come to grips with the fact that reality is always changing. Print publications have always taken daily, weekly or monthly snapshots of that reality, and everyone understands the relationship between the publication date and the information published under it. Radio and TV offer a closer-to-live reflection of the ever-changing news reality, but until the Web’s arrival their content was so fleeting that the new update pretty much obliterated the old version of any story.

The Web changes all of this. It is both up-to-the-minute and timeless — ephemeral and archival. This offers newsrooms a fundamentally different opportunity for presenting timely story updates while honoring and preserving the record of previous versions. Sadly, not a single news organization I’m aware of has yet taken advantage of this opportunity.

Instead, what we have is a big mess, with publications tripping over the distinction between revisions and corrections, and readers left to harbor suspicions of deception.

Politico got into trouble because, during the course of what it apparently viewed as a routine update, it changed a passage that (some of) its readers saw as significant. Worse, it provided no notice of the change.

Nothing presses the public’s Orwell alarm faster than altering a published or posted text without copping to the revision. If the text is the subject of criticism, even worse.

But of course, on the Web, everything is the subject of criticism. Editors and reporters rarely keep up with the full extent of the arguments over their work. Therefore, they should assume that some reader somewhere is likely to care about any change they might make. Be careful with the changes!

The problem is that changes are necessary and desirable. Changes are how you keep up with reality. We have to allow stories, especially breaking or running stories, to evolve. But journalism, thus far, has offered us only two models for doing so. The dominant one is the old wire-service model: There is one story that represents the latest reality, and it’s regularly updated to reflect new developments. Traditionally, wire services shared these stories with their newsroom customers. Newsrooms would either grab the latest version at deadline and print it, or use the running reports as the basis for broadcast updates.

Unfortunately, this approach is a disaster online. At best, it leaves the public confused about which version of the story is canonical: Where do I find the story I was reading a few minutes ago? If it’s different, how do I know what has changed? At worst, it saps readers’ trust, as Politico found.

At Salon, we learned some of these lessons during the Florida recount in 2000, and later after 9/11, as we struggled to perform the wire-service dance in a medium that’s ill-designed for it. What we should have done then, and what many smart sites have done since in similar situations, is embraced the second journalistic model for presenting changing stories: the blog format. It’s useful because the newest information is always accessible but it doesn’t obliterate the older stuff. But the blog structure falls down where the wire-service model excels — in offering readers an up-to-date, one-stop overview of a big story.

Back to the Slate piece, now. Intrigued by Politico’s admission, in the course of the McChrystal ruckus, that it often edits already posted stories without noting the changes, Slate’s Jeremy Singer-Vine undertook an investigation into Politico’s practices. He wrote a script to scrape the text of Politico’s stories at regular intervals after publication; this data would show how many Politico stories changed, and exactly how they changed.

The results — what I’d describe as a modest amount of mucking about, none of it hugely significant — are less interesting than what happened once Singer-Vine contacted Politico about the changes. It seems that the moment Politico’s editors realized that Slate was calling them on this practice, they scrambled to come clean. Most of the stories cited in Singer-Vine’s study now sport notices from Politico explaining that they’ve been changed. Singer-Vine, in turn, had to add a bunch of corrections to his own copy, since he’d originally took Politico to task for failing to come clean.

Slate — unlike Politico, whose editor apparently doesn’t believe in the value of an explicit correction policy — has long had forthright corrections practices; I think of it as one of the good guys in this realm. But in the course of making multiple small corrections to Singer-Vine’s piece, the magazine has now admitted that “we do not notify readers about minor corrections that we ourselves catch within 24 hours of publication.” Which I think means that Slate changes stories after they’re published without notifying readers — exactly what it is accusing Politico of.

Most editors will correctly argue that the average reader doesn’t want to know every time they fix a typo. On the other hand, editors can never know when a change they consider insubstantial — like Politico’s removal of the line about beat reporters — might seem underhanded to someone.

In sum: The Web lets us correct and update at will; it also insures that readers will question any change that isn’t acknowledged. But acknowledging every little change can lead to grotesque results. Singer-Vine’s piece illustrates this nicely. It is now striped with rows of correction notices, like the fat marbling a steak, and contains Escher-like sentences like this: “A previous version of this article also incorrectly stated that Politico had originally incorrectly stated that Howard Kurtz published the first report of sexual-harassment allegations against Al Gore.”

Something definitely went amiss with Slate’s little experiment — but the project also points the way out of this mess. What Singer-Vine’s script did to Politico’s stories was what every software project does as its developers write code: it built an archive of successive versions of a text, with changes — “diffs” — noted from one to the next. This is called versioning. Most software developers use it continuously in all their work. Versioning is common in the culture of computing because it’s the sort of thing computers do cheaply and well. You can see it at work on the “view history” tab of every page in Wikipedia.

Why not adopt this technique for every story we publish? Let readers see the older versions of stories. Let them see the diffs. Toss no text down the memory hole, and trigger no Orwell alarms.

Versioning should be the model for how we present the evolution of news stories on the Web. In fact, it makes so much sense that, even though right now no one is using it, I’m convinced it will become the norm over the next decade.

Today it might seem like overkill, but that’s how all new Web phenomena present themselves to us. It might sound like a lot of work, but once it’s incorporated into a newsroom’s content management software, it’s probably going to save time presently wasted on posting jerry-rigged correction notices. It can be presented unobtrusively, so that the vast majority of readers who don’t care will never need to see it — but the bloggers, pundits and critics who do care can feast. (Blog software could do this too: WordPress already stores each revision of a post as a separate version; someone could write a plugin that lets visitors access all versions of any post that have been created since its first publication.)

In software development, versioning is most useful as a practical tool for “rolling back” to an earlier version of code after some new addition has gone awry. In journalism, versioning can be valuable as a foundation for trust. It’s a smart way to solve the dilemma that Politico and Slate and everyone else faces in trying to keep information up to date and correct small errors without seeming to be playing fast and loose with the public record.

Public versioning for every news story: it’s time! Otherwise, we’re going to be wasting a lot of time struggling to pinion dynamic information on static pages, and accusing one another of tampering with history.

LINK UPDATE: Regret the Error’s Craig Silverman takes Politico to task for its editor’s casual dismissal of the value of a “black and white policy” on corrections. He also points to an example of a journalist who’s using versioning right now. — although from what I can see, the example, David McCandless, is labeling his work by version number but not exposing previous versions.

Post Revisions: