Wordyard

Hand-forged posts since 2002

Scott Rosenberg

  • About
  • Greatest hits

Archives

Wordyard / Media / In the context of web context: How to check out any Web page

In the context of web context: How to check out any Web page

September 14, 2010 by Scott Rosenberg 20 Comments

One of the great fears about the Web as it becomes our primary source of news is the notion that it rips stories from their moorings and delivers them to us context-free. We’re adrift! In a flood of soundbites! Borne upon a river of bits! Or something like that.

I’ve never understood this argument. As I tried to suggest in my Defense of Links posts, the convention of the link, properly used, provides more valuable context than most printed texts have ever been able to offer.

But links aren’t the only bearers of digital context. Every piece of information you receive online emits a welter of useful signals that can help you appraise it.

The techniques described here first filled my quiver in the ’90s, when I worked as Salon’s technology editor. We’d receive story tips and ideas, some of them pretty far out, and we’d scratch our heads and think, “Can this be for real?” I began applying an informal set of tests and checks to try to prevent us from being manipulated, pranked, or turned into a conduit for bad information. This was our way of trying to take the “discipline of verification” at the heart of the journalism we’d always practiced and apply it to the new medium. We knew we’d never be perfect. But there were scammers, hoaxsters and nuts out there, and we were damn sure not going to be pushovers for them.

Though some of the details have changed in the intervening years, the basic principles for evaluating an unknown source remain relevant, I think.

  • What’s the top-level domain? Is the page in question on a spammy top-level domain like “.info”? That’s not always a bad sign, but it raises your alert level a bit.
  • Look the domain name up with whois. Is the registration info available or hidden? Again, lots of domain owners hide their info for privacy reasons. But sometimes the absence of a public contact at the domain level is a sign that people would rather you not look into what they’re doing.
  • How old or new is the registration? If the site just suddenly appeared out of nowhere that can be another indication of mischief afoot.
  • Look up the site in the Internet Archive. Did it used to be something else? How has it changed over the years? Did it once reveal information that it now hides?
  • Look at the source code. Is there anything unusual or suspicious that you can see when you “view source”? (If you’re not up to this, technically, ask a friend who is.)
  • Check out the ads. Do they seem to be the main purpose of the site? Do they relate to the content or not?
  • Does the site tell you who runs it — in an about page, or a footer, or anywhere else? Is someone taking responsibility for what’s being published? If so, obviously you can begin this whole investigation again with that person or company’s name, if you need to dig deeper.
  • Is there a feedback option? Email address, contact form, public comments — any kind of feedback loop suggests there’s someone responsible at home.
  • What shape are the comments in? If they’re full of spam it may mean that nobody’s home. If people are posting critical comments and no one ever replies, that could also mean that the site owner has gone AWOL. (He might also be shy or uninterested in tangling with people.)
  • Is the content original and unique? Grab a chunk of text (a sentence or so), put it in quotes, and plug it into Google to see whether there are multiple versions of the text you’re reading. If so, which appears to be the original? Keep in mind that the original author might or might not be responsible for these multiple versions.
  • Does the article make reference to many specific sources or just a few? And are the references linked? More is usually a good sign, unless they appear to be assembled by script rather than by a human hand.
  • Links in are as important a clue as links out. If your hunt for links in turns up a ton of references from dubious sites, your article may be part of a Google-gaming effort. If you see lots of inbound links from sites that seem reputable to you, that’s a better sign.
  • Google the URL. Google the domain. Google the company name. Poke around if you have any doubts or questions. Then, of course, remember that every single question we’ve been applying here can be asked about every page Google points you to, as well.

Once you’ve done some or all of this work, it may be time to actually try to contact the author or site owner with your questions. If there’s no way to do so, that’s another bad sign. If there is, but they don’t answer, it might be a problem — or they might just be really swamped!

Software developers use the term “code smell” to describe the signals they catch from a chunk of program code that something might be off. What I’m trying to describe here is a rough equivalent for online journalism: Call it “Web smell.”

No one of these tests, typically, is conclusive in itself. But together they constitute a kind of sniff test for the quality of any given piece of Web-borne information.

There are probably many more tests that I’m not remembering — or that I never knew in the first place. If you know of some, do post them in the comments.

BONUS LINK: Craig Kanalley’s “How to verify a tweet” assembles a similar set of tests for tweets.

FOLLOWUP: Craig Silverman’s “How To Lose Your Gut” (at Columbia Journalism Review) has some more tips.

Filed Under: Media, Technology

Comments

  1. Kevin marks

    September 14, 2010 at 7:15 am

    Isn’t step zero to check snopea.com ? That delegates the research to someone who does check sources and link to them

  2. Scott Rosenberg

    September 14, 2010 at 7:23 am

    Snopes is great. A priceless public resource. I was thinking more about new stories, issues and controversies that haven’t yet made the rounds in that fashion.

  3. Scout Avah

    April 29, 2023 at 9:27 pm

    I visited your website. Here I found many important unknown facts which worried me a lot. You can also visit my website Buy Verified Coinbase Accounts

Trackbacks

  1. Quick Links | A Blog Around The Clock says:
    September 14, 2010 at 5:59 pm

    […] In the context of web context: How to check out any Web page […]

  2. » BC 1800 Research Session Research Resources from John@Steacie: For CS, Engineerng, Natural Science, STS and other students at York University, Toronto says:
    September 15, 2010 at 1:42 pm

    […] In the context of web context: How to check out any Web page — interesting post by journalist Scott Rosenberg with some tips & tricks on using web sources […]

  3. dougpete says:
    September 15, 2010 at 10:04 pm

    What is Twitter and how do I use it? | Parents as Partners
    Many of you come to read this blog may not have ventured into the world of Twitter. I encourage you to sign up and join in the conversations in which I share my passion for education. These instructions will help you set up a Twitter account and explain how to use Twitter.
    (tags: twitter parents web2.0)

    6 Online Riddle Games to Keep You Glued to Your Computer
    If you’re in the mood to give your mind a little exercise, online riddles are a great place to do just that. One of the most popular genre of online riddle games consists of multiple levels, containing clues in the source code, in photos, and just about everywhere you can look.
    (tags: riddle online)

    WTF Is The Semantic Web? (Infographic)
    What is the semantic web?
    (tags: semantic web infographic)

    Your Internet Needs – Yes, There’s a Hierarchy(Image)
    You already know that I’m a huge fan of the folks at Flowtown. Once again I am delighted with this image that I safely assume is based on Maslow’s Hierarchy of Need
    (tags: internet needs)

    yolink adds CC license support to its browser plugin – Creative Commons
    yolink, “a next-generation search technology,” has added CC license support to its updated browser plugin. yolink’s browser plugin allows you to quickly scan your search results by specific key terms, effectively simplifying your more complex or advanced searches.
    (tags: yolink support creative commons)

    Ants spiral of death – how ants kill themselves | buZzhunt.co.uk
    We all know that ants is one of the most interesting and mysterious species on our planet. But did you know that ants are quite suicidal?
    (tags: ants death spiral)

    Tech Learning TL Advisor Blog and Ed Tech Ticker Blogs from TL Blog Staff – TechLearning.com
    I am an advocate for Project Based Learning in the classroom. True Project Based Learning is a process that puts the student at the center of their learning. In this post I wish to share with you some of the top sites I have found to be useful on the internet that promote true PBL
    (tags: learning projectbasedlearning PBL blog tech)

    In the context of web context: How to check out any Web page — Scott Rosenberg’s Wordyard
    Though some of the details have changed in the intervening years, the basic principles for evaluating an unknown source remain relevant, I think.
    (tags: web website icsxx)

    SCC ENGLISH: All Shakespeare’s Sonnets via Wordle
    All Shakespeare’s Sonnets via Wordle
    (tags: english shakespeare wordle)

    Pex for fun – from Microsoft Research
    This puzzle is an interactive Coding Duel. Can you write code that matches a secret implementation?
    (tags: fun pex research icsxx)

    Funny English – Misplaced and Dangling Modifiers
    How to identify misplaced and dangling modifiers, errors that can make your work sound ludicrous and unreliable.
    (tags: english funny misplaced modifers)

    Rate this:Share this:EmailPrintPocketShare on TumblrDiggLike this:Like Loading…

    Related

  4. Daily Bookmark Post 09/16/2010 » Thoughts from a tech specialist… says:
    September 16, 2010 at 2:30 am

    […] In the context of web context: How to check out any Web page — Scott Rosenberg’s Wordyard […]

  5. Around the Web: Academic administration, Facebook privacy, Neil Gaiman on cities and more : Confessions of a Science Librarian says:
    September 16, 2010 at 7:18 am

    […] In the context of web context: How to check out any Web page […]

  6. This Week in Review: J-schools as R&D labs, a big news consumption shift, and what becomes of RSS » Nieman Journalism Lab says:
    September 17, 2010 at 7:04 am

    […] Finally, a wonderful web literacy tool from Scott Rosenberg: A step-by-step guide to gauge the credibility of anything on the web. Read it, save it, use it. This entry was […]

  7. links for 2010-09-18 says:
    September 18, 2010 at 6:02 am

    […] In the context of web context: How to check out any Web page Kevin: Scott Rosenberg has an excellent guide to how to dig in a website and find out who owns it. It's a great primer for web journalists in some basic investigative techniques to you know who's behind what you're reading online. (tags: context literacy internet journalism 2010 training) […]

  8. State Library of Ohio Blog » Blog Archive » Check Your Facts says:
    December 10, 2010 at 5:39 am

    […] important to verify the source of the information you are reading.  This great blog post, “In the context of web context: How to check out any Web page” offers some simple tips to ensure that the websites you are using are credible […]

  9. SERPD says:
    December 27, 2010 at 10:04 pm

    In the context of web context: How to check out any Web page — Scott Rosenberg's Wordyard…

    One of the great fears about the Web as it becomes our primary source of news is the notion that it rips stories from their moorings and delivers them to us context-free. We’re adrift! In a flood of……

  10. Social and Ethical cont | rogjam3's Blog says:
    April 10, 2011 at 12:09 pm

    […] http://www.wordyard.com/2010/09/14/in-the-context-of-web-context-how-to-check-out-any-web-page  Check out the reliability of any website […]

  11. Stephen Judd says:
    August 20, 2012 at 11:01 am

    Facebook
    Twitter
    Google+
    LinkedIn

    If you are interested in learning a bit more about “Assessing the reliability of online information”,  join Kristen Mastel and Stephen Judd for a free eXtension webinar on Tuesday August 21, 2012, at 2 PM EDT. The webinar will also be conducted on the DoD/DCO Adobe network on Wednesday August 22, 2012, at 2PM EDT to facilitate participation by military family service professionals.

    When the information we sought was contained in books and journals that had authors, editors, proofreaders, and fact-checkers, we had a sense of comfort that the material was reliable. (I admit that this is an arguable point.) However, with online publishing, we are left wondering who the author is, where the information came from, and if it’s true.
    Assessing the reliability of online information is a critical skill for each of us to develop and hone. Using or citing inaccurate online information can be embarrassing, expensive, and perhaps dangerous. Consider someone trying to fix an appliance, based on information they got from a random webpage – if the instructions aren’t right, the result could be further damage to the appliance, injury, etc.
    C.R.A.A.P.
    The Meriam Library at California State University, Chico, developed the C.R.A.A.P. test to give users a set of questions to ask when assessing information sources and their accuracy. CRAAP is an acronym that stands for currency, relevance, authority, accuracy, and purpose. By applying the questions in these categories to the source in question, a user can decide for themselves whether a source is reliable or not.
    Some example questions are:
    Is the information current?
    Who is the author or publisher? What are their qualifications?
    Is the information supported by evidence?
    Can you verify the information from another source?
    What is the purpose of the information? to inform? teach? sell? entertain? persuade?
    In his excellent book, Net Smart: How to thrive online (2012, MIT Press), Howard Rheingold uses the term crap detection to discuss how to decide if online information you find is true. Rheingold says, “Don’t refuse to believe, refuse to start out believing. Continue to pursue your investigation after you find an answer. Chase the story rather than just accepting the first evidence you encounter.” In other words, be skeptical.
    Rheingold links to a blog post (In the context of web context: How to check out any Web page) by Scott Rosenberg, co-founder of salon.com, that offers some practical tips for beginning to assess web pages. Understanding who operates a site, how long it’s existed, whether the content is unique or not, and who links to it are all important components of figuring out how reliable the site is.
    How confident?
    You may not need to ask these questions every time you visit a new website. Instead, how much time and effort you choose to spend digging into the reliability of the information will be dependent on your purpose.
    What you plan to do with the information should guide how rigorously you need to verify its accuracy. If you’re just curious about something and won’t be making decisions based on the information you find, then you might be more casual about verifying its accuracy. However, if you plan to stake time, money, reputation, health, etc. on the information, then you should take the time to assess the information’s validity.
    Role for online networks
    Given the vast amount of information accessible to us, having a filter or guide can be valuable. Online networks can serve this purpose, if we intentionally cultivate our networks to include people who are knowledgeable in areas that we aren’t, that share diverse interests, and whose judgement we trust. Curation is a term now applied to the intentional act of collecting and sharing information and links in an online environment. Using our online networks to connect with curators, is one way to apply an initial test to information. Taking Howard Rheingold as an example; if you are interested in this subject, you might use Rheingold’s curated links on crap detection which he maintains on scoop.it as a jumping off point. I trust that he has done, at least, an initial vetting of these sources, so I’m more comfortable with their reliability.
    Search engines, such as Google and Bing, sometimes include information in search results that indicate if others in your networks have “liked” or “+1’ed” a page. This implicit endorsement by your connections may influence how reliable you believe a site is. Of course, you need to take into account the person, their expertise, and the ambiguity of what it means to “like” or “+1” a page.
    Ultimately, it’s up to you
    It’s your reputation, time, money, health, or well-being that’s at stake when you make decisions or publish based on information you discover online. How carefully you vet that information and its source is up to you.

    Author: Stephen Judd (+Stephen Judd, @sjudd)
     
    This article was originally published Monday August 20,2012 on the Military Families Learning Network blog.
     

     This work is licensed under a Creative Commons Attribution 3.0 Unported License.
     

  12. Open Learning is different from Open Education? #oped12 #vsmooc12 | Creating an Open Classroom says:
    September 16, 2012 at 4:30 pm

    […] Identifying Bias http://wordyard.com/2010/09/14/in-the-context-of-web-context-how-to-check-out-any-web-page. […]

  13. Open Learning is different from Open Education? #oped12 #vsmooc12 | Él éxito en los negocios y en la vida says:
    September 18, 2012 at 9:09 am

    […] Identifying Bias http://wordyard.com/2010/09/14/in-the-context-of-web-context-how-to-check-out-any-web-page. […]

  14. How to Lose Your Gut – The journalist’s guide to gutless online verification | DDRRNT says:
    December 11, 2012 at 8:24 pm

    […] Rosenberg this week that offered advice on “How to check out any Web page.” Below is a selection of his […]

  15. The Internet if Full of Lies | Communication Theory and Practice says:
    September 24, 2013 at 11:14 am

    […] “In the Context of Web Context: How to check out any Web Page” Scott Rosenberg […]

  16. Mo Pelzel says:
    February 29, 2016 at 4:36 am

    Ernest Hemingway. (n.d.). AZQuotes.com. Retrieved February 29, 2016, from AZQuotes.com Web site: http://www.azquotes.com/quote/1390439“You can’t believe everything you read” is one of those aphorisms that we learn early on. “Caveat lector” was already a maxim long before the advent of the digital age and the world wide web. In that earlier era, though, barriers to publication were significant, and much print material went through gatekeeping procedures in order to ensure credibility, accuracy, and reliability. For academic work in particular, the editorial and peer-review processes of publishing companies, journals, and professional societies imparted authority to published works. The fact that a book or resource was available in libraries gave you some confidence that the author’s claims and positions had been vetted as credible and reliable.
    Today, peer-review and editorial oversight remain important. But the Web has made it possible for anyone to publish anything, and search engines provide immediate and unfiltered access to the abundance of online information. As Clay Shirky has noted, our paradigm now is “publish, then filter.” Now, more than ever, the reader (or viewer or listener) is responsible for assessing the claims of authors and evaluating documents for reliability and accuracy. Of course, faculty help guide their students toward critical thinking and the development of a discerning judgment about sources and materials. A big part of a prof’s job is to convey to students a sense for what’s reputable and what’s not in his or her field. Still, even reputable sources such as scientific journals and mainstream news outlets can reflect biases and misrepresentations of data, or even fall victim to outright fraud.
    Technologist Howard Rheingold has argued that “crap detection” is one of the core literacies we need to cultivate. What does that look like when it comes to searching the web and evaluating what we find? I think we can approach this in several steps:
    Clarifying in our mind exactly what we are searching for. Are we after a specific bit of information, or a more general introduction to or background on a topic?
    Composing our search query with the terms and filters that best match what we are looking for
    Knowing how to read and interpret the search engine result page
    Knowing how to assess a given site, page, document, or resource that the search returns to us.
    “Search literacy” is an important basis for crap detection. One of the best resources I have found to improve search literacy is this free self-directed course, Power Searching with Google. In a series of short videos, “search anthropologist” Dan Russell explains basic and advanced search techniques and concepts. Obviously, you should think carefully about the appropriate terms of your query; results could be biased due to poor or imprecise phrasing. Russell emphasizes that “every word matters” and that “word order is important.” Focus on the keywords and terms that are most essential to your topic, and disregard auxiliary words and phrases that you might use in normal conversation. The terms, structure, and filters that you put in your search query will determine what you get back. Use quotes around a phrase to return pages that only have that exact phrase, and use the minus sign in front of terms you want to exclude from your results. Use the the site: and filetype: operators to narrow the search to specific web domains and types of documents if that is useful.
    Search engine results are returned in “rank order” according to how well a webpage matches your query. But rank order is not the same as credibility or authoritativeness. Thus, the highest results of a search are not necessarily the most credible or useful…they are simply those that best match the query that you entered. It is not the search engine’s job to assess the accuracy of facts or the soundness of arguments that might be found on a web page. That’s why it is so important to formulate your query as appropriately as possible, and to be aware of how specific terms might influence the results. For example, searching for information about “Falkland Islands” would not produce the same results as a search about “Islas Malvinas.”
    As you examine search results, consider visiting several sites to cross-check information and assess reliability. As with analog sources, you obviously want to ask some very basic questions:
    Who is the author of this information? What person or organization is behind this document and site?
    What evidence is presented for the author’s competence with the subject matter?
    What do other people say about the author?
    What are the author’s sources? Are there citations and references to support the claims and arguments?
    Are there feedback options, so that visitors can ask questions and engage in discussion?
    What are the outbound links from the page? Conversely, what are the inbound links, that is, what other pages are pointing to this page?
    Rheingold gives the example of a search for information about Martin Luther King, Jr. Among the top results for this query from most search engines is the site titled “Martin Luther King, Jr.: A True Historical Examination.”  The URL for this site, http://martinlutherking.org, looks valid enough, but upon closer examination the site is revealed to be a front for a white supremacist organization called Stormfront.
    An important tool for checking the background of a website or domain is the internet protocol WHOIS. If you are suspicious about the legitimacy of a site, use this command to reveal information about who owns and operates a given domain on the web, where the site is hosted, contact information, etc. Just enter a domain to see who’s behind a site. For example, here’s some background info about martinlutherking.org:

    You can examine a website’s outbound links to see how it references other sources and documents. Hyperlinks on a site may be internal or external. “Internal” links point to other pages within the same domain, while “external” links point to pages outside of the domain. It’s the external, or “outbound,” links that will give you a sense of how the resource or page is situated within the larger web presentation of a topic. Conversely, “inbound” links can also be very telling as to a site’s legitimacy. But that information is somewhat harder to get at. Google used to have a link: operator as part of its search toolbox, but that seems to have been deprecated. The best resource I have found to discover incoming links to a site is this backlink checker. Here are the top results for a backlink check on http://martinlutherking.org:

    These are just a few things to keep in mind as you fact-check and analyze information on the web. Join us this week at our workshops for further conversation and demonstration.
    Resources
    Austin College Library website section, “Evaluating Information“
    Neil Postman, “Bullshit and the Art of Crap Detection” (speech, 1969)
    Howard Rheingold, “Crap Detection 101” (essay, 2009); “A Guide to Crap Detection Resources“; Net Smart. How to Thrive Online (MIT Press, 2012)
    Scott Rosenberg, “In the Content of Web Context: How to Check Out Any Webpage” (post, 2010)
    Walter Quattrociocchi, “How does misinformation spread online?” (post, 2016)
    Power Searching with Google short course

    var addthis_config = {
    url: “http://acdigitalpedagogy.org/crap-detection-101-assessing-the-credibility-of-online-information/”,
    title: “Crap Detection 101: Assessing the Credibility of Online Information”
    }

Leave a Reply

Your email address will not be published. Required fields are marked *