Archive | June 2017

Geohero or geochump?

A comment at a meeting GG was at got him to thinking about the popular view of scientists.  The comment was that scientists in the 19th century were heroes for Americans because they helped open up the West, while in the 20th century they were more thorns in the sides of growth.  Of course, this is so oversimplified it collapses quickly: John Wesley Powell, a hero for his explorations of the Colorado River, was viewed with great disdain when he closed claims for public lands. And post-WWII America fell in love with science in many ways. But still, when are scientists lauded and when are they scorned? An interesting pair of cases in the late 1860s and 1870s may shed light on this.

In both cases a scientist running a geological survey became aware of claims of major mineral finds within the area of his survey.  In both cases, the scientist claimed that these finds were incorrect. In both cases, the finds were not economic. Yet in one case, the scientist in question, Clarence King, was lauded, became first director of the USGS, and was viewed as one of the best and brightest America had to offer.  The other, Josiah Whitney, lost his survey and spent years grousing about the outcome. Why the difference?

Read More…

Book v Paper

Earth scientists today write papers.  Historians write books (well, they write papers, too, but it seems like that is kind of the installment plan for a book). Having completed a book, GG finds it a little frustrating in an odd way.

Professional papers are, in a way, a conversation. You get enough stuff together to say “Hey, this looks interesting.” Somebody else might then have some other observations and say “No, look, the story is different.”  And you are paying attention because that first paper was just the beginning of a research project.  So your next paper might have your new observations and an attempt to come to grips with those other observations that came up in the interim.  And so on.

A book, on the other hand, is kind of the last word. Unless you are writing a popular first-year textbook, publishers are not terribly interested in revised editions of books. And authors aren’t all that thrilled with the prospect of revisiting the whole of a book. In a way, this means that the kinds of conversation and continual revisiting of issues on a topic doesn’t happen. So there really should be a mindset in writing a book that, well, it is going to be sitting out there a long time without correction.

And so in writing about ongoing research, GG left the door open about what might come down the pike, knowing full well the give-and-take of geoscience research.

But it kind of hurts when you, as a book author, realize there was an oversight.  And there is nothing to do about it but wince. For GG, it was the discovery recently of a book, Golden Rules by Mark Kanazawa, that made him wince. It was published in 2015, plenty of time for its lessons on the creation of prior appropriation water law to be incorporated in GG’s manuscript chapter on hydraulic mining. And a quick skim (GG is reading now) suggests there were many lessons.

Does it really change the basic picture in The Mountains that Remade America? Probably not, particularly as the chapter in question focused more on the environmental damage of hydraulic mining. But gosh,it would have been better with this in it.

The sad realization is that this is probably the first of many oversights to be recognized. Who knew being finished writing a book could invoke regret? [Well, other than book authors].

When you have a hammer…

…all the world is a nail.  And the currently popular hammers are things like Twitter and Instagram and Tinder.  While some have long advocated the first two as important tools for scientists, the last has been used as a model for scanning through preprints.  Lots and lots of preprints. The Science story on this says “A web application inspired by the dating app Tinder lets you make snap judgments about preprints—papers published online before peer review—simply by swiping left, right, up, or down.”

Nothing says “science” like “snap judgment”.

While GG lambasted an effort to capture social media-ish solutions as a means of post-publication peer review, how about tools to let you find what cutting edge science is appearing? That Science report on social media linked above says that is what social media is good for.  Um, really?

GG studies the Sierra Nevada.  Try going to Twitter and searching on #SierraNevada.  Bet you didn’t think there were that many people so fascinated with taking pictures of beer bottles. Add, say, #science. Chaff winnowed some, but very little wheat. Add #tectonics. Crickets.

The idea of this new app (Papr) is that if only you were able to see lots and lots of stuff quickly, you’d find some gems to explore. Really?  Students complain bitterly about a firehose approach in the classroom, and the solution here is, um, a firehose? (To be fair, it appears the app developers are not necessarily expecting great things here).

Forget that.  What we want and need are tools to reduce chaff, not accelerate it.

What we need is something akin to Amazon’s suggestions tool.  Imagine visiting the preprint store to get a couple of papers you know you want.  One maybe is on a topic you care about–say, the Sierra Nevada.  Another maybe deals with a technique, say full waveform tomography.  A third uses some unusual statistical tests. You download these and the preprint store suggests a few other preprints based on the full text content of the papers you got. Why that instead of keywords? Keywords have a way of being too picky. You might call work “tectonics” and GG might call it “geodynamics” and thus the keywords searches might pass by each other. But if the text is still talking about changes in elevation, changes in lithospheric structure–those are less likely to get overlooked.  If this tool is smart enough to recognize quasi-synonyms and phrases, all the better.

Such a tool grows more powerful the more you work with it. While on that first try, you will also get recommendations on papers overlapping in non-interesting ways (say, applications of the techniques in paper 1, the geographic area under study in paper 2, and the measurement types in paper 3), the more you interact with this, the better it gets.

Here’s the sad thing: the tools to make something like this have been around for decades.  The best spam filters (like SpamSieve) use a form of Bayesian filtering based on message content in addition to black- and whitelists. Earth science got much of its literature into a single “preprint store” long ago in GeoScienceWorld. And yet here we are, swiping left again and again and again….

Citation Statistics Smackdown

Sorry, it isn’t that dramatic.  But in updating various web tools, GG noticed dramatic differences between his supposed citations between Google Scholar and Web of Science. In the past he has assumed the difference was because Google was capturing junk citations, but today decided to actually look at what is going on in detail.  Which may or may not interest you, dear reader….

The raw starting points for Web of Science is here, and for Google is here. At the very top, GG’s h index is 21 with Web of Science, 27 with Google (a significant difference for those who love those things, just a numerical quirk for others). The most highly cited paper has  252 citations from WoS but a staggering 338 in Google. Although this is tedious to work through, there is clearly a lot of fodder for comparison, so let’s dive in.

An oddity of Google’s citation listing comes into focus quickly: sorting on date only yields the last 15 papers.

Google overestimates citations in at least one situation: it repeated the citation to papers in the Chinese Journal of Geophysics, linking to both the English language version and the original Chinese html version of the papers. Another goofy thing is the Google will mess up from time to time and assign a citation from a previous paper in Nature with the article that starts on the same page as the citation. For instance, Google has an immunology paper citing the Zandt et al. tectonics paper. Google does end up with some number of duplicated citations: several preprints are counted along with the actual publication. Also some Chinese and possibly Russian papers are counted twice, once as Chinese versions, once in English versions.

Mostly, however, the difference is in theses and books, items Web of Science explicitly does not track. Since some theses contain papers published elsewhere, some of these are duplicates. More embarrassingly, there are some term papers on the web that are taken as citable materials.

What is the balance, though?

Of the 331 references identified overall, only 5 in Web of Science were not in Google.  Two were chapters in the Treatise on Geochemistry, two others were in GSA Special Paper 456, and the last was a G^3 article. So of the remaining 326, 247 were in WoS and so 79 more are in Google. Since 338-326=12, there are 12 outright duplicate entries in Google; what of the 79 other additional entries?

Five did not cites the Zandt et al. paper at all; these were outright mistakes.  Combined with the 12 duplicate entries, 17 of the 338, or about 5%, of the Google citations are simply wrong. The duplicates are sometimes multiple language versions of the same paper, or a preprint showing up as a separate item.

  • Theses: 28
  • Books: 16 (including 8 from GSA Memoir 212, which WoS should have had)
  • Foreign language (Chinese and Russian): 12 (Some of which might be duplicates or not even cite the paper at all)
  • “News” Journals (GSA Today, Eos): 6
  • Real journals missed by WoS: 6 (which, if you add the 8 from GSA Memoir 212, are 14 references that WoS should have had).
  • Miscellaneous: 6. A term paper was in there, a meeting abstract, an in press paper.

Which do you take to be more accurate? The 252 in WoS should clearly be at least 258 and probably over 260 with the GSA volumes that are supposed to be counted these days.  The 6 GSA Today+EOS science articles probably deserve inclusion, though the EOS articles are shakier. On the other side, the 338 reported by Google should be no higher than 320 (338 – 17 – 6 + 5). Theses are something interesting in this count, as they represent some kind of original research, but these days most thesis work worth anything is published.  If you take that view we are down to 292, 26 above the 266 WoS probably should have had.

This leaves as seriously gray at least 8 books, 12 foreign language papers, and the 6 news journals. So arguably the uncertainty on a citation count is in the 10-20% range.  If we say the correct number is 279 +/-13, the 252 of WoS is 27 low and Google is 59 high.

What does this mean, aside from apparently we can’t even count integers? Perhaps a first-cut approach would be to take as a closer approximation to a “true” measure of citations by going a third of the way from WoS to Google numbers (true = WoS + (Google-WoS)/3, or true = 2/3(WoS) + 1/3(Google)).

Citation citations

Neat article in Slate on how damaging bad citation practices are (using as key example the origins of the increase in opioid prescriptions that have now led to this abuse crisis) (thanks, RetractionWatch, for the pointer). Go read it.  GG’s already sounded off a few times on the mystery of lousy citation practices here and here, so we’ll just leave it at that.

An Alternate Fact History

While those of us in the sciences bemoan factual illiteracy, it might be worth recalling that widespread distribution of factually-challenged material is hardly new.

What is a bit distressing are the kinds of things such behavior leads to.

A few choice snippets:

The Mexican-American war was largely a creation of the Polk Administration, which desired to separate California from Mexico while absorbing the independent nation of Texas. In essence, by claiming a southern border for Texas at the Rio Grande that was well south of that understood by Mexico, Polk was able to claim American had been attacked in the U.S. This was challenged by, among others, Rep. Abraham Lincoln (W-IL). U.S. Grant, in his autobiography, lambasted the war as “one of the most unjust ever waged by a stronger against a weaker nation.” And yet the nation, guided by Polk’s sophistry, initiated a war for territory.

Arguably the Civil War represented a nadir in sharing of facts, from a fiercely partisan press across the country to southern states that intercepted and destroyed Northern newspapers. Lincoln’s relatively mild stance on slavery (that it could not be allowed to extend into the territories) was twisted by southern papers into extraordinary claims such as he would force interracial marriage. Perhaps the most damaging “alternative fact” came after the war, when the myth that the war was over state’s rights instead of slavery made a glorification of the Confederacy possible and popular.

The later “yellow press” of the late 19th century has often been fingered as causing the Spanish-American War, largely through exaggeration but also through the creation of fictional facts that induced Americans to enter into war.

McCarthyism was, at its heart, the creation of fictional crimes by real people, allegations that were utterly unsubstantiated.

Certainly more recently we’ve seen alternative facts play out, as the tobacco industry made up stories to counter evidence that smoking was a health hazard.  Similar but less obvious efforts were made by the paint industry to slow the banning of lead-based paints and the fossil-fuel industry to discredit concerns of global warming. These efforts went beyond a straightforward advocacy for their industry to try to discredit scientific evidence about their industries’ products.

The most damaging alternative fact was the German right-wing myth that Germany did not lose World War I on the battlefield, but that the military was “stabbed in the back” by a new republican government backed by Jews. This helped fuel the rise of the Nazis in Germany (which, it is worth recalling, had considerable electoral success before usurping the republican government). It also fed a hatred that spawned the Holocaust.

Note that episodes of hysteria in the face of ignorance don’t really count.  So things like quarantining doctors returning from Ebola outbreaks or parents not inoculating their kids aren’t a response to fake news (which is a false story put out to achieve some agenda) per se. Such episodes do exploit the same emotional reactions that fake news is often designed to evoke.

And so alternative facts/fake news are hardly a recent invention, and recent invocations about crowd sizes or job creation numbers are certainly some the most innocuous applications of such misdirection. But it is clear that the creation of-and widespread belief in-false stories are tied to a lot of human misery. We all need to be on guard against emotionally satisfying (but untrue) stories that lead us to beliefs that are untrue and lead us to actions that are immoral or counterproductive.

Uniquely Rockies

…baseball, that is. (We all need a diversion).

All Major League teams have some sort of oddities, but the Rockies seem a bit out there…. Consider that:

  • The Rockies are the only team to play in their own timezone (MDT). [Arizona plays in MST, which is the same as PDT during the MLB season]
  • Rockies are well known for playing at the highest altitude of any MLB team (a row of purple seats marks a mile above sea level).
  • Despite being the third-youngest expansion club, they have the oldest stadium of all expansion teams in the National League (Coors Field opened in 1995. The Big A in Anaheim for the Angels is the oldest stadium for an expansion franchise).  Coors is, amazingly, the third oldest NL ballpark (only Wrigley and Dodger Stadium are older; for some reason, AL parks last longer).
  • The team is the only one named for a geographical feature (and one of few named for an inanimate object, the others being socks of various colors)
  • An individual player is called, rather awkwardly and ungrammatically, a Rockie. (Is an individual member of the Red Sox a Red Sock? No.)
  • The Rockies’s home park set the record for most home runs in a season in 1999.
  • This led to the Rockies being the only club to store baseballs in a humidor [though Arizona is apparently going to do the same soon].
  • They are the only NL West team to have never won the division. Only their 1993 expansion-mates, the Marlins, share the distinction of not ever winning a division, though the Marlins won the World Series as a wild card team.  Pittsburg joins them as the only other team not to have won a division since the creation of 6 MLB divisions in 1994.
  • Although nobody keeps track of games cancelled by snow, concern for that led Coors to have the first heated infield in MLB. (Odds are good that the latest snow-outs have been at Rockies games, though this year they somehow were out of town for several late snowstorms).
  • Is Coors the only ballpark to actually contain a microbrewery? (Milwaukee’s doesn’t).
  • The Rockies might be the club most distant from college baseball. Only two college teams (neither in Denver) are D1 in Colorado (Univ. Northern Colorado and Air Force Academy).

That’s enough distraction for today.