An Accidental Academic (Librarian)

Aug 15

What has Amazon beat on all-you-can-read e-books? Your public library -

The Wall Street Journal discovers that the public library has e-books. Will wonders never cease.

Where We Came From, State by State -

A chart of how Americans have moved between states, since 1900.

Aug 14


Aug 11

Anatomy of a Research Tempest

On June 17, 2014, an article appeared in the Proceedings of the National Academy of Sciences and almost immediately garnered mainstream and popular news site coverage. It’s not unusual for PNAS-published research to get media attention; the site is open access and the research being published is often sexy in an attention-getting way. In this particular case, what was sexy about it was that the research involved Facebook doing the sort of thing people automatically assume that Facebook is always doing.

Everyone loves to hate Facebook, but so many people use it that it’s become the default platform for interacting with others online. Other sites use it to handle user logins, commenting, and networking, not to mention marketing. At the same time, the site’s obscure algorithms and perpetual-beta approach to usability provide useful hooks for the mass discomfort over how much personal information we’re giving up to a site that uses that information to make money. (As the saying goes, if you’re not being charged for it, the product is you.) Everyone complains about Facebook, but everyone uses it; it’s troublesome but it’s convenient, and as is so often the case, convenience trumps just about everything else.

Thus, the reaction to the news that Facebook researchers had intentionally manipulated some users’ news feeds to see whether what they saw on the site impacted their moods—as read from their subsequent posts on the network—was fairly predictable: a sort of unsurprised outrage that both acknowledged the ethical hinkiness of the manipulation, and classified it as business as usual. Of course Facebook was manipulating everyone’s news feeds. It’s the sort of thing that Facebook does.

More interesting was the view that the research study and the coverage of it provided into research processes that are to most people fairly obscure. It’s possibly the first time that many social media users ever encountered concepts like informed consent, institutional review boards, and various social science research methodologies. One of the issues raised at the outset concerned whether agreeing to Facebook Terms of Service constituted informed consent to the study; while the ToS does state that user data may be used for research purposes, the consensus emerged that this did not include having what information a user was exposed to manipulated in the name of research. What stories a Facebook user sees vary widely and are only partially under the user’s control, so users had no way of knowing this manipulation was happening at the time.

A related question was whether the authors of the study, two of whom were affiliated with Cornell University, had secured the approval of Cornell’s Institutional Review Board to conduct the study. IRBs are supposed to prevent abuses of human subjects in the course of research; accordingly, a researcher working for an institution that has an IRB submits a description of the study to the board for review. However, in this instance, the researchers were using already-existing data provided by Facebook; therefore, Cornell concluded that IRB approval was not necessary.

Then the question emerged almost immediately as to whether the research had any value. The study’s methodology (a form of textual analysis previously used on much longer samples than the average Facebook post) was called into question, as was the whole idea of whether Facebook status updates really communicate any useful information as to the poster’s emotional state anyway.

What interests me about this event and events like it is the successive waves of information and understanding as word about the study and its implications spread. First out of the gate was the OMG! reaction as people communicated their initial shock over Facebook’s shenanigans. This reaction gained traction quickly, for reasons already outlined: Facebook is very widely used, but much of its user base is cynical about the platform’s methods and motives. (Unsurprisingly, a great deal of this early reaction propagated through Facebook itself.)

The more analytical opinions took longer, and emerged in other locales—initially, chiefly in the comments on the PNAS article itself, where a debate quickly brewed over the study’s research methods, whether the Facebook ToS constituted informed consent, and whether the study had (and needed) IRB approval. Commentary emerged on AV Club and The Atlantic addressing these questions, as well as on subject-specialist site PsychCentral. Some of these commentaries also spread, via Facebook, Twitter, and other social media, but overall this wave was smaller and engendered a lesser reaction in terms of size.

Facebook’s user base does not appear to have diminished at all.

It’s a frustrating phenomenon if you’re in the business of getting people to seek out the best information on a topic, and to thoughtfully consider what they read. The most troubling element here isn’t that emotion might be contagious across social media—the study is, at best, inconclusive. The most troubling element isn’t even that this research was conducted, though the circumstances under which it was conducted, approved, and published are problematic. The most troubling part for me is how little people seem to care, beyond that initial OMG! reaction, to find out what really happened. It’s a side effect, perhaps, of the drinking-from-the-firehose experience of living in the Information Age, though I question whether people were really any more thoughtful about what they read in the past—certainly they found their sources more trustworthy, but whether they actually were is a different question. There’s always some new incident, news item, or outrage coming along, and by the time the analytical writing comes out, people have moved on to the next thing. Some sources have responded to this by speeding up their own analysis—The Atlantic’s coverage is quite thorough, and came out relatively quickly. Too, discovering what was going on with this particular research hardly required a great deal of investigative journalism; all that was really needed was a close reading of the study and a few follow-up questions.

There’s been a fair amount of research on the emotional effects of social media. As with most human endeavors, the results are a mixed bag. This, however, seems to be the first study that attempted to manipulate people’s responses, rather than simply observing what they did.

It won’t be the last.

Aug 06

How rumors spread via sloppy citation practices @insidehighered -

Most of the times a faculty member has asked me for assistance in tracking down a citation, the source in which they found it referenced it incorrectly.

Jul 28

Misrepresenting Fair Use

It happened again: another online writer’s stuff got swiped and reused in another venue, without so much as a hey-do-you-mind. When the writer—rather understandably—complained, the re-user got his ire up, claiming fair use, and also that the writer was overreacting and ought to be flattered that someone liked his work enough to rebroadcast it to a wider audience.

Both are pretty common arguments, and they’re equally annoying because they rest on faulty assumptions: that a use of someone’s work is fair if you say it is, and that anyone making their work freely available will be pleased if it gets picked up and re-shared elsewhere. The particulars are as follows: a blogger wrote a post about the possibility of going back to war in the Middle East. The post went viral, getting reshared all over Facebook, and eventually got reposted on a couple of other sites. It also got picked up by a podcaster who read the entire post, verbatim, with a minimal introduction to the effect that he thought the post was really cool.

The blogger was rather displeased, especially since some of these re-posts were without attribution or so much as a link back to his original work. And he let it be known, particularly to the podcaster who had read his essay. The podcaster fired back that his use of the essay was fair, since he wasn’t profiting off of it.

There is an ongoing conversation over how material gets used and re-used online, where entire books and movies are copied and rebroadcast with minimal fuss, even when protected by DRM—if you know how to crack it. But the claim of fair use specifically is one that is often misunderstood and often misused—sometimes unwittingly, sometimes as an excuse. Since the only way to definitively test whether a use is fair is in court and most people don’t want to go to that kind of trouble, uses that are not fair are often let slide with little action beyond vociferous protest; conversely, uses that are fair are often nonetheless withdrawn under threat from the copyright holder, if said holder has sufficient influence and deep pockets.

So what is fair use, anyway? It’s basically an attempt to insert some wiggle room into the labyrinthine rules concerning copyright and what it’s okay to do with copyrighted material, making allowance for the need to make use of such material in order to comment on or criticize it, include it within an academic study, use it in the classroom, and other occasions where it’s unlikely that the user is directly interfering with the copyright holder’s rights.

There are four factors which are to be applied when considering whether your use is fair; however, much of the misunderstanding of fair use seems to arise from users not applying them, or being incompletely aware of them, or oversimplifying them (“my use is educational, therefore it’s fair”—which is not strictly true, and also what some people attempt to justify as “educational” can be rather entertaining).

The main criterion of fair use is that the use is limited and transformative, generally understood as either commentary or criticism, or parody of the original work. Examples would be quoting from a book as part of a review, or writing a parody of an original song—though parodying the entire song is unlikely to be considered fair (for this reason, Weird Al Yankovic gets permission from copyright holders of the songs he parodies, rather than relying on a fair use argument).

In this case, the podcaster read the entire work aloud and issued the recording as part of his podcast, with a minimal introduction. This is neither limited, nor transformative. A better choice would have been to quote from the essay, with commentary, and tell listeners where they could read the whole thing—or seek permission from the author to replicate the work, something the author would have been willing to grant upon request.

The second factor is the nature of the work. It’s easier to make a fair use claim for factual information than for creative work. In this case, the essay was an opinion piece, not a simple reporting of fact. In other words, the intent of the author was to express his experience and interpretation of events, not to be an encyclopedia. This blogger has a distinctive voice and a large following, but even if he had neither, the nature of the piece itself is interpretive and creative.

The third factor is the amount and substantiality of the portion taken. “Amount” is fairly obvious—the usual guideline is no more than ten percent—but what’s meant by substantiality? This refers to the “heart” of the work, which can be a bit nebulous, but basically can be interpreted as whatever is most distinctive or memorable about that work. In this case, the podcaster took the entire thing.

The fourth factor is whether the market for the work is affected. In this particular case, the podcaster initially asserted that the use was fair because the original author had made his work available for free in the first place; therefore, the market for the work, and therefore the author’s income, were not affected. This is true enough. But with so much content being made freely available and easily copied, there’s some question in my mind as to whether potential loss of income ought to be the sole consideration where this factor is concerned. At the very least, re-broadcasting someone’s work without a link back to the original is rude, even though people do it all the time. The podcaster also claimed that he’d made no profit off of the writer’s work, which is patently untrue since while the podcast can be accessed for free if you listen according to its broadcast schedule, a downloadable subscription costs money.

So it’s pretty obvious that this particular instance isn’t really fair, whatever the podcaster says. But of course the only way this can really be tested is by going to court, and as I mentioned previously, in most cases like these people don’t bother. (In this case, after considerable bluster and insult, the podcaster backed down.)

This kind of thoughtful consideration of whether a use is fair seems almost quaint, in these days when you can find the entirety of a newly published book online for free almost immediately, torrent new movies and games, and so on. It pretty much goes against the automatic impulse to share something cool as soon as you stumble upon it. But it takes a special kind of obnoxiousness to engage in that kind of sharing, whatever your motivation, and then bogusly claim that you’re engaging in fair use when called on it. The fifth, unspoken factor in fair use is whether you’re being a dick about it. Like the man says, don’t be a dick.

Jul 09

When the reader cannot come to the library, the library must come to the reader.

When the reader cannot come to the library, the library must come to the reader.

Jul 03

Superpowers granted upon MLIS.

Superpowers granted upon MLIS.

Jul 01

My version of that flow chart that’s been going around.

My version of that flow chart that’s been going around.

Jun 29