Let the Great Cross-Referencing Begin: Google Book Search as Plagiarism Detector

The Google Book Search Library Project promises to be, among other things, the greatest plagiarism detector ever created.

So why are the Association of American Publishers and the Authors Guild suing Google over its plan to digitize millions of books?

In the case of the AAP, it's probably because they understand that copyright law really exists to subsidize distributors, not writers or readers. They're just looking out for their own interests. Or at least they think they are: it's much more likely that Google search results will improve book sales than hurt them. In any case, one has to pause at the spectacle of a publishers' association coming out against readers being able to locate the books they're looking for more efficiently than ever before.

But what's more interesting, if not exactly unexpected, is that the Authors Guild is reacting in the same way. Here's what the Guild's president, Nick Taylor, had to say:

    "This is a plain and brazen violation of copyright law. It's not up to Google or anyone other than the authors, the rightful owners of these copyrights, to decide whether and how their works will be copied."

How odd. Mostly, authors are not the owners of the copyrights in their work — publishers are. And even in those cases where the author retains copyright, she has usually signed a contract granting exclusive printing and distribution rights to a particular publisher. Nick Taylor's comment might make sense in some idealistic world where authors typically retain control of their work, but for the authors he represents, the world is rarely like that.

Meanwhile, the Authors Guild ignores an amazing possibility opened up by Google's project: we will be able detect plagiarism with a thoroughness hitherto unthinkable. Google is the world's premier search engine; they have made billions of dollars matching snippets of text together and displaying the results. After digitizing these texts, the natural thing to do is to start looking for ways to cross-reference them. For legitimate citations, the effect of this will be mere convenience: instead of trudging to the library or bookstore, you can click on a link. But for cases of plagiarism, the effect will be a revolution: whereas in the past, discovering plagiarism required that the same person read both books, it will now be possible to flag potential instances of unattributed copying automatically!

So why isn't the Authors Guild cheering Google on?

A clue can be found in the Guild's self-description, as given at the end of their press release about the Google lawsuit:

    "The Authors Guild is the nation's largest and oldest society of published authors and the leading writers' advocate for fair compensation, effective copyright protection, and free expression."

There's a subtle bit of cognitive slippage going on there. They start out stating (accurately) that they are the largest society for published authors. But then they go on to claim that they are the leading writers' advocate for fair compensation, effective copyright protection, and free expression. Where did that slide from representing published authors to representing all authors happen? Anyone who writes is a writer; and thanks to the Internet, any writer who wants to be published can be, by simply making their work available on the Web. This is not wordplay, it is a fundamentally important fact of modern information distribution, as many popular bloggers have learned. The Author's Guild does not represent most authors anymore, if it ever did. It represents a tiny minority of authors: those whose works have been found fit for distribution by a certain kind of publisher, the kind that makes a massive initial investment in a print run and then depends on strict monopoly control of the copyright to recover that investment.

Tellingly, the Guild's identifying statement doesn't contain a word about plagiarism, a threat faced by all authors. While texts may be shareable resources, reputation and credit are not: plagiarism is a concern for all writers, no matter how their work is distributed. Yet the Guild's omission isn't limited to that one press release. A search for the word "plagiarism" across their entire web site returns only this:

    Search word: plagiarism
    0 results found.

Perhaps the Guild thinks that the phrase "effective copyright protection" includes plagiarism, but as we have noted elsewhere, copyright "protection" is really not about plagiarism: one can permit limitless attributed copying without approving of or permitting plagiarism. The two are separate, and the Authors Guild, of all organizations, should know this.

The Authors Guild's heart is in the right place; the problem is just that they've bought the industry myth: that authors' interests are always the same as publishers'. If the AG really wants to look out for the interests of all authors, not just the small percentage with successful monopoly-based publishing arrangements, they'll knock on Google's door and ask how they can help. Instead, they're suing for copyright violation, even though what Google is doing is both well within the bounds of so-called "fair use" and enormously beneficial to the Guild's members.

The Great Cross-Referencing has begun. Let us hope the Authors Guild sees the light and allows it to continue.

[Postscript: When I first wrote this article, I wasn't aware that Amazon had already been doing in-book searching for some time. This means that Amazon could do automated plagiarism detection as well, and perhaps there are other organizations in the same position. But note that Amazon is not the target of publishing industry lawsuits, probably because Amazon negotiated with publishers for access to book text, rather than just scanning it in the way Google did.]

My Two Cents

I have been reading the posts above and i thought i'd share my own personal experience. Digital plagiarism isnt the next big thing at all - It is already here and it has been here for a long while. 12 months ago I was hit with someone steling my digital art, not just a little of it but a vast amount of my work. At the time I couldnt do anything but wave bye bye to my work. I did however research on the internet and i found a site that advertised a huge list of online marketing companies. So I decided to contact one, they have since helped my protect as much of my work as possible, but they also carry out on a monthly basis a scan of the internet. Brand management they call it or online reputation management. Any whatever it is they scan the internet and find if anyone else is using my digital art/copy etc. So as I say it is here now and has been for a long time. The question is are you going to protect your brand like i did.
Great website by the way

Authors are publishers

In the age of the internet the author is now a publisher as well. This means there needs to be a social media monitoring service or tool to keep track of it all for effective brand management.

[Editorial note: was that a paid link? We couldn't even tell...]

Its the google approach

I thought when Google had first announced their project they were only scanning in out of copyright books. Sililar to the Guttenburge project. But, it seems that was not fully the case. Google kind of tried to impose the idea on writers and publishers.

I love the idea. I hope to see the project succeed. We will see.

John Tasher

[link deleted]

Google - monopoly?

It is clear that Google is acting like an absolute monopoly in many aspects. The same goes for plagiarism. I will not be surprised if Google creates its own turnitin in the nearest future.
For my students' assignments I use http://www.plagiarismscanner.com - it works fine for me and I know that they are not saving any scanned documents in their databeses. It is basically a browser that compares your doc to what is available online.

Hope it will be helpful

In internet most of the content we read are copied from somewhere and the real writers really who don't get credit for their writtings so i hope with this search people can find out more of the sources and trust them.

Google

I really don't understand why they are fighting this thing myself. Can't they understand that Google will only help to improve their sales therefor making everyone even that much happier?!

[commercial link deleted]

Is using Copyscape really

Is using Copyscape really prevent plagiarism. We know that it takes time to index a published article so what if someone copied my article and publish it also and because it's not indexed yet so it would pass Copyscape. How will plagiarism deals with it?

-Jan

Re: Is using Copyscape really

See here for an answer (it's a comment I wrote a while ago in a conversation about a different article).

Time for change

The Google Book Search Library seems to be a very promising project. It will be the largest of its kind and it will hopefully bring a lot of changes in the world of copyrights. I am hoping it will...

DJ

Google vs Amazon

It's understandable that Authors Guild don't hassle Amazon. After all, Amazon is selling the books creating profit for the writers. Google's BSLP on the other hand only create profit for Google themselves using material that Google do not own in any way.

So, even thought Amazon and Google might be doing the same things it's still in different context.

BSLP might turn out to have very positive effects for everybody, including the writers. But right now it's probably impossible to tell for certain.

M. Nowak
Nowak blog

Re: Google vs Amazon

But Google's BS can create profits for writers, by leading to increased sales of books, by making them easier to find. That's Google's whole argument!

(Not that profits should be the deciding factor in deciding who can share information, of course. But even granting that dubious proposition, Google's service benefits writers.)

See Tim O'Reilly's New York Times Op-Ed piece on this: Search and Rescue. He explains in detail why Google BSLP should be welcomed by writers.

Thats the worst who can happen to an artist!

I had a few problem related to digital art getting stolen and people just have no-respect for us artists!

nancy

Plagiarism in art

Interesting article. I'm personally affected by plagiarism of my painted artwork. Another local artist has become quite skilled at copying my work and then trying to compete against me by selling it at the same venues - usually for a fraction of the price, presumably because he isn't registered to pay taxes.

As a current painter and aspiring writer (abeit one who needs to go back to school to learn proper grammer!!) I'd be interested to see how this one plays out.

Richard Buckley - Artist
Richard Buckley - Artist
Abstract Canvas Art

We as well deal with the

We as well deal with the same issues on our sites. Often our texts and graphics are used blatantly without our permission as well. It really is a shame, but unfortunately seems to be the norm in this digital age we live in.

Frank
zohaisx

I was wondering can you not

I was wondering can you not sue them for copyright infringement even though it's in a digital form?

Blatant stealing

I own a photoshop related web site and I've seen many other sites just blatantly rip off my graphics. Usually other people report the graphics that were stolen from me which I'm really thankful for..

Gagla

Other factors

Remember also that an author will fear plagiarism accusations against him, even if innocent. The matter is not always so clear cut. Google might therefore represent a threat.

But the bigger issue of why the publishers are resisting Google is more interesting. It is really quite simple. Google, by better informing the reading public is diluting the marketing power of the publishers. The reader, by being better informed can make choices independently of the publisher's marketing machine.

Re: Other factors

Yes -- one of the effects of loosening copyright is a much more collaborative notion of authorship. Things that today might be considered plagiarism, or at least lack of originality, might tomorrow be considered simply building on the works of others. The important thing is to have laws that enforce proportional attribution, rather than laws to restrict copying, in such a world.

I totally agree about the dilution of publishing's marketing power, very good point.

Thanks for the great site!

This will help me to polish up my own arguments against the stupid copyright system.

I have had a bit to say about it here :

http://forum.onlineopinion.com.au/thread.asp?article=141

---------
http://www.candobetter.org/james

Digital Plagiarism is NOW not Next

Hi,
 
I have been reading the posts above and i thought id share my own personal experience. Digital plagiarism isnt the next big thing at all - It is already here and it has been here for a long while. 12 months ago I was hit with someone steling my digital art, not just a little of it but a vast amount of my work. At the time I couldnt do anything but wave bye bye to my work. I did however research on the internet and i found a site that advertised a huge list of online marketing companies. So I decided to contact one, they have since helped my protect as much of my work as possible, but they also carry out on a monthly basis a scan of the internet. Brand management they call it or online reputation management. Any whatever it is they scan the internet and find if anyone else is using my digital art/copy etc. So as I say it is here now and has been for a long time. The question is are you going to protect your brand like i did.
Great website by the way