Note because I know non-AWers are checking out this thread: I am very pro-self-publishing, I am choosing to self-publish my debut novel myself, and I think self-publishing is all that and the cat's meow. But I also think good trade publishers are awesome, and that the right choice will be very individual to each author and his or her book, and I feel very, very strongly about authors being well-informed about the choices.
----------------------------------------
Edits: The unreliability of the sources I posted in one part were pointed out downthread. Also, I realized upon rereading that I elided some of the mathematical language in this post. I'm sure it's already clear what I meant, but I'm footnoting anyway because I'm pedantic with myself like that.
----------------------------------------
Before the other thread got locked, some people asked me why I thought the math/logic in this report was so poor. I really don't have a lot of time to spend on this so I'll just highlight a few things.
First, I'm highly skeptical that they've managed to crack Amazons ranking -> sales function (how many sales a given Amazon ranking equals). My own research suggests that this often fluctuates wildly and
may be skewed towards eBooks as Amazon has a vested interest in pushing Kindle products above paper books. Amazon keeps this algorithm a closely guarded secret. Because of this, I'm highly skeptical of the raw data the study is based on.
Even if we accept the raw data as accurate, a
quick Google
search tells me that online sales (not
ebook sales, online sales of any type) account for less than half of all book sales in the U.S. (as of 2012) and that Amazon accounts for half of all online sales (as of 2013). [
edit: The unreliability of these sources was pointed out downthread. See footnote #1. In general, take this point to be a point about Amazon being a portion of the market to a degree we just don't know (unless someone can direct me to better sources!)] This means that even rounding up generously Amazon accounts for only
a quarter forty percent(?) of books sales in the U.S.. Now, this is not quite a fair comparison, as it appears this applies to all book sales and not just genre book sales, but although I can't make the conclusion that genre sales
aren't concentrated at Amazon, I think it's also fallacious to assume that they
are without more information. If genre sales are distributed evenly across sales channels, these data are only
25% (40%? Not the whole portion, in any case) of the picture.
And it's the
25% (40%?) of the market in which self-publishers tend to make the majority of their sales. I don't think it's controversial (unless someone corrects me) to say that it's far more likely that the other
75% (60%?) of the market is skewed toward trade published books, particularly the
more than 50% of book sales that are still made off line. Looking at
25% (or 40%, or whatever it might be) of the market might be interesting, but I don't think it provides nearly enough of a picture to extrapolate the rest of it.
More significantly, however, the data are only from 7,000 books on the best seller lists. There are at least
12 million books on Amazon. That means we're looking at
five hundredths of one percent of the books there -- and since they're from best seller lists, we're looking at .05% of the
top books.
Why is this significant? Because we're only looking at data from books that are already doing well, which tells us nothing about how likely a particular method of publishing is to
get it there.
Assuming the raw data are correct -- which show roughly half of these 7,000 books as being self-published and roughly half being trade published -- I think the only conclusion that can be drawn here is that
it's possible for self-published books to do well within Amazon's marketplace. Which is nice (yay choices!), and might have been big news a decade ago, but I think anybody who doubted that it was
possible for self-published books to do quite well hasn't been reading much about publishing lately.
But saying something is possible does not say anything either way about the likelihood. And looking at people who have succeeded at X after the fact does not tell you anything about what to do to achieve X unless you also know the breakdowns of people who do
not achieve X. For instance, if you look at a bunch of lottery winners and see that (say) 75% of them were poor before winning (I'm making this up), it does not help your chances to give away all your money, because you're ignoring that it might be true that 75% or more of lottery
tickets are bought by lower income brackets. If a poor person and a rich person each by a ticket, they're equally likely to win; the poor person does not have a 75% chance.[2] And you wouldn't know this from looking at just the winners; you'd have to look at the breakdowns of both winners and nonwinners and see that both had breakdowns of 75% poor people and thus the state of being poor gave no actual advantage in the lottery.
How does this relate here? Well,
this link (which is well-worth reading; thanks AWer Gravity!) suggests 8 million of those 12 million Amazon books are self-published. If all books are equal (which they're not, but just as a demonstration), and half the 7,000 top sellers on Amazon are self-published, you have a .04% chance of selling that well if you self-publish but a .09% chance of selling that well if you trade publish[3] -- twice as high. Of course, both percentages are ludicrously small, and don't mean much at all for two reasons: (1) the far more interesting question for aspiring authors is not how to become a best seller, which is quite an achievement in either category, but which avenue is the
best business decision for that particular book, and (2) all books are not created equal. #2 suggests that the higher probability of trade published books becoming best sellers means that trade published books on average are either more popular or have better marketability or both, which I don't think is terribly surprising, since self-published books have a much longer tail of poor quality that trade published books do not (note that this says nothing about the comparability of the top self-published books to the top trade published books, only the averages), but other than that I don't feel like there are really any useful conclusions to draw here. What we
really want to know is, for us likely-to-be-non-bestsellers, what is the better path to choose? And these data simply don't give us any information on that, unfortunately.
The earnings conclusions strike me as equally skewed. First of all, we have the same problem here, in that we're looking at books that have already been successful. If we condition on a book being successful,
of course it's going to have higher royalties on Amazon if it's self-published. Because you get higher royalties for that. But it's ludicrous to condition on a book being successful when trying to make business decisions! We don't want to know which would make us more money
assuming we can
magically make our books sell brilliantly; we want to know which path will
make our books sell better in the first place. A lot of trade published books would not make the sales they do without the support of a trade publisher, so it doesn't make any sense to say a trade-published book selling X copies would make more self-publishing because it would still sell X copies at higher royalties, since it's quite possible it would sell far fewer than X. How to publish is a very personal per-book/per-author decision, I think, and would involve a lot of factors no study can tell a person.
Regarding earnings, it's been
mentioned elsewhere that there are potentially quite a few pieces of trade published earnings that aren't taken into account by looking at Amazon sales (plus there's that other
75% (60%?) of the market), whereas it's far more likely that we're looking at the majority of a self-publisher's earnings when we look at sales on Amazon. Stacking the majority of a successful self-publisher's income against a portion of a successful trade publisher's income does not strike me as terribly useful data. Additionally, the numbers make no accounting for the start-up costs a self-publisher incurs. The
reason royalties are lower when one trade publishes is that the trade publisher is making a substantial financial investment and needs to earn that money back (hopefully with some profit, as the publisher is a business partner). When a self-publisher publishes, she receives higher royalties, but these royalties must also pay back whatever initial investment she made in cover art, editing, time spent marketing, etc.. This is not accounted for.
I've seen Howey repeat the idea several times that it's better to self-publish because if you're successful, you'll earn more, and if you're less successful or unsuccessful, well, the trade publishers wouldn't have taken you anyway. One of the main reasons this bothers me intensely is that it completely discounts the impact a trade publisher has in
whether a book is successful or not. There are few manuscripts that trade publishers put out exactly as they are submitted; the creative support of trade publishers
can substantially improve the creative content of a book. The trade publishing process may involve a more professional presentation than the self-publisher can afford. And the trade publisher may be able to put a marketing and distribution push behind the book that the self-publisher does not have access to. IMHO, it's disingenuous to imply that a book will always have the same chances whether one self- or trade publishes and thus one should opt for the higher royalty rate. It also ignores the fact that serious self-publishers are often looking at a substantial financial investment before earning money back, and therefore are taking a large financial risk -- it's entirely possible for a less successful or unsuccessful self-published author to end up in the red, which would not happen if the author opted for trade publishing, regardless of how well the book did. One thing publishers do is assume that risk.
I'm not trying to suggest that these circumstances are
always true, just that it's substantially more complicated than, "it's always better to self-publish because if you're successful you'll make more and if you're not you wouldn't have been published anyway so everything is gravy" (paraphrased). Self- versus trade is a complicated decision, with risks involved no matter which way you go.
There are other statistical issues I see with the article -- the extrapolation, many of the other incidental conclusions drawn, etc. -- but I've spent far too long on this already. So there are some broad strokes.
The tl;dr version: In my opinion, the data tell us little other than that it's possible for self-published books to compete quite well in the Amazon marketplace, which I venture to say we already knew. I don't see any conclusions that can be drawn about which path is better or more lucrative for a particular aspiring author and his book. I'm not saying that the data contradict self-publishing as the best choice: they just don't tell us anything either way, other than that self-publishing is a
viable choice. Which, again, I think we knew already.
(But maybe that'll be useful to someone!)
----------------------------------------
Final note: I know next to nothing about publishing. But I do know a lot of math. I'm not trying to say here that anyone's beliefs about self- or trade publishing are right or wrong, only that I don't see the data supporting anyone's beliefs either way. There's just not a complete enough picture. I recognize that it may be next to impossible
ever to get a complete enough picture -- I'd dearly like one just as much as everyone else would -- but looking at a
lot of data doesn't mean that one is looking at
enough data, no matter how cool it would be (and it would!) to get some substantive numbers.
Footnotes:
[1] If anyone can find better sources, feel free to point me. Like I said, publishing is not my area of expertise, only math on existing data! I've edited the post to read "maybe 40%" in the sense that we might make a wild-ass guess that Amazon sells as many self-published books as it does trade published ones, since roughly half the 7,000 data points in the survey were self-published and roughly half were trade published. This half-and-half idea is not actually a conclusion we can draw from this, or at least I don't see the math to do it, but calling it half-and-half is consistent with the data. If Amazon sells 100% again as many books that are self-published as it does ISBN-listed trade published books, that would give it roughly 40% of the market instead of 25%. Eeeexcept not necessarily, because if we're adjusting Amazon, don't we have to adjust other retailers? I note that though these adjustments would give Amazon a smaller market share, they would increase the market share of self-publishers in the non-Amazon portion, since SPed books are the ones perhaps not being counted when people calculate market share. So what does this all mean? I feel reasonably confident in saying WE DON'T KNOW. Which was basically my point with this whole post. We just don't know enough to say . . . anything! (I'm happy if someone else can come along with math I didn't see how to apply and find something, but I can't see any conclusions to be drawn here.)
[2] This should have more properly read, "The poor person does not have three times the probability of the rich person of winning." Ironically, I think it's probably more intuitively understandable to non-math people as it's written, and math people almost certainly knew what I meant, but like I said, I'm pedantic about these things.
[3] These percentages are not actually the random chances of something in 4 or 8 million landing on a certain 7,000 list, since the 7,000 list in question changes over time, so will encompass more than 7,000 books. But as noted there are a lot bigger problems with trying to use numbers here, and I was more doing that for demonstration purposes. Note also here that just the fact that the data give us a roughly half-and-half breakdown but there may be twice as many SPed books on Amazon as trade published means that someone could use these same data to say that you're twice as likely to sell to any given level if you trade publish. This argument would be wrong, of course, for all the same problems listed in this paragraph, but it's an example of people being able to use numbers to suit any purpose they want as long as they twist them the right way. As I said in the post, as far as I can tell, these data don't support self-publishing or trade publishing over the other; they just don't really tell us anything.