Sunday 18 July 2010

FanFiction.Net Member Statistics

The research team is proud to present you first numerics from our user-related queries. This post answers many questions, including the following:

-How many writers are there on FFN?
-How long will you stay on FFN?
-How many stories do they write?
-How many users are deleted from FFN for infringement of ToS?
-How quickly does FFN grow?
-How many readers you should expect for a story?

First, we must present the methodology, though. The study consisted of generating 1100 random user account IDs spanning from 1 to 2,400,000 (source data at the bottom). It allows us to generate representative unbiased results at a 95.34% confidence level and a 3% error margin. The list has been generated on the 29th of June 2010. Therefore, we have included all accounts that have been registered, enabled and fully functional, without restrictions of story creation or profile/review posting.

Now, the definitions. You will see the following criteria used in this post:

Empty account: any account that does not host stories uploaded by the owner. In layman terms, there are no stories posted in this account. There may be favourites. Here and here are examples of accounts dubbed 'empty'. Conversely, this is not an empty account.

Active/alive account: any account that has shown signs of life in the past six months, from January 1, 2010. This may be the following: updating or posting a story OR updating the profile OR adding a favourite story OR reviewing a favourite story in the past six months. For example, these two accounts are called 'active' or 'alive' in this post. In the case of the second example, please check the favourites. As long as at least one criterion is met, it is active. Those, who have joined fan fiction in the year 2010 are active by default due to a professional grace period to create a story.

Inactive/dead account: any account that does not meet the active/alive criterion above. Here are two examples.

Deleted account: any user ID that shows the following or similar message "User does not exist or is no longer an active member."

Main Part

You probably recall that FFN has ~3,300,000 stories from our last research (number rounded up to accommodate growth since the previous post), which is 53% of all posted material, with the other 47% deleted. Keep this in mind for a moment.

In the sample of 1100, we have discovered 742 empty accounts, which means, via representativity, only 32.5% of all FanFiction.Net users have stories posted. How does that transfer into general numbers? In a population of 2,400,000 members 781,000 have stories (4.2 stories per account with a story on or 1.375 stories per every member), while the remaining 1,619,000 do not participate in adding content. Two thirds of all members are pure readers, or so it may seem. If it were correct, we could say that 1 writer has 3 dedicated readers on average, if we assume writers themselves read. However, it's not that simple.

Some accounts are plain dead. How many? In a sample of 1100, 855 accounts were inactive, and showed no signs of life in the year 2010. What does that mean for FFN? 78% of all accounts on FanFiction.Net are dead. Less than a fourth, or 22% is currently at your disposal, or 528,000, which is less than the number of accounts with stories on them.

The fun part begins now. How many writers are active? Who could you expect updates from? We connect the overlapping clauses of 'active' and 'not empty'. In a sample of 1100, 130 accounts showed signs of life and had stories on. It translates into: 12% of all accounts on FFN have at least one published story and are actively engaged in fandom activity. 88% of members on FFN are currently not shaping any fandom. As for those, who do, there are 283,000 of them. We have found out that there are 5259 fandoms on FFN, which would mean 54 people keep a fandom alive in the course of 6 months.

On average, no more than 54 people appear in a fandom over six months. How many new people is that per day? 0.3 of a person drops into an average fandom. An average fandom has 681 stories. A median fandom, the one in the middle, which ditches the enormous influence of HP with 0.5 million stories, has 16. That was a bit of extra information, and we now return to users.

One aspect of FFN particularly interested the research team, the number of account deletions by the administrator. 0.73% was the number we acquired. That's less than 1 in 100. However, let us convert that into raw numbers. 17,500. We add an arbitrary 3000 to that number because accounts from 1 to 3000 are unavailable, and the account number generator did not account for it. What do we get? Since September 1998 fanfiction deleted over 20,500 users for infringement. It stands for 0.85% of all users. 4.75 accounts are deleted per day on average, a very modest number because we disregard deletions impossible to document and test easily, like those attributed to policy changes (for instance, when MSTs were deemed unwelcome).

Who would that be? Blacklisted people: spammers, trolls, plagiarists, other infringers. They missed a few trying to use FFN as an advertising venue here and here.

By now, you already know how many account totals are there. It's time to break them into a time series and give you an understanding of how quickly FFN grows.

A table below tackles this issue. We need to explain the columns for complete clarity:

Total: the last account ID created in the year (AKA summary number of accounts created until December 31, all years including the one in the row [accounts made this year + all accounts made in the previous years])
Change: number of accounts that were created in the year in question
Growth%: how much accounts FFN gained in comparison to the previous year, excluding accounts created in the previous years.
CChange%: chained value of change. The ratio of Change (this year to last) divided by the ratio of Total. Answers how quicker (above 1)/slower(below 1) grew this year in comparison with the previous, acceleration.
Middle: the date when half of the annual growth is reached, 50% of accounts created in that year are already present by this date.

Year - Total - Change - Growth% - CChange - Middle
1999* - 6749 - ... - ... - ... - ...
2000 - 33,090 - 26,620 - 411.4 - ...
2001 - 147,200 - 114,110 - 344.8 - 0.19
2002 - 318,900 - 171,700 - 116.6 - 0.16
2003 - 512,000 - 193100 - 60.6 - 0.32 - June 22
2004 - 733000 - 221000 - 43.2 - 0.5 - June 13
2005 - 959000 - 226000 - 30.8 - 0.55 - June 29
2006 - 1188200 - 229200 - 23.9 - 0.63 - June 21
2007 - 1458900 - 270700 - 22.8 - 0.78 - June 17
2008 - 1788000 - 329100 - 22.6 - 0.81 - June 3
2009 - 2238000 - 450000 - 25.2 - 0.89 - May 31
2010** - 2680000 - 442000 - 19.8 - 0.66 - July 21

*Accounts created in 1998 added. It is impossible to tell when exactly a person joined before 2000-01-07.
**estimated, based on the first 6 months.

Before we begin analysing the data, there is an explanation for our 2010 estimate. We calculated it according to seasons, not a plain average. Based on our calculations, by June 21 the site receives 50% of its annual account growth spurt. This means that slightly more accounts are created in the first half of the year, than in the next six months. Site-wide, there is no reason to assume 'big' events like the release of a movie or a new popular book create significant fluctuations. Years before 2002 were not included due to volatility while the site was still young.

Now, let's carry on with the examination. As you can see in the Total column, the site is growing every year. Rational. The Change column shows that an increasing number of people joins the site up to 2010, with the period from 2004 till 2006 being stable in terms of Change. Things become trickier with Growth% and CChange. Some of you may be confused why a site which is growing more and more in raw numbers seems to score poorly in the last two columns. The explanation is as follows: as the site grows, it needs a larger number of new accounts to sustain itself. Simple example: site with 1000 accounts made in the previous year gets 1000 more this year. Next year, it will be 2000 accounts. If the site grows another 1000 next year, this 1000 will be relatively smaller (50% vs 100%) than the first. The same is happening to FFN, as it gains a similar number of accounts that weigh less and less.

The rate of acceleration or slowing down is most visible in CChange. Not a single value is higher than 1, which means the site never grew faster than the year before. On the contrary, the rate of slowing down, the closer to zero the less momentum the site gains compared to last year. From 2000 till 2009, deceleration (slowing down) was becoming closer to 1, a sustainable equilibrium point, but the year 2010 returns us to levels of 2006.

In layman terms, imagine two speeding cars. One of them is the site, and the other is 1, how the site did last year. The other car is a ghost/time challenge type that repeats the race as it was before. The ghost reaches the finish line first every time because your car never reaches the value of 1. You lose one race. Next time, the ghost repeats how you raced the time you lost. And again. Meaning, every race the ghost is slower, repeating your losses. You keep losing, though. While you do, you notice that if at first you lost by a long shot, after several runs, you still lose, but 1 is a lot closer.

If it weren't for 2010, a great gap in a seemingly fluent continuity, we could have made an obvious conclusion that FFN will, eventually, grow faster, and its growth will be bigger both in volume and ratio that volume takes in the whole (your car will start a winning streak).

Regression analysis showed that there is a polynomial relationship between time and growth. Linearly, there is a positive relationship and a linear trendline would claim that the site will reach CChange=1 in 2012. With an R^2=0.825.

A polynomial trend fits better, with R^2=0.9 for the parabola. It means that the function you will see below 'catches' 90% of all vibrations that our growth spurt (CChange) makes, and best describes fluctuations in growth on FFN. What does that R^2 mean? 90% of all growth fluctuations are explained by time in the function below.

y = -0,0094x^2 + 0,218x - 0,4813

y - CChange value

x - number of years since 1998 (0, 1, 2, et cetera)

Basically, this function allows us to calculate the future of FFN. What is it? Well, according to this, the CChange value will be 0 when the site reaches 21 years of age or by year 2019. This is the scenario we follow if the site does not gain momentum by 2012. If we employed descriptive statistics, any CChange above 0.779 and under 0.3 would have been considered anomalous (the rule of three standard errors). Removing those values gives us a more pessimistic, yet less accurate, picture of these events. Reaching 1 would take three years longer linearly, and negative CChange would also be acquired sooner in more reliable polynomial models. Our choice on extrapolation is based on the principle of numeric accuracy, provided other factors remain static. Surely, clever website management and an increased interest in fan fiction as a concept is bound to change the end result. It does, however, suggest that site administration would avoid the trend described in this exercise.

As a final part of this piece of research, we would like to address a number we have shown you before 12%, the number of accounts that have stories on and currently participate in fandom. Another 10% are active readers and do not have any stories posted. This is a general number, though, and we are sure You are more curious to know where do you stand with your peers rather than the whole site.

Below is a table with the following columns:
Year: year of joining.
Full: possibility% that your account is still active and has stories if you joined in the designated year
Empty: possibility% that your account is still active, but has no stories, if you joined in the designated year
Full stays: the probability% that if you have stayed until July 2010, you have stories on

We start from the year 2002, when initial FFN volatility abated. Empty in 2010 is skipped.

Year - Full - Empty - Full stays
2002 - 6.4 - 2.5 - 71.4
2003 - 8.5 - 1.1 - 88.9
2004 - 3.7 - 1.9 - 66.7
2005 - 5.7 - 2.3 - 71.4
2006 - 9.1 - 2.0 - 81.8
2007 - 9.1 - 5.8 - 61.1
2008 - 16.2 - 2.8 - 85.2
2009 - 18.6 - 21.3 - 46.6
2010 - 28.4

Interestingly, you are more likely to stay over a year on FFN if you have stories and are a writer than if you were just a reader. However, you have an equal chance of staying on FFN for a year, writer or reader alike. Regardless, if you join FFN, chances are you will not write a story and you will not be on the site longer than six months.

Even if you have written a story, it is most probable that you will not be on the site longer than six months. This is a generous time period, and it could be that six months is the most probable activity lifespan because it is the starting point and anything smaller does not exist in this part.

We have worked on regression to give you an easy way to calculate the perspectives of staying on FFN. A fifth degree polynomial function seemed to have the biggest R^2=0.99. Amusingly, the probability would go down to negative 1700% very quickly after 8 years, so we had to switch to a simpler parabolic function with R^2=0.96.

Y=0,0218x2 - 0,2603x + 0,961

x - the number of years you have/are intending to stay on FFN. (Works for values up to 10 years).

y - % that you will stay.

According to the given function, it is least likely that you will stay on FFN for 6 years. Thus, yes, more likely that it will be 7 or 8. We attribute this to some form of fandom patriotism the earliest members have expressed to the site. A more precise function would have to include account deletions, which should, in reality, lower active account rates (remember the 3000 first accounts?) and the possibility of staying much longer than 8 years. In any case, the function above is presented for your amusement. A more informative variant is below.

We understand that it might be difficult to imagine the contextual difference between 6% and 9% dominant in the previous table. For this reason, we have made a coefficient, so 28.4%=1. This way, you will see more clearly how many active accounts die away, and how many stay active.

8 years 23%
7 years 30%
6 years 13%
5 years 20%
4 years 32%
3 years 32%
2 years 57%
1 year 65%
0 years 100%

The process can be done further if you want to see how many % of 65% et cetera die in the following years.

Active fanfic participating accounts (those that make up 12% on the site, remember that) lose 35% of their numbers in the first year. The second most rapid drop is in 3 years, but people who tend to stay 3 years are prone to staying 4. The last accurate piece of data that coerces with the trend: the more time passes, the less people stay, is 6 years. Only 1/8 of the people who are active writers right after joining remain this way. 7/8 chip off during the trip. As such, the number of permanent contributors (who stay on the site for years) increases as FFN grows. There is only one 'but': the increase is majorly consumed by users abandoning their accounts.

Those, who have spent less than 6 months account for 6.5% (29.5%) of the 22% of people that are active in any way. Another 7.3% (33.1%) come from those, who have spent more than a year. As such, it is reasonable to say that almost two thirds of the site is actively inhabited by inexperienced account owners, rated 'fans' in forums. So-called 'fanatics' make up a third of the active population, a third that spans since 1998 till the beginning of 2009. On the one hand, it is peculiar that the amount of active newbies (writers or just readers) is almost equal to that of 'fanatics'. On the other, it should make quality control out of the question. Why does it not even out? A question we leave in your hands, dear readers.

Conclusion

Unless FFN manages to speed up its growth potential, those 12% that currently shape the fandom will not be enough, especially because ~5 accounts are deleted every day. The site needs to replace more than 35% of active users every year, and 2010 so far looks the most challenging yet. More dedication, fellow fans. May the concept of fan fiction prosper.

Added: here is a list of user accounts in our sample.

Question: What about people who just go to forums, aren't they active?
Answer: They do not make use of the site's core service as a fan fiction archive. If you don't write or read stories, you are considered inactive. The only way a forum goer could be included as active (provided they have no stories or favourites) is if they updated their profile this year.

15 comments:

  1. The idea that FFN statistically will be at a standstill by 2017 is absolutely ridiculous. Read what you wrote here again:
    "If it weren't for 2010, a great gaping anomaly, we could have made an obvious conclusion that FFN will, eventually, grow faster, and its growth will be bigger both in volume and ratio that volume takes in the whole (your car will start a winning streak)."
    If it weren't for 2010, a year that we're only halfway through so far, we may reach the conclusion that FFN will continue to grow closer to 1. A single anomaly in a trend doesn't void the trend in the same way that a single bad episode in a normally great TV show doesn't mean that the whole show will soon become bad.

    ReplyDelete
  2. We have assessed your comment and redefined 2010 as not anomalous, with the years 2008 and 2009 being so, however. The 2017 estimate with its provisions remains strong. Only considerable growth by 2012, as expected in less precise linear modeling would set changes to this definition.

    ReplyDelete
  3. I have a couple of suggestions for future research.
    --How many members of ff.net self-identify as female?
    --How many members of ff.net who have posted at least one story (ie, how many *writers*) on ff.net are female?
    --Of those members/writers on ff.net who self-identify as male, what are their age statistics? Is the average age of male members similar to the average age of female members, or do the numbers differ significantly?
    I ask these questions b/c I'm a lit student studying topics like fan fiction, popular fiction, and Mary-Sues, and I'm particularly interested in how gender interacts with these topics.

    ReplyDelete
  4. The raw data is useful, the trend fitting is suspect. If I were going to fit a trend, I'd look for a classic growth curve which applies to many more markets than just websites: a ramp in your growth rate, a period of high growth and then a decline in your growth rate as the market matures.

    FFN is over ten years old, old enough to be considered mature. It would be expected to see a fall off in growth rate and that is what I think you are seeing. 2010 is probably not anamolous, it is the beginning of a decline in growth.

    Nipping and tucking at the edges is probably not going to change this trend. There seem to be two issues:

    1) Saturation of existing market
    2) Retention of users

    For #1, the normal thing to do is look at existing markets:

    * Can you tap other geographical regions better (some data to show english speaking locatity would be good).
    * Can you tap other languages better (interrelated with previous of course).
    * Can you serve adjacent markets:
    o: does original content make sense to add?
    o: will expanding the TOS to include adult material help (based on overall internet usage traffic analysis we are all familiar with, the answer is most likely yes)
    * advertising campaigns to expand membership. This site seems a good candidate for banner ads.

    Keep in mind that adjacent markets may be best served by sister-sites so that the original brand is not diluted.

    For retention, this is harder but knowing a poster to the site, it isn't that easy to post and manage your stories. As an occasional ready, I don't find the search engines or the layout up to current standards.

    Also, for retention, something that keeps the site actively in the users attention (without crossing that fine line into spam), better rss-like notifications for favorite others, a periodic news letter with best of the site and news (i.e., advertisement) on expansion areas such as sister sites.

    I'd like to see more quantitative data like:
    -age demographics
    -region demographics
    -break down of stories posted by rating
    -break down of stores read by rating

    That could give you more information than trying to fit suspect trend curves.

    ReplyDelete
  5. My question would be if im visitor whom just reads without an account why dont they base activity on that? Surly they have more non account visits that could keep the site alive longer.

    ReplyDelete
  6. Haha.. this is fun. Maybe you are serious or not, but you do have a point about the time many new "authors" stay on the site. Me? I'm kind of an anomaly, as I've been active on the site for about ten years now.

    Dr Facer.

    ReplyDelete
  7. You are a self righteous person. Statistics numbers doesnt always predict correctly, and it's someone like you who could made some authors turn their back on FFnet, after what you did to them. Computer program still made mistakes, and I heard that those authors of FFnet was not impressive with what you've done.
    Especially after some of those deleted stories were technically innocent of the charges you set upon.
    I dont mind if you do something to those plagiarist, but some stories that had been deleted had some potential to growth. Just because it's not as impressive to you at the beginning, doesnt meant that they couldnt learn from their mistakes. That's what reviews are for! They would tell the writer whether they like the stories or not.
    And, what's wrong with some stroking of the authors ego? Humans likes to hear others approval to motivate them. Without ones, more often than not they tend to think that it's no use to continue what they do.

    And what would be of FFnet if because of you, some if not most of their loyal members decided to reroute their works elsewhere?? I've seen many who had re-considered to post their work because of that program of yours. Then the end of FFnet could be attributed to you too, dont you think?

    Since you seems like to critics so much, I wonder if you would reply to this one kindly.

    ReplyDelete
  8. 16 September 2010 02:53, I don't understand how your comment is relevant to the post. You should PM Lord Kelvin at FFN in case you just wish for dialogue.

    ReplyDelete
  9. I've been reading FanFiction since July 2008, and I created an account at the very beginning of November of that year. While I realize that your points about the lack of new members are valid, I also think that you're failing to realize how dedicated some of us are to these stories, and the importance of this. Yes, it's a shame that there may at some point be a minimal amount of contributors, but at the end of the day, the site should be about the love of the topics, rather than the number of people reading these stories. We'll still be there as long as our favorite stories are.

    That having been said, your statistics were definitely interesting, and it is a sad fact that many stories have been deserted or put on hiatus.

    Also, if Ashley is still on here: In my experience on the site, there are numerous stories with Mary Sues, however they're not popular and are generally treated with disdain. I personally am a female in my teenage years, and I would have to say the gender and age of members really do vary between topics and genres.

    ReplyDelete
  10. To Anonymous (comment directly above):
    You would be amazed at the popularity of Mary-Sues, not just in fanfiction but in original fiction. Just because Mary-Sues are disdained by a critical portion of the public doesn't mean they aren't popular. If they weren't popular (at least with writers) there wouldn't be so many out there! But what I'm mostly interested in is the way that readers self-insert into main characters, and Mary-Sue is obviously a site for that phenomenon.

    While gender and age do vary, statistics I have found other places have shown that fanfiction is overwhelmingly written and read by women. The reason I want clear statistics is to see if this is true, and what the actual numbers might be. Statistics about age and gender would clear up WHICH topics, genres etc were popular with WHICH genders/ages.
    ~Ashley

    ReplyDelete
  11. Demographics would be informative to get a hold of, but FFN does not individually collect such information unlike its service providers, Google included. From experience, I know surveys are hopeless on a site as varied as FFN, so it's difficult to suggest an efficient method of gathering information for free. Be sure that as soon as it becomes available, though, it will appear here.

    ReplyDelete
  12. Dear (other) Anonymous: They were not impressed. Not "they were not impressive."

    I have served the purpose of the typical Grammar Nazi and now depart.

    ReplyDelete
  13. The thing that I think is the best case scenario for the site, but would completely negate this research, would be the emergence of another media hyped fandom source (like Harry Potter or Twilight) that would help to create a new source for material for writers and readers. I think that there needs to be more insight into the function of the mega-fandoms and how they effect FFNet and how each time one of those fan-sources are created effect the growth of the site. With the Harry Potter books long completed and the last movie looming, and the same situation being true for Twilight, I could understand how participation in the site may be slowing. However, it would be interesting to see how media booms help to bolster the site, and how interest drops once a series has been completed for some time.

    ReplyDelete
  14. Wow. I know this was posted a long time ago, but it is still an interesting read! Thank you! Next month, I will have officially been on FFnet for a year!

    ReplyDelete
  15. wow, so I was assigned an assignment for my English class, we have to write about a hobby and I consider FFN a hobby, I found this by pure chance and I found it completely intriguing. I think FFN could continue to grow as it is one of the greatest Fiction sites in my belief, mainly because it's the only fiction site I've found that notifies you when a story you follow or an author you follow has updated. I am glad for this as it will be a great help to my essay, that is due tomorrow.

    ReplyDelete