Are there any statistics on the danbooru database like average number of tags per file, total number of tags, and correlations between specified tags?

I'm making a tag based files system at the moment and it occurred to me that the average depth of a file in a traditional file system (between 7 and 11 directories, based on my own system) won't be the same as the average number of tags. People will tag files with attributes that have roughly the same level of importance while they make a choice between them in a hierarchical structure.

Looking here on danbooru, I get the feeling I'm correct in that conclusion.


No I hadn't seen that. It's very interesting, but since my file system is meant for personal use, I can't impose such guidelines on tags, and thus I probably won't include tiers of tags in the design of my database.

My question is only the first sentence; the rest I include as an explanation of why I'm interested. I would like to email an admin with my questions, but I haven't yet found any contact infos.

That's all? I'm really encouraged by that, I was worried I would have a huge slowdown with the amount of data danbooru sees.

By correlation, I mean statistical measures of correlation, but also simple things how likely it is that one file will have both of two tags. Those may or may not have implications for how the database is structured, but I find it interesting nonetheless.

Thanks both of you.

I'm not sure, but I think you can just download the tag database. I know there are plugins and scripts for firefox that keep local copies. I'm thinking of things like the tag auto-complete plugin.