dracodraconis: (Default)
[personal profile] dracodraconis
http://www.bookblog.net/gender/genie.html

from the web site:
Inspired by an article in The New York Times Magazine, the Gender Genie uses a simplified version of an algorithm developed by Moshe Koppel, Bar-Ilan University in Israel, and Shlomo Argamon, Illinois Institute of Technology, to predict the gender of an author. Read more at nature.com.

Simply type or paste your text in the box below. Click submit for the results.


Apparently it works best with texts of more than 500 words. Here are the results of some of your more recent posts of more than 500 words.

[livejournal.com profile] ancalagon_tbFemale Score: 675 / Male Score: 1089 / Conclusion: Male

[livejournal.com profile] wonderleafyFemale Score: 611 / Male Score: 913 / Conclusion: Male (umm... no, bad turtle)

[livejournal.com profile] d2leddyFemale Score: 792 / Male Score: 741 / Conclusion: Female (not doing so hot, here)

[livejournal.com profile] _luaineachFemale Score: 747 / Male Score: 456 /Conclusion: Female

[livejournal.com profile] ironphoenixFemale Score: 752 / Male Score: 704 / Conclusion: Female (I'm sure his wife will be surprised)

[livejournal.com profile] ms_dansonFemale Score: 1892 / Male Score: 1908 / Conclusion: Male (Definitely not)

I'll try a few others later. Right now, cookies to bake.

(no subject)

Date: 2006-12-30 03:37 pm (UTC)
From: [identity profile] d2leddy.livejournal.com
I wonder what the assumptions of the data are. For example, men use adjectives more coarsley than females, who will be more specific. Assumptions like that, perhaps?

Or I could just read the recent large posts of those you sampled and see if I detect commonalities.

(no subject)

Date: 2006-12-31 11:48 am (UTC)
ext_15025: Photo by me (Default)
From: [identity profile] dracodraconis.livejournal.com
According to the original paper (http://www.cs.biu.ac.il/~koppel/papers/male-female-text-final.pdf):

The short (less than 50) list of features which our algorithm identified as being most collectively useful for distinguishing male-authored texts from female-authored texts was very suggestive. This list included a large number of determiners {a, the, that, these} and quantifiers {one, two, more, some} as male indicators. Moreover, the parts of speech DT0 (BNC: a determiner which typically occurs either as the first word in a noun phrase or as the head of a noun phrase), AT0 (BNC: a determiner which typically begins a noun phrase but cannot appear as its head), and CRD (cardinal numbers) are all strong male indicators. Conversely, the pronouns {I, you, she, her, their, myself, yourself, herself} are all strong female indicators.

(no subject)

Date: 2006-12-30 04:05 pm (UTC)
From: [identity profile] ancalagon-tb.livejournal.com
I think this definitely requires a few posts and making an average of the score. My last post was somewhat technical in nature, it might look more "male" than some others.

Anyway, I am delighted that an algorithm has re-affirmed my masculinity. All my doubts and fears are gone now, I am a man - Hear me roar! :P

(no subject)

Date: 2006-12-30 04:47 pm (UTC)
From: [identity profile] d2leddy.livejournal.com
I had to check to make sure I was. Maybe we should send pics to Bar-llan University.

(no subject)

Date: 2006-12-31 04:33 pm (UTC)
ext_15025: Photo by me (Default)
From: [identity profile] dracodraconis.livejournal.com
I'm sure the researchers as Bar-Ilan would love to receive photos of you proving your gender.

(no subject)

Date: 2006-12-31 11:53 am (UTC)
ext_15025: Photo by me (Default)
From: [identity profile] dracodraconis.livejournal.com
You might also want to examine this over document type, since the weightings for each speech category change with this variable. Try a fiction and a non-fiction piece, then a blog entry. I'd use a univariate ANOVA and within-element repetition of at least 30 before comparison with a Tukey test at 5% for statistical validity, but that would be taking the test FAR too seriously.

(no subject)

Date: 2006-12-30 04:22 pm (UTC)
From: [identity profile] wonderleafy.livejournal.com
Neat!

While the algorithm got me wrong, I do think I tend to write more like a guy--in fact, I strive for it. Masculine writing is more commercially accessible, I've noticed. Or, at least, more men seem to make more money off their writing, and reach larger audiences. The big difference I've noticed is that women focus a lot more on what's personal and emotional to them. If my sentences begin too often with the word "I," I become agitated. Why would Joe Random care about "I"?

The NYT article mentions that particular difference, I see now. I wonder what else the algorithm measured. Happen to be a Nature subscriber?

(no subject)

Date: 2006-12-31 12:02 pm (UTC)
ext_15025: Photo by me (Default)
From: [identity profile] dracodraconis.livejournal.com
Not being a NYT subscriber, I didn't even have that article to read, just the original paper that I'm only now getting a chance to go through. Unfortunately, I also don't (yet) have a Nature subscription. I generally go to CISTI (National science library, situated next to the building where I work) when I want to find things like this.

Apparently the use of personal pronouns is a strong indicator of a female writer. My wife's ([livejournal.com profile] ms_danson) writing also showed up as male. The difference was also mentioned by the authors in the original paper (see and earlier response to this post).

(no subject)

Date: 2006-12-30 04:51 pm (UTC)
From: [identity profile] samhaine.livejournal.com
Apparently, when I write big blocks of text (my RPGnet columns), my writing is overwhelmingly male. When I write short blocks and lists of text (bullet points and similar), I score female just as often as male.

Weird.

(no subject)

Date: 2006-12-31 12:03 pm (UTC)
ext_15025: Photo by me (Default)
From: [identity profile] dracodraconis.livejournal.com
Your bullet points are trans-gender?

(no subject)

Date: 2006-12-31 05:08 pm (UTC)
From: [identity profile] ironphoenix.livejournal.com
Ratio of female:male is interesting. d2leddy, ms_danson and I all have ratios between 0.9 and 1.1, which suggests a fairly neutral (neuter?) writing style, if this "algorithm" (really a heuristic) is to be believed. Wonderleafy's result is interesting, especially in light of her comments above.

January 2010

S M T W T F S
     1 2
3456789
10111213141516
17181920212223
24252627282930
31      

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags