On Blogspot's link creation algorithm

Blogspot uses an automated algorithm based on the title to come up with direct links to posts.  The algorithm seems to remove common words like "the" and "a" so that the link to my last post (title:  Blog like a guy: use "the" and avoid "and") is:

http://not-that-sane.blogspot.com/2007/04/blog-like-guy-use-and-avoid-and.html

The algorithm didn't recognize that in this case, the word "the" is important.  Since this is sort of my line of work, I started thinking about how to improve Blogspot's algorithm -- a rule like "if a word is enclosed in quotes, it should always be included" should do it.  Am I overthinking this -- does any one really care about what the link is?

Blog Like A Guy -- use "The" and avoid "And"

Based on an algorithm developed by three Israeli scientists and an American, there is a test out on the web where you can supply a blog entry (or other types of writing) and have it analyzed for whether the author is male or female. I supplied the longest blog entry I have so far and the test returned 5:2 odds that I am male.

What was interesting to me was that the test was not based on content. Instead, it was simply based on word frequency. Apparently " the" is indicative of a male author while the word "and" is indicative of a female author. I suppose that goes to well-known steoreotypes: egotistic male authors (making assertions with definitive articles) and verbose female authors (can't resist getting one more thing in). The word pair with and around (female and male respectively) is similiarly suggestive of female-male steoreotypes: for social skills vs. spatial reasoning.

Incidentally, my blog entry used 26 "the"'s while using only 10 "and"'s -- this disparity pretty much explained the 5:2 odds of my being male.

A childhood memory

I flew into Washington DC today morning and landed up with a very talkative Somali driver. Somewhere in our conversation, I mentioned that I'd grown up in Liberia.

In the African head-shaking sing-song voice familiar to me from childhood, he replied "typpp-ick-al Indian people ... you everywhere".

Use Less Stuff

The Oklahoma Sustainability Network is holding its annual conference on the ground floor of our building.

One of the booths has a big sign on it that reads "Use less stuff".

And what's in that booth? A bunch of cards and buttons with the slogan "Use less stuff" on them. The kind of stuff that people simply throw away.

Does the irony escape them?

Patents, order and fairness

Who do you think should own a patent: the person who first made the invention or the person who got to the patent office first? In most of the world, it is the company/person who files first that owns the patent. In the US, inventors can prevent patents being issued to the first issuer if they can show they made the discovery first. This fairness doctrine would go away, if a recently proposed bill in the Senate passes.

Why do foreign governments choose "first-filer" rather than "true-inventor"? And why are companies in favor of this? Because using "first-filer" makes the patent system easier to enforce (for governments) and more predictable (for corporations). But it doesn't pass the smell test. It's the kind of choice that a system that favors governance over fairness would make. As in the Russian speaker who sniffed about the Virginia Tech massacre:
"This is a tragedy and we express our condolences [but] the situation where a country dictates rules of behavior to other countries, but cannot keep its own people in order, does raise questions."
Keeping people in order, indeed. Talk about preferring governance to fairness -- a system that prefers order would have thrown Cho Seung Hui in a mental institution. A fair system, which is where I'd rather live, worries about potentially locking up hundreds of innocents.

And besides, I'm not sure how much we should look at other countries for their defense of intellectual property. Microsoft is happy that only 30% of new PCs carry pirated operating systems in China (down from 90%). But, a recent report suggests that only 244 genuine copies of Vista have been sold in all of China so far. How to reconcile the two? It's possible that the non-pirated operating systems are mostly Linux, which folks wipe it out and install pirated copies of Vista over.

Talking of China, have you heard about the Chinese companies that make copies of GM and Volkswagen cars and sell them for less (The Economist estimates 50% less)? Since the real Volkswagen and GM build in China in factories that have huge economies of scale, how can a Chinese company make a copy of the car for less than it takes Volkswagen and GM to build the car in China? Who's providing the subsidy, and what's the motive?

Cry wolf

Of course it had to happen.

This morning, our building (a combined US government/university facility) was locked down briefly. Reason? Some one called in that there was a suspicious fellow on the university lawn carrying what appeared to be a weapon. A few minutes later came another email lifting the lockdown. Appears that the "weapon" may have been a yoga mat. Who would carry a yoga mat in Oklahoma in broad daylight?

But this just points to the raw deal that the Virginia Tech campus police and president are getting. The simplest explanation for a young couple found dead in a dorm room is a domestic squabble, not an out-of-control gunman. It's natural to want to have someone (other than the shooter) to blame, but Virgina Tech and OU are of similar sizes. They are 25,000-people organizations spread over thousands of acres, hundreds of buildings and no real entrances or exits -- as big and indefensible as a small town in other words. Surely, no one closes a town down every time there's a murder ...

Almost had me ...

Quick: who works longer overall? men or women?

If you add up the time spent working for money ("market work") and the time spent at home doing work that needs to be done (cooking, washing, etc.), it turns out that men and women spend about the same amount of time working ... in rich countries at least.

http://www.nber.org/papers/w13000

Don't know if there was a reporting bias there (did any respondents exaggerate, or shy away from revealing housework?) -- the paper's too full of social science jargon to tell.

A's invite

A couple of years ago, A. had a barbeque party in a state park.  A's invite was surreal, with some phrases that could easily be misinterpreted.

D. couldn't find the party, so he asked a parked cop for help.  Showing the invite to the cop so that the cop could read the address. The cop read more than just the address.  He read the whole invite. "Follow me," he said and so it was that D. brought the police to A's party.

So, yesterday, A. sends out an invite to a barbeque. Except that it's at his home this year.  "New home," he says, although they've now lived there a year now. Perhaps with cops firmly in mind, one line of the invite reads "If you're under 21, no beer for you sucka!"

And the funniest thing about all this?  About 10 minutes after A. sends out the flyer, he sends out another email.  Apparently his wife had reminded him that he'd got the address wrong.  Not just a typo -- his invite sported a 4-digit number completely different from their actual 3-digit one.

Any doubts about A's profession?

On the beauty of equations

It's not often that you see a newspaper article about mathematicians but the Swiss embassy is doing stuff around Euler's 300th birthday. So, of course the Washington Post would notice ...

The statement in the article about a poll of mathematicians (math enthusiasts?) that named ei(pi) + 1 = 0 the "most beautiful equation" caught me by surprise.

I much prefer the more general form: ei theta = cos(theta) + i . sin(theta) because it neatly ties together geometry and calculus

So why is the simpler form considered so beautiful? Apparently because it contains five significant numbers: e, pi, i, 0 and 1 and three significant operations: addition, multiplication and exponentiation.

Looks like my view of beauty is closely tied to function. The majority of people polled seems to have plumped for sheer quantity.

Friends' blogs

I'm adding links to friends' blogs (see right). Do visit -- even at 3 links, they're already quite diverse.

And if you have a blog, let me know.

Naxalite Ganga


One of my B.Tech hostel mates emailed out a scanned photograph from those days. Except for the fellow who is now in government, no one even cracks a smile. That, and the war paint on a few faces, reminded me of police photographs of Naxalites that periodically appear in Indian newspapers. (Naxalites are part of a violent Maoist insurgency fighting for land redistribution mainly in tribal areas)

I went to Google Images and did a search for "naxalite gang" and found this article and the accompanying photograph.

The Finder-Seeker Tornado

It's spring in Oklahoma -- last week, there was a tornado in the metro area -- and my son's preschool have been doing tornado safety drills.

"When a tornado comes," he lectured to me from the backseat, "we need to go to a room with no windows."

"That's right," I said.

"Our laundry room ... or the big closet", he continued.

"Umm, hmm".

"That's so when the tornado comes, it can't see us."


Easy!

I was at a corporate shinding a few weeks ago. Apparently, the Staples "Easy" button is big in that world. Speaker-after-speaker wanted a system where their users could push the "Easy" button to get something done automatically (and faster).

How much easier is to publish an article to a blog now? Well, as easy as typing up an email. I'm emailing this to the blog. Would have been nice to have a decade ago, eh?  Pass the Easy button!


A few years later

As some of you know, I was writing a weekly blog before the word "blog" even existed. I kept it up for nearly 3 years:

http://www.geocities.com/Tokyo/Bridge/1771/Sane/archive.html

but gave it up because it had simply gotten too hard to keep up with. But ... I've missed the communication, from friends and strangers, that the "blog" brought about. The software to write, upload and get feedback has gotten easier in the mean time, so I thought I'd try again.

I'm keeping the title I came up with 10 years ago, although reading some of the older columns, the voice is unrecognizable. Everyone's writing style, I'm sure, changes over the years. Mine certainly has.

UPDATE: What's this blog about anyway, and what's with the title? It's a scientist's take on an irrational world: irrationality and innumeracy in politics, history, science or day-to-day life.

p.s. Yes, I'm as surprised as you are that GeoCities is still around, let alone still hosting all these pages.

p.s.2 [Oct 2009]: Geocities is no more, but my posts have been archived on the Internet Way Back Machine.