Wednesday, March 28, 2007

Did you mean: correctly-spelled words?

I think one of the most fascinating things I've worked on since I started programming for Tribune Interactive is a search feature that most people take for granted nowadays. You might notice it when you type something into Google and get no results. Before, you would just have to scratch your head, wondering why you couldn't find any webpages. But now, search engines are so smart, they'll tell you what you actually meant to search for!

It's actually not as difficult as it sounds. For our purposes, there's a great tool out there called Aspell that will look words up in the dictionary, and come back with a list of spelling suggestions. Since we're developing in Rails, I use the Raspell gem (You still have to have Aspell there, the gem is nothing more than hooks to the C code. And you have to install your dictionaries if you want it to work. You could create your own dictionary, if you need to check against proper names not in the dictionary. I'll get to that later maybe...probably not, though).

There's a surprising depth when considering what people want to search for. Search engines are obviously big business, so it's worth quite a bit to making users happy. One report I read said that people misspell words in searches about 15-20% of the time. Google posted how many queries they had over a three-month period of users looking for Britney Spears, and how many of each spelling variation there were. Which raises a good point: "Britney" usually doesn't come up in the English dictionary. How then, does Google know that "brentley spears" is actually "Britney Spears"?

The answer: I have no idea. Google's quite a bit smarter than I am. But there are some strategies to start with.

First, it's important to recognize that the words are spelled incorrectly. Two of the most common spelling errors are switching two letters around, and leaving off the end letter. The Damerau-Levenshtein distance algorithm can tell you "how far" one word is from another, that is, how many changes you have to make in one word to turn it into the second. It's a pretty nice tool. "Wierd"? How about "weird". "Theif"? How about "thief".

While that's good for catching a lot of errors, there's still more. Consider "Caesar", one of the most popular historical figures in history, yet a man whose name is misspelled pretty often. "Ceasar" is a pretty common one. But if we run our distance algorithm on "Ceasar", could anything else come back? How about "Cesar"?

Both have an "edit-distance" of 1 when it comes to swapping out the word. It's tough to determine which word the user is actually looking for. If the search query is just a single word, you simply might have to go with nothing more than a good guess. Maybe more people search for "Caesar" than "Cesar", so we'll choose option 1.

But if the user is looking for something more specific, we might be able to point them in the right direction. If the entry is "ceasar salad", they probably want "caesar salad", and if the entry is "ceasar chavez", they probably want "cesar chavez". Now, if you're Google or Yahoo, and you have access to trillions of search queries, it's probably easy to figure out which phrase they mean. However, I am neither, but I still catch flack when somebody says "How come my misspelling didn't return the 'did you mean' that I wanted?" While the answer is probably that my suggestion is much "closer" to yours than the suggestion you were hoping for, the ultimate goal is that the user is happy, so I just grumble and say that I'll get to work on fixing it.

So where do I start? My thought is to create a probability model. For example, if I look at "ceasar", and see that both "caesar" and "cesar" are good matches, and the second word is "salad", I'll assign a high priority to "caesar" being before salad, and perhaps a lesser priority for it coming after, as people like to search for things out of order. Perhaps the same if they're searching for "augustus", "brutus", "crossing the rubicon", "Little Caesar's", etc. While this is probably rather inefficient for Google, it will suit my needs fine.

There's a lot more that you can do with this. If you're trying to set up a search engine on your website (and who isn't these days?), and you're worried about people leaving your site in disgust because they can't find what they're looking for, you could go get a PhD in lexical analysis. But if you're only kinda worried, or curious as to how Google knows what you're thinking, hopefully this will get you on your feet.

Hapi saercheeng!

Monday, March 19, 2007

A lesson in wiring money

Friday afternoon, I got a call...

Chris: Hey Alex, I'm on my way to the Mall of America.
Me: That's awful.
Chris: Well, anyway, grandpa's credit card stopped working, and I don't have my debit card...so could you wire me some money?
Me: (grumble grumble) How much?
Chris: I was thinking like, maybe, 300 dollars or something.
Me: !!!


There was more to the conversation, like me asking how he was so unprepared, but that's beside the point. And honestly, $300? I know it's the Mall of America, but there's not really a whole lot you can buy there that you can't get somewhere else (cheaper). So I tell him that I'll see what I can do. I didn't realize how dire the situation was, as I got a call at 6:45 IN THE MORNING about it.

Chris: Hey, sorry to wake you.
Me: @%$##%@!!!
Chris: So, Katie's car broke down, we're stuck in this place called Albert Lea...she's taking it to a place to get it fixed...so, could you wire me that money?
Me: Chris, if you don't pay me back, I will take it out of your hide. Or your XBox 360.

You see, my brother is not the best with money. I need some assurance that I will get paid back. So, after I hung up, I decided to try Western Union's website. I put in all of my personal information, the recipient information, and hit confirm. Wells Fargo has this thing where I have to enter in my WF password right before the transaction, as a way to protect against fraud. So I did, and it says that my transaction can't be processed. I try a second time, same thing. I call Chris back and tell him that there's something wrong with the website, and that I'll wake up later and go find a place to do it physically.

So around 10:00 or so, I get another call:

Chris: So, um, I think we might be stranded here until Monday, because Katie had to get her car towed to Minneapolis. So...I think I might need a bit more money to last me until then.
Me: How much is a bit more?
Chris: Like...400 or 500?

Me: !?!?!

Honestly, $500?? I could eat sushi for every meal for a whole weekend and stay in a huge hotel room and barely reach half of that. I wasn't gonna worry about it though, as long as he was going to pay me back. So I head to the TCF bank at the Jewel nearby. I call Western Union on their neat little phone, get my confirmation number, and take it to the teller. "Hi, I'd like to send $450, please" (since the charge on that was $44), and I handed over my debit card. "Sorry, we only take cash." Now, I don't know about you, but I certainly do not walk around anywhere with $500 in my pocket, much less in Chicago. So I head over to the TCF ATM, and punch in that I want $500. "Sorry, you are over your daily limit." I try $400. The same. I finally get it to let me take $200 out, and it says "Transaction failed: please take receipt." Huh? I try again, and get the same error.

Frustrated, I leave the place, and try the sketchy-looking "Money Exchange" located across the street. I see a sign when I enter, with a Western Union logo and the phrase "We accept major debit cards!" written on the top. I walk to the teller. "Hi, I'd like to wire some money with my debit card." "We only do cash." "You...don't take any debit cards?" "We only do cash." There's an even sketchier-looking ATM there, so I try that. Not only does it not give me any cash, but the receipt printer is jammed. I decide to go home and call Wells Fargo to see if somebody's taken all of my money.

So it turns out, my mistake was made at around 7 a.m. I talk to one of their customer service people, and they inform me that a freeze has been put on my account. I was relieved, since that meant nobody else could steal my money, in theory. I ask why, and they inform me that they've decided to freeze accounts that use Western Union's online service, since there were so many fraud claims coming through. I ask to up my daily ATM limit to $500, and I thank the lady on the other line. Chris has called me about 10 times now, asking if I've sent the money yet. I don't know what his hurry is; perhaps he's on the wrong end of a drug deal and would like to keep all 10 fingers. I stop for lunch, and head to Jewel. I still had to take the money out in 200s, swallowing their $2.00 fee every time. It's now 3:45 p.m. Weary, I stumble up to the counter, do the phone thing again, and hand over $500 to the teller. I get $6 in change. As I walk out, I call my brother:

Me: Ok Chris, 9 hours later, I have sent your money. I hope you realize that I took my entire day off to do this, and you had better pay me or else I will -
Chris: Ok thanks a lot bye.

So, there's some valuable lessons to be learned here. One, if you're planning to head out on a long trip, it's wise to bring extra money. Two, if you screw up lesson one, you aren't allowed to call somebody at 7 a.m. about it unless it's an emergency. The Mall of America is not an emergency. Three, if you're going to use Western Union, don't. It's not worth the hassle.

Thursday, March 15, 2007

The best idea ever?

Alright, I've decided to start a blog, because I have such good ideas![/sarcasm] Anyway, I've had this for a bit and never wrote in it, and a friend suggested I blog something I was talking about. I'll throw in chat snippets so things don't seem completely random.

me: I challenge you to an academic decathalon
me: billy madison style
Jonathan: what's that?


I've always thought that the most noble pursuit in life was the ability to do anything, or the ability to adapt quickly. If you had a broad well of knowledge, you'd be able to pick up something in a fraction of the time that it would take if you had no idea what you were doing. There's certainly nothing wrong with specializing, but I always viewed it as the whole "giving a man a fish" vs. "teaching a man to fish" thing. Specialists can do some things very well. Sometimes I regret this path, because I feel like I could have been a really great pianist had I stuck with it every day. I mean, I would practice three days out of the semester and be as good as a lot of people, and I felt bad that a lot of people would kill to be that good. I think a bigger problem was that I always got bored. If I knew how to pick up the piano quickly, was my time better spent learning how to become amazing at the piano, or being decent at the piano and decent at something else, maybe the guitar? I decided on the latter, with the rationalization that I could always go back to the piano if I wanted to, but I'd never be able to get anywhere with the guitar if I didn't try it first. It's like having one ability vs. the option to have two.

me: you know what would be really interesting
me: having some sort of commune of people that excelled at various things
Jonathan: and they all specialize?
me: and everybody taught everybody else their best traits
Jonathan: interesting.
me: and it would be like a commune of superhumans
Jonathan: I'd live there.


I came up with this idea completely at random. I always thought that it'd be even easier to pick up a broad skill set if I had other people helping me along. Not that I'd just want to leech off of the skills of others. I also thought I'd really enjoy teaching people what I knew as well. A lot of people have really envied my talents, and I feel bad that I don't put them to great use, so why not let other people enjoy them?

me: the teacher would probably be more involved, because they aren't trying to teach different levels all day...plus, they are getting skills out of it (which I would argue is more valuable than a salary), and there's no pressure of failure, since it's not exactly a graded effort
me: it's a lot easier for a teacher to adapt when everybody has to start from the beginning...obviously people will progress at different paces, but it is something that they can do as they feel comfortable, and not have to feel suffocated by sticking to a syllabus
Jonathan: that is true.
me: the idea is that you teach concepts over specializations, to which the individuals would be better equipped to specialize on their own
me: for example, I think learning a broad range of music fundamentals and exposing myself to a broader range of instruments and musical styles allows me to pick up about any instrument in a fraction of the time that I might learn each one on their own separately
me: I feel the same principles are applied to schools, but more time is spent on specializations, and not enough on broadening one's knowledge of the general concept
we expect everyone to pick up math in the same order as everyone else in the same time frame, and those who can't conform to that are relegated to the lower math tracks
Jonathan: I think you should blog that.


I think this is the main problem with schools. There's too much pressure to conform to a certain teaching standard. One of the most famous example talks about how Einstein failed math. I don't have anything against my teachers. I just feel that I would have done a lot better had I gone at my own pace. I sometimes lament the fact that I was stuck in the confines of school, even though I'm certainly happy I got to make a ton of friends. I think I'd love to do some sort of group homeschooling...it works a lot like my commune idea, but it's a lot easier to expose my kids to different fields rather than my peers. Too bad I have that thing called a job.

[upon suggesting other members]
me: that would raise a tricky issue in doing this
me: making sure everybody was cool with everybody else
[...]
me: you're 2/2 on people to which I'd have a rocky time with
Jonathan: it'd be a good experience


Thinking about this idea in depth made me realize quite a bit about what kinds of people I connect with. I really don't think a lot of people agree with me, and certainly not to the point that they'd want to spend their free nights learning about Chinese or piano playing or programming or swing dancing or cooking. But it feels like there's a special brain wavelength, reserved for those people that think like I do. I feel like I can connect with them in a way that's absent from others (not to diminish my friendships with others, it's certainly not the case). In addition, I realized that in going through with my commune of philomaths, I don't know how I'd be able to deal with people who I wasn't on good terms with. It'd be hard to concentrate on so much when my mind wasn't focused, and I'd probably be in danger of losing interest.

Not to mention the other risks involved. It's a ballsy idea, indeed. But I'm still in my youth. Better to set these frameworks now, than when I'm 50. Nobody would take it seriously by then.

And how often does one come up with "maybe the best idea ever"?