Using grep to solve another Cartalk puzzler (Shallow Thoughts)

Akkana's Musings on Open Source Computing and Technology, Science, and Nature.

Tue, 15 Mar 2011

Using grep to solve another Cartalk puzzler

It's another episode of "How to use Linux to figure out CarTalk puzzlers"! This time you don't even need any programming.

Last week's puzzler was A Seven-Letter Vacation Curiosity. Basically, one couple hiking in Northern California and another couple carousing in Florida both see something described by a seven-letter word containing all five vowels -- but the two things they saw were very different. What's the word?

That's an easy one to solve using basic Linux command-line skills -- assuming the word is in the standard dictionary. If it's some esoteric word, all bets are off. But let's try it and see. It's a good beginning exercise in regular expressions and how to use the command line.

There's a handy word list in /usr/share/dict/words, one word per line. Depending on what packages you have installed, you may have bigger dictionaries handy, but you can usually count on /usr/share/dict/words being there on any Linux system. Some older Unix systems may have it in /usr/dict/words instead.

We need a way to choose all seven letter words. That's easy. In a regular expression, . (a dot) matches one letter. So ....... (seven dots) matches any seven letters.

(There's a more direct way to do that: the expression .\{7\} will also match 7 letters, and is really a better way. But personally, I find it harder both to remember and to type than the seven dots. Still, if you ever need to match 43 characters, or 114, it's good to know the "right" syntax.)

Fine, but if you grep ....... /usr/share/dict/words you get a list of words with seven or more letters. See why? It's because grep prints any line where it finds a match -- and a word with nine letters certainly contains seven letters within it.

The pattern you need to search for is '^.......$' -- the up-caret ^ matches the beginning of a line, and the dollar sign $ matches the end. Put single quotes around the pattern so the shell won't try to interpret the caret or dollar sign as special characters. (When in doubt, it's always safest to put single quotes around grep patterns.)

So now we can view all seven-letter words: grep '^.......$' /usr/share/dict/words
How do we choose only the ones that contain all the letters a e i o and u?

That's easy enough to build up using pipelines, using the pipe character | to pipe the output of one grep into a different grep. grep '^.......$' /usr/share/dict/words | grep a sends that list of 7-letter words through another grep command to make sure you only see words containing an a.

Now tack a grep for each of the other letters on the end, the same way:
grep '^.......$' /usr/share/dict/words | grep a | grep e | grep i | grep o | grep u

Voilà! I won't spoil the puzzler, but there are two words that match, and one of them is obviously the answer.

The power of the Unix command line to the rescue!

Tags: , , , , ,
[ 11:00 Mar 15, 2011    More linux/cmdline | permalink to this entry | ]

Comments via Disqus:

blog comments powered by Disqus