[philiptellis] /bb|[^b]{2}/
Never stop Grokking

Monday, January 24, 2005

End of line backslash on blogger

If a blogger post has a line that ends with a backslash, blogger will delete the backslash and the following newline character to merge two lines.

line1 \

shows up as:

line1 line2

after posting, and in further edits.

They seem to be parsing the input as if it were a unix command line or something like that.

The solution is to put a space after the \

Saturday, January 15, 2005

You've got mail! - loud and clear

You've got mail, announces the cheerful voice at AOL.
People who don't use AOL as their ISP will have seen it in advertisements and in the movie too at least.

AOL's program doesn't tell you anything more than that though. Who's the mail from, what's it about, nothing. To do that, one needs to parse a mailbox for the sender and subject, and then use a TTS tool to say it out loud.

Today I installed festival. It's a pretty cool TTS tool — runs on various unixes, which means prolly MacOSX as well.

I played around with festival for a few minutes while additional voices downloaded, and then hacked up this:


[ -e $lock ] && exit
touch $lock

awk " /^From / {from_start=1;sub_start=1}
/^From:/ && from_start==1 {print; from_start=0}
/^Subject:/ && sub_start==1 {print; sub_start=0}" /var/mail/philip | \
tail -n2 | \
sed -e '1iYou've got mail
s/:/ /;s/ R[eE]://g;s/$/./' | \
festival --tts

rm -f $lock

Attached it to my Inbox Monitor, to run every time the mailbox size increased, and now I have a (rather drab) British voice announcing my new mail, along with who it came from, and what it's about.

Yes, the script could do with improvements. I'm currently too lazy to figure out why case insensitive matches aren't working with sed or why I can't use alternation in my regexes, but hey, it's past 2:30am

Comments and suggestions welcome.

Oh yeah, I planned on using a single lock file across users, because:
a. The audio device would be busy anyway
b. Parsing large mailfiles takes a lot of time and is disk intensive. I don't want more than one of these to run at a time.

Festival was having trouble with Indian names, and some of the mailing lists I'm on, so I added some entries to its lexicon. Unfortunately, couldn't figure out how to get those entries loaded. .festivalrc did everything, but select my lexicon. I think it selected the default lexicon after selecting mine.

The only solution was to convert my script up there to one that output a festival script (scheme) rather than plain text.

This is what I came up with:


[ -e $lock ] && exit
touch $lock

[ -z "$msg" ] && msg="You've got mail!"

awk --assign msg="$msg" ' /^From / {from_start=1;sub_start=1}
/^From:/ && from_start==1 {from=$0; from_start=0}
/^Subject:/ && sub_start==1 {subject=$0; sub_start=0}
END {printf("%s\n%s\n%s\n", msg, from, subject);}
' /var/mail/philip | \
sed -e 's/:/ /;
s/ R[eE]://g;
/^From/s/ </, </;
(lex.select "philip")
(SayText "$a")
' | \
festival --pipe

rm -f $lock

and this is what my .festivalrc file looks like:
(lex.create 'philip)

(lex.set.phoneset 'mrpa)
(lex.set.lts.method 'oald_lts_function)
(lex.set.compile.file "/usr/local/share/festival/lib/dicts/oald/oald-0.4.out")

(lex.add.entry '("sachin" n ((( s a ) 0) (( ch i n ) 1))))
(lex.add.entry '("vinayak" n (((v ii) 0) ((n ai) 1) ((@ k) 1) )))
(lex.add.entry '("amarendra" n (((a m) 0) ((@) 0) ((r ei) 1) ((n d r @nil) )))
(lex.add.entry '("vijay" n ((( v ii ) 0) (( ch ei ) 1))))
(lex.add.entry '("ilug-bom" n (((ai ) 1) ((l @ g ) 1) ((b o m) 0) )))
(lex.add.entry '("linuxers" n (((l i) 0) ((n @ k s @ r z ) 1) )))

Interestingly, it reads out mm.ilug-bom as millimetres dot i-lug-bom.

The other changes in the script allow you to customise your leadin message, and also ensure that From is read out before Subject.

Festival has an email mode, but modes only work when reading from a file or using the (tts 'filename mode) syntax. Since my input comes from stdin, there's no way to specify it.

Update 2:

Inspired by jace, I decided to try using procmail for this. The only change to the script is that /var/mail/philip is no longer in there. It reads from standard input. My procmail recipe looks like this:
:1 c

and I put it at the end of .procmailrc.

I haven't yet been inundated with a deluge of emails, so don't know how it will work with bulk downloads. This of course runs after mails are sorted into folders, so only those that still make it to my inbox get reported.

Friday, January 14, 2005


Sigdashes are a (de facto) way of specifying where your mail ends and your signature starts. They're pretty cool, because smart mailers and newsreaders can do funky things when they notice sigdashes.

For example, many mail clients will strip off old signatures when replying to mails. This is a Good Thing, because, hey, just one signature per mail ya?

Many mail clients, like mutt, can display signatures in a different colour or font.

So, what /are/ sigdashes?

The character sequence "dash dash space" on a line by themselves are collectively known as sigdashes. It looks something like this (without the quotes):
"-- "

Configuring your mail client to use sigdashes:

Setup | Config
- Composer Preferences | Enable Sigdashes
- Reply Preferences | Strip From sigdashes in reply

Mutt: (sigdashes on by default)
in .muttrc, add
set sig_dashes=yes

unless it's set to "no" in /etc/Muttrc or ~/.muttrc, you do not need to do anything.

Thunderbird (via TagZilla):
In the TagZilla | Formatting screen, set Tagline Prefix to (without quotes)
"\n-- \n"

Thunderbird (no TagZilla) / Evolution / Web based mail:
Include the sigdashes line as the first line of your signature file/text.

Kmail / Outlook Express:
(No idea)

Go forth and spread the good news.