I use this for student classes. Since I own a domain name, I also own all email addresses whatsoever aliased to an account where the email for the domain is received. I tell a class of students that a listserv and website are available for them at, for example,
economics@domain.zone
and
http://domain.zone/economics/
The set of files below will work for any number of simultaneous classes or listservs. It will require a single recipe in the normal ".procmailrc" file at the account to call another procmail script, "serv-main", to complete the processes of subscription, email verification, cleanup, redistribution, and archiving. Since writing these notes there have been a few changes, not noted here.
Subdirectories are created and populated with the first email received.
"Serv-main" is one of four scripts. "Serv-main" is a procmail script; the other three are shell scripts, which are called from "serv-main". The 4 scripts are modular, so they can be used without modification for any number of differently named listservs at the same domain.
A few environmental variables are set in the ".procmailrc" file before initiating the "serv-main" script. These are (others can be added):
# variables for "serv" files X_FROM=`formail -x "From:"` X_SUBJECT=`formail -x "Subject:"` TODAY=`date +%y-%m-%d`
A last environmental variable to be set is the value of $SERV; this will happen when a recipe in the ".procmailrc" file recognizes an email addressed to the listserv.
The following inclusion in the ".procmailrc" file will then start up a set of files which will handle the email addressed to 'economics' (any other email addresses could be added). This should probably be preceeded by spooler delivery of email which already includes the "X-Loop:" header.
# -- deliver X-Loop mail to spooler
:0w
* X-Loop
${SPOOLER}
# -- economics listserv
:0w
* ^To:.*economics@domain.zone
* ! X-Loop
{
PW="whatever"
SERV="economics"
INCLUDERC=${HOME}/scripts/serv-main
}
Here we are using, as an example, 'economics@domain.zone' -- as if that were the name of the class I was teaching. The $SERV environmental variable will thus be set to "economics". This recipe calls for the 'inclusion' of the file "serv-main" detailed below. This recipe also requires a password of "whatever" -- see the notes below.
"serv-main" is a procmail script, located in the subdirectory "scripts/", but it could be anywhere. A transcription of the "serv-main" as follows (numbered notes below):
# VERBOSE can be set here (note 1) VERBOSE=no # A separate logfile is used (note 2) LOGFILE=$HOME/log.${SERV}.${TODAY} # The logfile entries separation (note 3) LOG=" " # set variables for "serv" files X_FROM=`formail -x "From:"` X_SUBJECT=`formail -x "Subject:"` X_TO=`formail -x "To:"` # set site variable SITE=${HOME}/public_html/${SERV} # make directory and files (note 3a) :0 cw:serv-dir.lock * ! ? test -d ${SITE} | ${HOME}/scripts/serv-dir # Bounce if not using password (note 3b) :0hw:no-pw.lock * PW ?? (.) *$ ! ^Subject:.*$PW | (formail -r -I "X-Loop: no pw" \ -I "From: errors@domain.zone" \ -I "Reply-To: help@domain.zone"; \ echo -e "email refused, password required \n"; \ ) | $SENDMAIL -oi -t # Save incoming email unfiltered (note 4) :0wc ${SERV}-asis.${TODAY} # LOOKUP extracts the email address (note 5) LOOKUP=`echo ${X_FROM} \ | sed -e 's/.*<\(.* @ .*\)>.*/\1/' \ | sed -e 's/ *\(.* @ .*\) .*/\1/' \ | sed -e' s/'\"'//g' \ | sed -e's/'\''//g'` | tr A-Z a-z ` # INLIST checks to see if LOOKUP is a user (note 6) INLIST=`cat ${HOME}/lists/${SERV} | grep -is ${LOOKUP} ` # Handle subscriptions (note 7) :0wc * ! INLIST ?? (.) { :0w:serv-subscribe.lock | ${HOME}/scripts/serv-subscribe # Notification is returned :0hw:serv-help.lock | (formail -I "X-Loop: subscribe" \ -I "To: ${LOOKUP}" \ -I "Reply-To: help@domain.zone" \ -I "From: ${SERV}@domain.zone" \ -I "Precedence: junk"; \ echo " ${LOOKUP} subscribed"; \ echo " "; \ cat ${HOME}/scripts/serv-help; \ ) | $SENDMAIL -oi -t } # Refuse email if size exceeds 8K (note 8) :0hw:toobig.lock * > 8000 | (formail -r \ -I "X-Loop: ${SERV} refusal" \ -I "Reply-To: help@domain.zone" \ -I "From: errors@domain.zone" \ -I "Precedence: junk"; \ echo -e "email refused, too big - ${SIZE} bytes"; \ ) | $SENDMAIL -oi -t # Tag-line added to each outgoing email (note 9) TAG=" ================================== Replies to ${SERV}@domain.zone Limit email to 8 K bytes. Website at http://domain.zone/${SERV} ================================== " # clean up multipart- e2t extract (note 10) :0 w * ^Content-Type:.*multipart { :0 fhbw:serv-e2t.lock | ${HOME}/scripts/serv-e2t } # clean up quoted/printable :0w * ^Content-Type:.*text/plain * ^Content-Transfer-Encoding:.*quoted-printable { :0 fbw:serv-quoted.lock | ${HOME}/scripts/serv-quoted } # clean up base64 plain text encoding (UTF-8) :0w * ^Content-Type:.*text/plain * ^Content-Transfer-Encoding:.*base64 { :0 fbw:serv-decode64.lock | ${HOME}/scripts/serv-decode64 } # html text - lynx dump - no delivery :0 w * ^Content-Type:.*text/html { :0 fbw:serv-html.lock | ${HOME}/scripts/serv-html } # Posting script - tests and rewrites (note 10a) :0 w * INLIST ?? (.) { # Rewrite headers of outgoing email (note 11) :0 fhw:rewrite.lock | formail \ -I "X-Loop: ${SERV} post" \ -I "From: ${X_FROM}" \ -I "Reply-To: ${SERV}@domain.zone" \ -I "To: ${SERV}@domain.zone" \ -I "Content-Type: TEXT/PLAIN; charset=US-ASCII" \ -I "Cc:" \ -I "Precedence: junk"; # Cleanup for redistribution (note 12) :0 fbw:serv-cleanup.lock | ${HOME}/scripts/serv-cleanup # Add tag line (note 13) :0 fbw | cat - ; echo "${TAG}" # Un-archived copy deliver to cat list (note 14) :0 cw ! `cat ${HOME}/lists/${SERV}` # Deliver very last copy to archive (note 15) :0bw:serv-archive.lock | ${HOME}/scripts/serv-archive } # Fallthrough to spooler (note 16) :0 ${SPOOLER} |
Verbose logging results in a lot of information written to the file. If set here it will not do verbose logging of the other recipies, since that is controlled by the ".procmailrc" file.
A separate log file is not needed, but might be of use if the normal logfile gets a lot of writes during the day. Notice that a new logfile is started for each day with the use of $TODAY
Just a nicety of adding a blank line between log entries. You could also write a visible marker like ...
-=-=-= new log entry =-=-=-... or use available environmental variables to indicate which email is being handled - since at this point in the environmental variables $X_FROM and $X_SUBJECT are available.
The website directory and the required files do not have to be created. As long as the 'name' is listed in the main ".procmailrc" file, the script serv-dir will make the directory, name it, and add the base files. It is a very short script, as follows:
mkdir ${SITE}
touch ${SITE}/list
touch ${SITE}/next
echo "1" > ${SITE}/next
cp ${HOME}/scripts/serv-help ${SITE}/README.TXT
chmod -R a+rx ${SITE}
|
Generally not needed, but recommended if the listserv is public. On one listserv which has generated 2500 postings over 4 years, we have had only 3 attempts at spam. The reply "password required" always shuts them up. If a password is used, include it with the ".procmailrc" script. Since this is a environmental variable, it has to be reset (to a blank) with every additional listserv recipe. That will be as simple as
PW=
Not needed, but strongly suggested, especially when you are dealing with users new to the listserv world. Saved in separate files for each day. This will include emails which fall through the script.
The "From:" header might take any number of forms ...
The first two sed commands extracts the bare email address; the next two remove any single or double quote mark as a matter of security; the last line changes it to lower case.
The value of $LOOKUP is compared line by line with the values of the list of users. This list has the same name as $SERV, and in this example would be 'economics' in the subdirectory "lists/" - but could be anywhere convenient. If the email address is already included, then $INLIST takes a value, otherwise it does not. This is used for new subscriptions, and could also be used for the redistribution section.
Note: I originally used "egrep" instead of "grep". Egrep doesn't work properly.
Subscriptions are handled with this recipe. Read "* ! INLIST ?? (.)" as, "if not $INLIST has an assigned value" -- meaning the $LOOKUP email address was not found in the list of subscribers. IF $INLIST has no value, the steps needed to add a subscriber are taken. These include...
The limit is here set at 8K which ought to be large enough for just about anything except some bloated HTML format Outlook generated email. A "Reply-To:" header reads "help@domain.zone" and the "From:" header is set to "errors@domain.zone". Note that the email body will specify the size of the email which was received.
The tag line will not be added until after "serv-cleanup" has been called. The tag line uses the equal sign as the score. This allows "serv-cleanup" to remove the tags from previous emails. The "serv-archive" script removes it again. $SERV is of course substituted with the value (name) of the current listserv.
If you do not disallow HTML-ized email, or multipart email, then the bulk of these will pass unmolested. The Archiving script, however, will have problems with HTML-ized and multi-part texts, but some measure can be taken to clean up weird email.
Additionally, fitering emails will radically reduce their size and make them universally readable.
What we have in the above is four separate recipes using four additional body filters, staggered so as to catch even the recursive stupidly formatted email and reduce it to plain text. Saves a lot of grief.
I will not present the filters here, since these consist of a Perl script (serv-e2t, 4686 bytes), a Lynx dump (serv-html, 224 bytes), a two Perl Mime decoders (serv-quoted, 200 bytes, and serv-decode64, 172 bytes). If you need to know, ask me. The set has worked consistantly on a dozen listservs for years.
However, because an extensive Perl library is not always available on some systems, you would be advised to visit [http://raf.org/textmail/] and check out the file 'textmail'. Textmail is larger, 49,964 bytes, and is a Perl script, but does not require any special Perl modules. It can be installed as a filter.
Note: the download is a selfcontained script, but a browser might fold the long lines, so for a download use ..
lynx -source http://raf.org/textmail/textmail > /usr/local/bin/textmail
.. (or place it where you want it, like /usr/local/share, or whatever is in use on your operation system). Add permission to execute as required. Access a manpage with "textmail -m".
Note that in all the cases the "Content-Type:" header has to be
rewritten after application of these filters in order to keep the mail
reader of the recipient from screwing up. The inclusion of
"Content-Type: TEXT/PLAIN; charset=US-ASCII"
and
"Content-Transfer-Encoding: 7bit"
This recipe checks (again) if the "From:" address is in the user list for the listserv. That condition is not really needed.
This is a "filter", that is, changes are made to the email, but it is not delivered. The changes consist of rewriting some of the headers. Notice that by inserting a "Cc:" header the original "CC" is overwritten. This protects against spurious deliveries.
The "From:" header takes the value of the email address of the sender, the "Reply-To:" header is of course the listserv. The "To:" header is also overwritten to remove multiple addresses. The "Content-Type:" header is discussed below.
Cleanup of the email is accomplished with a shell script. See "serv-cleanup", below.
The TAG is added after a cat of the complete email body.
The email is now delivered to the cat of the SERV list. The number of names is limited by the allowed size of the command line, and probably is good for about 80 or 90 addresses. After that a normal list-alias will need to be instituted. A copy is made for a secondary delivery to the archive script.
See the "serv-archive" below details.
Everything else falls through to the spooler.
"Serv-subscribe" is a shell script which adds the bare email address from the passed $LOOKUP environmental variable to the list of users. A copy of the email could be passed along to the rest of the "serv-main" script - so that everyone gets notification of the new user. Add a "c" to the first line to accomplish that (shown in the script above).
Alternately, the first email could be suppressed, by removing the "c" from the first line of the recipe (note 7 in the main script). However, forwarding the first email to the rest of the users might start some dialog. "Serv-subscribe" goes like this...
#!/bin/sh
# script/serv-subscribe
# LOOKUP is passed as an e.v.
cp ${HOME}/lists/${SERV} ${HOME}/lists/foo
echo "${LOOKUP}" >> ${HOME}/lists/foo
cat ${HOME}/lists/foo | sort | uniq > ${HOME}/lists/${SERV}
|
If, as the owner, you want to be notified of new subscriptions, add the following two lines to the "serv-subscribe" script
echo -e "${SERV} new subscribe: ${X_FROM}" > ${HOME}/update
mail -s "${SERV} list update" me@localhost < ${HOME}/update
|
Not shown is the ability to unsubscribe.
"Serv-cleanup" is a shell script which acts only as a filter. The concern here is to reduce redistributed email to a reasonable size by removing all email-quoted text, tag lines, and duplicate blank lines. Not all of these need to be removed, and some can be rewritten. If you don't care about the look of the archive, leave the left hooks in place, and watch text blow.
#!/bin/sh
# serv-cleanup -- filter only
# following establishes a new-line as an e.v. for sed
NL="
"
# email body gets filtered:
cat - \
| sed -e '/^>/d' \
| sed -e'/(---Original)|(--- Original)/,$d' \
| sed -e'/===/,$d' \
| sed -e'/___/,$d' \
| sed -e 's/^[[:space:]]*$//g' \
| sed -e 's/[[:cntrl:]]/ /g' \
| sed -e'/^$/{'"$NL"'N'"$NL"'/^\n$/D'"$NL"'}'
|
Filtering is accomplishes as follows:
sed -e '/^>/d'
sed -e'/(---Original)|(--- Original)/,$d'
sed -e'/===/,$d'
sed -e'/___/,$d'
sed -e 's/^[[:space:]]*$//g'
sed -e 's/[[:cntrl:]]/ /g'
sed -e'/^$/{'"$NL"'N'"$NL"'/^\n$/D'"$NL"'}'
Upper ASCII (8-bit) characters can also be removed with..
| sed -e 's/[^[:graph:]]/ /g'\
"Serv-archive" is a shell script. It will require a few comments. First, you need not set up a directory at your web pages. The initial email will do that, populate the directory with the needed files, and set permissions (for which, see above). Permissions for the directory are 'drwxr-xr-x'. The files need '-rw-r--r--' permission.
The students will not get a default index file to look at, but will get a directory listing (generally the default for an Apache server, but ask the admins if this is not so).
In the 'economics' directory will be a list of files..
Index of /economics
Icon Name Last modified Size Description
__________________________________________________________________________
[DIR] Parent Directory -
[DIR] bin/ 12-Jul-2003 04:29 -
[TXT] July.htm 30-Jul-2003 14:08 12K
[TXT] June.htm 30-Jun-2003 23:29 9.5K
[ ] calendar 26-Jun-2003 01:28 470
[TXT] handouts 29-Jun-2003 21:02 711
[ ] next 30-Jul-2003 14:08 3
[TXT] notes.htm 26-Jun-2003 01:28 453
[TXT] syllabus.htm 26-Jun-2003 03:23 8.9K
__________________________________________________________________________
README.TXT text follows this line....
To explain this set of files...
We will assume that the "economics" directory, as with all the other web files is located in the "public_html" subdirectory of the user account. Here is the script which does the archiving (lettered notes below)...
#!/bin/sh # serv-archive -- generic version for procmail use # Set env variables (note a) ARCHIVE=${HOME}/public_html/${SERV}/`date +%B`.htm NL=" " # Increment the post number (note b) echo "`cat ${HOME}/public_html/${SERV}/next` 1 + p" | dc \ > ${HOME}/figure cp ${HOME}/figure ${HOME}/public_html/${SERV}/next NEXT=`cat ${HOME}/public_html/${SERV}/next` # Extract a name from the email address (note c) NAME=`echo ${X_FROM} \ | sed -e's/^<\(.*\)@.*>$/(\1)/' \ | sed -e's/<.*@.*>//' \ | sed -e's/^\(.*\)@.*$/(\1)/' \ | sed -e 's/^.*@.* //' \ | sed -e 's/(\(.*[ ]+.*\))/\1/' \ | sed -e 's/"//g'` # Clean up the Subject line (note d) SUBJ=`echo ${X_SUBJECT} \ | sed -e 's/[[:cntrl:]]/ /g' \ | sed -e 's/[\/><]/ /g' \ | sed -e' s/^[[:space:]]*//' \ | sed -e' s/[[:space:]]*$//' \ | sed -e' s/'\"'//g' \ | sed -e's/'\''//g'` # Htmlize the header (note e) echo -e "\n</UL><P><A HREF=\"#${NEXT}\">[down]</A>" >> ${ARCHIVE} echo -e "<P><HR><UL>" >>${ARCHIVE} echo -e "<LI>received: `date`" >> ${ARCHIVE} echo -e "<LI>${NAME} writes:" >> ${ARCHIVE} echo -e "<LI><I>${SUBJ}</I> \n</UL><P>" >> ${ARCHIVE} # Cleanup for archive (note f) cat - \ | sed -e '/===/,$d' \ | sed -e '/___/,$d' \ | sed -e 's/@/ at /g' \ | sed -e 's/[[:cntrl:]]/ /g'\ | sed -e 's/[^[:graph:]]/ /g'\ | sed -e 's/^[[:space:]]*$//g' \ | sed -e '/^$/{'"$NL"'N'"$NL"'/^\n$/D'"$NL"'}' \ | sed -e 's/^$/<\/UL><P>/g' \ | sed -e '/<P>$/{'"$NL"'N'"$NL"'s/\n//'"$NL"'}'\ | sed -e 's/<P>[[:space:]]*"/<P><UL>"/' \ | sed -e 's/^[[:space:]]*-/<BR> -/g' \ | sed -e 's/http:\/\/[[:graph:]\.\/]*/<A HREF="&">[&]<\/A> /g' >> ${ARCHIVE} # Add to existing archive web files (note g) echo -e "</UL><P><A NAME=\"${NEXT}\"></A>" >> ${ARCHIVE} # Make readable (note h) chmod a+r ${ARCHIVE} |
The variable $ARCHIVE is set to keep down the length of the lines of the script. $ARCHIVE is the growing archive of posts for the current month. A new $ARCHIVE is written when the month changes. $NL is a new-line used by sed.
Every archived email is numbered. This is done for record keeping, and to allow jumping down from one email post to the next, and aids searching if the posts get lenghty. Additionally, a script can be used to remove any post by its "number" since the number occurs in a HREF and NAME anchor which enclose the post, for example...
#!/bin/sh
# remove $1 $2 (Month_file, Post_number)
cp $1 foo
sed -e '/\"#'"$2"'\"/,/\"'"$2"'\"/d' foo > $1
The $NAME variable is used in writing the header for a post in the form "${NAME} writes:... etc"
The email address "name" is either the readable name suplied with the email address of the sender or, if there is no plain name, the account name. The following will work with any form of a legitimate email address in use (tested on over 100,000 addresses). It works. Extracting $NAME involves... (line by line):
sed -e's/^<\(.*\)@.*>$/(\1)/'
sed -e's/<.*@.*>//'
sed -e's/^\(.*\)@.*$/(\1)/'
sed -e 's/^.*@.* //'
sed -e 's/(\(.*[ ]+.*\))/\1/'
sed -e 's/"//g'
The Subject line is cleaned of weird characters, leading and closing spaces, slashes and hooks, and protected against hacking. If a Subject line password is used, this would be a good place to remove it. The sed commands, line by Line:
sed -e 's/[[:cntrl:]]/ /g'
sed -e 's/[\/<>]/ /g'
sed -e' s/^[[:space:]]*//' sed -e' s/[[:space:]]*$//'
sed -e' s/'\"'//g' sed -e' s/'\''//g'
Posts are separated with a score, and preceeded with a HREF anchor. The text will be followed by a NAME anchor with the same $NEXT value. The header is a dot list of "when", "who", and "what".
The current local machine date and time are used, because senders are amazingly unreliable in getting the time right on their own machines. "Who" consists of a name from the "From:" header or the account name. The Subject is a cleaned up version of the original "Subject:" header.
Notice that every P tag is preceeded by a /UL tag. This is because the UL tag is used to indent paragraphs, but there is no way to close the UL section.
All of the header is written to $ARCHIVE.
Quoted text, tag lines, and duplicate lines are removed. In addition the text is HTML-ized as follows..
The result of the cleanup is added to $ARCHIVE. Step by step as follows...
sed -e '/===/,$d' sed -e '/___/,$d'
sed -e 's/@/ at /g'
sed -e 's/[[:cntrl:]]/ /g' sed -e 's/[^[:graph:]]/ /g' sed -e 's/^[[:space:]]*$//g'
sed -e '/^$/{'"$NL"'N'"$NL"'/^\n$/D'"$NL"'}'
sed -e 's/^$/<\/UL><P>/g'
sed -e '/<P>$/{'"$NL"'N'"$NL"'s/\n//'"$NL"'}'
sed -e 's/<P>[[:space:]]*"/<P><UL>"/'
sed -e 's/^[[:space:]]*-/<BR> -/g'
sed -e 's/http:\/\/[[:graph:]]*/<A HREF="&">[&]<\/A> /g'
A NAME anchor is added at the end of the text, that is, it is added to $ARCHIVE.
Public read permission needs to be set on these files.
Following is a typical text...
Directions for use: send email to "economics at domain.zone" (1) These emails will be redistributed to everyone on the list, (see the list) and are archived here. (2) Your first email will add your address to the list. (3) No links point to this website - it is invisible to the world. If the site is intruded on by an outsider we can add passwords, and disable the auto-subscribe feature. - Email has to be smaller that 8K, or will be refused. - Clean up your mail before replying, and sign your name. - A "Reply-to:" header is added, it will respond to everyone. - The "From:" header is retained - it is the person posting. In order to have the emails readable on the web page, the text is HTML-ized. Since all empty space is ignored by browsers, - Skip a line between paragraphs - Start each item of a list with a hyphen (ascii 45) at the left - To indent a paragraph start with a quote (ascii 34) at the left - start links (URLs) with 'http://' (and end with a blank) - If you have further questions send email to "help". |
Website Provider: Outflux.net, www.Outflux.net