Next message: Robert de Bath: "Re: A philosophical question: To rewrite or not to rewrite"
On 2002-08-04, 11:56:37 (+0100), Robert de Bath wrote:
> > The result is faster processing, as no content decoding/encoding takes
> > place, and no risk of content trashing, due to buggy decoders/encoders.
... and scanning with third-party virus scanners won't work
(unless they know how to decode MIME, which some do), and
disinfection by said scanners definately won't work.
It would be possible to implement this without breaking things, if
you doubled the I/O and disk-space usage of the program - saved
the encoded attachment to disk, decoded, scanned and then
reinserted the original content if scanning found no problems. But
that wouldn't be much faster... it would be much slower in many
Frankly, if this is the behavior you want, then you would probably
be better off using one of the other open-source mail scanners. I
don't mean this in a bad way - different programs are designed in
different ways, and choosing the program designed to support the
behavior you want makes alot more sense than trying to redesign a
program designed to work in an entirely different way. At least
one of them has support for my HTML defanging code.
> I think this is a _very_ good idea, if Anomy isn't intrested in the
> contents of an attachment it shouldn't encode/decode it. Then it would
> even be able to pass messages with unknown content types like the 'x-yenc'
> or 'x-base251' that may appear soon.
You're assuming (incorrectly) that the headers properly reflect what
the contents of an attachment are. Then what happens if someone
figures out a way to get Outlook to execute a binary disguisesed as
a image/jpeg .jpg attachment? Some common viruses do exactly that.
> It may even allow me to up the maximum size of message that I allow
> Anomy to check.
I place no limits on this myself - I spent alot of time getting
Anomy's memory/disk usage as independant of message size as
possible so I wouldn't have to, and this work is the root of the
problems you're discussing above.
If you aren't using any virus scanners (which require a temporary
file to work with), then Anomy can scan infinitely large messages
without every touching the disk or eating up more than a fixed
amount of memory. When using virus scanners, Anomy's disk usage
is dictated to the size of the largest attachment - and the memory
usage stays almost constant.
This is makes Anomy unique, most other scanning solutions are
heavily dependant on temporary files and require at least twice the
size of the scanned message for scratch space.
The price you pay for that scalability though, is the inflexibility
discussed above. You have to choose between minimizing disk/memory
usage and flexibility in encoding/decoding/scanning. Can't have
both... at least not within a generic tool like Anomy.
I've done specialized sendmail-based installations of Anomy which
don't rewrite messages unless Anomy actually finds something which
needed to be changed and don't eat up much more disk space or
memory than "standard" mail delivery - but that was done by tweaking
the sendmail delivery process and writing wrappers around Anomy,
not by modifying Anomy itself.
Note: if you are directing content to Anomy via. procmail then you
will want to beware of the memory usage behavior of procmail's
filter feature - I've had scanning machines max out their memory
not because of Anomy but because of procmail. So it does make
sense to limit how big a message you let procmail filter, but
that's not Anomy's fault. :)
Bjarni R. Einarsson PGP: 02764305, B7A3AB89
firstname.lastname@example.org -><- http://bre.klaki.net/
Check out my open-source email sanitizer: http://mailtools.anomy.net/
Spammers, please send plenty of email to: email@example.com