-

Difference between revisions of "2009 hardware purchase proposal"

From Open Bioinformatics Foundation
Jump to: navigation, search
(Justification for new Email Scanning Appliance)
(Justification for new Email Scanning Appliance)
Line 43: Line 43:
 
* Overall our mailing list community is not as open/accessible as it has been in the past, due to the anti-spam defenses we've had to emplace
 
* Overall our mailing list community is not as open/accessible as it has been in the past, due to the anti-spam defenses we've had to emplace
 
* Our community is not as open as it could or should be
 
* Our community is not as open as it could or should be
 +
 +
'''Commercial Solutions Exist'''
 +
 +
For 4 years, BioTeam Inc. (the company that donates datacenter space to OBF) has been routing it's own corporate email through a hosted/outsourced email cleaning service operated by MailFoundry Inc. (http://www.mailfoundry.com). The hosted service charges a flat monthly fee per email address protected. Over the years Bioteam has consistently seen upwards of 90% of total email volume being spam, malware-laden or otherwise unwanted. Via MailFoundry scanning the 90% of "bad" email is caught and isolated before hitting the mailserver. False positives do exist but experience shows that they occur on the order of 3-4 times per year.
 +
 +
However, the hosted or outsourced services offered by MailFoundry and competitors like Postini, Google etc. will not work for the OBF because they typically charge fees based on '''per-email''' and '''per-domain'''. Given that we run dozens of mailing lists via dozens of domains and that each mailing list has 6-7 related addresses ("-admin, "-help", etc.) our research has shown that we can't really make use of a hosted solution that charges by domain or email address in any financially reasonable way.
  
 
==Proposed purchases==
 
==Proposed purchases==

Revision as of 20:43, 14 December 2009

Proposal to purchase new machines for OBF

Justification for total replacement of existing servers

Total Server Refresh - Move to 100% hypervisor-based virtualization

  • Existing servers date back to 2004 and are based on 32bit Pentium 4 chipsets
  • Existing servers running CentOS 4.8, current CentOS is at version 5.4
  • No spare server capacity for new service and server requests from OBF community
  • Only two people (Chris D and Jason S) have 100% full remote control including remote-power reboot ability
    • Can not easily/securely provide full remote control to other volunteers with existing infrastructure
  • We need to move to 64 bits, x86_64 and take advantage of CPU level support for virtulization
  • Virtualizing our servers and services greatly reduces operational burden, makes our IT infrastructure more "portable" should future situations demand it and also solves much of our existing issues with granting high levels of remote admin access (including console & remote power control) to members of our sysadmin team
  • New 64-bit hardware with CPU level virtualization support would allow OBF to:
    • Run many more servers and services as needed
    • Provide higher levels of performance
    • Provide higher levels of redundancy, safety and portability
    • Allow much greater distribution of administrative powers
    • Consume less datacenter space
    • Consume less datacenter electricity

Justification for new Email Scanning Appliance

New Purchase - Mail scanning appliance

This proposal is to solve a problem that has long been somewhat invisible to the OBF community - the extremely large amount of work required to handle the massive volume of email traffic that our lists receive, particularly in dealing with and moderating emails that are clearly spam but have gotten through our own anti-spam and anti-virus filters. These messages end up in the moderator queues of all our mailing lists (dozens per day, per email list) and represent the single largest operational and administrative burden for OBF volunteers.

The fact is that in 2009 the free and open source methods for anti-spam and anti-virus can not keep up with the current methods used by the bad guys. The most common source of spam these days are compromised PC systems that send email in small volumes, via seemingly legit accounts and at rotating intervals. The PC-based botnets are much harder to block via greylisting and blocklists than the previous generation of spammers who preferred sending large volumes of email through a smaller collection of hosts.

Current OBF methods of anti-spam and anti-virus include:

  • Greylisting in effect on all inbound email from unknown senders
  • All inbound/outbound email scanned by clamAV for viral payloads
  • Email that passes the clamAV test gets routed through MIMEDefang for additional scrutiny
  • Email that passes clamAV and MIMEDefang gets processed by SpamAssassin
  • A high SA score causes the email to be discarded automatically


Even with the above methods in place, a huge amount of spam still gets through and clogs up the moderator queues of our very active mailing lists.

Bad effects of the spam deluge:

  • Overworked volunteer list administrators
  • Legit emails being lost or deleted in bulk moderator cleanup attempts
  • Once open mailing lists have had to become more closed and harder to reach
  • Overall our mailing list community is not as open/accessible as it has been in the past, due to the anti-spam defenses we've had to emplace
  • Our community is not as open as it could or should be

Commercial Solutions Exist

For 4 years, BioTeam Inc. (the company that donates datacenter space to OBF) has been routing it's own corporate email through a hosted/outsourced email cleaning service operated by MailFoundry Inc. (http://www.mailfoundry.com). The hosted service charges a flat monthly fee per email address protected. Over the years Bioteam has consistently seen upwards of 90% of total email volume being spam, malware-laden or otherwise unwanted. Via MailFoundry scanning the 90% of "bad" email is caught and isolated before hitting the mailserver. False positives do exist but experience shows that they occur on the order of 3-4 times per year.

However, the hosted or outsourced services offered by MailFoundry and competitors like Postini, Google etc. will not work for the OBF because they typically charge fees based on per-email and per-domain. Given that we run dozens of mailing lists via dozens of domains and that each mailing list has 6-7 related addresses ("-admin, "-help", etc.) our research has shown that we can't really make use of a hosted solution that charges by domain or email address in any financially reasonable way.

Proposed purchases

Proposals exist to purchase two main infrastructure devices:

Server Quotes

Deployment and upgrade plan

  • Decomission old machines, helps free up donated rackspace from BioTeam
  • Setting up worldwide mirrors for backup purposes (rsync scripts from the repository?)
    • Our 2 most valuable components are src code and mailing list (archives and membership). Losing these or being down is a HUGE problem. Can we insure these are protected and redundantly preserved?
  • How can we balancing security, all volunteer sysadmin team, moderate latency in response to issues (due to all volunteer nature)

Community requests

  • Latest and greatest src code tools - GIT/Mercurial, previously Trac was also requested.
    • Has been problematic to support because we currently don't allow HTTP access to dev machine
    • Can we setup NFS + httpd on separate machine with mirrored FS (read-only) or NFS(read-write) or other system?
  • How do we keep Wiki's up-to-date w software - better wikifarm support?