Docs / Email Servers / Email Archiving with MailArchiva

Email Archiving with MailArchiva

By Admin · Mar 15, 2026 · Updated Apr 23, 2026 · 411 views · 4 min read

Email archiving is essential for compliance, legal discovery, and organizational knowledge preservation. MailArchiva is a dedicated email archiving solution that captures, indexes, and stores all email passing through your mail server. This guide covers deployment options, integration with Postfix and Exchange, and search/retrieval workflows.

Why Archive Email?

  • Compliance — regulations like GDPR, HIPAA, SOX, and SEC Rule 17a-4 require email retention
  • Legal discovery — quickly search and export emails for litigation holds
  • Business continuity — recover accidentally deleted emails
  • Knowledge management — searchable organizational memory

MailArchiva Installation

# Download MailArchiva
wget https://www.mailarchiva.com/downloads/mailarchiva-server-latest.deb

# Install
sudo dpkg -i mailarchiva-server-latest.deb
sudo apt-get install -f  # Fix dependencies if needed

# Start the service
sudo systemctl enable --now mailarchiva

# Access web interface
# https://your-server:8443
# Default credentials: admin / admin

Alternative: Open-Source Archiving with Postfix

For a free alternative, you can use Postfix's always_bcc feature combined with a dedicated archive mailbox:

# /etc/postfix/main.cf
always_bcc = archive@example.com

# This sends a copy of every email (inbound and outbound) to the archive address
# Combine with a search tool like Apache Solr or Elasticsearch for indexing

Building a Custom Archive Pipeline

#!/usr/bin/env python3
# archive-pipe.py — Postfix content_filter for archiving
import sys
import email
import json
import hashlib
from datetime import datetime
from pathlib import Path
import subprocess

# Read email from stdin
raw = sys.stdin.buffer.read()
msg = email.message_from_bytes(raw)

# Extract metadata
metadata = {
    "message_id": msg["Message-ID"],
    "from": msg["From"],
    "to": msg["To"],
    "cc": msg.get("Cc", ""),
    "subject": msg["Subject"],
    "date": msg["Date"],
    "archived_at": datetime.utcnow().isoformat(),
    "size": len(raw),
    "hash": hashlib.sha256(raw).hexdigest()
}

# Store the raw email
archive_dir = Path("/archive/mail") / datetime.now().strftime("%Y/%m/%d")
archive_dir.mkdir(parents=True, exist_ok=True)

filename = f"{metadata['hash']}.eml"
(archive_dir / filename).write_bytes(raw)
(archive_dir / f"{metadata['hash']}.json").write_text(json.dumps(metadata, indent=2))

# Re-inject into Postfix for delivery
subprocess.run(["/usr/sbin/sendmail", "-G", "-i"] + sys.argv[1:], input=raw)
sys.exit(0)

MailArchiva Configuration

Journal-Based Archiving (Recommended)

Configure your mail server to journal (BCC) all email to MailArchiva:

# Postfix: journal all mail to MailArchiva
# /etc/postfix/main.cf
always_bcc = journal@archive.example.com

# Configure MailArchiva to receive on a dedicated port
# In MailArchiva admin → Archive → SMTP Listener
# Set port: 2525
# Set allowed hosts: your-mail-server-ip

Milter-Based Archiving

# MailArchiva can act as a milter
# /etc/postfix/main.cf
smtpd_milters = inet:localhost:8891  # MailArchiva milter port
milter_default_action = accept

Search and Retrieval

MailArchiva provides full-text search across all archived emails:

  • Search operators: from:, to:, subject:, body:, date:, has:attachment
  • Boolean operators: AND, OR, NOT, parentheses for grouping
  • Date ranges: date:[2025-01-01 TO 2025-03-15]
  • Wildcard search: invoice* matches invoice, invoices, invoicing
# Example searches
from:ceo@company.com AND subject:confidential
to:finance@company.com AND has:attachment AND date:[2025-01-01 TO *]
(from:vendor1.com OR from:vendor2.com) AND body:"purchase order"

Retention Policies

# Configure in MailArchiva admin → Policies → Retention
# Example policies:
# - General email: retain 7 years
# - Financial email: retain 10 years
# - Legal hold: retain indefinitely
# - Internal newsletters: retain 1 year

# Rules can be based on:
# - Sender/recipient domains
# - Subject line patterns
# - Date ranges
# - Custom headers

Storage Management

# MailArchiva stores emails in volumes
# Each volume is a directory containing indexed email data

# Estimate storage needs:
# Average email size: 75KB
# 100 users × 50 emails/day × 75KB = 375MB/day ≈ 137GB/year

# Use tiered storage:
# - Hot storage (SSD): recent 6 months
# - Cold storage (HDD/S3): older archives

Legal Hold and Export

# Legal hold: prevent deletion of emails matching criteria
# MailArchiva admin → Legal Hold → Create Hold
# Define scope: custodians, date range, keywords
# Held emails are protected from retention policy deletion

# Export for legal discovery:
# 1. Run search with relevant criteria
# 2. Select results and choose export format
# 3. Formats: PST, EML, PDF, MBOX
# 4. Export includes metadata, headers, and attachments

Best Practices

  • Use journal-based archiving to capture all mail without impacting mail flow
  • Store archives on separate storage from your mail server for resilience
  • Implement retention policies from day one — retroactive compliance is difficult
  • Test search and export regularly to ensure the archive is functional
  • Encrypt archive storage at rest for sensitive email data
  • Set up monitoring to alert if archiving stops or falls behind
  • Plan storage capacity based on your organization's email volume plus 20% buffer
  • Document your archiving policy for compliance audits

Was this article helpful?