Module 4 (M4) implements secure storage and retrieval of analyzed email data using MongoDB Atlas. This module seamlessly integrates with the signature verification system from M3 to ensure data integrity and authenticity.
-
mongo_client.py - MongoDB connection management
- Singleton client pattern with connection pooling
- Secure connection handling with TLS/SSL
- Automatic connection validation and retry logic
- Production-grade timeout and error handling
-
storage.py - Database operations layer
- Insert verified analyses with signature checking
- Load and verify stored analyses
- Automatic index management (gmail_id, processed_at, risk_label)
- Optional TTL (Time-To-Live) index for automatic cleanup
-
app.py (History Tab) - User interface
- View all stored analyses with signature verification status
- Filter by risk level (SAFE, REVIEW, HIGH_RISK)
- Display analysis details with visual indicators
- Real-time signature verification
- MongoDB URI stored in encrypted
secrets.encfile - Loaded at runtime via M3 decryption
- Never hardcoded in source code
- Every analysis signed with HMAC-SHA256 before storage
- Signatures verified on retrieval
- Invalid signatures clearly marked in UI
- Tamper detection and alerting
- Only analyses from current app session are stored
- Prevents backdating or stale data insertion
- Uses
app_start_timeas session marker
- TLS/SSL enforced for all MongoDB connections
- Strict certificate validation
- Connection timeout protection
- Retry logic for transient failures
{
"_id": ObjectId("..."),
"gmail_id": "18ab123...", // Unique email ID
"sender": "sender@example.com",
"subject": "Email Subject",
"date": "Mon, 01 Nov 2025 10:00:00",
"risk_score": 85.5, // Final merged score (0-100)
"risk_label": "HIGH_RISK", // SAFE, REVIEW, HIGH_RISK
"ai_risk_score": 90.0, // AI component score
"heuristic_risk_score": 75.0, // Heuristic component score
"intents": ["phishing", "urgency"], // Detected intents
"indicators": [...], // Heuristic indicators
"ai_summary": "This email...", // AI-generated summary
"recommendations": [...], // Action recommendations
"processed_at": ISODate("..."), // Timestamp
"mock_mode": false, // AI mode flag
"signature": "base64encodedhmac..." // HMAC-SHA256 signature
}- gmail_id (Unique) - Prevents duplicate entries
- processed_at (Descending) - Efficient time-based queries
- risk_label (Ascending) - Fast filtering by risk level
- processed_at TTL (Optional) - Auto-delete after 7 days
- Create a free MongoDB Atlas account at https://www.mongodb.com/cloud/atlas
- Create a new cluster (M0 Free Tier works for development)
- Configure network access:
- Add your IP address to IP Access List
- Or use
0.0.0.0/0for allow all (development only)
- Create database user with read/write permissions
- Get your connection string (mongodb+srv://...)
Run the secrets encryption script and add MONGO_URI:
python encrypt_secrets.pyWhen prompted, add:
MONGO_URI=mongodb+srv://username:password@cluster.mongodb.net/phishguard?retryWrites=true&w=majority
SIGNING_SECRET=your-signing-secret-key
pip install pymongo cryptographyOr from pyproject.toml:
pip install -e .-
Load Secrets (Security Tab)
- Enter your master passphrase
- Click "Load Secrets"
- MongoDB integration auto-enables
-
Scan Emails (Email Analysis Tab)
- Analyses automatically saved to MongoDB
- Signature added to each document
-
View History (History Tab)
- See all stored analyses
- Filter by risk level
- Check signature verification status
- Expand for detailed view
from storage import insert_analysis, load_analyses
from crypto_simple import load_encrypted
from datetime import datetime
# Load secrets
secrets = load_encrypted('secrets.enc', passphrase)
mongo_uri = secrets['MONGO_URI']
signing_secret = secrets['SIGNING_SECRET'].encode('utf-8')
# Insert analysis
analysis_doc = {
'gmail_id': '18ab123...',
'sender': 'test@example.com',
'subject': 'Test Email',
'risk_score': 85,
'risk_label': 'HIGH_RISK',
'processed_at': datetime.now(),
# ... other fields
}
# Sign and insert
from signing import sign
analysis_doc['signature'] = sign(analysis_doc, signing_secret)
success = insert_analysis(
analysis_doc,
app_start_time=datetime.now(),
mongo_uri=mongo_uri,
signing_secret=signing_secret
)
# Load analyses
analyses = load_analyses(
mongo_uri=mongo_uri,
signing_secret=signing_secret,
filter_by={'risk_label': 'HIGH_RISK'},
limit=50
)
# Check signatures
for analysis in analyses:
if analysis['signature_valid']:
print(f"✅ {analysis['subject']}")
else:
print(f"⚠️ {analysis['subject']} - INVALID SIGNATURE")Get or create MongoDB client with connection pooling.
Parameters:
mongo_uri: MongoDB Atlas connection stringforce_new: Force new connection (default: False)
Returns: MongoClient instance
Raises:
ConnectionFailure: Connection failedValueError: Invalid URI format
Close global MongoDB connection gracefully.
Test connection without caching client.
Insert signed analysis into MongoDB.
Parameters:
analysis_doc: Analysis document with all fieldsapp_start_time: Session start time for filteringmongo_uri: MongoDB connection URIsigning_secret: Secret for signature verification
Returns: True if inserted, False otherwise
Load and verify analyses from MongoDB.
Parameters:
mongo_uri: MongoDB connection URIsigning_secret: Secret for verificationfilter_by: MongoDB query filter (optional)limit: Max documents to return (optional)skip: Skip N documents (pagination)
Returns: List of verified analysis documents with signature_valid field
Retrieve specific analysis by Gmail ID.
Count analyses matching filter.
Delete analyses older than specified days.
- Automatic retry for transient failures
- Clear error messages for configuration issues
- Graceful fallback when MongoDB unavailable
- Invalid signatures logged but not rejected
- UI clearly indicates verification status
- Tampered records marked with warnings
- Required fields checked before insertion
- Duplicate gmail_id prevented by unique index
- Type validation for datetime fields
- ✅ Always use encrypted secrets (never hardcode)
- ✅ Enable MongoDB Atlas IP allowlisting
- ✅ Rotate signing secrets periodically
- ✅ Use strong database passwords
- ✅ Enable MongoDB encryption at rest
- ✅ Use connection pooling (handled automatically)
- ✅ Limit query results with pagination
- ✅ Use indexes for filtering (auto-created)
- ✅ Consider TTL index for automatic cleanup
- ✅ Monitor connection pool metrics
- ✅ Backup MongoDB regularly via Atlas
- ✅ Monitor signature verification failures
- ✅ Set up alerts for connection issues
- ✅ Review old data retention policy
- ✅ Test restore procedures
- Check MONGO_URI in secrets.enc
- Verify network access (IP whitelist)
- Confirm database user credentials
- Check firewall settings
- Ensure SIGNING_SECRET is consistent
- Check for data tampering
- Verify secret encoding (bytes vs string)
- Confirm signing process is correct
- Verify secrets are loaded in Security tab
- Ensure emails were scanned after loading secrets
- Check MongoDB Atlas for stored documents
- Verify collection name matches
- Gmail IDs are unique by design
- Re-scanning same emails will be skipped
- Check logs for insertion attempts
- Receives parsed email data
- Uses gmail_id as unique identifier
- Stores AI risk scores and insights
- Preserves intents and recommendations
- Uses signing functions for integrity
- Loads MongoDB URI from encrypted secrets
- Verifies signatures on retrieval
- Real-time analysis streaming
- Advanced query interface
- Analysis export (CSV, JSON bulk)
- Data retention policy UI
- Analytics dashboard
- Multi-tenant support
- Encrypted fields in MongoDB
- v0.2.0 (M4) - Initial MongoDB integration
- Secure storage with signature verification
- History tab with filtering
- Connection pooling and error handling
For issues or questions:
- Check MongoDB Atlas status
- Verify connection string format
- Review logs for detailed errors
- Ensure all M1-M3 modules working correctly
Module Status: ✅ Production Ready Last Updated: November 2, 2025 Next Module: TBD