This guide walks through setting up live reduction on a dedicated server for automated, daemon-based processing.
- Linux server with systemd (RHEL 9+, Ubuntu, etc.)
- Network access to instrument DAS
snsdatauser account (or equivalent service account) with access to files that are used in the proc and post-proc scripts- Sudo/admin access for installation, configuration, and interaction with the systemd services
flowchart TD
Step1["Step 1: Install RPM"]
Step2["Step 2: Set up snsdata user"]
Step3["Step 3: Configure Mantid environment"]
Step4["Step 4: Create directories"]
Step5["Step 5: Configure network access"]
Step6["Step 6: Create /etc/livereduce.conf"]
Step7["Step 7: Write processing scripts"]
Step8["Step 8: Test with fake server"]
Step9["Step 9: Enable and start service"]
Step10["Step 10: Verify operation"]
Step1 --> Step2
Step2 --> Step3
Step3 --> Step4
Step4 --> Step5
Step5 --> Step6
Step6 --> Step7
Step7 --> Step8
Step8 --> Step9
Step9 --> Step10
For SNS systems using DNF:
sudo dnf install python-livereduceOr manually from a built RPM:
sudo rpm -ivh python-livereduce-1.17-1.noarch.rpmThis installs:
/usr/bin/livereduce.sh- Service wrapper script/usr/bin/livereduce.py- Main daemon/usr/lib/systemd/system/livereduce.service- Systemd unit file/usr/bin/livereduce_watchdog.sh- Watchdog service wrapper/usr/lib/systemd/system/livereduce_watchdog.service- Watchdog systemd unit file
# The RPM's %pre script will warn if this user doesn't exist
sudo useradd -r -g users -G hfiradmin snsdataCreate or edit /etc/mantid.local.properties (optional if instrument is defined in /etc/livereduce.conf):
default.facility=SNS
default.instrument=POWGENCreate /etc/livereduce.conf with minimal configuration:
{
"instrument": "POWGEN",
"CONDA_ENV": "mantid"
}See Configuration Reference for all options.
The service uses pixi to manage the Mantid environment. Ensure pixi is installed and the environment is configured:
# Install pixi if not already installed
curl -fsSL https://pixi.sh/install.sh | bash
# The livereduce.sh script will use pixi to run Mantid
# Verify pixi is available
which pixiThe CONDA_ENV setting in /etc/livereduce.conf specifies which Mantid environment to use (typically mantid for stable or mantid-nightly for development).
# Default location for SNS instruments
sudo mkdir -p /SNS/POWGEN/shared/livereduce
sudo chown snsdata:users /SNS/POWGEN/shared/livereduce
sudo chmod 775 /SNS/POWGEN/shared/livereduceCopy your instrument-specific scripts:
cp reduce_POWGEN_live_proc.py /SNS/POWGEN/shared/livereduce/
cp reduce_POWGEN_live_post_proc.py /SNS/POWGEN/shared/livereduce/See Processing Scripts for how to write these.
sudo systemctl enable livereduce
sudo systemctl start livereducesystemctl status livereduce
tail -f /var/log/SNS_applications/livereduce.logLook for:
- "StartLiveData" with configuration details
- Connection messages
- Processing script detection
The watchdog monitors the main service and restarts it if unresponsive:
sudo dnf install python-livereduce-watchdog
sudo systemctl enable livereduce_watchdog
sudo systemctl start livereduce_watchdogThe server must be able to connect to:
Required:
- Instrument DAS (typically
bl<N>a-dassrv1.sns.govat SNS) - Shared file systems (for writing output files)
Optional (depending on post-processing):
- Web services (if publishing results)
- Databases (if storing metadata)
- Kafka brokers (if using Kafka listeners)
Depending on listener type, you may need to adjust firewall rules:
TCP listeners:
# Allow incoming connections on listener port
sudo firewall-cmd --add-port=31415/tcp --permanent
sudo firewall-cmd --reloadKafka listeners:
# Allow connections to Kafka brokers (ports 9092, 9093, etc.)
sudo firewall-cmd --add-rich-rule='rule family="ipv4" source address="kafka-broker.facility.gov" accept' --permanent
sudo firewall-cmd --reloadThe daemon automatically detects script changes via inotify:
1. Test scripts locally (see Processing Scripts)
2. Copy to production:
scp reduce_INSTR_live_proc.py snsdata@beamline-server:/SNS/INSTR/shared/livereduce/
scp reduce_INSTR_live_post_proc.py snsdata@beamline-server:/SNS/INSTR/shared/livereduce/3. Automatic detection: The daemon uses inotify to watch for file changes:
- Script modified (md5sum changed): Restarts processing automatically
- Script deleted: Restarts without that script
- Script created: Restarts with new script
4. Verify deployment:
# Check the log for restart message
sudo journalctl -u livereduce -n 50
# Look for:
# "Processing script "/path/to/script" changed - restarting StartLiveData"5. Monitor for errors:
tail -f /var/log/SNS_applications/livereduce.logNote: Modifying /etc/livereduce.conf causes the service to exit. Systemd will restart it with the new configuration after a short delay.
# 1. Edit configuration
sudo vim /etc/livereduce.conf
# 2. Service will automatically restart
# Monitor logs to verify
tail -f /var/log/SNS_applications/livereduce.log# Start the service
sudo systemctl start livereduce
# Stop the service
sudo systemctl stop livereduce
# Restart the service
sudo systemctl restart livereduce
# Check status
systemctl status livereduce
sudo systemctl status livereduce # Shows more log lines
# Enable at boot
sudo systemctl enable livereduce
# Disable at boot
sudo systemctl disable livereduce# Service log file (readable by anyone)
tail -f /var/log/SNS_applications/livereduce.log
# Systemd journal (requires sudo for full history)
sudo journalctl -u livereduce -f
# Last 100 lines
sudo journalctl -u livereduce -n 100
# Since specific time
sudo journalctl -u livereduce --since "2026-01-21 10:00:00"# Quick status check
systemctl status livereduce
# See all processes owned by snsdata
ps -u snsdata -o pid,etime,stat,command
# Process tree
pstree -p $(pgrep -f livereduce.py)
# Files the process has open
sudo lsof -p $(pgrep -f livereduce.py)The watchdog is a separate, independent service that monitors the main daemon.
- Checks
/var/log/SNS_applications/livereduce.logmodification time - If no updates for
thresholdseconds (default 300), restarts main service - Logs the last 20 lines of main log before restarting
- Prevents repeated restarts within same inactivity window
# Watchdog operations are completely independent
sudo systemctl start livereduce_watchdog
sudo systemctl stop livereduce_watchdog
sudo systemctl restart livereduce_watchdog
systemctl status livereduce_watchdogImportant:
- Stopping watchdog doesn't affect main service
- Main service continues running unsupervised
- Watchdog and main service must be managed separately
Configure in /etc/livereduce.conf:
{
"watchdog": {
"interval": 60, # Check every 60 seconds
"threshold": 300 # Restart if no activity for 300 seconds
}
}# View watchdog log
tail -f /var/log/SNS_applications/livereduce_watchdog.log
# Watchdog journal
sudo journalctl -u livereduce_watchdog -fEnable for:
- Production operation
- Unattended running
- Known issues with service stalling
- Automatic recovery from hangs
Disable for:
- Maintenance on main service
- Testing script changes interactively
- Investigating why restarts happen
- Watchdog too aggressive for workload
Before deploying to production:
- Tested processing scripts with fake data server
- Verified scripts with realistic data rates
- Checked memory usage under load
- Configured appropriate
system_mem_limit_perc - Network connectivity to DAS verified
- Output directory permissions correct
- Log rotation configured
- Watchdog enabled and configured
- Service enabled at boot
- Monitoring/alerting set up
- Documentation for instrument scientists
The daemon will immediately cancel StartLiveData and MonitorLiveData and restart them when one of the processing scripts is changed (verified by md5sum) or removed. This is to be resilient against changes in the scripts.
The process will exit and systemd will restart it if the configuration file is changed. This is done in case the version of mantid wanted is changed.
This project follows semantic versioning. Version numbers are in the format MAJOR.MINOR.
To create a new version, start by updating the version field in pyproject.toml.
For example, to create version 1.20:
[project]
version = "1.20"Then we create a git tag and release with the same version number. This can be done with the GitHub CLI tool gh:
git checkout main
git fetch --all --prune
git fetch --prune --prune-tags origin
git rebase -v origin/main
gh release create v1.20 --title "v1.20" --target main --generate-notesCommand gh release create creates both the tag v1.20 and a GitHub release accessible at
https://github.com/mantidproject/livereduce/releases/tag/v1.20
This package uses a hand-written spec file for releasing on RPM based systems.
To build, run script rpmbuid.sh in an environment containing the RPM building framework.
Host ndav.sns.gov has all necessary dependencies.
./rpmbuild.shPackages generated by this script:
$HOME/rpmbuild/RPMS/noarch/python-livereduce-<version>-1.el9.noarch.rpm$HOME/rpmbuild/RPMS/noarch/python-livereduce-watchdog-<version>-1.el9.noarch.rpm$HOME/rpmbuild/SRPMS/python-livereduce-<version>-1.el9.src.rpm
The source package src.rpm contains both livereduce and livereduce-watchdog services.
The last step requires manually uploading these packages to the GitHub release page created in the previous step.
You will need to edit the release page and click on Attach binaries by dropping them here or selecting them
in order to upload the three files.
After that, notify Peter Peterson who will copy these files onto snspackages.ornl.gov
for distribution on SNS systems.
Note: For RPM development and testing, visit the RPM testing guide.
This repository is configured to use pre-commit. Set up with pixi:
pixi install
pixi shell
pre-commit installMore information about testing can be found in test/README.md.
- Linux system (tested on Fedora/RHEL/CentOS)
- Python 3.9 or later
- Pixi for environment management
- Git for version control
- Fork and clone the repository:
git clone https://github.com/YOUR_USERNAME/livereduce.git
cd livereduce- Set up development environment:
pixi install
pixi shellThis installs:
- Mantid framework
- Python dependencies (pyinotify, psutil)
- Development tools (pre-commit, hatchling)
- Install pre-commit hooks:
pre-commit installPre-commit runs linting and formatting checks before each commit.
- Create a feature branch:
git checkout -b feature/your-feature-nameUse descriptive branch names:
feature/add-kafka-listenerfix/memory-leak-in-monitordocs/update-configuration-guide
-
Make your changes following the code style guidelines below.
-
Test your changes (see Testing section).
-
Commit with clear messages:
git add .
git commit -m "Add support for Kafka event streaming
- Implement KafkaListener class
- Add configuration options for broker URLs
- Update documentation with Kafka examples"Good commit messages:
- Start with a verb (Add, Fix, Update, Remove)
- Use present tense
- Include "why" context in the body
- Reference issues when applicable
The project uses automated formatting tools:
- Ruff for Python linting and formatting
- Pre-commit for automated checks
Configuration is in ruff.toml and .pre-commit-config.yaml.
Run checks manually:
# Run all pre-commit checks
pre-commit run --all-files
# Run ruff directly
pixi run ruff check scripts/ test/
pixi run ruff format scripts/ test/Python style guidelines:
- Follow PEP 8
- Use type hints where reasonable
- Keep functions focused and testable
- Add docstrings for public APIs
- Use meaningful variable names
- Push your branch:
git push origin feature/your-feature-name- Create a Pull Request:
- Go to https://github.com/mantidproject/livereduce
- Click "New Pull Request"
- Select your fork and branch
- Fill out the PR description with:
- Summary of changes and motivation
- Type of change (bug fix, feature, etc.)
- Testing performed
- Related issues
- Respond to review feedback:
- Address all reviewer comments
- Push additional commits to the same branch
- Request re-review when ready
1. Additional Data Listeners
Implement support for new data acquisition systems.
2. Memory Management Improvements
Enhance memory monitoring and recovery:
- Better prediction of memory needs
- Smarter workspace cleanup
- Event data compression strategies
3. Error Recovery
Improve resilience to transient failures:
- Automatic reconnection to DAS
- Better handling of network interruptions
- Recovery from corrupted data chunks
4. Performance Optimization
Profile and optimize hot paths:
- Reduce latency between chunks
- Optimize workspace operations
- Minimize memory allocations
Always welcome:
- Fix typos or unclear explanations
- Add examples for specific instruments
- Improve troubleshooting guides
- Add diagrams or visualizations
Found a bug? Please report it with:
- Description: What happened vs. what you expected
- Steps to reproduce: Exact sequence to trigger the bug
- Environment: OS, Mantid version, LiveReduce version, configuration
- Logs: Relevant log excerpts
- Impact: How severe is the issue?
(For maintainers)
- Update version in
pyproject.toml - Update changelog with notable changes
- Tag release:
git tag -a v1.18 -m "Release version 1.18" git push origin v1.18 - CI builds and tests automatically
- Build RPM for distribution:
./rpmbuild.sh
- Architecture - System design and components
- Processing Scripts - Writing processing scripts
- Configuration Reference - All configuration options
- Troubleshooting - Fixing problems