Process management
XMTP SDKs have a few guardrails in place to prevent crashes. However, it's difficult to guarantee 100% uptime for long-running processes. For that reason, we recommend using a process manager like PM2 to ensure proper restart behavior and logging.
Installation
Install PM2 as a dependency:
npm i pm2Ecosystem config file
Create ecosystem.config.cjs with the following structure:
const path = require("path");
const projectRoot = path.resolve(__dirname, "../../..");
module.exports = {
apps: [
{
name: "<bot-name>",
script: "node_modules/.bin/tsx",
args: "src/index.ts",
cwd: projectRoot,
autorestart: true,
max_memory_restart: "1G",
error_file: "./logs/pm2-<bot-name>-error.log",
out_file: "./logs/pm2-<bot-name>-out.log",
restart_delay: 4000,
min_uptime: 1000,
unstable_restarts: 10000, // CRITICAL: Prevents PM2 from stopping restarts
env: {
NODE_ENV: "production",
},
},
],
};Critical config settings
unstable_restarts: 10000- REQUIRED. Without this, PM2 will stop restarting if it detects the process as "unstable" (crashing too quickly). This allows PM2 to keep restarting even during rapid crash cycles.min_uptime: 1000- Process must run at least 1 second to be considered stablerestart_delay: 4000- Wait 4 seconds before restarting (prevents rapid restart loops)
Agent code pattern
1. Restart logging (FIRST THING)
Add restart logging as the very first line of code, before any async operations:
// Immediate synchronous log - FIRST THING that runs
console.log(
`[RESTART] <Bot Name> bot starting - PID: ${process.pid} at ${new Date().toISOString()}`,
);This log appears immediately when PM2 restarts the process, making it easy to track restarts in logs.
2. Error handlers
Add this error handler after agent creation but before agent.start():
// Handle agent-level unhandled errors
agent.on("unhandledError", (error) => {
console.error("<Bot Name> bot fatal error:", error);
if (error instanceof Error) {
console.error("Error stack:", error.stack);
}
console.error("Exiting process - PM2 will restart");
process.exit(1);
});Note: Process-level handlers (uncaughtException, unhandledRejection) are typically commented out, as agent.on("unhandledError") handles most cases.
Troubleshooting
PM2 shows "waiting restart" but never restarts
Symptom: PM2 status shows "waiting restart" but no new process spawns.
Solution: Add unstable_restarts: 10000 to ecosystem config. PM2 is blocking restarts because it thinks the process is unstable.
No restart logs appearing
Symptom: Process crashes but [RESTART] logs never appear.
Solution:
- Verify
[RESTART]log is the very first line of code - Check PM2 config has
unstable_restarts: 10000 - Check PM2 logs:
pm2 logs <bot-name> --raw
Process restarts too quickly
Symptom: Process restarts in a tight loop.
Solution: Increase restart_delay to give the process time to initialize before allowing another restart.
Application exits early
When PM2 runs in daemon mode, pm2 start exits right after launching the processes. In container or PaaS environments, this can make the platform think your app has finished and should shut down.
To keep PM2 in the foreground, use pm2-runtime instead of pm2 start:
pm2-runtime ecosystem.config.cjsThis keeps the PM2 process running in the foreground, which is typically required by PaaS platforms such as Render or Heroku.

