If you are building an AI agent that sends email from its own address, the first outbound message is the easy part. Without reply handling, the reply lands like any other new message. The agent has no way to connect it to the thread it started, no context for what was proposed, and no state to drive the next step. Every conversation the agent opens stops at the first exchange.
Reply handling only works if the mailbox belongs to the agent, not borrowed from a human account where replies mix with unrelated email. Nylas Agent Accounts give your agent its own inbox, its own thread model, and a message.created webhook scoped to what the agent owns. Here is how to wire up the reply side.
Before you start, you’ll need an Agent Account on a registered domain and a webhook endpoint Nylas can reach. The quickstart sets both up in about 5 minutes.
Detect the reply
Every message.created webhook payload includes a thread_id. When your agent sent the original outbound message, Nylas assigned it a thread ID. Your state store maps that thread ID to whatever context the agent is tracking for the conversation. When the recipient replies, their mail client automatically sets the In-Reply-To and References headers. Nylas reads those headers and groups the inbound reply under the same thread_id as the original outbound. You do not need to parse email headers yourself. The Threads API already did that work.
The check in your handler is a single lookup: does this thread_id exist in your state store? One note on field casing: webhook payloads arrive as raw JSON in snake_case (grant_id, thread_id), while values the SDK returns are camelCase (threadId, messageIds). Seeing both in one handler is expected, not a typo.
app.post("/webhooks/nylas", async (req, res) => { // Verify X-Nylas-Signature before processing (see the webhooks guide, https://developer.nylas.com/docs/v3/notifications/). res.status(200).end(); // Acknowledge immediately. const event = req.body; if (event.type !== "message.created") return; const msg = event.data.object; if (msg.grant_id !== AGENT_GRANT_ID) return; // Outbound messages also fire message.created. Skip them. if (msg.from?.[0]?.email === AGENT_EMAIL) return; // Is this a reply to a conversation the agent is tracking? const context = await db.getThreadContext(msg.thread_id); if (context) { await handleReply(msg, context); } else { await handleNewMessage(msg); }});
The self-message filter is not optional. When your agent sends a reply via the API, message.created fires for that outbound message too. Without the filter, the agent reads its own sent messages as inbound and the loop runs against itself.
Fetch the conversation history
The message.created webhook payload carries summary fields: subject, sender, a short snippet. It does not carry the full message body. Before the agent decides how to respond, fetch the complete message from the API. Fetch the full thread alongside it so the agent has the entire conversation history, not just the latest reply.
async function handleReply(msg, context) { // Fetch the full body. The webhook payload only carries a snippet. const fullMessage = await nylas.messages.find({ identifier: AGENT_GRANT_ID, messageId: msg.id, }); // Fetch the thread for the full conversation chain. const thread = await nylas.threads.find({ identifier: AGENT_GRANT_ID, threadId: msg.thread_id, }); // Reconstruct the conversation history the agent needs. const history = await buildConversationHistory(thread.data.messageIds); await routeReply(fullMessage.data, history, context);}
An LLM deciding how to respond to “sounds good, let’s do Thursday” needs to know what was proposed in the messages before it. Without the thread history, every reply looks like a message with no conversation behind it.
Route based on agent state
The context you stored when the agent sent the original message tells you where in the workflow this reply lands. A reply to a scheduling proposal needs different handling than a reply that reopens a closed support ticket. The routing logic is a switch on whatever step your agent tracks.
async function routeReply(message, history, context) { switch (context.step) { case "awaiting_confirmation": // Agent proposed something and is waiting for a yes or no. await handleConfirmation(message, history, context); break; case "awaiting_info": // Agent asked a question and needs the answer before proceeding. await handleInfoResponse(message, history, context); break; case "closed": // Conversation was resolved but the person wrote back. await handleReopenedThread(message, history, context); break; default: // Unknown state: log and route to a human. await escalateToHuman(message, context); }}
The step values are whatever your agent defines. A scheduling agent might use awaiting_time_selection, awaiting_rsvp, booked. A support agent might use open, awaiting_customer, escalated. The pattern is the same regardless of the use case.
Reply in-thread
When the agent responds, pass replyToMessageId to the send call. This tells Nylas to set the correct In-Reply-To and References headers on the outbound message. The recipient’s mail client threads the reply into the existing conversation. Without it, the agent’s response lands as a disconnected new email in the recipient’s inbox.
async function sendReply(originalMessage, body, context) { const sent = await nylas.messages.send({ identifier: AGENT_GRANT_ID, requestBody: { replyToMessageId: originalMessage.id, to: originalMessage.from, subject: `Re: ${originalMessage.subject}`, body: body, }, }); // Update the conversation state so the next reply routes correctly. await db.updateThreadContext(originalMessage.threadId, { ...context, step: "awaiting_reply", lastSentAt: Date.now(), lastSentMessageId: sent.data.id, });}
Store the sent message_id alongside the thread context. If the next reply arrives before the agent has finished processing this one, the stored message ID lets you confirm a reply was already sent and avoid duplicates.
What to watch in production
Multiple replies can arrive on the same thread before the agent has finished processing the first. A recipient might send a correction ten seconds after their initial message, or two people on a CC thread might both reply. Process each message independently and check whether the agent has already responded since the last inbound before generating another reply. A short cooldown of 30 to 60 seconds before the agent responds lets you batch consecutive messages into one reply rather than responding to each individually.
The webhook payload body is not always the full message. If an inbound message exceeds roughly 1 MB, the webhook type becomes message.created.truncated and the body is omitted entirely. Handle this type in your webhook router and always fetch the body from the API rather than relying on the payload.
Deduplication is required. Webhook redelivery and concurrent workers will both re-trigger the handler. Without dedup the agent sends duplicate replies. Prevent duplicate agent replies walks through how to close them.
Know when your reply actually lands. Subscribe to message.send_success, message.send_failed, and message.bounce_detected alongside message.created. send_success confirms the reply went out; send_failed and bounce_detected let the agent react to a bad address instead of reading silence as the other side still thinking.