The intricacies of integrating with IMAP

Explore IMAP integration insights to elevate your app’s email capabilities. Learn challenges, system design tradeoffs, and crucial decisions for a seamless integration.

Integrating with IMAP

Having IMAP support will level up your application’s email capabilities, extending your customer reach and empowering your users to sync their emails with your products seamlessly. However, while IMAP integration offers important accessibility and efficiency advantages, setting it up can be a head-scratcher, even for the most experienced developers. 

If you’re considering building IMAP support for your app, we’ve got some insider insights to share. We’ve been down that road, tackling the hurdles to set up IMAP provider support for our email API (that includes messages, threads, folders, drafts, and more). We’ll also share information on system design tradeoffs you may face when integrating with old, legacy technology in the modern era.

As you read about the complexity of integrating with IMAP and begin to decide on your next steps, keep these two questions in mind:

  • Are you willing or able to offer a less complete feature set to your IMAP end-users?
  • Is your team willing or able to invest significant time in building an architecture vastly different from your interactions with modern email providers not on IMAP (Gmail, Microsoft Graph, etc.)?

One can imagine the answers to these questions living on a spectrum from Minimal Feature Set & Minimal Time Investment to Complete Feature Parity & Maximum Time Investment:

IMAP integration minimal feature set and time investment

Knowing where you are on this spectrum is the key to determining if it’s worth building your IMAP integrations yourself or leaving this “heavy lifting” to experts in the field like Nylas.

What is IMAP?

IMAP, or Internet Message Access Protocol, is an email retrieval protocol that allows users to access their email messages from any device as long as they have an email client that supports IMAP. IMAP messages are stored on the mail server, and users can view, delete, and move them as needed.

The first version of IMAP was released in 1988 as RFC 1064. IMAP has been updated several times since then, with the latest version being RFC 3501 released in 2005. Still to date (this post was authored in August 2023), IMAP is a mature protocol well-supported by some of the most popular email clients and providers of the modern era, such as Apple iCloud Mail and Yahoo Mail.

Take a look at “Everything you need to know about IMAP”, written by our co-founder and CTO Christine Spang, if you’d like to develop your background knowledge of IMAP further!

Core challenges of an IMAP integration

From our experience working with IMAP, we know you’ll likely encounter a range of obstacles, from speed to provider-specific nuances and more. 

I originally intended to order these from “mildly annoying” to “this is impossible,” but soon realized that all these obstacles are deeply interlinked and will influence a team’s design choices. So we’ll share these in no particular order and dive into potential workarounds and their associated tradeoffs.

Latency

If you’ve ever tried integrating with IMAP in the past, you’re probably already familiar with just how slow these providers can be. To borrow a phrase from one of our engineering leaders, Chitresh Deshpande, “Finishing an operation in seconds is the equivalent of light speed in IMAP.”

Here’s an example of the boilerplate code in Golang needed just to print the flags (\\Drafts, \\Trash, etc.) of a mailbox:

// Set up a connection with the IMAP host server(s)
c, err := client.DialTLS(fmt.Sprintf("%s:%d", "imap.mail.me.com", 993), &tls.Config{InsecureSkipVerify: true})
if err != nil {
		return nil, fmt.Errorf("couldn't dial tls for imap client: %v", err)
}

// Login with the end-user’s credentials
if err := c.Login("nylas-username", "nylas-password"); err != nil {
		return fmt.Errorf("couldn't login as user: %v", err)
}

// Select the desired mailbox

This round trip of dialing in, logging in, logging out, etc., is required regardless of operation and makes directly communicating with the provider expensive. Below is a visualization of some benchmark testing conducted by one of our Staff Engineers, Zhi Qu, that seeks to quantify the latency of various IMAP commands precisely. If we assume that logging out takes about the same time as logging in, then simply performing the boilerplate operations above adds roughly two to two and a half seconds to your round-trip latency.

Latency benchmark for IMAP integration

At this point, you may wonder, “Why not just open a long-lasting connection?” While possible, this solution is generally not recommended, and we’ll cover the nuances of why later.

Lastly, since IMAP is simply a specification, you are also at the whim of the IMAP server implementation’s performance. Another team’s technical choices (although beyond your control) could detrimentally impact your product or service.

A potential workaround is to leverage a modern storage layer between your end users and the IMAP provider and perform operations against that. We will discuss this approach’s tradeoffs in length in the section on “Search” below.

Identification

To perform operations like starring, marking as unread, etc., that are common to most email clients, both the user and developer must be able to uniquely identify a particular message or folder. To newcomers building with IMAP, it may be a surprise that this is not particularly straightforward.

For instance, folders lack any numeric identifier at all. The only ID available is the folder name, and the user can rename the folder at any time. Fortunately, there cannot be two folders with the same name simultaneously.

For message management, the IMAP protocol uses three types of identification: the UID, Sequence ID, and Message-ID. When performing updates on your end-user’s email data, it’s of utmost importance that your system is leveraging the correct type of identification for the task at hand. Otherwise, you may end up manipulating the messages instead!

Message UID 

The unique identifier (UID) is a number assigned to each email message on an IMAP server. It’s important to note that UIDs are unique only within a particular mailbox. So, if a user has multiple mailboxes, the same UID might appear in each, but it would refer to different messages.

While UIDs are designed to remain generally consistent throughout the message’s life in a specific folder, there is one caveat: they can change if the folder’s UIDValidity number changes. When the UIDValidity of a folder is altered, all UIDs within that folder are invalidated, and each message in that folder is reassigned to a new UID. How frequently / when the UIDValidity changes is out of your control and entirely dependent on the provider.

Message sequence ID

Similar to the UID, the sequence ID is not a permanent identifier. It represents the relative position of a message within a mailbox. For instance, if a message is the 10th email in a mailbox, its Sequence ID would be 10. However, if an email above it is deleted, its Sequence ID will change to 9. This dynamic nature makes Sequence IDs less reliable for long-term operations, but they can be useful for short-term tasks or when working with a specific snapshot of a mailbox.

Message-ID

Lastly, the Message-ID is typically a header generated by the provider that sent the email, but it is not guaranteed to be present as it is outside the scope of the IMAP spec. It is useful because the message retains its header regardless of which server hosts it and stays consistent even if a message is duplicated across multiple mailboxes. Unfortunately, performing operations on messages via Message-ID takes additional time since you must first execute a search to retrieve the message with this particular header, and only then can you use the message’s UID/SeqID to perform CRUD operations.

The majority of historical email API call volumes we observe on our platform are from “GET /messages” requests with various levels of filters. These searches and queries are typically across all of an end user’s messages. Unfortunately, you can’t do this out of the box in IMAP because all operations are scoped to a specific folder. This leaves you with two options:

Query provider directly

First, list all of a user’s mailboxes and then execute the same search query multiple times. As you may suspect from our discussion above, this path will subject your end-users or stakeholders to high latencies and would be less than suitable for most front-end applications.

Even if you opt to leverage some sort of parallelism or multithreading, you’ll either run into the issue of opening too many connections with the IMAP provider (we’ll discuss this more in-depth below) or you’ll still experience less than desirable latencies because searching with the IMAP protocol directly is quite slow.

Store user data

Implement some sort of intermediate storage layer for yourself where you store message metadata, headers, and bodies. You can leverage blob storage optimized for raw HTML/text payloads such as message bodies and any other database or cache of your choice for the metadata and headers (subject, to, cc, etc.).

This will improve your per-operation latency by an order of magnitude and give you the freedom to compose complex queries across all folders. But this comes with the tradeoff of added complexity since you will need to sync the state of the IMAP provider with your data storage frequently. If your sync operations are too slow, your customers lose out on perhaps the most important quality of email as part of the modern communication stack: the fact that it is real-time.

IMAP integration and latency

Another complication is that you’ll also need to ensure that your end-user data management complies with GDPR, CCP, and other privacy laws applicable to where your customers reside. Failure to encrypt messages both in-transit and in-rest, adopt organization-wide security policies aligned on protecting customer data, and make it easily accessible or destroyable to customers could be as damaging as losing product-market fit.

Provider nuances

Despite being around for over 30 years, even the most popular email providers don’t exactly follow the IMAP protocol to perfection. Generally speaking, it’s quite likely that any particular IMAP provider may not implement newer IMAP commands from the latest specifications (UIDPlus extensions, ESEARCH, etc.). Therefore, it’s important to be overly cautious and directly test with each specific provider you intend to integrate with. This will help you catch such deviations before a launch. The alternative is to avoid using the coolest new IMAP features when possible.

Here are a few provider-specific nuances our platform team has encountered while debugging IMAP-related issues:

Yahoo

  • The inbox folder should be named “INBOX” exactly. Yahoo does not follow this convention and instead names it “Inbox”.
  • By the IMAP standard, email providers are obliged to support “NOT” as a search criterion (i.e. SEARCH NOT SEEN, SEARCH NOT FROM “example@nylas.com”). However, if you try to use this filter with Yahoo, their servers will always return an empty search result.

iCloud

  • Draft folders should be marked with the flag “\\Draft”, but iCloud does not follow this part of the IMAP spec. The iCloud drafts folder does not have any folder flags.
  • In its web/mobile user interface, iCloud displays the system folders “Sent” and “Trash.” But when listed from the iCloud IMAP servers directly, they are named “Sent Messages” and “Deleted Messages” respectively.
OK user nylas.test@icloud.com logged in
LIST "" "*"
* LIST () "/" "Education 3.0"
* LIST () "/" "Archive"
* LIST (\\Trash) "/" "Deleted Messages"
* LIST () "/" "Junk"
* LIST (\\Sent) "/" "Sent Messages"
* LIST (\\Noinferiors) "/" "INBOX"
* LIST () "/" "Drafts"
* LIST () "/" "alt-j"
OK LIST completed (took 11 ms)

Connection limits & timeouts

When building with IMAP, you’ll encounter several challenges related to connection limits and timeouts, especially with providers like Yahoo.

For instance, Yahoo’s IMAP server is known to automatically shut down a connection just 30 seconds after a user logs in. This can be particularly problematic if you’re executing a slow operation, such as loading raw MIME data. Before you even receive the result, the server might close your connection.

As discussed above, if you opt to build a sync engine, note that maintaining a continuous connection to Yahoo’s IMAP server for a specific account can lead to temporary bans. Specifically, Yahoo may ban your connection for five minutes every hour if it detects prolonged activity. These timeouts and limits can introduce a myriad of unexpected edge cases that could disrupt your system’s smooth functioning, which is why we recommend closing connections whenever possible.

Even outside of syncing, opening up direct communication with providers like Yahoo for your customers can also inadvertently limit their interactions with email accounts, which enforces a concurrent connection limit of approximately 10 per account.

To navigate these challenges, your system must maintain a real-time awareness of how many connections are open for a particular email address. Moreover, your system should be designed to gracefully handle spontaneously closed connections as they arise. It needs the capability to detect such closures, manage them efficiently, reopen connections, and retry operations to ensure uninterrupted service.

Email threading

Our threads endpoint is among our most frequented and sees tens of millions of API calls daily. This isn’t surprising, given that most email user interfaces today present messages in threaded views to make conversations easier to follow and manage. While threading is a fundamental feature for most modern email clients, not all IMAP providers offer this capability out of the box.

Those implementing threading typically follow the Internet message format (IMF) specification. This standard dictates that any message sent across the internet must adhere to a specific format. This includes fields like Message-ID, Date, Content-Type, and more. The Reference header within the IMF is a goldmine for email developers integrating with IMAP providers.

Learn more: What is email threading?

The Reference header typically contains a sequence of Message-IDs in which the first Message-ID can be considered both the parent and the “thread ID”. If a message is not part of a thread (is standalone), there will be no parent, and thus the “thread ID” is simply the Message-ID. Additionally, there’s a Reply-To-Message-ID header, a separate field that could aid in displaying threaded views.

If you’re dealing with providers that don’t strictly adhere to the IMF, you’ll have to take a more fundamental, algorithmic approach to implement threading for your customers. For instance, if the subject line, group of participants, and other parameters are the same or similar, you can thread messages together. This basic tenet is the core of Gmail’s threading algorithm and has been proven to work incredibly well at scale. Beware if you take this approach, as it’s possible for multiple unrelated emails with the same subject line to be incorrectly threaded together (imagine the same friend wishing you “Happy Birthday!” every year). Another popular algorithm is the JWZ algorithm used by Netscape, whose details can be found here.

(The lack of) Email webhooks

In today’s digital landscape, real-time notifications are crucial for maintaining a seamless user experience. However, when it comes to IMAP providers, there’s a glaring gap: the absence of webhook support.

If you aim to achieve this, you’ll need to revisit your sync engine implementation from earlier. Upgrade it to calculate diffs between newly fetched results and what is currently in your data store, and generate webhook events from those diffs. In doing so, you’ll encounter the same issues of building an IMAP sync engine as discussed earlier (connection limits and timeouts, compliance with GDPR and other privacy laws, etc.) and some new ones.

The first issue relates to the robustness of your webhook system. IMAP providers don’t offer a change-log or a “historyID” like Gmail (Google) or Graph (Microsoft) from which to read the latest updates and/or transactions. Changes are instantaneous and irreversible. So, for instance, if you lose a connection and manage to reconnect later, any changes that occurred during the downtime are lost forever and not replayable.

Secondly, while IMAP does provide a mechanism to set up idle connections to listen to specific mailboxes, it’s far from perfect. At best, IMAP might inform you that a change occurred, represented by a sequence ID, but it won’t specify the nature of that change. This ambiguity can stem from various reasons:

  • The UIDValidity might have been altered.
  • The folder’s name could have been modified.
  • A myriad of operations could have been performed on any number of messages.

In such scenarios, your only recourse is a complete re-sync.

No other email API provider in the market offers real-time webhook support for IMAP providers, except for Nylas. Nylas allows end-users to replay webhooks if their systems are down during an incident.

Conclusion

Undoubtedly, many engineers/readers of this article will scoff at the above challenges and still decide to build the integration themselves.

However, if you:

  • strongly answered “no” to any of the questions posited in the introduction
  • have placed yourself on the right end of the spectrum shown earlier (max feature set + time investment)
  • are not equipped with a team to handle the various backend & security challenges discussed above

I encourage you to check out how the Nylas Email API could help your team today! Check out one of our quickstart guides and have a working email integration in under 15 minutes.

You May Also Like

Everything you need to know about IMAP
The Developer’s Guide to Integrating with IMAP - Nylas
The Developer’s Guide to Integrating with IMAP
email threading
Stay Threaded: How to Manage and Control Email Threads

Subscribe for our updates

Please enter your email address and receive the latest updates.