Monitoring Application Email Delivery

envelope

Email Delivery Services

If you have ever used Mailgun or another similar email delivery service like SendGrid etc with a shared IP pool, you would have noticed that some emails are occasionally blocked even though your own spam rate is very low or zero. This usually happens because someone else assigned on the same shared IP as yours has been abusing the services by sending spam emails and the IP has been blacklisted. While Mailgun and SendGrid do actively trace and resolve these issues, but this still means that you are missing emails to your users, and if all you are sending are transactional emails that have high value to the users, this will affect their confidence in the service.

The easiest way to tackle these issues is to opt for a dedicated IP from the service providers. You would then be in control of your IP reputation and can avoid such issues in the future if you are sending good emails.

But a dedicated IP does not make sense for all teams. You might be starting a new project or have a very limited amounts of emails to deliver even on an already established product. In this case, it might be better to roll out your own email monitoring system, especially since it is very easy to manage and maintain.

Email Monitoring

The main use case of our monitoring solution is to track email failures and retry the email (which might obtain a new IP from the shared pool and thereby avoid the blacklisting). In case we are experiencing repeated failures, it might be a good idea to be notified so that we can manually intervene.

While all of this can be done in any framework, since I had recently the need for it in an Elixir/Phoenix app, that is what I will describe. I will be assuming Bamboo as our email library but this should again work even if you are using Swoosh or another library.

Email Schema

The first step is to create a schema to track all emails that you are queueing from the application. Let’s call it MyApp.Communication.Email. Since this step will probably be different for a lot of apps, I will just give a very brief list of the fields I have considered for the schema.

schema "emails" do
  field :attempt_at, :utc_datetime
  field :attempt_count, :integer
  field :mailgun_id, :string
  field :mailgun_url, :string
  field :delivered_recipients, {:array, :string}

  embeds_one :last_delivery_status, DeliveryStatus, on_replace: :delete do
    field :code, :integer
    field :message, :string
    field :attempt_no, :integer
    field :description, :string
  end

  timestamps()
end

Feel free to add or remove the fields as per your needs.

Tracking Queued Emails

The next step is to configure the Mailer to add your own method that will track the email after it has been queued to Mailgun.

@doc """
Delivers an email immediately and tracks its delivery.
"""
def deliver_now_with_tracking!(email) do
  case deliver_now(email, response: true) do
    {:error, error} ->
      raise error
    {:ok, _email, response} ->
      track_email(response)
      :ok
  end
end

@doc """
Queues an email for delivery and tracks its delivery.
"""
def deliver_later_with_tracking(nil), do: {:error, "no email"}
def deliver_later_with_tracking(email) do
  if Mix.env() == :test do
      deliver_now_with_tracking!(email)
  else
    Task.Supervisor.start_child(Bamboo.TaskSupervisor, fn ->
      deliver_now_with_tracking!(email)
    end)
  end
end

I assume this is pretty self explanatory. In addition to Bamboo’s deliver_now! and deliver_later, we now have two more methods that allow to track the email. The only thing of note here is we are passing response: true to Bamboo’s deliver methods to obtain the raw response from Mailgun. Mailgun responds with a json in the format {"id": "<message_id>"}. Here is our tracking code that just creates a new Email entry in the database:

defp track_email({:ok, %{"id" => id}}), do: MyApp.Communication.create_email(%{
  mailgun_id: id |> String.trim_leading("<") |> String.trim_trailing(">"),
  attempt_count: 1,
  attempt_at: Timex.now()
})
defp track_email(%{body: body, status_code: 200}), do: track_email(Jason.decode(body))
defp track_email(%Bamboo.Email{}) do
  if Mix.env() == :test, do: track_email(%{body: "{\"id\": \"<ID@mx.domain.ch>\"}", status_code: 200})
end
defp track_email(response), do: Logger.debug("Failed to track email. Response not recognized - #{inspect(response)}")

Receiving Mailgun Notifications

The next step is to create a controller that receives notifications from Mailgun through Webhook. I will leave out the router configuration since that will vary from app to app. Just keep in mind that you will need to use a pipeline that DOES NOT contain protect_from_forgery as the requests are originating from Mailgun which will not have that info.

Mailgun notifications are formatted like this:

{
  "signature": {"token": "xx", "timestamp": 123, "signature": "xx"},
  "event-data": {
    "event": "delivered|failed",
    "message": {"headers": {"message-id": "xx"}},
    "recipient": "xx",
    "delivery-status": {},
    "storage": {"url": "xx"}
  }
}

Inside the controller, there are two main aspects to responding to Mailgun notifications.

  1. Verify the signature: Here is how you can compute the signature on your app. You need to verify this with the signature that Mailgun sends in the json payload .
defp compute_signature(token, timestamp),
    do: :sha256 |> :crypto.hmac(mailgun_webhook_signing_key(), token <> timestamp) |> Base.encode16(case: :lower)
  1. Handle the event: This is where things get interesting. We are only going to track the delivered and failed events as these are the most interesting ones for us.
defp handle_event(%{"event" => "delivered", "recipient" => recipient}, %Email{} = email), do: Communication.update_email(email, %{recipient: recipient})
defp handle_event(%{"event" => "failed", "delivery-status" => status, "storage" => %{"url" => url}, "recipient" => recipient} = event, %Email{} = email) do
  if retryable?(status, email) do
    attrs = %{
      last_delivery_status: status,
      attempt_at: Timex.now() |> Timex.shift(minutes: 4 |> :math.pow(email.attempt_count) |> floor()),
      mailgun_url: url,
      to: recipient
    }
    Communication.retry_email(email, attrs)
  else
    title = MyAppWeb.Gettext.dgettext("emails", "Email Delivery Abandoned")
    MyAppWeb.AdminEmail.email_alert(title, title, event) |> MyAppWeb.Mailer.deliver_later()
    {:ok, email}
  end
end
defp handle_event(_event, _email), do: {:error, :not_acceptable}

Here, if the email is delivered, we just add the recipient to our tracked email. If the email is undelivered, we do a check if this is one of the emails that we want to retry (a simple logic could be to retry all emails with status between 500 and 554 and less than 5 attempts.

If it is, we retry it with an exponential duration or discard it and send an alert email to the admin so that he can manually intervene and check on it.

Now, to our final step. How do we retry the email. Mailgun makes it really simple. Each email has a unique storage URL (this is the mailgun_url that we tracked on our email model). We can post to that URL with some form data to re-enqueue the email with new details. Here is a simple method on the mailer that does this:

@doc """
Schedules an email on Mailgun. Requires:
  * url - Mailgun Storage URL (like https://storage.eu.mailgun.net/v3/domains/mg.domain.ch/messages/BASE64KEY)
  * datetime - UTC Date Time to schedule
  * to - Recipient Address
Returns the raw response from Mailgun.

## Examples:
    iex> schedule_mailgun_email([api_key: "foo"], %{url: "xyz", datetime: ~U[], to: "some@example.com"})
    %{body: "{\"id\": \"<ID@mx.domain.ch>\"}"}
"""
def schedule_mailgun_email!(config, %{url: url, datetime: datetime, to: to}) do
  auth_key = Base.encode64("api:" <> config.api_key)
  HTTPoison.post!(
    url,
    {:form, [{"to", to}, {"o:deliverytime", datetime |> Timex.to_unix()}]},
    [Authorization: "Basic #{auth_key}"],
    [ssl: [{:versions, [:'tlsv1.2']}], follow_redirect: true]
  )
end

With this system in place, you are ready to configure Mailgun to send webhook notifications (this can be configured from the Dashboard -> Sending -> Webhook) to your app and you would be ready to automatically track and retry any failed emails.

Published 2 May 2021

I build mobile and web applications. Full Stack, Rails, React, Typescript, Kotlin, Swift
Pulkit Goyal on Twitter