Processing forms on AMP pages with Amazon API Gateway and AWS Lambda

Learn how to implement and process forms on your AMP pages using Python and Amazon services for serverless applications.

AMP forms with API Gateway and Lambda

What is AMP?

AMP or Accelerated Mobile Pages is Google’s attempt to ease the frustration of mobile users when they open big and heavy websites on their devices. Three main components of AMP are:

  1. AMP HTML — a subset of HTML5 components for reliable performance and faster web pages loading.
  2. AMP JS — a library which implements best practices to render pages faster, e.g., loading external resources asynchronously, loading fonts before anything else, etc.
  3. AMP Cache — a set of proxy-based CDNs that fetch your AMP pages, cache them along with all the necessary resources and serve them to the user via HTTP/2. The most notorious example of such cache is, of course, Google AMP Cache.

However, all of these performance improvements come at a cost. No external stylesheets (except for white-listed font providers), no inline styles (which is not a good idea anyway). Scripts are allowed only as a part of AMP components or in iframes and only if they are non-blocking.

Why is this important?

Currently, AMP is seen as a mobile solution mainly for news articles and blog posts, and there is no evidence that implementing it influences search rankings. However, Google has already started to distinguish AMP pages in search results and it is quite possible that at some point it will start putting AMP pages higher. Other search engines also see potential in the technology. For example, Bing supports it since September 2016.

Simple page

Let’s start with a minimal AMP code:

<!doctype html>
<html amp>
  <head>
    <meta charset="utf-8">
    <script async src="https://cdn.ampproject.org/v0.js"></script>
    <title>Hello AMP world</title>
    <link rel="canonical" href="hello-world.html">
    <meta name="viewport" content="width=device-width,minimum-scale=1,initial-scale=1">
    <style amp-boilerplate>body{-webkit-animation:-amp-start 8s steps(1,end) 0s 1 normal both;-moz-animation:-amp-start 8s steps(1,end) 0s 1 normal both;-ms-animation:-amp-start 8s steps(1,end) 0s 1 normal both;animation:-amp-start 8s steps(1,end) 0s 1 normal both}@-webkit-keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}@-moz-keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}@-ms-keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}@-o-keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}@keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}</style><noscript><style amp-boilerplate>body{-webkit-animation:none;-moz-animation:none;-ms-animation:none;animation:none}</style></noscript>
  </head>
  <body>
    <h1>Hello AMP World!</h1>
  </body>
</html>

There is not much going on yet, but let’s see how this one is different from a general HTML page. To convert a page to AMP, you need to replace the <html> tag with <html ⚡> (or <html amp>, if you don’t like those fancy Unicode symbols) and include standard boilerplate code that loads the AMP script, sets viewport and basic styling for the page correctly.

All your custom CSS code must be located in

Let’s say, now we want to add a simple MailChimp form to collect subscribers’ emails. Easy, you might think, go to Mailchimp, grab some embedded form code and add it to your page. Well, no. One of the things that are blocked in AMP is regular forms with POST action. The good news is that you can use POST forms with action-xhr that sends the request asynchronously, the bad news is that it expects certain CORS headers that Mailchimp does not provide:

Google Chrome Console

This means you need a custom backend that will process the form input and send it to MailChimp. If your site is only a collection of static files created, for example, with Jekyll and hosted on Github Pages or Amazon S3, then you are in a dilemma: to host a server that will serve your pages and process form data or to create a backend that will only pass your data to MailChimp and send the correct response. In this article, we will explore the second approach.

POST Form

Let’s start with adding a form to the page. The first step is to load amp-form component in head. This is required according to AMP specs:

The amp-form extension MUST be loaded if you’re using <form>, otherwise, your document will be invalid!

<script async 
        custom-element="amp-form" 
        src="https://cdn.ampproject.org/v0/amp-form-0.1.js">
</script> 

Now, we can add the form itself:

<form method="post"
      action-xhr="https://mybackend.com"
      target="_top">
    <input type="email" 
           placeholder="email address" 
           name="email"/>
</form>

The only difference from the regular form tag is that action-xhr is required if method is POST. You can also add divs with classes submit-success and submit-error that become visible depending on the response status code (200 for submit-success, everything else for submit-error). When combined with amp-mustache template component, it is a very convenient way to present status information to the user. Let’s look at the resulting code:

<!doctype html>
<html >
  <head>
    <meta charset="utf-8">
    <script async   src="https://cdn.ampproject.org/v0.js"></script>
    <script async custom-element="amp-form" src="https://cdn.ampproject.org/v0/amp-form-0.1.js"></script>
    <script async custom-template="amp-mustache" src="https://cdn.ampproject.org/v0/amp-mustache-0.1.js"></script>
    <title>Hello AMP world</title>
    <link rel="canonical" href="hello-world.html">
    <meta name="viewport" content="width=device-width,minimum-scale=1,initial-scale=1">
    <style amp-boilerplate>body{-webkit-animation:-amp-start 8s steps(1,end) 0s 1 normal both;-moz-animation:-amp-start 8s steps(1,end) 0s 1 normal both;-ms-animation:-amp-start 8s steps(1,end) 0s 1 normal both;animation:-amp-start 8s steps(1,end) 0s 1 normal both}@-webkit-keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}@-moz-keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}@-ms-keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}@-o-keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}@keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}</style><noscript><style amp-boilerplate>body{-webkit-animation:none;-moz-animation:none;-ms-animation:none;animation:none}</style></noscript>
  </head>
  <body>
    <h1>Hello AMP World!</h1>
    <form method="post"
      action-xhr="https://mybackend.com"
      target="_top">
        <input type="email" placeholder="email address" name="email"/>
        <div submit-success>
            <template type="amp-mustache">
                Thanks! Check {{email}} to confirm your subscription to the newsletter.
            </template>
        </div>
        <div submit-error>
            <template type="amp-mustache">
                {{errorMessage}}
            </template>
        </div>
    </form>
  </body>
</html>

It will submit the form asynchronously; if a response has status code 200, it will show div with class submit-success and substitute {{email}} with email field from the response body. If the response has any other code (i.e. an error occurred), it will show submit-error div and substitute {{errorMessage}} with the errorMessage field from the response body.

Backend

To process the request we will use the combination of Amazon API Gateway and Lambda function in Python. They allow running code without the need to set up any servers or infrastructure and let you get your backend working in minutes.

The important thing to notice is that action-xhr submits data as multipart/form-data which is not natively supported by API Gateway and requires some parsing in the Lambda function. API Gateway setup is rather easy, we need it to work in “proxy” mode, that is, pass all the request data to Lambda and pass all the response data from Lambda. This is necessary not only because API Gateway doesn’t support the multipart form data, but also because amp-form requires response CORS headers that depend on the query string in the request, but more on this later.

Lambda Function

Let’s create a function that will process the form data. Login to your AWS account, select the Lambda service and click “Create a function”.

AWS Lambda Create Function

In Blueprints select “Author from scratch”

AWS Lambda From Scratch

In the “Basic Information” section choose a name for your function and configure execution role for Lambda. The roles description justifies an article on its own, so if you are not sure how to configure it, for our current purposes you can select “Create a new role”, fill in a name and add “Simple Microservice permissions” to “Policy templates”.

AWS Lambda Basic Information

Click “Create function” and on the next screen set “Runtime” to Python 3.6. Leave function code as it is for now.

AWS Lambda Runtime

Let’s now look at the default function code:

def lambda_handler(event, context):
    # TODO implement
    return "Hello from Lambda"

When Lambda function is used with API Gateway in the proxy mode, the event dictionary contains the information about a request in the following form:

{
    "resource": "/",
    "path": "/",
    "httpMethod": "POST",
    "headers": {
        ...
    },
    "queryStringParameters": {
        ...
    },
    "pathParameters": "",
    "stageVariables": "",
    "requestContext": {
        ...
    },
    "body": "...",
    "isBase64Encoded": true/false
}

All the fields are pretty self-explanatory, and we are interested only in headers, queryStringParameters, and body. To understand how exactly we need to parse the request let’s see how the request is sent:

Headers:

accept:application/json
content-length:157
content-type:multipart/form-data; boundary=----WebKitFormBoundaryYoEf5GGRCRaKj2oT
origin:http://127.0.0.1:8000


QueryString parameters:

__amp_source_origin:http://127.0.0.1:8000


Request payload:

------WebKitFormBoundaryYoEf5GGRCRaKj2oT
Content-Disposition: form-data; name="email"

test@example.com
------WebKitFormBoundaryYoEf5GGRCRaKj2oT--

Nothing really interesting here, it’s a standard multipart/form-data request. To parse it we need to use good old cgi module. Using it we can parse content-type header to get the boundary and then parse the body itself:

from cgi import parse_header, parse_multipart
from io import BytesIO
 
def lambda_handler(event, context):
    c_type, c_data = parse_header(headers["content-type"])
    
    # parse_multipart requires boundary to be a byte string
    c_data["boundary"] = bytes(c_data["boundary"], "utf-8")
    
    body_file = BytesIO(bytes(event["body"], "utf-8"))
    form_data = parse_multipart(body_file, c_data)

Now form_data contains a dictionary with all data submitted through the form. Each value in the dictionary is an array of byte strings. In our case form_data will look this way:

{'email': [b'test@example.com']}

Great, now we know how to parse the request and get the necessary values. But how to send the response back? Let’s figure out what we need to send first.

CORS Headers for AMP

AMP is very specific about what it expects to happen on your server. There is a pretty long description of what you should and shouldn’t do to meet the security requirements. For the purposes of this article we will concentrate only on one aspect of it, the headers it expects in the response:

  • Access-Control-Allow-Origin: . The header must be set to the same value as Origin header in the request. When you implement checks of the allowed Origin, keep in mind that requests will originate not only from your domain but also from AMP cache domains
  • AMP-Access-Control-Allow-Source-Origin: . The header must contain the value passed in query string parameter __amp_source_origin
  • Access-Control-Expose-Headers: AMP-Access-Control-Allow-Source-Origin. The sole purpose of this header is to allow AMP-Access-Control-Allow-Source-Origin to be set

Plus standard CORS headers:

  • Access-Control-Allow-Credentials: true
  • Access-Control-Allow-Headers: Content-Type,X-Amz-Date,Authorization,X-Api-Key,X-Amz-Security-Token
  • Access-Control-Allow-Methods: POST

And finally, the Content-Type header. As one can see from the sample request headers, it expects application/json

Create response in Lambda

First of all, we need to get the correct value for the headers that depend on the request data:

origin = headers.get("origin", "")
amp_source = event["queryStringParameters"].get("__amp_source_origin", "")

Now we can create the response. API Gateway expects output from a Lambda function to be a dictionary with the following keys:

  • statusCode — HTTP status code according to RFC7231
  • body — stringified JSON, body of the response
  • headers — a dictionary containing response headers

To wrap it up, here is how your code for returning response might look:

out = {}
out["statusCode"] = 200 # in case of error, send 4xx or 5xx code
out["body"] = json.dumps({"email": target_email}) # or json.dumps({"errorMessage": error_description}) in case of error
out["headers"] = {
    "Content-Type": "application/json",
    "Access-Control-Allow-Headers": "Content-Type,X-Amz-Date,Authorization,X-Api-Key,X-Amz-Security-Token",
    "Access-Control-Allow-Methods": "POST",
    "Access-Control-Allow-Origin": origin,
    "Access-Control-Allow-Credentials": "true",
    "Access-Control-Expose-Headers": "AMP-Access-Control-Allow-Source-Origin",
    "AMP-Access-Control-Allow-Source-Origin": amp_source
}
return out

Now let’s put it all together:

from cgi import parse_header, parse_multipart
from io import BytesIO
import json
 
def lambda_handler(event, context):
    # get all the necessary data from the request
    c_type, c_data = parse_header(headers["content-type"])
    
    # parse_multipart requires boundary to be a byte string
    c_data["boundary"] = bytes(c_data["boundary"], "utf-8") 
    
    body_file = BytesIO(bytes(event["body"], "utf-8"))
    form_data = parse_multipart(body_file, c_data)
 
    origin = headers.get("origin", "")
    amp_source = event["queryStringParameters"].get("__amp_source_origin", "")
 
    # here you get specific fields from form_data, do something useful 
    # (like calling the Mailchimp API), handle errors, etc.
    # ...
     
    # create a response
    out = {}
    out["statusCode"] = 200 # in case of an error, send 4xx or 5xx code
    
    # or json.dumps({"errorMessage": error_description}) in case of an error
    out["body"] = json.dumps({"email": target_email}) 
    
    out["headers"] = {
        "Content-Type": "application/json",
        "Access-Control-Allow-Headers": "Content-Type,X-Amz-Date,Authorization,X-Api-Key,X-Amz-Security-Token",
        "Access-Control-Allow-Methods": "POST",
        "Access-Control-Allow-Origin":  origin,
        "Access-Control-Allow-Credentials": "true",
        "Access-Control-Expose-Headers": "AMP-Access-Control-Allow-Source-Origin",
        "AMP-Access-Control-Allow-Source-Origin": amp_source
    }
    return out

API Gateway

The last part is to configure the API Gateway service to proxy requests to your Lambda function. Go to API Gateway service in the AWS console and click “Get Started”. Select “New API” and give it a descriptive name.

AWS API Gateway Create New API

Now click “Actions”, then “Create Method”. Select POST in the drop-down and click the checkmark to confirm. Set “Integration type” to “Lambda function”, check “Use Lambda Proxy integration”, select the region where your function is located and enter the function name.

AWS API Gateway Add POST Method

Click “Save” to save the settings and then “OK” to create a trigger for the function. Now select "Actions"→"Deploy API“, select “New Stage”, give it a name (for example, “dev”, “test” or “prod”) and press “Deploy”.

AWS API Gateway Deploy API

You will see the stage URL that you now need to paste into the action-xhr parameter in your form in the HTML code.

AWS API Gateway Stage URL

Conclusion

The current trend to provide the mobile users with light and fast web experience is likely to stay, and so is AMP. One can argue that AMP is too strict and too Google-oriented, but more and more publishers adopt the technology to keep their SEO in a good shape. As the technology matures, more readily available components will appear. However, for now, developers need to have a way of overcoming the limitations without hurting the performance. Creating such middle-men to “proxy” requests to the services that do not provide AMP components and don’t support required CORS settings is an important step to keep your AMP pages interactive and useful.

Although the AMP documentation is good and straightforward, we feel that it lacks implementation examples. The purpose of this article is to give an overview and a starting point for developers who wish to implement a backend for their AMP pages on AWS infrastructure in the easiest way.