Google App Engine: What Is It Good For?

Monday Mar 22nd 2010 by Mark Watson

Google App Engine offers Java and Python developers automatic scaling and potential cost savings -- if they properly design the applications to run on it.

As a developer, I'm enthusiastic about cloud computing platforms because they let me spend more time writing web applications and services and less time dealing with scalability and deployment issues. In particular, Google App Engine offers automatic scaling and potential cost savings if you design the applications to run on it with the proper discipline.

In this article, I provide an overview of the Google Apps Engine platform for developers. Along the way, I offer some tips for writing scalable and efficient Google App Engine applications.

Google App Engine Overview

I use Google App Engine for several of my own projects but I have not yet used it on any customer projects. Google engineers use Google App Engine to develop and deploy both internal and public web applications. As you will see, designing applications to run on Google App Engine takes some discipline.

The Datastore and App Efficiency and Scalability

The non-relational datastore for Google App Engine is based on Google's Bigtable system for storing and retrieving structured data. Bigtable can store petabyte-sized data collections, and Google uses Bigtable internally for web indexing and as data storage for user facing applications like Google Docs, Google Finance, etc. Bigtable is built on top of the distributed Google File System (GFS). As a developer using Google App Engine, you can also create very large datastores.

The datastore uses a structured data model, and the unit of storage for this model is called an entity. The datastore is hierarchical, which provides a way to cluster data or to manage "contains" type relationships. The way this works is fairly simple: each entity has a (primary) key and an entity group. For a top-level entity, the entity group will simply be the (primary) key. For example, if I have a kind of entity (think of this as being a type or a class) called a Magazine, I might have an entity representing an issue of this magazine identified with a key value of /Magazine:programingillustrated0101 and the entity group value would be the same as the key. I might have another entity that is an article of kind Article that might have an entity group of /Magazine:programingillustrated0101 and a key of /Magazine:programingillustrated0101/Article:10234518. Thus, you know that this article belongs to this issue of the magazine.

Entity groups also define those entities that can be updated atomically in a transaction. There is no schema for entities; you might have two entities of kind Article that have different properties. As an example, a second article might have an additional property relatedarticle that the first article does not have. The datastore also naturally supports multiple values of any property.

The primary technique for making your Google App Engine applications efficient and scalable is to rely on the datastore—rather than your application code—to sort and filter data. The next most important technique is effectively caching data for HTTP requests, which can be reused until the data becomes "stale."

Java and Python Language Support

The two supported languages for Google App Engine are Java and Python. (You can read my article on implementing text indexing and search for an example of a Java application using Google App Engine.) I do not often use Python for Google App Engine development, but the APIs for accessing the datastore are very easy to use. Here is a short example of creating a Python class that is persisted to the datastore:

class Article(db.Model):
    author = db.UserProperty()
    title = db.StringProperty(multiline=False)
    content = db.StringProperty(multiline=True)
    date = db.DateTimeProperty(auto_now_add=True)
query = Article.all().order('-date')
articles = query.fetch(10)
my_articles = db.GqlQuery("SELECT * FROM Article WHERE author = :1", users.get_current_user())

The datastore will automatically create indexes for your structured data models.

Although Ruby is not an officially supported language on Google App Engine, there is a large community that does support JRuby App Engine deployments. Google App Engine also has Datamapper support.

Authentication Support

One of the compelling features of Google App Engine is the very simple integration with Google's single sign-on feature, which you probably use every day (sign on to GMail and you are signed on to Google Documents, etc.). In the previous Python code snippet, the data model for Article used a field author whose value is a user property. Getting the information for a logged-in user is simple:

from google.App Engine.api import users
current_user = users.get_current_user()

If a user is not logged in, it is simple to create a URL link to Google's single sign-on login page:

url = users.create_login_url(self.request.uri)

The same example in Java would look like this:

import Engine.api.users.UserService;
import Engine.api.users.UserServiceFactory;
  // doGet servlet method:
  public void doGet(HttpServletRequest request, HttpServletResponse response)
                                                            throws IOException {
    UserService userService = UserServiceFactory.getUserService();
    String thisURL = request.getRequestURI();
    String login_URL_link = userService.createLoginURL(thisURL);
    // etc.

Using Memcache

Caching data for common HTTP requests and using the cached data for a time period that makes sense for a particular application is standard procedure. So, you probably already use a solution like Memcache when deploying Java, Ruby on Rails, etc. web applications on your own servers. The Google App Engine client library that you can use in your web applications allows you to add key/value pairs to Memcache with a specified timeout period. In use, you check to see if a key is in Memcached. If it is, use it. Otherwise, calculate the key's value and add this key/value to Memcached with an application-appropriate timeout period.

Using Task Queues

You cannot start and manage long-running processes on the Google App Engine platform. You must perform application-specific calculations in response to HTTP requests, and these calculations should be quick to avoid any time outs. Originally, you had to use cron services to send your application HTTP requests at desired times or time intervals. The new (and still experimental) Task Queue support is nicer than using cron because you can programmatically add tasks to a task queue and specify the number of tasks per second that get taken from the queue and processed. You still process these tasks as HTTP requests though.

Using URL Fetch

Your Google App Engine applications can fetch information from external services using the URL Fetch service. Any fetch operations should execute quickly to avoid timeouts. Returned data payloads are also restricted to 1 megabyte of data.

Development Tools

The most convenient tools for Google App Engine development are those for the Python language. For example, it is relatively easy in Python to import/export databases to the Google App Engine datastore for any of your applications. The support for this in Java is basically non-existent.

For Python, I use the Python App Engine SDK, keep the local test app server running, and edit Python code and HTML templates with a simple text editor such as Emacs, Textmate, or gedit. The test app engine notices changed files automatically so the development process is agile and interactive.

For Java, I usually use Eclipse with the Google-supplied plugins that can create new projects, manage projects for local testing, and upload applications to Google App Engine. I also like to use IntelliJ, which also has very good Google App Engine support.

I find local development using either Java or Python to be fun and productive because the tool support is very good.

Costs for Using Google App Engine

The reason that Google App Engine is a low-cost solution for startups is that the free daily quota of resources is very generous. For efficiently written applications, you can support many users before you need to enable billing. I recommend that you carefully review the quotas and billing rates that you may incur if your application attracts a very large number of users, which is the type of "problem" that you are probably hoping for!

To take full advantage of Google's relatively low hosting costs, apply the tips in this article to design web applications that can run within Google App Engine's limited runtime environment.

About the author

Mark Watson is a consultant living in the mountains of Central Arizona with his wife Carol and a very feisty Meyers Parrot. He specializes in web applications, text mining, and artificial intelligence. He is the author of 16 books and writes both a technology blog and an artificial intelligence blog at

Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved