view content/Coding/008-oauth-python-gdata.rst @ 5:4b5cdcc351c5

Use a cloned copy of pelican-bootstrap3 repo as my theme.
author Brian Neal <bgneal@gmail.com>
date Fri, 31 Jan 2014 19:12:50 -0600
parents 7ce6393e6d30
children
line wrap: on
line source
Implementing OAuth using Google's Python Client Library
#######################################################

:date: 2011-07-04 13:00
:tags: Python, OAuth, Google, GData
:slug: implementing-oauth-using-google-s-python-client-library
:author: Brian Neal

My Django_ powered website allows users to submit events for a site calendar
that is built upon Google Calendar.  After an admin approves events, I use
Google's `Python Client Library`_ to add, delete, or update events on the Google
calendar associated with my personal Google account. I wrote this application a
few years ago, and it used the ClientLogin_ method for authentication. I
recently decided to upgrade this to the OAuth_ authentication method. The
ClientLogin method isn't very secure and it doesn't play well with Google's
`two-step verification`_. After hearing about a friend who had his GMail account
compromised and all his email deleted I decided it was long past due to get
two-step verification on my account. But first I needed to upgrade my web
application to OAuth.

In this post I'll boil down the code I used to implement the elaborate OAuth
dance. It really isn't that much code, but the Google documentation is somewhat
confusing and scattered across a bewildering number of documents. I found at
least one error in the documentation that I will point out. Although I am using
Django, I will omit details specific to Django where I can.

In addition to switching from ClientLogin to OAuth, I also upgraded to version
2.0 of the Google Data API. This had more implications for my calendar-specific
code, and perhaps I can go over that in a future post.

Getting started and registering with Google
===========================================

To understand the basics of OAuth, I suggest you read `OAuth 1.0 for Web
Applications`_. I decided to go for maximum security and use RSA-SHA1 signing on
all my requests to Google. This requires that I verify my domain and then
`register my application`_ with Google, which includes uploading a security
certificate. Google provides documentation that describes how you can `create a
self-signing private key and certificate`_ using OpenSSL.

Fetching a Request Token and authorizing access
===============================================

To perform the first part of the OAuth dance, you must ask Google for a request
token. When you make this request, you state the "scope" of your future work by
listing the Google resources you are going to access. In our case, this is the
calendar resources. You also provide a "consumer key" that Google assigned to
you when you registered your application. This allows Google to retrieve the
security certificate you previously uploaded when you registered. This is very
important because this request is going to be signed with your private key.
Fortunately the Python library takes care of all the signing details, you simply
must provide your private key in PEM format. And finally, you provide a
"callback URL" that Google will send your browser to after you (or your users)
have manually authorized this request.

Once you have received the request token from Google, you have to squirrel it
away somewhere, then redirect your (or your user's) browser to a Google
authorization page. Once the user has authorized your application, Google sends
the browser to the callback URL to continue the process. Here I show the
distilled code I used that asks for a request token, then sends the user to the
authorization page.

.. sourcecode:: python

    import gdata.gauth
    from gdata.calendar_resource.client import CalendarResourceClient

    USER_AGENT = 'mydomain-myapp-v1' # my made up user agent string

    client = CalendarResourceClient(None, source=USER_AGENT)

    # obtain my private key that I saved previously on the filesystem:
    with open(settings.GOOGLE_OAUTH_PRIVATE_KEY_PATH, 'r') as f:
        rsa_key = f.read()

    # Ask for a request token:
    # scopes - a list of scope strings that the request token is for. See
    # http://code.google.com/apis/gdata/faq.html#AuthScopes
    # callback_url - URL to send the user after authorizing our app

    scopes = ['https://www.google.com/calendar/feeds/']
    callback_url = 'http://example.com/some/url/to/callback'

    request_token = client.GetOAuthToken(
            scopes,
            callback_url,
            settings.GOOGLE_OAUTH_CONSUMER_KEY, # from the registration process
            rsa_private_key=rsa_key)

    # Before redirecting, save the request token somewhere; here I place it in
    # the session (this line is Django specific):
    request.session[REQ_TOKEN_SESSION_KEY] = request_token

    # Generate the authorization URL.
    # Despite the documentation, don't do this:
    #    auth_url = request_token.generate_authorization_url(domain=None)
    # Do this instead if you are not using a Google Apps domain:
    auth_url = request_token.generate_authorization_url()

    # Now redirect the user somehow to the auth_url; here is how you might do
    # it in Django:
    return HttpResponseRedirect(auth_url)

A couple of notes on the above:

* You don't have to use ``CalendarResourceClient``, it just made the most sense
  for me since I am doing calendar stuff later on. Any class that inherits from
  ``gdata.client.GDClient`` will work. You might be able to use that class
  directly. Google uses ``gdata.docs.client.DocsClient`` in their examples.
* I chose to store my private key in a file rather than the database. If you do
  so, it's probably a good idea to make the file readable only to the user your
  webserver runs your application as.
* After getting the request token you must save it somehow. You can save it in
  the session, the database, or perhaps a file. Since this is only temporary, I
  chose to save it in the session. The code I have here is Django specific.
* When generating the authorization URL, don't pass in ``domain=None`` if you
  aren't using a Google Apps domain like the documentation states. This appears 
  to be an error in the documentation. Just omit it and let it use the default
  value of ``"default"`` (see the source code).
* After using the request token to generate the authorization URL, redirect the
  browser to it.

Extracting and upgrading to an Access Token
===========================================

The user will then be taken to a Google authorization page. The page will show the
user what parts of their Google account your application is trying to access
using the information you provided in the ``scopes`` parameter. If the user
accepts, Google will then redirect the browser to your callback URL where we can
complete the process.

The code running at our callback URL must retrieve the request token that we
saved earlier, and combine that with certain ``GET`` parameters Google attached
to our callback URL. This is all done for us by the Python library. We then send
this new token back to Google to upgrade it to an actual access token. If this
succeeds, we can then save this new access token in our database for use in
subsequent Google API operations. The access token is a Python object, so you
can serialize it use the pickle module, or use routines provided by Google
(shown below).

.. sourcecode:: python

    # Code running at our callback URL:
    # Retrieve the request token we saved earlier in our session
    saved_token = request.session[REQ_TOKEN_SESSION_KEY]

    # Authorize it by combining it with GET parameters received from Google
    request_token = gdata.gauth.AuthorizeRequestToken(saved_token,
                        request.build_absolute_uri())

    # Upgrade it to an access token
    client = CalendarResourceClient(None, source=USER_AGENT)
    access_token = client.GetAccessToken(request_token)

    # Now save access_token somewhere, e.g. a database. So first serialize it:
    access_token_str =  gdata.gauth.TokenToBlob(access_token)

    # Save to database (details omitted)

Some notes on the above code:

* Once called back, our code must retrieve the request token we saved in our
  session. The code shown is specific to Django.
* We then combine this saved request token with certain ``GET`` parameters that
  Google added to our callback URL. The ``AuthorizeRequestToken`` function takes care of
  those details for us. The second argument to that function requires the full URL
  including ``GET`` parameters as a string. Here I populate that argument by
  using a Django-specific method of retrieving that information.
* Finally, you upgrade your token to an access token by making one last call to
  Google. You should now save a serialized version of this access token in your
  database for future use.

Using your shiny new Access Token
=================================

Once you have saved your access token, you won't have to do this crazy dance
again until the token either expires, or the user revokes your application's
access to the Google account. To use it in a calendar operation, for example,
you simply retrieve it from your database, deserialize it, and then use it to
create a ``CalendarClient``.

.. sourcecode:: python
  
    from gdata.calendar.client import CalendarClient

    # retrieve access token from the database:
    access_token_str = ...
    access_token = gdata.gauth.TokenFromBlob(access_token_str)

    client = CalendarClient(source=USER_AGENT, auth_token=access_token)

    # now use client to make calendar operations...

Conclusion
==========

The main reason I wrote this blog post is I wanted to show a concrete example of
using RSA-SHA1 and version 2.0 of the Google API together. All of the
information I have presented is in the Google documentation, but it is spread
across several documents and jumbled up with example code for version 1.0 and
HMAC-SHA1.  Do not be afraid to look at the source code for the Python client
library.  Despite Google's strange habit of ignoring PEP-8_ and using
LongJavaLikeMethodNames, the code is logical and easy to read. Their library is
built up in layers, and you may have to dip down a few levels to find out what
is going on, but it is fairly straightforward to read if you combine it with
their online documentation.

I hope someone finds this useful. Your feedback is welcome.


.. _Django: http://djangoproject.com
.. _Python Client Library: http://code.google.com/apis/calendar/data/2.0/developers_guide_python.html
.. _ClientLogin: http://code.google.com/apis/calendar/data/2.0/developers_guide_python.html#AuthClientLogin
.. _OAuth: http://code.google.com/apis/gdata/docs/auth/oauth.html
.. _two-step verification: http://googleblog.blogspot.com/2011/02/advanced-sign-in-security-for-your.html
.. _OAuth 1.0 for Web Applications: http://code.google.com/apis/accounts/docs/OAuth.html
.. _register my application: http://code.google.com/apis/accounts/docs/RegistrationForWebAppsAuto.html
.. _create a self-signing private key and certificate: http://code.google.com/apis/gdata/docs/auth/oauth.html#GeneratingKeyCert
.. _PEP-8: http://www.python.org/dev/peps/pep-0008/