bgneal@4
|
1 A better "Who's Online" with Redis & Python
|
bgneal@4
|
2 ###########################################
|
bgneal@4
|
3
|
bgneal@4
|
4 :date: 2011-04-25 12:00
|
bgneal@4
|
5 :tags: Redis, Python
|
bgneal@4
|
6 :slug: a-better-who-s-online-with-redis-python
|
bgneal@4
|
7 :author: Brian Neal
|
bgneal@7
|
8 :summary: Still trying to find a better "who's online" function. I ran with this method for a while, but later found a way to improve upon it.
|
bgneal@4
|
9
|
bgneal@4
|
10 **Updated on December 17, 2011:** I found a better solution. Head on over to
|
bgneal@4
|
11 the `new post`_ to check it out.
|
bgneal@4
|
12
|
bgneal@4
|
13
|
bgneal@4
|
14 Who's What?
|
bgneal@4
|
15 -----------
|
bgneal@4
|
16
|
bgneal@4
|
17 My website, like many others, has a "who's online" feature. It displays the
|
bgneal@4
|
18 names of authenticated users that have been seen over the course of the last ten
|
bgneal@4
|
19 minutes or so. It may seem a minor feature at first, but I find it really does a lot to
|
bgneal@4
|
20 "humanize" the site and make it seem more like a community gathering place.
|
bgneal@4
|
21
|
bgneal@4
|
22 My first implementation of this feature used the MySQL database to update a
|
bgneal@4
|
23 per-user timestamp whenever a request from an authenticated user arrived.
|
bgneal@4
|
24 Actually, this seemed excessive to me, so I used a strategy involving an "online"
|
bgneal@4
|
25 cookie that has a five minute expiration time. Whenever I see an authenticated
|
bgneal@4
|
26 user without the online cookie I update their timestamp and then hand them back
|
bgneal@4
|
27 a cookie that will expire in five minutes. In this way I don't have to hit the
|
bgneal@4
|
28 database on every single request.
|
bgneal@4
|
29
|
bgneal@4
|
30 This approach worked fine but it has some aspects that didn't sit right with me:
|
bgneal@4
|
31
|
bgneal@4
|
32 * It seems like overkill to use the database to store temporary, trivial information like
|
bgneal@4
|
33 this. It doesn't feel like a good use of a full-featured relational database
|
bgneal@4
|
34 management system (RDBMS).
|
bgneal@4
|
35 * I am writing to the database during a GET request. Ideally, all GET requests should
|
bgneal@4
|
36 be idempotent. Of course if this is strictly followed, it would be
|
bgneal@4
|
37 impossible to create a "who's online" feature in the first place. You'd have
|
bgneal@4
|
38 to require the user to POST data periodically. However, writing to a RDBMS
|
bgneal@4
|
39 during a GET request is something I feel guilty about and try to avoid when I
|
bgneal@4
|
40 can.
|
bgneal@4
|
41
|
bgneal@4
|
42
|
bgneal@4
|
43 Redis
|
bgneal@4
|
44 -----
|
bgneal@4
|
45
|
bgneal@4
|
46 Enter Redis_. I discovered Redis recently, and it is pure, white-hot
|
bgneal@4
|
47 awesomeness. What is Redis? It's one of those projects that gets slapped with
|
bgneal@4
|
48 the "NoSQL" label. And while I'm still trying to figure that buzzword out, Redis makes
|
bgneal@4
|
49 sense to me when described as a lightweight data structure server.
|
bgneal@4
|
50 Memcached_ can store key-value pairs very fast, where the value is always a string.
|
bgneal@4
|
51 Redis goes one step further and stores not only strings, but data
|
bgneal@4
|
52 structures like lists, sets, and hashes. For a great overview of what Redis is
|
bgneal@4
|
53 and what you can do with it, check out `Simon Willison's Redis tutorial`_.
|
bgneal@4
|
54
|
bgneal@4
|
55 Another reason why I like Redis is that it is easy to install and deploy.
|
bgneal@4
|
56 It is straight C code without any dependencies. Thus you can build it from
|
bgneal@4
|
57 source just about anywhere. Your Linux distro may have a package for it, but it
|
bgneal@4
|
58 is just as easy to grab the latest tarball and build it yourself.
|
bgneal@4
|
59
|
bgneal@4
|
60 I've really come to appreciate Redis for being such a small and lightweight
|
bgneal@4
|
61 tool. At the same time, it is very powerful and effective for filling those
|
bgneal@4
|
62 tasks that a traditional RDBMS is not good at.
|
bgneal@4
|
63
|
bgneal@4
|
64 For working with Redis in Python, you'll need to grab Andy McCurdy's redis-py_
|
bgneal@4
|
65 client library. It can be installed with a simple
|
bgneal@4
|
66
|
bgneal@4
|
67 .. sourcecode:: sh
|
bgneal@4
|
68
|
bgneal@4
|
69 $ sudo pip install redis
|
bgneal@4
|
70
|
bgneal@4
|
71
|
bgneal@4
|
72 Who's Online with Redis
|
bgneal@4
|
73 -----------------------
|
bgneal@4
|
74
|
bgneal@4
|
75 Now that we are going to use Redis, how do we implement a "who's online"
|
bgneal@4
|
76 feature? The first step is to get familiar with the `Redis API`_.
|
bgneal@4
|
77
|
bgneal@4
|
78 One approach to the "who's online" problem is to add a user name to a set
|
bgneal@4
|
79 whenever we see a request from that user. That's fine but how do we know when
|
bgneal@4
|
80 they have stopped browsing the site? We have to periodically clean out the
|
bgneal@4
|
81 set in order to time people out. A cron job, for example, could delete the
|
bgneal@4
|
82 set every five minutes.
|
bgneal@4
|
83
|
bgneal@4
|
84 A small problem with deleting the set is that people will abruptly disappear
|
bgneal@4
|
85 from the site every five minutes. In order to give more gradual behavior we
|
bgneal@4
|
86 could utilize two sets, a "current" set and an "old" set. As users are seen, we
|
bgneal@4
|
87 add their names to the current set. Every five minutes or so (season to taste),
|
bgneal@4
|
88 we simply overwrite the old set with the contents of the current set, then clear
|
bgneal@4
|
89 out the current set. At any given time, the set of who's online is the union
|
bgneal@4
|
90 of these two sets.
|
bgneal@4
|
91
|
bgneal@4
|
92 This approach doesn't give exact results of course, but it is perfectly fine for my site.
|
bgneal@4
|
93
|
bgneal@4
|
94 Looking over the Redis API, we see that we'll be making use of the following
|
bgneal@4
|
95 commands:
|
bgneal@4
|
96
|
bgneal@4
|
97 * SADD_ for adding members to the current set.
|
bgneal@4
|
98 * RENAME_ for copying the current set to the old, as well as destroying the
|
bgneal@4
|
99 current set all in one step.
|
bgneal@4
|
100 * SUNION_ for performing a union on the current and old sets to produce the set
|
bgneal@4
|
101 of who's online.
|
bgneal@4
|
102
|
bgneal@4
|
103 And that's it! With these three primitives we have everything we need. This is
|
bgneal@4
|
104 because of the following useful Redis behaviors:
|
bgneal@4
|
105
|
bgneal@4
|
106 * Performing a ``SADD`` against a set that doesn't exist creates the set and is
|
bgneal@4
|
107 not an error.
|
bgneal@4
|
108 * Performing a ``SUNION`` with sets that don't exist is fine; they are simply
|
bgneal@4
|
109 treated as empty sets.
|
bgneal@4
|
110
|
bgneal@4
|
111 The one caveat involves the ``RENAME`` command. If the key you wish to rename
|
bgneal@4
|
112 does not exist, the Python Redis client treats this as an error and an exception
|
bgneal@4
|
113 is thrown.
|
bgneal@4
|
114
|
bgneal@4
|
115 Experimenting with algorithms and ideas is quite easy with Redis. You can either
|
bgneal@4
|
116 use the Python Redis client in a Python interactive interpreter shell, or you can
|
bgneal@4
|
117 use the command-line client that comes with Redis. Either way you can quickly
|
bgneal@4
|
118 try out commands and refine your approach.
|
bgneal@4
|
119
|
bgneal@4
|
120
|
bgneal@4
|
121 Implementation
|
bgneal@4
|
122 --------------
|
bgneal@4
|
123
|
bgneal@4
|
124 My website is powered by Django_, but I am not going to show any Django specific
|
bgneal@4
|
125 code here. Instead I'll show just the pure Python parts, and hopefully you can
|
bgneal@4
|
126 adapt it to whatever framework, if any, you are using.
|
bgneal@4
|
127
|
bgneal@4
|
128 I created a Python module to hold this functionality:
|
bgneal@4
|
129 ``whos_online.py``. Throughout this module I use a lot of exception handling,
|
bgneal@4
|
130 mainly because if the Redis server has crashed (or if I forgot to start it, say
|
bgneal@4
|
131 in development) I don't want my website to be unusable. If Redis is unavailable,
|
bgneal@4
|
132 I simply log an error and drive on. Note that in my limited experience Redis is
|
bgneal@4
|
133 very stable and has not crashed on me once, but it is good to be defensive.
|
bgneal@4
|
134
|
bgneal@4
|
135 The first important function used throughout this module is a function to obtain
|
bgneal@4
|
136 a connection to the Redis server:
|
bgneal@4
|
137
|
bgneal@4
|
138 .. sourcecode:: python
|
bgneal@4
|
139
|
bgneal@4
|
140 import logging
|
bgneal@4
|
141 import redis
|
bgneal@4
|
142
|
bgneal@4
|
143 logger = logging.getLogger(__name__)
|
bgneal@4
|
144
|
bgneal@4
|
145 def _get_connection():
|
bgneal@4
|
146 """
|
bgneal@4
|
147 Create and return a Redis connection. Returns None on failure.
|
bgneal@4
|
148 """
|
bgneal@4
|
149 try:
|
bgneal@4
|
150 conn = redis.Redis(host=HOST, port=PORT, db=DB)
|
bgneal@4
|
151 return conn
|
bgneal@4
|
152 except redis.RedisError, e:
|
bgneal@4
|
153 logger.error(e)
|
bgneal@4
|
154
|
bgneal@4
|
155 return None
|
bgneal@4
|
156
|
bgneal@4
|
157 The ``HOST``, ``PORT``, and ``DB`` constants can come from a
|
bgneal@4
|
158 configuration file or they could be module-level constants. In my case they are set in my
|
bgneal@4
|
159 Django ``settings.py`` file. Once we have this connection object, we are free to
|
bgneal@4
|
160 use the Redis API exposed via the Python Redis client.
|
bgneal@4
|
161
|
bgneal@4
|
162 To update the current set whenever we see a user, I call this function:
|
bgneal@4
|
163
|
bgneal@4
|
164 .. sourcecode:: python
|
bgneal@4
|
165
|
bgneal@4
|
166 # Redis key names:
|
bgneal@4
|
167 USER_CURRENT_KEY = "wo_user_current"
|
bgneal@4
|
168 USER_OLD_KEY = "wo_user_old"
|
bgneal@4
|
169
|
bgneal@4
|
170 def report_user(username):
|
bgneal@4
|
171 """
|
bgneal@4
|
172 Call this function when a user has been seen. The username will be added to
|
bgneal@4
|
173 the current set.
|
bgneal@4
|
174 """
|
bgneal@4
|
175 conn = _get_connection()
|
bgneal@4
|
176 if conn:
|
bgneal@4
|
177 try:
|
bgneal@4
|
178 conn.sadd(USER_CURRENT_KEY, username)
|
bgneal@4
|
179 except redis.RedisError, e:
|
bgneal@4
|
180 logger.error(e)
|
bgneal@4
|
181
|
bgneal@4
|
182 If you are using Django, a good spot to call this function is from a piece
|
bgneal@4
|
183 of `custom middleware`_. I kept my "5 minute cookie" algorithm to avoid doing this on
|
bgneal@4
|
184 every request although it is probably unnecessary on my low traffic site.
|
bgneal@4
|
185
|
bgneal@4
|
186 Periodically you need to "age out" the sets by destroying the old set, moving
|
bgneal@4
|
187 the current set to the old set, and then emptying the current set.
|
bgneal@4
|
188
|
bgneal@4
|
189 .. sourcecode:: python
|
bgneal@4
|
190
|
bgneal@4
|
191 def tick():
|
bgneal@4
|
192 """
|
bgneal@4
|
193 Call this function to "age out" the old set by renaming the current set
|
bgneal@4
|
194 to the old.
|
bgneal@4
|
195 """
|
bgneal@4
|
196 conn = _get_connection()
|
bgneal@4
|
197 if conn:
|
bgneal@4
|
198 # An exception may be raised if the current key doesn't exist; if that
|
bgneal@4
|
199 # happens we have to delete the old set because no one is online.
|
bgneal@4
|
200 try:
|
bgneal@4
|
201 conn.rename(USER_CURRENT_KEY, USER_OLD_KEY)
|
bgneal@4
|
202 except redis.ResponseError:
|
bgneal@4
|
203 try:
|
bgneal@4
|
204 del conn[old]
|
bgneal@4
|
205 except redis.RedisError, e:
|
bgneal@4
|
206 logger.error(e)
|
bgneal@4
|
207 except redis.RedisError, e:
|
bgneal@4
|
208 logger.error(e)
|
bgneal@4
|
209
|
bgneal@4
|
210 As mentioned previously, if no one is on your site, eventually your current set
|
bgneal@4
|
211 will cease to exist as it is renamed and not populated further. If you attempt to
|
bgneal@4
|
212 rename a non-existent key, the Python Redis client raises a ``ResponseError`` exception.
|
bgneal@4
|
213 If this occurs we just manually delete the old set. In a bit of Pythonic cleverness,
|
bgneal@4
|
214 the Python Redis client supports the ``del`` syntax to support this operation.
|
bgneal@4
|
215
|
bgneal@4
|
216 The ``tick()`` function can be called periodically by a cron job, for example. If you are using Django,
|
bgneal@4
|
217 you could create a `custom management command`_ that calls ``tick()`` and schedule cron
|
bgneal@4
|
218 to execute it. Alternatively, you could use something like Celery_ to schedule a
|
bgneal@4
|
219 job to do the same. (As an aside, Redis can be used as a back-end for Celery, something that I hope
|
bgneal@4
|
220 to explore in the near future).
|
bgneal@4
|
221
|
bgneal@4
|
222 Finally, you need a way to obtain the current "who's online" set, which again is
|
bgneal@4
|
223 a union of the current and old sets.
|
bgneal@4
|
224
|
bgneal@4
|
225 .. sourcecode:: python
|
bgneal@4
|
226
|
bgneal@4
|
227 def get_users_online():
|
bgneal@4
|
228 """
|
bgneal@4
|
229 Returns a set of user names which is the union of the current and old
|
bgneal@4
|
230 sets.
|
bgneal@4
|
231 """
|
bgneal@4
|
232 conn = _get_connection()
|
bgneal@4
|
233 if conn:
|
bgneal@4
|
234 try:
|
bgneal@4
|
235 # Note that keys that do not exist are considered empty sets
|
bgneal@4
|
236 return conn.sunion([USER_CURRENT_KEY, USER_OLD_KEY])
|
bgneal@4
|
237 except redis.RedisError, e:
|
bgneal@4
|
238 logger.error(e)
|
bgneal@4
|
239
|
bgneal@4
|
240 return set()
|
bgneal@4
|
241
|
bgneal@4
|
242 In my Django application, I calling this function from a `custom inclusion template tag`_
|
bgneal@4
|
243 .
|
bgneal@4
|
244
|
bgneal@4
|
245
|
bgneal@4
|
246 Conclusion
|
bgneal@4
|
247 ----------
|
bgneal@4
|
248
|
bgneal@4
|
249 I hope this blog post gives you some idea of the usefulness of Redis. I expanded
|
bgneal@4
|
250 on this example to also keep track of non-authenticated "guest" users. I simply added
|
bgneal@4
|
251 another pair of sets to track IP addresses.
|
bgneal@4
|
252
|
bgneal@4
|
253 If you are like me, you are probably already thinking about shifting some functions that you
|
bgneal@4
|
254 awkwardly jammed onto a traditional database to Redis and other "NoSQL"
|
bgneal@4
|
255 technologies.
|
bgneal@4
|
256
|
bgneal@4
|
257 .. _Redis: http://redis.io/
|
bgneal@4
|
258 .. _Memcached: http://memcached.org/
|
bgneal@4
|
259 .. _Simon Willison's Redis tutorial: http://simonwillison.net/static/2010/redis-tutorial/
|
bgneal@4
|
260 .. _redis-py: https://github.com/andymccurdy/redis-py
|
bgneal@4
|
261 .. _Django: http://djangoproject.com
|
bgneal@4
|
262 .. _Redis API: http://redis.io/commands
|
bgneal@4
|
263 .. _SADD: http://redis.io/commands/sadd
|
bgneal@4
|
264 .. _RENAME: http://redis.io/commands/rename
|
bgneal@4
|
265 .. _SUNION: http://redis.io/commands/sunion
|
bgneal@4
|
266 .. _custom middleware: http://docs.djangoproject.com/en/1.3/topics/http/middleware/
|
bgneal@4
|
267 .. _custom management command: http://docs.djangoproject.com/en/1.3/howto/custom-management-commands/
|
bgneal@4
|
268 .. _Celery: http://celeryproject.org/
|
bgneal@4
|
269 .. _custom inclusion template tag: http://docs.djangoproject.com/en/1.3/howto/custom-template-tags/#inclusion-tags
|
bgneal@4
|
270 .. _new post: http://deathofagremmie.com/2011/12/17/who-s-online-with-redis-python-a-slight-return/
|