Mercurial > public > pelican-blog
comparison content/Coding/002-redis-whos-online.rst @ 4:7ce6393e6d30
Adding converted blog posts from old blog.
author | Brian Neal <bgneal@gmail.com> |
---|---|
date | Thu, 30 Jan 2014 21:45:03 -0600 |
parents | |
children | 49bebfa6f9d3 |
comparison
equal
deleted
inserted
replaced
3:c3115da3ff73 | 4:7ce6393e6d30 |
---|---|
1 A better "Who's Online" with Redis & Python | |
2 ########################################### | |
3 | |
4 :date: 2011-04-25 12:00 | |
5 :tags: Redis, Python | |
6 :slug: a-better-who-s-online-with-redis-python | |
7 :author: Brian Neal | |
8 | |
9 **Updated on December 17, 2011:** I found a better solution. Head on over to | |
10 the `new post`_ to check it out. | |
11 | |
12 | |
13 Who's What? | |
14 ----------- | |
15 | |
16 My website, like many others, has a "who's online" feature. It displays the | |
17 names of authenticated users that have been seen over the course of the last ten | |
18 minutes or so. It may seem a minor feature at first, but I find it really does a lot to | |
19 "humanize" the site and make it seem more like a community gathering place. | |
20 | |
21 My first implementation of this feature used the MySQL database to update a | |
22 per-user timestamp whenever a request from an authenticated user arrived. | |
23 Actually, this seemed excessive to me, so I used a strategy involving an "online" | |
24 cookie that has a five minute expiration time. Whenever I see an authenticated | |
25 user without the online cookie I update their timestamp and then hand them back | |
26 a cookie that will expire in five minutes. In this way I don't have to hit the | |
27 database on every single request. | |
28 | |
29 This approach worked fine but it has some aspects that didn't sit right with me: | |
30 | |
31 * It seems like overkill to use the database to store temporary, trivial information like | |
32 this. It doesn't feel like a good use of a full-featured relational database | |
33 management system (RDBMS). | |
34 * I am writing to the database during a GET request. Ideally, all GET requests should | |
35 be idempotent. Of course if this is strictly followed, it would be | |
36 impossible to create a "who's online" feature in the first place. You'd have | |
37 to require the user to POST data periodically. However, writing to a RDBMS | |
38 during a GET request is something I feel guilty about and try to avoid when I | |
39 can. | |
40 | |
41 | |
42 Redis | |
43 ----- | |
44 | |
45 Enter Redis_. I discovered Redis recently, and it is pure, white-hot | |
46 awesomeness. What is Redis? It's one of those projects that gets slapped with | |
47 the "NoSQL" label. And while I'm still trying to figure that buzzword out, Redis makes | |
48 sense to me when described as a lightweight data structure server. | |
49 Memcached_ can store key-value pairs very fast, where the value is always a string. | |
50 Redis goes one step further and stores not only strings, but data | |
51 structures like lists, sets, and hashes. For a great overview of what Redis is | |
52 and what you can do with it, check out `Simon Willison's Redis tutorial`_. | |
53 | |
54 Another reason why I like Redis is that it is easy to install and deploy. | |
55 It is straight C code without any dependencies. Thus you can build it from | |
56 source just about anywhere. Your Linux distro may have a package for it, but it | |
57 is just as easy to grab the latest tarball and build it yourself. | |
58 | |
59 I've really come to appreciate Redis for being such a small and lightweight | |
60 tool. At the same time, it is very powerful and effective for filling those | |
61 tasks that a traditional RDBMS is not good at. | |
62 | |
63 For working with Redis in Python, you'll need to grab Andy McCurdy's redis-py_ | |
64 client library. It can be installed with a simple | |
65 | |
66 .. sourcecode:: sh | |
67 | |
68 $ sudo pip install redis | |
69 | |
70 | |
71 Who's Online with Redis | |
72 ----------------------- | |
73 | |
74 Now that we are going to use Redis, how do we implement a "who's online" | |
75 feature? The first step is to get familiar with the `Redis API`_. | |
76 | |
77 One approach to the "who's online" problem is to add a user name to a set | |
78 whenever we see a request from that user. That's fine but how do we know when | |
79 they have stopped browsing the site? We have to periodically clean out the | |
80 set in order to time people out. A cron job, for example, could delete the | |
81 set every five minutes. | |
82 | |
83 A small problem with deleting the set is that people will abruptly disappear | |
84 from the site every five minutes. In order to give more gradual behavior we | |
85 could utilize two sets, a "current" set and an "old" set. As users are seen, we | |
86 add their names to the current set. Every five minutes or so (season to taste), | |
87 we simply overwrite the old set with the contents of the current set, then clear | |
88 out the current set. At any given time, the set of who's online is the union | |
89 of these two sets. | |
90 | |
91 This approach doesn't give exact results of course, but it is perfectly fine for my site. | |
92 | |
93 Looking over the Redis API, we see that we'll be making use of the following | |
94 commands: | |
95 | |
96 * SADD_ for adding members to the current set. | |
97 * RENAME_ for copying the current set to the old, as well as destroying the | |
98 current set all in one step. | |
99 * SUNION_ for performing a union on the current and old sets to produce the set | |
100 of who's online. | |
101 | |
102 And that's it! With these three primitives we have everything we need. This is | |
103 because of the following useful Redis behaviors: | |
104 | |
105 * Performing a ``SADD`` against a set that doesn't exist creates the set and is | |
106 not an error. | |
107 * Performing a ``SUNION`` with sets that don't exist is fine; they are simply | |
108 treated as empty sets. | |
109 | |
110 The one caveat involves the ``RENAME`` command. If the key you wish to rename | |
111 does not exist, the Python Redis client treats this as an error and an exception | |
112 is thrown. | |
113 | |
114 Experimenting with algorithms and ideas is quite easy with Redis. You can either | |
115 use the Python Redis client in a Python interactive interpreter shell, or you can | |
116 use the command-line client that comes with Redis. Either way you can quickly | |
117 try out commands and refine your approach. | |
118 | |
119 | |
120 Implementation | |
121 -------------- | |
122 | |
123 My website is powered by Django_, but I am not going to show any Django specific | |
124 code here. Instead I'll show just the pure Python parts, and hopefully you can | |
125 adapt it to whatever framework, if any, you are using. | |
126 | |
127 I created a Python module to hold this functionality: | |
128 ``whos_online.py``. Throughout this module I use a lot of exception handling, | |
129 mainly because if the Redis server has crashed (or if I forgot to start it, say | |
130 in development) I don't want my website to be unusable. If Redis is unavailable, | |
131 I simply log an error and drive on. Note that in my limited experience Redis is | |
132 very stable and has not crashed on me once, but it is good to be defensive. | |
133 | |
134 The first important function used throughout this module is a function to obtain | |
135 a connection to the Redis server: | |
136 | |
137 .. sourcecode:: python | |
138 | |
139 import logging | |
140 import redis | |
141 | |
142 logger = logging.getLogger(__name__) | |
143 | |
144 def _get_connection(): | |
145 """ | |
146 Create and return a Redis connection. Returns None on failure. | |
147 """ | |
148 try: | |
149 conn = redis.Redis(host=HOST, port=PORT, db=DB) | |
150 return conn | |
151 except redis.RedisError, e: | |
152 logger.error(e) | |
153 | |
154 return None | |
155 | |
156 The ``HOST``, ``PORT``, and ``DB`` constants can come from a | |
157 configuration file or they could be module-level constants. In my case they are set in my | |
158 Django ``settings.py`` file. Once we have this connection object, we are free to | |
159 use the Redis API exposed via the Python Redis client. | |
160 | |
161 To update the current set whenever we see a user, I call this function: | |
162 | |
163 .. sourcecode:: python | |
164 | |
165 # Redis key names: | |
166 USER_CURRENT_KEY = "wo_user_current" | |
167 USER_OLD_KEY = "wo_user_old" | |
168 | |
169 def report_user(username): | |
170 """ | |
171 Call this function when a user has been seen. The username will be added to | |
172 the current set. | |
173 """ | |
174 conn = _get_connection() | |
175 if conn: | |
176 try: | |
177 conn.sadd(USER_CURRENT_KEY, username) | |
178 except redis.RedisError, e: | |
179 logger.error(e) | |
180 | |
181 If you are using Django, a good spot to call this function is from a piece | |
182 of `custom middleware`_. I kept my "5 minute cookie" algorithm to avoid doing this on | |
183 every request although it is probably unnecessary on my low traffic site. | |
184 | |
185 Periodically you need to "age out" the sets by destroying the old set, moving | |
186 the current set to the old set, and then emptying the current set. | |
187 | |
188 .. sourcecode:: python | |
189 | |
190 def tick(): | |
191 """ | |
192 Call this function to "age out" the old set by renaming the current set | |
193 to the old. | |
194 """ | |
195 conn = _get_connection() | |
196 if conn: | |
197 # An exception may be raised if the current key doesn't exist; if that | |
198 # happens we have to delete the old set because no one is online. | |
199 try: | |
200 conn.rename(USER_CURRENT_KEY, USER_OLD_KEY) | |
201 except redis.ResponseError: | |
202 try: | |
203 del conn[old] | |
204 except redis.RedisError, e: | |
205 logger.error(e) | |
206 except redis.RedisError, e: | |
207 logger.error(e) | |
208 | |
209 As mentioned previously, if no one is on your site, eventually your current set | |
210 will cease to exist as it is renamed and not populated further. If you attempt to | |
211 rename a non-existent key, the Python Redis client raises a ``ResponseError`` exception. | |
212 If this occurs we just manually delete the old set. In a bit of Pythonic cleverness, | |
213 the Python Redis client supports the ``del`` syntax to support this operation. | |
214 | |
215 The ``tick()`` function can be called periodically by a cron job, for example. If you are using Django, | |
216 you could create a `custom management command`_ that calls ``tick()`` and schedule cron | |
217 to execute it. Alternatively, you could use something like Celery_ to schedule a | |
218 job to do the same. (As an aside, Redis can be used as a back-end for Celery, something that I hope | |
219 to explore in the near future). | |
220 | |
221 Finally, you need a way to obtain the current "who's online" set, which again is | |
222 a union of the current and old sets. | |
223 | |
224 .. sourcecode:: python | |
225 | |
226 def get_users_online(): | |
227 """ | |
228 Returns a set of user names which is the union of the current and old | |
229 sets. | |
230 """ | |
231 conn = _get_connection() | |
232 if conn: | |
233 try: | |
234 # Note that keys that do not exist are considered empty sets | |
235 return conn.sunion([USER_CURRENT_KEY, USER_OLD_KEY]) | |
236 except redis.RedisError, e: | |
237 logger.error(e) | |
238 | |
239 return set() | |
240 | |
241 In my Django application, I calling this function from a `custom inclusion template tag`_ | |
242 . | |
243 | |
244 | |
245 Conclusion | |
246 ---------- | |
247 | |
248 I hope this blog post gives you some idea of the usefulness of Redis. I expanded | |
249 on this example to also keep track of non-authenticated "guest" users. I simply added | |
250 another pair of sets to track IP addresses. | |
251 | |
252 If you are like me, you are probably already thinking about shifting some functions that you | |
253 awkwardly jammed onto a traditional database to Redis and other "NoSQL" | |
254 technologies. | |
255 | |
256 .. _Redis: http://redis.io/ | |
257 .. _Memcached: http://memcached.org/ | |
258 .. _Simon Willison's Redis tutorial: http://simonwillison.net/static/2010/redis-tutorial/ | |
259 .. _redis-py: https://github.com/andymccurdy/redis-py | |
260 .. _Django: http://djangoproject.com | |
261 .. _Redis API: http://redis.io/commands | |
262 .. _SADD: http://redis.io/commands/sadd | |
263 .. _RENAME: http://redis.io/commands/rename | |
264 .. _SUNION: http://redis.io/commands/sunion | |
265 .. _custom middleware: http://docs.djangoproject.com/en/1.3/topics/http/middleware/ | |
266 .. _custom management command: http://docs.djangoproject.com/en/1.3/howto/custom-management-commands/ | |
267 .. _Celery: http://celeryproject.org/ | |
268 .. _custom inclusion template tag: http://docs.djangoproject.com/en/1.3/howto/custom-template-tags/#inclusion-tags | |
269 .. _new post: http://deathofagremmie.com/2011/12/17/who-s-online-with-redis-python-a-slight-return/ |