17 Aug 2008
26825
hits

MySQL asynchronous balancing with HTTP+JSON

It is almost the same. It solves the same problem, it does the same work, it looks quite similar, but there is a difference though. One of them is way faster than its predecessor. :-)

With great joy, allow my to introduce our brand new DB-Slayer-like module for Cherokee!

In case you did not know the original implementation, it is a MySQL to JSON over HTTP bridge written by the New York Times folks. So, before I say anything more, I would like to thank Derek Gottfrid for such an inspiring piece of software.

So, what's up with the Cherokee module? Why the heck did we re-implement DBSlayer as a Cherokee module when the original version was already Free Software and was based on the well-known Apache library?

It's basically the same reason why we are working on Cherokee: there are alternative ways of facing the same problems, and it is our believe we have taken a very good decisions along the way that makes Cherokee a fit better on many environments.

On this case, I have written the handler_dbslayer because I wanted to prove myself how big was the difference between both architectures. The results? I'm simply delighted, they are kind of what I expected. Let me show it to you:

The benchmark consisted on 50,000 requests with 50 concurrent threads. Each one of them sending a really basic query to the same MySQL server. The request sent to DBSlayer and to Cherokee were slightly different because of a design decision I made. We send the SQL query as the body of a POST call instead of encoded in the request, and the options are sent as HTTP headers instead of encoded with JSON and re-encoded as part of the request:

This was the request to DBSlayer:

GET /db?%7B%22SQL%22:%22SELECT%20*%20FROM%20example;%22%7D HTTP/1.1
User-Agent: DBSlayer-tester/0.1
Host: localhost:9090
Connection: Keep-Alive
Accept: */*

And this one to Cherokee + handler_dbslayer:

POST / HTTP/1.1
User-Agent: DBSlayer-tester/0.1
Host: localhost:9999
Content-Length: 22
Connection: Keep-Alive
Accept: */*
X-Beautify: 0

SELECT * FROM example;

Check out the results!

It took Cherokee 2.40 seconds to reply all the requests (4,186 reqs/sec), while it DBSlayer needed 8.01 seconds to do the same work (1,255 reqs/sec).

However, things do not end there. As long as the Cherokee's handler_dbslayer has been implement as a response module ('handler' in the project slang) it can take advantage of the rest of the server features. For instance, you could configure it to use any of the load balancing modules that we are already shipping, so you could split the load among a number of MySQL servers.

But, wait; it goes on. There are more interesting enhancements that handler_dbslayer inherits. What about encoding the JSON formated result with, for instance, gzip before sending it back to the application server? It would only require you a click on the administration interface, isn't it cool? Another good point would be to use Keep-alive, Chunked Transfer encoding or Pipelining. It does sound like a really big advantage, doesn't it? :-)

I will write something more about how this handler works and performs in the following days.

Comments

sharms on Sun Aug 17 22:38:34 2008
852


Just wanted to let you know I never really looked into the project before, but after reading your posts and the website, it looks very cool, will be giving it a try
bkor on Mon Aug 18 00:42:24 2008
853


What did you use to make the chart?
penyaskito on Mon Aug 18 01:37:09 2008
854


Sounds really good, but... I'm a little afraid about it. At first, I have to notice that my knowledge about Cherokee is that I read your articles here. My fears come from the security perspective... Does this allow us to make any SQL query to the server from an HTTP request? What are the threats that we should take care of? How does this manage SQL Injections? Sorry if these questions seems stupid and it is documented elsewhere. A pointer would be great ;)
Derek Gottfrid on Mon Aug 18 03:03:14 2008
855


This is awesome. I can't wait to see how this develops. I am excited to see somebody think there was enough value behind the ideas in dbslayer to reimplement. I know nothing about cherokee but it looks very impressive from the outside.
Ilya Grigorik on Mon Aug 18 03:30:24 2008
856


Alvaro, this looks fantastic. I sure hope you're planning on making the source code available - I'd love to play with it. If you need beta testers.. you know who to ping. ;-)
James Henstridge on Mon Aug 18 05:12:26 2008
857


penyaskito: presumably you'd protect a dbslayer daemon the same as you would the database servers themselves. Giving unrestricted access to it over the internet would usually be a bad thing. This sort of system looks like it'd only really be useful for read only databases (or ones that change infrequently enough to be treated as such). If you are doing significant changes to the DB or require transactional isolation, I guess this layer wouldn't be too useful.
Oli on Mon Aug 18 10:17:44 2008
858


I'd also love to know what you used to make the bar graph. It's awesome.
Darwin OS on Mon Aug 18 10:49:17 2008
859


The Chart were made with iWork (Pages, Numbers or Keynote) and the Diagram with Omnigraffle. Am I right Alvaro?
Alvaro Lopez Ortega on Mon Aug 18 10:51:35 2008
860


Darwin OS, you are indeed.
Richard on Mon Aug 18 11:55:01 2008
861


Isn't the point of putting the SQL query in the URL, as in the original dbslayer design, that you can stick a standard caching http proxy in front of the dbslayer server?
Tobias on Mon Aug 18 13:55:46 2008
862


The chart is really nice, I'd like to know how you made it. (The rest nice too, but MySQL isn't my topic.)
Taher Shihadeh on Mon Aug 18 15:43:20 2008
863


I just wanted to say something: Wow! No, really. Wow!!!! Great idea. Better yet implementation. I'm thrilled by the possibilities :)
penyaskito on Mon Aug 18 21:22:47 2008
864


Thanks, James, that makes sense :-)
Sylm on Tue Aug 19 08:19:01 2008
865


Charts was made with Pages or Numbers
bisho on Tue Aug 19 11:39:37 2008
867


Very interesting! As Richard said, GET makes sense for SELECT clauses, as it could be easily cached using web standard cache proxies. A very interesting option could be the support for specifying a cache timeout on each query: GET /?q=SELECT...&cache=1800 And cherokee will return that specified cache timeout on the response. Thus if you have a regular squid in from of the setup, it will obey the cache requested from the application. POST should be reserved for non-cacheable queries like INSERTS/UPDATES/DELETES (or even using the rest of HTTP verbs, as a good REST implementation). The implementation currently supports queries that modify the state of the DB -by specifying a master mysql server or resending the same query to all dbs-, or just data consults???
Xaprb on Mon Aug 25 14:30:35 2008
881


Edward Tufte would disapprove of the chart ;-)
Eric Larson on Mon Aug 25 21:39:11 2008
882


bisho brings up a good point regarding caching. My guess is that you would actually be querying a server dedicated to serving JSON that would handle the caching. For instance, the backend, when it needed to make a request, would make a POST request to the DBSlayer and then cache the results. Requests for the JSON would then be handled by the querying server out of its cache. I would imagine this kind of setup would almost be required anyway in order to keep everything secure. With that said, if you were planning on implementing the security aspect via HTTP (a reasonable idea IMHO), then using GET for queries does make sense. POST could then be used to always perform the query, no matter what has been (potentially) cached by the database. Cool stuff!
Mark Matthews on Tue Aug 26 01:35:44 2008
883


I've been pondering doing something similar, but following REST principles, so really a query should be in most cases a GET, which would mean that the JSON "encoder" application could apply appropriate etags, expires headers, etc. which would allow "any old" HTTP cache to cache the content...Putting *any* query in POST kind of negates the benefits of the REST architecture, doesn't it?

Leave a comment

Name:

Comment:


Loading PhotoLog..


Search

 

Twitter


Facebook

Alvaro Lopez Ortega's Profile
Alvaro Lopez Ortega's Facebook Profile