hits
Exciting times
Big announcement time.
I’m excited to say, it’s time for my next big move. The time has come for me to pursue new professional challenges. I'm thrilled to announce that I'm joining Canonical as an Engineering Manager.
I leave Octality with a heavy heart. It definitely has been a once in a lifetime experience. I will always carry the outstanding positive experiences and insightful lessons learned through the last three and a half years.
When I first thought about the opportunity of joining Canonical, I was immediately attracted by the idea of working at such an innovative company. Canonical has so much potential and such a global reach. Canonical has done decisive work improving the "Linux experience", and thus to bring F/OSS to a whole lot of people over the past few years.

I am delighted to be welcomed to this first-class team. It's great to have the chance to get my fingers back in the pie with the latest Open Source technology, especially since it is with the already stellar team at Canonical.
This will be an awesome ride. Woo hoo!
Exciting times.
hits
Front-Line Cache Configuration
This post will introduce how to configure Front-Line Cache, a new form of data caching offered in Cherokee 1.2.98.
Front-Line Cache provides the functionality of a regular Proxy-cache server, with a very big difference: it is not an independent second server which all the Web server traffic must also go through. Instead, Front-Line Cache is built into the Cherokee Web Server. This new architecture brings the same benefits as an independent Proxy-cache server, while it does not add any latency to the response (due to the network based communication between two different pieces of software), nor it use competing system resources (due to the duplication of resources between Web and Proxy servers: memory, poll of descriptors, sockets, etc.).
To try Front-Line Cache install Cherokee 1.2.98 (or greater). As usual, new packages for Ubuntu and MacOS X are made available a few minutes after the source tarball release.
How to Enable Front-Line Cache
Enabling Front-Line Cache is a pretty straightforward process that takes literally, a minute. The whole configuration process is performed via Cherokee-admin, so you don’t have to deal with complexity of plain ol’ configuration files.
There are two ways of enabling Front-Line Cache on a virtual server:

Front-Line Cache Wizard
-
Wizard: A wizard is provided to auto-configure the caching of a selected virtual server. This method will add a rule or two to your virtual host to enable the caching support and its administrative interface (accessed by standard PURGE requests).
It is a fairly straightforward process:
- Click vServer on Top menu
- Select the Virtual Server
- Behavior tab
- Click Rule Management
- Click on the "+" button to add new rules
- Select the Tasks section
- Select the Content Caching wizard
The wizard provides two configuration options that cover most of the use cases of the cache mechanism:
- Store Cacheable Dynamic Responses: This option will cache all the responses that are explicitly subject to be stored in a cache. That is, responses including the Cache-Control or Expires headers. If you are trying to speeding up a Web application this is the way to go.
- Store Encoded Responses from Static files: Cherokee will store a copy of the encoded (GZip or Deflate) version of static files, so when they are subsequently requested, the server does not need to compress it again.
Additionally, the wizard allows you to configure an administrative interface from which entries in the cache can be purged remotely. We have implemented a PURGE interface, for the sake of consistency with other Proxy-cache projects.
-
Manual: It's also possible to enable the Cache capabilities of Cherokee in finer-grain, per-rule way.
Starting in Cherokee 1.2.98 a new "Caching" tab has been incorporated to the Rule management interface. It allows you to define whether or not the content may be cached, and which responses are subject to this process.
The Responses to Cache option is specially relevant in the section. It defines if the explicitly cacheable responses are the only responses that can be cached, or if the server should apply a much more aggressive caching policy by storing all the responses except the ones that are explicitly forbidden to cache.
There are a few reasons why a response is considered to be non-cacheable:
- The object was requested over HTTPS
- The object required authentication
- The headers of the response forbids its caching. Either of the following header would forbid caching:
- Control-Cache, with a "no-cache", "no-store", "must-revalidate" or "proxy-revalidate" properties
- Expires: with a time in the past
- Pragma: with a "no-cache" value
- The headers of the response sets a cookie for the client (Cherokee allows you to ignore certain cookies, so they are not taken into account while evaluating the cacheability of a response. This provides an override where content where cookie(s) exist can be made cacheable)
How to Test That Front-Line Cache is Working
After applying the cache configuration changes to Cherokee you should confirm that cache is indeed working as expected. We do this by using curl, a common Linux tool to look at the HTTP Headers.
curl -v http://www.yourdomain.com/ | head -40
The presence of an X-Cache value indicates that Cherokee's Front-Line Cache is active. The first request for a URL should return: X-Cache: MISS from www.yourdomain.com. Subsequent requests where Cache is functioning should return: X-Cache: HIT from www.yourdomain.com
Further Debugging
The most common reason why backend content will fail to be cacheable is due to the presence of cookies.
Again, utilizing curl to look at the underlying HTTP data we can determine the presence of cookies that are impacting cacheability:
curl -v http://www.yourdomain.com/ | head -40
If you see a "Set-Cookie: value, expires=" type entry via curl then that data is unable to be cached. You have two choices when this occurs, either stop the backend server from sending cookies which might be complicated or even out of your control or override the values via Cherokees Front-Line Cache features.
To make a cookie present request cacheable within Cherokee we have provided a configuration option under caching, called Disregarded Cookies. This option allows Cherokee to ignore some or all cookies depending on your configuration preferences.
- Click vServer on Top menu
- Select the Virtual Server
- Behavior tab
- Click the Behavior Rule where cache is enabled
- Scroll down the Content Expiration screen until you see Disregarded Cookies
Disregarded cookies entries are defined by regular expressions.
In the extreme case you wanted to disregard all cookies and make a document cacheable is to provide a New Regular Expression: Regular Expression: .*
WARNING: Setting Cherokee to ignore cookies in this manner could expose personal data where cookies are used to relate data to a user or to show a user's personal information, log-in area, etc. Use with caution on non-personalized pages only.
hits
Cherokee 1.2.98, New Features
Cherokee 1.2.98 was released yesterday, as the first of two beta versions of Cherokee.1.3.
There are two major improvements scheduled for Cherokee 1.3. The first improvement is the Front-Line Cache mechanism I wrote about a few days ago. Basically, Cherokee stopped being a Web Server to become a Web Server with Web Caching capabilities. We have been aware that one of the most common implementations of Cherokee required a reverse proxy (usually Varnish or Squid) in front of it, so dynamic responses could be cached. There is nothing terribly wrong with that architecture, but it is not the optimal solution either. Front-Line Cache provides the same functionality of the tandem (web server + proxy cache), while it removing the latency introduced by the communication between Web server and Proxy-cache server. This ultimately uses the server resource much more efficiently and reduces page generation and user load times. Web servers and a Proxy Caches are very similar pieces of software, both requiring memory, a large poll of file descriptors, sockets, configuration files, long term documentation, maintenance, etc.
The cache server plus web server "duplication" will no longer be necessary. Starting in Cherokee 1.2.98 we provide a global caching mechanism so the content generated by the server can be cached. It does not matter whether it's a response from PHP, Python, Ruby, a proxied response from a back-end server, or simply content built within the own Web server (a static file that was compressed with GZip, a SSI rendered file, etc). All of them can be handled by the Front-Line Cache technology shipped in Cherokee 1.298 and onward. In testing, Front-Line Cache has been proven to boost server performance and lower page load time up to 80%.
The second major feature in Cherokee 1.2.98 is Cherokee Distribution. This is feature already has it's own post so that you can learn more about it by reading that here. It's simply, a radical change to how the Cherokee Market operates. There were many voices in the Cherokee community asking for a more open way of running the marketplace (originally Octality was the only entity empowered to operate it). Even though it was a fair petition by Cherokee's Community, it was not an easy decision to make. While you are reading this, the process for rebranding the Cherokee Market to Cherokee Distribution is still on going, so when we reach Cherokee 1.3.0 within the next few weeks, all the package repositories will be managed and maintained by community members (including ISVs).
I'd like to finish by clarifying that both features (Front-Line Cache and Cherokee Distribution) are still in beta. Both the current 1.2.98 and the upcoming 1.2.99 are beta version of the our planned Cherokee 1.3.0 release version.
I'd like to encourage you to test these new features and provide feedback (II). Let me know how these features work in your environment, with your website or data.
hits
Front-Line Cache
There are many reasons by which Web infrastructure can perform poorly or even degrade its performance over the time. Among the most common you can find misconfigurations, infrastructures based on good ol' servers that dries the hardware resources, incorrect provisioning policies, and a whole lot of human errors (Amazon and VMWare's Cloud Foundry outages are perfect examples of the latest group).
Fortunately, there are quite a few ways to improve the global performance of Web infrastructures as well. Some of them relies on the change of some of the software pieces involved, while others require a design change of the Web infrastructure itself.
It's been a few months since I had an idea about how to improve the global performance of many of the Web deployments I was working with. An idea directly related with one of the most common architectures that medium and big Webs use nowadays.
These days, it's really usual to find a Proxy-cache server right in front of the Web servers of any organization. Its purpose is to store local copies of the dynamically generated responses, so a concrete request is only processed by the Web Server a single time, and thus some of the most time-expensive operations are skipped. Despite the simplicity of the architecture, it is very effective. Actually, if you are interested on Cherokee, odds are you knew about this a long, long time ago.
Due the effectiveness of the scheme, it is usual to deploy Cherokee along with either Varnish or Squid in order to improve the performance of the Web. There are pros and cons about this scheme, though. On the one hand, it decreases the system(s) load, and improves response time whenever a cache hit happens. On the other hand, it introduces some latency to the system; bear in mind that the communication between the Proxy-cache server and the Web server takes time, and that's ultimately a latency increase that the Web site user will suffer.
From an architectural point of view, the scheme wasn't optimal either. Actually, there is a very simple question I'd like you to answer: Why would you want to keep two separate servers working on the same service? Is there something you get from it? It is neither safer, nor faster.. so, why would you do it then?
The answer is clear to me. You have been doing it, simply because you had no other choice. There was no way to enjoy the advantages of the scheme without paying the price of deploying a sub-optimal architecture.
Today, I'm delighted to introduce Front-Line Cache, a brand new take on early caching of Web content. Front-Line Cache is a new mechanism in Cherokee that implements the cache functionality you'd expect from a Proxy-cache server, but within the Web Server itself.
Advantages? Many. Please, allow me to list a few:
- Resource optimization: A Front-Line Cache enabled version of Cherokee uses far less resources (CPU, memory and disk access) than the tandem of a Proxy-cache server and a Web server.
- You get all the benefits of using a Proxy-cache, but you don't pay the price of an increase of the latency. Since there is no communication between the two software pieces, there is no additional delay.
- As with the rest of the pieces of the Cherokee stack, there is no need to deal with text based, error prone, configuration files. You will enjoy a nice configuration interface where you used to have to deal with two different configuration files, with two different, non-standard grammars.
- You will enjoy a unified log of the transactions of your Web infrastructure where you used to have two different files, one from each of the two different servers.
- Enjoy the same set of tools and goodies of Cherokee, including: Live monitoring of traffic, Remote administration and tweaking, Live server status reports, etc.
Please, check the following captures as an example of the performance improvements that can be achieved by using Cherokee's Front-Line Cache. Both images were captures running MediaWiki under Cherokee trunk (the upcoming Cherokee 1.2.3 version) and php-fpm 5.3. Despite the lower load of the server, the response time was also significantly lower: 204ms using Front-Line Cache, against 1280ms without it -- or, which is the same, the response was delivery took barely 16% of the time that it'd have required without Front-Line Cache.
Having said all this, I'd also like to clarify that the Front-Line Cache mechanism is still an experimental feature that will be shipped with Cherokee 1.2.3 by the first time. So, even though it's been stable for all our testing cycles, I'd recommend you to use it with caution until its 'Experimental' status is removed.
hits
An improved distribution model for Web Apps
It's been a few weeks since Octality launched the Cherokee Market, a marketplace where ISVs and developers can distribute and sell their Web applications.
During these past few weeks many things have happened, and I must admit some of them happen unexpectedly:
- A whole lot of people have signed up in the Cherokee Market. Many more than my most optimistic estimation. My main concern has been about ensuring that our servers scaled to handle all these users properly, for the rest I must confess I have quite enjoyed the experience of seeing how so many people have signed up.
- We received all sorts of feedback about the market. Either people loved the service or they had proposals on how they thought it should be improved. Again, the amount of feedback surpassed by far what I thought we could get. -- Actually, I'd like to take this opportunity to thank everybody who, in way or the other, took the time to drop us a line.
Actually, all that feedback has played a very important role on how the market has evolved. Believe me when I say that it isn't easy to change your medium term plans based on user feedback, although I believe that we are doing right being flexible and rethinking some the aspects of the market. For sure, it will provide a much better outcome to everyone involved in this way.
During the last few weeks, we have been working on an evolved version of the Cherokee Market. I do know it's something unusual to do after having launched it two months ago, but again, it's better to be open to evolve the service than try to stick to your initial approach when you know it can be improved.
Check out this scheme. It represents how the original Cherokee Market works:
As you can see, Octality plays a predominant role. Basically, it is the gatekeeper that allows a Web apps to get into the Cherokee Market. I don't think it is a crucial fact, basically because most of the people are already used to this scheme (think of Google and the Android Market, or Apple and the AppStore; it's the same). However, as I pointed out before, even though the scheme was good enough, we believe things could be done a little better.
Actually, a much classical approach could work better on this case. What if we removed Octality from the scheme? What if developers could get their Web applications into the market without the need of someone approving and handling them? Wouldn't it be much more appealing for users to install applications from a extensive bazaar rather than from a little, posh boutique? This fundamental change would make the market less bureaucratic, faster and much more accessible for developers and ISVs.
This mayor change in the work-flow, there will bring a number of positive side-effects. For instance, it'd also be quite interesting to be able to replicate the market, wouldn't it? Allow me to put an example here: I know of a few ISPs interested on setting up an internal Cherokee Market, so they could offer the service locally to their clients.
Believe me when I say that this change represents a huge advance from the original scheme. On the upcoming version of the marketplace, Octality disappears from the scheme, while an independent community takes over the web apps repository. It's open to anyone to join. They are the people who packs and maintains the web app packages along with the repository and its mirrors - which no interference form Octality whatsoever.
So, if you have a web app that you'd like to make dead easy to install, you are more than welcome to join the community. You will receive a repository account on the package source code repository, with which you can upload the new web app packages. Your application will be ready to be installed by any of the tens of thousand of Cherokee Web Server around the world a few minutes after you commit your package.
We are currently polishing the last few rough edges of all the new infrastructure. As soon as it's ready we will do the proper announcement.
hits
Linux Format benchmarks Cherokee
I have written about the performance of Cherokee a few times already. Cherokee is quite fast and efficient, I suppose you are aware of that by now. However, I wanted to write these few lines to let you know about an article titled 'Cherokee: Fast' that the Linux Format magazine published last month (March 2011, issue 142, pages 96-99).
The article includes an independent benchmark between Apache 2.2, Cherokee 1.0.15, Lighttpd 1.4.26 and Nginx 0.7.65. Truth be said, even though it is a very well written article, you shouldn't expect an extensive, in-depth benchmark. Still, the benchmark results are still fairly representative IMHO.
All in all, Cherokee was the fastest one, outperforming the rest of the servers. Check out one of the resulting graphs as an example:
The article also mentions Cherokee's administration and monitoring graphical interface and some of the many advantages of not having to deal with plain text configuration files any longer.
By the way, I do know that publishing this reference will probably raise the same questions and comments as the rest of the post I wrote on the performance of Cherokee.. so, as a preemptive measure I will answer a couple of the most usual questions in advance:
- Yes, there are more web servers. Obviously I did not write the article so I couldn't tell you for sure. I suppose they were not included either because of issues with their license, the project's maturity status, or simply because of its lack of user base.
- Yes, of course it's possible to write a tiny program to serve static content faster than any of the benchmarked servers. Bear in mind, though, that the article tested fully functional Web Servers able to run and interact with the wide variety of the technology on a regular Web server box: PHP, Java, Python, Ruby, MySQL and LDAP servers, Audio/Video streaming, and so on and so forth.
So, having said that, I'd just like to finish this quick post by letting you know that there are a couple of compelling new features on Cherokee that I'll try to merge into our Trunk branch within the next few weeks. It's pretty exciting because both of them are new concepts that have not been implemented on a Web Server before... You know, at the end of the day, the performance of Cherokee is not something we have focused on, but one of the consequences of thinking out of the box.
hits
Cherokee Market: The 1st marketplace for Web Apps
It's been a couple of days since we launched the Cherokee Market. I'm glad I have finally found a few minutes to write about it. Oh, boy.. these last few months have been intense!!
I will begin by introducing what we have done. The Cherokee Market is a marketplace for Web applications. It's basically a distribution channel where Web developers can distribute and sell their applications, while users can deploy them seamlessly.
Watch this short video to see what I'm talking about. Two minutes are enough to perform the Cherokee Web Server installation, and to access the Cherokee Market to install Drupal 7. It's interesting to notice that the user is never asked for any superfluous installation details:
The technology we have developed to create this services is quite interesting, although in my honest opinion, that is not the best thing about the Market. If you ask me, the very best thing about the Cherokee Market is the vast amount of people who will benefit from it. There are basically two groups of people who are starting to enjoy its benefits:
Web Developers: I'm including on this category both companies and independent developers. Cherokee Market represents a new distribution channel for them with two main advantages. First of all, it expands its user base. Every single Cherokee Web Server user is their potential user as well. The application gains visibility, and increases it user base. Second, as a direct consequence of the previous point, the monetization of the product will turn easier, and a bigger user base will be translated into an increase in the sells.
The Cherokee Market is already open for Web Developers and Companies to join. We are already working with a few companies to bring their products to the market. Do not hesitate to join if you want to distribute/monetize your Web applications.
Web Infrastructure Owners: There is a number of reasons why people like Cherokee. I'd split them in two groups. First, it's how it works: it's fast, very fast and it requires little memory to run. It also supports all the modern Web technologies, so there are no restrictions about what you can or cannot run under Cherokee.
Then, we have how the user interaction with the server is designed. Let me get this straight: It's 2011, it's about time to stop editing complex, error prone, plain text configuration files, don't you think? Now that we all are running fancy desktops full of powerful applications.. what sense does it make to have to open a terminal window, become root, and edit a text file, and type a new command to reload the service in order to make a little change on how your web server behaves. I’d say that none, none at all.

Behavior rule configuration
Traffic monitoringThe first step was to improve how users interacted with their Web Server. Now, we have taken the approach even further, and we are changing how users interact with their Web infrastructure as a whole. It's possible to configure your Web Server behavior, Load balancing policies, and even to browse and install applications from within the same graphical user interface.
The best thing, though, is the simplicity of the process. Odds are you have had to deal with the installation of some Web application that took way much longer than you expected. It's a common situation: missing Python, Ruby, PHP modules, missing build dependencies, invalid versions of interpreters you already have, etc. Not to mention that fact that you might have to perform some operations by hand: create data bases, add users, groups or services to your system, etc.
So, as a condensed summary: The Cherokee Market allows users to deploy applications seamlessly, while we ensure they are deployed safely. They can forget about misconfiguration issues, or poorly performing apps.
I must confess I am deeply proud of what we have achieved here. Besides all the technology we have had to develop in order to bring this project to life, we have created something that I understand is much more important: a way in which hundreds of companies and developers can distribute and sell their Web applications, providing an additioinal (previously proved successful) business model to all of them.
hits
Streaming WebM (VP8) One Day Later
Yesterday, I left the office and headed back home a couple of hours after Google announced the VP8 liberation (GMT+2). At that moment I was trying to forecast the whole lot of changes that the WebM project would bring to the web. The VP8 release under a BSD license with patent rights grant is a huge step forward towards an open and modern WWW. That solves one of the biggest problems the web was facing nowadays. Actually, before I get any further, I'd like to thank Google for freeing VP8 and creating the WebM project.
It was today when I though about starting to test WebM. It looks good and it does sound really promising. However, I wanted to give it a try in order to see how good it actually is. It might be because I've been working so many years with Open Source software, but the first thing I did was to clone their code repository and check the source code. Everything looked alright at that front, so I went ahead to the next stage: use the VP8 code.
Wouldn't it be pretty cool to support WebM streaming over HTTP?
After getting my hands dirty for a couple of hours I came up with a Cherokee Web Server with WebM streaming capabilities. It's very basic stuff, but it does work alright. Basically it can read WebM encoded files and stream them, taking care of performing an initial content boost (so browser cache is filled up right away) and an optional streaming bitrate increase factor.
For my test I used Chromium 6.0.411.0 (47774), Opera 10.54 (21874) and, of course, Cherokee Web Server 1.0.1. For the record, this is the trivial HTML code I used to embed WebM encoded video in a HTML5 test page:
<html> <body> <video controls="controls"> <source src="BBC.webm" type='video/webm; codecs="vorbis,vp8"'> <p><a href="BBC.webm">Download the video</a>.</p> </video> </body> </html>
Check out the result!
First of all.. It was trivial to configure, wasn't it?! Cherokee is always configured in the same way, the days when you had to open a terminal, become root and edit a text file by hand are long gone. Hurray!!
Now, what does Cherokee do to stream the WebM video? The first thing it does internally is to figure out the bitrate of the main data stream (basically, audio + video). In this case, it was not as simple as I was expecting because of a couple of issues with libvpx-vp8 (related to VPX_CODEC_INCAPABLE). Anyway, once Cherokee figures a few details about the video, it can start streaming the content.
The following graph shows how the content is delivered by the server. At the beginning it pushes as much content as possible for a very short period of time. The intention is to get the client's browser cache filled up with information so the video can be played right away. After a couple of seconds Cherokee decreases the throughput so it matches the real video bitrate. There is an additional parameter that allows to define an increment constant though. In this case, Cherokee was configured to deliver an additional 10% over the bare minimum required rate.
And, that's it. This is how WebM streaming is performed!
hits
Cherokee Summit Big Success
This is the first chance I have to write something since the Cherokee Summit finished a couple of days ago. I have been trying to make up my mind about what to write regarding the event, and I must say I have failed to do so. I have too many things to talk about. There were too many interesting conversations, too many people giving talks about successful Cherokee deployments, discussions about the Cherokee 2.0 roadmap and our upcoming marketing efforts. There were those awesome community open sessions where we discussed about every single subject the Cherokee community proposed (with streaming and bridges to IRC and Twitter). I could talk for hours about all those things.
However, do you know what impressed me the most? Energy. The amazing amount of energy that the attendees brought to the conference. It was both totally amazing and exhausting! It isn't easy to gather together a group of high-profile IT people (developers, data-center gurus, specialists for government IT departments, entrepreneurs, etc), but when you do, and they are motivated about a project, the outcome is really unbelievable.
The Cherokee Summit has been a huge success. However, I must confess I could not even imagine it'd be such an amazing experience. Seriously. It surpassed my highest expectations in almost every way.
So now, after having enjoyed such a great experience, it's time to focus again. We have a whole lot of things to do, features to implement, bugs to fix.. and, Community to make. At the end of the day, that's the most important thing: the Community around the project.
We are currently uploading some pictures of the event. Hopefully we'll get the videos of all the talks in the website soon. Most likely it'll take us a few days though.
hits
Collaboration Summit & Cherokee Summit
It's been a little over two weeks since the Linux Foundation Collaboration Summit took place. As always, it was a pretty interesting event well worth attending.
The content of the summit was enlightening in so many ways. First because of the quality of the speakers. Not all the events have so many recognizable and highly involved speakers as the Collaboration Summit, and that is a huge plus for the event. Secondly, because of the propitious environment to meet people. I got to meet a whole of lot of people during the three days of the conference, and of course to greet many old friends.
So, having said that, it is time to look to the future and more specially the upcoming events and challenges. The first one will be the long-awaited Cherokee Summit 2010. I have no words to describe how excited I'm about this event. It will be the first users and developers conference around the Cherokee Project ever!
So far, there are around 80 registered attendees, so it will be a fairly modest conference. However, it represents an important milestone for the Cherokee project: we will be releasing Cherokee 1.0, and most importantly, it will mobilize a crowd of experts on High Performance and Scalable web around Cherokee. In fact, it will take advantage of the conjuncture, and we have scheduled an open session to discuss the Cherokee 2.0 roadmap.
Cherokee Summit will also be the perfect opportunity to engage conversation about the local communities. So far the Polish, German, Chinese and Hispanic Cherokee communities have popped up. However, there is no coordination between them and the main project, and that's something I think we ought to improve from now on. That will be the topic of the open session of the second day.
All in all, it will have plenty of interesting attendees, talks, chats, giveaways, etc. :-) There are still a few empty spaces, so do not hesitate to register if you'd like to attend!
















