Jan 25, 2015

Future of CherryPy: bright and shiny?

First off, I really hope so, tough at the same time I see the reasons for it. Moreover I have been trying to contribute back to the community by giving knowledge that could hopefully cut the rough edges that I’ve met on my way of learning and employing CherryPy [1]. And when CherryPy is calling, I have something to say.

This article is my consideration in reply to the holiday post [2] by Sylvain Hellegouarch in CherryPy user group about its current state and future, which recalls to older big discussion [3] about the status of development of CherryPy.

Acquisition

My CherryPy experience started in beginning of 2011. I had a chance to jump ship to Python web application development and I didn’t lose it. At that moment I had years of web development experience, so I was looking for the certain piece of software. If I would start this search today, most of the front page titles are the same. The same you don’t have to be the best to be popular still applies to the very domain of Python application.

I’ve spent a couple of weeks reading the web: reviews, comparisons, benchmarks [4] and StackOverflow [5]. And I have to admit that there are paths for newcomers to CherryPy, who look for stable, fast, well-designed and pythonic HTTP framework with its own Zen [6] (like Python itself [7]), that doesn’t dictate application design, nor what tools to use, but rather provides structure, testability and primitives that help code scale.

Scope

Where CherryPy stands? I think it perfectly stands as a Python HTTP application server. It allows a developer to write a genuine Python code, use the best from the Cheese Shop, design applications in whatever desired way (but with restrictions of threaded execution model).

Self-contained packages

I agree that stable pure-Python HTTP server is one of major features of CherryPy. Because basically, in most cases one can just forget about WSGI, and use good old HTTP — bare CherryPy or behind reverse-proxy like nginx. Less intermediaries means less problems as it simplifies development and deployment. However, I don’t think it’s an additive feature. If you extract the web-server into a stand-alone package, be it named Cheroot or something else, it won’t be as valuable for application development. If there are other frameworks’ users who host their WSGI applications with CherryPy, it’s their problem of deployment and of these framework’s authors. This usage is ultimately a byproduct of CherryPy’s WSGI compliance and stability, and let it stay so.

On the other hand, as the questions risen in the post are mostly about the cost of CherryPy maintenance, if the increased modularity will help to reduce it, then it surely makes sense. But the extraction also has its cost, and if the code of target functionality is tightly coupled with the rest of the codebase, then benefits of extraction should clearly outweigh its cost. At least inner-package refactoring towards more encapsulated components is possible with almost the same benefits.

Implementation support

Regarding supported Python implementation official website says: Python 2.5+, 3.1+, PyPy, Jython and Android. Documentation says: CPython, IronPython, Jython and PyPy and CherryPy supports Python 2.3 through to 3.4.. There’s no need to say about official distribution, CPython, but its versions are in question. There’s a lot of compatibility code for old Python versions in CherryPy, which can be shrunk by lifting the bar closer to actual ones. I don’t think the move is for 3.x series, but for CherryPy 4, CPython 2.7 and 3.3+ are fine. This is a real way to make the codebase cleaner, smaller and fun to deal with again.

PyPy is constantly taking more ground and is supported by many libraries that also grow in number. However I can’t get many real use-cases out of my head (if we don’t actually include the case of a long task which should be run in background, but is run synchronously as an effort tradeoff and assuming it can benefit from JIT), nor do I know if there’s CPU-bound hot CherryPy code. Mostly I feel it’s a cool-factor, but with possible real use-cases. So I think it’s a good idea to keep it. The problem with supporting PyPy is its garbage collectors. They aren’t based on reference counting, thus files, sockets and everything that was able to close itself passing out of the execution scope with zero references on CPython, won’t deterministically keep doing it on PyPy. It makes more strict demands on code to keep it memory-leak free. Python3’s ResourceWarning can help with it. I left a bug report [8] about the subject once when I was porting a CherryPy-based library to Python3.

The rest is more questionable. CherryPy running on Android is clearly a cool-factor. SL4A with Python isn’t straightforward to install, and even though there’s complete CPython 2.6, it barely has a single real use-case. There’s file API, SQLite and the rest important things we use server-side for, for PhoneGap and the like, for ones who write HTML5 application. There’s Kivy with established deployment. The same way I feel of Jython and IronPython support. I have no idea if anyone is actually using CherryPy on them, but having them officially supported complicates things. Testing, debugging, developing the framework. I think these highly specific environments should be covered by enthusiasts, if there’re any out there, who deal with them. If there’s a bug, say for Jython, it should be treated as enhancement and fixed only if the fix doesn’t impede design, performance and maintainability of primary implementations’ code. I think guarantees and according measures are needed, so it’s clear what is officially supported by CherryPy, and what it may run on.

To depend or not to

For an application it is perfectly fine to depend on as many libraries to delegate as most out-of-domain functionality away from its codebase. On the other hand for a library it’s not as good idea. There’s a functional difference between a library and a framework, but for current form of CherryPy these terms feel interchangeable.

For a library having no dependencies is a feature when general less moving parts, less issues applies. It may relate to broken, abandoned, license-changed 3rd-party dependencies. It may be incompatible changes, or used C-modules which complicate installation and use in PyPy and other implementations and platforms. Thus as CherryPy has grown its own pure-Python internals which are fast and stable, I think there’s no point in abandoning them.

Hype, fuss and choices

Last, but not least thing I want to note about the scope of CherryPy is its technology choice. WSGI is good example here. From inception it likely was a cool thing in Python. WSGI here and WSGI there, standards, uniformity and stuff. And then, a little later… oops, a synchronous protocol. No WebSockets, no SSE, no other close to real-time techniques. This choice affects, for example things like django-sse [9] and its users [10] (though this community is strongest in seeing everything as a nail). But also PaaS providers like OpenShift [11], and probably others.

That is to say it’s okay to be compliant (e.g. WSGI-compliant), but it’s not okay to be a pliable bigot, who praises next cool X at every turn. In that sense I like how CherryPy stands. Simple is simple in a threaded server. Real-time stuff needed? Sylvain’s ws4py [12] to the rescue. Simple background processing? There you go with cherrypy.process.plugins.BackgroundTask. So I want to keep seeing CherryPy making conscious and pragmatic technology choices.

2014 has seen new breed of asynchronous libraries within Python 3.4+ batteries, asyncio. It is nice to see co-operative execution in Python finds more expressive language constructs and tooling, but apart from its niche uses I suspect it’ll be next asynchronicity-everywhere craze, which will lead to rewrites, new cool stuff and more “pressure” on users of “old” software. At least, there’s no evidence that it can be faster on generic load, as CherryPy is neck and neck with Tornado [4]. There’s good socket performance notice [13] in Python documentation which also applies generally to such design choice.

There’s no question that the fastest sockets code uses non-blocking sockets and select to multiplex them. You can put together something that will saturate a LAN connection without putting any strain on the CPU. The trouble is that an app written this way can’t do much of anything else - it needs to be ready to shuffle bytes around at all times.

Finally, remember that even though blocking sockets are somewhat slower than non-blocking, in many cases they are the “right” solution. After all, if your app is driven by the data it receives over a socket, there’s not much sense in complicating the logic just so your app can wait on select instead of recv.

Audience

Naturally appreciation comes from the target audience. For a newbie in HTTP (which is a complex protocol by the way) and application development in general, Django may be a blessed gift because it helps to meet her goals by answering as many development questions as possible, and at the same time for an experienced developer it may seem a worthless bulk of bloatware that solves all its tasks equally poor (like all generic solutions, though). The same newbie won’t be able to effectively use CherryPy because it won’t answer the several dozen of questions for her.

If CherryPy stays CherryPy there can’t and shoundn’t be answers for many of them. How to design my application? What application directory layout to use? What persistance layer and storage to employ? How do I render a template in CherryPy? It’s all up to you, developer. From my experience, reading others’ opinions on the web, and particularly in the user group, there is an unanimously opinion that CherryPy’s barrier to entry is high, but once you wrapped your head around its concepts you’re enlightened and empowered.

These points lead to two measures of increasing CherryPy user base, and hopefully contributors at some proportional rate. First is targeting more specific audience. Examining the main page [1] it is at least not bad. It shows that simple is easy, and gives clues that complex is possible. It refers to any other object-oriented Python program, which indicates good API. But it’s biased opinion, because I look at things in reverse. Is it really that comprehensible for one who is making decision at the moment? Is there some real-world CherryPy code that is widely considered worth exposing, that is done “right” way and promotes best CherryPy practices? Second is means of lowering and alleviating barrier to entry, see Improvement section.

A note about code hosting that is supposed to affect the audience. I agree with Konstantin Molchanov, that magic pixies… wait no, the best friend of Ruby and Rails, to the backbone awesome, the octocat won’t make CherryPy popular by the wave of its tentacle. There’s no outreach program nor there’s traffic charity for newcomers on Github. It’s a popular code hosting which makes it easy to contribute to a Git repository, in case there’re ones who are willing to. But if there are no new contributors, I don’t see the way it can help.

Though it may not work in opposite direction, because if there’s a Python developer willing to contribute to a Python project, the developer may expect a Python tool for it, which Git is not. I think it’s a disillusioned and naive idea to turn away from Mercurial to get more Python users. At the same time, I cannot disagree with Nic Young that there’re a few developer services (mostly for quality assurance) which integrate only with Github, which are free for open source projects and may be useful for CherryPy (e.g. TravisCI is not Git-only, but Github-only because it’s tied to its API), albeit none of them lack alternatives.

Part taken

A year after I commenced my CherryPy affair one web application project was in production and I possessed certain portion of experience. Specifically saying, deployment struggle experience. Having HTTP-only deployment was very helpful, because I could reuse the knowledge I had, but Linux part was at most scarce, poorly covered and I was groping for valuable pieces of knowledge all over the web. Having mint Debian box how do I make my application continuously deployable? How do I make it first-class daemon and run it as www-data? How do I put it behind nginx? How do I monitor its memory and CPU usage? How to restart it if it has crashed? How do I rotate its logs?

That time I already had the answers, and they were put together and published as tutorial/skeleton project [14]. I felt it was necessary contribution to make. Obviously, it hasn’t received high volume traffic, but qualitatively there’re constant visits from all over the world (and least from the countries where people do IT), which indicates that CherryPy still attracts new users.

In mid-2013 I’ve written another tutorial/skeleton project [15], and even though its goal is maintainable JavaScript website-like application design, its server-side is CherryPy application with real-world template layout for Jina2 and according CherryPy tool. I’ve found time to write documentation for it only in the end of 2014, so it is also an on topic thing.

In the spring of 2014 I started to participate on StackOverflow’s CherryPy tag [5]. It was another pay-off kind of thing, because the site was helpful to me, and its CherryPy part specifically. But the latter sadly had a lot of unanswered question making it feel a little abandoned. In fact many questions are quite interesting dealing with performance, design, execution model of CherryPy, so it requires an answerer to have a good grasp of documentation and codebase. Besides making other users’ CherryPy experience better, it is fun and cognitive activity. It sheds the light on how people really use CherryPy, how they start, where are remaining rough edges and what is the room for improvement.

Improvement

Inherently I’m an application guy. I pick out cog-wheels, tune them and make spin together. I don’t work on a library oftentimes, nor do I have much time for it, but because I really value CherryPy I want to have a way to do it, in case I have time and idea for improvement or fix. For instance, I want to add privilege drop plugin to cherryd and make other minor improvement to it.

When you file a bug report, you generally expect someone, a core contributor or so, to tell you whether it is generally valid or not. Then it may be re-prioritized, assigned to someone or put on a shelf to wait its day. Without this initial input it’s a little confusing. “Okay, no one replied. It should be working other way”, — you may think, at best.

Understanding the flow of one’s patch to CherryPy codebase is also important. Official site says fork CherryPy on BitBucket here and submit pull-request with your modifications. When it’s a bug either when it’s an enhancement? Who and where to ask a flow question, say about branching, or about eligibility of a change. There’s CONTRIBUTING.txt which didn’t appear long ago and it points to Jason Coombs’ post [17] about writing a perfect pull request. It is at least something but I think it can be more clear and CherryPy-specific. Also I think it should in the documentation, linked from the website’s main page.

Quality assurance

I think things like bug #1298 [16] should never happen to a project which is maintained in a responsible way and puts effort in quality assurance. Okay, I see tox.ini in root of the project. I clone it, and run tox. Duh, all environments have failed. Besides, not all the claimed environments are in the file. Moreover, it’s some kind of back in Soviet Russia story. Why the hell unit tests ask me questions!? Are they examining me instead? More surprises? ShiningPanda is dead [18], the same way as the link to it from the CherryPy website under online tests section (continuous integration is kind of standard term for the thing, as online tests refer more to IQ tests, psychology, et cetera).

To name it honestly, this is a reckless way of maintaining a project. Reckless to users whose applications will break with next release published this way. And interpretation result is as much time-consuming as to see whether the build page is green or red. No, I’m not promoting these cool badges one can see all around. I’m talking about the responsive flow where every contributor runs tests on all supported Python implementations (with Tox) before pushing her changes. If one is lazy to care and to set up all implementations, but has itching fingers to push some changes anyway, having the notification like: Hey, you’ve just broken CherryPy. Please fix ASAP would be helpful. I think for everyone who has write permission to the repository it makes sense to receive such notification. At very least the one who publishes CherryPy to the Cheese Shop should have rigid requirement to check CI page.

The same concern, with lesser strain, applies to the ReadTheDocs builds, which all have failed for last 3 months [19].

This is the problem which needs to be solved as soon as possible. In 2014 I tried the CI service called Drone.io, which supports the three major code hosing services. Well, it turned out to be usable (build page example [20]). I can help assist or set it up for CherryPy, once flow and role questions are answered.

Barrier to entry

Hopefully, one of biggest barriers I was facing myself seems to have passed. I mean that documentation link erosion where almost every single CherryPy-related link on the web leaded to 404 page. I think ReadTheDocs adoption is huge improvement, especially considering that new documentation, being clearly a better one, lacks some details that were covered in version 3.3. Having persistent, versioned and indexable documentation is very helpful.

I think, it should be stated more or less directly that if one wants a framework to make application design decisions it’s better off using something else, not CherryPy. Okay, here’s Joe who knows enough about Python. He’s conscious and is willing to make application design decisions and take responsibility for them. But he also has a deadline. Joe doesn’t expect his first real-world CherryPy application to be shiny from experienced CherryPy developer point of view, albeit functional from end-user’s. What’s the best way to help Joe create functional prototype quickly, and let him gradually improve the design and codebase of his application thereafter, along with CherryPy knowledge?

Let’s look at CherryPy knowledge as a tree. Each branch represents a functionality. Say a CherryPy tool branch. As it goes from the root, there’re nodes like Copy-paste this decorator to turn your handler result into JSON, What other tools are capable of?, What is a CherryPy tool, and how to write one?, Tool hook-points, types, priorities, etc.. The same is for other branches — from basic to advanced. And what will Joe benefit from the most is breadth-first traversing on the tree. Better if some of a few first levels are illustrated with runnable snippets. This way it’s possible to give overview, tunable blocks of code and directions for improvement. For the later is important to spur creativity, but not enforcing design decisions, in the way: start with this, improve by yourself.

I’ve just glanced over the latest documentation over again, and surprise, it narrates in very similar fashion! Sylvain has done a great job. But what is catching my eye is that there’s two separate tutorial series now: one in the source code [21], another in the documentation [22]. Even though both serve the same purpose, the former is much easier to maintain, covering with tests which leads to the bonus of better regression test suite. If the later’s textual flexibility can be sacrificed for less maintenance cost, I think, it’s possible to generate tutorial section from module docstrings and code, and merge the two.

However, both series deal with simplest cases and there’s a big gap between them and real-world application requirements. Here I also a snippet to base on directly or get ideas from to get a prototype faster, but with real-world tools. Like safe database access in threaded environment, URL routing and generation, multi-allplication design and so on. Sylvain’s CherryPy recipes [23] referred in the documentation several times may answer this to some extent, but in general I think it’s still in question. But also whether it should maintained in centralized fashion within CherryPy. Or if it’s more suitable for external source, like post on somebody’s blog or StackOverflow.

Contribution packages

I think Eric Larson’s suggestion about cherrypy-* pageages makes sense. For example there’s a package called cherrys [24] which is the good example. It’s a Redis adapter for CherryPy session handling. What’s important it has clearly defined scope and doesn’t affect an application design. This way it might have been named cherrypy-redis-session.

A package that is an implementaion of an CherryPy interface must be appropriate for packaging and sharing. Sharing a CherryPy application with clearly defined scope, like Dowser [25], is also beneficial.

On the contrary specific tools or plugins that affect application design shouldn’t be packaged. For instance, looking at Sylvain’s recipes they are all better to say recipes. Let’s look at Jinja2 recipe [26]. Besides having a class per file isn’t pythonic, and splitting a thing into plugin and tool isn’t always rewarding, it makes certian assumtion on how template file names are defined for the URL. Jina2 is a flexible template engine with notion of template inheritance. Real-world projects have dozens of templates and tend to orgianize them by role and use. Having some convention-over-configuration rule for template name is very handy. So I want to say it’s usually better to have a small and effiective application-specific CherryPy tool or plugin rather than trying to invent something overly broad and generic one.

Other changes

Bitbucket pages should have either actual supplementing information or no supplementing information. Menu community link [27] leads to wiki home page where version section has latest version of 3.2.5. Menu development link [28], which points to the repository overview page refers to Python 2.3, where the site’s version starts with 2.5+. Referring python setup.py install as installation command, which requires a manual download, ought to be changed to normal pip install cherrypy.

I think, minor changes on the website are needed:

  • no need in download link in the menu,
  • add install section after features which says pip install cherrypy,
  • community link is better to refer to the user group,
  • rename online tests, change the service link, use better sentence which will emphasize reliability.

About _cp module prefix. There’re some public classes like cherrypy._cptools.HandlerTool, which may be used in user code directly or as super classes. This isn’t correct naming, according to the underscore convention. So it’s a good idea to change it some day. Though as a PyDev user I can’t recall to any issues with it.

Wrap-up

I can conclude that there’re two things that I’m seeing problematic: quality assurance for various claimed enviroments CherryPy to support and lack of contribution guiadance, specifically in the form of bug tracker supervisor. The rest feels normal, if that makes sense and as it is usually the case needs small constant improment and polishing.


[1](1, 2) http://cherrypy.org/
[2]https://groups.google.com/forum/#!topic/cherrypy-users/lT1cxovGyy8
[3]https://groups.google.com/forum/#!topic/cherrypy-users/MjrIZ_jljRQ
[4](1, 2) http://nichol.as/benchmark-of-python-web-servers
[5](1, 2) http://stackoverflow.com/questions/tagged/cherrypy
[6]https://bitbucket.org/cherrypy/cherrypy/wiki/ZenOfCherryPy
[7]https://www.python.org/dev/peps/pep-0020/
[8]https://bitbucket.org/cherrypy/cherrypy/issue/1331
[9]https://github.com/niwibe/django-sse
[10]https://lincolnloop.com/blog/architecting-realtime-applications/
[11]http://stackoverflow.com/q/25845161/2072035
[12]https://ws4py.readthedocs.org/
[13]https://docs.python.org/2/howto/sockets.html#performance
[14]https://bitbucket.org/saaj/cherrypy-webapp-skeleton
[15]https://bitbucket.org/saaj/qooxdoo-website-skeleton
[16]https://bitbucket.org/cherrypy/cherrypy/issue/1298
[17]http://blog.jaraco.com/2014/04/how-to-write-perfect-pull-request.html
[18]http://shiningpanda.com/shiningpanda-ci-clap-de-fin.html
[19]https://readthedocs.org/builds/cherrypy
[20]https://drone.io/saaj/qooxdoo-cherrypy-json-rpc
[21]https://bitbucket.org/cherrypy/cherrypy/src/tip/cherrypy/tutorial/?at=default
[22]http://cherrypy.readthedocs.org/en/latest/tutorials.html
[23]https://bitbucket.org/Lawouach/cherrypy-recipes
[24]https://pypi.python.org/pypi/cherrys
[25]http://www.aminus.net/wiki/Dowser
[26]https://bitbucket.org/Lawouach/cherrypy-recipes/src/tip/web/templating/jinja2_templating/?at=default
[27]https://bitbucket.org/cherrypy/cherrypy/wiki/Home
[28]https://bitbucket.org/cherrypy/cherrypy/overview