Searching for Tao

The Case Against Drupal

I've been doing Drupal development on-and off for nearly three years now and it's always been frustrating. I'm a pretty vocal and animated kind of person too, so my co-workers soon came to know me as the anti-Drupal guy, which can be pretty rough when your employer has chosen to standardise on the platform. Now that I'm finally out of the Drupal world, I wanted to write a little about the platform, specifically speaking to its weaknesses and failures.

My hope here is two fold: (a) that this post serve as a means of communicating to the thousands of frustrated developers out there that they're not alone in their pain, and (b) that perhaps some of this post will help development shops choose Drupal where appropriate and other technologies when it is not.

For the Drupal fan(girl|boy)s, I ask only that you try to read this with an open and constructive mind. While I may rant and curse about Drupal in my Twitter feed, I've tried very hard to make this an unemotional, hopefully useful post about something I've spent a lot of time thinking about and working with.

Drupal Centricism

Drupal Ideology

It seems to be a mantra within the community: "You don't even need to write code". The Drupal ideology is user-centric, choosing ease-of-use over performance at every turn. There's nothing wrong with this of course, so long as your goal is to let unskilled people make websites. However if your priority is a performant application capable of handling a lot of traffic, you're going to have a number of problems.

Some examples of prioritising user-focus over performance:

  • Silent failures are the bane of any developer's existence. It's important to know when a variable isn't defined, or that writing a record to the database failed, or that a file didn't upload properly. Drupal suppresses such messages by default, and as a result nearly every contrib module in the community is so riddled with errors and warnings that development with these messages enabled is near impossible.
  • Views, the de-facto standard way to store and retrieve data from your database, writes queries to the database, so that in order to perform a query against the database, you must first fetch the query from the database. Similar inefficiencies can be found in other "standard" modules like CCK and Panels.
  • Drupal relies almost entirely on caching in order to function at all. Without caching, a method usually reserved for high to extreme traffic situations, Drupal can't handle even a small number of concurrent visitors. Indeed, some projects I've seen have taken more than 10minutes to load a single page, even in development where there was only one connection in use.

Drupal Magic

It's a term celebrated by many in the community. The idea being that Drupal does a mountain of work for you, so you don't have to worry about it. The only problem is that when you're trying to build a finely-tuned application, most of this magic either gets in the way, or even works against you. You get 80% of the way there with Drupal and its contrib modules, and then spend three months fighting the whole application, undoing the damage it's done, just to get what you need out of your website.

The hook-dependent system requires and fosters this anti-pattern. Re-using code often means unpredictable, site-wide changes. A property is written in module X, overwritten in module Y, and altogether removed in module Z, and there's no way to be certain that these functions will execute in a predictable order.

This problem is notably worse when it comes to new developers on a project, since they will undoubtedly not be privy to the magic that is running under the hood, and will have a difficult time discovering it on their own. To those who will answer this with "the project simply needs better documentation", I respectfully suggest that a good code base is easy to understand, and doesn't require a manual that is usually out of date.

To work with Drupal Magic is to attempt to produce useful code against an unordered, uncontrolled, grep-to-find-what-is-going-on-dependent architecture.

Drupal Community

For all the victories in community engagement Drupal has achieved (a massive, diverse and engaged membership), it's the glaring failures that make the whole project a miserable situation for developers. I've already mentioned the standardising on inefficient modules, but I haven't talked about the mountains of really horribly written code yet. Drupal Core, for what it does, is pretty efficient, but too many contrib modules are written by inexperienced developers, or are simply incapable of scaling to enterprise-level capacity. The result of this is that non-developers (managers, sometimes even clients) will point to the functionality of module X and insist: "don't redesign the wheel, just use that", and you spend the next three weeks trying to work around the poor design of said module, eventually being forced to write garbage that talks to garbage.

Often the perceived strength of the community is Drupal's greatest weakness. Drupal is promoted based on it's theoretically infinite feature set, but the reality is that in order to use every one of those contrib modules in your site, the memory footprint will be massive, the stability suspect, and the performance abysmal. And gods help you if you try this on a site with millions of users or a similar number of content nodes.

Drupal Establishment

None of this is a problem however if Drupal is used where its features and shortcomings are both understood and accepted as the nature of the platform. Drupal is a great tool in some situations and a horrible burden in others. Sadly, this has not yet sunk in with many of the decision-makers in the web development community. Drupal is being used and promoted as a solution hammer, with every potential development project, a Drupal-shaped nail.

This has a number of negative outcomes, the most dangerous of which is a lack of skill diversity in developers. Companies that insist on Drupal-centric development are in fact promoting ignorance of alternatives that might do a better job and that hurts everyone. Unless developers at these companies take it upon themselves to spend time outside of their 8-12 hour work day to write code for a different platform or language, this Drupal dependency will force their non-Drupal skills to atrophy, limiting their ability to produce good code in the future.

Conclusion

I'm finally at the end of my admittedly unenthusiastic involvement in the Drupal community. Whether the Drupal shops out there read this isn't really up to me, but I hope that this manages to help some people re-evaluate their devotion to the platform. Comments are welcome, so long as they're constructive (I moderate everything), but I'm not going to get into a shouting match on the Internet. If you think I'm wrong, we can talk about it in 5 years.

An Output Highlighter

I wrote something like this some time ago, but this version is much better, if only because it's in python. Basically, it's a script that highlights standard input based on arguments passed to it.

But how is that useful? Well imagine that you've dumped the contents of a file to standard output, maybe even piped it through grep, and/or sed etc. Oftentimes you're still left with a lot of text and it's hard to find what you're looking for. If only there was a way to highlight arbitrary portions of the text with some colour...

Here's what you do:

$ cat somefile | highlight.py some strings

You'll be presented with the same body of text, but with the word "some" highlighted everywhere in light blue and "strings" highlighted in light green. The script can support up to nine arguments which will show up in different colours. I hope someone finds it useful.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import sys,re

colours = [
    "\033[1;34m", # light blue
    "\033[1;32m", # light green
    "\033[1;36m", # light cyan
    "\033[1;31m", # light red
    "\033[1;33m", # yellow
    "\033[0;32m", # green
    "\033[0;36m", # cyan
    "\033[0;33m", # brown
    "\033[1;35m", # pink
    "\033[0m"     # none
]

args = sys.argv[1:]

# Strip out arguments exceeding the maximum
if len(args) > 9:
    print("\n%sWARNING: This script only allows for a maximum of 9 arguments.%s\n\n" % (colours[4], colours[9]), file=sys.stderr)
    args = args[0:8]

while True:
    line = sys.stdin.readline()
    colour = 0
    for arg in args:
        line = re.sub(
            r"(%s)" % (arg),
            "%s%s%s" % (colours[colour], "\g<1>", colours[9]),
            line
        )
        colour = colour + 1
    if line == '':
        break
    try:
        print(line.rstrip("\n"))
    except:
        pass

Your History in 140 Characters

Wil Wheton posted to Twitter today a request for an easy way to fetch all of one's tweets and store them locally. Someone might want to do that if they want a personal archive, or if they're interested in porting their data over to a Free implimentation like Laconica. Whatever your reasoning, here's a quick and dirty way to do it:

for i in {1..999}; do
  curl -s "http://twitter.com/statuses/user_timeline.xml?screen_name=your_screen_name&count=200&page=$i" | grep '<text>' | sed -e 's/^ *<text>\(.*\)<\/text>/\1/'
  sleep 2
done

Just hit "ctrl-c" when you hit your first post ever.

php.py

I wrote something rather fun today and I thought that I'd share it here. It's a Python module that you can use to interact with PHP products. Specifically, it's a reproduction of PHP's http_build_query() and parse_ini_file() functions that act as PHP does according to PHP's own way of doing things.

This means that if you've written an API server (as we have) in PHP that makes use of things like the above, you can interact with it using Python as your scripting language with little effort.

Examples:

from php import parse_ini_file

config = parse_ini_file("/path/to/config.ini")
print config["sectionName"]["keyName"]

This would give you the value for keyName in the section called sectionName in your config.ini file.

from php import http_build_query

somedata = {
  "keyname": "valuename",
  "otherkey": 123,
  "anotherkey": [1,2,3,{"seven": "eight"}]
}
print http_build_query(somedata)

This would give you:

otherkey=123&keyname=valuename&anotherkey[1]=2&anotherkey[0]=1&anotherkey[3][seven]=eight&anotherkey[2]=3&

The code was fun to write, and I'm guessing that it'll be useful to others so I'm posting it here. If you do end up using it, lemme know by posting a comment here eh?

You can download it here: php.py.

When I mentioned this to some other coworkers, they pointed out that I'm not the only one trying to get some of PHP's odd functionality into Python. Another developer has mimicked PHP's serialize() functions in the form of a Python module. I wonder if there are any other cases where this kind of stuff might be useful.

PHP is Not as Untyped as You May Believe

I stumbled upon an ugly PHP bug today and thought that I would share. While PHP is supposed to be a untyped language, this isn't always the case. The following code snippet for example does not do what you might expect:

switch ($output->status)
{
  case 0: $output->status = 'fail'; break;
  case 1: $output->status = 'ok';   break;
  case 2: $output->status = 'stub'; break;
}

With this code, passing in a string such as "ok", $output->status is set to "fail". This is due to what I assume to be a bug in PHP's lack of keeping everything untyped. For some reason, it would seem that PHP parses $output->status as an integer (therefore all strings return as 0) and then compares them to the list. If however you change the cases to strings:

switch ($output->status)
{
  case '0': $output->status = 'fail'; break;
  case '1': $output->status = 'ok';   break;
  case '2': $output->status = 'stub'; break;
}

Everything works as expected. Pretty lame if you ask me, but there it is.

The Five Stages of Refactoring

Big thanks to Corey for brightening my day with this one:

  1. Disbelief
    • "Who wrote this!?
  2. Anger
    • "I'm not cleaning this up!"
  3. Bargaining
    • "Okay, we'll fix up this module if you promise we'll just rewrite everything else."
  4. Depression
    • "This is never going to get any better."
  5. Acceptance
    • "I'll just create a wrapper..."

My Cascading Geocoder

I just read a nifty post on monkeycycle about how to geocode an spreadsheet with free tools from Google and Yahoo and it occurred to me that this is probably the kind of thing people go looking for so I thought that I'd post my latest shiny new bit of code here.

I call it a cascading geocoder. The idea being that most of the time, a single geocoding service is pretty good, but sometimes it goes down, and other times it can't understand the address. For the purposes of the project I'm working on, this wasn't permissible, so I wrote some code that attempts to code an address first with Google, then if that fails, it uses geocoder.ca's engine.

It's fully object oriented and very clean. It's also GPL. Download it here if you're interested ^_^

Hug a Developer

Found via The Blomsa Code, Margaret sent me this little gem. I should mention that the embed code the thing gave me appears to only include <embed> data and not Exploder's <ojbect> tags so I can't be sure it'll work in their browser.

Programmer as Grasshopper

I've been assigned to a junior-level programmer here at the office to teach her how to write code for the server I've been labouring on over the past six months. The system is my brainchild, my baby and it's with a mix of relief and aprehension that I'm taking on this new apprentice for this project.

And then I saw this at the top of her first class file:

<?php



	/**
	*
	*   Author: Coworker's Name (coworker@donatgroup.com)
	*  Licence: GPL-3 "Information wants to be free"

Granted, she forgot to capitalise "Free" but it's a pretty good start ;-)

Python Talks to Questionable Content

Yeah, I know it's late, but I got obsessed and couldn't turn away. Now I have a really slick python script that pulls down the latest Questionable Content strip and adds it to my collection to read later:

#!/usr/bin/env python

import os, sys, re
import pycurl, urllib2

destination = "/path/to/qc/repository"

class QC:

  def __init__(self):

    self.contents = ''
    self.baseurl  = 'http://questionablecontent.net/'


  def callback(self,buf):

    self.contents = self.contents + buf


  def end(self):

    c = pycurl.Curl()
    c.setopt(c.URL, self.baseurl)
    c.setopt(c.WRITEFUNCTION, self.callback)
    c.perform()
    c.close()

    try:
      return int(re.search('.*\/comics\/(\d+)\.png', self.contents).group(1))
    except:
      print ""
      print "  The regular expression no longer works."
      print "  You might want to check into that."
      print ""
      exit()


  def fetch(self,n):

    strip = urllib2.build_opener().open(self.baseurl + 'comics/' + str(n) + ".png").read()

    f = "%s%s%04d%s" % (destination, "/", n, ".png")

    fout = open(f, "wb")
    fout.write(strip)
    fout.close()


# Main -----------------------------------------------------------------------

qc = QC()

for i in range(1, qc.end() + 1):
  f = "%s%s%04d%s" % (destination, "/", i, ".png")
  if not os.path.exists(f):
    print "Fetching strip #" + str(i)
    qc.fetch(i)

Random Favourites

Most Commented

Tags

Activism Advertising Agriculture Amsterdam Anarchy Animals Anime Appnovation Art Atheism Blasphemy Bloc Québécois Blogger British Columbia Broadway Canada Capitalism Career CBC CCTV Charity Christians Chrystal Cities Civil Rights Climate Change Coalition Code Snippets Communism Conservatives Consumerism Copyright Corporations Costumes Creative Commons Culture Cycling Death Democracy Diplomacy Django Dreams Dream Vancouver Drupal Economy Emily-Jane Energy Environment Ethics Facebook Family Food Free Software Friends Fun Stuff Gentoo Linux George Bush Germany Graffiti Green Party Hacking Health Health Care Homelessness Ideas Imager Iraq Israel Italy Japan Javascript Job Hunting KDE Korea Language Learning Liberals Linux Maps Marketing Media Melanie Memes Moments In Time Money Movies Moving Multiculturalism Munich Municipal Collective My Future Nationalism NDP Netherlands Net Neutrality New Mind Space Noreen Nuclear Olympics Oxyor/Marketsims Passing Thoughts Patents Perl Photography PHP Police Politics Prejudice Primus Privacy Programming Protests Provincial Campaign 2009 Public Space Published Python Racism Recipes Reinvent Religion Riptown 'Round-the-World Rydium Scams Science and Nature Scrubby Seattle Self Development Self Reflection Sex Socialism Software Solitude Sovereignty SSH Star Trek Stephanie Stephen Harper Street Furniture Stress Stupid People STV Suburbia Susan Switzerland Technology TED Television Terrorism The Arts TheChange.com The Economy The Toronto Public Space Committee The United States Toronto Transit Translink Travel Twitter Unemployment Urban Design Utrecht Vancouver Vancouver Public Space Network Violence War Weather Web Development Who Am I Wikipedia Windows Women Wordpress Work [at] Play Writing

Twitter Feed

Support Wikipedia