Blog

July 30, 2017 19:23 +0000  |  Software 1

I'm just writing down my thoughts here in the hopes that Someone Smarter Than Me might be able to shed some light on the idea, or perhaps even work with me to make it happen.

I'm reading more and more about how fake news stories are circulating, and how technology has developed to the point where we can literally create images, audio, and video of events that never happened but appear as though they did. The effort so far seems to be in the area of somehow detecting a fake by searching for evidence of tampering, but this to me feels wrong-headed: it's expensive, slow, and will always be a step behind the fakes.

Why instead do we not simply sign each file on a sub-channel so it can be easily proven to be legit from the source?

For example, the BBC does a story about a politician and includes with it a picture of her doing something interesting. This picture is then circulated around the web with two bits of information hidden inside the EXIF data:

  • The original source organisation (BBC)
  • The signature of the image based on the BBC's private key
  • The original URL of the image (maybe?)

The image is then re-shared onto Facebook, where they've got simple software that:

  • Reads the original file and authenticates its origin against the BBC's public key
  • Resizes the image for its own purposes
  • Appends a second signature using Facebook's private key
  • Posts the video into the user's timeline with a "Verified BBC image, resized original from Facebook" caption

If the image is re-shared onto Twitter, or Google+, or Diaspora, these services will only be able to know that the image came from Facebook, but theoretically this still means more than not knowing the origin at all.

The goal is to create a means of authenticating the original source -- or at least a source more credible than "Jim's computer", and perhaps even the chain of modifications to said source There's also no reason this couldn't be applied to all kinds of media.

Maybe this technology already exists, though a cursory search didn't turn up anything for me. Anyone have any bright ideas?

January 22, 2014 17:46 +0000  |  Employment Software Web Development 0

Every once in a while I hear people speaking with authority about what exactly agile software development is, and the funny thing is, they usually conflict with other statements with similar authority about agile. Often, this is coupled with negative comments about how agile is impractical because X, which is frustrating, because some of my most productive years were spent in a fully agile office environment.

So I thought that I'd write something about agile as well, if for no other reason than to hopefully point people in the direction of what I know to be a very efficient and practical means of getting stuff done. I don't want to claim that this is the One True Way of agile development though, as I'm not interested in having the kind of conversation where we re-classify everything for the sake of giving it a name. My team lead at the time, Mike Gauthier called this system agile, and that's good enough for me.

Talk Less, Code More

The goal behind agile is to have developers spend time doing what they love: rolling code, and to keep them out of meetings they want no part of to begin with. Instead developers have only 3 responsibilities over and above writing code throughout the sprint. I'll cover these in more detail below:

  • A Morning stand up meeting: Every day, 10min
  • Sprint meeting: 1hr
    • 30min to recap the last sprint
    • 30min to prepare the next one
  • Any additional initiative taken to talk to the client about what they want

Note what isn't in that list:

  • Requirements meetings
  • Proposals
  • Logging hours
  • Documentation

The idea behind agile is essentially: "Here's a task, go!". The key to making this work is to keep the tasks simple and concise, so that the result of the sprint is incremental. Read: easy to deploy, with no surprises.

The rapid pace of an agile project means that the usual slow processes of planning meetings and wiki documentation becomes an exercise in futility: the job is done before it's planned, and it's changed not long after it's documented.

Stand Up

It sounds like a pointless process, but it's probably the most powerful part of an agile system. The morning "stand up" meeting, or "scrum" is exactly what it sounds like: the entire team stands up in a corner of the room to answer 3 questions each:

  1. What'd you do yesterday?
  2. What're you expecting to do today?
  3. What happened yesterday that prevented you from doing what you needed to do?

Each developer should talk for no more than a few minutes, answering these questions point blank. It's the opportunity for the team lead to address whatever problems were mentioned (after the meeting), and for other developers to find out that their colleagues are waiting for them to finish something.

Note that this meeting is not for design discussions, or gripes etc. Rather, the purpose is to be a quick update on what's going on -- which is why you're supposed to stand up through the whole thing. The minute someone starts to look like they need to sit, that's your cue that the meeting has gone on too long.

Sprints

Think of sprints as a deploy schedule, but short and seemingly insignificant in what they produce. While a typical software deploy schedule may last months or even years, consisting of massive upgrade paths and a long complex list of changes, sprints are typically 1-2 weeks long. You write the code, and it's live in a few days.

The big difference from other methods is that sprints are incremental, so while new features roll out bit by bit, bugs are fixed weekly with no having to maintain multiple branches for extended intervals.

Keeping the sprint short ensures 4 things:

  • The tasks are always short-term and easy to comprehend both for developers and clients
  • Clients see progress on a regular, predictable schedule
  • Releases are predictable, and easy to break new features into
  • Your team has a concrete and easy to understand goal to work toward

Code Debt

But what about those elaborate project charts with tasks designated to different developers, all colour coded by week, accounting for availability?

Gone. All of it. Throw it out. You now have a binder full of post-its, or if you're feeling all 21st century about it, a Jira task list. This bundle of tasks is your code debt and should not be organised as priorities are expected to change from sprint to sprint. At most the PM should keep a loose tally of priorities, so as to make the sprint planning meetings go smoother.

Chipping Away at that Debt

At the start of every sprint, you hold a meeting in which the project manager talks to the developers about what's most pressing in terms of bug fixes and new features. Importantly, this is a two-way conversation: the PM representing the needs of the client, and the developers representing their own limitations and the quality/maintainability of the code.

This sprint planning meeting is where you take stuff out of your code debt, break it into bite-sized chunks, and assign it to the current sprint. You need to keep the tasks small and easy to achieve in < 4hours. If it takes longer than that, it needs to be broken down. This has a couple big benefits:

  • Big jobs can be spread around, potentially finishing them faster
  • Knowledge sharing is easier as everyone has the opportunity to work on smaller portions of a greater whole.
  • It's an easy way to make big jobs suddenly feel possible.
  • Finishing a task results in a sense of accomplishment for the developers
  • Incremental change gives the client a sense that something is being done

No Ticket, No Work

Now that your sprint planning meeting has broken up a portion of your code debt into tasks, the team is presented with a white board with a simple grid layout:

+--------+--------------+-----------+------------+---------------+
|  Todo  |  Developers  |  Working  |  Finished  |  QA Complete  |
+--------+--------------+-----------+------------+---------------+
|        |  Daniel      |           |            |               |
|        +--------------+-----------+------------+---------------+
|        |  Aileen      |           |            |               |
|        +--------------+-----------+------------+---------------+
|        |  Charlie     |           |            |               |
|        +--------------+-----------+------------+---------------+
|        |  Aisha       |           |            |               |
+--------+--------------+-----------+------------+---------------+

That Todo column is where you put the amorphus blob of post-it notes, each one representing one of the aforementioned bite-sized tasks for this sprint. Note that while in this column, they aren't actually assigned to anyone; they're simply waiting for someone to take them and stick it onto their Working column.

Now, say that there are 30 tasks to complete before the end of the sprint. Aileen sits down at her desk and as she has nothing to do yet, she looks at the board and grabs the post-it about fixing a bug in email notifications. She moves the post-it from the Todo column into the Working column on her row, and opens her editor.

When the job's done, she moves it to Finished, at which point the QA team can now take a look, and when they're happy with the job, they move it to QA Complete. If however her change broke something, or if it's simply unsatisfactory, they move the post-it all the way back to the Todo column, where Charlie might grab it later that day, since Aileen has moved onto another ticket about the statistics engine.

In practise, developers will often gravitate toward tasks they're familiar with, and they'll often leave tickets that have been bounced-back by QA for the initial developer and this can be ok. However if ever one developer becomes a dominant force on a particular component, (s)he might be forbidden from working on it for a while, to make sure that the other developers have a chance to spend some time learning how that software works.

The most important part of this is that developers aren't supposed to do any work unless there's a ticket for them. This keeps people on-task toward completing the sprint on-time and as expected. If there's other work that deserves attention, this is best brought up at the next sprint planning meeting.

Spikes

It's about at this point where people start with comments like "What if the server goes down? Are we expected to wait until the next sprint to fix it?". Obviously not. Emergencies or "directives from on high" are things that can't wait and by their nature they can't be part of the sprint plan. They're also rare, so breaking a working system to accommodate them is a little absurd.

The solution is what's called a "spike". A task injected into the Todo list, typically flagged to be done as soon as possible. Its presence in a sprint taints the sprint, so that it can be pointed to in the event of an overrun:

The server went down on Friday and Aisha had to burn half her day fixing it. As a result, we only finished 33 of our 36 tickets this sprint.

This is the sort of thing talked about in the post-sprint meeting, and if more action is needed (either to fully correct the problem or to avoid future cases) these tasks are added to the next sprint.

So, How'd it Go?

There's one other meeting of consequence. At the end of every sprint, you meet to talk about how the sprint fared: what went well, what didn't. In those 30 minutes, you talk about how awesome the QA team was, and how much it sucked when that module we thought would save us work turned out to create more than it solved. It's important to use this time to blow off steam and celebrate the accomplishments of the previous sprint and to take some time to figure out what could have gone better. It facilitates knowledge sharing more than anything else, and allows the PM and team lead to make better decisions in the future.

Documentation

The one thing people freak out about most when I talk about this method is the lack of documentation. They conjure up nightmare scenarios where one of the developers is hit by a bus and "no one knows how their stuff works", or point out that new developers won't have anywhere to start. Both of these are non-issues though, so long as you stick to the process and don't write terrible code.

If any member of the team doesn't know how a component works enough to get in there and complete a task, then it's time to get that person working on one of those tasks. Knowledge transfer happens best through doing, which means making sure that every member of the team has her fingerprints on every part. To put it in real terms, if Daniel gets hit by a bus, the project can go on because Aileen, Charlie, and Aisha have all spent some time poking at the payment engine. Not one of them wrote the whole thing, but the understanding is there.

Of course this can only happen if the code is readable and adheres to established standards. Variable names should be in the common language of the team and be whole words, method calls should be given names that explain what they do, and class names should make sense as singular objects. If the code can't be understood by someone who's never seen it before, then it's broken by design. Making sure that everyone has an opportunity to interact with this code is the best way to ensure it's readability.

Be Rigid

Probably the hardest part of agile software development is sticking to the process. As simple as it is, it's just too easy to fix a bug that someone found that isn't in the sprint, or add a simple feature that the client mentioned earlier that day. If agile is going to work, this can't be allowed to happen, and a lot of people have a hard time with this.

What you have to remember is that while the process feels pointlessly rigid, it's there to protect the team and ensure that the client gets exactly what was promised on the schedule that was promised. Adding in bug fixes can potentially derail the schedule, or introduce bugs that shouldn't have been there in the first place. It teaches the client that she can have whatever she wants whenever she wants, and as it's not part of the agreed sprint, she may try to get away with not paying for it.

From the developer side, it's important to remember that we like lists. If we can look at the list of stuff to do and know that that's all that's ever going to be there for the whole sprint, this introduces a sense of calm, and knowing exactly what's expected.

To this end, it's important to reward a team that manages to complete its sprint ahead of schedule. If they get everything finished by Thursday, let them take Friday off. The project is exactly as far along as you expected, so why not? Similarly, if the team is routinely late in completing the sprint, overtime is justified since the entire team helped write the sprint schedule during the planning meeting.

Conclusions

What makes agile work is having a simple and concise plan to follow, that has been agreed upon by all parties. I've worked at companies that implement this system without involving the developers so the schedule is imposed by people who have no knowledge of what actually needs to be done. I've also worked at companies where the developers run the schedule, which is to say, there's barely any schedule at all and the results are products that "mostly work", according to whatever the developer at the time thought was appropriate. As with so many other things, the key is openness, honesty, and inclusion in the process for all sides.

Agile is a system that everyone understands and agrees to, but doesn't get in the way of actually getting stuff done. It protects all parties involved from undue stress, and unexpected results, and I can honestly say that it was (at least for me) the best system to work with.

November 15, 2010 22:19 +0000  |  Drupal Programming Software 14

I've been doing Drupal development on-and off for nearly three years now and it's always been frustrating. I'm a pretty vocal and animated kind of person too, so my co-workers soon came to know me as the anti-Drupal guy, which can be pretty rough when your employer has chosen to standardise on the platform. Now that I'm finally out of the Drupal world, I wanted to write a little about the platform, specifically speaking to its weaknesses and failures.

My hope here is two fold: (a) that this post serve as a means of communicating to the thousands of frustrated developers out there that they're not alone in their pain, and (b) that perhaps some of this post will help development shops choose Drupal where appropriate and other technologies when it is not.

For the Drupal fan(girl|boy)s, I ask only that you try to read this with an open and constructive mind. While I may rant and curse about Drupal in my Twitter feed, I've tried very hard to make this an unemotional, hopefully useful post about something I've spent a lot of time thinking about and working with.

Drupal Centricism

Drupal Ideology

It seems to be a mantra within the community: "You don't even need to write code". The Drupal ideology is user-centric, choosing ease-of-use over performance at every turn. There's nothing wrong with this of course, so long as your goal is to let unskilled people make websites. However if your priority is a performant application capable of handling a lot of traffic, you're going to have a number of problems.

Some examples of prioritising user-focus over performance:

  • Silent failures are the bane of any developer's existence. It's important to know when a variable isn't defined, or that writing a record to the database failed, or that a file didn't upload properly. Drupal suppresses such messages by default, and as a result nearly every contrib module in the community is so riddled with errors and warnings that development with these messages enabled is near impossible.
  • Views, the de-facto standard way to store and retrieve data from your database, writes queries to the database, so that in order to perform a query against the database, you must first fetch the query from the database. Similar inefficiencies can be found in other "standard" modules like CCK and Panels.
  • Drupal relies almost entirely on caching in order to function at all. Without caching, a method usually reserved for high to extreme traffic situations, Drupal can't handle even a small number of concurrent visitors. Indeed, some projects I've seen have taken more than 10minutes to load a single page, even in development where there was only one connection in use.

Drupal Magic

It's a term celebrated by many in the community. The idea being that Drupal does a mountain of work for you, so you don't have to worry about it. The only problem is that when you're trying to build a finely-tuned application, most of this magic either gets in the way, or even works against you. You get 80% of the way there with Drupal and its contrib modules, and then spend three months fighting the whole application, undoing the damage it's done, just to get what you need out of your website.

The hook-dependent system requires and fosters this anti-pattern. Re-using code often means unpredictable, site-wide changes. A property is written in module X, overwritten in module Y, and altogether removed in module Z, and there's no way to be certain that these functions will execute in a predictable order.

This problem is notably worse when it comes to new developers on a project, since they will undoubtedly not be privy to the magic that is running under the hood, and will have a difficult time discovering it on their own. To those who will answer this with "the project simply needs better documentation", I respectfully suggest that a good code base is easy to understand, and doesn't require a manual that is usually out of date.

To work with Drupal Magic is to attempt to produce useful code against an unordered, uncontrolled, grep-to-find-what-is-going-on-dependent architecture.

Drupal Community

For all the victories in community engagement Drupal has achieved (a massive, diverse and engaged membership), it's the glaring failures that make the whole project a miserable situation for developers. I've already mentioned the standardising on inefficient modules, but I haven't talked about the mountains of really horribly written code yet. Drupal Core, for what it does, is pretty efficient, but too many contrib modules are written by inexperienced developers, or are simply incapable of scaling to enterprise-level capacity. The result of this is that non-developers (managers, sometimes even clients) will point to the functionality of module X and insist: "don't redesign the wheel, just use that", and you spend the next three weeks trying to work around the poor design of said module, eventually being forced to write garbage that talks to garbage.

Often the perceived strength of the community is Drupal's greatest weakness. Drupal is promoted based on its theoretically infinite feature set, but the reality is that in order to use every one of those contrib modules in your site, the memory footprint will be massive, the stability suspect, and the performance abysmal. And gods help you if you try this on a site with millions of users or a similar number of content nodes.

Drupal Establishment

None of this is a problem however if Drupal is used where its features and shortcomings are both understood and accepted as the nature of the platform. Drupal is a great tool in some situations and a horrible burden in others. Sadly, this has not yet sunk in with many of the decision-makers in the web development community. Drupal is being used and promoted as a solution hammer, with every potential development project, a Drupal-shaped nail.

This has a number of negative outcomes, the most dangerous of which is a lack of skill diversity in developers. Companies that insist on Drupal-centric development are in fact promoting ignorance of alternatives that might do a better job and that hurts everyone. Unless developers at these companies take it upon themselves to spend time outside of their 8-12 hour work day to write code for a different platform or language, this Drupal dependency will force their non-Drupal skills to atrophy, limiting their ability to produce good code in the future.

Conclusion

I'm finally at the end of my admittedly unenthusiastic involvement in the Drupal community. Whether the Drupal shops out there read this isn't really up to me, but I hope that this manages to help some people re-evaluate their devotion to the platform. Comments are welcome, so long as they're constructive (I moderate everything), but I'm not going to get into a shouting match on the Internet. If you think I'm wrong, we can talk about it in 5 years.

October 04, 2010 01:41 +0000  |  Blogger Django Python Software 8

I haz a new site! I've been hacking at this for a few months now in my free time and it's finally in a position where I can replace the old one. Some of the features of the old site aren't here though, in fact this one is rather limited by comparison (no search, no snapshots, etc.) but the underlying code is the usual cleaner, better, faster, more extendable etc. so the site will grow beyond the old one eventually.

So, fun facts about this new version:

  • Written in Python, based on Django.
  • 317133 lines of code
  • Fun libraries used:
    • Flot (for the résumé skillset charts)
  • Neat stuff I added:
    • A new, hideous design!
    • A hierarchical tagging system
    • A custom image resizing library. I couldn't find a use for the other ones out there.
    • The Konami Code. Try it, it's fun :-)
  • Stuff that's coming:
    • Search
    • Mobile image upload (snapshots)
    • The image gallery will be up as soon as the shots are done uploading.

Anyway, if you feel so inclined, please poke around and look for problems. I'll fix them as soon as I can.

January 03, 2010 12:07 +0000  |  Django Facebook Python Software TheChange.com Web Development 2

This is going to be a rather technical post, coupled with a smattering of rants about Facebook so those of you uninterested in such things might just wanna skip this one.

As part of my work on my new company, I'm building a syncroniser for status updates between Twitter, Facebook, and our site. Eventually, it'll probably include additional services like Flickr, but for now, I'm just focusing on these two external systems.

A Special Case

Reading this far, you might think that this isn't really all that difficult for either Twitter or Facebook. After all, both have rather well-documented and heavily used APIs for pushing and pulling data to and from a user's stream, so why bother writing about it? Well for those with my special requirements, I found that Facebook has constructed a tiny, private hell, one in which I was trapped for four days over the Christmas break. In an effort to save others from this pain, I'm posting my experiences here. If you have questions regarding this setup, or feel that I've missed something, feel free to comment here and I'll see what I can do for you.

So, lets start with my special requirements. The first stumbler was the fact that my project is using Python, something not officially supported by Facebook. Instead, they've left the job to the community which has produced two separate libraries with different interfaces and feature sets.

Second, I wasn't trying to syncronise the user streams. Instead, I needed push/pull rights for the stream on a Facebook Page, like those created for companies, politicians, famous people, or products. Facebook claims full support for this, but in reality it's quite obvious that these features have been crowbared into the overall design, leaving gaping holes in the integration path.

What Not to Do

  • Don't expect Facebook to do the right/smart thing. Everything in Facebookland can be done in one of 3 or 4 ways and none of them do exactly what you want. You must accept this.
  • Don't try to hack Facebook into submission. It doesn't work. Facebook isn't doing that thing that makes sense because they forgot or didn't care to do it in the first place. Accept it and deal. If you try to compose elaborate tricks to force Facebook's hand, you'll only burn 8 hours, forget to eat or sleep in the process and it still won't work.

What to Do

Step 1: Your basic Facebook App

If you don't know how to create and setup a basic canvas page in Django, this post is not for you. Go read up on that and come back when you're ready.

You need a simple app so for starters get yourself a standard "Hello World" canvas page that requires a login. You can probably do this in minifb, but PyFacebook makes this easy since it comes with handy Django method decorators:

# views.py
from django.http import HttpResponse, HttpResponseRedirect
import facebook

@facebook.djangofb.require_login()
def fbCanvas(request):
    return HttpResponse("Hello World")
Step 2: Ask the User to Grant Permissions

This will force the user to add your application before proceeding, which is all fine and good but that doesn't give you access to much of anything you want, so we'll change the view to use a template that asks the user to click on a link to continue:

# views.py
from django.shortcuts import render_to_response
from django.template import RequestContext
import facebook

@facebook.djangofb.require_login()
def fbCanvas(request):
    return render_to_response(
        "social/canvas.fbml",
        {},
        context_instance=RequestContext(request)
    )

Note what I mentioned above, that we're asking the user to click on a link rather than issuing a redirect. I fought with Facebook for a good few hours to get this to happen all without user-input and it worked... sometimes. My advice is to just go with the user-clickable link. That way seems fool-proof (so far).

Here's our template:

<!-- canvas.fbml -->
<fb:header>
    <p>To enable the syncronisation, you'll need to grant us permission to read/write to your Facebook stream.  To do that, just <a href="http://www.facebook.com/connect/prompt_permissions.php?api_key=de33669a10a4219daecf0436ce829a2e&v=1.0&next=http://apps.facebook.com/myappname/granted/%3fxxRESULTTOKENxx&display=popup&ext_perm=read_stream,publish_stream,offline_access&enable_profile_selector=1">click here</a>.
</fb:header>

See that big URL? It's option #5 (of 6) for granting extended permissions to a Facebook App for a user. It's the easiest to use and hasn't broken for me yet (Numbers 1, 2, 3 and 4 all regularly complained about silly things like not having the app instaled when this was not the case, but your milage may vary). Basically, the user will be directed to a page asking her to grant read_stream, publish_stream, and offline_access to your app on whichever pages or users she selects from the list of pages she administers. Details for modifying this URL can be found in the Facebook Developer Wiki.

Step 3: Understanding Facebook's Hackery

So you see how in the previous section, adding enable_profile_selector=1 to the URL will tell Facebook to ask the user to specify which pages to which she'd like to grant these shiny new permissions? Well that's nifty and all, but they don't tell you which pages the user selected.

When the permission questions are finished, Facebook does a POST to the URL specified in next=. The post will include a bunch of cool stuff, including the all important infinite session key and the user id doing all of this, but it doesn't tell you anything about the choices made. You don't even know what page ids were in the list, let alone which ones were selected to have what permissions. Nice job there Facebook.

Step 4: The Workaround

My workaround for this isn't pretty, and worse, depends on a reasonably intelligent end-user (not always a healthy assumption), but after four days cursing Facebook for their API crowbarring, I could come up with nothing better. Basically, when the user returns to us from the permissioning steps, we capture that infinite session id, do a lookup for a complete list of pages our user maintains and then bounce them out of Facebook back to our site to complete the process by asking them to tell us what they just told Facebook. I'll start with the page defined in next=:

# views.py
@facebook.djangofb.require_login()
def fbGranted(request):

    from cPickle import dumps as pickle
    from urllib  import quote as encode

    from myproject.myapp.models import FbGetPageLookup

    return render_to_response(
        "social/granted.fbml",
        {
            "redirect": "http://mysite.com/social/facebook/link/?session=%s&pages=%s" % (
                request.POST.get("fb_sig_session_key"),
                encode(pickle(FbGetPageLookup(request.facebook, request.POST["fb_sig_user"])))
            )
        },
        context_instance=RequestContext(request)
    )
# models.py
def FbGetPageLookup(fb, uid):
    return fb.fql.query("""
        SELECT
            page_id,
            name
        FROM
            page
        WHERE
            page_id IN (
                SELECT
                    page_id
                FROM
                    page_admin
                WHERE
                    uid = %s
            )
    """ % uid)

The above code will fetch a list of page ids from Facebok using FQL, and coupling it with the shiny new infinite session key, bounce the user out of Facebook and back to your site where you'll use that info to re-ask the user about which page(s) you want them to link to Facebook.

Step 5: Capture That page_id

How you capture and store the page id is up to you. For me, I had to create a list of organisations we're storing locally and let the user compare that list of organisations to the list of Facebook Pages and make the links appropriately. Your process will probably be different. Regardless of how you do it, just make sure that for every page you wish to syncronise with Facebook, you have a session_key and page_id.

Step 6: Push & Pull

Because connectivity with Facebook (and Twitter) is notonoriously flakey, I don't recommend doing your syncronisation in real-time unless your use-case demands it. Instead, run the code via cron, or better yet as a daemon operating on a queue depending on the amount of data you're playing with. However you do it, the calls are the same:

import facebook

# Setup your connection
fb = facebook.Facebook(settings.FACEBOOK_API_KEY, settings.FACEBOOK_SECRET_KEY)
infinitesessionkey = "your infinite session key from facebook"
pageid             = "the page id the user picked"

# To push to Facebook:
fb(
    method="stream_publish",
    args={
        "session_key": infinitesessionkey,
        "message":     message,
        "target_id":   "NULL",
        "uid":         pageid
    }
)

# To pull from Facebook:
fb(
    method="stream_get",
    args={
        "session_key": infinitesessionkey,
        "source_ids": pageid
    }
)["posts"]

Conclusion

And that's it. It looks pretty complicated, and... well it is. For the most part, Facebook's documentation is pretty thorough, it's just that certain features like this page_id thing appear to have fallen off their radar. I'm sure that they'll change it in a few months though, which will make my brain hurt again :-(

November 13, 2009 17:51 +0000  |  Programming Python Software 0

I wrote something like this some time ago, but this version is much better, if only because it's in python. Basically, it's a script that highlights standard input based on arguments passed to it.

But how is that useful? Well imagine that you've dumped the contents of a file to standard output, maybe even piped it through grep, and/or sed etc. Oftentimes you're still left with a lot of text and it's hard to find what you're looking for. If only there was a way to highlight arbitrary portions of the text with some colour...

Here's what you do:

$ cat somefile | highlight.py some strings

You'll be presented with the same body of text, but with the word "some" highlighted everywhere in light blue and "strings" highlighted in light green. The script can support up to nine arguments which will show up in different colours. I hope someone finds it useful.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import sys,re

colours = [
    "\033[1;34m", # light blue
    "\033[1;32m", # light green
    "\033[1;36m", # light cyan
    "\033[1;31m", # light red
    "\033[1;33m", # yellow
    "\033[0;32m", # green
    "\033[0;36m", # cyan
    "\033[0;33m", # brown
    "\033[1;35m", # pink
    "\033[0m"     # none
]

args = sys.argv[1:]

# Strip out arguments exceeding the maximum
if len(args) > 9:
    print("\n%sWARNING: This script only allows for a maximum of 9 arguments.%s\n\n" % (colours[4], colours[9]), file=sys.stderr)
    args = args[0:8]

while True:
    line = sys.stdin.readline()
    colour = 0
    for arg in args:
        line = re.sub(
            r"(%s)" % (arg),
            "%s%s%s" % (colours[colour], "\g<1>", colours[9]),
            line
        )
        colour = colour + 1
    if line == '':
        break
    try:
        print(line.rstrip("\n"))
    except:
        pass

July 17, 2009 00:01 +0000  |  Programming Software Twitter 0

Wil Wheton posted to Twitter today a request for an easy way to fetch all of one's tweets and store them locally. Someone might want to do that if they want a personal archive, or if they're interested in porting their data over to a Free implimentation like Laconica. Whatever your reasoning, here's a quick and dirty way to do it:

for i in {1..999}; do
  curl -s "http://twitter.com/statuses/user_timeline.xml?screen_name=your_screen_name&count=200&page=$i" | grep '<text>' | sed -e 's/^ *<text>\(.*\)<\/text>/\1/'
  sleep 2
done

Just hit "ctrl-c" when you hit your first post ever.

June 15, 2009 19:07 +0000  |  Activism Drupal Free Software Linux PHP Software Technology Work [at] Play 0

I attended my first ever OpenWeb conference yesterday and as per company policy, I have to report on and share what I learnt, so what better way to do so then to make a blog post for all to read?

General

OpenWeb is awesome. It's a conference where people from all over the world come to talk about Open design and communication and hopefully, learn to build a better web in the process. Attendees include programmers, entrepreneurs, designers, activists and politicians all with shared goals and differing skillsets. I shook hands with Evan Prodromou, the founder of identi.ca and WikiTravel, heard talks from the guys who write Firefox and Thunderbird as well as the newly-elected representative for the Pirate Party in the European Parliament, Rickard Falkvinge. All kinds of awesome I tell you.

Rickard Falkvinge: Keynote - On the Pirate Party

Founder of the Pirate Party in Sweden and now a representative in the European Parliament (thanks to proportional representation), Falkvinge was a passionate and eloquent speaker who covered the history of copyright, the present fight for greater control of so-called intellectual property and more importantly the far-reaching and very misunderstood effects of some of the legislation being passed to "protect" copyright holders while eliminating privacy rights for the public.

The talk was very in depth and difficult to cover in a single post so I encourage you to ask me about it in person some time. For the impatient though, I'll try to summarise:

The copyright debate isn't about downloading music, that's just a byproduct of the evolution of technology. As the printing press gave the public greater access to information, so has the Internet managed to disperse that information further. The problem is now that the changing landscape has rendered certain business models ineffective, these business are fighting to change our laws to preserve said model rather than change with the times. Ranging from the frustratingly shortsighted attempts to ban technologies that further file sharing (legal or otherwise) to the instant wire tapping on every Internet connection (and by extension phone call) of every free citizen without a warrant, many of these changes are very, very scary.

"All of this has happened before, and it will happen again" he said. Every time a technological advancement creates serious change for citizen empowerment in society, the dominant forces in that society mobilise to crush it. The Catholic church, gatekeepers of the lion's share of human knowledge at the time actively worked to ban the printing press. They succeeded (if you can believe it) in France in 1535. This time, it's the media companies and they're willing to do anything, including associating file sharing with child pornography and terrorism to do it. Falkvinge's Pirate party is becoming the beachhead in the fight for copyright reform. Now the party with the largest youth delegation (30%!) in Sweden, they are working to get the crucial 4% of the seats in Parliament they need to hold the balance of power and they need your help. He'd like you to send the party 5€ or 10€ per month and I'm already on board.

Angie Byron: Keynote - Women in Open Source

Those of you who know me, know that I can get pretty hostile when it comes to treating women like a special class of people (be the light positive or negative) so I was somewhat skeptical about this one. Thankfully, I was happy to hear Byron cover a number of issues with the Free software community ranging from blatant sexism (CouchDB guys... seriously?) to basic barriers to entry for anyone new to a project. There were a lot of really helpful recommendations to people wanting to engage 100% of the community rather than just one half or the other.

Blake Mizerany: Sinatra

Sinatra is a Ruby framework that went in the opposite direction of things like my beloved Django or Ruby's Rails. Rather than hide the nuts and bolts of HTTP from the developer, Sinatra puts it right out there for you. Where traditional frameworks tend to muddle GET, POST, PUT, and DELETE into one input stream, this framework structures your whole program into blocks a lot like this:

  require 'rubygems'
  require 'sinatra'
  get '/hi' do
    "Hello World!"
  end

That little snipped up there handles the routing and display for a simple Hello World program. Sinatra's strength is that it's simple and elegant. It lets you get at the real power at the heart of HTTP which is really handy, but from what I could tell in the presentation, there's not a lot available outside of that. Database management is done separately, no ORM layer etc. etc. It's very good for what it does, but not at everything, which (at least in my book) makes it awesome.

Ben Galbraith and Dion Almaer: Mozilla Labs

These are the guys who make the Cool New Stuff that comes out of Mozilla. You know those guys, they write a nifty web browser called "Firefox", I'm sure you've heard of them.

Mozilla Labs is where the smart nerds get together to build and experiment with toys that will (hopefully) eventually make it into a finished product. Sometimes that product is an add-on or plug-in, other times it's an entirely new project. It's all about how useful something is to the public. And as always, the code is Free. You may have even heard of Ubiquity, an extension to Firefox that promises to reshape how we use a web browser... they're working on that.

This time through, they were demoing Bespin, a code editor in your web browser. Imagine opening a web browser, going to a page and doing your development there: no need for a local environment, but without the usual disadvantages of aggravating lag or difficult, text-only interface. Now imagine that you can share that development space with someone else in real time and that you can be doing this from your mobile device on a beach somewhere. Yeah, it's that awesome.

We watched as they demoed the crazy power that is the <canvas /> tag by creating a simple text editor, in Javascript right there in front of us... with about 15 lines of code. Really, really impressive.

David Ascher: Open Messaging on the Open Internet

Ascher's talk on Open Messaging was something I was really interested in since I've been actively searching for information on federated social networking for a while now. The presentation was divided into two parts: half covering the history of email and it's slow deprecation in favour of a number of different technologies as well as how people are using it in ways never intended for the architecture. Major problems with the protocol itself were touched on, as well as an explanation about how some of the alternatives out there are also flawed.

He then went on to talk about Mozilla Thunderbird 3 and the variety of cool stuff that's happening with it. "Your mail client knows a lot about you" he says "but until now, we haven't really done a lot with it". Some of the new features for Thunderbird 3 include conversation tracking (like you see in Gmail), helping you keep track of what kinds of email you spend the most time on, who you communicate with most etc. and even statistical charts about what time of day you use mail, what kind of mail you send and to whom how often. It's very neat stuff. Add to this the fact that they've completely rewritten the plug-in support, so new extensions to Thunderbird mean that your mail client will be as useful as you want it to be.

Evan Prodromou: Open Source Microblogging with Laconica

Up until this talk (and with the exception of Falkvinge's keynote), I'd been interested, but not excited about OpenWeb. Prodromou's coverage of Laconica changed all of that.

Founder of WikiTravel and one of the developers on WikiMedia (the software behind Wikipedia), Prodromou has built a federated microblogging platform called Laconica. Think Twitter, but with the ability for an individual to retain ownership of his/her posts and even handle distribution -- with little or no need for technical knowledge required. Here, I made you a diagram to explain:

Federated Laconica vs. Monolithic Twitter
Federated Laconica vs. Monolithic Twitter

Here's how it is: whereas Twitter is a single central source of information, controlled by a single entity (in this case, a corporation), Laconica distributes the load to any number of separate servers owned by different people that all know how to communicate. Where you might be on a server in Toronto, hosted by NetFirms, I could be using a Laconica service hosted by Dreamhost in Honolulu. My posts go to my server, yours go to yours, and when my Twitter client wants to fetch your posts, it talks to NetFirms and vice versa.

The advantages are clear:

  1. Infinite scalability: Twitter's monolithic model necessitates the need for crazy amounts of funding and they still don't have a profit model to account for those costs. Laconica on the other hand means that the load is distributed across potentially millions of hosts (much like the rest of the web).
  2. You control your identity, not a private corporation.

The future is where it gets really exciting though. By retaining ownership of your identity and data, you can start to attach a variety of other data types to the protocol. For the moment, Laconica only supports twitter-like messages, but they're already expanding into file-sharing as well. You'll be able to attach images, video and music files, upload them to your server and share them with whomever is following you. After that, I expect that they'll expand further to include Flickr-like photo streams, Facebook-like friendships and LiveJournal-like blog posts. These old, expensive monolithic systems are going away. In the future we'll have one identity, in one place, that we control that manages all of the data we want to share with others.

Really, really cool stuff.

I went home that night and signed up as a developer on Laconica. I've downloaded the source and will experiment with it this week before I take on anything on the "to do " list. I intend on focusing on expanding the feature set to include stuff that will deprecate the monolithic models mentioned above... should be fun :-)

Drupal Oops

I closed out the evening with some socialising in the hallway and some ranting about how-very-awesome Laconica was to my coworker Ronn, who showed up late in the day. He wandered off in search of my other colleagues and I followed after finishing a recap with Karen Quinn Fung a fellow transit fan and Free software fan. Unfortunately though, I wasn't really paying attention to where Ronn was going, I just followed out of curiosity. It turns that out I had stumbled into a Drupal social where I was almost immediately asked: "so, how do you use Drupal and how much do you love it?" by the social organiser. James gave me a horrified "what the hell are you doing here" look and searching for words, I said something to the effect of "Um, well, I was pretty much just dropping in here looking for my co-workers... oh here they are! -- I like Drupal because it makes it easy for people to make websites, but I don't really use it because it gets in my way. I prefer simple, elegant solutions and working around something just to get it to work is too aggravating." Considering the company, my response was pretty well received. I backed out quietly at the earliest opportunity :-)

So that was OpenWeb, well half of it anyway. I only got a pass for the Thursday. I can't recommend it enough though. Really interesting talks and really interesting people all over the place. I'll have to make sure that I go again next year.

October 10, 2008 20:19 +0000  |  Geek Stuff Software 1

I wrote a fun bit of code for doing Facebook apps and thought that I would share.

One of the big problems with Facebook's app system is styleising text. You can't include an external file because they won't let you. This leads a lot of designers to write their code directly into the style="" attribute in the HTML. This can get ugly fast, and is the reason external .css files exist in the first place.

To remedy this, you can create the usual .css file externally and then call this helper function to get the job done for you. If you have to use paths etc in it, you can even pass a key/value pair dictionary to it to swap out keywords so that a strings like this:

	background-image: url('http://domain.tld/path/to/some/image.png');

Can look like this:

	background-image: url('[[images]]/image.png');

The call for this type of thing would be just:

	print '
		<style type="text/css">
			'. Helper::css('myStyle', array('images' => 'http://domain.tld/path/to/some')) .'
		</style>
	';

Here's the soruce if you're interested:

<?



    /**
    *
    * Helper functions for the view
    *
    * \author Daniel Quinn (corporate at danielquinn.org)
    *
    */

    class Helper
    {

        /**
        *
        * Simple templating engine for cascading style sheets since
        * Facebook doesn't like the idea of including external .css
        * files.  Instead, we keep the files separate then call this
        * method with a series of key/value replacers if need be.
        *
        * \param  file  Full path for the css to include
        * \param  vars  A dictionary lookup of key value pairs to be
        *               replaced in \a file.
        *
        */
        public static function css($file, $vars = array())
        {

            $in = file_get_contents("$file.css");

            $src = array();
            $dst = array();

            foreach ($vars as $k => $v)
            {
                $src[] = '[['. $k .']]';
                $dst[] = $v;
            }

            return str_replace($src, $dst, $in);

        }

    }



?>

Not revolutionary, I know. But maybe it'll help somebody out there.

September 24, 2008 16:01 +0000  |  Maps PHP Programming Software 3

I just read a nifty post on monkeycycle about how to geocode an spreadsheet with free tools from Google and Yahoo and it occurred to me that this is probably the kind of thing people go looking for so I thought that I'd post my latest shiny new bit of code here.

I call it a cascading geocoder. The idea being that most of the time, a single geocoding service is pretty good, but sometimes it goes down, and other times it can't understand the address. For the purposes of the project I'm working on, this wasn't permissible, so I wrote some code that attempts to code an address first with Google, then if that fails, it uses geocoder.ca's engine.

It's fully object oriented and very clean. It's also GPL. Download it here if you're interested ^_^