Blog

January 07, 2022 22:53 +0000  |  Free Software Linux Majel 1

For the last two years I've been working on Majel, a project that allows you to control your computer with your voice. The first incarnation was released back in March, but was dependent on Mycroft, so I've been working to rewrite it to be independent. The end goal: to be able to release it as an image file for the Raspberry Pi so people can just download it, burn it onto an SD card, pop it into their Pi and have it Just Work™.

The development process has been surprisingly easy, with only a few hiccups around audio processing in Python. The new architecture has proven to be really solid and I'm excited to share that in detail at a later date -- but that's not what this post is about. This post is about packaging for the Raspberry Pi and what a nightmare it's been for me.

For the purposes of this post, you just need to know that I wrote a Python application that interfaces with GNOME, Firefox, Chromium, Skype, and other desktop tools to do cool stuff.

What follows is a series of things I now understand that came at the cost of a lot of time and hair-pulling. If you're thinking about going down this road for your own project, my hope is that sharing my experiences here will help save you frustration in the future.

The CPU Architecture

Anyone who knows anything about the Raspberry Pi project can tell you that these little devices don't run the same kind of CPU you're probably used to. Where most computers we use today (not including phones) use x86 processors (typically built by Intel or AMD), the Raspberry Pi uses ARM chips. If your knowledge of the situation (like mine) ended there, then I'm about to save you some pain.

Just like the x86 ecosystem (which consists of i386, i686, x86_64 and other "sub-architectures"), the ARM family includes a wide variety of architectures which you need to build explicitly for when you're making your own stuff. For example, if you've got a Python wheel labelled aarch64 it will only run on 64-bit ARM systems, while one labelled armv7l will run on 32-bit ARM systems.

The Raspberry Pi's hardware 4 can run both but the default "Raspberry Pi OS" is 32-bit and exclusively runs armv7l binaries. If you want to use aarch64, you must install an OS other than the default.

Python Support

In the Python world, the vast majority of packages on PyPI are "pure-python" (ie. they will run on any system already running Python). However there's a lot of packages out there that're bundled with some compiled code (usually C or C++). These packages must be compiled exclusively for your architecture in order to run and if your architecture isn't supported with a pre-existing build, you either have to build it yourself (painful, especially on a Pi) or you're shit out of luck.

For example, the popular cryptography library is not pure python and therefore must be compiled for the architecture it's running on. Thankfully, that project's maintainers support a variety of platforms but note that armv7l isn't one of them.

In fact, finding a package on PyPI with support for armv7l is actually quite rare. Instead, Raspberry Pi users have a special "hack" on their system (one of many I discovered in my travels), an additional Python repo, enabled by default: piwheels.org.

If you're running Raspberry Pi OS, you'll find that nearly all of your not-pure-python packages are not coming from PyPI at all, but are rather coming from piwheels.org: a repository of .whl files, built exclusively for the Raspberry Pi. This is pretty great, though it was definitely a surprise.

If however you're not using Raspberry Pi OS and are instead using an aarch64-based OS like Manjaro, then there's no piwheels.org for you. Instead, you have to hope that the package you need has pre-built support for your architecture. Thankfully, aarch64 is much more common in PyPI, but it's not everywhere. The vosk package for example has armv7l packages but not aarch64 ones.

Finally, Poetry has an annoying bug/limitation in it that means you can't configure your pypackage.toml file work across architectures. Your poetry.lock file will only store hashes for one architecture at a time, so if you run poetry update on an x86_64 machine, the resulting poetry.lock will be entirely different from one generated from an aarch64 machine. As this undermines the whole idea of a consistent, distributable, versioned lock file, it's rather disappointing.

The Operating Systems

So now that we know a bit about the limitations of Python in different operating systems running on the same hardware, let's talk about those systems in more detail

Raspberry Pi OS (formerly Raspbian)

Raspberry Pi OS is Debian-based, but critically it is not your typical Debian system. In an effort to make using the Pi easy for everyone from children to seasoned professionals, the Raspberry Pi Foundation has applied a lot of tweaks and hacks to standard Debian which can catch you off guard if you're not ready for them.

Old & Busted Software

Like any Debian system, everything is old-as-fuck as the maintainers prefer stability over modern features. If you're using your Pi to control humidity in a greenhouse, this is probably a good idea, but if you're hoping to take advantage of modern graphical user interfaces, you're going to have a bad time.

For example, the current version of GNOME available for the Pi is two versions behind the GNOME project's release schedule and it's a good bet that that gap will grow with time. As for Firefox, the most recent version you can get is Mozilla's "extended support release" (ESR) which is a nice way of saying "we promise to support this version for years and years but it won't be meaningfully updated during that time*.

What's more, simply installing GNOME on a standard Raspberry Pi OS image absolutely will not work because there's something called pi-package installed by default that claims to have installed an inferior version of gnome-settings and that conflicts with the would-be-installed version. You must instead use a "Lite" version of the image (the one that doesn't come with X or LXDE) and then install GNOME from there.

Special Configuration Pattern

Configuration of the Pi is done with a program called raspi-config which is installed by default, but if you're using a Pi 4, most of the options you can select in this tool will fail to apply.

As best I can tell, Bluetooth is entirely broken from the start. None of the usual patterns I would expect to get it working (like running systemctl start bluetooth and opening the Bluetooh UI) resulted in success. This is not a hardware problem, but a software one. I can only assume that there's some special Raspian way to do this.

Non-standard Re-packaging

Chromium is a first-class citizen in Piworld, installed by default on the standard image, but strangely listed as chromium-browser rather than the usual chromium. You can even get Widevine support in it (so you can watch encrypted video on Netflix & Prime) simply by running apt install libwidevinecdm0. This deviates from what you see on a typical Chromium install, since modern versions of Chromium allow you to download Widevine support automatically. I can only assume that this is a special concession for the armv7l architecture.

Widevine support in Firefox appears to be impossible.

Kodi has been compiled to exclusively run without an X server or Wayland present. Undoubtedly this is to allow Pi users to just install Kodi and run it without the overhead of a UI they aren't using, but if you want that standard overhead, you're SOL.

Building Your Own Image

If your goal, like mine, is to distribute your app as a Raspberry Pi image, then you'll want to look into pi-gen, an automated system that let's you build a Pi image on an x86-based machine. It's impressively simple, but critically only runs on Debian-based systems. If you're running Fedora, Arch, or some other system, they have a Docker-based runner, but I couldn't get it to work. To get it working on my Arch system, I started a Debian VM and and it worked beautifully... after consuming a whopping 42GB of disk space!

My original idea what to build Majel automatically via Gitlab's CI. With a disk footprint like that however, I'm afraid I'm going to have to rethink that idea.

Manjaro

After running afoul of all of the above, I started looking into alternative base images to work with. Thankfully, Raspberry Pi's excellent Pi Imager (available on FlatHub) makes the burning of alternative images super-easy, and I found Manjaro Linux (a flavour of Arch Linux) to be a really good starting point for my project. In fact, there's a GNOME variant available so you can burn an image that boots into GNOME shell!

As an Arch-derivative, it runs really close to the bleeding edge, so installing a modern version of Firefox and Kodi was super-easy. There were a few surprises though.

It's Not the Same Architecture

While Raspberry Pi OS is running on armv7l, Manjaro builds all of its packages for aarch64. That means that piwheels.org is out of the question, and that there's still going to be some Python packages that aren't published to PyPI with support for Manjaro on a Pi (looking at you vosk).

Wayland is the Default

It's a "new hotness" sort of OS, which means that the default UI server isn't Xorg, but Wayland. For most people, this is probably ok, but for me, since my project relies heavily on Xorg (Majel uses pyautogui which can't do Wayland) this was a problem. Thankfully, you can switch to using Xorg simply by installing xorg-server and uncommenting the WaylandEnable=false line in /etc/gdm/custom.conf.

Widevine is... Problematic

While getting Widevine support in Raspberry Pi OS is easy, getting it working in Manjaro is pretty sketchy. Sure you can install modern versions of both Chromium and Firefox and they work great, but Widevine isn't there, and it won't autodownload, even in Chromium.

Instead, you have to install this crazy/amazing package called chromium-docker from the AUR. The installation process builds a local Docker image of Ubuntu wherein you install Chromium and you can take advantage of the aforementioned libwidevinecdm0. Running it from that point forward involves starting the Docker container and running Chromium from inside it. That's just... bananas.

Packaging is Tricky

The easiest way to make my project installable on Arch-based systems is to contribute an AUR package, but writing one that will install properly on both aarch64 and x86_64 systems was surprisingly not straightforward.

All the docs you read will tell you that there's one variable you set for package sources, conveniently called source=(). What took far too long to find was that you can actually suffix this variable name with the name of the architecture: source_aarch64=() and source_x86_64=(). You then do the same for the sha512sums=() variables and finally, you write some sketchy if/else Bash in your package() function to check if ${CARCH} is equal to aarch64 or x86_64 etc. Have a look at what I had to do for the vosk library if you're curious.

Creating Your Own Image Looks Easy

Manjaro has all of their OS builds available on GitHub, so from the outside it looks like making your own build should be easy. I haven't tried it yet though, so I can't comment.

Everything Else

With the exception of the above, working with Manjaro on the Raspberry Pi is delightful. Getting my Flic button paired with the Pi via Bluetooth was 100% painless and straightforward, and the OS in general has all sorts of nice creature comforts built into it, like zsh by default, a pretty drop-in replacement for cat, and a nice set of custom icons.

Ubuntu

Finally, there's Ubuntu, which admittedly I actively dislike. The whole proprietary Snap system, the ugly re-skinning of GNOME, the dependence on Debian unstable under the hood so everything is both old and broken... Ubuntu is everything I don't want in Linux under one roof. It's also hugely popular though, and likely the only place I'll be able to get Widevine easy out-of-the-box.

The first time I installed it, it locked up the mouse and keyboard for minutes at a time during the initial setup phase. As I write this, I'm still waiting for the initial boot to finish and the mouse is frozen on the screen. I'm not confident that my desire to see this work will be strong enough to overcome my contempt for this distro.

In General

The Pi is marketed as a tiny computer that you can leverage to do anything your heart desires provided you have the time, patience, and are comfortable with a low-power device doing the lifting. The question is though: is something as complicated as a voice-activated desktop automation system that plays streaming video even possible on hardware as limited as a Raspberry Pi?

It turns out, it's totally doable. In Raspberry Pi OS, I managed to bring up simultaneous instances of Firefox and Chromium and play "The Witcher" on Netflix by way of voice command. All processing, even the speech-to-text handling was done on-device and the performance was admirable.

The only caveat I will mention is that streaming video at full screen will absolutely not work at 4K resolution. In fact, I didn't get anything resembling a good framerate until I bumped the resolution all the way down to 1280x720. For my purposes though, this is completely reasonable: this is basically a very smart television after all and the quality of stream I get from Amazon Prime is abysmal anyway.

Conclusion

As long as this post is, it isn't even the end of my development process. I still have to give Ubuntu a fair shake and decide which of the above will be the reference platform for Majel. It'll install just fine on x86-based systems, but as the Pi is what I always envisioned for it, I want to get this part right before I officially "release" the new Mycroft-free version 2.0. Hopefully that'll be sometime in the spring, as I only have a few hours a night to work on it.

Until then, maybe the above will be useful to someone. If it was, please leave a comment! If it wasn't and you have questions, feel free to ask :-)

May 11, 2021 22:51 +0000  |  Free Software Majel 0

I'm going to try to do a better job of recording my development process here, and to that end, I wanna talk about my latest Big Shiny Idea.

My Majel project was mostly well-received, but the most common piece of feedback I got was how hard it was to install. The dependency on Mycroft really kicked the project in the nuts because Mycroft isn't really user-friendly and the company behind it doesn't appear to be prioritising that aspect because they intend to make their money selling physical devices.

On top of that, Mycroft isn't exactly ideal for this sort of thing, since it's operating on the same premise as Alexa & Google Home: it's actively listening for commands which means there's a constant battle between being able to hear said commands and you know, playing music. I'm just tired of saying "Hey Mycroft... HEY MYCROFT" and getting nothing because it can't tell the difference between my voice and the one on the tv.

So I have a new plan, with a new architecture, and dreams of usability. Fun stuff!

At the moment in my head there are at least 3 components: the "orchestrator", the cloud service, and the remote.

The Orchestrator

This is basically what Majel already is, but I'm going to refactor it to leverage pyautogui and control your whole desktop rather than just your browser.

The Remote

No more yelling at the screen. You have an app on your phone consisting of a two buttons: listen and stop, along with the possibility of a few presets you use often. These buttons send messages to the orchestrator to tell it what to do.

There's no reason this has to be just a mobile app though. It can be a tap on your watch, or an IOT button if you like.

The Cloud Service

Unfortunately, after a lot of digging, I've found that there's no (easy) way that you can have a web app or mobile app talk to an unencrypted websocket. There's restrictions built into most platforms that block that sort of thing, so you have to encrypt the traffic, and the easiest way for non-nerds to do that is to use a 3rd-party service. This is that service: a dumb relay between remotes and orchestrators that itself cannot impersonate a remote (for obvious security reasons). This cloud service could be self-hosted of course, but for new users, or people who just don't care about that sort of thing, remotes & orchestrators will connect to majel.danielquinn.org.

So that's the idea. I'm still in the planning phase, but I've got a lot of energy behind me on this for the moment. We'll have to see where that leads. For now, here's the diagram I worked out tonight: