be Groovie

About

Ben Bangert is a San Francisco Bay Area programmer, best known for his open-source work creating and contributing to Python libraries such as Pylons, Beaker, and Routes.
He currently works at Mozilla.

Categories

Argo Tunnel in Kubernetes

Note

Part [1 2 3] of a series of more. I don’t know how much more yet as this is primarily written to document my setup so I can refer to it later when I wonder why/how I did something.

To get the Argo Tunnel working in Kubernetes, we need to first install helm on the computer we run kubectl from. I use a debian based system for this, so these commands are the appropriate ones for a recent debian that has snap.

Install helm:

$ sudo snap install helm --classic

Configure a service account with cluster-admin role by sticking this into a rbac-config.yaml file:

apiVersion: v1
kind: ServiceAccount
metadata:
    name: tiller
    namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
    name: tiller
roleRef:
    apiGroup: rbac.authorization.k8s.io
    kind: ClusterRole
    name: cluster-admin
subjects:
- kind: ServiceAccount
    name: tiller
    namespace: kube-system

Apply the configuration to the cluster:

$ kubectl create -f rbac-config.yaml

Initialize helm:

$ helm init --service-account tiller --history-max 200

Now we’re ready to add the cloudflared ingress controller:

$ helm repo add cloudflare https://cloudflare.github.io/helm-charts
$ helm repo update
$ helm install --name anydomain --namespace default \
    --set rbac.create=true \
    --set controller.ingressClass=argo-tunnel \
    --set controller.logLevel=6 \
    cloudflare/argo-tunnel

Install cloudflared from here.

Then run cloudflared login and put the cert in ~/.cloudflared/cert.pm.

Create a secret for this domain:

$ kubectl create secret generic DOMAIN --from-file="$HOME/.cloudflared/cert.pem"

The rest of the Argo instructions for an Ingress definition should all work fine now.

Lan Containers in Kubernetes with Rancher

Note

Part [1 2 3] of a series of more. I don’t know how much more yet as this is primarily written to document my setup so I can refer to it later when I wonder why/how I did something.

When running containers that I want available on my LAN, it’s handy to expose them under their own LAN IP. To do this I set my DHCP server to stop allocating addresses past .189, and will reserve the remaining IP addresses for container use.

To allocate LAN IP addresses for containers in Kubernetes I use MetalLB. I installed it using kubectl (Rancher makes it easy to download the kube config file).

Then I setup a ConfigMap YAML:

apiVersion: v1
kind: ConfigMap
metadata:
  namespace: metallb-system
  name: config
data:
  config: |
    address-pools:
      - name: default
        protocol: layer2
        addresses:
        - 192.168.2.190-192.168.2.254

And applied with:

$ kubectl apply -f metallb_config.yaml

Unifi Controller

I’m changing the IP of where the existing Unifi controller runs, which comes with an interesting drawback. All the existing Unifi devices need to start reporting to the new controller.

First, login to existing controller, and download a full backup. Then setup the new controller, my unifi-controller.yaml looks like this:

I created a namespace for unifi in the Rancher UI, where I’ll run my Unifi containers.

To spin up the container on its new IP:

$ kubectl -n unifi apply -f unifi-controller.yaml

This can take a few minutes to come-up, it has to play with ARP responses and such. Once the new Unifi controller was up, I restored the backup and logged in. None of the devices will show up.... yet!

On the old Unifi controller, I went to Settings -> Site and clicked the Export Site button. You can download a backup file here, but the more important step is to migrate the devices to the new controller by providing the new IP for them to inform. Once this step is done, the devices should show up in the new controller shortly.

The old controller can then be shutdown.

Unifi Video Controller

I used this kubernetes file to setup my unifi video controller:

I’m using the /mnt/data directory that is the second SATA SSD drive for these containers to store their data on.

After restoring this from the backup I made of my old unifi video controller, I logged in, selected the camera, and under Manage I changed its reporting IP to the new container IP.

A Smaller, Faster, and More Efficient Home Server

Note

Part [1 2 3] of a series of more. I don’t know how much more yet as this is primarily written to document my setup so I can refer to it later when I wonder why/how I did something.

A few years ago I built a home NAS and virtualization server. While I moved past SmartOS after a year or two to FreeNAS, the hard drives are aging and I realized it isn’t very efficient compared to what’s available now. From my prior post, the hard drives were replaced with 6 4TB drives in a ZRAID-2. This ended up giving me vastly more space than I had any need for and as I now have a 1Gbps fiber line I don’t feel the need to store 16 TB locally.

Given my Internet connectivity and increased focus in my spare time on containerized applications using Kubernetes, I figured it might be a good time to ‘downsize’ in power consumption while increasing my virtualization capabilities. The new server configuration I ended up with is a fraction of the size (half-length 1U vs full-length 2U) and power (190w idle vs 40w idle), while having 4x the cores and over 8x the memory capacity. 7 years has definitely helped on what you get for the same price!

System Specs

It took awhile before Epyc embedded processors and vendors 1U systems started to hit the market, but the 8-core/16-thread Epyc 3251 in a nice half-length 1U is now widely available. Serve The Home did a great review of this system which helped convince me to purchase one.

Here are specs for this build:

  • CPU/Motherboard: AMD Epyc Embedded 3251 (8-core / 16-thread) on Supermicro M11SDV-8C-LN4F
  • Case: Supermicro AS 5019D-FTN4
  • RAM: Supermicro DDR4-2666 64 GB ECC LRDIMM x2 (128 GB total)
  • NVMe Drive: Samsung 970 Evo Plus 1TB
  • SATA Drive: Sandisk Ultra 3D 2TB
My 2019 Home virtualization server

Overall cost: ~$2100

This is a bit more than the last buuild, but is 100% SSD storage without any redundancy. I’m relying on cloud backups and accepting that I will have some downtime if a part fails. I consider this an acceptable trade-off to keep costs lower with the hope that once a SSD has proven itself for a few months it should last much longer than a spinning platter drive as I don’t anticipate heavy read/write loads that would wear out the drives.

Note

In the event an SSD fails, I’m only a 10min drive from a Best Buy where I got both the SSD’s. I consider this a more likely failure scenario than the PSU, CPU, memory, or motherboard failing in my experience (I already ran a memtest suite against the memory before installing proxmox, which comes with memtest on their live ISO).

If I really go nuts with VM’s and containers I still have 2 DIMM slots free for another 128 GB of memory. I’ve found that for personal use containers usually run into RAM pressure much earlier than CPU pressure.

OS Choice

I’ve been reading a lot of Serve the Home, and their glowing review of Proxmox VE convinced me to give it a try. I’ve been enjoying it so far and it’s easy to get started with and makes running KVM a breeze.

ProxMox VE it is!

I grabbed the ISO for installing ProxMox 6.0, based on a buster debian distro. Using the ISO directly from my computer was rather easy from the Supermicro IPMI Java interface. While the ikvm HTML5 interface is more convenient, the Java-based console makes it a breeze to attach local .iso files as a CD/DVD drive to the server.

I’m running the community edition, which requires you to edit the /etc/apt/sources.list to include the non-enterprise deb:

deb http://download.proxmox.com/debian/pve buster pve-no-subscription

Additional steps:

  1. From the shell in the web UI:

    apt-get update
    apt-get dist-upgrade -y
    reboot
  2. Add the additional drive as a new lvm-thin pool

Note

I have set all the VM hard drives under the default lvm-thin pool which is on the NVMe SSD for performance. The SATA SSD is to be used for persistent data volumes for containers.

AWS Storage Apppliance

The first thing I wanted to try was utilizing hybrid storage in AWS with their Storage Appliance. Unfortunately AWS only provides a VMWare image. I found a few articles online that indicated this was rather easy to convert to a raw disk image for use in Proxmox, and got it working rather quickly.

  1. Download the AWS Storage Gateway zipfile

  2. Unzip the zipfile (resulting in an .ova file)

  3. tar xf AWS-Appliance-2019-07-24-1563990364.ova (Filename dependent on time it was d/l)

  4. qemu-img info AWS-Appliance-2019-07-24-1563990364-disk1.vmdk and record the virtual size to use when provisioning a proxmox KVM

  5. Provision a Proxmox VM with the given size disk, using an IDE disk emulation target. I gave my VM 16GB of memory, as I wasn’t sure how much it would want.

  6. Determine the location of the LVM disk used by the new VM (something like /dev/pve/vm-100-disk-0).

  7. Convert the vmware disk image to the raw:

    qemu-img convert -f vmdk -O raw AWS-Appliance-2019-07-24-1563990364-disk1.vmdk /dev/pve/vm-100-disk-0
  8. Edit the VM hardware to add the LVM-thin drive resource. I added a 150Gb hard drive as another IDE resource per AWS recommendations for local cache size.

  9. Start the VM.

  10. Look at the console in proxmox to determine the IP, and change it as desired for a static IP.

  11. Finish setup in the AWS Console for the Storage Gateway, your computer will need to be able to talk directly to the VM running the appliance VM. You will be asked to set a cache drive, select the additiona 150Gb drive.

Pros

  • Fast access to frequently accessed files that fit within the cache
  • Everything backed by S3 reliability
  • As much storage as you want to pay for
  • It’s fun to see that your SMB share has 7.99 Exabytes free

Cons

  • Unavailable when the Internet is out
  • Slower access than a NAS
  • SMB requires an Active Directory server for user based permissions or a single guest account with read/write access to all SMB shares.
  • NFS shares have similarly odd restrictions

RancherOS

I followed these directions to install RancherOS under ProxMox VE. Reproduced here with a fix to the cloud-config.yml as the example didn’t validate.

  1. Download RancherOS ISO
  2. Upload the iso to (local)pve
  3. Setup a VM with RancherOS ISO as CD. Give it at least 3gb ram to start. Rancher Server failed with low ram
  4. Boot
  5. From Console change password
    • sudo bash
    • passwd rancher
  6. SSH to rancher@
  7. prepare your ssh keys with putty gen or local ssh key-gen
    • vi cloud-config.yml
  8. paste the cloud config edited with your settings, make sure the pasted data is pated correctly, add your key in a single line
  9. press exit exit :wq to save
#cloud-config

    rancher:
      network:
        interfaces:
          eth0:
            address: 10.68.69.92/24
            gateway: 10.68.69.1
            mtu: 1500
            dhcp: false
        dns:
          nameservers:
          - 1.1.1.1
          - 8.8.4.4

    ssh_authorized_keys:
      - ssh-rsa <YOUR KEY>
  • sudo ros config validate -i cloud-config.yml
  • sudo ros install -c cloud-config.yml -d /dev/sda
  1. Remove CD Image from VM, and then reboot.
  2. SSH back into RancherOS (rancher@) using your new ssh private key

Rancher

With RancherOS running happilly, its time to install Rancher on the VM. This is relatively easy, from the RancherOS VM shell, just run:

sudo docker run -d --restart=unless-stopped -p 8080:80 -p 8443:443 -v rancher:/var/lib/rancher rancher/rancher:stable

Mapping port 80/443 to different local ports is to avoid intereference from the ingress proxy which will be running on this same node.

Once Rancher is available on port 8443:

  1. Add a cluster, of custom type.
  2. Name it, and hit next.
  3. Select all three node options (etcd, Control Plane, Worker)
  4. Copy the command shown and run it in the RancherOS shell.
  5. Click Done in the Rancher UX.
  6. The cluster will become available.

Setup the SATA SSD

I want to use the SATA SSD for persistent volumes for the containers:

  1. Add a hard drive in Proxmox VE to RancherOS VM

  2. Choose a sufficient size (I choose 400 GB)

  3. Start the RancherOS VM (or restart it)

  4. Verify additional hard drive appears in fdisk -l

  5. Format the hard drive with fdisk /dev/sdb

  6. Choose new partition, primary, select default start/end values

  7. Format the partition with mkfs.ext4 /dev/sdb

  8. Set it to load at start in RancherOS:

    ros config set mounts '[["/dev/sdb","/mnt/data","ext4",""]]'
  9. Reboot and verify /mnt/data is a volume mount.

Fin

That’s it for a first day of configuring things. Next up I’ll need to setup MetalLB so that my Kubernetes containers I start with Rancher get LAN IP’s rather than shuttling everything through the default nginx ingress.

Trying out SmartOS and OpenIndiana

After building my new server capable of running SmartOS, it was time to give it a spin!

If you’ve only built desktop machines, its hard to express how awesome IPMI KVM is. No longer do you need to grab another keyboard / video monitor / mouse (the KVM), you just plug in the IPMI Ethernet port on the motherboard to your switch and hit the web-server its running. It then lets you remotely access the machine as if you had it hooked up directly. You can get into the BIOS, boot from ISO’s on your local machine, hard reset, power down, power up, etc. It’s very slick and means I can stick the computer in the rack without needing to go near it to do everything that used to require a portable set of additional physical hardware.

Note

This post assumes some basic knowledge of OS virtualization. In this case QEMU, KVM (which was ported by Joyent to run on SmartOS), and Zones. I generally refer to them as VM’s and will differentiate when I add a Zone vs. a KVM instance.

First Go at SmartOS

Installation is ridiculously easy, there is none. You download SmartOS, put it on a USB stick or CD-ROM, and boot the computer from it. I was feeling especially lazy and used the motherboards IPMI KVM interface to remotely mount the ISO image directly from my Mac.

Once SmartOS booted, it asked me to setup the main ZFS pool, and it was done. SmartOS runs a lot like a VMWare ESXI hyper-visor, with the assumption that the machine will only be booting VM’s. So the entire ZFS pool is just for your VM’s, which I appreciate greatly. After playing with it a little bit, it almost felt.... too easy.

I had really allocated at least a week or two of my spare time to fiddle around with the OS before I wanted it to just work, and having it running so quickly was almost disappointing.

The only bit that was slightly annoying was that retaining settings in the GZ (Global Zone) is kind of a pain. You have to drop in a service file (which is XML, joy!) on a path which SmartOS will then load and run on startup. This was mildly annoying, and some folks on the IRC channel suggested I give OpenIndiana a spin, which is aimed more at a home server / desktop scenario. There was also a suggestion that I give Sophos UTM a spin instead of pfsense for the firewall / router VM.

OpenIndiana

Since OpenIndiana has SmartOS‘s QEMU/KVM functionality (needed to run other OS’s like Linux/BSD/Windows under an illumos based distro), it seemed worth giving a go. It actually installs itself on the system unlike SmartOS, so I figured it’d take a little more space. No big deal. Until I installed it.

Then I saw that the ZFS boot pool can’t have disks in it larger than 2TB (well, it can, but it only lets you use 2TB of the space). Doh. After chatting with some IRC folks again, its common to use two small disks in a mirror as a ZFS boot pool and then have the much larger storage pool. Luckily I had a 250GB drive around so I could give this a spin, though I was bummed to have to use one of my drive bays just for a boot disk.

Installation went smoothly, but upon trying to fire up a KVM instance I was struck by how clunky it is in comparison to SmartOS. Again, this difference comes down to SmartOS optimizing the heck out of its major use-case.... virtualizing in the data-center. In SmartOS there’s a handy imgadm tool to manage available images, and vmadm to manage VM’s. These don’t seem to exist for OpenIndiana (maybe as an add-on package?), so you have to use the less friendly QEMU/KVM tools directly.

Then the KVM failed to start. Apparently the QEMU/KVM support in OpenIndiana (at least for my Sandy Bridge based motherboard) has been broken in the latest 3 OpenIndiana releases for the past 5 months. There’s a work-around to install a specific set of packages, but to claim QEMU/KVM support with such a glaring bug in a fairly prominent motherboard chip-set isn’t a good first start.

My first try to install the specific packages failed as my server kernel-panicked halfway through the QEMU/KVM package installation. Upon restarting, the package index was apparently corrupted. The only way to fix it is to re-install OpenIndiana... or rollback the boot environment (a feature utilizing ZFS thus including snapshots). Boot environments and the beadm tool to manage them are a bit beyond the scope of this entry, but the short version is that it let me roll-back the boot file-system including the package index to a non-mangled state (Very cool!).

With QEMU / KVM finally installed and working, I installed and configured Sophos UTM in a KVM and was off and running. Except it seemed to run abysmally slow... oh well, I was about to go on vacation anyways. I set the KVM to load at boot-time and restarted.

Upon loading the KVM at boot, the machine halted. This issue is apparently related to the broken QEMU / KVM packages. It was about time for my vacation, and I had now played with an OS with some rather rough edges in my spare time for a week. So I powered it off, took out the boot drive, and went on my vacation.

Back to SmartOS

When I got back from my vacation, I was no longer in the mood to deal with failures in the OS distribution. I rather like the OpenIndiana community, but now I just wanted my server to work. SmartOS fit the bill, and didn’t require boot drives which was greatly appreciated. It also has a working QEMU / KVM, since its rather important to Joyent. :)

In just a day, I went from a blank slate to a smoothly running SmartOS machine. As before, installation was dead simple, and my main ZFS pool zones (named as such by SmartOS) was ready for VM’s. Before I added a VM I figured I should have an easy way to access the ZFS file-system. I turned on NFS for the file-systems I wanted to access and gave my computer’s IP write privilege and the rest of the LAN read-only. This is insanely easy in ZFS:

zfs set sharenfs=rw=MYIP,ro=192.168.2.0 zones/media/Audio

To say the least, I love ZFS. Every other file-system / volume manager feels like a relic of the past in comparison. Mounting NFS file-systems on OSX used to suck, but now its a breeze. They work fast and reliably (thus far at least).

Setting Up the Router KVM

First, I needed my router / firewall KVM. I have a DSL connection, so I figured I’d wire that into one NIC, and have the other NIC on the motherboard go to the LAN. SmartOS virtualizes these so that each VM gets its own Virtual NIC (VNIC), this is part of the Solaris feature- set called Crossbow. Setting up the new KVM instance for Sophos UTM was simple, I gave it a VNIC on the physical interface connected to the DSL modem and another on the physical interface connected to my switch.

Besides for the fact that the VM was working without any issues like I had in OpenIndiana, I noticed it was much faster as well. Unfortunately for some reason it wasn’t actually routing my traffic. It took me about an hour (and clearing the head while walking the dog) to see that I was missing several important VNIC config options, such as dhcp_server, allow_ip_spoofing, allow_dhcp_spoofing, and allow_restricted_traffic.

These settings are needed for a VM that intends to act as a router so that it can move the packets and NAT them as appropriate across the VNICs. Once I set those everything ran smoothly.

So far, this only took me about 3 hours and was rather simple so I decided to keep going and get a nice network backup for the two OSX machines in the house.

Setting Up Network Backups

After some research I found out the latest version of netatalk would work quite nicely for network Time Machine backups. I created a zones/tmbackups ZFS file-system, and two nested file-systems under that for my wifes’ Macbook and my own Mac Mini. Then I told ZFS that zones/tmbackups should have compression enabled (Time Machine doesn’t actually compress its backups, transparent ZFS file compression FTW!) and I set quota’s on each nested file-system to prevent Time Machine from expanding forever.

Next I created a Zone with a SmartOS Standard dataset. Technically, the KVM instances run in a Zone for additional resource constraints and security, while I wanted to use just a plain Zone for the network backups. This was mainly because I wanted to make the zones/tmbackups file-system directly available to it without having to NFS mount it into a KVM.

If you’ve ever compiled anything from source in Solaris, you’re probably thinking about how many days I spent to get netatalk running in a Zone right now. Thankfully Joyent has done an awesome job bringing a lot of the common GNU compiler toolchain to SmartOS. It only took me about an hour to get netatalk running and recognized by both macs as a valid network Time Machine backup volume.

Unfortunately I can’t remember how exactly I set it up, but here are the pages that gave me the guidance I needed:

I’ve heard that netatalk 3.x is faster, and will likely upgrade that one of these days.

Setting Up the Media Server KVM

One of the physical machines I wanted to get rid of was the home theater PC I had built a few years back. It was rarely used, not very energy efficient, and XBMC was nowhere near spouse-friendly enough for my wife. We have an AppleTV and Roku, and I figured I’d give Plex a try on the Roku since the UI was so simple.

I setup a KVM instance and installed Ubuntu 12.04 server on it. Then I added the Plex repo’s and installed their Media Server packages. Fired it up and pointed Plex at my Video folders and it was ready to go. The Roku interface is slick and makes it a breeze to navigate. Being based on XBMC means that it can play all the same media and trans-codes it as necessary for the other network devices that want to play it.

At first Plex ran into CPU problems in the KVM... which I quickly realized was because I hadn’t changed the default resource constraints. The poor thing only had a single virtual CPU... after giving it a few more it easily had enough CPU allocated to do the video trans-coding.

While KVM runs CPU-bound tasks at bare-metal speed, disk I/O is virtualized. To reduce this problem I have Plex writing its trans- coded files to the ZFS file-system directly via an NFS mount. The media folders are also NFS mounted into the Media Server KVM.

I threw some other useful apps onto this KVM that I was running on the home theater PC and left it alone.

SmartOS Rocks

I now have a nice little home SmartOS server setup running that does a great job taking on jobs previously done by 2 other pieces of hardware. I still need to setup a base Ubuntu image to use for other development KVM’s, which I’ll blog about when I get that going. Despite being intended for the data-center, SmartOS works great for a home NAS / Media Server / Router system. I’m sure I’ll be even happier as I start to ramp up my use of development VM’s.

OpenIndiana is a small community taking on a big job. It’s a great community and people are very friendly. But you should expect to be hacking on things very early on if you use it, rather than playing with the other components. The SmartOS community is doing great too, and there’s more than a few forks that add some additional home-centric type functionality. So far I haven’t needed any of those enough to get me to try them out.

Anything else I should blog about regarding SmartOS or the rest of my setup?

Building A SmartOS Server

I’ve been reading about SmartOS for awhile now and have wanted to build a home server that would let me run VM’s with ZFS for the main file-system. Getting rid of my home theater PC and wireless router (which has been annoying me with its flakiness for months) was also a goal. Running something like pfsense in a VM would give me more options and theoretically be more stable than the fairly crappy software that seems to plague home consumer-grade wireless routers.

So after a month or so of research in my spare time, it seemed like SmartOS was going to be the best bet. Even though its generally intended for use in the datacenter, it had all the features I wanted (which I’ll blog about separately in my next post). Now I just needed a parts list that had already been verified to work with SmartOS, which is a bit pickier on hardware than the linux/BSD distributions.

Equipment

Here’s what I ended up with:

  • CPU: Intel Xeon E3-1230 V2
  • Motherboard: SUPERMICRO MBD-X9SCL-F-O
  • Case: NORCO RPC-2212 Black 2U Rackmount Server Case with 12 Hot-Swappable SATA/SAS Drive Bays
  • HBA: LSI Internal SATA/SAS 9211-8i (Hooks up to 2 of the back-plane connectors in the case for 8 drives)
  • RAM: 16GB ECC (The 8 GB unbuffered sticks were unfortunately not around at the time or I would’ve gotten two of those to begin with)

I already had a 2TB and 3TB drive, so I bought one more of each so that I could run a ZFS storage pool with 2 vdev mirrors as Constantin Gonzalez blogs about regarding RAID vs. mirrors.

In retrospect, and after reading a bit more, I think I would’ve gotten one of the larger Norco 4U cases. Not because I need or want 20+ hot-swap bays, but because you can easily use a ‘desktop’ grade 80+ Titanium rated power supply. Finding a 2U 80+ PSU is difficult, a 80+ Titanium rated that puts all its power out on a single 5v rail is almost impossible. The cost savings in getting a good desktop-grade PSU with the Norco 4U case is about the same as the one I got with the more expensive 2U PSU.

I also bought a rack to put the server in along with my other home networking gear, so that it’d all be nicely packed away in a corner of the garage. Here’s a photo of the completed setup:

My home-server rack

I have one of the cheaper Cisco SG300-10 switches which conveniently came with rack-mounts, and monoprice had a very affordable patch panel and blank plates to make it look tidy.

Overall cost: ~$2200

That includes the nice Tripp Lite SR12UB 12U Rack Enclosure which I’ve found handy to lock to ensure my toddler doesn’t yank out hard drives (he figured out how to pull out the hot-swap drive in all of 20 seconds when I was assembling it). Not that I let him run around the garage, but keeping everything locked is handy just in case.

OS Choice

When I was assembling and preparing to install SmartOS, some people on IRC mentioned that OpenIndiana might be a better choice for a home server. Suffice it to say it didn’t work out well, while SmartOS has been flawless now and running smoothly for the past two months.

My next post will have a lot more details on my OpenIndiana experience as well as how I have the SmartOS box setup.

New Blog Software Again!

I’ve been using tumblr for awhile and while its useful when posting random stuff I should’ve posted to Facebook instead (images, links, videos, etc.), writing my text in HTML was just icky. Markdown isn’t a huge improvement and I was really itching to write all my posts in reST as I already know it quite well from writing my docs using Sphinx.

I considered using blogofile, but it seems to be abandoned and it’s not trivial to add normal reST style code highlighting. Then I saw tinkerer, which is basically just a few extensions on top of Sphinx... perfect!

The Migration

For anyone considering migrating from tumblr, here’s my simple dump script that pulled all my posts I cared about out of tumblr and dropped them into directories for tinkerer

import json
import re
import os
import subprocess

import requests


date_regex = re.compile(r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})')


def pull_posts(blog, api_key):
    api_url = 'http://api.tumblr.com/v2/blog/%s/posts?api_key=%s&limit=2000'
    call_url = api_url % (blog, api_key)
    r = requests.get(call_url)
    posts = json.loads(r.content)
    return posts


def html2rst(html):
    p = subprocess.Popen(['pandoc', '--from=html', '--to=rst'],
                         stdin=subprocess.PIPE, stdout=subprocess.PIPE)
    return p.communicate(html)[0]


def dump_posts(posts):
    post_links = []
    for post in posts:
        if post['type'] not in ['text']:
            continue
        d = date_regex.match(post['date']).groupdict()
        os.system("mkdir -p %s/%s/%s" % (d['year'], d['month'], d['day']))
        slug = post['post_url'].split('/')[-1].replace('-', '_')
        link = '%s/%s/%s/%s' % (d['year'], d['month'], d['day'], slug)
        post_links.append(link)
        bar = '=' * len(post['title'])
        with open('%s.rst' % link, 'wb') as f:
            print post
            f.writelines([post['title'].encode('utf-8'), '\n', bar, '\n\n'])
            if post['type'] == 'text':
                body = html2rst(post['body'].encode('utf-8').replace('’', "'"))
                f.writelines([body, '\n\n'])
            elif post['type'] == 'link':
                desc = html2rst(post['description'].replace('’', "'").encode('utf-8'))
                f.writelines(['Link: `%s <%s>`_\n\n' % (post['title'].encode('utf-8'), post['url'].encode('utf-8'))])
                f.write(desc + '\n\n'),
            f.writelines([
                '.. author:: default\n', '.. categories:: %s\n' % ', '.join(post['tags']),
                '.. comments::\n', '   :url: %s' % post['post_url']
            ])
    return post_links

It’s quite nice that I get all the Sphinx extensions for use in my blog, and there’s no more mental context switching to write a blog post vs. writing project documentation for my open-source projects.

Dropping a graphviz diagram into my blog also became trivial.

digraph foo {     "awesome" -> "sphinx";     "awesome" -> "tinkerer"; }

The Bad News

It’s not perfect, tinkerer is still very beta. But I can wrap my head around it, and its easy to extend. I’ve already made a little modification in my own fork which allows me to specify URL’s for the comments to ensure I get the right disqus threads on the old blog posts I ported. This wasn’t a flawless process due to how the main reactions and comments thing on the main pages look, they’re a bit off for the legacy posts... but at least the comments and such show up fine once you click in so I’ll live with it.

There’s no category specific RSS feeds at the moment, so I’ll need to hack that in so that I can get relisted on the Python aggregators. I also will likely update the theme, right now I’m just using ‘minimal’ which isn’t bad.

Since this is more for just posts, I dropped the other tumblr things like links and videos to retain just content. I don’t think this is too negative but some might want all the types tumblr supports.

Overall, I’m quite happy with it thus far. We’ll see how I feel in a few months, and hopefully I’ll be blogging more since there’s less friction involved since I get to use the Sphinx tools I’m quite familiar with.

Notes on the Pylons & repoze.bfg Merger

Some folks might not have time to follow the Pylons-discuss mail list, so this might be news to them, but I’m thrilled to announce that the Pylons and repoze.bfg web frameworks are merging. If this is the first you’ve heard about it, don’t worry, it was only announced a week ago now on the Pylons mail list.

In the time since the announcement, I’ve heard a lot of varying feedback. Some people took a look at Pyramid (the core package that will be equivilant to ‘Pylons 2.0’) and were quick to respond, usually in a knee-jerk type response. I think some of this was due to a miscommunication, and partly because there was so much already done. When other frameworks have merged in other languages, such as Rails merging with Merb, the announcement was just that. There was no code at the time to show, just a promise that when it was ready, it would be awesome.

This merger in contrast already had a starting foundation for a huge chunk of the core features. As a result, people assumed that what we had was already ‘finished’, or close to it. The polish of much of the documentation made it feel odd that there was no “Porting Pylons 1.0 to Pyramid” guide done. In reality, Pyramid is definitely not done, there is still quite a bit of work left before Pyramid will meet the expectations that many Pylons users have. There’s still refinements to be done to Pyramid, and additional packages that Pylons users will most likely always use with it for the feature-set they’re accustomed to.

I’ve summed up a few thoughts on when Pylons users should port to Pyramid to try and help manage expectations better in the future. I’ll make more announcements when packages are ready to ease the transition and a “Porting Guide” is ready.

What is Pylons?

Many Pylons users don’t realize which features they enjoy come from the package ‘pylons’ vs. the other packages that Pylons depends on. Contrary to popular belief the majority of features present in Pylons actually come from other packages. This mistaken belief that most of the features come from the pylons package led some to think that because a lot of my future development time will be spent on adding features/packages around pyramid, Pylons is somehow dead>. This is not the case.

First, Pylons the web framework is mainly a small (~ 1000 LoC) glue layer between Paste, PasteScript, PasteDeploy, WebOb, WebError, Routes, WebHelpers, Mako, and SQLAlchemy. Some people usually end up swapping out Mako/SQLAlchemy but by and large this is the common ‘Pylons Stack’. Most of the new features in Pylons over the past several years actually came from additions to WebHelpers, WebError, or Routes. All of these packages continue to get the same development as they have, so no ‘death’ is occurring.

Second, for over the past 6 months now, there’s been very little in the way of patches submitted, bugs reported, or other feature requests. In many ways Pylons is ‘done’ regarding adding more feature to the core package itself. As I announced on the Pylons-discuss mail list, the Pylons code-base hit some design issues. Adding the features I heard requested from quite a few users (and needed myself) regarding extensibility couldn’t be retro-fitted into the existing design. I encourage anyone curious to read my prior entry on sub-classing for extensibility to be a preview of some future blog posts. I’ll be writing more about design patterns in Python that handle extensibility which many popular Python web frameworks are also struggling to handle.

The Future

I’m very excited about the future for the Pylons Project, which is the new over-arching organization that will be developing Python web framework technologies. The core will be Pyramid, with additional features and functionality building around that. We’re already quickly expanding the developer team with some long-time contributors and having a combined team has definitely helped us progress rapidly.

One of my main goals is to encourage and ease contributions from the community. To that extent I’ve been filling in the contributing section for the Pylons Project as much as possible. I believe this is an area that will quickly set us apart from other projects as we emphasize a higher standard of Python development.

Django did a good job setting the bar high for its documentation of how to contribute to Django, which deserves a lot of credit for clearly defining community policies. Its missing a portion we considered extremely valuable which core developers generally get very picky on when accepting patches… how to test your code. The Pylons Project adapted the rather thorough testing dogma noted by Tres Seaver, which I personally can’t recommend highly enough when it comes to writing unit tests. It’d be nice to see more posts expand on exactly how to test your code. Many developers (including myself) can write code that passes 100% test coverage… but is it brittle test code? Prone to failure if some overly clever macro it uses fail? Seeing a well written set of examples on designing unit tests to avoid common gotcha’s is definitely something anyone contributing (and developers in general) should be familiar with.

For those wanting a gentler introduction to Pyramid (the docs are very verbose and detailed, not at all opinionated), I’ll be blogging more about new features and how to utilize them. Please be patient, I think a lot of people are going to be excited at what’s in store.

Why Extending Through Subclassing (a framework’s classes) is a Bad Idea

Ok, I’ll admit it, overly ambitious blog post. So I’ll refine it a little now, this is intended mainly as my thoughts on why as a tool developer (one who makes tools/frameworks that other programmers then use), its a bad idea to implement extensible objects via developer subclassing. This is actually how the web framework I wrote - Pylons - provides its extensibility to developers and lets them change how the framework functions.

Please excuse the short and possibly incomplete description, this is mainly a quick post to illustrate a tweet I recently made.

First, some background…

One of the things that Pylons 1.0 and prior is missing is a way to easily extend a Pylons project. While it can be done, its very ad-hoc, kludgy, and generally not very well thought-out. Or thought-out at all really. What was somewhat thought-out was how a developer was supposed to extend and customize the framework.

In a Pylons project, the project creates a PylonsApp WSGI object, and all the projects controllers subclass WSGIController. This seemed to work quite well, and indeed many users happily imported PylonsApp, subclassed it to extend/override methods they needed for customization, or changed how their WSGIController subclass worked to change how individual actions would be called.

Everything seemed just fine…. until…

Improving Pylons

When I had some free time a little while back, I set about looking into how to extend and improve Pylons to make up for where it was lacking, extensibility. I quickly realized that I’d need to change rather drastically how Pylons dispatch worked, and how controller methods were called to make them more easily extendable. But then with a certain feeling of dread, the subclassing issue nipped me. All my implementations of PylonsApp and WSGIController were effectively frozen.

Since every single developer using Pylons sub-classes WSGIController, and to a much lesser extent, PylonsApp, any change to any of the main methods would result in immediate breakage of every single Pylons users app that happened to customize them (the very reason subclassing was used!). This meant that I couldn’t very well change the implementation of the actual classes to fix their design, because that would just cause complete breakage. Ugh!

So after looking into it more, I’ve ended up with this short list of the obvious bad reasons this shouldn’t be done. BTW, in Pylons 2, controllers don’t subclass anything, and customization is all with hooks into the framework, no subclassing in sight!

Short List of Why It’s Bad

From a framework maintainers point of view…

  1. Implementations of the classes are effectively frozen, because all the class methods are the API.
  2. Correcting design flaws or implementation flaws are much more difficult if not impossible without major breakage due to point #1.
  3. Heavily sub-classed/large hierarchy classes can have performance penalties.

From a developer ‘extending’ the classes point of view…

  1. Figuring out how to unit-test is more difficult as the full implementation is not in your own code… its in the framework code.
  2. When using mix-in’s and other classes that also subclassed the framework, strange conflicts or overrides occur that aren’t at all obvious or easy to debug/troubleshoot.

I think there were a few more reasons I came across as well, but I can’t recall them at the moment. In short, I’m now of the rather firm opinion that the only classes you should ever subclass are your own classes. Preferably in the same package, or nearby.

Pylons 0.10 and 1.0 Beta 1 Released

Without further ado,

I’m pleased to announced that Pylons 0.10b1 and 1.0b1 are now out. I have not put them on Cheeseshop to ensure they’re not downloaded accidentally.

Upgrading / Installing

I have updated upgrading instructions here: http://pylonshq.com/docs/en/1.0/upgrading/

The instructions to install from scratch on Pylons 1.0b1: http://pylonshq.com/docs/en/1.0/gettingstarted/#installing

The upgrading page covers the important upgrading instructions that Mike Orr touched briefly on before.

Note that these are beta releases, intended for us to discover remaining issues and continue updating any other documentation where applicable. Very little has actually changed in Pylons since 0.9.7, apart from 1.0 dropping all of the legacy functionality and a few explicit clean-ups.

Updates

Routes, Beaker, and WebHelpers however have been seeing quite a bit of updates through the life of Pylons 0.9.7 so no one should think that the developers working on Pylons and its related parts have been hanging out doing nothing. :)

Since Pylons 0.9.7 was released on February 23, 2009, almost one year ago now:

  • Routes 1.11 was released, and 1.12 with some great updates will be out shortly
  • Beaker has gone from 1.2.2 -> 1.5 with 3 major updates substantially increasing its ease of use and reliability
  • WebHelpers is now at 1.0b4 with major updates, core functions rewritten, and new docs up
  • SQLAlchemy has gone from 0.4 to 0.5 (with 0.6 in beta)

I believe this speaks a great deal about the benefits of keeping the core Pylons functionality separate from other parts, as a variety of bug fixes and features can be improved without requiring new Pylons releases to quickly address bug reports.

How to Help!

To bring Pylons to 1.0, many docs likely need very small changes. Also, it would be great to take care of reference docs where people have commented about problems/tips. Helping is fairly easy, especially if you’re familiar with restructured text.

First: Clone the Pylons repository on Bitbucket: http://bitbucket.org/bbangert/pylons/

Then: Edit the documentation files under pylons/docs/en/ to read as appropriate, commit the fix, and push it to bitbucket.

Finally: Issue a pull request on bitbucket so that we’ll know your fix is ready. Ideally you should include a note in it about what your fix remedies.

Bug Reports

Did your upgrade not go according to plan? Was there something missing that you needed to do from the upgrading docs?

Let us know by filing a bug report (mark component as documentation, and milestone as 0.10: http://pylonshq.com/project/pylonshq/newticket

You’ll need to login to file a bug report, or feel free to reply to this announcement with the issue.

Thanks (in alphabetical order) to Mike Bayer, Ian Bicking, Mike Burrows, Graham Higgins, Phil Jenvey, Mike Orr, and anyone else I missed for all their hard work on making Pylons and its various components what they are today.

Deploying Python web apps with toppcloud

Ian Bicking recently released a rather interesting package called toppcloud that aims to tackle what I see as a growing need for those of us deploying Python webapps.

I was actually interested in easing my own deployment woes before I saw Ian’s announcement about his package, and was halfway through a rather hefty amount of research on automating server deployments with tools like Chef and Puppet, but toppcloud is a bit different. It not only tackles provisioning a new system on the fly from the ‘cloud’ (using libcloud), but it also handles easy Python (and now PHP) web application deployments.

With such a tantalizing set of goals, I couldn’t really resist getting my feet wet. Boy am I glad I did. PylonsHQ is now running with toppcloud, and I won’t be surprised when more people get it running for them. When many shared hosting providers are $5-10 a month, its rather nice to pay $10/mth for an automatically configured VPS, with one-command deployments. Though of course, unless you go to quite a bit of work yourself, most shared hosting providers don’t have one-command deployments for you.

Before I continue on to describe how I setup Kai - the source code behind PylonsHQ - I should provide a few caveats about using toppcloud:

  • toppcloud is alpha software, there’s no releases of it yet, you will be checking it out from source code
  • currently, only Rackspace Cloud is known to be working, though since it uses libcloud, in theory, any cloud providers that it supports should be usable
  • toppcloud is changing rapidly, be ready to keep up on the commit log to see whats changing
  • there are no unit tests, most likely because its very tedious to make the rather significant amount of mock objects required to test the various local/remote commands and the fact that its changing so fast the tests would probably be obsolete in a week

toppcloud philosophy

I’m half guessing here, based on talking with Ian and reading the docs myself, but the philosophy of toppcloud is around providing a common deployment platform, ala Google App Engine, except of course with Postgres or other ‘services’ that you can request to use. At the moment the only services that toppcloud comes with is CouchDB, Files (to store/serve files for your app on the filesystem), and Postgres w/postgis extensions. For those diving into the source code, it shouldn’t be too hard to see how to create additional services and I hope to see more get added as people get more interested.

Therefore, toppcloud is not expected to be everything to everyone, it is expected to be at least an 80+% solution to deploying web apps in a Google App Engine style ease-of-deployment process. toppcloud itself then ensures that depending on what service you asked for, its setup and ready for use on the server when your app is deployed.

If that sounds like something you’re dying to try out, you’re in luck!

Setting up a Pylons App

I’m not going to mention how to check out the toppcloud source, except to mention the directions are in the

/docs/index.txt

file in the toppcloud source (which should be read in its entirety!). Once you have toppcloud installed on your computer, getting from zero to running website on VPS is remarkably quick:

  1. $ toppcloud create-node --image-id=14362 baseimage

    Wait until email arrived indicating that server is up and ready

  2. $ toppcloud setup-node baseimage

    Server is now all setup to run web apps!

  3. $ toppcloud init myapp

    Create an app to deploy to our server

  4. $ source myapp/bin/activate

    Activate the virtualenv used for this app

  5. $ pip install Pylons
  6. $ cd myapp/src
  7. $ paster create -t pylons awesomeapp

    Create our Pylons app, or check out an existing one from your VCS here

  8. $ cd ../..
  9. $ ln -s myapp/src/awesomeapp/awesomeapp/public/ myapp/static

    toppcloud will make available things in the static directory available without hitting the webapp

  10. Configure myapp/app.ini similarly to (taken from pylonshq site):

    [production]
    app_name = pylonshq
    runner = src/kai/production.ini
    version = 1
    update_fetch = /sync_app
    service.couchdb =
    service.files =
    default_host = pylonshq.com

    In this case, the Pylons site needs CouchDB setup, and the files service (all Pylons sites should use the files service to store cached files, templates, etc.)

  11. $ toppcloud update myapp/

    Note that we’re one directory above myapp, and we have a trailing slash on myapp, this is needed because an rsync is done to copy it to the server, don’t leave off the trailing slash! (This will prolly be fixed at some point)

That’s it! 10 easy steps to go from zero to a running deployed website.

I should also note that you will need to make a production.ini file, and that it should have a few important changes in it. All the references to %(here)s should be changed to %(CONFIG_FILES)s since that’s the persistent location that files can be stored for a toppcloud app between deployments. Other configuration information provided by services (Couch supplies the db/host, Postgres has its host/user/pass info) can be accessed via CONFIG_ vars as well. The services docs have some more info.

More apps can be added to a host as desired, until the ram runs out of course. At the moment toppcloud is using a process pool of 5 with 1 thread under mod-wsgi for each application. This can use a bit of ram if you have multiple heavy Pylons processes, hopefully there will shortly be a way to ask toppcloud to use a single process with multiple threads which will help cut the ram profile a bit.

Using Django and PHP

Unfortunately I haven’t actually tried this myself, but there’s nothing preventing it. You’ll need to change the app.ini so that instead of using ‘src/kai/production.ini’ as the runner, it uses a Python file in the directory, say main.py that then loads the Django app as a WSGI application and returns it. Sort of like this. Note that the config vars Django needs for its database should then be present in os.environ when setting up the settings.py that Django uses.

If you’re looking through the toppcloud source code by now, you may have also noticed there’s an example app that uses PHP. There’s nothing holding back toppcloud from setting up mod_passenger and deploying Ruby apps at some point either should someone wish to add that feature.

dumb pipes

There’s been numerous mentions on various blogs and in the news about how much the cell phone carriers hate the concept of being nothing more than “dumb pipes” for wireless Internet and phone use. That means of course, that they’d no longer be competing on what phones you could use but instead solely on service quality and price… I think the same type of transition is in store for shared hosting providers and some of the boutique app deployment shops like heroku.

It’ll take awhile, toppcloud is very rough right now. But when you can d/l a nice little open-source package, run it, choose your choice of cloud provider (based on price + quality!), have it automatically setup for you to deploy your apps to, then deploy apps with a single command… you’ve already done the vast majority of what a service like heroku does, except you could still modify toppcloud if there was something lacking you really needed. And it’s a reason like that, that toppcloud exists to begin with.

Oh, and thanks Ian for writing this before I wasted more time making it myself.