Planet linux.conf.au
Celebrating the wonderful linux.conf.au 2015 conference...

April 27, 2015

Announcing GovCloud support on AWS

Today we are happy to announce CoreOS Linux now supports Amazon Web Services GovCloud (US). AWS GovCloud is an isolated AWS Region for US government agencies and customers to move sensitive workloads into the AWS cloud by addressing their specific regulatory and compliance requirements. With this, automatic updates are now stable and available to all government agencies using the cloud.

CoreOS Linux customers will benefit from the security thanks to support of FedRAMP, a US government program sharing a standardized approach to security assessment, authorization and continuous monitoring for cloud products and services.

For more details, see the documentation on Running CoreOS on EC2.

New gst-rpicamsrc features

I’ve pushed some new changes to my Raspberry Pi camera GStreamer wrapper, at https://github.com/thaytan/gst-rpicamsrc/

These bring the GStreamer element up to date with new features added to raspivid since I first started the project, such as adding text annotations to the video, support for the 2nd camera on the compute module, intra-refresh and others.

Where possible, you can now dynamically update any of the properties – where the firmware supports it. So you can implement digital zoom by adjusting the region-of-interest (roi) properties on the fly, or update the annotation or change video effects and colour balance, for example.

The timestamps produced are now based on the internal STC of the Raspberry Pi, so the audio video sync is tighter. Although it was never terrible, it’s now more correct and slightly less jittery.

The one major feature I haven’t enabled as yet is stereoscopic handling. Stereoscopic capture requires 2 cameras attached to a Raspberry Pi Compute Module, so at the moment I have no way to test it works.

I’m also working on GStreamer stereoscopic handling in general (which is coming along). I look forward to releasing some of that code soon.

 

Sahana Nepal Earthquake SitRep 1

As you are probably aware a 7.8 magnitude earthquake has struck Nepal on 25th April causing 2,288 deaths and injuring over 5,500 people [1]. Sahana is already being used in Nepal by both the Nepal Red Cross Society and the National Emergency Operation Center [Read the Rest...]

April 26, 2015

Anti-Systemd People

For the Technical People

This post isn’t really about technology, I’ll cover the technology briefly skip to the next section if you aren’t interested in Linux programming or system administration.

I’ve been using the Systemd init system for a long time, I first tested it in 2010 [1]. I use Systemd on most of my systems that run Debian/Wheezy (which means most of the Linux systems I run which aren’t embedded systems). Currently the only systems where I’m not running Systemd are some systems on which I don’t have console access, while Systemd works reasonably well it wasn’t a standard init system for Debian/Wheezy so I don’t run it everywhere. That said I haven’t had any problems with Systemd in Wheezy, so I might have been too paranoid.

I recently wrote a blog post about systemd, just some basic information on how to use it and why it’s not a big deal [2]. I’ve been playing with Systemd for almost 5 years and using it in production for almost 2 years and it’s performed well. The most serious bug I’ve found in systemd is Bug #774153 which causes a Wheezy->Jessie upgrade to hang until you run “systemctl daemon-reexec” [3].

I know that some people have had problems with systemd, but any piece of significant software will cause problems for some people, there are bugs in all software that is complex enough to be useful. However the fact that it has worked so well for me on so many systems suggests that it’s not going to cause huge problems, it should be covered in the routine testing that is needed for a significant deployment of any new version of a distribution.

I’ve been using Debian for a long time. The transitions from libc4 to libc5 and then libc6 were complex but didn’t break much. The use of devfs in Debian caused some issues and then the removal of devfs caused other issues. The introduction of udev probably caused problems for some people too. Doing major updates to Debian systems isn’t something that is new or which will necessarily cause significant problems, I don’t think that the change to systemd by default compares to changing from a.out binaries to ELF binaries (which required replacing all shared objects and executables).

The Social Issue of the Default Init

Recently the Debian technical committee determined that Systemd was the best choice for the default init system in Debian/Jessie (the next release of Debian which will come out soon). Decisions about which programs should be in the default install are made periodically and it’s usually not a big deal. Even when the choice is between options that directly involve the user (such as the KDE and GNOME desktop environments) it’s not really a big deal because you can just install a non-default option.

One of the strengths of Debian has always been the fact that any Debian Developer (DD) can just add any new package to the archive if they maintain it to a suitable technical standard and if copyright and all other relevant laws are respected. Any DD who doesn’t like any of the current init systems can just package a new one and upload it. Obviously the default option will get more testing, so the non-default options will need more testing by the maintainer. This is particularly difficult for programs that have significant interaction with other parts of the system, I’ve had difficulties with this over the course of 14 years of SE Linux development but I’ve also found that it’s not an impossible problem to solve.

It’s generally accepted that making demands of other people’s volunteer work is a bad thing, which to some extent is a reasonable position. There is a problem when this is taken to extremes, Debian has over 1000 developers who have to work together so sometimes it’s a question of who gets to do the extra work to make the parts of the distribution fit together. The issue of who gets to do the work is often based on what parts are the defaults or most commonly used options. For my work on SE Linux I often have to do a lot of extra work because it’s not part of the default install and I have to make my requests for changes to other packages be as small and simple as possible.

So part of the decision to make Systemd be the default init is essentially a decision to impose slightly more development effort on the people who maintain SysVInit if they are to provide the same level of support – of course given the lack of overall development on SysVInit the level of support provided may decrease. It also means slightly less development effort for the people who maintain Systemd as developers of daemon packages MUST make them work with it. Another part of this issue is the fact that DDs who maintain daemon packages need to maintain init.d scripts (for SysVInit) and systemd scripts, presumably most DDs will have a preference for one init system and do less testing for the other one. Therefore the choice of systemd as the default means that slightly less developer effort will go into init.d scripts. On average this will slightly increase the amount of sysadmin effort that will be required to run systems with SysVInit as the scripts will on average be less well tested. This isn’t going to be a problem in the short term as the current scripts are working reasonably well, but over the course of years bugs may creep in and a proposed solution to this is to have SysVInit scripts generated from systemd config files.

We did have a long debate within Debian about the issue of default init systems and many Debian Developers disagree about this. But there is a big difference between volunteers debating about their work and external people who don’t contribute but believe that they are entitled to tell us what to do. Especially when the non-contributors abuse the people who do the work.

The Crowd Reaction

In a world filled with reasonable people who aren’t assholes there wouldn’t be any more reaction to this than there has been to decisions such as which desktop environment should be the default (which has caused some debate but nothing serious). The issue of which desktop environment (or which version of a desktop environment) to support has a significant affect on users that can’t be avoided, I could understand people being a little upset about that. But the init system isn’t something that most users will notice – apart from the boot time.

For some reason the men in the Linux community who hate women the most seem to have taken a dislike to systemd. I understand that being “conservative” might mean not wanting changes to software as well as not wanting changes to inequality in society but even so this surprised me. My last blog post about systemd has probably set a personal record for the amount of misogynistic and homophobic abuse I received in the comments. More gender and sexuality related abuse than I usually receive when posting about the issues of gender and sexuality in the context of the FOSS community! For the record this doesn’t bother me, when I get such abuse I’m just going to write more about the topic in question.

While the issue of which init system to use by default in Debian was being discussed we had a lot of hostility from unimportant people who for some reason thought that they might get their way by being abusive and threatening people. As expected that didn’t give the result they desired, but it did result in a small trend towards people who are less concerned about the reactions of users taking on development work related to init systems.

The next thing that they did was to announce a “fork” of Debian. Forking software means maintaining a separate version due to a serious disagreement about how it should be maintained. Doing that requires a significant amount of work in compiling all the source code and testing the results. The sensible option would be to just maintain a separate repository of modified packages as has been done many times before. One of the most well known repositories was the Debian Multimedia repository, it was controversial due to flouting legal issues (the developer produced code that was legal where they lived) and due to confusion among users. But it demonstrated that you can make a repository containing many modified packages. In my work on SE Linux I’ve always had a repository of packages containing changes that haven’t been accepted into Debian, which included changes to SysVInit in about 2001.

The latest news on the fork-Debian front seems to be the call for donations [4]. Apparently most of the money that was spent went to accounting fees and buying a laptop for a developer. The amount of money involved is fairly small, Forbes has an article about how awful people can use “controversy” to get crowd-funding windfalls [5].

MikeeUSA is an evil person who hates systemd [6]. This isn’t any sort of evidence that systemd is great (I’m sure that evil people make reasonable choices about software on occasion). But it is a significant factor in support for non-systemd variants of Debian (and other Linux distributions). Decent people don’t want to be associated with people like MikeeUSA, the fact that the anti-systemd people seem happy to associate with him isn’t going to help their cause.

Conclusion

Forking Debian is not the correct technical solution to any problem you might have with a few packages. Filing bug reports and possibly forking those packages in an external repository is the right thing to do.

Sending homophobic and sexist abuse is going to make you as popular as the GamerGate and GodHatesAmerica.com people. It’s not going to convince anyone to change their mind about technical decisions.

Abusing volunteers who might consider donating some of their time to projects that you like is generally a bad idea. If you abuse them enough you might get them to volunteer less of their time, but the most likely result is that they just don’t volunteer on anything associated with you.

Abusing people who write technical blog posts isn’t going to convince them that they made an error. Abuse is evidence of the absence of technical errors.

April 24, 2015

rkt 0.5.4, featuring repository authentication, port forwarding and more

Since the last rkt release a few weeks ago, development has continued apace, and today we're happy to announce rkt v0.5.4. This release includes a number of new features and improvements across the board, including authentication for image fetching, per-application arguments, running from pod manifests, and port forwarding support – check below the break for more details.

rkt, a container runtime for application containers, is under heavy development but making rapid progress towards a 1.0 release. Earlier this week, VMware announced support for rkt and the emerging App Container (appc) specification. appc is an open specification defining how applications can be run in containers, and rkt is the first implementation of the spec. With increasing industry commitment and involvement in appc, it is quickly fulfilling its promise of becoming a standard of how applications should be deployed in containers.

VMware released a short demo about how its new Project Photon works with rkt via Vagrant and VMware Fusion.

Read on below for more about the latest features in rkt 0.5.4.

Authentication for image fetching

rkt now supports HTTP Basic and OAuth Bearer Token authentication when retrieving remote images from HTTP endpoints and Docker registries. To facilitate this, we've introduced a flexible configuration system, allowing vendors to ship default configurations and then systems administrators to supplement or override configuration locally. Configuration is fully versioned to support forwards and backwards compatibility – check out the rkt documentation for more details.

Here's a simple example of fetching an image from a private Docker registry (note that Docker registries support only Basic authentication):

$ sudo cat /etc/rkt/auth.d/myuser.json 
{
    "rktKind": "dockerAuth",
    "rktVersion": "v1",
    "registries": ["quay.io"],
    "credentials": {
        "user": "myuser",
        "password": "sekr3tstuff"
    }
}
$ sudo /rkt --insecure-skip-verify fetch docker://quay.io/myuser/privateapp
rkt: fetching image from docker://quay.io/myuser/privateapp
Downloading layer: cf2616975b4a3cba083ca99bc3f0bf25f5f528c3c52be1596b30f60b0b1c37ff
Downloading layer: 6ce2e90b0bc7224de3db1f0d646fe8e2c4dd37f1793928287f6074bc451a57ea
....

Per-application arguments and image signature verification for local images

The flag parsing in rkt run has been reworked to support per-app flags when running a pod with multiple images. Furthermore, in keeping with our philosophy of "secure by default", rkt will now attempt signature verification even when referencing local image files (during rkt fetch or rkt run commands). In this case, rkt expects to find the signature file alongside the referenced image – for example:

 $ rkt run imgs/pauser.aci
     error opening signature file: open /home/coreos/rkt/imgs/pauser.aci.asc: no such file or directory
 $ gpg2 --armor --detach-sign imgs/pauser.aci
 $ rkt run imgs/pauser.aci
     rkt: signature verified:
       Irma Bot (ACI Signing Key)
     ^]^]^]Container rootfs terminated by signal KILL.

Specific signatures can be provided with the --signature flag, which also applies per-app in the case of multiple references. In this example, we import two local images into the rkt CAS, specifying images signatures for both:

     $ rkt fetch   \
        imgs/pauser.aci --signature ./present.asc  \
        imgs/bash.aci --signature foo.asc
      rkt: signature verified:
        Joe Packager (CoreOS)
sha512-b680fd853abeba1a310a344e9fbf8ac9
sha512-ae78000a3d38fae4009699bf7494b293

Running from pod manifests

In previous versions of rkt, the arguments passed to rkt run (or rkt prepare) would be used to internally generate a Pod Manifest which is executed by later stages of rkt. This release introduces a new flag, --pod-manifest, to both rkt prepare and rkt run, to supply a pre-created pod manifest to rkt.

A pod manifest completely defines the execution environment of the pod to be run, such as volume mounts, port mappings, isolators, etc. This allows users complete control over all of these parameters in a well-defined way, without the need of a complicated rkt command-line invocation. For example, when integrating rkt as a container runtime for a cluster orchestration system like Kubernetes, the system can now programmatically generate a pod manifest instead of feeding a complicated series of arguments to the rkt CLI.

In this first implementation — and following the prescriptions of the upstream appc spec — the pod manifest is treated as the definitive record of the desired execution state: anything specified in the app fields will override what is in the original image, such as exec parameters, volumes mounts, port mappings, etc. This allows the operator to completely control what will be executed by rkt. Since the pod manifest is treated as a complete source of truth — and expected to be generated by orchestration tools with complete knowledge of the execution environment – --pod-manifest is initially considered mutually exclusive with other flags, such as --volumes and --port. See rkt run --help for more details.

Port forwarding

rkt now supports forwarding ports from the host to pods when using private networking.

As a simple example, given an app with the following ports entry in its Image Manifest:

{
    "name": "http",
    "port": 80,
    "protocol": "tcp"
}

the following rkt run command can be used to forward traffic from the host's TCP port 8888 to port 80 inside the pod:

rkt run --private-net --port=http:8888 myapp.aci

Whenever possible, it is more convenient to use a SDN solution like flannel to assign routable IPs to rkt pods. However, when such an option is not available, or for "edge" apps that require straddling both SDN and external networks (such as a load balancer), port forwarding can be used to expose select ports to the pod.

Testing, forward-compatibility, and more

There's plenty more under the hood in this release, including an extensive functional test harness, a new database schema migration process, and various internal improvements to the codebase. As we've talked about previously, rkt is a young project and we aren't yet able to guarantee API/ABI stability between releases, but forward-compatibility is a top priority for the forthcoming 0.6 release, and these changes are important steps towards this goal.

For full details of all the changes in this release, check out the release on GitHub.

Get involved!

We're on a journey to create an efficient, secure and composable application container runtime for production environments, and we want you to join us. Take part in the discussion through the rkt-dev mailing list or GitHub issues — and for those eager to get stuck in, contribute directly to the project. Are you doing interesting things with rkt or appc and want to share it with the world? Contact our marketing team at press@coreos.com.

CAP on a Map project kickoff in the Maldives

A workshop and set of meetings (April 15 & 16, 2015) took place in the capitol city Male in the Maldives. It was an event of the CAP on a Map kickoff in the Maldives. The project aims to improve [Read the Rest...]

April 23, 2015

Verification Challenge 5: Uses of RCU

This is another self-directed verification challenge, this time to validate uses of RCU instead of validating the RCU implementations as in earlier posts. As you can see from Verification Challenge 4, the logic expression corresponding even to the simplest Linux-kernel RCU implementation is quite large, weighing in at tens of thousands of variables and hundreds of thousands of clauses. It is therefore worthwhile to look into the possibility of a trivial model of RCU that could be used for verification.



Because logic expressions do not care about cache locality, memory contention, energy efficiency, CPU hotplug, and a host of other complications that a Linux-kernel implementation must deal with, we can start with extreme simplicity. For example:



 1 static int rcu_read_nesting_global;
 2 
 3 static void rcu_read_lock(void)
 4 {
 5   (void)__sync_fetch_and_add(&rcu_read_nesting_global, 2);
 6 }
 7 
 8 static void rcu_read_unlock(void)
 9 {
10   (void)__sync_fetch_and_add(&rcu_read_nesting_global, -2);
11 }
12 
13 static inline void assert_no_rcu_read_lock(void)
14 {
15   BUG_ON(rcu_read_nesting_global >= 2);
16 }
17 
18 static void synchronize_rcu(void)
19 {
20   if (__sync_fetch_and_xor(&rcu_read_nesting_global, 1) < 2)
21     return;
22   SET_NOASSERT();
23   return;
24 }




The idea is to reject any execution in which synchronize_rcu() does not wait for all readers to be done. As before, SET_ASSERT() sets a variable that suppresses all future assertions.



Please note that this model of RCU has some shortcomings:





  1. There is no diagnosis of rcu_read_lock()/rcu_read_unlock() misnesting. (A later version of the model provides limited diagnosis, but under #ifdef CBMC_PROVE_RCU.)

  2. The heavyweight operations in rcu_read_lock() and rcu_read_unlock() result in artificial ordering constraints. Even in TSO systems such as x86 or s390, a store in a prior RCU read-side critical section might be reordered with loads in later critical sections, but this model will act as if such reordering was prohibited.

  3. Although synchronize_rcu() is permitted to complete once all pre-existing readers are done, in this model it will instead wait until a point in time at which there are absolutely no readers, whether pre-existing or new. Therefore, this model's idea of an RCU grace period is even heavier weight than in real life.





Nevertheless, this approach will allow us to find at least some RCU-usage bugs, and it fits in well with cbmc's default fully-ordered settings. For example, we can use it to verify a variant of the simple litmus test used previously:



 1 int r_x;
 2 int r_y;
 3 
 4 int x;
 5 int y;
 6 
 7 void *thread_reader(void *arg)
 8 {
 9   rcu_read_lock();
10   r_x = x;
11 #ifdef FORCE_FAILURE_READER
12   rcu_read_unlock();
13   rcu_read_lock();
14 #endif
15   r_y = y;
16   rcu_read_unlock();
17   return NULL;
18 }
19 
20 void *thread_update(void *arg)
21 {
22   x = 1;
23 #ifndef FORCE_FAILURE_GP
24   synchronize_rcu();
25 #endif
26   y = 1;
27   return NULL;
28 }
29 
30 int main(int argc, char *argv[])
31 {
32   pthread_t tr;
33 
34   if (pthread_create(&tr, NULL, thread_reader, NULL))
35     abort();
36   (void)thread_update(NULL);
37   if (pthread_join(tr, NULL))
38     abort();
39 
40   BUG_ON(r_y != 0 && r_x != 1);
41   return 0;
42 }




This model has only 3,032 variables and 8,844 clauses, more than an order of magnitude smaller than for the Tiny RCU verification. Verification takes about half a second, which is almost two orders of magnitude faster than the 30-second verification time for Tiny RCU. In addition, the model successfully flags several injected errors. We have therefore succeeded in producing a simpler and faster model approximating RCU, and that can handle multi-threaded litmus tests.



A natural next step would be to move to litmus tests involving linked lists. Unfortunately, there appear to be problems with cbmc's handling of pointers in multithreaded situations. On the other hand, cbmc's multithreaded support is quite new, so hopefully there will be fixes for these problems in the near future. After fixes appear, I will give the linked-list litmus tests another try.



In the meantime, the full source code for these models may be found here.

Dockerising Puppet

Learn how to use Puppet to manage Docker containers. This post contains complementary technical details to the talk on 23th of April at the Puppet Camp in Sydney.

Manageacloud is a company that specialises in cloud automation. We are about to launch our next product, the multi-cloud orchestration platform. Please contact us if you want to know more.

 

Summary

The goal is to manage the configuration of Docker containers using existing puppet modules and Puppet Enterprise. We will use the example of a Wordpress application and two different approaches:

  • Fat containers: treating the container as a virtual machine
  • Microservices: one process per container, as originally recommended by Docker

 

Docker Workflow

 

 

1 - Dockerfile

Dockerfile is the "source code" of the container image:

  • It uses imperative programming, which means we need specify every command, tailored to the target distribution, to achieve the desired state.
  • It is very similar to bash; if you know bash, you know how to use a Dockerfile
  • In large and complex architectures, the goal of the Dockerfile is to hook a configuration management system like puppet to install the required software and configure the container.

For example, this is a Dockerfile that will create a container image with Apache2 installed in Ubuntu:

FROM ubuntu MAINTAINER Ruben Rubio Rey <ruben@manageacloud.com> RUN apt-get update RUN apt-get install apache2

 

2 - Container Image

The container image is generated from the Dockerfile using docker build:

docker build -t <image_name> <directory_path_to_Dockerfile>

 

3 - Registry

An analogy for the Registry is that it works like a git repository. It allows you to push and pull container's images. Container images can have different versions.

The Registry is the central point to distribute Docker containers. It does not matter if you use Kubernetes, CoreOS Fleet, Docker Swarm, Mesos or you are just orchestrating in a Docker host.

For example, if you are the DevOps person within your organization, you may decide that the developers (who are already developing under Linux) will use containers instead of virtual machines for the development environment. The DevOps person should be responsible to creating the Dockerfile, building the container image and pushing it to the registry. All developers within your organization can now pull the latest version of the development environment from the registry and use it.

 

4 - Development Environment

Docker containers can be used in a development environment. You can make developers more comfortable with the transition to containers by using the controversial "Fat Containers" approach.

 

5 - Production Environment

You can orchestrate Docker containers in production for two different purposes:

  • Docker Host: Using containers as a way to distribute the configuration. This post focuses on using containers in Docker Hosts.
  • Cluster Management: Mesos, Kubernetes, Docker Swarm and CoreOS Fleet are used to manage containerised applications in clustered environments. This aims to create a layer in the top of the different available virtual machines, allowing you to manage all resources as one unified whole. Those technologies are very likely to evolve significantly over the next 12 months.

 

Fat Containers vs Microservices

When you are creating containers, there are three different approaches:

  • Microservices: running one single process per container.
  • Fat containers: running many processes and services in a container. In fact, you are treating the container as a virtual machine.

The problem with the microservices approach is that Linux is not really designed for microservices. If you have some processes running in a container, and one of those processes is detached from the parent, it is responsibility of the init process to recycle those resources. If those resources are not recycled, it will become a zombie process.

Some Linux applications are not designed for single process systems either:

  • Many Linux applications are designed to have a crontab daemon to run periodical tasks.
  • Many Linux applications writes vital information directly to the syslog. If the syslog daemon is not running, you might never notice those messages.

In order to use multiple processes in a container, you need to use an init process or similar. There are base images with init processes built in. For example for ubuntu and debian.

What to use ? My advice is to be pragmatic; no one size fits all. Your goal is to solve business problems without creating technical debt. If fat containers better suits your business need, use it. However if microservices fits better, use that instead. Ideally, you should know how to use both, and analyse the case in point to decide what is best for your company. There are no technical reasons to use one over the other.

 

 

Managing Docker Containers with Puppet

When we use Puppet (or any other configuration management system) to manage Docker containers, there are two sets of tasks: container creation and container orchestration.

 

Container Creation

  1. The Dockerfile installs the puppet clients and invokes the puppet master to retrieve the container's configuration
  2. The new image is pushed to the registry

 

Container Orchestration

  1. Docker's host puppet agent invokes the puppet master to get the configuration
  2. The puppet agent identifies a set of containers. Those containers must be pulled from the Docker registry
  3. The puppet agent pulls, configures and starts the Docker containers in the Docker host

 

Puppet Master Configuration

For this configuration, we are assuming that Puppet Master is running in a private network, where all the clients are secure. This allows us to use the configuration setting autosign = true in the master's puppet.conf.

 

Docker Registry

The Docker registry is like a "git repository" for containers. You can push and pull containers. Containers might have a version number. You can use a provider for the Docker registry or you can install one yourself. For this example we will use the module garethr/docker from the PuppetForge to create our docker-registry puppet manifest:

class docker-registry {

    include 'docker'

    docker::run { 'local-registry':

        # Name of the container in Docker Hub

        image => 'registry',

        # We are mapping a port from the Docker host to the container.

        # If you don't do that you cannot have access

        # to the services available in the container

        ports           => ['5000:5000'],

        # We send the configuration parameters that are required to configure a insecure version of a local registry

        env             => ['SETTINGS_FLAVOR=dev', 'STORAGE_PATH=/var/docker-registry/local-registry'],

        # Containers are stateless. If you modify the filesystem

        # you are creating a new container.

        # If we want to push containers, we need a

        # persistent layer somewhere.

        # For this case, in order to have a persistent layer,

        # we are mapping a folder in the host with a folder in the container

        volumes         => ['/var/docker-registry:/var/docker-registry'],

    }

}

Please note that this installs an insecure Docker registry for testing purposes only.

 

Fat Containers Approach

For this example, I am using a fat container as I am considering the development environment for the developers within my organization. How fat containers works is very similar to virtual machines, and the learning curve will be close to zero. If the developers are already using Linux, using containers will remove the overhead of the hypervisor and their computer will be faster immediately.

This fat container will contain the following services:

  • Provided by the base image:
    • init
    • syslog
    • crontab
    • ssh
  • Provided by Puppet:
    • mysql
    • apache2 (along with Wordpress codebase)

Dockerfile will create the container Wordpress Fat Container. This is the content:

FROM phusion/baseimage

MAINTAINER Ruben Rubio Rey  "ruben.rubio@manageacloud.com"

# Activate AU mirrors

COPY files/sources.list.au /etc/apt/sources.list

# Install puppet client using Puppet Enterprise

RUN curl -k https://puppet.manageacloud.com.au:8140/packages/current/install.bash | bash

# Configure puppet client (Just removed the last line for the "certname")

COPY files/puppet.conf /etc/puppetlabs/puppet/puppet.conf

# Apply puppet changes. Note certname, we are using "wordpress-image-"

# and three random characters.

#  - "wordpress-image-" allows Puppet Enterprise

# to identify which classes must be applied

#  - The three random characters are used to

# avoid conflict with the node certificates

RUN puppet agent --debug --verbose --no-daemonize --onetime --certname wordpress-image-`date +%s | sha256sum | head -c 3; echo `

# Enable SSH - As this is meant to be a development environment,

# SSH might be useful to the developer

# This is needed for phusion/baseimage only

RUN rm -f /etc/service/sshd/down

# Change root password - even if we use key authentication

# knowing the root's password is useful for developers

RUN echo "root:mypassword" | chpasswd

# We enable the services that puppet is installing

COPY files/init /etc/my_init.d/10_init_services

RUN chmod +x /etc/my_init.d/10_init_services

When we are building the Docker container, it will request the configuration from the Puppet Master using the certname "wordpress-image-XXX" being XXX random characters.

Puppet master returns the following manifest:

class wordpress-all-in-one {

  # Problems using official mysql from Puppet Forge

  # If you try to install mysql using package {"mysql": ensure => installed }

  # it crashes. It tries to do something with the init process

  # and this container does not have a

  # fully featured init process. "mysql-noinit" installs

  # mysql without any init dependency.

  # note that although we cannot use mysql Puppet Forge

  # module to install the software, we can use

  # the types to create database, create user

  # and grant permissions

  include "mysql-noinit"

  # Fix unsatisfied requirements in Wordpress class.

  # hunner/wordpress module assumes that

  # wget is installed in the system. However,

  # containers by default has minimal software

  # installed.

  package {"wget": ensure => latest}

  # hunner/wordpress,

  # removing any task related with

  # the database (it will crash when

  # checking if mysql package is installed)

  class { 'wordpress':

    install_dir => '/var/www/wordpress',

    db_user     => 'wp_user',

    db_password => 'password',

    create_db   => false,

    create_db_user => false

  }

  # Ad-hoc apache configuration

  # installs apache, php and adds the

  # virtual server wordpress.conf

  include "apache-wordpress"

}

Build the container image:

docker build -t puppet_wordpress_all_in_one /path/to/Dockerfile_folder/



Push the image to the registry

docker tag puppet_wordpress_all_in_one registry.manageacloud.com.au:5000/puppet_wordpress_all_in_one docker push registry.manageacloud.com.au:5000/puppet_wordpress_all_in_one

Orchestrate the container

To orchestrate the fat container in a Docker host:

class container-wordpress-all-in-one {

    class { 'docker':

        extra_parameters=> ['--insecure-registry registry.manageacloud.com.au:5000']

    }

    docker::run { 'wordpress-all-in-one':



        # image is fetched from the Registry

        image => 'registry.manageacloud.com.au:5000/puppet_wordpress_all_in_one',



        # The fat container is mapping the port 80 from the docker host to

        # the container's port 80

        ports => ['80:80'],

    }

}

Microservices Approach

Now we are going to use as much as possible of the existing code using the Microservices Architecture approach. For this approach we will have two containers, a DB container running MySQL and a WEB container running Apache2.

 

1 - MySQL (DB) Microservice Container

As usual, we use the Dockerfile to build the Docker image.

Dockerfiles are very similar. I will highlight the changes.

# This time we are using the Docker Official image Ubuntu (no init process)

FROM ubuntu

MAINTAINER Ruben Rubio Rey "ruben.rubio@manageacloud.com"

# Activate AU mirrors

COPY files/sources.list.au /etc/apt/sources.list

# This base image does not have curl installed

RUN apt-get update && apt-get install -y curl

# Install puppet client

RUN curl -k https://puppet.manageacloud.com.au:8140/packages/current/install.bash | bash

# Configure puppet client

COPY files/puppet.conf /etc/puppetlabs/puppet/puppet.conf

# Apply puppet changes. We change the certname

# so Puppet Master knows what configuration to retrieve.

RUN puppet agent --debug --verbose --no-daemonize --onetime --certname ms-mysql-image-`date +%s | sha256sum | head -c 3; echo `

# Expose MySQL to Docker network

# We are notifying the Docker network that there is a container


# that has a service and other containers might need it

EXPOSE 3306

The class returned by Puppet Master is wordpress-ms-mysql. You will notice that this class is exactly the same as the fat container, but anything that is not related to the database is commented out.

class wordpress-mysql-ms {

    # Install MySQL

    include "mysql-noinit"

    # Unsatisfied requirements in wordpress class

    # package {"wget": ensure => latest}

    # Puppet forge wordpress class, removing mysql

    # class { 'wordpress':

    #   install_dir => '/var/www/wordpress',

    #   db_user => 'wp_user',

    #   db_password => 'password',

    #}

    # Apache configuration not needed

    # include "apache-wordpress"

}

Build the container

docker build -t puppet_ms_mysql .

Push the container to the registry

docker tag puppet_ms_mysql registry.manageacloud.com.au:5000/puppet_ms_mysql sudo docker push registry.manageacloud.com.au:5000/puppet_ms_mysql

 

2 - Apache (WEB) Microservice Container

Once more, we use the Dockerfile to build the image. The file is exactly the same as the MySQL, except for a few lines that are highlighted.

FROM ubuntu

MAINTAINER Ruben Rubio Rey "ruben.rubio@manageacloud.com"

# Activate AU mirrors

COPY files/sources.list.au /etc/apt/sources.list

# Install CURL

RUN apt-get update && apt-get install -y curl

# Install puppet client

RUN curl -k https://puppet.manageacloud.com.au:8140/packages/current/install.bash | bash

# Configure puppet client

COPY files/puppet.conf /etc/puppetlabs/puppet/puppet.conf

# Apply puppet changes

RUN puppet agent --debug --verbose --no-daemonize --onetime --certname ms-apache-image-`date +%s | sha256sum | head -c 3; echo `

# Apply patch to link container.

# We have to tell Wordpress where

# mysql service is running,

# using a system environment variable

# (Explanation in the next section)


# If we are using Puppet for microservices

# we should update the Wordpress module

# to set this environment variable.

# In this case, I am exposing the changes so

# it is easier to see what is changing.


RUN apt-get install patch -y

COPY files/wp-config.patch /var/www/wordpress/wp-config.patch


RUN cd /var/www/wordpress && patch wp-config.php < wp-config.patch

# We configure PHP to read system environment variables

COPY files/90-env.ini /etc/php5/apache2/conf.d/90-env.ini

The class returned by Puppet Master is wordpress-apache-ms. You will notice that it is very similar to wordpress-ms-mysql and to the one used by the fat container wordpress-all-in-one. The difference is that everything related with mysql is commented out and everything related with wordpress and apache is executed.

class wordpress-apache-ms {

    # MySQL won't be installed here

    # include "mysql-noinit"



    # Unsatisfied requirements in wordpress class

    package {"wget": ensure => latest}

    # Puppet forge wordpress class, removing mysql

    class { 'wordpress':

        install_dir => '/var/www/wordpress',

        db_user => 'wp_user',

        db_password => 'password',

        create_db => false,

        create_db_user => false

    }

    # Ad-hoc apache configuration

    include "apache-wordpress"

}

 

3 - Orchestrating Web and DB Microservice

The Puppet class that orchestrates both microservies is called container-wordpress-ms:

class container-wordpress-ms {

    # Make sure that Docker is installed

    # and that it can get images from our insecure registry

    class { 'docker':

        extra_parameters=> ['--insecure-registry registry.manageacloud.com.au:5000']

    }

    # Container DB will run MySQL

    docker::run { 'db':

        # The image is taken from the registry

        image => 'registry.manageacloud.com.au:5000/puppet_ms_mysql',

        command => '/usr/sbin/mysqld --bind-address=0.0.0.0',

        use_name => true

    }

    # Container WEB will run Apache

    docker::run { 'web':

        # The image is taken from the Registry

        image => 'registry.manageacloud.com.au:5000/puppet_ms_apache',

        command => '/usr/sbin/apache2ctl -D FOREGROUND',

        # We are mapping a port between the Docker Host and the Apache container.

        ports => ['80:80'],

        # We link WEB container to DB container. This will allow WEB to access to the

        # services exposed under DB container (in this case 3306)

        links => ['db:db'],

        use_name => true,

       # We need DB container up and running before running WEB.

        depends => ['db'],

    }

}

 

APPENDIX I: Linking containers

When we are linking containers in the microservices approach we are are performing the following tasks

 

Starting "db" container:

This will start puppet_ms_mysql, named as db container. Please note that puppet_ms_mysql is exposing the port 3306, which notifies Docker that this container has a service that might be useful for other containers.

docker run --name db -d puppet_ms_mysql /usr/sbin/mysqld --bind-address=0.0.0.0

 

Starting "web" container

Now we want to start the container puppet_ms_apache, named as web .

If we link the containers and execute the command env the folllowing environment variables are created in the web container:

docker run --name web -p 1800:80 --link db:db puppet_ms_apache env PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOSTNAME=8d48e28094e3 DB_PORT=tcp://172.17.0.2:3306 DB_PORT_3306_TCP=tcp://172.17.0.2:3306 DB_PORT_3306_TCP_ADDR=172.17.0.2 DB_PORT_3306_TCP_PORT=3306 DB_PORT_3306_TCP_PROTO=tcp DB_NAME=/web/db HOME=/root

These variables point out where the mysql database is. Thus, the application should use the environment variable DB_PORT_3306_TCP_ADDR to connect to the database.

  • DB is the name of the container we are linking to
  • 3306 is the port exposed in the Dockerfile of the db container

 

APPENDIX II: Docker Compose

When working with microservices, you want to avoid long commands. Docker Compose makes the management of long Docker commands a lot easier. For example, this is how the Microservices approach would look with Docker Compose:

file docker-compose.yml

web:

  image: puppet_ms_apache

  command: /usr/sbin/apache2ctl -D FOREGROUND

  links:

   - db:db

  ports:

   - "80:80"

db:

  image: puppet_ms_mysql

  command: /usr/sbin/mysqld --bind-address=0.0.0.0

 

and you can execute both contianers with the command docker-compose up

April 20, 2015

VMware Ships rkt and Supports App Container Spec

Today VMware shipped rkt, the application container runtime, and made it available to VMware customers in Project Photon. VMware also announced their support of the App Container spec, of which rkt is the first implementation.

“VMware is happy to provide rkt to offer our customers application container choice. rkt is the first implementation of the App Container spec (appc), and we look forward to contributing to the appc community to advance security and portability between platforms.”

— Kit Colbert, vice president and CTO, Cloud-Native Apps, VMware

We are thrilled to welcome VMware into the appc and rkt communities. The appc specific was formed to create an industry standard of how applications should be deployed in containers, with a focus on portability, composability, and security. rkt is a project originated by CoreOS to provide a production-ready Linux implementation of the specification.

VMware's extensive experience with running applications at scale in enterprise environments will be incredibly valuable as we work together with the community towards a 1.0 release of the appc specification and the rkt project.

Join us on our mission to create a secure, composable and standards-based container runtime. We welcome your involvement and contributions to rkt and appc:

April 16, 2015

etcd 2.0 in CoreOS Alpha Image

Today we are pleased to announce that the first CoreOS image to have an etcd v2.0 release is now available in CoreOS alpha channel. etcd v2.0 marks a milestone in the evolution of etcd and includes many new features and improvements over etcd 0.4 including:

  • Reconfiguration protocol improvements: guards against accidental misconfiguration
  • New raft implementation: provides improved cluster stability
  • On-disk safety improvements: utilizes CRC checksums and append-only log behavior

etcd is an open source, distributed, consistent key-value store. It is a core component of CoreOS software that facilitates safe automatic updates, coordinates work scheduled to hosts, and sets up overlay networking for containers. Check out the etcd v2.0 announcement for more details on etcd and the new features.

We’ve been using etcd v2.0 in production behind discovery.etcd.io and quay.io for a few months now and it has proven to be stable in these use cases. All existing applications that use the etcd API should work against this new version of etcd. We have tested etcd v2.0 with applications like fleet, locksmith and flannel. The user facing API to etcd should provide the same features it had in the past; if you find issues please report them on GitHub.

Setup Using cloud-init

If you want to dive right in and try out bootstrapping a new cluster, the cloud-init docs have full details on all of the parameters. To support the new features of etcd v2.0, such as multiple listen addresses and proxy modes, a new cloud-init section named etcd2 is used. With a few lines of configuration and a new discovery token, you can take etcd v2.0 for a spin on your cluster.

IANA Ports

With the release of etcd2, we’ve taken the opportunity to begin the transition to our IANA-assigned port numbers: 2379 and 2380. For backward compatibility, etcd2 is configured to listen on both the new and old port numbers (4001 and 7001) by default, but this can always be further restricted as desired.

Migration and Changes

Existing clusters running etcd 0.4 clusters will not automatically migrate to etcd v2.0. As there are semantic changes in how etcd clusters are managed between the two versions, we have decided to include both. There are documented methods to migrate to etcd v2.0 and you may do this at your own pace. We encourage users to use etcd v2.0 for all new clusters to take advantage of the large number of stability and performance improvements over the older series.

In this process, we have had to break backward compatibility in two cases in order to support this change:

  1. Starting fleet.service without explicitly starting etcd.service or etcd2.service will no longer work. If you are using fleet and need a local etcd endpoint, you will need to also start etcd.service or etcd2.service.

  2. Starting flannel.service without explicitly starting etcd.service or etcd2.service will no longer work. If you are using flannel and need a local etcd endpoint, you will need to also start etcd.service or etcd2.service.

We have discouraged the use of this implicit dependency via our documentation but you can check if you will be affected. Make sure that etcd.service or etcd2.service are enabled or started in your cloud-config.

Looking Forward

As we look forward to etcd v2.1.0 and beyond, there are a number of exciting things shaping up inside of etcd. In the near future new features such as the authorization and authentication API will make it safer to operate multiple applications on a single cluster. The team has also been operating both on-going test environments that introduce regular partitions and crashes and making practical benchmarks available. In the last few days there has also been an active discussion on how to evolve the etcd APIs to better support the applications using etcd for coordination and scheduling today.

We welcome your involvement in the development of etcd - via the etcd-dev discussion mailing list, GitHub issues, or contributing directly to the project.

April 14, 2015

CoreOS on ARM64

This is a guest post from CoreOS contributor, Geoff Levand, Linux Architect, Huawei America Software Lab. He has started work on an ARM64 port of CoreOS. Here is the current state of the project, followed by how you can help.

Recent patches that I've contributed to CoreOS have added basic support for a new target board named arm64-usr. There is currently a single generic ARM64 little endian Linux profile. This profile should work with any ARM64 platform currently supported by the mainline Linux kernel, so the ARM V8 Foundation Model, the ARM FVP_VE Fast Model, the ARM FVP_BASE Fast Model, and recent qemu-system-aarch64. I hope to add other profiles to support an ARM64 big endian build, and also to get the eight-core HiSilicon 6220 based HiKey developer board supported.

ARM64 porting work is still in progress, so please consider what is done so far as experimental. Some initial work I did along with Michael Marineau of CoreOS was to clean up parts of the CoreOS build system to simplify the way architectures are defined, and also to make the generic build infrastructure completely architecture agnostic. The resulting system should make it quite straight forward to add additional architecture support to CoreOS.

The ARM64 architecture is a relatively new one, so many upstream software packages have either only recently been updated to support ARM64, or have not yet been. Much of my CoreOS porting work so far has been going through the packages which don't build and figuring out how to get them to build. Sometimes a package can be updated to the latest upstream, sometimes a package keyword can be set, sometimes a modification to the ebuild in coreos-overlay will work, and other times a combination of these are needed. This process is still ongoing, and some difficult packages still lay ahead. The resulting arm64-usr build is experimental and all the work to bring it up will need testing and review in the future.

There is still a lot of work to be done. Many more packages need to be brought up, and as I mentioned, this involves working at a low level with the package ebuild files and the CoreOS build system. At another level, all the CoreOS features will need to be exercised and verified as needed to bring up the stability and confidence of the port. There are going to be multi-arch clusters, so ARM64 and x86_64 nodes are going to need to work together -- it sounds pretty cool. Someone will need to get in there and make that happen. If you have any interest in the ARM64 port I encourage you get involve and help out.

For general info about the port you can look at my Github site. For those who would like to investigate more, or even help with the effort, see my CoreOS ARM64 HOWTO document.

Continue the discussion with Geoff at CoreOS Fest and on freenode in #coreos as geoff-

April 13, 2015

Counting Down to CoreOS Fest on May 4 and 5

As we count down to the inaugural CoreOS Fest in just three weeks, we are thrilled to announce additional speakers and the agenda! CoreOS Fest will be May 4-5 at The Village at 969 Market Street in San Francisco and we hope you will join us.

CoreOS Fest is a two-day event about the tools and best practices used to build modern infrastructure stacks. CoreOS Fest connects people from all levels of the community with future-thinking industry veterans to learn how to build distributed systems that support application containers. This May’s festival is brought to you by our premier sponsor Intel, and additional sponsors Sysdig, Chef, Mesosphere, Metaswitch Networks and Giant Swarm.

CoreOS Fest will include speakers from Google, Intel, Salesforce Data.com, HP, and more, including:

  • Brendan Burns, software engineer at Google and founder of Kubernetes, will provide a technical overview of Kubernetes

  • Diego Ongaro, creator of Raft, will discuss the Raft Consensus Algorithm

  • Lennart Poettering, creator of systemd, will talk about systemd at the Core of the OS

  • Nicholas Weaver, director of SDI-X at Intel, will demonstrate how we can optimize container architectures for the next level of scale

  • Prakash Rudraraju, manager of technical operations at Salesforce Data.com, will join Brian Harrington, principal architect at CoreOS, for a fireside chat on how Salesforce Data.com is thinking about distributed systems and application containers

  • Yazz Atlas, HPCS principle engineer with Hewlett-Packard Advanced Technology Group, will give a presentation on automated MySQL Cluster Failover using Galera Cluster on CoreOS Linux

  • Loris Degioanni, CEO and founder of Sysdig and co-creator of Wireshark, will present the dark art of container monitoring

  • Gabriel Monroy, CTO at OpDemand/Deis, will discuss lessons learned from building platforms on top of CoreOS

  • Spencer Kimball, founder of Cockroach Labs, will talk about CockroachDB

  • Chris Winslett, product manager at Compose.io, will present etcd based Postgres SQL HA Cluster

  • Timo Derstappen, co-founder of Giant Swarm, will present Containers on the Autobahn

More speakers will be added at https://coreos.com/fest/.

As a part of today's schedule announcement, we are offering 10 percent off the regular ticket price until tomorrow, April 14, at 10 a.m. PT. Use this link to reserve your 10 percent off ticket. Tickets are selling fast so get them before we sell out!

Once again, CoreOS Fest thanks its top level sponsor Intel and additional sponsors, including Sysdig, Chef, Mesosphere, Metaswitch Networks and Giant Swarm. If you’re interested in participating at CoreOS Fest as a sponsor, contact fest@coreos.com.

For more CoreOS Fest news, follow along @coreoslinux or #CoreOSFest

April 08, 2015

Upcoming CoreOS Events in April

Supplied with fresh CoreOS t-shirts and half our weight in airport Cinnabons, we’ve made sure that you’ll be seeing a lot of us this April.


Wednesday, April 8, 2015 at 10:15 a.m. EDT - Philadelphia, PA

Don’t miss Kelsey Hightower (@kelseyhightower), developer advocate and toolsmith at CoreOS, kick off our April events by speaking at ETE Conference. He’ll be discussing managing containers at scale with CoreOS and Kubernetes.


Thursday, April 16, 2015 at 7:00 p.m. CET - Amsterdam, Netherlands

Kelsey Hightower will be giving an introduction to fleet, CoreOS and building large reliable systems at the Docker Randstad Meetup.


Thursday, April 16, 2015 at 6:00 p.m. PDT - San Francisco, CA

Brian Harrington will be giving an overview of CoreOS at CloudCamp. This is an unconference dedicated to all things containers.


Friday, April 17, 2015 - San Francisco, CA

Joined by a few of our very own, CoreOS CTO Brandon Philips (@BrandonPhilips) will be speaking at Container Camp. This event focuses on the latest developments in software virtualization. Get your tickets here.


Tuesday April 21 - Saturday, April 25, 2015 - Berlin, Germany

This year we’ll be attending Open Source Data Center Conference (OSDC) where Kelsey Hightower will be talking on building distributed systems with CoreOS.


Wednesday, April 22 at 6:30p.m. CET - Berlin, Germany

If you’re in Berlin, be sure to check out Kelsey Hightower talk about managing containers at scale with CoreOS and Kubernetes.


In case you missed it

In case you missed it, check out Chris Winslett from Compose.io talk about an etcd-based PostgreSQL HA Cluster:

CoreOS Fest

Don’t forget that CoreOS Fest is happening the following month on May 4 and 5! We’ve released a tentative schedule and our first round of speakers. Keep checking back for more updates as the event gets closer.

April 07, 2015

Sahana Participates for GCI 2014

The Sahana Software Foundation has actively taken part in the Google Code-In programme since its inception in 2010 and 2014′s programme was no exception as Sahana was once again among the 12 open source organizations selected to mentor students for Code-In. [Read the Rest...]

April 06, 2015

Announcing Tectonic: The Commercial Kubernetes Platform

CoreOS Tech Stack + Kubernetes

Our technology is often characterized as “Google’s infrastructure for everyone else.” Today we are excited to make this idea a reality by announcing Tectonic, a commercial Kubernetes platform. Tectonic provides the combined power of the CoreOS portfolio and the Kubernetes project to any cloud or on-premise environment.

Why we are building Tectonic

Our users want to securely run containers at scale in a distributed environment. We help companies do this by building open source tools which allow teams to create this type of infrastructure. With Tectonic, we now have an option for companies that want a preassembled and enterprise-ready distribution of these tools, allowing them to quickly see the benefits of modern container infrastructure.

What is Tectonic?

Tectonic is a platform combining Kubernetes and the CoreOS stack. Tectonic pre-packages all of the components required to build Google-style infrastructure and adds additional commercial features, such as a management console for workflows and dashboards, an integrated registry to build and share Linux containers, and additional tools to automate deployment and customize rolling updates.

Tectonic is available today to a select number of early customers. Head over to tectonic.com to sign up for the waitlist if your company is interested in participating.

What is Kubernetes?

Kubernetes is an open source project introduced by Google to help organizations run their infrastructure in a similar manner to the internal infrastructure that runs Google Search, Gmail, and other Google services. The concepts and workflows in Kubernetes are designed to help engineers focus on their application instead of infrastructure and build for high availability of services. With the Kubernetes APIs, users can manage application infrastructure - such as load balancing, service discovery, and rollout of new versions - in a way that is consistent and fault-tolerant.

Tectonic and CoreOS

Tectonic is a commercial product, and with this release, we have decided to launch our commercial products under a new brand, separate from the CoreOS name. We want our open source components - like etcd, rkt, flannel, and CoreOS Linux - to always be freely available for everyone under their respective open source licenses. We think open source development works best when it is community-supported infrastructure that we all share and build with few direct commercial motives. To that end, we want to keep CoreOS focused on building completely open source components.

To get access to an early release of Tectonic or to learn more, visit tectonic.com. To contribute and learn more about our open source projects visit coreos.com.

Google Ventures Funding

In addition to introducing Tectonic, today we are announcing an investment in CoreOS, Inc. led by Google Ventures. It is great to have the support and backing of Google Ventures as we bring the Kubernetes platform to market. The investment will help us accelerate our efforts to secure the backend of the Internet and deliver Google-like infrastructure to everyone else.

FAQ

Q: What does this change about CoreOS Linux and other open source projects like rkt, etcd, fleet, flannel, etc?

A: Nothing: development will continue, and we want to see all of the open source projects continue to thrive as independent components. CoreOS Linux will remain the same carefully maintained, open source, and container-focused OS it has always been. Tectonic uses many of these projects internally - including rkt, etcd, flannel, and fleet - and runs on top of the same CoreOS Linux operating system as any other application would.

Q: I am using Apache Mesos, Deis, or another application on top of CoreOS Linux: does anything change for me?

A: No, this announcement doesn't change anything about the CoreOS Linux project or software. Tectonic is simply another container-delivered application that runs on top of CoreOS Linux.

Q: What does this change for existing Enterprise Registry, Managed Linux, or Quay.io customers?

A: Everything will remain the same for existing customers. All of these components are utilized in the Tectonic stack and we continue to offer support, fix bugs and add features to these products.


Follow @TectonicStack on Twitter

Go to Tectonic.com to join an early release or to stay up to date on Tectonic news

Visit us in person at CoreOS Fest in San Francisco May 4-5, to learn more about CoreOS, Tectonic and all things distributed systems

April 01, 2015

Announcing rkt v0.5, featuring pods, overlayfs, and more

rkt is a new container runtime for applications, intended to meet the most demanding production requirements of security, efficiency and composability. rkt is also an implementation of the emerging Application Container (appc) specification, an open specification defining how applications can be run in containers. Today we are announcing the next major release of rkt, v0.5, with a number of new features that bring us closer to these goals, and want to give an update on the upcoming roadmap for the rkt project.

appc v0.5 - introducing pods

This release of rkt updates to the latest version of the appc spec, which introduces pods. Pods encapsulate a group of Application Container Images and describe their runtime environment, serving as a first-class unit for application container execution.

Pods are a concept recently popularised by Google's Kubernetes project. The idea emerged from the recognition of a powerful, pervasive pattern in deploying applications in containers, particularly at scale. The key insight is that, while one of the main value propositions of containers is for applications to run in isolated and self-contained environments, it is often useful to co-locate certain "helper" applications within a container. These applications have an intimate knowledge of each other - they are designed and developed to work co-operatively - and hence can share the container environment without conflict, yet still be isolated from interfering with other application containers on the same system.

A classic example of a pod is service discovery using the sidekick model, wherein the main application process serves traffic, and the sidekick process uses its knowledge of the pod environment to register the application in the discovery service. The pod links together the lifecycle of the two processes and ensures they can be jointly deployed and constrained in the cluster.

Another simple example is a database co-located with a backup worker. In this case, the backup worker could be isolated from interfering with the database's work - through memory, I/O and CPU limits applied to the process - but when the database process is shut down the backup process will terminate too. By making the backup worker an independent application container, and making pods the unit of deployment, we can reuse the worker for backing up data from a variety of applications: SQL databases, file stores or simple log files.

This is the power that pods provide: they encapsulate a self-contained, deployable unit that still provides granularity (for example, per-process isolators) and facilitates advanced use cases. Bringing pods to rkt enables it to natively model a huge variety of application use cases, and integrate tightly with cluster-level orchestration systems like Kubernetes.

For more information on pods, including the technical definition, check out the appc spec or the Kubernetes documentation.

overlayfs support

On modern Linux systems, rkt now uses overlayfs by default when running application containers. This provides immense benefits to performance and efficiency: start times for large containers will be much faster, and multiple pods using the same images will consume less disk space and can share page cache entries.

If overlayfs is not supported on the host operating system, rkt gracefully degrades back to the previous behaviour of extracting each image at runtime - this behaviour can also be triggered with the new --no-overlay flag to rkt run.

Another improvement behind the scenes is the introduction of a tree cache for rkt's local image storage. When storing ACIs in its local database (for example, after pulling them from a remote repository using rkt fetch), rkt will now store the expanded root filesystem of the image on disk. This means that when pods that reference this image are subsequently started (via rkt run), the pod filesystem can be created almost instantaneously in the case of overlayfs - or, without overlayfs, by using a simple copy instead of needing to expand the image again from its compressed format.

To facilitate simultaneous use of the tree store by multiple rkt invocations, file-based locking has been added to ensure images that are in use cannot be removed. Future versions of rkt will expose more advanced capabilities to manage images in the store.

stage1 from source

When executing application containers, rkt uses a modular approach (described in the architecture documentation) to support swappable, alternative execution environments. The default stage1 that we develop with rkt itself is based on systemd, but alternative implementations can leverage different technologies like KVM-based virtual machines to execute applications.

In earlier versions of rkt, the pre-bundled stage1 was assembled from a copy of the CoreOS Linux distribution image. We have been working hard to decouple this process to make it easier to package rkt for different operating systems and in different build environments. In rkt 0.5, the default stage1 is now constructed from source code, and over the next few releases we will make it easier to build alternative stage1 images by documenting and stabilizing the ABI.

"Rocket", "rocket", "rkt"?

This release also sees us standardizing on a single name for all areas of the project - the command-line tool, filesystem names and Unix groups, and the title of the project itself. Instead of "rocket", "Rocket", or "rock't", we now simply use "rkt".

rkt logo

Looking forward

rkt is a young project and the last few months have seen rapid changes to the codebase. As we look towards rkt 0.6 and beyond, we will be focusing on making it possible to depend on rkt to roll-forward from version to version without breaking working setups. There are several areas that are needed to make this happen, including reaching the initial stable version (1.0) of the appc spec, implementing functional testing, stabilizing the on-disk formats, and implementing schema upgrades for the store. We realize that stability is vital for people considering using rkt in production environments, and this will be a priority in the next few releases. The goal is to make it possible for a user that was happily using rkt 0.6 to upgrade to rkt 0.7 without having to remove their downloaded ACIs or configuration files.

We welcome your involvement in the development of rkt - via the rkt-dev discussion mailing list, GitHub issues, or contributing directly to the project.

March 27, 2015

CoreOS Fest 2015 First Round of Speakers Announced

As you might already know, we’re launching our first ever CoreOS Fest this May 4th and 5th in San Francisco! We’ve been hard at work making sure that this event is two days filled with all things distributed, and all things awesome.

In addition to many CoreOS project leads taking the stage, we are excited to announce a sneak peek at some of our community speakers. Join us at CoreOS Fest and you’ll hear from some of the most influential people in distributed systems today: Brendan Burns, one of the founders of Kubernetes; Diego Ongaro, the creator of Raft; Gabriel Monroy, the creator of Deis; Spencer Kimball, CEO of Cockroach Labs; Loris Degioanni, CEO of Sysdig; and many more!

We are still accepting submissions for speakers through March 31st, so we encourage you to submit your talk in our Call for Papers portal.

While the schedule will be live in the coming weeks, here's a high level overview:

We’ll kick off day one at 9 AM PDT (with registration and breakfast beforehand) with a single track of speakers, followed by lunch, then afternoon panels and breakouts. You’ll have lots of opportunities to connect and talk with fellow attendees, especially at an evening reception on the first day. Day two will include breakfast, single-track talks, lunch, panels and more.

Confirmed Speakers

See more about our first round of speakers:


Brendan Burns
Brendan Burns
Software Engineer at Google and a founder of the Kubernetes project

Brendan works in the Google Cloud Platform, leading engineering efforts to make the Google Cloud Platform the best place to run containers. He also has managed several other cloud teams including the Managed VMs team, and Cloud DNS. Prior to Cloud, he was a lead engineer in Google’s web search infrastructure, building backends that powered social and personal search. Prior to working at Google, he was a professor at Union College in Schenectady, NY. He received a PhD in Computer Science from the University of Massachusetts Amherst, and a BA in Computer Science and Studio Art from Williams College.


Diego Ongaro
Diego Ongaro
Creator of Raft

Diego recently completed his doctorate with John Ousterhout at Stanford. During his doctorate, he worked on RAMCloud (a 5-10 microsecond RTT key-value store), Raft, and LogCabin (a coordination service built with Raft). He’s lately been continuing development on LogCabin as an independent contractor.


Gabriel Monroy
Gabriel Monroy
CTO of OpDemand and creator of Deis

Gabriel Monroy is CTO at OpDemand and the creator of Deis, the leading CoreOS-based PaaS. As an early contributor to Docker and CoreOS, Gabriel has deep experience putting containers into production and frequently advises organizations on PaaS, container automation and distributed systems. Gabriel spoke recently at QConSF on cluster scheduling and deploying containers at scale.


Spencer Kimball
Spencer Kimball

Spencer is CEO of Cockroach Labs. After helping to re-architect and re-implement Square's items catalog service, Spencer was convinced that the industry needed a more capable database software. He began work on the design and implementation of Cockroach as an open source project and moved to work on it full time at Square mid-2014. Spencer managed the acquisition of Viewfinder by Square as CEO and before that, shared the roles of co-CTO and co-founder. Previously, he worked at Google on systems and web application infrastructure, most recently helping to build Colossus, Google’s exascale distributed file system, and on Java infrastructure, including the open-sourced Google Servlet Engine.


Loris Degioanni
Loris Degioanni
CEO of Sysdig

Loris is the creator and CEO of Sysdig, a popular open source troubleshooting tool for Linux environments. He is a pioneer in the field of network analysis through his work on WinPcap and Wireshark: open source tools with millions of users worldwide. Loris was previously a senior director of technology at Riverbed, and co-founder/CTO at CACE Technologies, the company behind Wireshark. Loris holds a PhD in computer engineering from Politecnico di Torino, Italy.


Excited? Stay tuned for more announcements and join us at CoreOS Fest 2015.

Buy your early bird ticket by March 31st: https://coreos.com/fest/

Submit a speaking abstract by March 31st: CFP Portal

Become a sponsor, email us for more details.

March 20, 2015

What makes a cluster a cluster?

“What makes a cluster a cluster?” - Ask that question of 10 different engineers and you’ll get 10 different answers. Some look at it from a hardware perspective, some see it as a particular set of cloud technologies, and some say it’s the protocols exchanging information on the network.

With this ever-growing field of distributed systems technologies, it is helpful to compare the goals, roles and differences of some of these new projects based on their functionality. In this post we propose a conceptual description of the cluster at large, while showing some examples of emerging distributed systems technologies.

Layers of abstraction

The tech community has long agreed on what a network looks like. We’ve largely come to agree, in principle, on the OSI (Open Systems Interconnection) model (and in practice, on its close cousin, the TCP/IP model).

A key aspect of this model is the separation of concerns, with well-defined responsibilities and dependence between components: every layer depends on the layer below it and provides useful network functionality (connection, retry, packetization) to the layer above it. At the top, finally, are web sessions and applications of all sorts running and abstracting communication.

So, as an exercise to try to answer “What makes a cluster a cluster?” let’s apply the same sort of thinking to layers of abstraction in terms of execution of code on a group of machines, instead of communication between these machines.

Here’s a snapshot of the OSI model, applied to containers and clustering:

OSI Applied to Clustering

Let’s take a look from the bottom up.

Level 1, Hardware

The hardware layer is where it all begins. In a modern environment, this may mean physical (bare metal) or virtualized hardware – abstraction knows no bounds – but for our purposes, we define hardware as the CPU, RAM, disk and network equipment that is rented or bought in discrete units.

Examples: bare metal, virtual machines, cloud

Level 2, OS/Machine ABI

The OS layer is where we define how software executes on the hardware: the OS gives us the Application Binary Interface (ABI) by which we agree on a common language that our userland applications speak to the OS (system calls, device drivers, and so on). We also set up a network stack so that these machines can communicate amongst each other. This layer therefore provides our lowest level complete execution environment for applications.

Now, traditionally, we stop here, and run our final application on top of this as a third pseudo-layer of the OS and various user-space packages. We provision individual machines with slightly different software stacks (a database server, an app server) and there’s our server rack.

Over the lifetime of servers and software, however, the permutations and histories of individual machine configurations start to become unwieldy. As an industry, we are learning that managing this complexity becomes costly or infeasible over time, even at moderate scale (e.g. 3+ machines).

This is often where people start to talk about containers, as containers treat the entire OS userland as one hermetic application package that can be managed as an independent unit. Because of this abstraction, we can conceptually shift containers up the stack, as long as they’re above layer 2. We’ll revisit containers in layer 6.

Examples: kernel + {systemd, cgroups/namespaces, jails, zones}

Level 3, Cluster Consensus

To begin to mitigate the complexity of managing individual servers, we need to start thinking about machines in some greater, collective sense: this is our first notion of a cluster. We want to write software that scales across these individual servers and shares work effortlessly.

However, as we add more servers to the picture, we now introduce many more points of failure: networks partition, machines crash and disks fail. How can we build systems in the face of greater uncertainty? What we’d like is some way of creating a uniform set of data and data primitives, as needed by distributed systems. Much like in multiprocessor programming, we need the equivalent of locks, message passing, shared memory and atomicity across this group of machines.

This is an interesting and vibrant field of algorithmic research: a first stop for the curious reader should be the works of Leslie Lamport, particularly his earlier writing on ordering and reliability of distributed systems. His later work describes Paxos, the preeminent consensus protocol; the other major protocol, as provided by many projects in this category, is Raft.

Why is this called consensus? The machines need to ‘agree’ on the same history and order of events in order to make the guarantees we’d like for the primitives described. Locks cannot be taken twice, for example, even if some subset of messages disappears or arrives out of order, or member machines crash for unknown reasons.

These algorithms build data structures to form a coherent, consistent, and fault-tolerant whole.

Examples: etcd, ZooKeeper, consul

Level 4, Cluster Resources

With this perspective of a unified cluster, we can now talk about cluster resources. Having abstracted the primitives of individual machines, we use this higher level view to create and interact with the complete set of resources that we have at our disposal. Thus we can consider in aggregate the CPUs, RAM, disk and networking as available to any process in the cluster, as provided by the physical layers underneath.

Viewing the cluster as one large machine, all devices (CPU, RAM, disk, networking) become abstract. This is a benefit already being used by containers. Containers depend on these things being abstracted on their behalf; for example, network bridges. This is so they can use these abstractions at a level higher in the stack while running on any of the underlying hardware.

In some sense, this layer is the equivalent of the hardware layer of the now-primordial notion of the cluster. It may not be as celebrated as the layers above it, but this layer is where some important innovation takes place. Showing a cool auto-scaling webapp demo is nice, but requires things like carving up cluster IP space or where a block device is attached to a host.

Examples: flannel, remote block storage, weave

Level 5, Cluster Orchestration and Scheduling

Cluster orchestration, then, starts to look a lot like an OS kernel atop these cluster-level resources and the tools given by consistency – symmetry with the layers below again. It’s the purview of the orchestration platform to divide and share cluster resources, schedule applications to run, manage permissions, set up interfaces into and out of the cluster, and at the end of the day, find an ABI-compatible environment for the userland. With increased scale comes new challenges: from finding the right machines to providing the best experience to users of the cluster.

Any software that will run on the cluster must ultimately execute on a physical CPU on a particular server. How the application code gets there and what abstractions it sees is controlled by the orchestration layer. This is similar to how WiFi simulates a copper wire to existing network stacks, with a controllable abstraction through access points, signal strength, meshes, encryption and more.

Examples: fleet, Mesos, Kubernetes

Level 6, Containers

This brings us back to containers, which, as described earlier, the entire userland is bundled together and treated as a single application unit.

If you’ve followed the whole stack up to this point, you’ll see why containers sit at level 6, instead of at level 2 or 3. It’s because the layers of abstraction below this point all depend on each other to build up to the point where a single-serving userland can safely abstract whether it’s running as one process on a local machine or as something scheduled on the cluster as a whole.

Containers are actually simple that way; they depend on everything else to provide the appropriate execution environment. They carry userland data and expect specific OS details to be presented to them.

Examples: Rocket, Docker, systemd-nspawn

Level 7, Application

Containers are currently getting a lot of attention in the industry because they can separate the OS and software dependencies from the hardware. By abstracting these details, we can create consistent execution environments across a fleet of machines and let the traditional POSIX userland continue to work, fairly seamlessly, no matter where you take it. If the intention is to share the containers, then choice is important, as is agreeing upon a sharable standard. Containers are exciting; it starts us down the road of a lot of open source work in the realm of true distributed systems, backwards-compatible with the code we already write – our Application.

Closing Thoughts

For any of the layers of the cluster, there are (and will continue to be) multiple implementations. Some will combine layers, some will break them into sub-pieces – but this was true of networking in the past as well (do you remember IPX? Or AppleTalk?).

As we continue to work deeply on the internals of every layer, we also sometimes want to take a step back to look at the overall picture and consider the greater audience of people who are interested and starting to work on clusters of their own. We want to introduce this concept as a guideline, with a symmetric way of thinking about a cluster and its components. We’d love your thoughts on what defines a cluster as more than a mass of hardware.

March 13, 2015

Announcing rkt and App Container 0.4.1

Today we are announcing rkt v0.4.1. rkt is a new app container runtime and implementation of the App Container (appc) spec. This milestone release includes new features like private networking, an enhanced container lifecycle, and unprivileged image fetching, all of which get us closer to our goals of a production-ready container runtime that is composable, secure, and fast.

Private Networking

This release includes our first iteration of the rkt networking subsystem. As an example, let's run etcd in a private network:

# Run an etcd container in a private network
$ rkt run --private-net coreos.com/etcd:v2.0.4

By using the --private-net flag, the etcd container will run with its own network stack decoupled from the host. This includes a private lo loopback device and an eth0 device with an IP in the 172.16.28.0/24 address range. By default, rkt creates a veth pair, with one end becoming eth0 in the container and the other placed on the host. rkt will also set up an IP masquerade rule (NAT) to allow the container to speak to the outside world.

This can be demonstrated by being able to reach etcd on its version endpoint from the host:

$ curl 172.16.28.9:2379/version
{"releaseVersion":"2.0.4","internalVersion":"2"}

The networking configuration in rkt is designed to be highly pluggable to facilitate a variety of networking topologies and infrastructures. In this release, we have included plugins for veth, bridge, and macvlan, and more are under active development. See the rkt network docs for details.

If you are interested in building new network plugins, please take a look at the current specification and get involved by reaching out on GitHub or the mailing list. We would also like to extend a thank you to everyone who has spent time giving valuable feedback on the spec so far.

Unprivileged Fetches

It is good practice to download files over the Internet only as unprivileged users. With this release of rkt, it is possible to set up a rkt Unix group, and give users in that group the ability to download and verify container images. For example, let's give the core user permission to use rkt to retrieve images and verify their signature:

$ sudo groupadd rkt
$ sudo usermod -a -G rkt core
$ sudo rkt install
$ rkt fetch coreos.com/etcd:v2.0.5
rkt: searching for app image coreos.com/etcd:v2.0.5
rkt: fetching image from https://github.com/coreos/etcd/releases/download/v2.0.5/etcd-v2.0.5-linux-amd64.aci
Downloading ACI: [==========                                   ] 897 KB/3.76 MB
Downloading signature from https://github.com/coreos/etcd/releases/download/v2.0.5/etcd-v2.0.5-linux-amd64.aci.asc
rkt: signature verified:                                       ] 0 B/819 B
  CoreOS ACI Builder <release@coreos.com>
sha512-295a78d35f7ac5cc919e349837afca6d

The new rkt install subcommand is a simple helper to quickly set up all of the rkt directory permissions. These steps could easily be scripted outside of rkt for a more complex setup or a custom group name; for example, distributions that package rkt in their native formats would configure directory permissions at the time the package is installed.

Note that the image we’ve fetched will still need to be run with sudo, as Linux doesn't yet make it possible to do many of the operations necessary to start a container without root privileges. But at this stage, you can trust that the image comes from an author you have already trusted via rkt trust.

Other Features

rkt prepare is a new command that can be used to set up a container without immediately running it. This gives users the ability to allocate a container ID and do filesystem setup before launching any processes. In this way, a container can be prepared ahead of time, so that when rkt run-prepared is subsequently invoked, the process startup happens immediately with few additional steps. Being able to pre-allocate a unique container ID also facilitates better integration with higher-level orchestration systems.

rkt run can now append additional command line flags and environment variables for all apps, as well as optionally have containers inherit the environment from the parent process. For full details see the command line documentation.

The image store now uses a ql database to track metadata about images in the store. This is used to keep track of URLs, labels, and other metadata of images stored inside rkt's local store. Note that if you are upgrading from a previous rkt release on a system, you may need to remove /var/lib/rkt. We understand people are already beginning to rely on rkt and over the next few releases will focus heavily on introducing stable APIs. But until we are closer to a 1.0 release, expect that there will be more regular changes.

For more details about this 0.4.1 release and pre-compiled standalone rkt Linux binaries see the release page.

Updates to App Container spec

Finally, this change updates rkt to the latest version of the appc spec, v0.4.1. Recent changes to the spec include reworked isolators, new OS-specific requirements, and greater explicitness around image signing and encryption. You can refer to a list of some major changes and additions here.

Join us on the mission to create a secure, composable and standards based container runtime, and get involved in hacking on rkt or App Container here:

rkt:

Help Wanted, Mailing list

App Container:

Help Wanted, Mailing list

March 12, 2015

rkt Now Available in CoreOS Alpha Channel

Our CoreOS Alpha channel is designed to strike a balance between offering early access to new versions of software and serving as the release candidate for the Beta and Stable channels. Due to its release-candidate nature, we must be conservative in upgrading critical system components (e.g. systemd and etcd), but in order to get new technologies (like fleet and flannel) into the hands of users for testing we must occasionally include pre-production versions of these components in Alpha.

Today, we are adding rkt, a container runtime built on top of the App Container spec, to make it easier for users to try it and give us feedback.

rkt will join systemd-nspawn and Docker as container runtimes that are available to CoreOS users. Keep in mind that rkt is still pre-1.0 and that you should not rely on flags or the data in /var/lib/rkt to work between versions. Specifically, next week v0.4.1 will land in Alpha which is incompatible with images and containers created by previous versions of rkt. Besides the addition of /usr/bin/rkt to the image, nothing major has changed and no additional daemons will run by default.

Release Cadence

We have adopted a regular weekly schedule for Alpha releases, rolling out a new version every Thursday. Every other week we release a Beta, taking the best of the previous two Alpha versions and promoting it bit-for-bit. Similarly, once every four weeks we promote the best of the previous two Beta releases to Stable.

Give it a spin

If you want to spin up a CoreOS Alpha machine and get started, check out the documentation for v0.3.2. We look forward to having you involved in rkt development via the rkt-dev discussion mailing list, GitHub issues, or contributing directly to the project. We have made great progress so far, but there is still much to build!

Confessions of a Recovering Proprietary Programmer, Part XV

So the Linux kernel now has a Documentation/CodeOfConflict file. As one of the people who provided an Acked-by for this file, I thought I should set down what went through my mind while reading it. Taking it one piece at a time:



The Linux kernel development effort is a very personal process compared to “traditional” ways of developing software. Your code and ideas behind it will be carefully reviewed, often resulting in critique and criticism. The review will almost always require improvements to the code before it can be included in the kernel. Know that this happens because everyone involved wants to see the best possible solution for the overall success of Linux. This development process has been proven to create the most robust operating system kernel ever, and we do not want to do anything to cause the quality of submission and eventual result to ever decrease.



In a perfect world, this would go without saying, give or take the “most robust” chest-beating. But I am probably not the only person to have noticed that the world is not always perfect. Sadly, it is probably necessary to remind some people that “job one” for the Linux kernel community is the health and well-being of the Linux kernel itself, and not their own pet project, whatever that might be.



On the other hand, I was also heartened by what does not appear in the above paragraph. There is no assertion that the Linux kernel community's processes are perfect, which is all to the good, because delusions of perfection all too often prevent progress in mature projects. In fact, in this imperfect world, there is nothing so good that it cannot be made better. On the other hand, there also is nothing so bad that it cannot be made worse, so random wholesale changes should be tested somewhere before being applied globally to a project as important as the Linux kernel. I was therefore quite happy to read the last part of this paragraph: “we do not want to do anything to cause the quality of submission and eventual result to ever decrease.”



If however, anyone feels personally abused, threatened, or otherwise uncomfortable due to this process, that is not acceptable.



That sentence is of course critically important, but must be interpreted carefully. For example, it is all too possible that someone might feel abused, threatened, and uncomfortable by the mere fact of a patch being rejected, even if that rejection was both civil and absolutely necessary for the continued robust operation of the Linux kernel. Or someone might claim to feel that way, if they felt that doing so would get their patch accepted. (If this sounds impossible to you, be thankful, but also please understand that the range of human behavior is extremely wide.) In addition, I certainly feel uncomfortable when someone points out a stupid mistake in one of my patches, but that discomfort is my problem, and furthermore encourages me to improve, which is a good thing. For but one example, this discomfort is exactly what motivated me to write the rcutorture test suite. Therefore, although I hope that we all know what is intended by the words “abused”, “threatened”, and “uncomfortable” in that sentence, the fact is that it will never be possible to fully codify the difference between constructive and destructive behavior.



Therefore, the resolution process is quite important:



If so, please contact the Linux Foundation's Technical Advisory Board at <tab@lists.linux-foundation.org>, or the individual members, and they will work to resolve the issue to the best of their ability. For more information on who is on the Technical Advisory Board and what their role is, please see:



http://www.linuxfoundation.org/programs/advisory-councils/tab



There can be no perfect resolution process, but this one seems to be squarely in the “good enough” category. The timeframes are long enough that people will not be rewarded by complaining to the LF TAB instead of fixing their patches. The composition of the LF TAB, although not perfect, is diverse, consisting of both men and women from multiple countries. The LF TAB appears to be able to manage the inevitable differences of opinion, based on the fact that not all members provided their Acked-by for this Code of Conflict. And finally, the LF TAB is an elected body that has oversight via the LF, so there are feedback mechanisms. Again, this is not perfect, but it is good enough that I am willing to overlook my concerns about the first sentence in the paragraph.



On to the final paragraph:



As a reviewer of code, please strive to keep things civil and focused on the technical issues involved. We are all humans, and frustrations can be high on both sides of the process. Try to keep in mind the immortal words of Bill and Ted, “Be excellent to each other.”



And once again, in a perfect world it would not be necessary to say this. Sadly, we are human beings rather than angels, and so it does appear to be necessary. Then again, if we were all angels, this would be a very boring world.



Or at least that is what I keep telling myself!

March 11, 2015

The First CoreOS Fest

CoreOS Fest 2015

Get ready, CoreOS Fest, our celebration of everything distributed, is right around the corner! Our first CoreOS Fest is happening May 4 and 5, 2015 in San Francisco. You’ll learn more about application containers, container orchestration, clustering, devops security, new Linux, Go and more.

Join us for this two-day event as we talk about the newest in distributed systems technologies and together talk about securing the Internet. Be part of discussions shaping modern infrastructure stacks, hear from peers on how they are using these technologies today and get inspired to learn new ways to speed up your application development process.

Take a journey with us (in space and time) and help contribute to the next generation of infrastructure. The early bird tickets are available until March 31st and are only $199, so snatch one up now before they are gone. After March 31st, tickets will be available for $349. See you in May.

Submit an Abstract

Grab An Early Bird Ticket

If you are interested in sponsoring the event, reach out to fest@coreos.com and we would be happy to send you the prospectus.

March 10, 2015

Py3progress updated

Another year down!

I've updated the py3progress site with the whole of 2014, and what we have so far in 2015. I'll post a review of the last year like I have before later.

March 09, 2015

Verification Challenge 4: Tiny RCU

The first and second verification challenges were directed to people working on verification tools, and the third challenge was directed at developers. Perhaps you are thinking that it is high time that I stop picking on others and instead direct a challenge at myself. If so, this is the challenge you were looking for!



The challenge is to take the v3.19 Linux kernel code implementing Tiny RCU, unmodified, and use some formal-verification tool to prove that its grace periods are correctly implemented.



This requires a tool that can handle multiple threads. Yes, Tiny RCU runs only on a single CPU, but the proof will require at least two threads. The basic idea is to have one thread update a variable, wait for a grace period, then update a second variable, while another thread accesses both variables within an RCU read-side critical section, and a third parent thread verifies that this critical section did not span a grace period, like this:



 1 int x;
 2 int y;
 3 int r1;
 4 int r2;
 5
 6 void rcu_reader(void)
 7 {
 8   rcu_read_lock();
 9   r1 = x; 
10   r2 = y; 
11   rcu_read_unlock();
12 }
13
14 void *thread_update(void *arg)
15 {
16   x = 1; 
17   synchronize_rcu();
18   y = 1; 
19 }
20
21 . . .
22
23 assert(r2 == 0 || r1 == 1);




Of course, rcu_reader()'s RCU read-side critical section is not allowed to span thread_update()'s grace period, which is provided by synchronize_rcu(). Therefore, rcu_reader() must execute entirely before the end of the grace period (in which case r2 must be zero, keeping in mind C's default initialization to zero), or it must execute entirely after the beginning of the grace period (in which case r1 must be one).



There are a few technical problems to solve:





  1. The Tiny RCU code #includes numerous “interesting” files. I supplied empty files as needed and used “-I .” to focus the C preprocessor's attention on the current directory.

  2. Tiny RCU uses a number of equally interesting Linux-kernel primitives. I stubbed most of these out in fake.h, but copied a number of definitions from the Linux kernel, including IS_ENABLED, barrier(), and bool.

  3. Tiny RCU runs on a single CPU, so the two threads shown above must act as if this was the case. I used pthread_mutex_lock() to provide the needed mutual exclusion, keeping in mind that Tiny RCU is available only with CONFIG_PREEMPT=n. The thread that holds the lock is running on the sole CPU.

  4. The synchronize_rcu() function can block. I modeled this by having it drop the lock and then re-acquire it.

  5. The dyntick-idle subsystem assumes that the boot CPU is born non-idle, but in this case the system starts out idle. After a surprisingly long period of confusion, I handled this by having main() invoke rcu_idle_enter() before spawning the two threads. The confusion eventually proved beneficial, but more on that later.





The first step is to get the code to build and run normally. You can omit this step if you want, but given that compilers usually generate better diagnostics than do the formal-verification tools, it is best to make full use of the compilers.



I first tried goto-cc, goto-instrument, and satabs [Slide 44 of PDF] and impara [Slide 52 of PDF], but both tools objected strenuously to my code. My copies of these two tools are a bit dated, so it is possible that these problems have since been fixed. However, I decided to download version 5 of cbmc, which is said to have gained multithreading support.



After converting my code to a logic expression with no fewer than 109,811 variables and 457,344 clauses, cbmc -I . -DRUN fake.c took a bit more than ten seconds to announce VERIFICATION SUCCESSFUL. But should I trust it? After all, I might have a bug in my scaffolding or there might be a bug in cbmc.



The usual way to check for this is to inject a bug and see if cbmc catches it. I chose to break up the RCU read-side critical section as follows:



 1 void rcu_reader(void)
 2 {
 3   rcu_read_lock();
 4   r1 = x; 
 5   rcu_read_unlock();
 6   cond_resched();
 7   rcu_read_lock();
 8   r2 = y; 
 9   rcu_read_unlock();
10 }




Why not remove thread_update()'s call to synchronize_rcu()? Take a look at Tiny RCU's implementation of synchronize_rcu() to see why not!



With this change enabled via #ifdef statements, “cbmc -I . -DRUN -DFORCE_FAILURE fake.c” took almost 20 seconds to find a counter-example in a logic expression with 185,627 variables and 815,691 clauses. Needless to say, I am glad that I didn't have to manipulate this logic expression by hand!



Because cbmc catches an injected bug and verifies the original code, we have some reason to hope that the VERIFICATION SUCCESSFUL was in fact legitimate. As far as I know, this is the first mechanical proof of the grace-period property of a Linux-kernel RCU implementation, though admittedly of a rather trivial implementation. On the other hand, a mechanical proof of some properties of the dyntick-idle counters came along for the ride, courtesy of the WARN_ON_ONCE() statements in the Linux-kernel source code. (Previously, researchers at Oxford mechanically validated the relationship between rcu_dereference() and rcu_assign_pointer(), taking the whole of Tree RCU as input, and researchers at MPI-SWS formally validated userspace RCU's grace-period guarantee—manually.)



As noted earlier, I had confused myself into thinking that cbmc did not handle pthread_mutex_lock(). I verified that cbmc handles the gcc atomic builtins, but it turns out to be impractical to build a lock for cbmc's use from atomics. The problem stems from the “b” for “bounded” in “cbmc”, which means cbmc cannot analyze the unbounded spin loops used in locking primitives.



However, cbmc does do the equivalent of a full state-space search, which means it will automatically model all possible combinations of lock-acquisition delays even in the absence of a spin loop. This suggests something like the following:



 1 if (__sync_fetch_and_add(&cpu_lock, 1))
 2   exit();




The idea is to exclude from consideration any executions where the lock cannot be immediately acquired, again relying on the fact that cbmc automatically models all possible combinations of delays that the spin loop might have otherwise produced, but without the need for an actual spin loop. This actually works, but my mis-modeling of dynticks fooled me into thinking that it did not. I therefore made lock-acquisition failure set a global variable and added this global variable to all assertions. When this failed, I had sufficient motivation to think, which caused me to find my dynticks mistake. Fixing this mistake fixed all three versions (locking, exit(), and flag).



The exit() and flag approaches result in exactly the same number of variables and clauses, which turns out to be quite a bit fewer than the locking approach:



exit()/flaglocking
Verification69,050 variables, 287,548 clauses (output)109,811 variables, 457,344 clauses (output)
Verification Forced Failure113,947 variables, 501,366 clauses (output)   185,627 variables, 815,691 clauses (output)




So locking increases the size of the logic expressions by quite a bit, but interestingly enough does not have much effect on verification time. Nevertheless, these three approaches show a few of the tricks that can be used to accomplish real work using formal verification.



The GPL-licensed source for the Tiny RCU validation may be found here. C-preprocessor macros select the various options, with -DRUN being necessary for both real runs and cbmc verification (as opposed to goto-cc or impara verification), -DCBMC forcing the atomic-and-flag substitute for locking, and -DFORCE_FAILURE forcing the failure case. For example, to run the failure case using the atomic-and-flag approach, use:



cbmc -I . -DRUN -DCBMC -DFORCE_FAILURE fake.c




Possible next steps include verifying dynticks and interrupts, dynticks and NMIs, and of course use of call_rcu() in place of synchronize_rcu(). If you try these out, please let me know how it goes!

CoreOS on VMware vSphere and VMware vCloud Air

At CoreOS, we want to make the world successful with containers on all computing platforms. Today, we are taking one step closer to that goal by announcing, with VMware, that CoreOS is fully supported and integrated with both VMware vSphere 5.5 and VMware vCloud Air. Enterprises that have been evaluating using containers but needed fully supported environments to begin now have the support to get started.

We’ve worked closely with VMware in enabling CoreOS to run on vSphere 5.5 (see the technical preview of CoreOS on vSphere 5.5). This collaboration extends the security, consistency, and reliability advantages of CoreOS to users of vSphere. Developers can focus on their applications and operations get the control they need. We encourage you to read more from VMware here:

CoreOS Now Supported on VMware vSphere 5.5 and VMware vCloud Air.

As a sysadmin you’ve gotta be thinking, what does this mean for me?

Many people have been running CoreOS on VMware for a while now, but something was missing. Mainly performance and full integration with VMware management APIs. Today that all changes. CoreOS is now shipping open-vm-tools, the open source implementation of VMware Tools, which enables better performance and enables management of CoreOS VMs running in all VMware environments.

Lets take a quick moment to explore some of the things that are now possible.

Taking CoreOS for a spin with VMware Fusion

The following tutorial will walk you through downloading an official CoreOS VMware image and configuring it using a cloud config drive. Once configured, a CoreOS instance will be launched and managed using the vmrun command line tool that ships with VMware Fusion.

To make the following commands easier to run set the following vmrun alias in your shell:

alias vmrun='/Applications/VMware\ Fusion.app/Contents/Library/vmrun'

Download a CoreOS VMware Image

First things first, download a CoreOS VMware image and save it to your local machine:

$ mkdir coreos-vmware
$ cd coreos-vmware
$ wget http://alpha.release.core-os.net/amd64-usr/current/coreos_production_vmware.vmx
$ wget http://alpha.release.core-os.net/amd64-usr/current/coreos_production_vmware_image.vmdk.bz2

Decompress the VMware disk image:

$ bzip2 -d coreos_production_vmware_image.vmdk.bz2

Configuring a CoreOS VM with a config-drive

By default CoreOS VMware images do not have any users configured, which means you won’t be able to login to your VM after it boots. Also, many of the vmrun guest OS commands require a valid CoreOS username and password.

A config-drive is the best way to configure a CoreOS instance running on VMware. Before you can create a config-drive, you’ll need some user data. For this tutorial you will use a CoreOS cloud-config file as user data to configure users and set the hostname.

Generate the password hash for the core and root users

Before creating the cloud-config file, generate a password hash for the core and root users:

$ openssl passwd -1
Password:
Verifying - Password:
$1$LEfVXsiG$lhcyOrkJq02jWnEhF93IR/

Enter vmware at both password prompts.

Create a cloud config file

Now we are ready to create a cloud-config file:

edit cloud-config.yaml

#cloud-config

hostname: vmware-guest
users:
  - name: core
    passwd: $1$LEfVXsiG$lhcyOrkJq02jWnEhF93IR/
    groups:
      - sudo
      - docker
  - name: root
    passwd: $1$LEfVXsiG$lhcyOrkJq02jWnEhF93IR/

Create a config-drive

With your cloud-config file in place you can use it to create a config drive. The easiest way to create a config-drive is to generate an ISO using a cloud-config file and attach it to a VM.

$ mkdir -p /tmp/new-drive/openstack/latest
$ cp cloud-config.yaml /tmp/new-drive/openstack/latest/user_data
$ hdiutil makehybrid -iso -joliet -joliet-volume-name "config-2" -o ~/cloudconfig.iso /tmp/new-drive
$ rm -r /tmp/new-drive

At this point you should have a config-drive named cloudconfig.iso in your home directory.

Attaching a config-drive to a VM

Before booting the CoreOS VM the config-drive must be attached to the VM. Do this by appending the following lines to the coreos_production_vmware.vmx config file:

ide0:0.present = "TRUE"
ide0:0.autodetect = "TRUE"
ide0:0.deviceType = "cdrom-image"
ide0:0.fileName = "/Users/kelseyhightower/cloudconfig.iso"

At this point you are ready to launch the CoreOS VM:

vmrun start coreos_production_vmware.vmx

CoreOS on VMware

Running commands

With the CoreOS VM up and running use the vmrun command line tool to interact with it. Let's start by checking the status of vmware-tools in the VM:

$ vmrun checkToolsState coreos_production_vmware.vmx

Grab the VM’s IP address with the getGuestIPAddress command:

$ vmrun getGuestIPAddress coreos_production_vmware.vmx

Full VMware integration also means you can now run guest OS commands. For example you can list the running processes using the listProcessesInGuest command:

$ vmrun -gu core -gp vmware listProcessesInGuest coreos_production_vmware.vmx
Process list: 63
pid=1, owner=root, cmd=/usr/lib/systemd/systemd --switched-root --system --deserialize 21
pid=2, owner=root, cmd=kthreadd
pid=3, owner=root, cmd=ksoftirqd/0
pid=4, owner=root, cmd=kworker/0:0
pid=5, owner=root, cmd=kworker/0:0H
pid=6, owner=root, cmd=kworker/u2:0
...

Finally you can now run arbitrary commands and scripts using VMware management tools. For example, use the runProgramInGuest command to initiate a graceful shutdown:

$ vmrun -gu root -gp vmware runProgramInGuest coreos_production_vmware.vmx /usr/sbin/shutdown now

CoreOS on VMware

We have only scratched the surface regarding the number of things you can do with the new VMware powered CoreOS images. Check out the “Using vmrun to Control Virtual Machines” e-book for more details.

CoreOS and VMware going forward

We look forward to continuing on the journey to secure the backend of the Internet by working on all types of platforms in the cloud or behind the firewall. We are continuing to work with VMware so that CoreOS is also supported on the recently announced vSphere 6. If you have any questions in the meantime, you can find us on IRC as you get started. Feedback can also be provided at the VMware / CoreOS community forum.

March Update

It’s been a busy start to the year with lots going on in the Sahana. There’s been some great voluntary contributions over the past months. Tom Baker has been making some great progress extending continuing his work developing a Sahana [Read the Rest...]

March 08, 2015

Technocracy: a short look at the impact of technology on modern political and power structures

Below is an essay I wrote for some study that I thought might be fun to share. If you like this, please see the other blog posts tagged as Gov 2.0. Please note, this is a personal essay and not representative of anyone else :)

In recent centuries we have seen a dramatic change in the world brought about by the rise of and proliferation of modern democracies. This shift in governance structures gives the common individual a specific role in the power structure, and differs sharply from more traditional top down power structures. This change has instilled in many of the world’s population some common assumptions about the roles, responsibilities and rights of citizens and their governing bodies. Though there will always exist a natural tension between those in power and those governed, modern governments are generally expected to be a benevolent and accountable mechanism that balances this tension for the good of the society as a whole.

In recent decades the Internet has rapidly further evolved the expectations and individual capacity of people around the globe through, for the first time in history, the mass distribution of the traditional bastions of power. With a third of the world online and countries starting to enshrine access to the Internet as a human right, individuals have more power than ever before to influence and shape their lives and the lives of people around them. It is easier that ever for people to congregate, albeit virtually, according to common interests and goals, regardless of their location, beliefs, language, culture or other age old barriers to collaboration. This is having a direct and dramatic impact on governments and traditional power structures everywhere, and is both extending and challenging the principles and foundations of democracy.

This short paper outlines how the Internet has empowered individuals in an unprecedented and prolific way, and how this has changed and continues to change the balance of power in societies around the world, including how governments and democracies work.

Democracy and equality

The concept of an individual having any implicit rights or equality isn’t new, let alone the idea that an individual in a society should have some say over the ruling of the society. Indeed the idea of democracy itself has been around since the ancient Greeks in 500 BCE. The basis for modern democracies lies with the Parliament of England in the 11th century at a time when the laws of the Crown largely relied upon the support of the clergy and nobility, and the Great Council was formed for consultation and to gain consent from power brokers. In subsequent centuries, great concerns about leadership and taxes effectively led to a strongly increased role in administrative power and oversight by the parliament rather than the Crown.

The practical basis for modern government structures with elected official had emerged by the 17th century. This idea was already established in England, but also took root in the United States. This was closely followed by multiple suffrage movements from the 19th and 20th centuries which expanded the right to participate in modern democracies from (typically) adult white property owners to almost all adults in those societies.

It is quite astounding to consider the dramatic change from very hierarchical, largely unaccountable and highly centralised power systems to democratic ones in which those in powers are expected to be held to account. This shift from top down power, to distributed, representative and accountable power is an important step to understand modern expectations.

Democracy itself is sustainable only when the key principle of equality is deeply ingrained in the population at large. This principle has been largely infused into Western culture and democracies, independent of religion, including in largely secular and multicultural democracies such as Australia. This is important because an assumption of equality underpins stability in a system that puts into the hands of its citizens the ability to make a decision. If one component of the society feels another doesn’t have an equal right to a vote, then outcomes other than their own are not accepted as legitimate. This has been an ongoing challenge in some parts of the world more than others.

In many ways there is a huge gap between the fearful sentiments of Thomas Hobbes, who preferred a complete and powerful authority to keep the supposed ‘brutish nature’ of mankind at bay, and the aspirations of John Locke who felt that even governments should be held to account and the role of the government was to secure the natural rights of the individual to life, liberty and property. Yet both of these men and indeed, many political theorists over many years, have started from a premise that all men are equal – either equally capable of taking from and harming others, or equal with regards to their individual rights.

Arguably, the Western notion of individual rights is rooted in religion. The Christian idea that all men are created equal under a deity presents an interesting contrast to traditional power structures that assume one person, family or group have more rights than the rest, although ironically various churches have not treated all people equally either. Christianity has deeply influenced many political thinkers and the forming of modern democracies, many of which which look very similar to the mixed regime system described by Saint Thomas Aquinas in his Summa Thelogiae essays:

Some, indeed, say that the best constitution is a combination of all existing forms, and they praise the Lacedemonian because it is made up of oligarchy, monarchy, and democracy, the king forming the monarchy, and the council of elders the oligarchy, while the democratic element is represented by the Ephors: for the Ephors are selected from the people.

The assumption of equality has been enshrined in key influential documents including the United States Declaration of Independence, 1776:

We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness.

More recently in the 20th Century, the Universal Declaration of Human Rights goes even further to define and enshrine equality and rights, marking them as important for the entire society:

Whereas recognition of the inherent dignity and of the equal and inalienable rights of all members of the human family is the foundation of freedom, justice and peace in the world…1st sentence of the Preamble to the Universal Declaration of Human Rights

All human beings are born free and equal in dignity and rights.Article 1 of the United Nations Universal Declaration of Human Rights (UDHR)

The evolution of the concepts of equality and “rights” is important to understand as they provide the basis for how the Internet is having such a disruptive impact on traditional power structures, whilst also being a natural extension of an evolution in human thinking that has been hundreds of years in the making.

Great expectations

Although only a third of the world is online, in many countries this means the vast bulk of the population. In Australia over 88% of households are online as of 2012. Constant online access starts to drive a series of new expectations and behaviours in a community, especially one where equality has already been so deeply ingrained as a basic principle.

Over time a series of Internet-based instincts and perspectives have become mainstream, arguably driven by the very nature of the technology and the tools that we use online. For example, the Internet was developed to “route around damage” which means the technology can withstand technical interruption by another hardware or software means. Where damage is interpreted in a social sense, such as perhaps censorship or locking away access to knowledge, individuals instinctively seek and develop a work around and you see something quite profound. A society has emerged that doesn’t blindly accept limitations put upon them. This is quite a challenge for traditional power structures.

The Internet has become both an extension and an enabler of equality and power by massively distributing both to ordinary people around the world. How has power and equality been distributed? When you consider what constitutes power, four elements come to mind: publishing, communications, monitoring and enforcement.

Publishing – in times gone past the ideas that spread beyond a small geographical area either traveled word of mouth via trade routes, or made it into a book. Only the wealthy could afford to print and distribute the written word, so publishing and dissemination of information was a power limited to a small number of people. Today the spreading of ideas is extremely easy, cheap and can be done anonymously. Anyone can start a blog, use social media, and the proliferation of information creation and dissemination is unprecedented. How does this change society? Firstly there is an assumption that an individual can tell their story to a global audience, which means an official story is easily challenged not only by the intended audience, but by the people about whom the story is written. Individuals online expect both to have their say, and to find multiple perspectives that they can weigh up, and determine for themselves what is most credible. This presents significant challenges to traditional powers such as governments in establishing an authoritative voice unless they can establish trust with the citizens they serve.

Communications– individuals have always had some method to communicate with individuals in other communities and countries, but up until recent decades these methods have been quite expensive, slow and oftentimes controlled. This has meant that historically, people have tended to form social and professional relationships with those close by, largely out of convenience. The Internet has made it easy to communicate, collaborate with, and coordinate with individuals and groups all around the world, in real time. This has made massive and global civil responses and movements possible, which has challenged traditional and geographically defined powers substantially. It has also presented a significant challenge for governments to predict and control information flow and relationships within the society. It also created a challenge for how to support the best interests of citizens, given the tension between what is good for a geographically defined nation state doesn’t always align with what is good for an online and trans-nationally focused citizen.

Monitoring – traditional power structures have always had ways to monitor the masses. Monitoring helps maintain rule of law through assisting in the enforcement of laws, and is often upheld through self-reporting as those affected by broken laws will report issues to hold detractors to account. In just the last 50 years, modern technologies like CCTV have made monitoring of the people a trivial task, where video cameras can record what is happening 24 hours a day. Foucault spoke of the panopticon gaol design as a metaphor for a modern surveillance state, where everyone is constantly watched on camera. The panopticon was a gaol design wherein detainees could not tell if they were being observed by gaolers or not, enabling in principle, less gaolers to control a large number of prisoners. In the same way prisoners would theoretically behave better under observation, Foucault was concerned that omnipresent surveillance would lead to all individuals being more conservative and limited in themselves if they knew they could be watched at any time. The Internet has turned this model on its head. Although governments can more easily monitor citizens than ever before, individuals can also monitor each other and indeed, monitor governments for misbehaviour. This has led to individuals, governments, companies and other entities all being held to account publicly, sometimes violently or unfairly so.

Enforcement – enforcement of laws are a key role of a power structure, to ensure the rules of a society are maintained for the benefit of stability and prosperity. Enforcement can take many forms including physical (gaol, punishment) or psychological (pressure, public humiliation). Power structures have many ways of enforcing the rules of a society on individuals, but the Internet gives individuals substantial enforcement tools of their own. Power used to be who had the biggest sword, or gun, or police force. Now that major powers and indeed, economies, rely so heavily upon the Internet, there is a power in the ability to disrupt communications. In taking down a government or corporate website or online service, an individual or small group of individuals can have an impact far greater than in the past on power structures in their society, and can do so anonymously. This becomes quite profound as citizen groups can emerge with their own philosophical premise and the tools to monitor and enforce their perspective.

Property – property has always been a strong basis of law and order and still plays an important part in democracy, though perspectives towards property are arguably starting to shift. Copyright was invented to protect the “intellectual property” of a person against copying at a time when copying was quite a physical business, and when the mode of distributing information was very expensive. Now, digital information is so easy to copy that it has created a change in expectations and a struggle for traditional models of intellectual property. New models of copyright have emerged that explicitly support copying (copyleft) and some have been successful, such as with the Open Source software industry or with remix music culture. 3D printing will change the game again as we will see in the near future the massive distribution of the ability to copy physical goods, not just virtual ones. This is already creating havoc with those who seek to protect traditional approaches to property but it also presents an extraordinary opportunity for mankind to have greater distribution of physical wealth, not just virtual wealth. Particularly if you consider the current use of 3D printing to create transplant organs, or the potential of 3D printing combined with some form of nano technology that could reassemble matter into food or other essential living items. That is starting to step into science fiction, but we should consider the broader potential of these new technologies before we decide to arbitrarily limit them based on traditional views of copyright, as we are already starting to see.

By massively distributing publishing, communications, monitoring and enforcement, and with the coming potential massive distribution of property, technology and the Internet has created an ad hoc, self-determined and grassroots power base that challenges traditional power structures and governments.

With great power…

Individuals online find themselves more empowered and self-determined than ever before, regardless of the socio-political nature of their circumstances. They can share and seek information directly from other individuals, bypassing traditional gatekeepers of knowledge. They can coordinate with like-minded citizens both nationally and internationally and establish communities of interest that transcend geo-politics. They can monitor elected officials, bureaucrats, companies and other individuals, and even hold them all to account.

To leverage these opportunities fully requires a reasonable amount of technical literacy. As such, many technologists are on the front line, playing a special role in supporting, challenging and sometimes overthrowing modern power structures. As technical literacy is permeating mainstream culture more individuals are able to leverage these disrupters, but technologist activists are often the most effective at disrupting power through the use of technology and the Internet.

Of course, whilst the Internet is a threat to traditional centralised power structures, it also presents an unprecedented opportunity to leverage the skills, knowledge and efforts of an entire society in the running of government, for the benefit of all. Citizen engagement in democracy and government beyond the ballot box presents the ability to co-develop, or co-design the future of the society, including the services and rules that support stability and prosperity. Arguably, citizen buy-in and support is now an important part of the stability of a society and success of a policy.

Disrupting the status quo

The combination of improved capacity for self-determination by individuals along with the increasingly pervasive assumptions of equality and rights have led to many examples of traditional power structures being held to account, challenged, and in some cases, overthrown.

Governments are able to be held more strongly to account than ever before. The Open Australia Foundation is a small group of technologists in Australia who create tools to improve transparency and citizen engagement in the Australian democracy. They created Open Australia, a site that made the public parliamentary record more accessible to individuals through making it searchable, subscribable and easy to browse and comment on. They also have projects such as Planning Alerts which notifies citizens of planned development in their area, Election Leaflets where citizens upload political pamphlets for public record and accountability, and Right to Know, a site to assist the general public in pursuing information and public records from the government under Freedom of Information. These are all projects that monitor, engage and inform citizens about government.

Wikileaks is a website and organisation that provides an anonymous way for individuals to anonymously leak sensitive information, often classified government information. Key examples include video and documents from the Iraq and Afghanistan wars, about the Guantanamo Bay detention camp, United States diplomatic cables and million of emails from Syrian political and corporate figures. Some of the information revealed by Wikileaks has had quite dramatic consequences with the media and citizens around the world responding to the information. Arguably, many of the Arab Spring uprisings throughout the Middle East from December 2010 were provoked by the release of the US diplomatic cables by Wikileaks, as it demonstrated very clearly the level of corruption in many countries. The Internet also played a vital part in many of these uprisings, some of which saw governments deposed, as social media tools such as Twitter and Facebook provided the mechanism for massive coordination of protests, but importantly also provided a way to get citizen coverage of the protests and police/army brutality, creating global audience, commentary and pressure on the governments and support for the protesters.

Citizen journalism is an interesting challenge to governments because the route to communicate with the general public has traditionally been through the media. The media has presented for many years a reasonably predictable mechanism for governments to communicate an official statement and shape public narrative. But the Internet has facilitated any individual to publish online to a global audience, and this has resulted in a much more robust exchange of ideas and less clear cut public narrative about any particular issue, sometimes directly challenging official statements. A particularly interesting case of this was the Salam Pax blog during the 2003 Iraq invasion by the United States. Official news from the US would largely talk about the success of the campaign to overthrown Suddam Hussein. The Salam Pax blog provided the view of a 29 year old educated Iraqi architect living in Baghdad and experiencing the invasion as a citizen, which contrasted quite significantly at times with official US Government reports. This type of contrast will continue to be a challenge to governments.

On the flip side, the Internet has also provided new ways for governments themselves to support and engage citizens. There has been the growth of a global open government movement, where governments themselves try to improve transparency, public engagement and services delivery using the Internet. Open data is a good example of this, with governments going above and beyond traditional freedom of information obligations to proactively release raw data online for public scrutiny. Digital services allow citizens to interact with their government online rather than the inconvenience of having to physically attend a shopfront. Many governments around the world are making public commitments to improving the transparency, engagement and services for their citizens. We now also see more politicians and bureaucrats engaging directly with citizens online through the use of social media, blogs and sophisticated public consultations tools. Governments have become, in short, more engaged, more responsive and more accountable to more people than ever before.

Conclusion

Only in recent centuries have power structures emerged with a specific role for common individual citizens. The relationship between individuals and power structures has long been about the balance between what the power could enforce and what the population would accept. With the emergence of power structures that support and enshrine the principles of equality and human rights, individuals around the world have come to expect the capacity to determine their own future. The growth of and proliferation of democracy has been a key shift in how individuals relate to power and governance structures.

New technologies and the Internet has gone on to massively distribute the traditionally centralised powers of publishing, communications, monitoring and enforcement (with property on the way). This distribution of power through the means of technology has seen democracy evolve into something of a technocracy, a system which has effectively tipped the balance of power from institutions to individuals.

References

Hobbes, T. The Leviathan, ed. by R. Tuck, Cambridge University Press, 1991.

Aquinas, T. Sum. Theol. i-ii. 105. 1, trans. A. C. Pegis, Whether the old law enjoined fitting precepts concerning rulers?

Uzgalis, William, “John Locke”, The Stanford Encyclopedia of Philosophy (Fall 2012 Edition), Edward N. Zalta (ed.), http://plato.stanford.edu/archives/fall2012/entries/locke/.

See additional useful references linked throughout essay.

March 05, 2015

Managing CoreOS Logs with Logentries

Today Logentries announced a CoreOS integration, so CoreOS users can get a a deeper understanding into their CoreOS environments. The new integration enables CoreOS users to easily send logs using the Journal logging system, part of CoreOS’ Systemd process manager, directly into Logentries for real-time monitoring, alerting, and data visualization. This is the first CoreOS log management integration.

To learn more about centralizing logs from CoreOS clusters read Trevor Parsons, co-founder and chief scientist at Logentires post. Or get started by following the documentation here.

March 03, 2015

Upcoming CoreOS Events in March

March brings a variety of events – including a keynote from Alex Polvi (@polvi), CEO of CoreOS, at Rackspace Solve. Read on for more details on the team’s whereabouts this month.

In case you missed it, Alex keynoted at The Linux Foundation Collab Summit last month. See the replay.


Tuesday, March 3, 2015 at 6 p.m. EST – Montreal, QC

The month kicks off with Jake Moshenko (@JacobMoshenko), product manager for the Quay.io container registry at CoreOS, at Big Data Montreal. Jake will discuss Rocket and how CoreOS and Quay.io fit into the development lifecycle.


Wednesday, March 4, 2015 at 1:30 p.m. PST – San Francisco, CA

Join us at Rackspace Solve to see Alex Polvi (@polvi), CEO of CoreOS, speak about Container Technology: Applications and Implications. Registration is free and there will be talks from Wikimedia, Walmart Labs, DigitalFilmTree, Tinder and more.


Tuesday, March 10, 2015 at 7 p.m. GWT - London, England

If you find yourself in London, be sure to stop by the CoreOS London meetup. They are currently confirming speakers for the event. If you are interested in speaking be sure to submit on github.


Monday, March 16, 2015 at 7 p.m. PDT – San Francisco, CA

Join the CoreOS San Francisco March Meetup at Imgur (@imgur). On the agenda: Rocket, appc spec and etcd. Chris Winslett from Compose.io (@composeio) will also explain how Compose.io uses etcd as its “repository of truth.”


Friday, March 27, 2015 at 6:45 p.m. CDT – Pflugerville, TX

Brian “Redbeard” Harrington (@brianredbeard), is an opening speaker at Container Days Austin. Container Days provides a forum for all interested in the technical, process and production ramifications of adopting container style virtualization. Get your tickets here.


AirPair Writing Completition

On another note, this month CoreOS has joined the AirPair $100K writing competition. For more details please see the contest site, https://www.airpair.com/100k-writing-competition, for more information.

If you have a CoreOS, etcd or Rocket implementation, tutorial or use case now is a great time to share. How do you apply CoreOS for automatic server updates? Do you have any stories about your implementation of etcd to keep an application up when a server needs or goes down? Any exciting ways you have applied Rocket, the first container runtime based on the Application Container specification?

If you are interested in writing about your experiences with CoreOS, etcd or Rocket, email press@coreos.com and we will give you support to make your post a success. More details about the competition are here.

March 02, 2015

Verification Challenge 3: cbmc

The first and second verification challenges were directed to people working on verification tools, but this one is instead directed at developers.



It turns out that there are a number of verification tools that have seen heavy use. For example, I have written several times about Promela and spin (here, here, and here), which I have used from time to time over the past 20 years. However, this tool requires that you translate your code to Promela, which is not conducive to use of Promela for regression tests.



For those of use working in the Linux kernel, it would be nice to have a verification tool that operated directly on C source code. And there are tools that do just that, for example, the C Bounded Model Checker (cbmc). This tool, which is included in a number of Linux distributions, converts a C-language input file into a (possibly quite large) logic expression. This expression is constructed so that if any combination of variables causes the logic expression to evaluate to true, then (and only then) one of the assertions can be triggered. This logic expression is then passed to a SAT solver, and if this SAT solver finds a solution, then there is a set of inputs that can trigger the assertion. The cbmc tool is also capable of checking for array-bounds errors and some classes of pointer misuse.



Current versions of cbmc can handle some useful tasks. For example, suppose it was necessary to reverse the sense of the if condition in the following code fragment from Linux-kernel RCU:



 1   if (rnp->exp_tasks != NULL ||
 2       (rnp->gp_tasks != NULL &&
 3        rnp->boost_tasks == NULL &&
 4        rnp->qsmask == 0 &&
 5        ULONG_CMP_GE(jiffies, rnp->boost_time))) {
 6     if (rnp->exp_tasks == NULL) 
 7       rnp->boost_tasks = rnp->gp_tasks;
 8     /* raw_spin_unlock_irqrestore(&rnp->lock, flags); */
 9     t = rnp->boost_kthread_task;
10     if (t)   
11       rcu_wake_cond(t, rnp->boost_kthread_status);
12   } else {
13     rcu_initiate_boost_trace(rnp);
14     /* raw_spin_unlock_irqrestore(&rnp->lock, flags); */
15   }




This is a simple application of De Morgan's law, but an error-prone one, particularly if carried out in a distracting environment. Of course, to test a validation tool, it is best to feed it buggy code to see if it detects those known bugs. And applying De Morgan's law in a distracting environment is an excellent way to create bugs, as you can see below:



 1   if (rnp->exp_tasks == NULL &&
 2       (rnp->gp_tasks == NULL ||
 3        rnp->boost_tasks != NULL ||
 4        rnp->qsmask != 0 &&
 5        ULONG_CMP_LT(jiffies, rnp->boost_time))) {
 6     rcu_initiate_boost_trace(rnp);
 7     /* raw_spin_unlock_irqrestore(&rnp->lock, flags); */
 8   } else {
 9     if (rnp->exp_tasks == NULL) 
10       rnp->boost_tasks = rnp->gp_tasks;
11     /* raw_spin_unlock_irqrestore(&rnp->lock, flags); */
12     t = rnp->boost_kthread_task;
13     if (t)   
14       rcu_wake_cond(t, rnp->boost_kthread_status);
15   }




Of course, a full exhaustive test is infeasible, but structured testing would result in a manageable number of test cases. However, we can use cbmc to do the equivalent of a full exhaustive test, despite the fact that the number of combinations is on the order of two raised to the power 1,000. The approach is to create task_struct and rcu_node structures that contain only those fields that are used by this code fragment, but that also contain flags that indicate which functions were called and what their arguments were. This allows us to wrapper both the old and the new versions of the code fragment in their respective functions, and call them in sequence on different instances of identically initialized task_struct and rcu_node structures. These two calls are followed by an assertion that checks that the return value and the corresponding fields of the structures are identical.



This approach results in checkiftrans-1.c (raw C code here). Lines 5-8 show the abbreviated task_struct structure and lines 13-22 show the abbreviated rcu_node struButcture. Lines 10, 11, 24, and 25 show the instances. Lines 27-31 record a call to rcu_wake_cond() and lines 33-36 record a call to rcu_initiate_boost_trace().



Lines 38-49 initialize a task_struct/rcu_node structure pair. The rather unconventional use of the argv[] array works because cbmc assumes that this array contains random numbers. The old if statement is wrappered by do_old_if() on lines 51-71, while the new if statement is wrappered by do_new_if() on lines 73-93. The assertion is in check() on lines 95-107, and finally the main program is on lines 109-118.



Running cbmc checkiftrans-1.c gives this output, which prominently features VERIFICATION FAILED at the end of the file. On lines 4, 5, 12 and 13 of the file are complaints that neither ULONG_CMP_GE() nor ULONG_CMP_LT() are defined. Lacking definitions for these these two functions, cbmc seems to treat them as random-number generators, which could of course cause the two versions of the if statement to yield different results. This is easily fixed by adding the required definitions:



 1 #define ULONG_MAX         (~0UL)
 2 #define ULONG_CMP_GE(a, b)  (ULONG_MAX / 2 >= (a) - (b))
 3 #define ULONG_CMP_LT(a, b)  (ULONG_MAX / 2 < (a) - (b))




This results in checkiftrans-2.c (raw C code here). However, running cbmc checkiftrans-2.c gives this output, which still prominently features VERIFICATION FAILED at the end of the file. At least there are no longer any complaints about undefined functions!



It turns out that cbmc provides a counterexample in the form of a traceback. This traceback clearly shows that the two instances executed different code paths, and a closer examination of the two representations of the if statement show that I forgot to convert one of the && operators to a ||—that is, the “rnp->qsmask != 0 &&” on line 84 should instead be “rnp->qsmask != 0 ||”. Making this change results incheckiftrans-3.c (raw C code here). The inverted if statement is now as follows:



 1   if (rnp->exp_tasks == NULL &&
 2       (rnp->gp_tasks == NULL ||
 3        rnp->boost_tasks != NULL ||
 4        rnp->qsmask != 0 ||
 5        ULONG_CMP_LT(jiffies, rnp->boost_time))) {
 6     rcu_initiate_boost_trace(rnp);
 7     /* raw_spin_unlock_irqrestore(&rnp->lock, flags); */
 8   } else {
 9     if (rnp->exp_tasks == NULL) 
10       rnp->boost_tasks = rnp->gp_tasks;
11     /* raw_spin_unlock_irqrestore(&rnp->lock, flags); */
12     t = rnp->boost_kthread_task;
13     if (t)   
14       rcu_wake_cond(t, rnp->boost_kthread_status);
15   }




This time, running cbmc checkiftrans-3.c produces this output, which prominently features VERIFICATION SUCCESSFUL at the end of the file. Furthermore, this verification consumed only about 100 milliseconds on my aging laptop. And, even better, because it refused to verify the buggy version, we have at least some reason to believe it!



Of course, one can argue that doing such work carefully and in a quiet environment would eliminate the need for such verification, and 30 years ago I might have emphatically agreed with this argument. I have since learned that ideal work environments are not always as feasible as we might like to think, especially if there are small children (to say nothing of adult-sized children) in the vicinity. Besides which, human beings do make mistakes, even when working in ideal circumstances, and if we are to have reliable software, we need some way of catching these mistakes.



The canonical pattern for using cbmc in this way is as follows:



 1 retref = funcref(...);
 2 retnew = funcnew(...);
 3 assert(retref == retnew && ...);




The ... sequences represent any needed arguments to the calls and any needed comparisons of side effects within the assertion.



Of course, there are limitations:





  1. The “b” in cbmc stands for “bounded.” In particular, cbmc handles neither infinite loops nor infinite recursion. The --unwind and --depth arguments to cbmc allow you to control how much looping and recursion is analyzed. See the manual for more information.

  2. The SAT solvers used by cbmc have improved greatly over the past 25 years. In fact, where a 100-variable problem was at the edge of what could be handled in the 1990s, most ca-2015 solvers can handle more than a million variables. However, the NP-complete nature of SAT does occasionally make its presence known, for example, programs that reduce to a proof involving the pigeonhole principle are not handled well as of early 2015.

  3. Handling of concurrency is available in later versions of cbmc, but is not as mature as is the handling of single-threaded code.





All that aside, everything has its limitations, and cbmc's ease of use is quite impressive. I expect to continue to use it from time to time, and strongly recommend that you give it a try!

February 24, 2015

The Day Is My Enemy

Looking forward to see the live performance at the Future Music Festival 2015 :-)

February 23, 2015

Sahana This Week

It’s a busy week for Sahana around the world! Fran Boon, the Technical Lead for the Sahana software project, delivering a SahanaCamp training workshop for the Civil Society Disaster Platform, a coalition of disaster management organizations in Turkey. This workshop [Read the Rest...]

February 21, 2015

Confessions of a Recovering Proprietary Programmer, Part XIV

Although junk mail, puppies, and patches often are unwelcome, there are exceptions. For example, if someone has been wanting a particular breed of dog for some time, that person might be willing to accept a puppy, even if that means giving it shots, housebreaking it, teaching it the difference between furniture and food, doing bottlefeeding, watching over it day and night, and even putting up with some sleepless nights.



Similarly, if a patch fixes a difficult and elusive bug, the maintainer might be willing to apply the patch by hand, fix build errors and warnings, fix a few bugs in the patch itself, run a full set of tests, fix and style problems, and even accept the risk that the bug might have unexpected side effects, some of which might result in some sleepless nights. This in fact is one of the reasons for the common advice given to open-source newbies: start by fixing bugs.



Other good advice for new contributors can be found here:





  1. Greg Kroah-Hartman's HOWTO do Linux kernel development – take 2 (2005)

  2. Jonathan Corbet's How to Participate in the Linux Community (2008)

  3. Greg Kroah-Hartman's Write and Submit your first Linux kernel Patch (2010)

  4. My How to make a positive difference in a FOSS project (2012)

  5. Daniel Lezcano's What do we mean by working upstream: A long-term contributor’s view





This list is mostly about contributing to the Linux kernel, but most other projects have similar pages giving good new-contributor advice.

February 20, 2015

Kickstart new developers using Docker – Linux.conf.au 2015

One of the talks I gave at Linux.conf.au this year was a quick-start guide to using Docker.

The slides begin with building Apache from source on your local host, using their documentation, and then how much simpler it is if instead of documentation, the project provides a Dockerfile. I quickly gloss over making a slim production container from that large development container – see my other talk, which I’ll blog about a little later.

The second example, is using a Dockerfile to create and execute a test environment, so everyone can replicate identical test results.

Finally, I end with a quite example of fig (Docker Compose), and running GUI applications in containers.

the Slides

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

February 19, 2015

Using Sahana to Support Volunteer Technical Communities

There’s a lot of similarities  between traditional disaster management organizations and volunteer technical communities such as Sahana’s – especially when you look at our operations from a information management perspective. We collaborate on projects with partner organization, often breaking the [Read the Rest...]

February 13, 2015

App Container and Docker

A core principle of the App Container (appc) specification is that it is open: multiple implementations of the spec should exist and be developed independently. Even though the spec is young and pre-1.0, it has already seen a number of implementations.

With this in mind, over the last few weeks we have been working on ways to make appc interoperable with the Docker v1 Image format. As we discovered, the two formats are sufficiently compatible that Docker v1 Images can easily be run alongside appc images (ACIs). Today we want to describe two different demonstrations of this interoperability, and start a conversation about closer integration between the Docker and appc communities.

rkt Running Docker Images

rkt is an App Container implementation that fully implements the current state of the spec. This means it can download, verify and run App Container Images (ACIs). And now along with ACI support the latest release of rkt, v0.3.2, can download and run container images directly from the Docker Hub or any other Docker Registry:

$ rkt --insecure-skip-verify run docker://redis docker://tenstartups/redis-commander
rkt: fetching image from docker://redis
rkt: warning: signature verification has been disabled
Downloading layer: 511136ea3c5a64f264b78b5433614aec563103b4d4702f3ba7d4d2698e22c158
…
      _.-``    `.  `_.  ''-._           Redis 2.8.19 (00000000/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._
 (    '      ,       .-`  | `,    )     Running in stand alone mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 6379
 |    `-._   `._    /     _.-'    |     PID: 3
...
[3] 12 Feb 09:09:19.071 # Server started, Redis version 2.8.19
# redis will be  running on 127.0.0.1:6379 and redis-commander on 127.0.0.1:8081

Docker Running App Container Images

At the same time as adding Docker support to rkt, we have also opened a pull-request that enables Docker to run appc images (ACIs). This is a simple functional PR that includes many of the essential features of the image spec. Docker API operations such as image list, run image by appc image ID and more work as expected and integrate with the native Docker experience. As a simple example, downloading and running an etcd ACI works seamlessly with the addition of this patchset:

$ docker pull --format aci coreos.com/etcd:v2.0.0
$ docker run --format aci coreos.com/etcd
2015/02/12 11:21:05 no data-dir provided, using default data-dir ./default.etcd
2015/02/12 11:21:05 etcd: listening for peers on http://localhost:2380
2015/02/12 11:21:05 etcd: listening for peers on http://localhost:7001

For more details, check out the PR itself.

Docker and App Container: Looking forward

We think App Container represents the next logical iteration in what a container image format, runtime engine, and discovery protocol should look like. App Container is young but we want to continue to get wider community feedback and see the spec evolve into something that can work for a number of runtimes.

Before appc spec reaches 1.0 (stable) status, we would like feedback from the Docker community on what might need to be modified in the spec in order for it to be supported natively in Docker. To gather feedback and start the discussion, we have put up a proposal to add appc support to Docker.

We are looking forward to getting additional feedback from the Docker community on this proposal. Working together, we can create a better appc spec for everyone to use, and over time, work towards a shared standard.

Join us on a mission to create a secure, composable, and standards-based container runtime. If you are interested in hacking on rkt or App Container we encourage you to get involved:

rkt

Help Wanted

Mailing list

App Container

Help Wanted

Mailing list

If you want more background on the appc spec, we encourage you to read our first blog post about the App Container spec and Rocket. Also read more in a recent Q&A with OpenSource.com.

February 11, 2015

Confessions of a Recovering Proprietary Programmer, Part XIII

True confession: I was once a serial junk mailer. Not mere email spam, but physical bulk-postage-rate flyers, advertising a series of non-technical conferences. It was of course for a good cause, and one of the most difficult things about that task was convincing that cause's leaders that this flyer was in fact junk mail. They firmly believed that anyone with even a scrap of compassion would of course read the flyer from beginning to end, feeling the full emotional impact of each and every lovingly crafted word. They reluctantly came around to my view, which was that we had at most 300 milliseconds to catch the recipient's attention, that being the amount of time that the recipient might glance at the flyer on its way into the trash. Or at least I think that they came around to my view. All I really know is that they stopped disputing the point.



But junk mail for worthy causes is not the only thing that can be less welcome than its sender might like.



For example, Jim Wasko noticed a sign at a daycare center that read: “If you are late picking up your child and have not called us in advance, we will give him/her an espresso and a puppy. Have a great day.”



Which goes to show that although puppies are cute and lovable, and although their mother no doubt went to a lot of trouble to bring them into this world, they are, just like junk mail, not universally welcome. And this should not be too surprising, given the questions that come to mind when contemplating a free puppy. Has it had its shots? Is it housebroken? Has it learned that furniture is not food? Has it been spayed/neutered? Is it able to eat normal dogfood, or does it still require bottlefeeding? Is it willing to entertain itself for long periods? And, last, but most definitely not least, is it willing to let you sleep through the night?



Nevertheless, people are often surprised and bitterly disappointed when their offers of free puppies are rejected.



Other people are just as surprised and disappointed when their offers of free patches are rejected. After all, they put a lot of work into their patches, and they might even get into trouble if the patch isn't eventually accepted.



But it turns out that patches are a lot like junk mail and puppies. They are greatly valued by those who produce them, but often viewed with great suspicion by the maintainers receiving them. You see, the thought of accepting a free patch also raises questions. Does the patch apply cleanly? Does it build without errors and warnings? Does it run at all? Does it pass regression tests? Has it been tested with the commonly used combination of configuration parameters? Does the patch have good code style? Is the patch maintainable? Does the patch provide a straightforward and robust solution to whatever problem it is trying to solve? In short, will this patch allow the maintainer to sleep through the night?



I am extremely fortunate in that most of the RCU patches that I receive are “good puppies.” However, not everyone is so lucky, and I occasionally hear from patch submitters whose patches were not well received. They often have a long list of reasons why their patches should have been accepted, including:



  1. I put a lot of work into that patch, so it should have been accepted! Unfortunately, hard work on your part does not guarantee a perception of value on the maintainer's part.

  2. The maintainer's job is to accept patches. Maybe not, your maintainer might well be an unpaid volunteer.

  3. But my maintainer is paid to maintain! True, but he is probably not being paid to do your job.

  4. I am not asking him to do my job, but rather his/her job, which is to accept patches! The maintainer's job is not to accept any and all patches, but instead to accept good patches that further the project's mission.

  5. I really don't like your attitude! I put a lot of work into making this be a very good patch! It should have been accepted! Really? Did you make sure it applied cleanly? Did you follow the project's coding conventions? Did you make sure that it passed regression tests? Did you test it on the full set of platforms supported by the project? Does it avoid problems discussed on the project's mailing list? Did you promptly update your patch based on any feedback you might have received? Is your code maintainable? Is your code aligned with the project's development directions? Do you have a good reputation with the community? Do you have a track record of supporting your submissions? In other words, will your patch allow the maintainer to sleep through the night?

  6. But I don't have time to do all that! Then the maintainer doesn't have time to accept your patch. And most especially doesn't have time to deal with all the problems that your patch is likely to cause.



As a recovering proprietary programmer, I can assure you that things work a bit differently in the open-source world, so some adjustment is required. But participation in an open-source project can be very rewarding and worthwhile!

February 06, 2015

Announcing rkt and App Container v0.3.1

Today we're announcing the next release of Rocket and the App Container (appc) spec, v0.3.1.

rkt Updates

This release of rkt includes new user-facing features and some important changes under the hood which further make progress towards our goals of security and composability.

First, the rkt CLI has a couple of new commands:

  • rkt trust can be used to easily add keys to the public keystore for ACI signatures (introduced in the previous release). This supports retrieving public keys directly from a URL or using discovery to locate public keys - a simple example of the latter is rkt trust --prefix coreos.com/etcd. See the commit for other examples.

  • rkt list is a simple tool to list the containers on the system. It leverages the same file-based locking as rkt status and rkt gc to ensure safety during concurrent invocations of rkt.

As mentioned, v0.3.1 includes two significant changes to how rkt is built internally.

  • Instead of embedding the (default) stage1 using go-bindata, rkt now consumes a stage1 in the form of an actual ACI, containing a rootfs and stage1 init/enter binaries, via the --stage1-image flag. This makes it much more straightforward to use alternative stage1 image with rkt and facilitates packaging for other distributions like Fedora.

  • rkt now vendors a copy of appc/spec instead of depending on HEAD. This means that rkt can be built in a self-contained and reproducible way and that master will no longer break in response to changes to the spec. It also makes explicit the specific version of the spec against which particular release of rkt is compiled.

As a consequence of these two changes, it is now possible to use the standard Go workflow to build the rkt CLI (e.g. go get github.com/coreos/rocket/rkt). Note however that this does not implicitly build a stage1, so that will still need to be done using the included ./build script, or some other way for those desiring to use a different stage1.

App Container Updates

This week saw a number of interesting projects emerge that implement the App Container Spec. Please note, all of these are very early and actively seeking more contributors.

Nose Cone, an independent App Container Runtime

Nose Cone is an appc runtime that is built on top of the libappc C++ library that was released a few weeks ago. This project is only a few days old but you can find it up on GitHub. It makes no use of rkt, but implements the App Container specification. It is great to see this level of experimentation around the appc spec: having multiple, alternative runtimes with different goals is an important part of building a robust specification.

Tools for building ACIs

A few tools have emerged since last week for building App Container Images. All of these are very early and could use your contributions to help get them production ready.

docker2aci

A Dockerfile and the "docker build" command is a very convenient way to build an image, and many people already have existing infrastructure and pipelines around Docker images. To take advantage of this, the docker2aci tool and library takes an existing Docker image and generates an equivalent ACI. This means the container can now be run in any implementation of the appc spec.

$ docker2aci quay.io/lafolle/redis
Downloading layer: 511136ea3c5a64f264b78b5433614aec563103b4d4702f3ba7d4d2698e22c158
...
Generated ACI(s):
lafolle-redis-latest.aci
$ rkt run lafolle-redis-latest.aci
[3] 04 Feb 03:56:31.186 # Server started, Redis version 2.8.8

goaci

While a Dockerfile is a very convenient way to build, it should not be the only way to create a container image. With the new experimental goaci tool, it is possible to build a minimal golang ACI without the need of any additional build environment. Example:

$ goaci github.com/coreos/etcd
Wrote etcd.aci
$ actool -debug validate etcd.aci
etcd.aci: valid app container image

Quay support

Finally, we have added experimental support for App Container Images to Quay.io, our hosted container registry. Test it out by pulling any public image using rkt:

$ rkt trust --prefix quay.io
Prefix: "quay.io"
Key: "https://quay.io/aci-signing-key"
GPG key fingerprint is: BFF3 13CD AA56 0B16 A898  7B8F 72AB F5F6 799D 33BC
    Quay.io ACI Converter (ACI conversion signing key) <support@quay.io>
Are you sure you want to trust this key (yes/no)? yes
$ rkt run quay.io/philips/golang-outyet
$ curl 127.0.0.1:8080

While these tools are very young, they are an important milestone towards our goals with appc. We are on a path to being able to create images with multiple, independent tools (from Docker conversion to native language tools), and have multiple ways to run them (with runtimes like rkt and Nose Cone). This is just the beginning, but a great early example of the power of open standards.

Join us on a mission to create a secure, composable, and standards-based container runtime. If you are interested in hacking on rkt or App Container we encourage you to get involved:

There is still much to do - onward!

February 05, 2015

Improving Coastal Resilience throug Multi-agency Situational Awareness

Under a well-developed disaster management system, the Disaster Management Organization of a Country should be aware of and should map every significant emergency incident or risk in the country. Disseminating such information among multiple agencies with disparate systems can be [Read the Rest...]

February 03, 2015

Upcoming CoreOS Events in February

We’ve just come from FOSDEM ‘15 in Belgium and have an exciting rest of the month planned. We’ll be in Europe and the United States in February, and you can even catch Alex Polvi, CEO of CoreOS, keynoting at two events – TurboFest West (February 13) and Linux Collab Summit (February 18). Read more to see where we’ll be and meet us.

Also, thank you to all that hosted and attended our events last month. Questions or comments, contact us at press@coreos.com or tweet to us @CoreOSlinux.

Europe

See slides from the Config Management Camp 2015 talk by Kelsey Hightower (@kelseyhightower), developer advocate at CoreOS. He presented in Belgium on February 2 about Managing Containers at Scale with CoreOS and Kubernetes.


Tuesday, February 3 at 7 p.m. CET – Munich, Germany

Learn about CoreOS and Rocket at the Munich CoreOS meetup led by Brian Harrington/Redbeard (@brianredbeard), principal architect at CoreOS, and Jonathan Boulle (@baronboulle), senior engineer at CoreOS.


Tuesday, February 3 at 7 p.m. GMT – London, United Kingdom

See the first Kubernetes London meetup with Craig Box, solutions engineer for Google Cloud Platform, and Kelsey Hightower (@kelseyhightower), developer advocate at CoreOS. Attendees will be guided through the first steps with Kubernetes and Kelsey will discuss managing containers at scale with CoreOS and Kubernetes.


Thursday, February 5 at 7:00 p.m. CET – Frankfurt, Germany

Check out the DevOps Frankfurt meetup, where we will give a rundown on CoreOS and Rocket from Redbeard (@brianredbeard), principal architect at CoreOS, and Jonathan Boulle (@baronboulle), senior engineer at CoreOS.


Monday, February 9 at 7:00 p.m. CET – Berlin, Germany

Meet Jonathan Boulle (@baronboulle), senior engineer at CoreOS, at the CoreOS Berlin meetup to learn about Rocket and the App Container spec.

United States

Wednesday, February 4 at 6:00 p.m. – New York, New York

Come to our February CoreOS New York City meetup at Work-Bench, 110 Fifth Avenue on the 5th floor, where our team will discuss our new container runtime, Rocket, as well as Quay.io new features. In addition, Nathan Smith, head of engineering at Wink, www.wink.com, will walk us through how they are using CoreOS.


Monday, February 9 at 6:30 p.m. EST – New York, New York

The CTO School meetup will host an evening on Docker and the Linux container ecosystem. See Jake Moshenko (@JacobMoshenko), product manager at CoreOS, and Borja Burgos-Galindo, CEO & co-founder of Tutum, for an intro to containers and an overview on the ecosystem, followed by a presentation from Tom Leach and Travis Thieman of Gamechanger.


Friday, February 13 – San Francisco, California

See Alex Polvi, CEO of CoreOS, keynote at TurboFest West, a program of cloud and virtualization thought leadership discussions hosted by VMTurbo. Register for more details.


Tuesday, February 17 at 5:30 p.m. CST – Kansas City, Missouri

Redbeard (@brianredbeard), principal architect at CoreOS, will be kickin’ it with the Cloud KC meetup to go over CoreOS and Rocket. Thanks to C2FO for hosting this event.


Wednesday, February 18 at 10:00 a.m. PST – Santa Rosa, California

Alex Polvi, CEO of CoreOS, will present a keynote on Containers and the Changing Server Landscape at the Linux Collab Summit. See more about what Alex will discuss in a Q&A with Linux.com and tweet to us to meet at the event if you’ll be there.


Thursday, February 19 at 7:00 p.m. CST – Carrollton, Texas

Come to the Linux Containers & Virtualization meetup to meet Redbeard (@brianredbeard), principal architect at CoreOS, and learn about Rocket and the App Container spec.


February 19-February 22 – Los Angeles, California

Meet Jonathan Boulle (@baronboulle), senior engineer at CoreOS, at the SCALE 13x, the SoCal Linux Expo. Jon will present a session on Rocket and the App Container spec on Saturday, February 21 at 3:00 p.m. PT in the Carmel room.


More events will be added, so check back for updates here and at our community page!

In case you missed it, watch a webinar with Kelsey Hightower, developer advocate at CoreOS, and Matt Williams, DevOps evangelist at Datadog on Managing CoreOS Container Performance for Production Workloads

February 01, 2015

Parallel Programming: January 2015 Update

This release of Is Parallel Programming Hard, And, If So, What Can You Do About It? features a new chapter on SMP real-time programming, a updated formal-verification chapter, removal of a couple of the less-relevant appendices, several new cartoons (along with some refurbishing of old cartoons), and other updates, including contributions from Bill Pemberton, Borislav Petkov, Chris Rorvick, Namhyung Kim, Patrick Marlier, and Zygmunt Bazyli Krynicki.



As always, git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git will be updated in real time.

January 29, 2015

Verification Challenge 2: RCU NO_HZ_FULL_SYSIDLE

I suppose that I might as well get the “But what about Verification Challenge 1?” question out of the way to start with. You will find Verification Challenge 1 here.



Now, Verification Challenge 1 was about locating a known bug. Verification Challenge 2 takes a different approach: The goal is instead to find a bug if there is one, or to prove that there is no bug. This challenge involves the Linux kernel's NO_HZ_FULL_SYSIDLE functionality, which is supposed to determine whether or not all non-housekeeping CPUs are idle. The normal Linux-kernel review process located an unexpected bug (which was allegedly fixed), so it seemed worthwhile to apply some formal verification. Unfortunately, all of the tools that I tried failed. Not simply failed to verify, but failed to run correctly at all—though I have heard a rumor that one of the tools was fixed, and thus advanced to the “failed to verify” state, where “failed to verify” apparently meant that the tool consumed all available CPU and memory without deigning to express an opinion as to the correctness of the code.



So I fell back to 20-year-old habits and converted my C code to a Promela model and used spin to do a full-state-space verification. After some back and forth, this model did claim verification, and correctly refused to verify bug-injected perturbations of the model. Mathieu Desnoyers created a separate Promela model that made more deft use of temporal logic, and this model also claimed verification and refused to verify bug-injected perturbations. So maybe I can trust them. Or maybe not.



Unfortunately, regardless of whether or not I can trust these models, they are not appropriate for regression testing. How large a change to the C code requires a corresponding change to the Promela models? And how can I be sure that this corresponding change was actually the correct change? And so on: These kinds of questions just keep coming.



It would therefore be nice to be able to validate straight from the C code. So if you have a favorite verification tool, why not see what it can make of NO_HZ_FULL_SYSIDLE? The relevant fragments of the C code, along with both Promela models, can be found here. See the README file for a description of the files, and you know where to find me for any questions that you might have.



If you do give it a try, please let me know how it goes!

January 28, 2015

etcd 2.0 Release - First Major Stable Release

etcd 2.0 Release - First Major Stable Release

Today etcd hit v2.0.0, our first major stable release. Since the release-candidate, in mid-December, the team has been hard at work stabilizing the release. You can find the new binaries on GitHub.

For a quick overview, etcd is an open source, distributed, consistent key-value store for shared configuration, service discovery, and scheduler coordination. By using etcd, applications can ensure that even in the face of individual servers failing, the application will continue to work. etcd is a core component of CoreOS software that facilitates safe automatic updates, coordinating work being scheduled to hosts, and setting up overlay networking for containers.

New Updates

The etcd team has been hard at work to improve the ease-of-use and stability of the project. Some of the highlights compared to the last official release, etcd 0.4.6, include

  • Internal etcd protocol improvements to guard against accidental misconfiguration
  • etcdctl backup was added to make recovering from cluster failure easier
  • etcdctl member list/add/remove commands for easily managing a cluster
  • On-disk datastore safety improvements with CRC checksums and append-only behavior
  • An improved Raft consensus implementation already used in other projects like CockroachDB
  • More rigorous and faster running tests of the underlying Raft implementation, covering all state machine and cases explained in the original Raft white paper in 1.5 seconds
  • Additional administrator focused documentation explaining common scenarios
  • Official IANA assigned ports for etcd TCP 2379/2380

The major goal has been to make etcd more usable and stable with all of these changes. Over the hundreds of pull requests merged to make this release, many other improvements and bug fixes have been made. Thank you to the 150 contributors who have helped etcd get where it is today and provided those bug fixes, pull requests and more.

Who uses etcd?

Many projects use etcd - Google’s Kubernetes, Pivotal’s Cloud Foundry, Mailgun and now Apache Mesos and Mesosphere DCOS too. In addition to these projects, there are more than 500 projects on GitHub, using etcd. The feedback from these application developers continues to be an important part of the development cycle; thank you for being involved.

Direct quotes from people using etcd:

"We evaluated a number of persistent stores, yet etcd’s HTTP API and strong Go client support was the best fit for Cloud Foundry," said Onsi Fakhouri, engineering manager at Pivotal. "Anyone currently running a recent version of Cloud Foundry is running etcd. We are big fans of etcd and are excited to see the rapid progress behind the key-value store."

"etcd is an important part of configuration management and service discovery in our infrastructure," said Sasha Klizhentas, lead engineer at Mailgun. "Our services use etcd for dynamic load-balancing, leader election and canary deployment patterns. etcd’s simple HTTP API helps make our infrastructure reliable and distributed."

"Shared configuration and shared state are two very tricky domains for distributed systems developers as services no longer run on one machine but are coordinated across an entire datacenter," said Benjamin Hindman, chief architect at Mesosphere and chair of Apache Mesos. "Apache Mesos and Mesosphere’s Datacenter Operating System (DCOS) will soon have a standard plugin to support etcd. Users and customers have asked for etcd support, and we’re delivering it as an option."

Get Involved and Get Started

After nearly two years of diligent work, we are eager to hear your continued feedback on etcd. We will continue to work to make etcd a fundamental building block for Google-like infrastructure that users can take off the shelf, build upon and rely on.

Brandon Philips speaking about etcd 2.0

CoreOS CTO Brandon Philips speaking about etcd 2.0 at the CoreOS San Francsico meet up:

Update on CVE-2015-0235, GHOST

The glibc vulnerability, CVE-2015-0235, known as “GHOST”, has been patched on CoreOS. If automatic updates are enabled (default configuration), your server should already be patched.

If automatic updates are disabled, you can force an update by running update_engine_client -check_for_update.

Currently, the auto-update mechanism only applies to the base CoreOS, not inside your containers. If your container was built from an older ubuntu base, for example, you’ll need to update the container and get the patch from ubuntu.

If you have any questions or concerns, please join us in IRC freenode/#coreos.

SE Linux Play Machine Over Tor

I work on SE Linux to improve security for all computer users. I think that my work has gone reasonably well in that regard in terms of directly improving security of computers and helping developers find and fix certain types of security flaws in apps. But a large part of the security problems we have at the moment are related to subversion of Internet infrastructure. The Tor project is a significant step towards addressing such problems. So to achieve my goals in improving computer security I have to support the Tor project. So I decided to put my latest SE Linux Play Machine online as a Tor hidden service. There is no real need for it to be hidden (for the record it’s in my bedroom), but it’s a learning experience for me and for everyone who logs in.

A Play Machine is what I call a system with root as the guest account with only SE Linux to restrict access.

Running a Hidden Service

A Hidden Service in TOR is just a cryptographically protected address that forwards to a regular TCP port. It’s not difficult to setup and the Tor project has good documentation [1]. For Debian the file to edit is /etc/tor/torrc.

I added the following 3 lines to my torrc to create a hidden service for SSH. I forwarded port 80 for test purposes because web browsers are easier to configure for SOCKS proxying than ssh.

HiddenServiceDir /var/lib/tor/hidden_service/

HiddenServicePort 22 192.168.0.2:22

HiddenServicePort 80 192.168.0.2:22

Generally when setting up a hidden service you want to avoid using an IP address that gives anything away. So it’s a good idea to run a hidden service on a virtual machine that is well isolated from any public network. My Play machine is hidden in that manner not for secrecy but to prevent it being used for attacking other systems.

SSH over Tor

Howtoforge has a good article on setting up SSH with Tor [2]. That has everything you need for setting up Tor for a regular ssh connection, but the tor-resolve program only works for connecting to services on the public Internet. By design the .onion addresses used by Hidden Services have no mapping to anything that reswemble IP addresses and tor-resolve breaks it. I believe that the fact that tor-resolve breaks thins in this situation is a bug, I have filed Debian bug report #776454 requesting that tor-resolve allow such things to just work [3].

Host *.onion

ProxyCommand connect -5 -S localhost:9050 %h %p

I use the above ssh configuration (which can go in ~/.ssh/config or /etc/ssh/ssh_config) to tell the ssh client how to deal with .onion addresses. I also had to install the connect-proxy package which provides the connect program.

ssh root@zp7zwyd5t3aju57m.onion

The authenticity of host ‘zp7zwyd5t3aju57m.onion ()

ECDSA key fingerprint is 3c:17:2f:7b:e2:f6:c0:c2:66:f5:c9:ab:4e:02:45:74.

Are you sure you want to continue connecting (yes/no)?

I now get the above message when I connect, the ssh developers have dealt with connecting via a proxy that doesn’t have an IP address.

Also see the general information page about my Play Machine, that information page has the root password [4].

January 23, 2015

rkt and App Container 0.2.0 Release

This week both rkt and the App Container (appc) spec have reached 0.2.0. Since our launch of the projects in December, both have been moving very quickly with a healthy community emerging. rkt now has cryptographic signing by default and a community is emerging around independent implementations of the appc spec. Read on for details on the updates.

rkt 0.2.0

Development on rkt has continued rapidly over the past few weeks, and today we are releasing v0.2.0. This important milestone release brings a lot of new features and improvements that enable securely verified image retrieval and tools for container introspection and lifecycle management.

Notably, this release introduces several important new subcommands:

  • rkt enter, to enter the namespace of an app within a container
  • rkt status, to check the status of a container and applications within it
  • rkt gc, to garbage collect old containers no longer in use

In keeping with rkt's goals of being simple and composable, we've taken care to implement these lifecycle-related subcommands without introducing additional daemons or databases. rkt achieves this by taking advantage of existing file-system and kernel semantics like advisory file-locking, atomic renames, and implicit closing (and unlocking) of open files at process exit.

v0.2.0 also marks the arrival of automatic signature validation: when retrieving an image during rkt fetch or rkt run, Rocket will verify its signature by default. Kelsey Hightower has written up an overview guide explaining this functionality. This signature verification is backed by a flexible system for storing public keys, which will soon be even easier to use with a new rkt trust subcommand. This is a small but important step towards our goal of rkt being as secure as possible by default.

Here's an example of the key validation in action when retrieving the latest etcd release (in this case the CoreOS ACI signing key has previously been trusted using the process above):

$ rkt fetch coreos.com/etcd:v2.0.0-rc.1
rkt: searching for app image coreos.com/etcd:v2.0.0-rc.1
rkt: fetching image from https://github.com/coreos/etcd/releases/download/v2.0.0-rc.1/etcd-v2.0.0-rc.1-linux-amd64.aci
Downloading aci: [=============================                ] 2.31 MB/3.58 MB
Downloading signature from https://github.com/coreos/etcd/releases/download/v2.0.0-rc.1/etcd-v2.0.0-rc.1-linux-amd64.sig
rkt: signature verified: 
  CoreOS ACI Builder <release@coreos.com>

App Container 0.2.0

The appc spec continues to evolve but is now stabilizing. Some of the major changes are highlighted in the announcement email that went out earlier this week.

This last week has also seen the emergence of two different implementations of the spec: jetpack (a FreeBSD/Jails-based executor) and libappc (a C++ library for working with app containers). The authors of both projects have provided extremely helpful feedback and pull requests to the spec, and it is great to see these early implementations develop!

Jetpack, App Container for FreeBSD

Jetpack is an implementation of the App Container Specification for FreeBSD. It uses jails as an isolation mechanism, and ZFS for layered storage. Jetpack is a great test of the cross platform portability of appc.

libappc, C++ library for App Container

libappc is a C++ library for doing things with app containers. The goal of the library is to be a flexible toolkit: manifest parsing and creation, pluggable discovery, image creation/extraction/caching, thin-provisioned file systems, etc.

Get involved

If you are interested in contributing to any of these projects, please get involved! A great place to start is issues in the Help Wanted label on GitHub. You can also reach out with questions and feedback on the Rocket and appc mailing lists:

rkt

App Container

In the SF Bay Area or NYC next week? Come to the meetups in each area to hear more about these changes and the future of rocket and appc. RSVP to the CoreOS NYC meetup and SF meetup to learn more.

Lastly, thank you to the community of contributors emerging around Rocket and App Container:

Alan LaMielle, Alban Crequy, Alex Polvi, Ankush Agarwal, Antoine Roy-Gobeil, azu, beadon, Brandon Philips, Brian Ketelsen, Brian Waldon, Burcu Dogan, Caleb Spare, Charles Aylward, Daniel Farrell, Dan Lipsitt, deepak1556, Derek, Emil Hessman, Eugene Yakubovich, Filippo Giunchedi, Ghislain Guiot, gprggr, Hector Fernandez, Iago López Galeiras, James Bayer, Jimmy Zelinskie, Johan Bergström, Jonathan Boulle, Josh Braegger, Kelsey Hightower, Keunwoo Lee, Krzesimir Nowak, Levi Gross, Maciej Pasternacki, Mark Kropf, Mark Lamourine, Matt Blair, Matt Boersma, Máximo Cuadros Ortiz, Meaglith Ma, PatrickJS, Pekka Enberg, Peter Bourgon, Rahul, Robo, Rob Szumski, Rohit Jnagal, sbevington, Shaun Jackman, Simone Gotti, Simon Thulbourn, virtualswede, Vito Caputo, Vivek Sekhar, Xiang Li

January 20, 2015

Meet us for our January 2015 events

CoreOS CTO Brandon Philips speaking at Linux Conf AU

January has been packed with meetups and events across the globe. So far, we’ve been to India, Switzerland, France, England and New Zealand.

Check out a CoreOS tutorial from Brandon Philips (@brandonphilips) at Linux Conf New Zealand.

Our team has been on a fantastic tour meeting CoreOS contributors and friends around the world. A special thank you to the organizers of those meetups and to all those who came out to the meetups and made us feel at home. Come join us at the following events this month:

Tuesday, January 27 at 11 a.m. PST – Online

Join us for a webinar on Managing CoreOS Container Performance for Production Workloads. Kelsey Hightower (@kelseyhightower) from CoreOS and Matt Williams from Datadog will discuss trends in container usage and show how container performance can be monitored, especially as the container deployments grow. Register here.


Tuesday, January 27 at 6 p.m. EST – New York, NY

Come to our January New York City meetup at Work-Bench, 110 Fifth Avenue on the 5th floor, where our team will discuss our new container runtime, Rocket, as well as Quay.io new features. In addition, Nathan Smith, head of engineering at Wink, www.wink.com, will walk us through how they are using CoreOS. Register here.


Tuesday, January 27 at 6 p.m. PST – San Francisco, CA

Our January San Francisco meetup is not-to-miss! We’ll discuss news and updates on etcd, Rocket and AppC. Register here.


Thursday, January 29 at 7 p.m. CET – Barcelona, Spain

Meet Brian Harrington, better known as Redbeard (@brianredbeard), for CoreOS: An Overview, at itnig. Dedicated VMs and configuration management tools are being replaced by containerization and new service management technologies like systemd. This meetup will give an overview of CoreOS, including etcd, schedulers (mesos, kubernetes, etc.), and containers (nspawn, docker, rocket). Understand how to use these new technologies to build performant, reliable, large distributed systems. Register here.


Saturday, January 31-Sunday, February 1 – Brussels, Belgium

Our team is attending FOSDEM ’15 to connect with developers and the open source community. See our talks and meet the team at our dev booth throughout the event.

  • Redbeard (@brianredbeard) will discuss How CoreOS is built, modified, and updated on Saturday at 1 p.m. CET.
  • Jon Boulle (@baronboulle) from our engineering team will discuss all things Go at CoreOS on Sunday at 9:05 a.m. CET.
  • Kelsey Hightower (@kelseyhightower), developer advocate at CoreOS, will give a talk on Rocket and the App Container Spec at 11:40 a.m. CET.

A special shout out to the organizers of those meetups - Fintan Ryan, Ranganathan Balashanmugam, Muharem Hrnjadovic, Frédéric Ménez, Richard Paul, ­Piotr Zurek, Patrick Heneise, Benjamin Reitzammer, Sunday Ogwu, Tom Martin, Chris Kuhl and Johann Romefort.

If you are interested in hosting an event of your own or inviting someone from CoreOS to speak, reach out to us at press@coreos.com.

Sahana @ linux.conf.au

Last week I was able to attend linux.conf.au which was being hosted in my home town of Auckland. This was a great chance to spend time with people from the open source community from New Zealand, Australia and around the [Read the Rest...]

‘Sup With The Tablet?

As I mentioned on Twitter last week, I’m very happy SUSE was able to support linux.conf.au 2015 with a keynote giveaway on Wednesday morning and sponsorship of the post-conference Beer O’Clock at Catalyst:

For those who were in attendance, I thought a little explanation of the keynote gift (a Samsung Galaxy Tab 4 8″) might be in order, especially given the winner came up to me during the post-conference drinks and asked “what’s up with the tablet?”

To put this in perspective, I’m in engineering at SUSE (I’ve spent a lot of time working on high availabilitydistributed storage and cloud software), and while it’s fair to say I represent the company in some sense simply by existing, I do not (and cannot) actually speak on behalf of my employer. Nevertheless, it fell to me to purchase a gift for us to provide to one lucky delegate sensible enough to arrive on time for Wednesday’s keynote.

I like to think we have a distinct engineering culture at SUSE. In particular, we run a hackweek once or twice a year where everyone has a full week to work on something entirely of their own choosing, provided it’s related to Free and Open Source Software. In that spirit (and given that we don’t make hardware ourselves) I thought it would be nice to be able to donate an Android tablet which the winner would either be able to hack on directly, or would be able to use in the course of hacking something else. So I’m not aware of any particular relationship between my employer and that tablet, but as it says on the back of the hackweek t-shirt I was wearing at the time:

Some things have to be done just because they are possible.

Not because they make sense.

 

January 16, 2015

Linux.conf.au 2015 – Day 5 – Session 3

NoOps with Ansible and Puppet – Monty Taylor

  • NoOps
    • didn’t know it was a contentious term
    • “devs can code and let a service deploy, manage and scale their code”
    • I want to change the system by landing commits. don’t want to “do ops”
    • if I have to use my root access it is a bug
  • Cloud Native
    • Ephemeral Compute
    • Data services
    • Design your applications to be resilient via scale out
    • Cloud scale out, forget HA for one system, forget long-lived system, shared-nothing for everything. Cloud provides the hard scale-out/HA/9s stuff
    • Great for new applications
  • OpenStack Infra
    • Tooling, automation, and CI for the openstack project
    • 2000 devs
    • every commit is fully tested.
    • each test runs on a single use cloud slave
    • 1.7 million test jobs in the last 6 months. 18 TB of log data
    • all runs in HP and rackspace public clouds
  • Create Servers manually at 1st
  • Step 1 – Puppet
    • extra hipster because it is in ruby
    • If you like ruby it is awesome. If don’t is it less-awesome
    • collaboration from non-root users
    • code review
    • problem that it blows up when you try and install the same thing in two different places
    • 3 ways to run. masterless puppet apply. master + puppet agent daemon . master + puppet agent non-daemons
  • Secret stuff that you don’t want into you puppet git repo
    • hiera
  • Step 2 – Ansible for orchestration
    • Control the puppet agent so it runs it nicely and in schedule and on correct hosts first
    • Open source system management tool
    • Sequence of steps not description of state like puppet
    • ad-hoc operation. run random commands
    • easy to slowly grow over time till it takes over puppet
    • yaml syntax of config files
  • Step 3 – Ansible for cloud management
  • Ansible config currently mixed in with puppet under – http://git.openstack.org/cgit/openstack-infra/system-config/

 

Conference Closing

  • Steve Walsh wins Rusty Wrench award
  • Preview of Linux.conf.au 2016 in Geelong
    • Much flatter than Auckland
    • Deakin University – Waterfront Campus
    • Waurn Ponds student accomadation 15 minutes with shuttles
    • Feb 8th – 12th 2016
    • CFP 1st of June 2015
    • Theme “life is better with linux”
    • 4 keynotes confirmed or in final stages of discussion, 2 female, 2 male
    • NFS keytags
    • lcabythebay.org.au
  • Announcement for Linux.conf.au 2017 will be in Hobart

 

FacebookGoogle+Share

Linux.conf.au 2015 – Day 5 – Session 2

When Everything Falls Apart: Stories of Version Control System Scaling – Ben Kero

  • Sysadmin at Mozilla looking after VCS
  • Primarily covering mercurial
  • Background
    • Primarily mercurial
    • 3445 repos (1223 unique)
    • 32 million commits
    • 2TB+ transfer per day
    • 1000+ clones per day
    • Biggest customer = ourselves
    • tested platforms > 12
  • Also use  git (a lot) and a bit of:  subversion, CVS, Bazaar, RCS
  • 2 * ssh servers, 10 machines mirror http traffic behind load balancer
  • 1st story – know what you are hosting
    • Big git repo 1.7G somebody asked to move off github
    • Turned out to be mozilla git mirror, so important to move
    • plenty of spare resources
    • But high load straight away
    • turned out to be mercurial->git converter, huge load
    • Ran garbage collection – took several hours
    • tweaked some other settings
  • 2nd story
    • 2003 . “Try” CI system
    • Simple CI system (before the term existed or they were common)
    • flicks off to build server, sends status back to dev
    • mercurial had history being immutable up until v2.1 and mozilla was stuck on old version
    • ended up with 29,000 brashes in repo
    • Around 10,000 heads some operations just start to fail
    • Wait times for pushes over 45 minutes. Manual fixes for this
    • process was “hg serve” only just freezein gup, not any debug info
    • had to attached debugging. trying to update the cache.
    • cache got nuked by cached push, long process to rebuild it.
    • mercurial bug 4255 in process of being looked at, no fix yet
  • The new system
    • More web-scalable to replace old the system
    • Closer to the pull-request model
    • multi-homing
    • leverage mercurial bundles
    • stores bundles in scalable object store
    • hopefully minimal retooling from other groups (lots of weird systems supported)
  • Planet release engineering @ mozilla

SL[AUO]B: Kernel memory allocator design and philosophy – Christopher Lameter

  • NOTE: I don’t do kernel stuff so much of this is over my head.
  • Role of the allocator
    • page allocator only works in full page size (4k) and is fairly slow
    • slab allocator for smaller allocation
    • SLAB is one of the “slab allocators”
  • kmeme_cache , numa aware, etc
  • History
    • SLOB: K&R 1991-1999 . compact
    • SLAB: Solaris 199-2008 . cache friendly, benchmark friendly
    • SLUB: 2008-today , simple and instruction costs count, better debugging, defrag, execution time friendly
  • 2013 – work to split out common code for allocators
  • SOLB
    • manages list of free objects with the space of free objects
    • have to traverse list to find object of sufficient size
    • rapid fragmentation of memory
  • SLAB
    • queues per cpu and per node to track cache hotness
    • queues for each remote node
    • complete data structures
    • cold object expiration every 2 seconds on each CPU
    • large systems with LOTS of CPUs have huge amount of memory trapped, spending lots of time cleaning cache
  • SLUB
    • A lot less queuing
    • Pages associated with per-cpu. increased locality
    • page based policies and interleave
    • de-fragmentation on multiple levels
    • current default in the kernel
  • slabinfo tool for SLUB. tune, modify, query, control objects and settings
  • can be asked to go into debug mode even when debugging not enabled with rest of the kernel
  • Comparing
    • SLUB faster (SLAB good for benchmarks)
    • SLOB slow
    • SLOB less memory overhead for small/simple systems (only, doesn’t handle lots of reallocations that fragment)
  • Roadmap
    • More common framework
    • Various other speedups and features

 

FacebookGoogle+Share

Linux.conf.au 2015 – Day 5 – Session 1

How to get one of those Open Source jobs – Mark Atwood

  • Warns talk might still have some US-centric stuff still in it
  • “Open Source Job” – most important word is “Job”
    • The Open Source bit means you are a bit more transferable than a closed-source programmer
    • Don’t have to move to major tech city
  • Communication skills
    • Have to learn to Write clearly in English
    • Heave to learn how to speak, including in meetings and give some talks
    • Reachable – Have a public email address
    • Don’t be a jerk, reputation very important
  • Technical skills
    • Learn how to program
    • Start with python and javascript
    • Learn other languages eg scale, erlang, clojure, c, C++
    • How to use debugger and IDE
    • Learn to use git well
    • Learn how to code test (especially to work with CI testers like jenkins)
    • Idea: Do lots of simple practise problems in programming using specific technique or language
  • Relationships & Peers
    • Work with people remote and nearby
    • stackoverflow
    • Don’t be a jerk
  • Work
    • Have to “do the work” then “get the job”
    • Start by fixing bugs on a project
    • Your skills will improve and others will see you have those skills
  • Collaborate
    • Many projects use IRC
    • Most projects have bug tracker
    • Learn how to use the non-basic stuff in git
    • Peer programming
  • Reputation
    • Portfolio vs resume
    • github account is your portfolio
    • Need to be on social media, at least a little bit, most be reachable
  • Getting the Job
    • If you have a good enough a rep the jobs will seek you out
    • Keywords on github and linkedin will attract recruiters
    • People will suggest you that apply
    • Conferences like linux.conf.au
    • Remember to counter-offer the offer letter
    • Once you are working for them, work out what is job related an the company might have a claim on. make sure you list in your agreement any projects you are already working on
  • Health
    • Don’t work longer than 40h a week regularly
    • 60h weeks can only be sustained for a couple of weeks
    • Just eat junk-food
    • Don’t work for jerks
  • Money
    • Startups – bad for your health. Do not kill yourself for a nickle, have real equity
  • Keep Learning
  • 3 books to read
    • Oh the palces you will go – Dr Seuss
    • Getting things Done – David Allen
    • How to fail at almost everything and still win big – Scott Adams

 

Pettycoin: Towards 1.0 – Rusty Russell

  • Problem it bitcoining mining is expensive, places lower limit on transaction fees
  • Took 6 months of to mostly work on pettycoin
  • Petty coin
    • Simple
    • gateway to bitcoin
    • small amounts
    • partial knowledge, don’t need to know everything
    • fast block times
  • Altcoins – bitcoin like things that are not bitcoin
    • 2 million posts to altcoin announce forum
    • lots of noise to talk to people
  • review
    • Paper released saying how it should have been done
    • hash functions
    • bitcoin blocks
    • Bitcoin transactions
  • Sidechain
    • alternative chains that use real bitcoins
    • Lots of wasted work? – bitcoin miners can mine other chains at the same time
    • too fast to keep notes
    • Compact CVP Proofs (reduce length of block header to go all the way back )

 

FacebookGoogle+Share

January 15, 2015

Gender diversity in linux.conf.au speakers

My first linux.conf.au was 2003 and it was absolutely fantastic and I’ve been to every one since. Since I like this radical idea of equality and the LCA2015 organizers said there were 20% female speakers this year, I thought I’d look through the history.

So, since there isn’t M or F on the conference program, I have to guess. This probably means I get things wrong and have a bias. But, heck, I’ll have a go and this is my best guess (and mostly excludes miniconfs as I don’t have programmes for them)

  • 2003: 34 speakers: 5.8% women.
  • 2004: 46 speakers: 4.3% women.
  • 2005: 44 speakers: 4.5% women
  • 2006: 66 speakers: 0% women (somebody please correct me, there’s some non gender specific names without gender pronouns in bios)
  • 2007: 173 speakers: 12.1% women (and an order of magnitude more than previously). Includes miniconfs

    (didn’t have just a list of speakers, so this is numbers of talks and talks given by… plus some talks had multiple presenters)
  • 2008: 72 speakers: 16.6% women
  • 2009: 177 speakers (includes miniconfs): 12.4% women
  • 2010: 207 speakers (includes miniconfs): 14.5% women
  • 2011: 194 speakers (includes miniconfs): 14.4% women
  • 2012: (for some reason site isn’t responding…)
  • 2013: 188 speakers (includes most miniconfs), 14.4% women
  • 2014: 162 speakers (some miniconfs included): 19.1% women
  • 2015: As announced at the opening: 20% women.

Or, in graph form:

Sources:

  • the historical schedules up on linux.org.au.
  • my brain guessing the gender of names. This is no doubt sometimes flawed.

Update/correction: lca2012 had around 20% women speakers at main conference (organizers gave numbers at opening) and 2006 had 3 at sysadmin miniconf and 1 in main conference.

Linux.conf.au 2015 – Day 5 – Keynote/Panel

  • Everybody Sung Happy birthday to Baale
  • Bdale said he has a new house and FreedomBox 0.3 release this week
  • Rusty also on the panel
  • Questions:
    • Why is Linus so mean
    • Unified Storage/Memory machines – from HP
    • Young people getting into community
    • systemd ( I asked this)
    • Year of the Linux Desktop
    • Documentation & training material
    • Predict the security problems in next 12 month
    • Does NZ and Australia need a joint space agency
    • Will you be remembered more for Linux or Git?

FacebookGoogle+Share

Linux.conf.ay 2015 – Day 4 – Session 3

Drupal8 outta the box – Donna Benjamin

  • I went to the first half of this but wanted to catch the talk below so I missed the 2nd part

 

Connecting Containers: Building a PaaS with Docker and Kubernetes – Katie Miller

  • co-presented with Steve Pousty
  • Plugs their OpenShift book, they are re-archetecturing the whole thing based on what in the book
  • Platform as a service
    • dev tooling, runtime, OS , App server, middleware.
    • everything except the application itself
    • Openshift is an example
  • Reasons to rebuild
    • New tech
    • Lessons learned from old deploy
  • Stack
    • Atomic + docker + Kubeneties
  • Atomic
    • Redhat’s answer of CoreOS
    • RPM-OSTree – atomic update to the OS
    • Minimal System
    • Fast boot, container mngt, Good Kernel
  • Containers
    • Docker
    • Nice way of specifying everything
    • Pros – portable, easy to create, fast boot
    • Cons – host centric, no reporting
    • Wins – BYOP ( each container brings all it’s dependencies ) , Standard way to make containers , Big eco-system
  • Kubernetes
    • system managing containerize maps across multiple hosts
    • declarative model
    • open source by google
    • pod + service + label + replication controller
    • cluster = N*nodes + master(s) + etcd
    • Wins: Runtime and operation management + management related containers as a unit, container communication, available, scalable, automated, across multiple hosts
  • Rebuilding Openshift
    • Kubernetes provides container runtime
    • Openshift provides devops and team enviroment
  • Concepts
    • application = multiple pods linked togeather (front + back + db ) managed as a unit, scald independantly
    • config
    • template
    • build config = source + build -> image
    • deployment = image and settings for it
  • This is OpenShift v3 – things have been moving very fast so some docs are out of date
  • Slides http://containers.codemiller.com

FacebookGoogle+Share

Linux.conf.au 2015 – Day 4 – Session 2

Tunnels and Bridges: A drive through OpenStack Networking – Mark McClain

  • Challenges with the cloud
    • High density multi-tenancy
    • On demand provisioning
    • Need to place / move workloads
  • SDN , L2 fabric, network virtualisation Overlay tunneling
  • The Basics
    • The user sees the API, doesn’t matter too much what is behind
    • Neutron = Virtual subnet + L2 virtual network + virtual port
    • Nova = Server + interface on the server
  • Design Goals
    • Unified API
    • Small Core. Networks + Subnets + Ports
    • Plugable open archetecture
  • Features
    • Overlapping IPs
    • Configuration DHCP/Metadata
    • Floating IPs
    • Security Groups ( Like AWS style groups ) . Ingress/egress rules, IPv6 . VMs with multiple VIFS
  • Deployment
    • Database + Neutron Server + Message Queue
    • L2 Agent , L3 agent + DHCP Agent
  • Server
    • Core
    • Plugins types =  Proxy (proxy to backend) or direct control (login instide plugin)
    • ML2 – Modular Layer 2 plugin
  • Plugin extensions
    • Add to REST API
    • dpch, l3, quota, security group, metering, allowed addresses
  • L2 Agent
    • Runs on a hypervisor
    • Watch and notify when devices have been added/removed
  • L3 agent – static routing only for now
  • Load balancing as a service, based on haproxy
  • VPN as a service , based on openswan, replicates AWS VPC.
  • What is new in Juno?
    • IPv6
    • based on Radbd
    • Advised to go dual-stack
  • Look ahead to Kilo
    • Paying down technical debt
    • IPv6 prefix delegation, metadata service
    • IPAM – hook into external systems
    • Facilitate dynamic routing
    • Enabling NFV Applications
  • See Cloud Administrators Guide

 

Crypto Won’t Save You Either – Peter Gutmann

  • US Govt has capabilities against common encryption protocols
  • BULLRUN
  • Example Games consoles
    • Signed executables
    • encrypted storage
    • Full media and memory encryption
    • All of these have been hacked
  • Example – Replaced signature checking code
  • Example – Hacked “secure” kernel to attack the application code
  • Example – Modify firmware to load over the checking code
  • Example – Recover key from firmware image
  • Example – Spoof on-air update
  • LOTS of examples
  • Nobody noticed bunch of DKIM keys were bad, cause all attackers had bypassed encryption rather than trying to beat the crypto
  • No. of times crypto broken: 0, bypassed: all the rest
  • National Security Letters – The Legalised form of rubber-hose cryptanalysis
  • Any well design crypto is NSA-proof
  • The security holes are sitting right next to the crypto

 

FacebookGoogle+Share

Linux.conf.au 2015 – Day 4 – Session 1

8 writers in under 8 months: from zero to a docs team in no time flat – Lana Brindley

  • Co Presenting with Alexandra Settle
  • 8 months ago online 1 documentation person at rackspace
  • Hired a couple people
  • Horrible documentation suite
  • Hired some more
  • 4 in Australia, 4 in the US
  • Building a team fast without a terrible culture
    • Management by MEME – everybody had a meme created for them when they started
    • Not all work and No play. But we still get a lot of work done
    • Use tech to overcome geography
    • Treat people as humans not robots
    • Always stay flexible. Couch time, Gym time
  • Finding the right people
    • Work your network , job is probably not going to be advertise on linkedin, bad for diversity
    • Find great people, and work out how to hire them
    • If you do want a job, network
  • Toolchains and Systems
    • Have a vision and work towards it
    • acknowledge imperfection. If you can’t fix, ack and just move forward anyway
  • You can maintain crazy growth forever. You have to level off.
  • Pair US person with AU person for projects
  • Writers should attend Docs summit and encouraged to attend at least one Openstack summit

 

FacebookGoogle+Share

January 14, 2015

Linux.conf.au 2015 – Day 4 – Keynotes

Cooper Lees – Facebook

  • Open Source at facebook
  • Increase in pull requests, not just pushing out stuff or throwing over the wall anymore
  • Focussing on full life-cycle of opensource
  • Big Projects: react , hhvm , asyncdisplaykit , presto
  • Working on other projects and sending to upstream
  • code.facebook.com  github.com/facebook
  • Network Switches and Open Compute
    • Datacentre in NZ using open compute designs
  • Open source Switch
    • Top of rack switch
    • Want to be the open compute of network switches
    • Installer, OS, API to talk to asic that runs ports
    • Switches = Servers. running chef
  • Wedge
    • 16-32 of 40GE ports
    • Internal facebook design
    • 1st building block for disaggregated switching technology
    • Contributed to OCP project
    • Micro Server + Switchports

Carol Smith – Google

  • Works in Google Open Source office
  • Google Summer of code
    • Real world experience
    • Contacts and references
  • 11th year of the program
  • 8600 participated over last 10 years
  • Not enough people in office to do southern hemisphere programme. There is “Google code-in” though

Mark McLoughlin – Red Hat

  • Open Source and the datacenter
  • iaas, paas, microservices, etc
  • The big guys are leading (amazon, google). They are building on open source
  • Telcos
    • Squeezed and scrambling
    • Not so “special” anymore
    • Need to be agile and responsive
    • Telecom datacentre – filled with big, expensive, proprietary boxes
    • opposite of agile
  • OPNFV reference architecture
  • OpenStack, Open vswitch, etc
  • Why Open Source? – collaboration and coopetition , diversity drives innovation , sustainability

 

There was a Q&A. Mostly questions about diversity at the companies and grumps about having to move to US/Sydney for peopl eto work for them

FacebookGoogle+Share

Linux.conf.au – Day 3 – Lightning talks

 

  • Clinton Roy + Tom Eastman – Python Conference Australia 2015 + Kiwi PyCon 2015
    • Brisbane , late July 2015
    • Similar Structure to LCA
    • Christchurch – Septemberish
    • kiwi.pycon.org
  • Daniel Bryan – Comms for Camps
    • Detention camps for Australian boats people camps
    • Please contact if you can offer technical help
  • Phil Ingram – Beernomics
    • Doing stuff for people in return for beer
    • Windows reinstall = a Keg
    • Beercoin
  • Patrick Shuff – Open sourcing proxygen
    • C++ http framework. Built own webserver
    • Features they need, monitoring, fast, easy to add new features
    • github -> /facebook/progen
  • Nicolás Erdödy – Multicore World 2015 & the SKA.
    • Multicore World – 17-18 Feb 2015 Wellington
  • Paul Foxworthy – Open Source Industry Australia (OSIA)
    • Industry Body
    • Govt will consult with industry bodies but won’t listen to individual companies
    • Please join
  • Francois Marier – apt-get remove –purge skype
    • Web RTC
    • Now usable to replace skype
    • Works in firefox and chrome. Click link, no account, video conversation
    • Firefox Hello
  • Tobin Harding – Central Coast LUG
    • Update on Central Coast of NSW LUG
    • About 6 people regularly
  • Mark Smith – Failing Gracefully At 10,000ft
    • Private pilot
    • Aircrafts have 400+ page handbooks
    • Things will fail…
    • Have procedures…
    • Before the engine is on fire
    • test
    • The most important task is to fly the plane
  • Tim Serong – A very short song about memory management
    • 1 verson song
  • Angela Brett – Working at CERN and why you should do it
    • Really Really awesome
    • Basic I applied, lots of fellowship
    • Meet someone famous
    • Lectures online from famous people
  • Donna Benjamin – The D8 Chook Raffle
    • $125k fund to get Drupal8 out
    • Raffle. google it
  • Matthew Cengia/maia sauren – What is the Open Knowledge Foundation?
    • au.okfn.org
    • Open govt/ data / tech / jouralism / etc
    • govHack
    • Open Knowledge Brisbane Meetup Govt
  • Florian Forster – noping
    • Pretty graphs and output on command line ping
    • http://noping.cc
  • Jan Schmidt – Supporting 3D movies in GStreamer
    • A brief overview of it all
  • Justin Clacherty ORP – An open hardware, open software router
    • PowerPC 1-2G RAM
    • Package based updates
    • Signed packages
    • ORP1.com

FacebookGoogle+Share

Linux.conf.au 2015 – Day 2 – Session 2 – Sysadmin Miniconf

Mass automatic roll out of Linux with Windows as a VM guest – Steven Sykes

  • Was late and missed the start of the talk

etcd: distributed locking and service discovery – Brandon Philips

  • /etc distributed
  • open source, failure tolerant, durable, watchable, exposed via http, runtime configurable
  • API – get/put/del  basics plus some extras
  • Applications
    • Locksmith, distributed locks used when machines update
    • Vulcan http load balancer
  • Leader Election
    • TTL and atomic operations
    • Magical stuff explained faster than I can type it.
    • Just one leader cluster-wide
  • Aims for consistence ahead of raw performance

 

Linux at the University – Randy Appleton

  • No numbers on how many students use Linux
  • Peninsula Michigan
  • 3 schools
  • Michigan Tech
    • research, 7k students, 200CS Students, Sysadmin Majors in biz school
    • Linux used is Sysadmin courses, one of two main subjects
    • Research use Linux “alot”
    • Inactive LUG
    • Scripting languages. Python, perl etc
  • Northern Michigan
    • 9k students, 140 CS Majors
    • Growing CIS program
    • No Phd Programs
    • Required for sophomore and senior network programming course
    • Optional Linux sysadmin course
    • Inactive LUG
    • Sysadmin course: One teacher, app of the week (Apache, nfs, email ), shell scripting at end, big project at the end
    • No problem picking distributions, No problem picking topics, huge problem with desperate incoming knowledge
    • Kernel hacking. Difficult to do, difficult to teach, best students do great. Hard to teach the others
  • Lake Superior State
    • 2600 students
    • 70 CS Majors
    • One professor teaches Sysadmin and PHP/MySQL
    • No LUG
    • Not a lot of research
  • What is missing
    • Big power Universities
    • High Schools – None really
    • Community college – None really
  • Usage for projects
    • Sometimes, not for video games
  • Usage for infrastructure
    • Web sites, ALL
    • Beowuld Clusters
    • Databases – Mostly
  • Obstacles
    • Not in High Schools
    • Not on laptops, not supported by Uni
    • Need to attract liberal studies students
    • Is Sysadmin a core concept – not academic enough
  • What would make it better
    • Servers but not desktops
    • Not a edu distribution
    • Easier than Eclispe , better than visual studio

Untangling the strings: Scaling Puppet with inotify – Steven McDonald

  • Around 1000 nodes at site
  • Lots of small changes, specific to one node that we want to happen quickly
  • Historically restarting the puppet master after each update
  • Problem is the master gets slow as you scale up
  • 1300 manifests, takes at least a minute to read each startup
  • Puppet internal caching very coarse, per environment basis (and they have only one prod one)
  • Multiple environments doesn’t work well at site
  • Ideas – tell puppet exactly what files have changed with each rollout (via git, inotify). But puppet doesn’t support this
  • I missed the explan of exactly how puppet parses the change. I think it is “import” which is getting removed in the future
  • Inotify seemed to be more portable and simpler
  • Speed up of up to 5 minutes for nodes with complex catalogs, 70 seconds off average agent run
  • implementation doesn’t support the future parser, re-opening the class in a seperate file is not supported
  • Available on github. Doesn’t work with current ruby-inotify ( in current master branch )

 

 

Linux.conf.au – Day 2 – Session 1 – Sysadmin Miniconf

Configuration Management – A love Story – Javier Turegano

  • June 2008 – Devs want to deploy fast
  • June 2009 – git -> jenkins -> Puppet master
  • But things got pretty complicated and hard to maintain
  • Remove puppet master, puppet noop, but only happens now and then lots of changes but a couple of errors
  • Now doing manual changes
  • June 2010 – Thngs turned into a mess.
  • June 2011 – Devs want prod-like development
  • Cloud! Tooling! Chef! – each dev have their own environment
  • June 2012 – dev environments for all working in ec2
  • dev no longer prod-like. cloud vs datacentre, puppet vs chef , debian vs centos, etc
  • June 2013 – More into cloud, teams re-arranged
  • Build EC2 images and deploy out of jenkins. Eaither as AMI or as rpm
  • Each team fairly separate, doing thing different ways. Had guilds to share skills and procedures and experience
  • June 2014 – Cloudformation, Ansible used by some groups, random

Healthy Operations – Phil Ingram

  • Acquia – Enterprise Drupal as a service. GovCMS Australian Federal Government. 1/4 are remote
  • Went from working in office to working from home
  • Every week had phone call with boss
  • Talk about thing other than with work, ask home people are going, talk to people.
  • Not sleep, waking up at night, not exercising, quick to anger and negative thinking, inability to concentrate
  • Hadn’t taken more than 1 week off work, let exercise work, hobbies was computer stuff
  • In general being in Ops not as much of an option to take time off. Things stay broke until fix
  • Unable to learn via Osmosis, Timing of handing over between shifts
  • People do not understand that computers are run by people not robots
  • Methods: Turn work off at the end of the day, Rubber Ducking, exercise

Developments in PCP (Performance Co-Pilot) : Nathan Scott

  • See my slides from yesterday for intro to PCP
  • Stuff in last 12 months
    • Included in supported in RHEL 6.6 and RHEL 7
    • Regular stable releases
    • Better out of the box experience
    • Tackling some long-standing problems
  • JSON access – pmwebd , interactive web charts ( Graphite, grafana )
  • zero-install look-inside containers
  • Docker support but written to allow use by others
  • Collectors
    • Lots of new kernel metrics additions
    • New applications from web devs (memcached, DNS, web )
    • DB server additions
    • Python PMDA interfaces
  • Monitor work
    • Reporting tools
    • Web tools, GUIs
  • Also improving ease of setup
  • Getting historical data from sar, iostat
  • www.pcp.io

Security options for container implementations – Jay Coles

  • What doesn’t work: rlimits, quotas, blacklisting via ACLs
  • Capabilities: Big list that containers probably shouldn’t have
  • Cgroups – Accounting, Limiting resource usage, tracking of processes, preventing/allowing device access
  • App Armor vs selinux – Use at least one, selinux a little more featured

Linux.conf.au 2015 – Day 3 – Session 2

EQNZ – crisis response, open source style – Brenda Wallace

  • Started with a Trigger warning and “fucker”
  • First thing posted – “I am okay” , one tweet, one facebook
  • State of Scial Media
    • Social media not as common, SMS king, not many smartphones
    • Google Buzz, twitter, Facebook
    • Multiple hashtags
  • Questions people asked on social media
  • Official info was under strain, websites down due to bad generators
  • Crisis Commons
  • Skype
    • Free
    • Multi-platform
    • Txt based
    • Battery Drain very bad
    • Bad internet in Chc hard to use, no mobile, message reply for minutes on join
  • Things pop up within an hour
    • Pirate Pad
    • Couch apps
    • Wikis
    • WordPress installs
  • Short code 4000 for non-urgent help live by 5pm
    • Volenteers processing the queue
  • All telcos agree to coordinate their social media effort
  • Civil defence didn’t have site ready and refused offers, people decided to do independantly
  • Ushahidi instance setup
    • Google setup people finder app
    • Moved into ec2 cluther
    • hackfest, including added mobile
    • Some other Ushidis, in the end newspaper sites enbedded
  • Council
    • chc council wordpress for info
    • Very slow and bad UI
    • Hit very hard, old information from the previous earthquake
    • staff under extreme pressure
  • Civil Defence
    • Official info only
    • Falls over
    • Caught by DDOS against another govt site
  • Our reliability
    • Never wen tdown
    • contact and reassured some authorities
    • After 24h . 78k page impressions
  • Skype
    • 100+ chatting. limitations
    • IRC used by some but many no common enough
    • Gap for something common. cross platform, easy to use
  • Hashtag
    • twitter to SMS notifications to add stuff to website
  • Maps were a new thing
    • None of the authorities knew them
  • Council and DHB websites did not work on mobile and were not updating
  • Government
    • Govt officers didn’t talk – except NZ Geospacial office
    • Meeting that some people attended
  • Wrap up after 3 weeks
    • Redirected website
    • Anonymous copy of database
  • Pragmatic
    • Used closed source where we had too (eg skype)
    • But easier with OS could quick to modify
    • Closed source people could install webserver, use git, etc. Hard to use contributions
  • Burned Bridges
    • Better jobs with Gov agencies
  • These days
    • Tablets
    • Would use EC2 again
    • phones have low power mode
    • more open street maps

 

collectd in dynamic environments – Florian Forster

  • Started collectd in 2005
  • Dynamic environments – Number and location of machines change frequently – VM or job management system
  • NOTE: I use collectd so my notes are a little sparse here cause I knew most of it already
  • Collects timeseries data, does one thing well. collectd.org
  • agent runs on each host, plugins mostly in C for lots of things or exec plug to run random stuff.
  • Read Plugins to get metrics from system metrics, applications, other weird stuff
  • Write plugs – Graphite, RRD, Reimann, MongoDB
  • Virtual machine Metrics
    • libvirt plugin
    • Various metrics, cpu, memory, swap, disk ops/bytes, network
    • GenericJMX plugin – connects to JVM. memory and garbage collection, threads
  • Network plugin
    • sends and receives metric
    • Effecient binary protocol. 50-100 byte UDP multicast/unicast protocol
    • crypto available
    • send, receive, forward packets
  • Aggregation
    • Often more useful for alerting
  • Aggregation plugin
    • Subscribes to metric
    • aggregates and forwards
    • Limitation, no state, eg medium, mean are missing
    • only metrics with one value
    • can be aggregated at any level
    • eg instead of each CPU then total usage of all your CPUS
  • Reimann
    • Lots of filters and functions
    • can aggregate, many otions
  • Bosum
    • Monitoring and alert language
  • Storage
    • Graphite
    • OpenTSDB based on hadoop
    • InfluxDB – understand collectd protocol native (and graphite).
    • Vaultaire ( no collectd integration but… )
  • New Dishboard – facette.io

FacebookGoogle+Share

Linux.conf.au 2015 – Day 3 – Keynote

Bob Young

  • Warns that some stories might not be 100% true
  • ”  Liked about Early Linux – Nobody was very nice to each other but everybody was very respectful of the Intel Microprocessor “
  • CEO of Redhat 1992 – 2000
  • Various stories, hard to take notes from
  • One person said they walked out of the Keynote when they heard the quote “it was a complete meritocracy” re the early days of Linux.
  • Others didn’t other parts of the talk. General tone and some statements similar to the one above.
  • “SuSe User Loser” proviked from laughs and a Suse Lizzard being thrown at the speaker
  • Reasons the publishing industry rejects books: 1. no good; 2. market not big enough; 3. They already publish one on the subject.

Linux.conf.au 2015 – Day 3 – Session 1

CoreOS: an introduction – Brandon Philips

  • Reference to the “Datacenter as a Computer Paper
  • Intro to containers
  • cAdvisor – API of what resources are used by a container
  • Rocket
    • Multiple implementations of container spec , rocket is just one implementation
  • Operating system is able to make less promises to applications
  • Kernel API is really stable
  • Making updates easy
    • Based on ChromeOS
    • Update one partition with OS version. Then flip over to that.
    • Keep another partition/version ready to fail back if needed
    • Safer to update the OS seperated from the app
    • Just around 100MB in size. Kernel, very base OS, systemd
  • etcd
    • Key value store over http (see my notes from yesterday)
    • multiple, leader election etc
    • Individual server less critical since data across multiple hosts
  • Scheduling stuff to servers
    • fleet – very simple, kinda systemd looking
    • fleetctl start foo.service   – sends it off to some machine
    • meso, kubernetes, swam other alternative scedulers
  • Co-ordination
    • locksmith
  • Service discover
    • skydns, discoverd, conf
    • Export location of application to DNS or http API
    • Need proxies to forward request to the right place (for apps not able to query service discovery directly)
  • It is all pretty much a new way of thinking about problems

 

Why you should consider using btrfs, real COW snapshots and file level incremental server OS upgrades like Google does. – Marc Merlin

  • Worked at netapp, hooked on snapshots, lvm snapshots never worked too well , also lvm partitions not too good
  • Switched laptop to btrfs to 3 years ago
  • Why you should consider btrfs
    • Copy on Write
    • Snapshots
    • cp -reflink=always
    • metadata is redundant and checksummed, data checksummed too
    • btrfs underlying filesystem [for now]
    • RAID 0, 1, 5, 6 built in
    • file compression is also built in
    • online background scrub (partial fsck)
    • block level filesystem diff backups(instead of a slow rsync)
    • convert difectly from ext3 (fails sometimes)
  • Why not use ZFS instead
    • ZFS more mature than ZFS
    • Same features plus more
    • Bad license. Oracle not interested in relicensing. Either hard to do or prfer btrfs
    • Netapp sued sun for infringing patents with ZFS. Might be a factor
    • Hard to ship a project with it due to license condistions
  • Is it safe now?
    • Use new kernels. 3.14.x works okay
    • You have to manually balance sometimes
    • snapshots, raid 0 , raid 1 mostly stable
    • Send/receive mostly works reliably
  • Missing
    • btrfs incomplete, but mostly not needed
    • file encryption not supported yet
    • dedup experimental
  • Who use it
    • openSUSE 13.2 ships with it by default
  • File System recovery
    • Good entry on bfrfs wiki
    • btrfs scrub, run weekly
    • Plan for recovery though, keep backups, not as mature as ext4/ext3 yet, prepare beforehand
    • btrfs-tools are in the Ubuntu initrd
  • Encryption
    • Recommends setup encryption on md raid device if using raid
  • Partitions
    • Not needed anymore
    • Just create storage pools, under them create sub volumes which can be mounted
    • boot: root=/dev/sda1  rootflags=solvol=root
  • Snapshots
    • Works using subvolumes
    • Read only or read-write
    • noatime is strongly recommended
    • Can sneakily fill up your disk “btrfs fi show” tells you real situation. Hard to tell what snapshots to delete to reclaim space
  • Compression
    • Mount option
    • lzo fast, zlib slower but better
    • if change option then files changed from then on use new option
  • Turn off COW for big files with lots of random rights in the middle. eg DBs and virtual disk images
  • Send/receive
    • rsync very slow to scan many files before copy
    • initial copy, then only the diffs. diff is computed instantly
    • backup up ssd to hard drive hourly. very fast
  • You can make metadata of file system at a different raid level than the the data
  • Talk slides here. Lots of command examples

 

January 13, 2015

Linux.conf.au 2015 – Day 2 – Session 3 – Sysadmin

Alerting Husbandry – Julien Goodwin

  • Obsolete alerts
    • New staff members won’t have context to know was is obsolete and should have been removed (or ignorened)
  • Unactionable alerts – It is managed by another team but thought you’d like to be woken up
  • SLA Alerts – can I do something about that?
  • Bad thresholds ( server with 32 cores had load of 4 , that is not load ), Disk space alerts either too much or not enough margin
  • Thresholds only redo after complete monitoring rebuilds
  • Hair trigger alerts ( once at 51ms not 50ms )
  • Not impacting redundancy ( only one of 8 web servers is down )
  • Spamming alerts, things is down for the 2925379857 time. Even if important you’ve stopped caring
  • Alerts for something nobody cares about, eg test servers
  • Most of earlier items end up in “don’t care” bucket
  • Emails bad, within a few weeks the entire team will have a filter to ignore it.
  • Undocumented alerts – If it is broken, what am I supposed to do about it?
  • Document actions to take in  “playbook”
  • Alert acceptance practice, only oncallers should e accepting alerts
  • Need a way to silence it
  • Production by Fiat

 

 

Managing microservices effectively – Daniel Hall

  • Step one – write your own apps
  • keep state outside apps
  • not nanoservices, not milliservices
  • Each should be replaceable, independantly deployable , have a single capability
  • think about depandencies, especially circular
  • Packaging
    • small
    • multiple versions on same machine
    • in dev and prod
    • maybe use docker, have local registry
    • Small performance hit compared to VMs
    • Docker is a little immature
  • Step 3 deployment
    • Fast in and out
    • Minimal human interaction
    • Recovery from failures
    • Less overhead requires less overhead
    • We use Meso and marathon
    • Marathon handles switches from old app to new, task failure and recover
    •  Early on the Hype Cycle
  • Extra Credit Sceduling
    • Chronos within Mesos
    • A bit newish

 

Corralling logs with ELK – Mark Walkom

  • You don’t want to be your bosses grep
  • Cluster Elastisearch, single master at any point
  • Sizing best to determine with single machine, see how much it can hadle. Keep Java heap under 31GB
  • Lots of plugins and clients
  • APIs return json. ?pretty makes it looks nicer. The ” _cat/* ” api is more command line
  • new node scales, auto balancers and grows automatic
  • Logstash. lots of filters, handles just about any format, easy to setup.
  • Kibana – graphical front end for elastisearch
  • Curator, logstash-forwarder, grokdebugger

FAI — the universal deployment tool – Thomas Lange

  • From power off to applications running
  • It is all about installing software packages
  • Central administration and control
  • no master or golden image
  • can be expanded by hooks
  • plan your installation and FAI installs the plan
  • Boot up diskless client via PXE/tftp
  • creates partitions, file systems, installs, reboots
  • groups hosts by classes, mutiple classes per host etc
  • Classes can be executables, writeing to standard output, can be in shell, pass variables
  • partitioning, can handle LVM, RAID
  • Projected started in 1999
  • Supports debian based distributions including ubuntu
  • Supports bare metal, VM, chroot, LiveCD, Golden image

 

Documentation made complicated – Eric Burgueno

  • Incomplete, out of date, inconsistent
  • Tools – Word, LibreOffice  -> Sharepoint
  • Sharepoint = lets put this stuff over here so nobody will read it ever again
  • txt , markdown, html. Need to track changes
  • Files can be put in version control.
  • Mediawiki
  • Wiki – uncontrolled proliferation of pages, duplicate pages
  • Why can’t documentation be mixed in with the configuration management
  • Documentation snippits
    • Same everywhere (mostly)
    • Reusable
  • Transclusion in mediawiki (include one page install another)
  • Modern version of mediawiki have parser functions. display different content depending on a condition
  • awesomewiki.co

Systemd Notes

A few months ago I gave a lecture about systemd for the Linux Users of Victoria. Here are some of my notes reformatted as a blog post:

Scripts in /etc/init.d can still be used, they work the same way as they do under sysvinit for the user. You type the same commands to start and stop daemons.

To get a result similar to changing runlevel use the “systemctl isolate” command. Runlevels were never really supported in Debian (unlike Red Hat where they were used for starting and stopping the X server) so for Debian users there’s no change here.

The command systemctl with no params shows a list of loaded services and highlights failed units.

The command “journalctl -u UNIT-PATTERN” shows journal entries for the unit(s) in question. The pattern uses wildcards not regexs.

The systemd journal includes the stdout and stderr of all daemons. This solves the problem of daemons that don’t log all errors to syslog and leave the sysadmin wondering why they don’t work.

The command “systemctl status UNIT” gives the status and last log entries for the unit in question.

A program can use ioctl(fd, TIOCSTI, …) to push characters into a tty buffer. If the sysadmin runs an untrusted program with the same controlling tty then it can cause the sysadmin shell to run hostile commands. The system call setsid() to create a new terminal session is one solution but managing which daemons can be started with it is difficult. The way that systemd manages start/stop of all daemons solves this. I am glad to be rid of the run_init program we used to use on SE Linux systems to deal with this.

Systemd has a mechanism to ask for passwords for SSL keys and encrypted filesystems etc. There have been problems with that in the past but I think they are all fixed now. While there is some difficulty during development the end result of having one consistent way of managing this will be better than having multiple daemons doing it in different ways.

The commands “systemctl enable” and “systemctl disable” enable/disable daemon start at boot which is easier than the SysVinit alternative of update-rc.d in Debian.

Systemd has built in seat management, which is not more complex than consolekit which it replaces. Consolekit was installed automatically without controversy so I don’t think there should be controversy about systemd replacing consolekit.

Systemd improves performance by parallel start and autofs style fsck.

The command systemd-cgtop shows resource use for cgroups it creates.

The command “systemd-analyze blame” shows what delayed the boot process and

systemd-analyze critical-chain” shows the critical path in boot delays.

Sysremd also has security features such as service private /tmp and restricting service access to directory trees.

Conclusion

For basic use things just work, you don’t need to learn anything new to use systemd.

It provides significant benefits for boot speed and potentially security.

It doesn’t seem more complex than other alternative solutions to the same problems.

https://wiki.debian.org/systemd

http://freedesktop.org/wiki/Software/systemd/Optimizations/

http://0pointer.de/blog/projects/security.html

January 12, 2015

Linux.conf.au – Day 2 – Keynote by Eben Moglen

Last spoke 10 years ago in Canberra Linux.conf.au

Things have improved in the last ten years

  • $10s of billions of value have been lost in software patent war
  • But things have been so bad that some help was acquired, so worst laws have been pushed back  a little
  • “Fear of God” in industry was enough to push open Patent pools
  • Judges determined that Patent law was getting pathological, 3 wins in Supreme court
  • Likelihood worst patent laws will be applied against free software devs has decreased
  • “The Nature of the problem has altered because the world has altered”

The Next 10 years

  • Most important Patent system will be China’s
  • Lack of rule of law in China will cause problems in environment of patents
  • Too risky for somebody too try and stop a free software project. We have “our own baseball bat” to spring back at them

The last 10 years

  • Changes in Society more important changes in software
  • 21st century vs 20th century social organisations
    • Less need for hierarchy and secrecy
    • Transparency, Participation, non-hierarchical interaction
  • OS invented that organisation structure
  • Technology we made has taken over the creation of software
  • “Where is BitKeeper now?” – Eben Moglen
  • Even Microsoft reorganises that our way of software making won
  • Long term the organisation structure change everywhere will be more important than just it’s application in Software
  • If there has been good news about politics = “we did it”, bad news = “we tried”

Our common Values

  • “Bridge entire environment between vi and emacs”

Snowden

  • Without PGP and free software then things could have been worse
  • The world would be a far more despotic place if PGP was driven underground back in 1993. Imagine today’s Net without HTTPS or SSH!
  • “We now live in the world we are afraid of”
  • “What stands between them and us is our inventions”
  • “Freedom itself depends on how we make use of the technologies we are creating.” – Eben Moglen
  • “You can’t trust what you can’t read”
  • Big power in the wrong is committed against the first law of robotics, they what technology to work for it.
  • From guy in twitter – “You can’t trust what you can’t read.” True, but if OpenSSL teaches us anything you can’t necessarily trust what you can
  • Attitudes in under-18s are a lot more positive towards him than those who are older (not just cause he looks like Harry Potter)
  • GNU Project is 30 years old, almost same age is Snowden

Oppertunity

  • We can’t control the net but opportunity to prevent others from controlling it
  • Opportunity to prevent failure of freedom
  • Society is changing, demographics under control
  • But 1.6 billion people live in China, America is committed to spying, consumer companies are committed to collecting consumer information
  • Collecting everything is not the way we want the net to work
  • We are playing for keeps now.

 

 

linux.conf.au 2015

I'm at linux.conf.au this week learning lots. I'll have my Begg Digital hat on (when outside).

Linux.conf.au 2015 – Day 1 – Session 2 – Containers

AWS OpsWorks Orchestration War Stories – Andrew Boag

  • Autoscaling too slow since running build-from-scratch every time
  • Communications dependencies
  • Full stack rebuild in 20-40 minutes to use data currently in production
  • A bit longer in a different region
  • Great for load testing
  • If we ere doing again
    • AMI-based better
    • OPSWorks not suitable for all AWS stacks
    • Golden master for flexable
  • Auto-Scaling
    • Not every AMI instance is Good to Go upon provisioning
    • Not a magic bullet, you can’t broadly under-provision
    • needs to be throughly load-tested
  • Tips
    • Dual factor authentication
    • No single person / credentials should be able to delete all cloud-hosted copies of your data
  • Looked at Cloudformation at start, seemed to be more work
  • Fallen out of love with OpsWorks
  • Nice distinction by Andrew Boag: he doesn’t talk about “lock-in” to cloud providers, but about “cost to exit”.   – Quote from Paul

 

Slim Application Containers from Source – Sven Dowideit

  • Choose a base image and make a local version (so all your stuff uses the same one)
  • I’d pick debian (a little smaller) unless you can make do with busybox or scratch
  • Do I need these files? (check though the Dockerfile) eg remove docs files, manpages, timezones
  • Then build, export, import and it comes all clean with just one layer.
  • If all your images use same base, only on the disk once
  • Use related images with all your tools, related to deployment image but with the extra dev, debug, network tools
  • Version the dev images
  • Minimise to 2 layers
    • look at docker-squash
    • Get rid of all the sourc code from your image, just end up with whats need, not junk hidden in layers
  • Static micro-container nginx
    • Build as container
    • export as tar , reimport
    • It crashes :(
    • Use inotifywait to find what extra files (like shared libraries) it needs
    • Create new tarball with those extra files and “docker import” again
    • Just 21MB instead of 1.4GB with all the build fragments and random system stuff
    • Use docker build as last stage rather than docker import and you can run nginx from docker command line
    • Make 2 tar files, one for each image, one in libs/etc, second is nginx

 

Containers and PCP (Performance Co-Pilot) -  Nathan Scott

  • Been around for 20+ years, 11 years open source, Not a big mindshare
  • What is PCP?
    • Toolkit, System level analysis, live and historical, Extensible, distributed
    • pmcd daemon on each server, plus for various functions (bit of like collectd model)
    • pmlogger, pmchart, pmie, etc talk (pull or poll) to pmcd to get data
  • With Containers
    • Use –container=  to grab info inside a container/namespace
    • Lots of work still needed. Metrics inside containers limited compared to native OS

 

The Challenges of Containerizing your Datacenter – Daniel Hall

  • Goals at LIFX
    • Apps all stateless, easy to dockerize
    • Using mesos, zookeeper, marathon, chronos
    • Databases and other stuff outside that cloud
  • Mesos slave launches docker containers
  • Docker Security
    • chroot < Docker < KVM
    • Running untrusted Docket containers are a BAD IDEA
    • Don’t run apps as root inside container
    • Use a recent kernel
    • Run as little as possible in container
    • Single static app if possible
    • Run SELinux on the host
  • Finding things
    • Lots of micoroservices, marathon/mesos moves things all over the place
    • Whole machines going up and down
    • Marathon comes with a tool that pushes it’s state into HAProxy, works fairly well, apps talk to localhost on each machines and haproxy forwards
    • Use custom script for this
  • Collecting Logs
    • Not a good solution
    • can mount /dev/log but don’t restart syslog
    • Mesos collects stdout/stderror , hard to work with and no timestamps
    • Centralized logs
    • rsyslog log to 127.0.0.1 -> haproxy -> contral machine
    • Sometimes needs to queue/drop if things take a little while to start
    • rsyslog -> logstash
    • elasticsearch on mesos
    • nginx tasks running kibana
  • Troubleshooting
    • Similar to service discover problem
    • Easier to get into a container than getting out
    • Find a container in marathon
    • Use docker exec to run a shell, doesn’t work so well on really thin containers
    • So debugging tolls can work from outside, pprof or jsonsole can connect to exposed port/pid of container

Linux.conf.au 2015 – Day 1 – Session 1 – Containers

Clouds, Containers, and Orchestration Miniconf

 

Cloud Management and ManageIQ – John Mark Walker

  • Who needs management – Needs something to tie it all together
  • New Technology -> Adoption -> Proliferation -> chaos -> Control -> New Technology
  • Many technologies follow this, flies under the radar, becomes a problem to control, management tools created, management tools follow the same pattern
  • Large number of customers using hybrid cloud environment ( 70% )
  • Huge potential complexity, lots of requirements, multiple vendors/systems to interact with
  • ManageIQ
    • Many vendor managed open source products fail – open core, runt products
    • Better way – give more leeway to upstream developers
    • Article about taking it opensource on opensource.com. Took around a year from when decision was made
    • Lots of work to create a good open source project that will grow
    • Release named after Chess Grandmasters
    • Rails App

 

LXD: The Container-Based Hypervisor That Isn’t -  Tycho Andersen

  • Part of Openstack
  • Based on LXC , container based hypervisor
  • Secure by default: user namespaces, cgroups, Apparmor, etc
  • A EST API
  • A daemon that doesn’t hypervisory things
  • A framework for maintaining container based applications
  • It Isn’t
    • No network configuration
    • No storage management – But storage aware
    • Not an application container tool
    • handwavy difference between it and docker, I’m sure it makes sense to some people. Something about running an init/systemd rather than the app directly.
  • Features
    • Snapshoting – eg something that is slow to start, snapshot after just starts and deploy it in that state
    • Injection – add files into the container for app to work on.
    • Migration – designed to go fairly fast with low downtime
  • Image
    • Public and private images
    • can be published
  • Roadmap
    • MVP 0.1 released late January 2015
    • container management only

 

Rocket and the App Container Spec – Brandon Philips

  • Single binary – rkt – runs everywhere, systemd not required
  • rkt fetch – downloads and discovers images ( can run as non-root user )
  • bash -> rkt -> application
  • upstart -> rkt -> application
  • rkt run coreos.com/etcd-v2.3.1
  • multiple processes in container common. Multiple can be run from command line or specified in json file of spec.
  • Steps in launch
    • stage 0 – downloads images, checks it
    • Stage 1 – Exec as root, setup namespaces and cgroups, run systemd container
    • Stage 2 – runs actual app in container. Things like policy to restart the app
    • rocket-gc garbage collects stuff , runs periodicly. no managmanent daemon
  • App Container spec is work in progress
    • images, files, compressed, meta-data, dependencies on other images
    • runtime , restarts processes, run multiple processes, run extra procs under specified conditions
    • metadata server
    • Intended to be built with test suite to verify

January 11, 2015

January 10, 2015

DevOps Automation Services

Since we launched in 2014, we have assisted numerous companies, opensource projects and individuals, in learning, experimenting and using automation tools that nowadays define operations. Many things are changing in this area.

We have helped many people to achieve their automation goals, and we are happy to see how their operational costs are reduced and how productivity is increased.

Do you need help with DevOps and automation ? Don't hesitate to contact us at sales@manageacloud.com. You can also find more information at https://manageacloud.com/operations

Stay tuned! Very soon we will release a new of tools that will make your life in operations even easier.

January 09, 2015

The new citizenship: digital citizenship

Recently I was invited to give a TEDx talk at a Canberra event for women speakers. It was a good opportunity to have some fun with some ideas I’ve been playing with for a while around the concept of being a citizen in the era of the Internet, and what that means for individuals and traditional power structures in society, including government. A snipped transcript below. Enjoy and comments welcome :) I’ve put a few links that might be of interest throughout and the slides are in the video for reference.

Video is at http://www.youtube.com/embed/iqjM_HU0WSw

Digital Citizenship

I want to talk to you about digital citizenship and how, not only the geek will inherit the earth but, indeed, we already have. All the peoples just don’t know it yet.

Powerful individuals

We are in the most exciting of times. People are connected from birth and are engaged across the world. We are more powerful as individuals than ever before. We have, particularly in communities and societies like Australia, we have a population that has all of our basic needs taken care of. So we have got time to kill. And we’ve got resources. Time and resources gives a greater opportunity for introspection which has led over the last hundred years in particular, to enormous progress. To the establishment of the concept of individual rights and strange ideas like the concept that animals might actually have feelings and perhaps maybe shouldn’t be treated awfully or just as a food source.

We’ve had these huge, enormous revolutions and evolutions of thought and perspective for a long, long time but it’s been growing exponentially. It’s a combination of the growth in democracy, the rise of the concept of individual rights, and the concept of individuals being able to participate in the macro forces that shape their world.

But it’s also a combination of technology and the explosion in what an individual can achieve both as a individual but also en mass collaborating dynamically across the globe. It’s the fact that many of us are kind of fat, content and happy and now wanting to make a bit of a difference, which is quite exciting. So what we’ve got is a massive and unprecedented distribution of power.

Distributed power

We’ve got the distribution of publishing. The ability to publish whatever you want. Whether you do it through formal mechanisms or anonymously. You can distribute to a global audience with less barriers to entry than ever before. We have the distribution of the ability to communicate with whomever your please. The ability to monitor, which has traditionally been a top down thing for ensuring laws are followed and taxes are paid. But now people can monitor sideways, they can monitor up. They can monitor their governments. They can monitor companies. There is the distribution of enforcement. This gets a little tricky because if anyone can enforce than anyone can enforce any thing. And you start to get a little bit of active concerns there but it is an interesting time. Finally with the advent of 3D printing starting to get mainstream, we’re seeing the massive distribution of, of property.

And if you think about these five concepts – publishing, communications, monitoring, enforcement and property – these five power bases have traditionally been centralised. We usually look at the industrial revolution and the broadcast age as two majors periods in history but arguably they’re both actually part of the same era. Because both of them are about the centralised creation of stuff – whether it’s physical or information – by a small number of people that could afford to do so, and then distributed to the rest of the population.

The idea that anyone can create any of these things and distribute it to anyone else, or indeed for their own purposes is a whole new thing and very exciting. And what that means is that the relationship between people and governments and industry has changed quite fundamentally. Traditional institutions and bastions of any sort of power are struggling with this and are finding it rather scary but it is creating an imperative to change. It is also creating new questions about legitimacy and power relations between people, companies and governments.

Individuals however, are thriving in this environment. There’s always arguments about trolls and about whether the power’s being used trivially. The fact is the Internet isn’t all unicorns or all doom. It is something different, it is something exciting and it is something that is empowering people in a way that’s unprecedented and often unexpected.

The term singularity is one of those fluffy things that’s been touted around by futurists but it does have a fairly specific meaning which is kind of handy. The concept of the distance between things getting smaller. Whether that’s the distance between you and your publisher, you and your food, you and your network or you and your device. The concept of approaching the singularity is about reducing those distances between. Now, of course the internet has reduced the distance between people quite significantly and I put to you that we’re in a period of a “democratic singularity” because the distance between people and power has dramatically reduced.

People are in many ways now as powerful as a lot of the institutions which frame and shape their lives. So to paraphrase and slightly turn on it’s head the quote by William Gibson: the future is here and it is already widely distributed. So we’ve approached the democratic singularity and it’s starting to make democracy a lot more participatory, a lot more democratic.

Changing expectations

So, what does this mean in reality? What does this actually translate to for us as people, as a society, as a “global village”, to quote Marshall McLuhan. There’s quite massive changing expectations of individual. I see a lot of people focused on the shift in power from the West to the East. But I believe the more interesting shift is the shift in power from institutions to individuals.

That is the more fascinating shift not just because individuals have power but because it is changing our expectations as a society. And when you start to get a massive change of expectations across an entire community of people, that starts to change behaviors, change economics, change socials patterns, change social norms. 

What are those changing expectations? Well, the internet teaches us a lot of things. The foundation technical principles of the internet are effectively shaping the social characteristics of this new society. This distributed society or “Society 5″ if you will.

Some of the expectations are the ability to access what you want. The ability to talk to whom you want. The ability to cross reference. When I was a kid and you did a essay on anything you had to go look at Encyclopedia Britannica. It was a single source of truth. The concept that you could get multiple perspectives, some of which might be skewed by the way, but still to concept of getting the context of those different perspectives and a little comparison was hard and alien for the average person. Now you can often talk to someone who is there right now let alone find myriad sources to help inform your view. You can get a point of comparison against traditionally official sources like a government source or media report. People online start to intuitively understand that the world’s actually a lot more gray than we are generally taught in school and such. Learning that the world is gray is great because you start to say, “you know what? You could be right and I could be right and that doesn’t make either perspective necessarily invalid, and that isn’t a terrible thing.” It doesn’t have to be mutually exclusive or a zero sum game, or a single view of history. We can both have a perspective and be mutually respectful in a lot of cases and actually have a more diverse and interesting world as a result.

Changing expectations are helping many people overcome barriers that traditionally stopped them from being socially successful: economically, reputationally, etc. People are more empowered to basically be a superhero which is kinda cool. Online communities can be one of the most exciting and powerful places to be because it starts to transcend limitations and make it possible for people to excel in a way that perhaps traditionally they weren’t able to. So, it’s very exciting. 

Individual power also brings a lot of responsibility. We’ve got all these power structures but at the end of the day there’s usually a techie implementing the big red button so the role of geeks in this world is very important. We are the ones who enable technology to be used for any agenda. Everything is basically based on technology, right? So everything is reliant upon technology. Well, this means we are exactly as free as the tools that we use. 

Technical freedom

If the tool that you’re using for social networking only allows you to talk to people in the same geographic area as you then you’re limited. If the email tool you’re using only allows you to send to someone who has another secure network then you’re only as free as that tool. Tech literacy becomes an enabler or an inhibitor, and it defines an individuals privacy. Because you might say to yourself, oh you know, I will never tell anyone where I am at a particular point in time cause I don’t want someone to rob my house while I’m out on holiday. But you’ll still put a photo up that you’re Argentina right now, because that’s fun, so now people know. Technical literacy for the masses is really important but largely, at this point, confined to the geeks. So hacker ethos ends up being a really important part of this.

For those that don’t know, hacker is not a rude word. It’s not a bad word. It’s the concept of having a creative and clever approach to technology and applying tech in cool and exciting ways. It helps people scratch an itch, test their skills, solve tricky problems collaboratively. Hacker ethos is a very important thing because you start to say freedom, including technical freedom is actually very, very important. It’s very high on the list. And with this ethos, technologists know that to implement and facilitate technologies that actually hobble our fellow citizens kind of screws them over.

Geeks will always be the most free in a digital society because we will always know how to route around the damage. Again, going back to the technical construct of the internet. But fundamentally we have a role to play to actually be leaders and pioneers in this society and to help lead the masses into a better future.

Danger!

There’s also a lot of other sorts of dangers. Tools don’t discriminate. The same tools that can lead a wonderful social revolution or empower individuals to tell their stories is the same technology that can be used by criminals or those with a nefarious agenda. This is an important reason to remember we shouldn’t lock down the internet because someone can use it for a bad reason in the same way we don’t ban cars just because someone used a vehicle to rob a bank. The idea of hobbling technology because it’s used in a bad way is a highly frustrating one.

Another danger is “privilege cringe”. In communities like Australia we’re sort of taught to say, well, you’ve got privilege because you’ve been brought up in a safe stable environment, you’ve got an education, you’ve got enough money, you’ve got a sense of being able to go out and conquer the world. But you’ve got to hide that because you should be embarrassed of your opportunities when so many others have so little. I suggest to you all that you in this room, and pretty much anyone that would probably come and watch a TED event or go to a TED talk or watch it online, is the sort of person who is probably reasonably privileged in a lot of ways and you can use your privilege to influence the world in a powerful and positive way.

You’ve got access to the internet which makes you part of the third of the world that has access. So use your privilege for the power of good! This is the point. We are more powerful than ever before so if you’re not using your power for the power of good, if you’re not actually contributing to making the world a better place, what are you doing?

Hipsters are a major danger. Billy Bragg made the perfect quote which is, cynicism is the perfect enemy of progress. There is nothing more frustrating than actually making progress and having people tear you down because you haven’t done it exactly so.

Another danger is misdirection. We have a lot of people in Australia who want to do good. That’s very exciting and really cool. But Australians tend to say, I’m going to go to another country and feed some poor people and that’ll make me feel good, that’ll be doing some good and that’ll be great. Me personally, that would really not be good for people because I don’t cook very well. Deciding how you can actually contribute to making the world a better place in a way is like finding a lever? You need to identify what you are good at, what real differences you can make when you apply your skills very specifically. Where do you push to get a major change rather than, rather than contributing to actually maintaining the status quo? How do you rewrite the rules? How do you actually help those people that need help all around the world, including here in Australia, in a way that actually helps them sustainably? Enthused misdirection is I guess where I’m getting at.

And of course, one of the most frustrating dangers is hyperbole. It is literally destroying us. Figuratively speaking ;)

So there’s a lot of dangers, there’s a lot of issues but there is a lot of opportunities and a lot of capacities to do awesome. How many people here have been to a TED talk of some sort before? So keep your hand up if, after that, you went out and did something world changing. OK. So now you’re gonna do that, yeah? Right. So next time we do this all of those hands will stay up.

Progress

I’ll make couple of last points. My terrible little diagram here maps the concept that if you look at the last 5,000 years. The quality of life for individuals in a many societies has been down here fairly low for a long time. In millennia past, kings come and go, people get killed, properties taken. All sorts of things happen and individuals were very much at the behest of the powers of the day but you just keep plowing your fields and try to be all right. But is has slowly improved over a long time time, and the collective epiphany of the individual starts to happen, the idea of having rights, the idea that things could be better and that the people could contribute to their own future and democracy starts to kick off. The many suffrage movements addressing gender, ethnicity and other biases with more and more individuals in societies starting to be granted more equal recognition and rights.

The last hundred years, boom! It has soared up here somewhere. And I’m not tall enough to actually make the point, right? This is so exciting! So where are we going to go next?

How do we contribute to the future if we’re not involved in shaping the future. If we aren’t involved, then other powerful individuals are going to shape it for us. And this, this is the thing I’ve really learned by working in government, but working in the Minister’s office, by working in the public service. I specifically went to work in for a politician – even though I’m very strongly apolitical – to work in the government and in the public service because I wanted to understand the executive, legislative, and administrative arms of the entity that shapes our lives so much. I feel like I have a fairly good understanding of that now and there’s a lot of people who influence your lives every day.

Tipping point

Have we really hit this tipping point? You know, is it, is it really any different today than it was yesterday? Well, we’ve had this exponential progress, we’ve got a third of the world online, we’ve got these super human powerful individuals in a large chunk of different societies around the world. I argue that we have hit and passed the tipping point but the realisation hasn’t hit everyone yet.

So, the question is for you to figure out your super power. How do you best contribute it to making the world a better place?

Powers and kryptonite

For me, going and working in a soup kitchen will not help anybody. I could possibly design a robot that creates super delicious and nutritional food to actually feed people. But me doing it myself would actually probably give them food poisoning and wouldn’t help anyone. You need to figure out your specific super powers so you can deploy them to some effect. Figure out how you can contribute to the world. Also figure out your kryptonite.

What biases do you have in place? What weaknesses do you have? What things will actually get in the way of you trying to do what you’re doing? I quite often see people apply critical analysis and critical thinking tools without any self-awareness and the problem is that we are super clever beings and we can rationalize anything we want if, emotionally, we like it or dislike it.

So try and have both self-awareness and critical analysis and now you’ve got a very powerful way to do some good. So I’m going to just finish with a quote.

JFDI

What better place than here? What better time than now? All hell can’t stop us now — RATM

The future is being determined whether you like it or not. But it’s not really being determined by the traditional players in a lot of ways. The power’s been distributed. It’s not just the politicians or the scholars or the researchers or corporates. It’s being invented right here, right now. You are contributing to that future either passively or actively. So you may as well get up and be active about it.

We’re heading towards this and we’ve possibly even hit the tipping point of a digital singularity and a democratic singularity. So, what are you going do about it? I invite you to share with me in the creating the future together.

Thank you very much.

You might also be interested in my blog post on Creating Open Government for a Digital Society and I think the old nugget of noblesse oblige applies here very well.

Antarctica Adventure!

Recently I adventured to Antarctica. It’s not every day you get to say that and it has always been a dream of mine to travel to the south pole (or close to it!) and to see the glaciers, penguins, whales and birds that inhabit such a remote environment. There is something liberating and awesome (in the full sense of the word) in going somewhere where very few humans have traveled. Especially for someone like me who is spends so much time online.

Awesome Flickr Gallery Error - SSL is required

Being Australian and unused to real cold, I think I was also attracted to exploring a truly cold problem with travelling to Antarctica is, as it turns out, the 48-60 hours of torment you need to go through to get there and to get back. The Drake Passage is the strip of open sea between the bottom of South America and the Peninsula of the Antarctic continent. It is by far the most direct way by ship to get to Antarctica and the port town of Ushuaia is well set up to support intrepid travelers in this venture. We took off from Ushuaia on a calm Wednesday afternoon and within a few hours, were into the dreaded Drake. I found whilst ever I was lying down I was ok but walking around was torture! So I ended up staying in bed about 40 hours by which time it had calmed down significantly. See my little video of the more calm but still awful parts :) And that was apparently a calm crossing! Ah well, turns out I don’t have sea legs. At least I wasn’t actually sick and I certainly caught up with a few months of sleep deprivation so arguably, it was the perfect enforced rest!

Now the adventure begins! We were accompanied by a number of stunning and enormous birds, including Cape Pestrels and a number of Albatrosses. Then we came across a Blue Whale which is apparently quite a rare thing to see in the Drake. It gave us a little show and then went on its way. We entered the Gerlache Strait and saw our first ice which was quite exciting, but by the end of the trip these early views were just breadcrumbs! We landed at Cuverville Island which was stunning! I had taken the snowshoeing option and so with 12 other adventurous travellers, we started up the snow covered hill to get some better views. We saw a large colony of Gentoo penguins which was fun, they are quite curious and cute creatures. We had to be careful to not block any “penguin highways” so often was giving way to scores of them as we explored. We saw a Leopard Seal in the water, which managed to catch one unfortunate penguin for lunch.

We then landed at Neko Harbour, our first step onto the actual Antarctic continent! Again, more stunning views and Gentoo penguins. We had the good fortune to also have time that day to land at Port Lockroy, an old British station in Antarctica and the southern most post office in the world. I send a bunch of postcards to friends and family on the 23rd December, I guess we’ll see how long they take to make the trip. We got to see a number of the Snowy Sheathbill birds, which is a bit of a scavenger. It eats everything, including penguin poo, which is truly horrible. Although their eating habits are awful, they are quite beautiful and I was lucky enough to score a really good shot of one mid flight.

The next day we traveled down the Lemaire Channel to Petermann Island where we saw more Gentoo penguins, but also Adalie penguins, which are terribly cute! Again we did some snowshoeing which was excellent. I took some time to just sit and drink in the remoteness and the pristine environment that is Antarctica. It was humbling and wonderful to remember how truly small we all are and the magnificence of this world on which we reside. We saw some Minke Whales in the water beside the ship.

In the afternoon we broke through a few kilometres of ice and took the small boats (zodiacs) a short distance, then walked a half kilometre over ocean ice to land at Vernadsky Base, a Ukranian scientific post. The dozen or so scientists there hadn’t seen any other humans for 8 months are were very pleased to see us :) All of them were men and when I asked why there weren’t any women scientists there I had a one word answer from our young Ukranian guide: politics. Interesting… At any rate it was fascinating and it looks like they do some incredible science down there. There was also a small Elephant Seal who crawled up to the bar to say hi. They also have the southern most bar in the world, and there were treated to home made sugar based vodka, which was actually pretty good. So good in fact that one of the guests from our ship drank a dozen shots, then exchanged her bra in exchange for some snow mobile moonlighting around the base. Was quite hilarious and poor expedition leader dealt with it very diplomatically.

To cap off a fantastic day, the catering crew put on a BBQ on the deck of the Ocean Nova which was a cold but excellent affair. The mulled wine and hot apple dessert went down particularly well against the cold! We did a trivia night which was great fun, and our team, “The Rise of the Gentoo” won! There was much celebration though the sweet victory was snatched from us when they found a score card for a team that hadn’t been marked. Ah well, all is fair in love and war! I had only one question for our expedition leader, would we see any Orca? Orca are a new favourite animal of mine. They are brilliant, social and strategic animals. Well worth looking into.

The next morning we were woken particularly early as there were some Orca in the water! I was first on deck, in my pyjamas and I have to admit I squealed quite a lot, much to the amusement of our new American friends. At one point I saw all five Orca come to the surface and I could only watch in awe. They really are stunning animals. I learned from the on board whale expert that Orca have some particularly unique hunting techniques. Often the come across a seal or two on a small iceberg surrounded by water and ao they swim up to it in formation and then dive and hit their tails simultaneously creating a small tidal wave that washes the seal off into the water ready for the taking. Very clever animals. Then they always share the spoils of a hunt amongst the pod, and often will simply daze a victim to teach young Orca how to hunt before dealing a death blow. Apparently Orca have been known to kill much larger animals including Humpback Whales.

Anyway, the rest of day we did some zodiac trips (the small courier boats) around Paradise Harbour which was bitterly cold, and then around the Melchior Islands in Dallman Bay which was spectacular. One of the birds down here is the Antarctic Cormorant, closely related to the Cormorants in Australia. They look quite similar :) We got to see number of them nesting. Going back through the Drake I had to confine myself to my room again, which meant I missed seeing Humpback Whales. This was unfortunate but I really did struggle to travel around the ship when in the Drake without getting very ill.

On a final note, I traveled with the Antarctica XX1, which has a caring and wonderful crew. The crew includes scientists, naturists, biologists and others who genuinely love Antarctica. As a result we had a number of amazing lectures throughout the trip about the wildlife and ecosystem of Antarctica. Learning about Krill, ice flow, climate change and the migratory patterns of the whales was awesome. I wish I had been able to attend more talks but I couldn’t get up during most of the Drake :/ The rest of the crew who looked after navigation, feeding us, cleaning and all the other operations were just amazing. A huge thank you to you all for making this voyage the trip of a lifetime!

One thing I didn’t anticipate was the land sickness! 24 hours after getting off the boat and I still feel the sway of the ocean! All of my photos, plus a couple of group photos and a video or two are up on my flickr account in the Antarctica 2013 set at http://www.flickr.com/photos/piawaugh/sets/72157638364999506/ You can also see photos from Buenos Aires if you are interested at http://www.flickr.com/photos/piawaugh/sets/72157638573728155/

A special thank you also to Jamie, our exhibition leader who delivered an incredible itinerary under some quite trying circumstances, and all the expedition crew! You guys totally rock :)

I met some amazing new friends on the trip, and got to spend some quality time with existing friends. You don’t go on adventures like this without meeting other people of a similar adventurous mindset, which is always wonderful.

For everyone else, I highly highly recommend you check out the Antarctica XXI (Ocean Nova) trips if you are interested in going to Antarctica or the Arctic.

For all my linux.conf.au friends, yes I did scope out Antartica for a potential future conference, but given the only LUGs there are Gentoos, I think we should all spare ourselves the pain ;)

Below are links to some additional reading about the places we visited as provided by the Antarctic XX1 crew, the list of animals that were sighted throughout the journey and some other bits and pieces that might be of interest. Below are also some excellent quotes about Antarctica that were on the ship intranet that I just had to post to give you a flavour of what we experienced :)

  • AXXI_Logbook_SE2-1314 (PDF) Log book for the trip. Includes animals we saw, where we went and some details of our activities. Lovely work by the Antarctica XXI crew :)
  • Daily-program (PDF) – our daily program for the journey
  • Info-landings (PDF) – information about the landing sites we went to

The church says the earth is flat, but I know that it is round, for I have seen the shadow on the moon, and I have more faith in a shadow than in the church. — Ferdinando Magallanes

We were the only pulsating creatures in a dead world of ice. — Frederick Albert Cook

Below the 40th latitude there is no law; below the 50th no god; below the 60th no common sense and below the 70th no intelligence whatsoever. — Kim Stanley Robinson

I have never heard or felt or seen a wind like this. I wondered why it did not carry away the earth. — Cherry-Garrard

Great God ! this is an awful place. — Robert Falcon Scott, referring to the South Pole

Human effort is not futile, but Man fights against the giant forces of Nature in the spirit of humility. — Ernest Shackleton

Had we lived I should have had a tale to tell of the hardihood, endurance and courage of my companions …. These rough notes and our dead bodies must tell the tale. — Robert Falcon Scott

People do not decide to become extraordinary. They decide to accomplish extraordinary things. — Edmund Hillary

Superhuman effort isn’t worth a damn unless it achieves results. — Ernest Shackleton Adventure is just bad planning. — Roald Amundsen

For scientific leadership, give me Scott; for swift and efficient travel, Amundsen; but when you are in a hopeless situation, when there seems to be no way out, get on your knees and pray for Shackleton. — Sir Raymond Priestley

The imperatives for changing how we do government

Below are some of the interesting imperatives I have observed as key drivers for changing how governments do things, especially in Australia. I thought it might be of interest for some of you :) Particularly those trying to understand “digital government”, and why technology is now so vital for government services delivery:

  • Changing public expectations – public expectations have fundamentally changed, not just with technology and everyone being connected to each other via ubiquitous mobile computing, but our basic assumptions and instincts are changing, such as the innate assumption of routing around damage, where damage might be technical or social. I’ve gone into my observations in some depth in a blog post called Online Culture – Part 1: Unicorns and Doom (2011).
  • Tipping point of digital engagement with government – in 2009 Australia had more citizens engaging with government  online than through any other means. This digital tipping point creates a strong business case to move to digitally delivered services, as a digital approach enables more citizens to self serve online and frees up expensive human resources for our more vulnerable, complex or disengaged members of the community.
  • Fiscal constraints over a number of years have largely led to IT Departments having done more for less for years, with limited investment in doing things differently, and effectively a legacy technology millstone. New investment is needed but no one has money for it, and IT Departments have in many cases, resorted to being focused on maintenance rather than project work (an upgrade of a system that maintains the status quo is still maintenance in my books). Systems have reached a difficult point where the fat has been trimmed and trimmed, but the demands have grown. In order to scale government services to growing needs in a way that enables more citizens to self service, new approaches are necessary, and the capability to aggregate services and information (through open APIs and open data) as well as user-centric design underpins this capability.
  • Disconnect between business and IT – there has been for some time a growing problem of business units disengaging with IT. As cheap cloud services have started to appear, many parts of government (esp Comms and HR) have more recently started to just avoid IT altogether and do their own thing. On one hand this enables some more innovative approaches, but it also leads directly to a problem in whole of government consistency, reliability, standards and generally a distribution of services which is the exact opposite of a citizen centric approach. It’s important that we figure out how to get IT re-engaged in the business, policy and strategic development of government such that these approaches are more informed and implementable, and such that governments use, develop, fund and prioritise technology in alignment with a broader vision.
  • Highly connected and mobile community and workforce – the opportunities (and risks) are immense, and it is important that governments take an informed and sustainable approach to this space. For instance, in developing public facing mobile services, a mobile optimised web services approach is more inclusive, cost efficient and sustainable than native applications development, but by making secure system APIs and open data available, the government can also facilitate public and private competition and innovation in services delivery.
  • New opportunities for high speed Internet are obviously a big deal in Australia (and also New Zealand) at the moment with the new infrastructure being rolled out (FTTP in both countries), and setting up to better support and engaging with citizens digitally now, before mainstream adoption, is rather important and urgent.
  • Impact of politics and media on policy – the public service is generally motivated to have an evidence-based approach to policy, and where this approach is developed in a transparent and iterative way, in collaboration with the broader society, it means government can engage directly with citizens rather than through the prism of politics or the media, each which have their own motivations and imperatives.
  • Prioritisation of ICT spending – it is difficult to ensure the government investment and prioritisation of ICT projects aligns with the strategic goals of the organisation and government, especially where the goals are not clearly articulated.
  • Communications and trust – with anyone able to publish pretty much anything, it is incumbent on governments to be a part of the public narrative as custodians of a lot of information and research. By doing this in a transparent and apolitical way, the public service can be a value and trusted source.
  • The expensive overhead of replication of effort across governments – consolidating where possible is vital to improve efficiencies, but also to put in place the mechanisms to support whole of government approaches.
  • Skills – a high technical literacy directly supports the capacity to innovate across government and across the society in every sector. As such this should be prioritised in our education systems, way above and well beyond “office productivity” tools.

Note: I originally had some of this in another blog post about open data and digital government in NZ, buried some way down. Have republished with some updated ideas.

Embrace your inner geek: speech to launch QUT OSS community

This was a speech I gave in Brisbane to launch the QUT OSS group. It talks about FOSS, hacker culture, open government/data, and why we all need to embrace our inner geek :)

Welcome to the beginning of something magnificent. I have had the luck, privilege and honour to be involved in some pretty awesome things over the 15 or so years I’ve been in the tech sector, and I can honestly say it has been my involvement in the free and Open Source software community that has been one of the biggest contributors.

It has connected me to amazing and inspiring geeks and communities nationally and internationally, it has given me an appreciation of the fact that we are exactly as free as the tools we use and the skills we possess, it has given me a sense of great responsibility as part of the pioneer warrior class of our age, and it has given me the instincts and tools to do great things and route around issues that get in the way of awesomeness.

As such it is really excited to be part of launching this new student focused Open Source group at QUT, especially one with academic and industry backing so congratulations to QUT, Red Hat, Microsoft and Tech One.

It’s also worth mentioning that Open Source skills are in high demand, both nationally and internationally, and something like 2/3 of Open Source developers are doing so in some professional capacity.

So thanks in advance for having me, and I should say up front that I am here in a voluntary capacity and not to represent my employer or any other organisation.

Who am I? Many things: martial artist, musician, public servant, recently recovered ministerial adviser, but most of all, I am a proud and reasonably successful geek.

Geek Culture

So firstly, why does being a geek make me so proud? Because technology underpins everything we do in modern society. It underpins industry, progress, government, democracy, a more empowered, equitable and meritocratic society. Basically technology supports and enhances everything I care about, so being part of that sector means I can play some small part in making the world a better place.

It is the geeks of this world that create and forge the world we live in today. I like to go to non-geek events and tell people who usually take us completely for granted, “we made the Internet, you’re welcome”, just to try to embed a broader appreciation for tech literacy and creativity.

Geeks are the pioneers of the modern age. We are carving out the future one bit at a time, and leading the charge for mainstream culture. As such we have, I believe, a great responsibility to ensure our powers are used to improve life for all people, but that is another lecture entirely.

Geek culture is one of the driving forces of innovation and progress today, and it is organisations that embrace technology as an enabler and strategic benefit that are able to rapidly adapt to emerging opportunities and challenges.

FOSS culture is drawn very strongly from the hacker culture of the 60′s and 70′s. Unfortunately the term hacker has been stolen by the media and spooks to imply bad or illegal behaviours, which we would refer to as black hat hacking or cracking. But true hacker culture is all about being creative and clever with technology, building cool stuff, showing off one’s skills, scratching an itch.

Hacker culture led to free software culture in the 80′s and 90′s, also known as Open Source in business speak, which also led to a broader free culture movement in the 90′s and 00′s with Creative Commons, Wikipedia and other online cultural commons. And now we are seeing a strong emergence of open government and open science movements which is very exciting.

Open Source

A lot of people are aware of the enormity of Wikipedia. Even though Open Source well predates Wikipedia, it ends up being a good tool to articulate to the general population the importance of Open Source.

Wikipedia is a globally crowdsourced phenomenon than, love it or hate it, has made knowledge more accessible than every before. I personally believe that the greatest success of Wikipedia is in demonstrating that truth is perception, and the “truth” held in the pages of Wikipedia ends up, ideally anyway, being the most credible middle ground of perspectives available. The discussion pages of any page give a wonderful insight to any contradicting perspectives or controversies and it teaches us the importance of taking everything with a grain of salt.

Open Source is the software equivalent of Wikipedia. There are literally hundreds of thousands if not millions of Open Source software projects in the world, and you would used thousands of the most mature and useful ones every day, without even knowing it. Open Source operating systems like Linux or MINIX powers your cars, devices, phones, telephone exchanges and the majority of servers and super computers in the world. Open Source web tools like WordPress, Drupal or indeed WikiMedia (the software behind Wikipedia) power an enormous amount of websites you go to everyday. Even Google heavily uses Open Source software to build the worlds most reliable infrastructure. If Google.com doesn’t work, you generally check your own network reliability first.

Open Source is all about people working together to scratch a mutual itch, sharing in the development and maintenance of software that is developed in an open and collaborative way. You can build on the top of existing Open Source software platforms as a technical foundation for innovation, or employ Open Source development methodologies to better innovate internally. I’m still terrified by the number of organisations I see that don’t use base code revision systems and email around zip files!

Open Source means you can leverage expertise far beyond what you could ever hope to hire, and you build your business around services. The IT sector used to be all about services before the proprietary lowest common denominator approach to software emerged in the 80s.

But we have seen the IT sector largely swing heavily back to services, except in the case on niche software markets, and companies compete on quality of services and whole solution delivery rather than specific products. Services companies that leverage Open Source often find their cost of delivery lower, particularly in the age of “cloud” software as a service, where customers want to access software functionality as a utility based on usage.

Open Source can help improve quality and cost effectiveness of technology solutions as it creates greater competition at the services level.

The Open Source movement has given us an enormous collective repository of stable, useful, innovative, responsive and secure software solutions. I must emphasise secure because many eyes reviewing code means a better chance of identifying and fixing issues. Security through obscurity is a myth and it always frustrates me when people buy into the line that Open Source is somehow less secure than proprietary solutions because you can see the code.

If you want to know about government use of Open Source, check out the Open Source policy on the Department of Finance and Deregulation website. It’s a pretty good policy not only because it encourages procurement processes to consider Open Source equally, but because it encourages government agencies to contribute to and get involved in the Open Source community.

Open Government

It has been fascinating to see a lot of Open Source geeks taking their instincts and skills with them into other avenues. And to see non-technical and non-Open Source people converging on the same basic principles of openness and collaboration for mutual gain from completely different avenues.

For me, the most exciting recent evolution of hacker ethos is the Open Government movement.

Open Government has always been associated with parliamentary and bureacratic transparency bureaucratic, such as Freedom of Information and Hansard.

I currently work primarily on the nexus where open government meets technology. Where we start to look at what government means in a digital age where citizens are more empowered than ever before, where globalisation challenges sovereignty, where the need to adapt and evolve in the public service is vital to provide iterative, personalised and timely responses to new challenges and opportunities both locally and globally.

There are three key pillars of what we like to call “Government 2.0”. A stupid term I know, but bear with me:

  1. Participatory governance – this is about engaging the broader public in the decision making processes of government to both leverage the skills, expertise and knowledge of the population for better policy outcomes, and to give citizens a way to engage directly with decisions and programs that affect their every day lives. Many people think about democratic engagement as political engagement, but I content that the public service has a big role to play in engaging citizens directly in co-developing the future together.
  2. Citizen centricity – this is about designing government services with the citizen at the centre of the design. Imagine if you will, and I know many in the room are somewhat technical, imagine government as an API, where you can easily aggregate information and services thematically or in a deeply personalised way for citizens, regardless of the structure or machinery of government changes. Imagine being able to change your address in one location, and have one place to ask questions or get the services you need. This is the vision of my.gov.au and indeed there are several initiatives that delivery on this vision including the Canberra Connect service in the ACT, which is worth looking at. In the ACT you can go into any Canberra Connect location for all your Territory/Local government needs, and they then interface with all the systems of that government behind the scenes in a way that is seamless to a citizen. It is vital that governments and agencies start to realise that citizens don’t care about the structures of government, and neither should they have to. It is up to us all to start thinking about how we do government in a whole of government way to best serve the public.
  3. Open and transparent government – this translates as both parliamentary transparency, but also opening up government data and APIs. Open data also opens up opportunities for greater analysis, policy development, mobile service delivery, public transaprency and trust, economic development through new services and products being developed in the private sector, and much more.

Open Data

Open data is very much my personal focus at the moment. I’m now in charge of data.gov.au, which we are in the process of migrating to an excellent Open Source data repository called CKAN which will be up soon. There is currently a beta up for people to play with.

I also am the head cat herder for a volunteer run project called GovHack which ran only just a week ago, where we had 1000 participants from 8 cities, including here in Brisbane, all working with government data to build 130 new hacks including mashups, data visualisations, mobile and other applications, interactive websites and more. GovHack shows clearly the benefits to society when you open up government data for public use, particularly if it is available in a machine readable way and is available under a very permissive copyright such as Creative Commons.

I would highly recommend you check out my blog posts about open data around the world from when I went to a conference in Helsinki last year and got to meet luminaries in this space including Hans Rosling, Dr Tim Hubbard and Rufus Pollock. I also did some work with the New Zealand Government looking at NZ open data practice and policy which might be useful, where we were also able to identify some major imperatives for changing how governments work.

The exciting thing is how keen government agencies in Federal, State, Territory and Local governments are to open up their data! To engage meaningfully with citizens. And to evolve their service delivery to be more personalised and effective for everyone. We are truly living in a very exciting time for technologists, democracy and the broader society.

Though to be fair, governments don’t really have much choice. Citizens are more empowered than ever before and governments have to adapt, delivery responsive, iterative and personalised services and policy, or risk losing relevance. We have seen the massive distribution now of every traditional bastion of power, from publishing, communications, monitoring, enforcement, and even property is about to dramatically shift, with the leaps in 3D printing and nano technologies. Ultimately governments are under a lot of pressure to adapt the way we do things, and it is a wonderful thing.

The Federal Australian Government already has in place several policies that directly support opening up government data:

Australia has also recently signed up to the Open Government Partnership, an international consortia of over 65 governments which will be a very exciting step for open data and other aspects of open government.

At the State and Territory level, there is also a lot of movement around open data. Queensland and the ACT launched your new open data platform late last year with some good success. NSW and South Australia have launched new platforms in the last few weeks with hundreds of new data sets. Western Australia and Victoria have been publishing some great data for some time and everyone is looking at how they can do so better!

Many local governments have been very active in trying to open up data, and a huge shout out to the Gold Coast City Council here in Queensland who have been working very hard and doing great things in this space!

It is worth noting that the NSW government currently have a big open data policy consultation happening which closes on the 17th June and is well worth looking into and contributing to.

Embracing geekiness

One of my biggest bug bears is when people say “I’m sorry the software can’t do that”. It is the learned helplessness of the tech illiterate that is our biggest challenge for innovating and being globally competitive, and as countries like Australia are overwhelming well off, with the vast majority of our citizens living high quality lives, it is this learned helplessness that is becoming the difference between the haves and have nots. The empowered and the disempowered.

Teaching everyone to embrace their inner geek isn’t just about improving productivity, efficiency, innovation and competitiveness, it is about empowering our people to be safer, smarter, more collaborative and more empowered citizens in a digital world.

If everyone learnt and experienced even the tiniest amount of programming, we would all have embedded that wonderful instinct that says “the software can do whatever we can imagine”.

Open Source communities and ethos gives us a clear vision as to how we can overcome every traditional barrier to collaboration to make awesome stuff in a sustainable way. It teaches us that enlightened self interest in the age of the Internet translates directly to open and mutually beneficial collaboration.

We can all stand on the shoulders of giants that have come before, and become the giants that support the next generation of pioneers. We can all contribute to making this world just a bit more awesome.

So get out there, embrace your inner geek and join the open movement. Be it Open Source, open government or open knowledge, and whatever your particular skills, you can help shape the future for us all.

Thank you for coming today, thank you to Jim for inviting me to be a part of this launch, and good luck to you all in your endeavours with this new project. I look forward to working with you to create the future of our society, together.

So you want to change the world?

Recently I spoke at BarCamp Canberra about my tips and tricks to changing the world. I thought it might be useful to get people thinking about how they can best contribute to the world, according to their skills and passions.

Completely coincidentally, my most excellent boss did a talk a few sessions ahead of me which was the American Civil War version of the same thing :) I highly recommend it. John Sheridan – Lincoln, Lee and ICT: Lessons from the Civil War.

So you want to change the world?

Here are the tactics I use to some success. I heartily recommend you find what works for you. Then you will have no excuse but to join me in implementing Operation World Awesomeness.

The Short Version:

No wasted movement.

The Long Version:

1) Pick your battles: there are a million things you could do. What do you most care about? What can you maintain constructive and positive energy about even in the face of towering adverseries and significant challenges? What do you think you can make a difference in? There is a subtle difference between choosing to knock down a mountain with your forehead, and renting a bulldozer. If you find yourself expending enormous energy on something, but not making a difference, you need to be comfortable to change tactics.

2) Work to your strengths: everyone is good at something. If you choose to contribute to your battle in a way that doesn’t work to your strengths, whatever they are, then you are wasting energy. You are not contributing in the best way you can. You need to really know yourself, understand what you can and can’t do, then do what you can do well, and supplement your army with the skills of others. Everyone has a part to play and a meaningful way to contribute. FWIW, I work to know myself through my martial arts training, which provides a useful cognitive and physical toolkit to engage in the world with clarity. Find what works for you. As Sun Tzu said: know yourself.

3) Identify success: Figure out what success actually looks like, otherwise you don’t have either a measurement of progress, nor a measurement of completion. I’ve seen too many activists get caught up on a battle and continued fighting well beyond the battle being won, or indeed keep hitting their heads against a battle that can’t be won. It’s important to continually be monitoring and measuring, holding yourself to account, and ensuring you are making progress. If not, change tactics.

4) Reconnaissance: do your research. Whatever your area of interest there is likely a body of work that has come before you that you can build upon. Learn about the environment you are working in, the politics, the various motivations and interests at play, the history and structure of your particular battlefield. Find levers in the system that you can press for maximum effect, rather than just straining against the weight of a mountain. Identify the various moving parts of the system and you have the best chance to have a constructive and positive influence.

5) Networks & Mentors: identify all the players in your field. Who is involved, influential, constructive, destructive, effective, etc. It is important to understand the motivations at play so you can engage meaningfully, collaboratively and build a mutually beneficial network in the persuit of awesomeness. Strong mentors are a vital asset and they will teach you how to navigate the rapids and make things happen. A strong network of allies is also vital to keep you on track, and accountable, and true to your own purpose. People usually strive to meet the expectations of those around them, so surround yourself with high expectations. Knowing your network also helps you identify issues and opportunities early.

6) Sustainability: have you put in place a succession plan? How will your legacy continue on without you? It’s important if your work is to continue on that it not be utterly reliant upon one individual. You need to share your vision, passion and success. Glory shared is glory sustained, so bring others on board, encourage and support them to succeed. Always give recognition and thanks to people who do great stuff.

7) Patience: remember the long game. Nothing changes overnight. It always take a lot of work and persistence, and remembering the long game will help during those times when it doesn’t feel like you are making progress. Again, your network is vital as it will help you maintain your strength, confidence and patience :) Speaking of which, a huge thanks to Geoff Mason for reminding me of this one on the day.

8) Shifting power: it is worth noting that we are living in the most exciting of times. Truly. Individuals are more empowered than ever before to do great things. The Internet has created a mechanism for the mass distribution of power, but putting into the hands of all people (all those online anyway), the tools to:

  1. publish and access knowledge;
  2. communicate and collaborate with people all around the world;
  3. monitor and hold others to account including companies, governments and individuals;
  4. act as enforcers for whatever code or law they uphold. This is of course quite controversial but fascinating nonetheless; and
  5. finally, with the advances in 3D printing and nanotechnology, we are on the cusp of all people having unprecedented access to property.

Side note: Poverty and hunger, we shall overcome you yet! Then we just urgently need to prioritise education of all the people. But that is a post for another day :) Check out my blog post on Unicorns and Doom, which goes into my thoughts on how online culture is fundamentally changing society.

This last aspect is particularly fascinating as it changes the game from one between the haves and the have nots, to one between those with and those without skills and knowledge. We are moving from a material wealth differentiation in society towards an intellectual wealth differentiation. Arguable we always had the latter, but the former has long been a bastion for law, structures, power and hierarchies. And it is all changing.

“What better place than here, what better time than now?” — RATM

I am so thankful – the gap is sorted

I will be doing a longer blog post about the incredible adventure it was to bring Sir Tim Berners-Lee and Rosemary Leith to Australia 10 days ago, but tonight I have had something just amazing happen that I wanted to briefly reflect upon.

I feel humbled, amazed and extremely extremely thankful to be part of such an incredible community in Australia and New Zealand, and a lot of people have stood up and supported me with something I felt very uncomfortable having to deal with.

Basically, a large sponsor pulled out from the TBL Down Under Tour (which I was the coordinator for, supported by the incredible and hard working Jan Bryson) just a few weeks before the start, leaving us with a substantial hole in the budget. I managed to find sponsorship to cover most of the gap, but was left $20k short (for expenses only) and just decided to figure it out myself. Friends rallied around and suggested the crowdsourcing approach which I was hesitant to do, but eventually was convinced it wouldn’t be a bad thing.

We crowdsourced less than two days ago and raised around $6k ($4,800 on GoGetFunding and $1,200 from Jeff’s earlier effort). This was incredible, especially the wonderfully supportive and positive comments that people left. Honestly, it was amazing. And then, much to my surprise and shock, Linux Australia offered to contribute the rest of the $20k. Silvia is closing the crowdsourcing site as I write this and I’m thankful to her for setting it up in the first place.

I am truly speechless. And humbled. And….

It is worth noting that stress and exhaustion aside, and though I put over 350 hours of my own time into this project, for me it has been completely worth it. It has brought many subjects dear to my heart into the mainstream public narrative and media including open government, open data, open source, net neutrality, data retention and indeed, the importance of geeks. I think such a step forward in public narrative will help us take a few more steps towards the the future where Geeks Rule Over Kings ;) (my lca2013 talk)

It was also truly a pleasure to hang out with Tim and Rosemary who are extremely lovely people, clever and very interesting to chat to.

For the haters :) No I am not suffering from cultural cringe. No I am not needing an external voice to validate perspectives locally. There is only one TBL and if he was Australian I’d still have done what I did :P

More to come in the wrap up post on the weekend, but thank you again to all the individuals who contributed, and especially to Linux Australia for offering to fill the gap. There are definitely lessons learnt from this experience which I’ll outline later, but if I was an optimist before, this gives me such a sense of confidence, strength and support to continue to do my best to serve my community and the broader society as best I can.

And I promise I won’t burn out in the meantime ;)

Po is looking forward to spending more time with his human. We all made sacrifices :) (old photo courtesy of Mary Gardiner)

My NZ Open Data and Digital Government Adventure

On a recent trip to New Zealand I spent three action packed days working with Keitha Booth and Alison Stringer looking at open data. These two have an incredible amount of knowledge and experience to share, and it was an absolute pleasure to work with them, albeit briefly. They arranged meetings with about 3000* individuals from across different parts of the NZ government to talk about everything from open data, ICT policy, the role of government in a digital era, iterative policy, public engagement and the components that make up a feasible strategy for all of the above.

It’s important to note, I did this trip in a personal capacity only, and was sure to be clear I was not representing the Australian government in any official sense. I saw it as a bit of a public servant cultural exchange, which I think is probably a good idea even between agencies let alone governments ;)

I got to hear about some of the key NZ Government data projects, including data.govt.nz, data.linz.govt.nz, the statistical data service, some additional geospatial and linked data work, some NZ government planning and efforts around innovation and finding more efficient ways to do tech, and much more. I also found myself in various conversations with extremely clever people about science and government communications, public engagement, rockets, circus and more.

It was awesome, inspiring, informative and exhausting. But this blog post aims to capture the key ideas from the visit. I’d love your feedback on the ideas/frameworks below, and I’ll extrapolate on some of these ideas in followup posts.

I’m also looking forward to working more collaboratively with my colleagues in New Zealand, as well as from across all three spheres of government in Australia. I’d like to set up a way for government people in the open data and open government space across Australia/New Zealand to freely share information and technologies (in code), identify opportunities to collaborate, share their policies and planning for feedback and ideas, and generally work together for more awesome outcomes all round. Any suggestions for how best to do this? :) GovDex? A new thing? Will continue public discussions on the Gov 2.0 mailing list, but I think it’ll be also useful to connect govvies privately whilst encouraging individuals and agencies to promote their work publicly.

This blog post is a collaboration with the wonderful Alison Stringer, in a personal capacity only. Enjoy!

* 3000 may be a wee stretch :)

Table of Contents

Open Data

  • Strategic/Policy Building Blocks
  • Technical Building Blocks
  • References

Digital and Open Government

  • Some imperatives for changing how we do government
  • Policy/strategic components

Open data

Strategic/policy building blocks

Below are some basic building blocks we have found to be needed for an open data strategy to be sustainable and effective in gaining value for both the government and the broader community including industry, academia and civil society. It is based on the experiences in NZ, Aus and discussions with open data colleagues around the world. Would love your feedback, and I’ll expand this out to a broader post in the coming weeks.

  • Policy- open as the default, specifically encouraging and supporting a proactive and automated disclosure of government information in an appropriate, secure and sustainable way. Ideally, each policy should be managed as an iterative and live document that responds to changing trends, opportunities and challenges:
    • Copyright and licensing – providing clear guidance that government information can be legally used. Using simple, permissive and known/trusted licences is important to avoid confusion.
    • Procurement – procurement policy creates a useful and efficient lever to establish proactive “business as usual” disclosure of information assets, by requiring new systems to support such functionality and publishing in open data formats from the start. This also means the security and privacy of data can be built into the system.
    • Proactive publishing – a policy of proactive disclosure helps avoid the inefficiencies of retrospective data publishing. It is important to also review existing assets and require an implementation plan from all parts of government on how they will open up their information assets, and then measure, monitor and report on the progress.
  • Legislation – ensuring any legislative blockers to publishing data are sorted, for instance, in some jurisdictions civil servants are personally liable if someone takes umbrage to the publication of something. Indeed there may be some issues here that are perceptions as opposed to reality. A review of any relevant legislation and plan to fix any blockers to publishing information assets is recommended.
  • Leadership/permission – this is vital, especially in early days whilst open data is still being integrated as business as usual. It should be as senior as possible.
  • Resourcing – it is very hard to find new money in governments in the current fiscal environment. However, we do have people. Resourcing the technical aspects of an open data project would only need a couple of people and a little infrastructure that can both host and point to data and data services. The UK open data platform runs on less than £460K per year, including the costs of three staff). But there needs to be a policy of distributed publishing. In the UK there are ~760 registered publishers of data throughout government. It would be useful to have at least one data publisher (probably to work part of their job only and alongside the current senior agency data champion role) who spends a day or two a week just seeking out and publishing data for their department, and identifying opportunities to automate data publishing with the data.govt.nz team.
  • Value realisation – including:
    • Improved policy development across government through better and early access to data and tools to use data
    • Knowledge transfer across government, especially given so many senior public servants are retiring in the coming years
    • Improved communication of complex issues to the public, better public engagement and exploration of data – especially with data visualisation tools
    • Monitoring, reporting, measuring clear outcomes (productivity savings, commercialisation, new business or products/projects, innovation in government, improved efficiency in Freedom of Information responses, efficiencies in not replicating data or reports, effectiveness and metrics around projects, programs and portfolios)
    • Application of data in developing citizen centric services and information
    • Supporting and facilitating commercialisation opportunities
  • Agency collaboration – the importance of agency collaboration can not be overstated. Especially on sharing/using/reusing data, on sharing knowledge and skills, on public engagement and communications. Also on working together where projects or policy areas might be mutually beneficial and on public engagement such that there is a consistent and effective dialogue with citizens. This shouldn’t be a bottlenecked approach, but rather a distributed network of individuals at different levels and in different functions.
  • Technology – need to have the right bits in place, or the best policy/vision won’t go anywhere :) See below for an extrapolation on the technical building blocks.
  • Public engagement – a public communications and engagement strategy is vital to build and support a community of interest and innovation around government data.

Technical building blocks

Below are some potential technical building blocks for supporting a whole of government(s) approach to information management, proactive publishing and collaboration. Let me know what you think I’m missing :)

Please note, I am not in any way suggesting this should be a functional scope for a single tool. On the contrary, I would suggest for each functional requirement the best of breed tool be found and that there be a modular approach such that you can replace components as they are upgraded or as better alternatives arise. There is no reason why a clever frontend tool couldn’t talk to a number of backend services.

  • Copyright and licensing management – if an appropriately permissive copyright licence is applied to data/content at the point of creation, and stored in the metadata, it saves on the cost of administration down the track. The Australian Government default license has been determined as Creative Commons BY, so agencies and departments should use that, regardless of whether the data/content is ever publishing publicly. The New Zealand government recommends CC-BY as the default for data and information published for re-use.
  • An effective data publishing platform(s) (see Craig Thomler’s useful postabout different generations of open data platforms) that supports the publishing, indexing and federation of data sources/services including:
    • Geospatial data – one of the pivotal data sets required for achieving citizen centric services, and in bringing the various other datasets together for analysis and policy development.
    • Real time data – eg, buses, weather, sensor networks
    • Statistical data – eg census and surveys, where raw access to data is only possible through an API that gives a minimum number of results so as to make individual identification difficult
    • Tabular data – such as spreadsheets or databases of records in structured format
  • Identity management – for publishers at the very least.
  • Linked data and metadata system(s) – particularly where such data can be automatically inferred or drawn from other systems.
  • Change control – the ability to push or take updates to datasets, or multiple files in a dataset, including iterative updates from public or private sources in a verifiable way.
  • Automation tools for publishing and updating datasets including where possible, from their source, proactive system-to-system publishing.
  • Data analysis and visualisation tools – both to make it easier to communicate data, but also to help people (in government and the public) analyse and interact with any number of published datasets more effectively. This is far more efficient for government than each department trying to source their own data visualisation and analysis tools.
  • Reporting tools – that clearly demonstrate status, progress, trends and value of open data and open government on an ongoing basis. Ideally this would also feed into a governance process to iteratively improve the relevant policies on an ongoing basis.

Some open data references

Digital and Open Government

Although I was primarily in New Zealand to discuss open data, I ended up entering into a number of discussions about the broader aspects of digital and open government, which is entirely appropriate and a natural evolution. I was reminded of the three pillars of open government that we often discuss in Australia which roughly translate to:

  • Transparency
  • Participation
  • Citizen centricity

There is a good speech by my old boss, Minister Kate Lundy, which explains these in some detail.

I got into a couple of discussions which went into the concept of public engagement at length. I highly recommend those people check out the Public Sphere consultation methodology that I developed with Minister Kate Lundy which is purposefully modular so that you can adapt it to any community and how they best communicate, digitally or otherwise. It also is focused on getting evidence based, peer reviewed, contextually analysed and useful actual outcomes. It got an international award from the World eDemocracy Forum, which was great to see. Particularly check out how we applied computer forensics tools to help figure out if a consultation is being gamed by any individual or group.

When I consider digital government, I find myself standing back in the first instance to consider the general role of government in a digital society. I think this is an important starting point as our understanding is broadly out of date. New Zealand has definitions in the State Sector Act 1988, but they aren’t necessarily very relevant to 2013, let alone an open and transparent digital government.

Some imperatives for changing how we do government

Below are some of the interesting imperatives I have identified as key drivers for changing how we do government:

  • Changing public expectations – public expectations have fundamentally changed, not just with technology and everyone being connected to each other via ubiquitous mobile computing, but our basic assumptions and instincts are changing, such as the innate assumption of routing around damage, where damage might be technical or social. I’ve gone into my observations in some depth in a blog post called Online Culture – Part 1: Unicorns and Doom (2011).
  • Tipping point of digital engagement with government – in 2009 Australia had more citizens engaging with government  online than through any other means. This digital tipping point creates a strong business case to move to digitally delivered services, as a digital approach enables more citizens to self serve online and frees up expensive human resources for our more vulnerable, complex or disengaged members of the community.
  • Fiscal constraints over a number of years have largely led to IT Departments having done more for less for years, with limited investment in doing things differently, and effectively a legacy technology millstone. New investment is needed but no one has money for it, and IT Departments have in many cases, resorted to being focused on maintenance rather than project work (an upgrade of a system that maintains the status quo is still maintenance in my books). Systems have reached a difficult point where the fat has been trimmed and trimmed, but the demands have grown. In order to scale government services to growing needs in a way that enables more citizens to self service, new approaches are necessary, and the capability to aggregate services and information (through open APIs and open data) as well as user-centric design underpins this capability.
  • Disconnect between business and IT – there has been for some time a growing problem of business units disengaging with IT. As cheap cloud services have started to appear, many parts of government (esp Comms and HR) have more recently started to just avoid IT altogether and do their own thing. On one hand this enables some more innovative approaches, but it also leads directly to a problem in whole of government consistency, reliability, standards and generally a distribution of services which is the exact opposite of a citizen centric approach. It’s important that we figure out how to get IT re-engaged in the business, policy and strategic development of government such that these approaches are more informed and implementable, and such that governments use, develop, fund and prioritise technology in alignment with a broader vision.
  • Highly connected and mobile community and workforce – the opportunities (and risks) are immense, and it is important that governments take an informed and sustainable approach to this space. For instance, in developing public facing mobile services, a mobile optimised web services approach is more inclusive, cost efficient and sustainable than native applications development, but by making secure system APIs and open data available, the government can also facilitate public and private competition and innovation in services delivery.
  • New opportunities for high speed Internet are obviously a big deal in Australia and New Zealand at the moment with the new infrastructure being rolled out (FTTP in both countries), and setting up to better support and engaging with citizens digitally now, before mainstream adoption, is rather important and urgent.
  • Impact of politics and media on policy – the public service is generally to have an evidence-based approach to policy, and where this approach is developed in a transparent and iterative way, in collaboration with the broader society, it means government can engage directly with citizens rather than through the prism of politics or the media, each which have their own motivations and imperatives.
  • Prioritisation of ICT spending – it is difficult to ensure the government investment and prioritisation of ICT projects aligns with the strategic goals of the organisation and government, especially where the goals are not clearly articulated.
  • Communications and value realisation – with anyone able to publish pretty much anything, it is incumbent on governments to be a part of the public narrative as custodians of a lot of information and research. By doing this in a transparent and apolitical way, the public service can be a value and trusted source.
  • The expensive overhead of replication of effort across governments – consolidating where possible is vital to improve efficiencies, but also to put in place the mechanisms to support whole of government approaches.
  • Skills – a high technical literacy directly supports the capacity to innovate across government and across the society in every sector. As such this should be prioritised in our education systems, way above and well beyond “office productivity” tools.

Policy/strategic components

  • Strategic approach to information policy – many people looking at information policy tend to look deeply at one or a small number of areas, but it is only in looking at all of the information created by government, and how we can share, link, re-use, and analyse that we will gain the significant policy, service delivery and social/economic benefits and opportunities. When one considers geospatial, tabular, real time and statistical (census and survey) data, and then the application of metadata and linked data, it gets rather complicated. But we need to be able to interface effectively with these different data types.
  • Facilitating public and private innovation – taking a “government as a platform” approach, including open data and open APIs, such that industry and civil society can innovate on top of government systems and information assets, creating new value and services to the community.
  • Sector and R&D investment – it is vital that government ensured that the investment in digital industries, internal innovation and indeed R&D more broadly, aligns with the strategic vision. This means understanding how to measure and monitor digital innovation more effectively and not through the lens of traditional approaches that may not be relevant, such as the number of patents and other IP metrics. The New Zealand and Australian business and research community need to make the most of their governments’ leadership in Open Government. The Open Government Partnership network might provide a way to build upon and export this expertise.
  • Exports – by creating local capacity in the arena of improved and citizen-centric services delivery, Australia and New Zealand set themselves up nicely for exporting services and products to Asia Pacific, particularly given the rapid uptake of countries in the region to join the Open Government Partnership which requires signatories to develop plans around topics such as open data, citizen centricity and parliamentary transparency, all of which we are quite skilled in.
  • Distributed skunkworks for government – developing the communities/spaces/tools across government to encourage and leverage the skills and enthusiasm of clever geeks both internally (internal hackdays, communities of practice) and externally (eg – GovHack). No one can afford new resources, but allocating a small amount of time from the existing workforce who are motivated to do great things is a cost efficient and empowering way to create a distributed skunkworks. And as people speak to each other about common problems and common solutions we should see less duplication of these solutions and improved efficiency across agencies.
  • Iterative policy – rethinking how policy is developed, implemented, measured and governed to take a more iterative and agile approach that a) leverages the skills and expertise of the broader community for more evidence based and peer reviewed policy outcomes and b) is capable of responding effectively and in a timely manner to new challenges and opportunities as they arise. It would also be useful to build better internal intelligence systems for an improved understanding of the status of projects, and improved strategic planning for success.
  • An Information Commissioner for New Zealand – an option for a policy lead on information management to work closely with departments to have a consolidated, consistent, effective and overall strategic approach to the management, sharing and benefits realisation of government information. This would also build the profile of Open Government in New Zealand and hopefully be the permanent solution to current resourcing challenges. The Office of the Australian Information Commissioner, and similar roles at State level, include the function of Information Commissioner, Privacy Commissioner and Free of Information Commissioner, and these combined give a holistic approach to government information policy that ideally balances open information and privacy. In New Zealand it could be a role that builds on recent information policies, such as NZGOAL which is designed, amongst other things, to replace bespoke content licences. Bespoke licences create an unnecessary liability issue for departments.
  • Citizen centricity – the increasing importance of consolidating government service and information delivery, putting citizens (and business) at the centre of the design. This is achieved through open mechanisms (eg, APIs) to interface with government systems and information such that they can be managed in a distributed and secure way, but aggregated in a thematic way.
  • Shared infrastructure and services – the shared services being taken up by some parts of the New Zealand Government is very encouraging to see, particularly when such an approach has been very successful in the ACT and SA state governments in Australia, and with several shared infrastructure and services projects at a national level in Australia including the AGIMO network and online services, and the NECTAR examples (free cloud stack tools for researchers). Shared services create the capacity for a consistent and consolidated approach, as well as enable the foundations of citizen centric design in a practical sense.

Some additional reading and thoughts

Digital literacy and ICT skills – should be embedded into curriculum and encouraged across the board. I did a paper on this as a contribution to the National Australian Curriculum consultation in 2010 with Senator Kate Lundy which identified three areas of ICT competency: 1) Productivity skills, 2) Online engagement skills, & 3) Automation skills as key skills for all citizens. It’s also worth looking at the NSW Digital Citizenship courseware. It’s worth noting that public libraries are a low cost and effective way to deliver digital services, information and skills to the broader community and minimise the issue of the digital divide.

Media data – often when talking about open data, media is completely forgotten. Video, audio, arts, etc. The GLAM (galleries, libraries, archives and museums) are all over this and should be part of the conversation about how to manage this kind of content across whole of government.

Just a few additional links for those interested, somewhat related to some of the things I discussed this last week.

Getting started in the Australian Public Service

I worked for Senator Kate Lundy from April 2009 till January 2012. It was a fascinating experience learning how the executive and legislative arms of government work and working closely with Kate, who is extremely knowlegable and passionate about good policy and tech. As someone who is very interested in the interrelation between governments, society, the private sector and technology, I could not have asked for a better place to learn.

But last October (2011) I decided I really wanted to take the next step and expand my experience to better understand the public service, how policy goes from (and to) the political sphere from the administrative arm of government, how policy is implemented in practise and the impact/engagement with the general public.

I sat back and considered where I would ideally like to work if I could choose. I wanted to get an insight to different departments and public sector cultures across the whole govenrment. I wanted to work in tech policy, and open government stuff if at all possible. I wanted to be in a position where I might be able to make a difference, and where I could look at government in a holistic way. I think a whole of government approach is vital to serving the public in a coherent and consistent way, as is serious public engagement and transparency.

So I came up with my top three places to work that would satisfy this criteria. My top option happened to have a job going which I applied for and by November I was informed I was their first choice. This was remarkable and I was very excited to get started, but also wanted to tie up a few things in Kate’s office. So we arranged a starting date of January 31st 2012.

What is the job you ask? You’ll have to wait till the end of the post ;)

Unfortunately for me, because I was already 6 months into a Top Secret Positive Vetting (TSPV) process (what you need for a Ministerial office in order to work with any classified information), and that process had to be completed, even though I needed a lower level for the new job. I was informed back in October that it should be done by Christmas.

So I blogged on my last day with Kate about what I had learned and indicated that I was entering the public service to get a better understanding of the administrative arm of government. There was some amusing speculation, and it has probably been the worst kept secret around Canberra for the last year :)

Of course, I thought I would be able to update my “Moving On” blog post within a few weeks or so. It ended up taking another 10 months for my clearance to finalise. TSPV does take a while, and I’m a little more complicated a case than the average bear given my travel and online profile :)

As it turns out, the 10 months presented some useful opportunities. During the last year I did a bunch of contracting work looking largely at tech policy, some website development, and I ended up working for the ACT Government for the last 5 months.

In the ACT Government I worked in a policy role under Mick Chisnall, the Executive Director of the ACT Government Information Office. That was a fantastic learning experience and I’d like to thank Mick for being such a great person to work with and learn from. I worked on open government policy, open data policy and projects (including the dataACT launch, and some initial work for the Canberra Digital Community Connect project), looked at tech policies around mobile, cloud, real time data, accessibility and much more. I also helped write some fascinating papers around the role of government in a digital city. Again, I feel very fortunate to have had the opportunity to work with excellent people with vision. A huge thanks to Mick Chisnall, Andrew Cappie-Wood, Pam Davoren, Christopher Norman, Kerry Webb, James Watson, Greg Tankard, Gavin Tapp and all the people I had the opportunity to work with. I learnt a lot, much of which will be useful in my new role.

It also showed me that the hype around “shared services” being supposedly terrible doesn’t quite map reality. For sure, some states have had significant challenges, but in some states it works reasonably well (nothing is perfect) and presents some pretty useful opportunities for whole of government service delivery.

Anyway, so my new job is at AGIMO as Divisional Coordinator for the Agency Services Division, working directly to John Sheridan who has long been quite an active and engaged voice in the Australian Gov 2.0 scene. I started a week and a half ago and am really enjoying it already. I think there are some great opportunities for me through this job to usefully serve the public and the broader public service. I look forward to making my mark and contributing to the pursuit of good tech in government. I’m also taking the role of Media Coordinator for AGIMO, and supporting John in his role.

I’ve met loads of brilliant people working in the public service across Australia, and I’m looking forward to learning a lot. I’m also keen to take a very collaborative approach (no surprises there), so I’m looking at ways to better enable people to work together across the APS and indeed, across all government jurisdictions in Australia. There is a lot to be gained by collaboration between the Federal, States/Territories and Local spheres of government, particularly when you can get the implementers and policy developers working together rather than just those up the stack.

So, if you are in government (any sphere) and want to talk open government, open data, tech policy, iterative policy development, public engagement, or all the things, please get in touch. I’m hoping to set up an open data working group to bring together the people in various governments doing great work across the country and I’ll be continuing to participate in the Gov 2.0 community, now from within the tent :)

Collaborative innovation in the public service: Game of Thrones style

I recently gave a speech about “collaborative innovation” in the public service, and I thought I’d post it here for those interested :)

The short version was that governments everywhere, or more specifically, public services everywhere are unlikely to get more money to do the same work, and are struggling to deliver and to transform how they do things under the pressure of rapidly changing citizen expectations. The speech used Game of Thrones as a bit of a metaphor for the public service, and basically challenged public servants (the audience), whatever their level, to take personal responsibility for change, to innovate (in the true sense of the word), to collaborate, to lead, to put the citizen first and to engage beyond the confines of their desk, business unit, department or jurisdiction to co-develop develop better ways of doing things. It basically said that the public service needs to better work across the silos.

The long version is below, on YouTube or you can check out the full transcript:

The first thing I guess I wanted to talk about was pressure number one on government. I’m still new to government. I’ve been working in I guess the public service, be it federal or state, only for a couple of years. Prior to that I was an adviser in a politician’s office, but don’t hold that against me, I’m strictly apolitical. Prior to that I was in the industry for 10 years and I’ve been involved in non-profits, I’ve been involved in communities, I’ve been involved in online communities for 15 years. I sort of got a bit of an idea what’s going on when it comes to online communities and online engagement. It’s interesting for me to see a lot of these things done they’ve become very popular and very interesting.

My background is systems administration, which a lot of people would think is very boring, but it’s been a very useful skill for me because in everything I’ve done, I’ve tried to figure out what all the moving parts are, what the inputs are, where the configurations files are; how to tweak those configurations to get the better outputs. The entire thing has been building up my knowledge of the whole system, how the societal-wide system, if you like, operates.

One of the main of pressures I’ve noticed on government of course is around resources. Everyone has less to do more. In some cases, some of those pressures are around fatigued systems that haven’t had investment for 20 years. Fatigued people who have been trying to do more with less for many years. Some of that is around assumptions. There’s a lot of assumptions about what it takes to innovate. I’ve had people say, “Oh yeah, we can totally do an online survey that’ll cost you $4 million.” “Oh my, really? Okay. I’m going to just use Survey Monkey, that’s cool.” There are a lot of perceptions that I would suggest a little out of date.

It was a very opportunistic and a very wonderful thing that I worked in the ACT Government prior to coming into the federal government. A lot of people in the federal government look down on working in other jurisdictions, but it was very useful because when you see what some of the state territory and local governments do with the tiny fraction of the funding that the federal government has, it’s really quite humbling to start to say, “Well why do we have these assumptions that a project is going to cost a billion dollars?”

I think our perceptions about what’s possible today is a little bit out of whack. Some of those resources problems are also limitations for the self-imposed, our assumptions, our expectations and such. So first major pressure that we’re dealing with is around resources, both the real issue and I would argue a slight issue of perception. This is the only gory one (slide), so turn away from it if you like, I should have said that before sorry.

The second pressure is around changing expectations. Citizens now, because of the Internet, are more powerful than ever before. This is a real challenge for entities such as government or a large traditional power brokers shall we say. Having citizens that can solve their own problems, they can make their own applications that can pull data from wherever we like, that can screen scrape what we put online, is a very different situation to whether it be the Game of Thrones land or Medieval times, even up to even only 100 years ago; the role of a citizen was more about being a subject and they were basically subject to whatever you wanted. A citizen today is able to engage and if you’re not responsive to them, if government don’t be agile and actually fill up a role then that void gets picked up by other people, so the internet society is a major pressure of the changing expectations of the public that we serve is a major pressure. When fundamentally, government can’t in a lot of cases innovate quickly enough, particularly in isolation, to solve the new challenges of today and to adapt and grab on to the new opportunities of today.

We (public servants) need to collaborate. We need to collaborate across government. We need to collaborate across jurisdictions and we need to collaborate across society and I would argue the world. These are things that are very, very foreign concepts to a lot of people in the public service. One of the reasons I chose this topic today was because when I undertook to kick off Data.gov.au again, which is just about to hit its first anniversary and I recommend that you come along on the 17th of July, but when I kicked that off, the first thing I did was say, “Well who else is doing stuff? What are they doing? How’s that working? What’s the best practice?” When I chatted to other jurisdictions in Australia, when I chatted to other countries, I sat down and grilled for a couple of hours the Data.gov.uk guys to find out exactly how they do it, how it’s resourced, what their model was. It was fabulous because it really helped us create a strategy which has really worked and it’s continuing to work in Australia.

A lot of these problems and pressures are relatively new, we can’t use old methods to solve these problems. So to quote another Game of Thrones-ism,  if we look back, we are lost.

The third pressure and it’s not too gory, this one. The third pressure is upper management. They don’t always get what we’re trying to do. Let’s be honest, right? I’m very lucky I work for a very innovative, collaborative person who delegates responsibilities down … Audience Member: And still has his head. Pia Waugh: … and still has his head. Well actually it’s the other way around. Upper management is Joffrey Baratheon; but I guess you could say it that way, too. In engaging with upper management, a lot of the time and this has been touched on by several speakers earlier today, a lot of the time they have risks. To manage they have to maintain reputation and when you say we can’t do it that way, if you can’t give a solution that will solve the problem, then what do you expect to happen? We need to engage with upper management to understand what their concerns are, what their risks are and help mitigate those risks. If we can’t do that then it is in a lot of cases to our detriment that our projects are not going to be able to get up.

We need to figure out what the agendas are, we need to be able to align what we’re trying to do effectively and we need to be able to help provide those solutions and engage more constructively, I would suggest, with upper management.

Okay, but the biggest issue, the biggest issue I believe is around what I call systemic silos. So this is how people see government, it’s remote, it’s very hard to get to; it’s one entity. It’s a bit crumbling, a bit off in the realm, it’s out of touch with people, it’s off in the clouds and it’s untouchable. It’s very hard to get to, there’s winding dangerous road you might fall off. Most importantly, it’s one entity. When people have a good or bad experience with your department, they just see that as government. We are all exactly judged by the best and the worst examples of all of these and yet we’re all motivated to work independently of each other in order to meet fairly arbitrary, goals in some cases. In terms of how government sees people, they’re these trouble-making people that climbing up to try and destroy us. They’re a threat, they’re outsiders, they don’t get it. If only we could teach them how government works and then this will all be okay.

Well, it’s not their job; I mean half of the people in government don’t know how government works. By the time you take MOG changes into account, by the time you take changes of functions, changes of management, changes of different approaches, different cultures throughout the public service, the amount of time someone has said to me, “The public service can’t innovate.” I’m like, “Well, the public service is myriad organisations with myriad cultures.” It’s not one entity and yet people see us as one entity. It’s not I think the job of the citizen to understand the complexities of government, but rather the job of the government to abstract the complexities of government to get a better engagement and service for citizens. That’s our job, which means if you’re not collaborating and looking across government, then you’re not actually doing your job, in my opinion. But again, I’m still possibly seen as one of these troublemakers, that’s okay.

This is how government sees government (map of the Realm), a whole map of fiefdoms, of castles to defend, of armies that are beating at your door, people trying to take your food and this is just one department. We don’t have this concept of that flag has these skills that we could use. These people are doing this project; here’s this fantastic thing happening over there that we could chat to. We’re not doing that enough across departments, across jurisdictions, let alone internationally and there’s some fantastic opportunities to actually tap into some of those skills. The solution in my opinion, this massive barrier to doing the work of the public service better is systemic silos. So what’s the solution?

The solution is we need to share. We’re all taught as children to share the cookie and yet as we get into primary school and high school we’re told to hide our cookie. Keep it away. Oh you don’t want to share the cookie because there’s only one cookie and if you gave any of it away you don’t have any cookie left. Well, there’s only so many potatoes in this metaphor and if we don’t share those potatoes then someone’s going to starve and probably the person who’s going to starve is actually right now delivering a service that if they’re not there to deliver, we’re going to have to figure out how to deliver for the one potato that we have. So I’m feeling we have to collaborate and to share those resources is I think a very important step forward.

Innovative collaboration. Innovative collaboration is a totally made up term as are a lot of things are I guess. It’s the concept of actually forging strategic partnerships. I’ve actually had a number of projects now. I didn’t have a lot of funding for Data.gov.au. I don’t need a lot of funding for Data.gov.au because fundamentally, a lot of agencies want to publish data because they see it now to be in their best interest. It helps them improve their policy outcomes, helps them improve their services, helps them improve efficiency in their organisations. Now that we’ve sort of hit that tipping point of agencies wanting to do this stuff increasingly so, it’s not completely proliferated yet, but I’m working on it; now that we sort of hit that tipping point, I’ve got a number of agencies that say, “Well, we’d love to open data but we just need a data model registry.” “Oh, cool. Do you have one?” “Yes, we do but we don’t have anywhere to host it.” “Okay, how about I host it for you. You develop it and I’ll host it. Rock!” I’ve got five of those projects happening right now where I’ve aligned the motivation and the goals of what we’re doing with the motivation and goals of five other departments and we have actually have some fantastic outcomes coming out that meet all the needs of all the players involved, plus create a whole of government improved service.

I think this idea of having a shared load, pooling our resources, pooling our skills, getting a better outcome for everyone is a very important way of thinking. It gives you better improved outcomes in terms of dealing again with upper management. If you start from a premise that most people do, well we’ve only got this number of people and this amount of money and therefore, we’re only going to be able to get this outcome. In a year’s time you’ll be told, “That’s fine, just still do it 20% less.” If you say our engagement with this agency is going to help us get more resilience in a project and more expertise on a project and by the way, upper management, it means we’re splitting the cost with someone else, that starts to help the conversation. You can start to leverage resources across multiple departments, across society and across the world.

Here’s a little how-to, just a couple of ideas, I’m going to go into this into a little bit more detail. In the first case research, so I’m a child of the internet, I’m a little bit unique for my age bracket and that my mom was a geek, so I have been using computers since I was four, 30 years ago. A lot of people my age got their first taste of computing and the internet when they got to university or at best maybe high school whereas I was playing with computers very young. In fact, there’s a wonderful photo if you want to check it out, of my mom and I sitting and looking at the computer very black and white and there’s this beautiful photo of this mother with a tiny child at the computer. What I tell people is that it’s a cute photo but actually my mom had spent three days programming that system and when her back was turned, just five minutes, I completely broke it. The picture is actually of her fixing my first breaking of a system. I guess I could have had a career in testing but anyway I got in big trouble.

One of the things about being a child of the internet or someone, who’s really adopted the internet into the way that I think, is that my work space is not limited to the desk area that I have. I don’t start with a project and sort of go, okay, what’s on my computer, who’s in my immediate team, who’s in my area, my business area. I start with what’s happening in the world. The idea of research is not just to say what’s happening elsewhere so that we can integrate into what we are going to do, but to start to see the whole world as your work space or as your playground or as your sandpit, whichever metaphor you prefer. In this way, you can start to automatically as opposed to by force, start to get into a collaborative mindset.

Research is very important. You need to establish something. You need to actually do something. This is an important one that’s why I’ve got it in bold. You need to demonstrate that success and you need to wrap up. I think a lot of times people get very caught up with establishing a community and then maintaining that community for the sake of maintaining the community. What are the outcomes? You need to identify fairly quickly, is this going to have an outcome or is this sort of a community, an ongoing community which is not necessarily outcome driven? Part of this is around, again, understanding how the system works and how you can actually work in the system. Some of that research is about understanding projects and skills. I’ll jump into a little bit. So what already exists? If I had a mammoth (slide), I’d totally do cool stuff. What exists out there? What are the people and skills that are out there? What are the motivations that exist in those people that are already out there? How can I align with those? What are the projects that are already doing cool stuff? What are the agendas and priorities and I guess systemic motivations that are out there? What tech exists?

And this is why I always contend and I always slip into a talk somewhere, so I’ll slip it in here, you need to have a geek involved somewhere. How many people here would consider yourselves geeks? Not many. You need to have people that have technical literacy in order to make sure that your great idea, your shiny vision; your shiny policy can actually be implemented. If you don’t have a techie person, then you don’t have the person who has a very, very good skill at identifying opportunities and risks. You can say, “Well we’ll just go to our IT department and they’ll give us quote of how much it does to do a survey.” Well in that case, okay, not necessarily our case, it was $4 million. So you need to have techie people who will help you keep your finger on the pulse of what’s possible, what’s probable and how it’s going to possibly work. I highly recommend, you don’t need to be that person but you need to have the different skills in the room.

This is where and I said this on Twitter, I do actually recommend Malcolm Gladwell’s ‘The Tipping Point’, not because he’s the most brilliant author in the world, but because he has a concept in there that’s very important. Maybe I’ll save you reading it now, but of having three skills – connectedness, so the connector; the maven, your researcher sort of person; and your sales person. Those three skills, one person might have all or none of those skills, but a project needs to have all of those skills represented in some format for the project to go from nothing to being successful or massively distributed. It’s a very interesting concept. It’s been very beneficial to a lot of projects I’ve been involved in. I’ve run a lot of volunteer projects. The biggest of which is happening this weekend, which is GovHack. Having 1,300 participants in an 11-city event only with volunteer organisers is a fairly big deal and part of the reason we can do that is because we align natural motivation with the common vision and we get geeks involved obviously.

What already exists? Identifying the opportunities, identifying what’s out there, treating the world like a basket of goodies that you can draw from. Secondly, you want to form an A team. Communities are great and communities are important. Communities establish a ongoing presence from which you can engage in, draw from, get support and all those kinds of things. This kind of community is very, very important, but innovative collaboration is about building a team to do something, a project team. You want to have your A-list. You want to have a wide variety of skills. You want to have doers. You want to establish the common and different needs of the individuals involved and they might be across departments or across governments or from society. Establishing what is common of the people involved that you want to get out of it and establishing then what’s different is important to making sure that when you go to announce this, that everyone’s needs is taken care of or that it doesn’t put someone off side or whatever. You need to understand the dynamics of your group very, very well and you need to have the right people in the room. You want to plan realistic outcomes and milestones. These need to be tangible.

This is where I get just super pragmatic and I apologise, but if you’re building a team to build the project report to build the team, maybe you’ve lost your way just slightly. If the return on investment or the business case that you’re writing takes 10 times the amount of time to do the project, itself, maybe you could do a little optimisation. So just sort of sitting back and saying what is the scale of what we’re trying to do. What are the tangible outcomes and what is actually necessary for this? This comes back to the concept of again, managing and mapping risk to projects. If the risk is very, very, very low, then maybe the amount of time and effort that goes into building the enormous structure of governance around it, can be somewhat minimised. This is taking a engaged proactive approach with the risk I think is very important in this kind of thing and making sure that the outcomes are actually achievable and tangible. This is also important because if you have tangible outcomes then you can demonstrate tangible outcomes. You need to also avoid scope creep.

I had a project recently that didn’t end up happening. It was a very interesting lesson to me though where something simple was asked and I came out with a way to do it in four weeks. Brilliant! Then the scope started to creep significantly and then it became this and this and then this and then we want to have an elephant with bells on it. Well, you can have the elephants with bells if you do this in this way in six months. So how about you have that as a second project? Anyway, so basically try to hold your ground. Often enough when people ask for something, they don’t know what they’re asking for. We need to be the people that are on the front line saying, “What you want to achieve fundamentally, you’re not going to achieve the way that you’re trying to achieve it. So how about we think about what the actual end goal that we all want is and how to achieve that? And by the way, I’m the technical expert and you should believe me and if you don’t, ask another technical expert but for God’s sake, don’t leave it to someone who doesn’t know how to implement this, please.”

You want to plan your goals. You want to ensure and this another important bit that there is actually someone responsible for each bit, otherwise, your planning committee will get together in another four weeks or eight weeks and will say, “So, how is action A going? Oh nothing’s happened. Okay, how’s action B going?” You need to actually make sure that this nominated responsibilities and they again should align to those individuals’ natural motivations and systemic motivations.

My next bit, don’t reinvent the wheel. I find a lot of projects where someone has gone on and completely recreated something. The amount of time when someone said, “Well that’s a really good piece of software but let’s rewrite it in another language.” In technical land, this is very common, but I see it happen in a process perspective, I see it happen in a policy perspective. Again, going back to see what’s available is very important, but I’ll just throw in another thing here, the idea of taking responsibility is a very scary thing, apparently, in the public service. Let’s go back to the wheel. If your wheel is perfect, you’ve developed it, you’ve designed it, you’ve spent six years getting it to this point and it’s shiny and it’s beautiful and it works, but it’s not connected to a car, what’s the point, seriously?

You want to make sure that what you’re doing needs to actually contribute to something bigger, needs to actually be part of the engine, because if your wheel or if your cog is perfectly defined but the engine as a whole doesn’t work, then there’s a problem there and sometimes that’s out of your control. Quite often what’s missing is someone actually looking end to end and saying, “Well, the reason there’s a problem is because there’s actually a spanner, just here.” If we remove that spanner and I know it’s not my job to remove that spanner, but if someone removed that spanner the whole thing would work. Sometimes it’s very scary for some people to do and I understand that, but you need to understand what you’re doing and how it fits into the bigger picture and how the bigger picture is or isn’t working, I would suggest.

Monitoring. Obviously, measuring and monitoring success in Game of Thrones was a lot more messy than it is for us. They had to deal with birds, they had to feed them, they had to deal with what they fed them. To measure and monitor your project is a lot easier in a lot of cases. There’s a lot of ways to automate it. There’s a lot of ways to come up with it at the beginning. How do we define success, if you don’t define it then you don’t know if you’ve got there. These things are all kind of obvious, but I remember having a real epiphany moment when a very senior person from another department actually, I was talking to him about the challenge that I’m having with a project and I said, “Well if you’re doing this great thing, then why aren’t you shouting it from the rooftop. This is wonderful. It’s very innovative, it’s very clever. You’ve solved a really great problem.” Then he looked at me and said, “Well Pia, you know success is just as bad as failure, don’t you?” It really struck me and then I realised I guess any sort of success or failure is seen as attention and the moment someone puts attention then it’s not very scary. I put to you that having success, having defensible projects, having evidence that actually underpins why, what you’re doing is important, is probably one of the most important things that you can do today to make sure that you continue getting funding, resources and all these kinds of things. Measuring, monitoring, reporting is more important now than ever and luckily and coincidentally, it’s easier now than ever. There’s a lot of ways that we can automate this stuff. There’s a lot of ways that we can put in place these mechanisms from the start of a project. There’s a lot of ways we can use technology to help. We need to define success, we need to defend and promote the outcomes of those projects.

Share the glory. If it’s you sitting on the throne then everyone starts to get a little antsy. I like to say that shared glory is the key to a sustainable success. I’ve had a number of projects and I don’t think I’ve told John this, but I’ve had a couple of things where I’ve collaborated with someone and then I’ve let them announce their part of it first, because that’s a good way to get great relationship. It doesn’t really matter to me if I announce it now or in a week’s time. It helps share the success, it helps share the glory. It means everyone is a little bit more on site and it builds trust. The point that was made earlier today about trust is a very important one and the way that you build trust is by having integrity, following through on what you’re doing and to share the glory a little. Sharing the glory is a very important part because if everyone feels like they’re getting out of the collaboration what they need to justify their work, to justify to their bosses, to justify their investment of time, then that’s a very good thing for everyone.

Everything great starts small. This goes to the point of doing pilots, doing demos. How many of you have heard the term release early, release often? Not many. It’s a technology sector idea, but the idea is rather than taking, in big terms, rather than taking four years to scope something out and then get $100 million and then implement it, yeah I know, right? You actually start to do smaller modular projects and if it fails straight away, then at least you haven’t spent four years and $100 million failing. The other part of release early, release often is fail early, fail often, which sounds very scary in the public sector but it’s a very important thing because from failure and from early releases, you get lessons. You can iteratively improve projects or policies or outcomes that you’re doing if you continually getting out there and actually testing with people and demoing and doing pilots. It’s a very, very useful thing to realise that sometimes even the tiniest baby step is still a step and for yourselves as individuals, we don’t always get the big success that we hope and so you need to make sure that you have a continuous success loop in your own environment and for yourself to make sure that you maintain your own sense of moving forward, I guess, so even small steps are very important steps. Audience Member: Fail early, fail often to succeed sooner. Pia Waugh: That’s probably a better sentence.

There’s a lot of lessons that we can learn from other sector and from other industries, from both the corporate and community sectors, that don’t always necessarily translate in the first instance; but they’re tried and true in those sectors. Understanding why they work and why they do or in some cases don’t map to our sector, I think is very important.

Finally, this is the last thing I want to leave you with. The amount of times that I hear someone say, “Oh, we can’t possibly do that. We need to have good leadership. Leadership is what will take us over the line.” We are the leaders of this sector. We are the future of the public service and so there’s a question about you need to start acting it as well, not you, all of us. You lead through doing. You establish change through being the change you want to see, to quote another great guy. When you actually realising that a large proportion of the SES are actually retiring in the next five to ten years, and realising that we are all the future of the public service means that we can be those leaders. Now if you go to your boss and say, “I want to do this great, cool thing and it’s going to be great and I’m going to go and work with all these other people. I’m going to spend lots of your money.” Yeah, they’re going to probably get a little nervous. If you say to them “here’s why this is going to be good for you, I want to make you look good, I want to achieve something great that’s going to help our work, it’s going to help our area, it’s going to help our department, it’s going to help our Minister, it aligns with all of these things” you’re going to have a better chance of getting it through. There’s a lot of ways that you can demonstrate leadership just at our level, just by working to people directly.

So I spoke before about how the first thing I did was go and research what everyone else was doing, I followed that up by establishing an informal forum. A series of informal get togethers. One of those informal get togethers is across jurisdictional meeting with open data people from other jurisdictions. What that means is every two months I meet with the people who are in charge of the open data policies and practice from most of the states and territories, from a bunch of local governments, from a few other departments at the federal level, just to talk about what we’re all doing; made very clear from the start, this is not formal, this is not mandatory, it’s not top down, it’s not the feds trying to tell you what to do, which is an unfortunate although often accurate picture that the other jurisdictions have of us, which is unfortunate because there’s so much we can learn from them. By just setting that up and getting the tone of that right, everyone is sharing policy, sharing outcomes, sharing projects, starting to share a code, starting to share functionality and we’ve got to a point only I guess eight months into the establishment of that group, where we really started to get some great benefits for everyone and it’s bringing everyone’s base line up.

There’s a lot of leadership to be had at every level and identifying what you can do in your job today is very important rather than waiting for the permission. I remember and I’m going to say a little story that I hope John doesn’t mind, I remember when I started in my job and I got a week into the job and I said to John, “So, I’ve been here a week, I really don’t know if this is what you wanted from me. Are you happy with how I’m going?” He said, “Well Pia, don’t change what you’re doing, but I just want to give you a bit of feedback. I’ve never been in a meeting before with outsiders, with vendors or whatever and had an EL speak before.” I said, “Oh, what’s wrong with your department? What’s wrong with ELs?” Because certainly by a particular level you have expertise, you have knowledge, you have something to contribute, so why wouldn’t you be encouraging people of all levels but certainly of senior levels to be actually speaking and engaging in the meetings. It was a really interesting thought experiment and discussion to be had about the culture.

The amount of people that have said to me, just quietly, “Hey, we’d love to do that but we don’t want to get any criticism.” Well, criticism comes in two forms. It’s either constructive or unconstructive. Now it can be given negatively, it can be given positively, it can be given in a little bottle in the sea, but it only comes in those two forms. If it’s constructive, even if yelled at you online, if it’s something to learn from, take that, roll with it. If it’s unconstructive, you can ignore it safely. It’s about having self knowledge, an understanding of a certain amount of clarity and comfort with the idea that you can improve, that sometimes other people will be the mechanism for you to improve, in a lot of cases it will be other people will be the mechanism for you to improve. Conflict is not a bad thing. Conflict is actually a very healthy thing in a lot of ways, if you engage with it. It’s really up to us about how we engage with conflict or with criticism.

This is again where I’m going to be a slight outsider, but it’s very, very hard, not that I’ve seen this directly, but everything I hear is that it’s very, very hard to get rid of someone in the public service. I put to you, why would you not be brave? Seriously. You can’t have it both ways. You can’t say, “Oh, I’m so scared about criticism. I’m so scared blah, blah, blah,” and at the same time it be difficult to be fired, why not be brave? We can do great things and it’s up to us as individuals to not wait for that permission to do great things. We can all do great things at lots and lots of different levels. Yes, there will be bad bosses and yes, there will be good bosses, but if you continually pin your ability to shine on those external factors and wait, then you’ll be waiting a long time. Anyway, it’s just my opinion.

So be the leader, be the leader that you want to see. That’s I guess what I wanted to talk about with collaborative innovation.

Essays: Improving the Public Policy Cycle Model

I don’t have nearly enough time to blog these days, but I am doing a bunch of writing for university. I decided I would publish a selection of the (hopefully) more interesting essays that people might find interesting :) Please note, my academic writing is pretty awful, but hopefully some of the ideas, research and references are useful. 

For this essay, I had the most fun in developing my own alternative public policy model at the end of the essay. Would love to hear your thoughts. Enjoy and comments welcome!

Question: Critically assess the accuracy of and relevance to Australian public policy of the Bridgman and Davis policy cycle model.

The public policy cycle developed by Peter Bridgman and Glyn Davis is both relevant to Australian public policy and simultaneously not an accurate representation of developing policy in practice. This essay outlines some of the ways the policy cycle model both assists and distracts from quality policy development in Australia and provides an alternative model as a thought experiment based on the authors policy experience and reflecting on the research conducted around the applicability of Bridgman and Davis’ policy cycle model.

Background

In 1998 Peter Bridgman and Glyn Davis released the first edition of The Australian Policy Handbook, a guide developed to assist public servants to understand and develop sound public policy. The book includes a policy cycle model, developed by Bridgman and Davis, which portrays a number of cyclic logical steps for developing and iteratively improving public policy. This policy model has attracted much analysis, scrutiny, criticism and debate since it was first developed, and it continues to be taught as a useful tool in the kit of any public servant. The fifth edition of the Handbook was the most recent, being released in 2012 which includes Catherine Althaus who joined Bridgman and Davis on the fourth edition in 2007.

The policy cycle model

The policy cycle model presented in the Handbook is below:

bridgman-and-davis

The model consists of eight steps in a circle that is meant to encourage an ongoing, cyclic and iterative approach to developing and improving policy over time with the benefit of cumulative inputs and experience. The eight steps of the policy cycle are:

  1. Issue identification – a new issue emerges through some mechanism.

  2. Policy analysis – research and analysis of the policy problem to establish sufficient information to make decisions about the policy.

  3. Policy instrument development – the identification of which instruments of government are appropriate to implement the policy. Could include legislation, programs, regulation, etc.

  4. Consultation (which permeates the entire process) – garnering of external and independent expertise and information to inform the policy development.

  5. Coordination – once a policy position is prepared it needs to be coordinated through the mechanisms and machinations of government. This could include engagement with the financial, Cabinet and parliamentary processes.

  6. Decision – a decision is made by the appropriate person or body, often a Minister or the Cabinet.

  7. Implementation – once approved the policy then needs to be implemented.

  8. Evaluation – an important process to measure, monitor and evaluate the policy implementation.

In the first instance is it worth reflecting on the stages of the model, which implies the entire policy process is centrally managed and coordinated by the policy makers which is rarely true, and thus gives very little indication of who is involved, where policies originate, external factors and pressures, how policies go from a concept to being acted upon. Even to just develop a position resources must be allocated and the development of a policy is thus prioritised above the development of some other policy competing for resourcing. Bridgman and Davis establish very little in helping the policy practitioner or entrepreneur to understand the broader picture which is vital in the development and successful implementation of a policy.

The policy cycle model is relevant to Australian public policy in two key ways: 1) that it both presents a useful reference model for identifying various potential parts of policy development; and 2) it is instructive for policy entrepreneurs to understand the expectations and approach taken by their peers in the public service, given that the Bridgman and Davis model has been taught to public servants for a number of years. In the first instance the model presents a basic framework that policy makers can use to go about the thinking of and planning for their policy development. In practise, some stages may be skipped, reversed or compressed depending upon the context, or a completely different approach altogether may be taken, but the model gives a starting point in the absence of anything formally imposed.

Bridgman and Davis themselves paint a picture of vast complexity in policy making whilst holding up their model as both an explanatory and prescriptive approach, albeit with some caveats. This is problematic because public policy development almost never follows a cleanly structured process. Many criticisms of the policy cycle model question its accuracy as a descriptive model given it doesn’t map to the experiences of policy makers. This draws into question the relevance of the model as a prescriptive approach as it is too linear and simplistic to represent even a basic policy development process. Dr Cosmo Howard conducted many interviews with senior public servants in Australia and found that the policy cycle model developed by Bridgman and Davis didn’t broadly match the experiences of policy makers. Although they did identify various aspects of the model that did play a part in their policy development work to varying degrees, the model was seen as too linear, too structured, and generally not reflective of the at times quite different approaches from policy to policy (Howard, 2005). The model was however seen as a good starting point to plan and think about individual policy development processes.

Howard also discovered that political engagement changed throughout the process and from policy to policy depending on government priorities, making a consistent approach to policy development quite difficult to articulate. The common need for policy makers to respond to political demands and tight timelines often leads to an inability to follow a structured policy development process resulting in rushed or pre-canned policies that lack due process or public consultation (Howard, 2005). In this way the policy cycle model as presented does not prepare policy-makers in any pragmatic way for the pressures to respond to the realities of policy making in the public service. Colebatch (2005) also criticised the model as having “not much concern to demonstrate that these prescriptions are derived from practice, or that following them will lead to better outcomes”. Fundamentally, Bridgman and Davis don’t present much evidence to support their policy cycle model or to support the notion that implementation of the model will bring about better policy outcomes.

Policy development is often heavily influenced by political players and agendas, which is not captured in the Bridgman and Davis’ policy cycle model. Some policies are effectively handed over to the public service to develop and implement, but often policies have strong political involvement with the outcomes of policy development ultimately given to the respective Minister for consideration, who may also take the policy to Cabinet for final ratification. This means even the most evidence based, logical, widely consulted and highly researched policy position can be overturned entirely at the behest of the government of the day (Howard, 2005) . The policy cycle model does not capture nor prepare public servants for how to manage this process. Arguably, the most important aspects to successful policy entrepreneurship lie outside the policy development cycle entirely, in the mapping and navigation of the treacherous waters of stakeholder and public management, myriad political and other agendas, and other policy areas competing for prioritisation and limited resources.

The changing role of the public in the 21st century is not captured by the policy cycle model. The proliferation of digital information and communications creates new challenges and opportunities for modern policy makers. They must now compete for influence and attention in an ever expanding and contestable market of experts, perspectives and potential policies (Howard, 2005), which is a real challenge for policy makers used to being the single trusted source of knowledge for decision makers. This has moved policy development and influence away from the traditional Machiavellian bureaucratic approach of an internal, specialised, tightly controlled monopoly on advice, towards a more transparent and inclusive though more complex approach to policy making. Although Bridgman and Davis go part of the way to reflecting this post-Machiavellian approach to policy by explicitly including consultation and the role of various external actors in policy making, they still maintain the Machiavellian role of the public servant at the centre of the policy making process.

The model does not clearly articulate the need for public buy-in and communication of the policy throughout the cycle, from development to implementation. There are a number of recent examples of policies that have been developed and implemented well by any traditional public service standards, but the general public have seen as complete failures due to a lack of or negative public narrative around the policies. Key examples include the Building the Education Revolution policy and the insulation scheme. In the case of both, the policy implementation largely met the policy goals and independent analysis showed the policies to be quite successful through quantitative and qualitative assessment. However, both policies were announced very publicly and politically prior to implementation and then had little to no public narrative throughout implementation leaving the the public narrative around both to be determined by media reporting on issues and the Government Opposition who were motivated to undermine the policies. The policy cycle model in focusing on consultation ignores the necessity of a public engagement and communication strategy throughout the entire process.

The Internet also presents significant opportunities for policy makers to get better policy outcomes through public and transparent policy development. The model down not reflect how to strengthen a policy position in an open environment of competing ideas and expertise (aka, the Internet), though it is arguably one of the greatest opportunities to establish evidence-based, peer reviewed policy positions with a broad range of expertise, experience and public buy-in from experts, stakeholders and those who might be affected by a policy. This establishes a public record for consideration by government. A Minister or the Cabinet has the right to deviate from these publicly developed policy recommendations as our democratically elected representatives, but it increases the accountability and transparency of the political decision making regarding policy development, thus improving the likelihood of an evidence-based rather than purely political outcome. History has shown that transparency in decision making tends to improve outcomes as it aligns the motivations of those involved to pursue what they can defend publicly. Currently the lack of transparency at the political end of policy decision making has led to a number of examples where policy makers are asked to rationalise policy decisions rather than investigate the best possible policy approach (Howard, 2005). Within the public service there is a joke about developing policy-based evidence rather than the generally desired public service approach of developing evidence-based policy.

Although there are clearly issues with any policy cycle model in practise due to the myriad factors involved and the at times quite complex landscape of influences, by constantly referencing throughout their book the importance of “good process” to “help create better policy” (Bridgman & Davis, 2012), they both imply their model is a “good process” and subtly encourage a check-box style, formally structured and iterative approach to policy development. The policy cycle in practice becomes impractical and inappropriate for much policy development (Everett, 2003). Essentially, it gives new and inexperienced policy makers a false sense of confidence in a model put forward as descriptive which is at best just a useful point of reference. In a book review of the 5th edition of the Handbook, Kevin Rozzoli supports this by criticising the policy cycle model as being too generic and academic rather than practical, and compares it to the relatively pragmatic policy guide by Eugene Bardach (2012).

Bridgman and Davis do concede that their policy cycle model is not an accurate portrayal of policy practice, calling it “an ideal type from which every reality must curve away” (Bridgman & Davis, 2012). However, they still teach it as a prescriptive and normative model from which policy developers can begin. This unfortunately provides policy developers with an imperfect model that can’t be implemented in practise and little guidance to tell when it is implemented well or how to successfully “curve away”. At best, the model establishes some useful ideas that policy makers should consider, but as a normative model, it rapidly loses traction as every implementation of the model inevitably will “curve away”.

The model also embeds in the minds of public servants some subtle assumptions about policy development that are questionable such as: the role of the public service as a source of policy; the idea that good policy will be naturally adopted; a simplistic view of implementation when that is arguably the most tricky aspect of policy-making; a top down approach to policy that doesn’t explicitly engage or value input from administrators, implementers or stakeholders throughout the entire process; and very little assistance including no framework in the model for the process of healthy termination or finalisation of policies. Bridgman and Davis effectively promote the virtues of a centralised policy approach whereby the public service controls the process, inputs and outputs of public policy development. However, this perspective is somewhat self serving according to Colebatch, as it supports a central agency agenda approach. The model reinforces a perspective that policy makers control the process and consult where necessary as opposed to being just part of a necessarily diverse ecosystem where they must engage with experts, implementers, the political agenda, the general public and more to create robust policy positions that might be adopted and successfully implemented. The model and handbook as a whole reinforce the somewhat dated and Machiavellian idea of policy making as a standalone profession, with policy makers the trusted source of policies. Although Bridgman and Davis emphasise that consultation should happen throughout the process, modern policy development requires ongoing input and indeed co-design from independent experts, policy implementers and those affected by the policy. This is implied but the model offers no pragmatic way to do policy engagement in this way. Without these three perspectives built into any policy proposal, the outcomes are unlikely to be informed, pragmatic, measurable, implementable or easily accepted by the target communities.

The final problem with the Bridgman and Davis public policy development model is that by focusing so completely on the policy development process and not looking at implementation nor in considering the engagement of policy implementers in the policy development process, the policy is unlikely to be pragmatic or take implementation opportunities and issues into account. Basically, the policy cycle model encourages policy makers to focus on a policy itself, iterative and cyclic though it may be, as an outcome rather than practical outcomes that support the policy goals. The means is mistaken for the ends. This approach artificially delineates policy development from implementation and the motivations of those involved in each are not necessarily aligned.

The context of the model in the handbook is also somewhat misleading which affects the accuracy and relevance of the model. The book over simplifies the roles of various actors in policy development, placing policy responsibility clearly in the domain of Cabinet, Ministers, the Department of Prime Minister & Cabinet and senior departmental officers (Bridgman and Davis, 2012 Figure 2.1). Arguably, this conflicts with the supposed point of the book to support even quite junior or inexperienced public servants throughout a government administration to develop policy. It does not match reality in practise thus confusing students at best or establishing misplaced confidence in outcomes derived from policies developed according to the Handbook at worst.

spheres-of-government

An alternative model

Part of the reason the Bridgman and Davis policy cycle model has had such traction is because it was created in the absence of much in the way of pragmatic advice to policy makers and thus has been useful at filling a need, regardless as to how effective is has been in doing so. The authors have however, not significantly revisited the model since it was developed in 1998. This would be quite useful given new technologies have established both new mechanisms for public engagement and new public expectations to co-develop or at least have a say about the policies that shape their lives.

From my own experience, policy entrepreneurship in modern Australia requires a highly pragmatic approach that takes into account the various new technologies, influences, motivations, agendas, competing interests, external factors and policy actors involved. This means researching in the first instance the landscape and then shaping the policy development process accordingly to maximise the quality and potential adoptability of the policy position developed. As a bit of a thought experiment, below is my attempt at a more usefully descriptive and thus potentially more useful prescriptive policy model. I have included the main aspects involved in policy development, but have included a number of additional factors that might be useful to policy makers and policy entrepreneurs looking to successfully develop and implement new and iterative policies.

Policy-model

It is also important to identify the inherent motivations of the various actors involved in the pursuit, development of and implementation of a policy. In this way it is possible to align motivations with policy goals or vice versa to get the best and most sustainable policy outcomes. Where these motivations conflict or leave gaps in achieving the policy goals, it is unlikely a policy will be successfully implemented or sustainable in the medium to long term. This process of proactively identifying motivations and effectively dealing with them is missing from the policy cycle model.

Conclusion

The Bridgman and Davis policy cycle model is demonstrably inaccurate and yet is held up by its authors as a reasonable descriptive and prescriptive normative approach to policy development. Evidence is lacking for both the model accuracy and any tangible benefits in applying the model to a policy development process and research into policy development across the public service continually deviates from and often directly contradicts the model. Although Bridgman and Davis concede policy development in practise will deviate from their model, there is very little useful guidance as to how to implement or deviate from the model effectively. The model is also inaccurate in that is overly simplifies policy development, leaving policy practitioners to learn for themselves about external factors, the various policy actors involved throughout the process, the changing nature of public and political expectations and myriad other realities that affect modern policy development and implementation in the Australian public service.

Regardless of the policy cycle model inaccuracy, it has existed and been taught for nearly sixteen years. It has shaped the perspectives and processes of countless public servants and thus is relevant in the Australian public service in so far as it has been used as a normative model or starting point for countless policy developments and provides a common understanding and lexicon for engaging with these policy makers.

The model is therefore both inaccurate and relevant to policy entrepreneurs in the Australian public service today. I believe a review and rewrite of the model would greatly improve the advice and guidance available for policy makers and policy entrepreneurs within the Australian public service and beyond.

References

(Please note, as is the usual case with academic references, most of these are not publicly freely available at all. Sorry. It is an ongoing bug bear of mine and many others).

Althaus, C, Bridgman, P and Davis, G. 2012, The Australian Policy Handbook. Sydney, Allen and Unwin, 5th ed.

Bridgman, P and Davis, G. 2004, The Australian Policy Handbook. Sydney, Allen and Unwin, 3rd ed.

Bardach, E. 2012, A practical guide for policy analysis: the eightfold path to more effective problem solving, 4th Edition. New York. Chatham House Publishers.

Everett, S. 2003, The Policy Cycle: Democratic Process or Rational Paradigm Revisited?, The Australian Journal of Public Administration, 62(2) 65-70

Howard, C. 2005, The Policy Cycle: a Model of Post-Machiavellian Policy Making?, The Australian Journal of Public Administration, Vol. 64, No. 3, pp3-13.

Rozzoli, K. 2013, Book Review of The Australian Policy Handbook: Fifth Edition., Australasian Parliamentary Review, Autumn 2013, Vol 28, No. 1.

Attending Linux.conf.au 2015

Really excited to note that I’m going to be attending Linux.conf.au 2015 and running the Cloud, Containers, and Orchestration mini-conf. Will be issuing the CfP for that shortly, but just wanted to give a shout (and create the category feed for LCA planet…) about heading to New Zealand next January. Extremely psyched to be going to LCA once again!

linux.conf.au: day 4

Another successful day of Linux geeking has passed, this week is going surprisingly quickly…

Some of the days highlights:

  • James Bottomley spoke on the current state of Linux UEFI support and demonstrated the tools and processes to install and manage keys and hashes for the installed software. Would have been interesting to have Matthew Garrett at LCA this year to present his somewhat different solution in comparison.
  • Avi Miller from Oracle did an interesting presentation on a new Linux feature called “Transcendent Memory“, which is a solution to the memory ballooning problems for virtualised environments. Essentially it works by giving the kernel the option to request more memory from another host, which could be the VM host, or even another host entirely connected via 10GigE or Infiniband, and having the kernel request and release memory when required. To make it even more exciting, memory doesn’t have to be just RAM, SSDs are also usable, meaning you could add a couple memory hosts to your Xen (and soon KVM) environments and stack them with RAM and SSD to then be provided to all your other guests as a memory ballooning space. It’s a very cool concept and one I intended to review further in future.
  • To wrap up the day, Michael Schwern presented on the 2038 bug – the problem where 32-bit computers are unable to keep time any further and reset to 1901, due to the limits of a 32-bit time buffer (see wikipedia). Time is something that always appears very simple, yet is extremely complex to do right once you consider timezones and other weirdness like leap years/seconds.
The end of time is here! Always trust announcements by a guy wearing a cardboard and robes.

The end of time is here! Always trust announcements by a guy wearing a cardboard and robes.

The conference presentations finished up with a surprise talk from Simon Hackett and Robert Llewellyn from Red Dwarf,  which was somewhat entertaining, but not highly relevant for me – personally I’d rather have heard more from Simon Hackett on the history and future expectations for the ISP industry in Australia than having them debate their electric cars.

Thursday was the evening of the Penguin Dinner, the (usually) formal dinner held at each LCA, this year rather than the usual sit down 3-course dinner, the conference decided to do a BBQ-style event up at the Observatory on Mount Stromlo.

The Penguin Dinner is always a little pricey at $80, but for a night out, good food, drinks and spending time with friends, it’s usually a fun and enjoyable event. Sadly this year had a few issues that kind of spoilt it, at least for me personally, with some major failings on the food and transport which lead to me spending only 2 hours up the mountain and feeling quite hungry.

At the same time, LCA is a volunteer organised conference and I must thank them for-making the effort, even if it was quite a failure this year – I don’t necessarily know all the behind the scenes factors, although the conflicting/poor communications really didn’t put me in the best mood that night.

Next year there is a professional events coordinator being hired to help with the event, so hopefully this adds value in their experience handling logistics and catering to avoid a repeat of the issue.

On the plus side, for the limited time I spent up the mountain, I got some neat photographs (I *really* need to borrow Lisa’s DSLR rather than using my cellphone for this stuff) and spent some good time discussing life with friends lying on the grass looking at the stars after the sun went down.

Part of the old burnt-out observatory

Part of the old burnt-out observatory

Sun setting along the ridge.

Sun setting along the ridge.

What is it with geeks and blue lights? ;-)

What is it with geeks and blue LEDs? ;-)

The other perk from the penguin dinner was the AWESOME shirts they gave everyone in the conference as a surprise. Lisa took this photo when I got back to Sydney since she loves it [1] so much.

Paaaartay!

Paaaartay!

[1] She hates it.

linux.conf.au: day 3

Having reached mid-week, my morning wakeup is getting increasingly difficult from late nights, thankfully there were large amounts of deep fried potato and coffee readily available.

Breakfast of champions - just add cheese and it would be a meal.

Breakfast of champions – just add cheese and it would be a meal.

Coffee Coffee Coffee Coffee Coffee Coffee Coffee Coffee Coffee

Coffee Coffee Coffee Coffee Coffee Coffee Coffee Coffee Coffee

The day had some interesting talks, most of the value I got was out of the web development space:

  • Andy Fitzsimon did an interesting presentation on design and how to approach designing applications or websites and the terminologies that developers use.
  • Sarah Sharp presented on “vampire mice”  – essentially a lot of USB devices don’t correctly obey the USB power suspend options, the result is that by enabling USB suspend for all your devices and disconnecting those that don’t obey, considerable power can be saved – one audience member found he could save 4W by sleeping all his USB devices. I also discovered that newer versions of Powertop now provide the ability to select particular USB devices for power-save mode.
  • There was a really good talk by Joel Stanley, probably one of the most interesting talks that day, on how they designed and built some hardware for doing digital radio transmissions using a radio circuit connected into an Android phone and the challenges encountered of doing hardware integration with Android.
  • We had an update on IPv6 adoption by Geoff Huston – sadly as expected, we’re dangerously low on IPv4 space, yet IPv6 adoption isn’t taking place particularly quickly either, with Internode still being the only major AU ISP with dual stacked addressing for consumers. On a side note, really awesome to see a former keynote presenter come back as a regular presenter and make a talk, having community engagement really adds to my respect for them.
  • My friend Adam Harvey did another awesome web development talk, this time presenting on some of the new CSS3 techniques including animation and transitions with some demonstrations on how these can work.
Open source radio reciever with Android phone coupled.

Open source radio receiver with Android phone coupled.

users: delighted, presenter: smug :-P

users: delighted, presenter: smug :-P

Spot the possum!

Spot the possum!

With all the talks this week, I’m feeling particularly motivated to do some more development this week, starting with writing some new proper landing pages for some of my projects.

Playing with new HTML5/CSS3 effects having been inspired to upskill my web development skills.

Playing with new HTML5/CSS3 effects having been inspired to upskill my web development skills.

linux.conf.au: day 2

The second day of linux.conf.au has been and gone, was another day of interesting miniconf talks and many geeky discussions with old and new friends.

Jethro: Booted

Jethro: Booted, with the power of coffee!

The keynote was a really good talk by Radia Perlman about how engineers approach developing network protocols and an interesting talk of the history of STP and the designed replacement, TRILL. Great to see a really technical female keynote speaker at LCA this year, particularly one as passionate about her topic as Radia.

The conference WiFi is still pretty unhappy this year, I’ve been suffering pretty bad latency and packet loss (30-50%) most of the past few days – if I’ve been able to find an AP – seems they’re only located around the lecture rooms. Yesterday afternoon it seems to have started improving however, so it may be that the networking team have beaten the university APs into submission.

No internet makes sad Jethro sad. :'(

No internet makes sad Jethro sad. :'(

Of course, some of the projectors decided not to play nicely, which seems pretty much business as usual when it comes to projectors and functioning…. it appears that the projector in question would complain about the higher refresh rates provided by DVI and HDMI connected devices, but functioned correctly with VGA.

Someone did an interesting talk a couple of LCA’s ago on the issue, apparently many projectors lie about what their true capabilities are and request resolutions and refresh rates from the computer that are higher than what they can actually support, which really messes with any modern operating system’s auto-detection.

Lending my VGA enabled Thinkpad to @lgnome whist a @chrisjrn observes.

Lending my VGA enabled Thinkpad to @lgnome whist a @chrisjrn observes.

A startled @colmiga approaches!

A startled @colmiga approaches!

Geeks listening intently

Geeks listening intently to concurrent programming.

@lgnome pushing some crazy new drugs to all the kiddies

@lgnome pushing some crazy new drugs to all the kiddies

A few of my friends were delivering talks today, so I spent my time between the Browser miniconf and Open Programming miniconf, picked up some interesting new technologies and techniques to look at:

  • Adam Harvey’s PHP talks were great as usual, always good to get an update on the latest developments in the PHP world.
  • Francois Marier from Mozilla NZ presented on Content Security Policy, a technique I wasn’t aware of until now. Essentially it allows you to set a header defining which sites should be trusted as sources of CSS, Javascript and image content, allowing a well developed site to be locked down to prevent many forms of XSS (cross site scripting).
  • Francios also spoke briefly about HTTP Strict Transport Security, a header which can be used by SSL websites to fix the long standing problem of users being intercepted by a bad proxy and served up a hacked HTTP-only version of the website. Essentially this header tells your browser that your site should only ever be accessed by HTTPS – anything that then directs your browser to HTTP will result in a security block, protecting the user, since your browser has been told that the site should only ever be SSL from it’s previous interaction. It’s not perfect, but it’s a great step forwards, as long as the first connection is made on a trusted non-intercepted link, it makes man-in-the-middle attacks impossible.
  • Daniel Nadasi from Google presented on AngularJS, a modern Javascript framework suitable for building complex applications with features designed to reduce the complexity of developing the required Javascript.

After that, dinner at one of the (many!) Asian restaurants in the area, followed by some delicious beer at the Wig and Pen.

Either I've already had too many beers, or there's a giant stone parcel in my way.

Either I’ve already had too many beers, or there’s a giant stone parcel in my way.

Onwards to delicious geekiness!

Onwards to delicious geekiness!

Delicious hand pulled pale ale.

Delicious hand pulled pale ale.

The beetroot beer is an interesting idea. But some ideas should just not be attempted. :-/

The beetroot beer is an interesting idea. But some ideas should just not be attempted. :-/

Native Australian night life!

Native Australian night life! This little fellow was very up close and friendly.

Linux.conf.au native wildlife. ;-)

Linux.conf.au native wildlife. ;-)

Another great day, looking forwards to Wednesday and the rest of the week. :-)

linux.conf.au: day 1

First proper day of linux.conf.au today, starting with breakfast and the quest of several hundred geeks to find and consume coffee.

Some of us went a bit overboard to get their exact daily coffee fix....

Some of us went a bit overboard to get their exact daily coffee fix….

After acquiring coffee, we started the day with a keynote by the well known Bdale Garbee, talking about a number of (somewhat controversial) thoughts and reflections on Linux and the open source ecosystem in regards to the uptake by commercial companies.

Keynote venue.

Keynote venue.

Bdale raised some really good points, particularly how GNU/Linux isn’t a sellable idea to OEM vendors on cost – many vendors pay nothing for Microsoft licensing, or even make a profit due to the amount of preloaded crapware they ship with the computers. Vendors are unlikely to ship GNU/Linux unless there is sufficient consumer demand or feature set that makes it so good

My take on the talk was that Bdale was advocating that we aren’t going to win the desktop with a mass popularity – instead of trying to build a desktop for the average joe, we should build desktops that meet our own needs as power uses

It’s an interesting approach – some of the more recent endeavours with desktop developers has lead to environments that newer users like, but power users hate (eg GNOME 3), as a power user, I share this view, I’d rather we develop a really good power user OS, rather than an OS designed for the simplest user. Having said that, the nice thing about open source is that developers can target different audiences and share each other’s work.

Bdale goes on to state that the year of the Linux desktop isn’t relevant, it’s something we’re probably never going to win – but we have won the year of Linux on the mobile, which is going to replace conventional workstations more and more for the average use and become the dominant device used.

It’s something I personally believe as well, I already have some friends who *only* own a phone or tablet, instead of a desktop or tablet, and use it for all their communications. In this space, Android/Linux is selling extremely well.

And although it’s not a conventional GNU/Linux space we know and love and it still has it’s share of problems, a future where Android/Linux is the dominate device OS is much more promising than the current Windows/MacOS duopoly.

The rest of the day had a mix of miniconf talks – there wasn’t anything particularly special for me, but there were some good highlights during the day:

  • Sherri Cabral did a great talk on what it means to be a senior sysadmin, stating that a proper senior sysadmin knows how to solve problems by experience ( not guess work), works to continuously automate themselves out of a job with better tools and works to impart knowledge onto others.
  • Andrew Bartlett did a brief update on Samba 4 (the Linux CIFS/SMB file system implementation) – it’s production ready now and includes proper active directory support. The trade off, is that in order to implement AD, you can’t use an external LDAP directory or Kerberos server when using Samba 4 in an AD server mode.
  • Nick Clifford did an entertaining presentation on the experiences and suffering from working with SNMP, turns out that both vendor and open source SNMP implementations are generally quite poor quality.
  • Several interesting debates over the issues with our current monitoring systems (Nagios, Icinga, Munin, etc) and how we can fix them and scale better – no clear “this is the solution” responses, but some good food for thought.

Overall it was a good first day, followed up by some casual drinks and chats with friends – thankfully we even managed to find an open liquor store in Canberra on a public holiday.

Poor @lgnome expresses his pain at yet another closed liquor store before we located an open location.

Poor @lgnome expresses his pain at yet another closed liquor store.

 

 

linux.conf.au: day 0

It’s time for the most important week of the year - linux.conf.au – which is being held in Canberra this year. I’m actually going to try and blog each day this year, unlike last year which still has all my photos in the “too be be blogged folder”. :-)

Ended up taking the bus down from Sydney to Canberra – at only around $60 and a 3 hour trip, it made more sense to take the bus down, rather than go through the hassle of getting to and from the airports and all the security hassles of flying.

Ended up having several other linux.conf.au friends on the bus, which makes for an interesting trip – and having a bus with WiFi and power was certainly handy.

I am geek, hear me roar!

I am geek, hear me roar!

Horrifying wail of the Aucklander!

Horrifying wail of the Aucklander!

The road trip down to Canberra wasn’t particularly scenic, most of the route is just dry Australian bush and motorways, generally it seems between city road trips in AU tend not to be wildly scenic unlike most of the ones I take in NZ.

Canberra itself is interesting, my initial thoughts on entering the city was that it’s kind of a cross between Rotorua and post-quake Christchurch – most of the city is low rise- 5-10 story buildings and low density sprawl, and extremely quiet with both the university and parliament on leave. In fact many have already commented it would be a great place to film a zombie movie simply due to it’s eerily deserted nature.

Considering it’s  a designed city, I do wonder why they choose such a sprawled design, IMHO it would have been way better to have a very small high density tower CBD which would be easily walk-able and massive park lands around them. Canberra also made the mistake of not putting in light rail, instead relying on buses and cars as primary transport.

Neat fountain in town

Neat fountain in town

The Aussies can never make fun of us Kiwis and sheep again... at least we don't have THIS in our capital city O_o

The Aussies can never make fun of us Kiwis and sheep again… at least we don’t have THIS in our capital city O_o

Impressively large transmission tower for such a small city.

Impressively large transmission tower for such a small city.

Once nice side of Canberra, is that with the sprawl, there tends to be a lot of greenery (or what passes for greenery in the aussie heat!) around the town and campus, including a bit of wildlife – so far I’ve seen rabbits, cockatoos, and lizards, which makes a nice change from Sydney’s wildlife viewing of giant rats running over concrete pavements.

Sqwark!

Sqwark!

The evening was spent tracking down the best pub options near by, and we were fortunate enough to discover the Wig and Pen, a local British-style brewery/pub, with about 10 of their own beers on hand pulled taps. I’m told that when the conference was here in Canberra in 2005, the attendees drank the pub dry – twice. Hopefully they have more beer on stock this year.

First beer casualty from the conference - laptop being stood vertically to drain, whilst charging a cellphone.

First beer casualty from the conference – laptop being stood vertically to drain, whilst charging a cellphone.

Normally every year the conference provides a swag bag, typically the bag is pretty good and there’s usually a few good bits in there, as well as spammy items like brochures, branded cheap gadgets (USB speakers, reading lights, etc).

This year they’ve cut down hugely on the swag volume, my bag simply had some bathroom supplies (yes, that means there’s no excuse for the geeks to wash this week), a water bottle, some sunblock and the conference t-shirt. I’m a huge fan of this reduction in waste and hope that other conferences continue on with this theme.

Arrrrrr there be some swag me mateys!

Arrrrrr there be some swag me mateys!

The conference accommodation isn’t the best this year – it’s clean and functional, but I’m really not a huge fan of the older shared dorm styles with communal bathroom facilities, particularly the showers with their coffin-style claustrophobic feel.

The plus side of course, is that the accommodation is always cheap and your evenings are filled with awesome conversations and chats with other geeks.

Looking forwards for the actuals talks, going to be lots of interesting cloud and mobile talks this year, as well as the usual kernel, programming and sysadmin streams. :-)

linux.conf.au 2013 plans

It’s nearing that important time of year that the NZ-AU open source flock congregate that important and time honoured tradition of linux.conf.au. I’ve said plenty about this conference in the past, going to make an effort to write a lot more this year about the conference.

There’s a bit of concern this year that there might not be a team ready to take up the mantle for 2014, unfortunately linux.conf.au is a victim of it’s own success – as each year has grown bigger and better, it’s at the stage where a lot of volunteers consider it too daunting to take it on themselves. Hopefully a team has managed to put together a credible bid for 2014, it would be sad to lose this amazing conference.

As I’m now living in Sydney, I can actually get to this year’s conference via a business class coach service which is way cheaper than flying, and really just as fast once taking the hassles of getting to the airport, going through security and flying into account. Avoiding the security theatre is a good enough reason for me really – I travel a lot, but I actually really hate all the messing about.

If you’re attending the conference and departing from Sydney (or flying into Sydney from NZ to then transfer to Canberra), I’d also suggest this bus service – feel free to join me on my booked bus if you want a chat buddy:

  • Depart Sydney, Sunday 27th Jan at 11:45 on bus GX273.
  • Depart Canberra, Saturday 2nd Feb at 14:00 on bus GX284.

The bus has WiFi and power and extra leg room, so should be pretty good if you want to laptop the whole way in style – for about $35 each way.

Leosticks are a gateway drug

At linux.conf.au earlier this year, the guys behind Freetronics gave every attendee a free Leostick Arduino compatible board.

As I predicted at the time, this quickly became the gateway drug – having been given an awesome 8-bit processor that can run off the USB port and can provide any possibility of input/output with both digital and analogue hardware, it was inevitable that I would want to actually acquire some hardware to connect to it!

Beware kids, this is what crack looks like.

My background into actual electronics hasn’t been great, my parents kindly got me a Dick Smith starter kit when I was much younger (remember back in the day when DSE actually sold components! Now I feel old :-/) but I never quite managed to grasp all the concepts and a few attempts since then haven’t been that successful.

Part of the issue for me is I learn by doing and having good resources to refer to, back then it wasn’t so easy, however with internet connectivity and thousands of companies selling components to consumers offering tutorials and circuit design information, it’s never been easier.

Interestingly I found it hard to get a real good “you’re a complete novice with no clue about any of this” guide, but the Arduino learning resources are very good at detailing how their digital circuits work and with a bit of wikipediaing, got me on the right track so far.

Also not having the right tools and components for the job is an issue, so I made a decision to get a proper range of components, tools, hookup wire and some Arduino units to make a few fun projects to learn how to make this stuff work.

I settled on 3 main projects:

  1. Temperature monitoring inside my home server – this is a whitebox machine so doesn’t have too many sensors in good locations, I’d like to be able to monitor some of the major disk bays, fans, motherboard, etc.
  2. Out-of-band serial management and watchdog restart of my home server. This is more complex & ambitious, but all the components are there – with a RS232 to TTY conversion circuit I can read the server’s serial port from the Arduino and use the Arduno and a transistor to control the reset header on the motherboard to power-restart if my slightly flaky CPU crashes again.
  3. Android controlled projects. This is a great one, since I have an abundance of older model Android phones available and would like a project that allows me to improve my C coding (Arduino) and to learn Java/Dalvik (Android). This ticks both boxes. ATM considering adding an Android phone to the Arduino server monitoring solution, or maybe hooking it into my car and using the Android phone as the display.

These cover a few main areas – to learn how to talk with one wire sensor devices, to earn how to use transistors to act as switches, to learn different forms of serial communication and to learn some new programming languages.

Having next to no current electronic parts (soldering iron, breadboard and my general PC tools were about it) I went down the path of ordering a full set of different bits to make sure I had a good selection of tools and parts to make most circuits I want.

Ended up sourcing most of my electronic components (resister packs, prototyping boards, hookup wire, general capacitors & ICs) from Mindkits in NZ, who also import a lot of Sparkfun stuff giving them a pretty awesome range.

Whilst the Arduinos I ordered supply 5V and 3.3V, I grabbed a separate USB-powered supply kit for projects needing their own feed – much easier running off USB (of which I have an abundance of ports around) than adding yet-another-wallwart transformer. I haven’t tackled it yet, but I’m sure my soldering skills will be horrific and naturally worth blogging about in future to scare any competent electronics geek.

I also grabbed two Dallas 1-wire temperature sensors, which whilst expensive compared to the analog options are so damn simple to work with and can be daisy chained. Freetronics sell a breakout board model all pre-assembled, but they’re pricey and they’re so simple you can just wire the sensors straight back to your Arduino circuit anyway.

Next I decided to order some regular size Arduinos from Freetronics – if I start wanting to make my own shields (expansion boards for the Arduinos), I’d need a regular sized unit rather than the ultrasmall Leostick.

Ended up getting the classic Arduino Eleven/Uno and one of the Arduino USB Droids which provide a USB Host port so they can be used with Android phones to write software than can interface with hardware.

After a bit of time, all my bits have arrived from AU and the US and now I’m already to go – planning to blog my progress as I get on with my electronics discovery – hopefully before long I’ll have some neat circuit designs up on here. :-)

Once I actually have a clue what I’m doing, I’ll probably go and prepare a useful resource on learning from scratch, to cover all the gaps that I found hard to fill, since learning this stuff opens up so many exciting projects once you get past the initial barrier.

Arduino Uno/Eleven making an LED blink. HIGH TECH STUFF ;-)

Push a button to make the LED blink! Sure you can do this with just a battery, switch and LED, but using a whole CPU to read the button state and switch on the LED is much geekier! ;-)

1-wire temperature sensors. Notably with a few more than one wire. ;-)

I’ll keep posting my adventures as I get further into the development of different designs, I expect this is going to become a fun new hobby that ties into my other two main interests – computers and things with blinky lights. :-)

AirNZ 747, yay!

For my trip to linux.conf.au in Melbourne/Ballarat I had rescheduled my flights from Wellington to Auckland due to the fact that I had booked my flights before my lovely lady dragged me up to Auckland to live with her.

It’s the first time I’ve ever flown out of Auckland International Airport and to my delight, I was booked on an Air New Zealand 747. This is the very first time I’ve even flown on one, and with AirNZ phasing out the 747s in favor of 777s, I’m glad to have been able to flown on one before they got phased out entirely.

OMG PLANE! WITH AN UPSTAIRS!

I’d also like to add just for @thatjohn, that I got some awesome perks on the flight over, including a smile from a cute attendant and a FREE PEN! \m/

 

linux.conf.au 2014

I’ve just returned from my annual pilgrimage to linux.conf.au, which was held in Perth this year. It’s the first time I’ve been over to West Australia, it’s a whole 5 hour flight from Sydney –  longer than it takes to fly to New Zealand.

Perth’s climate is a very dry heat compared to Sydney, so although it was actually hotter than Sydney for most of the week, it didn’t feel quite as unpleasant – other than the final day which hit 45 degrees and was baking hot…

It’s also a very clean/tidy city, the well maintained nature was very noticeable with the city and gardens being immaculately trimmed – not sure if it’s always been like this, or if it’s a side effect of the mining wealth in the economy allowing the local government to afford it more effectively.

The towering metropolis of mining wealth.

The towering metropolis of mining wealth.

As usual, the conference ran for 5 full days and featured 4-5 concurrent streams of talks during the week. The quality was generally high as always, although I feel that content selection has shifted away from a lot of deep dive technical talks to more high level talks, and that OpenStack (whilst awesome) is taking up far too much of the conference and really deserves it’s own dedicated conference now.

I’ve prepared my personal shortlist of the talks I enjoyed most of all for anyone who wants to spend a bit of time watching some of the recorded sessions.

 

Interesting New(ish) Software

  1. RatticDB – A web-based password storage system written in Python written by friends in Melbourne. I’ve been trialling it and since then it’s growing in popularity and awareness, as well as getting security audits (and fixes) [video] [project homepage].
  2. MARS Light – This is an insanely awesome replacement for DRBD designed to address the issues of DRBD when replicating over slower long WAN links. Like DRBD, MARS Light is a block-level replication, so ideal for entire datacenter and VM replication. [video] [project homepage].
  3. Pettycoin – Proposal/design for an adjacent network to Bitcoin designed for microtransactions. It’s currently under development, but is an interesting idea. [video] [project homepage].
  4. Lua code in Mediawiki – the Mediawiki developers have added the ability for Wikipedia editors to write Lua code that is executed server side which is pretty insanely awesome when you think about how normally nobody wants to allow untrusted public the ability to remote execute code on systems. The developers have taken Lua and created a “safe” version that runs inside PHP with restrictions to make this possible. [video] [project homepage].
  5. OpenShift – RedHat did a demonstration on their hosted (and open source) PAAS platform, OpenShift. It’s a solution I’ve been looking at before, if you’re a developer whom doesn’t care about infrastructure management, it looks very attractive. [video] [project homepage].

 

Evolution of Linux

  1. D-Bus in the Kernel – Lennart Pottering (of Pulseaudio and SystemD fame) presented the efforts he’s been involved in to fix D-Bus’s shortcomings and move it into the kernel itself and have D-Bus as a proper high speed IPC solution for the Linux kernel. [video]
  2. The Six Stages of SystemD – Presentation by an engineer who has been moving systems to SystemD and the process he went through and his thoughts/experience with SystemD. Really showcases the value that moving to SystemD will bring to GNU/Linux distributions. [video]
  3. Development Tools & The UNIX Philosophy – Excellent talk by a Python developer on how we should stop accepting command-line only tools as being the “right” or “proper” UNIX-style tools. Some tools (eg debuggers) are just better suited for graphical interfaces, and that it still meets the UNIX philosophy of having one tool doing one thing well. I really like the argument he makes and have to agree, in some cases GUIs are just more suitable for some tasks. [video]

 

Walkthroughs and Warstories

  1. TCP Tuning for the Web – presented by one of the co-founders of Fastly showing the various techniques they use to improve the performance of TCP connections and handle issues such as DDOS attacks. Excellent talk by a very smart networking engineer. [video]
  2. Massive Scaling of Graphite – very interesting talk on the massive scaling issues involved to collect statistics with Graphite and some impressive and scary stats on the lifespans and abuse that SSDs will tolerate (which is nowhere near as much as they should!). [video]
  3. Maintaining Internal Forks – One of the FreeBSD developers spoke on how his company maintains an internal fork of FreeBSD (with various modifications for their storage product) and the challenges of keeping it synced with the current releases. Lots of common problems, such as pain of handling new upstream releases and re-merging changes. [video]
  4. Reverse engineering firmware – Mathew Garrett dug deep into vendor firmware configuration tools and explained how to reverse engineer their calls with various tools such as strace, IO and memory mapping tools. Well worth a watch purely for the fact that Matthew Garrett is an amazing speaker. [video]
  5. Android, The positronic brain – Interesting session on how to build native applications for Android devices, such as cross compiling daemons and how the internal structure of Android is laid out. [video]
  6. Rapid OpenStack Deployment – Double-length Tutorial/presentation on how to build OpenStack clusters. Very useful if you’re looking at building one. [video]
  7. Debian on AWS – Interesting talk on how the Debian project is using Amazon AWS for various serving projects and how they’re handling AMI builds. [video]
  8. A Web Page in Seven Syscalls – Excellent walk through on Varnish by one of the developers. Nothing too new for anyone who’s been using it, but a good explanation of how it works and what it’s used for. [video]

 

Other Cool Stuff

  1. Deploying software updates to ArduSat in orbit by Jonathan Oxer – Launching Arduino powered satelittes into orbit and updating them remotely to allow them to be used for educational and research purposes. What could possibly be more awesome than this? [video].
  2. HTTP/2.0 and you – Discussion of the emerging HTTP/2.0 standard. Interesting and important stuff for anyone working in the online space. [video]
  3. OpenStreetMap – Very interesting talk from the director of OpenStreetMap Team about how OpenStreetMap is used around disaster prone areas and getting the local community to assist with generating maps, which are being used by humanitarian teams to help with the disaster relief efforts. [video]
  4. Linux File Systems, Where did they come from? – A great look at the history and development cycles of the different filesytems in the Linux kernel – comparing ext1/2/3/4, XFS, ReiserFS, Btrfs and others. [video]
  5. A pseudo-random talk on entropy – Good explanation of the importance of entropy on Linux systems, but much more low level and about what tools there are for helping with it. Some cross-over with my own previous writings on this topic. [video]

Naturally there have been many other excellent talks – the above is just a selection of the ones that I got the most out from during the conference. Take a look at the full schedule to find other talks that might interest, almost all sessions got recorded during the conference.

linux.conf.au: day 5

Final day of linux.conf.au – I’m about a week behind schedule in posting, but that’s about how long it takes to catch up on life following a week at LCA. ;-)

uuuurgggh need more sleep

uuuurgggh need more sleep

I like that guy's idea!

I like that guy’s idea!

Friday’s conference keynote was delivered by Tim Berners-Lee, who is widely known as “the inventor of the world wide web”, but is more accurately described as the developer of HTML, the markup language behind all websites. Certainly TBL was an influential player in the internets creation and evolution, but the networking and IP layer of the internet was already being developed by others and is arguably more important than HTML itself, calling anyone the inventor of the internet is wrong for such a collaborative effort.

His talk was enjoyable, although very much a case of preaching to the choir – there wasn’t a lot that would really surprise any linux.conf.au attendee. What *was* more interesting than his talk content, is the aftermath….

TBL was in Australia and New Zealand for just over 1 week, where he gave several talks at different venues, including linux.conf.au as part of the “TBL Down Under Tour“. It turns out that the 1 week tour cost the organisers/sponsors around $200,000 in charges for TBL to speak at these events, a figure I personally consider outrageous for someone to charge non-profits for a speaking event.

I can understand high demand speakers charging to ensure that they have comfortable travel arrangements and even to compensate for lost earnings, but even at an expensive consultant’s charge rate of $1,500 per day, that’s no more than $30,000 for a 1 week trip.

I could understand charging a little more if it’s an expensive commercial conference such as $2k per ticket per day corporate affairs, but I would rather have a passionate technologist who comes for the chance to impart ideas and knowledge at a geeky conference, than someone there to make a profit any day –  the $20-40k that Linux Australia contributed would have paid several airfares for some well deserving hackers to come to AU to present.

So whilst I applaud the organisers and particularly Pia Waugh for the efforts spend making this happen, I have to state that I don’t think it was worth it, and seeing the amount TBL charged for this visit to a non-profit entity actually really sours my opinion of the man.

I just hope that seeing a well known figure talking about open data and internet freedom at some of the more public events leads to more positive work in that space in NZ and AU and goes towards making up for this cost.

Outside the conference hall.

Outside the conference hall.

Friday had it’s share of interesting talks:

  • Stewart Smith spoke a bit about SQL databases with focus around MySQL & varieties being used in cloud and hosted environments. Read his latest blog post for some amusing hacks fun to execute on databases.
  • I ended up frequenting a few Linux graphical environment related talks, including David Airlie talking about improvements coming up in the X.org server, as well as Daniel Stone explaining the Wayland project and architecture.
  • Whilst I missed Keith Packard’s talk due to a scheduling clash, he was there heckling during both of the above talks. (Top tip – when presenting at LCAs, if one of the main developers of the software being discussed is in the audience, expect LOTS of heckles). ;-)
  • Francois Marier presented on Persona (developed by Mozilla), a single sign on system for the internet, with a federated decentralised design. Whilst I do have some issues with parts of it’s design, over all it’s pretty awesome and it fixes a lot of problems that plagued other attempts like OpenID. I expect I’ll cover Persona more in a future blog post, since I want to setup a Persona server myself and test it out more, and I’ll detail more about the good and the bad of this proposed solution.

Sadly it turns out Friday is the last day of the conference, so I had to finish it up with the obligatory beer and chat with friends, before we all headed off for another year. ;-)

They're taking the hobbits to Isengard! Or maybe just back to the dorms via the stream.

They’re taking the hobbits to Isengard!

A dodgy looking charactor with a wire running into a large duffle bag.....

Hopefully not a road-side bomber.

The fuel that powers IT

The fuel that powers IT

Incoming!

Incoming!

OpenStack infrastructure swift logs and performance

Turns out I’m not very good at blogging very often. However I thought I would put what I’ve been working on for the last few days here out of interest.

For a while the OpenStack Infrastructure team have wanted to move away from storing logs on disk to something more cloudy – namely, swift. I’ve been working on this on and off for a while and we’re nearly there.

For the last few weeks the openstack-infra/project-config repository has been uploading its CI test logs to swift as well as storing them on disk. This has given us the opportunity to compare the last few weeks of data and see what kind of effects we can expect as we move assets into an object storage.

  • I should add a disclaimer/warning, before you read, that my methods here will likely make statisticians cringe horribly. For the moment though I’m just getting an indication for how things compare.

The set up

Fetching files from an object storage is nothing particularly new or special (CDN’s have been doing it for ages). However, for our usage we want to serve logs with os-loganalyze giving the opportunity to hyperlink to timestamp anchors or filter by log severity.

First though we need to get the logs into swift somehow. This is done by having the job upload its own logs. Rather than using (or writing) a Jenkins publisher we use a bash script to grab the jobs own console log (pulled from the Jenkins web ui) and then upload it to swift using credentials supplied to the job as environment variables (see my zuul-swift contributions).

This does, however, mean part of the logs are missing. For example the fetching and upload processes write to Jenkins’ console log but because it has already been fetched these entries are missing. Therefore this wants to be the very last thing you do in a job. I did see somebody do something similar where they keep the download process running in a fork so that they can fetch the full log but we’ll look at that another time.

When a request comes into logs.openstack.org, a request is handled like so:

  1. apache vhost matches the server
  2. if the request ends in .txt.gz, console.html or console.html.gz rewrite the url to prepend /htmlify/
  3. if the requested filename is a file or folder on disk, serve it up with apache as per normal
  4. otherwise rewrite the requested file to prepend /htmlify/ anyway

os-loganalyze is set up as an WSGIScriptAlias at /htmlify/. This means all files that aren’t on disk are sent to os-loganalyze (or if the file is on disk but matches a file we want to mark up it is also sent to os-loganalyze). os-loganalyze then does the following:

  1. Checks the requested file path is legitimate (or throws a 400 error)
  2. Checks if the file is on disk
  3. Checks if the file is stored in swift
  4. If the file is found markup (such as anchors) are optionally added and the request is served
    1. When serving from swift the file is fetched via the swiftclient by os-loganlayze in chunks and streamed to the user on the fly. Obviously fetching from swift will have larger network consequences.
  5. If no file is found, 404 is returned

If the file exists both on disk and in swift then step #2 can be skipped by passing ?source=swift as a parameter (thus only attempting to serve from swift). In our case the files exist both on disk and in swift since we want to compare the performance so this feature is necessary.

So now that we have the logs uploaded into swift and stored on disk we can get into some more interesting comparisons.

Testing performance process

My first attempt at this was simply to fetch the files from disk and then from swift and compare the results. A crude little python script did this for me: http://paste.openstack.org/show/122630/

The script fetches a copy of the log from disk and then from swift (both through os-loganalyze and therefore marked-up) and times the results. It does this in two scenarios:

  1. Repeatably fetching the same file over again (to get a good average)
  2. Fetching a list of recent logs from gerrit (using the gerrit api) and timing those

I then ran this in two environments.

  1. On my local network the other side of the world to the logserver
  2. On 5 parallel servers in the same DC as the logserver

Running on my home computer likely introduced a lot of errors due to my limited bandwidth, noisy network and large network latency. To help eliminate these errors I also tested it on 5 performance servers in the Rackspace cloud next to the log server itself. In this case I used ansible to orchestrate the test nodes thus running the benchmarks in parallel. I did this since in real world use there will often be many parallel requests at once affecting performance.

The following metrics are measured for both disk and swift:

  1. request sent – time taken to send the http request from my test computer
  2. response – time taken for a response from the server to arrive at the test computer
  3. transfer – time taken to transfer the file
  4. size – filesize of the requested file

The total time can be found by adding the first 3 metrics together.

 

Results

Home computer, sequential requests of one file

 

The complementary colours are the same metric and the darker line represents swift’s performance (over the lighter disk performance line). The vertical lines over the plots are the error bars while the fetched filesize is the column graph down the bottom. Note that the transfer and file size metrics use the right axis for scale while the rest use the left.

As you would expect the requests for both disk and swift files are more or less comparable. We see a more noticable difference on the responses though with swift being slower. This is because disk is checked first, and if the file isn’t found on disk then a connection is sent to swift to check there. Clearly this is going to be slower.

The transfer times are erratic and varied. We can’t draw much from these, so lets keep analyzing deeper.

The total time from request to transfer can be seen by adding the times together. I didn’t do this as when requesting files of different sizes (in the next scenario) there is nothing worth comparing (as the file sizes are different). Arguably we could compare them anyway as the log sizes for identical jobs are similar but I didn’t think it was interesting.

The file sizes are there for interest sake but as expected they never change in this case.

You might notice that the end of the graph is much noisier. That is because I’ve applied some rudimentary data filtering.

request sent (ms) – disk request sent (ms) – swift response (ms) – disk response (ms) – swift transfer (ms) – disk transfer (ms) – swift size (KB) – disk size (KB) – swift
Standard Deviation 54.89516183 43.71917948 56.74750291 194.7547117 849.8545127 838.9172066 7.121600095 7.311125275
Mean 283.9594368 282.5074598 373.7328851 531.8043908 5091.536092 5122.686897 1219.804598 1220.735632

 

I know it’s argued as poor practice to remove outliers using twice the standard deviation, but I did it anyway to see how it would look. I only did one pass at this even though I calculated new standard deviations.

 

request sent (ms) – disk request sent (ms) – swift response (ms) – disk response (ms) – swift transfer (ms) – disk transfer (ms) – swift size (KB) – disk size (KB) – swift
Standard Deviation 13.88664039 14.84054789 44.0860569 115.5299781 541.3912899 515.4364601 7.038111654 6.98399691
Mean 274.9291111 276.2813889 364.6289583 503.9393472 5008.439028 5013.627083 1220.013889 1220.888889

 

I then moved the outliers to the end of the results list instead of removing them completely and used the newly calculated standard deviation (ie without the outliers) as the error margin.

Then to get a better indication of what are average times I plotted the histograms of each of these metrics.

Here we can see a similar request time.

 

Here it is quite clear that swift is slower at actually responding.

 

Interestingly both disk and swift sources have a similar total transfer time. This is perhaps an indication of my network limitation in downloading the files.

 

Home computer, sequential requests of recent logs

Next from my home computer I fetched a bunch of files in sequence from recent job runs.

 

 

Again I calculated the standard deviation and average to move the outliers to the end and get smaller error margins.

request sent (ms) – disk request sent (ms) – swift response (ms) – disk response (ms) – swift transfer (ms) – disk transfer (ms) – swift size (KB) – disk size (KB) – swift
Standard Deviation 54.89516183 43.71917948 194.7547117 56.74750291 849.8545127 838.9172066 7.121600095 7.311125275
Mean 283.9594368 282.5074598 531.8043908 373.7328851 5091.536092 5122.686897 1219.804598 1220.735632
Second pass without outliers
Standard Deviation 13.88664039 14.84054789 115.5299781 44.0860569 541.3912899 515.4364601 7.038111654 6.98399691
Mean 274.9291111 276.2813889 503.9393472 364.6289583 5008.439028 5013.627083 1220.013889 1220.888889

 

What we are probably seeing here with the large number of slower requests is network congestion in my house. Since the script requests disk, swift, disk, swift, disk.. and so on this evens it out causing a latency in both sources as seen.

 

Swift is very much slower here.

 

Although comparable in transfer times. Again this is likely due to my network limitation.

 

The size histograms don’t really add much here.

 

Rackspace Cloud, parallel requests of same log

Now to reduce latency and other network effects I tested fetching the same log over again in 5 parallel streams. Granted, it may have been interesting to see a machine close to the log server do a bunch of sequential requests for the one file (with little other noise) but I didn’t do it at the time unfortunately. Also we need to keep in mind that others may be access the log server and therefore any request in both my testing and normal use is going to have competing load.

 

I collected a much larger amount of data here making it harder to visualise through all the noise and error margins etc. (Sadly I couldn’t find a way of linking to a larger google spreadsheet graph). The histograms below give a much better picture of what is going on. However out of interest I created a rolling average graph. This graph won’t mean much in reality but hopefully will show which is faster on average (disk or swift).

 

You can see now that we’re closer to the server that swift is noticeably slower. This is confirmed by the averages:

 

  request sent (ms) – disk request sent (ms) – swift response (ms) – disk response (ms) – swift transfer (ms) – disk transfer (ms) – swift size (KB) – disk size (KB) – swift
Standard Deviation 32.42528982 9.749368282 245.3197219 781.8807534 1082.253253 2737.059103 0 0
Mean 4.87337544 4.05191168 39.51898688 245.0792916 1553.098063 4167.07851 1226 1232
Second pass without outliers
Standard Deviation 1.375875503 0.8390193564 28.38377158 191.4744331 878.6703183 2132.654898 0 0
Mean 3.487575109 3.418433003 7.550682037 96.65978872 1389.405618 3660.501404 1226 1232

 

Even once outliers are removed we’re still seeing a large latency from swift’s response.

The standard deviation in the requests now have gotten very small. We’ve clearly made a difference moving closer to the logserver.

 

Very nice and close.

 

Here we can see that for roughly half the requests the response time was the same for swift as for the disk. It’s the other half of the requests bringing things down.

 

The transfer for swift is consistently slower.

 

Rackspace Cloud, parallel requests of recent logs

Finally I ran just over a thousand requests in 5 parallel streams from computers near the logserver for recent logs.

 

Again the graph is too crowded to see what is happening so I took a rolling average.

 

 

request sent (ms) – disk request sent (ms) – swift response (ms) – disk response (ms) – swift transfer (ms) – disk transfer (ms) – swift size (KB) – disk size (KB) – swift
Standard Deviation 0.7227904332 0.8900549012 434.8600827 909.095546 1913.9587 2132.992773 6.341238774 7.659678352
Mean 3.515711867 3.56191383 145.5941102 189.947818 2427.776165 2875.289455 1219.940039 1221.384913
Second pass without outliers
Standard Deviation 0.4798803247 0.4966553679 109.6540634 171.1102999 1348.939342 1440.2851 6.137625464 7.565931993
Mean 3.379718381 3.405770445 70.31323922 86.16522485 2016.900047 2426.312363 1220.318912 1221.881335

 

The averages here are much more reasonable than when we continually tried to request the same file. Perhaps we’re hitting limitations with swifts serving abilities.

 

I’m not sure why we have sinc function here. A network expert may be able to tell you more. As far as I know this isn’t important to our analysis other than the fact that both disk and swift match.

 

Here we can now see swift keeping a lot closer to disk results than when we only requested the one file in parallel. Swift is still, unsurprisingly, slower overall.

 

Swift still loses out on transfers but again does a much better job of keeping up.

 

Error sources

I haven’t accounted for any of the following swift intricacies (in terms of caches etc) for:

  • Fetching random objects
  • Fetching the same object over and over
  • Fetching in parallel multiple different objects
  • Fetching the same object in parallel

I also haven’t done anything to account for things like file system caching, network profiling, noisy neighbours etc etc.

os-loganalyze tries to keep authenticated with swift, however

  • This can timeout (causes delays while reconnecting, possibly accounting for some spikes?)
  • This isn’t thread safe (are we hitting those edge cases?)

We could possibly explore getting longer authentication tokens or having os-loganalyze pull from an unauthenticated CDN to add the markup and then serve. I haven’t explored those here though.

os-loganalyze also handles all of the requests not just from my testing but also from anybody looking at OpenStack CI logs. In addition to this it also needs to deflate the gzip stream if required. As such there is potentially a large unknown (to me) load on the log server.

In other words, there are plenty of sources of errors. However I just wanted to get a feel for the general responsiveness compared to fetching from disk. Both sources had noise in their results so it should be expected in the real world when downloading logs that it’ll never be consistent.

Conclusions

As you would expect the request times are pretty much the same for both disk and swift (as mentioned earlier) especially when sitting next to the log server.

The response times vary but looking at the averages and the histograms these are rarely large. Even in the case where requesting the same file over and over in parallel caused responses to go slow these were only in the magnitude of 100ms.

The response time is the important one as it indicates how soon a download will start for the user. The total time to stream the contents of the whole log is seemingly less important if the user is able to start reading the file.

One thing that wasn’t tested was streaming of different file sizes. All of the files were roughly the same size (being logs of the same job). For example, what if the asset was a few gigabytes in size, would swift have any significant differences there? In general swift was slower to stream the file but only by a few hundred milliseconds for a megabyte. It’s hard to say (without further testing) if this would be noticeable on large files where there are many other factors contributing to the variance.

Whether or not these latencies are an issue is relative to how the user is using/consuming the logs. For example, if they are just looking at the logs in their web browser on occasion they probably aren’t going to notice a large difference. However if the logs are being fetched and scraped by a bot then it may see a decrease in performance.

Overall I’ll leave deciding on whether or not these latencies are acceptable as an exercise for the reader.

Third party testing with Turbo-Hipster

Why is this hipster voting on my code?!

Soon you are going to see a new robot barista leaving comments on Nova code reviews. He is obsessed with espresso, that band you haven’t heard of yet, and easing the life of OpenStack operators.

Doing a large OpenStack deployment has always been hard when it came to database migrations. Running a migration requires downtime, and when you have giant datasets that downtime could be hours. To help catch these issues Turbo-Hipster (http://josh.people.rcbops.com/2013/09/building-a-zuul-worker/) will now run your patchset’s migrations against copies of real databases. This will give you valuable feedback on the success of the patch, and how long it might take to migrate.

Depending on the results, Turbo-Hipster will add a review to your patchset that looks something like this:

Example turbo-hipster post

What should I do if Turbo-Hipster fails?

That depends on why it has failed. Here are some scenarios and steps you can take for different errors:

FAILURE – Did not find the end of a migration after a start

  • If you look at the log you should find that a migration began but never finished. Hopefully there’ll be a traceroute for you to follow through to get some hints about why it failed.

WARNING – Migration %s took too long

  • In this case your migration took a long time to run against one of our test datasets. You should reconsider what operations your migration is performing and see if there are any optimisations you can make, or if each step is really necessary. If there is no way to speed up your migration you can email us at rcbau@rcbops.com for an exception.

FAILURE – Final schema version does not match expectation

  • Somewhere along the line the migrations stopped and did not reach the expected version. The datasets start at previous releases and have to upgrade all the way through. If you see this inspect the log for traceroutes or other hints about the failure.

FAILURE – Could not setup seed database. FAILURE – Could not find seed database.

  • These two are internal errors. If you see either of these, contact us at rcbau@rcbops.com to let us know so we can fix and rerun the tests for you.

FAILURE – Could not import required module.

  • This error probably shouldn’t happen as Jenkins should catch it in the unit tests before Turbo-Hipster launches. If you see this, please contact us at rcbau@rcbops.com and let us know.

If you receive an error that you think is a false positive, leave a comment on the review with the sole contents of recheck migrations.

If you see any false positives or have any questions or problems please contact us on rcbau@rcbops.com

LinuxCon Europe

linuxcon-europe-2011After travelling very close to literally the other side of the world[0] I’m in Edinburgh for LinuxCon EU recovering from jetlag and getting ready to attend. I’m very much looking forward to my first LinuxCon, meeting new people and learning lots :-).

If you’re around and would like to catch up drop me a comment here. Otherwise I’ll see you at the conference!

[0] http://goo.gl/maps/JeJO2

New Blog

Welcome to my new blog.

You can find my old one here: http://josh.opentechnologysolutions.com/blog/joshua-hesketh

I intend on back-porting those posts into this one in due course. For now though I’m going to start posting about my adventures in openstack!Wordpress

Introducing turbo-hipster for testing nova db migrations

Zuul is the continuous integration utility used by OpenStack to gate patchsets against tests. It takes care of communicating with gerrit (the code review system) and the test workers – usually Jenkins. You can read more about how the systems tie together on the OpenStack Project Infrastructure page.

The nice thing is that zuul doesn’t require you to use Jenkins. Anybody can provide a worker to zuul using the gearman protocol (which is a simple job server). Enter turbo-hipster*.

“Turbo-hipster is a CI worker with pluggable tasks initially designed to test OpenStack’s database migrations against copies of real databases.”

This will hopefully catch scenarios where changes to the database schema may not work due to outliers in real datasets and also help find where a migration may take an unreasonable amount of time against a large database.

In zuuls layout configuration we are able to specify which jobs should be ran against which projects in which pipelines. For example, for nova we want to run tests when a patchset is created, but we don’t need to run tests against it (necessarily) once it is merged etc. So in zuul we specify a new gate (aka job) to test nova against real databases.

turbo-hipster then listens for jobs created on that gate using the gearman protocol. Once it receives a patchset from zuul it creates a virtual environment and tests the upgrades. It then compiles and sends back the results.

At the moment turbo-hipster is still under heavy development but I hope to have it reporting results back to gerrit patchsets soon as part of zuuls report summary. For the moment I have a separate zuul instance running to test new nova patches and email the results back to me. Here is an example result report:

<code>Build succeeded.

- http://thw01.rcbops.com/logviewer/?q=/results/47/47162/9/check/gate-real-db-upgrade_nova_mysql/c4bc35c/index.html : SUCCESS in 13m 31s
</code>

Turbo Hipster Meme

*The name was randomly generated and does not necessarily contain meaning.

Open and Free Internet

The last week has been an interesting nexus of Open and Free.

On Saturday I attended the Firefox OS App day in Wellington. I had heard about Firefox OS some time ago under its project name Boot2Geeko (b2g). At the time I had thought that it was an intriguing idea, but wouldn't be very powerful. I was certainly wrong. Firefox OS is fairly mature and looking like it will be very powerful. Check out arewemobileyet.com for an idea where they are heading (for example WebUSB!) It appeared to work well on the developer phones (re-flashed Android phones, the same Linux kernel is used).

All the applications on Firefox OS are web applications. In particular, they are Open Web Apps, using HTML5, CSS and Javascript. Even the phone dialer is an HTML5/JS app! Mozilla showed off a framework for building apps called mortar that takes care of the basic UI consistent with the standard apps, but you could use any html5/css/js tools or frameworks. Unless you use some of the newer (and higher security required) APIs, the apps also work in a normal web browser.

I wasn't able to stick around what people developed, but it was very interesting.

Last night I watched the live stream of Sir Tim Berners-Lee, the inventor of the World Wide Web, giving a public lecture in Wellington (I missed out on a ticket) on "The Open Internet and World Wide Web". He covered the many forms of openness and freedom, including open standards, open source software, open access, open data, and open Internet. One key point from the lecture was that native apps (on IOS or Android, for example) take you off the Web, and therefore away from the core of social discourse. This is significant and currently increasingly happening. I will tweet a link when the lecture is available to view online.

These events dovetail nicely and fits with my general strategy of focusing on web apps that work nicely on phones, tablets, and computers.

First Post

I've updated this site over the last few weeks to help manage it going forward.

The key piece is this blog, which I hope to update somewhat frequently. I haven't enabled comments, so reply by twitter or email.