Archive for the ‘Programming’ Category

A git post-receive that filters out already seen commits.

December 7th, 2010

The problem

We use the trac TimingAndEstimationPlugin to record a lot of our time and git, of late, has been our preferred source control system. As we’ve been using more topic branches in git this poses a bit of a headache: patches can show up multiple times.

We’ve traditionally had the post-receive hook only select patches that update the master branch. This mostly works, but sometimes ticket comments and hours don’t appear for days/weeks until that branch gets merged back. What I really want is for a patch to be posted onwards only when it is new to this repo. As such we can develop in a topic branch, push that topic up to the central repo so it is backed up and hours/comments are recorded for the project manager to see, and then merge it into the mainline at some point later and have all of this work as expected.

The only thing this doesn’t support is rebase… but we all know that’s dangerous already, right? [1]

git rev-list to the rescue

During post receive we want to find commits that are now reachable from one of the refs, but was not previously reachable.

It turns out git-rev-list can give us exactly that, but crafting the call to it is a bit tricky. When you call git rev-list CommitA ^CommitB it will give you back the set difference, i.e. commits reachable from A that are not reachable from B. So we just need to ask git what’s reachable now, that wasn’t before this receive.

Algorithm

The post-receive runs after the database has been updated but you are given a log of what has been updated.

  1. Find the set, OriginalRefs, of refs reachable in the database now.
  2. foreach ref that was updated: remove it from OriginalRefs; exclude the old value and include the new value of that ref.
  3. exclude all refs left in OriginalRefs

There’s a bit of trickiness in watching for new or deleted branch names but that’s pretty much it. We feed these to rev-list over stdin and read the list of commits on stdout.

We implemented it in python and run this regularly. Speed is pretty good even for repositories that have several hundred named refs in them (mostly publish tags); a couple seconds of overhead perhaps[2], but compared to the ssh push it doesn’t feel like the process is slower.

[1] The next day after we deployed this we ran into a case of someone having the rebase on pull flag set that caused problems.
[2] This also includes time to load a trac environment and make a database call in there for every new commit we’ve found; this could be improved but good enough.  I suspect the time in git rev-list to be even less.

Using a local emacs+tramp as your EDITOR on remote servers with SSH and emacsclient.

June 22nd, 2010

Scenario

I ssh into a bunch of different servers all the time. I would like to set my EDITOR variable on those servers to invoke emacs on my local machine using emacsclient.

This is confounded somewhat because my machine is behind NAT so my host is not remotely addressable. If it was I could tell emacs server to listen publicly, and maybe even get the server to use a consistent auth key and port.

The setup

  • A wrapper around ssh to send the configuration to the remote server whenever I start a session there. This script will also add the remote port forward so that emacsclient on the remote host can get back to my machine.
  • A script on the remote hosts to invoke emacsclient with the right parameters.
  • export EDITOR

The Details

ssh wrapper

Stick this in a script on the local machine and invoke it instead of ssh.

editor scrpt

Put this on the remote machines, marked executable, and export the EDITOR variable to point at it:

EDIT (6/23/10 10:40AM EST): Updated the editor script to include some error checking.

A reverse proxy that caches and deflates.

August 8th, 2008

One of the projects we’ve been working on recently has a lot of dynamic data driven content that doesn’t change much. Knowing this we made sure the backend was setting cache control headers appropriately and then tried to turn caching on in apache using mod_disk_cache– so far so good. Pages were speedier; cpu load was lighter; we were happy. Then tritchey started insulting us.

I didn’t take it personally; I already knew these were good ideas and I saw this as an opportunity. Surely we could match his feat– I should only need to add one line to the config and enable a module. Ah the foolishness of youth.

I turned on mod_deflate, cleared the cache, and hit refresh. Hrm, nothing happens. Firebug[1] is reporting that the page is definately being delivered zipped, but the cache wasn’t ever being filled. It would successfully zip and cache the static javacript and css files. The html coming through the reverse proxy was never put in the disk cache and was requiring a re-request to the backend every time.

I saw some references to AddOutputFilterByType not working with reverse proxies from not being able to correctly identify the content type but that seemed to be fixed and deprecated at the same time. Suggestions were to use mod_filter instead but the docs on mod_deflate still pointed to the old style and there weren’t many good examples of this type of setup with mod_filter (good example coming below). Someone else claimed that apache was stripping off the cache control headers the backend needed, but my logging showed that wasn’t my problem. I’d largely given up and just left it at the caching without the deflate since that gave me better performance.

At the same time we were trying to bring up a new server for this to be hosted on using Ubuntu JEOS (8.04) rather than the Debian Etch we had been using. Once we got all deployed on the new box, I decided to give the caching another try. It works great!  The difference for us seems to primarly reside somewhere between the Apache 2.2.3 we had been using on Debian Etch, and the 2.2.8 with Hardy Heron. To me this somewhat justified our decision to give Ubuntu a shot over Debian. I’ve been a fan of Debian for a while but found quite a few cases where I hit a bug that has been fixed in a newer version that wasn’t in stable. Backports were sometimes available; more frequently it would only be available in testing which through libc6 would require everything to be upgraded at once.

Here’s an example of the config:

    RewriteEngine On # we are actually using mod_rewrite to implement it
    ProxyRequests Off # don't be a proxy, just allow the reverse proxy

    #see if a static file exists in the webroot first and serve it from there.
    RewriteCond /var/www/gainesville-green.com/current/www/%{REQUEST_FILENAME} -f
    RewriteRule ^(.+) /var/www/gainesville-green.com/current/www/$1 [L]

    #if not forward it to the lisp process listening locally
    RewriteRule ^/(.*)$ http://127.0.0.1:3434/$1 [P]

    #set up caching, enabled for the entire site.
    CacheRoot /var/cache/apache2/mod_disk_cache/gainesville-green.com
    CacheEnable disk /

    #Declare a filter named gzipping
    #The 2nd parameter is type of filter. I believe this is saying
    # that the filter operates on the content body, as opposed to
    # the url or some other part.
    FilterDeclare gzipping CONTENT_SET
    #in filter gzipping use deflate when content type equals text/html
    FilterProvider gzipping deflate Content-Type text/html
    FilterProvider gzipping deflate Content-Type text/css
    #'$' here is substring match, match both text/javascript application/x-javascript
    FilterProvider gzipping deflate Content-Type $javascript
    #insert the filter into the chain, by default at the end.
    FilterChain gzipping

[1] I’m using the beta which has got a lot of nice improvements to an already great extension. Also Clear Cache Button is a nice Firefox exension that aided in the testing here.

mod_ldap LDAPVerifyServerCert simple bind failed

August 8th, 2008

We’ve been working for a long time to resolve an error in our ldap setup. Whenever we tried to use the LDAPVerifyServerCert option to verify the ldap server we were talking to is correct, it didn’t work. Always failed with the unhelpful error:

[LDAP: ldap_simple_bind_s() failed][Can't contact LDAP server]

We had set the appropriate CA cert with the LDAPTrustedGlobalCert option. We could use openssl s_client to verify the certificate chain.  We couldn’t figure out why it didn’t work; it was always just simple bind failed.

I finally found it today: The certificate file needs to be readable by others. Aparently the apache process reads that file separately than the rest of the config. SSL certificates for mod_ssl appear to be fine to only have root read on it, but not the LDAPTrustedGlobalCert.  It would’ve been nice if the log message had said something like “Permission denied reading …” or even just “Couldn’t read certificate.”  Unfortunately it falls back to the most generic error there.

Learning Common Lisp

September 20th, 2007

Peter Seibel wrote a Common Lisp tutorial that is the best introduction for Common Lisp that I’ve seen. I’ve poked around at a few different books, but that one was the one I actually learned Common Lisp from. It is clearly written giving you what you need to know without getting bogged down. I also find myself revisiting it as a reference from time to time even though I now use the Common Lisp Hyperspec much more. Common Lisp rocks.

Even If you aren’t looking to learn Common Lisp, I would recommend Object Reorientation: Generic Functions for a non-Java-like look at how object orientation can be done nicely. Multimethods rock.