aside

Presentations, May 2010 and July 2012, on Queueing

Just a quick note to point out a couple of presentations on Queuing. I’ve recently shown the second (which admittedly has some significant things in common with the original, and not just the web-based slides).

Either way, you are welcome to look at them online, and the original html source, and some source code, are all online at http://github.com/alister

Tags:
default

Recently….

It’s been one of those quiet spots around here for a while, so here’s the catch-up on what has been happening while I was not posting.

I’ve recently finished a short contract working with an agency, Transform (part of the Engine group) working with a couple of government departments. The Office Of The Public Guardian receives, checks and stores Lasting Powers of Attorney – a legal document that you write while still mentally compentant to say what you would like to happen should the worst occur, and by whom you want to do it. The simpler cases aren’t actually very complicated, but there is a lot of work to get the form completed – and the information can have to be written in triplicate over two or three different forms.

The project was to work with (and at the offices of) the new Government Digital Services (GDS) who are building the http://Gov.uk project, and I helped write the first-draft (but otherwise basically complete) prototype to put the form online. If nothing else, it allows someone to step through and only have to put information in once. Myself and one other developer, with a project manager and many others from the OPG and GDS took nearly all of the 37-pages of duplicated paper forms and created a PHP/Zend Form based system, that in the end produced PDFs, ready to check and then have signed by everyone involved.

It was an interesting project – and it will be a valuable service to make it easier to handle, and eventually also process from the back-office perspective. It’s not quite what I would normally do, I’m far more infrastructure and back-end oriented – not so used to building a large and complex flowing form, and so I elected to move on at the end of the prototype, rather than continue with the alpha/beta phases. With a little luck, it will go live later this year after some extensive user-testing to make it as useful, and easy as possible to fill in this important legal document. The end result, in a few years, should be the ability to have many times more people making an LPA for themselves, often as part as something as routine as making a will, or buying a new house.

Rest assured, there’s been plenty of relaxing since I finished that project a couple of weeks ago (especially since I spend a good portion of the time working while distinctly under the weather).

Puppet and Github

In the last couple of weeks, I’ve been a software developing machine. I’ve also been looking for my next contract role, hopefully something to start just after Easter – though, as I write this, exactly what that will be is still up in the air.

First, I’ve been putting together a development VM – currently based on a beta of Ubuntu 12.04. There are a few things that go into making it.

Puppet config https://github.com/alister/puppet-ab
I’ve been occasionally tweaking this since before Christmas when I took some time to do a deeper dive into Puppet, after using it last year. Many of the modules I use, are actually pulled in from other github-based projects, especially from a number by https://github.com/saz (Steffen Zieger).

Puppet-dropbox: (forked from https://github.com/cwarden/puppet-dropbox) Dropbox is a very useful tool on any desktop to copy files around your own machines, and shared folders enable easy access to others working on the same project – I found it nearly invaluable in my last contract. It was also good to be able to improve the code that was upstream – a small fix to allow for Ubuntu to be also set as a destination for the installation.

In the end, I have elected to use it to just install the basic command line tool (rather than the full client), and then that can be used to install the main client, if required. It saves having to store the username and password in the repository, and it is also useful from other security standpoints not having copies of your files on all machines where the puppet manifests might be run.

Dotfiles is more of a meta-project, many appear to have a repo by that name, but a large number of them are hand-rolled. I forked one of the more common bases by Ryan Bates https://github.com/ryanb/dotfiles who also runs the excellent http://railscasts.com/ (which are not all about Ruby On Rails). I have yet to find a good way to integrate it with another shell-oriented project – Oh-My-Zsh, https://github.com/robbyrussell/oh-my-zsh which is an excellent improvement over a standard Bash shell, that I’ve been using for more than a dozend years.

The common thread between both of these is to take a basic machine, with git, an SSH key and Puppet installed and bring it quickly up to a full spec development desktop/server. It’s a continuing project, but a valuable one, and not just as a learning tool.

There are two other projects that I’ve been working on.

guard-puppet-lint: Guard (see the railscast episode http://railscasts.com/episodes/264-guard) is a ruby-based project that will watch for file changes in a subdirectory hierarchy. There are a lot of plug-ins for it https://rubygems.org/search?utf8=%E2%9C%93&query=guard- including PHP-oriented ones for PHPunit and PHP_CodeSniffer. The project itself can be downloaded from https://rubygems.org/gems/guard-puppet-lint.

As the name suggests, this small ruby gem adds a slightly easier way to run Puppet-Lint through Guard. As my first released ruby-code, there’s not much to it, and in fact, it’s really just a hack of guard-shell, that will run puppet-lint on the changed manifests. it does make it slightly cleaner though, and so I’m happy enough. I’m also very pleased to have had a couple of (very minor) issues raised – literally one word missing from the readme file and a single character to reduce the number of false-positive files that might be processed. There are some ideas I can add to it to make it even more useful, but that can wait for a little while, and besides, I have to figure out how to better use Guard in the first place, to be able to to do so.

My final, and latest, project is https://github.com/alister/QR-Generator-PHP – a refactoring of the QR code library at https://github.com/edent/QR-Generator-PHP.

The code itself works fine, but only as a URL destination. One of the ideas that came up while working with the Office Of The Public Guardian on their new LPA form was to put QR codes onto the final output PDF pages to help verify automatically that all the pages that have been produced, have been recieved at the back office – and also being able to refer the paper form to a digital version stored in the database.

It’s a classic refactoring though – taking a piece of code and, without changing the end results, make it possible to use in a slightly different, but useful, context. Eventually, the qr.php webpage would be a thin wrapper around the class – and the class itself could be used from backend code to, for example, generate an image that can be placed into a PDF.

aside

Hire quickly: Addendum, recruiters

Recruiters: Here’s the rules.

  1. The first recruiter to tell me the company name, and then send the job-spec gets to forward my details – if I think it’s interesting.
  2. No company name, or spec, no chance
  3. If you send my details without my OK, you lose. And I tell the company you are a loser (chances are, they are too).
  4. Sending me all the information I could want to know is good – but when you do it on spec (and probably en-mass as well), does not mean that you get to claim the bounty.
  5. Let us know what is happening. That especially includes feed back from the employers.

All of the above have happened to me.

About that last point about feed-back? There’s one recruiter that is on my list (it’s not a good list) because he didn’t bother telling me what the employer was saying about me, but a different recruiter did find out and let me know. The 2nd one has still got a chance to place me, but the first, not so much. Ironically, the comments were about some rants I posted on my LinkedIn profile. It’s also a potential employer I no longer care about working for.

The strangest story happened to me about 15 years ago. I was working with a small recruiter and spending a couple of days to tweak a CV so that it was just perfect for a potential job (this was when I still wrote CVs, this year, it’s all on websites to read, not MSWord documents). Then I got a phone call from one of the largest recruitment companies in town – they had sent my CV (without my knowledge) and the employer was interested in talking to me. WTF? That was so not good. It was even worse for the small recruiter though – it turns out he knew the rogue recruiter. He was married to her.

Finally, when you do contact a candidate with a potential role, make sure you send them your details – and about the job(s). Just a quick email with a note with the what and where. Without it, they will not know how to get back to you for anything. I know you love to talk on the phone (and it avoids that pesky audit-trail), or you might make wonderful notes in your recruitment systems for yourself, but when us developers are looking, we can get a dozen phone calls a day from all different recruiters, and quite likely on our mobile phones to boot. We’ve not got the chance to write it all down most of the time, so you should, and drop us a note about it. Otherwise, we can’t get back to you – if it was interesting. So, it’s in your own benefit to keep us in the loop.

Tags: , ,
default

Hire quickly, because your competitors will.

If you aren’t taking hiring seriously – other people can, and do hire the people you need.

I’ve been guilty of it before – leaving it a couple of days – or even a week before getting back to someone that sent in their CV – although of course, most of the time, it didn’t matter. The person wasn’t going to get hired because they were just not good enough (the generally poor quality of developers is a different rant).

A couple of times I have been bitten hard when hiring though, such as being introduced to a sysadmin on a Thursday night, following up late Friday afternoon and finding out on Monday when I chased him up, that he had just accepted an offer.

So, what to do? Well, to be honest, all you can do is be swift about things. Check all CVs that come in within a couple of hours at most, and for those that show promise, get back to them and arrange the next step as quickly as you can (probably a quick chat on the phone?) and pencil in some time – in your own calendar, if not theirs – a potential time to sit down with them properly.

Please though, after you’ve had the interview get back to them quickly. Occasionally, I’ll have left them with a little thing to do (some code to write, or something to get back to me on), it’s a good idea to just drop a quick email confirming that after they step out the door. A couple of times when I was looking for a new job, I’ve actually emailed them back that afternoon, or before lunchtime the following day to follow up with some code. Both times I was starting that role inside two weeks.

When I’ve been interviewing, I’ve even offered someone a job before they left the interview. It was obvious that the guy was a good developer – just searching for him online found a number of posts he’d done into relevant mailing lists. A few years later, I’d moved on myself, he was now freelancing, so on my suggestion he was interviewed again, and promptly hired again.

There is a cut-throat market for developers in the last few years, and that’s not likely change. Really good people will always have a choice if they want it. You, as an employer, need to be worth working for.

  • Interesting project(s)
  • Enough money for that not to be an issue – though salaries for the best devs are rising fast
  • Working conditions that don’t get in the way

A future post will touch more on my ‘perfect wish-list’ of working environments.

Tags: ,
aside

Jailbreaking your Kindle, and putting new ‘screensaver’ images

A quick fun post for those of you with an Amazon Kindle – some instructions on how to a) jailbreak your reader (trivially easy), and then b) put your own wallpapers on there, so you get a more interesting ‘screensaver’.

It’s really easy, no more than 20 mins and a couple of reboots/software updates. Most of the time is literally waiting for the reader to restart after you’ve placed a file in the base directory.

All of the instructions are here: http://wiki.mobileread.com/wiki/Kindle_Screen_Saver_Hack_for_all_2.x_and_3.x_Kindles

A few notes to make it easier to understand:

  1. The current version is around 3.3. You can double-check that you have this (or later) by pressing ‘Menu’, and going down to ‘Settings’. The version is on the bottom line: “Page 1 of 3 Version Kindle 3.3 (61680021)”
  2. You will likely need the version marked “*-3.2.1_install.bin”
  3. Once the jailbreak has been installed, you can install the screen save hack
  4. The zip file has a ‘src’ folder, you’ll also need to copy the src/linkss/ folder to your Kindle, as /linkss/
  5. Finally, you can add suitable wallpapers into the /linkss/screensavers/ directory. They should be greyscale .png’s or .jpgs

Non functional demo unit

Kindle screensaver

A quick search for “kindle wallpaper” or “kindle screensaver”, especially if you restrict it to 600×800 pixels sizes will bring up a lot of possibilities. On the right is one I made myself (I’d heard of this particular joke wallpaper before), which you are welcome to use.

Tags: ,
aside

Booze at tech meetups

I was out last night at the The Big Xmas [bash] #, near Silicon Roundabout. It was a fun night out meeting various people, tech, business and recruiters. Oh, the shame though – I was wearing the same T-shirt as someone else – and, yes, I have indeed replaced people with small shell scripts.

Now, to the main part of what this post is about – the rant. It’s not aimed at the particular event last night alone though. It’s alcohol at various tech-meetups in general. Look guys, you generally end up buying too much anyway, and all too often its also to the exclusion of those that may prefer to not get inebriated.

As an example, The Hacker News meetups will get dozens of pizzas (which are, admittedly all eaten – there are 150+ people attending usually), but also a couple of stacks worth of cans and bottles, each with 24 cans in each tray, several hundred cans at least. It’s just as well they aren’t all drunk on the night – many of the event-goers would be unconscious by the end. At least they will also add a few trays of soft-drinks, Lemonade and Cola.

If you want a couple of drinks to help lubricate the social aspect of an evening out, I’ve got no problem at all. I don’t though. I prefer to save my brain cells for doing interesting things, like oh, writing code?

For other events, how about adding some more soft drinks to replace some of the alcohol? Last night, the choice was booze, or fizzy water; That was all.

Thankfully, people don’t generally get blotto at the various meetups – at least that I’ve seen, but I expect there’s been one or two that have swerved their way home sometimes.

Do you have a comment about alcohol being served at the various meetups?  Would you like more, less, or do you think that organisers and sponsors are doing it right?  I would love to start a conversation here about the good or bad of it.

Tags: ,
default

Deployment with Capistrano – the Gotchas

Capistrano, makes deployment of code easy. If you need to do a number of additional steps as well, then the fact that they can be scripted and run automatically is a huge win.

If you’ve only got a single machine (or maybe two), then you could certainly write your own quite simple, and entirely workable system – I described something just like this in a previous post: “SVN checkouts vs exports for live versions”. That was written and used before I was deploying to multiple machines however – and had to be run from the command line of the machine itself. It was OK even when I had a couple of machines to deploy – I just opened an SSH to both, and ran the command on them both at the same time. When I attended the London Devops roundtable on Deployment I even advocated for that as a valid deployment mechanism. But, at the same time, as I was saying that (and it’s in the video), I was also writing Chef cookbooks and a Capistrano script to be able to build, and then deploy code to at least four different machines at once.

A number of people have already written about how to setup Capistrano to deploy PHP scripts. I’ll not repeat their work, instead I’ll just tell you some of the problems you might come across afterwards.

cap shell is a wonderful thing, until it bites you

The Capistrano shell will let you run a simple command or an internal task on one, or as many machines as you want. This can be useful when you are trying things out – and if you are in anyway unsure where a command can be run – you can practice it, just do:

cap> with web uptime
cap> on host.example.com uptime

Those two commands just show how long a machine has been up, and the current load average. Easy, and safe, but as they run, they show the list of machines they succeed on.

There are some other useful commands you can try:

## show the currently live REVISION file on each machine
cap> cat /mnt/html/deployed/current/REVISION
## This file is created as each new/updated checkout is done.
## change your path to the ./current/ path as appropriate

Since you should be deploying the same codebase to all your live machines at a time (or staging, or qa/test), the versions (or git sha1’s) should be the same as well.

Finally, in the ‘useful’ list is cap deploy:cleanup – this will remove old deployments. Keeping a few around are useful, but they can take up a lot of space. As cap --explain deploy:cleanup says:

Clean up old releases. By default, the last 5 releases are kept on each server (though you can change this with the keep_releases variable). All other deployed revisions are removed from the servers. By default, this will use sudo to clean up the old releases, but if sudo is not available for your environment, set the :use_sudo variable to false instead.

If you want to change the default to something other than 5, that can be set with the line “set :keep_releases, 10” in deploy.rb.

A few gotcha’s

When cap shell checks the source repo version

I’ve found that the latest version available in the main source code repository is only apparently checked when the Capistrano shell is first run. This can be useful if you want to check out to a limited set of machines, run a test and then check out to all the machines (you end up with the same version checked out in the same-named ‘releases/’ directory), but if you are sitting on the cap> prompt in Capistrano shell and doing multiple !deploy commands, you won’t get new versions of code that have been committed to the repository. Exit the shell, and re-run to solve this.

You checked out a new version, but you can’t see it

Be wary if you are logged into the machine, and sitting somewhere inside the ./current/ directory. Because of the symlink is being changed underneath you to a new directory that is being pointed to (the newest subdirectory in releases/), if you do not do a cd . to refresh your location within the real directory tree, you will still be in an old copy of the code. The ‘cd’ makes sure you are in the latest place on disk, via the (now changed) symlink.

Rolling back

Capistrano has the ability to remove the currently live version, and change the ‘current’ symlink to the previous location. Should the worst happen, and a website deployment fail, this can help, if ‘rolling forward’, with a fast-fix, check-in and redeploy may not be easily possible.

# to roll back to a previous deployment:
cap> with !deploy:rollback

If you have rolled back the webservers (php/app servers) you will have to restart php-fpm (or maybe Apache) on the servers, as they do not necessarily pick up the (old) versions of code that is being run now. The same would also be true if you have set APC to cache the byte-code and not look at the time-stamp of files in case they change. I’ve found that PHP-FPM also has this issue.

status

Back from the coalface

I’ve been pretty busy in the last couple of years, first at Binweevils and in 2011, PeerIndex – hence the utter lack of posts, but as the note on my personal CV site says, I’m taking some time off between looking for my next role. This does give the opportunity to write more of PHP Scaling and the tools around development that I’ve been using in the last couple of years, and that have been piquing my curiosity.

So, it is my plan to investigate other languages such as Python and Ruby, and tools like Puppet and Node.Js. Rest assured, I’ll keep up with the state-of-the art in PHP and such technologies as MongoDB though!

There’s also a number of planned posts right here, more for Beanstalkd (and talking about other queues), Deployment with Capistrano, graphing and logging (including how to mark a Capistrano deployment in a graph!) and a few other things, including rants.

Tags: ,
default

Doing the work elsewhere – Adding a job to the queue

I’ve previously shown you why you may want to put some tasks through a queuing system, what sort of jobs you could define, plus how to keep a worker process running for as long as you would like (but still be mindful of problems that happen).

In this post, I’ll show you how to put the messages into the queue, and we’ll also make a start on reading them back out.

For PHP, there are two BeanstalkD client libraries available.

Although I’ve previously used the first class in live code, I’m preferring the second, ‘Pheanstalk’, for this article. It is more regularly worked on, and uses object orientation to the fullest, plus it’s got a test suite (based on Simpletest, which is included in the download).

Using it, according to the example is simple:

The ‘pheanstalk_init.php’ file adds an autoloader, though you may find it advantageous to move the main class file hierarchy from where it had been downloaded into its own directory so that an existing (for example Zend Framework) auto-loader can find it.

As you see above, the Object Orientation lends itself well to (an optional) ‘fluid’ programming style, where an object is returned and then can be acted on in turn
$pheanstalk->useTube('testtube')->put("job payload goes here\n");

So, putting simple data into the queue, is, well, simple (as it should be). There are advantages in wrapping this simplicity into our own class though. Some examples

  • We want to put the same job into the queue multiple times – for example, a call to check some data in 1, 10 and 20 seconds time.
  • Adding a new default priority – or with multiple classes, a small range of defaults
  • adding in other (meta) information about the job that is being run, such as when it was queued, and how important it is. Some tasks might be urgent, but not important – ie, if we have the opportunity, run them now – but it doesn’t have to be run at all.

Each may be simple enough to create a simple loop, but it might be advantagous to push that down into a class – and especially with the final idea.

How to store the meta-information then? It should be a text-friendly, but concise format, and quick to parse. Here, JSON (or the related Yaml) fits the bill quite nicely.

Processing it at the other end, after it has been fetched by the worker is a simple matter of running ‘json_decode()’ and extracting the ['task'] from the results before running it.

default

Doing the work elsewhere – Sidebar – running the worker

I’m taking a slight diversion now, to show you how the main worker processor runs. There are two parts to it – the actual worker, written in PHP, and the script that keeps running it.

For testing with return from the worker, we’ll just return a random number. In order to avoid returning a normally used exit value, I’ve picked a few numbers for our controls, up around the 100 range. By default a ‘die()’ or ‘exit’ will return a ‘0’, so we can’t use that to act on – though we will use it as a fall-back as a generic error. Ideally, we won’t get one, instead we want the code in all the workers to just run as planned, and then have the worker execute a planned restart – and we will just immediately restart. We may also choose to have the worker process specifically stop – and so we’ll have an exit code for that. If there are any codes we don’t understand, we’ll slow the system down with a ‘sleep()’ to avoid running away with the process.

The actual script that is run from the command line is a pretty simple BASH script – all it’s got to do is to loop, until it gets a particular set of exit values back.


So, if it’s an exit value we know, we either
1/ pause, then restart
2/ immediately restart
3/ exit the loop.
If its any other value, we pause, and restart.

The bash command ‘exec $0 $@’ will re-run the current script ($0) with the original arguments ($@) – but with the ‘exec’, replaces the current process with a specified command. Normally, when the shell encounters a command, it forks off a child process to actually execute the command. Using the exec builtin, the shell does not fork, and the command exec’ed replaces the shell.

Save both the PHP and bash script, and then you can start the script with ‘sh runBeanstalkd-worker.sh’, run it a few times to see a lot of (deliberate) errors that cause the bash script to pause before restart, immediately restart and finally exit.

With this bash script in place, we can now run the script as many times as we need – and it will keep running, until we specifically tell it to exit. As usefully, we can exit the php worker, and have it execute a planned restart – which will clear any overheads that the script may have picked up with memory or resource allocation.

Next time, we’ll put some simple tasks into the queue.

© PHP Scaling
CyberChimps