Nuclear Trains

Keir and I are beginning to roll out “Nuclear Trains,“which is a project we’ve been working on very on-and-off for almost a year. It uses data from Real Time Trains to tweet whenever a train carrying nuclear waste sets off from a power plant, passes by a station (and we chose which stations to bother with tweeting) and arrives at its destination.

I got the idea for doing it after discovering Real Time Trains. Whilst I was waiting on the platform one evening I was using RTT to look up the freight trains that passsed through Highbury & Islington. I noticed that there was a slot on the Time Table for a train from a place called Sizewell C.E.G.B. to a place called Crewe Coal Sidings (DRS). I immediately recognised the name Sizewell as the power plant (with CEGB standing for Central Electricity Generating Board) and the words DRS standing for Direct Rail Services – the train operating company owned by the Nuclear Decommissioning Authority. A quick search of the freight trains that went to and from Crewe Coal Siding (DRS) showed the network of nuclear waste trains in this country. Keir put together a script (available on GitHub) that checks Real Time Trains every morning and sees which trains are running that day. It then tweets when they’re timetabled to pass through a station.

So far the friends we’ve shown it to are supportive. We’re a bit nervous of being accused of being terrorists or something, so I wrote a reasonably comprehensive FAQ which refers back to my old Tycho’s Nose blog post about operation smash-hit.

The output of the script looks a lot like this:

Which I think is rather neat. We’ve even coded a special tweet for the trains from Wyfla power plant that pass through Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch so we can fit it in.

In terms of putting it out there — I’ll show it to friends who have big twitter followings and see if they want to tweet about it. I’ve also wondered about emailing a link to gizmodo/motherboard because it would be neat if it gets some coverage. 99.9% of the work has been done by Keir so it would be good for him to have been able to get some recognition for it, and it would also be cool if people who know how to do things like make interactive real-time updating maps could get involved.

What I’m looking forward to most about it will be when Sizewell B next decides to ship some fuel to Sellafield and a train passes through Highbury & Islington so I can go and watch it.

SelenaBot boxing day update

Merry Christmas.


Well on Christmas eve, the day after I wrote & blogged about SelenaBot, I tried to run it again and got a bunch of “connection error” exceptions. Having run this by some friends, we suspected that Facebook servers detected Instagram was getting scraped and then got angry about it and started blocking me.

Changing my IP address – by coming back to my parents’ house for Christmas – seems to have worked: It runs again smoothly, although much more slowly as 1Gbps internet isn’t a thing in the countryside. I’m hoping that a more subtle scraper that I’m working on, which will only download images taken in the previous 24 hours, will be much kinder on the Instagram server and not get blocked. Otherwise I might have to teach my code how to change its IP address midway through operation…

Evening update:

Well things have gone very successfully, and SelenaBot will now only download files that were taken on the calendar day that SelenaBot is run. I need to alter this so it is actually the preceeding 24hrs, but I’m not sure how to do that yet. This means that Selenabot only scrapes around 150 pictures as opposed to 1,200 pictures, and the requests come in at more random time intervals (as opposed to one after the other) and so far this hasn’t resulted in any blocks. However all my scraping at this IP address has been done within one 24hr period so it will be interesting to see if I suddenly start getting server no reply exceptions tomorrow (i.e: does facebook analyse all its traffic once every 24hrs and then block offending requesters from the next day’s traffic?).

Another interesting thing that happened between this morning and this afternoon is that the account “repostapp” suddenly disappeared in the middle of the day (as in, it worked this morning, stopped working this afternoon) which meant that it was derailing the whole of SelenaBot, in much the same way Justin Bieber did the first time I ran the code. To stop this from happening in the future as and when other accounts get deleted/taken down, I’ve taught myself Exception Handling, so now when an account disappears, the code can carry on running.


One of my new years resolutions of 2016 was to learn to code a bit, and with just over a week to go I feel like I’ve been able to produce an idea for a piece of code, structure it, and get it to work. From scratch. By myself.

I started learning python using Automate the Boring Stuff, a website I found so useful I bought the book to say thank you to the Author. I’ve learnt all the basics from section one and started coming up with exercises myself to practice with, which is how SelenaBot came about.

I’ve had an idea for a Science Engagement stunt/thing (which, if it works, I’ll devote future blog posts to as it develops) that requires access to lots of Instagram posts, preferably by celebrities. So I’ve put together a piece of code that automatically accesses celebrity Instagrams and downloads their most recent 12 pictures, and I’m going to talk us through it here.

To start with, we import all the libraries we’ll need. Requests, for requesting webpages; BeautifulSoup to read the information on the webpages; JSON, because once you read the information on the webpage you discover it’s an unholy pile of tangled up Javascript that takes a whole evening to pick apart, and a list of the top100 instagram accounts, that I made separately (and will talk about later)

As alluded to above, the real killer for using BeautifulSoup to scrape instagram is that, when you run a request for an instagram page, instead of getting the website, you get a horrible pile of javascript.

When you access using a webbrowser
When you access using the Python Requests module

I suspect that this gross mess is because using BeautifulSoup is not the most intelligent way to scrape instagram, but I’m new here so that’s what we’re using.

line 13 of the code pull all the JavaScript out of everything else (mostly CSS and a bit of HTML) and pops it in a list called pageJS.

PageJS, and specifically PageJS[6]– the bit of code that contains the information that allows us to access the pictures – happens to be the ugliest thing you’ve ever damn well seen, and line 14 of SelenaBot took my the best part of an evening to write

Here’s the full content of PageJS[6]

and here’s line 14 again

allPics= json.loads(str(pageJS[6])[52:-10])['entry_data']['ProfilePage'][0]['user']['media']['nodes']

After stripping it of the begining and end characters to render it into something that the JSON module can interpret, we can see that all the information we need: Image URLS, timestamps, and more are in a dictionary with the key “Nodes”. To get to “Nodes” however  have to pick our way through another dictionary nested in a third dictionary, nested in a fourth, nested in a list, which is nested in two further dictionaries. This took me a long time painstakingly going through a text file I made of JS[6] and I really rather suspect there was a tool that would’ve helped me a lot quicker. But whatever, we can access the dictionary and we’ve called it allPics. From here on out it’s plain sailing, the for loop iterates through the allPics dictionary, and saves each one to the harddrive, giving it the file name of User + the unix time stamp.

The final two lines of code tell SelenaBot to open up a list of the top100 most followed celebrity accounts and then download all their most recent 12 images, and off it goes

Automated downloading the pics of the Ellen Show and Cara Delevigne: for all my eyebrowinspo and mum-meme needs
And the final result!

Just briefly, the top100gram module is another smaller scraper I wrote that pulls a list of the Top 100 most followed instagram accounts off a not particularly reliable looking website called SocialBlade. Socialblade hasn’t been updated since Justin Bieber quit and then rejoined instagram, which mean the list was passing an incorrect username into Selenabot. The poetic irony of having a piece of code named after Selena Gomez react violently and stop working at the mention of Justin Bieber’s name was not lost on me, and also forced me to write my favourite three lines of code yet.

if 'justinbieber' in top100: 

It’s not too late to say sorry.


Things In The Countryside That Are Uglier than Solar Panels

This morning, environment minister and former shell employee Liz Truss decided to bash solar panels.

From her perspective, it is a sensible thing to do. Solar Panels are a new thing. Traditional Tory voters generally don’t like new things, and since the aim of the pre-election game is to bash things that traditional Tory voters don’t like, it’s time to get out the Solar Panel flogging stick.

Continue reading “Things In The Countryside That Are Uglier than Solar Panels”

Liquid Nitrogen Ice Bucket Challenge

For a full disclaimer and safety notice please see the original post.

I now do lots of science blogging over at Tycho’s Nose, an excellent blog put together by Gilead, Keir and myself. Whilst the site’s content usually consists of long, thought-out pieces on interesting and varied topics, I was challenged to do the blooming ice bucket and figured that I would only do it if i was teaching people stuff.

So yeah, combined gas law for the win. The experiment works beautifully and the moment where I pull in the whiteboard isn’t overly cheesy (although still very cheesy) but hell people have messaged me to say they enjoyed learning the gas law so why the hell not.


Personal Genome Project Party

[Most important update: a big old full statement and apology from the PGP here]
[Update: someone from PGP has commented to point out that this is PGP UK only and the international ones are run separately]

Last night the Personal Genome Project (PGP UK) sent an email out to everyone who’d registered an interest in taking part. 2 hours later one volunteer replied to the PGP UK with a query about the booking form. Unfortunately for the UCL-administered mailing list, the thousands of people that received the first email also received the second one.

Continue reading “Personal Genome Project Party”

Does Microsoft Excel know more about Nicki Minaj than I do?

For a bit of a laugh, I frequently challenge my friends to topics on Quiz-up that I know we both know literally know nothing about. My main topic of choice used to be “Nicki Minaj” but recently I suspect that my younger brother is secretly studying Minaj facts so he can beat me at it.* Anyway, there have been a couple of times I’ve scored so badly in these joke Quizup games I’ve wondered:

Would I actually just been better off jabbing at the screen at random?

Continue reading “Does Microsoft Excel know more about Nicki Minaj than I do?”