So, is http://joule.marnanel.org working for you? If it isn't, tell me. If it is, tell your friends.
A few reports are coming in of Twitter/identi.ca support reporting that valid accounts don't exist. I'll be investigating.
A few reports are coming in of Twitter/identi.ca support reporting that valid accounts don't exist. I'll be investigating.
Fixing Joule
Dec. 12th, 2009 09:46 pm
We need to:
- get coffee
- back everything up
- find out how many users are on the raisin system
- convert them back to currant
- roll back the comparator to use currant
- turn everything on again and hope it works
DONE.
DONE. It's 2.1Gb.
DONE. There are 18760 (about 5% of the total, but the most regular users are on it)
DONE
DONE
DONE...
- Finish the next chapter of my book and send it off to the editor (tonight, I expect)
- Convert myriadcolours.com to use spiritbutter
- Wind back the raisin comparator changes in Joule
- Bring Joule back online. Honestly. Happy Christmas.
further midday updates
Oct. 7th, 2009 12:13 pm- IT IS RIORDON'S BIRTHDAY. Happy birthday to the kid who is officially the most awesome kid in the world.
- The nicest Joule comment ever.
- Two reviewers in Canada want physical copies of Borrowable: one says they will probably review it and one might review it. One other reviewer doesn't want it because it's self-published. I will therefore be ordering more author copies when my paycheque comes through.
- The Launchpad people will allow Shavian translations only if we first fix the bugs in Launchpad which are holding it up. Arc does not seem to think this will be a major difficulty.
- Cambridge University Library "would be delighted" to add Borrowable to their collections.
- I have just received a (free) review copy of Writing Children's Books For Dummies.
- I have finished A Tale of Two Guinea-Pigs and thoroughly enjoyed it. A review follows, later today.
- Did I mention there was a quiz on the Borrowable site now? And the start of a recipe collection?
Status of the sites
Sep. 22nd, 2009 10:20 amIntroduction. I've had a few people ask me what happened to *.marnanel.org, which has been down for several days now. The short answer is that there were software memory problems in each case; the long answer differs for each application.
joule.marnanel.org: I'm putting this first because I expect most people reading this will want to know about Joule.
One of the main parts of Joule is the comparator, which compares the old state of your friends list with the new. The old comparator, "currant", had worked fine for around a year, but could only compare about 1000 records a second. Several months ago I introduced Twitter support to Joule, and because I wondered whether people with millions of followers might like to use it, I rewrote the comparator entirely, producing "raisin". This was a bad move for two reasons:
Firstly, although it worked, if you have millions of followers you also have hundreds of friendings and unfriendings every day, more than anyone would want to wade through. So Joule is rather useless for such people.
Secondly, the new comparator compared everything in memory rather than in the database. Although it worked with large datasets in testing, when it was put into production there were some scalability issues. Eventually it allocated so much memory that it crashed the server Joule was running on, which isn't acceptable because it's used for many other things and by many other people.
I thought I fixed both of these problems by adding a check that a user didn't have more than a few thousand followers. However, it seemed that this wasn't enough, and the server eventually crashed again. The server admins asked me, quite reasonably, not to run Joule in its present state on that server.
Ways forward from here.
I don't know. Out of all the sites, this is the toughest one to bring back. Unfortunately, it's also by far the most used.
I could revert back to "currant", although that would be fiddly because the database structure is rather different and I'd need to write something to convert back to the old format.
I suppose I could run Joule in its present state with "raisin", on a dedicated server.
I could convert Joule to CGI so that it was easier to profile.
The Yarrow sites: marnanel.org and rgtp.thurman.org.uk. These are less of a problem because they're CGI, not in-process. The problem with them is that they occasionally load the entire index into memory while creating a cache of it; this is easily fixed.
The Shavian wiki. This was also a memory hog for a quite different reason: it cached all the transliterations while rendering. I could easily turn this off, but it would become very slow. Ways forward: I would actually like to do the rendering in a separate script, so I wasn't writing it in MediaWiki and therefore PHP any more; I'd like it if it could render text taken from other sources than its own wiki, such as simple.wikipedia.org and en.wikisource.org; I'd like it if it could take pronunciations from another, less perfect source such as CMUdict where they weren't already supplied, except in wiki-building mode, and if it could make some attempt at automatic disambiguation. This is a medium-sized rewrite, though, but this may be a good excuse for it.
joule.marnanel.org: I'm putting this first because I expect most people reading this will want to know about Joule.
One of the main parts of Joule is the comparator, which compares the old state of your friends list with the new. The old comparator, "currant", had worked fine for around a year, but could only compare about 1000 records a second. Several months ago I introduced Twitter support to Joule, and because I wondered whether people with millions of followers might like to use it, I rewrote the comparator entirely, producing "raisin". This was a bad move for two reasons:
Firstly, although it worked, if you have millions of followers you also have hundreds of friendings and unfriendings every day, more than anyone would want to wade through. So Joule is rather useless for such people.
Secondly, the new comparator compared everything in memory rather than in the database. Although it worked with large datasets in testing, when it was put into production there were some scalability issues. Eventually it allocated so much memory that it crashed the server Joule was running on, which isn't acceptable because it's used for many other things and by many other people.
I thought I fixed both of these problems by adding a check that a user didn't have more than a few thousand followers. However, it seemed that this wasn't enough, and the server eventually crashed again. The server admins asked me, quite reasonably, not to run Joule in its present state on that server.
Ways forward from here.
I don't know. Out of all the sites, this is the toughest one to bring back. Unfortunately, it's also by far the most used.
I could revert back to "currant", although that would be fiddly because the database structure is rather different and I'd need to write something to convert back to the old format.
I suppose I could run Joule in its present state with "raisin", on a dedicated server.
I could convert Joule to CGI so that it was easier to profile.
The Yarrow sites: marnanel.org and rgtp.thurman.org.uk. These are less of a problem because they're CGI, not in-process. The problem with them is that they occasionally load the entire index into memory while creating a cache of it; this is easily fixed.
The Shavian wiki. This was also a memory hog for a quite different reason: it cached all the transliterations while rendering. I could easily turn this off, but it would become very slow. Ways forward: I would actually like to do the rendering in a separate script, so I wasn't writing it in MediaWiki and therefore PHP any more; I'd like it if it could render text taken from other sources than its own wiki, such as simple.wikipedia.org and en.wikisource.org; I'd like it if it could take pronunciations from another, less perfect source such as CMUdict where they weren't already supplied, except in wiki-building mode, and if it could make some attempt at automatic disambiguation. This is a medium-sized rewrite, though, but this may be a good excuse for it.
More Joule: UI changes
Jul. 2nd, 2009 10:32 amBriefly, since I'm busy:
Since it was a fairly trivial fix, I implemented a two-stage system in Joule last night, as suggested here: there's just a username box on the front page, and it takes you to an intermediate page where you pick the service. Most people bookmark their chart page and don't use the controls, and I was aiming to make the controls simpler even at the possible cost of a few extra seconds for a first-time user. I'm asking for feedback for or against this idea. I've only heard one reply so far and they didn't like it. What do you think?
Since it was a fairly trivial fix, I implemented a two-stage system in Joule last night, as suggested here: there's just a username box on the front page, and it takes you to an intermediate page where you pick the service. Most people bookmark their chart page and don't use the controls, and I was aiming to make the controls simpler even at the possible cost of a few extra seconds for a first-time user. I'm asking for feedback for or against this idea. I've only heard one reply so far and they didn't like it. What do you think?
Things that need doing on Joule
Jul. 1st, 2009 07:23 pmSome things that could be done to Joule, mainly for my own reference. Not in order. I've shown the amount of work needed; I haven't ascribed an importance to any of these (though I wouldn't mind hearing your opinions).
- Joule is case-sensitive. None of the systems it serves data from are case-sensitive. This is silly. This will probably require downtime to fix, because effective duplicates will need to be removed from the database. Medium
- The translation system needs a radical overhaul. I have several ideas. In particular, the English text should be placed within the templates, as with gettext, and not within a magic .po file; and ?lang=fr etc should be pages, not redirects, for the benefit of search engines. Complex
Controls overhaul.Easy- Look into OpenSocial so we can chart Blogger and MySpace. Medium
- There should be a table of messages of the day. The HTML pages should show the most recent, and the RSS feeds should show whichever was the most recent on the relevant day. This will let us put interesting messages about new features into RSS feeds, which is the only way to contact most of our users. Medium
joulestatsis stable and can be run from cron: done. Also, fix joulestats's messages for users with zillions of followers; they're less helpful than they could be. Easy- Page view per day so that massive charts become at least slightly useful. Medium
- Add an extra column showing the total number of followers on each day, for the same reason. This needs a current count to be returned from the XS and then we just add and subtract as we go down the line. Easy
- The FAQ needs to be broken out into separate pages. Easy
- Dreamwidth support, when this enhancement is finished. Easy
- Most of the Twitter and identi.ca work needs to be done in a superclass rather than duplicating code. Easy
- I would like a way to draw line graphs of number of followers over time. (This is blocked by "controls overhaul".) Complex
Woke up at a good time, around seven. Promptly and stupidly decided to go back to sleep to see what the end of the dream was; it turned out to be a nightmare. Woke up again at about eleven and went to the gym. Continued the run of stupid mistakes by forgetting to get lunch for Rio. Sharon came by and brought her lunch instead. I hate getting up late. :(
Later, went to the diner for dinner. Talked to Alex about a shelving project he's working on.
Did a little tidying, but not very much. But I've got some way towards Inbox Zero: I'm now down to four emails.
Today I learned that cd - changes to the directory you were in before the current one.
Fin gave me an old notebook of zirs to use as a logbook. It's lovely.
It occurs to me that the simple system I built a while ago which mostly allows Ubuntu to come up in Shavian would also work to get Deseret, Unifon and Tengwar. I wonder whether there's much of a market for Ubuntu in Tengwar. Possibly good Slashdot fodder, anyway.
Joule-for-Dreamwidth is edging closer. I also need to implement a per-day view with a paging system to get around this problem.
Five days until GCDS starts.
Later, went to the diner for dinner. Talked to Alex about a shelving project he's working on.
Did a little tidying, but not very much. But I've got some way towards Inbox Zero: I'm now down to four emails.
Today I learned that cd - changes to the directory you were in before the current one.
Fin gave me an old notebook of zirs to use as a logbook. It's lovely.
It occurs to me that the simple system I built a while ago which mostly allows Ubuntu to come up in Shavian would also work to get Deseret, Unifon and Tengwar. I wonder whether there's much of a market for Ubuntu in Tengwar. Possibly good Slashdot fodder, anyway.
Joule-for-Dreamwidth is edging closer. I also need to implement a per-day view with a paging system to get around this problem.
Five days until GCDS starts.
Liveblogging a Joule fix
Jun. 27th, 2009 06:31 pmRyan Tucker reported a bug in Joule. When a user has more than 5,000 followers, on some days Joule will throw a database error about a duplicate key. This is mysterious, since the keys come from a hash and should be unique. I thought I'd try liveblogging fixing it, in case anyone wanted to watch. Times are EST.
- 18:30: Can we replicate it in staging? I don't want to bring the real Joule down while I look for a fix.
- 18:39: Yes, astronautics on Twitter has >5,000 followers and causes Joule to exhibit the bug.
- 18:44: Okay, Joule is instrumented so it will dump the old and new lists to a file, plus what it thinks the changes are.
- 18:45: It failed again and I have the log file. Good! I hate when you set up debugging and it suddenly starts working.
- 18:48: Well, it's not because there are duplicates in the old or the new lists, so it must be a comparison error.
- 19:01: Fascinating. The new version of the comparison code is reporting one of the userids as both added and removed, which the DB constraints obviously won't allow. This didn't come up in testing...
- 19:24: Seems that when you have two users A and B, and A's name is a prefix of B's, and A unfriends you, that the system gets confused and reports B having both friended and unfriended you. Fixing now.
- 19:31: I think I have a solution. Taking out all the instrumentation to test it.
- 19:36: Tests pass. So do old tests. Will write a regression test in a few minutes.
- 19:37: The moment of truth... yes! it works on staging. Rolling out to production.
- 19:52: Fix checked in and in production. The remaining problem here is that astronautics had 3064 follows and 2421 unfollows, and Joule is fixed so that it shows "Hiccup" if you have more than 100 on the same day (for three reasons; I could tell you, but does anyone care?) Suggestions for working around this one are welcome.
- We have to do a separate lookup in Twitter for every userid we haven't seen before, to get the icon and username. For 5000 changes in a day, that slows page load times a lot. This is still a problem.
- There is an old pre-Twitter assumption that 100 follows or unfollows means either that Joule broke, or that LJ broke when it sent us the names. Clearly this is outdated.
- There isn't enough space in the chart for more than a few hundred names a day without making the page insanely long.
Bottled annoyance
May. 5th, 2009 09:51 pmIt's been raining for days. Rio (whose website is now a little out of date) says we should put the rain into jars and call it "bottled annoyance".
Speaking of Rio, she's been learning the trumpet for a few months now. Tonight we went to a concert her school were putting on. There was a high school jazz band, too, and now she's decided she wants to be a jazz trumpeter. She's asking for trumpet jazz CDs, and Fin is asking whether you have any recommendations. All this makes me want to pick up the bass again. Perhaps I need to take lessons.
We had to take Rothko to the vet. He'll be fine. The other cats are missing him rather.
I didn't get much done this weekend; I've been feeling kind of out of sorts recently. I did manage to spend an hour or so on Sunday adding Digg support to Joule, and later I added support for Doug Ewell's spiky rune-like Ewellic alphabet to the Shavian wiki here. Which is your favourite of the scripts we have so far? (You'll need IE, Safari, or Firefox 3.5 to see them without downloading fonts.)
We had to take Rothko to the vet. He'll be fine. The other cats are missing him rather.
I didn't get much done this weekend; I've been feeling kind of out of sorts recently. I did manage to spend an hour or so on Sunday adding Digg support to Joule, and later I added support for Doug Ewell's spiky rune-like Ewellic alphabet to the Shavian wiki here. Which is your favourite of the scripts we have so far? (You'll need IE, Safari, or Firefox 3.5 to see them without downloading fonts.)
telling people about Joule
May. 3rd, 2009 02:03 pmI'm trying to think how to tell more people about Joule rather than just by me blogging about it.
- People can tell their friends. This is the best way, and happens quite a bit, but I can't do much about it either way. Of course, all the other ways feed back into this way too.
- http://twitter.pbworks.com/Mashups might be worth editing, but I'm not sure I should add my own stuff there or I look like a spammer.
- How do you get listed on http://laconi.ca/trac/wiki/Apps anyway?
- No chance of getting linked from Wikipedia, but it would certainly bring people over.
- Running a few Google ads might work, but might be expensive. Still, for a few days it might be worthwhile.
- I could email various blogs and tell them about it; they might be interested.
- I wish Joule was interesting enough that someone would write an article about it (from a "tool with a six-year history" angle, perhaps, rather than a "just a friends-list tracker" angle). But maybe it isn't.