Freesteel Blog » “Add your comment” considered harmful

“Add your comment” considered harmful

Saturday, June 21st, 2008 at 12:49 am Written by:

Amazingly, although I have been ranting about this issue in various private and public group emails for a year now, the matter never seems to have hit the blog. I’d pretty much given up attempting to persuade the TheyWorkForYou.com crew to make a move in this direction, so I must have forgotten about it.

It is an observable fact that when an area of software gets done to the point of being declared tolerably okay, it becomes totally static till the end of time. That’s why you have to finish things properly, or leave them in such a bad state that you cannot avoid fixing it later, if you don’t want to get stuck in a hole. It’s also why start-ups in the software industry can sometimes whip the established products with shockingly little effort, when in a sensible world they shouldn’t even be able to catch up.

However, what’s changed is that recently a small group have taken a block-copy of the TheyWorkForYou.com code and called it OpenAustralia.org for the purpose of republishing the Hansard transcripts of the Parliament of Australia. I’ve been winding them up over here recently.

While I’m very much in favour of reusing good code, what this means is that they’ve copied all the static flaws of the TheyWorkForYou.com project, when it would be preferable if they were a small start-up that whipped this established system by being a lot smarter. As the Australians are ruby programmers, you’d have expected them to borrow from the code-base of TheyWorkForYou NewZealand instead.

My unsupported attempts at producing a system with many of the features I want (and without any stupid “Add your comment” links) to see can be witnessed at undemocracy.com.

I wrote the parser for undemocracy.com (from PDFs) over several months, but the webpage suffered a serious setback because I tried to get someone else to build it on top of a content management system he was a very keen fan of called drupal. It didn’t work. I don’t know if any of these numerous big fancy CMSs are up to the job. It’s odd. You’d think there’d be something out there standard by now for noting down minutes of meetings and subcommittees, or court transcripts, or even historical plays, which could be adapted for this purpose. But it seems not.

In the end, after a really long hard weekend building it from the ground up in raw python, I got something ready in time to show to people at hackday last year. (Note: I’m staying up to catch a bus to the equivalent event this year, which leaves at 3:15am)

One of the important ideas is to parse everything into a standard HTML form, rather than this made up XML nonsense which I thought was a good idea at the time. After all, why have a line like:

<major-heading id=”uk.org.publicwhip/debate/2003-06-26.1220.0″>

when the semantically equivalent

<div class=”major-heading” id=”uk.org.publicwhip/debate/2003-06-26.1220.0″>

is readable in a standard HTML browser. Consequently, while the parsed files of ParlParse which feed into TheyWorkForYou.com found here are not a lot of use on their own, the undemocracy.com files here are quite serviceable with the addition of a trivial bit of CSS. In fact, you can design it so that this makes the job nearly done. All you need are some batch generated indexing pages and scripts to slice out the individual debates, and this is essentially what you get. No need to load all the paragraphs and speeches into a SQL database, only to print them all back out in the same order without any gaps — that’s just a long way round to get back to where you started.

What’s the problem with “Add your comment” then?

Well, as you can see, there are a billion times more empty comments than ones people have written on, and it’s always going to be that way.

The spread of locations where comments can be made is desperately uneven — there’s one per speech, whether it’s a substantial multi-page oration covering dozens of points, or a one word interjection. You might call this a minor implementation quibble that could be fixed by changing the unit to the paragraph, but the fact that this has not been done in the past three years is a hint that comments are not really being used.

What are comments on a debate speeches anyway? When you read a good debate, you have one person putting points to another person who responds to them in a process known as an intervention. Isn’t the intervention a comment on the first person’s speech? Is his response to that intervention a comment on the intervener’s comment, or a second comment in a pair of two comments on the point he was making at the time of the intervention? And how does a third person outside fit their comment in later? I know of no forum software where anyone, but the editors, can insert comments between two comments made earlier, such as with this example. Maybe a debate is a single comment thread on its own in the first place.

The most effective place comments can be used is to explain the back-story, such as with this example over the interjection: “So weak!” which witnesses had reported (across the front page of newspapers) as being “So what!”

These are extremely rare. Generally the quantity of data flowing out of the debates is such that, given the choice, people would prefer to skim down the comment column, hoping that others will have picked out the interesting speeches. They won’t have done. Someone has got to read it. And if it’s got no comments yet, no one will think it’s interesting.

Another use of comments is to point out contradictions or related speeches. Some person gives a speech in 2004 completely contradicting his speech from 2001, so you point to his 2001 speech from a comment on his 2004 speech. Then you add a similar comment to the 2001 speech pointing back to the 2004 speech, just in case someone finds that one instead of this one. It gets prohibitively annoying once you get up to three related speeches in a cluster.

As well as being very sparse in the data, comments are generally not worked over, because who is going to read them anyway? People who write good articles about speeches or disclosures in Parliament do it on their own blogs and professional newspaper articles. These are not going to appear on your TheyWorkForYou comment stream. It’s easy to find them for issues today on the Liberty website, here, here, and with a non-deeplink to TheyWorkForYou, and no link to PublicWhip (which should be to this one) here.

The ever-popular TheRegister often has snippets about events in Parliament, particularly when they report the latest publically funded IT debacle. (I posted my favourite exchange two years ago). Here is a recent TheRegister posting, with a link to a video that doesn’t work — especially at a time when TheyWorkForYou.com have been dealing with this issue. Just goes to show that most of the people who should know about it are going to be completely oblivious to the hard technical problems that you’re solving, because they don’t even know that what they’re doing now isn’t working.

Without trying very hard, it’s easy to find other respected places like Greenpeace, Oxfam, and the BBC who generate Parliamentary commentary, but will have nothing to do with your site.

So, comments hosted on a TheyWorkForYou system are going to go nowhere.

What’s the alternative?

Host only track-backs.

These are easy to harvest using the referrer in your incoming HTTP request, and turn out a handy live feed that is more dynamic than the “most recent comments” table, because it takes no effort to bump things up to the top of the list. Also, since the data is not intrinsic to your system, it makes it easier to develop the software because it doesn’t need to remain compatible with a huge blob of user generated data in a database.

You don’t need to respond to every blog; that would just allow the spam in. In fact, blogs and other publishers are kind of like individual users in this case, so when you ban one, all their messages can disappear. Wikipedia (of which more later) is a quality source as well. Say what you like about it, but it is astonishingly spam free in this day and age.

In fact, if you designated an open blog, or a forum, which you lightly integrated, it’s possible to recreate the whole execrable “Add your comment” feature in its entirety by making every link go to a new thread whose first sentence contains the link back to the speech. The comment is made there, and the back-link appears in its place as if you added your comment. The designated open forum software can then operate its own user-login environment so you don’t have to program it.

So that proves this is an enlarged system. What’s better is that one article or blog post can point to many speeches at once. So you can say, “Hey, my MP said this in a speech in 2004, but he said the opposite thing in that speech in 2001. I think it’s because of this entry in his register of interests.”

Now that’s three places that will all back-link to the same article. No more having to point one thing to another and to another and to itself again. Also, if someone else commented about that same register of interests, citing all the other MPs who had the same commercial interest, you can connect through from your post, to the register, back to his article, and then back forwards to another MP whom he says has the same interests.

This is a sort of zig-zag effect that could happen when a site automatically exchanges links with the outside world.

Okay, now this isn’t happening yet anywhere. But the technical implementation is not very hard. I can’t do it for TheyWorkForYou, because that system is already established, and you can’t make a rival one, and no one working on it sees this as a very pressing issue. Therefore, it won’t happen.

The undemocracy.com site, on the other hand, has limited resources and audience, but it makes the first step. Take a look at this link and click on the grey [link to this] block. This opens out into a block containing a copyable reference like so:

<ref>{{ UN document |docid=A-62-PV.15 |body=General Assembly |type=Verbotim Report |session=62 |meeting=15 |page=32 |anchor=pg032-bk02 |date=[[2 October]] [[2007]] |speakername=Mr. Gutiérrez Reinel | speakernation=Peru |accessdate=2008-06-21 }}</ref>

which is going to fit quite nicely here into a wikipedia article.

So that’s the first step — generate these high-quality links which will give people the support to add them into — for example — wikipedia articles. I’ve made up a few links, like in here, for the TheyWorkForYou system, including the necessary {{UK Parliament|}} template, but they don’t auto-generate them.

So much for that.

Like I say, I’ve given up trying to get this to happen. But if it did, the next step would be to create these back-links, and then make the back-links to the blogs. Now you will have a reason for people like Oxfam and Greenpeace and the BBC to link to you — because they’ll get something back: you will give people who read the Parliamentary transcripts a way back to their sites as good as a google-ad (which these organizations all pay for). And they don’t even have to associate themselves with you overtly by posting their original comments on your pages.

So, that’s the plan, as I’ve been expressing it for some time. It’s waiting for someone who’s cool, with more influence than me, to take it up as the basis for a robust general CMS that’s suitable for Parliamentary informatics. Then we can all load our different parsed data into one good flashy well-designed system so I don’t have to keep hearing about how crap I am at designing web-pages anymore, and it’ll be compatible across all continents. Wouldn’t that be great, eh?

6 Comments

  • 1. Matthew replies at 21st June 2008, 1:35 pm :

    “After all, why have a line like: when the semantically equivalent is readable in a standard HTML browser. ”

    You could even use the more semantic h1 or h2? 🙂

    “I can’t do it for TheyWorkForYou, because that system is already established, and you can’t make a rival one, and no one working on it sees this as a very pressing issue. Therefore, it won’t happen.”

    Actually, TheyWorkForYou had trackbacks back when it launched and for quite some time after. It got spammed to death and so I switched it off. If you want to submit a patch, rather than just moaning that no-one can be bothered to do it because they don’t feel about it as strongly as you do or have other things they’d rather be doing, you are of course free to do so.

    How do you notice links that no-one clicks? I mean, it’s a bit awkward asking people to click a link they add to a web apge to “activate” it. Plus we get quite a bit of spam from people expressly trying to use HTTP_REFERER – how would you stop that?

  • 2. Julian replies at 23rd June 2008, 9:46 am :

    I don’t understand why using a white-list isn’t so entirely obvious that it’s an insult to the intelligence to mention it? Anyway, if no one reads their blog and clicks their link, why does it matter if it doesn’t show up? With the referrer you get a time-based hit-count so you know who’s high quality and can encourage them. For example, if they’re a newspaper. (BTW, the best target is TheRegister.)

    Given all I have contributed codewise for free and for the sum total recognition of eleven ASCII characters on the about page, plus the fact that I am probably still the only power user of the system (name someone else who has created as many high quality deep-links) — I think it explains a lot how my suggestions never get implemented in terms of moving the system towards one day eventually having a measurable political effect.

    I am entitled to moan, and I am going to do so more and more, because I can then hold the offer of something which you will desperately need: for me to shut-up. The amount of influence I get otherwise has been approximately zero. So it ain’t going to get any lower.

    The massive collaboration that has gone on with the video matching demonstrates that the limitation is with the imagination, not time.

    Or is it just me. I bet if anyone else told you to replace the sections like[1]:

    Division number 226

    See full list of votes (From The Public Whip)

    by at the very least in-lining from the table:

    http://www.publicwhip.org.uk/twfydivision.php?date=2008-06-20&number=226

    it would have happened.

    Same goes for all the other simple to implement concepts I can’t be bothered to list here just to hear more waste-of-time excuses that can’t be the real reason.

  • 3. Freesteel&hellip replies at 23rd June 2008, 10:45 am :

    […] I say I was horribly grumpy right […]

  • 4. Matthew replies at 23rd June 2008, 12:56 pm :

    “I don’t understand why using a white-list isn’t so entirely obvious that it’s an insult to the intelligence to mention it?”

    Who would maintain a list? Okay, so you start with Wikipedia or whatever, but presuambly you want the person you mention who wants to link to three different speeches (on their blog, I am guessing) to be allowed, so they have to email and get approved. Or perhaps simply have them register their website/blog in their user account and that whitelists it – I literally just thought of that then, up to you whether you believe me or not. That would of course involve patches to the user account system, not trivial.

    “Anyway, if no one reads their blog and clicks their link, why does it matter if it doesn’t show up?”

    I’d have thought you’d want to encourage traffic both ways, so people could find other people’s commentary (as yet unread by anyone else, presumably) through TheyWorkForYou. My mistake.

    “I think it explains a lot how my suggestions never get implemented in terms of moving the system towards one day eventually having a measurable political effect.”

    Sorry, I can’t parse that past paranoia. Are you accusing me of deliberately acting against you? What would be my reasoning for that?

    “The massive collaboration that has gone on with the video matching demonstrates that the limitation is with the imagination, not time.”

    Well, my time isn’t limitless, I don’t know about yours. And much of the time, I have other stuff I’d prefer to be doing.

    “Or is it just me. I bet if anyone else told you to replace the sections like: [snip] it would have happened.”

    Even though you go on to say this is “simple to implement”, it isn’t. I can’t slow down page loading times enough to do an external HTTP request for every division in a particular debate, so it would have to be some cron job inserting some stuff into some database somewhere, none of which is “simple”. Might be just as easy to write something to parse the elements directly in TheyWorkForYou’s xml2db.pl loader. If it is as simple as you say it is, I say again, write a patch. 🙂

    “concepts I can’t be bothered to list here just to hear more waste-of-time excuses that can’t be the real reason.”

    I am genuinely interested in what you think the real reason that I’m apparently not saying is. As I have no idea.

    P.S. The two comments on your latest blog post are both spam.

  • 5. Julian replies at 23rd June 2008, 2:15 pm :

    According to a search of my emails, I was posting in detail about this very feature, including the use of white-lists in the way you have just realized is obvious, in an email to the TheyWorkForYou private list entitled “Get theyworkforyou to support blogs and wikipedia.” on 15/12/2006 at 19:34.

    I don’t think it’s paranoid to recognize that a remarkable quantity of my attempted contributions fall straight through the cracks sight-unseen. It’s just natural human behavior in the main — some individuals have the charisma such that people hang on their every word (whether or not they have something to say), and others don’t.

    It’s just life.

    The bit that’s irritating is that many important developments don’t get considered, because they happen to be said by someone with the “wrong stuff” in the “wrong way”, so that most human minds by reflex come up with the answer “No”, and then spend the next 10 microseconds grabbing for whatever lame excuse comes to hand immediately, and then forgets about it.

    That’s the psychological chain of events in the order that they occur.

    It never reaches the stage of:

    “Ah, that might be quite an important thing to do when we get time — I wonder if we think about it for a few moments there might be a slick way of achieving the result?”

    — Does it?

    You just get a list of: oh that bit’s too hard and might do some slowing down, and this bit’s is not simple as he says, and then there’s one tiny, temporary flaw in an otherwise good idea that means we shouldn’t try it, and anyway it’s only Julian who wants this feature, so he can piss-off and do it himself.

    You know. Whatever.

  • 6. rupinder singh replies at 11th April 2011, 12:26 pm :

    Free sms Send Bulk SMS | text messages to 234 countries | worldwide from your PC | Group SMS auto Schedule | bulksms pricing discounts | BulkSMS Dedicated Gateway l NDNC facility |Plugin of Deepax. | We offer world class services for sending bulk SMS | Send customized bulk SMS through http://www.Deepax.com

Leave a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <blockquote cite=""> <code> <em> <strong>