Freesteel Blog » UN parsing roundup

UN parsing roundup

Thursday, March 13th, 2008 at 12:23 pm Written by:

Uh-oh. An article in the Guardian showed up today with my name in it.

[Update: The words I misattributed to myself in the article were by Stefan Megdalinski (note spelling). The UN does in fact televize itself, because this is easy to do, but it’s like CCTV footage — needs editing for highlights. Contributions are needed specifically to redo the website (including its features) which I hacked in a hurry. This would give me time and inclination to get back to the parser.]

I completed the “laborious” work of getting the General Assembly meeting 75 and meeting 76 of Session 62 scraped and parsed this morning. Unlike the Security Council reports, which come on-line within hours, the General Assembly transcripts are always months late; these ones represent the afternoon and morning sessions for 17-18 December 2007.

It’s taken an unusual couple of hours to push it through onto the web-page, because I have a very temperamental parser that is able to pick out the most extra-ordinarily obscure and completely invisible problems in this pair of very complicated days involving 34 recorded votes.

Problems included:

Strange characters

For example, the highlight on page 2 of A-62-PV.76 looks like “A/C.2/62/INF/1”, doesn’t it? But go into the third page of the corresponding PDF file and try to copy-and-paste that symbol, and you’ll find it’s not a “C”. In some word-processors it shows up as a slightly different “C”-shaped object, while in others it just gets a “?” or nothing. A unicode detective could track down this blemish about where it has come from. Maybe it’s a symbol available on the Khmer version of Word which, idiotically, inserted itself into a word-completion, and then the symbol was copy-and-pasted from one email to the next by secretaries in an unbroken chain until it wound up in this document. It is almost certain that I am the only person in the world to find it a problem, because everywhere else this reference is dereferenced it’s done by the human eye.

Now, wouldn’t it be much more convenient if there were hyperlinks within the on-line versions of the documents themselves?

Unexplained misspellings

Right in the middle of this vote, I get “Marwill Islands”. Obviously this is meant to be the “Marshall Islands”. But before you think about how explicable this typo is in relation to the layout on the qwerty keyboard, ask why would these country names be typed in anyway? The whole voting procedure is conducted electronically using buttons and a big board full of lights (see the transcript of a cock-up involving that system from 12 December 1995), so how hard would it be for the system to email an electronic page of the votes all laid out properly in which country names are either never be misspelt, or always misspelt the same way every single time?

Either the procedure in the UN is to retype the entire list of votes for the transcripts — a job which could take days of unnecessary work — or someone has got to explain how mistakes like this can happen?

Indentation problems

It’s important to be strict about where the new paragraph lies because when it says A recorded vote was taken, those are not the final words at the end of the previous person’s speech. It’s a signal to engage the vote parser. In the PDF file you sometimes get an actual line indentation (it starts on pixel 504 rather than 468), or the line starts on pixel 468 and they add 2 spaces to indent it! Looks the same to the eye, but not to the computer.

Voting corrections

Oh, and finally when you have a hard day of voting like this, there are dozens of cock-ups, which manifest as:

[Subsequently, the delegations of Bolivia, Burkina Faso and the Sudan advised the Secretariat that they had intended to vote in favour; the delegation of the United States of America advised the Secretariat that it indented to vote against.]

In spite of dozens of variations of these words, and countries containing several words, (sometimes with an “and” in their name just to ruin the nth version of the software from working), this “advice to the Secretariat” is so common it can’t be done by hand and has needed a special program.


So, what does this mean? Lots of votes means lots of information. What’s happening is that the resolutions discussed in the Special Committees are all coming to the floor of the General Assembly in a stream to receive their general votes.

Session 62 Meeting 75 begins with agreements by consensus on Assistance on mine action, Effects of atomic radiation, and International cooperation in the peaceful uses of outer space — these generally schedule what the United Nations is going to do on these topics (usually conduct more studies and write more reports).

That last one was something about the Registration Convention. Where are all the space law nerds when you need them? I had to start that Wikipedia page myself. Don’t you think it’s cool that every space object will now have its own web page?

The next resolution took a vote to extend the mandate of United Nations Relief and Works Agency for Palestine Refugees in the Near East. Nauru, an island state in the Pacific with probably the least interest in this issue of anywhere in the world outside of Antarctica, voted against. Sometimes you wonder whether these smallest states haven’t outsourced their Ambassadorial services to some random New York lawyer firm.

And so it goes on for another dozen votes, with Pacific island states taking a suspicious interest in the Palestinian question. You can look through that yourself.

The meeting for 18 December 2007 does more refugees and tonnes more resolutions about women, all adopted by consensus.

But as usual, when it comes to Children’s Rights, that’s just a step too far for the United States, and they are the only country in the world to vote against. After all, young people are there to be ripped off and forced into debt before they grow old and wise enough to spot this unmitigated, miserable, cruel, mindless trap we, the older generations, lay form them embodied within the entire financial system, because we know that they come into this world penniless and with a naive sense of justice and fair-play that can be abused starting with their first bank-loan.

However, when it comes to the “Use of mercenaries”, everyone in Europe is against restricting that. (Note, the actual text of that resolution can be found here.)

The objectors to a “Moratorium on the use of the death penalty” have a non-European flavour.

“Periodic and geniune elections” has 13 abstainers on the “fifth preambular paragraph”. I think they mean this one.

And so on with a lot of other divided votes on issues throughout the day. But I ran out of time for this today long ago.


  • 1. Lee Maguire replies at 13th March 2008, 4:01 pm :

    I believe you’ll find the correct spelling is “Mugdalinski”.

  • 2. Matthew replies at 14th March 2008, 8:45 am :

    The strange character is the Cyrillic capital letter Es, which looks just like a C. is very interesting, here’s the Wikipedia article on the US’s position:

  • 3. Julian replies at 14th March 2008, 12:54 pm :

    As I thought, it’s due to the religious home-schooling movement (“a child is for life, not just for christmas”). It’s a really strange country.

    There’s got to be a project that makes the treaties that are signed or unsigned easy to look up, as well as make clear their purpose — to improve the quality of governance on issues that tend to fall below the radar screen for most governments.

    I mean, there’s no Convention on the Adequate Payment of Parliamentarians, because we don’t need it — the issue always seems to look after itself!

  • 4. Will Cox replies at 19th March 2008, 1:06 pm :

    Julian, I love what you’re doing with this undemocracy project: very cool.

    With regard to the vote on the convention on the rights of the child, it’s not so much the “religious home-schooling movement” but a particular view, somewhat inconsistently applied, of the relationship of the individual and the various layers of government (of which the U.N. is the least important and most annoying). Essentially, it’s none of the government’s business, so butt out.

    Note that many of the countries which did agree to the convention have a horrible record with regard to children. I am sure that Sudan, for example, is a paragon of virtue. Hopefully being a signatory is not an entirely hypocritical act.

  • 5. Julian replies at 19th March 2008, 4:10 pm :

    Thanks for the compliment.

    I singled out the home-schooling movement in particular because they were the only objector who actually seemed to have bothered to read the Convention, before asserting that it was none of the government’s business.

    The United Nations is not a government body — it is merely a permanent forum for establishing international norms such as, for example,
    Optional Protocol to the Convention on the Rights of the Child on the Sale of Children, Child Prostitution and Child Pornography. (See that slight nuanced interpretation on the meaning of child pornography.)

    I don’t suppose too many reasonable people are willing to defend the view that it’s no one’s business when people sell children into prostitution — an infringement against the rights of the child if ever there was one. And, indeed, the United States has even signed it — maybe because anyone reading only the title knew they had to support it.

    There’s a theory that the real objection lies with Article 37 which forbids applying the death penalty to children, seeing as the United States has the highest documented rate for executions of child offenders. I don’t know where you stand on this issue, but clearly it’s something unusual in American culture that would need to be faced up to before it signed up.

    While it is true that many governments do seriously mistreat people in their custody, it makes a big difference if this is done within the law or outside the law, because if there is a law there is a second chance.

    I do know that if I was a child I would be happier to grow up in a United States that was willing to sign and implement this Convention than one that didn’t. The same applies for growing up in a Sudan that does sign or does not sign the Convention. One stands a better chance with a hypocrite who will be embarrassed if he is exposed, than someone who doesn’t care if everyone sees he’s evil.

    I don’t see why it’s worth making the comparison between a United States that doesn’t sign the Convention, with a Sudan that does. That’s like arguing that a pan lid is no use because you can boil water better in a saucepan without a lid, than in a chocolate teapot with a lid.

  • 6. 4iP’s New £500k Sl&hellip replies at 5th April 2012, 9:45 pm :

    […] Started by TheyWorkForYou, PublicWhip and UNDemocracy civic hacker Julian Todd, it’s in alpha right now but, due for live in February, it will be a […]

Leave a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <blockquote cite=""> <code> <em> <strong>