Freesteel » UN parsing roundup
UN parsing roundup
Thursday, March 13th, 2008 at 12:23 pm
Uh-oh. An article in the Guardian showed up today with my name in it.
[Update: The words I misattributed to myself in the article were by Stefan Megdalinski (note spelling). The UN does in fact televize itself, because this is easy to do, but it's like CCTV footage -- needs editing for highlights. Contributions are needed specifically to redo the website (including its features) which I hacked in a hurry. This would give me time and inclination to get back to the parser.]
I completed the “laborious” work of getting the General Assembly meeting 75 and meeting 76 of Session 62 scraped and parsed this morning. Unlike the Security Council reports, which come on-line within hours, the General Assembly transcripts are always months late; these ones represent the afternoon and morning sessions for 17-18 December 2007.
It’s taken an unusual couple of hours to push it through onto the web-page, because I have a very temperamental parser that is able to pick out the most extra-ordinarily obscure and completely invisible problems in this pair of very complicated days involving 34 recorded votes.
For example, the highlight on page 2 of A-62-PV.76 looks like “A/C.2/62/INF/1″, doesn’t it? But go into the third page of the corresponding PDF file and try to copy-and-paste that symbol, and you’ll find it’s not a “C”. In some word-processors it shows up as a slightly different “C”-shaped object, while in others it just gets a “?” or nothing. A unicode detective could track down this blemish about where it has come from. Maybe it’s a symbol available on the Khmer version of Word which, idiotically, inserted itself into a word-completion, and then the symbol was copy-and-pasted from one email to the next by secretaries in an unbroken chain until it wound up in this document. It is almost certain that I am the only person in the world to find it a problem, because everywhere else this reference is dereferenced it’s done by the human eye.
Now, wouldn’t it be much more convenient if there were hyperlinks within the on-line versions of the documents themselves?
Right in the middle of this vote, I get “Marwill Islands”. Obviously this is meant to be the “Marshall Islands”. But before you think about how explicable this typo is in relation to the layout on the qwerty keyboard, ask why would these country names be typed in anyway? The whole voting procedure is conducted electronically using buttons and a big board full of lights (see the transcript of a cock-up involving that system from 12 December 1995), so how hard would it be for the system to email an electronic page of the votes all laid out properly in which country names are either never be misspelt, or always misspelt the same way every single time?
Either the procedure in the UN is to retype the entire list of votes for the transcripts — a job which could take days of unnecessary work — or someone has got to explain how mistakes like this can happen?
It’s important to be strict about where the new paragraph lies because when it says A recorded vote was taken, those are not the final words at the end of the previous person’s speech. It’s a signal to engage the vote parser. In the PDF file you sometimes get an actual line indentation (it starts on pixel 504 rather than 468), or the line starts on pixel 468 and they add 2 spaces to indent it! Looks the same to the eye, but not to the computer.
Oh, and finally when you have a hard day of voting like this, there are dozens of cock-ups, which manifest as:
[Subsequently, the delegations of Bolivia, Burkina Faso and the Sudan advised the Secretariat that they had intended to vote in favour; the delegation of the United States of America advised the Secretariat that it indented to vote against.]
In spite of dozens of variations of these words, and countries containing several words, (sometimes with an “and” in their name just to ruin the nth version of the software from working), this “advice to the Secretariat” is so common it can’t be done by hand and has needed a special program.
So, what does this mean? Lots of votes means lots of information. What’s happening is that the resolutions discussed in the Special Committees are all coming to the floor of the General Assembly in a stream to receive their general votes.
Session 62 Meeting 75 begins with agreements by consensus on Assistance on mine action, Effects of atomic radiation, and International cooperation in the peaceful uses of outer space — these generally schedule what the United Nations is going to do on these topics (usually conduct more studies and write more reports).
That last one was something about the Registration Convention. Where are all the space law nerds when you need them? I had to start that Wikipedia page myself. Don’t you think it’s cool that every space object will now have its own web page?
The next resolution took a vote to extend the mandate of United Nations Relief and Works Agency for Palestine Refugees in the Near East. Nauru, an island state in the Pacific with probably the least interest in this issue of anywhere in the world outside of Antarctica, voted against. Sometimes you wonder whether these smallest states haven’t outsourced their Ambassadorial services to some random New York lawyer firm.
And so it goes on for another dozen votes, with Pacific island states taking a suspicious interest in the Palestinian question. You can look through that yourself.
The meeting for 18 December 2007 does more refugees and tonnes more resolutions about women, all adopted by consensus.
But as usual, when it comes to Children’s Rights, that’s just a step too far for the United States, and they are the only country in the world to vote against. After all, young people are there to be ripped off and forced into debt before they grow old and wise enough to spot this unmitigated, miserable, cruel, mindless trap we, the older generations, lay form them embodied within the entire financial system, because we know that they come into this world penniless and with a naive sense of justice and fair-play that can be abused starting with their first bank-loan.
The objectors to a “Moratorium on the use of the death penalty” have a non-European flavour.
And so on with a lot of other divided votes on issues throughout the day. But I ran out of time for this today long ago.