Freesteel Blog » Volunteer to

Volunteer to

Friday, March 7th, 2008 at 6:11 pm Written by:

I received a message through the emailbag:

…I’m interested in learning more and doing what I can to volunteer at

Probably the easiest thing to do is find a wikipedia article that needs some updating with citations to the official documents. For a good simple example, look at UNMIN.

A big project would be to work on AMISOM, because there was a United Nations Security Council Resolution passed that extended its mandate a week ago. (Read the meeting.)

Less challenging projects can be found under Category:United Nations observances to bring them all up to the standard of International Year of the Potato, by locating the meetings and General Assembly Resolutions that authorized them.

Those are just two ideas off the top of my head. It depends on what sort of UN activities are you interested in. I’ve designed it so that Wikipedia is the natural way to index the documents, and that’s where I expect there will eventually be a complete timeline of its interventions and of the documents that support them. This is always going to be better than a search engine for leading people (eg journalists) directly to the sources.

Please post any questions about the useability as you go along to the comments in this blog post. Nothing is too stupid or trivial — especially if it helps lead to some necessary clarification.


  • 1. Tom L replies at 13th March 2008, 1:59 pm :

    Done (at least before the gnome blight gets to it). The base quality of the stub article is bad, containing some confusing rubbish. I don’t relish taking on a re-write from scratch, but at least the UN doc stuff is now sorted and a legible timeline of UN involvement put in.

    I’ve learned a few things about undemocracy doing this task:

    a) I hadn’t kenned the PDF highlighting (and automated ref generation) until after I’d put in the base-level citations on the AMISOM page. If the SC resolutions are eventually parsed, will those citations be related to blocks/paragraphs in the resulting plain text?

    b) SC Resolutions don’t have a date or topic in the title, so finding the right one on this page can turn into a frustrating clickfest. Not sure how to solve this, since you usually have to read an SC res to find out what it’s about anyhow.

    c) Concerning the pages with SC resolutions, the “references to this document” trail is very helpful, but it might be worth carrying a little data from those pages (eg. the SC topic keyword) into the index page for a particular SC resolution. A potentially useful link is also missing in the reference list: draft resolutions that become approved resolutions. In SC debates there are usually a few paras where the draft is introduced, a vote occurs, and the draft is given a SCR number.

    d) The Security Council meetings, by topic page is in reverse chronological order, but the SC resolutions page lists items in chronological order.

    e) Both the SC and GA regularly allow people like el presidentes, foreign ministers, senior UN staffers and other worthies to address them. Would be interesting to know more about these people, through a link to their wikipedia page.

    f) Where the President of the SC speaks, in verbatim transcripts, the flag of the office holder is shown, but not the name of the speaker. Sure that’s a small hack to add.

    Anyhow, these are first thoughts …

  • 2. Julian replies at 13th March 2008, 2:59 pm :

    Looks lovely. Thank you so much.

    As you can see, though the system is not great, it has attained a level of useability where you can get a clear idea of its potential and use it as a basis for discussing how to do it properly.

    Most of these points definitely occur on the list of things to do, and would be quite easy to implement if there was a database behind it. However, I’ve hacked it as a nest of flat ascii files all over the place which makes improvements rather tricky. This was only ever intended as a prototype because I was experiencing too much difficulty getting across any of these ideas on paper.

    Future of Parsing.

    Transcripts, by their nature, are very simple documents (they contain titles, spoken words, stage directions, votes, and nothing else). Much can be gained through processing them to the very limit of what you can imagine, and then slicing and dicing them in all kinds of different ways.

    Documents such as Resolutions are very much more complex and varied. Their layout is usually very important (they sometimes contain tables). Parsing them fully doesn’t get you ahead, because the best you can do is rebuild them back into the same form they first arrived. So I don’t think any highlightings which are based on page pixels are going to go out of date.

    Document information extraction, on the other hand, is very worthwhile. This would include search terms, hyper-links to other documents See Job 1, date matching, and title detecting.

    However, without a database in which to upload this information where it would be displayed on a website, I’ve not had the motivation to start work on this part of the project.

  • 3. Tom L replies at 13th March 2008, 6:39 pm :

    The potential is very clear – I won’t say it was fUN, but it was certainly a useful thing to do.

    Like with any legal document, it’s quite common to find that Resolutions back-reference specific paragraphs from prior resolutions, and I can say it would be useful to be able to follow these chains though. You’d save lawyers a lot of tedium in piecing them together.

  • 4. Julian replies at 13th March 2008, 6:57 pm :

    They tend to link to the documents by their code. References to paragraphs or pages or anything more granular are extremely rare. The big low-hanging fruit is that Job 1 about inserting links into their PDFs.

    As a connoisseur of government-legal documents, I’d argue that the functioning of the UN is more than an order of magnitude clearer than, say, the Westminster Parliament, which creates nothing less than an explosion of difficult cross-references.

    One difference in culture, for example, is the UN document will quote the paragraph from the earlier document, and include a reference to it.

    In Westminster they don’t want to waste paper and are duty-bound to be as awkward as possible, so they include a difficult reference, but don’t even hint at what it says.

    I think it’s something to do with the Anglo-legal culture — genuinely predictable misunderstandings due to a gratuitously difficult word-game are *your* fault, no matter how obvious it was that you would not have signed that contract had it been clarified at the time.

  • 5. Tom L replies at 13th March 2008, 7:45 pm :

    I agree that UN language is less obscure, that UNSC resolutions are helpfully repetitive and more reliably interlinked than English gov-legal stuff. There is a finer grain than just the overall documentation citation though. See Security Council Resolution 1801 (2008) of None“>this page from Resolution 1801 on Somalia, for example. The top level citation is qualified with a paragraph reference. My experience is that these types of qualifying citation are are more common than you say. And that’s why we need them all in a database, so we can find out.

  • 6. Julian replies at 13th March 2008, 8:19 pm :

    Ah- dang! You win!

    I am so pleased to have been beaten this way!

    Note how it is “Paragraph” as opposed to “Operative Paragraph”, which is the numbered ones when they are discussing General Assembly Resolutions.

    Anyways, in my very informed judgment, I’d recommend avoiding parsing the resolutions. While I can commit to getting all meeting transcripts fully parsed sight unseen, I know there’s going to be some fiendish cases in the Resolutions, which sometimes contain annexes. And as far as non-resolution documents (letters and reports): no way.

    For sure:- adding in links and anchors into the PDF documents in some form should definitely be considered. But I don’t think parsing into text, reslicing, and regenerating with different layouts — as is done with the speeches — is no going to be a wise move.

    We have to consider a scheme which gets it done, but doesn’t go the whole hog, so that it doesn’t need to be completely reliable — as the transcript parsing has to be.

    I’ve tried to design a system where the first person who follows up a link to another place is able to leave a forward link in the place he started from for the next person to use, so that this work needs only be done only once. It’s something that, for the moment, could more easily be solved in places like:

    where you have cross-references like “pursuant to the answer of 1 September 2004, Official Report, columns 683–4W”. But I’d need to find enough other people who saw the point of this before going further. You’ve got to think deep into the way we use the browser and stuff like its button feature to crack it in an efficient way.

  • 7. Tom L replies at 14th March 2008, 8:07 am :

    Aye, well, it’s not like SC Resolutions are large documents, unlike transcripts. The solution you’ve come up with – highlight and custom url – works well, so ignore my dim heckling.

Leave a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <blockquote cite=""> <code> <em> <strong>