Freedom of information and metadata
Date: Wed, 14 May 2008 09:50:27 +0000 (UTC)
It would be interesting to hear your comments and opinions on the following.
Essentially, I'm struggling with the problem that when we want to get information out of the government, it can be difficult because we don't know if the information exists or if it's in the right format.
My tentative solution is to turn the idea of having a database full of information on its head. Instead, we would have a database describing information we don't have but would like to have.
The problem is that there seems to be a distinct obstacle to holding governments accountable using Freedom of Information requests.
This was highlighted yesterday, when the EU Ombudsman announced a consultation to see what people think about databases as documents.
The public currently has the legal right of access to some EU documents. However, in 2005 a Danish journalist requested access to a database about agricultural subsidies. His application was refused, on the grounds that a database is not a document.
But thinking of information in terms of "documents" seems old-fashioned. For example this newsgroup message could be regarded as a document, but it would be better to treat it as content (the words I'm writing), and metadata (extra information such as the date I wrote it, the message ID, the newsgroups line, and so on).
Nowadays, I think all information can be treated as having these two parts - content and metadata. The content is what us humans are interested in and the metadata allows it to be organised and found.
Combining these two parts allows the creation of totally new "documents". For example it could give the Danish journalist a "document" listing all agricultural subsidies paid to farmers within 50 miles of Aarhus in 2007. The content would be names and payment amounts, and the metadata used to create the list would be the date, the region, and so on.
However, this presents problems to people seeking freedom of information.
Firstly, as we've already seen, the EU doesn't regard documents made on the fly as documents. They say "we do not have this information", which really means "we have this information, but it's not compiled into a format we regard as information".
Secondly, researchers are not sure how the content will look, and what it will reveal, so they're very often forced to make broad requests. In the UK, broad requests are likely to result in refusals on the grounds of expense (though I'm not sure how other countries handle this).
Thirdly, one applicant for information might be trying to research identical or similar information as another. A second journalist might be researching agricultural subsidies in France, for example. Or even trying to discover the same information about Aarhus as the first journalist.
So (if you've read this far) I'm wondering what are the conceptual
advantages and disadvantages of creating a metadata of information that
doesn't exist :-). By this I mean a database of potential "documents" not
yet in existence, but which a little bit of database manipulation could
conjure into existence.
For example, the Danish journalist would make a record in this database
describing the information he wants in a structured way. He would send it
to the EU and they'd simply run it through their computers and send the
answer back the next day. Even better, he'd do it online for himself.
Would such a metadata of non-existent information work? Would it provide
a solution to any of the problems described above? What would the
metadata requirements be? In the abstract, have people worked on this
concept before, and if so what results have they achieved?
For example, the Danish journalist would make a record in this database describing the information he wants in a structured way. He would send it to the EU and they'd simply run it through their computers and send the answer back the next day. Even better, he'd do it online for himself.
Would such a metadata of non-existent information work? Would it provide a solution to any of the problems described above? What would the metadata requirements be? In the abstract, have people worked on this concept before, and if so what results have they achieved?Received on Wed May 14 2008 - 11:50:27 CEST