iHub By Conrad Akunga / November 12, 2012
The Trouble With Open Data
By Conrad Akunga (firstname.lastname@example.org)
Unless you have been living under the proverbial rock, you must have come across the term OpenData, either professionally or personally. However as with many novel concepts penetrating the local ICT lexis, much more heat than light has been generated about OpenData.
The premise is simple. Government is the largest collector and custodian of data. This data is usually sitting ignored in silos across government bodies. Suppose this raw data were exposed to citizens, so that innovative applications can be built to dissect, view and extrapolate additional intelligence using this data?
The ICT board has long been a stellar champion of this concept, the genesis of which were laid out in a ICT policy discussion mailing list, KICTANET, culminating in the development and launch of the portal, http://www.opendata.go.ke
Since the launch of the portal, there have been several initiatives to attempt to make use of the data and derive new intelligence.
However, the question arises – have we really fully exploited the concept of OpenData? I would postulate that no, we have not nearly come close to harnessing this to its full. And here is why.
Seeing The Trees, Missing the Forest
For some reason, whether by accident or by design, the definition of OpenData seems to have been somehow restricted to government data. This is an unnecessary and arbitrary restriction.
Businesses can also harness this data and dissect, transform and combine it to yield additional intelligence. Supermarkets for example collect a lot of information every time a customer is at the till. Consider what business intelligence can be derived if this data is cross-‐referenced with anonymized customer information.
A forward thinking business can mine this information and the resulting intelligence can guide it in various ways:
- Customer segmentation
- Product inventory level optimization
- Product matching
- Purchase patterns
The business does not need to invest in its own capacity for this. It can simply upload its data publicly and incentivize developers and statisticians to mine this information and provide new insights.
Dependence On Government Benevolence
One of the first questions that arose after the launch of the OpenData portal was how does one request data that is not currently there?
The official answer was to write a “request” for what datasets one requires and await feedback.
This, unfortunately, is not good enough. Not nearly good enough. Because this means someone decides whether or not to accede to a request for data. This in effect means the data on the portal is subject to somebody’s benevolence. This is neither scalable, sustainable nor transparent.
For instance, during the Olympics Kenya sent an absurdly large delegation of officials to London. I wrote a request for the following:
1. Who exactly traveled to London at taxpayer expense?
2. How much was spent on travel, accommodation, per diems and other expenses for each of these?
As of now I am yet to receive anything other than a promise to write to the relevant parties to request this information. This I find wanting.
If we are committed to true transparency and openness of data, we should be operating on two simple premises
1. The public has a right to details of any and all public expenditure. Except of course for security spending and a few exceptions. After all, it is not government money. It is public money.
2. This data should be continuously and automatically uploaded. It should not await a request.
With these two premises in place, it will be possible for the public to aggressively audit government on various levels – expenditure, income, personnel, projects, etc.
Imagine being able to quickly answer questions such as:
- What is the total expenditure on tea and refreshments across allministries?
- What is the total expenditure on travel across government, broken downby air, road, rail and sea
- What percentage of government expenditure are recurrent versusdevelopmental across government bodies?
- What is the rate of hiring across government by quarter?
- How many airline tickets does government purchase per quarter, andfrom which airlines?
- What are all the aggregated line item expenditures, broken down acrossministries?
Leading By Example
The most obvious dataset I would expect government to release is the complete raw results of the last conducted census results. Of course anomyized so that the data is no longer personally identifiable.
This would be a very powerful tool for
- Policy makers
- Actuarial scientists
- Computer scientists
The potential of mining all that information – across all axes – demographic, health, agriculture, religion, employment, etc. and deriving new insights is breathtaking.
Policy bodies, government institutions, businesses, health practitioners and many other stakeholders would be ready and willing to fund initiatives to develop applications and tools to derive this intelligence from the raw data.
This dataset is still not forthcoming.
OpenData datasets are, by definition, outputs. The question then arises, what are the inputs?
One of the biggest problems government faces is automation. Many of us know only too well that not all government institutions are making full use of ICT. One of the best examples is the Kenya Police Service.
Those of us who have visited a police station will know that most still use manual quire books to keep records such as the occurrence book.
How many of us have had the experience of a relative or a friend who has failed to return home, and been forced to look for them? You must physically visit every police station and read through the occurrence book.
Imagine police stations automate, and the OB is online. You can simply search by ID number or name of your friend or relative and the tool will tell you
- What police station they are in
- Why they have been arrested
- What you need to do next
Over and above this, data on crime and misdemeanours can also be published and mined to glean new insights, over and above informing the citizens.
If government can fix the fundamentals of automation across government, OpenData becomes that much easier to realize.
Conrad Akunga has worked in the software industry for over 10 years. He is a co founder of Innova Limited, a software company specializing in the development of software and tools for the finance and investments industry. He is also the co-founder of Mzalendo.com, a civic education and governance watchdog portal. He also sits on the Board Of Advisors of the Nairobi iHub. He is also a philosopher, writer and all round good guy.
Upin Vasani at 11:07:27AM Monday, November 12, 2012
Identifying and highlighting the problem is good ! What are the prospective solutions and how can these be meaningfully addressed is what is now required.Reply
bankelele at 11:16:22AM Monday, November 12, 2012
Going by this article in the Sunday Nation, the concept of ‘open data’ has not been sold to many government departments – who still hoard information they collect with taxpayers money – but which they will only release in doses or formats that they see fit.Reply
Michael Pedersen at 12:05:41PM Monday, November 12, 2012
100% agree the “request” for datasets mechanism is far from good enough.
I am still waiting for a dataset about registered companies that I requested on the launch day of the OpenData portal.
ICT Board actually did follow up on the request and tried to avail it (since that particular dataset has already been digitized so it would technically be “easy”), but the goverment institution in charge did not feel there was any “reason” to avail this dataset – and the whole thing stoped there.
What good is it if the individual goverment department can just deny availing a dataset ?Reply
Steven at 12:56:26PM Monday, November 12, 2012
Conrad, while the pressure for OpenData is certainly top-down, you rightfully point out that the egov fundamentals are not in place. That certainly impacts, evidently, the success of OpenData since a lot of the bottlenecks are still at that very fundamental level, including control and (mis)management of information that should be easily publicly available and accessible, basis for OpenData. I’m sure, I hope, that OpenData has invested some effort and resources in this regard, otherwise it’ll continue being a nice repository but with low value content.Reply
Wilfred Oluoch at 19:34:26PM Monday, November 12, 2012
I have often wondered how committed Government really is on this OpenData initiative when we still have an acute culture of “hiding information”. I have done some work with government as a software consultant quite often and come across artificial barriers to information ( or data sets) that should be public. The push has to come from top-down. We need regulations that force ministries to publish on OpenData. Perhaps part of performance contracts.Reply
Adamsy at 23:48:56PM Monday, November 12, 2012
Nice one! Very thought provoking!!Reply
Gabriella Razzano at 14:59:22PM Tuesday, November 13, 2012
This is precisely why, at the Open Democracy Advice Centre, we believe fundamentally that the call for open data cannot replace the call for access to information laws, that are well implemented. The focus on open government data has in some ways quietened voices on access to information laws – but it should be no surprise that many of our African governments prefer the call for open government data (and proactive disclosure of information) if it means they have greater control over what information is, or isn’t, released. The two calls must be continually voiced together – we still only have around 10 African countries with access to information laws and, I believe, these laws must exist; especially when we want a big stick for beating govenrment with to release more ‘controversial’ forms of information. You can see further http://www.opendemocracy.org.za and also the African Platform on Acces to Information at http://www.africanplatform.org/.Reply
Michael Bauer at 16:28:31PM Tuesday, November 13, 2012
There are a few comments I’d like to make:
1. you state “For some reason, whether by accident or by design, the definition of OpenData seems to have been somehow restricted to government data. This is an unnecessary and arbitrary restriction.” – this is factually incorrect. Open Data is all Data. (Open Government Data is a subcategory) – there are already private entities and research projects opening their data as open data (e.g: measurementlab.net). This is not a restriction at all – rather a misunderstanding of the full implications.
2. You are right: Open Data is only one side of transparency – the other has to be freedom of information legislation – to not depend on government benevolescence.Reply
- iHub Cluster
- iHub Consulting
- iHub Research
- iHub Robotics
- iHub UXlab
- Expert Talk – IBM’s Research Dr. Kommie Weldemariam
- Its Cloudy Today: Goodbye Cluster, Hello Cloud
- PIVOT East partners with Tech hubs for 2014 startups competition
- Screening of Girl Rising – empowering young women in ICT and entrepreneurship
- Why calling yourself a Startup could be Killing your Business