View the visualization here.
View the visualization here.
Last week, one of the cooler projects I have worked on since I started with Open Data Kosovo finally got released to the public. The project was to visualize data collected by a survey called Kosovo Mosaic. This survey is run every three years across all 38 municipalities in Kosovo and asks citizens, amongst other things, how satisfied they are with a range of services the municipality and government provides. 2015 was the fifth installment of the survey.
Our job was to work with the Kosovo Mosaic data and come up with a way to visualize it so that people who are not into spreadsheets and coding can interact with it and get something from it. Our solution, using D3.js and Highcharts, was to provide users with a (hopefully) easy to understand interface to explore the data and find their own interesting conclusions.
Obviously the data presented in this visualization will appeal mostly to those who have some connection to Kosovo. However, even for those that do not, I thought you may be interested in seeing how we approached the problem and whether it works for you. If you do have any thoughts, things you like, things we could have done better, please feel free to leave a comment.
You can access the interactive visualization by clicking the picture below:
Cross posted from OpenDataKosovo.org:
Continuing our series on Gender Inequality and Corruption in Kosovo, in Part IV we are going to build on Part III and use our understanding of the participation rate to compare the participation rate in Kosovo across a range of countries, as well as look at the reasons for non-participation (“inactivity”). If you don’t understand what a participation rate is (SPOILER: it is not the same as the unemployment rate), or just want to make sure you get the full picture, please go back and read Part III.
Click on the chart below to interact with the data!
Sunburst chart created by Festina Ismali
Comparing participation rates across countries provides insight into broad demographic trends and the specific employment situation in a country relative to other countries. For most high income nations, the participation rate tends to be around 60%. That is, 6 out of every 10 people of working age are actively engaged in the employment market (whether they currently have a job or not). While that may sound low, this accounts for parents who stay home to raise children, students, retirees and discouraged workers.
Once we leave high income countries, there is a much larger range of participation rates. Many very poor low income nations in Asia and Africa have extremely high participation rates of well over 80%. This is driven by pure necessity as, in many cases, there is simply no option for one partner to stay home, retire, or even for young people to continue studying.
Conversely, we also see many countries with very low participation rates of just over 40%. In some cases, these countries are involved in ongoing conflicts or are post-conflict countries (Syria, Iraq and Afghanistan all had participation rates below 50% in 2013). But in other cases, the cause is harder to identify.
Unfortunately, Kosovo is one of these harder to understand cases. In 2013, Kosovo had the second lowest participation rate of any country in the World Bank database, at 40.5%. In 2014 that number picked up slightly to 41.6%, but that was still low enough to keep Kosovo in the bottom 10, based on 2013 figures. Notably, Kosovo’s low participation rate has actually decreased substantially over the past decade (see Chart 1). In 2002, the participation rate stood at 52.8%. If that participation rate applied today, there would be an extra 134,600 people in the labour force – an increase of 26.9%.
Looking at Chart 1, another data point that immediately stands out is the low participation rate for women. In fact, with a participation rate for women of 21.1% in 2013, Kosovo has one of the lowest participation rates for women in the world. In terms of the rankings, Kosovo places between Saudi Arabia (20.2%) and Lebanon (23.3%). Looking around the region, Kosovo is also a significantly outlier (see Chart 2).
Previously, in Part III, we mentioned that there were some more detailed criteria for determining whether a person is considered ‘employed’ in Kosovo. Specifically, there is one particular criteria that may partially explain Kosovo’s notably lower participation compared to its neighbors (and everyone else).
In the 2014 Kosovo Labour Force Survey, a specific methodological difference with Albania is highlighted. In Kosovo, people who work on a family run farm are not considered employed if the produce of the farm is not considered an “important source of consumption” (let’s call these people ‘family farm workers’). In contrast, these same people in Albania are classified as employed. From the 2014 Kosovo Labour Force Survey Results paper (emphasis mine):
“It is important to note that when respondents answer code 5B, that they do some agricultural activity but it is not an important contribution, this is not counted as employed. In 2014 69% of this group were categorized as inactive and 31% as unemployed. An important contribution is a subjective term and could depend on overall household income.”
The key takeaway here is that there is a significant population of family farm workers that are currently being classified as inactive, when in fact they are working. This at least partially explains the low participation rate in Kosovo.
Unfortunately, the paper does not provide enough information to be able to determine how many people are family farm workers. As such, we are unable to quantify exactly how much impact adding family farm workers back into the labour force would have on the headline participation rate.
Even if we could though, this would not be fully correct either (welcome to the surprisingly complex world of labour market statistics). Many family farm workers probably do not consider themselves employed – working 1 hour a week on a family farm is a pretty low bar after all. The fact that 31% of them qualified as unemployed, meaning they actively sought other work, reveals that this is not homogenous group of full time farm workers being incorrectly classified.
Methodological anomalies aside, there is also a concerning trend in the data – the participation rate for women in Kosovo has been declining for much of the past decade. Despite the improving economy and significant international development assistance, the participation rate for women fell from over 34.5% in 2002 to 21.4% in 2014. There is some good news – the fall appears to have bottomed out, with 2013 and 2014 both recording higher participation rates for women than the low point in 2012 (17.8%!).
This slight uptick in recent years could be the impact of numerous initiatives to get women into the workforce in Kosovo. These range from the prioritization of grants for projects that provide jobs for women, to supporting women in registering property in their own names to help provide collateral for loans. There has also been a push by Kosovo’s first and current female President to boost participation among women. Several more years of data will be required to determine whether this is the beginning of a more substantial trend or simply noise in the data.
In the meantime, let’s get a better understanding of the current labour market by looking at a break down (see Table 1), provided in the 2014 Kosovo Labour Force Survey, of the inactive population sorted by reason for not participating.
|(A) Men||(B) Women||(C) = (B) minus (A)
|1,000s||1,000s||(C1) 1,000s||(C2) % of total|
|Looking after children or incapacitated adults||0.1||14.3||14.2||5.8%|
|Own illness or disability||13.3||8.6||-4.7||-1.9%|
|Other personal or family responsibilities||13.5||233.4||219.9||90.2%|
|In education or training||104.7||97.3||-7.4||-3.0%|
|Believes that no work is available||49.5||78.9||29.4||12.1%|
|Waiting to go back to work (laid-off people)||0.8||0.5||-0.3||-0.1%|
|No reason given||1.9||3.4||1.5||0.6%|
Looking at the breakdown, there is one category in particular in which there was a large discrepancy between the sexes – ‘Other personal or family responsibilities’. In this category, 233,400 were women, amounting to 38.8% of the total population of working age women. By contrast, only 13,500 were men, amounting to 2.2% of the total population of working age men. The table also shows the calculated difference between the number of inactive women and men (see column C1). Looking at these calculated differences, we see that for the total calculated difference across all categories (243,800 – see ‘Total’ row in column C1), 219,900, or over 90%, arose from this category. This breakdown is also shown in Chart 3 below.
Going back to the family farm workers discussed earlier, we expect that those classified as inactive would be included in the ‘Other personal or family responsibilities’ category. However, if a significant number of women in this category were family farm workers and this was a full time role, we would also expect to see large numbers of men in the same category. The fact that we do not suggests that many men who are family farm workers also have other more formal jobs and lends support to the decision to exclude family farm workers from the employed population.
The other category where we see a meaningful gap between the sexes is the ‘Believes that no work is available’ category. As mentioned earlier, these are the people that are considered discouraged workers (i.e. those that would take a job, but are no longer actively looking). Why would significantly more women be discouraged than men? Typically, discouraged workers are the end product of long and unsuccessful searches for employment. At times of high unemployment, it will often be the case that the number of discouraged workers will also increase. Seeing that women are more likely to be discouraged than men suggests they are having a more difficult time finding employment.
To confirm this hypothesis, we need to look at unemployment rates. This will be the focus of the next piece in this series – Part V.
 People who would like a job but who haven’t actively sought work in the past 4 weeks
 Code 5b text: “Worked (at least one hour) on a farm owned or rented by you or a member of your household (even unpaid) whether in cultivating crops or in other farm maintenance tasks, or you have cared for livestock belonging to you or a member of your household (if the whole production is only for own consumption and this production does not constitute an important contribution to the total consumption of the household.”
 Employed are considered all the persons who have worked even for one hour with a respective salary or profit during the reference week.
 There is no mention of when the current methodology was implemented, but it is possible that the large drop in participation rate between 2009 and 2012 was due to a change.
Cross Posted from OpenDataKosovo.org:
Continuing our series on Gender Inequality and Corruption in Kosovo (see Part I and Part II), in Part III and the next few parts, we are going to take a detailed look at the problems women face in the labour market in Kosovo.
To do this, we will be using information from several sources, including data on participation rates, by gender, from the Gender Statistics database at the World Bank, and a range of labour market statistics from various Kosovo Labour Force Surveys, released by the Kosovo Agency of Statistics.
Before diving into the statistics, let’s first visualize and explain some of the high level concepts in labour market statistics.
At the highest level, the section of the population that is relevant when looking at labour market statistics is people who are of working age and are able to work. In Kosovo, this population includes all people aged 15 to 64 and is known as the ‘working age population’.
At the next level, the working age population can be broken down into two main subgroups – those that are considered in the labour force (i.e. ‘participating’) and those that are ‘inactive’. It is important to note that someone who is ‘inactive’ is not the same as someone who is ‘unemployed’. In Kosovo, to be considered ‘actively looking for work’ (and therefore be classified in the labour force) the following criteria must be met. The person must be:
If either of the above criteria is not met, the person is classified as inactive.
Once the population is classified as either in the labour force or inactive, it is possible to calculate the participation rate, one of the key labour market statistics. The participation rate measures the labour force population (people employed and/or actively looking for work) as a percentage of the working age population.
In Kosovo, the participation rates in 2014 were as follows:
Unlike the unemployment rate, described below, the participation rate tends to provide more stable and reliable data than the unemployment rate, as it is not affected by short-term fluctuations and the business cycle.
Analyzing the population further, the ‘labour force’ can be subdivided into two populations – those that are employed and those that are unemployed. In most cases it is obvious whether someone is employed or not, but in some situations it may not be so clear (e.g. when a person is working for the family business in an unpaid capacity). To handle these scenarios, the agency tasked with compiling the labour market statistics in each country typically has a specific definition (or definitions) of what qualifies as employment. In Kosovo, to be classified as ‘employed’ a person must meet the following high-level criteria:
“People who during the reference week performed some work for wage or salary, or profit or family gain, in cash or in kind or were temporarily absent from their jobs.”
In addition, the Kosovo Agency of Statistics includes some more detailed criteria in their methodology that clarifies when work done on family owned farms classifies as employment. This will become important later.
Having separated the employed from the unemployed, it is now possible to calculate the unemployment rate. To do this, we divide the number of unemployed people by the total number of people in the labour force.
In Kosovo, the unemployment rates in 2014 were as follows:
The unemployment rate is useful as a more immediate indicator of conditions in the economy. The obvious information is provides is an indicator of how many people without a job are currently looking for employment. But, in addition, it also provides information about how much spare capacity an economy has, the risk that inflation may pick up, whether structural issues are keeping people out of work and so on.
In the next article, we will take a look at how the participation rate (for both males and females) in Kosovo compares across the region and internationally. In the meantime, please feel free to play around with the interactive visualization below, which shows the entire working age population of Kosovo broken down into its various subgroups.
Click on the chart below to interact with the data!
Sunburst chart created by Festina Ismali
Cross posted from OpenDataKosovo.org:
Previously in Part I of this series, we looked at corruption in Kosovo from the perspective of Kosovo civil servants, as documented in a United Nations Development Programme (UNDP) report entitled Gender Equality Related Corruption Risks and Vulnerabilities in Civil Service in Kosovo.
In Part II we are now going to look at global corruption perception statistics compiled by Transparency International to consider how Kosovo compares internationally.
Transparency International is an organization that works to reduce corruption through increasing the transparency of Governments around the world. Arguably Transparency International’s most well known contribution is the Corruption Perceptions Index (CPI), an index measuring “the perceived levels of public sector corruption worldwide”. In 2014 the CPI was calculated by aggregating 12 indices and data sources collected from 11 different independent institutions specializing in governance and business climate analysis over the past 24 months. The 2014 CPI covered 175 countries, including Kosovo.
In addition to the CPI, Transparency International does its own survey and data collection in the form of the Global Corruption Barometer (GCB survey). The GCB survey focuses on the public’s opinion of corruption within their own country, and in 2013 (the latest edition of the GCB available at the time of writing) collected the opinions of over 114,000 people across 107 countries – including Kosovo.
So what did these two reports show?
In the CPI, Kosovo performs poorly, placing 110th out of 175 countries with a score of 33 out of 100 (unchanged from 2013). To give some perspective, Kosovo finished equal 110th with 4 other countries – Albania, Ecuador, Ethiopia, and Malawi. This placed it behind Argentina (107th), Mexico (103rd), China (100th), India (85th) and Greece (69th), countries that are often associated with high levels of corruption. Finally, this was the lowest ranking for any country in the Balkans region (tied with Albania).
The GCB survey, however, shows that the people in Kosovo have a different perception of corruption in several areas to that reported in the CPI. Based on the responses to question 6 (see Chart 1) and question 7 (see Chart 2) of the GCB survey, people in Kosovo are somewhat more optimistic about the levels of corruption in their country than the low rating on the CPI might indicate. Kosovo scores well in several areas:
What does all this mean? Why does Kosovo perform so poorly on the CPI, and on some GCB survey questions, but on other questions the perceived level of corruption of people in Kosovo is comparable to some developed nations?
One of the issues when looking at the results of the GCB survey is that the responses to most of these questions are subjective. What constitutes corruption or extreme corruption varies by country and culture based on what people are used to living with. What someone in South Asia or sub-Saharan Africa considers standard practice and harmless may be considered unbelievably corrupt by people in other parts of the world.
These different standards are really highlighted when we compare the percentage of people believing an institution is corrupt with the number of people reporting to have paid a bribe to that institution, using questions 6 and 7 of the GCB survey. There are four institutions that appear as options for both questions, allowing us to make a direct comparison:
In the comparison (see Chart 3), we find numerous examples where the percentage of people that reported paying bribes was higher than the percentage of people who believed the institution was corrupt. The implication of this finding is that significant numbers of people in these countries believe that paying a bribe is not a sign of corruption.
Kosovo and most developed nations were examples of the opposite case – they generally reported relatively high numbers of people who believed the four comparable institutions were corrupt, and relatively low percentages of people reporting bribes being paid. Bribery, of course, is not the only form of corruption, and this result could simply be an indicator that different forms of corruption are more prevalent in these countries. But it could also be an indicator that people in some countries are particularly cynical about the fidelity of their institutions.
To get a better sense of how concerned people really are about corruption, lets now take a look at some of the responses to other questions in the survey.
One of the questions asked on the survey that could potentially reveal some further information was question 10 – “Are you willing to get involved in the fight against corruption?” Respondents were then provided with a range of activities, both active and passive, and were requested to indicate whether they would be willing to participate.
At a high level, the responses to this question appear to show an inverse correlation between the value of the CPI for a country and how willing people in that country were to do something active to fight corruption. In other words, the higher the percentage of people willing to do something active to fight corruption, the lower the CPI index for that country (i.e. a higher level of corruption).
Using a statistical model (such as regression), we can check whether this relationship is real and how strong it is. However to do this, we need to consider countries with regimes that punish dissent and crack down on protests and/or organizations that might try to combat corruption. In these countries, you would expect to have a low percentage of people willing to take action against corruption despite corruption being high.
To account for this, we need to have some sort of indicator of how worried people are about speaking out in their country. The best piece of information that we have from the GCB survey that can serve this purpose was the question asking if the respondent would be willing to report corruption.
Using these two pieces of information, we can try to test the following hypotheses:
Based on these hypotheses, we also expect that there would be no (or very few) cases where there is high percentage of people willing to take action against corruption and a low level of people willing to report corruption.
Using our two pieces of information described above, and with the assumption that the CPI is the most accurate indicator of the true level of corruption within a country, we can build a model to predict CPI for each country and test our hypotheses. The formula for this model will be as follows:
Yi = the actual value of CPI for country i
β0 = a constant
Xi1 = the percentage of people willing to do something active to fight corruption in country i
β1 = a constant applied to Xi1
Xi2 = the percentage of people willing report an incidence of corruption in country i
β2 = a constant applied to Xi2
εi = the residual or error
Using ordinary least squares (OLS) and the data for the 101 countries for which the CPI and the two variables (X1 and X2) described above are provided, the results of the model is as follows:
The first thing to note is that the coefficients support the three hypotheses we mentioned above:
Aside from providing support for our hypotheses, the other thing this model reveals is the countries that are not very well explained by this model. Chart 4 shows the CPI predicted by the model as compared to the actual CPI value for 2014.
At a high level, we can split the chart into two parts:
Starting with the first group – countries that were more corrupt than the model predicted – these cases appear to fall into two categories:
Contrasting with the above cases, we can also see there are countries above and to the left of the line in Chart 4. This represents countries that were less corrupt than the model predicted. In these cases the responses to the two questions were indicative of a country with a higher level of corruption than actually existed. The following were two interesting cases:
Unlike the above examples, Kosovo appeared fairly typical for the model. Let’s now take a deeper look into the results of the model for Kosovo.
For Kosovo, the model was able to fairly accurately predict the CPI using the two variables described. Kosovo has both a high percentage of people willing to do something active to fight corruption (80%) and a high percentage of people willing to report corruption (84%). As a result, the model predicted a high level of corruption in Kosovo, a CPI of 35, which was just below the actual CPI value of 33.
However, aside from proving the accuracy of the model in this case, these high values reveal important information about the people of Kosovo. It reveals Kosovars do believe corruption is an issue, and that they are willing to do something about it.
Overall, there are positives and negatives for Kosovo that can be taken from the Transparency International data. On the negative side, the CPI highlights that corruption is a significant issue in Kosovo. Even in a region with consistently low CPI scores (the best performer is Slovenia with a score of 58) Kosovo is a significant underperformer. The most disappointing aspect of this underperformance is that Kosovo has had the significant advantage of 15 years of assistance from various international agencies in setting up infrastructure for good governance.
That said, there is a big positive that comes from the GCB survey data, and it is also potentially an important clue as to the best way forward for Kosovo and the international organizations involved in the region. That positive is that the people of Kosovo appear to be aware of the issues of corruption in their country, and more importantly, they are very willing to take an active role to fight it. Compared to Albania, a country with the same CPI as Kosovo, almost twice the percentage of survey respondents stated they were willing to do something active to fight corruption in Kosovo (80% vs. 44%), and significantly more people said they were willing to report corruption (84% vs. 51%).
What this suggests is that, if harnessed effectively, anti-corruption efforts in Kosovo could be very popular, and therefore powerful. But the right strategies have to be implemented and publicized to garner public support.
Somewhat unsurprisingly, we believe a key strategy has to be raising awareness of how data can be used to reduce corruption and bring about change. This can apply equally to data that is currently collected by government agencies but isn’t publically released, or new datasets that the public can assist in collecting. With the right data and right analysis, these datasets can help to improve governance in numerous ways including:
Using this open data approach also helps reduce reliance on the bravery of individual whistleblowers. Although whistleblowers are often vital in helping to identify incidents and even patterns of corruption, the fact is that, even in developed nations, they will always risk retaliation and other subtler forms of retribution (reduced career prospects, being ostracized by their peers and generally being perceived as untrustworthy).
Overall, what the results of the Transparency International data shows us is that, with better coordination and targeting of anti-corruption efforts, there is the potential to actively involve large numbers of Kosovars. If that can be achieved and funneled into meaningful strategies, the future of Kosovo could be very bright indeed.
Have any suggestions for ways data could be used to fight corruption? Disagree completely? Feel free to leave your thoughts in the comments!
 Defined by Transparency International ‘… as “the abuse of entrusted power for private gain”. Corruption can be classified as grand, petty and political, depending on the amounts of money lost and the sector where it occurs.’
 The methodology for compiling the CPI is reviewed on a yearly basis with data sources added and removed as needed.
 “To what extent do you see the following categories in this country affected by corruption?” – responses of “corrupt” or “extremely corrupt” recorded as a positive response.
 “In your contact or contacts with the institutions have you or anyone living in your household paid a bribe in any form in the past 12 months?“
 “Over the past 2 years, how has the level of corruption in this country changed?”
 “To what extent is this country’s government run by a few big entities acting in their own best interests?”
 “How effective do you think your government’s actions are in the fight against corruption?”
 By their own admission, Transparency International’s CPI is not a perfect measure of corruption. Corruption by its nature is hidden and so there is no objective measure of the true level of corruption. However, the CPI is currently the most respected measure of corruption available and so we make the assumption that it is also the most accurate for the purposes of constructing this model.
 Taken as the average of the percentage of people who said they would take part in a peaceful protest and the percentage of people who said they would join an organization that works to reduce corruption as an active member
For those that don’t know, over the past couple of months I have been spending time working with a tech startup/NGO here in Pristina called Open Data Kosovo. The main aim of the organization is to encourage and facilitate the release of data and other information by the government of Kosovo (and related bodies) in order to increase transparency and reduce corruption. So far they have been fantastically successful, getting both national and international media attention, which is all the more impressive when you consider they are only now coming to the end of their first year of existence.
One of the main things I have been working on since joining is putting together some analysis of the various datasets they have been publishing online to see what conclusions can be provided to the public that might help create a more informed discussion of the issues. The first piece has now been published on the Open Data Kosovo website and we are excited to see what kind of feedback we get. If you want to take a look, please click the link below: