38.6 million people lived in Kenya in 2009 according to its national census. The population could reach 50 million by 2020, but the predictions depend on the data source and method used. In this article we look at some of the population data sources and their predictions.
1. Kenya Population and Housing Census
Kenya National Bureau of Statistics (KNBS) has conducted a national census every 10 year since 1969. The latest census was in 2009 and total population was estimated at 38,610,097 up from 10,942,705 in 1969.
The 2009 census results were disputed and the estimates for Garissa, Mandera and Wajir were reduced by 40%. Revenue allocation to these counties has now been reduced, which demonstrates the need for accountability and accuracy in conducting a national census.
Population data from the 2009 census has been disseminated as aggregated statistics up to the level of sub-counties. Data is disseminated as tables, and commonly visualized as charts. Map visualizations and open spatial data are published on the Kenya Open Data site. However, the site was experiencing technical hitches when writing this article.
In 2015 KNBS published a County Statistical Abstract for each of the 47 counties. Population estimates from 2010 to 2014 are included for most of the counties, and the few missing data can be filled through simple data analysis.
2. United Nations DESA / Population Division
The UN Population Division takes great care in adjusting official national census results that are affected by errors in reporting, lack of timeliness, or incomplete coverage. Their 2017 Revision of World Population Prospects is worth a visit, but data are aggregated to the national level which limits its use for national and sub-national studies.
Countrymeters uses the UN DESA / Population Division data in their Kenya Population clock, which shows various population indicators in real-time using statistics on birth, mortality, and migration. It is definitely fascinating to see the population grow as you watch!
3. Gridded Population of the World
The Gridded Population of the World (GPW), models the distribution of human population on a continuous global surface. Version 4, the latest release, uses the results of the 2010 round of censuses at its lowest spatial resolution as the primary data source.
Raster datasets are available for 2000, 2005, 2010, 2015 and 2020 both as estimates of population count and population density. UN-adjusted versions of these datasets are available to match UN country totals.
GPW uses a weighted-area algorithm to proportionally allocate population data to a grid with a spatial resolution of 30 arc seconds (approximately 1 km at the equator). The main determinant in the algorithm is land use, and water areas are masked out as areas of habitation.
GPW offers a consistent global dataset over a twenty-year period and uses a simple model that produces realistic results. However, the simplicity of the model and the low spatial resolution of the output data limit the type applications for which GPW can be used.
4. WorldPop Project
The WorldPop project was initiated in 2013 to unite the continent-focused AfriPop, AsiaPop and AmeriPop projects. Its aim is to produce detailed and freely-available population distribution and composition maps for the whole of Central and South America, Africa, and Asia.
Data is disseminated as country datasets and for Kenya include other socio-economic indicators like poverty, literacy, maternal and newborn health, births, pregnancies. Population estimates are available for 2010 and 2015 at a spatial resolution of 3 arc seconds (approximately 100m at the equator). UN-adjusted versions that match UN country totals are included in the dataset.
Like the Gridded Population of the World, WorldPop uses a weighted-area model that proportionally allocates national census data to a raster surface. WorldPop uses the random forest (RF) model, which has 30 variables and uses 17 datasets. In addition to land cover and water areas, the datasets include lights at night and estimated net primary productivity derived from satellite imagery, and national datasets on roads, schools, and health facilities.
WorldPop has developed a sophisticated model with a good fit and disseminates data at a high spatial resolution. However, considerable time and effort needs to be invested in data preparation and the dynamics of some of the datasets aren’t well understood. For example, do people settle along major roads, or are roads built in areas with high population density? This lack of understanding makes it difficult to make future predictions.
Data Analysis & Visualization
Population numbers are constant to constant change, since as we speak babies have been born and others have lost their life. To compare population estimates we need a common time reference and for this article I have picked 1 July 2010 and 2015.
1. Kenya National Population
In 2009 KNBS estimated Kenya’s population at 38,610,097. The population estimates for Garissa, Wajir and Mandera were reduced by 40%, so in 2010 the estimate for Kenya dropped marginally to 38,482,952. Based on the 2014 estimates from the County Statistical Abstract and a constant growth rate, Kenya population according to KNBS would have risen to 44,170,249 in 2015. A summary of the KNBS population estimates is presented in the table below.
The 2017 revision of world population prospects contains annual total population estimates from 1950 to 2015 for virtually every country. The 2010 and 2015 population estimates for Kenya are summarized in the table below.
To calculate the national and county population estimates from the GPW and WorldPop datasets I used zonal statistics to summarize the population counts for each raster cell for each county. The unadjusted and UN-adjusted population estimates for 2010 and 2015 derived from GPW and WorldPop are presented in the tables below.
2. Kenya County Population
Conventional census mapping used aggregated statistics to assign graduated colors to administrative areas, like for example counties or sub-counties. The map visualization below shows population density and total population count for each country in 2015.
This approach is that it reveals patterns and distribution of population at a national level, but it doesn’t reveal the distribution of the population within a county.
For delivery of government services or allocation of resources within a county high resolution mapping of population and its distribution is required. Weighted-area modeling leverages data analytics and the growing repository of remote sensing data to produce population surfaces as shown in the figures below.
Population estimates using KNBS, GPW and WorldPop Project as a source have been generated for each of the 47 counties. A sample of 10 counties is presented in the table below. What is interesting to note is that the proportion / share of the counties to the national total is dependent on the data source / method. For example WorldPop predicts increased growth, while GPW predicts reduced growth in urban centers in comparison with KNBS.
We set out to find out how many people live in Kenya, but ended up with different answers. It appears that KNBS underestimates population in comparison to other data sources. In comparison to the UN country totals the GPW model underestimated Kenya’s total population in 2010 and 2015. The WorldPop model interestingly underestimated population in 2010 and overestimated population in 2015 in comparison to the UN country total.
Predicting absolute population counts might remain a challenge for years to come, but the modeling approach to population distribution provides valuable insight into where people live and don’t live. The emergence of new data sources (e.g. call locations) and advanced analytics (e.g. big data) could keep population modeling at the forefront of research and application development for years to come.