Geospatial is Where in Data Science

Statement or question? Data science is in the news as a discipline that can solve business problems through big data. The fast-growing geospatial industry has been handling large volumes of spatial data for decades. Let’s look at both and find out how they can sustain each other’s growth.

About Data Science

Data science forms part of a continuum of data-related disciplines and solutions that have data at their core. We can begin to define data science by comparing it with a few of these illustrious contemporaries.

  • Artificial Intelligence – AI will have a disruptive impact on businesses in years to come. However, AI focuses on the development of machine based algorithms, while data science is concerned with solving business problems in a human context.
  • Big Data – The distinctions between big data and data science are dissipating, but big data deals with massive volumes of structured and unstructured data. Data science in comparison deals with any data, big and small.
  • Business Intelligence – BI allows business executives to analyze sales data in various dimensions. In doing so BI exploration typically raises new questions, while data science could give definite answers to such questions.

Wikipedia states that data science, or data-driven science, is an interdisciplinary field about scientific methods, processes, and systems to extract knowledge or insights from data in various forms, either structured or unstructured. This definition implies that data science uses the steps of the scientific method in a process-like manner while leveraging scientific knowledge and theories.

Data is accumulated at an unprecedented rate and companies that can derive information and insight from their data in support of decision-making have an edge over their competitors. The use of data-driven as an adjective to nouns like marketing and journalism is an indication that the value of data science is increasingly understood and appreciated.

In 2012 Harvard Business Review named data scientist the “sexiest job of the 21st century”. Alex Castrounis mentions it in his article “What is Data Science, and What Does a Data Scientist Do?” and goes on to discuss the data scientist’s secret sauce and what this “sexy” person does on a daily basis.

The article and other references suggests that a data scientist needs at least basic skills in the following four broad areas of expertise.

  • Statistics and mathematics – A good understanding of statistics and calculus is needed to work with quantitative data and answer questions with statistical significance.
  • Programming and computer science – Computing and programming skills are needed for data management, analysis, and visualizations.
  • Communication skills – A data scientist needs good verbal and written communication skills to engage with stakeholders and to present findings in a convincing manner.
  • Business and domain expertise – An understanding of company goals and objectives within its larger industry are needed to solve problems in a business context.

The responsibilities of a data scientist are dependent on the organizational context. A person working in a large team could handle specialized responsibilities (e.g. programming, visualization), while a data scientist in a small organization could manage data science projects single-handedly. Following are typical tasks that a data scientist can be expected to perform.

Data management

  • Collect, acquire, connect to, and organize data in a data repository.
  • Get to understand data meanings, relationships, and accuracy.
  • Clean, transform, enrich, and prepare data for subsequent analysis.

Data analysis

  • Perform descriptive (what happened?) analytics with descriptive statistics and basic data analysis tools.
  • Perform diagnostic (why did it happen?) analytics with regression and hypothesis testing to discover correlations and dependencies.
  • Perform predictive (what will happen?) analytics with programming, computational methods, and algorithms for forecast simulations.
  • Perform prescriptive (what should we do?) analytics with the integration of predictive modeling, big data analytics, artificial intelligence, and company data.

Data visualization

  • Present data and derived information as compelling visualizations in the form of tables, charts, and maps.
  • Monitor and evaluate KPIs and supporting indicators in real-time through executive and operational dashboards.
  • Create awareness about business problems and recommended solutions through visual narratives and storytelling.

About Geospatial

Geographic Information System (GIS) software came to the fore in the 1980s with the advent of the PC. The software has evolved along with IT and now embraces web services and cloud computing. Somewhat unfortunately the term GIS remained, since businesses question the need for a dedicated system that manages geographic data.

Recent years have seen a consolidation of GIS into the broader geospatial industry. This has resulted in a small number of big players that offer a broader portfolio of technologies of GIS, surveying, remote sensing, and emergent technologies. The geospatial industry is growing at a fast rate and increasingly moving towards industry-solutions that solve specific business problems.

Both large and small players in the geospatial industry are exploring niche markets that can sustain their growth. They keenly follow new technology trends like Big Data and Artificial Intelligence and the opportunities that it could provide.

To demystify the meaning of geospatial or spatial technologies let’s have a closer look at the key constituent technologies and their outlook.

  • Mapping & Surveying – One might think that every inch of our planet has been mapped, but our world keeps changing. Evolving technologies like GNSS, LiDAR and laser scanning now allow larger volumes of data to be collected at higher speeds and accuracies.
  • GIS – Not the latest kid on the block but the GIS industry has latched on to web, mobile and cloud computing. GIS offers increasingly powerful capabilities for geodata management, analytics, and visualization through easier to use systems, tools, and apps.
  • Remote Sensing – This discipline deals with the acquisition, processing, analysis, and interpretation of imagery through space satellite, aerial and terrestrial sensor systems. It deals with massive volumes of data and can be regarded the big data arm of geospatial.
  • Location Technologies – Location-based services leveraging location-enabled devices have been around for quite a while, but this has transcended into a vibrant industry that delivers location data and location intelligence. Call it the new “sexy” in geospatial if you like.

A geospatial professional can be viewed as a modern-day geographer who explores the world with digital technologies, but what do they look like? Let’s explore some of the common archetypes:

  1. Surveyor – Loves being outdoor and handles river crossings, wild animals, and irate citizens to collect measurements with a tripod and other survey instruments. Surveyors make rudimentary maps as proof of work done, but you better get hold of the data.
  2. Analyst – The geospatial analyst stares at maps to interpret colors and tell you what’s near and far. They most frequently use query, buffer, and clip, since this helps to keep geo-statistics, raster algebra and other complicated algorithms at bay.
  3. Map maker – Produces online maps in large numbers by collecting, digitizing, bashing, publishing, and sharing geographic data. Some maps hurt the eye, some use attractive colors, but few carry a message that supports decision-making.
  4. App builder – Are considered tech-savvy because they discovered how to use smart technology that anyone could use. They happily build location-based mobile and web apps that already exist, or stuff that nobody needs or has asked for.

The above is a bit of a parody derived from my experiences in the local geospatial industry. It should point towards 4 key areas of expertise for today’s geospatial practitioner:

  1. Spatial data collection and acquisition, and geodatabase management.
  2. Spatial data analysis, geo-statistics, and spatial-temporal analytics.
  3. Cartographic map production and data visualization in 2D and 3D.
  4. Development of automated workflows and end-user applications.

Conclusions

Geospatial technologies manage “Where” and can be leveraged in data science due to their unique capabilities in managing, analyzing, and visualizing the locational aspects of data. Data science is already leveraging location-based services and map visualizations, but there is greater opportunity for applying geo-statistics and spatial temporal analytics in decision-making.

The geospatial industry can attribute a large part of its recent growth to the advent of data science and related disciplines like big data, AI, machine learning, and IoT. It can grow further if it adapts the scientific rigor of data science in problem-solving to offer insights and solutions to many of the world’s pressing problems.

Both data science and geospatial practitioners need to widen their skillset to offer service to organizations and individuals that seek to leverage all manner of data in their decision-making. Crossovers with expert skills in data and geospatial science will be in demand in an increasingly integrated and competitive world.

About Author:

Willy Simons came to Kenya from The Netherlands in 1994. He is a serial entrepreneur and co-founder of Oakar Services, Esri Eastern Africa and Spatiality. He blogs about business, geospatial technology and cloud computing.