Brief guide to big data in shipping

Brief guide to big data in shipping

Big data has become the tech buzzword of the decade, not just in maritime, but in every major industry. Despite the hype, many people don’t fully understand exactly what big data is, how it works, or how it can be implemented to help organisations reach their goals. To help, we have prepared a jargon free guide to the fundamentals of big data and its potential applications in shipping.

Before jumping into a definition of big data, first let’s explore data. In its simplest form, data is just a group of information that can be collated for analysis. Shipping has always been a data driven industry; Lloyd’s coffee house was popular in the late 17th century precisely because of its ability to provide shipping data to the traders who frequented the premises. Data doesn’t have to be digital, but in the latter half of the twentieth century, the rise of computing made the two synonymous.

Lloyd’s Coffee House was popular in the 17th Century because it was a source of industry data. Credit: Lloyd’s of London

How does big data compare to regular data analysis?

We have been doing data for decades if not centuries, so when does regular data become big data? The volume of data the world is generating is growing exponentially, as is the variety of that data and the speed at which we can capture and process it. When the volume, velocity, and variety of data being generated, captured and processed outstrips the capabilities of regular database tools we enter the world of big data.

Traditional databases have limitations. When they are asked to process too much data from too many different sources their performance degrades. Popular big data tools like Hadoop are able to scale the volume, velocity, and variety of data they can process with almost no limitations by distributing the work across multiple computers. This has two major advantages; firstly it allows for complete flexibility in what data is stored and second, it creates near limitless possibilities in how that data can be used.

In the majority of cases, for data to be useful in analysis, it needs to be structured. Most databases rely on a system of categories, values, and relationships between pieces of information known as a data model. If you have ever used a spreadsheet you should have a basic understanding of how structured data works. For example, a simple dataset called Ships could look like this:

  • Tankers: 4;
  • Container Ships: 6;
  • Bulk Carriers: 12;

It’s easy for me to analyse that dataset, I know that there are 22 ships. I know there are twice as many bulk carriers as there are container ships. I know that Tankers are the smallest category of ship. More importantly, however, if the data model for “Ships” has been defined, it is easy for a computer to analyse this dataset too. Computers can easily process and make use of structured data. A basic computer can perform the same analysis as I can but it can do it much faster and with much larger datasets. The shipping industry is full of structured data; IMO numbers, lat and long positions, AIS information, and MMSI numbers are all examples of data that has been defined and is therefore easy to process.

A key differentiator between regular data and big data is the ability of computers to both store and process unstructured data. Unstructured data is any information that sits outside of a pre-defined data model. This could be anything from a bridge VDR sound recording, to a CCTV video feed, to a stack of photocopies of bills of lading.

The world has a growing number of data centres to store massive amounts of structured and unstructured data.
The world has a growing number of data centres to store massive amounts of structured and unstructured data.

If data sits outside of a defined model, it is difficult for a computer to understand what it means and categorise or value it accordingly. To overcome this, the processing power that big data systems have at their disposal allows them to use specific techniques to assess millions of data points to spot familiar patterns and group them together. Natural language processing (NLP) is a great example of such a technique: By processing vast amounts of written and spoken text, it becomes possible for computers to recognise and record patterns, and eventually understand human languages. This is the technology that powers tools like Google Translate and Amazon’s Alexa. At this point, we are venturing into the realm of artificial intelligence but it is worth noting that the two technology sectors are intertwined.

As well as the ability to capture, store and process vast amounts of structured and unstructured data, another key advantage of big data is the flexibility it creates for using the data to generate business value. While traditional data analysis depends on data being stored and queried in a specific and structured way, big data tools make it possible to quickly and flexibly query data that comes in many different forms. They also make it possible to run complex queries and search massive datasets in real-time, making the technology indispensable for use cases including cybersecurity or optimisation problems.

How does big data impact shipping?

How does big data impact shipping? In almost any way imaginable. Ships produce millions of data points, even on short voyages. Most of it is lost to the ocean waves instead of being captured and put to use. As the cost of capturing and processing data continues to fall, using it to drive more business decisions becomes increasingly viable. The data produced by the world’s shipping fleet can be used to improve both operational and commercial outcomes for the industry.

Data analysis can be used to improve safety by predicting seafarer or port operator behaviour, making it easier to stop dangerous situations developing. Vessel and port equipment performance can also be analysed, predicted, and continuously optimised with the right data too. Freight rates, commodity prices, and even trade flows can also be analysed and predicted in real-time with the right data.

A word of warning though: Creating a model to predict human behaviours or price movements is relatively easy when compared to turning that insight into action. Real organisational change can be difficult to achieve, and while big data can improve organisational outcomes in almost any way imaginable, it requires a thorough understanding of current workflows, decision making processes, and how to enact change.

Current big data applications in shipping

As well as a number of in-house data and analytics projects going on across the industry there are a growing number of solution providers in the space leveraging the technology.


Transmetrics leverages the power of big data and artificial intelligence to help companies optimise their transport networks and ship less empty containers. The team offers a combination of data cleansing, demand forecasting and predictive optimisation to help clients including DHL and CMA CGM to improve operational efficiency and reduce costs.


CoVadem gathers water depth data from a network of barge operators on European inland waterways. The system uses a “sailing network” of barges that continuously contribute depth data from sensors as they go about their normal business. The data is processed using proprietary algorithms and is used to predict water depths across the inland waterway network, and provide real-time insight into available fairways and their condition.


LogComex captures millions of data points to track the flows of cargo around the world. They offer real-time trade data on cargo movements at ports and airports around the world, and help clients who move freight to track their own shipments automatically without the need for emails, phone calls, or a complex installation process.


HiLo uses data to change how the maritime industry assesses and reacts to risk. The not for profit initiative anonymously collates near-miss data from the fleet of ships operated by their clients. The data is used to create a comprehensive risk profile of a ship or fleet which can be used to inform corrective action onboard and minimise the chance of major incidents occurring.


While big data in maritime trade is still an early stage sector, many of the largest industry players including major carriers, forwarders, and brokers have bet heavily on building big data capability. It is a serious undertaking, requiring considerable investment in technology, skills development, and change management across the organisation. But unlike some of the more hyped technology areas like autonomous ships and blockchain, the business case usually stacks up well for businesses operating at a large scale. The saying “data is the new oil” is so overused that it has become cliché, but it is absolutely true. For organisations looking for a competitive advantage in an increasingly commoditised world, investing in big data capability is a great way to find it.