Data driven decision making and learning

Duncan Simpson discusses how data projects can be improved and how data could be used more cleverly in the future. He also talks about the evolution of Valpak's data collection processes, which has led to the development of the Insight Platform - a tool which can benefit businesses by helping them to assess their environmental impacts.

We have become used to terms like “big data”, algorithm, machine learning and AI (artificial intelligence). There is no doubt that these things have started to play greater roles in our lives. For good and sometimes bad we are learning how the use of vast amounts of data could improve our worlds.

However, there are several things which would greatly improve data projects, which are as follows:

  • A solid understanding of what you want to achieve with the data
  • Data quality
  • Detailing the potential benefits
  • Context, teaching and experience input
  • Collaboration
  • Review

Let’s go through these and consider them in order.

A solid understanding of what you want to achieve with the data

This is the first major decision that must be made. Easy to say but hard to do. The danger is collecting vast amounts of random data, which you don’t need or want, or discarding data you may need in the future. For example, if the question relates to the recyclability of plastic, then it is important to note that the packaging material is in fact plastic, but what may be more helpful is the polymer type or the colour. This also brings in the concept of levels or granularity of data. Clearly defining the purpose of data collection is key.

Data quality

Next step should be to ensure data quality. Quality data can lead to better quality decision making and answers. Also, agreeing uniform terms which everyone understands and can adhere to. This is not just important for your team but also those inputting data as third parties. Often the language we use may not be the same as others. Explaining terms or providing a glossary of terms and definitions can help. Restricting what can be entered into data fields can be used productively as well. Waste, for example, allows many people to complete many definitions of what they are disposing of. Often the term “general mixed waste” is used but it masks many possibilities for what resources may lie within this definition.

On the other hand, EWC (European Waste Classification) Codes provide us with some detailed choices of labels for waste types, such as 15-01-12 waste plastic packaging. However, this still covers a wide range of possibilities from films, to bottles, to drums and more. Understanding the source of the data and how it is collected can be very helpful as well in determining its quality. Often those with a vested interest in getting the data correct or getting value from it will be more reliable than someone perceiving or believing there is no value in providing accurate data.

Detailing potential benefits

A key to this may be detailing potential benefits or positive impacts from data in the future. This may mean that a person or company will invest in data collection and processing.

Context, teaching and experience input

When utilising AI or machine learning it is essential that you use expert knowledge and experience to check results or assumptions made by software. It is important that you teach the software context, parameters, rules and boundaries. If you don’t and blindly follow its outputs, they could well be wrong.

If we are going to try to map the flows of resources and waste through our economy, then understanding what the largest streams you would expect to find is important. Knowing who the largest handlers of materials are at present or the typical size of a UK based plastic film sorting plant compared to a bottle plant is key. These at least give you a view of what you should expect to see and a chance to view any anomalies.


There is probably no better way to do this than to collaborate. Our worlds have become so interdependent and interwoven with others that the data and knowledge to solve problems, such as carbon reduction, circular economy models or cutting out waste crime does not lie with one owner.

It also comes in various forms on various systems, which may not understand or speak to one another. Collaboration is a way of ensuring you consider all the options available. Picking the right partners takes time. Working together in a trusted and productive relationship with partners will give the project the best chance of success.


Finally review. No one gets these things right first time. When Valpak first started capturing data related to packaging, we did not start with a goal of producing a footprint of plastic consumption or a carbon measure of packaging use in a business.

We have over time added to and redefined the quantity and types of data we collect. We have refined templates for collecting data with clients and their suppliers to be sure that we are as clear as possible as to the type of data we want and the format we want it in.

We have collected data ourselves directly from packed product, when we believe that this is the quickest, most accurate and expedient way to capture information.

We have worked with suppliers and retailers to try to get the best result for everyone in a form which is insightful and reliable.

The more we are part of the client project team the better the results. Over time we have reviewed and adjusted our processes based on results, client feedback, increased challenges and an insatiable desire to learn more.

I am sure our Insight Platform will go through many redesigns and changes in the future. I am also convinced we can work more with others.

Our data is at the small end of big data, but it is huge to us and we are sure its benefits will be huge for mapping carbon flows, circular economy opportunities and measures having a direct impact on climate change for the good.

Contact us

For a free consultation to assess your data needs please contact us.