Harnessing big data for agricultural excellence

Part 1: Understanding big data in agriculture

Author

Marie-Hélène Burle

Content from the webinar slides for easier browsing.

Who we are

Simon Fraser University

SFU hosts the Cedar supercomputer—a cluster of 100,400 CPUs and 1,352 GPUs soon to be replaced by an even larger computer cluster.


SFU also works with the Digital Research Alliance of Canada to offer researchers large amounts of computing power to solve challenging data and technology problems, as well as training to optimize their solutions.

SFU’s Big Data Hub

Since 2016, Simon Fraser University’s Big Data Hub has been offering workshops, events, and consulting services to researchers and industry partners helping them remain at the top of the fast evolving data landscape.

noshadow

BC Centre for Agritech Innovation

Since 2022, SFU BCCAI has been helping small and medium enterprises in the farming industry to embrace technology driven solutions in:

  • agritech projects
  • training & upscaling
  • agritech network

Goals for this workshop

Session 1

Today.

A (hopefully) friendly lecture to:

  • Demystify big data.
  • Demonstrate the critical importance of big data in agriculture and farming.

noshadow

Session 2

Tomorrow at 11am in the Mount Baker Room.

An interactive workshop to:

  • Brainstorm on how big data can benefit your operation.
  • Help you make the transition to smart farming.

noshadow

What is big data?

noshadow

The 3 “V”: Volume

Before

Farmers were taking measurements (e.g. on soil moisture) manually creating low volumes of data.

Now

Internet of Things (IoT) (e.g. hundreds of soil moisture sensors) collects large volumes of data.

The 3 “V”: Variety

Before

There was a limited set of data a producer could collect.

Now

There are so many different types of data (e.g. satellite images, market data gathered from internet browsing…).

The 3 “V”: Velocity

Before

A farmer could only gather so much data, even with a lot of employees.

Now

Data is generated in real time and accumulates at high speed.

Why is it important?

Why has big data become so essential?

All this data is key to the development of artificial intelligence (AI).

So …

What is AI?

AI

Very loosely, you can think of neural networks (the most powerful form of AI) as an attempt to create a computer model that mimics the brain:

Biological neurons

noshadow

Neural network

In traditional computing, a programmer writes code that gives a computer detailed instructions of what to do.

These instructions are called a program.

noshadow

noshadow

Some action


With neural networks, instead of writing a program, a programmer writes a model, then feeds it lots of data and the model changes little by little over time.

The model “learns” thanks to this data.

Simplilearn has a video explaining how neural networks work in 5 min:


This learning is nothing magical: some numbers in the model get tweaked a tiny bit, with each new piece of data, to make the model a little bit better.

noshadow

From xkcd.com

Basically, we start with a model, train it with data, and we end up with a trained model that can be used as a traditional computer program.

noshadow

That trained model can be used to get predictions, generate art or speech, identify objects in images or spams in emails…

The only difference from traditional computing is that we don’t write the program ourselves. Instead, we write a starting point (the untrained model), then train it with A LOT of data and let it get better by itself.


To get a very good model at the end—one that can write human language like ChatGPT or voice assistants for instance—you really need A LOT OF DATA.

noshadow

AI: an example

Imagine that you want a program able to detect tomatoes in pictures.

This could be very useful to get real time data on your upcoming crop so that you can plan adequately (hiring staff, setting price, looking for markets).

For humans, this is straightforward.

Yet, this is impossible to achieve with traditional programming because there are too many factors (location of the tomatoes in the image, quality of the picture, colour of the tomatoes…).


However, by feeding a very large number of images with and without tomatoes along with labels that give the number of tomatoes for each image to a neural network, we can train it to recognize tomatoes in images that it has never seen.

With each pair of image/label (e.g. “Picture 34, label: 56 tomatoes”), the model gets better.

noshadow


We don’t write the program to do this. We write the starting model, then let it adjust by itself based on the data.

It is a form of learning by experience, which is exactly what happens to us as we grow up. It is a form of programming that is much closer to how brains work than traditional programming.

Why now?

The idea is not new, but it is only recently that we have had enough computing power, internet connectivity, and storage capacity to implement it.

Big data and AI in agriculture

Smart farming

We already talked about data collection thanks to the Internet of Things (e.g. moisture sensors).

Many AI tools are involved in improving all domains of agriculture, from irrigation management to supply chain and demand forecasting.

The benefits are huge for both farmers (increased yields, reduced costs, better planning) and the environment (optimization of resources and pesticide use).

Decision making

Before

Farmers had to make decisions as best they could based on their experience and their limited data.

Now

Farmers can use powerful models to make informed decision in real time. This can be followed by the automation of some action (e.g. watering).

Livestock monitoring

A case study

Livestock successfully monitored remotely via sound sensors and algorithms for background noise filtering.

Animal welfare and efficiency improvements.

Markets and supply chains

This next section looks at a review of market analysis and supply chain optimization.

A review

Systematic literature review of peer-reviewed articles and conference papers published between 2014 and 2024 showed large improvements of demand forecasting accuracy and supply chain optimization.

Real time data analysis helped with predictive maintenance, market volatility, resource constraints, and climate variability.

noshadow

Challenges

There are challenges to the implementation of such transformative methods.

  • Infrastructure development.
  • Skill gaps among agricultural professionals.

We are here to help!

noshadow

noshadow

Come to session 2 tomorrow!

Session 2

Join us tomorrow at 11am in the Mount Baker Room for our 2nd session:

Diagnosing and implementing big data solutions.

We will have an interactive workshop to:

  • Brainstorm on how big data can benefit your operation.
  • Help you make the transition to smart farming.

If you are unable to attend, you can find the slides here, but it will be an interactive clinic with most of the material covered in the activity.

Resources

Getting in touch

Understanding neural networks

To go a bit further than the video mentioned earlier, 3Blue1Brown by Grant Sanderson has a series of 4 videos on neural networks which is easy to watch, fun, and does an excellent job at introducing the functioning of a simple neural network:

Literature

Open-access preprints:

Arxiv Sanity Preserver by Andrej Karpathy
ML papers in the computer science category on arXiv
ML papers in the stats category on arXiv
Distill ML research online journal

Acknowledgements

noshadow

Carson Li (BCCAI) suggested an outline for this talk.

noshadow

Ian Chan (BCCAI) provided copious feedback.