Lazy evaluation

Author

Marie-Hélène Burle

When it comes to high-performance computing, one of the strengths of Polars is that it supports lazy evaluation. Lazy evaluation instantly returns a future that can be used without waiting for the result of the computation. Moreover, when you run queries on a LazyFrame, Polars creates a graph and runs optimizations on it, very much the way compiled languages work.

If you want to speedup your code, use lazy execution whenever possible.

Reading in data to a LazyFrame

Ideally, you want to use the lazy API from the start, when you read in the data.

In the previous examples, we used polars.read_csv to read our data. This returns a Polars DataFrame:

import polars as pl

url = "https://cdn.jsdelivr.net/npm/vega-datasets/data/disasters.csv"

df = pl.read_csv(url)
type(df)
polars.dataframe.frame.DataFrame

Instead, you can use polars.scan_csv to create a LazyFrame:

df_lazy = pl.scan_csv(url)
type(df_lazy)
polars.lazyframe.frame.LazyFrame

There are scan functions for all the IO methods Polars offers.

Converting to a LazyFrame

If you already have a DataFrame, you can create a LazyFrame from it with the polars.DataFrame.lazy method:

df_lazy = df.lazy()

Getting the results

To get results from a LazyFrame, you use polars.LazyFrame.collect.

This won’t work because a LazyFrame has no attribute shape:

df_lazy.filter(pl.col("Year") == 2001).shape
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[17], line 1
----> 1 df_lazy.filter(pl.col("Year") == 2001).shape

AttributeError: 'LazyFrame' object has no attribute 'shape'

You need to collect the result first:

df_lazy.filter(pl.col("Year") == 2001).collect().shape
(9, 3)

collect turns your LazyFrame into a DataFrame, but it only does so on the subset needed for your query:

type(df_lazy.filter(pl.col("Year") == 2001).collect())
polars.dataframe.frame.DataFrame

This allows you to work with data too big to fit in memory!