Context

Getting started

PRNG key

Traditional pseudorandom number generators are based on nondeterministic state of OS

Slow and problematic for parallel executions

JAX relies on explicitly-set random state called a key:

from jax import random

initial_key = random.key(18)
print(initial_key)

[ 0 18]

PRNG key

Each key can only be used for one random function, but it can be split into new keys:

new_key1, new_key2 = random.split(initial_key)

initial_key can’t be used anymore now

print(new_key1)

[4197003906 1654466292]

print(new_key2)

[1685972163 1654824463]

We need to keep one key to split whenever we need and we can use the other one

Static vs traced variables

@jit
def cond_func(x):
    if x < 0.0:
        return x ** 2.0
    else:
        return x ** 3.0

print(cond_func(1.0))

jax.errors.TracerBoolConversionError: Attempted boolean conversion of traced array with shape bool[]

JIT compilation uses tracing of the code based on shape and dtype so that the same compiled code can be reused for new values with the same characteristics

Tracer objects are not real values but abstract representation that are more general

Here, an abstract general value does not work as it wouldn’t know which branch to take

Static vs traced variables

One solution is to tell jit() to exclude the problematic arguments from tracing

with arguments positions:

def cond_func(x):
    if x < 0.0:
        return x ** 2.0
    else:
        return x ** 3.0

cond_func_jit = jit(cond_func, static_argnums=(0,))

print(cond_func_jit(2.0))
print(cond_func_jit(-2.0))

8.0
4.0

Static vs traced variables

One solution is to tell jit() to exclude the problematic arguments from tracing

with arguments names:

def cond_func(x):
    if x < 0.0:
        return x ** 2.0
    else:
        return x ** 3.0

cond_func_jit_alt = jit(cond_func, static_argnames="x")

print(cond_func_jit_alt(2.0))
print(cond_func_jit_alt(-2.0))

8.0
4.0

Control flow primitives

Another solution, is to use one of the structured control flow primitives:

from jax import lax

lax.cond(False, lambda x: x ** 2.0, lambda x: x ** 3.0, jnp.array([2.]))

Array([8.], dtype=float32)

lax.cond(True, lambda x: x ** 2.0, lambda x: x ** 3.0, jnp.array([-2.]))

Array([4.], dtype=float32)

Static vs traced operations

Similarly, you can mark problematic operations as static so that they don’t get traced during JIT compilation:

@jit
def f(x):
    return x.reshape(jnp.array(x.shape).prod())

x = jnp.ones((2, 3))
print(f(x))

TypeError: Shapes must be 1D sequences of concrete values of integer type, got [Traced<ShapedArray(int32[])>with<DynamicJaxprTrace(level=1/0)>]

Static vs traced operations

The problem here is that the shape of the argument to prod() depends on the value of x which is unknown at compilation time

One solution is to use the NumPy version of prod():

import numpy as np

@jit
def f(x):
    return x.reshape((np.prod(x.shape)))

print(f(x))

[1. 1. 1. 1. 1. 1.]

Jaxprs

import jax

x = jnp.array([1., 4., 3.])
y = jnp.array([8., 1., 2.])

def f(x, y):
    return 2 * x**2 + y

jax.make_jaxpr(f)(x, y)

{ lambda ; a:f32[3] b:f32[3]. let
    c:f32[3] = integer_pow[y=2] a
    d:f32[3] = mul 2.0 c
    e:f32[3] = add d b
  in (e,) }

Outputs only based on inputs

a = jnp.ones(3)
print(a)

[1. 1. 1.]

def f(x):
    return a + x

print(jit(f)(jnp.ones(3)))

[2. 2. 2.]

Things seem ok here because this is the first run (tracing)

Outputs only based on inputs

Now, let’s change the value of a to an array of zeros:

a = jnp.zeros(3)
print(a)

[0. 0. 0.]

And rerun the same code:

print(jit(f)(jnp.ones(3)))

[2. 2. 2.]

Our cached compiled program is run and we get a wrong result

Outputs only based on inputs

The new value for a will only take effect if we re-trigger tracing by changing the shape and/or dtype of x:

a = jnp.zeros(4)
print(a)

[0. 0. 0. 0.]

print(jit(f)(jnp.ones(4)))

[1. 1. 1. 1.]

Passing to f() an argument of a different shape forced retracing

No side effects

The side effects will happen during tracing, but not on subsequent runs. You cannot rely on side effects in your code

def f(a, b):
    print("Calculating sum")
    return a + b

print(jit(f)(jnp.arange(3), jnp.arange(3)))

Calculating sum
[0 2 4]

Printing happened here because this is the first run

No side effects

Let’s rerun the function:

print(jit(f)(jnp.arange(3), jnp.arange(3)))

[0 2 4]

This time, no printing

Automatic differentiation

Considering the function f:

f = lambda x: x**3 + 2*x**2 - 3*x + 8

We can create a new function dfdx that computes the gradient of f w.r.t. x:

from jax import grad

dfdx = grad(f)

dfdx returns the derivatives

print(dfdx(1.))

4.0

Composing transformations

Transformations can be composed:

print(jit(grad(f))(1.))

4.0

print(grad(jit(f))(1.))

4.0

Accelerated array computing and flexible differentiation with