Pyro is a deep probabilistic programming framework based on PyTorch. TensorFlow has its own PPL branch with an Edward taste and there is the inevitable PyMC3 as well but Pyro feels very natural and the API more direct than the aforementioned.
The example below is simplistic but shows what I mean by ‘natural’.
import torch import pyro import pyro.distributions as dist import numpy as np import matplotlib.pyplot as plt import seaborn as sea sea.set() sea.set %matplotlib inline
You get immediate results with Pyro or should one say ‘eager execution’:
normal = dist.Normal(0, 1) x = normal.sample() print(f"sample: {x}") print(f"log probability: {normal.log_prob(x)}")
Defining a variable is different than PyMC3. A normally distribute variable ‘size’ would be:
x = pyro.sample("size", normal) print(x)
By combining probabilistic variables you can create a simplistic weather model liks so:
def weather(): cloudy = pyro.sample('cloudy', dist.Bernoulli(0.3)) cloudy = 'cloudy' if cloudy.item() == 1.0 else 'sunny' mean_temp = {'cloudy': 55.0, 'sunny': 75.0}[cloudy] scale_temp = {'cloudy': 10.0, 'sunny': 15.0}[cloudy] temp = pyro.sample('temp', dist.Normal(mean_temp, scale_temp)) return cloudy, temp.item()
The logic is straightforward and you can get a few states:
for _ in range(3): print(weather())
Let’s model the sales of ice-cream in function of the weather:
def ice_cream_sales(): cloudy, temp = weather() expected_sales = 200. if cloudy == 'sunny' and temp > 80.0 else 50. ice_cream = pyro.sample('ice_cream', dist.Normal(expected_sales, 10.0)) return ice_cream
Unlike other PPL frameworks you can get an idea directly from this definition
print(f"Possible sales of ice-cream: {ice_cream_sales():.2f}")
The key-idea behind PPL is to have a marginal distribution of the relevant variable. Marginal means, in simple terms, getting a probability distribution with all de inner-dependencies integrated out.
To do this in Pyro you first define the posterior:
posterior = pyro.infer.Importance(ice_cream_sales, num_samples=100)
and feed it to the EmpiricalMarginal:
marginal = pyro.infer.EmpiricalMarginal(posterior.run())
To see it in a histogran use something like this:
plt.hist([marginal().item() for _ in range(1000)], range=(5.0, 100.0)) plt.title("P(sales)") plt.xlabel("sales") plt.ylabel("#");
Considering the weather model and how the sales depends on it, the most likely value is around 50.
Both the way the story is assembled and how concrete the result is strikes me as fantastically clear. Bayesian reasoning and PPL can be abstract at first but this example brings it all together for me.
If we look at a linear regression in PyMC3 versus Pyro you get something like the following.
%matplotlib inline import pymc3 as pm import numpy as np import pandas as pd import scipy.stats as stats import matplotlib.pyplot as plt import seaborn as sns palette = 'muted' sns.set_palette(palette) sns.set_color_codes(palette) np.set_printoptions(precision=2) pd.set_option('display.precision', 2) np.random.seed(1) N = 100 alfa_real = 2.5 beta_real = 0.9 eps_real = np.random.normal(0, 0.5, size=N) x = np.random.normal(10, 1, N) y_real = alfa_real + beta_real * x y = y_real + eps_real plt.figure(figsize=(10, 5)) plt.subplot(1, 2, 1) plt.plot(x, y, 'b.') plt.xlabel('$x$', fontsize=16) plt.ylabel('$y$', fontsize=16, rotation=0) plt.plot(x, y_real, 'k') plt.subplot(1, 2, 2) sns.kdeplot(y) plt.xlabel('$y$', fontsize=16) plt.tight_layout() with pm.Model() as model: alpha = pm.Normal('alpha', mu=0, sd=10) beta = pm.Normal('beta', mu=0, sd=1) epsilon = pm.HalfCauchy('epsilon', 5) mu = pm.Deterministic('mu', alpha + beta * x) y_pred = pm.Normal('y_pred', mu=mu, sd=epsilon, observed=y) start = pm.find_MAP() step = pm.Metropolis() trace = pm.sample(10000, step, start) pm.traceplot(trace)
The same in Pyro look like this:
import torch import pyro from pyro.distributions import Normal, HalfCauchy from pyro.infer.mcmc import NUTS, MCMC from pyro.infer import EmpiricalMarginal import seaborn as sns import matplotlib.pyplot as plt def model(X, y): b_est = pyro.sample("beta", Normal(0,1)) a_est = pyro.sample("alpha", Normal(0,1)) s2_est = pyro.sample("variance", HalfCauchy(1)) y_hat = pyro.sample("yhat", Normal(loc=b_est * X + a_est , scale=s2_est), obs=y) return y_hat X = torch.tensor(range(100)).float() b =torch.tensor(3.0).float() a =torch.tensor(2.0).float() noise = torch.randn((100,)).float() y = X * b + a + noise nuts_kernel = NUTS(model, adapt_step_size=True) mcmc_run = MCMC(nuts_kernel, num_samples=1000, warmup_steps=400).run(X, y) posterior_beta = EmpiricalMarginal(mcmc_run, "beta") posterior_alpha = EmpiricalMarginal(mcmc_run, "alpha") print(posterior_beta.mean) print(posterior_alpha.mean) fig, ax = plt.subplots(1,2,figsize=(15,5)) sns.distplot(posterior_beta.get_samples_and_weights()[0], ax=ax[0]) sns.distplot(posterior_alpha.get_samples_and_weights()[0], ax=ax[1]) fig.show()