The rejection sampling algorithm, and how guess-and-check is a special case.
Author
Jacob Louis Hoover
Published
August 29, 2022
1 Definitions
Rejection sampling1 is an algorithm for obtaining samples from some target density (which is hard to sample from directly), using a proposal distribution, which can be sampled from. The algorithm defines a scheme for uniformly sampling from the area under the target density.
Another algorithm, which I’ll call the simple guess-and-check algorithm, is sometimes also referred to as rejection sampling2. This algorithm is for obtaining samples from a conditional distribution, by sampling from the joint. The algorithm draws samples iteratively from a distribution, and rejects them until one is drawn which satisfies a predefined condition.
2 As in e.g., Freer, Mansinghka, and Roy (2010), and probably other places
These two algorithms have slightly different goals and requirements, and are described explicitly below. The second is in fact a special case of the first. But because the setup is so different, the relationship between them is perhaps not obvious at first glance.
Rejection sampling algorithm
Goal: to sample from an arbitrary distribution z\sim \pi(Z).
Requirements:
you must be able to evaluate \pi^\ast(z) \propto \pi(z), (that is you can score \pi up to a normalizing constant)
you have some proposal distribution q from which you can sample, whose support contains the support of \pi.
Algorithm:
choose k such that \forall z\ kq(z) \ge \pi^*(z).
sample z_0\sim q
sample u\sim \operatorname{Unif}([0,kq(z_0)])
if u\le \pi^*(z_0) then accept, else start over from 2.
Goal: to sample subject to a condition. That is, to sample z\sim p(Z\mid f(Z)=\texttt{True}), where Z is some random variable, and f(Z) is a deterministic predicate of Z, which must be true for the sample to be accepted.
Requirements:
you must be able to sample from the unconditioned distribution p(Z)
you must be able to evaluate f(z) for all z, to check the condition is met
Algorithm:
sample z_0\sim p(Z)
if f(z_0)=\texttt{True} then accept, else start over from 1.
return z_0
2 Demonstrations
Let’s look at some examples. First I’ll load some packages and define a plotting function to use below.
Show/hide code
usingDistributions, Plots, StatsPlots, LaTeXStringsimportIntervalUnionArithmetic: interval, ∪importStatsBase: fit, HistogramimportLinearAlgebra: normalize"""Plot the target, proposal, and scaled proposal for the rejection sampling setup. Optionally also plot the acceptance probability."""functionplot_rejection_sampling_setup(; target_density, proposal_distribution::Distribution, title="Rejection sampling setup", subtitle="Target, proposal, and scaled proposal", xmin=-15, xmax=115, size=(600, 300), xresolution =1000, fmt =:svg, plot_accept_prob=false) x = xmin:(xmax-xmin)/xresolution:xmaxp(x) =target_density(x)q(x) =pdf(proposal_distribution, x)# Define the weight function w =x->p(x)/q(x)# k is the constant by which to multiply q so it upperbounds p k =maximum(w.(xmin:xmax)) p1 =plot(x, [p.(x) q.(x) k*q.(x)], xlabel=L"x", label=[L"target: $\pi^{\ast}(x)$" L"proposal:$q(x)$" L"$k\cdot q(x)$"], ls=[:dash :solid :dot], color=[:green :purple :purple], linewidth=[424], alpha=[0.40.50.5], fmt = fmt)plot!(p1, x, [(0.* x) p.(x)], fillrange = [p.(x) (k*q.(x))], lw =0, fillalpha = [0.200.15], lab=["accept region""reject region"], c = [:darkgreen :darkred], legend_position =:topleft) if plot_accept_probplot!(p1, title=subtitle) p2 =plot(x, (x->p(x)/(k*q(x))).(x), xlabel=L"x", label=nothing, title=L"accept probability: $p(x)/k\cdot q(x)$",ls=:solid, color=:black, linewidth=1, alpha=0.75, fmt = fmt)plot(p1,p2, layout =grid(2, 1, heights=[2/3, 1/3]), plot_title=title, plot_titlelocation=:left, titlelocation=:left, plot_titlefontsize=11, titlefontsize=9, size=(size[1],(3/2)*size[2])) elseplot(p1, plot_title=title*": "*subtitle, plot_titlelocation=:left, titlelocation=:left, plot_titlefontsize=11, titlefontsize=9, size=size)endend;"Run the rejection sampling algorithm, and plot the rejection sampling estimate"function plot_rejection_sampling_estimate(; N::Integer, target_density, proposal_distribution::Distribution, title="Rejection sampling estimate of target density", bins=100, histogram_scaling_factor=1.0, xmin=-15, xmax=115, size=(600, 300), xresolution =1000, fmt =:svg) x = xmin:(xmax-xmin)/xresolution:xmaxq(x) =pdf(proposal_distribution, x)# Define the weight function as the ratio of target to proposal densitiesw(x) =target_density(x)/q(x)# k is the constant by which to multiply q so it upperbounds p k =maximum(w.(xmin:xmax)) samples =rejection_sample( N = N, target_density = target_density, proposal_distribution = proposal_distribution) N_success =length(samples) pct_success =round(N_success/N, digits=2)plot(x, [target_density.(x) q.(x)], xlabel=L"x", legend=:topleft, label=[L"target: $\pi^{\ast}(x)$" L"proposal:$q(x)$"], ls=[:dash :solid], color=[:green :purple], linewidth=[42], alpha=[0.40.5], plot_title=title, plot_titlelocation=:left, titlelocation=:left, plot_titlefontsize=11, titlefontsize=9, title="accepted $N_success/$N"*" = $pct_success%", size=size, fmt = fmt ) h =fit(Histogram{Float64}, samples, nbins=bins) |> normalize h.weights .*= histogram_scaling_factorplot!(h; α=0.20, label="estimate", lw=0, color=:darkgreen)# makes a scaled version of this:# histogram!(samples, normalize=true, α=0.2, label="estimate", bins=bins, lw=0)end;
2.1 Rejection sampling
Let’s define a concrete proposal distribution and target density to use as an example,
Show/hide code
# Define a proposal distribution QQ =Normal(55,30) # the proposal density is q(x) = pdf(Q,x)# Define a target density # note, this example is in fact a pdf (which is normalized and we can sample from), but it needn't beP =MixtureModel([Normal(20,10), Chisq(60), Normal(92,4)], [.3,.67,.03]) πstar(x) =pdf(P, x); # target density
and look at a plot of these:
Show/hide code
plot_rejection_sampling_setup(target_density = πstar, proposal_distribution = Q, plot_accept_prob=true)
the proposal distribution Q is a normal distribution \mathcal{N}(\mu=55, \sigma^2=30), which we can sample from.
the target density p is something more complicated (in this example, we can of course sample from the mixture distribution used to define p, but the point is we don’t need to be able to, in principle this density could not correspond to something we know how to sample from).
Simulation
Here is some code which implements the rejection sampling algorithm.
Show/hide code
"""Run the rejection sampling algorithmto estimate `target_density` by `N` times from distribution `proposal_distribution`.Note, to prevent infinite looping, this version of the algorithm doesn't startover if the sample is not accepted, so only n ≤ `N` accepted samples are returned."""functionrejection_sample(; N::Integer, target_density, proposal_distribution::Distribution, xmin=-10, xmax=110)p(x) =target_density(x)q(x) =pdf(proposal_distribution, x)w(x) =p(x)/q(x)# k is the constant by which to multiply q so it upperbounds p k =maximum(w.(xmin:xmax)) xs =rand(proposal_distribution, N) us =rand(Uniform(0,1), N)whether_accept(x, u) =target_density(x)/(k*q(x)) > u samples = xs[whether_accept.(xs,us)]end;
To see the estimate resulting from these rejection sampling examples, let’s make a histogram of the accepted samples resulting from sampling N times from the proposal:
The proposal above is relatively good (it’s similar enough to the target, so a healthy proportion of the samples were accepted). What if the proposal were a worse fit to the target?
Show/hide code
# Define a "bad" proposal distribution (badly matched to the target)Q_bad =MixtureModel([Normal(60,50), Normal(37,4), Normal(90,3)], [.4,.3,.3])plot_rejection_sampling_estimate(N =2500, target_density = πstar, proposal_distribution = Q_bad)
with a worse proposal, we can see it will require more samples from the proposal to get a good estimate.
2.2 The guess-and-check algorithm, a special case
Guessing repeatedly (from a prior distribution) and only accepting when some condition about the sample is met is a special case of the rejection sampling algorithm, though this isn’t immediately obvious the way it is usually described. To see how this is so let’s make a target density that is equal to the proposal density wherever a condition is met, and zero elsewhere (we’ll be guessing from the proposal distribution Q, and accepting iff the condition condition is met).
We’ll set our target density to simply be equal to the proposal density when the condition is met, and zero otherwise. The condition I’ll use for this example is just whether the sampled real number is in a specified couple of intervals.
Show/hide code
# Make a target density that equals pdf(Q) only on a chosen intervals, and zero elsewhereintervals =interval(12,25) ∪interval(50,70) # the intervals to accept oncondition(x) = x ∈ intervalsspecial_πstar(x) =condition(x) ? pdf(Q, x) :0# Note, this density is _not_ normalized.# If we wanted to normalize this, we could calculate the normalizing constantZ =sum(cdf(Q, i.hi) -cdf(Q, i.lo) for i in intervals.v)# and we could scale the density to make it a pdf, if we wanted# special_π_normalized(x) = special_πstar(x) * 1/Z plot_rejection_sampling_setup(target_density = special_πstar, proposal_distribution = Q, title ="Rejection sampling setup", subtitle ="case when target = proposal where nonzero", plot_accept_prob =true)
To see precisely how the simple guess-and-check rejection sampling scheme is a special case of the rejection sampling algorithm (see Figure 4, to compare with Figure 1)
let \pi(z) \coloneqq p(z\mid f(z)=\texttt{True})
let \pi^\ast(z) \coloneqq \scriptsize\begin{cases}p(z)&\text{if }f(z)=\texttt{True}\\0&\text{else}\end{cases}
let q(z) \coloneqq p(z)
Letting k=1, it holds that \forall z,\ kq(z)\ge \pi^\ast(z). In fact, we have that q(z)=\pi^\ast(z) for all z in the support of \pi^\ast.
This means that the step 3 of the rejection sampling algorithm (sampling from the uniform distribution) is unnecessary. We would be guaranteed that u\le \pi^\ast(z_0), whenever z_0 \in support of \pi^\ast. So we just need to check whether \pi^\ast(z_0) > 0 (that is, check whether f(x)=\texttt{True}).
Note
One important difference between these two definitions of rejection sampling is that in the special case we actually don’t need to know how to evaluate/score the density \pi^*. This is important in practice, if we have a proposal process we can obtain samples from (and check whether they satisfy a condition), but we don’t have any way of obtaining scores. In this case guess-and-check is still possible, while the rejection sampling algorithm in general is not.
Bishop, Christopher M. 2006. Pattern Recognition and Machine Learning. 1st ed. Information Science and Statistics. New York, USA: Springer. https://link.springer.com/book/9780387310732.
Chopin, Nicolas, and Omiros Papaspiliopoulos. 2020. An Introduction to Sequential Monte Carlo. Springer International Publishing. https://doi.org/10.1007/978-3-030-47845-2.