Loglikelihood API Reference

Full SFS Loglikelihood

demestats.loglik.sfs_loglik.sfs_loglik(afs, esfs, sequence_length=None, theta=None, folded=False)[source]

This function evaluates the multinomial or Poisson log-likelihood of an observed site frequency spectrum (AFS) given an expected spectrum (ESFS).

By default, the sequence length and mutation rate (theta) are None, indicating that the multinomial likelihood will be used. To use the Poisson likelihood, one must provide BOTH the sequence length and mutation rate (theta).

Parameters:
  • afs (array_like) – Observed allele frequency spectrum

  • esfs (array_like) – Expected allele frequency spectrum. Must be the same shape as afs

  • sequence_length (int, optional) – Total number of sites in the sequence. Required if theta is given

  • theta (float, optional) – Population-scaled mutation rate. If provided, a sequence length must also be provided and the Poisson likelihood is used; otherwise a multinomial likelihood is assumed.

  • folded (boolean, optional) – A boolean indicating whether you are whether with folded SFS or not

Returns:

Log-likelihood of the observed spectrum given the expected spectrum.

Return type:

float

Notes

In tskit, given a tree sequence, to obtain the afs one can use the function

afs = tree_sequence.allele_frequency_spectrum(*options)

To obtain the esfs, with momi3 one must first initialize an ExpectedSFS object with a demes demographic model and a dictionary of the number of samples used per population. Then one would input a dictionary of parameter values into the Expected SFS object:

ESFS_obj = demestats.sfs.ExpectedSFS(demes_model.to_demes(), num_samples=samples_per_population)
params = {param_key: value}
esfs = ESFS_obj(params)

multinomial_loglik_value = sfs_loglik(afs, esfs)
poisson_loglik_value = sfs_loglik(afs, esfs, sequence_length=1e8, theta=1e-8)

To compute the gradient, one can use jax.grad or jax.value_and_grad. All loglikelihood functions are compatible with jax.

Please refer to the tutorial for a specific example, the above provided codes are just outlines of how to call on the functions.

Projected SFS Loglikelihood

demestats.loglik.sfs_loglik.projection_sfs_loglik(esfs_obj, params, proj_dict, einsum_str, input_arrays, sequence_length=None, theta=None, folded=False)[source]

This function evaluates the projected multinomial or Poisson log-likelihood of an observed site frequency spectrum (AFS) given an expected spectrum (ESFS) via Einstein summation.

By default, the sequence length and mutation rate (theta) are None, indicating that the multinomial likelihood will be used. To use the Poisson likelihood, one must provide BOTH the sequence length and mutation rate (theta).

Parameters:
  • esfs_obj (array_like) – An demestats.sfs.ExpectedSFS object

  • params (dict) – a dictionary of model parameters and their values

  • proj_dict (dict) – Dictionary of arrays that represent projection vectors

  • einsum_str (string) – Einstein summation string for projection

  • input_arrays (array_like) – Input arrays for einsum operation, it must contain the original afs

  • sequence_length (int, optional) – Total number of sites in the sequence. Required if theta is given

  • theta (float, optional) – Population-scaled mutation rate. If provided, a sequence length must also be provided and the Poisson likelihood is used, otherwise a multinomial likelihood is assumed.

Returns:

Log-likelihood of the projected observed spectrum given the projected expected spectrum.

Return type:

float

Notes

proj_dict contains the random projection vectors that define the low-dimensional subspace for approximating the full expected SFS, einsum_str is a string specifying the Einstein summation for tensor operations, and input_arrays are preprocessed arrays that serve as inputs to the jax.numpy.einsum call, optimized for JAX’s just-in-time compilation

Example:

proj_dict, einsum_str, input_arrays = prepare_projection(afs, afs_samples, sequence_length, num_projections, seed)
esfs_obj = ExpectedSFS(demo.to_demes(), num_samples=afs_samples)
params = {param_key: val}
projection_sfs_loglik(esfs_obj, params, proj_dict, einsum_str, input_arrays, sequence_length=None, theta=None)

Internally this function will call on demestats.sfs.ExpectedSFS.tensor_prod, which performs the projection operations on the site frequency spectrum.

Please refer to the tutorial for a specific example, the above provided codes are just outlines of how to call on the functions.

See also

demestats.sfs.ExpectedSFS, demestats.sfs.ExpectedSFS.tensor_prod, demestats.sfs.sfs_loglik.prepare_projection

ICR/CCR Loglikelihood

demestats.loglik.icr_loglik.icr_loglik(time, sample_config, params, icr_call, deme_names)[source]

Compute the log-likelihood contribution from an ICR evaluation at the given times and sampling configuration.

Parameters:
  • time (ArrayLike) – One or more time points at which the ICR quantities are evaluated.

  • sample_config (array of int) – A array giving the number of sampled haploids for each deme, ordered consistently with deme_names.

  • params (dict) – A dictionary of parameter values passed through to icr_call.

  • icr_call (Callable) – A callable object, typically an ICRCurve instance or compatible function, that accepts params, t, and num_samples and returns ICR-related quantities.

  • deme_names (array of str) – The ordered deme names corresponding to entries in sample_config.

Returns:

The total log-likelihood, computed as the sum of log(result["c"]) + result["log_s"] over the given time points.

Return type:

Scalar

Notes

This function converts the positional sampling configuration into the deme-name mapping expected by icr_call, evaluates the ICR quantities, and combines the returned components into a scalar log-likelihood.

The callable icr_call is an ICR or CCR object that is expected to return a dictionary containing the entries "c" and "log_s". You may also pass in their respective mean-field objects, Any function that returns ‘c’ and ‘log_s’ will work.

::

icr_exact = ICRCurve(demo=g, k=2)

ll = icr_loglik(

time=jnp.array([10.0, 100.0, 1000.0]), sample_config=[2, 0], params={}, icr_call=icr_exact, deme_names=[“P0”, “P1”],

)

See also

demestats.iicr.IICRCurve, demestats.iicr.IICRCurve.__call__, demestats.iicr.IICRMeanFieldCurve, demestats.iicr.IICRMeanFieldCurve.__call__, demestats.iicr.CCRCurve, demestats.iicr.CCRCurve.__call__, demestats.iicr.CCRMeanFieldCurve, demestats.iicr.CCRMeanFieldCurve.__call__