API Reference

This page contains the API reference documentation for PNMF.

Main Interface

class PNMF.PNMF(n_components=10, loadings_mode='projected', mode='expanded', training_mode='standard', E=3, max_iter=200, tol=0.0001, learning_rate=0.01, optimizer='Adam', random_state=None, verbose=False, device='auto', init='random', scheduler='one_cycle', scheduler_patience=200, scheduler_factor=0.8, scheduler_pct_start=0.3, scheduler_div_factor=25.0, scheduler_final_div_factor=10000.0, min_lr=1e-05, batch_size=None, y_batch_size=None, shuffle=True, spatial=False, local=False, kernel='Matern32', multigroup=False, num_inducing=3000, lengthscale=1.0, sigma=1.0, group_diff_param=10.0, jitter=1e-05, train_lengthscale=False, cholesky_mode='exp', diagonal_only=False, inducing_allocation='proportional', K=50, precompute_knn=True, neighbors='knn', scale_ll_D=True, scale_kl_NM=True)[source]

Bases: object

Probabilistic Non-negative Matrix Factorization with variational inference.

This class provides a scikit-learn compatible interface for variational PNMF using Poisson factorization with ELBO optimization.

The model factorizes a non-negative matrix X into:

X ≈ exp(F) @ W.T

where F is the latent factor matrix (sample-specific, sampled from a variational Gaussian distribution) and W is the loading matrix (learned).

Note: For sklearn API compatibility, fit_transform returns exp(F) (called the “transformed data”) and components_ stores W.T (called the “components”).

Parameters:
  • n_components (int, default=10) – Number of latent components (rank of factorization).

  • loadings_mode ({'softplus', 'exp', 'projected'}, default='projected') – Method for enforcing non-negativity on W: - ‘softplus’: Use softplus transformation - ‘exp’: Use exponential transformation - ‘projected’: Use projected gradient descent (clamp after each step)

  • mode ({'simple', 'expanded', 'lower-bound'}, default='expanded') – ELBO computation mode: - ‘simple’: Use torch.distributions.Poisson.log_prob() directly - ‘expanded’: Use hybrid Monte Carlo + analytic expectation (default) - ‘lower-bound’: Use Jensen’s lower bound (fully analytic, no MC sampling)

  • training_mode ({'standard', 'natural'}, default='standard') – Training mode for variational parameters: - ‘standard’: Standard gradient descent with Adam/other optimizer - ‘natural’: Natural gradient descent with dual optimizers (NGD for variational, Adam for W)

  • E (int, default=10) – Number of Monte Carlo samples for ELBO estimation.

  • max_iter (int, default=200) – Maximum number of iterations.

  • tol (float, default=1e-4) – Tolerance for convergence.

  • learning_rate (float, default=0.01) – Learning rate for the optimizer.

  • optimizer ({'Adam', 'AdamW', 'NAdam', 'SGD', 'RMSprop'}, default='Adam') – Optimizer to use for training (applies to W parameters in natural mode).

  • random_state (int, default=None) – Random seed for reproducibility.

  • verbose (bool, default=False) – Whether to print progress messages.

  • device ({'cpu', 'cuda', 'mps', 'auto'}, default='auto') – Device to use for computation. ‘auto’ will select mps (Apple Silicon), cuda (NVIDIA), or cpu in that order based on availability.

  • init ({'random', 'nndsvd', 'nndsvda', 'nndsvdar', 'k-means', None}, default='random') –

    Initialization method for W and exp(F): - ‘random’: Non-negative random matrices, scaled with sqrt(X.mean() / n_components) (default). - ‘nndsvd’: Nonnegative Double SVD (better for sparseness). - ‘nndsvda’: NNDSVD with zeros filled with average of X (better for dense data). - ‘nndsvdar’: NNDSVD with zeros filled with small random values (faster dense). - ‘k-means’: K-means clustering based initialization. - None: Auto-select ‘nndsvda’ if n_components <= min(n_samples, n_features),

    otherwise ‘random’.

  • scheduler ({'one_cycle', 'plateau', None}, default='one_cycle') – Learning rate scheduler: - ‘one_cycle’: OneCycleLR with warmup then cosine annealing (default) - ‘plateau’: ReduceLROnPlateau, reduces LR when ELBO plateaus - None: No scheduler (constant learning rate)

  • scheduler_patience (int, default=200) – Number of iterations with no improvement before reducing LR. Only used when scheduler=’plateau’.

  • scheduler_factor (float, default=0.8) – Factor by which to reduce LR (new_lr = old_lr * factor). Only used when scheduler=’plateau’.

  • scheduler_pct_start (float, default=0.3) – Fraction of total iterations spent in warmup phase (LR ramps up from learning_rate/div_factor to learning_rate). Only used when scheduler=’one_cycle’.

  • scheduler_div_factor (float, default=25.0) – Determines initial LR: initial_lr = learning_rate / div_factor. Only used when scheduler=’one_cycle’.

  • scheduler_final_div_factor (float, default=1e4) – Determines final LR: min_lr = initial_lr / final_div_factor. Only used when scheduler=’one_cycle’.

  • min_lr (float, default=1e-5) – Minimum learning rate. Only used when scheduler=’plateau’.

  • batch_size (int, default=None) – Size of mini-batches for samples (N dimension). If None, uses full batch. Enable mini-batch training for large datasets.

  • y_batch_size (int, default=None) – Size of mini-batches for features (M/D dimension). If None, uses all features. Enable feature batching for very wide datasets.

  • shuffle (bool, default=True) – Whether to shuffle sample indices between iterations (for mini-batch mode).

  • Parameters (LCGP-Specific)

  • --------------------

  • spatial (bool, default=False) – Enable spatial GP prior instead of independent Gaussian prior. Requires coordinates to be provided in fit().

  • local (bool, default=False) – Use locally conditioned GP (LCGP) instead of SVGP when spatial=True. LCGP uses all points as inducing with low-rank+diagonal covariance. Only used when spatial=True.

  • kernel (str, default='Matern32') – Kernel function for spatial GP. Currently only ‘Matern32’ is supported.

  • multigroup (bool, default=False) – Use multi-group GP (MGGP) with group-aware spatial smoothing. When False, uses standard GP with batched_Matern32 kernel (no groups needed).

  • num_inducing (int, default=3000) – Number of inducing points for SVGP approximation.

  • lengthscale (float, default=1.0) – Kernel lengthscale for spatial correlation.

  • sigma (float, default=1.0) – Kernel output scale (amplitude).

  • group_diff_param (float, default=10.0) – Group difference parameter for MGGP. Higher values = stronger group separation.

  • jitter (float, default=1e-5) – Jitter term for numerical stability in Cholesky decomposition.

  • train_lengthscale (bool, default=False) – Whether to train the kernel lengthscale (currently not supported).

  • cholesky_mode (str, default='exp') – Cholesky diagonal constraint mode (‘exp’, ‘softplus’).

  • diagonal_only (bool, default=False) – Use diagonal-only variational covariance for inducing points.

  • inducing_allocation (str, default='proportional') –

    How to distribute inducing points across groups (SVGP only): - ‘proportional’: Allocate points proportionally to group sizes (default). - ‘equal’: Allocate equal points to each group. - ‘derived’: Run K-means on all data for optimal spatial coverage,

    then use KNN (k=5, distance-weighted) to classify centroids to groups. Falls back to proportional for groups with no assigned points.

  • Parameters

  • ------------------------

  • K (int, default=50) – Number of nearest neighbors for LCGP local conditioning. Only used when local=True.

  • precompute_knn (bool, default=True) – Whether to precompute KNN indices at initialization for LCGP. Only used when local=True.

components_

The basis matrix W (transposed for sklearn compatibility).

Type:

ndarray of shape (n_components, n_features)

n_components_

The number of components.

Type:

int

n_features_in_

Number of features seen during fit.

Type:

int

elbo_

Final ELBO value.

Type:

float

n_iter_

Actual number of iterations performed.

Type:

int

Examples

>>> import numpy as np
>>> from PNMF import PNMF
>>> X = np.random.rand(100, 50)  # 100 samples, 50 features
>>> model = PNMF(n_components=5, random_state=42)
>>> transformed = model.fit_transform(X)  # exp(F): (100, 5) transformed data
>>> components = model.components_        # W.T: (5, 50) components
>>> X_reconstructed = model.inverse_transform(transformed)

Mini-batch training for large datasets:

>>> X_large = np.random.rand(10000, 500)
>>> model = PNMF(n_components=10, batch_size=1000, shuffle=True)
>>> model.fit(X_large)
__init__(n_components=10, loadings_mode='projected', mode='expanded', training_mode='standard', E=3, max_iter=200, tol=0.0001, learning_rate=0.01, optimizer='Adam', random_state=None, verbose=False, device='auto', init='random', scheduler='one_cycle', scheduler_patience=200, scheduler_factor=0.8, scheduler_pct_start=0.3, scheduler_div_factor=25.0, scheduler_final_div_factor=10000.0, min_lr=1e-05, batch_size=None, y_batch_size=None, shuffle=True, spatial=False, local=False, kernel='Matern32', multigroup=False, num_inducing=3000, lengthscale=1.0, sigma=1.0, group_diff_param=10.0, jitter=1e-05, train_lengthscale=False, cholesky_mode='exp', diagonal_only=False, inducing_allocation='proportional', K=50, precompute_knn=True, neighbors='knn', scale_ll_D=True, scale_kl_NM=True)[source]
fit(X, y=None, coordinates=None, groups=None, return_history=False, callback=None, callback_interval=100)[source]

Fit the PNMF model to data X using variational inference.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Input data matrix (non-negative).

  • y (Ignored) – Not used, present for scikit-learn compatibility.

  • coordinates (array-like of shape (n_samples, 2), optional) – Spatial coordinates for each sample. Required when spatial=True.

  • groups (array-like of shape (n_samples,), optional) – Integer group assignments for each sample. Required when spatial=True and multigroup=True.

  • return_history (bool, default=False) – If True, returns a tuple (history, self) where history is a list of ELBO values during training.

Returns:

self – Returns the instance itself (or (history, self) if return_history=True).

Return type:

object

transform(X, coordinates=None, groups=None)[source]

Transform X using the fitted model.

For non-spatial models, uses NNLS multiplicative updates to find exp(F). For spatial models, uses GP predictive equations at new coordinates.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Input data matrix.

  • coordinates (array-like of shape (n_samples, 2), optional) – Spatial coordinates for new samples. Required when spatial=True.

  • groups (array-like of shape (n_samples,), optional) – Group assignments for new samples. Required when spatial=True and multigroup=True.

Returns:

transformed – Transformed data (exp(F) in our model notation).

Return type:

ndarray of shape (n_samples, n_components)

fit_transform(X, coordinates=None, groups=None, **kwargs)[source]

Fit the model and transform X.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Input data matrix.

  • coordinates (array-like of shape (n_samples, 2), optional) – Spatial coordinates. Required when spatial=True.

  • groups (array-like of shape (n_samples,), optional) – Group assignments. Required when spatial=True and multigroup=True.

  • **kwargs (Additional arguments to pass to fit())

Returns:

transformed – Transformed data (exp(F) in our model notation).

Return type:

ndarray of shape (n_samples, n_components)

inverse_transform(transformed)[source]

Transform data back to its original space.

Reconstructs X from exp(F) using: X ≈ exp(F) @ W.T

Parameters:

transformed (array-like of shape (n_samples, n_components)) – Transformed data (exp(F) in our model notation).

Returns:

X_reconstructed – Reconstructed data in original space.

Return type:

ndarray of shape (n_samples, n_features)

Initialization

PNMF.initialization.initialize_factors(X, n_components, init=None, random_state=None, scale_log_mu=True, target_std=1.0)[source]

Initialize W and exp(F) matrices for PNMF.

This function provides sklearn-compatible initialization for PNMF. The returned matrices can be used to initialize the model parameters.

Parameters:
  • X (ndarray or torch.Tensor of shape (n_samples, n_features)) – Non-negative input data matrix.

  • n_components (int) – Number of latent components.

  • init ({None, 'random', 'nndsvd', 'nndsvda', 'nndsvdar', 'k-means'}, default=None) –

    Initialization method: - None: Auto-select ‘nndsvda’ if n_components <= min(n_samples, n_features),

    otherwise ‘random’.

    • ’random’: Non-negative random matrices, scaled with sqrt(X.mean() / n_components).

    • ’nndsvd’: Nonnegative Double SVD (better for sparseness).

    • ’nndsvda’: NNDSVD with zeros filled with average of X (better for dense data).

    • ’nndsvdar’: NNDSVD with zeros filled with small random values (faster dense).

    • ’k-means’: K-means clustering based initialization.

  • random_state (int or None) – Random seed for reproducibility.

  • scale_log_mu (bool, default=True) – If True, scales exp_F so that log(exp_F) has approximately N(0, 1) distribution. This ensures the variational mean μ is on a consistent scale across different datasets and initialization methods.

  • target_std (float, default=1.0) – Target standard deviation for log(exp_F) when scale_log_mu=True.

Return type:

Tuple[ndarray, ndarray]

Returns:

  • W (ndarray of shape (n_features, n_components)) – Loadings matrix initialization (for W parameter).

  • exp_F (ndarray of shape (n_samples, n_components)) – Expected latent factors initialization (for exp(F) where F is in log-space).

Examples

>>> import numpy as np
>>> from PNMF.initialization import initialize_factors
>>> X = np.random.poisson(5, size=(100, 50)).astype(np.float32)
>>> W, exp_F = initialize_factors(X, n_components=5, init='nndsvda', random_state=42)
>>> print(W.shape)   # (50, 5)
>>> print(exp_F.shape)  # (100, 5)

Notes

For PNMF, the model structure is:

X ≈ W @ exp(F).T

where: - W is (n_features, n_components) - loadings - F is the latent factor matrix (n_samples, n_components) in log-space - exp(F) is the expected value of the latent factors

The variational distribution q(F) is Gaussian with mean μ and scale σ. To initialize from exp_F:

μ = log(exp_F) (or log(exp_F + eps) if exp_F has zeros) σ can be initialized to a small value (e.g., 0.1)

Transform and Utility Functions

Factor Extraction

PNMF.transforms.log_factors(model, coordinates=None, groups=None, return_tensor=False)[source]

Get log-space latent factors (μ from q(F) = Normal(μ, σ²)).

Parameters:
  • model (PNMF) – A fitted PNMF model.

  • coordinates (array-like, optional) – Spatial coordinates. For spatial models, uses stored training coordinates if not provided.

  • groups (array-like, optional) – Group assignments. For spatial models, uses stored training groups if not provided.

  • return_tensor (bool, default=False) – If True, return a torch.Tensor instead of numpy array.

Returns:

F_log – Latent factors in log-space (μ from the variational distribution).

Return type:

ndarray or Tensor of shape (n_samples, n_components)

Examples

>>> from PNMF import PNMF
>>> from PNMF.transforms import log_factors
>>> model = PNMF(n_components=5).fit(X)
>>> F_log = log_factors(model)  # Shape: (n_samples, n_components)
PNMF.transforms.factor_uncertainty(model, return_variance=False, coordinates=None, groups=None, return_tensor=False)[source]

Get uncertainty in latent factors (σ or σ² from q(F) = Normal(μ, σ²)).

Parameters:
  • model (PNMF) – A fitted PNMF model.

  • return_variance (bool, default=False) – If True, return variance (σ²). If False, return standard deviation (σ).

  • coordinates (array-like, optional) – Spatial coordinates (for spatial models).

  • groups (array-like, optional) – Group assignments (for spatial models).

  • return_tensor (bool, default=False) – If True, return a torch.Tensor instead of numpy array.

Returns:

uncertainty – Standard deviation (σ) or variance (σ²) of the variational distribution.

Return type:

ndarray or Tensor of shape (n_samples, n_components)

Examples

>>> from PNMF import PNMF
>>> from PNMF.transforms import factor_uncertainty
>>> model = PNMF(n_components=5).fit(X)
>>> F_std = factor_uncertainty(model)  # σ, shape: (n_samples, n_components)
>>> F_var = factor_uncertainty(model, return_variance=True)  # σ²
PNMF.transforms.factor_samples(model, n_samples=100, return_exp=False, coordinates=None, groups=None, return_tensor=False)[source]

Sample latent factors from the variational posterior q(F).

Parameters:
  • model (PNMF) – A fitted PNMF model.

  • n_samples (int, default=100) – Number of samples to draw.

  • return_exp (bool, default=False) – If True, return exp(F) samples. If False, return F samples.

  • coordinates (array-like, optional) – Spatial coordinates (for spatial models).

  • groups (array-like, optional) – Group assignments (for spatial models).

  • return_tensor (bool, default=False) – If True, return a torch.Tensor instead of numpy array.

Returns:

samples – Samples from the variational posterior.

Return type:

ndarray or Tensor of shape (n_samples, n_data_samples, n_components)

Notes

Uses the reparameterization trick for sampling.

Examples

>>> from PNMF import PNMF
>>> from PNMF.transforms import factor_samples
>>> model = PNMF(n_components=5).fit(X)
>>> samples = factor_samples(model, n_samples=100)  # (100, n_data, 5)
>>> exp_samples = factor_samples(model, n_samples=50, return_exp=True)

Model Accessors

PNMF.transforms.get_loadings(model, return_tensor=False)[source]

Get the loadings matrix W from a fitted PNMF model.

Parameters:
  • model (PNMF) – A fitted PNMF model.

  • return_tensor (bool, default=False) – If True, return a torch.Tensor instead of numpy array.

Returns:

W – The loadings matrix W.

Return type:

ndarray or Tensor of shape (n_features, n_components)

Notes

This is equivalent to model.components_.T but provides a consistent interface and works with both fitted and unfitted (but initialized) models.

Examples

>>> from PNMF import PNMF
>>> from PNMF.transforms import get_loadings
>>> model = PNMF(n_components=5).fit(X)
>>> W = get_loadings(model)  # Shape: (n_features, n_components)
PNMF.transforms.get_prior(model)[source]

Get the GaussianPrior object from a fitted PNMF model.

Parameters:

model (PNMF) – A fitted PNMF model.

Returns:

prior – The GaussianPrior object containing the variational distribution.

Return type:

GaussianPrior

Notes

This provides access to the full prior for advanced users who want to directly manipulate or inspect the variational distribution.

Examples

>>> from PNMF import PNMF
>>> from PNMF.transforms import get_prior
>>> model = PNMF(n_components=5).fit(X)
>>> prior = get_prior(model)
>>> qF, pF = prior()  # Get distributions directly

Conditional Inference

PNMF.transforms.transform_F(X, W, n_components=None, mode='expanded', E=3, max_iter=200, tol=0.0001, learning_rate=0.01, verbose=False, device='auto', return_prior=False)[source]

Learn new latent factors F conditioned on fixed loadings W using variational inference.

This is a Bayesian alternative to NNLS-based transform. It learns a full variational distribution q(F) for the new data given the fixed W.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – New data to transform.

  • W (array-like of shape (n_features, n_components)) – Fixed loadings matrix.

  • n_components (int, optional) – Number of components. Inferred from W if not specified.

  • mode ({'simple', 'expanded', 'lower-bound'}, default='expanded') – ELBO computation mode.

  • E (int, default=3) – Number of Monte Carlo samples for ELBO estimation.

  • max_iter (int, default=200) – Maximum number of iterations.

  • tol (float, default=1e-4) – Tolerance for convergence.

  • learning_rate (float, default=0.01) – Learning rate for the optimizer.

  • verbose (bool, default=False) – Whether to print progress messages.

  • device ({'cpu', 'cuda', 'mps', 'auto'}, default='auto') – Device to use for computation.

  • return_prior (bool, default=False) – If True, return the GaussianPrior object instead of exp(F).

Return type:

Union[ndarray, GaussianPrior]

Returns:

  • F_exp (ndarray of shape (n_samples, n_components)) – Transformed data E[exp(F)]. Returned if return_prior=False.

  • prior (GaussianPrior) – The fitted GaussianPrior object. Returned if return_prior=True.

Examples

>>> from PNMF import PNMF
>>> from PNMF.transforms import transform_F, get_loadings
>>> # Fit original model
>>> model = PNMF(n_components=5).fit(X_train)
>>> W = get_loadings(model)
>>> # Transform new data with full VI
>>> F_new = transform_F(X_new, W)  # Shape: (n_new, 5)
>>> # Or get the full prior for uncertainty quantification
>>> prior_new = transform_F(X_new, W, return_prior=True)
PNMF.transforms.transform_W(X, F, W_init=None, n_components=None, max_iter=100, tol=0.0001, verbose=False, device='auto')[source]

Learn new loadings W conditioned on fixed latent factors F.

Uses multiplicative updates (NNLS-style) for fast convergence while displaying the Poisson negative log-likelihood on the progress bar.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Data to fit.

  • F (array-like of shape (n_samples, n_components)) – Fixed latent factors in exp-space (exp(F) from the model).

  • W_init (array-like of shape (n_features, n_components), optional) – Initial loadings W. If None, uses random initialization.

  • n_components (int, optional) – Number of components. Inferred from F if not specified.

  • max_iter (int, default=100) – Maximum number of iterations.

  • tol (float, default=1e-4) – Tolerance for convergence.

  • verbose (bool, default=False) – Whether to print progress messages.

  • device ({'cpu', 'cuda', 'mps', 'auto'}, default='auto') – Device to use for computation.

Returns:

W – The learned loadings matrix.

Return type:

ndarray of shape (n_features, n_components)

Notes

This function uses multiplicative updates for non-negative matrix factorization, which is equivalent to coordinate descent on the Poisson log-likelihood.

The update rule for W is:

W = W * (X @ H) / (W @ (H.T @ H) + eps)

where H = F (the fixed factors in exp-space).

For uncertainty quantification in F, consider using factor_samples() and running this function multiple times with different F samples.

Examples

>>> from PNMF import PNMF
>>> from PNMF.transforms import transform_W, get_factors
>>> # Fit original model
>>> model = PNMF(n_components=5).fit(X_train)
>>> F_exp = get_factors(model)  # exp-space factors
>>> # Learn new W for different data with same factors
>>> W_new = transform_W(X_new, F_exp)  # Shape: (n_features, 5)

Prior Utilities

PNMF.transforms.log_factors_from_prior(prior, return_tensor=False)[source]

Get log-space latent factors from a GaussianPrior object.

Parameters:
  • prior (GaussianPrior) – A GaussianPrior object.

  • return_tensor (bool, default=False) – If True, return a torch.Tensor instead of numpy array.

Returns:

F_log – Latent factors in log-space.

Return type:

ndarray or Tensor of shape (n_samples, n_components)

Examples

>>> from PNMF.transforms import transform_F, log_factors_from_prior
>>> prior = transform_F(X_new, W, return_prior=True)
>>> F_log = log_factors_from_prior(prior)
PNMF.transforms.factors_from_prior(prior, use_mgf=True, return_tensor=False)[source]

Get exp-space latent factors from a GaussianPrior object.

Parameters:
  • prior (GaussianPrior) – A GaussianPrior object.

  • use_mgf (bool, default=True) – If True, compute E[exp(F)] = exp(μ + σ²/2). If False, compute exp(μ) directly.

  • return_tensor (bool, default=False) – If True, return a torch.Tensor instead of numpy array.

Returns:

F_exp – Latent factors in exp-space.

Return type:

ndarray or Tensor of shape (n_samples, n_components)

Examples

>>> from PNMF.transforms import transform_F, factors_from_prior
>>> prior = transform_F(X_new, W, return_prior=True)
>>> F_exp = factors_from_prior(prior)
PNMF.transforms.uncertainty_from_prior(prior, return_variance=False, return_tensor=False)[source]

Get uncertainty from a GaussianPrior object.

Parameters:
  • prior (GaussianPrior) – A GaussianPrior object.

  • return_variance (bool, default=False) – If True, return variance (σ²). If False, return standard deviation (σ).

  • return_tensor (bool, default=False) – If True, return a torch.Tensor instead of numpy array.

Returns:

uncertainty – Standard deviation or variance.

Return type:

ndarray or Tensor of shape (n_samples, n_components)

Examples

>>> from PNMF.transforms import transform_F, uncertainty_from_prior
>>> prior = transform_F(X_new, W, return_prior=True)
>>> F_std = uncertainty_from_prior(prior)

PyTorch Models

class PNMF.models.PoissonFactorization(prior, y, L=10, loadings_mode='softplus', mode='expanded')[source]

Bases: Module

Poisson Factorization base model with variational inference.

This model uses Poisson factorization with variational inference to learn non-negative factor matrices. The latent factors F are sampled from a Gaussian variational distribution, and the loadings W are positive parameters.

Parameters:
  • prior – A GaussianPrior object providing variational and prior distributions

  • y – Input data tensor of shape (D, N) where D is features, N is samples

  • L (default: 10) – Number of latent components (default: 10)

  • loadings_mode (default: 'softplus') – Mode for enforcing positivity on W (‘softplus’, ‘exp’, or ‘projected’)

  • mode (default: 'expanded') – ELBO computation mode (‘simple’, ‘expanded’, or ‘lower-bound’) - ‘simple’: Use torch.distributions.Poisson.log_prob() directly - ‘expanded’: Use hybrid Monte Carlo + analytic expectation (default) - ‘lower-bound’: Use Jensen’s lower bound (fully analytic, no MC sampling)

prior

GaussianPrior for variational inference

W

PositiveParameter loadings matrix of shape (D, L)

loadings_mode

Positivity constraint mode

D

Number of features

N

Number of samples

L

Number of latent components

Example

>>> import torch
>>> from PNMF.priors import GaussianPrior
>>> from PNMF.models import PoissonFactorization
>>> y = torch.randn(100, 50)
>>> prior = GaussianPrior(y, L=10)
>>> model = PoissonFactorization(prior, y, L=10)
>>> rate, qF, pF = model(E=10)
__init__(prior, y, L=10, loadings_mode='softplus', mode='expanded')[source]

Initialize internal Module state, shared by both nn.Module and ScriptModule.

get_rate(prior_samples, idy=None)[source]

Compute the Poisson rate from prior samples.

Parameters:
  • prior_samples – Samples from the prior of shape (E, L, N) where E is number of samples, L is components, N is samples

  • idy (default: None) – Feature indices for batching (D dimension), None for full features

Returns:

Rate matrix of shape (E, D, N) or (E, D_batch, batch_size)

Return type:

Z

forward(idx=None, idy=None, E=10, X=None, coordinates=None, groups=None, spatial=False)[source]

Forward pass: compute variational distributions and log-likelihood terms.

Supports both full-batch and mini-batch training. When idx and idy are None, performs full-batch forward pass. Otherwise, computes on the specified batch indices.

Parameters:
  • idx (default: None) – Sample indices (for N dimension), None for full samples

  • idy (default: None) – Feature indices (for D dimension), None for full features

  • E (default: 10) – Number of Monte Carlo samples (ignored for lower-bound mode)

  • X (default: None) – Input data (D, N) or (D_batch, N_batch). Required for memory-efficient MC accumulation. If None and MC is needed, falls back to full-tensor mode.

  • coordinates (default: None) – Spatial coordinates (N_batch, 2) for GP prior. Required when spatial=True.

  • groups (default: None) – Group assignments (N_batch,) for MGGP prior. Required when spatial=True and using multi-group GP.

  • spatial (default: False) – Whether to use spatial GP forward pass.

Returns:

terms: dict from compute_log_likelihood_terms()

qF: Variational posterior distribution pF: Prior distribution

For spatial:

terms: dict from compute_log_likelihood_terms() qF: Variational posterior distribution from GP predictive qU: Inducing point variational distribution pU: Inducing point prior distribution (None for whitened)

Return type:

For non-spatial

project_parameters()[source]

Apply projection to ensure non-negativity (for projected gradient mode).

Priors

class PNMF.priors.GaussianPrior(y, L=10, scale_pf=1.0, use_natural_gradients=False)[source]

Bases: Module

Gaussian prior for latent factors in variational inference.

This class represents a variational distribution qF over latent factors F, along with a prior distribution pF (standard normal). The variational distribution can be parameterized either in standard form (mean, scale) or in natural parameter form (theta1, theta2) for natural gradient descent.

Parameters:
  • y – Input data tensor of shape (D, N) where D is features, N is samples

  • L (default: 10) – Number of latent components (default: 10)

  • scale_pf (default: 1.0) – Scale parameter for the prior distribution (default: 1.0)

  • use_natural_gradients (default: False) – Use natural parameterization (default: False)

mean

Variational mean parameter of shape (L, N) [standard mode]

scale

PositiveParameter for scale of shape (L, N) [standard mode]

theta1

Natural parameter θ₁ = μ/s² of shape (L, N) [natural mode]

theta2

Natural parameter θ₂ = -1/(2s²) of shape (L, N) [natural mode]

scale_pf

Fixed scale for the prior distribution

use_natural_gradients

Whether natural parameterization is used

Example

>>> import torch
>>> from PNMF.priors import GaussianPrior
>>> y = torch.randn(100, 50)  # 100 features, 50 samples
>>> prior = GaussianPrior(y, L=10)  # Standard mode
>>> qF, pF = prior()
>>> F = qF.rsample((5,))  # Sample 5 times using reparameterization
>>> # Natural gradient mode
>>> prior_nat = GaussianPrior(y, L=10, use_natural_gradients=True)
>>> qF, pF = prior_nat()
__init__(y, L=10, scale_pf=1.0, use_natural_gradients=False)[source]

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward()[source]

Get the variational and prior distributions.

Returns:

Variational posterior distribution Normal(mean, scale) pF: Prior distribution Normal(0, scale_pf)

Return type:

qF

forward_batched(idx)[source]

Get distributions for a batch of samples.

Parameters:

idx – Indices of samples to include in the batch

Returns:

Variational distribution for the batch pF: Prior distribution for the batch

Return type:

qF

parameters()[source]

Return parameters for optimization (excludes the prior hyperparameter).

natural_parameters()[source]

Return natural parameters (theta1, theta2) for NGD optimizer.

Custom Modules

class PNMF.custom_modules.ConstrainedParameter(shape, mode)[source]

Bases: Module

Base class for parameters with constraints.

Subclasses implement _to_constrained() and _to_unconstrained() to define the bijective mapping between raw (unconstrained) and constrained spaces.

The .data property returns the constrained value, and the underlying unconstrained parameter is stored in ._raw (an nn.Parameter).

Parameters:
  • shape (Union[int, Tuple[int, ...]]) – Shape of the constrained parameter.

  • mode (str) – Constraint mode (subclass-specific).

__init__(shape, mode)[source]

Initialize internal Module state, shared by both nn.Module and ScriptModule.

property shape: Tuple[int, ...]

Shape of the constrained parameter.

property data: Tensor

Returns the constrained value.

property requires_grad: bool

Returns requires_grad status of the underlying parameter.

property device: device

Returns the device of the underlying parameter.

property dtype: dtype

Returns the dtype of the underlying parameter.

property raw: Parameter

Direct access to the underlying unconstrained parameter.

freeze()[source]

Freeze the parameter (disable gradient computation).

unfreeze()[source]

Unfreeze the parameter (enable gradient computation).

project()[source]

Project parameters to satisfy constraints. Override for projected modes.

__getitem__(idx)[source]

Allow subscripting to index into the constrained value.

dim()[source]
size(dim=None)[source]
numel()[source]
detach()[source]

Returns a detached copy of the constrained value.

clone()[source]

Returns a cloned copy of the constrained value.

cpu()[source]

Move to CPU.

cuda(device=None)[source]

Move to CUDA.

to(*args, **kwargs)[source]

Move to device/dtype.

float()[source]

Convert to float32.

double()[source]

Convert to float64.

half()[source]

Convert to float16.

contiguous()[source]

Returns a contiguous copy of the constrained value.

numpy()[source]

Returns numpy array of the constrained value.

property grad

Returns the gradient of the underlying parameter.

property is_cuda

Check if on CUDA.

property is_leaf

Check if leaf tensor.

class PNMF.custom_modules.PositiveParameter(shape, mode='softplus', elbo_mode='expanded')[source]

Bases: ConstrainedParameter

Parameter constrained to be positive.

Supports any shape - batching works automatically.

Parameters:
  • shape (Union[int, Tuple[int, ...]]) – int or tuple for the parameter shape.

  • mode (str, default: 'softplus') – ‘softplus’, ‘exp’, ‘projected’, or ‘multiplicative’ for ensuring positivity.

  • elbo_mode (str, default: 'expanded') – ELBO computation mode for multiplicative updates. Only used when mode=’multiplicative’. One of ‘lower-bound’, ‘expanded’, or ‘simple’.

The mode determines how positivity is enforced:
  • ‘softplus’: Uses softplus(x) = log(1 + exp(x)) transformation

  • ‘exp’: Uses exp(x) transformation

  • ‘projected’: Uses projected gradient descent (clamps to >= 0 after each step)

  • ‘multiplicative’: Uses multiplicative updates (no gradient-based optimization).

    Requires calling multiplicative_update() manually after computing update terms.

For multiplicative mode, the update rule is:

W = W * numerator / denominator

The numerator and denominator depend on the elbo_mode:
  • ‘lower-bound’: Fully analytic using Jensen’s bound

  • ‘expanded’: Hybrid MC + analytic expectation

  • ‘simple’: Full Monte Carlo

__init__(shape, mode='softplus', elbo_mode='expanded')[source]

Initialize internal Module state, shared by both nn.Module and ScriptModule.

project()[source]

Project parameters to satisfy constraints.

For ‘projected’ mode, this clamps values to be >= 0. Call this after optimizer.step() when using projected gradients.

multiplicative_update(X, terms, idy=None, eps=1e-08)[source]

Perform multiplicative update for W using precomputed terms.

Reuses all tensors already computed by compute_log_likelihood_terms(): - exp_mu, exp_mu_sigma (always available) - rate_mean (lower-bound: W @ exp(μ), already computed) - exp_F_samples (MC modes: collected during the loop)

For MC modes, rate_e = W @ exp_F_e is computed per-sample in a loop under no_grad, avoiding the (E, D, N) batch matmul entirely.

Update rule: W_jl <- W_jl * numerator_jl / denominator_jl

Parameters:
  • X (Tensor) – Data tensor of shape (D, N) or (D_batch, N).

  • terms (dict) – dict from compute_log_likelihood_terms(return_samples=True). Required keys depend on elbo_mode: - All modes: ‘exp_mu’, ‘exp_mu_sigma’ - ‘lower-bound’: ‘rate_mean’ - ‘expanded’/’simple’: ‘exp_F_samples’

  • idy (Optional[Tensor], default: None) – Optional feature indices for batched updates. Shape: (D_batch,).

  • eps (float, default: 1e-08) – Small constant for numerical stability.