API Reference
This page contains the API reference documentation for PNMF.
Main Interface
- class PNMF.PNMF(n_components=10, loadings_mode='projected', mode='expanded', training_mode='standard', E=3, max_iter=200, tol=0.0001, learning_rate=0.01, optimizer='Adam', random_state=None, verbose=False, device='auto', init='random', scheduler='one_cycle', scheduler_patience=200, scheduler_factor=0.8, scheduler_pct_start=0.3, scheduler_div_factor=25.0, scheduler_final_div_factor=10000.0, min_lr=1e-05, batch_size=None, y_batch_size=None, shuffle=True, spatial=False, local=False, kernel='Matern32', multigroup=False, num_inducing=3000, lengthscale=1.0, sigma=1.0, group_diff_param=10.0, jitter=1e-05, train_lengthscale=False, cholesky_mode='exp', diagonal_only=False, inducing_allocation='proportional', K=50, precompute_knn=True, neighbors='knn', scale_ll_D=True, scale_kl_NM=True)[source]
Bases:
objectProbabilistic Non-negative Matrix Factorization with variational inference.
This class provides a scikit-learn compatible interface for variational PNMF using Poisson factorization with ELBO optimization.
- The model factorizes a non-negative matrix X into:
X ≈ exp(F) @ W.T
where F is the latent factor matrix (sample-specific, sampled from a variational Gaussian distribution) and W is the loading matrix (learned).
Note: For sklearn API compatibility, fit_transform returns exp(F) (called the “transformed data”) and components_ stores W.T (called the “components”).
- Parameters:
n_components (int, default=10) – Number of latent components (rank of factorization).
loadings_mode ({'softplus', 'exp', 'projected'}, default='projected') – Method for enforcing non-negativity on W: - ‘softplus’: Use softplus transformation - ‘exp’: Use exponential transformation - ‘projected’: Use projected gradient descent (clamp after each step)
mode ({'simple', 'expanded', 'lower-bound'}, default='expanded') – ELBO computation mode: - ‘simple’: Use torch.distributions.Poisson.log_prob() directly - ‘expanded’: Use hybrid Monte Carlo + analytic expectation (default) - ‘lower-bound’: Use Jensen’s lower bound (fully analytic, no MC sampling)
training_mode ({'standard', 'natural'}, default='standard') – Training mode for variational parameters: - ‘standard’: Standard gradient descent with Adam/other optimizer - ‘natural’: Natural gradient descent with dual optimizers (NGD for variational, Adam for W)
E (int, default=10) – Number of Monte Carlo samples for ELBO estimation.
max_iter (int, default=200) – Maximum number of iterations.
tol (float, default=1e-4) – Tolerance for convergence.
learning_rate (float, default=0.01) – Learning rate for the optimizer.
optimizer ({'Adam', 'AdamW', 'NAdam', 'SGD', 'RMSprop'}, default='Adam') – Optimizer to use for training (applies to W parameters in natural mode).
random_state (int, default=None) – Random seed for reproducibility.
verbose (bool, default=False) – Whether to print progress messages.
device ({'cpu', 'cuda', 'mps', 'auto'}, default='auto') – Device to use for computation. ‘auto’ will select mps (Apple Silicon), cuda (NVIDIA), or cpu in that order based on availability.
init ({'random', 'nndsvd', 'nndsvda', 'nndsvdar', 'k-means', None}, default='random') –
Initialization method for W and exp(F): - ‘random’: Non-negative random matrices, scaled with sqrt(X.mean() / n_components) (default). - ‘nndsvd’: Nonnegative Double SVD (better for sparseness). - ‘nndsvda’: NNDSVD with zeros filled with average of X (better for dense data). - ‘nndsvdar’: NNDSVD with zeros filled with small random values (faster dense). - ‘k-means’: K-means clustering based initialization. - None: Auto-select ‘nndsvda’ if n_components <= min(n_samples, n_features),
otherwise ‘random’.
scheduler ({'one_cycle', 'plateau', None}, default='one_cycle') – Learning rate scheduler: - ‘one_cycle’: OneCycleLR with warmup then cosine annealing (default) - ‘plateau’: ReduceLROnPlateau, reduces LR when ELBO plateaus - None: No scheduler (constant learning rate)
scheduler_patience (int, default=200) – Number of iterations with no improvement before reducing LR. Only used when scheduler=’plateau’.
scheduler_factor (float, default=0.8) – Factor by which to reduce LR (new_lr = old_lr * factor). Only used when scheduler=’plateau’.
scheduler_pct_start (float, default=0.3) – Fraction of total iterations spent in warmup phase (LR ramps up from learning_rate/div_factor to learning_rate). Only used when scheduler=’one_cycle’.
scheduler_div_factor (float, default=25.0) – Determines initial LR: initial_lr = learning_rate / div_factor. Only used when scheduler=’one_cycle’.
scheduler_final_div_factor (float, default=1e4) – Determines final LR: min_lr = initial_lr / final_div_factor. Only used when scheduler=’one_cycle’.
min_lr (float, default=1e-5) – Minimum learning rate. Only used when scheduler=’plateau’.
batch_size (int, default=None) – Size of mini-batches for samples (N dimension). If None, uses full batch. Enable mini-batch training for large datasets.
y_batch_size (int, default=None) – Size of mini-batches for features (M/D dimension). If None, uses all features. Enable feature batching for very wide datasets.
shuffle (bool, default=True) – Whether to shuffle sample indices between iterations (for mini-batch mode).
Parameters (LCGP-Specific)
--------------------
spatial (bool, default=False) – Enable spatial GP prior instead of independent Gaussian prior. Requires coordinates to be provided in fit().
local (bool, default=False) – Use locally conditioned GP (LCGP) instead of SVGP when spatial=True. LCGP uses all points as inducing with low-rank+diagonal covariance. Only used when spatial=True.
kernel (str, default='Matern32') – Kernel function for spatial GP. Currently only ‘Matern32’ is supported.
multigroup (bool, default=False) – Use multi-group GP (MGGP) with group-aware spatial smoothing. When False, uses standard GP with batched_Matern32 kernel (no groups needed).
num_inducing (int, default=3000) – Number of inducing points for SVGP approximation.
lengthscale (float, default=1.0) – Kernel lengthscale for spatial correlation.
sigma (float, default=1.0) – Kernel output scale (amplitude).
group_diff_param (float, default=10.0) – Group difference parameter for MGGP. Higher values = stronger group separation.
jitter (float, default=1e-5) – Jitter term for numerical stability in Cholesky decomposition.
train_lengthscale (bool, default=False) – Whether to train the kernel lengthscale (currently not supported).
cholesky_mode (str, default='exp') – Cholesky diagonal constraint mode (‘exp’, ‘softplus’).
diagonal_only (bool, default=False) – Use diagonal-only variational covariance for inducing points.
inducing_allocation (str, default='proportional') –
How to distribute inducing points across groups (SVGP only): - ‘proportional’: Allocate points proportionally to group sizes (default). - ‘equal’: Allocate equal points to each group. - ‘derived’: Run K-means on all data for optimal spatial coverage,
then use KNN (k=5, distance-weighted) to classify centroids to groups. Falls back to proportional for groups with no assigned points.
Parameters
------------------------
K (int, default=50) – Number of nearest neighbors for LCGP local conditioning. Only used when local=True.
precompute_knn (bool, default=True) – Whether to precompute KNN indices at initialization for LCGP. Only used when local=True.
- components_
The basis matrix W (transposed for sklearn compatibility).
- Type:
ndarray of shape (n_components, n_features)
Examples
>>> import numpy as np >>> from PNMF import PNMF >>> X = np.random.rand(100, 50) # 100 samples, 50 features >>> model = PNMF(n_components=5, random_state=42) >>> transformed = model.fit_transform(X) # exp(F): (100, 5) transformed data >>> components = model.components_ # W.T: (5, 50) components >>> X_reconstructed = model.inverse_transform(transformed)
Mini-batch training for large datasets:
>>> X_large = np.random.rand(10000, 500) >>> model = PNMF(n_components=10, batch_size=1000, shuffle=True) >>> model.fit(X_large)
- __init__(n_components=10, loadings_mode='projected', mode='expanded', training_mode='standard', E=3, max_iter=200, tol=0.0001, learning_rate=0.01, optimizer='Adam', random_state=None, verbose=False, device='auto', init='random', scheduler='one_cycle', scheduler_patience=200, scheduler_factor=0.8, scheduler_pct_start=0.3, scheduler_div_factor=25.0, scheduler_final_div_factor=10000.0, min_lr=1e-05, batch_size=None, y_batch_size=None, shuffle=True, spatial=False, local=False, kernel='Matern32', multigroup=False, num_inducing=3000, lengthscale=1.0, sigma=1.0, group_diff_param=10.0, jitter=1e-05, train_lengthscale=False, cholesky_mode='exp', diagonal_only=False, inducing_allocation='proportional', K=50, precompute_knn=True, neighbors='knn', scale_ll_D=True, scale_kl_NM=True)[source]
- fit(X, y=None, coordinates=None, groups=None, return_history=False, callback=None, callback_interval=100)[source]
Fit the PNMF model to data X using variational inference.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input data matrix (non-negative).
y (Ignored) – Not used, present for scikit-learn compatibility.
coordinates (array-like of shape (n_samples, 2), optional) – Spatial coordinates for each sample. Required when spatial=True.
groups (array-like of shape (n_samples,), optional) – Integer group assignments for each sample. Required when spatial=True and multigroup=True.
return_history (bool, default=False) – If True, returns a tuple (history, self) where history is a list of ELBO values during training.
- Returns:
self – Returns the instance itself (or (history, self) if return_history=True).
- Return type:
- transform(X, coordinates=None, groups=None)[source]
Transform X using the fitted model.
For non-spatial models, uses NNLS multiplicative updates to find exp(F). For spatial models, uses GP predictive equations at new coordinates.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input data matrix.
coordinates (array-like of shape (n_samples, 2), optional) – Spatial coordinates for new samples. Required when spatial=True.
groups (array-like of shape (n_samples,), optional) – Group assignments for new samples. Required when spatial=True and multigroup=True.
- Returns:
transformed – Transformed data (exp(F) in our model notation).
- Return type:
ndarray of shape (n_samples, n_components)
- fit_transform(X, coordinates=None, groups=None, **kwargs)[source]
Fit the model and transform X.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input data matrix.
coordinates (array-like of shape (n_samples, 2), optional) – Spatial coordinates. Required when spatial=True.
groups (array-like of shape (n_samples,), optional) – Group assignments. Required when spatial=True and multigroup=True.
**kwargs (Additional arguments to pass to fit())
- Returns:
transformed – Transformed data (exp(F) in our model notation).
- Return type:
ndarray of shape (n_samples, n_components)
- inverse_transform(transformed)[source]
Transform data back to its original space.
Reconstructs X from exp(F) using: X ≈ exp(F) @ W.T
- Parameters:
transformed (array-like of shape (n_samples, n_components)) – Transformed data (exp(F) in our model notation).
- Returns:
X_reconstructed – Reconstructed data in original space.
- Return type:
ndarray of shape (n_samples, n_features)
Initialization
- PNMF.initialization.initialize_factors(X, n_components, init=None, random_state=None, scale_log_mu=True, target_std=1.0)[source]
Initialize W and exp(F) matrices for PNMF.
This function provides sklearn-compatible initialization for PNMF. The returned matrices can be used to initialize the model parameters.
- Parameters:
X (ndarray or torch.Tensor of shape (n_samples, n_features)) – Non-negative input data matrix.
n_components (int) – Number of latent components.
init ({None, 'random', 'nndsvd', 'nndsvda', 'nndsvdar', 'k-means'}, default=None) –
Initialization method: - None: Auto-select ‘nndsvda’ if n_components <= min(n_samples, n_features),
otherwise ‘random’.
’random’: Non-negative random matrices, scaled with sqrt(X.mean() / n_components).
’nndsvd’: Nonnegative Double SVD (better for sparseness).
’nndsvda’: NNDSVD with zeros filled with average of X (better for dense data).
’nndsvdar’: NNDSVD with zeros filled with small random values (faster dense).
’k-means’: K-means clustering based initialization.
random_state (int or None) – Random seed for reproducibility.
scale_log_mu (bool, default=True) – If True, scales exp_F so that log(exp_F) has approximately N(0, 1) distribution. This ensures the variational mean μ is on a consistent scale across different datasets and initialization methods.
target_std (float, default=1.0) – Target standard deviation for log(exp_F) when scale_log_mu=True.
- Return type:
- Returns:
W (ndarray of shape (n_features, n_components)) – Loadings matrix initialization (for W parameter).
exp_F (ndarray of shape (n_samples, n_components)) – Expected latent factors initialization (for exp(F) where F is in log-space).
Examples
>>> import numpy as np >>> from PNMF.initialization import initialize_factors >>> X = np.random.poisson(5, size=(100, 50)).astype(np.float32) >>> W, exp_F = initialize_factors(X, n_components=5, init='nndsvda', random_state=42) >>> print(W.shape) # (50, 5) >>> print(exp_F.shape) # (100, 5)
Notes
- For PNMF, the model structure is:
X ≈ W @ exp(F).T
where: - W is (n_features, n_components) - loadings - F is the latent factor matrix (n_samples, n_components) in log-space - exp(F) is the expected value of the latent factors
The variational distribution q(F) is Gaussian with mean μ and scale σ. To initialize from exp_F:
μ = log(exp_F) (or log(exp_F + eps) if exp_F has zeros) σ can be initialized to a small value (e.g., 0.1)
Transform and Utility Functions
Factor Extraction
- PNMF.transforms.log_factors(model, coordinates=None, groups=None, return_tensor=False)[source]
Get log-space latent factors (μ from q(F) = Normal(μ, σ²)).
- Parameters:
model (PNMF) – A fitted PNMF model.
coordinates (array-like, optional) – Spatial coordinates. For spatial models, uses stored training coordinates if not provided.
groups (array-like, optional) – Group assignments. For spatial models, uses stored training groups if not provided.
return_tensor (bool, default=False) – If True, return a torch.Tensor instead of numpy array.
- Returns:
F_log – Latent factors in log-space (μ from the variational distribution).
- Return type:
ndarray or Tensor of shape (n_samples, n_components)
Examples
>>> from PNMF import PNMF >>> from PNMF.transforms import log_factors >>> model = PNMF(n_components=5).fit(X) >>> F_log = log_factors(model) # Shape: (n_samples, n_components)
- PNMF.transforms.factor_uncertainty(model, return_variance=False, coordinates=None, groups=None, return_tensor=False)[source]
Get uncertainty in latent factors (σ or σ² from q(F) = Normal(μ, σ²)).
- Parameters:
model (PNMF) – A fitted PNMF model.
return_variance (bool, default=False) – If True, return variance (σ²). If False, return standard deviation (σ).
coordinates (array-like, optional) – Spatial coordinates (for spatial models).
groups (array-like, optional) – Group assignments (for spatial models).
return_tensor (bool, default=False) – If True, return a torch.Tensor instead of numpy array.
- Returns:
uncertainty – Standard deviation (σ) or variance (σ²) of the variational distribution.
- Return type:
ndarray or Tensor of shape (n_samples, n_components)
Examples
>>> from PNMF import PNMF >>> from PNMF.transforms import factor_uncertainty >>> model = PNMF(n_components=5).fit(X) >>> F_std = factor_uncertainty(model) # σ, shape: (n_samples, n_components) >>> F_var = factor_uncertainty(model, return_variance=True) # σ²
- PNMF.transforms.factor_samples(model, n_samples=100, return_exp=False, coordinates=None, groups=None, return_tensor=False)[source]
Sample latent factors from the variational posterior q(F).
- Parameters:
model (PNMF) – A fitted PNMF model.
n_samples (int, default=100) – Number of samples to draw.
return_exp (bool, default=False) – If True, return exp(F) samples. If False, return F samples.
coordinates (array-like, optional) – Spatial coordinates (for spatial models).
groups (array-like, optional) – Group assignments (for spatial models).
return_tensor (bool, default=False) – If True, return a torch.Tensor instead of numpy array.
- Returns:
samples – Samples from the variational posterior.
- Return type:
ndarray or Tensor of shape (n_samples, n_data_samples, n_components)
Notes
Uses the reparameterization trick for sampling.
Examples
>>> from PNMF import PNMF >>> from PNMF.transforms import factor_samples >>> model = PNMF(n_components=5).fit(X) >>> samples = factor_samples(model, n_samples=100) # (100, n_data, 5) >>> exp_samples = factor_samples(model, n_samples=50, return_exp=True)
Model Accessors
- PNMF.transforms.get_loadings(model, return_tensor=False)[source]
Get the loadings matrix W from a fitted PNMF model.
- Parameters:
- Returns:
W – The loadings matrix W.
- Return type:
ndarray or Tensor of shape (n_features, n_components)
Notes
This is equivalent to model.components_.T but provides a consistent interface and works with both fitted and unfitted (but initialized) models.
Examples
>>> from PNMF import PNMF >>> from PNMF.transforms import get_loadings >>> model = PNMF(n_components=5).fit(X) >>> W = get_loadings(model) # Shape: (n_features, n_components)
- PNMF.transforms.get_prior(model)[source]
Get the GaussianPrior object from a fitted PNMF model.
- Parameters:
model (PNMF) – A fitted PNMF model.
- Returns:
prior – The GaussianPrior object containing the variational distribution.
- Return type:
Notes
This provides access to the full prior for advanced users who want to directly manipulate or inspect the variational distribution.
Examples
>>> from PNMF import PNMF >>> from PNMF.transforms import get_prior >>> model = PNMF(n_components=5).fit(X) >>> prior = get_prior(model) >>> qF, pF = prior() # Get distributions directly
Conditional Inference
- PNMF.transforms.transform_F(X, W, n_components=None, mode='expanded', E=3, max_iter=200, tol=0.0001, learning_rate=0.01, verbose=False, device='auto', return_prior=False)[source]
Learn new latent factors F conditioned on fixed loadings W using variational inference.
This is a Bayesian alternative to NNLS-based transform. It learns a full variational distribution q(F) for the new data given the fixed W.
- Parameters:
X (array-like of shape (n_samples, n_features)) – New data to transform.
W (array-like of shape (n_features, n_components)) – Fixed loadings matrix.
n_components (int, optional) – Number of components. Inferred from W if not specified.
mode ({'simple', 'expanded', 'lower-bound'}, default='expanded') – ELBO computation mode.
E (int, default=3) – Number of Monte Carlo samples for ELBO estimation.
max_iter (int, default=200) – Maximum number of iterations.
tol (float, default=1e-4) – Tolerance for convergence.
learning_rate (float, default=0.01) – Learning rate for the optimizer.
verbose (bool, default=False) – Whether to print progress messages.
device ({'cpu', 'cuda', 'mps', 'auto'}, default='auto') – Device to use for computation.
return_prior (bool, default=False) – If True, return the GaussianPrior object instead of exp(F).
- Return type:
Union[ndarray,GaussianPrior]- Returns:
F_exp (ndarray of shape (n_samples, n_components)) – Transformed data E[exp(F)]. Returned if return_prior=False.
prior (GaussianPrior) – The fitted GaussianPrior object. Returned if return_prior=True.
Examples
>>> from PNMF import PNMF >>> from PNMF.transforms import transform_F, get_loadings >>> # Fit original model >>> model = PNMF(n_components=5).fit(X_train) >>> W = get_loadings(model) >>> # Transform new data with full VI >>> F_new = transform_F(X_new, W) # Shape: (n_new, 5) >>> # Or get the full prior for uncertainty quantification >>> prior_new = transform_F(X_new, W, return_prior=True)
- PNMF.transforms.transform_W(X, F, W_init=None, n_components=None, max_iter=100, tol=0.0001, verbose=False, device='auto')[source]
Learn new loadings W conditioned on fixed latent factors F.
Uses multiplicative updates (NNLS-style) for fast convergence while displaying the Poisson negative log-likelihood on the progress bar.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Data to fit.
F (array-like of shape (n_samples, n_components)) – Fixed latent factors in exp-space (exp(F) from the model).
W_init (array-like of shape (n_features, n_components), optional) – Initial loadings W. If None, uses random initialization.
n_components (int, optional) – Number of components. Inferred from F if not specified.
max_iter (int, default=100) – Maximum number of iterations.
tol (float, default=1e-4) – Tolerance for convergence.
verbose (bool, default=False) – Whether to print progress messages.
device ({'cpu', 'cuda', 'mps', 'auto'}, default='auto') – Device to use for computation.
- Returns:
W – The learned loadings matrix.
- Return type:
ndarray of shape (n_features, n_components)
Notes
This function uses multiplicative updates for non-negative matrix factorization, which is equivalent to coordinate descent on the Poisson log-likelihood.
- The update rule for W is:
W = W * (X @ H) / (W @ (H.T @ H) + eps)
where H = F (the fixed factors in exp-space).
For uncertainty quantification in F, consider using factor_samples() and running this function multiple times with different F samples.
Examples
>>> from PNMF import PNMF >>> from PNMF.transforms import transform_W, get_factors >>> # Fit original model >>> model = PNMF(n_components=5).fit(X_train) >>> F_exp = get_factors(model) # exp-space factors >>> # Learn new W for different data with same factors >>> W_new = transform_W(X_new, F_exp) # Shape: (n_features, 5)
Prior Utilities
- PNMF.transforms.log_factors_from_prior(prior, return_tensor=False)[source]
Get log-space latent factors from a GaussianPrior object.
- Parameters:
prior (GaussianPrior) – A GaussianPrior object.
return_tensor (bool, default=False) – If True, return a torch.Tensor instead of numpy array.
- Returns:
F_log – Latent factors in log-space.
- Return type:
ndarray or Tensor of shape (n_samples, n_components)
Examples
>>> from PNMF.transforms import transform_F, log_factors_from_prior >>> prior = transform_F(X_new, W, return_prior=True) >>> F_log = log_factors_from_prior(prior)
- PNMF.transforms.factors_from_prior(prior, use_mgf=True, return_tensor=False)[source]
Get exp-space latent factors from a GaussianPrior object.
- Parameters:
prior (GaussianPrior) – A GaussianPrior object.
use_mgf (bool, default=True) – If True, compute E[exp(F)] = exp(μ + σ²/2). If False, compute exp(μ) directly.
return_tensor (bool, default=False) – If True, return a torch.Tensor instead of numpy array.
- Returns:
F_exp – Latent factors in exp-space.
- Return type:
ndarray or Tensor of shape (n_samples, n_components)
Examples
>>> from PNMF.transforms import transform_F, factors_from_prior >>> prior = transform_F(X_new, W, return_prior=True) >>> F_exp = factors_from_prior(prior)
- PNMF.transforms.uncertainty_from_prior(prior, return_variance=False, return_tensor=False)[source]
Get uncertainty from a GaussianPrior object.
- Parameters:
prior (GaussianPrior) – A GaussianPrior object.
return_variance (bool, default=False) – If True, return variance (σ²). If False, return standard deviation (σ).
return_tensor (bool, default=False) – If True, return a torch.Tensor instead of numpy array.
- Returns:
uncertainty – Standard deviation or variance.
- Return type:
ndarray or Tensor of shape (n_samples, n_components)
Examples
>>> from PNMF.transforms import transform_F, uncertainty_from_prior >>> prior = transform_F(X_new, W, return_prior=True) >>> F_std = uncertainty_from_prior(prior)
PyTorch Models
- class PNMF.models.PoissonFactorization(prior, y, L=10, loadings_mode='softplus', mode='expanded')[source]
Bases:
ModulePoisson Factorization base model with variational inference.
This model uses Poisson factorization with variational inference to learn non-negative factor matrices. The latent factors F are sampled from a Gaussian variational distribution, and the loadings W are positive parameters.
- Parameters:
prior – A GaussianPrior object providing variational and prior distributions
y – Input data tensor of shape (D, N) where D is features, N is samples
L (default:
10) – Number of latent components (default: 10)loadings_mode (default:
'softplus') – Mode for enforcing positivity on W (‘softplus’, ‘exp’, or ‘projected’)mode (default:
'expanded') – ELBO computation mode (‘simple’, ‘expanded’, or ‘lower-bound’) - ‘simple’: Use torch.distributions.Poisson.log_prob() directly - ‘expanded’: Use hybrid Monte Carlo + analytic expectation (default) - ‘lower-bound’: Use Jensen’s lower bound (fully analytic, no MC sampling)
- prior
GaussianPrior for variational inference
- W
PositiveParameter loadings matrix of shape (D, L)
- loadings_mode
Positivity constraint mode
- D
Number of features
- N
Number of samples
- L
Number of latent components
Example
>>> import torch >>> from PNMF.priors import GaussianPrior >>> from PNMF.models import PoissonFactorization >>> y = torch.randn(100, 50) >>> prior = GaussianPrior(y, L=10) >>> model = PoissonFactorization(prior, y, L=10) >>> rate, qF, pF = model(E=10)
- __init__(prior, y, L=10, loadings_mode='softplus', mode='expanded')[source]
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- get_rate(prior_samples, idy=None)[source]
Compute the Poisson rate from prior samples.
- Parameters:
prior_samples – Samples from the prior of shape (E, L, N) where E is number of samples, L is components, N is samples
idy (default:
None) – Feature indices for batching (D dimension), None for full features
- Returns:
Rate matrix of shape (E, D, N) or (E, D_batch, batch_size)
- Return type:
Z
- forward(idx=None, idy=None, E=10, X=None, coordinates=None, groups=None, spatial=False)[source]
Forward pass: compute variational distributions and log-likelihood terms.
Supports both full-batch and mini-batch training. When idx and idy are None, performs full-batch forward pass. Otherwise, computes on the specified batch indices.
- Parameters:
idx (default:
None) – Sample indices (for N dimension), None for full samplesidy (default:
None) – Feature indices (for D dimension), None for full featuresE (default:
10) – Number of Monte Carlo samples (ignored for lower-bound mode)X (default:
None) – Input data (D, N) or (D_batch, N_batch). Required for memory-efficient MC accumulation. If None and MC is needed, falls back to full-tensor mode.coordinates (default:
None) – Spatial coordinates (N_batch, 2) for GP prior. Required when spatial=True.groups (default:
None) – Group assignments (N_batch,) for MGGP prior. Required when spatial=True and using multi-group GP.spatial (default:
False) – Whether to use spatial GP forward pass.
- Returns:
- terms: dict from compute_log_likelihood_terms()
qF: Variational posterior distribution pF: Prior distribution
- For spatial:
terms: dict from compute_log_likelihood_terms() qF: Variational posterior distribution from GP predictive qU: Inducing point variational distribution pU: Inducing point prior distribution (None for whitened)
- Return type:
For non-spatial
Priors
- class PNMF.priors.GaussianPrior(y, L=10, scale_pf=1.0, use_natural_gradients=False)[source]
Bases:
ModuleGaussian prior for latent factors in variational inference.
This class represents a variational distribution qF over latent factors F, along with a prior distribution pF (standard normal). The variational distribution can be parameterized either in standard form (mean, scale) or in natural parameter form (theta1, theta2) for natural gradient descent.
- Parameters:
y – Input data tensor of shape (D, N) where D is features, N is samples
L (default:
10) – Number of latent components (default: 10)scale_pf (default:
1.0) – Scale parameter for the prior distribution (default: 1.0)use_natural_gradients (default:
False) – Use natural parameterization (default: False)
- mean
Variational mean parameter of shape (L, N) [standard mode]
- scale
PositiveParameter for scale of shape (L, N) [standard mode]
- theta1
Natural parameter θ₁ = μ/s² of shape (L, N) [natural mode]
- theta2
Natural parameter θ₂ = -1/(2s²) of shape (L, N) [natural mode]
- scale_pf
Fixed scale for the prior distribution
- use_natural_gradients
Whether natural parameterization is used
Example
>>> import torch >>> from PNMF.priors import GaussianPrior >>> y = torch.randn(100, 50) # 100 features, 50 samples >>> prior = GaussianPrior(y, L=10) # Standard mode >>> qF, pF = prior() >>> F = qF.rsample((5,)) # Sample 5 times using reparameterization
>>> # Natural gradient mode >>> prior_nat = GaussianPrior(y, L=10, use_natural_gradients=True) >>> qF, pF = prior_nat()
- __init__(y, L=10, scale_pf=1.0, use_natural_gradients=False)[source]
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward()[source]
Get the variational and prior distributions.
- Returns:
Variational posterior distribution Normal(mean, scale) pF: Prior distribution Normal(0, scale_pf)
- Return type:
qF
Custom Modules
- class PNMF.custom_modules.ConstrainedParameter(shape, mode)[source]
Bases:
ModuleBase class for parameters with constraints.
Subclasses implement _to_constrained() and _to_unconstrained() to define the bijective mapping between raw (unconstrained) and constrained spaces.
The .data property returns the constrained value, and the underlying unconstrained parameter is stored in ._raw (an nn.Parameter).
- Parameters:
- __init__(shape, mode)[source]
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- property grad
Returns the gradient of the underlying parameter.
- property is_cuda
Check if on CUDA.
- property is_leaf
Check if leaf tensor.
- class PNMF.custom_modules.PositiveParameter(shape, mode='softplus', elbo_mode='expanded')[source]
Bases:
ConstrainedParameterParameter constrained to be positive.
Supports any shape - batching works automatically.
- Parameters:
shape (
Union[int,Tuple[int,...]]) – int or tuple for the parameter shape.mode (
str, default:'softplus') – ‘softplus’, ‘exp’, ‘projected’, or ‘multiplicative’ for ensuring positivity.elbo_mode (
str, default:'expanded') – ELBO computation mode for multiplicative updates. Only used when mode=’multiplicative’. One of ‘lower-bound’, ‘expanded’, or ‘simple’.
- The mode determines how positivity is enforced:
‘softplus’: Uses softplus(x) = log(1 + exp(x)) transformation
‘exp’: Uses exp(x) transformation
‘projected’: Uses projected gradient descent (clamps to >= 0 after each step)
- ‘multiplicative’: Uses multiplicative updates (no gradient-based optimization).
Requires calling multiplicative_update() manually after computing update terms.
- For multiplicative mode, the update rule is:
W = W * numerator / denominator
- The numerator and denominator depend on the elbo_mode:
‘lower-bound’: Fully analytic using Jensen’s bound
‘expanded’: Hybrid MC + analytic expectation
‘simple’: Full Monte Carlo
- __init__(shape, mode='softplus', elbo_mode='expanded')[source]
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- project()[source]
Project parameters to satisfy constraints.
For ‘projected’ mode, this clamps values to be >= 0. Call this after optimizer.step() when using projected gradients.
- multiplicative_update(X, terms, idy=None, eps=1e-08)[source]
Perform multiplicative update for W using precomputed terms.
Reuses all tensors already computed by compute_log_likelihood_terms(): - exp_mu, exp_mu_sigma (always available) - rate_mean (lower-bound: W @ exp(μ), already computed) - exp_F_samples (MC modes: collected during the loop)
For MC modes, rate_e = W @ exp_F_e is computed per-sample in a loop under no_grad, avoiding the (E, D, N) batch matmul entirely.
Update rule: W_jl <- W_jl * numerator_jl / denominator_jl
- Parameters:
X (
Tensor) – Data tensor of shape (D, N) or (D_batch, N).terms (
dict) – dict from compute_log_likelihood_terms(return_samples=True). Required keys depend on elbo_mode: - All modes: ‘exp_mu’, ‘exp_mu_sigma’ - ‘lower-bound’: ‘rate_mean’ - ‘expanded’/’simple’: ‘exp_F_samples’idy (
Optional[Tensor], default:None) – Optional feature indices for batched updates. Shape: (D_batch,).eps (
float, default:1e-08) – Small constant for numerical stability.