PlnPCAcollection

Bases: object

A collection where value q corresponds to a PlnPCA object with rank q.

Examples

>>> from pyPLNmodels import PlnPCAcollection, get_real_count_data, get_simulation_parameters, sample_pln
>>> endog, labels = get_real_count_data(return_labels = True)
>>> data = {"endog": endog}
>>> plnpcas = PlnPCAcollection.from_formula("endog ~ 1", data = data, ranks = [5,8, 12])
>>> plnpcas.fit()
>>> print(plnpcas)
>>> plnpcas.show()
>>> print(plnpcas.best_model())
>>> print(plnpcas[5])

>>> plnparam = get_simulation_parameters(n_samples =100, dim = 60, nb_cov = 2, rank = 8)
>>> endog = sample_pln(plnparam)
>>> data = {"endog":endog, "cov": plnparam.exog, "offsets": plnparam.offsets}
>>> plnpcas = PlnPCAcollection.from_formula("endog ~ 0 + cov", data = data, ranks = [5,8,12])
>>> plnpcas.fit()
>>> print(plnpcas)
>>> plnpcas.show()

See also

PlnPCA, from_formula()

property batch_size: Tensor

Property representing the batch_size.

Returns:: The batch_size.
Return type:: torch.Tensor

property best_AIC_model_rank: int

Property representing the rank of the best model according to the AIC criterion.

Returns:: The rank of the best model.
Return type:: int

property best_BIC_model_rank: int

Property representing the rank of the best model according to the BIC criterion.

Returns:: The rank of the best model.
Return type:: int

best_model(criterion: str = 'AIC') → Any

Get the best model according to the specified criterion.

Parameters:: criterion (str, optional) – The criterion to use (‘AIC’ or ‘BIC’), by default ‘AIC’.
Returns:: The best model.
Return type:: Any

property coef: Dict[int, Tensor]

Property representing the coefficients.

Returns:: The coefficients.
Return type:: Dict[int, torch.Tensor]

property components: Dict[int, Tensor]

Property representing the components.

Returns:: The components.
Return type:: Dict[int, torch.Tensor]

property dim: int

Property representing the dimension.

Returns:: The dimension.
Return type:: int

property endog: Tensor

Property representing the endog.

Returns:: The endog.
Return type:: torch.Tensor

property exog: Tensor

Property representing the exog.

Returns:: The exog.
Return type:: torch.Tensor

fit(nb_max_iteration: int = 50000, *, lr: float = 0.01, tol: float = 0.001, do_smart_init: bool = True, verbose: bool = False, batch_size: int | None = None)

Fit each model in the PlnPCAcollection.

Parameters:

nb_max_iteration (int, optional) – The maximum number of iterations, by default 50000.
lr (float, optional(keyword-only)) – The learning rate, by default 0.01.
tol (float, optional(keyword-only)) – The tolerance, by default 1e-8.
do_smart_init (bool, optional(keyword-only)) – Whether to do smart initialization, by default True.
verbose (bool, optional(keyword-only)) – Whether to print verbose output, by default False.
batch_size (int, optional(keyword-only)) – The batch size when optimizing the elbo. If None, batch gradient descent will be performed (i.e. batch_size = n_samples).

Raises:

ValueError – If the batch_size is greater than the number of samples, or not int.

classmethod from_formula(formula: str, data: Dict[str, Tensor | ndarray | DataFrame], *, offsets_formula: str = 'logsum', ranks: Iterable[int] = range(3, 5), dict_of_dict_initialization: dict | None = None, take_log_offsets: bool = False) → PlnPCAcollection

Create an instance of PlnPCAcollection from a formula.

Parameters:

formula (str) – The formula.
data (dict) – The data dictionary. Each value can be either a torch.Tensor, np.ndarray or pd.DataFrame
offsets_formula (str, optional(keyword-only)) – The formula for offsets, by default “logsum”. Overriden if data[“offsets”] is not None.
ranks (Iterable[int], optional(keyword-only)) – The range of ranks, by default range(3, 5).
dict_of_dict_initialization (dict, optional(keyword-only)) – The dictionary of initialization, by default None.
take_log_offsets (bool, optional(keyword-only)) – Whether to take the logarithm of offsets, by default False.

Returns:

The created PlnPCAcollection instance.

Return type:

PlnPCAcollection

Examples

>>> from pyPLNmodels import PlnPCAcollection, get_real_count_data
>>> endog = get_real_count_data()
>>> data = {"endog": endog}
>>> pca_col = PlnPCAcollection.from_formula("endog ~ 1", data = data, ranks = [5,6])