pyproteonet.imputation.r.pca_methods.impute_pca_method

pyproteonet.imputation.r.pca_methods.impute_pca_method(dataset: Dataset, molecule: str, column: str, method: Literal['svdPca', 'ppca', 'bpca', 'svdImpute'] = 'bpca', n_pcs: int | None = None, result_column: str | None = None, molecules_as_variables: bool = False, only_transform_missing: bool = True)

Apply any principal component analysis (PCA) related imputation function as implmented by the pcaMethods R package. Available PCA imputation methods: svdPca, ppca, bpca, as well as svdImpute which is not a PCA method but a simple imputation method based on singular value decomposition. See https://bioconductor.org/packages/release/bioc/html/pcaMethods.html for more details.

Args:

dataset (Dataset): Dataset to impute. molecule (str): Molecule type to impute (e.g. protein, peptide etc.). column (str): Name of the value column to impute. method (Literal[‘svdPca’, ‘ppca’, ‘bpca’, ‘svdImpute’], optional): Imputation method to use. Defaults to “bpca”. n_pcs (Optional[int], optional): Number of principal components to use. If not given set to number_samples-1. Defaults to None. result_column (Optional[str], optional): If given, name of the value column to store the imputed values in. Defaults to None. molecules_as_variables (bool, optional): Whether to transpose the input matrix before imputation (treating molecules instead of samples as variables for the PCA). Defaults to False. only_transform_missing (bool, optional): Whether to only predict missing values. Otherwise all values are replaced with their PCA reconstruction. Defaults to True.

Returns:

pd.Series: The imputed values.