pyproteonet.metrics.differential_expression.evaluate_des
- pyproteonet.metrics.differential_expression.evaluate_des(dataset: Dataset, molecule: str, columns: str | List[str], numerator_samples: List[str], denominator_samples: List[str], gt_fc: Series, min_fc: float = 1.5, max_pvalue: float = 0.05, is_log: bool = False, absolute_metrics: bool = False) DataFrame
- Compares the results of finding differentially expressed molecule to known ground troth differential expressiosn according to a ground truth fold change.
Evaluation is done with respect to Precision, Recall, Specificity, Accuracy, FP Rate and F1 Score (or corresponding absolute metrics if absolute_metrics is True).
- Parameters:
dataset (Dataset) – The dataset to find differentially expressed molecules in.
molecule (str) – The molecule type to find differentially expressed molecules for.
columns (Union[str, List[str]]) – The value column(s) containing the abundance values of potentially differentially expressed molecules.
nominator_samples (List[str]) – List of samples names to use as nominator when computing fold change.
denominator_samples (List[str]) – List of samples names to use as denominator when computing fold change.
gt_fc (pd.Series) – Ground truth fold change for each molecule.
min_fc (int, optional) – Minimum fold change required to be considered as differentially expressed. Works for both increase and decrease in abundance (e.g. a min. fold change of 2 results in both a fold change of 2 and 0.5 being considered as potentially differentially expressed.). Defaults to 2.
max_pvalue (float, optional) – P value to as significance threshold . Defaults to 0.05.
is_log (bool, optional) – Whether the column values are logarithmized. Defaults to False.
absolute_metrics (bool, optional) – Whether to return absolute numbers of correctly/incorrectly found DEs or whether to use relative metrics (Precision, Recall …). Defaults to False.
- Returns:
The evaluation results according to the calculated metrics.
- Return type:
pd.DataFrame