pyproteonet.io.io.read_multiple_mapped_dataframes
- pyproteonet.io.io.read_multiple_mapped_dataframes(dfs: Dict[str, DataFrame], sample_columns: List[str], molecule_columns: List[str] | Dict[str, List[str]] = [], mappings: List[Tuple[Tuple[str, str], Tuple[str, str]]] | Dict[str, Tuple[Tuple[str, str], Tuple[str, str]]] = [], mapping_sep=',', value_name='abundance') Dataset
Reads multiple mapped dataframes each containing a mapping column with lists of ids of mapped molecules.
- Parameters:
dfs (Dict[str, pd.DataFrame]) – A dictionary of dataframes, where the keys represent the molecule names and the values represent the dataframes.
sample_columns (List[str]) – A list of column names representing the samples.
molecule_columns (Union[List[str], Dict[str, List[str]]], optional) – The column names representing the non-sample-specific molecule columns (e.g. sequence). It can be a list if the same columns are used for all molecules, or a dictionary if different columns are used for different molecules. Defaults to an empty list.
mappings (Union[List[Tuple[Tuple[str, str], Tuple[str, str]]], Dict[str, Tuple[Tuple[str, str], Tuple[str, str]]]], optional) – The mappings between molecules. It can be a list of tuples or a dictionary from mapping name to tuple. Each tuple represents a mapping given by two tuples, each of which contains the key of the dataframe and the name of the mapping column (ids from both mapping columns are matched to find the mapped molecules) . Defaults to an empty list.
mapping_sep (str, optional) – The separator used in the mapping columns to separed ids. Defaults to “,”.
value_name (str, optional) – The name of the create value column. Defaults to “abundance”.
- Returns:
A Dataset object.
- Return type: