oktoberfest.pl.plot_score_distribution

oktoberfest.pl.plot_score_distribution(target, decoy, level, filename)

Generate histogram of the score distribution for targets and decoys.

Parameters:
  • target (DataFrame) – mokapot / percolator target output

  • decoy (DataFrame) – mokapot / percolator decoy output

  • level (str) – The level on which to produce the comparison. Can be either “peptide” or “psm”

  • filename (Union[str, Path]) – the path to the location used for storing the plot

Example:

>>> from oktoberfest import plotting as pl
>>> import pandas as pd
>>> # Required columns: PSMId, score, q-value and peptide
>>> target_df = pd.DataFrame({"PSMId": ["F1-15-TAIASPEK-1-5","F1-5-HARPQTTLR-2-6","F2-14-RVYDPASPQRR-2-5",
>>>                             "F1-12-FSTQDHAAAAIAK-2-2","F2-63-ISDPTSPLRTR-2-9","F1-16-ADHPLRTR-1-5"],
>>>                             "score": [-0.1,-0.5,-0.5,0.7,0.4,0.7],
>>>                             "q-value": [0.005,0.008,0.002,0.006,0.004,0.001],
>>>                             "peptide": ["TAIASPEK","HARPQTTLR","RVYDPASPQRR",
>>>                             "FSTQDHAAAAIAK","ISDPTSPLRTR","ADHPLRTR"]})
>>> decoy_df = pd.DataFrame({"PSMId": ["F1-11-KLYNANYIK-3-7","F2-59-LGLTKLQLH-3-9","F1-24-EFAVEVLK-2-4"],
>>>                         "score": [-0.1,-0.5,-0.5],
>>>                         "q-value": [0.006,0.004,0.003],
>>>                         "peptide": ["KLYNANYIK","LGLTKLQLH","EFAVEVLK"]})
>>> pl.plot_score_distribution(target=target_df,
>>>                             decoy=decoy_df,
>>>                             level="psm",
>>>                             filename="./tests/doctests/output/score_distribution_plot.svg")