oktoberfest.pp.merge_spectra_and_peptides

oktoberfest.pp.merge_spectra_and_peptides(spectra, search)

Merge peptides with spectra.

This function takes spectra and and peptides from a search and merges them based on the RAW File identifier and the scan number using the “RAW_FILE” and “SCAN_NUMBER” column of the two dataframes provided.

Parameters:
Return type:

DataFrame

Returns:

Dataframe containing the matched pairs of peptides and spectra (PSMs)

Example:

>>> from oktoberfest import preprocessing as pp
>>> import numpy as np
>>> import pandas as pd
>>> search_results = pd.DataFrame({"RAW_FILE": ["File1","File2"],
>>>                                 "SCAN_NUMBER": [5123,4012],
>>>                                 "MODIFIED_SEQUENCE": ["AAAC[UNIMOD:4]RFVQ","RM[UNIMOD:35]PC[UNIMOD:4]HKPYL"],
>>>                                 "PRECURSOR_CHARGE": [1,2],
>>>                                 "SCAN_EVENT_NUMBER": [4,10],
>>>                                 "MASS": [1000.41,1589.1],
>>>                                 "SCORE": [3.64,5.45],
>>>                                 "REVERSE": [False,False],
>>>                                 "SEQUENCE": ["AAACRFVQ","RMPCHKPYL"],
>>>                                 "PEPTIDE_LENGTH": [8,9]})
>>> spectra = pd.DataFrame({"RAW_FILE": ["File1","File2"],
>>>                         "SCAN_NUMBER": [5123,4012],
>>>                         "INTENSITIES": [np.random.rand(174),np.random.rand(174)],
>>>                         "MZ": [np.random.rand(174),np.random.rand(174)],
>>>                         "MZ_RANGE": ["100.0-385.0","100.0-402.0"],
>>>                         "RETENTION_TIME": [59.1, 110.42],
>>>                         "MASS_ANALYZER": ["FTMS","FTMS"],
>>>                         "FRAGMENTATION": ["HCD","HCD"],
>>>                         "COLLISION_ENERGY": [27,27],
>>>                         "INSTRUMENT_TYPES": ["Q Exactive Plus","Q Exactive Plus"]})
>>> psms = pp.merge_spectra_and_peptides(spectra=spectra, search=search_results)
>>> print(psms)