oktoberfest.pp.merge_spectra_and_peptides
- oktoberfest.pp.merge_spectra_and_peptides(spectra, search)
Merge peptides with spectra.
This function takes spectra and and peptides from a search and merges them based on the RAW File identifier and the scan number using the “RAW_FILE” and “SCAN_NUMBER” column of the two dataframes provided.
- Parameters:
spectra (
DataFrame) – MS2 spectra of a mass spectrometry runsearch (
DataFrame) – Peptides from a search for the given spectra in internal format (see Custom search results)
- Return type:
- Returns:
Dataframe containing the matched pairs of peptides and spectra (PSMs)
- Example:
>>> from oktoberfest import preprocessing as pp >>> import numpy as np >>> import pandas as pd >>> search_results = pd.DataFrame({"RAW_FILE": ["File1","File2"], >>> "SCAN_NUMBER": [5123,4012], >>> "MODIFIED_SEQUENCE": ["AAAC[UNIMOD:4]RFVQ","RM[UNIMOD:35]PC[UNIMOD:4]HKPYL"], >>> "PRECURSOR_CHARGE": [1,2], >>> "SCAN_EVENT_NUMBER": [4,10], >>> "MASS": [1000.41,1589.1], >>> "SCORE": [3.64,5.45], >>> "REVERSE": [False,False], >>> "SEQUENCE": ["AAACRFVQ","RMPCHKPYL"], >>> "PEPTIDE_LENGTH": [8,9]}) >>> spectra = pd.DataFrame({"RAW_FILE": ["File1","File2"], >>> "SCAN_NUMBER": [5123,4012], >>> "INTENSITIES": [np.random.rand(174),np.random.rand(174)], >>> "MZ": [np.random.rand(174),np.random.rand(174)], >>> "MZ_RANGE": ["100.0-385.0","100.0-402.0"], >>> "RETENTION_TIME": [59.1, 110.42], >>> "MASS_ANALYZER": ["FTMS","FTMS"], >>> "FRAGMENTATION": ["HCD","HCD"], >>> "COLLISION_ENERGY": [27,27], >>> "INSTRUMENT_TYPES": ["Q Exactive Plus","Q Exactive Plus"]}) >>> psms = pp.merge_spectra_and_peptides(spectra=spectra, search=search_results) >>> print(psms)