Custom search results
If the search engine you get your results from is not directly supported by Oktoberfest, the outputs can be manually transformed into the internal prosit file format. If you want to use this format, you need have the following parameter in your config file:
"search_results_type": "internal",
Internal file format specification
Oktoberfest expects a csv formatted file where each row represents a PSM. The following provides the basic file format specification. Additional columns can be added, which will be forwarded to percolator / mokapot if specified with the “add_feature_cols” option.
Column Header |
Explanation |
|---|---|
RAW_FILE |
Name of the RAW file or mzml file associated with the PSM without the file extension |
SCAN_NUMBER |
RAW file derived sequential number of an individal scan during the mass spectrometry run |
MODIFIED_SEQUENCE |
Peptide sequence including modifications in UNIMOD format |
PRECURSOR_CHARGE |
Charge state of the precursor ion |
SCAN_EVENT_NUMBER |
Optional number of the specific scan event in relation to the acquisition cycle that depends on the used search engine and might not be present |
MASS |
Monoisotopic mass of the peptide including modifications |
SCORE |
Search engine derived score (highest score is considered “best”, i.e. needs manual transformation if the search engine score is not following this rule) |
REVERSE |
Whether the PSM is a decoy (“True”) or a target (“False”) |
SEQUENCE |
Unmodified peptide sequence |
PEPTIDE_LENGTH |
Length of the unmodified peptide sequence |
<optional cols> |
Optional additional feature columns forwarded directly to percolator / mokapot if specified in the config file using the “add_feature_cols” option. Columns must only contain valid floating point numbers. |
Example
RAW_FILE,SCAN_NUMBER,MODIFIED_SEQUENCE,PRECURSOR_CHARGE,SCAN_EVENT_NUMBER,MASS,SCORE,REVERSE,SEQUENCE,PEPTIDE_LENGTH
GN20170722_SK_HLA_G0103_R1_02,51825,AAAAPPWAC[UNIMOD:4]FAAV,2,2,1301.6227,3.9695,True,AAAAPPWACFAAV,13
GN20170722_SK_HLA_G0103_R1_01,23561,AMILGKM[UNIMOD:35]VL,2,6,990.56059,47.574,False,AMILGKMVL,9
GN20170722_SK_HLA_G0103_R2_01,30017,APAKPGGP,1,9,693.38097,2.9045,False,APAKPGGP,8
GN20170722_SK_HLA_G0103_R2_02,27809,APIC[UNIMOD:4]SEAYSHC[UNIMOD:4]C[UNIMOD:4]DC[UNIMOD:4]F,6,10,1875.6685,0.0,True,APICSEAYSHCCDCF,15
GN20170722_SK_HLA_G0103_R1_01,13509,KM[UNIMOD:35]PAC[UNIMOD:4]NIM[UNIMOD:35]L,2,11,1108.5079,57.047,False,KMPACNIML,9
GN20170722_SK_HLA_G0103_R2_01,29784,KNHGRARW,3,2,1023.5475,2.6794,False,KNHGRARW,8
GN20170722_SK_HLA_G0103_R1_02,26938,KNHKKSHK,3,5,1005.5832,5.4817,True,KNHKKSHK,8
GN20170722_SK_HLA_G0103_R1_02,31215,KNHKKSHK,3,10,1005.5832,4.4157,True,KNHKKSHK,8
GN20170722_SK_HLA_G0103_R2_02,25018,KNHKKSHK,3,4,1005.5832,6.8953,True,KNHKKSHK,8