Running Oktoberfest

This documentation provides an overview of the high-level job API and how to execute jobs for collision energy calibration, spectral library generation and rescoring.

Executing a job

Oktoberfest can be run in three different ways. The only input required is a configuration for setting up the job (we provide examples further down and a full documentation on the next page).

The command for executing a job from terminal:

python -m oktoberfest --config_path <path/to/config.json>

The command for executing a job within python:

from oktoberfest.runner import run_job
run_job("<path/to/config.json>")

If you instead want to run oktoberfest using the docker image, run:

DATA=path/to/data/dir make run_oktoberfest

Note

When using with docker, DATA must contain the spectra, the search results that fit the specified search_results_type in the config, and a config.json file with the configuration. The results will be written to <DATA>/<output>/results/percolator.

A. Collision Energy Calibration

This task estimates the optimal normalised collision energy (NCE) based on a given search result. Oktoberfest will:

Select the 1000 highest scoring target PSMs
Perform peptide property prediction for NCE 18 to 49 in steps of one.
Calculate the spectral angle between predicted and experimentally observed fragment intensities for each NCE and report the best NCE, i.e the one that reaches the highest spectral angle.

Note

Sequences with amino acid U or O are not supported. Modifications except “M(ox)” are not supported.

Each C is treated as Cysteine with carbamidomethylation (fixed modification).

Example config file:

{
    "type": "CollisionEnergyCalibration",
    "tag": "",
    "output": "./out",
    "inputs": {
        "search_results": "./msms.txt",
        "search_results_type": "Maxquant",
        "spectra": "./",
        "spectra_type": "raw",
        "instrument_type": "QE"
    },
    "models": {
        "intensity": "Prosit_2020_intensity_HCD",
        "irt": "Prosit_2019_irt"
    },
    "prediction_server": "koina.wilhelmlab.org:443",
    "numThreads": 1,
    "regressionMethod": "spline",
    "ssl": true,
    "thermoExe": "ThermoRawFileParser.exe",
    "massTolerance": 20,
    "unitMassTolerance": "ppm",
    "ce_alignment_options": {
        "ce_range": [19,50],
        "use_ransac_model": false
    }
}

The example config can be loaded and viewed using

import oktoberfest as ok
import json
config = ok.utils.example_configs.CECALIB
json.dumps(config, indent=4)

B. Spectral Library Generation

This task generates a spectral library either by digesting a given FASTA file, or by predicting a list of peptides given in a CSV file. You need to provide a collision energy (CE) for prediction (see above). Oktoberfest will: 1. Digest the FASTA using a given protease and other parameters and create a peptides.csv file from that. 2. Predict all spectra at the given collision energy.

In case a CSV with peptides is provided, Oktoberfest will directly predict all spectra and skip the digestion step.

Note

Sequences with amino acid U or O are not supported. Modifications except “M(ox)” are not supported.

Each C is treated as Cysteine with carbamidomethylation (fixed modification).

Example config file:

{
    "type": "SpectralLibraryGeneration",
    "tag": "",
    "output": "./out",
    "inputs": {
        "library_input": "uniprot.fasta",
        "library_input_type": "fasta",
        "instrument_type": "QE"
    },
    "models": {
        "intensity": "Prosit_2020_intensity_HCD",
        "irt": "Prosit_2019_irt"
    },
    "spectralLibraryOptions": {
        "fragmentation": "HCD",
        "collisionEnergy": 30,
        "precursorCharge": [2,3],
        "minIntensity": 5e-4,
        "nrOx": 1,
        "batchsize": 10000,
        "format": "msp"
    },
    "fastaDigestOptions": {
        "digestion": "full",
        "missedCleavages": 2,
        "minLength": 7,
        "maxLength": 60,
        "enzyme": "trypsin",
        "specialAas": "KR",
        "db": "concat"
    },
    "prediction_server": "koina.wilhelmlab.org:443",
    "numThreads": 1,
    "ssl": true
}

The example config can be loaded and viewed using

import oktoberfest as ok
import json
config = ok.utils.example_configs.LIBGEN
json.dumps(config, indent=4)

Running Oktoberfest

Executing a job

A. Collision Energy Calibration

B. Spectral Library Generation

C. Rescoring

a) without refinement

b) with refinement