Optimization of spectral network parameters
Dereplication is a process of using information about the chemical structure of previously characterized compounds to interpret mass spectra of compounds in an experimental sample. Identification of a Peptidic Natural Product (PNP) from its variants is called variable dereplication and for this process a special data structure called spectral network is used. Spectral network is a graph where two spectra are connected if their similatity score is higher than some threshold.
In this work, we develop the method for optimizing this threshold by matching the spectral network onto the peptide network (the graph, where two PNPs are connected if they differ on one amino acid) and then calculating a special similarity score between the graphs using personalized Random Walks with Restart.