El-MAVEN | FAQs

General
Compound Database
Blank Samples
Peak Grouping
Curating Peaks
Isotope Detection
Untargeted Data Analysis
Baseline Calculation
Options Dialog

General
Which file formats are compatible with El-MAVEN?

● El-MAVEN supports .mzXML, .mzML, .mzroll, .mzPeaks and netCDF formats as inputs. To convert different machine formats to the above mentioned format use MSConvert.

● You can store El-MAVEN output as .csv, .pdf, .json, .png and .mzroll or push the data to Polly for analyzing, sharing or storing.
What settings should I use to convert my raw files?

El-MAVEN supports .mzXML, .mzML formats as input. To convert different machine formats to the above-mentioned format use MSConvert.

The settings used in MSConvert should be as follows:

● Output Format: mzXML/mzML

● Binary encoding precision: according to system

● Write index: Checked

● TPP compatibility: Checked

● Use zlib compression: Unchecked

● Package in gzip: Unchecked
How can I reduce the data size input in El-MAVEN?

Centroiding can be used while conversion of raw data to reduce data size.

Another way to reduce the size of data input in El-MAVEN is as follows:

● Add "Threshold Peak Filter" as shown in the image

● Add "Value" as 10000 and convert your raw files

These settings reduce the converted file size significantly.

NOTE: "Value" settings may vary for different types of data
Can El-MAVEN store my data?

Yes. You can save your work as a .mzroll project. Loading it back will restore the session. We also offer an integration to Polly, a cloud-based platform where you can store all your data as well as the parameters and peaks you curated.
Should I save the project file (.mzroll) every time on reloading?

The project file is automatically updated on making changes in El-MAVEN. It does not have to be saved again.
Which alignment method should I use in El-MAVEN? How are the three methods different?

OBI-warp is the latest addition to El-MAVEN's list of alignment methods. It uses Dynamic Time warping, which is a global fit on the retention time drifts. It has empirically performed better than the other two algorithms and is the recommended method for alignment.

In case OBI-warp does not perform well for a particular data set, you can try:

● Poly Fit: the oldest alignment algorithm in El-MAVEN. It was built to correct for retention time drift by fitting a polynomial function to correct for the drifts.

● Loess Fit: Uses a part by part fitting thereby having a closer fit to the data and hence works better than Poly Fit in most cases.
How can I observe the total ion current for my data?

Click on the widget in the EIC widget bar to display the total ion current for your data.
Can I view the isotope distribution for my samples in El-MAVEN?

Yes. You can view the percentage of each isotope present in your sample by clicking on the isotope plot widget on the EIC widget bar.

On clicking this icon, the isotope plot appears, where you can hover over each sample to view the isotopic distribution after you have uploaded your samples in El-MAVEN.
What type of LC columns has MAVEN algorithms been tested on?

El-MAVEN has been tested on HILIC columns but a more important thing to note is that El-MAVEN is neutral to the kind of column used and hence using any column will allow the users to process data through El-MAVEN.

Compound Database
What type of compound database does El-MAVEN use?

The compound database is made differently for different kinds of data. Some of the essential columns are the compound name, compound formula (necessary for labeled analysis) or m/z. To do targeted analysis one can also add RT column. You can also refer to these documents to understand more about the compounds DBs for MS and MS-MS data
I do not see my database file in El-MAVEN. Why?

The compound database needs to be specifically added to El-MAVEN. Given below are the steps to add a database to El-MAVEN:

● Step 1: Select Compounds tab on the left panel

●Step 2: Select Open and browse to your compound database file

Blank Samples
Some of my samples are black in colour. Why is that?

If the sample file name has the word “blank" in it, El-MAVEN automatically marks them as a blank and the background colour is set to black. If these samples are not blanks, you can select them in the samples widget and click on the "Blank" icon to unmark them. (Screenshot required)
How to mark samples as blanks in El-MAVEN?

Samples can be marked as blanks by selecting the sample and clicking "Blank" icon. The sample gets greyed out once it is marked as a blank.

Peak Grouping
My data is not getting grouped properly. What should I do?

Samples do not get grouped correctly because of a couple of reasons.

First, you should check for alignment of your samples in the EIC. Misalignment is a common observation due to drift in retention time. If the samples seem to be misaligned, you should perform alignment through the following steps:

● Step 1: Click on “Align” in the top menu

● Step 2: Select an alignment algorithm (El-MAVEN provides Loess, Poly Fit, and OBI Warp alignment methods)

● Step 3: Click on “Align” in the popup

If grouping issue persists, you can increase the peak grouping parameters in the “Options” menu.

In El-MAVEN, a peak grouping score is determined to decide whether a peak should be grouped together or not. The “Peak Grouping” tab in the “Options” menu gives the equation and the parameters for calculation of the score. You can tweak these parameters to improve grouping of your data.

If the issue still persists, you can increase the EIC Smoothing Window under “Options” menu in “Peak Detection” tab. Smoothing of data points helps in increasing the signal/noise ratio. There are three algorithms in El-MAVEN on the basis of which EIC Smoothing is done, namely: Savitzky-Golay, Gaussian and Moving Average.

Check Documentation for details.
Why am I not getting enough groups in my data?

The Compound Database is very small in size because of which we do not detect all groups in our samples.

Another possible could be high group filtering parameters which filter out even the good groups. To change these parameters, we can perform the following steps:

● Go to “Peaks” in the top menu

● Select “Group Filtering” tab

● Lower “Minimum Peak Intensity” values
What does the parameter "minimum peak width" signify?

Peak width is equal to the number of scans that a peak is spread over. Groups with no peak widths above this threshold are filtered out. Spurious signals can be filtered out using this option.
What does the parameter "minimum good peak/group" signify?

Minimum Good Peak/Group in a sample signifies the number of good peaks that should be present for a group to be accepted as a good peak.
How does El-MAVEN detect the best group in any m/z?

El-MAVEN does this on the basis of group rank. Group rank is calculated using intensity and quality score. Quality score, in turn, is calculated using 9 different metrics of a peak.You can adjust the score calculation from the "Group Rank" tab in "Options" dialog.

Curating Peaks
How to automatically detect peaks in El-MAVEN?

● Select “Peaks” on the top menu

● Step2: You can select "Automatic Features Detection" or "Compound Database Search"

● Step3: Select "Find Peaks"

The peak table shows the list of groups detected. Automatic Curation of Peaks selects high-quality groups with high intensities.
How can we manually curate peaks in El-MAVEN?

● To use manual curation using the compound DB widget, the user has to iterate over all the compounds in the compound DB.

● Once on a compound, El-MAVEN shows the highest ranked group for that M/Z. The user can now choose a group or reject it. There are two ways to do this.

● In the first workflow, the user needs to double click on the peak group of his choice. This will get the Rt line to the median of the group and also add the metabolite to the bookmarks table.
● In the second workflow, the user can "Shift"+Drag on the peak they want to add to their bookmark table.
Why can’t I see the peaks in my data?

Peaks might not be displayed for any sample because of multiple reasons.

● Conversion issue: The raw files from the experiment were not converted properly to the required .mzXML or .mzML format. Sometimes conversion using 32bit MSConvert gives an issue. Check Documentation for correct parameters to be used while conversion.

● Incorrect ppm values: The ppm values can be very high or low as compared to what was used in the experiment. This leads to the peaks not getting detected. Select proper ppm value in the Top Left menu or while Peak Detection through Compound Database Search.

● Incorrect polarity settings: Sometimes peaks are not detected because of incorrect polarity settings. This can be changed by selecting “Options” in the top menu and changing “Polarity/Ionization Mode” under the “Instrumentation” tab

● Unit mismatch: El-MAVEN works with the monoisotopic mass unit, whereas mass spectrometry machines give an output in the atomic mass unit. Due to this unit mismatch, we have to raise the ppm value to negate the mismatch and detect peaks.
What parameters can I change to get good peaks for LCMS/MS data?

LC-MS/MS data generally has peak intensities lower than that of LC-MS or GC-MS data so we should typically use intensity cutoffs which are in the range of 1000 to 10000. Along with this, we should use the model_QQQ instead of the generic model for better quality scores.
What parameters can I change to get good peaks for GCMS data?

GCMS data often requires high PPM range as compared to LC-MS data. Thus we would suggest using high ppm ranges (order of magnitude 100) to detect the correct peaks in GCMS.
What are good peaks?

A good peak can be defined to have the following properties:

● Gaussian Shape

● Perfect grouping

● Narrow Retention Time

● Good Sample Intensities

● Low Blank Intensities

● A similar trend between observable standards and other samples
What are bad peaks?

Bad Peaks can be defined to show the following properties:

● The Peaks do not have Gaussian Shape

● Peaks are not grouping well

● Standard samples have intensities very high as compared to other samples

● The Samples show intensities lower or roughly equal to the blank samples indicating noisy peaks

● Peaks show very low intensity
What does the background color in the bookmark/peak table behind the good/bad marked peaks signify?

This feature takes advantage of the group quality score to decide whether the group has been correctly marked ‘good’ or ‘bad’. Darker the shade of red implies worse is the curation and should be considered as a bad peak.

Here the peak curated for UTP is a bad group and should be marked as bad ideally.

Isotope Detection
Can we perform untargeted data analysis in El-MAVEN? How?

Isotopologues/ Labels are not detected in any data due to the following reasons:

● Report Isotopic Peaks is disabled: If Report Isotopic Peaks is disabled, during peak detection, the labels are not detected. Make sure the “Report Isotopic Peaks” field is selected after you have selected "Peaks" on the top menu

● High Peak Filtering Settings: If minimum signal baseline difference and minimum peak quality are too high, the labels are not detected. This can be changed by going to “Options” in the top menu, selecting the “Peak Filtering” tab and tweaking the parameters.

● Select Labels: Labels are not detected if the isotopic labels are not selected. Go to “Options”, select “Isotope Detection” and select the labels you want to detect in your samples.

● High Isotope-Parent Peak Correlation value: Set the minimum threshold for isotope-parent peak correlation. This correlation is a measure of how often they appear together. To change this,

1. Go to “Options”

2. Select “Isotope Detection” tab

3. Change the “Minimum Isotope-Parent Peak Correlation” parameter

● Narrow scan range : If the number of scans of the parent within which Isotope has to be detected is set very low, the labels do not get detected. To change this,

1.Go to “Options”

2.Select “Isotope Detection” tab

3.Change the “Isotope is within [X] scans of parent” parameter

Untargeted Data Analysis
Can we perform untargeted data analysis in El-MAVEN? How?

Yes. Untargeted data analysis can be done in El-MAVEN through automated peak detection. For Untargeted Analysis perform the following steps:

● Step 1: Upload samples

● Step 2: Go to "Peaks" on the top menu

● Step 3: Select the tab "Feature Detection Selection"

● Step 4: Mark "Automated Feature Detection" as Check. Set the parameters here as per requirement.

● Step 5: Click on "Find Peaks"

Baseline Calculation
How to automatically detect peaks in El-MAVEN?

If you set “Droptop x% intensities from chromatogram” to 80%:

● El-MAVEN will sort the intensity vector from lowest to the highest intensity

● Intensity at the (100-80) = 20% mark will be set as the baseline cut-off value

● Baseline intensity is the same as signal intensity for all values below the cut-off. The baseline for higher signal intensity is set at the cut-off value.

● Baseline smoothing is done using Gaussian smoothing algorithm.

Basically, the top x% of the intensity points are dropped. If your data has a high noise level, set this parameter to a lower percentage.

Options Dialog
Do settings in "Options" dialog impact automated peak detection?

Yes. All the parameters within “Options” are global parameters and affect every workflow. You can read more about the parameters in our detailed UI documentation.
What is the default Ionization Mode in El-MAVEN? Can I change it?

El-MAVEN auto detects the Ionization Mode of your data by default. However, you can manually select this by,

● Going to "Options" in the top menu

● Selecting “Instrumentation” tab

● Selecting the correct “Ionization Mode” from the drop-down
What is the default Ionization Type in El-MAVEN? Can I change it?

The default Ionization Type in El-MAVEN is ESI (Electron-Spray Ionization). This can be changed as follows:

● Going to "Options" in the top menu

● Selecting “Instrumentation” tab

● Select the correct “Ionization Type” from the drop-down
Can I select a particular MS level in El-MAVEN?

Yes. To Select MS Level filters,

● Going to "Options" in the top menu

● Go to "File Import" tab

● Select "Scan Filter MS Level" from the drop-down
How to process a Polarity Switching data in El-MAVEN?

El-MAVEN allows a user to extract only positive or negative scans from mzXML files. This can be done before uploading the data by the following steps:

● Going to "Options" in the top menu

● Click on "File Import"

● Click on "Scan Filter Polarity" and select the polarity you want to process
Can I remove low abundance peaks in El-MAVEN?

The abundance of data can be reduced in El-MAVEN in two ways::

1. Selecting Minimum Intensity Filter in El-MAVEN

The Minimum Intensity Filter in El-MAVEN reflects the peaks above the selected minimum intensity. Selecting a high minimum intensity filter will remove all the peaks below the intensity and hence remove abundance. This can be done as follows,

● Go to "Options" in the top menu

● Select the "File Import" tab

● Set the "Scan Filter: Minimum Intensity" field

2. Centroiding in El-MAVEN

Selecting "Centroid Scan" in the File Import tab within the "Options" icon also reduces the data abundance.

● Go to "Options"

● Click on "File Import"

● Click on "Centroid Scan"
What does the "EIC Smoothing Algorithm" signify?

The "EIC Smoothing Algorithm" parameter smoothes the data points wrt the selected algorithm which helps in increasing the signal/noise ratio.

In El-MAVEN there are three Smoothing Algorithms you can select from. Namely,

1. Savitzky-Golay: It preserves the original shape and features of the signal better than most other filters

2. Gaussian: It reduces noise by averaging over the neighborhood with the central pixel having higher weight but successfully preserves sharp edges

3. Moving Average: It takes the simple average of all points over time. Signal behavior is not natural. Least preferred method for smoothing

El-MAVEN has Savitzky-Golay as its default EIC Smoothing Algorithm. You can select any other algorithm by the following steps:

● Go to "Options" in the top menu

● Click on "Peak detection" tab

● Select the algorithm from the drop-down
What does the "EIC Smoothing Window" signify?

The "EIC Smoothing Window" fits the smoothing algorithm in the selected number of scans.

This parameter can be changed as follows:

● Go to "Options" in the top menu

● Select the "Peak Detection" Tab

● Select the value for "EIC Smoothing Window"
How does "Max Retention Time Difference Between Peaks" affect my data?

This sets a limit to RT difference between peaks in a group. Increasing this value when alignment fails will center the peaks satisfactorily.

This value can be changed by the following steps:

● Go to "Options" in the top menu

● Select the "Peak Detection" Tab

● Select the value for "Max Retention Time Difference Between Peaks"
What does the "Minimum Signal Baseline Difference" parameter signify?

The "Minimum Signal Baseline Difference" parameter sets the difference between Intensity and Baseline to detect any signal as a valid signal. To change this value,

● Go to "Options" in the top menu

● Select the "Peak Filtering" tab

● Select the "Minimum Signal Baseline Difference" value
What do the parameters in the Isotope Detection tab affect data?

Samples are Labeled?

● The labels that are present in the samples or that has to be detected should be checked.

The parameters in Filter Isotopic Peaks section are as follows:

● Minimum Isotope-Parent Correlation- Sets the minimum threshold for isotope-parent peak correlation. This correlation is a measure of how often they appear together.

● Isotope is within [X] scans of parent- Sets the maximum scan difference between isotopic and parent peaks. This is a measure of how closely they appear together on the RT scale.

● Maximum % Error to Natural Abundance- Sets the maximum natural abundance error expected. Natural abundance of an isotope is the expected ratio of the amount of isotope over the amount of parent molecule in nature. The error is the difference between observed and natural abundance as a fraction of natural abundance.

● Correct for Natural C13 Isotope Abundance- Check the box to correct for natural C13 abundance.

To change these parameters,

● Go to "Options" in the top menu

● Select "Isotope Detection" tab

● Change the parameters
What do the parameters in Peak Grouping tab signify?

The Peak Grouping tab has the parameters that calculate the Peak Grouping Score which determines if the peaks should be grouped together or not.

The score depends on the following 3 parameters and their weights:

a) RT difference or DistX- Difference in RT between the peaks under comparison. Closer peaks are assigned a higher score.

b) Intensity difference or DistY- Difference in intensity between peaks under comparison. Smaller difference accounts for a higher score.

c) Overlap- Fraction of RT overlap between the peaks under comparison. Greater overlap accounts for a higher score.

● Consider Overlap- Uncheck this box to calculate grouping score without overlap.

● Sliders are provided to adjust the weights attached to each of the three parameters.

To change the "Peak Grouping" parameters,

● Go to "Options" in the top menu

● Click on "Peak Grouping" tab

● Change the parameters
What do the parameters in Group Ranking tab signify?

Group rank is one of the parameters for group filtering. Peaks are ranked according to their quality.

The score depends on the following 3 parameters and their respective weights A, B and C:

i) Q or Group Quality- Maximum peak quality of a group. Peaks are assigned a quality score by a machine learning algorithm in El-MAVEN. Better quality leads to a higher rank.

ii) I or Group Intensity- Maximum intensity of a group. Better intensity leads to a higher rank.

iii) dRT or RT difference- Difference between expected RT and group mean RT.

● Consider Retention Time- Check the box to use retention time while group rank calculation.

● Quality Weight- Adjust the slider to set weight for group quality in group rank calculation.

● Intensity Weight- Adjust the slider to set weight for group intensity in group rank calculation.

● dRT Weight- Adjust the slider to set weight for RT difference in group rank calculation. The slider is disabled if Consider Retention Time is unchecked.

To change the group rank parameters,,

● Go to "Options" in the top menu

● Click on "Group Rank" tab

● Change the parameters

General

Compound Database

Blank Samples

Peak Grouping

Curating Peaks

Isotope Detection

Untargeted Data Analysis

Baseline Calculation

Options Dialog