El-Maven | FAQs
  • General

  • Which file formats are compatible with El-MAVEN?

    ● El-MAVEN supports .mzXML, .mzML, .mzroll, .mzPeaks and netCDF formats as inputs. To convert different machine formats to the above mentioned format use MSConvert.

    ● You can store El-MAVEN output as .csv, .pdf, .json, .png and .mzroll or push the data to Polly for analyzing, sharing or storing.

  • What settings should I use to convert my raw files?

    El-MAVEN supports .mzXML, .mzML formats as input. To convert different machine formats to the above-mentioned format use MSConvert.

    The settings used in MSConvert should be as follows:

    ● Output Format: mzXML/mzML

    ● Binary encoding precision: according to system

    ● Write index: Checked

    ● TPP compatibility: Checked

    ● Use zlib compression: Unchecked

    ● Package in gzip: Unchecked

    Convert Raw Data
  • How can I reduce the data size input in El-MAVEN?

    Centroiding can be used while conversion of raw data to reduce data size.

    Another way to reduce the size of data input in El-MAVEN is as follows:

    ● Add "Threshold Peak Filter" as shown in the image

    ● Add "Value" as 10000 and convert your raw files

    Reduce Data Size

    These settings reduce the converted file size significantly.

    NOTE: "Value" settings may vary for different types of data

  • Can El-MAVEN store my data?

    Yes. You can save your work as a .mzroll project. Loading it back will restore the session. We also offer an integration to Polly, a cloud-based platform where you can store all your data as well as the parameters and peaks you curated.

  • Should I save the project file (.mzroll) every time on reloading?

    The project file is automatically updated on making changes in El-MAVEN. It does not have to be saved again.

  • Which alignment method should I use in El-MAVEN? How are the three methods different?

    OBI-warp is the latest addition to El-MAVEN's list of alignment methods. It uses Dynamic Time warping, which is a global fit on the retention time drifts. It has empirically performed better than the other two algorithms and is the recommended method for alignment.

    In case OBI-warp does not perform well for a particular data set, you can try:

    ● Poly Fit: the oldest alignment algorithm in El-MAVEN. It was built to correct for retention time drift by fitting a polynomial function to correct for the drifts.

    ● Loess Fit: Uses a part by part fitting thereby having a closer fit to the data and hence works better than Poly Fit in most cases.

  • How can I observe the total ion current for my data?

    Click on the Blank Sample Image widget in the EIC widget bar to display the total ion current for your data.

  • Can I view the isotope distribution for my samples in El-MAVEN?

    Yes. You can view the percentage of each isotope present in your sample by clicking on the isotope plot widget Blank Sample Image on the EIC widget bar.

    On clicking this icon, the isotope plot appears, where you can hover over each sample to view the isotopic distribution after you have uploaded your samples in El-MAVEN.

  • What type of LC columns has MAVEN algorithms been tested on?

    El-MAVEN has been tested on HILIC columns but a more important thing to note is that El-MAVEN is neutral to the kind of column used and hence using any column will allow the users to process data through El-MAVEN.

  • Compound Database

  • What type of compound database does El-MAVEN use?

    The compound database is made differently for different kinds of data. Some of the essential columns are the compound name, compound formula (necessary for labeled analysis) or m/z. To do targeted analysis one can also add RT column. You can also refer to these documents to understand more about the compounds DBs for MS and MS-MS data

  • I do not see my database file in El-MAVEN. Why?

    The compound database needs to be specifically added to El-MAVEN. Given below are the steps to add a database to El-MAVEN:

    Group Ranking Tab

    ● Step 1: Select Compounds tab on the left panel

    ●Step 2: Select Open and browse to your compound database file

  • Blank Samples

  • Some of my samples are black in colour. Why is that?

    If the sample file name has the word “blank" in it, El-MAVEN automatically marks them as a blank and the background colour is set to black. If these samples are not blanks, you can select them in the samples widget and click on the Blank Sample Image "Blank" icon to unmark them. (Screenshot required)

  • How to mark samples as blanks in ElMaven?

    Samples can be marked as blanks by selecting the sample and clicking Blank Sample Image "Blank" icon. The sample gets greyed out once it is marked as a blank.

  • Peak Grouping

  • My data is not getting grouped properly. What should I do?

    Samples do not get grouped correctly because of a couple of reasons.

    First, you should check for alignment of your samples in the EIC. Misalignment is a common observation due to drift in retention time. If the samples seem to be misaligned, you should perform alignment through the following steps:

    ● Step 1: Click on “Align” Align Icon Icon in the top menu

    ● Step 2: Select an alignment algorithm (ElMaven provides Loess, Poly Fit, and OBI Warp alignment methods)

    ● Step 3: Click on “Align” in the popup

    Peak Grouping Alignment

    If grouping issue persists, you can increase the peak grouping parameters in the “Options” menu.

    In ElMaven, a peak grouping score is determined to decide whether a peak should be grouped together or not. The “Peak Grouping” tab in the “Options” menu gives the equation and the parameters for calculation of the score. You can tweak these parameters to improve grouping of your data.

    If the issue still persists, you can increase the EIC Smoothing Window under “Options” menu in “Peak Detection” tab. Smoothing of data points helps in increasing the signal/noise ratio. There are three algorithms in ElMaven on the basis of which EIC Smoothing is done, namely: Savitzky-Golay, Gaussian and Moving Average.

    Check Documentation for details.

  • Why am I not getting enough groups in my data?

    The Compound Database is very small in size because of which we do not detect all groups in our samples.

    Another possible could be high group filtering parameters which filter out even the good groups. To change these parameters, we can perform the following steps:

    ● Go to “Peaks”Align Icon Icon in the top menu

    ● Select “Group Filtering” tab

    ● Lower “Minimum Peak Intensity” values

    Not Enough Groups
  • What does the parameter "minimum peak width" signify?

    Peak width is equal to the number of scans that a peak is spread over. Groups with no peak widths above this threshold are filtered out. Spurious signals can be filtered out using this option.

  • What does the parameter "minimum good peak/group" signify?

    Minimum Good Peak/Group in a sample signifies the number of good peaks that should be present for a group to be accepted as a good peak.

  • How does ElMaven detect the best group in any m/z?

    ElMaven does this on the basis of group rank. Group rank is calculated using intensity and quality score. Quality score, in turn, is calculated using 9 different metrics of a peak.You can adjust the score calculation from the "Group Rank" tab in "Options" dialog.

  • Curating Peaks

  • How to automatically detect peaks in El-MAVEN?

    ● Select “Peaks”Align Icon Icon on the top menu

    ● Step2: You can select "Automatic Features Detection" or "Compound Database Search"

    ● Step3: Select "Find Peaks"

    Automatic Detect Peaks

    The peak table shows the list of groups detected. Automatic Curation of Peaks selects high-quality groups with high intensities.

  • How can we manually curate peaks in ElMaven?

    ● To use manual curation using the compound DB widget, the user has to iterate over all the compounds in the compound DB.

    ● Once on a compound, ElMaven shows the highest ranked group for that M/Z. The user can now choose a group or reject it. There are two ways to do this.

    ● In the first workflow, the user needs to double click on the peak group of his choice. This will get the Rt line to the median of the group and also add the metabolite to the bookmarks table.

    ● In the second workflow, the user can "Shift"+Drag on the peak they want to add to their bookmark table.

  • Why can’t I see the peaks in my data?

    Peaks might not be displayed for any sample because of multiple reasons.

    ● Conversion issue: The raw files from the experiment were not converted properly to the required .mzXML or .mzML format. Sometimes conversion using 32bit MSConvert gives an issue. Check Documentation for correct parameters to be used while conversion.

    ● Incorrect ppm values: The ppm values can be very high or low as compared to what was used in the experiment. This leads to the peaks not getting detected. Select proper ppm value in the Top Left menu or while Peak Detection through Compound Database Search.

    ● Incorrect polarity settings: Sometimes peaks are not detected because of incorrect polarity settings. This can be changed by selecting “Options” in the top menu and changing “Polarity/Ionization Mode” under the “Instrumentation” tab

    ● Unit mismatch: ElMaven works with the monoisotopic mass unit, whereas mass spectrometry machines give an output in the atomic mass unit. Due to this unit mismatch, we have to raise the ppm value to negate the mismatch and detect peaks.

  • What parameters can I change to get good peaks for LCMS/MS data?

    LC-MS/MS data generally has peak intensities lower than that of LC-MS or GC-MS data so we should typically use intensity cutoffs which are in the range of 1000 to 10000. Along with this, we should use the model_QQQ instead of the generic model for better quality scores.

  • What parameters can I change to get good peaks for GCMS data?

    GCMS data often requires high PPM range as compared to LC-MS data. Thus we would suggest using high ppm ranges (order of magnitude 100) to detect the correct peaks in GCMS.

  • What are good peaks?

    A good peak can be defined to have the following properties:

    ● Gaussian Shape

    ● Perfect grouping

    ● Narrow Retention Time

    ● Good Sample Intensities

    ● Low Blank Intensities

    ● A similar trend between observable standards and other samples

  • What are bad peaks?

    Bad Peaks can be defined to show the following properties:

    ● The Peaks do not have Gaussian Shape

    ● Peaks are not grouping well

    ● Standard samples have intensities very high as compared to other samples

    ● The Samples show intensities lower or roughly equal to the blank samples indicating noisy peaks

    ● Peaks show very low intensity

  • What does the background color in the bookmark/peak table behind the good/bad marked peaks signify?

    This feature takes advantage of the group quality score to decide whether the group has been correctly marked ‘good’ or ‘bad’. Darker the shade of red implies worse is the curation and should be considered as a bad peak.

    Here the peak curated for UTP is a bad group and should be marked as bad ideally.

    Bookmark Peak Table
  • Isotope Detection

  • Can we perform untargeted data analysis in El-MAVEN? How?

    Isotopologues/ Labels are not detected in any data due to the following reasons:

    ● Report Isotopic Peaks is disabled: If Report Isotopic Peaks is disabled, during peak detection, the labels are not detected. Make sure the “Report Isotopic Peaks” field is selected after you have selected "Peaks" on the top menu

    Report Isotopic Peaks

    ● High Peak Filtering Settings: If minimum signal baseline difference and minimum peak quality are too high, the labels are not detected. This can be changed by going to “Options” in the top menu, selecting the “Peak Filtering” tab and tweaking the parameters.

    ● Select Labels: Labels are not detected if the isotopic labels are not selected. Go to “Options”Options Icon, select “Isotope Detection” and select the labels you want to detect in your samples.

    Select Isotope

    ● High Isotope-Parent Peak Correlation value: Set the minimum threshold for isotope-parent peak correlation. This correlation is a measure of how often they appear together. To change this,

    1. Go to “Options”Options Icon

    2. Select “Isotope Detection” tab

    3. Change the “Minimum Isotope-Parent Peak Correlation” parameter

    ● Narrow scan range : If the number of scans of the parent within which Isotope has to be detected is set very low, the labels do not get detected. To change this,

    1.Go to “Options”Options Icon

    2.Select “Isotope Detection” tab

    3.Change the “Isotope is within [X] scans of parent” parameter

  • Untargeted Data Analysis

  • Can we perform untargeted data analysis in El-MAVEN? How?

    Yes. Untargeted data analysis can be done in El-MAVEN through automated peak detection. For Untargeted Analysis perform the following steps:

    ● Step 1: Upload samples

    ● Step 2: Go to "Peaks"Align Icon Icon on the top menu

    ● Step 3: Select the tab "Feature Detection Selection"

    ● Step 4: Mark "Automated Feature Detection" as Check. Set the parameters here as per requirement.

    ● Step 5: Click on "Find Peaks"

  • Baseline Calculation

  • How to automatically detect peaks in El-MAVEN?

    If you set “Droptop x% intensities from chromatogram” to 80%:

    ● El-MAVEN will sort the intensity vector from lowest to the highest intensity

    ● Intensity at the (100-80) = 20% mark will be set as the baseline cut-off value

    ● Baseline intensity is the same as signal intensity for all values below the cut-off. The baseline for higher signal intensity is set at the cut-off value.

    ● Baseline smoothing is done using Gaussian smoothing algorithm.

    Basically, the top x% of the intensity points are dropped. If your data has a high noise level, set this parameter to a lower percentage.

  • Options Dialog

  • Do settings in "Options" dialog impact automated peak detection?

    Yes. All the parameters within “Options” are global parameters and affect every workflow. You can read more about the parameters in our detailed UI documentation.

  • What is the default Ionization Mode in El-MAVEN? Can I change it?

    El-MAVEN auto detects the Ionization Mode of your data by default. However, you can manually select this by,

    ● Going to "Options"Options Icon in the top menu

    ● Selecting “Instrumentation” tab

    ● Selecting the correct “Ionization Mode” from the drop-down

    Ionization Mode
  • What is the default Ionization Type in El-MAVEN? Can I change it?

    The default Ionization Type in El-MAVEN is ESI (Electron-Spray Ionization). This can be changed as follows:

    ● Going to "Options"Options Icon in the top menu

    ● Selecting “Instrumentation” tab

    ● Select the correct “Ionization Type” from the drop-down

    Ionization Type
  • Can I select a particular MS level in El-MAVEN?

    Yes. To Select MS Level filters,

    ● Going to "Options"Options Icon in the top menu

    ● Go to "File Import" tab

    ● Select "Scan Filter MS Level" from the drop-down

    Ionization Type
  • How to process a Polarity Switching data in El-MAVEN?

    El-MAVEN allows a user to extract only positive or negative scans from mzXML files. This can be done before uploading the data by the following steps:

    ● Going to "Options"Options Icon in the top menu

    ● Click on "File Import"

    ● Click on "Scan Filter Polarity" and select the polarity you want to process

    Paolarity Switching
  • Can I remove low abundance peaks in El-MAVEN?

    The abundance of data can be reduced in El-MAVEN in two ways::

    1. Selecting Minimum Intensity Filter in El-MAVEN

    The Minimum Intensity Filter in El-MAVEN reflects the peaks above the selected minimum intensity. Selecting a high minimum intensity filter will remove all the peaks below the intensity and hence remove abundance. This can be done as follows,

    ● Go to "Options"Options Icon in the top menu

    ● Select the "File Import" tab

    ● Set the "Scan Filter: Minimum Intensity" field

    Low Abundance

    2. Centroiding in El-MAVEN

    Selecting "Centroid Scan" in the File Import tab within the "Options" icon also reduces the data abundance.

    ● Go to "Options"Options Icon

    ● Click on "File Import"

    ● Click on "Centroid Scan"

    Centroid Scan
  • What does the "EIC Smoothing Algorithm" signify?

    The "EIC Smoothing Algorithm" parameter smoothes the data points wrt the selected algorithm which helps in increasing the signal/noise ratio.

    In El-MAVEN there are three Smoothing Algorithms you can select from. Namely,

    1. Savitzky-Golay: It preserves the original shape and features of the signal better than most other filters

    2. Gaussian: It reduces noise by averaging over the neighborhood with the central pixel having higher weight but successfully preserves sharp edges

    3. Moving Average: It takes the simple average of all points over time. Signal behavior is not natural. Least preferred method for smoothing

    El-MAVEN has Savitzky-Golay as its default EIC Smoothing Algorithm. You can select any other algorithm by the following steps:

    ● Go to "Options"Options Icon in the top menu

    ● Click on "Peak detection" tab

    ● Select the algorithm from the drop-down

    Eic smoothing
  • What does the "EIC Smoothing Window" signify?

    The "EIC Smoothing Window" fits the smoothing algorithm in the selected number of scans.

    This parameter can be changed as follows:

    ● Go to "Options"Options Icon in the top menu

    ● Select the "Peak Detection" Tab

    ● Select the value for "EIC Smoothing Window"

    Eic smoothing window
  • How does "Max Retention Time Difference Between Peaks" affect my data?

    This sets a limit to RT difference between peaks in a group. Increasing this value when alignment fails will center the peaks satisfactorily.

    This value can be changed by the following steps:

    ● Go to "Options"Options Icon in the top menu

    ● Select the "Peak Detection" Tab

    ● Select the value for "Max Retention Time Difference Between Peaks"

    Maximum Retention Time Difference
  • What does the "Minimum Signal Baseline Difference" parameter signify?

    The "Minimum Signal Baseline Difference" parameter sets the difference between Intensity and Baseline to detect any signal as a valid signal. To change this value,

    ● Go to "Options"Options Icon in the top menu

    ● Select the "Peak Filtering" tab

    ● Select the "Minimum Signal Baseline Difference" value

    Minimum Signal Baseline Difference
  • What do the parameters in the Isotope Detection tab affect data?

    Samples are Labeled?

    ● The labels that are present in the samples or that has to be detected should be checked.

    The parameters in Filter Isotopic Peaks section are as follows:

    ● Minimum Isotope-Parent Correlation- Sets the minimum threshold for isotope-parent peak correlation. This correlation is a measure of how often they appear together.

    ● Isotope is within [X] scans of parent- Sets the maximum scan difference between isotopic and parent peaks. This is a measure of how closely they appear together on the RT scale.

    ● Maximum % Error to Natural Abundance- Sets the maximum natural abundance error expected. Natural abundance of an isotope is the expected ratio of the amount of isotope over the amount of parent molecule in nature. The error is the difference between observed and natural abundance as a fraction of natural abundance.

    ● Correct for Natural C13 Isotope Abundance- Check the box to correct for natural C13 abundance.

    To change these parameters,

    ● Go to "Options"Options Icon in the top menu

    ● Select "Isotope Detection" tab

    ● Change the parameters

  • What do the parameters in Peak Grouping tab signify?

    The Peak Grouping tab has the parameters that calculate the Peak Grouping Score which determines if the peaks should be grouped together or not.

    The score depends on the following 3 parameters and their weights:

    a) RT difference or DistX- Difference in RT between the peaks under comparison. Closer peaks are assigned a higher score.

    b) Intensity difference or DistY- Difference in intensity between peaks under comparison. Smaller difference accounts for a higher score.

    c) Overlap- Fraction of RT overlap between the peaks under comparison. Greater overlap accounts for a higher score.

    ● Consider Overlap- Uncheck this box to calculate grouping score without overlap.

    ● Sliders are provided to adjust the weights attached to each of the three parameters.

    To change the "Peak Grouping" parameters,

    ● Go to "Options"Options Icon in the top menu

    ● Click on "Peak Grouping" tab

    ● Change the parameters

  • What do the parameters in Group Ranking tab signify?

    Group rank is one of the parameters for group filtering. Peaks are ranked according to their quality.

    The score depends on the following 3 parameters and their respective weights A, B and C:

    i) Q or Group Quality- Maximum peak quality of a group. Peaks are assigned a quality score by a machine learning algorithm in El Maven. Better quality leads to a higher rank.

    ii) I or Group Intensity- Maximum intensity of a group. Better intensity leads to a higher rank.

    iii) dRT or RT difference- Difference between expected RT and group mean RT.

    ● Consider Retention Time- Check the box to use retention time while group rank calculation.

    ● Quality Weight- Adjust the slider to set weight for group quality in group rank calculation.

    ● Intensity Weight- Adjust the slider to set weight for group intensity in group rank calculation.

    ● dRT Weight- Adjust the slider to set weight for RT difference in group rank calculation. The slider is disabled if Consider Retention Time is unchecked.

    To change the group rank parameters,,

    ● Go to "Options"Options Icon in the top menu

    ● Click on "Group Rank" tab

    ● Change the parameters

    Group Ranking Tab