JUMP Shiny
Toggle navigation
  • Home
  • Tutorial
  • Experiment Design
  • Exploratory Analysis
  • Batch Normalization
  • Differential Expresssion
  • Enrichment Analysis

Welcome to JUMP Shiny

home

JUMP Shiny is a comprehensive web platform designed for mass spectrometry-based proteomics analysis. JUMP Shiny offers a wide range of analytical tools, including experimental design, exploratory analysis, batch normalization, differential expression and enrichment pathway analysis. This platform provides intuitive visualizations and functionalities that facilitate in-depth exploration of proteomics data. The tutorial tab will guide you through user interface, functionalities, and detailed steps to perform the analysis.

Contact Information

Aijun Zhang: azhang16@uthsc.edu
Xusheng Wang: xwang39@uthsc.edu

Wang Lab@2025: Lab Website

Acknowledgement

Part of the code was adapted from project TCC-GUI under MIT license. We truly appreciate and respect their contributions.

  • Experiment Design
  • Exploratory Analysis
  • Batch Normalization
  • Differential Expression
  • Enrichment pathway analysis

Experiment Design

Experiment Design offers tools for organizing sample processing sequences in proteomics experiments. By using block randomization, it assigns samples to different batches, minimizing batch effects and reducing the risk of introducing confounders that could bias data interpretation.

The Experiment Design algorithm comprises three key procedures:

  • Generating a batch design matrix based on the distribution of the first explanatory variable, while considering the specified number and size of batches.
  • Allocating samples to each batch, factoring in the distribution of the second explanatory variable.
  • Optimizing the batch design scenario for the third and subsequent variables.

Steps for Experiment Design

  1. Navigate to Experiment Design Tab

    Click on the Experiment Design tab in the left sidebar of this page.

    Experiment Design Tab

  2. Upload Sample Information

    At the top left, Click [Browse...] to upload your sample information table, which should be in .csv format.

    Upload sample info

    Ensure your file follows the correct format shown below. The first column should be SampleID, with factors to be considered starting from the second column. You can also download the example file by clicking the [Download example sample information].

    The [Sample Information Table] and the [Factor Distributions] plot showing the distribution of each factor (maximum: 3 plots) will be displayed after successful uploading. Note that factors should be categorical variables, and continuous variables should be grouped before uploading (such as AgeGroup in the below table).

    example sample info

  3. Experiment Settings

    Choose your experiment type as either [Label-free] or [TMT-labeling].

    Input the number of samples in a batch. For [TMT-labeling] experiment, this could be 10, 11, 16, and 18. A WARNING will be shown if the number is greater than 18. For [Label-free], there is no limitation for the batch size.

    Input the number of IR (Internal Reference) samples used in a batch. This typically would be 0, 1 or 2.

    The [Optimization level] is the number of times that the block randomization program will be run to find the best result. You can leave it as the default value.

    Note that when the number of factors is greater than two, achieving an equal distribution for the third and subsequent factors across batches cannot be guaranteed. In such cases, prioritizing factors becomes essential, and the most important factor should be placed as the first two factors to be considered.

    experiment settings

  4. Run Experiment Design

    Click the [Run experiment design] button. After the program finishes running, it will generate a [Batch and channel assignation] and the [Batch design matrix] for each factor.

    The [Batch and channel assignation] contains the information provided by the user alongside the batch information generated by the program. For TMT-labeling experiments, the table contains an additional column specifying the assigned channel for each sample.

    Click [Download all results] to download all results as an .xlsx file.

    experiment result1

    experiment result2

Exploratory Analysis

Exploratory Analysis provides a user-friendly interface for uploading and visualizing datasets. It summarizes key dataset characteristics, helping users understand data distribution and identify underlying patterns. This section provides effective quality control of your data.


Steps for Exploratory Analysis

  1. Navigate to Exploratory Analysis Tab

    Click on the Exploratory Analysis tab in the left sidebar of this page.

    Data Import Tab

  2. Upload Data

    Users can seamlessly upload a proteomic dataset, which should be in tab-delimited text or csv file format. Please ensure your file follows the correct format shown below. For a large dataset (over 30 Mb), please use JUMP Shiny locally.

    In addition, please click Download example input expression table for an example illustration.

    The input data should be organized into three columns: Protein Accession Number (i.e., UniProt), Gene Name, and Protein Description, followed by as many samples as needed. JUMP Shiny supports raw abundance values as well as log2 conversion values.
    Note: if your data is already log2 transformed, please choose the log2 option.

    An example of input data is shown below:

    Jumpshiny

    If successfully uploaded, the input data will be displayed in the Protein expression table panel (see below).

    Import Data


JUMP Shiny Format:

If your data contains Accession number, Gene Name, Description, and samples, use jumpshiny to upload. The first column is required.


JUMPq Result Format:

If your data sample starts from the 24th column, use jumpq to upload. Please remove the header rows of jumpq data.

Jumpq


JUMPq Batch Result Format:

If your data contains batch info, use jump_batch to upload.

Jumpq Batch

  1. Group Assignment

    After loading the dataset, input your grouping in the [Meta information] panel.

    Group Assignment

    Group Information File
    The Group Information File is required for the data analysis. It should adhere to the following structure:

    • Sample Name Column: The first column should contain your sample names. These names must exactly match the corresponding column names in your input expression table. Only the columns specified in this file will be included in the analysis, so ensure that they are correct, complete, and matched.

    • Grouping Name Column: After the sample name column, you can include one or more grouping columns. Each grouping column can represent different categories or factors relevant to your analysis (e.g., “control” vs. “treatment,” “male” vs. “female,” etc.). You can add as many grouping columns as necessary to capture all relevant grouping factors.

    Note: Headers are required in this file to clearly identify each column. The header of the first column is required to be named as Sample or sample.

    Below is an example of Group Information file:

    Group Info

  2. Confirm and Analyze

    Click the [Assign group information] button and wait for the [Summary] section to display additional information about your dataset. You can download and save the plots in .svg format for further analysis or publication. All plots can be zoomed in and out for a closer examination of the data.

    Intensity Distribution Plot

    By clicking the [Intensity Distribution] tab, you can view box plots for all uploaded samples, with each group highlighted in different colors. You can filter out proteins with low intensity and customize the title, X-axis, and Y-axis labels as needed.

    Distribution

    PCA Plot

    The PCA Plot visualizes the distribution of selected groups based on Principal Component Analysis (PCA). This plot helps in identifying patterns and trends in your dataset by reducing the dimensionality and highlighting the differences and similarities between groups. We include 2D and 3D PCA plots. Each point in the plot represents a sample, and the position of the points indicates their relative similarity or difference based on the principal components. The axes represent the first two/three principal components, which capture the most variance in the data. This visualization can be useful for identifying outliers, clusters, and potential relationships between groups.

    You can define the number of top variable proteins included in the PCA. The results may vary depending on the number of proteins selected. Additionally, you can toggle the buttons to display or hide labels on the plot.

    PCA Plot

    Sample Correlation

    The sample correlation heatmap visualizes the correlation within and between groups. This method organizes samples and features into a hierarchical tree, known as a dendrogram, based on their similarity or dissimilarity. The heatmap uses color gradients to represent the intensity of the correlation, with closely related samples or groups appearing closer together on the dendrogram. This visualization can help identify clusters of similar samples, reveal patterns in the data, and highlight differences between groups. It’s a valuable tool for understanding the relationships and structure within your dataset.

    Similarly, you can select the number of proteins to include in the cluster analysis. The percentage indicates the ratio of the selected top variable proteins to the total number of proteins in the dataset. You can also choose from various agglomeration and distance methods to customize the clustering process. These options allow you to refine the analysis and tailor the clustering approach based on the characteristics of your data and your specific research needs.

    Heatmap

    Group Selection

    Navigate to the [Group selection] panel to explore different ways of grouping your data for visualization. You can select variables or categories to group your data, such as experimental conditions, genes, or sexes. This flexibility allows you to customize the visualization and highlight specific aspects of your data for more detailed analysis. Use the options in the panel to easily switch between different groupings and gain insights from various perspectives.

    Group Selection

Batch Normalization

Batch Normalization aims to correct unwanted technical variation in protein expression data arising from experimental batch effects. By applying normalization techniques, you can ensure that the observed differences in protein expression are due to biological variation rather than technical artifacts. This process offers several key benefits:

  • Improving the Accuracy of Differential Expression: Normalization helps in accurately identifying true biological differences by eliminating technical noise.
  • Reducing the Impact of Batch Effects: It minimizes the influence of variations introduced during different experimental batches, leading to more consistent and reliable data.
  • Enhancing the Reproducibility of Results: By standardizing the data, normalization ensures that the results are reproducible across different experiments and studies.

Steps for Batch Normalization

  1. Navigate to Batch Normalization Tab

    Click on the Batch Normalization tab in the left sidebar of this page.

    Batch Normalization Tab

  2. Select Normalization Method

    Choose the appropriate normalization method based on your data:

    Internal: If your data has an internal reference, such as TMT data, you can normalize the data based on an internal sample.

    Linear: If your data doesn’t have an internal sample, select linear normalization. Linear normalization adjusts your dataset based on overall trends, bringing all samples to a common scale and correcting for systematic technical variations.

    Internal+Linear: If your data has an internal reference, you can choose first internal normalization and then linear normalization for better results.

    Normalization Method

  3. Selecting Batch Group Information

    To perform batch normalization, please specify the necessary batch and internal reference as provided in the sample information file. Please use the dropdown menu to choose the relevant batch identifier and select the column that contains the internal reference sample information as provided in the sample information file.

    Once you have selected the appropriate batch and internal reference column, click the [Run Batch Normalization] button to initiate the normalization process.

    Select your batch group column in the dropdown menu.
    batchgroup

    • Internal Method: Format: Include one more Info column. Make sure to specify the internal samples. Internal Method

    • Linear Method: Format: No need to have internal reference column.
      Linear Method

  4. Normalization Results

    After normalization, the Data Table will appear on the right side of the page. Similar to the Exploratory Analysis, Intensity Distribution, PCA plot, and Sample Correlation are generated to assess the effectiveness of the normalization.

    batch distribution

    batch distribution

    batch distribution

  5. Proceed to Differential Expression

    Now, you can directly proceed to Differential Expression analysis.

Differential Expression

Differential Expression is a method used to identify proteins that show significant differences in expression levels between groups or conditions. By comparing the expression levels across different groups, such as diseased versus healthy samples, researchers can pinpoint specific molecules that are upregulated or downregulated, providing insights into the biological processes involved.


Steps for Differential Expression

  1. Navigate to Differential Expression Tab

    Click on the Differential Expression tab in the left sidebar of this page.