Setting Up MiCall for Local Analysis

Setting Up MiCall for Local Analysis

MiCall is a powerful pipeline for processing FASTQ data from an Illumina MiSeq. This guide focuses on running MiCall locally on your personal computer — using Docker for portability and ease of installation — in order to analyze a folder of MiSeq samples. This local mode is intended for hands–on analysis of a single run folder, as opposed to the fully automated server deployments (such as using MiCall watcher).


Prerequisites

Before you begin, you’ll need to ensure the following:

  • A computer running a suitable modern operating system (Linux, macOS, or Windows with Docker installed).
  • Docker: Please install Docker from Docker’s official installation page.
  • (Optionally) Recommended minimum resources: at least 1 GB of RAM and a multi-core CPU.
  • MiCall’s Docker image is recommended because it bundles all dependencies for portability.

Preparing Your Input Data

MiCall processes a run folder that contains the following:

  • FASTQ files for forward and reverse reads (optionally compressed as .fastq.gz).
  • A Sample Sheet (for proper pairing of files), if available.
  • Additional quality files generated by the MiSeq, which MiCall will use for quality filtering.

Before running MiCall locally:

  1. Collect your FASTQ files in a dedicated folder (e.g., /path/to/MiSeq_run).
  2. Confirm that your folder structure resembles what MiCall expects. For example, if a sample sheet is available, it should be placed at <run_folder>/SampleSheet.csv.
  3. If you’d like to test the pipeline first, you can download one of our example input datasets from Github — Example Inputs.

Pulling the MiCall Docker Image

MiCall is available on Docker Hub. To pull the latest stable image (non–dev version), run the following command in your terminal:

docker pull cfelab/micall

Running MiCall Locally Using Docker

For local analysis, you will run MiCall on your local run folder. The most commonly used mode is the “folder” mode, which processes an entire run folder. When running MiCall from Docker, you need to bind mount both your local input folder (containing your FASTQ files, sample sheet, and other data) and an output folder where results will be written.

Example Command

Assuming your MiSeq run folder is the current directory and you wish to write results in ./results subfolder, use the following command:

docker run --rm -v .:/data cfelab/micall folder --project HIVB --skip trim.censor --denovo --keep_scratch . results

In this command:

  • The --rm option cleans up the container after execution.
  • The input folder is mounted at /data inside the container.
  • The results folder is mounted at ./results.
  • The “folder” subcommand instructs MiCall to process all samples found in the run folder.
  • The --project options asks the trim primers that BCCFE uses for sequencing HIV-B.
  • The --skip trim.censor option tell MiCall to bypass the quality filtering of FASTQ reads.
  • The --denovo option sets MiCall in the DeNovo assembly mode.
  • The --keep_scratch option prevents MiCall from deleting intermediate files (they can be useful for troubleshooting).

Overview of the Local Analysis Pipeline

When you run MiCall locally, the pipeline executes the following key steps:

  1. Quality Assessment and Filtering:
    • Reads the phiX quality data and produces a quality summary (quality.csv) and bad cycles file.
    • Trims the FASTQ reads to remove low-quality regions and adapter sequences.
  2. Mapping/De Novo Assembly:
    • Maps the reads against a set of reference sequences or, if enabled, performs de novo assembly.
    • Optionally uses remapping techniques to refine the consensus sequences.
  3. Consensus Generation and Coverage Mapping:
    • Produces consensus sequences, aligned FASTQ/SAM files, and generates coverage maps which visually display the read depth across each consensus region.
  4. Resistance Reporting (if applicable):
    • For HIV resistance typing (or HCV, as selected) additional steps generate resistance calls based on rules defined elsewhere.

Key output files include but are not limited to:

  • aligned.csv, amino.csv, nuc.csv
  • consensus sequences (conseq.csv, conseq_all.csv)
  • genome_coverage maps
  • resistance reports (if run for HIV/HCV)

Customizing Your Run

MiCall provides several command–line options to tailor runs to your needs. For example:

  • Skipping Steps: You can skip over certain steps (e.g., quality censoring) by using the --skip parameter.
  • Enabling de novo Mode: Use the --denovo flag if you want to perform de novo assembly instead of mapping to known references.
  • Specifying Primers: Use the --project_code option to indicate the primer set (e.g., HCV, HIVB, or SARSCOV2) to trim.

Run following command to learn more about the possibilities:

docker run --rm cfelab/micall --help

Troubleshooting and FAQs

If you encounter issues while setting up or running MiCall locally, consider the following tips:

  • Error: “No such file or directory”:
    • Verify your bind mount paths are correct and accessible by Docker.
    • Check that the run folder contains the required files (FASTQ files, SampleSheet.csv, etc.).
  • Incorrect Output or Empty Results:
    • Ensure that your input files are in the correct format.
    • Review the logs output by MiCall in the container’s standard output for hints.
  • Logging and Debugging:
    • Use the --debug_remap flag to generate additional debug files if you suspect mapping issues.
    • Check the scratch folder (if not using --keep_scratch) for intermediate outputs.
  • Low Performance or Resource Errors:
    • Check the available RAM and CPU resources on your machine.
    • Consider closing other applications or increasing system resources if necessary.

Summary and Next Steps

In this guide, you have learned how to set up MiCall for local analysis using Docker. By following these instructions, you can process a single run folder on your local computer. Once you’ve successfully analyzed your data locally, you may consider exploring further aspects of MiCall, such as resistance interpretation or custom de novo assembly settings.


Appendices

A. Glossary of Terms

  • run_folder — The directory containing the input MiSeq files.
  • results_folder — The target directory where MiCall writes the outputs.
  • scratch — A temporary folder for intermediate files during processing.

B. Example Directory Structure

A suggested structure for your local analysis might look like this:

  /your_project_directory/
    SampleSheet.csv
    1234A_R1.fastq.gz
    1234A_R2.fastq.gz
    ./results/
      (respective output files will be written here)