MiCall is a powerful pipeline for processing FASTQ data from an Illumina MiSeq. This guide focuses on running MiCall locally on your personal computer — using Docker for portability and ease of installation — in order to analyze a folder of MiSeq samples. This local mode is intended for hands–on analysis of a single run folder, as opposed to the fully automated server deployments (such as using MiCall watcher).
When you want to process a few FASTQ files or MiSeq run folders outside of a fully automated setup, the best solution is to run MiCall locally using Docker. Docker bundles all of MiCall’s dependencies in a portable container, removing the hassle of installing multiple packages. Before diving in, please ensure your machine meets the prerequisites described below.
Prerequisites and Docker Setup
- A modern operating system (Linux, macOS, or Windows with Docker support).
- Docker installed on your computer. Follow the official instructions on the Docker website. If you have successfully run a container before (for example, the “hello-world” image), you are ready for MiCall.
- Recommended minimum resources: at least 1 GB of RAM and a multi-core CPU.
MiCall is hosted on Docker Hub under the repository “cfelab/micall”. To pull the latest stable (non-dev) image, open your terminal and run:
docker pull cfelab/micall
It is a good idea to check the Docker Hub tags to select a version of MiCall that suits your needs. Once you have pulled the desired tag, you can verify its help message by running, for example:
docker run cfelab/micall --help
Preparing Your Input Data
MiCall expects the input to be organized as a MiSeq run folder. This folder should contain:
- FASTQ files for forward and reverse reads (optionally compressed as
.fastq.gz
). - A Sample Sheet (typically named
SampleSheet.csv
) if available, which helps with pairing and metadata extraction. - Additional quality files (such as the
InterOp
folder) generated by the MiSeq.
For a hands–on trial, you might want to download an example dataset from the project’s Github — Example Inputs and arrange the files in a directory similar to:
/your_project_directory/
SampleSheet.csv
1234A_R1.fastq.gz
1234A_R2.fastq.gz
...
results/ <-- (an empty folder to capture outputs)
Running MiCall Locally Using Docker
For local analysis, MiCall’s “folder” mode processes an entire run
folder. When running via Docker, you need to bind mount both your
input folder and a dedicated results folder. For instance, if your
current working directory is the run folder and you wish to store
outputs in a ./results
folder, run:
docker run --rm -v .:/data cfelab/micall folder --project HIVB --skip trim.censor --denovo --keep_scratch . results
In this command:
- The
--rm
option removes the container when it finishes, helping to keep your system tidy. - The local run folder is mounted inside the container as
/data
, ensuring MiCall has access to your FASTQ files, SampleSheet, and other resources. - The “folder” subcommand tells MiCall to process all available samples.
- Options such as
--project
(here set to HIVB) specify the primer set to use (for example, for HIV tests), while--skip trim.censor
instructs MiCall to bypass the quality filtering step.
Note that quality filtering is a good thing to do in production. But in order to test it, we need the
InterOp
data. That is the reason why we skip filtering in this short guide.
- The
--denovo
flag activates de novo assembly mode, and--keep_scratch
preserves intermediate files to aid in troubleshooting if needed.
A few additional Docker tips:
- Passing the environment variable
COLUMNS
(for example, using-e COLUMNS=$COLUMNS
) can improve help message formatting. - If you forget to clean up, you may list containers with
docker ps -a
and remove them individually (or rundocker container prune -f
).
Overview of the Local Analysis Pipeline
When you run MiCall locally, the pipeline performs several key steps:
-
Quality Assessment and Filtering
MiCall first reads file-specific quality information (for example, phiX error rates from the InterOp files) and produces a quality summary. It then trims the FASTQ reads to remove low-quality bases and adapter sequences.
-
Mapping and De Novo Assembly
The pipeline maps reads against a set of reference sequences or, if requested, employs de novo assembly tools to construct contigs from scratch. For HIV resistance testing, for instance, a remapping mode may be used to refine consensus sequences.
-
Consensus Generation and Coverage Mapping
Based on read alignments, consensus sequences are computed and coverage maps are generated. These visualizations illustrate read depth variations across genomic regions.
-
Resistance Reporting (when applicable)
If running resistance analysis for HIV or HCV, additional modules generate resistance calls according to predefined rule sets.
Key output files include aligned data (aligned.csv
), nucleotide and
amino acid counts (nuc.csv
, amino.csv
), consensus sequences
(conseq.csv
), genome coverage maps, and resistance reports if the
analysis is configured for that purpose.
Customizing and Troubleshooting Your Run
MiCall offers a range of command-line options to fine-tune your
analysis. For example, you can adjust parameters to skip certain
steps, change the quality cutoff thresholds, or select different
primer sets (using the --project
option). To see the full list of
options, simply run:
docker run --rm cfelab/micall --help
If you experience issues — such as error messages like “No such file or directory,” empty results, or performance problems — ensure that:
- Your bind mount paths are correct and accessible by Docker.
- Your input folder contains all the necessary files in the expected format.
- System resources (RAM, CPU) are sufficient, and consider closing other applications if needed.
- For mapping issues, you might add the
--debug_remap
flag to gain additional insight into the process.
Summary
By following this guide, you can effectively run MiCall locally using Docker. Whether you are processing first–hand MiSeq run folders for research or to experiment with de novo assembly and resistance scoring, this integrated local analysis and getting-started guide empowers you to harness the full potential of MiCall on your own system. Once you are comfortable with local runs, you can further explore advanced features, such as custom assembly settings and detailed troubleshooting tools, ensuring that every step of MiCall’s pipeline is transparent and customizable.
Happy analyzing!