View on GitHub


version control of software and data

Kive is an accessible computing framework for the version control of bioinformatic pipelines, along with their input and output datasets.


What does Kive do?

We developed our new framework (“Kive”) as a Django application. Django is a Python framework for developing web applications.

Kive is built on a PostgreSQL relational database. The database records the digital “fingerprint” (md5 checksum) of every version of pipeline components and data sets, their locations in the filesystem, and their relations to each other.

Executing a pipeline version on a data set is completely automated by Kive, which distributes jobs across computing resources (such as a computing cluster) and records every intermediate step in the database. Any intermediate step that can be re-used in subsequent pipeline versions will be loaded to minimize computing time.

Read/write privileges to pipelines and data sets in Kive are specific to users and groups.

Kive also features a web-based graphical user interface, including a point-and-click toolkit for assembling and running pipelines that is implemented in HTML5 Canvas and JavaScript.

We used Kive to track versions of pipelines being developed in-house for processing and interpreting raw data sets from an Illumina MiSeq. This pipeline comprises 8 scripts written in Python, Ruby, and R. For more information, read about how we fixed a problem with bad cycles in our example application.

Client requirements

The following browsers are supported

Browser Basic Support Bulk Upload Feature
Google Chrome version 4 version 5
Firefox version 3.6 version 3.6
Safari version 3.1 version 7
Internet Explorer version 9 version 10


You can upload data, launch pipelines, and update pipelines all through Kive’s API. You can also use our Python library to script calls to the API.

What are we working on?

You can see active tasks on our project board, or look at the current milestone’s burndown.