It is possible to write Python programs that use the ensj library to access Ensembl databases. This is achieved by running the program through the Jython interpreter.
Python programs can import the ensj classes directly and create their own drivers for later use.
An alternative is to import ensembl.py (source), (latest
versions)
which provides a facade for ensj and automatically loads drivers from
predefined config files. To use ensembl.py add from ensembl
import * to your program and ensure the module is available on
your path.
Here is a simple example that fetches all the genes from a genomic
location and prints their names and how many exons they have. Note
that in addition to importing ensembl we also need to
qualify the driver, we use ensembl.human rather than
human. It is also possible to do from ensembl
import human ... human.ga.fetch(...) if you don't want to use
the ensembl prefix.
#!/usr/bin/env jython
import ensembl
# Using the current human data release
# find the number of exons for all of the genes in the region 20m
# to 21m bases on chromosome 22.
genes = ensembl.human.ga.fetch(ensembl.Location("chromosome:22:20m-21m"))
for gene in genes:
print gene.accessionID, gene.exons.size()
You can make the script executable on Unix by doing chmod +x
myscript.py. Alternatively pass it the jython interpreter
directly jython myscript.py
Here are some other examples:
What is Jython? Jython is a java implementation of the Python interpreter. It converts python scripts into java bytecode for execution on a JVM. It supports python language constructs and standard modules in addition to allowing access to java types. This means we can write python programs that use ensj's java classes.