gget

Use this skill when a task needs quick bioinformatics lookup across genomic reference databases with the gget CLI or Python package.

When to Use

Finding Ensembl IDs, gene metadata, transcript details, or sequences.
Running quick BLAST or BLAT lookups without building a full local pipeline.
Fetching reference genome links and annotations from Ensembl.
Querying protein structure, pathway, cancer, expression, or disease-association modules through a single interface.
Creating a reproducible first-pass evidence log before moving to heavier tools such as Biopython, Snakemake, Nextflow, BLAST+, or database-specific clients.

Use a dedicated workflow instead of gget when the task requires regulated clinical interpretation, high-throughput production pipelines, or fine-grained control over database versions and local indexes.

Installation

Use a clean Python environment.

python -m venv .venv
. .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install --upgrade gget
gget --help

If uv is available:

uv venv
. .venv/bin/activate
uv pip install gget

Before relying on an older environment, upgrade gget and re-check the module docs. The upstream databases queried by gget change over time.

Basic Patterns

CLI shape:

gget <module> [arguments] [options]

Python shape:

import gget

result = gget.search(["BRCA1"], species="human")
print(result)

Common workflow:

Identify the species, assembly, gene ID type, and database needed.
Check the current module documentation for arguments.
Run a small query first.
Save output with an explicit filename and date.
Record module name, version, arguments, and database assumptions.

Common Modules

Use current upstream docs for exact arguments. These modules are common first choices:

gget search: find Ensembl IDs from search terms.
gget info: retrieve metadata for Ensembl, UniProt, or related IDs.
gget seq: fetch nucleotide or amino-acid sequences.
gget ref: retrieve reference genome download links.
gget blast: run a quick BLAST query.
gget blat: locate a sequence against supported genome assemblies.
gget muscle: run multiple sequence alignment.
gget diamond: run local sequence alignment against reference sequences.
gget alphafold and gget pdb: inspect protein-structure references.
gget enrichr, gget opentargets, gget archs4, gget bgee, gget cbio, and gget cosmic: explore enrichment, target, expression, cancer, and disease association data.

Do not assume every module supports every Python version or dependency set. Some optional scientific dependencies have narrower version support than the core package.

Quick Examples

Find genes:

gget search -s human brca1 dna repair -o brca1-search.json

Fetch gene metadata:

gget info ENSG00000012048 -o brca1-info.json

Fetch a sequence:

gget seq ENSG00000012048 -o brca1-seq.fa

Run a small BLAST query:

gget blast "MEEPQSDPSVEPPLSQETFSDLWKLLPEN" -l 10 -o blast-results.json

Python example:

import gget

genes = gget.search(["BRCA1", "DNA repair"], species="human")
info = gget.info(["ENSG00000012048"])
sequence = gget.seq("ENSG00000012048")

Reproducibility Log

For scientific outputs, include enough metadata to replay the query.

| Date | gget version | Module | Query | Species/assembly | Output | Notes |
| --- | --- | --- | --- | --- | --- | --- |
| 2026-05-11 | `gget --version` | search | `BRCA1 DNA repair` | human | `brca1-search.json` | Docs checked before run |

Also record:

Python version and environment manager.
Any optional dependency installed through gget setup.
Database-specific identifiers returned by the query.
Whether output is JSON, CSV, FASTA, or a DataFrame export.
Any failures that were resolved by upgrading gget.

Review Checklist

Did you upgrade or verify the installed gget version?
Did you check the current upstream module docs before using arguments?
Is the species or assembly explicit?
Are identifiers preserved exactly, including Ensembl/UniProt prefixes?
Is the result labeled as database output rather than clinical interpretation?
Is the query reproducible from the saved command or Python snippet?
Are optional dependencies installed in an isolated environment?

References

Files1

1 files · 1.0 KB

Select a file to preview

Overall Score

88/100

Grade

A

Excellent

Safety

88

Quality

90

Clarity

88

Completeness

85

Summary

The gget skill provides structured guidance for using the gget CLI and Python package to query genomic reference databases (Ensembl, UniProt, PDB, etc.) for gene metadata, sequences, BLAST results, and enrichment data. It covers installation, common module patterns, reproducibility logging, and clear scope boundaries separating quick lookups from production pipelines.

Detected Capabilities

Python package installation and managementCLI execution (gget commands)Network requests to genomic databases (Ensembl, UniProt, PDB, etc.)File output writing (JSON, FASTA, CSV)Python scripting with gget APIVirtual environment setup and management

Trigger Keywords

Phrases that MCP clients use to match this skill to user intent.

genomic database lookupgene metadata searchensembl queryblast sequence searchgenome reference fetchbioinformatics evidence log

Risk Signals

INFO

Network requests to external genomic databases (Ensembl, UniProt, PDB)

Throughout skill; Common Modules and examples sections

INFO

Python package installation from pip

Installation section

INFO

File writes to local project directory (.json, .fa, .csv outputs)

Quick Examples and Reproducibility Log sections

Referenced Domains

External domains referenced in skill content, detected by static analysis.

doi.orggithub.compachterlab.github.io

Use Cases

Look up gene identifiers and metadata from Ensembl
Retrieve genomic sequences (nucleotide or protein) by ID
Run quick BLAST or BLAT searches without local infrastructure
Fetch reference genome download links and annotations
Query protein structures, pathways, cancer, expression, and disease associations
Create a reproducible evidence log for initial bioinformatics investigations before committing to heavier pipelines

Quality Notes

Excellent scope definition: clearly separates use cases (quick lookups) from non-use cases (clinical interpretation, production pipelines)
Strong reproducibility emphasis with explicit metadata logging template (version, date, arguments, database state)
Detailed review checklist helps users validate queries before relying on results
Clear installation instructions with alternative package managers (pip and uv) and virtual environment isolation
Comprehensive module list with brief descriptions helps agents select appropriate tools
Good edge case coverage: warns about version-specific dependencies, database changes over time, and need to verify current docs
References section includes both tool docs and peer-reviewed publication
Clear demarcation between CLI and Python patterns
Security-conscious: recommends clean environments and explicit recording of assumptions

Model: claude-haiku-4-5-20251001Analyzed: May 15, 2026

Reviews

Add this skill to your library to leave a review.

No reviews yet

Be the first to share your experience.

gget

gget

When to Use

Installation

Basic Patterns

Common Modules

Quick Examples

Reproducibility Log

Review Checklist

References

Summary

Detected Capabilities

Trigger Keywords

Risk Signals

Referenced Domains

Use Cases

Quality Notes

Reviews

Command Palette