JASPAR familial binding sites

Description

This UCSC Genome Browser track hub displays genome-wide predicted binding sites for each familial binding profile in the latest release of the JASPAR database.

Genomes

A list of genomes for which we provide predictions is available here.

Methods

We adapted the methodology from (Vierstra et al Nature 2020) to generate familial binding profiles from the clusters found by RSAT matrix-clustering (Castro-Mondragon et al. NAR 2017) with the following parameters: `-lth w 5 -lth cor 0.8 -lth Ncor 0.6 -hclust_method​ average -calc sum -metric_build_tree Ncor -label_in_tree name -return json`. With such stringent similarity thresholds, the motifs representing TF dimers are not grouped with their monomeric components. Indeed, we explicitly aimed at providing distinct familial binding profiles for motifs derived from dimer or monomer TF binding. Except for the motif clustering step, we used the default parameters in (Vierstra et al Nature 2020). As a final step, we removed the familial binding sites whose start coordinates were lower than 0 or end coordinates were greater than the chromosome size. These correspond to artifacts related to the expansion of original binding sites to fit the familial profile size.

The code for the familial binding profiles computation is freely available at https://bitbucket.org/CBGR/jaspar_familial_profiles_construction/.

Data Availability

All data is freely available.

Display Conventions and Configuration

Boxes represent predicted binding sites for each of the familial binding profiles in JASPAR.

Each familial profile is named with the cluster number it belongs to. Furthermore, each familial profile is displayed using a specific RGB color according to the following table: