Bacteria onehybrid (B1H) and dihybrid
(B2H) systems; Bacterial
onehybrid and dihybrid systems are quick and sensitive
to identify DNA-binding protein-of-interest. These two systems were introduced
in 1989 by Field and Song
and has undergone numerous modifications since its inception in 2005. One
hybrid and two hybrid systems provide a simple and efficient method for
identifying and characterizing novel protein-DNA interactions and can assess
not only in Bacterial selection system but also in Yeast selection system.
However, a bacterial selection system provides advantages over the
corresponding system in yeast (Meng et al. 2006). They have the highest degree of sensitivity
since DNA-protein sequence specific interaction recognition occurs when
proteins are in a dynamic and natural state. These systems have three parts:
transcription factor expression vector, a library of binding sites on DNA and Bacteria
(Vidal et al. 1996).
Any DNA-binding regulatory proteins or domains is expressed as a binding
part to a sub-unit of ?-RNA
polymerase. There are two HIS3 and URA3 reporter genes in the
upstream of promotor, the restriction sites for attaching random oligonucleotide
library. One protein binding domain (referred as Bait) recognizing the target
site will cause the activation of RNA polymerase enzyme, therefore, the
transcription of both genes starts. URA3 and HIS3 reporter assays
provide, respectively, the possibility of positive and negative selection to
bacterial strain (Meng and
Wolfe 2006). Bacteria cell culture on culture medium, at least, contains
3-amino-triazole (3-AT) being a competitive inhibitor causing the selection of
active promotor. Also, the growth of bacteria on a medium containing
5-fluoro-orotic acid (5-FOA) causes the production of a
poisonous substance and selection against active URA3 promotor. Plasmids
containing sequences under examination are designed with a genic system which
has a resistance gene to a compound in a specific medium. The designed plasmid
is transmitted to Bacteria. Also, plasmid containing a protein gene under
examination has been designed which is transmitted to Bacteria with the
above-mentioned plasmid. In case the produced protein under examination is
bound with the target sequence by plasmid, then, the polymerase enzyme is activated,
and this will lead to the activation of resistance gene transcription to
compound. In isolates with the above-mentioned properties, there exists the
possibility of growth in specific medium of the same compound and such Bacteria
grow and after the extraction of plasmids containing such sequences and the
sequencing of the above-mentioned plasmids, the binding site is specified. This
sequence has the possibility of analysis with motif searching tools on genome
and specifies the rest of regions with binding potential (Meng et al, 2005).

one-hybrid system is used to determine and characterizing DNA-protein binding
site, whereas the two-hybrid versions can assess not only protein–protein but
also protein–DNA interactions (Chien et al.1991; Field et al. 1993; Miller and Stagljar 2004; Karimova et al. 2017; Lin and Lai 2017). the mechanism of
two-hybrid system based on the ability of specific separated DNA binding domain
(DBD; referred as bait) and activation domains (AD; referred as prey) to
reconstitute an active transcription factor complex if interacting proteins are
fused to these domains. prey has no affinity for binding to the promoter
elements therefore it is not actively targeted to the promoter and does not
activate transcription of the reporter gen. When bait and prey proteins
interact with each other, the functional transcription factor is reconstituted
to the promoter upstream of the reporter gene and therefore activates
transcription process (Joung
et al. 2000; Gieseche
and Joung 2007; Maple and Moller 2007).

Chromatin immunoprecipitation based
methods; CHromatin
ImmunoPrecipitation (ChIP) are one of the powerful experimental techniques that
are used for characterizing binding of protein to DNA molecule not only
directly but also indirectly (Milne et al. 2009). The first ChIP assay was used by Gilmour and Lisder in 1984
as a laboratory technique for revealing RNA polymerase II interaction with DNA
target region in E. coli and Drosophila (Carey et al. 2012). Nowadays, ChIP has been
adopted as a potent technique for analyzing and characterizing not only histone
modifications but also transcription factor binding sites and protein–DNA
interactions occurring in vivo (Gade and Kalvakolanu 2012). In this method, cross-linked
protein to the specific DNA region of interest are fixed using Formaldehyde and
treated genomic DNA are extracted from the cell. The Chromatin is then sheared
into short fragments (segments of 200 to 1000 base pairs) by enzymatic
digestion or sonicate. The intended DNA-protein complexes subjected to
precipitation using specific antibody to the protein (Collas 2010). Immunoprecipitated DNA are purified
and released by reversing the cross-links commonly by acids or increased
temperature (Milne et al.
2009). Then, DNA is specified with different methods including but not
limited to Southern blotting, conventional PCR, quantitative PCR, hybridization
to arrays or cloning and sequencing. Nowadays, a lot of powerful methods were
developed based on ChIP for analyzing and characteristic protein binding DNA
site e.g. chromatin immunoprecipitation with DNA microarray referred as ChIP-chip (Lieb et al. 2001), ChIP-serial analysis
of gene expression referred as ChIP-SAGE (), chromatin
immunoprecipitation-sequencing referred as ChIP-seq (), chromatin immunoprecipitation-Paired-end tags referred as ChIP-PET
(Wu et al.2013) and ChIP-exo (Pugh 2012). Abcam website ( contains diverse ChIP
optimal protocols (Carey
et al., 2009). One of the problems in CHIP methods is that the
anti-bodies have the potential to cross-react with other nuclear proteins (even
if they are very specific).


containing DNA-protein interaction information

 Recent developed experimental
technologies provide an efficient and comprehensive methods for not only in
vitro identification of specific protein-nucleic acid interaction but also understanding
mediated nucleic acid-protein molecular specific recognition process. Recently
a lot of protein-nucleic acid complexes were studied by X-ray crystallography
in high resolution and are publicly available to the researchers in the protein
data bank (PDB). The experimentally prepared structural data prepare a source
of valuable information to investigate about the binding mods and the
specificity of protein/nucleic acid binding. Consequently, many physicochemical
and structural features of protein-DNA interaction are discovered during last
few years. To facilitate the data accessibility, data retrieval, site
navigation, and make the available data for analysis and predictions, several
databases for DNA-protein complexes and associated software have been
developed. Nowadays, many databases containing different information from
diverse aspects have been prepared with their information available, among
which the AANT database
(Hoffman et al. 2004)
freely available at
extracts all nucleic acid-protein interaction in residue level from PDB
deposited structures. It offers categories for experimentally determined
protein-nucleic acid interaction and provide graphic interface and statistical
information concerning for the interaction between nucleic acid and protein.
AANT database is updated weekly. Also, the ProNuc database stores a few structural data from
protein-DNA binding motifs with 3D structural information. ProNuc database are
now inactive and not accessible. NPIDB database (;
Spirin et al. 2007)
contains structural information on DNA/RNA-protein
interaction derived from PDB. The data deposited as a file in PDB format. The
data include the hydrogen bonds and hydrophobic interaction between proteins
and nucleic acids. NPIDB database is updated automatically by week. NPIDB uses
several web based tools for analyzing hydrophobic features, potential hydrogen
bonds in DNA/RNA-protein interaction and visualization of the binding site structures.
NPIDB has been upgraded and improved in 2013 (Kirsanov et al. 2013)
by new web interface, new tools for calculation of intermolecular DNA/RNA-protein
interactions, a classification of SCOP families and data on conserved water
molecules on the DNA-protein interface (Kirsanov et al. 2013). BIPA database (
provides several physiochemical properties from nucleic acid/protein interaction
site. In this database size, shape, residue propensity, secondary structure
composition and intermolecular interactions were
calculated and stored in the database (Lee and Blundell 2009). Moreover, additional
annotations for each nucleic acid/protein complex
such as structural multiple sequence alignment of protein families binding to DNA
along with annotations of the nucleic acid/protein local environments which are
provided for influence acceptability of mutations at a position in a protein family
(Lee and Blundell 2009).
3D-Footprint database (
contains structural-based specificities and sequence logo (footprint) for all
DNA-binding proteins deposited in PDB database. The database architecture
allows to browsing DNA-binding proteins by name, finding proteins that
recognize a similar DNA motif, and Blasting similar DNA-binding proteins and highlighting
interface residues in the resulting alignments (Contreras-Moreira 2010). 3D-Footprint is also
updated and curated weekly with the PDB complexes. 3D-Footprint database
cluster DNA-binding proteins based on structural similarity. The DNA-protein
interface is represented graphically and by footprint diagram (Contreras-Moreira 2010). Also,
DBBP database (
deposited structurally analyzed hydrogen bindings in DNA/RNA-protein complexes
extracted from PDB and offers information about the hydrogenic bond interaction
between proteins and nucleic acids at various levels from the residue to the
atom level (Park et al.,

TRANSFAC database (are now
accessible under
provides the largest data collection for experimentally derived cis-acting
DNA-transcription factor interaction which were manually fed into this database
(Wingender et al. 1996).
TRANSFAC data appears in several tables including FACTOR (containing
information on transcription factors), SITES (containing information on
genomic/artificial binding site), GENE (containing information on regulated
gene), MATRIX (containing nucleotide distribution matrix derived from
collection of binding site), CLASS (grouping the transcription factors
according their DNA binding domains), CELL (cell lines and other kinds and
factor sources), CONS (contains consensus description). Also, it provides a
link to entries of several databases including EMBL, SwissProt, PROSITE and
Transcription Regulatory Region Database(TRRD). These tables are linked to the
respective nodes in the other tables and databases. TRANSFAC have been revised
two times in 2003 and 2006 based on new laboratory data (Matys et al. 2006).
In the latest revised version of this database in 2006, some new tables and new
filed for several tables have been introduced (Table ). New links and
additional information were added for another tables. Also, new links to some
databases like UniGene, Enzemble and EntrezGene were added. Moreover, complementary
database called TRANSCompel was annexed to TRANSFAC database containing
information related to the composite elements (Matys et al. 2006). TRANSCompel consist of two
tables including COMPEL table (containing general information about the
composite elements) and EVIDENCE table (containing a brief list of the
experimental evidences confirming physical and fundamental interactions
information between corresponding TFs). TRANSFAC database focuses on just one
binding site, while, TRANSCompel annexed database the information was added of
two or more binding sites with protein near the primary site both from the
viewpoint of synergy and antagonistic effect (Matys et al. 2010).