Bacteria onehybrid (B1H) and dihybrid(B2H) systems; Bacterialonehybrid and dihybrid systems are quick and sensitivemethodto identify DNA-binding protein-of-interest. These two systems were introducedin 1989 by Field and Songand has undergone numerous modifications since its inception in 2005. Onehybrid and two hybrid systems provide a simple and efficient method foridentifying and characterizing novel protein-DNA interactions and can assessnot only in Bacterial selection system but also in Yeast selection system.
However, a bacterial selection system provides advantages over thecorresponding system in yeast (Meng et al. 2006). They have the highest degree of sensitivitysince DNA-protein sequence specific interaction recognition occurs whenproteins are in a dynamic and natural state. These systems have three parts:transcription factor expression vector, a library of binding sites on DNA and Bacteria(Vidal et al. 1996).Any DNA-binding regulatory proteins or domains is expressed as a bindingpart to a sub-unit of ?-RNApolymerase. There are two HIS3 and URA3 reporter genes in theupstream of promotor, the restriction sites for attaching random oligonucleotidelibrary.
One protein binding domain (referred as Bait) recognizing the targetsite will cause the activation of RNA polymerase enzyme, therefore, thetranscription of both genes starts. URA3 and HIS3 reporter assaysprovide, respectively, the possibility of positive and negative selection tobacterial strain (Meng andWolfe 2006). Bacteria cell culture on culture medium, at least, contains3-amino-triazole (3-AT) being a competitive inhibitor causing the selection ofactive promotor. Also, the growth of bacteria on a medium containing5-fluoro-orotic acid (5-FOA) causes the production of apoisonous substance and selection against active URA3 promotor. Plasmidscontaining sequences under examination are designed with a genic system whichhas a resistance gene to a compound in a specific medium. The designed plasmidis transmitted to Bacteria.
Also, plasmid containing a protein gene underexamination has been designed which is transmitted to Bacteria with theabove-mentioned plasmid. In case the produced protein under examination isbound with the target sequence by plasmid, then, the polymerase enzyme is activated,and this will lead to the activation of resistance gene transcription tocompound. In isolates with the above-mentioned properties, there exists thepossibility of growth in specific medium of the same compound and such Bacteriagrow and after the extraction of plasmids containing such sequences and thesequencing of the above-mentioned plasmids, the binding site is specified. Thissequence has the possibility of analysis with motif searching tools on genomeand specifies the rest of regions with binding potential (Meng et al, 2005).Theone-hybrid system is used to determine and characterizing DNA-protein bindingsite, whereas the two-hybrid versions can assess not only protein–protein butalso protein–DNA interactions (Chien et al.1991; Field et al. 1993; Miller and Stagljar 2004; Karimova et al. 2017; Lin and Lai 2017).
the mechanism oftwo-hybrid system based on the ability of specific separated DNA binding domain(DBD; referred as bait) and activation domains (AD; referred as prey) toreconstitute an active transcription factor complex if interacting proteins arefused to these domains. prey has no affinity for binding to the promoterelements therefore it is not actively targeted to the promoter and does notactivate transcription of the reporter gen. When bait and prey proteinsinteract with each other, the functional transcription factor is reconstitutedto the promoter upstream of the reporter gene and therefore activatestranscription process (Jounget al. 2000; Giesecheand Joung 2007; Maple and Moller 2007).Chromatin immunoprecipitation basedmethods; CHromatinImmunoPrecipitation (ChIP) are one of the powerful experimental techniques thatare used for characterizing binding of protein to DNA molecule not onlydirectly but also indirectly (Milne et al. 2009). The first ChIP assay was used by Gilmour and Lisder in 1984as a laboratory technique for revealing RNA polymerase II interaction with DNAtarget region in E. coli and Drosophila (Carey et al.
2012). Nowadays, ChIP has beenadopted as a potent technique for analyzing and characterizing not only histonemodifications but also transcription factor binding sites and protein–DNAinteractions occurring in vivo (Gade and Kalvakolanu 2012). In this method, cross-linkedprotein to the specific DNA region of interest are fixed using Formaldehyde andtreated genomic DNA are extracted from the cell. The Chromatin is then shearedinto short fragments (segments of 200 to 1000 base pairs) by enzymaticdigestion or sonicate. The intended DNA-protein complexes subjected toprecipitation using specific antibody to the protein (Collas 2010).
Immunoprecipitated DNA are purifiedand released by reversing the cross-links commonly by acids or increasedtemperature (Milne et al.2009). Then, DNA is specified with different methods including but notlimited to Southern blotting, conventional PCR, quantitative PCR, hybridizationto arrays or cloning and sequencing. Nowadays, a lot of powerful methods weredeveloped based on ChIP for analyzing and characteristic protein binding DNAsite e.g. chromatin immunoprecipitation with DNA microarray referred as ChIP-chip (Lieb et al. 2001), ChIP-serial analysisof gene expression referred as ChIP-SAGE (), chromatinimmunoprecipitation-sequencing referred as ChIP-seq (), chromatin immunoprecipitation-Paired-end tags referred as ChIP-PET(Wu et al.
2013) and ChIP-exo (Pugh 2012). Abcam website (http://www.abcam.com/) contains diverse ChIPoptimal protocols (Careyet al.
, 2009). One of the problems in CHIP methods is that theanti-bodies have the potential to cross-react with other nuclear proteins (evenif they are very specific). Databasescontaining DNA-protein interaction information Recent developed experimentaltechnologies provide an efficient and comprehensive methods for not only invitro identification of specific protein-nucleic acid interaction but also understandingmediated nucleic acid-protein molecular specific recognition process.
Recentlya lot of protein-nucleic acid complexes were studied by X-ray crystallographyin high resolution and are publicly available to the researchers in the proteindata bank (PDB). The experimentally prepared structural data prepare a sourceof valuable information to investigate about the binding mods and thespecificity of protein/nucleic acid binding. Consequently, many physicochemicaland structural features of protein-DNA interaction are discovered during lastfew years.
To facilitate the data accessibility, data retrieval, sitenavigation, and make the available data for analysis and predictions, severaldatabases for DNA-protein complexes and associated software have beendeveloped. Nowadays, many databases containing different information fromdiverse aspects have been prepared with their information available, amongwhich the AANT database(Hoffman et al. 2004)freely available at http://aant.
icmb.utexas.edu/extracts all nucleic acid-protein interaction in residue level from PDBdeposited structures. It offers categories for experimentally determinedprotein-nucleic acid interaction and provide graphic interface and statisticalinformation concerning for the interaction between nucleic acid and protein.AANT database is updated weekly. Also, the ProNuc database stores a few structural data fromprotein-DNA binding motifs with 3D structural information. ProNuc database arenow inactive and not accessible.
NPIDB database (http://npidb.belozersky.msu.
ru/;Spirin et al. 2007)contains structural information on DNA/RNA-proteininteraction derived from PDB. The data deposited as a file in PDB format. Thedata include the hydrogen bonds and hydrophobic interaction between proteinsand nucleic acids. NPIDB database is updated automatically by week. NPIDB usesseveral web based tools for analyzing hydrophobic features, potential hydrogenbonds in DNA/RNA-protein interaction and visualization of the binding site structures.
NPIDB has been upgraded and improved in 2013 (Kirsanov et al. 2013)by new web interface, new tools for calculation of intermolecular DNA/RNA-proteininteractions, a classification of SCOP families and data on conserved watermolecules on the DNA-protein interface (Kirsanov et al. 2013). BIPA database (http://www-cryst.bioc.cam.ac.
uk/bipa)provides several physiochemical properties from nucleic acid/protein interactionsite. In this database size, shape, residue propensity, secondary structurecomposition and intermolecular interactions werecalculated and stored in the database (Lee and Blundell 2009). Moreover, additionalannotations for each nucleic acid/protein complexsuch as structural multiple sequence alignment of protein families binding to DNAalong with annotations of the nucleic acid/protein local environments which areprovided for influence acceptability of mutations at a position in a protein family(Lee and Blundell 2009).3D-Footprint database (http://floresta.
eead.csic.es/3dfootprint)contains structural-based specificities and sequence logo (footprint) for allDNA-binding proteins deposited in PDB database. The database architectureallows to browsing DNA-binding proteins by name, finding proteins thatrecognize a similar DNA motif, and Blasting similar DNA-binding proteins and highlightinginterface residues in the resulting alignments (Contreras-Moreira 2010). 3D-Footprint is alsoupdated and curated weekly with the PDB complexes. 3D-Footprint databasecluster DNA-binding proteins based on structural similarity. The DNA-proteininterface is represented graphically and by footprint diagram (Contreras-Moreira 2010). Also,DBBP database (http://bclab.
inha.ac.kr/dbbp)deposited structurally analyzed hydrogen bindings in DNA/RNA-protein complexesextracted from PDB and offers information about the hydrogenic bond interactionbetween proteins and nucleic acids at various levels from the residue to theatom level (Park et al.
,2014).TRANSFAC database (are nowaccessible under http://www.gene-regulation.com/pub/databases.html)provides the largest data collection for experimentally derived cis-actingDNA-transcription factor interaction which were manually fed into this database(Wingender et al.
1996).TRANSFAC data appears in several tables including FACTOR (containinginformation on transcription factors), SITES (containing information ongenomic/artificial binding site), GENE (containing information on regulatedgene), MATRIX (containing nucleotide distribution matrix derived fromcollection of binding site), CLASS (grouping the transcription factorsaccording their DNA binding domains), CELL (cell lines and other kinds andfactor sources), CONS (contains consensus description). Also, it provides alink to entries of several databases including EMBL, SwissProt, PROSITE andTranscription Regulatory Region Database(TRRD).
These tables are linked to therespective nodes in the other tables and databases. TRANSFAC have been revisedtwo times in 2003 and 2006 based on new laboratory data (Matys et al. 2006).
In the latest revised version of this database in 2006, some new tables and newfiled for several tables have been introduced (Table ). New links andadditional information were added for another tables. Also, new links to somedatabases like UniGene, Enzemble and EntrezGene were added. Moreover, complementarydatabase called TRANSCompel was annexed to TRANSFAC database containinginformation related to the composite elements (Matys et al. 2006). TRANSCompel consist of twotables including COMPEL table (containing general information about thecomposite elements) and EVIDENCE table (containing a brief list of theexperimental evidences confirming physical and fundamental interactionsinformation between corresponding TFs). TRANSFAC database focuses on just onebinding site, while, TRANSCompel annexed database the information was added oftwo or more binding sites with protein near the primary site both from theviewpoint of synergy and antagonistic effect (Matys et al.