Computational Genomics and Structural Bioinformatics
Dr Wang, John Junwen (王俊文)
Assistant Professor, Department of Biochemistry
BEng (Huazhong Agric.); MSc (Penn, Jiangnan); PhD (UW-Seattle)
- Contact
- Email:
- Tel: (852) 2831 5075; Fax: (852) 2855 1254
- Office: L1-05E, Human Research Institute, 5 Sassoon Road, Hong Kong
- Awards and Distinctions
- Outstanding Young Researcher Award (2011-12)
Publications, Achievements, and Grants are available at: HKU Scholars Hub, and Google Scholar
Wang Lab Webpage: http://jjwanglab.org/
Web Servers: ChIP-Array, EpiRegNet
Software: FastPval, co-evo, NRProF, FaSD
Positions Available:
I am always looking for someone to join my group either as Research Assistant (RA) or postgraduate student in Hong Kong, and RA or exchange student in Shenzhen through HKU-SIRI. The candidate should hold a master degree or above, and is highly motivated, hardworking and smart, preferably with excellent algorithm/programming skills and basic knowledge in biology, or with in-depth trainings in biomedical sciences and some programming skills. BS holder with first class honor (GPA>86/100 or 3.5/4.0) from top-ranking universities, who can compete for HKPF or UPF, will also be considered. Please also refer to here.
Research Description:
We employ computational and biological approaches to study the relationship of biological sequence, structure and function. We focus on three areas:
Computational and transcriptional genomics: Defining core promoter and surrounding transcription factor binding sites (TFBS) is a crucial step toward understanding gene regulation. We have developed computational models to detect core promoters in the human genome. We have also defined DNA sequence motifs associated with the core promoter and explored their relations to known genetic networks. Recent studies showed that many genes have multiple promoters. We discovered that among these promoters, most 5' promoters are more likely to be located within a CpG island. We are exploring this finding both computationally and biochemically. Computationally, we investigate the structure and functional variations among different promoters regarding their TFBS composition, CpG islands and promoter specificity. The computational findings are verified biochemically by DNA mutagenesis (i.e., to introduce insertions and deletions to disrupt the TFBS) aiming to demonstrate the correlation between the presence of a TFBS and a promoter function. We are developing computational methods to discover the genetic and epigenetic signatures of human/mouse embryonic stem cell differentiation.
Structural bioinformatics: The primary sequence of a protein determines its secondary and higher-order structures. However, the rules governing this determination are still poorly understood. We are interested in the correlation between protein sequence and structure that can better define these rules. We have developed statistical methods to explore these correlations, and are using these methods to improve protein sequence alignments. We plan to develop computational tools to improve prediction of higher-order protein structure, which in turn will help us to construct protein-protein interaction networks. We are also interested in studying the evolutionary relationships between transcription factors and their binding sites. We are developing HMM based algorithms to model protein-DNA and protein-RNA interactions.
Genome variation and diseases: Single Nucleotide Polymorphism (SNP) and Copy Number Variation (CNV) are powerful tools to study genetic diseases, such as cancers in breast, colon and lung. There are more than 10 million SNPs in the human genome, but only a fraction have been associated with diseases. Discovering new disease-associated SNPs will improve prediction, prevention and therapy of these diseases. We have developed algorithms to detect the SNPs that are within the binding sites of transcription factors, or within a putative microRNA target. These SNPs are likely to alter normal gene regulation and causing diseases. We are developing new probabilistic models to improve detection of disease-associated SNPs and CNVs. In addition, we are developing analysis pipelines for The Cancer Genome Atlas (TCGA) project.
Current Lab Members:
- Dr. Junwen John Wang; PI since March, 2008
- Mr. Weixin Jacky Wang, PhD student since Oct., 2009; BSc., ZJU
- Mr. Hari Krishna Yalamanchili, PhD student since Jan., 2010; BSc., JUIT; MSc., IIIT, India
- Ms. Jing Qin, RA since March, 2010; PhD student since Aug., 2010; BSc., ZJU; MPhil., CUHK
- Ms. Yan Wang, RA since March, 2010; PhD student since Aug., 2010; BSc., PKU
- Mr. Feng Xu, RA since Sept., 2010; PhD student since Sept., 2011; BSc., NEFU; MSc., NEFU
- Mr. Mulin Jun Li, RA since March, 2010; PhD student since June, 2012; BSc., USTA; MSc., USTC
- Mr. Panwen Wang, RA since Oct, 2010; PhD student since Dec, 2011; BSc., WHU; MSc., BUT
- Mr. Xiaorong Liu, RA since Feb., 2011; BSc., HNNU; MSc., CSU
Past Lab Members:
- Mr. Shu Yang, MPhil student (Sept. 2008~July, 2011); now at UBC, Canada
- Dr. Kalpana Agrawal, part time RA (Nov. 2008~June, 2010);
- Mr. Xinran Li, undergraduate FYP (Aug. 2008~July, 2009); now at UMich, USA
- Mr. Zhanyong Wang, RA (Mar. 2009-July, 2009); now at UCLA, USA
- Mr. Po Lo Paul Chan, undergraduate FYP (Sept. 2009~May, 2010)
- Mr. Leung Hing Lok, undergraduate project student (Sept. 2009~May, 2010)
- Ms. Pony Chan, undergraduate FYP (Sept. 2010~May, 2011)
- Mr. Ocean Wong, undergraduate FYP (Sept. 2010~May, 2011)
Past Exchange Students/Summer Intern:
- Mr. Xueya Zhou (May, 2011), from Tsinghua University, China
- Ms. Ee Lyn Lim (Sept., 2010~Sept., 2010), from University of Oxford, UK
- Mr. Long Chan (July, 2010~Aug, 2010), from Carlton College, USA
- Mr. Kevin Mao (July, 2010~Aug, 2010), from Royal College of Surgeons in Ireland
- Ms. Tina Yuen (July, 2010~Aug, 2010), from Royal College of Surgeons in Ireland
- Ms. Vijitra Luang-In (July, 2010~Aug, 2010), from Imperial College London, UK
- Ms. Ruijuan Li (May, 2010), from Tsinghua University, China
- Mr. Yugang Hu (July, 2010), from NIBS, China
- Ms. Grace Yip (July, 2009~Aug, 2009), from Imperial College London, UK
Selected Publications (name in bold: lab member, *Corresponding author):
- Pan X, Papasani M, Hao Y, Calamito M, Wei F, Quinn III W, Wang JW, Hodawadekar S, Zaprazna K, Liu H, Shi Y, Allman D, Cancro M, Basu A, Atchison ML* (2013) YY1 Controls Igk Repertoire and B Cell Development, and Localizes with Condensin on the Igk Locus. EMBO J, doi:10.1038/emboj.2013.66. link
- Lei S, Li H, Xu J, Gao X, Wang JW, Ng KFJ, Lau WB, Rodrigues B, Irwin MG*, Xia ZY* (2013) Hyperglycemia-induced PKCβ2 Activation Induces Diastolic Cardiac Dysfunction in Diabetic Rats by Impairing Caveolin-3 Expression and Akt/eNOS Signaling. Diabetes, doi:10.2337/db12-1391. link
- Wang W, Xu F, Wang JW* (2013) Assessment of Mapping and SNP-detection Algorithms for Next Generation Sequencing Data for Cancer Genomics. APPLICATION OF NEXT GENERATION SEQUENCING IN CANCER RESEARCH, Invited book chapter, 2013, Springer, in press
- Lan Q*, Hsiung, CA, Matsuo K, Hong YC, ..., Wang JW, ..., Rothman, N (2012) Genome-wide association analysis identifies new lung cancer susceptibility loci in never-smoking women in Asia. Nat Genet, 44(12):1330-1335.
- Xu F†, Wang W†, Wang P, Li MJ, Sham PC, and Wang JW* (2012) A fast and accurate SNP detection algorithm for next-generation sequencing data. Nat Commun, doi:10.1038/ncomms2256.
- Li MJ, Sham PC, and Wang JW* (2012) Genetic variants representation, annotation and prioritization in the post-GWAS era. Cell Research, 22(10):1505-1508.
- Yalamanchili HK, Xiao QW, and Wang JW* (2012) A Neural Response Algorithm for Protein Function Prediction. BMC Systems Biology, 6(S1):S19.
- Zhang G, Zhou B, Wang W, Zhang M, Zhao Y, Wang Z, Yang L, Zhai J, Feng CG, Wang JW*, and Chen X* (2012) A functional Single-Nucleotide Polymorphism in interleukin-6 promoter is associated with susceptibility to Tuberculosis. The Journal of Infectious Diseases, 205:1697-1704.
- Li MJ, Wang P, Liu X, Lim EL, Wang Z, Yeager M, Wong MP, Sham PC, Chanock S, and Wang JW* (2012) GWASdb: a database for human genetic variants identified by genome wide association studies. Nucleic Acids Research, 40(1):D1047-54.
- Wang JW* (2012) A database of genetic variants in microRNA genes and their putative functional roles in gene regulation. Human Mutation, 33(1):vii.
- Wang LY, Wang PW, Li MJ, Qin J, Wang XO, Zhang MQ, and Wang JW* (2011) EpiRegNet: constructing epigenetic regulatory networks from high throughput gene expression data for human. Epigenetics, 6(12):1505-12.
- Yang S, Yalamanchili HK, Li X, Yao KM, Sham PC, Zhang MQ, and Wang JW* (2011) Correlated evolution of transcription factors and their binding sites. Bioinformatics, 27(21):2972-2978.
- Yalamanchili HK, Xiao QW, and Wang JW* (2011) NRProF: Neural Response Based Protein Function Prediction Algorithm. IEEE International Conference on Systems Biology, 33-40.
- Wu HJ, Wu W, Sun HY, Qin GW, Wang HB, Wang PW, Yalamanchili HK, Wang JW, Tse HF, Lau CP, Vanhouttee PM, and Li GR. (2011) Acacetin causes a frequency- and use-dependent blockade of hKv1.5 channels by binding to the S6 domain. Journal of Molecular and Cellular Cardiology, 51(6):966-973.
- Zhang Y, Liao S, Yang M, Liang X, Poon MW, Wong CY, Wang JW, Zhou Z, Cheong SK, Lee CN, Tse HF, and Lian Q (2012) Improved Cell Survival and Paracrine Capacity of Human Embryonic Stem Cells-Derived Mesenchymal Stem Cells Promote Therapeutic Potential for Pulmonary Arterial Hypertension. Cell Transplantation, 21(10):2225-2239.
- Wang W, Wei Z, Lam T-W, and Wang JW* (2011) Next generation sequencing has lower sequence coverage and poorer SNP-detection capability in the regulatory regions. Scientific Reports, 1:55.
- Zhang G, Chen X, Chan L, Zhang M, Zhu B, Wang L, Zhu X, Zhang J, Zhou B, and Wang JW* (2011) An SNP selection strategy identified IL-22 associating with susceptibility to tuberculosis in Chinese. Scientific Reports, 1:20.
- Qin J, Li MJ, Wang P, Zhang MQ, and Wang JW* (2011) ChIP-Array: combinatory analysis of ChIP-seq/chip and microarray gene expression data to discover direct/indirect targets of a transcription factor. Nucleic Acids Research, 39:W430-436.

- Li MJ, Sham PC, and Wang JW* (2010) FastPval: a fast and memory efficient program to calculate very low p-values from empirical distribution. Bioinformatics, 26(22):2897-99.
- Wei F, Zaprazna K, Wang JW, and Atchison ML (2009) PU.1 Can Recruit BCL6 to DNA To Repress Gene Expression in Germinal Center B Cells. Molecular and Cellular Biology, 29(17):4612-4622.
- Tseng H, Chou W, Wang J, Zhang X, Zhang S, and Schultz RM (2008). Mouse ribosomal RNA genes contain multiple differentially regulated variants. PLoS One 3(3):e1843.

- Wang J*, Ungar LH, Tseng H, and Hannenhalli S (2007) MetaProm: a neural network based meta-predictor for alternative human promoter prediction. BMC Genomics 8, 374. (*Corresponding author)
- Zhang S, Wang J, and Tseng H (2007) Basonuclin regulates a subset of ribosomal RNA genes in HaCaT cells. PLoS ONE 2, e902.
- Hunter DJ, Kraft P, Jacobs KB, Cox DG, Yeager M, Hankinson SE, Wacholder S, Wang Z, Welch R, Hutchinson A, Wang J, Yu K, Chatterjee N, Orr N, Willett WC, Colditz GA, Ziegler RG, Berg CD, Buys SS, McCarty CA, Feigelson HS, Calle EE, Thun MJ, Hayes RB, Tucker M, Gerhard DS, Fraumeni JF Jr, Hoover RN, Thomas G, and Chanock SJ (2007) A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nat Genet 39, 870-874.