National Precision Medicine (NPM) programme

Building a National Precision Medicine (NPM) programme – SG10K_Health

Background
The SG10K_Health project is a multi-institutional initiative developing an ‘all-of-Singapore' coordinated approach to precision medicine (PM). PM is a fast emerging field that seeks to improve treatment and prevent disease by considering individual and population genetic variation in genes, environment and lifestyle patterns. Through PM, patients may receive more accurate diagnosis, customised treatments achieving maximal benefit while minimising side effects, and reduced health care costs by avoiding ineffective treatments. At present, the Asian population is severely under-represented in the public genotypic databases. The current lack of large-scale control databases of Asian-specific genetic variation linked to clinical phenotypes is a significant barrier to the conduct of PM in Asia, to avoid mis-diagnosis and overtreatment due to the mistaken identification of pathogenic variants. The Singaporean population consists of three major ethnic groups, Chinese, Malay, and Indian, which together represent over 80% of the genetic variation in Asia. The presence of these three major ethnic groups in a single country thus offers a unique opportunity for Singapore, despite its small size, to contribute to global efforts in PM, complementing other large-scale efforts such as and .

Objectives
The SG10K_Health project aims to empower biomedical and genetic studies of and Asian-centric diseases by: 1) building local infrastructure and deep capabilities to generate, analyse and store human genetic data at population scale, in a safe, secure and rapid manner, 2) generating a diverse and rich control dataset of Asian populations for genetic association study of diseases, and 3) developing advanced analytical tools for genetic variants interpretation to derive disease risk predictions and identify clinically actionable variants. Notably, statistical estimates indicate that a genomic data set of 10,000 individuals will be sufficient to capture essentially all common alleles (ie more than 1% allele frequency) in our population. Currently, the SG10K_Health data is linked to research traits (e.g., height, weight, blood pressure) and in the future will be linked to clinical records, subject to participant consent.

Genomic Web Services
We have established various web services to enable users to query the SG10K_Health dataset, including allele frequencies, protein-drug interactions, and polygenic risk scores. The web services can be access through the below links:

  • SNPdrug3D (Coming soon)
  • Polygenic Risk Score (Coming soon)
Publications & Press Release
npm1
Credit image source: Nature Genetics

Key highlights:
  • Sequenced whole genomes of 10,323 healthy consented individuals
  • Discovered over 179 million variants, 43% of which are novel (not detected in dbSNP151)
  • Generated insights into population structure and identified clinically relevant variants and Asian specific structural variations
  • Joanna Hui Juan Tan, Zhihui Li, Mar Gonzalez Porta, et al.  Nat. Comms. doi.org/10.1038/s41467-024-53620-8  
  • Eleanor Wong, Nicolas Bertin, Maxime Hebrard, et al.  doi: 10.1038/s41588-022-01274-x (19 Jan 2023) 
  • Sock Hoai Chan, Yasmin Bylstra, Jing Xian Teo, et al. Nature Communication doi: 10.1038/s41467-022-34116-9 (05 Nov 2022) 
  • Jing Xian Teo, Sonia Davila, Chengxi Yang, et al.  Communications Biology doi: 10.1038/s42003-019-0605-1 (04 Oct 2019)
  • Yasmin Bylstra, Sonia Davila, Weng Khong Lim, et al.  Genomic Medicine doi: 10.1038/s41525-019-0085-8 (07 Jun 2019) 

Press Release

Whole Genome Sequencing (WGS) Quality Control (QC) Standards approved as an official GA4GH product

The Genome Institute of (GIS), A*STAR, has led the development of the newly approved , now recognised as an official product of the Global Alliance for Genomics and Health (GA4GH). Developed within the , with input from the , this framework provides a unified approach for assessing WGS data quality across global genomics initiatives. Product development was led by Maxime Hebrard, Justin Jeyakani and Nicolas Bertin from GIS, A*STAR — working closely with the GA4GH LSG community under the guidance of Work Stream Manager Reggan Thomas (EMBL’s European Bioinformatics Institute). 

Programme investigators
Lead investigators: Prof Patrick Tan (GIS) and Prof Tai E Shyong (NUHs)
Co-investigators: Prof John Chambers (LKCMedicine), Dr Neerja Karnani (SICS), Prof Liu Jian Jun (GIS), Dr Shyam Prabhakar (GIS), Dr Birgit Eisenhaber (BII), Dr Chandra Verma (BII), Dr Sebastian Maurer-Stroh (BII), Dr Rick Goh (IHPC), Dr Sonia Davila (Duke-NUS), Dr Pavitra Krishnaswamy (I2R), Dr Sim Xueling (NUS), Dr Marie Loh (LKCMedicine), Prof Cheng Ching-Yu (SERI) and Dr Leong Khai Pang (TTSH).

Contact Us
For more information on the SG10K_Health program and web services, please reach out to us at contact_npco@a-star.edu.sg

FAQ

Q. How do each of the Genomics web services works?

CHORUS Variant browser

The CHORUS variant browser provides access to variants from the SG10K_Health dataset of 10,000 whole genomes. It builds on the gnomAD/McArthur lab infrastructure and extends the gnomAD user interface and APIs for Singapore’s population reference.

How it works

  • Query by gene, transcript, variant ID or genomic region
  • View allele frequencies stratified by gender and ethnicity
  • Access variant-level metrics and functional annotations (e.g. synonymous/missense, HGVS, SIFT/PolyPhen)

Downloads available 

  • SG10K_Health aggregated allele frequencies
  • SG10K_Health aggregated structural variants (see “A Catalogue of Structural Variation across Ancestrally Diverse Asian Genomes”, DOI: 10.1038/s41467-024-53620-8) 

BEACON

Beacon reports whether specific variants are present in the SG10K_Health dataset of 10,000 whole genomes. It implements the Global Alliance for Genomics and Health Beacon standard and uses a graphQL layer over an ElasticSearch backend.

How it works

  • Users query a variant by chromosome, position, reference allele and alternate allele.
  • The beacon returns a simple “yes” or “no” indicating whether the variant is found. 
  • When available, aggregated allele frequencies by gender and ethnicity are also displayed. 

SNPdrug3D

SNPdrug3D is a web tool to explore how SNPs affect protein sequence and structure, with a focus on drug binding and pharmacogenetic impact.

How it works

  • Search by coordinate, gene, protein or drug
  • See if SNPs hit functional regions or drugbinding pockets
  • Visualise effects using sequence and interactive 3D structure viewers

Polygenic Risk Scores (PRS)

The PRS web service is an intuitive tool for exploring polygenic risk scores in the SG10K_Health cohort. 

How it works

  • Visualise PRS distributions and their associations with available phenotypes, without accessing individuallevel data
  • Run analyses on cohort subsets by specifying age, gender and ethnicity criteria

Infographics