약물유전체학

Extracting CYP2C19 Star Alleles from 23andMe — Plavix (Clopidogrel) Response Prediction in Python

Python tutorial: parse 23andMe raw data for CYP2C19 *2, *3, and *17 variants, predict your clopidogrel (Plavix) response phenotype, and understand why ~20% of Koreans get suboptimal antiplatelet protection on standard dosing. Includes the SNP-to-star-allele lookup, phenotype interpretation, and where DTC data falls short.

·10 min read
#CYP2C19#23andMe#clopidogrel#Plavix#약물유전체학#PGx#DTC genetic testing#Python tutorial#*2 *3 *17#antiplatelet#PCI#stent

CYP2C19 Plavix pharmacogenomics

The Stent + Plavix Population That Should Care

Roughly 100,000 Koreans receive a coronary stent (PCI) every year. Almost all leave the hospital on dual antiplatelet therapy: aspirin + clopidogrel (Plavix). The two drugs reduce in-stent thrombosis risk by ~80%.

Except — about 15-20% of Koreans are CYP2C19 poor metabolizers and clopidogrel doesn't work properly for them. The drug is a prodrug; it has to be activated by the CYP2C19 enzyme into its active metabolite. If your CYP2C19 is impaired, you get suboptimal antiplatelet effect even on the standard 75 mg/day, and your in-stent thrombosis risk rises 1.5-2× vs normal metabolizers.

This matters because:

  • A simple genetic test would catch this risk
  • Alternative antiplatelets exist (prasugrel, ticagrelor) that don't require CYP2C19 activation
  • Most Korean PCI patients never get tested

This tutorial shows how to extract the CYP2C19 information from a 23andMe raw data file. It's the practical sibling of Reading 23andMe Raw Data for CYP2D6 Star Alleles in Python but for CYP2C19 — the one that matters for Plavix.

CYP2C19 Quick Background

  • Gene location: chromosome 10q23.33
  • What it does: oxidative drug metabolism (CYP450 family)
  • Major substrates: clopidogrel, proton pump inhibitors (omeprazole, esomeprazole), some SSRIs (citalopram, escitalopram), some antidepressants and antifungals
  • Key alleles (PharmVar definitions):
AlleleFunctionDefining SNP (rsid)Variant base
*1Normal(reference)
*2No functionrs4244285A (681G>A)
*3No functionrs4986893A (636G>A)
*17Increased functionrs12248560T (-806C>T)

For Plavix specifically, you need to know if you carry *2, *3, or are *17 (which is the rapid metabolizer flip side).

Population frequencies

Population*2 carrier %*3 carrier %*17 carrier %Poor metabolizers
Korean25-306-105-1015-20%
Japanese25-307-105-1015-20%
Chinese25-305-85-1014-18%
European12-15<122-282-5%
African American14-18<118-223-6%

Korea has ~5× higher poor metabolizer rate than European populations. This is one of the cleanest examples of why pharmacogenomic guidance from Western trials may underestimate the importance of testing in Korean clinical practice.

Python Tutorial

Load 23andMe Raw Data

import pandas as pd

snps = pd.read_csv(
    'genome_Firstname_Lastname_Full_v5_Full_20260101230000.txt',
    sep='\t',
    comment='#',
    names=['rsid', 'chromosome', 'position', 'genotype'],
    dtype={'rsid': str, 'chromosome': str, 'position': int, 'genotype': str},
)

Locate CYP2C19 Region

CYP2C19 lives on chromosome 10, GRCh37 coordinates ~96,522,000-96,613,000.

cyp2c19_snps = snps[
    (snps['chromosome'] == '10') &
    (snps['position'].between(96_520_000, 96_615_000))
]
print(f"CYP2C19 region SNPs detected: {len(cyp2c19_snps)}")
print(cyp2c19_snps)

A typical 23andMe v5 chip returns 8-15 SNPs in this region. Most are intronic; you care about three.

Check Star Alleles

cyp2c19_lookup = {
    'rs4244285':  {'allele': '*2',  'variant_base': 'A', 'effect': 'no function'},
    'rs4986893':  {'allele': '*3',  'variant_base': 'A', 'effect': 'no function'},
    'rs12248560': {'allele': '*17', 'variant_base': 'T', 'effect': 'increased function'},
}

def check_cyp2c19(snps_df, lookup):
    results = []
    for rsid, info in lookup.items():
        row = snps_df[snps_df['rsid'] == rsid]
        if len(row) == 0:
            results.append({
                'rsid': rsid, 'allele': info['allele'], 'effect': info['effect'],
                'genotype': 'NOT_TESTED', 'carries': '?',
            })
            continue
        gt = row.iloc[0]['genotype']
        v = info['variant_base']
        cnt = gt.count(v)
        carries = {0: 'no', 1: 'heterozygous (1 copy)', 2: 'homozygous (2 copies)'}.get(cnt, '?')
        results.append({
            'rsid': rsid, 'allele': info['allele'], 'effect': info['effect'],
            'genotype': gt, 'carries': carries,
        })
    return pd.DataFrame(results)

stars = check_cyp2c19(cyp2c19_snps, cyp2c19_lookup)
print(stars)

Predict Phenotype

Combine the alleles into a phenotype following CPIC guidelines:

def cyp2c19_phenotype(stars_df):
    has = {row['allele']: row['carries'] for _, row in stars_df.iterrows()}
    s2 = has.get('*2', '?')
    s3 = has.get('*3', '?')
    s17 = has.get('*17', '?')

    # Activity score (CPIC-style simplified)
    no_func_alleles = 0
    if s2 == 'homozygous (2 copies)': no_func_alleles += 2
    elif s2 == 'heterozygous (1 copy)': no_func_alleles += 1
    if s3 == 'homozygous (2 copies)': no_func_alleles += 2
    elif s3 == 'heterozygous (1 copy)': no_func_alleles += 1

    increased = 0
    if s17 == 'homozygous (2 copies)': increased = 2
    elif s17 == 'heterozygous (1 copy)': increased = 1

    if no_func_alleles == 2:
        return "Poor Metabolizer (PM)", "Clopidogrel: avoid; use prasugrel or ticagrelor"
    if no_func_alleles == 1 and increased == 0:
        return "Intermediate Metabolizer (IM)", "Clopidogrel: reduced efficacy; consider alternatives"
    if no_func_alleles == 1 and increased >= 1:
        return "Likely Intermediate", "Clopidogrel: variable response; clinical discretion"
    if no_func_alleles == 0 and increased >= 1:
        return "Rapid/Ultrarapid Metabolizer (RM/UM)", "Clopidogrel: standard dose; SSRI/PPI may need dose review"
    return "Normal Metabolizer (NM)", "Standard clopidogrel dose appropriate"

phenotype, clinical_note = cyp2c19_phenotype(stars)
print(f"Phenotype: {phenotype}")
print(f"Clinical implication: {clinical_note}")

Sample output for a hypothetical Korean user:

   rsid          allele  effect              genotype  carries
0  rs4244285     *2      no function         GA        heterozygous (1 copy)
1  rs4986893     *3      no function         GG        no
2  rs12248560    *17     increased function  CC        no

Phenotype: Intermediate Metabolizer (IM)
Clinical implication: Clopidogrel: reduced efficacy; consider alternatives

This person carries one *2 allele — they're an intermediate metabolizer. If they were prescribed Plavix after a stent, they're at elevated thrombosis risk vs a normal metabolizer.

Phenotype Reference Table

GenotypePhenotypeApprox. KR frequencyClopidogrel guidance
*1/*1Normal Metabolizer (NM)35-45%Standard 75 mg/day
*1/*2 or *1/*3Intermediate (IM)30-35%Suboptimal — consider alternative
*2/*2, *2/*3, *3/*3Poor Metabolizer (PM)15-20%Avoid clopidogrel; use prasugrel or ticagrelor
*1/*17Rapid (RM)5-8%Standard dose; monitor for bleeding
*17/*17Ultrarapid (UM)<2%Standard dose; consider lower in bleeding-risk patients
Mixed *2 or *3 + *17Variable2-5%Case by case

Clinical Context — Why This Matters Beyond Theory

CPIC guideline (Lee et al., 2022)

  • Strong recommendation: PM and IM patients should receive prasugrel or ticagrelor instead of clopidogrel for acute coronary syndrome
  • Moderate recommendation: same for elective PCI

Real-world evidence

A 2020 Lancet meta-analysis showed CYP2C19 PMs receiving clopidogrel after PCI had:

  • 1.5-1.8× higher rate of recurrent stent thrombosis
  • 1.4× higher major adverse cardiovascular events

The Korea-specific gap

Despite the elevated PM rate in Koreans, CYP2C19 testing is not routinely performed before clopidogrel prescription in most Korean hospitals. Reasons:

  • Insurance reimbursement is conditional (covered in defined post-stent contexts at major centers; not for all clopidogrel users)
  • Test turnaround can exceed acute-phase decision window
  • Clinical inertia toward established dosing

A handful of Korean tertiary hospitals (SNUH, AMC, SMC, Severance) now perform routine pre-procedural CYP2C19 testing. Coverage is expanding slowly.

Other CYP2C19 Substrates to Care About

CYP2C19 affects multiple drug classes. If you know your CYP2C19 phenotype, also consider:

Proton pump inhibitors (PPIs)

Omeprazole, esomeprazole, lansoprazole are CYP2C19 substrates.

  • PM → drug accumulates → may help reflux better, but long-term PPI use increases certain risks (B12 deficiency, bone loss)
  • UM → drug clears too fast → standard PPI dose may be insufficient

SSRIs

Citalopram and escitalopram are partly CYP2C19 metabolized.

  • PM → drug accumulates → increased risk of QT prolongation at standard dose (FDA dose cap may apply)
  • UM → may need higher dose to achieve effect

Antifungals (voriconazole)

CYP2C19 is the major metabolic pathway. PM has elevated voriconazole levels → toxicity risk; UM has subtherapeutic levels → treatment failure. Routinely tested before voriconazole.

DTC Limitations You Should Know

Same caveats as the broader DTC PGx discussion:

  • **23andMe catches *2, 3, 17 common variants — good coverage of clinically relevant alleles
  • **Misses rare *4, *5, 6, 8 alleles — small fraction of populations, but if you carry one your phenotype could differ from prediction
  • Misses copy number variants — CYP2C19 has *13 (decreased function) and other rare CNVs that SNP arrays don't detect

For making actual prescription decisions (especially around major events like PCI), use a clinical CYP2C19 panel from a hospital lab. The DTC result is hypothesis-generating, not diagnostic. See PGx Complete Guide 2026 for the clinical testing landscape.

FAQ

Q: I'm scheduled for PCI next week — should I rush this? If you have time and your hospital can order it, request CYP2C19 testing. Clinical-grade test, not DTC. If timing is tight, your interventional cardiologist may default to alternative antiplatelets pending result, or use platelet function assays.

Q: I already had a stent on Plavix years ago, no problems — does it matter now? You survived the high-risk window (first 6-12 months post-stent). Long-term Plavix benefit is less time-sensitive. Worth knowing your CYP2C19 for future medications (PPI, SSRI choices). No emergency change needed for the chronic Plavix.

**Q: My DTC shows 17/17 — do I bleed more on Plavix? *17/*17 = ultrarapid → more active metabolite → stronger antiplatelet effect → modestly elevated bleeding risk. Tell your cardiologist; standard dose is still typically used, but they may monitor more closely.

Q: How does this differ from CYP2D6 (the other Big PGx gene)? CYP2D6 metabolizes ~25% of all prescription drugs (codeine, tramadol, many psychiatric meds). CYP2C19 metabolizes ~10% but includes critical cardiology drugs. They're different genes on different chromosomes — knowing one doesn't tell you the other. Test both via clinical panels if you take multiple medications.

Q: My family member had Plavix failure — should I test before I might need it? Familial CYP2C19 PM clusters. If a first-degree relative had documented Plavix failure or stent thrombosis, your prior probability of PM is much higher than population baseline. Worth knowing your status — both for current and future medications.

Q: Can a doctor refuse to honor my DTC result? They cannot prescribe based solely on DTC; clinical guidelines require validated tests. They can use the DTC result as a reason to order the clinical test. That's the appropriate workflow.

Q: Does aspirin have similar genetic factors? Aspirin metabolism has some genetic variation but the clinical impact is much smaller and standard low-dose (75-100 mg) works for most. The dual antiplatelet question is really about clopidogrel.

Closing — The Practical Takeaways

  1. Korean populations have ~15-20% CYP2C19 PMs — much higher than European populations
  2. Clopidogrel (Plavix) needs CYP2C19 activation — PMs get reduced antiplatelet protection
  3. **23andMe catches the common *2, 3, 17 alleles with code in this guide
  4. For PCI decisions, get clinical CYP2C19 testing — DTC is informational, not diagnostic
  5. Alternative antiplatelets exist (prasugrel, ticagrelor) for confirmed PMs/IMs

If you've had cardiology contact and ended up on clopidogrel, knowing your CYP2C19 phenotype is one of the most clinically actionable PGx datapoints available. The DTC raw data + the Python here gets you to a reasonable hypothesis; bring it to your cardiologist for confirmation testing.


Related posts:

References:

  • Lee, C. R. et al. (2022). CPIC Guideline for CYP2C19 and Clopidogrel Therapy: 2022 Update. Clinical Pharmacology & Therapeutics, 112, 959-967.
  • Mega, J. L. et al. (2010). Cytochrome P-450 polymorphisms and response to clopidogrel. NEJM, 360, 354-362.
  • PharmVar CYP2C19 page: https://www.pharmvar.org/gene/CYP2C19
  • PharmGKB: https://www.pharmgkb.org

⚠️ Medical disclaimer: This article is for informational purposes. DTC-derived genetic information is not a substitute for clinical pharmacogenomic testing or physician advice. Do not start, stop, or change clopidogrel or any antiplatelet medication based on DTC test results alone — discuss with your cardiologist.

관련 글