Why Single-Variant Pharmacogenomics Is Misleading: Star Alleles, Diplotypes, and What DTC Arrays Miss

Star alleles and diplotypes in pharmacogenomics

TL;DR (Quick Answer)

Is a pharmacogenomics result from a single variant trustworthy? Usually not on its own. Drug response is determined by your diplotype (the pair of star alleles you carry), not by any one SNP, and consumer arrays can't reliably assemble that diplotype.

PGx is haplotype-level. A star allele like CYP2D6 *4 is a defined combination of variants; your phenotype comes from the diplotype (e.g. *1/*4) and its activity score, not a single flag.
Arrays miss three things — variants that aren't on the chip, the phase needed to assemble a haplotype, and structural variation (deletions, duplications) that single-position genotyping can't see at all.
Confidence is per-gene, not global. Some genes (CYP2C19, CYP2C9/VKORC1, TPMT) are reasonably callable from array data for common alleles; CYP2D6 is the worst case and routinely mis-called.
The silent killer — genome build. 23andMe raw data is GRCh37; most current PGx coordinates are GRCh38. A bad liftover or strand flip produces a confident wrong allele.

What "Star Allele" and "Diplotype" Actually Mean

A star allele is a named haplotype: a specific set of variants inherited together on one copy of a pharmacogene (CYP2D6 *4, CYP2C19 *2, and so on). Because you have two copies of each gene, your genotype is a diplotype — the pair of star alleles, like *1/*4. That diplotype maps to a metabolizer phenotype (poor, intermediate, normal, ultrarapid) — via an activity score for genes like CYP2D6 and CYP2C9, or directly from functional diplotype tables for others like CYP2C19 — and that phenotype is what drives a dosing decision.

The reference definitions live with PharmVar, the drug–gene evidence with PharmGKB, the prescribing guidance with CPIC, and the production-grade diplotype caller is PharmCAT. The important structural point: CPIC recommendations are keyed to the diplotype and phenotype, never to an individual variant. That is the whole reason single-variant annotation is the wrong unit.

Why a single variant tells you almost nothing

Say a report flags rs3892097, one of the variants that defines CYP2D6 *4. On its own, that flag can't tell you:

whether the other chromosome is functional, which is what decides between an intermediate and a poor metabolizer;
whether a rare decreased-function allele the chip never tested is also present;
whether there's a whole-gene deletion or duplication changing the dose entirely.

You can't get from "this SNP is present" to a phenotype without the rest of the diplotype. A tool that prints "CYP2D6 variant detected" next to a drug is technically true and practically misleading, because the reader takes it as an answer when it's a fragment.

Why consumer arrays can't reliably call diplotypes

Three independent problems stack up, and they're the reason this matters for any tool taking 23andMe or AncestryDNA input.

Incomplete coverage. Star alleles are defined by dozens of positions; a consumer chip genotypes only a subset of PGx SNPs. If a defining variant isn't on the array, the allele it defines is simply unobservable, and the tool will quietly default you to *1 (normal) — a false negative, not an "unknown."

No phase. An array gives you genotypes, not haplotypes. If you're heterozygous at two positions, the array can't tell you whether those variants sit on the same chromosome or opposite ones, and that distinction changes the star-allele call. This bites even in "easy" genes: TPMT *3A is itself a two-SNP haplotype (rs1800460 + rs1142345), so the same unphased genotype can read as *3A/*1 or as *3B/*3C, which are different diplotypes. Tools infer phase statistically, which is a guess, not a measurement.

Structural variation is invisible. This is the big one for CYP2D6, which has whole-gene deletions (*5), duplications and multiplications (the path to ultrarapid metabolism), and CYP2D6–CYP2D7 hybrids. What a 23andMe raw file gives you is one allele per position with no probe-intensity signal. Array hardware can in principle support copy-number calling from intensity, but 23andMe doesn't expose that, so from the raw data file *5 and gene duplications are undetectable, full stop. We worked through exactly this in the 23andMe CYP2D6 star-allele tutorial: you can catch the common SNP-defined alleles and you cannot catch the deletion.

This is also why PharmCAT is built around sequencing/VCF input and is cautious about array-derived data. If your tool runs PharmCAT on a converted 23andMe file, the diplotypes it returns carry all three of these limitations, even though the output looks just as confident as a clinical result.

Confidence is per-gene, and you should say so

"Can you do PGx from a 23andMe file" doesn't have one answer; it has one per gene.

Genes whose clinically common alleles are defined by a few well-covered SNPs are reasonably callable. CYP2C19 (*2 = rs4244285, *3, *17) drives clopidogrel response and is usually tractable — that's the basis of the CYP2C19 / Plavix prediction walkthrough. CYP2C9 with VKORC1 (warfarin) and TPMT (thiopurines) are often callable for the common variants too. But even here, "normal metabolizer" can be a false negative, because a rare or structural allele the chip didn't test would also read as normal.

CYP2D6 is the gene people most want and the one arrays handle worst, because of the structural variation above. A report that hands you a confident CYP2D6 phenotype from array data is overselling what the data can support.

This isn't a fringe worry. The consumer-facing PGx reporting from the array vendors themselves tests only a limited, curated variant set and carries an explicit warning against using it to make treatment decisions. When the company that generated the data won't call the phenotype outright, a third-party tool that does is claiming more than the chip can support.

So the honest design is a per-gene confidence label, not a global one: report a diplotype only where the chip actually covers the defining variants, and flag the genes (CYP2D6 first) where array data isn't enough, rather than defaulting them to normal.

The unglamorous gotcha: genome build

None of the above matters if the coordinates are wrong to begin with. 23andMe raw data is reported on GRCh37; most current PGx resources and .vcf-based pipelines are on GRCh38. If you don't liftover correctly, and if you don't handle the fact that 23andMe reports alleles on the chip's design strand (which can differ from the reference strand), you can assign the wrong allele with full confidence. Build and strand normalization is where DTC annotation pipelines quietly produce wrong calls, and it's worth testing against a known sample before you trust a single output. The general accuracy limits are covered in DTC genetic testing: accuracy and limits.

If you're building a PGx tool

Call diplotypes with PharmCAT on VCF/sequencing input; treat array-derived input as lower-confidence and say so.
Report per-gene confidence. Don't let an unobserved allele silently become *1.
Key recommendations to the diplotype and phenotype via CPIC and PharmGKB; don't invent interpretations from individual variants.
Normalize build and strand first, and validate against a reference sample.
For the bigger picture of where DTC data is strong and weak, the Pharmacogenomics Complete Guide is the pillar this sits under.

FAQ

Can my 23andMe data give me an accurate pharmacogenomics result? For some genes and their common star alleles, yes; for CYP2D6 and any structural variant, no. Treat a DTC PGx result as a screen, not a diagnosis, and confirm anything actionable with a clinical panel.

Why do two PGx tools give me different results from the same file? Usually phasing assumptions, which star alleles each tool tries to call, and how they handle build/strand. Single-variant annotators and true diplotype callers can disagree on the same data because they're answering different questions.

Is PharmCAT accurate on 23andMe data? PharmCAT is the standard diplotype caller, but it's designed for sequencing/VCF. On array-derived input it inherits the array's gaps (missing variants, no phase, no CNV), so its output is only as complete as the chip underneath it.

What's the single most common DTC pharmacogenomics mistake? Reading a single-variant flag as a phenotype, and defaulting an unobserved allele to normal. Both make a confident result out of incomplete data.

This is general educational content about how pharmacogenomic annotation works, not medical advice. Discuss any medication decision with a clinician or pharmacist who can order and interpret a validated PGx test.

Why Single-Variant Pharmacogenomics Is Misleading: Star Alleles, Diplotypes, and What DTC Arrays Miss

TL;DR (Quick Answer)

What "Star Allele" and "Diplotype" Actually Mean

Why a single variant tells you almost nothing

Why consumer arrays can't reliably call diplotypes

Confidence is per-gene, and you should say so

The unglamorous gotcha: genome build

If you're building a PGx tool

FAQ

관련 글

Reading 23andMe Raw Data for CYP2D6 Star Alleles in Python — Why DTC Often Misses *5 Deletion

Extracting CYP2C19 Star Alleles from 23andMe — Plavix (Clopidogrel) Response Prediction in Python

약물유전체학 (PGx) 완전 가이드 — 유전자가 약물 반응을 어떻게 결정하나 2026

ClinVar Variant Pathogenicity Lookup in Python — Programmatic Access for Hereditary Disease Screening (2026)