Call, annotate, and filter variants
VCF workflows produce the file type most downstream variant tools understand. Choose callers based on variant type, sequencing technology, runtime, and accuracy needs.
Variant calling is where small test regions are especially important. A command can be syntactically correct but still fail because the BAM is unsorted, the CRAM cannot find its reference, the contig names do not match, or the caller needs more memory than expected. A fast mitochondrial or small-gene test catches those problems before a whole-genome run consumes hours.
Annotation is a separate step from calling. A caller says what appears to differ from the reference. Annotation adds context such as gene names, transcript consequences, population frequencies, or clinical database matches. Filtering then narrows the result to records that match a quality threshold, region, gene, consequence, or expression.
pixi run wgsextract --input sample.bam --ref /refs/hs38.fa vcf snp --region chrM
pixi run wgsextract --input sample.bam --ref /refs/hs38.fa vcf snp
pixi run wgsextract --input sample.bam --ref /refs/hs38.fa vcf indel
pixi run wgsextract --input sample.bam --ref /refs/hs38.fa vcf sv
pixi run wgsextract vep run --vcf-input calls.vcf.gz
pixi run wgsextract vcf filter --vcf-input calls.vep.vcf.gz --gene BRCA1
pixi run wgsextract vcf filter --vcf-input calls.vcf.gz --expr 'QUAL>30'
SNP and InDel calls
SNP and InDel workflows target small sequence changes. They are often the first full-genome variant calls people run, but they still need a matching reference, sorted/indexed alignments, and enough coverage.
Structural and copy-number calls
SV and CNV workflows look for larger changes. These can be sensitive to sequencing technology, read length, depth, and caller assumptions, so compare outputs cautiously.
Annotation and filtering
Annotation tools such as VEP add biological context. Filtering is useful for exploration, but a filtered list is not a diagnosis and should not be used as medical advice.