Glossary

Key Points

Data101: From raw data to individual samples files	Raw data should always be checked with FastQC. Assigning reads to specific samples is called demultiplexing process_radtags is the built in demultiplexing tools of Stacks and it includes some basic quality control
De-novo assembly without a reference genome	-M is the main parameter to optimise when identifying variants de-novo using Stacks Optimisation can often be performed with a subset of the data SLURM scripts are the way to harvest the cluster’s potential by running jobs
Assembly with a reference genome	Reference genomes, even of poor quality or from a related species are great for SNP identification Reference-based SNP calling takes the guess work of distance between and within loci away by mapping reads to individual location within the genome
Population genetics analyses	SNP filtering is about balancing signal vs noise Populations is the stacks implemented software to deal with filtering of SNPs Principal component analysis (PCA) and Structure are easy visualisation tools for your samples

FIXME