In 2013 we have made a change to our quality control (QC) methods for PlasmoGEM vectors, in order to fulfill our goal of only releasing vectors of the highest possible quality. Initially, PlasmoGEM vectors were tested using a combination of PCR and NotI restriction digest QC steps. More recently, we have started to sequence the full genomic DNA insert of all vectors in large batches using Illumina technology. With this we can now compare the complete sequence of each vector with the expected nucleotide sequence, which allows us to detect the vast majority of potential problems. We now use this as our primary QC for all new vectors and new vectors will only be released once they have passed QC. Please note that some already existing constructs are currently being re-examined with this method and may be taken off our database as a result of failing this new QC procedure.
We sequence our construct to a high depth and the results are therefore very reliable but there are limits to the technology. For example, we may not be able to map the short reads we obtain to some of the very AT rich and/or repetive regions of the genome, which may lead to an incorrect QC failure. We have also already noticed cases of known loci with higher-than-usual sequence variability, which may also lead to incorrect QC failure by sequencing.
Large deletions, as well as SNPs and indels in coding sequences are the most serious types of quality issues, because they may lead to unintended mutations in tagged or neighboring genes. This may lead to an incorrect interpretation of your phenotype. If we cannot provide a sequence perfect vector, we will tolerate SNPs and small indels in non-coding regions.
More information about this is summarised in the next section.
The sequence of the homology region has been verified. However, we occasionally find small indels or single base changes in non-coding genomic sequence in the homology arms and such vectors still pass QC. The majority of these mutations are single base insertions or deletions in long homopolymeric tracts of A/T nucleotides. Many of these will have originated in E. coli but others may pre-exist in the parasite clone from which the gDNA library was created. Small variations in non-coding regions of low complexity are difficult to avoid entirely and the vast majority will neither affect vector function, nor influence downstream phenotypic analysis of resulting transgenic lines. These constructs pass QC so they can be used where no sequence perfect vector is available. Please note that the sequence of the selection cassette in the final vectors is not verified, but we do not anticipate this to be a problem.
Detection of any one of the below feature(s)