When providing two or more sequence lists as input for Taxonomic Profiling, the three optional outputs - ‘Reads matching the reference database’, ‘Reads matching the host genome’, and ‘Unclassified reads’ - contain only reads from the first input sequence list. The issue also affects the sequence list that can be generated by clicking "Extract Reads from Selection" on the abundance table.
The abundance table results, and the taxonomic profiling report are not affected by this issue.
If the affected sequence lists are used for downstream analysis, the results will be based on only a subset of the intended reads and consequently may differ from those that would have been obtained if the full set of reads had been used.
These template workflows use the output "Viral reads" (reads matching the reference database) as input for downstream read mapping, variant detection and consensus sequence generation. If two or more sequence lists are provided as workflow input, the results may be skewed.
In addition, the output ‘Reads matching the host genome’ is used as input for the analysis step ‘Map Reads to Human Control Genes’. If this input contains only a subset of the intended reads, the presence of reads from human control gene will be underestimated.
This template workflow outputs “Cleaned reads” (reads matching the reference database), “Background reads” (reads matching the host genome), and “Unmapped reads“. If two or more sequence lists are provided as workflow input, these outputs will contain only a subset of the intended reads. If the outputs are used for downstream analysis, the results may be skewed.
CLC Microbial Genomics Module 24.0 and CLC Microbial Genomics Server Extension 24.0.
This was addressed in CLC Microbial Genomics Module 24.0.1 and CLC Microbial Genomics Server Extension 24.0.1.