AnalyzeCovariates produces invalid csv when readgroups contain commas, resulting in a crash
open | Created 2019-03-01 | Last updated 2019-03-01| Posted by bartgrantham | See in Github


bug


Bug Report

Affected tool(s) or class(es)

GATK 4.1.0.0 AnalyzeCovariates

Affected version(s)

  • Latest public release version [version?]
  • Latest master branch as of [date of test?]

Description

The csv produced by AnalyzeCovariates is invalid. It doesn't escape commas in fields, resulting in an error in the R script.

Steps to reproduce

If you have a comma in the readgroup in a BAM, this will happen.

Expected behavior

It should produce valid csv files, and then be able to properly produce the plots.

Actual behavior

Commas in read group names result in malformed (unescaped) csv where it's impossible to parse fields properly. This results in the following R script error:

Error in read.table(file = file, header = header, sep = sep, quote = quote,  :
  more columns than column names

Return to top