The other day I decided to disassemble the bathroom doorknob. Efforts included chipping away layers of paint and recruiting some muscle to remove the screws the chipping had revealed. When I levered out the latch system and took it apart, I noticed two things. First, parts of it had a beautiful copper color. Second, the internal spring was broken into three parts. The latter explained the sticky latch you had to jiggle to stay closed and a door that had started to pop open randomly as if possessed.

I made the fix with a new spring.

It may be a compulsion to want to fix broken things. I think it stems from the same curiosity that makes you want to take things apart to understand how they work [1]. When I joined the GATK some 3.5 years ago as a technical writer, this compulsion surfaced and drove me to the effort that resulted in the pieces I wrote. Below, at the end of this post, is a sampling of my most viewed articles.

I would like to thank you for allowing me to serve you these past years. I have learned much in the process. The knowledge I have gained in genomics comes not only from these writing projects but also just as much from answering your questions on the forum. From holding to answering just one forum question a day, I am proud to have earned over 250 likes and points aplenty for the forum’s five-star ranking.

My last day with the DSP Communications Team is April 1, which is today [2]. Rest assured, my teammates and our wonderful methods developers will continue to take excellent care of you.

Looking back, 2018 was a busy year. Geraldine asked I help out at the July Cambridge UK workshop and also at the December Taiwan workshop [3]. Each workshop brings with it a torrent of activity creating and updating materials. It is always insightful and rewarding to interact firsthand with researchers, to hear about sticking points and to see reactions to the tutorials I develop and write.

Since returning from the December workshop, I have been submarined pouring effort into finalizing the gCNV tutorial in time for my departure. I hope you find it useful. This tutorial has been the most challenging to develop so far in that exploring the results involved more creative solutions than usual, as you will see in the tutorial’s companion Jupyter Notebook reports here and here [4].

Before I start searching for a new job, this month I will spend some time visiting friends and family and remembering my Ph.D. advisor at his memorial. If you would like to lend your support, I would love to have your endorsement on LinkedIn [5]. If you need to get in touch with me, please ping me on GitHub, in the broadinstitute/gatk repository. My handle is @sooheelee and I will be checking in intermittently.

It has been a privilege.

Yours truly,

Soo Hee


Footnotes

[1] This curiosity should not be surprising in someone who once walked the life of a Ph.D. biochemist. And it should be expected from someone whose folks include a plant pathologist (Dad studied in North Dakota) and a WWII pilot turned aeronautical engineer (Mr. Cummings served in the Army Air Corp; he is turning 95 this May and I will be seeing him for his birthday). Each of my families tells me I’m molded from the same clay as my fathers.
[2] No, this is not an April Fools' joke. [3] There were two Taiwan workshops in 2018. The video footage of the December 2018 Taiwan workshop is not posted anywhere else, and so here is the link: https://drive.google.com/drive/folders/1-uMoz-ui5IteriKngee7Vic9AWAcnfcL. [4] I have become a fan of pandas the software but also the animal. [5] Connect with me, and, if you feel like it, please endorse my skills in Genomics.


A sampling of my most popular articles grouped by year and sorted by number of views

Year Views Article# and link Title
2015 26.1K 6484 (How to) Generate an unmapped BAM from FASTQ or aligned BAM
. 17.2K 6483 (How to) Map and clean up short read sequence data efficiently
2016 17.9K 6747 (How to) Mark duplicates with MarkDuplicates or MarkDuplicatesWithMateCigar
. 8.2K 7857 Reference Genome Components
. 7.8K 8017 (How to) Map reads to a reference with alternate contigs like GRCh38
. 4.9K 7847 Changing workflows around calling SNPs and indels
. 4.4K 7156 (howto) Perform local realignment around indels
. 3.5K 7899 Reference implementation: PairedEndSingleSampleWf pipeline
. 2.2K 6926 Spanning or overlapping deletions (* allele)
. 2.0K 8180 9 Takeaways to help you get started with GRCh38
. 1.9K 7859 (How to) Simulate reads using a reference genome ALT contig
. 1.1K 7019 Sam flags down a boat
2017 22.7K 9143* (How to) Call somatic copy number variants using GATK4 CNV
. 2.9K 9183* (How to) Call somatic SNVs and indels using MuTect2
. 2.2K 10172 (How to) Run the GATK4 Docker locally and take a look inside
. 1.7K 10911 Differences between GATK3 MuTect2 and GATK4 Mutect2
. 1.1K 10060 (How to) Run FlagStatSpark on a cloud Spark cluster
2018 18.5K 11136 (How to) Call somatic mutations using GATK4 Mutect2
. 3.0K 11682 (How to part I) Sensitively detect copy ratio alterations and allelic segments
. 2.6K 11127 Somatic calling is NOT simply a difference between two callsets
. 2.0K 11683 (How to part II) Sensitively detect copy ratio alterations and allelic segments
. 938 12350 (How to) Filter on genotype using VariantFiltration
. 740 11315 Off-label workflow to simply call differences in two samples
. ~ 23216 (How to) Filter variants either with VQSR or by hard-filtering
2019 ~ 11684 (How to) Call common and rare germline copy number variants
. ~ 11685 (Notebook) Concordance of NA19017 chr20 gCNV calls
. ~ 11686 (Notebook) Correlate gCNV callset metrics and annotations
. ~ 11687 After gCNV calling considerations

*Uses older versions of tools that have been replaced. ~Published in the last three months.



Return to top

Mon 1 Apr 2019

SkyWarrior on 1 Apr 2019


Thanks for all the help you provided and good luck and best for your new ventures.

shlee on 1 Apr 2019


Thanks @SkyWarrior. And thank you for all your contributions on the forum.

manolis on 1 Apr 2019


Thanks for all the help @shlee !!! Good luck for everything and everywhere !!!

jin0008 on 1 Apr 2019


Thanks a lot @shlee. from South Korea.

blueskypy on 1 Apr 2019


@shlee, I feel sad to hear this! I'll be missing the quality of your answers, and missing the confidence of your reply after pinging you! Best wishes and good luck!




- Recent posts


- Upcoming events

See Events calendar for full list and dates


- Recent events

See Events calendar for full list and dates



- Follow us on Twitter

GATK Dev Team

@gatk_dev

@wbsimey Happy to hear you’ve found the resources we provide helpful!
30 Jul 19
New crop of GATK workshop videos now available on YouTube! Updated for the GATK4/2019 version of the Best Practices… https://t.co/Wfgq5YKBFg
25 Jul 19
Don't miss this #GATK workshop -- we've got a great crew lined up and the location isn't half bad either :) https://t.co/b0fL8ZLwzn
23 Jul 19
@Brunods1001 It’s been updated to use GATK4, which addresses the invalid bam output issue that affected the GATK3 v… https://t.co/AUlbjmHKmm
11 Jul 19
Wrapping up the #GATK workshop in Cambridge, UK -- it's been a blast. Great group of participants and fantastic hos… https://t.co/bvwGTU7lYq
11 Jul 19

- Our favorite tweets from others

In spite of their stated mission to support human health through genomics, many GATK pipelines are applicable to no… https://t.co/FKQTouZjbv
29 Jul 19
Me: driving myself insane over what data to keep and what to not bother with for thesis and also frantically trying… https://t.co/er2klIcw5i
18 Jul 19
@RareSeas first attempt at teaching the GATK course, do I look puzzled up there? https://t.co/4mqkHbWJy4
11 Jul 19
Can you spot CDGP PhD student, Dr. Alice Denyer, brushing up on the latest bioinformatics tools from @gatk_dev? The… https://t.co/KAbdlWLbcb
10 Jul 19
GATK workshop materials available online! Learn it in your own time with @ProjectJupyter notebooks. ^MT https://t.co/IKDa6SGwaU
8 Jul 19

See more of our favorite tweets...