NAT (NGS Analysis Training) returns in February 2020!

December 7, 2019 by yui102 Leave a Comment

https://sites.psu.edu/yuka/nat2020/

More info to follow! (and updated in the above NAT2020 page!)

[Read more…]

Dr. Shinoka’s seminar hosted by Stem Cell and Regenerative Biology Program (SCRBP)

March 18, 2018 by yui102 Leave a Comment

NEW on 3/22/18: we will be broadcasting Dr. Shinoka’s seminar by Zoom!

Join URL: https://psu.zoom.us/j/955257613

Quoting original invitation from Irina (SCRBP’s organizer)…

Dear Colleagues,

We are pleased to announce our next speaker, Dr. Toshiharu Shinoka, a pediatric thoracic and cardiac surgeon who works in the field of regenerative medicine.
Dr. Toshiharu Shinoka is a Professor of Surgery and a Co-Director of Cardiovascular Tissue Engineering Program at the Heart Center of Nationwide Children’s Hospital located in Columbus, Ohio.

The title of Dr. Shinoka’s presentation is “History and Current Status of Vascular Tissue Engineering in Pediatric Heart Surgery”. The lecture will take place on Friday, March 23, at noon, room C3621.

Dr. Shinoka’s research interests are focused on developing the vascular grafts that have the ability to grow and remodel. These grafts are composed of a woven fabric and seeded with autologous bone marrow mononuclear cells. Dr. Shinoka and colleagues were among the first scientists who begun a human clinical trial evaluating vascular grafts in patients with a univentricular physiology. In his lecture, Dr. Shinoka will review the long-term results of patients who underwent implantation of tissue-engineered vascular grafts and discuss the feasibility of tissue-engineered vascular grafts in pediatric cardiovascular surgery.

We look forward to seeing you there!

Dr. Shinoka is my long time friend from Yale University. He is not only an extraordinary cardiothoratic surgeon, but also runs groundbreaking research in the field of vascular tissue engineering. Thank you sooooooooo much, Irina, SCRBP, JFDP (Junior Faculty Development Program) and Four Diamonds to support this super exciting event! Can’t wait to see you at the seminar!!!

(from below two videos you can see how Dr. Shinoka and his team are contributing/changing/touching our lives!)

Kind regards,

Yuka

My EIBDG (big data genomics education) project got funded by AFF!

August 15, 2017 by yui102 Leave a Comment

My EIBDG project won a funding support from AFF (Association of Faculty and Friends at Milton S. Hershey Medical Center and Penn State College of Medicine)!!! Thank you very much, AFF, for your generous support!!!

#Working on scheduling new training classes in fall 2017! Details to be announced in below web site, twitter, and by a mass email to Hershey audience!

https://sites.psu.edu/yuka/sample-page/eibdg-education-initiative-in-big-data-genomics/

2 more new conference abstracts!

September 22, 2016 by yui102 Leave a Comment

Mohammad, S., Imamura Kawasawa, Y., Ishii, S, Son, A.I., Li, P., Liu, J., Wang, J., Quezado, Z.M.N., Imamura, F., Torii, M., Hashimoto-Torii, K. Epigenetic changes associated with prenatal activation of heat shock signaling as novel therapeutic targets for cognitive deficits in Fetal Alcohol Spectrum Disorder. VIIIth International symposium on Heat Shock Proteins in Biology & Medicine (2016)
Ishii, S, Torii, M., Son, A.I., Imamura Kawasawa, Y., Rajendraprasad, M., Morozov, Y.M., Fujimoto, M., Brennand, K., Nakai, A., Mezger, V., Gage, F., Rakic, P., Hashimoto-Torii, K. Involvement of heterogeneous activation of Heat Shock Factor 1 in the formation of focal cortical dysplasia elicited by prenatal environmental challenges. VIIIth International symposium on Heat Shock Proteins in Biology & Medicine (2016)

Recent Publications and Conference Abstracts got updated!

September 19, 2016 by yui102 Leave a Comment

Publication!

Postula M, Janicki PK, Milanowski L, Pordzik J, Eyileten C, Karlinski M, Wylezol P, Solarska M, Czlonkowka A, Kurkowska-Jastrzebka I, Sugino S, Imamura Y, Mirowska-Guzel D. Association of frequent genetic variants in platelet activation pathway genes with large-vessel ischemic stroke in Polish population. 2016 Aug 17:1-8. [Epub ahead of print]

Conference Abstract!

Berg, F. He, E. Bixler, J. Fernandez-Mendoza, Y. Imamura, D. Liu and D. Liao. Sleep disordered breathing is associated with DNA methylation of obesity-related genes in adolescents. Journal of Sleep Research. European Sleep Research Society, JSR 25 (Suppl. 1), Pg 29 (2016)

How to process too long paired end reads?

April 27, 2016 by yui102 Leave a Comment

I’ve got an interesting question regarding how to process way long paired end RNA-seq reads before Tophat alignment:

“We have a set of paired-end RNA-Seq raw data with Adaptors and Barcode in both Read1 and Read2 sequences. Also, because each read sequence is 150bp long and the library insert size is only 180bp, it is possible that Read1 and Read2 have an overlapped region. How can I deal with this kind of datasets to remove the Adaptors, Barcodes and the overlapped sequences?”

The similar questions have been asked before, so I would post my answer in the blog to share with everyone!

Below is my answer:

“I hope below will help you
http://thegenomefactory.blogspot.com/2013/08/paired-end-read-confusion-library.html

http://thegenomefactory.blogspot.com.au/2012/11/tools-to-merge-overlapping-paired-end.html

If your paired reads have overlap, tophat won’t map the pairs in a straightforward manner.
There are a few options here:

(1) Stitch R1 and R2 to make the read longer and treat them as if they were a single molecule. This is what we do for making Illumina reads as long as 454 reads in silico (2X300 will turn into as 600bp reads).
… However, in your case, even though you stitch the R1 and R2, it becomes only 30bp longer than its original length, so I would not take this option.

(2) You can trim both R1 and R2 to 75bp and run tophat using -r 0 option. Or you may want to check the insert size computationally (http://picard.sourceforge.net/command-line-overview.shtml#CollectInsertSizeMetrics) as there might be certain deviation in the insert length even the core said the insert is 180bp. Then you could do serial different length of trimming down to 50bp each for R1 and R2 and then tophat using -r 0 to -r 50 options and see which one maps best.

(3) You can align R1 and R2 separately. If above picard run tells you there is certain deviation in the insert length and potential adapter read through, you will need to trim adapters (https://www.biostars.org/p/63044/). If the libraries were made strand specific, you will have to align them separately (http://seqanswers.com/forums/showthread.php?t=64806).

Hope these will help!

Thank you so much!
Yuka”

How to use SRA toolkit

April 7, 2016 by yui102 Leave a Comment

If you’d like to use publicly available NGS data, you may want to learn how to use SRA toolkit. Downloaded .sra file can be converted to .fastq file.

Fyi… what is SRA?

http://www.ncbi.nlm.nih.gov/sra

http://www.ncbi.nlm.nih.gov/books/NBK158900/

Though above provides comprehensive information, my customer wanted to know ‘exactly how’ to use SRA toolkit, so I did it myself and summarized the workflow in below scripts (run at Mac Terminal) and the pdf file.

Hope this helps! and if you have any troubles please feel free to contact me! 🙂

#install sra toolkit
ruby -e “$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)”
brew install wget
brew tap homebrew/science
brew install sratoolkit

#download individual sra file
wget ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByStudy/sra/SRP/SRP009/SRP009459/SRR384905/SRR384905.sra #change me#

#if you would like to download a series of sra files, do something like this
sra_list=({384905..384962})
base_url=ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByStudy/sra/SRP/SRP009/SRP009459

for sra_id in ${sra_list[@]}
do
wget ${base_url}/SRR${sra_id}/SRR${sra_id}.sra
sleep 10m
done

#convert sra to fastq
fastq-dump –split-files ./SRR384905.sra #change me#

#if you would like convert a series of sra files to fastq files, do something like this
sra_list=({384905..384962}) #change me#
for sra_id in ${sra_list[@]}
do
fastq-dump –split-files ./SRR${sra_id}.sra
done

#above is summarized in below pdf file

link to pdf file “SRA_toolkit”

Our RRBS rocks!

March 21, 2016 by yui102 Leave a Comment

Just created a blurb for our RRBS applications! If you are limited with initial amount of DNA and/or sequencing budget, this would be a way to go! Details on the cost will be updated in a few months!

As Director of Penn State Hershey Genome Sciences Facility (http://www.pennstatehershey.org/web/core/gene-expression-analysis-overview, https://sites.psu.edu/yuka/sample-page/), my mission is to support “from start to end” genomic research. I believe our core is one of a few that can include/conduct fast and comprehensive bioinformatics analyses at competitive cost. To keep up with the fast-paced and constantly evolving NGS methodologies, we thrive to install new instruments, develop new methods and increase our computational power. One of the cutting-edge technologies we have recently developed and currently in preparation of its manuscript is a genome-wide Reduced Representation Bisulfite Sequencing (RRBS), a robust and cost effective alternative to conventional Methyl-seq technology to determine DNA methylation throughout the genome. This method generates a specific, reduced representation of the genome of DNA fragments enriched for CpG dinucleotides. Our method requires only 5 ng DNA as an input, which is about 10 times less than what other genomics core or commercially available kits do, and is thus well suited for characterizing precious samples. The development of this higher-throughput genome-wide methylation analysis has allowed Penn State Hershey’s multiple investigators’ successful achievement of DNA methylation analysis as shown in below references. Given that all of my staff has been well trained for executing “from start to end” RRBS workflows, that typically involve DNA extraction, quality control, followed by library preparation, HiSeq operation and bioinformatics analyses, I am confident we are a one-of-a-kind powerful genomics core that can conduct from soup to nut genomics projects to meet each investigator’s needs.

Sun, Y., El-Bayoumy, K., Imamura, Y., Salzberg, A., Aliaga, C., Gowdahalli, K., Amin, S., Chen, K. Genome-wide analysis of DNA methylation induced by environmental carcinogen dibenzo[def,p]chrysene in ovarian tissues of mice. American Association for Cancer Research, (2016).
Sun, Y., Chen, K., Imamura Kawasawa, Salzberg, A., Aliaga, C., Gowdahalli, K., Amin, S., Stoner, G., El-Bayoumy, K. The effects of the environmental carcinogen dibenzo[a,l]pyrene on genome-wide methylation and the impact of dietary black raspberry in mouse oral tissues. Cancer Research75 (15 Supplement), 2955-2955 (2015).
Berg, A., Imamura Kawasawa, Y., Salzberg, A., Bixler, E.O., He, F., Liao, D. Abstract P260: Obesity is associated with DNA Methylation in Population-Based Adolescents. Circulation 131 (Suppl 1), AP260-AP260 (2015).

TruSeq adapter trimming practice

March 9, 2016 by yui102 Leave a Comment

One day I got a question from my customer, something like this:
—————————————————————-
For the adapter sequences, you gave me the following info:
AdapterRead1 AGATCGGAAGAGCACACGTCTGAACTCCAGTCA
AdapterRead2 AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT

Are these from the 5’ or 3’ end? To which end of the sequence do you
attach the adapters? What is the difference between Read1 and Read2?
Do they go on different ends of the sequence?
—————————————————————-
And here is my reply:
—————————————————————-
I hope below will help!
http://bitesizebio.com/13542/what-everyone-should-know-about-rna-seq/
(the cartoon in the middle)
http://seqanswers.com/forums/archive/index.php/t-17521.html
(example cutadapt trimming scripts)
In case of paired end sequencing, trim Read1 adapter from Read1 fastq, and Read2 adapter from Read2 fastq.

In our default CASAVA blc2fastq workflow (except small RNA-seq data), we mask the adapter sequences, so in case you have adapter sequence in its 3′ end of the sequencing reads (this happens if your insert was shorter than your sequencing length), you will see ‘NNNNN’ instead of the adapter sequences. We can leave the adapter sequences as they are and let you to trim them using e.g. above method, so just let us know if that’s your preference.

Thank you!

Yuka

Clarity LIMS on live!

February 25, 2016 by yui102 Leave a Comment

Our new Genologics Clarity LIMS went on live finally!!!

http://www.genologics.com/clarity-lims/

PSU COM’s log in site:
https://claritylims.med.psu.edu/clarity/

We are testing our first HiSeq run on the system today! Official announcement/training to the users is soon TBA!

Yuka Imamura Kawasawa

yimamura@pennstatehealth.psu.edu