You might have been so excited to see your fastq reads coming immediately out of the MinION that you might not (initially) realized that the data you started to analyze are NOT all the data there is! In case your computer wasn’t able to do all the base-calling on the fly, there will be plenty of the data in your skip folder — in our recent run, this was as much as 90% of it!
One way to fix this is to catch up with the base-calling using albacore on Linux machine. In my case, I achieved it using following commands:
PYTHON_VERSION=3.5 conda create --prefix /biomonika/conda/nanopore_basecalling python=${PYTHON_VERSION} anaconda source activate nanopore_basecalling conda update --all wget https://mirror.oxfordnanoportal.com/software/analysis/ont_albacore-2.2.7-cp35-cp35m-manylinux1_x86_64.whl pip install ont_albacore-2.2.*.whl
So how does our loop submitting jobs look like?
for i in `seq 0 48`; do echo $i; sbatch basecalling_job.sh /biomonika/nanopore_run/data/reads/20180426_2154_/fast5/skip/${i} ${i}; done;
And this is the insides of basecalling_job.sh
#!/bin/bash #SBATCH --job-name=biomonika #SBATCH --output=biomonika-%j.out #SBATCH --error=biomonika-%j.err #SBATCH --mem-per-cpu=2G #SBATCH --ntasks=63 #SBATCH --cpus-per-task=1 set -x set -e input_dir=$1 output_dir=$2 echo "input dir: " ${input_dir} echo "output dir: " ${output_dir} which conda conda info source activate /galaxy/home/biomonika/conda/nanopore_basecalling read_fast5_basecaller.py --flowcell FLO-MIN106 --kit SQK-LSK108 --barcoding --output_format fastq --input ${input_dir} --save_path ${output_dir} --worker_threads 63 echo "Done."
As usual, please comment if you have suggestions or advice. Happy base-calling!
I am following your posts and noticing that every post contains an educative value. I have been benefited from your posts in many ways and suggested this site to my friends.There is no doubt that this an amazing platform.
Visit: http://bit.ly/38KVQV7