Overview
The generatervis package provides functions to create empty .fastq files, generate random reads, fill raw .fastq files with random reads, plot .fastq sequences, convert .fastq files to .bam files, convert .bam files to .vcf files, and create metadata files for patient IDs.
⬇️ Installing generatervis
You can install the development version of generatervis from GitHub with
# install.packages("pak")
pak::pak("Clinical-Informatics-Collaborative/generatervis")
#>
#> → Will update 1 package.
#> → The package (0 B) is cached.
#> + generatervis 0.1.0 → 0.1.0 👷♀️🔧 (GitHub: 38a905c)
#> ℹ No downloads are needed, 1 pkg is cached
#> ✔ Got generatervis 0.1.0 (source) (45.75 kB)
#> ℹ Packaging generatervis 0.1.0
#> ✔ Packaged generatervis 0.1.0 (689ms)
#> ℹ Building generatervis 0.1.0
#> ✔ Built generatervis 0.1.0 (523ms)
#> ✔ Installed generatervis 0.1.0 (github::Clinical-Informatics-Collaborative/generatervis@38a905c) (15ms)
#> ✔ 1 pkg: upd 1, dld 1 (NA B) [8s]Usage
Create an empty raw .fastq file for the specified patient.
patient_id <- "patient_123"
generatervis::create_empty_fastq(patient_id)Generate a random sample of reads for a Whole Genome Sequencing (WGS) dataset for the specified patient ID.
n <- 2
generatervis::rreads(patient_id, n)Populate the .fastq file with the random reads.
output_dir <- tempdir()
read_length <- 8
generatervis::fill_fastq(patient_id, output_dir, n, read_length)(Optional) Plot the nucleotide sequences in the .fastq file in a grid format.
generatervis::fastq_plot(patient_id, output_dir, n, read_length)Convert the (raw) .fastq file to a (processed) .bam file using a dummy .sam format.
fastq_file <- file.path(output_dir, paste0(patient_id, ".fastq"))
generatervis::fill_fastq(patient_id, output_dir, n, read_length)
generatervis::fastq_to_bam(fastq_file, patient_id, output_dir, sam_file = paste0(output_dir, "/", patient_id, ".sam"), reference = "chr1")To create the corresponding .bam file, use the samtools command-line tool.
Convert the (processed) .bam file to a (summarised) .vcf file format.
generatervis::bam_to_vcf(patient_id, output_dir, vcf_file = paste0(output_dir, "/", patient_id, ".vcf"))Create the metadata files to upload to data_storage_repository
generatervis::create_metadata(patient_id, output_dir)When these metadata .txt files are ready, they can be uploaded to data_storarge_repository by forking the repository and creating a pull request.
Documentation
You can find detailed documentation and tutorials at the package website: https://clinical-informatics-collaborative.github.io/generatervis/
- Reference manual: Full list of functions with detailed descriptions.
For in-session help:
# View documentation for a specific function
?create_empty_fastqAcknowledgments
This package is written as a part of the volunteer programme by the Research Computing Program, Walter and Eliza Hall Institute of Medical Research, mentored by Rowland Mosbergen.