metadata:
sample_annotation: /path/to/samples.csv
output_dir: /path/to/output/folder
sample_name, protocol, organism, data_source
frog_0h, RNA-seq, frog, /path/to/frog0.gz
frog_1h, RNA-seq, frog, /path/to/frog1.gz
frog_2h, RNA-seq, frog, /path/to/frog2.gz
frog_3h, RNA-seq, frog, /path/to/frog3.gz
| sample_name | t | protocol | organism | data_source |
| ------------- | ---- | :-------------: | -------- | ---------------------- |
| frog_0h | 0 | RNA-seq | frog | /path/to/frog0.gz |
| frog_1h | 1 | RNA-seq | frog | /path/to/frog1.gz |
| frog_2h | 2 | RNA-seq | frog | /path/to/frog2.gz |
| frog_3h | 3 | RNA-seq | frog | /path/to/frog3.gz |
Using derived attribute:
| sample_name | t | protocol | organism | data_source |
| ------------- | ---- | :-------------: | -------- | ---------------------- |
| frog_0h | 0 | RNA-seq | frog | my_samples |
| frog_1h | 1 | RNA-seq | frog | my_samples |
| frog_2h | 2 | RNA-seq | frog | my_samples |
| frog_3h | 3 | RNA-seq | frog | my_samples |
| crab_0h | 0 | RNA-seq | crab | your_samples |
| crab_3h | 3 | RNA-seq | crab | your_samples |
| sample_name | t | protocol | organism | data_source |
| ------------- | ---- | :-------------: | -------- | ---------------------- |
| frog_0h | 0 | RNA-seq | frog | my_samples |
| frog_1h | 1 | RNA-seq | frog | my_samples |
| frog_2h | 2 | RNA-seq | frog | my_samples |
| frog_3h | 3 | RNA-seq | frog | my_samples |
| crab_0h | 0 | RNA-seq | crab | your_samples |
| crab_3h | 3 | RNA-seq | crab | your_samples |
Project config file:
derived_columns: [data_source]
data_sources:
my_samples: "/path/to/my/samples/{organism}_{t}h.gz"
your_samples: "/path/to/your/samples/{organism}_{t}h.gz"
| sample_name | protocol | organism |
| ------------- | :-------------: | -------- |
| human_1 | RNA-seq | human |
| human_2 | RNA-seq | human |
| human_3 | RNA-seq | human |
| mouse_1 | RNA-seq | mouse |
| sample_name | protocol | organism | genome |
| ------------- | :-------------: | -------- | ------ |
| human_1 | RNA-seq | human | hg38 |
| human_2 | RNA-seq | human | hg38 |
| human_3 | RNA-seq | human | hg38 |
| mouse_1 | RNA-seq | mouse | mm10 |
| sample_name | protocol | organism |
| ------------- | :-------------: | -------- |
| human_1 | RNA-seq | human |
| human_2 | RNA-seq | human |
| human_3 | RNA-seq | human |
| mouse_1 | RNA-seq | mouse |
Project config file:
implied_columns:
organism:
human:
genome: hg38
mouse:
genome: mm10
subprojects:
diverse:
metadata:
sample_annotation: psa_rrbs_diverse.csv
cancer:
metadata:
sample_annotation: psa_rrbs_intracancer.csv
import peppy
prj = Project("pep_config.yaml")
samples = prj.get_samples()
for sample in samples:
print(sample.name)
# do further analysis to each sample
library("pepr")
prj = pepr::Project("pep_config.yaml")
samples = pepr::pepSamples(prj)
for (sample in samples) {
message(pepr::sampleName(sample))
# do further analysis to each sample
}
protocol_mappings:
RNA-seq: rna-seq
pipelines:
rna-seq:
name: RNA-seq_pipeline
path: path/to/rna-seq.py
arguments:
"--option1": sample_attribute
"--option2": sample_attribute2
looper run project_config.yaml
protocol_mappings:
RRBS: rrbs
WGBS: wgbs
EG: wgbs.py
SMART-seq: rnaBitSeq -f; rnaTopHat -f
ATAC-SEQ: atacseq
DNase-seq: atacseq
CHIP-SEQ: chipseq
pipeline_key:
name: pipeline_name
arguments:
"--option" : value
resources:
default:
file_size: "0"
cores: "2"
mem: "6000"
time: "01:00:00"
large_input:
file_size: "2000"
cores: "4"
mem: "12000"
time: "08:00:00"
compute:
slurm:
submission_template: templates/slurm_template.sub
submission_command: sbatch
localhost:
submission_template: templates/localhost_template.sub
submission_command: sh
> looper run project_config.yaml --compute localhost
looper check project_config.yaml
looper summarize project_config.yaml
geofetch
and looper
build PEP projects and connect them to pipelines