The default path for downloading SRA data

The SRA (Sequence Read Archive) is a public repository of DNA sequence data. When you run sam-dump or fastq-dump from the sratoolkit, it will first actually use prefetch to download a “temporary” .sra file, which it then converts to either sam or fastq format. By default, sratoolkit will download .sra files to a subfolder in your home folder ($HOME/ncbi/public/sra). This is a bad thing because 1) your home folder may have a space quota, and 2) these downloaded files won’t be useful for others in the group, since they’re in your personal space. It’s better to tell sratoolkit to use a shared filesystem. The advertised way to change the default path uses a graphical interface called vdb-config -i, which is not ideal. Luckily, all this GUI does is add a setting to a config file that sratoolkit reads, so we can bypass the GUI completely and edit the config file directly. Here’s how to change your default data storage path:

echo "/repository/user/main/public/root = \"$DATA\"" > $HOME/.ncbi/user-settings.mkfg

Now, the huge .sra files will be stored in our shared, huge filesystem instead of in your home directory.

A second thing to keep in mind: these .sra files aren’t really temporary; there is no system in place to delete them after a time. So they will just build up and be huge as you download more files from SRA until you delete them. Once you’ve converted to sam, bam, or fastq format, the sra files are no longer needed and can theoretically be purged.

We can delete all such .sra files that have not been accessed in the past year like this:

find $DATA/sra -depth -type f -atime +365 -delete

Source: Thanks to piet in this question.