Installing Software and Configuration

Updated on 03/05/2019

You need to have access to a unix based operating system to use Relion. Mac and linux systems are mostly ready to interact with the FarmShare2 computer cluster. However if you would like to stick with Windows, please configure your computer according to the below instructions to install Ubuntu subsytem for Windows and X server. Mac (only newer MacOS) users please follow this link to install X11 if you do not have it already.

Ubuntu on Windows
Windows 10 64 bit
Settings>System>About
Build should not be below 14393, otherwise check for updates
Settings -> Update and Security -> For developers
Select the Developer mode
Start Menu -> Search for "Turn Windows features on or off"
Enable the "Windows Subsystem for Linux"
Install "Ubuntu" from Microsoft Store
Run Ubuntu and follow the instructions
If there is a Windows subsystem error, reboot your windows.
Install Windows - X Server -  VcXsrv from this link
In order to start X Server when your computer boots up, copy and paste a shortcut of VcXsrv in %appdata%\Microsoft\Windows\Start Menu\Programs\Startup
Open your Ubuntu terminal and configure .bashrc to use the local X server
$ echo "export DISPLAY=localhost:0.0" >> ~/.bashrc
source the .bashrc file
$ source ~/.bashrc

We will use FarmShare2 computer cluster to process your data, you can also use your own laptop or desktop cumputer if you have a modern GPU to process this dataset. You can also use computers without GPUs which is not covered in this tutorial. If you would like to use your own computers please check relion, motioncor2 and gctf instructions to install these software.

Commands are written after "$" sign, you should not include "$" sign in your commands.

Lets first login to a "login node" in FarmShare2 using your linux/mac/ubuntu-on-windows terminal

$ ssh -Y sunetid@rice.stanford.edu

Create a Project Directory
$ mkdir Project_T20S

Copy the necessary programs compiled for FarmShare2 and the data set.
$ cp /home/alpays/public/software.tar.gz ~/.
$ cp /home/alpays/public/dataset.tar.gz ~/Project_T20S

Untar these files
$ cd
$ tar -xvzf software.tar.gz
remove the tar file
$ rm software.tar.gz
You will have relion, Gctf and MotionCor2 along with a slurm submission script.

$ cd Project_T20S
$ tar -xvzf dataset.tar.gz
$ rm dataset.tar.gz

Configure .bashrc for our programs
$ cd
$ cp ~/.bashrc  backup_bashrc
$ echo "ml use $HOME/software/alpayslua" >> ~/.bashrc

Check if "ml use $HOME/software/alpayslua" is added at the bottom of the file
$ vi ~/.bashrc
Go to the end of the file with your cursor, and go to the end of the line, click "i", click return and add the following lines starting with export.

export RELION_QSUB_EXTRA_COUNT=3
export RELION_QSUB_EXTRA1="Memory (ex: 64G)"
export RELION_QSUB_EXTRA1_DEFAULT="64G"
export RELION_QSUB_EXTRA2="Number of GPU"
export RELION_QSUB_EXTRA2_DEFAULT="0"
export RELION_QSUB_EXTRA3="Wall Time (max: 48:00:00)"
export RELION_QSUB_EXTRA3_DEFAULT="24:00:00"
export RELION_PDFVIEWER_EXECUTABLE="$HOME/software/alpaysprograms/xpdf/4.0/xpdf"
export RELION_MOTIONCOR2_EXECUTABLE="MotionCor2"
export RELION_GCTF_EXECUTABLE="Gctf-v1.06_sm_20_cu8.0_x86_64"
export RELION_QSUB_TEMPLATE="$HOME/software/scripts/relion/farmshare-relion.sbatch"
click Esc, click ":" and type "wq" and click return to save it.
source your .bashrc file
$ source ~/.bashrc
Congratulations everything set to process your first dataset on FarmShare2.
FarMShare2 and SLURM Information

Please familrize yourself with FarmShare2 from the below links

Some of the most used slurm commands;
srun, sbatch, squeue, scancel, sinfo, and scontrol

Below are the available compute nodes in FarmShare2.
sinfo -e -o "%.10R %.8D %.10m %.5c %7z %8G %110f"

 PARTITION  NODES MEMORY CPUS S:C:T  GRES     AVAIL_FEATURES
    normal        3     128821    32 2:8:2   gpu:tesl CPU_GEN:HSW,CPU_SKU:E5-2640v3,GPU_GEN:KPL,GPU_SKU:TESLA_K40
    normal        7     128821    32 2:8:2   gpu:tesl CPU_GEN:HSW,CPU_SKU:E5-2640v3,GPU_GEN:KPL,GPU_SKU:TESLA_K40
    normal        2     773830    32 2:8:2   (null)   CPU_GEN:HSW,CPU_SKU:E5-2640v3
    normal       10    128883    32 2:8:2   (null)   CPU_GEN:IVB,CPU_SKU:E5-2650v2
    bigmem      2     773830    32 2:8:2   (null)   CPU_GEN:HSW,CPU_SKU:E5-2640v3
       gpu          3     128821    32 2:8:2   gpu:tesl CPU_GEN:HSW,CPU_SKU:E5-2640v3,GPU_GEN:KPL,GPU_SKU:TESLA_K40
       gpu          7     128821    32 2:8:2   gpu:tesl CPU_GEN:HSW,CPU_SKU:E5-2640v3,GPU_GEN:KPL,GPU_SKU:TESLA_K40

You will use mostly "gpu" and "normal" partitions to run your jobs. You have 48GB, and you can use 64GB upto 1 week on your home directory.

You have more scratch space at the directory below.
​/farmshare/user_data/yoursunetid

You will mostly submit batch jobs to the cluster using the provided slurm script through the relion GUI.

Before starting to run relion, you should load all the necessary software and libraries every time you start a new terminal.
ml cuda/8.0 openmpi/3.0.0 MotionCor2 Gctf relion #this command will load all the necessary software.

You should always run relion in the project directory and your micrographs should be in a separate directory under the project directory.

You can run the relion GUI and light processes on the login node. However you need to run the computational intensive jobs and the jobs requiring special resources like GPU or high memory on the compute nodes. You will submit computational intensive jobs to compute nodes through relion GUI and a SLURM submission script. SLURM is the scheduler program for the FarmShare2 cluster. 

We will also be using other programs (MotionCor2 and Gctf) wrapped in relion for some processes.

Please refer to the relion tutorial for further information for the relion related parameters.

After finishing these steps, you should start from "Getting Ready" step each time you want to run relion.

Getting Ready

Open your FarmShare2 terminal
$ ssh -Y sunetid@rice.stanford.edu

Purge all the modules by typing
$ ml purge

Load relion by typing
$ ml relion

Other programs can be load now but not necessary, they will be always loaded with the submission script for your batch jobs.
$ ml cuda/8.0 openmpi/3.0.0 MotionCor2 Gctf relion

Go to the project directory
$ cd Project_T20S

Start Relion by typing
$ relion

Below, there will be instructions for the values you should use. If the value is not mentioned, please use the default values or consult the Relion tutorial. You can also email me at alpays@stanford.edu.

You will also need a visualization software for 3 dimensional em densities. I would recommend using Chimera from UCSF, pymol should work too.
In order to copy files from farmshare to your local computer, you can use scp or rsync commands. I would suggest you to use sshfs command to connect one of your local directories to your home directory on farmshare.
Your local linux systems should have sshfs installed. You should install FUSE and SSHFS from the osxfuse site for your local Mac, and for your local windows please follow the instruction on this link.
Create a folder named farmshare on your local computer and then use the following command to connect this folder to your home folder in farmshare cluster.
sshfs yoursunetid@rice.stanford.edu:/home/yoursunetid/ pathtofarmsharefolderonyourlocalcomputer/
This connection will be kept as long as you keep your internet connection (or until the key expires). You can re-run this same command to reestablish the link.

Import

Input files: empiar_10025_subset/14sep*.tif
Node type: 2D micrograph movies (*.mrcs, *.tiff)

RUN

MotionCor2

The purpose of this stem is to correct beam-induced sample motion recorded on dose fractionated movie stacks.

Input movies STAR file: Import/job001/movies.star #job number in this input may change, you basically need to input the imported movies star file.
First frame: 1
Last frame: -1
The pixel size on the dataset : 0.6575
Voltage: 300
Dose per frame (e/A2) : 1.4
Pre-exposure (e/A2): 0
Do dose-weighting?: Yes
Save non-dose weighted?: No

Bfactor: 150
Number of patches: 5 x 5
Group frames: 1
Binning factor: 1
Gain-reference image: empiar_10025_subset/norm-amibox05-0.mrc #Select the .mrc file in the dataset
Gain rotation: No rotation (0)
Gain flip: Flip upside down (1)
Defect file:
Use RELION's own implementation: No
MotionCor2 executable: MotionCor2
Which GPUs to use: 0:1
Other MotionCor2 arguments: -InTiff

Number of MPI procs: 2
Number of threads: 1
Submit to queue: Yes
Queue name: gpu
Submit queue command: sbatch
Memory: 32G
Number of GPU:2
Wall Time: 01:00:00
Standard sumbission script: /home/yoursunetid/software/scripts/relion/farmshare-relion.sbatch

RUN

You can close the relion GUI and check your whether if you job is running by the following command
$ squeue -u yoursunetid

You can rerun relion by typing (if you close your terminal, you should start back from the "Getting Ready" step)
$ relion
You can also open an additional FarmShare2 terminal and run some additional commands like monitoring your slurm jobs while relion is on. 

You can inspect your results clicking the logfile.pdf from "Display" dropdown menu after clicking the MotionCor2 job.
You can also inspect individual integrated image files from Relion-GUI/File/Display/MotionCorr/job_name/folder_name/blabla.mrc (scale-0.1, sigma-3, lowpass-10)

If you think you made a mistake and cancel your slurm batch job, you can use the following command

scancel jobid                   #jobid can be found with the command --> squeue -u yoursunetid

CTF Estimation

Input micrographs: Your call (select the star file of the motion correction job)
Use micrograph without dose-weighting?: No
Spherical aberration (mm): 2.7
Voltage (kV): 300
Amplitude contrast: 0.1
Magnified pixel size (Angstrom): 0.6575
Amount of astigmatism (A): 100

FFT box size (pix): 512
Minimun resolution (A): 30
Maximum resolution: 2
Minimum defocus valuse (A): 5000
Maximum defocus value (A): 50000
Defocus step size (A): 500
Estimate phase shifts?: No

Use CTFFIND-4.1?: No

Use Gctf instead?: Yes
Gctf executable: Gctf-v1.06_sm_20_cu8.0_x86_64
Ignore `Searches` parameters?: No
Perform equi-phase averaging?: Yes
Other Gctf options: 
Which GPUs to use: 0:1

Number of MPI procs: 2
Submit to queue?: Yes
Queue name: gpu
Submit queue command: sbatch
Memory: 32G
Number of GPU: 2
Wall time: 00:30:00
Standard sumbission script: /home/yoursunetid/software/scripts/relion/farmshare-relion.sbatch

RUN

Manual picking

You can manually pick about 300 to 1000 particles

Input micrographs: Your call (It should be the star file of the Ctf estimation job)

Particle diameter (A): Your call. (This is very easy to test, just give a value, and examine during manual picking, you can then close popped up additional manual picking windows, change any parameter and click the manual picking job under "Finished jobs" and click "Continue" to start picking again with the new parameters.)
Sigma contrast: 3
White value: 0
Black value: 0

Lowpass filter (A): 20
Highpass filter (A): -1
Pixel size (A): -1

RUN

After running this job, you click the finished/unfinished job, change parameters, like particle diameter and click "Continue". Particle size parameter in this step is just for visual purpose, but it will be a good practice to understand the size of the molecule in both Ångström and pixel. (When you extract your particles you can use this value. Your particle should cover 2/3 of the box size during extraction.)

Particle Extraction

This is a CPU based job and does not require much resources since we have only couple of micrographs and very few selected particles.

micrograph STAR file: Your call (this should be the star file of the Ctf Estimation job.)
Input coordinates: Your call (it should be the coordinate files of your manual pick job)
OR re-extract refined particles: No
Manually set pixel size?: No

Particle box size (pix): Your call (If your particles size is 80A, I would use 100A mask size and 120A during extraction - extraction value is in pixel, so you have to calculate)
Invert contrast?: Yes
Normalize particles?: Yes
Diameter/white/black: -1/-1/-1
Rescale particles: =particle box size or half the size or less, your call

Extract helical segments?: No

You can probably run this job on the login node
Number of MPI procs: 1 (if you run it on the login node) or more (6-8) if your want to submit it to the cluster
Submit to queue: No (if you run it on the login node) or Yes if your want to submit it to the cluster
If the previous answer is No

RUN

If the previous answer is YES
Submit to queue: Yes
Queue name: normal
Submit queue command: sbatch
Memory: 32G
Number of GPU:2
Wall Time: 01:00:00
Standard sumbission script: /home/yoursunetid/software/scripts/relion/farmshare-relion.sbatch

RUN

2D classification

Input images STAR file: Your call

CTF Tab --> Yes, No, No

Number of classes: Your call (you should decide this number according to your number of particles about 5 classes for 500 to 1000 particles
Regularisation parameter T: 2
Number of iterations: 25
Use fast subsets: No
Mask diameter (A): Your call (this parameter can be determined from the manual picking, your circular mask should cover should be slightly bigger than your particle. If your particles size is 80A, I would use 100A mask size and 120A during extraction - extraction value is in pixel, so you have to calculate)
Mask individual particles with zeros?: Yes
Limit resolution: -1

Perform image alignment?: Yes
In-plane angular sampling: 6 (you may change this)
Offset serch range (pix): 5 (you may change this)
Offset search step (pix): 1 (you may change this)

Classify 2D helical segments?: No

Use parallel disc I/O?: Yes
Number of pooled particles: 3
Pre-read all particles into RAM?: No (You can change this to Yes if you have enough ram)
Combine iterations through disc?: No
Use GPU acceleration? Yes
Which GPUs to use:

Number of MPI procs: 3
Number of threads: 1
Submit to queue: Yes
Queue name: gpu
Submit queue command: sbatch
Memory: 32G
Number of GPU:2
Wall Time: 01:00:00
Standard sumbission script: /home/yoursunetid/software/scripts/relion/farmshare-relion.sbatch

RUN

Rest of the Processing

Please follow rest of the processing steps from Relion tutorial but keep in mind the parameters specific to this dataset. I will update this page if I get a lot of questions about a particular step. For cpu based jobs, you can run it directly on the GUI without submitting a batch job if it is a light processing or you can submit to cluster using "normal" queue name. Please submit gpu based jobs to the cluster using "gpu" queue name. Your job will be scheduled faster if you keep the time parameter short, but it should be enough to finish the jobs. The 2D  classification, InitialModel, 3D classification and 3D Refine job with all the particles from all 20 micrographs are computationally heavy but should not take more than 8 hours (keep the time parameter at 24 hours for these jobs).

For 2D classification, 3D classification and 3d refinement, keep mpi at 3 and gpu at 2. For autopicking, motioncor2 and gctf keep both mpi and gpu at 2. Do not use threads.

You can reach upto 3A with this data set. If you can do better, you should consider doing this more. Have fun.

The general steps of a Relion Pipeline is following. (These steps can change depending on the project). 

Import --> MotionCorr --> CtfFind --> Manual Picking --> Particle Extraction --> 2D Classification --> Subset Selection of 2D Classes for Autopick --> Auto-picking --> Particle Extraction --> 2D Classification --> Subset Selection of 2D Classification to select for "good" particles --> (2D Class + Subset)xN --> Initial Model --> 3D Classification --> Subset Selection of 3D Classes --> (3D Class + Subset)xN to select "good" classes --> 3D Refinement --> Other steps (Post processing, Ctf refine, Polishing, etc ...)

Please email alpays@stanford.edu if you encounter any problem, so I can update this document.