Library preparation for single-cell omics is the process of converting DNA or RNA samples into a format compatible with NGS platforms. While the typical workflows differ according to the molecule of interest (RNA or DNA), some steps are typically similar to all single-cell sequencing strategies.
After cell isolation, DNA or RNA needs to be accessed by the library prep reagents: this step is typically performed by lysing the cell and nuclear membranes, while some protocols simply make it permeable to the enzymatic catalysts. Once the molecular material is accessible, it must be amplified to generate enough material for subsequent reactions and their detection by sequencing instruments. A human cell contains about 6 pg of DNA, and an E. coli cell has less than 6 femtograms! Another crucial step in all single-cell omics workflow is the introduction of so-called cell barcodes, i.e., the addition of small nucleic acid sequences, different for each individual cell, allowing the bioinformatics-driven re-attribution of sequencing reads to the exact cell they originate from. Finally, sequencing adapters must be added to the barcoded DNA to enable their reading by the sequencing platform.
Variations and adaptations of these core library prep principles are infinite and primarily depend on the molecular phenomenon of interest.
Single-cell RNA-seq (scRNA-seq)
Single-cell transcriptomics studies the variability of cell transcriptomes at the individual cell level. It is particularly relevant to cancer research, where it can help researchers understand the heterogeneity of cell populations within tumors [5]. There are several approaches to library prep in scRNA-seq, each with unique advantages.
Targeting the Transcript
scRNA-seq analysis may focus on different regions of the transcript molecules. Full-length scRNA-seq, mostly used for variant exploration, captures the entire messenger RNA (mRNA) length, preserving complete transcript sequences. 3’ and 5’ scRNA-seq focus on a different end of the mRNA molecules: while the former is used for the exploration of differential gene expression, as just a few dozen base pairs of the transcript 3’ end are needed to identify a gene in well-known targets such as the human genome, the latter focuses on post-transcriptional modifications, mainly operating at 5’ end of messenger molecules. Other scRNA-seq library preparation methods aim at sequencing the full transcriptome of each cell, including non-messenger RNAs, particularly regulatory RNAs. Finally, some targeted scRNA-seq enhances sensitivity and cost-efficiency by focusing on specific gene groups through hybridization probes or amplicon-based methods.
Protocol Selection
Overall, the choice of the scRNA-seq library prep protocol is highly dependent on the research question and model of interest. Most scRNA-seq methods implement a reverse-transcription (RT) step, which is necessary to convert RNA into PCR- and NGS-compatible DNA relatively early in the workflow, if not as a first step. This allows the introduction of cell barcodes early in the protocol by adding them onto the RT primers. This way, the libraries of multiple single cells can be quickly pooled into one sample for easier and quicker subsequent handling. cDNA can then be fragmented using different methods depending on the transcript regions of interest. It can also be amplified, typically by PCR, before or during the addition of more classical sequencing adapters. As with any molecular biology workflow, scRNA-seq has biases and limitations, including potential PCR biases, which are proportional to the number of PCR cycles needed to yield enough material for sequencing.
Single-Cell Whole Genome Sequencing (scWGS)
scWGS refers to the sequencing of the total DNA material of individual cells and has led to insights such as tracking the progression of melanoma tumors during anti-programmed cell death protein 1 therapy [5]. scWGS library prep requires the amplification and fragmentation of all of the DNA material in a cell.
Amplification Method
This step can be performed using a variety of methods, such as high-throughput in-vitro transcription (followed by RT) or rolling circle amplification, amongst which the most popular method is Multiple Displacement Amplification (MDA). While MDA, catalyzed by a bacteriophage-derived DNA polymerase, ensures the highest amplification yield, it also introduces important amplification biases, as certain regions of the genome may become overrepresented or underrepresented compared to others (e.g., loss of GC-rich regions [6]).
Fragmentation and Barcoding
Subsequently, the cutting of amplified DNA into NGS-compatible fragments and the addition of the cell barcodes and sequencing adapters generally involve either fragmentation (generally enzymatic) followed by adapter ligation ,or tagmentation, which achieves both steps in a single reaction thanks to engineered transposases.
All of these steps involve very expensive enzymatic reagents, and their efficiency is tied to their ability to “meet” their target molecule in the reagent mix. Therefore, miniaturization represents a significant interest in terms of cost and efficiency.
The Precision Microdispensing Advantage
Precision microdispensing provides several advantages for single-cell sequencing library preparation. One advantage is minimizing reagent waste by delivering very small, and very precise, volumes, preventing excess usage and drastically reducing reagent cost, especially when using instruments with low dead volume. Moreover, miniaturized workflows often prove more efficient for single-cell sequencing. DNA content from a single cell falling into a classical 50 µL microplate-based assay would be like pouring a glass of fruit juice into the ocean: it would be very diluted. Why is this an issue? Because most enzymatic reactions, but also sequencing, require minimum target molecule concentration or mass to yield sufficient product, or even to start at all. Library miniaturization increases the chance that substrate and enzyme meet and then react efficiently.
Maximizing Efficiency While Minimizing Bias
Precision dispensing maximizes efficiency and minimizes amplification bias in certain contexts. For example, researchers used the cellenONE system to develop a scalable method for single-cell whole-genome sequencing called Direct Library Preparation (DLP+) that allowed single-cell WGS to be scaled to hundreds of thousands of genomes. The cellenONE performs sorting and dispensing of single cells into aluminum chips containing more than 5,000 nanowells prefilled with lysis buffer and individual cell barcodes while removing debris and multiplets.
After cell lysis, DNA is directly tagmented, by transposases that cut the DNA into controlled-size fragments and introduce PCR primer binding sites. PCR is then performed directly in the nanowells, introducing cell barcodes and generating sequencing-ready libraries in a few hundred nanoliters! This method was further improved by incorporating images of the isolated cells generated by the cellenONE, which provided deeper insights into cellular aneuploidy (Fig. 2). This approach enabled the analysis of rare cell populations, the replication status of individual cells, and other important cellular characteristics.