WO2025096338A1 - Systèmes et procédés d'analyse morphométrique - Google Patents

Systèmes et procédés d'analyse morphométrique Download PDF

Info

Publication number
WO2025096338A1
WO2025096338A1 PCT/US2024/053235 US2024053235W WO2025096338A1 WO 2025096338 A1 WO2025096338 A1 WO 2025096338A1 US 2024053235 W US2024053235 W US 2024053235W WO 2025096338 A1 WO2025096338 A1 WO 2025096338A1
Authority
WO
WIPO (PCT)
Prior art keywords
cell
images
morphometric
model
cells
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2024/053235
Other languages
English (en)
Inventor
Ryan C. CARELLI
Cristian L. LUENGO HENDRIKS
Senzeyu ZHANG
Kevin B. JACOBS
Amirali Kia
Mahyar Salek
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Deepcell Inc
Original Assignee
Deepcell Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from NL2036275A external-priority patent/NL2036275B1/en
Priority claimed from NL2036929A external-priority patent/NL2036929B1/en
Application filed by Deepcell Inc filed Critical Deepcell Inc
Publication of WO2025096338A1 publication Critical patent/WO2025096338A1/fr
Anticipated expiration legal-status Critical
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/469Contour-based spatial representations, e.g. vector-coding
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/69Microscopic objects, e.g. biological cells or cellular parts
    • G06V20/698Matching; Classification
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis

Definitions

  • Analysis of a cell can be accomplished by examining, for example, one or more images of the cell that is tagged (e.g., stained with a polypeptide, such as an antibody, against a target protein of interest within the cell; with a polynucleotide against a target gene of interest within the cell; with probes to analyze gene expression profile of the cell via polymerase chain reaction; or with a small molecule sub strate that is modified by the target protein) or sequencing data of the cell (e.g., gene fragment analysis, whole-genome sequencing, whole-exome sequencing, RNA-seq, etc.).
  • a polypeptide such as an antibody
  • Such methods can be used to identify cell type (e.g., stem cell or differentiated cell) or cell state (e.g., healthy or disease state). Such methods can require treatment of the cell (e.g., antibody staining, cell lysis or sequencing etc.) that can be time-consuming and/or costly.
  • cell state e.g., healthy or disease state.
  • Such methods can require treatment of the cell (e.g., antibody staining, cell lysis or sequencing etc.) that can be time-consuming and/or costly.
  • Example 1 A method of processing, comprising: extracting, using a Deep Learning (DL) model, a set of machine learning (ML)-based features from a cell image; generating, using the DL model and the set of ML-based features, a plurality of DL embeddings orthogonal to each other; extracting, using a computer vision model, a set of cell morphometric features from the cell image, the cell morphometric features being orthogonal to the plurality of DL embeddings; and generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features.
  • DL Deep Learning
  • ML machine learning
  • Example 2 The method of example 1, wherein the generating the plurality of morphometric predictive embeddings comprises: generating, using the computer vision model and the set of the cell morphometric features, a plurality of morphometric embeddings; and concatenating, using the DL model, the plurality of morphometric embeddings and the plurality of DL embeddings to a plurality of predictive multidimensional vectors comprising predictive probabilities morphometric features.
  • Example 3 The method of example 1, wherein the DL model is trained using a loss function comprising invariance, variance, covariance, morphometric decorrelation terms, or combinations thereof.
  • Example 4 The method of example 1, further comprising: predicting, using the DL model and the plurality of morphometric predictive embeddings, a plurality of morphometric blob features using at least a correlation coefficient greater than approximately 0.9.
  • Example 5 The method of example 1, further comprising: predicting, using the DL model and the plurality of morphometric predictive embeddings, a plurality of morphometric blob features using at least a correlation coefficient ranging between approximately 0.85 and approximately 0.95.
  • Example 6 The method of example 1, wherein the set of cell morphometric features comprises a plurality of blob features.
  • Example 7 The method of example 1, wherein the generating the plurality of morphometric predictive embeddings is in a high throughput setting.
  • Example 8 The method of example 1, wherein the plurality of DL embeddings comprises cell morphology information independent from a fixed set of rule-based morphometric features.
  • Example 9 The method of example 1, wherein the plurality of DL embeddings comprises cell morphology data orthogonal to a fixed set of rule-based morphometric features.
  • Example 10 The method of example 1, further comprising: separating, usingthe DL model, a plurality of de-correlated DL embeddings and the plurality of morphometric predictive embeddings into different dimensions.
  • Example 11 The method of example 1, wherein the DL model is a self -supervised machine learning (SSL) system, the method further comprising: de -correlating, using the DL model, the set of morphometric features fromthe DL embeddings such thatthe DL model is trained to acquire information not covered using the computer vision model.
  • SSL self -supervised machine learning
  • Example 12 The method of example 1, wherein the cell image comprises a label free image.
  • Example 13 The method of example 1, further comprising: hostingtheDL model and the computer vision model in a cloud computing environment.
  • Example 14 The method of example 1, wherein the method is performed in a cloud computing environment.
  • Example 15 The method of example 1, further comprising: generating an instruction to sort a cell of the cell image based on the plurality of morphometric predictive embeddings.
  • Example 16 The method of example 1, further comprising: sorting, using the plurality of morphometric predictive embeddings, a cell of the cell image; and feeding data from the sorting back to the DL model in order to train the DL model for future generating of the plurality of DL embeddings.
  • Example 17 A method for assessing image data, comprising: extracting, using a Deep Learning (DL) model and image data of a plurality of cells, a vectorfor a cell of the plurality of cells, the vector comprising a set of machine-learning (ML)- based featuresand a set of cell morphometric features extracted using a computer vision model; and generating, using the DL model and using the set of ML-based features, a plurality of DL embeddings orthogonal to each other.
  • DL Deep Learning
  • ML machine-learning
  • Example 18 The method of example 17, wherein the DL model is trained using a loss function comprising invariance, variance, covariance, morphometric decorrelation terms, or a combination thereof.
  • Example 19 The method of example 17, further comprising: generating, usingthe
  • Example 20 The method of example 19, wherein the generating the plurality of morphometric predictive embeddings comprises: generating, usingthe computer vision model and at least the set of the cell morphometric features, a plurality of morphometric embeddings; and concatenating, using the DL model, the plurality of morphometric embeddings and the plurality of DL embeddings to a plurality of predictive multidimensional vectors comprising predictive probabilities morphometric features. [0024] Example 21. The method of example 19, further comprising: predicting, usingthe
  • Example 22 The method of example 17, wherein the plurality of DL embeddings comprise cell morphology information orthogonal to the set of morphometric features, and wherein the set of morphometric features are determined using at least a fixed set of rules.
  • Example 23 The method of example 17, further comprising: separating, usingthe
  • Example 24 The method of example 17, wherein the DL model is a self- supervised machine learning (SSL) system, the method comprising: de-correlating, using the DL model, the set of morphometric features from the plurality of DL embeddings so thatthe DL model is trained to acquire information not covered using the computer vision model.
  • SSL self-supervised machine learning
  • Example 25 The method of example 17, wherein the image data comprises a label free image of each cell of the plurality of cells.
  • Example 26 The method of example 17, further comprising: hosting the DL model and the computer vision model in a cloud computing environment.
  • Example 27 The method of example 17, further comprising: generating an instruction to sort the plurality of cells using the plurality of DL embeddings.
  • Example 28 The method of example 17, further comprising: sorting, using the plurality of morphometric predictive embeddings, a cell of the cell image; and
  • Example 29 A system for analyzing image data, the system comprising: at least one processor to execute instructions to perform operations comprising: extracting, using a Deep Learning (DL) model, a set of machine learning (ML)-based features from a cell image; generating, using the DL model and using at least the ML-based features, a plurality of DL embeddings orthogonal to each other; extracting, using a computer vision model, a set of cell morphometric features from the cell image that are orthogonal to the plurality of DL embeddings; and generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features.
  • Example 30 The system of Claim 29, wherein the set of cell morphometric features comprises a plurality of blob features.
  • Example 31 The system of Claim 29, wherein the generating the plurality of morphometric predictive embeddings is in a high throughput setting.
  • Example 32 The system of Claim 29, wherein the plurality of DL embeddings comprise cell morphology data orthogonal to a fixed set of rule based morphometric features.
  • Example 33 The system of Claim 29, the operations further comprising: separating, using the DL model, a plurality of de-correlated DL embeddings and the plurality of morphometric predictive embeddings into different dimensions.
  • Example 34 The system of Claim 29, wherein the DL model is a self -supervised machine learning (SSL) system, the operations further comprising: de-correlating the set of morphometric features from the plurality of DL embeddings so that the plurality of DL model is trained to acquire information not covered using the computer vision model.
  • SSL self -supervised machine learning
  • Example 35 The system of Claim 29, wherein the at least one processor is in a cloud computing environment.
  • Example 36 A non -transitory computer-readable medium storing instructions that, when executed by processor, cause the processor to perform operations for analyzing image data of a cell image, the operations comprising: extracting, using a trained Deep Learning (DL) model and from image data a plurality of cells, a vector for a cell of the plurality of cells, the vector comprising a set of machine -learning (ML)-based features and a set of cell morphometric features extracted by a trained computer vision model; generating, using the trained DL model and the ML-based features, a plurality of DL embeddings orthogonal to each other; and extracting, using the trained computer vision model, a set of cell morphometric features from the cell image that are orthogonal to the plurality of DL embeddings.
  • DL Deep Learning
  • ML machine -learning
  • Example 37 A cloud-based computing system, the system comprising: at least one cloud-based processor to execute the instructions to perform operations comprising: extracting, using a Deep Learning (DL) model, a set of machine learning (ML)-based features from a cell image; generating, using the DL model and using at least the set of ML-based features, a plurality of DL embeddings orthogonal to each other; extracting, using a computer vision model, a set of cell morphometric features from the cell image, the cell morphometric features being orthogonal to the plurality of DL embeddings; and generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features.
  • DL Deep Learning
  • ML machine learning
  • Example 38 The system of Claim 37, wherein the generating the plurality of morphometric predictive embeddings comprises: generating, using the computer vision model and at least the set of the cell morphometric features, a plurality of morphometric embeddings; and concatenating, using the DL model, the plurality of morphometric embeddings and the plurality of DL embeddings to a plurality of predictive multidimensional vectors comprising predictive probabilities morphometric features.
  • Example 39 The system of Claim 37, wherein the DL model is trained using a loss function comprising one or more invariance, variance, covariance, and morphometric decorrelation terms.
  • Example 40 The system of Claim 37, the operations further comprising: predicting, using the DL model and the plurality of morphometric predictive embeddings, a plurality of morphometric blob features using at least a correlation coefficient greater than approximately 0.9.
  • Example 41 The system of Claim 37, the operations further comprising: predicting, using the DL model and the plurality of morphometric predictive embeddings, a plurality of morphometric blob features using at least a correlation coefficient ranging between approximately 0.85 and approximately 0.95.
  • Example 42 The system of Claim 37, the operations further comprising: separating, using the DL model, a plurality of de-correlated DL embeddings and the plurality of morphometric predictive embeddings into different dimensions.
  • Example 43 The system of Claim 37, wherein the DL model is a self -supervised machine learning (SSL) system, the operations further comprising: de -correlating, using the DL model, the set of morphometric features fromthe DL embeddings such thatthe DL model is trained to acquire information not covered using the computer vision model.
  • SSL self -supervised machine learning
  • Example 44 A method for cell sorting, comprising: transporting a cell suspended in a fluid through a flow channel, wherein the flow channel is in fluid communication with a plurality of sub -channels; capturing one or more images of the cell as the cell is transported through the flow channel; extracting, by a Deep Learning (DL) model, a set of machine learning (ML) -based features from the one or more images; generating, using the DL model and the set of ML-based features, a plurality of DL embeddings orthogonal to each other; extracting, using a computer vision model, a set of cell morphometric features from the one or more images, the cell morphometric features being orthogonal to the plurality of DL embeddings; generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features; and sorting the cell to a selected sub-channel of the plurality of sub-channels using the plurality of
  • Example 45 A method for cell sorting, comprising: transporting a cell suspended in a fluid through a flow channel, wherein the flow channel is in fluid communication with a plurality of sub -channels; capturing one or more images of the cell as the cell is transported through the flow channel; extracting, by a Deep Learning (DL) model, a set of machine learning (ML)-based features from the one or more images; generating, using the DL model and using the set of ML-based features, a plurality of DL embeddings orthogonal to each other; and sorting the cell to a selected sub-channel of the plurality of sub-channels using the plurality of DL embeddings.
  • DL Deep Learning
  • ML machine learning
  • Example 46 A cell sorting system, comprising: a flow channel configured to transport a cell through the flow channel; an imaging device configured to capture one or more images of the cell as the cell is transported through the flow channel; and a processor to execute instructions to perform operations comprising: extracting, using a Deep Learning (DL) model, a set of machine learning (ML)-based features from a cell image; generating, using the DL model and using at least the ML-based features, a plurality of DL embeddings orthogonal to each other; extracting, using a computer vision model, a set of cell morphometric features from the cell image that are orthogonal to the plurality of DL embeddings; generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features; and enabling sorting of the cell using at least one of the plurality of morphometric predictive embeddings, the plurality of DL embeddings, and the set of cell
  • Example 47 A cell sorting system, comprising: a flow channel configured to transport a cell through the flow channel; an imaging device configured to capture one or more images of the cell as the cell is transported through the flow channel; and a processor to execute instructions to perform operations comprising: extracting, using a Deep Learning (DL) model and image data of a plurality of cells, a vectorfor a cell of the plurality of cells, the vector comprising a set of machine -learning (ML)- based featuresand a set of cell morphometric features extracted using a computer vision model; generating, using the DL model and using the set of ML-based features, a plurality of DL embeddings orthogonal to each other; and enabling sorting of the cell using at least one of the plurality of DL embeddings and the set of cell morphometric features.
  • DL Deep Learning
  • ML machine -learning
  • FIG. 1 illustrates an example workflow of extracting features associated with cell morphology from cell images using the human foundation model, in accordance with some examples of the present disclosure.
  • FIG. 2A illustrates an example interaction between a microfluidics platform, the human foundation model, and the data suite, in accordance with some examples of the present disclosure.
  • FIG. 2B illustrates an example workflow from high-throughput imaging to cell characterization, classification and sorting based on cell morphology analysis, in accordance with some examples of the present disclosure.
  • FIG. 3 schematically illustrates an example method for classifying a cell.
  • FIG. 4 schematically illustrates, in one example, different ways of representing analysis data of image data of cells.
  • FIG. 5 schematically illustrates, in one example, different representations of analysis of image data of a population of cells.
  • FIG. 9 illustrates an example training architecture of the human foundation model.
  • FIG. 10 illustrates another example of the training architecture of the human foundation model shown in FIG. 9.
  • FIGs. 11 A and 1 IB show examples of morphometric features of cellular images.
  • FIG. 14 with views (a) to (e) schematically illustrate operations that can be performed in an example method.
  • FIG. 18 illustrates, in one example, how a convolution layer may identify features in an input image.
  • FIG. 19 illustrates, in one example, how a transpose convolution layer may operate.
  • FIGS. 20 A-20E illustrate, in one example, morphometric characteristics which maybe used as dimensions in feature space for cell images.
  • FIG. 21 illustrates, in one example, a method which canbe used to train a neural network to approximate the process of recreating an original input after it has been degraded through the introduction of noise.
  • FIG. 22 illustrates, in one example, a method which may be used to create new images using a neural network trained in a manner such as shown in FIG. 21 .
  • FIG. 23 illustrates, in one example, an image creation process which is guided by additional information such as a complete or partial embedding.
  • FIG. 24 depicts, in one example, a process which may be used in exploring a feature space.
  • FIG. 26 depicts, in one example, hardware optimized for rendering and/or parallel operations.
  • FIG. 27 depicts an example imaging system.
  • the word “comprise,” and variations such as “comprises” and “comprising” means various components may be co-jointly employed in the methods and articles (e.g., compositions and apparatuses including device and methods).
  • the term “comprising” will be understood to imply the inclusion of any stated elements or acts but not the exclusion of any other elements or acts.
  • any of the apparatuses and methods described herein are inclusive, but all or a sub-set of the components and/or acts may alternatively be exclusive and may be expressed as “consisting of’ or alternatively “consisting essentially of’ the various components, acts, sub-components, or sub-acts.
  • references to “one example” are not intended to be interpreted as excluding the existence of additional examples that also incorporate the recited features.
  • the use of “including,” “comprising,” “having,” or “in which,” and variations thereof, herein is meant to encompass the items listed thereafter and equivalents thereof as well as additional items.
  • morphology or “morphological characteristic” or “morphological feature” of a cell as used herein generally refers to the form, structure, and/or configuration of the cell.
  • morphometric feature is intendedto refer to a quantitative representation of a morphological feature of a cell, and in some cases the terms “morphological feature” and “morphometric feature” are used interchangeably herein.
  • the morphology of a cell can comprise one or more aspects of a cell’s appearance, such as, for example, shape, size, arrangement, form, structure, pattern(s) of one or more internal and/or external parts ofthe cell, or shade (e.g., color, greyscale, etc.).
  • Non -limiting examples of a shape of a cell can include, but are not limited to, circular, elliptic, shmoo-like, dumbbell, star-like, flat, scale-like, columnar, invaginated, having one or more concavely formed walls, having one or more convexly formed walls, prolongated, having appendices, having cilia, having angle(s), having comer(s), etc.
  • a morphological feature of a cell can be visible with treatment of a cell (e.g., small molecule or antibody staining). In another example, the morphological feature of the cell need not require any treatment to be visualized in an image or video.
  • unstructured or “unsorted,” as used interchangeably herein, generally refers to a mixture of cells (e.g., an initial mixture of cells) that is not substantially sorted (or rearranged) into separate partitions.
  • An unstructured population of cells can comprise atleasttwo types of cells that can be distinguished by exhibiting different properties (e.g., one or more physical properties, such as one or more different morphological characteristics as disclosed herein).
  • the unstructured population of cells can be a random (or randomized) mixture of the atleasttwo types of cells.
  • the cells as disclosed herein can be viable cells.
  • a viable cell can be a cell that is not undergoing necrosis or a cell that is not in an early or late apoptotic state.
  • Assays for determining cell viability can include, e.g., as using propidium iodide (PI) staining which can be detected by flow cytometry.
  • the cells need not be viable (e.g., fixed cells).
  • a “viable cell” as disclosed herein can be characterized by exhibiting one or more characteristics (e.g., morphology, one or more gene expression profiles, etc.) that is substantially unaltered (or that is not substantially impacted by) by any operation or process of the methods disclosed herein (e.g., partitioning).
  • a characteristic of a viable cell can be a gene transcript accumulation rate, which can be characterized by a change in transcript levels of a same gene (e.g., a same endogenous gene) between mother and daughter cells over the time between cell divisions, as ascertained by single cell sequencing, polymerase chain reaction (PCR), etc.
  • PCR polymerase chain reaction
  • high throughput when referring to a platform, system, model, and the like, means that such a platform, system, model, etc., is capable of generating an embedding for at least one image within a desired time, such as but not limited to approximately 5 ms to 30 ms.
  • a high-throughput setting can also include components to process approximately 10,000 frames/sec and/or approximately 1000 images/sec while being configured to correct per -pixel variation in background offset, camera gain, and/or illumination for the processed frames.
  • “high-throughput” systems can require relatively low-latency.
  • Relative terms such as “about,” “substantially,” or “approximately” are used to include small variations with specific numerical values (e.g., +/- x%,), as well as including the situation of no variation (+/-0%).
  • the numerical valuex is less than or equal to 10 - e.g., less than or equal to 5, to 2, to 1, or smaller.
  • the term “real time” or “real-time,” as used interchangeably herein, generally refers to an event (e.g., an operation, a process, a method, a technique, a computation, a calculation, an analysis, an optimization, etc.) that is performed using recently obtained (e.g., collected or received) data.
  • the event may include, but are not limited to, analysis of a one or more images of a cell to classify the cell, updating one or more deep learning algorithms (e.g., neural networks) for classification and sorting, controlling actuation of one or more valves by at a sorting bifurcation, etc.
  • a real time event can be performed almost immediately or within a short enough time span, such as within at least about 0.0001 ms, at least about 0.0005 ms, at least about 0.001 ms, at least about 0.005 ms, at least about 0.01 ms, at least about 0.05 ms, at least about 0.1 ms, at least about 0.5 ms, at least about 1 ms, at least about 5 ms, at least about 0.01 seconds, at least about 0.05 seconds, at least about 0.1 seconds, at least about 0.5 seconds, at least about 1 second, or more.
  • a real time event can be performed almost immediately or within a short enough time span, such as within at most about 1 second, at most about 0.5 seconds, at most about 0.1 seconds, at most about 0.05 seconds, at most about 0.01 seconds, at most about 5 ms, at most about 1 ms, at most about 0.5 ms, at most about 0.1 ms, at most about 0.05 ms, at most about 0.01 ms, at most about 0.005 ms, at most about 0.001 ms, at most about 0.0005 ms, atmost about 0.0001 ms, or less.
  • any of the operations of a computer processor as provided herein can be performed (e.g., automatically performed) in real-time.
  • an “encoder” refers to a type of deep learning model that transforms or “encodes” an image into a vector.
  • the device may be otherwise oriented (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly.
  • the terms “upwardly,” “downwardly,” “vertical,” “horizontal,” and the like are used herein for the purpose of explanation only unless specifically indicated otherwise.
  • terms such as “outef ’ and “inner” are used herein for purposes of description andare notintendedto indicate or imply relative importance or significance.
  • a feature or element When a feature or element is herein referred to as being “on” or “over” another feature or element, itmay be directly on or overthe other feature or element; or intervening features and/or elements may also be present. In other words, when a feature or element is herein referred to as being “on” or “over” another feature or element, it may be indirectly on or over the other feature or element. In contrast, when a feature or element is referred to as being “directly on” or “directly over” another feature or element, there are no intervening features or elements present.
  • the terms “about” or “approximately” for any numerical values or ranges indicate a suitable dimensional tolerance, or other form of reasonable expected range, that allows the part or collection of components to function for its intended purpose as described herein. More specifically, “about” or “approximately” may refer to the range of values that are within ⁇ 10% of the recited value, including ⁇ 0 (e.g., “about 100” may refer to the range of values from 90 to 110, including 90, 110, 100, and all other values within the range of 90 and 110). Any numerical values given herein include about or approximately that value unless the context indicates otherwise.
  • the term “substantially” is also utilized herein to represent the degree by which a quantitative representation may vary from a stated reference without resulting in a change in the basicfunction of the subjectmatteratissue.
  • Theterm “substantially” shall therefore be understood to include a range of conditions orresults that provide afunctional equivalentto an explicitly stated condition or result. For instance, if a task is “substantially complete,” the result of the task having been substantially completed is functionally equivalentto the result that may have been achieved if the task had been perfectly completed.
  • a component that is “substantially straight” or “substantially flat,” an apparatus including a component that is “substantially straight” or “substantially flat” may provide a result or effect that is functionally equivalentto a result or effect that may be achieved by the same apparatus including the same component in a perfectly straight or perfectly flat configuration.
  • the range implied by the term “substantially” includes the perfect result that is within thatrange.
  • the term “substantially complete” shall be read as including “perfectly complete” while also including a range of completeness that is functionally equivalent to perfectly complete.
  • terms such as “substantially straight” and “substantially flat” shall be read as including “perfectly straight” and “perfectly flat,” respectively; while also including a range of straightness or flatness that is functionally equivalent to perfectly straight or flat, respectively.
  • the term “substantially” may indicate a suitable dimensional tolerance, or other form of reasonable expected range, that allows a part or collection of components to function for its intended purpose as described herein.
  • perpendicular shall be understood to include arrangements where one element (e.g., surface, feature, component, axis, etc.) defines an angle of 90 degrees with another element (e.g., surface, feature, component, axis, etc.).
  • the term “perpendicular” shall also be understood to include arrangements where one element (e.g., surface, feature, component, axis, etc.) defines an angle of approximately 90 degrees with another element (e.g., surface, feature, component, axis, etc.).
  • first and second may be used herein to describe various features/elements (including acts), these features/elements are not limited by these terms, unless the context indicates otherwise. These terms are used to distinguish one feature/element from another feature/element, and unless specifically pointed out, do not denote a certain order. Thus, a first feature/element discussed below may be termed a second feature/element, and similarly, a second feature/element discussed below may be termed a first feature/element without departing from the teachings of the present disclosure.
  • the terms “first,” “second,” and “third,” etc. are thus used merely as labels, and are not intended to impose numerical requirements on their objects.
  • system As used herein, the terms “system,” “apparatus,” and “device” may be read as being interchangeable with each other. A system, apparatus, and device may each include a plurality of components having various kinds of structural and/or functional relationships with each other.
  • fluid shall be understood to include liquids and gases, including pneumatic pressure.
  • fluid communication shall be understood to include the communication of liquids and the communication of gases, including pneumatic pressure.
  • morphometric feature of a cell generally refers to the form, structure, and/or configuration of the cell.
  • the morphometric features of a cell may comprise one or more aspects of a cell’s appearance, such as, for example, shape, size, arrangement, form, structure, pattern(s) of one or more internal and/or external parts of the cell, or shade (e.g., color, greyscale, etc.).
  • Non-limiting examples of a shape of a cell may include, but are not limited to, circular, elliptic, dumbbell, star-like, flat, scale-like, columnar, invaginated, having one or more concavely formed walls, having one or more convexly formed walls, prolongated, having appendices, having cilia, having angle(s), having comer(s), etc.
  • a morphometric feature of a cell may be visible with treatment of a cell (e.g., small molecule or antibody staining). In other examples, the morphometric feature of the cell may not and need not require any treatment to be visualized in an image or video.
  • the terms “unstructured” or “unsorted,” as used interchangeably herein, generally refers to a mixture of cells (e.g., an initial mixture of cells) that is not substantially sorted (or rearranged) into separate partitions.
  • An unstructured population of cells may comprise at least two types of cells that can be distinguished by exhibiting different properties (e.g., one or more physical properties, such as one or more different morphological characteristics as disclosed herein).
  • the unstructured population of cells may be a random (or randomized) mixture of the atleasttwo types of cells.
  • the cells as disclosed herein may be viable cells.
  • a viable cell, as disclosed herein may be a cell that is not undergoing necrosis or a cell that is notin an early or late apoptotic state. In other examples, the cells may not and need not be viable (e.g., fixed cells).
  • resilient refers to a material property where the material has shape memory and stiffness such that it is structurally biased toward a neutral shape or structural arrangement.
  • a resilient member may have a resilient bias toward a neutral shape or structural arrangement where the resilient member is straight along a central longitudinal axis. That same resilient member may be deformed relative to the neutral shape or structural arrangement, such as by being bent away from that central longitudinal axis, in response to a force (e.g., when a force is imparted on the resilient member, where the force has a directional component that is transverse to the central longitudinal axis).
  • the resilient member While the resilient member is being deformed relative to the neutral shape in response to the force, the resilient member may be under stress whereby the resilient property of the material of the resilient member generates a force in a direction that is opposite to the force that is causing the deformation of the resilient member.
  • the resilient property of the material of the resilient member may impart a mechanical bias urging the resilient member back toward the neutral shape or structural arrangement.
  • the resilient bias of the material of the resilient member may cause the resilient member to return to (or at least toward) the neutral shape or structural arrangement. While the foregoing example provides a straight configuration as a neutral shape or structural arrangement, other examples of resilient members may have other kinds of neutral shapes or structural arrangements.
  • Morphology is an important cell property associated with identity, state, and function, but in some instances it is characterized crudely in a few standard dimensions such as diameter, perimeter, or area, or with subjective qualitative descriptions.
  • the present disclosure provides a method of processing that includes using a machine learning encoder to extract a set of ML-based features from a cell images, using a computer vision encoder to extract a set of cell morphometric features from the cell image, and using the set of ML-based features and the set of cell morphometric features to generate a feature vector that represents morphology of the cell.
  • the feature vector can beusedin a variety of practical applications, e.g., in a manner such as described in the nonlimiting examples provided herein.
  • tumors are composed of heterogeneous assortments of cells with distinct genetic and phenotypic characteristics that may drive therapeutic resistance, immune evasion, and disease progression.
  • the advent of single cell technologies has enabled deep profiling of individual cells within a tumor microenvironment, leading to a better understanding of tumor biology and subsequently more effective cancer treatment strategies. While profiling technologies such as flow cytometry and single cell sequencing yields insight on tumor composition, cells are sometimes no longer amenable to additional downstream studies after being subjected to antibody staining or destructive analytical processes such as cell lysis.
  • Current sorting methods such as fluorescence- activated cell sorting(FACS) rely on a limited set of biomarkers, which cannot cover the full extent or be readily available for all distinct cell properties. Additionally, dependence on antibodies, dyes/stains, and biomarkers to denote cell identity may inadvertently create sampling bias by depleting biomarker-negative but potentially biologically interesting cell populations.
  • FACS fluorescence- activated cell sorting
  • Cell morphology information has historically been used for cell and disease characterization but has been difficult to objectively and reproducibly quantify. Cell morphology in many instances are studied qualitatively through microscopes, which can be inherently slow, difficult to scale, and relies on human interpretation.
  • the present disclosure provides multi-dimensional morphology analysis (e.g., profiling) enabled by machine learning and computer vision morphometries.
  • the present disclosure has the benefit of enabling higher resolution and biological insight while reducing labor-intensive cell processing manipulations.
  • the multi-dimensional morphology profiling and sorting of unlabeled single cells using machine learning, advanced imaging, and microfluidics can be used to assess population heterogeneity beyond biomarkers.
  • the present disclosure provides a method for cell morphology analysis.
  • the method may combine deep learning and computer vision methods to extract features from cell images.
  • a deep learning model used in the method may provide quantitative descriptions of cell features using one or more neural network.
  • a computer vision model used in the method may provide a quantitative assessment of cell and biological features using discrete image analysis algorithms. The method as described herein may allow for extracting and interpreting cell morphology features with a multidimensional, unbounded, and quantitative assessment.
  • the present disclosure provides a system for cell morphology analysis.
  • the system may comprise a benchtop single-cell imaging and sorting system for high -dimensional morphology analysis.
  • the system may combine label-free imaging deep learning, computer vision morphometries, and gentle cell sorting to leverage multidimensional single cell morphology as a quantitative readout.
  • the system may capture high- resolution brightfield cell images, from which features (e.g., dimensional embedding vectors) can be extracted representing the morphology of the cells.
  • the system and method may combine label-free imaging, deep learning, computer vision morphometries, and gentle cell sorting to harness multi-dimensional single cell morphology as a quantitative biological readout.
  • the systems and methods disclosed herein have a variety of potential uses.
  • the combination of deep learning and computer vision morphometries may allow cell characterization and sorting based on multi-dimensional morphometric and deep learning derived features, which can be used to identify and enrich cancer cells in heterogeneous populations.
  • Quantitative multi-dimensional morphology information at single cell level may provide additional information to resolve cancer heterogeneity.
  • the system and method may extraetpigmentationfeaturesfrom the morphology profiling and basedon which, assess melanoma cells.
  • Cell populations can be characterized with specific morphological profiles have distinct molecular profiles, and morphologically distinct cells (e.g., normal vs. tumor) can be distinguished over each other.
  • the system and method may provide a relatively fast workflow for cell morphology analysis. For example, it may only take a few hours from preparing cell samples to generating publishable figures representing cell morphology.
  • the systems and methods can be used in a variety of applications including but not limited to cancer research, developmental biology, cell and gene therapies, and drug and functional screening.
  • the systems and methods as described herein can be used in drug and Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) perturbation screening, using cell morphology as a novel biomarker for the screening.
  • CRISPR Clustered Regularly Interspaced Short Palindromic Repeats
  • the systems and methods as described herein can be used in sample-level profiling including but not limited to heterogeneous sample evaluation and characterization, disease detection and enrichment, and sample clean-up.
  • the systems and methods as described herein can be used in cell-level phenotyping, including cell health status, cell state characterization, and multi-omic integration.
  • the present disclosure may provide a human foundation model (“HFM”) for cell morphology analysis (e.g., profiling).
  • the model may combine a deep learning model and a computer vision model and extract cell features from cell images.
  • the deep learning model may process cell images as input and provide quantitative descriptions of cell features.
  • the deep learning model may extract deep learning features that are information-rich metrics of cell morphology with powerful discriminative capabilities.
  • the deep learning features may not be human-interpretable.
  • the computer vision model may process cell images as input and provide morphometric features that are human-interpretable, quantitative metrics of cell morphology including cell size, shape, texture, and intensity.
  • the morphometries can be computationally generated using discrete computer vision algorithms.
  • the deep learning model may overcome the limitation of the computer vision model by imputing the most computationally intensive morphometries into the human foundation model.
  • the human foundation model as described herein may provide both accuracy and interpretability in real-time feature extraction, cell classification and sorting.
  • the human foundation model may also have strong generalization capabilities that enable hypothesis- free sample exploration and efficient generation of application -specific models.
  • FIG. 1 illustrates an example workflow of extracting features associated with cell morphology from cell images using the human foundation model, in accordance with some examples of the present disclosure.
  • the human foundation model may process cell images 110 and generate features therefrom.
  • cells that are under analysis can be unstained, and the cell images 110 can be brightfield cell images.
  • the human foundation model may comprise a deep learning model 120 and a computer vision model 130.
  • the deep learning model 120 may comprise a deep learning encoder, for example, a convolutional neural network.
  • the deep learning model 120 may process cell images 110 as input and extract artificial intelligence (Al) features 140 therefrom.
  • Al features 140 may comprise deep learning features 160, e.g., features that are extracted using a deep learning algorithm, such as a convolutional neural network, with other nonlimiting examples being provided elsewhere herein.
  • the dimensions of the deep learning features can be in a range of between about 1 and about 10, between about 1 and about 20, between about 1 and about 50, between about 1 and about 80, between about 1 and about 100, between about 1 and about 200, between about 1 and about 500, between about 1 and about 800, between about 1 and about 1,000, between about 1 and about 2,000, between about 1 and about 5,000, between about 1 and about 8,000, between about 1 and about 10,000, between about 1 and about 20,000, between about 1 and about 50,000, between about 1 and about 80,000, or between about 1 and about 100,000, or any value between any of the aforementioned numbers.
  • the number may be at least about 1 feature - e.g., at least about 5, at least about 10, at least about 50, at least about 100, at least about 500, at least about 1,000, at least about 5,000, at least about 10,000, at least about 50,000, at least about 100,000, or more, features.
  • the number may be up to about 100,000 features - e.g., up to about 50,000features, up to about 10,000, up to about 5,000, up to about 1,000, upto about 500, up to about 100, up to about 50, up to about 10, up to about 5, up to about 1, or smaller, features.
  • Other suitable numbers are also possible.
  • the deep learning model 120 may extract between about 5 and about 1000 deep learning features, e.g., between about 10 and about 500 deep learning features, e.g., between about 50 and about 100 deep learning features, from each cell image.
  • each feature in a data set comprising a plurality of deep learning features of the cell(s), each feature can be referred to as a dimension (e.g., a deep learning dimension). Any range of dimensions ofthe deep learning features can be contemplated, for example from 1 through any number greater than about 100,000.
  • the deep learning model 120 generates about 64- dimensional deep learning features 160.
  • the computer vision model 130 may comprise a computer vision encoder including human-constructed algorithms, which in some cases can be referred to as “rulebased morphometries.”
  • the computer vision model 130 may process cell images 110 as input and extract cell features 150 therefrom.
  • the cell features 150 may comprise cell position, cell shape, pixel intensity, texture, focus, or combinations thereof.
  • the cell features 150 may comprise morphometric features 170. Nonlimiting examples of morphometric features 170 are provided below in Table 1.
  • the dimensions ofthe morphometric features 170 can be in a range of between about 1 and about 10, between about 1 and about 20, between about 1 and about 50, between about 1 and about 80, between about 1 and about 100, between about 1 and about 200, between about 1 and about 500, between about 1 and about 800, between about 1 and about 1,000, between about 1 and about 2,000, between about 1 and about 5,000, between about 1 and about 8,000, between about 1 and about 10,000, between about 1 and about 20,000, between about 1 and about 50,000, between about 1 and about 80,000, or between about 1 and about 100,000, or any value between any of the aforementioned numbers.
  • the cell features 150 may include any suitable number of morphometric features 170, for example, at least about 1 feature, at least about 5 features, at least about 10 features, atleast about 50 features, at least about 100 features, atleast about 500 features, at least about 1 ,000 features, at least about 5,000 features, at least about 10,000 features, atleast about 50,000 features, and atleast about 100,000 features.
  • each feature in a data set comprising a plurality of computer vision features of the cell(s), each feature can be referred to as a dimension (e.g., computer vision-based dimension). Any range of dimensions of the morphometric features can be contemplated, for example from 1 through any number greater than 100,000.
  • the computer vision model 130 may extract between about 5 and about 1000 morphometric features, e.g., between about 10 and about 500 morphometric features, e.g., between about 50 and about 100 features, and any values in between any of the aforementioned ranges, from each cell image. As illustrated in FIG. 1, in one nonlimiting example, the computer vision model generates about 51 -dimensional morphometric features 170.
  • the human foundation model may generate one or more morphology maps based on deep learning features 160, morphometric features 170, or combinations thereof and in some examples based on a plurality of deep learning features and a plurality of morphometric features (e.g., based on a multi-dimensional vector that represents morphology of a cell).
  • a cell morphology map canbe avisual (e.g., graphical) representation of one ormore clusters of datapoints.
  • the cell morphology map can be a 1 -dimensional (ID) representation (e.g., based on one morphological property as one parameter or dimension) or a multi-dimensional representation, such as a 2-dimensional (2D) representation (e.g., based on two morphological properties as two parameters or dimensions), a 3 -dimensional (3D) representation (e.g., based on three morphological properties as three parameters or dimensions), a 4 -dimensional (4D) representation, etc.
  • ID 1 -dimensional
  • 2D 2-dimensional
  • 3D 3 -dimensional
  • 4D 4 -dimensional
  • one morphological property of a plurality of morphological properties used for blotting the cell morphology map can be represented as a non-axial parameter (e.g., non-x, y, or z axis), such as, distinguishable colors (e.g., heatmap), numbers, letters (e.g., texts of one or more languages), and/or symbols (e.g., a square, oval, triangle, square, etc.).
  • a heatmap can be used as colorimetric scale to represent the classifier prediction percentages for each cell against a cell class, cell type, or cell state.
  • the cell morphology map can be generatedbasedon one ormore morphological features (e.g., characteristics, profiles, fingerprints, etc.) from the processed image data.
  • morphological features e.g., characteristics, profiles, fingerprints, etc.
  • Non-limiting examples of one or more morphological properties of a cell, as disclosed herein, that can be extracted from one or more images of the cell may include, but are not limited to (a) shape, curvature, size (e.g., diameter, length, width, circumference), area, volume, texture, thickness, roundness, etc.
  • the cell or one or more components of the cell e.g., cell membrane, nucleus, mitochondria, etc.
  • number or positioning of one or more contents e.g., nucleus, mitochondria, etc.
  • optical characteristics of a region of the image(s) e.g., unique groups of pixels within the image(s) that correspond to the cell or a portion thereof (e.g., light emission, transmission, reflectance, absorbance, fluorescence, luminescence, etc.).
  • One or more dimension of the cell morphology map can be represented by various approaches (e.g., dimensionality reduction approaches), such as, for example, principal component analysis (PCA), multidimensional scaling (MDS), t-distributed stochastic neighbor embedding (t- SNE), and uniform manifold approximation and projection (UMAP).
  • PCA principal component analysis
  • MDS multidimensional scaling
  • t- SNE t-distributed stochastic neighbor embedding
  • UMAP uniform manifold approximation and projection
  • UMAP can be a machine learning technique for dimension reduction.
  • UMAP can be constructed from a theoretical framework based in Riemannian geometry and algebraic topology.
  • UMAP can be utilized for a practical scalable algorithm that applies to real world data, such as morphological properties of one or more cells.
  • the deep learning model of the human foundation model can be trained using a plurality of cell images from different types of biological samples and thus, be able to detect differences in cell morphology without labeled training data.
  • the deep learning model 120 of the human foundation model may be trained using any suitable number of images of cells, for example between about 1 and about 200, about 1 to about 500, between about 1 and about 800, between about 1 and about 1,000, between about 1 and about 2,000, between about 1 and about 5,000, between about 1 and about 8,000, between about 1 and about 10,000, between about 1 and about20,000, between about 1 and about50,000, betweenabout 1 and about80,000, betweenabout 1 and about 100,000, between about 1 and about 200,000, between about 1 and about 500,000, between about 1 and about 800,000, between about 1 and about 1,000,000, between about 1 and about 2,000,000, between about 1 and about 5,000,000, between about 1 and about 8,000,000, or between about 1 and about 10,000,000 images of cells.
  • the deep learning model 120 of the human foundation model is trained using a training dataset that includes at least about 10,000 images of cells - e.g., at least about 100,000 images of cells, at least about 1,000,000 images of cells, at least about 5,000,000 images of cells, atleast about 10,000,000 images of cells, atleast about 100,000,00images of cells, at least about 1 billion, or more, images of cells.
  • the deep learning model 120 can be trained using between about 5,000,000 and about 1 billion images of cells.
  • the training set may include, may consist of, or may consist essentially of (and in some examples may consist of), images of cells that are not physically stained and that are not computationally labeled in any manner.
  • the deep learningmodel 120 learns to recognize features from the cell images in a self-supervised manner.
  • the human foundation model may comprise parameters in a range of between about 1 and about 1,000, between about 1 and about 2,000, between about 1 and about 5,000, between about 1 and about 8,000, between about 1 and about 10,000, between about 1 and ab out 20 , 000, b etwe enabout l and about50,000,betweenabout l and about 80,000, b etween about 1 and about 100,000, between about 1 and about 200,000, between about 1 and about 500,000, between about 1 and about 800,000, between about 1 and about 1,000,000, between about 1 and about 2,000,000, between about 1 and about 5,000,000, between about 1 and about 8,000,000, between about 1 and about 10,000,000, between about 1 and about 20,000,000, between about 1 and about 50,000,000, between about 1 and about 80,000,000, between about 1 and about 100,000,000, or between about 1 and about 500,000,000.
  • a neural network may include millions or billions of floating-point numbers connected by mathematical operations. These numbers in some instances can be called “parameters” or “weights”.
  • parameters are adjusted ("trained") to transform an image of a cell into a vector (for example classification probabilities, or feature vector, depending on the use -case of the neural network).
  • a neural network for computer vision applications such as provided herein may have a number of parameters ranging from 1 million to upwards of 10 billion.
  • the deep learning model (e.g., backbone model) of the human foundation model, which extracts image features can be based on a convolutional neural network architecture, a vision transformer architecture, or both.
  • the training process may apply a self- supervised learning approach that learns image features without labels and generate deep learning embeddings (vectors) that are orthogonal to each other and orthogonal to morphometric features.
  • embeddings that are “orthogonal” can be perpendicular to another embedding vector or set of embedding vectors. For example, vectors are considered to be orthogonal to each other if they are at right angles in ⁇ -dimensional space, where n is the size or number of elements in each vector.
  • “orthogonal” embeddings can have a covariance of about 0 and can be perfectly or completely orthogonal (e.g., have exactly a covariance of 0) or be substantially orthogonal with a covariance that is greater than but close to 0.
  • “orthogonal” embeddings include features that are “independent” of one another, meaning, that the presence or absence of one feature does not affect the presence or absence of any of the other feature. For example, a vector is orthogonal if the dot product with another vector is zero.
  • analysis of imaging data as disclosed herein can be performed using artificial intelligence, such as one or more machine learning algorithms.
  • the machine learning model (e.g., a metamodel) can be trained by using a learning model and applying learning algorithms (e.g., machine learning algorithms) on a training dataset (e.g., a dataset comprising unlabeled cell images).
  • learning algorithms e.g., machine learning algorithms
  • a training algorithm may build a machine learning model capable of assigning features within images of cells into one category or the other, e.g., to make the model anon-probabilistic machinelearning model.
  • the machine learning model can be used to create a new category to assign new examples/cases into the new category.
  • a machine learning model can be the actual trained model that is generated based on the training model.
  • the machine learning algorithm as disclosed herein can be configured to extract one or more morphological features of a cell from the image data of the cell.
  • the machine learning algorithm may form a new data set based on the extracted morphological features, and the new data set need not contain the original image data of the cell.
  • replicas of the original images in the image data can be stored in a database disclosed herein, e.g., prior to using any of the new images fortraining, e.g., to keep the integrity of the images of the image data.
  • processed images of the original images in the image data can be stored in a database disclosed herein during or subsequent to the classifier training.
  • any of the newly extracted morphological features as disclosed herein can be utilized as new molecular markers for a cell or population of cells of interest to the user.
  • a cell analysis platform as disclosed herein can be operatively coupled to one or more databases comprising non-morphological data of cells processed (e.g., genomics data, transcriptomics data, proteomics data, metabolomics data), a selected population of cells exhibiting the newly extracted morphological feature(s) can be further analyzed by their non-morphological properties to identify proteins or genes of interest that are common in the selected population of cells but not in other cells, thereby determining such proteins or genes of interest to be new molecular markers that can be used to identify such selected population of cells.
  • non-morphological data of cells processed e.g., genomics data, transcriptomics data, proteomics data, metabolomics data
  • a selected population of cells exhibiting the newly extracted morphological feature(s) can be further analyzed by their non-morphological properties to identify proteins or genes of interest that are common in the selected population of cells but not in
  • Non-limiting examples of machine learning algorithms for training a machine learning model may include supervised learning, unsupervised learning, semi-supervised learning reinforcement learning, self-learning(also referredto as self-supervisedlearning), feature learning anomaly detection, association rules, etc.
  • a machine learning model can be trained by using one or more learning models on such training dataset.
  • Non -limiting examples of learning models may include artificial neural networks (e.g., convolutional neural networks, U-net architecture neural network, etc.), backpropagation, boosting, decision trees, support vector machines, regression analysis, Bayesian networks, genetic algorithms, kernel estimators, conditional random field, random forest, ensembles of machine learning models, minimum complexity machines (MCM), probably approximately correct learning (PACT), etc.
  • the neural networks are designed by the modification of neural networks such as AlexNet, VGGNet, GoogLeNet, ResNet (residual networks), DenseNet, and Inception networks.
  • the enhanced neural networks are designed by modification of ResNet (e.g. ResNet 18, ResNet 34, ResNet 50, ResNet 101, and ResNet 152) or inception networks.
  • the modification comprises a series of network surgery operations that are mainly carried out to improve including inference time and/or inference accuracy.
  • the neural network can be used together with a Vision Transformer.
  • Vision Transformers and their use in encoding images are described in Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” International Conference on Learning Representations (ICLR) (2021) (21 pages available at arxiv.org/abs/2010.11929), the entire contents of which are incorporated by reference herein.
  • ICLR International Conference on Learning Representations
  • the machine learning algorithm as disclosed herein may utilize one or more clustering algorithms to determine that objects (e.g., features) in the same cluster canbe more similar (in one or more morphological features) to each other than those in other clusters.
  • clustering algorithms may include, but are not limited to, connectivity models (e .g., hierarchical clustering), centroid models (e.g.
  • the machine learning algorithm may utilize a plurality of models, e.g., in equal weights or in different weights.
  • the graph -based models may include graph-based clustering algorithms that use modularity, e.g., such as described in the following references, the entire contents of each of which are incorporated by reference herein: Blondel et al., “Fast unfolding of communities in large networks,” Journal of Statistical Mechanics: Theory and Experiment 2008: P10008 (2008); and Traag et al., “From Louvain to Leiden: guaranteeing well-connected communities,” Scientific Reports 9: 5233, 12 pages, (2019).
  • unsupervised and self-supervised approaches can be used to expedite labeling of image data of cells (extract features from cells).
  • an embedding for a cell image can be generated.
  • the embedding can be a representation of the image in a space with reduced dimensions than the original image data.
  • Such embeddings can be used to cluster images that are similar to one another.
  • the labeler can be configured to batch-label the cells and increase the throughput as compared to manually labeling one or more cells.
  • additional meta information e.g., additional non-morphological information
  • additional meta information e.g., additional non-morphological information
  • the sample e.g., what disease is known or associated with the patient who provided the sample
  • embedding generation may use a neural net trained on predefined cell types.
  • an intermediate layer of the neural net that is trained on predetermined image data (e.g., image data of known cell types and/or states) can be used.
  • embedding generation may use neural nets trained for differenttasks.
  • an intermediate layer of the neural net that is trained for a different task e.g., a neural net that is trained on a canonical dataset such as ImageNet.
  • this can allowthe system to focus onfeatures that matter for image classification (e.g., edges and curves) while removing a bias that may otherwise be introduced in labeling the image data.
  • autoencoders can be used for embedding generation.
  • autoencoders can be used, in which the input and the output can be substantially the same image and the squeeze layer can be used to extract the embeddings.
  • the squeeze layer may force the model to learn a smaller representation of the image, which smaller representation may have sufficient information to recreate the image (e.g., as the output).
  • an expanding training data set can be used for clustering-based labeling of image data or cells.
  • one or more revisions of labeling e.g., manual relabeling
  • manual relabeling can be intractable on a large scale and ineffective when done on a random subset of the data.
  • similar embedding-based clustering can be used to identify labeled images that may cluster with members of other classes. Such examples are likely to be enriched for incorrect or ambiguous labels, which can be removed (e.g., automatically or manually).
  • adaptive image augmentation can be used.
  • one or more images with artifacts can be identified, and (2) such images identified with artifacts can be added to training pipeline (e.g., fortraining the model).
  • Identifying the image(s) with artifacts may comprise: (la) while imaging cells, one or more additional sections of the image frame can be cropped, which frame(s) being expected to contain just the background without any cell; (2a) the background image can be checked for any change in one or more characteristics (e.g., optical characteristics, such as brightness); and (3a) flagging/labeling one or more images that have such change in the characteristic(s).
  • Adding the identified images to training pipeline may comprise: (2a) adding the one or more images that have been flagged/labeled as augmentation by first calculating an average feature of the changed characteristic(s) (e.g., the background median color); (2b) creating a delta image by subtractingthe average feature from the image data (e.g., subtracting the median for each pixel of the image); and (3 c) adding the delta image to the training pipeline.
  • an average feature of the changed characteristic(s) e.g., the background median color
  • 2b creating a delta image by subtractingthe average feature from the image data (e.g., subtracting the median for each pixel of the image)
  • (3 c) adding the delta image to the training pipeline may comprise: (2a) adding the one or more images that have been flagged/labeled as augmentation by first calculating an average feature of the changed characteristic(s) (e.g., the background median color); (2b) creating a delta image by subtractingthe average feature from the image data (e.g., subtract
  • the model(s) can be validated (e.g, for the ability to demonstrate accurate cell classification performance).
  • validation metrics may include, but are not limited to, threshold metrics (e.g., accuracy, F-measure, Kappa, Macro-Average Accuracy, Mean-Class-Weighted Accuracy, Optimized Precision, Adjusted Geometric Mean, Balanced Accuracy, etc.), the ranking methods and metrics (e.g., receiver operating characteristics (ROC) analysis or “ROC area under the curve (ROC AUC)”), and the probabilistic metrics (e.g., root-mean-squared error).
  • threshold metrics e.g., accuracy, F-measure, Kappa, Macro-Average Accuracy, Mean-Class-Weighted Accuracy, Optimized Precision, Adjusted Geometric Mean, Balanced Accuracy, etc.
  • the ranking methods and metrics e.g., receiver operating characteristics (ROC) analysis or “ROC area under the curve (ROC AUC)
  • ROC AUC receiver operating characteristics
  • the model(s) can be determined to be balanced or accurate when the ROC AUC is greater than about 0.5 - e.g., greater than about0.55, greaterthan about0.6, greaterthan about 0.65, greater than about 0.7, greaterthan about 0.75, greaterthan about 0.8, greaterthan about 0.85, greater than about 0.9, greater than about 0.91, greaterthan about 0.92, greaterthan about 0.93, greaterthan about 0.94, greater than about 0.95, greaterthan about 0.96, greaterthan about 0.97, greaterthan about 0.98, greater than about 0.99, or higher.
  • the output of the machine learning encoder may include, may consist of, or may consist essentially of, at least one multidimensional vector (which may also be referred to herein as an embedding). Elements of the vector(s) for a given image may correspond to the values of respective features that the machine learning encoder extracted from that image.
  • Table 1 below describes example machine learning dimensions (for example, deep learning dimensions), which correspond to different features that the machine learning encoder extracts from images.
  • the machine learning encoder extracts n ML-based features from each image (where n is a positive integer), and outputs an array of length //, which array can be considered to be an ⁇ -dimensional vector.
  • the output of the deep learning encoder may have the format:
  • n can be in any suitable range, e.g., can be between about 5 and about 1000 - e.g., between about 10 and about 500, between about 50 and about 100, or range in between any of the aforementioned values. In the nonlimiting example shown in Table 1 , n is equal to 64.
  • the ML-based features are not human-interpretable.
  • the features are not human-interpretable.
  • the elements of the vector generated by the machine learning encoder may have numeric values, such as [0.1 4 2.3 ... 10], that correspond to the quantitative “amount” of certain features that the machine learning encoder has identified as being present or not in a given image.
  • numeric values such as [0.1 4 2.3 ... 10]
  • analysis of imaging data as disclosed herein can be performed using artificial intelligence, such as one or more machine learning algorithms.
  • one or more machine leamingmodels canbeused to automatically sort or categorize particles (e.g., cells) in the imaging data into one or more classes (e.g., one or more physical characteristics or morphological features, as used interchangeably herein).
  • cell imaging data can be analyzed using the machine learning algorithm(s) to classify (e.g., sort) a cell (e.g., a single cell) in a cell image or video.
  • cell imaging data can be analyzed using the machine learning algorithm(s) to determine a focus score of a cell (e.g., a single cell) in a cell image or video.
  • cell imaging data can be analyzed using the machine learning algorithm(s) to determine a relative distance between (i) a first plane of cells exhibiting first similar physical characteristic(s) and (ii) a second plane of cells exhibiting second similar physical characteristic(s), which first and second planes denote fluid streams flowing substantially parallel to each other in a channel.
  • one or more cell morphology maps as disclosed herein can be used to train one or more machine learning models (e.g., at least about 1, at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, at least about 10, or more machine learning models) as disclosed herein.
  • Each machine learning model can be trained to analyze one or more images of a cell (e.g., to extract one or more morphological features of the cell) and categorize (or classify) the cell into one or more determined class or categories of a cell (e.g., based on a type of state of the cell).
  • the machine learning model can be trained to create a new category to categorize (or classify) the cell into the new category , e.g. , when determining that the cell is morphologically distinct than any pre-existing categories of other cells.
  • the entire process of cell focusing as disclosed herein e.g., partitioning of cells into one or more planar currents flowing through the channel
  • de novo Al-mediated analysis of each cell e.g., using analysis of one or more images of each cell using machine learning algorithm. This can be a complete Al or a full Al approach for cell sorting and analysis.
  • a hybrid approach can be utilized, wherein Al-mediated analysis may analyze cells and one or more heterologous markers that are co-partitioned with the cells (e.g., into the same planar current flowing through the channel), confirm or determine the co-partitioning, after which a more conventional approach (e.g., imaging to detect presence of the heterologous markers, such as fluorescent imaging) can be utilized to sort a subsequent population of cells and the heterologous markers that are co-partitioned into the same planar current.
  • a more conventional approach e.g., imaging to detect presence of the heterologous markers, such as fluorescent imaging
  • the machine learning model (e.g., a metamodel) can be trained by using a learning model and applying learning algorithms (e.g., machine learning algorithms) on a training dataset (e.g., a dataset comprising examples of specific classes).
  • learning algorithms e.g., machine learning algorithms
  • a training algorithm may build a machine learning model capable of assigning new examples/cases (e.g., new datapoints of a cell or a group of cells) into one category or the other, e.g., to make the model a non -probabilistic machine learningmodel.
  • the machine leamingmodel can be capable of creating a new category to assign new examples/cases into the new category.
  • a machine learning model can be the actual trained model that is generated based on the training model.
  • the machine learning algorithm as disclosed herein can be configured to extract one or more morphological features of a cell from the image data of the cell.
  • the machine learning algorithm may form a new data set based on the extracted morphological features, and the new data set need not contain the original image data of the cell.
  • replicas of the original images in the image data can be stored in a database disclosed herein, e.g., prior to using any of the new images fortraining, e.g., to keep the integrity of the images of the image data.
  • processed images of the original images in the image data can be stored in a database disclosed herein during or sub sequent to the classifier training.
  • any of the newly extracted morphological features as disclosed herein can be utilized as new molecular markers for a cell or population of cells of interestto the user.
  • cell analysis platform as di sclosed herein can be operatively coupled to one or more databases comprising non-morphological data of cells processed (e.g., genomics data, transcriptomics data, proteomics data, metabolomics data), a selected population of cells exhibiting the newly extracted morphological feature(s) can be further analyzed by their non-morphological properties to identify proteins or genes of interest that are common in the selected population of cells but not in other cells, thereby determining such proteins or genes of interest to be new molecular markers that can be used to identify such selected population of cells.
  • non-morphological data of cells processed e.g., genomics data, transcriptomics data, proteomics data, metabolomics data
  • a selected population of cells exhibiting the newly extracted morphological feature(s) can be further analyzed by their non-morphological properties to identify proteins or genes of interest that are common in the selected population of cells
  • a machine learning model can be trained by applying machine learning algorithms on at least a portion of one or more cell morphology maps as disclosed herein as a training dataset.
  • machinelearningalgorithmsfortraininga machine learning model may include supervised learning, unsupervised learning, semi -supervised learning reinforcement learning, self-learning, feature learning, anomaly detection, association rules, etc.
  • a machine learning model can be trained by using one or more learning models on such training dataset.
  • Non-limiting examples of learning models may include artificial neural networks (e.g., convolutional neural networks, U-net architecture neural network, etc.), backpropagation, boosting, decision trees, support vector machines, regression analysis, Bayesian networks, genetic algorithms, kernel estimators, conditional random field, random forest, ensembles of machine learning models, minimum complexity machines (MCM), probably approximately correct learning (PACT), etc.
  • artificial neural networks e.g., convolutional neural networks, U-net architecture neural network, etc.
  • backpropagation boosting
  • decision trees e.g., decision trees, support vector machines, regression analysis, Bayesian networks, genetic algorithms, kernel estimators, conditional random field, random forest, ensembles of machine learning models, minimum complexity machines (MCM), probably approximately correct learning (PACT), etc.
  • MCM minimum complexity machines
  • PACT probably approximately correct learning
  • the neural networks are designed by the modification of neural networks such as AlexNet, VGGNet, GoogLeNet, ResNet (residual networks), DenseNet, and Inception networks.
  • the enhanced neural networks are designed by modification of ResNet (e.g. ResNet 18, ResNet 34, ResNet 50, ResNet 101, and ResNet 152) or inception networks.
  • the modification comprises a series of network surgery operations that are mainly carried out to improve including inference time and/or inference accuracy.
  • the machine learning algorithm as disclosed herein may utilize one or more clustering algorithms to determine that objects (e.g., cells) in the same cluster can be more similar (in one or more morphological features) to each other than those in other clusters.
  • clustering algorithms may include, but are not limited to, connectivity models (e.g., hierarchical clustering), centroid models (e.g.
  • K-means algorithm K-means algorithm
  • distribution models e.g., expectationmaximization algorithm
  • density models e.g., density -based spatial clustering of applications with noise (DBSCAN), ordering points to identify the clustering structure (OPTICS)
  • subspace models e.g., biclustering
  • group models graph-based models (e.g., highly connected subgraphs (HCS) clustering algorithms), single graph models, and neural models (e.g., using unsupervised neural network).
  • the machine learning algorithm may utilize a plurality of models, e.g., in equal weights or in different weights.
  • unsupervised and self-supervised approaches can be used to expedite labeling of image data of cells.
  • an embedding for a cell image can be generated.
  • the embedding can be a representation of the image in a space with reduced dimensions than the original image data.
  • Such embeddings can be used to cluster images that are similar to one another.
  • the labeler can be configured to batch-label the cells and increase the throughput as compared to manually labeling one or more cells.
  • additional meta information e.g., additional non-morphological information
  • additional meta information e.g., additional non-morphological information
  • the sample e.g., what disease is known or associated with the patient who provided the sample
  • embedding generation may use a neural net trained on predefined cell types.
  • an intermediate layer of the neural net that is trained on predetermined image data (e.g., image data of known cell types and/or states) can be used. By providing enough diversity in image data/sample data to the trained model, this method may provide an accurate way to cluster future cells.
  • embedding generation may use neural nets trained for differenttasks.
  • an intermediate layer of the neural net that is trained for a different task e.g., a neural net that is trained on a canonical dataset such as ImageNet.
  • this may allow to focus on features that matter for image classification (e.g., edges and curves) while removing a bias that may otherwise be introduced in labeling the image data.
  • autoencoders can be used for embedding generation.
  • autoencoders can be used, in which the input and the output can be substantially the same image and the squeeze layer can be used to extract the embeddings.
  • the squeeze layer may force the model to learn a smaller representation of the image, which smaller representation may have sufficient information to recreate the image (e.g., as the output).
  • an expanding training data set can be used for clustering-based labeling of image data or cells.
  • one or more revisions of labeling e.g., manual relabeling
  • Such manual relabeling can be intractable on a large scale and ineffective when done on a random subset of the data.
  • similar embedding-based clustering can be used to identify labeled images that may cluster with members of other classes. Such examples are likely to be enriched for incorrect or ambiguous labels, which can be removed (e.g., automatically or manually).
  • adaptive image augmentation can be used.
  • one or more images with artifacts can be identified, and (2) such images identified with artifacts can be added to training pipeline (e.g., fortraining the model).
  • Identifying the image(s) with artifacts may comprise: (la) while imaging cells, one or more additional sections of the image frame can be cropped, which frame(s) being expected to contain just the background without any cell; (2a) the background image can be checked for any change in one or more characteristics (e.g., optical characteristics, such as brightness); and (3a) flagging/labeling one or more images that have such change in the characteristic(s).
  • Adding the identified images to training pipeline may comprise: (2a) adding the one or more images that have been flagged/labeled as augmentation by first calculating an average feature of the changed characteristic(s) (e.g., the background median color); (2b) creating a delta image by subtracting the average feature from the image data (e.g., subtracting the median for each pixel of the image); and (3 c) adding the delta image to the training pipeline.
  • the ML-based features are identified using machine learning, the features may not be human-interpretable.
  • the elements of the vector generated using the machine learning encoder may have numeric values, such as [0.1 4 2.3 ...
  • the computer vision model of the human foundation model may include a set of rules to identify cell morphometric features within an image, and to encode those features into a multidimensional vector.
  • the rules can be human defined, and may correspond to features that can be understood by a human.
  • the output of the computer vision encoder may include, may consist of, or may consist essentially of, at least one multidimensional vector (which may also be referred to herein as an embedding). Elements of the vector(s) for a given image may correspond to the values of respective features that the computer vision encoder extracted from that image. Because the features are human defined, the features can be human-interpretable. Table 2 below describes example computer vision dimensions (morphometric features), which correspond to different features that the computer vision encoder may extract from images.
  • Example morphometric features generated using the human foundation model denotes metrics that may also be referred to as blobs or granules.
  • morphometric features can be categorized into different groups.
  • cell morphometric features can be selected from the group consisting of position features, cell shape features, pixel intensity features, texture features, and focus features.
  • position features can be selected from the group consisting of: centroid X axis and centroid Y axis, where Table 2 provides respective descriptions for such features.
  • cell shape features can be selected from the group consisting of : area, perimeter, maximum caliper distance, minimum caliper distance, maximum radius, minimum radius, long ellipse axis, short ellipse axis, ellipse elongation, ellipse similarity, roundness, circle similarity, and convex shape, where Table 2 provides respective example descriptions for such features.
  • pixel intensity features are selected from the group consisting of : mean pixel intensity, standard deviation of pixel intensity, pixel intensity 25th percentile, pixel intensity 75 th percentile, positive fraction, and negative fraction, where Table 2 provides respective example descriptions for such features.
  • texture features can be selected from the group consisting of: small set of connected bright pixels, integral; small set of connected dark pixels, integral; large set of connected bright pixels, integral; large set of connected dark pixels, integral; image moments; local binary patterns - center; local binary patterns - periphery; image sharpness; image focus; ring width; and ring intensity.
  • the computer vision encoder extracts m morphometric features from each image (where m is a positive integer), and outputs an array of length /??, which array can be considered to be an /77-dimensional vector.
  • m is a positive integer
  • the output of the computer vision encoder may have the format:
  • m can be in any suitable range, e.g., can be between about 5 and about 1000, e.g., between about 10 and about 500, e.g., between about 50 and about 100. In the nonlimiting example shown in Table 2, m is equal to 51.
  • the morphometric features represent featuresthat are visible by both hum an and computer vision
  • the features can be human-interpretable.
  • the elements of the vector generated using the computer vision encoder may have numeric values, such as [5 0.8 1.4 ...
  • the computer vision encoder can be implemented using any suitable combination of hardware and software.
  • the system component which is implementing the HFM may include a processor and a non-volatile computer-readable medium that includes instructions for causing the processor to respectively process cell images using a computer vision encoder.
  • the computer vision encoder can be configured to quantify the characteristics (e.g., to measure dimensions or intensities) of different features within respective cell images, and to output a vector the dimensions (elements) of which correspond to the measured values of those respective characteristics.
  • the set of ML-based features extracted using the machine learning encoder and the set of cell morphometric features extracted using the computer vision encoder can be used to respectively encode the set of ML-based features and the set of cell morphometric features into a plurality of multi-dimensional vectors that represent morphology of a cell in a cell image.
  • the multidimensional vectors may have n+m dimensions, where n and/?? are positive integers.
  • each dimension of the n+m dimensions can be an element of that multi-dimensional vector, e.g., a numeric value.
  • the ML-based features and the cell morphometric features can be concatenated to generate a multi-dimensional vector having the format: where, similarly as above, the subscripts l ...n correspond to the respective deep learning dimension numbers, the letter V represents the value of the feature in that image that the deep learning encoder calculated, the subscripts l ... m correspond to the respective computer vision dimension numbers, and the letter W represents the value of the feature in that image that the computer vision encoder calculated.
  • the value of m can be the same as the value of n, in which example the plurality of multi-dimensional vectors extracted using the machine learning encoder and the computer vision encoder may include a same number of each of the ML-based features and the cell morphological features.
  • the value of m can be different than the value of n, in which example the plurality of multi-dimensional vectors extracted using the machine learning encoder and the computer vision encoder may include a different number of each of the ML-based features and the cell morphological features.
  • an array of length (n+m), or a vector of length (n+m) can be interpreted as being a plurality of multi-dimensional vectors, each such vector having one or more elements.
  • the ML-based features can be orthogonal to one another as explained more particularly in FIGs. 9-10.
  • the ML-based features may all be different than one another, and may all be uncorrelated to one another.
  • ML-based feature Vi can be different than, and uncorrelated to, each of ML- based features p2- • • K n .
  • the ML-based features can be orthogonal to the cell morphological features.
  • Cell morphology can be highly indicative of a cell’s phenotype and function, but it is also highly dynamic and complex. Traditional analysis of cell morphology by human eyes has significant limitations. Other methods of assessing and characterizing cell morphology are also limited to imaging or sorting with cell labels.
  • the system as described herein may provide imaging of single cells and label-free sorting in one platform.
  • the system may directly capture high- resolution brightfield images of cells in real time.
  • the system may also enable cell sorting based on their morphology without involving any cell labels.
  • the cells may remain viable and minimally perturbed after the sorting process.
  • the system may allow collection of sorted cells for downstream analysis, for example, single-cell RNA sequencing.
  • the system may comprise, or be compatible with the human foundation model for highdimensional morphological feature analysis.
  • the system may comprise or be compatible with a data suite that may allowusers to store, visualize, and analyze images and highdimensional data.
  • the system may enable the end-to-end process including cell imaging, morphology analysis, sorting, and classification.
  • the system may comprise a microfluidic platform. When cells flow through the microfluidic platform, the system may capture high-resolution brightfield images of each individual cell. The images can be processed by the human foundation model for extracting high-dimensional features corresponding to the cells. The system may sort the cells in different categories, based on the distinct morphological features.
  • FIG. 2A illustrates system 100 which includes, and illustrates the interaction between, a microfluidics platform 20 (e.g., corresponding to any microfluidics platform of this disclosure), the human foundation model 60 (e.g., a deep learning model), and a data suite 40, in accordance with some examples of the present disclosure.
  • the system 100 for cell morphology analysis may include microfluidics platform 20, which may includeor be compatible with the human foundation model 60 and the data suite 40. Example interactions between the microfluidics platform 20, the human foundation model 60, and the data suite 40 will be described in further detail elsewhere herein, including below in accordance with FIG. 2B.
  • brightfield images of single cells are analyzed in real-time by model 60 to generate quantitative Al embeddings that can include reproducible high-dimensional descriptions of cell morphology.
  • morphologically distinct cell groups can be sorted in real-time by system 100 for downstream analysis, including up to approximately 6 populations per run.
  • FIG. 2B illustrates an example workflow from high-throughput imaging to cell characterization, classification and sorting based on cell morphology analysis, in accordance with some examples of the present disclosure.
  • the system for cell morphology profiling as described herein may include a benchtop micro fluidics platform that captures high -resolution brightfield images of single cells and sort cells in a label-free manner.
  • the microfluidics platform may comprise or be compatible with the human foundation model and a data suite that may allow users to store, visualize, and analyze images and high-dimensional data.
  • the workflow 200 can be streamlined, starting from preparing and loading cells onto the microfluidics platform (operation 210).
  • samples from established human cell lines or dissociated tissue biopsies in single cell suspension are loaded onto a microfluidic chip.
  • the preparation of samples may comprise dissociation of cells into a single-cell suspension and loading the suspension onto the microfluidics platform.
  • the system may capture images of the cells, and the human foundation model may characterize the cells in real time as they flow through the microfluidic chip (operation 220).
  • images of single cells are captured and analyzed in real-time by the human foundation model to generate multidimensional quantitative morphological profiles (operation 230).
  • the human foundation model may process the images of the cells and generate high -dimensional features reflecting the cell morphology.
  • the images and extracted features can be stored in the data suite (operation 240).
  • the human foundation model also visualizes the cell morphology data by, for example, generatinguser- defined cell clusters based on cell types (also operation 240).
  • the data suite may also provide in- depth data analysis, including selecting cell populations of interest to sort on the microfluidics platform.
  • the system may recover sorted cells in a plurality of collection wells (operation 250), which can be used for downstream analyses.
  • the collected morphology data (referred to as embeddings) can be further analyzed as a unique modality, and users can continuously train customized models for specific applications.
  • FIG. 3 illustrates an example system for cell morphology analysis, in accordance with some examples of the present disclosure.
  • the system 300 may comprise a benchtop microfluidics platform 310 that captures high -resolution brightfield images of single cells and sort cells in a label-free manner.
  • System 300 also may include data suite 330, which can be implemented using (and integrated with) microfluidics platform 310, or can be implemented using a separate device.
  • Tables 3 and 4 provided further below, list example parameters, specifications, and components of system 300.
  • FIG. 3 schematically illustrates an example method for classifying a cell.
  • the method can comprise processing image data 310 comprising tag-free images/videos of single cells (e.g., image data 310 consisting of tag-free images/videos of single cells).
  • Various clustering analysis models 320 as disclosed herein can be used to process the image data 310 to extract one or more morphological properties of the cells from the image data 310, and generate a cell morphology map 330A based on the extracted one or more morphological properties.
  • the cell morphology map 330A can be generated based on two morphological properties as dimension 1 and dimension 2.
  • the cell morphology map 330 A can comprise one or more clusters (e.g., clusters A, B, and C) of datapoints, each datapoint representing an individual cell from the image data 310.
  • the cell morphology map 330A and the clusters A-C therein can be used to train classifier(s) 350.
  • a new image 340 of a new cell can be obtained and processed by the trained classifier(s) 350 to automatically extract and analyze one or more morphological features from the cellular image 340 and plot it as a datapoint on the cell morphology map 330 A.
  • the classifier(s) 350 can automatically classify the new cell.
  • the classifier(s) 350 can determine a probability that the cell in the new image data 340 belongs to cluster C (e.g., the likelihood for the cell in the new image data 340 to share one or more commonalities and/or characteristics with cluster C more than with other clusters A/B).
  • the classifier(s) 350 can determine and report that the cell in the new image data 340 has a 95% probability of belonging to cluster C, 1% probability of belonging to cluster B, and 4% probability of belongto cluster A, solely based on analysis of the tag-free image 340 and one or more morphological features of the cell extracted therefrom.
  • An image and/or video (e.g., a plurality of images and/or videos) of one or more cells as disclosed herein (e.g., that of image data 310 in FIG. 3) can be captured while the cell(s) is suspended in a fluid (e.g., an aqueous liquid, such as a buffer) and/or while the cell(s) is moving (e.g., transported across a microfluidic channel).
  • a fluid e.g., an aqueous liquid, such as a buffer
  • the cell(s) is moving (e.g., transported across a microfluidic channel).
  • the cell need not be suspended is a gel-like or solid-like medium.
  • the fluid can comprise a liquid that is heterologous to the cell(s)’s natural environment.
  • cells from a subject’ s blood can be suspended in a fluid that comprises (i) at least a portion of the blood and (ii) a buffer that is heterologous to the blood.
  • the cell(s) may be not immobilized (e.g., embedded in a solid tissue or affixed to a microscope slide, such as a glass slide, for histology) or adhered to a substrate.
  • the cell(s) can be isolated from the natural environment or niche (e.g., a partofthetissue the cell(s) itwouldbe in if notretrievedfrom a subject by human intervention) when the image and/or video of the cell(s) is captured.
  • the image and/or video need not be from a histological imaging.
  • the cell(s) need not be sliced or sectioned prior to obtaining the image and/or video of the cell, and, as such, the cell(s) may remain substantially intact as a whole during capturing of the image and/or video.
  • each cell image can be annotated with the extracted one or more morphological features and/or with information that the cell image belongs to a particular cluster (e.g., a probability).
  • the cell morphology map can be a visual (e.g., graphical) representation of one or more clusters of datapoints.
  • the cell morphology map canbe a 1 -dimensional (ID) representation (e.g, based on one morphological property as one parameter or dimension) or a multi-dimensional representation, such as a 2-dimensional (2D) representation (e.g., based on two morphological properties as two parameters or dimensions), a 3 -dimensional (3D) representation (e.g., based on three morphological properties as three parameters or dimensions), a 4 -dimensional (4D) representation, etc.
  • ID 1 -dimensional
  • 2D 2-dimensional
  • 3D 3 -dimensional
  • 4D 4 -dimensional
  • one morphological property of a plurality of morphological properties used for blotting the cell morphology map can be represented as a non-axial parameter (e.g., non-x, y, or z axis), such as, distinguishable colors (e.g., heatmap), numbers, letters (e.g., texts of one or more languages), and/or symbols (e.g., a square, oval, triangle, square, etc.).
  • a heatmap can be used as colorimetric scale to represent the classifier prediction percentages for each cell against a cell class, cell type, or cell state.
  • the cell morphology map can be generated based on one or more morphological features (e.g., characteristics, profiles, fingerprints, etc.) from the processed image data.
  • morphological features e.g., characteristics, profiles, fingerprints, etc.
  • Non-limiting examples of one or more morphological properties of a cell, as disclosed herein, that can be extracted from one or more images of the cell can include, but are not limited to (i) shape, curvature, size (e.g., diameter, length, width, circumference), area, volume, texture, thickness, roundness, etc.
  • the cell or one or more components of the cell e.g., cell membrane, nucleus, mitochondria, etc.
  • number or positioning of oneormore contents e.g., nucleus, mitochondria, etc.
  • optical characteristics of a region of the image(s) e.g., unique groups of pixels within the image(s) that correspond to the cell or a portion thereof (e.g., light emission, transmission, reflectance, absorbance, fluorescence, luminescence, etc.).
  • Non-limiting examples of clustering as disclosed herein can be hard clustering (e.g., determining whether a cell belongs to a cluster or not), soft clustering (e.g., determining a likelihood that a cell belongs to each cluster to a certain degree), strict partitioning clustering (e.g, determining whether each cell belongs to exactly one cluster), strict partitioning clustering with outliers (e.g., determining whether a cell can also belong to no cluster), overlapping clustering (e.g., determining whether a cell can belong to more than one cluster), hierarchical clustering (e.g, determining whether cells that belong to a child cluster can also belong to a parent cluster), and subspace clustering (e.g., determining whether clusters are not expected to overlap).
  • hard clustering e.g., determining whether a cell belongs to a cluster or not
  • soft clustering e.g., determining a likelihood that a cell belongs to each cluster to a certain degree
  • strict partitioning clustering e.g,
  • Cell clustering and/or generation of the cell morphology map can be based on a single morphological property of the cells.
  • cell clustering and/or generation the cell morphology map can be based on a plurality of different morphological properties of the cells.
  • the plurality of different morphological properties of the cells can have the same weight or different weights.
  • a weight can be a value indicative of the importance or influence of each morphological property relative to one another in training the classifier or using the classifier to (i) generate one or more cell clusters, (ii) generate the cell morphology map, or (iii) analyze a new cellular image to classify the cellular image as disclosed herein.
  • cell clustering can be performed by having 50% weight on cell shape, 40% weight on cell area, and 10% weight on texture (e.g., roughness) of the cell membrane.
  • the classifier as disclosed herein can be configured to adjust the weights of the plurality of different morphological properties of the cells during analysis of new cellular image data, thereby to yield a most optimal cell clustering and cell morphology map.
  • the plurality of different morphological properties with different weights can be utilized during the same analysis operation for cell clustering and/or generation of the cell morphology map.
  • the plurality of different morphological properties can be analyzed hierarchically.
  • a first morphological property can be used as a parameter to analyze image data of a plurality of cells to generate an initial set of clusters.
  • a second and different morphological property can be used as a second parameter to (i) modify the initial set of clusters (e.g., optimize arrangement among the initial set of clusters, re-group some clusters of the initial set of clusters, etc.) and/or (ii) generate a plurality of sub -clusters within a cluster of the initial set of clusters.
  • a first morphological property can be used as a parameter to analyze image data of a plurality of cells to generate an initial set of clusters, to generate a ID cell morphology map.
  • a second morphological property can be used as a parameter to further analyze the clusters of the ID cell morphology map, to modify the clusters and generate a 2D cell morphology map (e.g., a first axis parameter based on the first morphological property and a second axis parameter based on the second morphological property).
  • an initial set of clusters can be generated based on an initial morphological feature thatis extracted from the image data, and one or more clusters of the initial set of clusters can comprise a plurality of sub -clusters based on second morphological features or sub -features of the initial morphological feature.
  • the initial morphological feature can be cell type, such as stem cells (or not), and the subfeatures can be different types of stem cells (e.g., embryonic stem cells, induced pluripotent stem cells, mesenchymal stem cells, muscle stem cells, etc.).
  • the initial can be cancer cells (or not), and the sub-feature can be different types of cancer cells (e.g., sarcoma cells, sarcoma cells, leukemia cells, lymphoma cells, multiple myeloma cells, melanoma cells, etc.).
  • the initial can be cancer cells (or not), and the sub -feature can be different stages of the cancer cell (e.g., quiescent, proliferative, apoptotic, etc.).
  • Each datapoint can represent an individual cell or a collection of aplurality of cells (e.g., at least about 2 - e.g., at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, or at least about 10 cells Or more).
  • Each datapoint can represent an individual image (e.g., of a single cell or a plurality of cells) or a collection of a plurality of images (e.g., atleast about2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, or at least about 10 images of the same single cell or different cells or more).
  • the cell morphology map can comprise at least about 1, or at least about 2, or at least about 3, or at least about 4, or at least about 5, or at least about 6, or at least about 7, or at least about 8, or at least about 9, or at least about 10, or at least about 15, or at least about 20, or at least about 30, or at least about 40, or at least about 50, or at least about 60, or at least about 70, or at least about 80, or at least about 90, or at least about 100, or at least about 150, or at least about 200, or at least about 300, or at least about 400, or at least about 500 clusters, or more.
  • Each cluster as disclosed herein can comprise a plurality of sub -clusters, e.g., at least about 2, or at least about 3, or at least about 4, or at least about 5, or at least about 6, or at least about 7, or at least about 8, or at least about 9, or at least about 10, or at least about 15, or atleast about 20, or at least about 30, or at least about 40, or at least about 50, or at least about 60, or at least about 70, or at least about 80, or at least about 90, or at least about 100, or at least about 150, or at least about 200, or at least about 300, or at least about 400, or at least about 500 sub-clusters.
  • a cluster (or sub-cluster) can comprise datapoints representing cells of the same type/state.
  • a cluster (or sub-cluster) can comprise datapoints representing cells of different types/states.
  • a cluster can comprise at least about 1 , at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, at least about 10, at least about 15, at least about 20, at least about 30, at least about 40, at least about 50, at least about 60, at least about 70, at least about 80, at least about 90, at least about 100, at least about 150, at least about 200, at least about 300, at least about 400, at least about 500, at least about 1,000, at least about 2,000, at least about 3,000, at least about 4,000, at least about 5,000, at least about 10000, at least about 50,000, or at least about 100,000 datapoints.
  • a cluster (or sub-cluster) as disclosed herein can be represented with a boundary (e.g, a solid line or a dashed line).
  • a cluster or sub -cluster need not be represented with a boundary, and can be distinguishable from other cluster(s) sub-cluster(s) based on their proximity to one another.
  • a cluster (or sub-cluster) or a data comprising information about the cluster can be annotated based on one or more annotation schema (e.g., predefined annotation schema).
  • annotation schema e.g., predefined annotation schema
  • Such annotation can be manual (e.g., by a user of the method or system disclosed herein) or automatically (e.g., by any of the machine learning algorithms disclosed herein).
  • the annotation of the clustering can be related the one or more morphological properties of the cells that have been analyzed (e.g., cell shape, cell area, optical characteristic(s), etc.) to generate the cluster or assign one or more datapoints to the cluster.
  • the annotation of the clustering can be related to information that has not been used or analyzed to generate the cluster or assign one or more datapoints to the cluster (e.g., genomics, transcriptomics, or proteomics, etc.).
  • the annotation can be utilized to add additional “layers” of information to each cluster.
  • an interactive annotation tool can be provided that permits one or more users to modify any process of the method described herein.
  • the interactive annotation tool can allow a user to curate, verify, edit, and/or annotate the morphologically -distinct clusters.
  • the interactive annotation tool can process the image data, extract one or more morphological features from the image data, and allow the user to select one or more of the extracted morphological features to be used as a basis to generate the clusters and/or the cell morphology map.
  • the interactive annotation tool can allow the user to annotate each cluster and/or the cell morphology map using (i) a predefined annotation schema or (ii) a new, user-defined annotation schema.
  • the interactive annotation tool can allow user to assign different weights to different morphological features for the clustering and/or map plotting.
  • the interactive annotation tool can allow user to select with imaging data (or which cells) to be used and/or which imaging data (or which cells, cell clumps, artifacts, or debris) to be discarded, for the clustering and/or map plotting.
  • a user can manually identify incorrectly clustered cells, or the machine learning algorithm can provide probability or correlation value of cells within each cluster and identify any outlier (e.g., a datapoint that would change the outcome of the probability/correlation value of the cluster(s) by a certain percentage value).
  • the user can choose to move the outliers usingthe interactive annotation tool to further tunethe cell morphology map, e.g., to yield a “higher resolution” map.
  • One or more cell morphology maps as disclosed herein can be used to train one or more classifiers (e.g., at least about 1, at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, at least about 10, or more classifiers) as disclosed herein.
  • Each classifier can be trained to analyze one or more images of a cell (e.g., to extract one or more morphological features of the cell) and categorize (or classify) the cell into one or more determined class or categories of a cell (e.g., based on a type of state of the cell).
  • the classifier can be trained to create a new category to categorize (or classify) the cell into the new category, e.g., when determining that the cell is morphologically distinct than any pre-existing categories of other cells.
  • the machine learning algorithm as disclosed herein can be configured to extract one or more morphological feature of a cell from the image data of the cell.
  • the machine learning algorithm can form a new data set based on the extracted morphological features, and the new data set need not contain the original image data of the cell.
  • replicas of the original images in the image data can be stored in a database disclosed herein, e.g., prior to using any of the new images fortraining, e.g., to keep the integrity of the images of the image data.
  • processed images of the original images in the image data can be stored in a database disclosed herein during or subsequentto the classifier training.
  • any of the newly extracted morphological features as disclosed herein can be utilized as new molecular markers for a cell or population of cells of interest to the user.
  • cell analysis platform as disclosed herein can be operatively coupled to one or more databases comprising non -morphological data of cells processed (e.g., genomics data, transcriptomics data, proteomics data, metab olomics data), a selected population of cells exhibiting the newly extracted morphological feature(s) can be further analyzed by their non-morphological properties to identify proteins or genes of interest that are common in the selected population of cell s but not in other cells, thereby determining such proteins or genes of interest to be new molecular markers that can be used to identify such selected population of cells.
  • non -morphological data of cells processed e.g., genomics data, transcriptomics data, proteomics data, metab olomics data
  • a selected population of cells exhibiting the newly extracted morphological feature(s) can be further analyzed by their non-morphological properties to identify proteins or genes of interest that are common
  • a classifier can be trained by applying machine learning algorithms on at least a portion of one or more cell morphology maps as disclosed herein as a training dataset.
  • machine learning algorithms for training a classifier can include supervised learning, unsupervisedlearning, semi -supervised learning, reinforcement learning, selflearning, feature learning, anomaly detection, association rules, etc.
  • a classifier can be trained by using one or more learning models on such training dataset.
  • Non-limiting examples of learning models can include artificial neural networks (e.g., convolutional neural networks, U-net architecture neural network, etc.), backpropagation, boosting, decision trees, support vector machines, regression analysis, Bayesian networks, genetic algorithms, kernel estimators, conditional random field, random forest, ensembles of classifiers, minimum complexity machines (MCM), probably approximately correct learning (PACT), etc.
  • artificial neural networks e.g., convolutional neural networks, U-net architecture neural network, etc.
  • backpropagation boosting
  • decision trees e.g., decision trees, support vector machines, regression analysis, Bayesian networks, genetic algorithms, kernel estimators, conditional random field, random forest, ensembles of classifiers, minimum complexity machines (MCM), probably approximately correct learning (PACT), etc.
  • MCM minimum complexity machines
  • PACT probably approximately correct learning
  • the neural networks are designed by the modification of neural networks such as AlexNet, VGGNet, GoogLeNet, ResNet (residual networks), DenseNet, and Inception networks.
  • the enhanced neural networks are designed by modification of ResNet (e.g. ResNet 18, ResNet 34, ResNet 50, ResNet 101, and ResNet 152) or inception networks.
  • the modification comprises a series of network surgery operations that are mainly carried out to improve including inference time and/or inference accuracy.
  • the machine learning algorithm as disclosed herein can utilize one or more clustering algorithms to determine that objects in the same cluster can be more similar (in one or more morphological features) to each other than those in other clusters.
  • the clustering algorithms can include, but are not limited to, connectivity models (e.g., hierarchical clustering), centroid models (e.g.
  • K-means algorithm K-means algorithm
  • distribution models e.g., expectationmaximization algorithm
  • density models e.g., density -based spatial clustering of applications with noise (DBSCAN), ordering points to identify the clustering structure (OPTICS)
  • subspace models e.g., biclustering
  • group models graph-based models (e.g., highly connected subgraphs (HCS) clustering algorithms), single graph models, and neural models (e.g., using unsupervised neural network).
  • the machine learning algorithm can utilize a plurality of models, e.g., in equal weights or in different weights.
  • unsupervised and self-supervised approaches can be used to expedite labeling of image data of cells.
  • an embedding for a cell image can be generated.
  • the embedding can be a representation of the image in a space with reduced dimensions than the original image data.
  • Such embeddings can be used to cluster images that are similar to one another.
  • the labeler can be configured to batch -label the cells and increase the throughput as compared to manually labeling one or more cells.
  • additional meta information e.g., additional non-morphological information
  • additional meta information e.g., additional non-morphological information
  • the sample e.g., what disease is known or associated with the patient who provided the sample
  • embedding generation can use a neural net trained on predefined cell types.
  • an intermediate layer of the neural net that is trained on predetermined image data (e.g., image data of known cell types and/or states) can be used.
  • embedding generation can use neural nets trained for different tasks.
  • an intermediate layer of the neural net that is trained for a different task e.g., a neural net that is trained on a canonical dataset such as ImageNet.
  • this can allow to focus on features that matter for image classification (e.g., edges and curves) while removing a bias that may otherwise be introduced in labeling the image data.
  • autoencoders can be used for embedding generation.
  • autoencoders can be used, in which the input and the output can be substantially the same image and the squeeze layer can be used to extract the embeddings.
  • the squeeze layer can force the model to learn a smaller representation of the image, which smaller representation may have sufficient information to recreate the image (e.g., as the output).
  • an expanding training data set can be used for clustering-based labeling of image data or cells.
  • one or more revisions of labeling e.g., manual relabeling
  • Such manual relabeling can be intractable on a large scale and ineffective when done on a random subset of the data.
  • similar embedding-based clustering can be used to identify labeled images that may cluster with members of other classes. Such examples are likely to be enriched for incorrect or ambiguous labels, which can be removed (e.g., automatically or manually).
  • adaptive image augmentation can be used.
  • (1) one or more images with artifacts can be identified, and (2) such images identified with artifacts can be added to training pipeline (e.g., for training the model/classifier).
  • Identifying the image(s) with artifacts can comprise: (1 a) while imaging cells, one or more additional sections of the image frame can be cropped, which frame(s) being expected to contain just the background without any cell; (2a) the background image can be checked for any change in one or more characteristics (e.g., optical characteristics, such as brightness); and (3 a) flagging/labeling one or more images that have such change in the characteristic(s).
  • Addingtheidentifiedimagesto trainingpipeline can comprise: (2a) adding the one or more images that have been flagged/labeled as augmentation by first calculating an average feature of the changed characteristic(s) (e.g., the background median color); (2b) creating a delta image by subtracting the average feature from the image data (e.g., subtracting the median for each pixel of the image); and (3c) adding the delta image to the training pipeline.
  • an average feature of the changed characteristic(s) e.g., the background median color
  • One or more dimension of the cell morphology map can be represented by various approaches (e.g., dimensionality reduction approaches), such as, for example, principal component analysis (PCA), multidimensional scaling (MDS), t-distributed stochastic neighbor embedding (t- SNE), and uniform manifold approximation and projection (UMAP).
  • PCA principal component analysis
  • MDS multidimensional scaling
  • t- SNE t-distributed stochastic neighbor embedding
  • UMAP uniform manifold approximation and projection
  • UMAP can be a machine learning technique for dimension reduction.
  • UMAP can be constructed from a theoretical framework based in Riemannian geometry and algebraic topology.
  • UMAP can be utilized for a practical scalable algorithm that applies to real world data, such as morphological properties of one or more cells.
  • the cell morphology map as disclosed herein can comprise an ontology of the one or more morphological features.
  • the ontology can be an alternative medium to represent a relationship amongvarious datapoints (e.g., each representing a cell) analyzed from an image data.
  • an ontology can be a data structure of information, in which nodescan be linked by edges. An edge can be used to define a relationship between two nodes.
  • a cell morphology map can comprise a cluster comprising sub -clusters, and the relationship between the cluster and the sub-clusters can be represented in an nodes/edges ontology (e.g., an edge can be used to describe the relationship as a subclass of, genus of, part of, stem cell of, differentiated from, progeny of, diseased state of, targets, recruits, interacts with, same tissue, different tissue, etc.).
  • one-to-one morphology to genomics mapping can be utilized. An image of a single cell or images of multiple “similar looking” cells can be mapped to its/their molecular profile(s) (e.g., genomics, proteomics, transcriptomics, etc.).
  • classifier-based barcoding can be performed.
  • Each sorting event e.g., positive classifier
  • a unique barcode e.g., nucleic acid or small molecule barcode.
  • the exactbarcode(s) used for that individual classifier positive event can be recorded and tracked.
  • the cells can be lysed and molecularly analyzed together with the barcode(s).
  • the result of the molecular analysis can then be mapped (e.g., one-to-one) to the image(s) of the individual (or ensemble of) sorted cell(s) captured while the cell(s) are flowing in the flow channel.
  • class-based sorting can be utilized.
  • FIG. 4 schematically illustrates different ways of representing analysis data of image data of cells.
  • Tag-free image data 410 of cells e.g., circular cells and square cells
  • nuclei e.g., small nucleus and large nucleus
  • any of the classifier(s) disclosed herein can be used to analyze and plot the image data 410 into a cell morphology map 420, comprising four distinguishable clusters: cluster A (circular cell, small nucleus), cluster B (circular cell, large nucleus), cluster C (square cell, small nucleus), and cluster D (square cell, large nucleus).
  • the classifier(s) can also represent the analysis in a cell morphological ontology 430, in which a top node (“cell shape”) can be connectedto two sub -nodes (“circular cell” and “rectangular cell”) using an edge (“is a subclass of’) to define the relationship between the nodes.
  • Each sub-node can also connected to its own sub-nodes (“small nucleus” and “large nucleus”) using an edge (“is a part of’) to define their relationships.
  • the sub-nodes e.g, “small nucleus” and “large nucleus”
  • the cell morphology map or cell morphological ontology as disclosed herein can be further annotated with one or more non-morphological data of each cell.
  • the ontology 430 from FIG. 4 can be further annotated with information about the cells that may not be extractable from the image data used to classify the cells (e.g., molecular profiles obtained using molecular barcodes, as disclosed herein).
  • Non-limiting examples of such non-morphological data can be from additional treatment and/or analysis, including, but not limited to, cell culture (e.g., proliferation, differentiation, etc.), cell permeabilization and fixation, cell staining by a probe, mass cytometry, multiplexed ion beam imaging (MIBI), confocal imaging, nucleic acid (e.g., DNA, RNA) or protein extraction, polymerase chain reaction (PCR), target nucleic acid enrichment, sequencing, sequence mapping, etc.
  • cell culture e.g., proliferation, differentiation, etc.
  • cell permeabilization and fixation cell staining by a probe
  • mass cytometry mass cytometry
  • MIBI multiplexed ion beam imaging
  • confocal imaging nucleic acid (e.g., DNA, RNA) or protein extraction
  • PCR polymerase chain reaction
  • target nucleic acid enrichment sequencing, sequence mapping, etc.
  • Examples of the probe used for cell staining may include, but are notlimited to, a fluorescent probe (e.g., for staining chromosomes such as X, Y, 13, 18 and 21 in fetal cells), a chromogenic probe, a direct immunoagent (e.g.
  • an indirect immunoagent e.g., unlabeled primary antibody coupled to a secondary enzyme
  • a quantum dot e.g., a fluorescent nucleic acid stain (such as DAPI, Ethidium bromide, Sybr green, Sybr gold, Sybr blue, Ribogreen, Picogreen, YoPro-1, YoPro-2 YoPro-3, YOYo, Oligreen acridine orange, thiazole orange, propidium iodine, orHoeste), another probe that emits a photon, or a radioactive probe.
  • DAPI fluorescent nucleic acid stain
  • the instrument(s) for the additional analysis may comprise a computer executable logic that performs karyotyping, in situ hybridization (ISH) (e.g., florescence in situ hybridization (FISH), chromogenic in situ hybridization (CISH), nanogold in situ hybridization (NISH)), restriction fragment length polymorphism (RFLP) analysis, polymerase chain reaction (PCR) techniques, flow cytometry, electron microscopy, quantum dot analysis, or detects single nucleotide polymorphisms (SNPs) or levels of RNA.
  • ISH in situ hybridization
  • FISH florescence in situ hybridization
  • CISH chromogenic in situ hybridization
  • NISH nanogold in situ hybridization
  • RFLP restriction fragment length polymorphism
  • PCR polymerase chain reaction
  • Analysis of the image data can be performed (e.g., automatically) within less than about 1 hour- e.g., less than about 50 minutes, or less than about 40 minutes, or less than about 30 minutes, or less than about 25 minutes, or less than about 20 minutes, or less than about 15 minutes, or less than about 10 minutes, or less than about 9 minutes, or less than about 8 minutes, or less than about 7 minutes, or less than about 6 minutes, or less than about 5 minutes, or less than about 4 minutes, or less than about 3 minutes, or less than about 2 minutes, or less than about 1 minute, or less than about 50 seconds, or less than about 40 seconds, or less than about 30 seconds, or less than about 20 seconds, or less than about 10 seconds, or less than about 5 seconds, about 1 second, or less. In some examples, such analysis can be performed in real-time.
  • One or more morphological features utilized for generating the clusters or the cell morphology map, as disclosed herein, can be selected automatically (e.g., by one or more machine learning algorithms) or, alternatively, selected manually by a user using a user interface (e.g., graphical user interface (GUI)).
  • GUI graphical user interface
  • the GUI can show visualization of, for example, (i) the one or more morphological parameters extracted from the image data (e.g., represented as images, words, symbols, predefined codes, etc.), (ii) the cell morphology map comprising one or more clusters, or (iii) the cell morphological ontology.
  • the user can select, using the GUI, which morphological parameter(s) to be used to generate the clusters and the cell morphological map prior to actual generation of the clusters and the cell morphological map.
  • the user can, upon seeing or receiving a report aboutthe generated clusters andthe cell morphological map, retroactively modify the types of morphological parameter(s) to use, thereby to (i) modify the clustering or the cell morphological mapping and/or (ii) create new cluster(s) or new cell morphological map(s).
  • the user can select one or more regions to be excluded or included for further analy sis or further processing of the cells (e.g., sorting in the future or in real-time).
  • a microfluidic system as disclosed herein can be utilized to capture image(s) of each cell from a population of cells, and any of the methodsdisclosedhereincan be utilized to analyze suchimage data to generate a cell morphology map comprising clusters representingthe population of cells.
  • the user can select one or more clusters or sub-clusters to be sorted, and the input can be provided to the microfluidic system to sort at least a portion of the cells into one or more sub -channels of the microfluidic system (e.g., in real-time) accordingly.
  • the user can selectone ormore clusters or sub-clusters to be excluded during sorting (e.g., to get rid of artifacts, debris, or dead cells), and the input can be provided to the microfluidic system to sort at least a portion of the cells into one or more sub-channels of the microfluidic system (e.g., in real-time) accordingly without such artifacts, debris, or dead cells.
  • the cell morphology map or cell morphological ontology as disclosed herein can be further annotated with one or more non -morphological data of each cell.
  • the ontology 430 from FIG. 4 can be further annotated with information about the cells that may not be extractable from the image data used to classify the cells (e.g., molecular profiles obtained using molecular barcodes, as disclosed herein).
  • Non-limiting examples of such non-morphological data can be from additional treatment and/or analysis, including, but not limited to, cell culture (e.g., proliferation, differentiation, etc.), cell permeabilization and fixation, cell staining by a probe, mass cytometry, multiplexed ion beam imaging (MIBI), confocal imaging, nucleic acid (e.g., DNA, RNA) or protein extraction, polymerase chain reaction (PCR), target nucleic acid enrichment, sequencing, sequence mapping, etc.
  • cell culture e.g., proliferation, differentiation, etc.
  • cell permeabilization and fixation cell staining by a probe
  • mass cytometry mass cytometry
  • MIBI multiplexed ion beam imaging
  • confocal imaging nucleic acid (e.g., DNA, RNA) or protein extraction
  • PCR polymerase chain reaction
  • target nucleic acid enrichment sequencing, sequence mapping, etc.
  • Examples of the probe used for cell staining may include, but are notlimited to, a fluorescent probe (e.g., for staining chromosomes such as X, Y, 13, 18 and 21 in fetal cells), a chromogenic probe, a direct immunoagent (e.g.
  • an indirect immunoagent e.g., unlabeled primary antibody coupled to a secondary enzyme
  • a quantum dot e.g., a fluorescent nucleic acid stain (such as DAPI, Ethidium bromide, Sybr green, Sybr gold, Sybr blue, Ribogreen, Picogreen, YoPro-1, YoPro-2 YoPro-3, YOYo, Oligreen acridine orange, thiazole orange, propidium iodine, or Hoeste), another probe that emits a photon, or a radioactive probe.
  • the instrument(s) for the additional analysis may comprise a computer executable logic that performs karyotyping, in situ hybridization (ISH) (e.g., florescence in situ hybridization (FISH), chromogenic in situ hybridization (CISH), nanogold in situ hybridization (NISH)), restriction fragment length polymorphism (RFLP) analysis, polymerase chain reaction (PCR) techniques, flow cytometry, electron microscopy, quantum dot analysis, or detects single nucleotide polymorphisms (SNPs) or levels of RNA.
  • ISH in situ hybridization
  • FISH florescence in situ hybridization
  • CISH chromogenic in situ hybridization
  • NISH nanogold in situ hybridization
  • RFLP restriction fragment length polymorphism
  • PCR polymerase chain reaction
  • Analysis of the image data can be performed (e.g., automatically) within less than about 1 hour, about 50 minutes, about 40 minutes, about 30 minutes, about 25 minutes, about 20 minutes, about 15 minutes, about 10 minutes, about 9 minutes, about 8 minutes, about 7 minutes, about 6 minutes, about 5 minutes, about 4 minutes, about 3 minutes, about 2 minutes, about 1 minute, about 50 seconds, about 40 seconds, about 30 seconds, 20 seconds, about 10 seconds, about 5 seconds, about 1 second, or less.
  • such analysis can be performed in real-time.
  • One or more morphological features utilized for generating the clusters or the cell morphology map, as disclosed herein, can be selected automatically (e.g., by one or more machine learning algorithms) or, alternatively, selected manually by a user using a user interface (e.g., graphical user interface (GUI)).
  • GUI graphical user interface
  • the GUI can show visualization of, for example, (i) the one or more morphological parameters extracted from the image data (e.g., represented as images, words, symbols, predefined codes, etc.), (ii) the cell morphology map comprising one or more clusters, or (iii) the cell morphological ontology.
  • the user can select, using the GUI, which morphological parameter(s) to be used to generate the clusters and the cell morphological map prior to actual generation of the clusters and the cell morphological map.
  • the user can, upon seeing or receiving a report aboutthe generated clusters andthe cell morphological map, retroactively modify the types of morphological param eter(s) to use, thereby to (i) modify the clustering or the cell morphological mapping and/or (ii) create new cluster(s) or new cell morphological map(s).
  • the user can select one or more regions to be excluded or included for further analysis or further processing of the cells (e.g., sorting in the future or in real-time).
  • a microfluidic system as disclosed herein can be utilized to capture image(s) of each cell from a population of cells, and any of the methodsdisclosedhereincan b e utilized to analyze suchimage data to generate a cell morphology map comprising clusters representingthe population of cells.
  • the user can select one or more clusters or sub-clusters to be sorted, and the input can be provided to the microfluidic system to sort at least a portion of the cells into one or more sub -channels of the microfluidic system (e.g., in real-time) accordingly.
  • the user can selectone ormore clusters or sub-clusters to be excluded during sorting (e.g., to get rid of artifacts, debris, or dead cells), and the input can be provided to the microfluidic system to sort at least a portion of the cells into one or more sub channels of the microfluidic system (e.g., in real-time) accordingly without such artifacts, debris, or dead cells.
  • FIG. 6 schematically illustratesamethodforauserto interact (e.g., usingGUI) with any one of the methods disclosed herein.
  • Image data 610 of a plurality of cells can be processed, using any one of the methods disclosed herein, to generate a cell morphology map 620 A that represents the plurality of cells as datapoints in different clusters A, B, C, and D.
  • the cell morphology map 620A canbe displayedtotheuserusingthe GUI630.
  • the user can select each cluster or a datapoint within each cluster to visualize one or more images 650a, b, c, or d of the cells classified into the cluster.
  • the user can draw a box 640 (e.g., using any user-defined shape and/or size) around one or more datapoints or around a cluster.
  • a box 640 e.g., using any user-defined shape and/or size
  • the user can draw a box 640 around a cluster of “debris” datapoints, to, e.g., remove the selected cluster and generate a new cell morphology map 620B.
  • the user input can be used to update cell classifying algorithms, mapping algorithms, cell flowing mechanism (e.g., velocity of cells, positioning of the cells within a flow channel, adjusting imaging focal length/plane of one or more sensors/cameras of an imaging module (also referred to as an imaging device herein) that captures one or more images/videos of cells flowing through the cartridge, etc.), cell sorting mechanisms in the flow channel, cell sorting instructions in the flow channel, etc.
  • the classifier can be trained to identify one or more common morphological features within the selected datapoints (e.g., features that distinguish the selected datapoints from the unselected data).
  • the present disclosure also describes a cell analysis platform, e.g., for analyzing or classifying a cell.
  • the cell analysis platform can be a product of any one of the methods disclosed herein.
  • the cell analysis platform can be used as a basis to execute any one of the methods disclosed herein.
  • the cell analysis platform can be used to process image data comprising tag-free images of single cells to generate a new cell morphology map of various cell clusters.
  • the cell analysis platform can be used to process image data comprising tag-free images of single cells to compare the cell to predetermined (e.g., pre-analyzed) images of known cells or cell morphology map(s), such that the single cells from the image data can be classified, e.g., for cell sorting.
  • FIG. 7 illustrates an example cell analysis platform (e.g., machine leaming/artificial intelligence platform) for analyzing image data of one or more cells.
  • the cell analysis platform 700 can comprise a cell morphology atlas (CMA) 705.
  • CMA cell morphology atlas
  • the CMA 705 can comprise a database 710 having a plurality of annotated single cell images that are grouped into morphologically -distinct clusters (e.g., represented a texts, as cell morphology map(s), or cell morphological ontology(ies)) corresponding to a plurality of classifications (e.g., predefined cell classes).
  • the CMA 705 can comprise a modeling unit comprising one or more models (e.g., modeling library 720 comprising, such as, one or more machine learning algorithms disclosed herein) that are trained and validated using datasets from the CMA 705, to process image data comprising images/videos of one or more cells to identify different cell types and/or states based at least on morphological features.
  • the CMA 705 can comprise an analysis module 730 comprising one or more classifiers as disclosed herein.
  • the classifier(s) can uses one or more ofthe models from the modeling library 720 to, e.g., (1) classify one or more images taken from a sample, (2) assess a quality or state of the sample based on the one or more images, (3) map one or more datapoints representing such one or more images onto a cell morphology map (or cell morphological ontology) using a mapping module 740.
  • the CMA 705 can be operatively coupled to one or more additional database 770 to receive the image data comprising the images/videos of one or more cells.
  • the image data from the database 770 can be obtained from an imaging module 792 of a cartridge 790, which can also be operatively coupled to the CMA 705.
  • the cartridge can direct flow of a sample comprising or suspected of comprising a target cell, and capture one or more images of contents (e.g., cells) within the sample by the imaging module 792. Any image data obtained by the imaging module 792 can be transmitted directly to the CMA 705 and/or to the new image database 770.
  • the CMA 705 can be operatively coupled to one or more additional databases 780 comprising non- morphological data of any of the cells (e.g., genomics, transcriptomics, or proteomics, etc.), e.g, to further annotate any of the datapoint, cluster, map, ontology, images, as disclosed herein.
  • the CMA 705 can be operatively coupled to a user device 750 (e.g., a computer or a mobile device comprising a display) comprising a GUI 760 forthe user to receive information from and/or to provide input (e.g., instructions to modify or assist any portion of the method disclosed herein).
  • any classification made by the CMA and/or the user can be provided as an input to the sorting module 794 ofthe cartridge 790.
  • the sorting module can determine, for example, (i) when to activate one or more sorting mechanisms at the sorting junction of the cartridge 790 to sort one or more cells of interest, (ii) which sub -channel of a plurality of sub channels to direct each single cell for sorting.
  • the sorted cells can be collected for further analysis, e.g., downstream molecular assessment and/or profiling, such as genomics, transcriptomics, proteomics, metabolomics, etc.
  • any of the methods or platforms disclosed herein can be used as a tool that permits a user to train one or more models (e.g., from the modeling library) for cell clustering and/or cell classification.
  • a user may provide initial image dataset of a sample to the platform, and the platform may process the initial set of image data. Based on the processing, the platform can determine a number of labels and/or an amount of data that the user needs to train the one or more models, based on the initial image dataset of the sample. In some examples, the platform can determine that the initial set of image data can be insufficient to provide an accurate cell classification or cell morphology map.
  • the platform can plot an initial cell morphology map and recommend to the user the number of labels and/or the amount of data needed to for enhanced processing, classification, and/or sorting, based on proximity (or separability), correlation, or commonality of the datapoints in the map (e.g., whether there is no distinguishable clusters within the map, whether the clusters within the map are too close to each other, etc.).
  • the platform can allow the user to select different model (e.g., clustering model) or classifier, different combinations of models or classifiers, to reanalyze the initial set of image data.
  • any of the methods or platforms disclosed herein can be used to determine quality or state of the image(s) of the cell, that of the cell, or that of a sample comprising the cell.
  • the quality or state of the cell can be determined at a single cell level.
  • the quality or state of the cell can be determined at an aggregate level (e.g., as a whole sample, or as a portion of the sample).
  • the quality or state can be determined and reported based on, e.g., a number system (e.g, a number scale from about 1 to about 10, a percentage scale from about 1% to about 100%), a symbolic system, or a color system.
  • the quality or state can be indicative of a preparation or priming condition of the sample (e.g., whether the sample has a sufficient number of cells, whether the sample has too much artifacts, debris, etc.) or indicative of a viability of the sample (e.g., whether the sample has an amount of “dead” cells above a predetermined threshold).
  • a preparation or priming condition of the sample e.g., whether the sample has a sufficient number of cells, whether the sample has too much artifacts, debris, etc.
  • a viability of the sample e.g., whether the sample has an amount of “dead” cells above a predetermined threshold.
  • Any of the methods or platforms disclosed herein can be used to sort cells in silico (e.g, prior to actual sorting of the cells using a microfluidic channel).
  • the in silico sorting can be, e.g., to discriminate among and/or between, e.g., multiple different cell types (e.g., different types of cancer cells, different types of immune cells, etc.), cell states, cell qualities.
  • the methods and platforms disclosed herein can utilize pre-determined morphological properties (e.g., provided in the platform) for the discrimination.
  • newly abstracted morphological properties can be abstracted (e.g., generated) based on the input data for the discrimination.
  • new model(s) and/or classifier(s) can be trained or generated to process the image data.
  • the newly abstracted morphological properties can be used to discriminate among and/or between, e.g., multiple different cell types, cell states, cell qualities that are known.
  • the newly abstracted morphological properties can be used to create new class (or classifications) to sort the cells (e.g., in silico or via the microfluidic system).
  • the newly abstracted morphological properties as disclosed herein may enhance accuracy or sensitivity of cell sorting (e.g., in silico or via the microfluidic system).
  • the actual cell sorting of the cells (e.g., via the microfluidic system or cartridge) based on the in silico sorting can be performed within less than about 1 hours, about 50 minutes, about40 minutes, about 30 minutes, about25 minutes, about 20 minutes, about 15 minutes, about 10 minutes, about 9 minutes, about 8 minutes, about 7 minutes, about 6 minutes, about 5 minutes, about 4 minutes, about 3 minutes, about 2 minutes, about 1 minute, about50 seconds, about40 seconds, about30 seconds, about20 seconds, about 10 seconds, about 5 seconds, about 1 second, or less.
  • the in silico sorting and the actual sorting can occur in real-time.
  • the model(s) and/or classifier(s) can be validated (e.g., for the ability to demonstrate accurate cell classification performance).
  • validation metrics can include, but are not limited to, threshold metrics (e.g., accuracy, F-measure, Kappa, Macro-Average Accuracy, Mean-Class- Weighted Accuracy, Optimized Precision, Adjusted Geometric Mean, Balanced Accuracy, etc.), the ranking methods and metrics (e.g., receiver operating characteristics (ROC) analysis or “ROC area under the curve (ROC AUC)”), and the probabilistic metrics (e.g., root-mean-squared error).
  • threshold metrics e.g., accuracy, F-measure, Kappa, Macro-Average Accuracy, Mean-Class- Weighted Accuracy, Optimized Precision, Adjusted Geometric Mean, Balanced Accuracy, etc.
  • the ranking methods and metrics e.g., receiver operating characteristics (ROC) analysis or “ROC area under the curve (ROC AUC)
  • the image(s) of the cell(s) can be obtained when the cell(s) are prepared and diluted in a sample (e.g., a buffer sample).
  • the cell(s) can be diluted, e.g., in comparison to real-life concentrations of the cell in the tissue (e.g., solid tissue, blood, serum, spinal fluid, urine, etc.) to a dilution concentration.
  • the methods or platforms disclosed herein can be compatible with a sample (e.g., a biological sample or derivative thereof) that is diluted by a factor of about 500 to about 1,000,000.
  • the methods or platforms disclosed herein can be compatible with a sample thatis diluted by a factor of atleast about 500.
  • the methods or platforms disclosed herein can be compatible with a sample that is diluted by a factor of at most about 1,000,000.
  • the methods or platforms disclosed herein can be compatible with a sample that is diluted by a factor of about 500 to about 1,000, about 500 to about 2,000, about 500 to about 5,000, about 500 to about 10,000, about 500 to about 20,000, about 500 to about 50,000, about 500 to about 100,000, about 500 to about 200,000, about 500 to about 500,000, about 500 to about 1,000,000, about 1,000 to about 2,000, about 1,000 to about 5,000, about 1,000 to about 10,000, about 1,000 to about 20,000, about 1,000 to about 50,000, about 1,000 to about 100,000, about 1,000 to about200, 000, about l,000to about 500,000, about 1,000 to about 1,000,000, about2,000 to about 5,000, about 2,000 to about 10,000, about 2,000 to about 20, 000 , about 2, 000 to about 50,000, about 2, 000 to about 100,000, about 2, 000 to about 200, 000, about 2, 00 Oto about 500, 000, about2,000 to about 1,000,000, about 5,000
  • the methods or platforms disclosed herein can be compatible with a sample that is diluted by a factor of about 500, about 1,000, about 2,000, about 5,000, about 10,000, about 20,000, about 50,000, about 100,000, about 200,000, about 500,000, or about 1,000,000.
  • the classifier can generate a prediction probability (e.g., based on the morphological clustering and analysis) that an individual cell or a cluster of cells belongs to a cell class (e.g., within a predetermined cell class provided in the CMA as disclosed herein), e.g., via a reportingmodule.
  • the reporting module can communicate with the user via a GUI as disclosed herein.
  • the classifier can generate a prediction vector that an individual cell or a cluster of cells belongs to a plurality of cell classes (e.g., a plurality of all of predetermined cell classes from the CMA as disclosed herein).
  • the vector can be ID (e.g., a single row of different cell classes), 2D (e.g., two dimensions, such as tissue origin vs. cell type), 3D, etc.
  • the classifier can generate a report showing a composition of the sample, e.g., a distribution of one or more cell types, each cell type indicated with a relative proportion within the sample.
  • Each cell of the sample can also be annotated with a most probable cell type and one or more less probably cell types.
  • Any one of the methods and platforms disclosed herein can be capable of processing image data of one or more cells to generate one or more morphometric maps of the one or more cells.
  • Non-limiting examples of morphometric models can be utilized to analyze one or more images of single cells (or cell clusters) can include, e.g., simple morphometries (e.g., based on lengths, widths, masses, angles, ratios, areas, etc.), landmark -based geometric morphometries (e.g, spatial information, intersections, etc.
  • the morphometric map(s) can be multi-dimensional (e.g., 2D, 3D, etc.). The morphometric map(s) can be reported to the user via the GUI.
  • any of the methods or platforms disclosed herein can be used to process, analyze, classify, and/or compare two or more samples (e.g., at least about 2, about 3, about 4, about 5, about 6, about ?, about 8, about 9, about 10, or more test samples).
  • the two or more samples can each be analyzed to determine a morphological profile (e.g., a cell morphology map) of each sample.
  • a morphological profile e.g., a cell morphology map
  • the morphological profiles of the two or more samples can be compared for identifying a disease state of a patient’s sample in comparison to a health cohort’s sample or a sample of image data representative of a disease of interest.
  • the morphological profiles of the two or more samples can be compared to monitor a progress of a condition of a subject, e.g., comparing first image data of a first set of cells from a subject before a treatment (e.g., a test drug candidate, chemotherapy, surgical resection of solid tumors, etc.) and second image data of a second set of cells from the subject after the treatment.
  • the second set of cells can be obtained from the subject at least about 1 week, at least about 2 weeks, at least about 3 weeks, at least about 4 weeks, atleast about2 months, or at least about 3 months subsequent to obtaining the first set of cells from the subject.
  • the morphological profiles of the two or more samples can be compared to monitor effects of two or more different treatment options (e.g., different test drugs) in two or more different cohorts (e.g., human subjects, animal subjects, or cells being tested in vitro/ex vivo).
  • the systems and methods disclosed herein can be utilized (e.g., using sorting or enrichment of a cell type of interest or a cell exhibiting a characteristic of interest) to select a drug and/or a therapy that yields a desired effect (e.g., a therapeutic effect greater than equal to a threshold value).
  • Any of the platforms disclosed herein can provide an inline end-to-end pipeline solution for continuous labeling and/or sorting of multiple different cell types and/or states based at least in part on (e.g., based solely on) morphological analysis of imaging data provided.
  • a modeling library used by the platform can be scalable for large amount of data, extensible (e.g., one or more models or classifiers modified), and/or generalizable (e.g., more resistant to data perturbations - such as artifacts, debris, random objects in the background, image/video distortions - between samples). Any of the modeling library can be removed or updated with newmodel automatically by the machinelearning algorithms or artificial intelligence, or by the user.
  • any of the methods and platforms disclosed herein can adjust one or more parameters of the microfluidic system as disclosed herein.
  • an imaging module e.g., sensors, cameras
  • the image data can be processed and analyzed (e.g., in real-time) by the methods and platforms of the present disclosure to train a model (e.g., machine learningmodel) to determine whether or not one or more parameters of the microfluidic system.
  • a model e.g., machine learningmodel
  • the model(s) can determine that the cells are flowing too fast er too slow, and send an instruction to the microfluidic system to adjust (i) the velocity of the cells (e.g, using adjusting velocity of the fluid medium carrying the cells) and/or (ii) image recording rate of a camera that is capturing images/videos of cells flowing through the flow channel.
  • themodel(s) can determine that the cells are in -focus or out-of-focus in the images/videos, and send an instruction to the microfluidic system to (i) adjust a positioning of the cells within the cartridge (e.g., move the cell towards or away from the center of the flow channelvia, for example, hydrodynamic focusing and/or inertial focusing) and/or (ii) adjust a focal length/plane of the camera that is capturing images/videos of cells flowing through the flow channel. Adjustingthe focal length/plane can be performedforthe same cell that has been analyzed (e.g., adjusting focal length/plane of a camera that is downstream) or a subsequent cell.
  • Adjusting the focal length/plane can enhance clarity or reduceblurriness in the images.
  • the focal length/plane can be adjusted based on a classified type or state of the cell.
  • adjusting the focal length/plane can allow enhanced focusing/clarity on all parts of the cell.
  • adjustingthe focal length/plane can allow enhanced focusing/clarity on different portions (but not all parts) of the cell.
  • out-of-focus images can be usable for any of the methods disclosed herein to extract morphological feature(s) of the cell that otherwise may not be abstracted from in-focus images, or vice versa.
  • instructing the imaging module to capture both in -focus and out-of-focus images of the cells can enhance accuracy of any of the analysis of cells disclosed herein.
  • the model(s) can send an instruction to the microfluidic system to modify the flow and adjust an angle of the cell relative to the camera, to adjust focus on different portions of the cell or a subsequent cell.
  • Different portions as disclosed herein can comprise an upper portion, a mid portion, a lower portion, membrane, nucleus, mitochondria, etc. of the cell.
  • bi-directional out-of-focus(OOF) images cells e.g., one or more first images that are OOF in a first direction, and one or more second images that are OOF in as second direction that is different — such as opposite — from the first direction.
  • images that are OOF in two opposite directions can be called “bright OOF” image(s) and “dark OOF” image(s), which can be obtained by changing the z-focus bi-directionally.
  • a classifier as disclosed herein can be trained with a image data comprising both bright OOF image(s) and dark OOF image(s).
  • the trained classifiers can be used to run inferences (e.g., in real-time) on new image data of cells to classify each image as bright OOF image, dark OOF image, and optionally image that is not OOF (e.g., not OOF relative to the bright/dark OOF images).
  • the classifier can also measure a percentage of bright OOF image, a percentage of dark OOF image, or a percentage of both bright and dark OOF images within the image data.
  • the classifier can determine that the imaging device (e.g., by the microfluidic system as disclosed herein) may not be imaging cells at the right focal length/plane.
  • the classifier can instruct the user, via GUI of a user device, to adjust the imaging device’s focal length/plane.
  • the classifier can determine, based on analysis of the image data comprising OOF images, direction and degree of adjustment of focal length/plane that can be required to adjust the imaging device, to yield a reduced amount of OOF imaging.
  • the classifier and the microfluidic device can be operatively coupled to a machine leaming/artificial intelligence controller, such that the focal length/plane of the imaging device can be adjusted automatically upon determination of the classifier.
  • a threshold (e.g., a predetermined threshold) of a percentage of OOF images can be about 0.1 % to about 20 %.
  • a threshold (e.g., a predetermined threshold) of a percentage of OOF images can be at least about 0.1 %.
  • a threshold (e.g., a predetermined threshold) of a percentage of OOF images can be at most about 20 %.
  • a threshold (e.g., a predetermined threshold) of a percentage of OOF images can be about 0.1 % to about 0.5 %, about 0.1 % to about 1 %, about 0.1 % to about 2 %, about 0.1 % to about 4 %, about 0. 1 % to about 6 %, about 0.1 % to about 8 %, about 0.1 % to about 10 %, about 0.
  • a threshold (e.g., a predetermined threshold) of a percentage of OOF images can be at least about 0.1 %, or at least about 0.5 %, or at least about 1 %, or at least about 2 %, or at least about 4 %, or at least about 6 %, or at least about 8 %, or at least about 10 %, or at least about 15 %, or at least or about 20 %.
  • the model(s) can determine that images of different modalities are needed for any ofthe analysis disclosedherein.
  • Images of varying modalities can comprise a bright field image, a dark field image, a fluorescent image (e.g., of cells stained with a dye), an in -focus image, an out-of-focus image, a greyscale image, a monochrome image, a multi-chrome image, etc.
  • any of the models or classifiers disclosed herein can be trained on a set of image data that is annotated with one imaging modality.
  • the models/classifiers can be trained on set of image data that is annotated with a plurality of different imaging modalities (e.g, about 2, about 3, about 4, about 5, or more different imaging modalities).
  • Any of the models/classifiers disclosed herein can be trained on a set of image data that is annotated with a spatial coordinate indicative of a position or location within the flow channel.
  • Any of the models/classifiers disclosed herein can be trained on a set of image data that is annotated with a timestamp, such that a set of images can be processed based on the time they are taken.
  • An image of the image data can be processed in various image processingmethods, such as horizontal or vertical image flips, orthogonal rotation, gaussian noise, contrast variation, or noise introduction to mimic microscopic particles or pixel-level aberrations.
  • One or more of the processing methods can be used to generate replicas of the image or analyze the image.
  • the image can be processed into a lower-resolution image or a lower-dimension image (e.g., by using one or more deconvolution algorithm).
  • processing an image or video from image data can comprise identifying, accounting for, and/or excluding one or more artifacts from the image/video, either automatically or manually by a user.
  • the artifact(s) can be fed into any of the models or classifiers, to train image processing or image analysis.
  • the artifacts) can be accounted for when classifying the type or state of one or more cells in the image/video.
  • the artifact(s) can be excluded from any determination of the type or state of the cell(s) in the image/video.
  • the artifact(s) can be removed in silico by any of the models/classifiers disclosed herein, and any new replica or modified variant of the image/video excluding the artifact(s) can be stored in a database as disclosed herein.
  • the artifact(s) can be, for example, from debris (e.g., dead cells, dust, etc.), optical conditions during capturing the image/video of the cells (e.g., lighting variability, over- saturation, under-exposure, degradation of the light source, etc.), external factors (e.g., vibrations, misalignment of the microfluidic chip relative to the lighting or optical sensor/camera, power surges/fluctuations, etc.), and changes to the microfluidic system (e.g., deformation/shrinkage/expansion of the microfluidic channel or the microfluidic chip as a whole).
  • debris e.g., dead cells, dust, etc.
  • optical conditions during capturing the image/video of the cells e.
  • the artifacts can be known.
  • the artifacts can be unknown, and the models or classifiers disclosed herein can be configured to define one or more parameters of a new artifact, such that the new artifact can be identified, accounted for, and/or excluded in image processing and analysis.
  • a plurality of artifacts disclosed herein can be identified, accounted for, and/or excluded during image/video processing or analysis.
  • the plurality of artifacts can be weighted the same (e.g., determined to have the same degree of influence on the image/video processing or analysis) or can have different weights (e.g., determined to have different degrees of influence on the image/video processing or analysis). Weight assignments to the plurality of artifacts can be instructed manually by the user or determined automatically by the models/classifiers disclosed herein.
  • one or more reference images or videos of the flow channel can be stored in a database and used as a frame of reference to help identify, account for, and/or exclude any artifact.
  • the reference image(s)/video(s) can be obtained before use of the microfluidics system.
  • the reference image(s)/video(s) can be obtained during the use of the micro fluidics system.
  • the reference image(s)/video(s) can be obtained periodically during the use of the microfluidics system, such as, each time the optical sensor/camera captures atleast about 5, at least about 10, at least about 20, at least about 50, at least about 100, at least about 200, at least about 500, atleast about 1,000, atleast about2, 000, atleastabout5,000, atleastabout 10,000, at least about 20,000, at least about 50,000, or at least about 100,000 images.
  • the reference image(s)/video(s) can be obtained periodically during the use of the microfluidics system, such as, each time the microfluidics system passes atleast about 5, at least about 10, at least about 20, at least about 50, atleast about 100, at least about 200, at least about 500, at least about 1,000, at least about 2,000, at least about 5,000, at least about 10,000, at least about 20,000, atleast about 50,000, at least about 100,000 cells.
  • the reference image(s)/video(s) can be obtained at landmark periods during the use of the microfluidics system, such as, when the optical sensor/camera captures at least about 5, at least about 10, at least about 20, at least about 50, at least about 100, atleast about 200, at least about 500, at least about 1,000, at least 2,000, at least about 5,000, at least about 10,000, at least about 20,000, at least about 50,000, at least about 100,000 images.
  • the reference image(s)/video(s) can be obtained at landmark periods during the use of the microfluidics system, such as, when the microfluidics system passes atleast about 5, at least about 10, atleast about 20, at least about 50, at least about 100, atleast about 200, at least about 500, atleast about 1,000, at least about 2,000, at least about 5,000, at least about 10,000, at least about 20,000, at least about 50,000, at least about 100,000 images.
  • the method and the platform as disclosed herein can be utilized to process (e.g., modify, analyze, classify) the image data at a rate of about 1,000 images/second to about 100,000,000 images/second.
  • the rate of image data processing can be at least about 1 ,000 images/second.
  • the rate of image data processing can be at most about 100,000, 000 images/second.
  • the rate of image data processing can be about 1,000 images/second, about 5,000 images/second, about 10,000 images/second, about 50,000 images/second, about 100,000 images/second, about 500,000 images/second, about 1,000,000 images/second, about 5,000,000 images/second, about 10,000,000 images/second, about 50,000,000 images/second, or about 100,000,000 images/second.
  • the method and the platform as disclosed herein can be utilized to process (e.g., modify, analyze, classify) the image data at a rate of about 1,000 cells/second to about 100,000,000 cells/second.
  • the rate of image data processing can be at least about 1 ,000 cells/second.
  • the rate of image data processing can be at most about 100,000,000 cells/second.
  • the rate of image data processing can be about 1,000 cells/second to about 5,000 cells/second, about 1,000 cells/second to about 10,000 cells/second, about 1,000 cells/second to about 50,000 cells/second, about 1,000 cells/second to about 100,000 cells/second, about 1,000 cells/secondto about 500,000 cells/second, about 1,000 cells/second to about 1,000,000 cells/second, about 1,000 cells/second to about 5,000,000 cells/second, about 1,000 cells/second to about 10,000,000 cells/second, about 1,000 cells/second to about 50,000,000 cells/second, about 1,000 cells/second to about 100,000,000 cells/second, about 5,000 cells/secondto about 10,000 cells/second, about 5,000 cells/secondto about 50,000 cells/second, about 5,000 cells/second to about 100,000 cells/second, about 5,000 cells/second to about 500,000 cells/second, about 5,000 cells/second to about 1,000,000 cells/second, about 5,000 cells/second to about 5,000,000 cells/second, about 5,000 cells/second to about 10,000,000 cells/second, about 5,000 cells/secondto about 50, 000, 000 cells/second, about 5,000 cells/
  • the rate of image data processing can be about 1,000 cells/second, about 5,000 cells/second, about 10,000 cells/second, about 50,000 cells/second, about 100,000 cells/second, about 500,000 cells/second, about 1,000,000 cells/second, about 5,000,000 cells/second, about 10,000,000 cells/second, about 50,000,000 cells/second, or about 100,000,000 cells/second.
  • the method and the platform as disclosed herein can be utilized to process (e.g., modify, analyze, classify) the image data at a rate of about 1,000 datapoints/second to about 100,000,000 datapoints/second.
  • Therate of image data processing can be atleast about 1,000 datapoints/second.
  • the rate of image data processing can be at most about 100,000,000 datapoints/second.
  • the rate of image data processing can be about 1,000 datapoints/second to about 5,000 datapoints/second, about 1,000 datapoints/second to about 10,000 datapoints/second, about 1,000 datapoints/second to about 50,000 datapoints/second, about 1,000 datapoints/second to about 100,000 datapoints/second, about 1,000 datapoints/second to about500,000 datapoints/second, about 1,000 datapoints/second to about 1,000,000 datapoints/second, about 1,000 datapoints/second to about 5,000,000 datapoints/second, about 1,000 datapoints/second to about 10,000,000 datapoints/second, about 1,000 datapoints/second to about 50,000,000 datapoints/second, about 1,000 datapoints/second to about 100,000,000 datapoints/second, about 5,000 datapoints/second to about 10,000 datapoints/second, about 5,000 datapoints/second to about 50,000 datapoints/second, about 5,000 datapoints/second to about 100,000 datapoints/second, about 5,000 datapoints/second to about 500,000 datapoints/second, about 5,000 datapoints/second to
  • the rate of image data processing can be about 1,000 datapoints/second, about 5,000 datapoints/second, about 10,000 datapoints/second, about 50,000 datapoints/second, about 100,000 datapoints/second, about 500,000 datapoints/second, about 1,000,000 datapoints/second, about 5,000,000 datapoints/second, about 10,000,000 datapoints/second, about 50,000,000 datapoints/second, or about 100,000,000 datapoints/second.
  • the online crowdsourcing platform can comprise any of the database disclosed herein.
  • the database can store a plurality of single cell images that are grouped into morphologically-distinct clusters corresponding to a plurality of cell classes (e.g., predetermined cell types or states).
  • the online crowdsourcing platform can comprise one or more models or classifiers as disclosed herein (e.g., a modeling library comprising one or more machine learning models/classifiers as disclosed herein).
  • the online crowdsourcing platform can comprise a web portal for a community of users to share contents, e.g., (1) upload, download, search, curate, annotate, or edit one or more existing images or new images into the database, (2) train or validate the one or more model(s)/classifier(s) using datasets from the database, and/or (3) upload new models into the modeling library.
  • the online crowdsourcing platform can allow users to buy, sell, share, or exchange the model(s)/classifier(s) with one another.
  • the web portal can be configured to generate incentives for the users to update the database with new annotated cell images, model(s), and/or classifier(s). Incentives can be monetary. Incentives can be additional access to the global CMA, model(s), and/or classified s). In some examples, the web portal can be configured to generate incentives for the users to download, use, and review (e.g., rate or leave comments) any of the annotated cell images, model(s), and/or classifier(s) from, e.g., other users.
  • a global cell morphology atlas can be generated using collecting (i) annotated cell images, (ii) cell morphology maps or ontologies, (iii), and/or (iv) classifiers from the users using the web portal.
  • the global CMA can then be shared with the users via the web portal. All users can have access to the global CMA.
  • specifically defined users can have access to specifically defined portions of the global CMA.
  • cancer centers can have access to “cancer cells” portion of the global CMA, e.g., using a sub scription based service.
  • global models or classifiers can be generated based on the annotated cell images, model(s), and/or classifiers that are collected from the users using the web portal.
  • FIG. 8A shows a schematic illustration of the cell sorting system, as disclosed herein, with a cartridge design (e.g., a microfluidic design), with further details illustrated in FIG. 8B.
  • the cell sorting system can be operatively coupled to a machine learning or artificial intelligence controller.
  • ML/ Al controller can be configured to perform any of the methods disclosed herein.
  • Such ML/ Al controller can be operatively coupled to any of the platforms disclosed herein.
  • a sample 802 is prepared and injected by a pump 804 (e.g., a syringe pump) into a cartridge 805, or flow-through device.
  • the cartridge 805 is a micro fluidic device.
  • any of a number of perfusion systems can be used such as (but not limited to) gravity feeds, peristalsis, or any of a number of pressure systems.
  • the sample is prepared by fixation and staining.
  • the sample comprises live cells.
  • Examples of the pump or other suitable flow unit may be, but are not limited to, a syringe pump, a vacuum pump, an actuator (e.g., linear, pneumatic, hydraulic, etc.), a compressor, or any other suitable device to exert pressure (positive, negative, alternating thereof, etc.) to a fluid that may or may not comprise one or more particles (e.g., one or more cells to be classified, sorted, and/or analyzed).
  • the pump or other suitable flow unit may be configured to raise, compress, move, and/or transfer fluid into or away from the microfluidic channel.
  • the pump or other suitable flow unit may be configuredto deliver positive pressure, alternatingpositive pressure and vacuum pressure, negative pressure, alternating negative pressure and vacuum pressure, and/or only vacuum pressure.
  • the cartridge of the present disclosure may comprise (or otherwise be in operable communication with) at least about 1 - e.g., at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about ?, at least about 8, at least about
  • the flow cell may comprise at most about
  • Each pump or other suitable flow unit can be in fluid communication with at least about 1, at least about 2, at least about 3, at least about 4, atleast about 5, at least about 6, at least about 7, at least about 8, at least about 9, at least about 10, or more sources of fluid.
  • Each flow unit may be in fluid communication with at most about 10, at most about 9, at most about 8, at most about 7, at most about 6, at most about 5, at most about 4, at most about 3, at most about 2, or at most about 1 fluid.
  • the fluid may contain the particles (e.g., cells). In another example, the fluid may be particle-free.
  • the pump or other suitable flow unit may be configured to maintain, increase, and/or decrease a flow velocity of the fluid within the microfluidic channel of the flow unit.
  • the pump or other suitable flow unit may be configured to maintain, increase, and/or decrease a flow velocity (e.g., downstream of the microfluidic channel) of the particles.
  • the pump or other suitable flow unit may be configured to accelerate or decelerate a flow velocity of the fluid within the microfluidic channel of the flow unit, thereby accelerating or decelerating a flow velocity of the particles.
  • the fluid can be liquid or gas (e.g., air, argon, nitrogen, etc.).
  • the liquid can be an aqueous solution (e.g., water, buffer, saline, etc.).
  • the liquid can be oil.
  • only one or more aqueous solutions can be directed through the micro fluidic channels.
  • only one or more oils can be directed through the microfluidic channels.
  • both aqueous solution(s) and oil(s) can be directed through the microfluidic channels.
  • the aqueous solution may form droplets (e.g., emulsionscontaining the particles) that are suspended in the oil, or (ii) the oil may form droplets (e.g., emulsions containing the particles) that are suspended in the aqueous solution.
  • the oil may form droplets (e.g., emulsions containing the particles) that are suspended in the aqueous solution.
  • any perfusion system includingbutnotlimited to peristalsis systems and gravity feeds, appropriate to a given classification and/or sorting system can be utilized.
  • the cartridge 805 can be implemented as a fluidic device that focuses cells from the sample into a single streamline that is imaged continuously.
  • the cell line is illuminated by a light source 806 (e.g., a lamp, such as an arc lamp) and an optical system 810 that directs light onto an imaging region 838 of the cartridge 805.
  • a light source 806 e.g., a lamp, such as an arc lamp
  • An objective lens system 812 magnifies the cells by directing light toward the sensor of a high-speed camera system 814.
  • a lOx, 20x, 40x, 60x, 80x, lOOx, or200x objective is used to magnify the cells.
  • a 1 Ox, objective is used to magnify the cells.
  • a 20x objective is used to magnify the cells.
  • a 40x objective is used to magnify the cells.
  • a 60x objective is used to magnify the cells.
  • a 80x objective is used to magnify the cells.
  • a lOOx objective is used to magnify the cells.
  • a 200x objective is used to magnify the cells.
  • a 1 Ox to a 200x objective is usedto magnify the cells, for example a 10x-20x, a 10x-40x, a 10x-60x, a lOx- 80x, or lOx-lOOx objectiveis used to magnify the cells.
  • the specific magnification utilized can vary greatly and is largely dependent upon the requirements of a given imaging system and cell types of interest.
  • one or more imaging devices can be used to capture images of the cell.
  • the imaging device is a high-speed camera.
  • the imaging device is a high-speed camera with a micro-second exposure time.
  • the exposure time is about 1 millisecond.
  • the exposure time is between about 1 millisecond (ms) and about 0.75 millisecond.
  • the exposure time is between about 1 ms and about 0.50 ms.
  • the exposure time is between about 1 ms and about 0.25 ms.
  • the exposure time is between about 0.75 ms and about 0.50 ms.
  • the exposure time is between about 0.75 ms and about 0.25 ms.
  • the exposure time is between about 0.50 ms and about 0.25 ms. In some instances, the exposure time is between about 0.25 ms and about 0.1 ms. In some instances, the exposure time is between about 0.1 ms and about 0.01 ms. In some instances, the exposure time is between about 0.1 ms and about 0.001 ms. In some instances, the exposure time is between about 0.1 ms and about 1 microsecond (ps). In some examples, the exposure time is between about 1 ps and about 0.1 ps. In some examples, the exposure time is between about 1 ps and about 0.01 ps. In some examples, the exposure time is between about 0. 1 ps and about 0.01 ps.
  • the exposure time is between about 1 ps and about 0.001 ps. In some examples, the exposure time is between about 0.1 ps and about 0.001 ps. In some examples, the exposure time is between about 0.01 ps and about 0.001 ps.
  • the cartridge 805 may comprise at least about 1 - e.g., at least about 2, at least about 3 , at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, at least about 10, or more, imaging devices (e.g., the high-speed camera system 814) on or adjacent to the imaging region 838.
  • the cartridge may at most about 10 - e.g., at most about 9, at most about 8, at most about 7, at most about 6, at most about 5, at most about 4, at most about 3, at most about 2, or at most about 1 imaging device on or adjacent to the imaging region 838.
  • the cartridge 805 may comprise a plurality of imaging devices.
  • Each of the plurality of imaging devices may use light from a same light source. In another example, each of the plurality of imaging devices may use light from different light sources.
  • the plurality of imaging devices can be configured in parallel and/or in series with respect to one another.
  • the plurality of imaging devices can be configured on one or more sides (e.g., two adjacent sides or two opposite sides) of the cartridge 805.
  • the plurality of imaging devices can be configured to view the imaging region 838 along a same axis or different axes with respect to (i) a length of the cartridge 805 (e.g., a length of a straight channel of the cartridge 805) or (ii) a direction of migration of one or more particles (e.g., one or more cells) in the cartridge 805.
  • One or more imaging devices of the present disclosure can be stationary while imaging one or more cells, e.g., atthe imaging region 838.
  • one or more imaging devices may move with respect to the flow channel (e.g., along the length of the flow channel, towards and/or away from the flow channel, tangentially about the circumference of the flow channel, etc.) while imaging the one or more cells.
  • the one or more imaging devices can be operatively coupled to one or more actuators, such as, for example, a stepper actuator, linear actuator, hydraulic actuator, pneumatic actuator, electric actuator, magnetic actuator, and mechanical actuator (e.g., rack and pinion, chains, etc.).
  • the cartridge 805 may comprise at least about 1 - e.g., at least about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10, or more, imaging regions (e.g., the imaging region 838). In some examples, the cartridge 805 may comprise at most about 10 - e.g., at most about 9, about 8, about 7, about 6, about 5, about 4, about 3, about 2, or about 1 imaging region . In some examples, the cartridge 815 may comprise a plurality of imaging regions, and the plurality of imaging regions can be configured in parallel and/or in series with respect to each another. The plurality of imaging regions may or may not be in fluid communication with each other.
  • a first imaging region and a second imaging region can be configured in parallel, such that a first fluid that passes through the first imaging region does not pass through a second imaging region.
  • a first imaging region and a second imaging region can be configured in series, such that a first fluid that passes through the first imaging region also passes through the second imaging region.
  • the imaging device(s) e.g., the high-speed camera of the imaging system can comprise an electromagnetic radiation sensor (e.g., IR sensor, color sensor, etc. ) that detects atleast a portion of the electromagnetic radiation that is reflected by and/or transmitted from the cartridge or any content (e.g., the cell) in the cartridge.
  • the imaging device can be in operative communication with one or more sources (e.g., at least about 1, about 2, about 3, about 4, about 5, or more) of the electromagnetic radiation.
  • the electromagnetic radiation can comprise one or more wavelengths from the electromagnetic spectrum including, but not limited to x-rays (about 0.1 nanometers (nm) to about 10.0 nm; or about 10 18 Hertz (Hz) to about 10 16 Hz), ultraviolet (UV) rays (about lO.O nm to about 380 nm; or about 8x 10 16 Hz to about 10 15 Hz), visible light (about 380 nm to about 750 nm; or about 8x 10 14 Hz to about 4x 10 14 Hz), infrared (IR) light (about 750 nm to about 0.1 centimeters (cm); or about 4x 10 14 Hz to about 5x 10 11 Hz), and microwaves (about 0.1 cm to about 100 cm; or about 10 8 Hz to about 5x 10 n Hz).
  • the source(s) of the electromagnetic radiation can be ambient light, and thus the cell sorting system may not have an additional source of the electromagnetic radiation.
  • the imaging device(s) can be configured to take a two-dimensional image (e.g., one or more pixels) of the cell and/or a three-dimensional image (e.g., one or more voxels) of the cell.
  • the exposure times can differ across different systems and can largely be dependent upon the requirements of a given application or the limitations of a given system such as but not limited to flow rates. Images are acquired and can be analyzed using an image analysis algorithm.
  • the images are acquired and analyzed post-capture. In some examples, the images are acquired and analyzed in real-time continuously. Using object tracking software, single cells can be detected and tracked while in the field of view of the camera.
  • the cartridge 806 causes the cells to rotate as they are imaged, and multiple images of each cell are provided to a computing system 816 for analysis.
  • the multiple images comprise images from a plurality of cell angles.
  • the flow rate and channel dimensions can be determined to obtain multiple images of the same cell from a plurality of different angles (i.e., a plurality of cell angles). A degree of rotation between an angle to the next angle can be uniform or non-uniform. In some examples, a full 360° view of the cell is captured. In some examples, 4 images are provided in which the cell rotates 90° between successive frames. In some examples, 8 images are provided in which the cell rotates 45° between successive frames. In some examples, 24 images are provided in which the cell rotates 15° between successive frames.
  • At least three or more images are provided in which the cell rotates at a first angle between a first frame and a second frame, and the cell rotates at a second angle between the second frame and a third frame, wherein the first and second angles are different.
  • less than the full 360° view of the cell can be captured, and a resulting plurality of images of the same cell can be sufficient to classify the cell (e.g., determine a specific type of the cell).
  • the cell can have a plurality of sides.
  • the plurality of sides of the cell can be defined with respectto a direction of the transport (flow) ofthe cell through the channel.
  • the cell can comprise a stop side, a bottom side that is opposite the top side, a front side (e.g., the side towards the direction of the flow of the cell), a rear side opposite the front side, a left side, and/or a right side opposite the left side.
  • the image of the cell can comprise a plurality of images captured from the plurality of angles, wherein the plurality of images comprise: (1) an image captured from the top side of the cell, (2) an image captured from the bottom side of the cell, (3) an image captured from the front side of the cell, (4) an image captured from the rear side of the cell, (5) an image captured from the left side of the cell, and/or (6) an image captured from the right side of the cell.
  • a two-dimensional “hologram” of a cell can be generated using superimposing the multiple images of the individual cell.
  • the “hologram” can be analyzed to automatically classify characteristics of the cell based upon features including but not limited to the morphological features of the cell.
  • At least about 1, at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, or at least about 10 images are captured for each cell.
  • about 5 or more images are captured for each cell.
  • from about 5 to about 10 images are captured for each cell.
  • 10 or more images are captured for each cell.
  • from about 10 to about 20 images are captured for each cell.
  • about 20 or more images are captured for each cell.
  • from about 20 to about 50 images are captured for each cell.
  • about 50 or more images are captured for each cell.
  • from about 50 to about 100 images are captured for each cell.
  • At least about 1, at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, at least about 10, atleast about 15, at least about20, at least about 30, atleast about 40, at least about 50, or more images may be captured for each cell at a plurality of different angles.
  • at most 50, at most about 40, at most about 30, at most about 20, at most about 15, at most about 10, at most about 9, at most about 8, at most about 7, at most about 6, at most about 5, at most about 4, at most about 3, or at most about 2 images can be captured for each cell at a plurality of different angles.
  • the imaging device is moved so as to capture multiple images of the cell from a plurality of angles.
  • the images are captured at an angle between 0 and 90 degrees to the horizontal axis.
  • the images are captured at an angle between 90 and 180 degrees to the horizontal axis.
  • the images are captured at an angle between 180 and 270 degrees to the horizontal axis.
  • the images are captured at an angle between 270 and 360 degrees to the horizontal axis.
  • multiple imaging devices for e.g. multiple cameras
  • each device captures an image of the cell from a specific cell angle.
  • At least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, or at least about 10 cameras are used. In some examples, more than about 10 cameras are used, wherein each camera images the cell from a specific cell angle.
  • the cartridge has different regions to focus, order, and/or rotate cells. Although the focusing regions, ordering regions, and cell rotating regions are discussed as affecting the sample in a specific sequence, a person having ordinary skill in the art would appreciate thatthe various regions can be arranged differently, where the focusing, ordering, and/or rotating of the cells in the sample can be performed in any order. Regions within a microfluidic device implemented in accordance with an example of the disclosure are illustrated in FIG. 8B. Cartridge 805 may include a filtration region 830 to prevent channel cloggingby aggregates/debris or dust particles.
  • Cells pass through a focusing region 832 that focuses the cells into a single streamline of cells that are then spaced by an ordering region 834.
  • the focusing region utilizes “inertial focusing” to form the single streamline of cells.
  • the focusing region utilizes “hydrodynamic focusing” to focus the cells into the single streamline of cells.
  • rotation can be imparted upon the cells by a rotation region 836.
  • the optionally spinning cells can then pass through an imaging region 838 in which the cells are illuminated for imaging prior to exiting the cartridge.
  • the rotation region 836 can be a part (e.g., a beginning portion, a middle portion, and/or an end portion with respect to a migration of a cell within the cartridge) of the imaging region 838.
  • the imaging region 838 can be a part of the rotation region 836.
  • a single cell is imaged in a field of view of the imaging device, e.g. camera.
  • multiple cells are imaged in the same field of view of the imaging device.
  • at least about 1 , at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, or at least about 10 cells are imaged in the same field of view of the imaging device.
  • up to about 100 cells are imaged in the same field of view of the imaging device.
  • about 10 to about 100 cells are imaged in the field of view, for example, about 10 to 20 cells, about 10 to about 30 cells, about 10 to about 40 cells, about 10 to about 50 cells, about 10 to about 60 cells, about 10 to about 80 cells, about 10 to about 90 cells, about 20 to about 30 cells, about 20 to about 40 cells, about 20 to about 50 cells, about 20 to about 60 cells, about 20 to about 70 cells, about 20 to about 80 cells, about 20 to about 90 cells, about 30 to about 40 cells, about 40 to about 50 cells, about 40 to about 60 cells, about 40 to about 70 cells, about 40 to about 80 cells, about 40 to about 90 cells, about 50 to about 60 cells, about 50 to about 70 cells, about 50 to about 80 cells, about 50 to about 90 cells, about 60 to about 70 cells, about 60 to about 80 cells, about 60 to about 90 cells, about 70 to about 80 cells, about 70 to about 90 cells, or about 90 to about 100 cells are imaged in the same field of view of the imaging device.
  • only a single cell can be allowed to be transported across a cross -section of the flow channel perpendicular to the axis of the flow channel.
  • a plurality of cells e.g., at least about 2, about 3, about 4, about 5, or more cells; at most about 5, about 4, about 3, about2, or about 1 cell
  • the imaging device or the processor operatively linked to the imaging device
  • the imaging system can include, among other things, a camera, an objective lens system and a light source.
  • cartridges similar to those described above can be fabricated using standard 2D microfluidic fabrication techniques, requiring minimal fabrication time and cost.
  • classification and/or sorting systems can be implemented in any of a variety of ways appropriate to the requirements of specific applications in accordance with various examples of the disclosure. Specific elements of microfluidic devices that can be utilized in classification and/or sorting systems in accordance with some examples of the disclosure are discussed further below.
  • the microfluidic system can comprise a microfluidic chip (e.g., comprising one or more microfluidic channels for flowing cells) operatively coupled to an imaging device (e.g., one or more cameras).
  • a microfluidic device can comprise the imaging device, and the chip can be inserted into the device, to align the imaging device to an imaging region of a channel of the chip.
  • the chip can comprise one or more positioningidentifiers (e.g., pattem(s), such as numbers, letters, symbols, or other drawings) that can b e imaged to determine the positioning of the chip (and thusthe imaging region of the channel of the chip) relative to the device as a whole or relative to the imaging device.
  • positioning identifiers e.g., pattem(s), such as numbers, letters, symbols, or other drawings
  • image-based alignment e.g., auto-alignment
  • oneor more images of the chip can be capture upon its coupling to the device, and the image(s) can be analyzed by any of the methods disclosed herein (e.g., using any model or classifier disclosed herein) to determine a degree or score of chip alignment.
  • the positioning identifier(s) can be a “guide” to navigate the stage holding the chip within the device to move within the device towards a correct position relative to the imaging unit.
  • rule-based image processing can be used to navigate the stage to a precise range of location or a precise location relative to the image unit.
  • machine leaming/artificial intelligence methods as disclosed herein can be modified or trained to identify the pattern on the chip and navigate the stage to the precise imaging location for the image unit, to increase resilience.
  • machine leaming/artificial intelligence methods as disclosed herein can be modified ortrained to implementreinforcementlearningbased alignment and focusing.
  • the alignment process for the chip to the instrument or the image unit can involve moving the stage holding the chip in, e.g., either X or Y axis and/or moving the imaging plane on the Z axis.
  • the chip can start at a X, Y, and Z position (e.g., randomly selected), (ii) based on one or more image(s) of the chip and/or the stage holding the chip, a model can determine a movement vector for the stage and a movement for the imaging plane, (iii) depending on whether such movement vector may take the chip closer to the optimum X, Y, and Z position relative to the image unit, an error term can be determined as a loss for the model, and (iv) the magnitude of the error can be either constant or be proportional to how far the current X, Y, and Z position is from an optimal X, Y, and Z position (e.g., can be predetermined).
  • One or more flow channels of the cartridge of the present disclosure may have various shapes and sizes.
  • at least a portion of the flow channel e.g., the focusing region 832, the ordering region 834, the rotation region 836, the imaging region 838, connecting region therebetween, etc.
  • the system of the present disclosure comprises straight channels with rectangular or square cross-sections.
  • the system of the present disclosure comprises straight channels with round cross-sections.
  • the system comprises straight channels with half-ellipsoid cross-sections.
  • the system comprises spiral channels. In some examples, the system comprises round channels with rectangular cross -sections. In some examples, the system comprises round channels with rectangular channels with round cross-sections. In some examples, the system comprises round channels with half -ellipsoid crosssections. In some examples, the system comprises channels that are expanding and contracting in width with rectangular cross-sections. In some examples, the system comprises channels that are expanding and contracting in width with round cross -sections. In some examples, the system comprises channels that are expanding and contracting in width with half -ellipsoid cross-sections. [0282]
  • the flow channel can comprise one or more walls that are f ormed to focus one or more cells into a streamline.
  • the focusing region receives a flow of randomly arranged cells using an upstream section.
  • the cells flow into a region of contracted and expanded sections in which the randomly arranged cells are focused into a single streamline of cells.
  • the focusing can be driven by the action of inertial lift forces (wall effect and shear gradient forces) acting on cells.
  • the cell sorting system can be configured to focus the cell at a width and/or a height within the flow channel along an axis of the flow channel.
  • the cell can be focused to a center or off the center of the cross-section of the flow channel.
  • the cell can be focused to a side (e.g., a wall) of the cross-section of the flow channel.
  • a focused position of the cell within the crosssection of the channel can be uniform or non-uniform as the cell is transported through the channel.
  • Architecture of the microfluidic channels of the cartridge of the present disclosure can be controlled (e.g., modified, optimized, etc.) to modulate cell flow along the microfluidic channels.
  • Examples of the cell flow may include (i) cell focusing (e.g., into a single streamline) and (ii) rotation of the one or more cells as the cell(s) are migrating (e.g., within the single streamline) down the length of the microfluidic channels.
  • microfluidic channels can be configured to impart rotation on ordered cells in accordance with a number of examples of the disclosure.
  • the plurality of buffers can be co-flown at a same position along the length of the cell rotation region, or sequentially at different positions along the length of the cell rotation region. In some examples, the plurality of buffers can be the same or different.
  • the cell rotation region of the microfluidic channel is fabricated using a two-layer fabrication process so that the axis of rotation is perpendicular to the axis of cell downstream migration and parallel to cell lateral migration.
  • Cells can be imaged in at least a portion of the cell rotating region, while the cells are tumbling and/or rotating as they migrate downstream.
  • the cells can be imaged in an imaging region that is adjacent to or downstream of the cell rotating region.
  • the cells can be flowing in a single streamline within a flow channel, and the cells can be imaged as the cells are rotating within the single streamline.
  • a rotational speed of the cells can be constant or varied along the length of the imaging region.
  • the design comprises hydrodynamic focusing with 2 inlets, wherein only one side flow channel is used and cells are focused near channel wall.
  • the hydrodynamic focusing comprises side flows that do not contain any cells and a middle inlet that contains cells. The ratio of the flow rate on the side channel to the flow rate on the main channel determines the width of cell focusing region.
  • the design is a combination of the above. In all examples, the design is integrable with the bifurcation and sorting mechanisms disclosed herein.
  • the hydrodynamic-based z focusing system is usedin conjunction with inertiabased z focusing.
  • a variety of techniques can be utilized to classify images of cells captured by classification and/or sorting systems in accordance with various examples of the disclosure.
  • the image captures are saved for future analysis/classification either manually or by image analysis software. Any suitable image analysis software can be used for image analysis.
  • image analysis is performed using OpenCV.
  • analysis and classification is performed in real time.
  • the system and methods ofthe present disclosure comprise collecting a plurality of images of objects in the flow.
  • the plurality of images comprises at least 20 images of cells.
  • the plurality of images comprises at least about 19, about 18, about 17, about 16, about 15, about 14, about 13, about 12, about 11, about 10, about 9, about 8, about 7, about 6, about 5, about 4, about 3, or about 2 images of cells.
  • the plurality of images comprises images from multiple cell angles.
  • the plurality of images, comprisingimages from multiple cell angles help derive extra features from the particle which would be hidden if the particle is imaged from a single point-of-view.
  • the systems and methods of the disclosure comprise imaging a single particle in a particular field of view of the camera.
  • the same instrument that performs imaging operations can also perform sorting operations.
  • the system and methods of the present disclosure image multiple particles in the same field of view of camera. Imaging multiple particles in the same field of view of the camera can provide additional advantages, for example it will increase the throughput of the system by batching the data collection and transmission of multiple particles.
  • At least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about ?, at least about 8, at least about 9, at least about 10, at least about 20, at least about 30, at least about 40, at least about 50, at least about 60, at least about 70, at least about 80, at least about 90, at least about 100, or more particles are imaged in the same field of view of the camera. In some instances, about 100 to about 200 particles are imaged in the same field of view of the camera.
  • At most about 100, at most about 90, at most about 80, at most about 70, at most about 60, at most about 50, at most about 40, at most about 30, at most about 20, at most about 10, at most about 9, at most about 8, at most about 7, at most about 6, at most about 5, atmost about 4, at most about 3, or at most about 2 particles are imaged in the same field of view of the camera.
  • the number of the particles (e.g., cells) that are imaged in the same field of view may notbe changed throughout the operation of the cartridge.
  • the number of the particles (e.g., cells) that are imaged in the same field of view can be changed in real-time throughout the operation of the cartridge, e.g., to increase speed of the classification and/or sorting process without negatively affecting quality or accuracy of the classification and/or soring process.
  • the imaging region maybe downstream of the focusing region and the ordering region.
  • the imaging region may not be part of the focusing region and the ordering region.
  • the focusing region may not comprise or be operatively coupled to any imaging device that is configured to capture one or more images to be used for particle analysis (e.g., cell classification).
  • the systems and the methods of the present disclosure actively sorts a stream of particles.
  • sort or sorting refers to physically separating particles, for e.g. cells, with one or more desired characteristics.
  • the desired characteristic s) can comprise a feature of the cell(s) analyzed and/or obtained from the image(s) of the cell.
  • Examples of the morphometric feature of the cell(s) can comprise a size, shape, volume, electromagnetic radiation absorbance and/or transmittance (e.g., fluorescence intensity, luminescence intensity, etc.), or viability (e.g., when live cells are used).
  • electromagnetic radiation absorbance and/or transmittance e.g., fluorescence intensity, luminescence intensity, etc.
  • viability e.g., when live cells are used.
  • the flow channel can branch into a plurality of channels, and the cell sorting system can be configured to sort the cell by directing the cell to a selected channel of the plurality of channels based on the analyzed image of the cell.
  • the analyzed image can be indicative of one or more features of the cell, wherein the feature(s) are used as parameters of cell sorting.
  • one or more channels of the plurality of channels can have a plurality of sub channels, and the plurality of sub-channels can be used to further sort the cells that have been sorted once.
  • Cell sorting may comprise isolating one or more target cells from a population of cells.
  • the cell sorting accuracy of the cartridge provided herein can be at least about 80 %, at least about 81 %, at least about 82 %, at least about 83 %, at least about 84 %, at least about 85 %, at least about 86 %, at least about 87 %, at least about 88 %, at least about 89 %, at least about 90 %, at least about 91 %, at least about 92 %, at least about 93 %, at least about 94 %, at least about 95 %, at least about 96 %, a at least bout 97 %, at least about 98 %, at least about 99 %, or more (e.g., about 99.9% or about 100%).
  • the cell sorting accuracy of the cartridge provided herein may be at most about 100 %, at most about 99 %, at most about 98 %, at most about 97 %, at most about 96 %, at most about 95 %, at most about 94 %, at most about 93 %, at most about 92 %, at most about 91 %, at most about 90 %, at most about 89 %, at most about 88 %, at most about 87 %, at most about 86 %, at most about 85 %, at most about 84 %, at most about 83 %, at most about 82 %, at most about 81 %, or at most about 80 %, or less.
  • cell sorting may be performed at a rate of at least about 1 cell/second, at least about 5 cells/second, at least about 10 cells/second, at least about 50 cells/second, at least about 100 cells/second, at least about 500 cells/second, at least about 1,000 cells/second, at least about 5,000 cells/second, at least about 10,000 cells/second, at least about 50,000 cells/second, or more.
  • cell sorting may be performed at a rate of at most about 50,000 cells/second, at most about 10,000 cells/second, at most about 5,000 cells/second, at most about 1,000 cells/second, at most about 500 cells/second, atmost about 100 cells/second, at most about 50 cells/second, at most about 10 cells/second, atmost about 5 cells/second, or at most about 1 cell/second, or less.
  • the systems and methods disclosed herein use an active sorting mechanism.
  • the active sorting is independent from analysis and decision making platforms and methods.
  • the sorting is performed by a sorter, which receives a signal from the decision making unit (e.g. a classifier), or any other external unit, and then sorts cells as they arrive at the bifurcation.
  • the term bifurcation as used herein refers to the termination of the flow channel into two or more channels, such that cells with the one or more desired characteristics are sorted or directed towards one of the two or more channels and cell without the one or more desired characteristics are directed towards the remaining channels.
  • the flow channel terminates into at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, at least about 10, or more channels. In some examples, the flow channel terminates into at most about 10, at most about 9, at most about 8, at most about 7, at most about 6, at most about 5, at most about 4, at most about 3, or at most about 2 channels. In some examples, the flow channel terminates in two channels and cells with one or more desired characteristics are directed towards one of the two channels (the positive channel), while cells without the one or more desired characteristics are directed towards the other channel (the negative channel).
  • the flow channel terminates in three channels and cells with a first desired characteristic are directed to one of the three channels, cells with a second desired characteristic are directed to another of the three channels, and cells without the first desired characteristic and the second desired characteristic are directed to the remaining of the three channels.
  • the sorting is performed by a sorter.
  • the sorter may function by predicting the exact time at which the particle will arrive at the bifurcation. To predict the time of particle arrival, the sorter can use any applicable method.
  • the sorter predicts the time of arrival of the particle by using (i) velocity of particles (e.g., downstream velocity of a particle along the length of the microfluidic channel) that are upstream of the bifurcation and (ii) the distance between velocity measurement/calculation location and the bifurcation.
  • the sorter predicts the time of arrival of the particles by using a constant delay time as an input.
  • the sorter may measure the velocity of a particle (e.g., a cell) at least about 1 , at least about 2, at least about 3, at least about 4, or at least about 5, or more times. In some examples, prior to the cell’s arrival at the bifurcation, the sorter may measure the velocity of the particle at most about 5, at most about 4, at most about 3, at most about 2, or at most about 1 time. In some examples, the sorter may use at least about 1, at least about2, at least about 3, at least about 4, or at least about 5, or more sensors. In some examples, the sorter may use at most about 5, at most about 4, at most about 3, at most about 2, or at most about 1 sensor.
  • a particle e.g., a cell
  • the sorter may measure the velocity of the particle at most about 5, at most about 4, at most about 3, at most about 2, or at most about 1 time. In some examples, the sorter may use at least about 1, at least about2, at least about 3, at least about 4, or at least about 5, or more sensors. In some examples, the sort
  • Example of the sensor(s) can be an imaging device (e.g., a camera such as a high-speed camera), one- or multi-point light (e.g., laser) detector, etc.
  • the sorter may use any one of the imaging devices (e.g., the high-speed camera system 814) disposed at or adjacent to the imaging region 838.
  • the same imaging device(s) can be used to capture one or more images of a cell as the cell is rotating and migrating within the channel, and the one or more images can be analyzed to (i) classify the cell and (ii) measure a rotational and/or lateral velocity of the cell within the channel and predict the cell’ s arrival time at the bifurcation.
  • the sorter may use one or more sensors that are different than the imaging devices of the imaging region 838.
  • the sorter may measure the velocity of the particle (i) upstream of the imaging region 838, (ii) at the imaging region 838, and/or (iii) downstream of the imaging region 838.
  • the sorter may comprise or be operatively coupled to a processor, such as a computer processor.
  • a processor such as a computer processor.
  • Such processor can be the processor 816 that is operatively coupled to the imaging device 814 or a different processor.
  • the processor can be configured to calculate the velocity of a particle (rotational and/or downstream velocity of the particle) an predict the time of arrival of the particle at the bifurcation.
  • the processor can be operatively coupled to one or more valves of the bifurcation.
  • the processor can be configured to direct the valve(s) to open and close any channel in fluid communication with the bifurcation.
  • the processor can be configured to predict and measure when operation of the valve(s) (e.g., opening or closing) is completed.
  • the sorter may comprise a self -included unit (e.g., comprising the sensors, such as the imaging device(s)) which is capable of (i) predicting the time of arrival of the articles and/or (ii) detecting the particle as it arrives at the bifurcation.
  • a self -included unit e.g., comprising the sensors, such as the imaging device(s)
  • the order at which the particles arrive at the bifurcation, as detected by the self -included unit can be matched to the order of the received signal from the decision making unit (e.g. a classifier).
  • controlled particles are used to align and update the order as necessary.
  • the decision making unit may classify a first cell, a second cell, and a third cell, respectively, and the sorter may confirm that the first cell, the second cell, and the third cell are sorted, respectively in the same order. If the order is confirmed, the classification and sorting mechanisms (or deep learning algorithms) may remain the same. If the order is different between the classifying and the sorting, then the classification and/or sorting mechanisms (or deep learning algorithms) can be updated or optimized, either manually or automatically.
  • the controlled particles can be cells (e.g., live or dead cells). [0311] In some examples, the controlled particles can be special calibration beads (e.g., plastic beads, metallic beads, magnetic beads, etc.).
  • the calibration beads used are polystyrene beads with size ranging between about 1 mMto about 50 mM. In some examples the calibration beads used are polystyrene beads with size of least about 1 pM. In some examples the calibration beads used are polystyrene beads with size of at most about 50 pM.
  • the calibration beads used are poly styrene beads with size ranging between about 1 pM to about 3 pM, about 1 pM to about 5 pM, about 1 pM to about 6 pM, about 1 pM to about 10 pM, about 1 pMto about 15 pM, about 1 pMto about 20 pM, about 1 pM to about 25 pM, about 1 pMto about 30 pM, about 1 pMto about 35 pM, about 1 pMto about 40 pM, about 1 pMto about 50 pM, about 3 pM to about 5 pM, about 3 pM to about 6 pM, about 3 pM to about 10 pM, about 3 pM to about 15 pM, about 3 pMto about 20 pM, about 3 pMto about 25 pM, about 3 pMto about 30 pM, about 3 pM to about 35 pM, about 3 pM to about 35
  • the calibration beads used are polystyrene beads with size of about 1 mM, about 3 mM, about 5 mM, about 6 mM, about 10 mM, about 15 mM, about 20 mM, about 25 mM, about 30 mM, about 35 mM, about 40 mM, or about 50 mM.
  • the sorter may modify any operation (e.g., cell focusing, cell rotation, controlling cell velocity, cell classification algorithms, valve actuation processes, etc.) of the cartridge.
  • the validation by the sorter can be used for closed -loop and real-time update of any operation of the cartridge.
  • the systems, methods, and platforms disclosed herein can dynamically adjust a delay time (e.g., a constant delay time) based on imaging of the cell(s) or based on tracking of the cell(s) with light (e.g., laser).
  • a delay time e.g., a constant delay time
  • the delay time e.g., time at which the cells arrive at the bifurcation
  • a feedback loop can be designed that can constantly read such changes and adjust the delay time accordingly.
  • the delay time can be adjusted for each cell/particle.
  • the delay time can be calculated separately for each individual cell, based on, e.g., its velocity, lateral position in the channel, and/or time of arrival at specific locations along the channel (e.g., using tracking based on lasers or other methods).
  • the calculated delay time can then be applied to the individual cell/particle (e.g., if the cell is a positive cell or a target cell, the sorting can be performed according to its specific delay time or a predetermined delay time).
  • the sorters used in the systems and methods disclosed herein are self-learning cell sorting systems or intelligent cell sorting systems, as disclosed herein.
  • sorting systems can continuously learn based on the outcome of sorting. For example, a sample of cells is sorted, the sorted cells are analyzed, and the results of this analysis are fed back to the classifier. In some examples, the cells that are sorted as “positive” (i.e., target cells or cells of interest) can be analyzed and validated. In some examples, the cells that are sorted as “negative” (i.e., non-target cells or cells not of interest) can be analyzed and validated. In some examples, both positive and negative cells can be validated. Such validation of sorted cells (e.g., based on secondary imaging and classification) can be used for closed-loop and real-time update of the primary cell classification algorithms.
  • the systems and methods of the present disclosure comprise one or more reservoirs designed to collect the particles after the particles have been sorted.
  • the number of cells to be sorted is about 1 cell to about 1,000,000 cells. In some examples, the number of cells to be sorted is at least about 1 cell. In some examples, the number of cells to be sorted is at most about 1 ,000,000 cells.
  • the number of cells to be sorted is about 1 cell to about 100 cells, about 1 cell to about 500 cells, about 1 cell to about 1,000 cells, about 1 cell to about 5,000 cells, about 1 cell to about 10, 000 cells, about 1 cell to about 50,000 cells, about 1 cell to about 100,000 cells, about 1 cell to about 500,000 cells, about 1 cell to about 1,000,000 cells, about 100 cells to about 500 cells, about 100 cells to about 1,000 cells, about 100 cells to about 5,000 cells, about 100 cells to about 10,000 cells, about 100 cells to about 50,000 cells, about 100 cells to about 100,000 cells, about 100 cells to about 500,000 cells, about 100 cells to about 1,000,000 cells, about 500 cells to about 1,000 cells, ab out 500 cells to about 5,000 cells, about 500 cells to about 10,000 cells, about 500 cells to about 50,000 cells, about 500 cells to about 100,000 cells, about 500 cells to about 500,000 cells, about 500 cells to about 1,000,000 cells, about 1,000 cells to about 5,000 cells, about 1,000 cells to about 10,000 cells, about 1,000 cells to about 50,000 cells, about 1,000 cells to about 100,000 cells, about 1,000 cells to about
  • system and methods of the present disclosure comprise a combination oftechniques, wherein a graphics processingunit (GPU) and a digital signal processor (DSP) are used to run artificial intelligence (Al) algorithmsand apply classification results in realtime to the system.
  • system and methods of the present disclosure comprise a hybrid method for real-time cell sorting.
  • the system and methods ofthe present disclosure comprise a feedback loop (e.g., an automatic feedback loop).
  • the system and methods can be configured to (i) monitor the vital signals and (ii) finetune one or more parameters of the system and methods based on the signals being read.
  • a processor e.g., a ML/ Al processor as disclosed herein
  • target values for one or more selected parameters e.g, flow rate, cell rate, etc.
  • other signals that reflect (e.g., automatically reflect) the quality of the run canbeutilized in the feedbackloop.
  • the feedback loop can receive (e.g., in real -time) values of the parameters/signals disclosed herein and, based on the predetermined target values and/or one or more general mandates (e.g., the fewer the out-of-focus cells, the better), the feedback loop can facilitate adjustments (e.g., adjustments to pressure systems, illumination, stage, etc.).
  • the feedback loop can be designed to monitor and/or handle degenerate scenarios, in which the microfluidic system is not responsive or malfunctioning (e.g., outputting a value read that is out of range of acceptable reads).
  • the system and methods of the present disclosure can adjust a cell classification threshold based on expected true positive rate for a sample type.
  • the expected true positive rate can come from statistics gathered in one or more previous runs from the same or other patients with similar conditions. Such approach can help neutralize run -to-run variations (e.g., illumination, chip fabrication variation, etc.) that would impact imaging and hence any inference therefrom.
  • the particles (e.g., cells) analyzed by the systems and methods disclosed herein are comprised in a sample.
  • the sample can be a biological sample obtained from a subject (e.g., a human or any animal).
  • a subject e.g., a human or any animal.
  • an animal can be a variety of any applicable type, including, but not limited thereto, mammal or non-mammals.
  • the animal can be veterinarian animal, livestock animal or pet type animal, etc.
  • the animal can be a laboratory animal specifically selected to have certain characteristics similar to a human (e.g., rat, dog, pig, monkey, or the like).
  • the subject can be any applicable human patient, for example.
  • the biological sample comprises a biopsy sample from a subject.
  • the biological sample comprises a tissue sample from a subject.
  • the biological sample comprises liquid biopsy from a subject.
  • the biological sample can be a solid biological sample, e.g., a tumor sample.
  • a sample from a subject can comprise at least about 1%, at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least ab out 80%, atleast about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or at least about 100% tumor cells from a tumor.
  • the sample can be a liquid biological sample.
  • the liquid biological sample can be a blood sample (e.g., whole blood, plasma, or serum). A whole blood sample can be subjected to separation of cellular components (e.g., plasma, serum) and cellular components by use of a Ficoll reagent.
  • the liquid biological sample can be a urine sample.
  • the liquid biological sample can be a perilymph sample.
  • the liquid biological sample can be a fecal sample.
  • the liquid biological sample can be saliva.
  • the liquid biological sample can be semen.
  • the liquid biological sample can be amniotic fluid.
  • the liquid biological sample can be cerebrospinal fluid. In some examples, the liquid biological sample can be bile. In some examples, the liquid biological sample can be sweat. In some examples, the liquid biological sample can be tears. In some examples, the liquid biological sample can be sputum. In some examples, the liquid biological sample can be synovial fluid. In some examples, the liquid biological sample can be vomit.
  • samples can be collected over a period of time and the samples can be compared to each other or with a standard sample using the systems and methods disclosed herein.
  • the standard sample is a comparable sample obtained from a different subject, for example a different subject that is known to be healthy or a different subject that is known to be unhealthy. Samples can be collected over regular time intervals, or can be collected intermittently over irregular time intervals.
  • FIG. 9 illustrates an example training architecture 900 for aspects herein disclosed systems, such as the human foundation model previously illustrated in FIG. 1 .
  • Architecture 900 can be completely symmetric, completely asymmetric, partially symmetric, or partially asymmetric, or a combination of the foregoing, and include a multi-layered (e.g., 18 layers deep) convolutional neural network. It is understood that after training, the system (e.g., system 100) can be trained to predict morphometric characteristics of cellular images, including but not limited to cell class, cell type, cell state, other morphometric features such as blobs, related probabilities, and related accuracy identifiers.
  • morphometric characteristics of cellular images including but not limited to cell class, cell type, cell state, other morphometric features such as blobs, related probabilities, and related accuracy identifiers.
  • Architecture 900 can be usedin a high-throughput setting so that images 912.
  • images 912 are captured by a camera (e.g., an ultra high-speed bright-field camera) as cell suspensions flow through a channel in the microfluidics chip.
  • Architecture 900 can include an augmentation module 940 configured to crop collected ultra -high-speed bright-field images 912 of cells as they pass through an imaging zone (e.g., an imaging zone of a microfluidic chip such as those captured images of FIGs. 8A-8B).
  • Augmentation module 940 can implement one or more augmentation methods to generate batches 942a, b of altered replicas of the images 912. Augmentation techniques of module 940 includes, but is not limited to, horizontal and vertical flips of images, orthogonal rotation, translation, gaussian noise, contrast variation, and the like.
  • Batches 942a, b can be used to train a deep learning (DL) encoder 950.
  • batches 942a, b of altered replicas of the images 912 can be introduced along with images 912 into DL encoder 950 to generate augmented embeddings 952a, b.
  • Encoder 950 can be trained using a self-supervised learning (SSL) method that learns image features without labels and relies at least on preserving information of its embeddings, including embeddings 952a, 952b, as well as concatenated deep learning predictive embeddings 964 (discussed more particularly below).
  • SSL self-supervised learning
  • DL encoder 950 canbe a ResNetbased encoder trained using a plurality of unlabeled cell images from different types of samples to detect differences in cell morphology without labeled training data.
  • encoder 950 learns image features without labels and with orthogonal morphometric features to improve model performance and interpretability.
  • Encoder 950 may include a plurality of convolution layers that use examples, such as edge detectors to detect a plurality of edge components of images 912 and batches 942a, b of altered replicas of the images 912. Encoder 950 can also use shape detectors to detect shape components of images 912 and batches 942a, b of altered replicas of the images 912 (e.g., a particular type of cell ridge). Augmented embeddings 952a, b from deep learning encoder 950 can be used to determine deep learning interpretations of captured images 912 performed in real -time (e.g., approximately ⁇ 150 ms latency).
  • encoder 950 can encode features of batches 942a, b into multi-dimensional vectors.
  • encoder 950 can extract a 64-dimensional feature vector for each altered image of batches 942a, b and images 912.
  • Encoder 950 canbe trained with a loss function that utilizes maximum likelihood -based invariance between augmented images (such as mean squared error or categorical cross entropy), as well as variance, covariance, and morphometric decorrelation terms.
  • the variance and covariance terms used herein can include estimates of variance and covariance between feature dimensions by calculating the invariance and covariance directly for batches of images (e.g., of hundreds to thousands of images).
  • encoder 950 can be iteratively optimized until the DL model converges and calculate statistical quality (e.g., covariance) usingthe loss function.
  • encoder 950 can include a backbone, such as a ResNet-50 backbone, trained with the herein describedinvariance, variance, covariance, and morphometric decorrelation terms.
  • the loss function uses an invariance term that learns invariance to vector transformations and is regularized with a variance term that prevents norm collapse.
  • the invariance term is determined using the mean square distance between embedding vectors (e.g., vectors of embeddings 952a, 952b).
  • the loss function also uses a covariance term that prevents informational collapse by decorrelating the different dimensions of the vectors of embeddings 952a, 952b.
  • the variance loss constrains the variance term of the vectors of embeddings 952a, 952b along each dimension independently.
  • a distance between vector pairs of embeddings 952a, b of the augmented images of batches 942a, b of the same cell is minimized (e.g., Euclidean distance) and variance of each embedding 952a, b over a training batch is maintained above a threshold.
  • the threshold is a hyperparameter determined by a value that gives us the best or most-optimized performance on downstream tasks .
  • variance is optimized to be around approximately 1 .
  • variance can be optimized to be any value on a range of 0 or strictly (0, infinity] (strictly greater than 0).
  • Architecture 900 can also include a computer vision encoder 960 that can be self - supervised and can include human-constructed algorithms, which in some cases can be referred to as the previously-described “rule-based morphometries.” See Table 1.
  • Encoder 960 may process captured images 912 as input and extract morphometric cell features into a plurality of morphometric vectors 962 (e.g., dimensional morphometric features encoded into 95-dimensional vectors representing the cell morphology).
  • the multi-dimension vectors 962 can include cell position, cell shape, pixel intensity, pixel count, cell size, texture, focus, or combinations thereof.
  • encoder 960 can extract 99 dimensional embedding vectors representing cell morphology from high resolution images 912.
  • FIGs. HA to 1 IB Example depictions of certain contemplated morphometric cell features are shown in FIGs. HA to 1 IB, where FIG. 11 A illustrates representative images showing features that include cell shape and size (e.g., convex hull, max/min radius, max ferret diameter, min ferret diameter, long/ short axis, etc.).
  • FIG. 11B shows representative images showing features that include pixel intensity and texture (e.g., small white “blobs”, small black “blobs”, large white “blobs”, large white “blobs”, etc.).
  • blobs relate to cellular structures like granules, vesicles, and the like.
  • blobs can be understood as connected set(s) of pixels that are either sub stantially or entirely dark or sub stantially or entirely bright.
  • blobs can be understood region(s) in a respective image that differs in properties (e.g., brightness, color, etc.) relative to surrounding region(s).
  • outputs of encoders 950, 960 can be analyzed together and concatenate as decorrelated concatenated morphometric predictive embeddings 964.
  • Embeddings 964 can be generated adopting a probabilistic approach and/or using deep learning features of encoder 950 (e.g., using conditional batch normalization) concatenated with computer vision morphometric feature embeddings 962 from encoder 960 into different dimensions.
  • Embeddings 964 can be predictive multidimensional vectors that include predictive features related to individual cells, clusters of cells, morphometric features, and related probabilities.
  • embeddings 952a, 952b are capable of being visualized.
  • Projector 970 is configured to reduce the dimensionality to projected embeddings 972a, 972b and map representations of embeddings 952a, 952b.
  • the previously discussed criterion including the loss functions with invariance, variance, covariance, and morphometric decorrelation terms can be applied on projected embeddings 972a, 972b.
  • Views (a) to (f) of FIG. 13 schematically illustrate an example system for classifying and sorting one or more cells.
  • the platform as disclosed herein can allow for the input and flow of cells in suspension with confinement along a single lateral trajectory to obtain a narrow band of focus across the z-axis (views (a) to (f) of FIG. 13).
  • View (a) of FIG. 13 shows the microfluidic chip and the inputs and output of the sorter platform according to one example of the present disclosure.
  • Cells in suspension and sheath fluid are inputted, along with run parameters entered by the user: target cell type(s) and a cap on the number of cells to sort, if sorting is of interest.
  • the system Upon run completion, the system generates reports of the sample composition (number and types of all of the processed cells) and the parameters of the run, including: length of run, number of analyzed cells, quality ofimaging, quality ofthe sample. If sorting option is selected, it outputs isolated cells in a reservoir on the chip as well as a report of the number of sorted cells, purity of the collected cells and yield of the sort. Referring to view (b) of FIG. 13, a combination of hydrodynamic focusing and inertial focusing is used to focus the cells on a single z plane and a single lateral trajectory. Referring to views (c) and (d) of FIG. 13, the diagram shows the interplay between different components of the software (view (c) of FIG.
  • the classifier is blown up in view (e) of FIG. 13, depicting the process of image collection, and automated real-time assessment of single cells in flow.
  • individual cell images are cropped using an automated object detection module, the cropped images are then run through a deep neural networks model trained on the relevant cells (e.g., DL encoder 950).
  • the model can generate deep learning embeddings (e.g., embeddings 952a, 952b), deep learning predictive embeddings 964, as well as generate a prediction vector over the available cell classes and an inference will be made according to a selection rule (e.g., argmax).
  • the model may also infer the z focusing plane of the image.
  • the percentage of debris and cell clumps may also be predicted by the neural network model as a proxy for “sample quality”.
  • View (f) of FIG. 13 shows the performance of sorting. In this figure, the tradeoff between purity and yield is shown in three different modes, for profiling as sorting of 130,000 [A], 500,000 [B] or 1,000,000 [C] cells within one hour.
  • the platform can collect ultra high-speed bright-field images of cells as they pass through the imaging zone of the microfluidic chip (views (a) and (b) of FIG. 13).
  • an automated object detection module can be incorporated to crop each image centered around the cell, beforefeedingthe croppedimages into a deep convolutional neural network (CNN) based on Inception architecture, which is trained on images of relevant cell types.
  • CNN deep convolutional neural network
  • the CNN can be trained to assess the focus of each image (in Z plane) and identify debris and cell clusters, thus providinginformation to assess sample quality (view (e) of FIG. 13).
  • a feedback loop can be engineered so that the CNN inferred cell type can be used in real time to regulate pneumatic valves for sorting a cell into either the positive reservoir (cell collection reservoir) for a targeted category of interest or a waste outlet (FIG. 13 A). Sorted cells in the reservoir may then be retrieved for downstream processing and molecular analysis.
  • the feedback loop can be engineered so that the generated deep learning embeddings (e.g., embeddings 952a, 952b, 964, etc.) can be used in real time to regulate pneumatic valves for sorting a cell into either a cell collection reservoir or a waste outlet (FIG. 13 A).
  • FIG. 14 schematically illustrate operations that can be performed in an example method.
  • View (a) of FIG. 14 shows high resolution images of single cells in flow are stored.
  • AIAIA Al Assisted Image Annotation
  • AIAIA is used to cluster individual cell images into morphologically similar groups of cells.
  • AIAIA is used to cluster individual cell images into groups of cells using deep learning embeddings (e.g., embeddings 952a, 952b, 964).
  • a user uses the labeling tool to adjust and batch-label the cell clusters.
  • one AML cell can be mis-clusteredinto a group of WBC cells and an image showing a cell clump (debris) can be mis-clustered in a NSCLC cell group.
  • These errors are corrected by the “Expert clean-up” operation of view (b).
  • the annotated cells are then integrated into a Cell Morphology Atlas (CMA).
  • CMA Cell Morphology Atlas
  • the CMA is used to generate both training and validation sets of the next generation of the models.
  • view (e) of FIG. 14 during a sorting experiment, the pre-trained model shown in view (d) of FIG. Mis used to infer the cell type (class) in real-time.
  • the enriched cells are retrieved from the device.
  • the retrieved cells are further processed for molecular profiling.
  • the platform can be run in multiple different modes.
  • the collected images of a sample can be fed to the AIAIA, configured to use unsupervised leamingto group cells into sub-clusters.
  • the sub-clusters be morphologically distinct sub-clusters.
  • the cells can be grouped using deep learning embeddings (e.g., embeddings 952a, 952b, 964).
  • a user can clean up the sub -clusters by removing cells that are incorrectly clustered and annotates each cluster based on a predefined annotation schema.
  • the annotated cell images are then integrated into the Cell Morphology Atlas (CMA), a growing database of expert-annotated images of single cells.
  • CMA Cell Morphology Atlas
  • the CMA is broken down into training and validation sets and is used to train and evaluate CNN models aimed at identifying cell types, cell states, morphometric features, and/or the like.
  • the collected images are fed into models that had been previously trained using the CMA, and a report is generated demonstrating the composition of the sample of interest.
  • a UMAP visualization is used to depict the morphometric map of all the single cells within the sample.
  • a set of prediction probabilities is also generated showing the classifier prediction of each individual cell within the sample belonging to every predefined cell class within the CMA.
  • the collected images are passed to the CNN in real-time and a decision is made on the fly to assign each single cell to one of the predefined classes within the CMA.
  • the collected images are passed to the CNN in real-time and the decision is made on the fly to assign each single cell using deep learning embeddings (e.g., embeddings 952a, 952b, 964) within the CMA.
  • the target cells are then sorted in real-time and are outputted for downstream molecular assessment.
  • FIG. 15 shows a computer system that is programmed or otherwise configured to implement methods provided herein.
  • the present disclosure provides computer systems that are programmed to implement methods of the disclosure.
  • FIG. 15 shows a computer system 1501 that is programmed or otherwise configured to capture and/or analyze one or more images of the cell.
  • the computer system 1501 can regulate various examples of components of the cell sorting system of the present disclosure, such as, for example, the pump, the valve, and the imaging device.
  • the computer system 1501 can be an electronic device of a user or a computer system that is remotely located with respect to the electronic device.
  • the electronic device can be a mobile electronic device.
  • the computer system 1501 includes a central processing unit (CPU, also “processor” and “computer processor” herein) 1505, which can be a single core or multi core processor, or a plurality of processors for parallel processing.
  • the computer system 1501 also includes memory or memory location 1510 (e.g., random-access memory, read-only memory, flash memory), electronic storage unit 1515 (e.g., hard disk), communication interface 1520 (e.g., network adapter) for communicating with one or more other systems, and peripheral devices 1525, such as cache, other memory, data storage and/or electronic display adapters.
  • the memory 1510, storage unit 1515, interface 1520 and peripheral devices 1525 are in communication with the CPU 1505 through a communication bus (solid lines), such as a motherboard.
  • the storage unit 1515 can be a data storage unit (or data repository) for storing data.
  • the computer system 1501 can be operatively coupled to a computer network (“network”) 1530 with the aid of the communication interface 1520.
  • the network 1530 can be the Internet, an internet and/or extranet, or an intranet and/or extranet that is in communication with the Internet.
  • the network 1530 in some cases is a telecommunication and/or data network.
  • the network 1530 can include one or more computer servers, which can enable distributed computing, such as cloud computing.
  • the network 1530, in some cases with the aid of the computer system 1501, can implement a peer-to-peer network, which may enable devices coupled to the computer system 1501 to behave as a client or a server.
  • the CPU 1505 can execute a sequence of machine -readable instructions, which can be embodied in a program or software.
  • the instructions can be stored in a memory location, such as the memory 1510.
  • the instructions can be directed to the CPU 1505, which can subsequently program or otherwise configure the CPU 1505 to implement methods of the present disclosure. Examples of operations performed by the CPU 1505 can include fetch, decode, execute, and writeback.
  • the CPU 1505 can be part of a circuit, such as an integrated circuit.
  • a circuit such as an integrated circuit.
  • One or more other components of the system 1501 can be included in the circuit.
  • the circuit is an application specific integrated circuit (ASIC).
  • the storage unit 1515 can store files, such as drivers, libraries and saved programs.
  • the storage unit 1515 can store user data, e.g., user preferences and user programs.
  • the computer system 1501 in some cases can include one ormore additional data storage units that are external to the computer system 1501, such as located on a remote server that is in communication with the computer system 1501 through an intranet or the Internet.
  • the computer system 1501 can communicate with one or more remote computer systems through the network 1530.
  • the computer system 1501 can communicate with a remote computer system of a user.
  • remote computer systems include personal computers (e.g., portable PC), slate or tablets, telephones, Smart phones (e.g., Apple® iPhone, Android- enabled device, Blackberry®), or personal digital assistants.
  • the user can access the computer system 1501 using the network 1530.
  • Methods as described herein can be implemented by way of machine (e.g., computer processor) executable code stored on an electronic storage location of the computer system 1501, such as, for example, on the memory 1510 or electronic storage unit 1515.
  • the machine executable or machine readable code can be provided in the form of software.
  • the code can be executed by the processor 1505.
  • the code can be retrieved from the storage unit 1515 and stored on the memory 1510 for ready access by the processor 1505.
  • the electronic storage unit 1515 can be precluded, and machine -executable instructions are stored on memory 1510.
  • the code can be pre-compiled and configured for use with a machine having a processor adapted to execute the code, or can be compiled during runtime.
  • the code can be supplied in a programming language that can be selected to enable the code to execute in a pre-compiled or as- compiled fashion.
  • Examples of the systems and methods provided herein, such as the computer system 1501, can be embodied in programming.
  • Various examples of the technology can be thought of as “products” or “articles of manufacture” typically in the form of machine (or processor) executable code and/or associated data that is carried on or embodied in a type of machine readable medium.
  • Machine-executable code can be stored on an electronic storage unit, such as memory (e.g., readonly memory, random-access memory, flash memory) or a hard disk.
  • “Storage” type media can include any or all of the tangible memory of the computers, processors or the like, or associated modules thereof, such as various semiconductor memories, tape drives, disk drives and the like, which may provide non -transitory storage at any time for the software programming. All or portions of the software may at times be communicated through the Internet or various other telecommunication networks. Such communications, for example, may enable loading of the software from one computer or processor into another, for example, from a management server or host computer into the computer platform of an application server.
  • another type of media that may bear the software elements includes optical, electrical and electromagnetic waves, such as used across physical interfaces between local devices, through wired and optical landline networks and over various air-links.
  • a machine readable medium such as computer-executable code
  • a tangible storage medium such as computer-executable code
  • Non-volatile storage media include, for example, optical or magnetic disks, such as any of the storage devices in any computer(s) or the like, such as can be used to implement the databases, etc. shown in the drawings.
  • Volatile storage media include dynamic memory, such as main memory of such a computer platform.
  • Tangible transmission media include coaxial cables; copper wire and fiber optics, including the wires that comprise a bus within a computer system.
  • Carrier-wave transmission media may take the form of electric or electromagnetic signals, or acoustic or light waves such as those generated during radio frequency (RF) and infrared (IR) data communications.
  • RF radio frequency
  • IR infrared
  • Common forms of computer-readable media therefore include for example: a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD orDVD-ROM, any other optical medium, punchcards papertape, any otherphysical storage medium with patterns of holes, a RAM, a ROM, a PROM and EPROM, a FLASH -EPROM, any other memory chip or cartridge, a carrier wave transporting data or instructions, cables or links transporting such a carrier wave, or any other medium from which a computer may read programming code and/or data.
  • Many of these forms of computer readable media can be involved in carrying one or more sequences of one or more instructions to a processor for execution.
  • the computer system 1501 can include or be in communication with an electronic display 1535 that comprises a user interface (UI) 1540 for providing, for example, the one or more images of the cell that is transported through the channel of the cell sorting system.
  • UI user interface
  • the computer system 1501 can be configured to provide a live feedback of the images.
  • UIs include, without limitation, a graphical user interface (GUI) and web-based user interface.
  • GUI graphical user interface
  • Methods and systems of the present disclosure can be implemented by way of one or more algorithms. An algorithm can be implemented by way of software upon execution by the central processingunit 1505. The algorithm can include, for example, the human foundation model.
  • GUI graphical user interface
  • FIG. 16 illustrates an example flow of operations in a method of processing images.
  • Method 1600 illustrated in FIG. 16 includes extracting, using a Deep Learning (DL) model (e.g., DL encoder 950), a set of machine learning (ML)-based features from a cell image (e.g., images 912 and/or augmented images thereof such as in batches 942a, 942b ) (operation 1610).
  • DL Deep Learning
  • ML machine learning
  • any suitable component(s) may include a processor and a non-computer readable medium storing the machine learning model and related encoder, and instructions for causing the processor to perform operations.
  • the processor can be included with a cloud-based computing environment and/or within a microfluidics platform (e.g., platform 20, 310).
  • Nonlimiting examples of machine learning encoders are provided elsewhere herein.
  • cells in the one or more cell images are unstained.
  • the one or more cell images are brightfield cell images in some examples, though it will be appreciated that other types of cell images readily canbe provided to the machine learning encoder and computer vision encoder.
  • Method 1600 illustrated in FIG. 16 also may include generating, using the DL model and the set of ML-based features, a plurality of DL embeddings orthogonal to each other (operation 1620).
  • Method 1600 illustrated in FIG. 16 also may include extracting, using a computer vision model, a set of cell morphometric features from the cell image, the cell morphometric features being orthogonal to the plurality of DL embeddings (operation 1630).
  • cell morphometric features include cell position, cell shape, pixel intensity, texture, focus, or any combination thereof. These and other nonlimiting examples of cell morphological features are described in Table 2.
  • Method 1600 illustrated in FIG. 16 also may include generating a plurality of morphometric predictive embeddings using the set of ML-based features and the set of cell morphometric features (operation 1640).
  • FIG. 17 illustrates an example flow of operations in a method of processing images.
  • Method 1700 illustrated in FIG. 17 includes extracting, using a Deep Learning (DL) model (e.g., DL encoder 950) and image data of a plurality of cells, a vector for a cell of the plurality of cells, the vector comprising a set of machine-learning (ML)-based features and a set of cell morphometric features extractedusing a computer vision model (operation 1710).
  • DL Deep Learning
  • ML machine-learning
  • any suitable component(s) may include a processor and a non-computer readable medium storing the machine learning model and related encoder, and instructions for causing the processor to perform operations.
  • the processor can be included with a cloud-based computing environmentand/orwithin a microfluidicsplatform (e.g., platform20, 310).
  • a microfluidicsplatform e.g., platform20, 310.
  • machine learning encoders such as deep learning encoders, for example, convolutional neural networks
  • cells in the one or more cell images are unstained.
  • the one or more cell images are brightfield cell images in some examples, though it will be appreciated that other types of cell images readily can be provided to the machine learning encoder and computer vision encoder.
  • Method 1700 illustrated in FIG. 17 also may include generating, using the DL model and using the set of ML-based features, a plurality of DL embeddings orthogonal to each other (operation 1720).
  • Cell images may be represented as points in a space (referred to herein as “feature space”) defined by the information included in the image.
  • feature space a space defined by the information included in the image.
  • each cell image (and, indeed, any image) encoded in the specified format may be represented as a unique location in the feature being considered.
  • dimensions maybe identified through use of an encoder comprising a neural network with a plurality of convolution layers, which may identify features of an input image and generate a reduced dimensionality output image, and a plurality of transpose convolution layers, which may generate an increased dimensionality output image based on features includedin an inputimage. Illustrations of potential operations which may be performed by the layers of such a network are discussed below in the context of FIGS. 18 and 19.
  • a convolution layer may identify features in an input image 101, and generate a reduced dimensionality output image 102 by convolving the input image 101 with a convolution filter 103.
  • FIG. 18 illustrates how convolution of square input image with n+1 pixel long sides may be used to generate a square output image with n-1 pixel long sides through convolving the square input image with a 3x3 convolution filter.
  • the input and output images may each be represented by layers in the network. The weights on the connections between those layers may then define the convolution operation. For example, given a 3x3 convolution filter such as shown in FIG.
  • a pixel at location (a,b) in the output image may have a weight for the pixel at location (a-1 ,b-l) of the input image equal to the top left value in the convolution filter, a weight for the pixel at location (a,b-l) of the input image equal to the top center value in the convolution filter, and may continue in this pattern for the remainder of the square extending from (a-l,b-l) to (a+l,b+l) in the input image, while having weight values of zero elsewhere.
  • the particular values in the filter may then define the feature of the input image which may be captured in a convolution layer’ s output image. For example, a 3x3 filter with a value of 8 atits center and -1 elsewhere such as shown in Table 3, below, may capture the edges from the input image.
  • Multiple such convolution layers may be arranged in a series (i.e., the output image of one layer may serve as the input image to the next) to further refine the features and reduce the dimensionality of the ultimate output image created by the convolution layers.
  • a transpose convolution layer may function in a manner similar to that described above for a convolution layer, with a filter 201 being used to create an output image 202 from an input image 203.
  • the input image 203 may be augmented by adding zeros around its periphery and between its elements, with the filter 201 only being applied to the expanded image 204 obtained using this augmentation.
  • the first transpose convolution layer taking the output 102 of the last convolution layer as input 203, it is possible to obtain a final output image which has the same resolution as the original input image to the first convolution layer.
  • This final output image can then be compared with the original image provided as input to the first convolution layer, and used with a loss function (e. g., mean squared error, binary cross entropy) to allow a neural network comprising convolution and transpose convolution layers to be trained using unlabeled images of the type to be represented (e.g., cell images).
  • a loss function e. g., mean squared error, binary cross entropy
  • the pixels in the smallest output image i.e., the output image created by the final convolution layer which is provided as input to the first transpose convolution layer
  • the convolution layers can be used as an encoder by treating the values of the pixels in the smallest output image as the embedding for that image.
  • FIGS. 20A-20E which illustrate morphometric features which may be extracted from cell images and used as dimensions in a feature space, either in addition to, or as alternatives to, dimensions derived using machine learning such as described above.
  • FIGS. 18 and 19 illustrated various items as having particular dimensions (e.g., 3x3 convolution filter in FIG. 1; 3x3 transpose convolution filter, 3x3 input image, and 5x5 output image in FIG. 19).
  • dimensions e.g., 3x3 convolution filter in FIG. 1; 3x3 transpose convolution filter, 3x3 input image, and 5x5 output image in FIG. 19.
  • those dimensions are used only for illustration, and other dimensions may also be used.
  • convolution and transpose convolution filters having dimensions other than 3x3 may be used, and transpose convolution layers may have input images with sizes other than 3x3 and output images with sizes other than 5x5.
  • 18 and 19 may be obtained by applying filters on a pixel by pixelbasis (e.g., a filter may be convolved with one 3x3 area, then moved one pixel and convolved with another 3x3 area, etc.) it is also possible that filters may be applied on a different basis in some scenarios (e.g., a filter may be convolved with one 3x3 area, then moved two pixels and convolved with another 3x3 area, etc.).
  • the amount of zeros added between pixels in, and/or around the periphery of, an input image to create an enhanced image 204 may be more than the single rows and columns of zeros illustrated in FIG. 19.
  • may be used to increase or decrease the size of images as appropriate in some cases.
  • approaches such as pooling, downsampling or upsampling may be the only operations which may be used to reduce size.
  • a convolution layer may include augmenting an input image by adding zero padding so that the output provided by applying a convolution filter may have the same dimension as the input image, and then the size of that output image may be reduced by application of a pooling operation.
  • Other suitable approaches may be employed. Accordingly, the above variations, like the discussions of FIGS. 18 and 19 themselves, are to be illustrative only, and not limiting.
  • the loss function may be used when using neural networks to identify the dimensions of a feature space in which cell images may be represented. While, as noted above, the use of convolution and transpose convolution layers can allow such a neural network to be trained on unlabeled data based on how well it can reconstruct a cell image, in some cases labeled data may also be used in the training process. For example, a loss function may be implemented to combine the loss based on how well the neural network can reconstruct a cell image (referred to as “reconstruction loss”), with how well the dimensions in a particular feature space function in allowing a cell image to be classified.
  • the values of pixels at the feature space dimensions may be used as inputs to a dense network with output nodes corresponding to labels of cell images with ground truth (e.g., provided by a human annotator) classifications, and the loss from comparing the ground truth classification with the classifications provided by the dense network may be combined with the reconstruction loss when training the convolution layers of the network.
  • ground truth e.g., provided by a human annotator
  • the loss from comparing the ground truth classification with the classifications provided by the dense network may be combined with the reconstruction loss when training the convolution layers of the network.
  • ground truth e.g., provided by a human annotator
  • a cell image by providing an embedding to a deep learningmodel (e.g., a series of transpose convolution layers such a described in the context of FIG. 19) designed to translate embeddings into images (such a series of layers referred to herein as a “decoder”).
  • a deep learningmodel e.g., a series of transpose convolution layers such a described in the context of FIG. 19
  • decoder such as a series of layers referred to herein as a “decoder”.
  • other approaches for generating cell images are also possible.
  • a neural network can be trained to generate cell images from random input images (e.g, a picture of static where the values of the pixels in the picture follow a gaussian distribution) by, for each of a set of trainingimages, progressively addingmorenoisetothatimage until an obscured image of effectively random pixels is produced, and training the network to approximate the (actually intractable) process of recreating the original input.
  • random input images e.g, a picture of static where the values of the pixels in the picture follow a gaussian distribution
  • FIG. 21 A method which may be used to perform this type of training is illustrated in FIG. 21, discussed below.
  • a method for training a neural network to approximate the process of denoising cell images can start in block 401 with obtaining an input image. This may be done, for example, by retrieving a cell image from a set of training data and passingit to a function which may perform a training process such as shown.
  • an output image can be obtained by adding noise (e.g., gaussian white noise) to the input image in block 402.
  • noise e.g., gaussian white noise
  • This can be done, for example, by using an equation such as equation 1, below in which x t is the output image and xt-1 is the input image, to progressively add more noise to the image on each iteration of the process shown in FIG. 21 by increasing P from a relatively low value initially (e.g., 0.00085) to a relatively higher value (e.g., 0.012) as the image is more completely obscured.
  • the neural network can be trained to generate the input image from the output image (which, in this example, may be done by training the neural network to approximately denoise the output image) in block 403. This can be done in a manner similar to that described in the context of FIGS. 18 and 19 for training a decoder to generate an output image based on an embedding.
  • the neural network may be provided with the output image and its loss may be based on the neural network’ s ability to generate a new image that matched the input image as it appeared before noise was added in block 402.
  • a decision may be made in block 404 of whether to perform further iterations of the process of FIG. 21 .
  • This decision may be implemented by, for example, checking if a predefined number of iterations (e.g., between 20 and 25 iterations) had already taken place.
  • blocks 401 may be repeated, with the output image obtained on the previous iteration being treated as the inputimage forthe further iteration of blocks 401 -403. Otherwise, if the decision in block 404 is that no more iterations are needed, the process for FIG. 21 may be treated as done in block 405, and it may then be repeated for the next image in the training data (e.g., an additional cell image from a database of cell images).
  • a process such as shown in FIG. 22 may be used.
  • an input may be received. This may be done, for example, in an initial iteration of the process of FIG. 22 by providing a random input to a neural network trained in a manner such as described above in the context of FIG. 21.
  • obtainingthe inputin block 501 may be performed by feeding an output obtained on a previous iteration back into the network as its new input data.
  • the input (whether on the initial iteration or otherwise) may then be de-noised in block 502 to provide a de-noising output.
  • a decision may be made as to whether further iterations are needed. This may be done, for example, by continuing to iterate the process of FIG. 22 forthe same number of iterations as were used in training the neural network using the process of FIG. 21 (e.g., in implementations in which there was a fixed number of iterations perimage in the trainingloop of FIG. 21 ). In some examples, other approaches to makingthe decision in block 503 may be used.
  • the number of denoising iterations may be a user definable parameter, and so the individual using the network to create a new image may provide the information for deciding if more iteration are needed. If more iterations are needed, then the process may repeat, with the output of the previous iteration being treated as the input for the next iteration. Otherwise, the process of FIG. 22 may terminate in block 504, with the final de-noising output (e.g., a new picture created based on the initial random input) treated as the ultimate output of the neural network.
  • the final de-noising output e.g., a new picture created based on the initial random input
  • noise may be progressively added to an embedding for an image, and the neural network may be trained to recreate the embedding by removing the added noise.
  • the neural network creating a new image it may be used to create an embedding, and a decoder may then create the new image from the embedding.
  • a trained neural network may be provided with additional information, such as a complete or partial embedding, and this additional information can be used to guide the image creation process.
  • FIG. 23 discussed below, illustrates in one example a process in which the creation of a new image can be guided by an embedding or partial embedding (such embedding or partial embedding referred to as a “prompt’ in the context of FIG. 23).
  • a prompt may be obtained in block 601. This may be done, for example, by a user specifying values for certain dimensions in feature space (e.g., values for roundness, ellipse elongation and convex shape, in a case where the feature space dimensions include morphometric parameters such as shown in FIG. 20B), though other way s are also possible and are discussedin more detail in section IV.
  • an attention output may be created from thatpromptin block 602.
  • This may be done, for example, by projecting the feature space dimensions with values defined in the prompt onto a flattened representation of the neural network using a set of trainable projection matrices.
  • This attention output may then be added to the de-noising output (i.e., the result of applying de-noising in block 502, as described in the context of FIG. 22) to create a conditioned output in block 603.
  • This conditioned output may then be either treated as the final output of the neural network (if the decision in block 503 indicates that no further iterations are needed), or may be treated as input and itself be subjected to denoising (if the decision in block 503 indicates that further iterations are to take place) in the process of obtaining the final output of the network.
  • This final output i.e., an embedding created by progressive denoising as illustrated in FIG. 23
  • FIG. 24 depicts in one example a process in which the disclosed technology can be used in exploring the transition between different cell states, such as the transition between melanocytic and mesenchymal melanoma cells.
  • first and second states may be defined in block 701. This may be done in a variety of manners. For example, in some cases, a user may provide two cell images as input, and the first cell image may be treated as the first state, and the second cell image may be treated as defining the second state.
  • a user may specify cell states by name, such as using a drop down or text entry field allowing him or her to provide the names of the relevant states.
  • a computer which is being used to perform the method of FIG. 24 may compare the user’ s entered text with a database of cell states, and then treat the cell states which most closely matched the user’s input as the first and second states for purposes of the process of FIG. 24.
  • Other acts e.g., informing the user of the closest matches from the database of cell states and asking them to confirm that those states are really the states they intended
  • first and second states e.g., melanocytic and mesenchymal melanoma cells
  • embeddings may be determined for each of those states in block 702.
  • the embedding definition of block 702 may be performed in a variety of manners. For example, in a case where the first and second states were defined by a user providing images for the first and second states, the embeddings for the first and second states may be determined by providing the user’s images as inputs to an encoder trained to generate embeddings for cell images.
  • each cell state in the database may have a predefined archetypal embedding, and so determining the embeddings for the first and second states in block 702 may be performed by retrieving the applicable already defined embeddings.
  • Other approaches are also possible, and may be implemented without undue experimentation based on this disclosure. Accordingly, the above description of how embeddings may be determined for the first and second states in a method such as shown in FIG. 24 is to be understood as being illustrative only, andis notto be treated as limiting.
  • those embeddings may be used to determine a set of transitional dimensions in block 703. This determination may be performed by identifying how the embeddings for the first and second state differed from each other, and treating the dimensions whose values in the embeddings differed most as the transitional dimensions.
  • the process of FIG. 24 may then proceed to a loop in which a set of prompts may be created by modify ingthe values of those transitional dimensions to illustrate the transition between the first and second (or, in some versions, the transitional dimensions form the first embedding) as a current prompt. With the current prompt set, an additional prompt may be created by modifying the current prompt in block 705.
  • the differences between the first and second states for each of the transitional dimensions may be divided by the predefined number of transitional images, and the current prompt may be modified in block 705 by adding the result of that division to each of the transitional dimensions.
  • the dimensions in the latest space for cell images may be associated with modification increments (e.g., minimum changes for those dimensions which may be likely to be detectable to a human viewer) and those modification increments may be used to modify the current prompt in block 705.
  • this may be done by: (1) identifying the dimension where the ratio of the difference between the embeddings for the first and second states to that dimension’s modification increment may be identified, (2) defining a number of stages as equal to the number of times that dimension’s modification increment goes into the difference between its value in the embeddings for the first and second states; and (3) modifying each of the transitional dimensions by addingthe result of dividing the difference between that dimension’s value in the embeddings for the first and second states by the number of stages.
  • images may be generated in block 708 for the first state’s embedding and each additional prompt, such as by providing the additional prompt to a decoder to use in creating the images, by using the additional prompt to condition the output of a machine learning model trained to approximate a de-noising process such as described in the context of FIG. 23, or by providing only the transitional dimensions to a machine learning model trained to perform such conditional de-noising.
  • images may, in block 709, be displayed to a user, such as via a user interface generated by a system which performed the method of FIG. 24.
  • This interface may display the images showing the transition from the first state to the second state, and may also include other information which may be of benefit to a user.
  • FIG. 25 depicts an interface which may be used to display images such as those which may be generated in block 708.
  • FIG. 5 illustrates a set of synthetic cell images which may be created using the described technology to illustrate how a melanoma cell may change when transitioning from the melanocytic to the mesenchymal state.
  • the interface of FIG. 25 also shows particular micro-events which take place during the transition, which micro-events may or may not correspond to individual dimensions in the latent space.
  • an interface that may be used to display generated images in block 709 may include columns showing membrane changes, elongation, dark circumferential features, and cytoplasmic features. To generate this type of interface, a system implemented based on FIG.
  • the 24 may be programmed with a set of visual characteristics, as well as how those characteristics may be determined from cell images (e.g., membrane changes may be identified based on changes in circularity of a cell image; dark circumferential features may be identified based on relatively dark portions around the perimeter of a cell image), and how those characteristics may be illustrated for a user (e.g., membrane changes may be illustrated through highlighting the perimeter of a cell; elongation may be illustrated using a line along the major axis of a cell). Then, when creating an interface such as shown in FIG. 25, the visual characteristics which change most significantly from the first to the second states may be identified and used to populate the interface for review by the user.
  • cell images e.g., membrane changes may be identified based on changes in circularity of a cell image; dark circumferential features may be identified based on relatively dark portions around the perimeter of a cell image
  • those characteristics may be illustrated for a user (e.g., membrane changes may be illustrated through highlighting the perimeter of a cell;
  • changes which may be illustrated in a series of synthetic images may be associated with various descriptive terms (e.g., membrane changes represented by decreased circularity may be associated with descriptors like irregular) and semantic dimensions for those changes may be identified by creating embeddings for the descriptive terms, as well as other terms associated with the images like mesenchymal (e.g., using software such as Word2Vec, available from Google, Inc. at https://code.google.eom/archive/p/word2vec/). Those semantic dimensions may then be compared with subjects addressed in the literature, and references having the greatest overlap may be identified as suggesting potentially fruitful paths for future research based on the generated images.
  • various descriptive terms e.g., membrane changes represented by decreased circularity may be associated with descriptors like irregular
  • semantic dimensions for those changes may be identified by creating embeddings for the descriptive terms, as well as other terms associated with the images like mesenchymal (e.g., using software such as Word2Vec, available from Google, Inc. at https://
  • the disclosed technology may also be used to support insights into natural phenomenon which may notbe possible purely from the observation of natural cells.
  • the disclosed technology may also be used to support insights into natural phenomenon which may notbe possible purely from the observation of natural cells.
  • Another example of a potential use for synthetic image generation technology such as described herein is to increase the interpretability of otherwise opaque decisions which may be made by an artificial intelligence system.
  • an artificial intelligence system which had been trained to classify cell images by creating representations of those images in a feature space and then providing the values of the dimensions in a cell image’s feature space representation to a dense network as inputs. Regardless of their accuracy, the reasons for those classifications may notbe readily understandable by humanbeings, since feature space dimensions identified by a machine learning model may not correspond to commonly recognized visual attributes such as circularity or elongation.
  • a system may be implemented which allows a user to manipulate one or more feature space dimensions of interest (e.g., feature space dimensions which led to a classification that the user may like to better understand), and the disclosed technology may be used to generate cell images showing the impact of those manipulations. Additionally, in some cases these generated cell images may be presented in a mannertomaketheimpactof changes in particular feature space dimensions clearer.
  • feature space dimensions of interest e.g., feature space dimensions which led to a classification that the user may like to better understand
  • a system implemented based on this disclosure may display a synthetic cell image with changes highlighted (e.g., by subtracting an image generated with the user’s modifications from a baseline image where those modifications were not present, and then presenting the pixels which differed between the two images in a contrasting color), thereby allowing the user to obtain a more intuitive visual understanding of the feature space dimensions used by the machine learning model.
  • the disclosed technology may also be applied to allow exploration and/or elaboration of the feature space in which cell images are represented.
  • a user may be able to define a vector, such as by defining a trend a cell might undergo (e.g., becoming cancerous) or by defining a change in one or more feature space dimensions (e.g., increasing elongation), and the disclosed technology may be used to generate images illustrating changes in the cell’ s appearance as it moved along that vector.
  • a person or entity who was responsible for training the machine learning models which generate representations in feature space based on cell images may use the disclosedtechnology to visualize cells generated with embedding values at the limits of what was present in the training data.
  • the user may identify if the quality of the underlying training data may be improved by replacing the currently available images with cleaner images in the areas where artifacts were observed.
  • the disclosed technology may be used to create an atlas, in which synthetic images may serve as references for cells in different disease states and be used for diagnostic applications.
  • a feature space for cell images may include dimensions corresponding to attributes of the imaged cells, such as protein production or gene expression.
  • a machine leamingmodel e.g., neural network such as described above
  • This type of training data may be obtained, for example, by capturing images of cultured cells and annotating the captured images with information corresponding to the relevant feature space dimensions, such as the protein production or gene expression of the imaged cells at the time the images were captured.
  • Similar approaches can also be applied to other types of information, such as temporal information (e.g., how long had elapsed since the cell was imaged in a default state, such as a healthy state without any adverse conditions), or treatment information (e.g., a type of substance, such as a drug or drug metabolite, the imaged cell had been exposed to).
  • temporal information e.g., how long had elapsed since the cell was imaged in a default state, such as a healthy state without any adverse conditions
  • treatment information e.g., a type of substance, such as a drug or drug metabolite, the imaged cell had been exposed to.
  • Further variations such as use of compound dimensions such as amount of time since a treatment with a particular type of drug had begun, are also possible and may be implemented by those of ordinary skill in the art without undue experimentation in light of this disclosure. Accordingly, the examples of dimensions in feature space describedherein may be illustrative only, and not limiting.
  • Implementations of the disclosed technology which incorporate information such as time, protein production, treatments, etc. may be used to provide a variety of types of information which may be of use in particular situations.
  • a system implemented based on this disclosure may answer a question such as what a cell would look like after a specified period of time had passed, or after a specified period of time had passed given a particular treatment.
  • this type of question may be answered by using information provided by the user (e.g., amount of time which had elapsed, whether there was a treatment regimen in place and, if so, what the regimen consisted of) as conditioning information for generating a synthetic cell image which responded to the user’s question using techniques such as described ab ove.
  • questions like what a cell would look like after a given period of time had passed under a given set of conditions could be answeredby identifyinga vector in feature space (e.g., as may be derived from training data such as described above) showing changes in the cell overtime under the given conditions, using that vector to project a location in feature space from an initial condition (e.g., a cell image provided by the user) to the specified future time, and then convertingthat location into a synthetic image answeringthe user’ s question usingtechniques such as described above.
  • an initial condition e.g., a cell image provided by the user
  • processors may then simultaneously apply the instruction from the instruction unit 902 to a set of different data items from memory 904.
  • a single instruction may be loaded and applied to multiple data items in many fewer cycles (e.g., load n data items, apply instruction to each of the n data items), than might be needed if rendering was performed using a conventional central processing unit (CPU) (e.g., load data item 1, apply instruction to data item 1, load data item 2, apply instruction to data item 2, . . . load data item n, apply instruction to data item n).
  • CPU central processing unit
  • This can be useful in generating images, and in training and applying machine learning models such as described, because those activities involve applying the same operations to many different items (e.g., matrix multiplication with different data).
  • a version of the disclosed technology may include imaging hardware by which a user may capture one or more cell images, as well as software which may integrate that hardware with cell image generation technology such as described, and which may allow a user to review images captured by the imaging system and then select images to provide to the image generation system as inputs. For example, a user may be allowed to ask for images to be generated indicating what may happen if a cell became cancerous, or if it transitioned into another specified state, etc.
  • FIG. 27 shows an example of a cell analysis system 1000, which may be used to capture images of cells.
  • System 1000 ofthis example includes a pump 1010 that is to drive a sample cellcontaining fluid from a reservoir 1012 into a cartridge 1020.
  • Cartridge 1020 may be provided as a modular component, such that cartridge 1020 may be readily replaced within system 1000 (e.g, to analyze different batches of sample cells, etc.).
  • the remaining components of system 1000 that do not get replaced each time cartridge 1020 is replaced may be collectively referred to as “the instrument.”
  • the instrument of system 1000 may include pump 1010; or pump 1010 may be considered as a separate component such that a different pump 1010 may be used when a different batch of sample cells is being analyzed.
  • the instrument of system 1000 may include reservoir 1012; or reservoir 1012 maybe considered as a separate component such that a different reservoir 1012 may be used when a different batch of sample cells is being analyzed.
  • reservoir 1012 comprises a syringe barrel; and pump 1010 comprises a syringe pump.
  • pump 1010 may take any other suitable form, includingbut not limited to a gravity feed, a peristaltic pump, etc.
  • Reservoir 1012 may also take any other suitable form, including but not limited to a vial, tube, etc.
  • the sample in reservoir 1012 may be prepared by fixation and staining; and may contain viable cells.
  • the fluid in which the sample cells are contained may include an aqueous solution (e.g., water, buffer, saline, etc.), an oil, or any other suitable fluid.
  • Cartridge 1020 includes a flow channel 1022 fluidically coupled with pump 1010, such that pump 1010 is to drive the sample cell -containing fluid from reservoir 1012 through flow channel 1022.
  • Cartridge 1020 may comprise a structure through which fluid may flow; and through which cells in the fluid may be imaged.
  • a light source 1030 generates light for such imaging.
  • an optical assembly 1032 directs light from light source 1030 toward an imaging region of flow channel 1022.
  • light source 1030 comprises a source of incoherent white light, such as an arc lamp, etc.
  • light source 1030 may take any other suitable form, such as a laser.
  • Optical assembly 1032 may comprise any suitable number and/or arrangement of lenses and/or other elements as will be apparent to those skilled in the art in view of the teachings herein.
  • An objective lens assembly 1040 is positioned on the opposite side of the imaging region of flow channel 1022, magnifies the images of cells passingthrough the imaging region of flow channel 1022, and directs the magnified images to a camera 1042.
  • Objectivelens assembly 1040 and camera 1042 thus cooperate to capture high resolution images of cells that pass through the imaging region of flow channel 1022, as illuminated by light source 1030 and optical assembly 1032.
  • objective lens assembly 1040 may provide magnification ranging from approximately 10x to approximately 200*. In other examples, objective lens assembly 1040 may provide any other suitable level of magnification.
  • camera 1042 may provide an exposure time ranging from approximately 0.001 ps to approximately 1 ms. In other examples, camera 1042 may provide any other suitable exposure time.
  • objective lens assembly 1040 and camera 1042 have an optical axis along the z-dimension. As described, images captured using the camera 1042 and lens assembly 1040 may be provided to a computer or other suitable hardware which can use them in image generation or for other purposes.
  • firstand second states in block 701 may include a user providing first and second sets of pictures, and defining the first and second states based on those sets.
  • the initial embedding for the first state may be an embedding whose dimensions have values equal to the averages for those dimensions in the first set of images.
  • the determination in block 706 of whether the second state has been reached may be based on whether the most recently created additional prompt was within a convex hull defined by the embeddings of the second set of pictures in feature space.
  • an image generation process may generate an image for each prompt as soon as that prompt itself has been created, rather than waiting until the set of prompts is complete.
  • Example 1 Evaluation of the human foundation model using different cell lines.
  • Cancer cells lines A375 and Caov-3, and immune cell line Jurkat were used to evaluate the classification performance of the human foundation model.
  • Polystyrene beads with a size of 6 micrometers (pm) were used as control.
  • the cell lines and polystyrene beads were imaged using the microfluidics platform (e.g., REM-I platform) as described herein and combined in silico to evaluate the performance of the human foundation model.
  • the human foundation model processed the images of the cell lines and polystyrene beads, extracted deep learning and morphometric features. These features were standardized and projected into a lower dimensional principal components analysis (PCA) basis.
  • PCA principal components analysis
  • Table 1 (above) is a panel of deep learning derived features generated using the DL model of the human foundation model and used in the present examples.
  • Table 2 (above) is a panel of morphometric features generated using the computer vision model of the human foundation model and used in the present examples. It should be noted that the number and types of features listed in Tables 1 and 2 are provided only as examples, without limiting the scope of the present disclosure. More and/or different features can be included. This example illustrates generated and extracted features using the human foundation model and computer vision model, including features that can be referred to as blobs or granules.
  • Example 2 System for Cell Morphology Analysis.
  • Table 3 in this example lists parameters and specifications of an example system such as described with reference to FIG. 2 A (e.g., REM-1 ) for cell morphology analysis as well as image analysis generally, cell sorting, and other operations described herein.
  • Table 5 lists the example components of one example system. in Table 4 denotes the specification is dependent on sample characteristics and/or sorting configurations.
  • a method comprising: generating a set of prompts, wherein each prompt of the set of prompts comprises a set of values correspondingto characteristics of an image type; and generating a set of images, wherein generating the set of images comprises, for each prompt from the set of prompts, generating an image corresponding to that prompt from at least providing that prompt to a trained generative artificial intelligence model; and displaying a user interface to present the set of images.
  • generating the set of prompts comprises generating the set of prompts in a sequence, wherein each prompt after an initial prompt in the sequence is generated by performing one or more modification acts, wherein the one or more modification acts comprise modifying one or more values of the set of values comprised by a preceding prompt in the sequence; and the user interface is to present the set of images in the sequence.
  • the method comprises identifying a set of visual characteristics which differ between the initial prompt in the sequence and one or more other prompts in the sequence; and the method comprises, for each of one or more images from the set of images, displaying, using the user interface, highlighted features corresponding to the set of visual characteristics as they appear in that image.
  • the method further comprises receiving user input data; the sequence of prompts establishes a transition path from a first state to a second state, wherein both the first state and the second state are determined using at least the user input data; the method further comprises identifying a subset of the set of values as significant values separating the first state from the second state; and for each prompt from the set of prompts after the initial prompt, the one or more modification acts are performed by modifying one or more of the subset of the set of values identified as significant values separating the first state from the second state.
  • the method comprises receiving user input data; and for each prompt after the initial prompt, is the one or more modification acts are performed by identifying a location along a vector in a space defined by the characteristics of the image type, wherein: the location is more distant from a location for the initial prompt in the sequence than a location for the preceding prompt in the sequence; and the vector is determined using at least the user input data.
  • the set of values comprises a value for a bounded characteristic, wherein the bounded characteristic is a characteristic of the image type for which there is a logical upper bound on values for the bounded characteristic; and the set of prompts comprises at least one prompt in which the value corresponding to the bounded characteristic is greater than the logical upper bound on values for the bounded characteristic.
  • the user input data comprises an identification of a time period, and a context for change over time
  • the vector is a direction in the space defined by the characteristics of the image type for changes over time given the context from the user input data.
  • generating the set of images comprises, for each prompt from the set of prompts: obtaining a derived prompt by providing that prompt as input to a latent diffusion model; and providing the derived prompt as input to a decoder.
  • the set of images comprises a set of subsets of images, wherein each subset of images corresponds to a condition; and generating the set of prompts comprises, for each subset of images, generating a set of prompts whose values are consistent with the condition corresponding to that subset of images.
  • the method further comprises capturing, using a cell analysis system, a plurality of images of cells from a sample; the method comprises receiving user input data which comprises a selected image from the plurality of images of cells; and the method further comprises generating at least one prompt from the set of prompts by providing the selected image from the plurality of cell images to an encoder.
  • a method comprising: determining an initial prompt by, using an encoder, determining an embedding comprising, for each of one or more dimensions of a feature space for images of an image type, a real number value for that dimension which is consistent with a first state; and determining a set of transitional prompts, wherein each prompt from the set of transitional prompts has a preceding neighbor, which is the initial prompt for one ofthe prompts in the set of transitional prompts, and which is otherwise another prompt in the set of transitional prompts; and generating a set of images using at least, for the initial prompt, and for each transitional prompt from the set of transitional prompts, generation of an image using at least that prompt and a decoder trained to generate images of the image type based on embeddings in the feature space for images of the image type; and presenting the set of images to a user using a user interface.
  • the method further comprises identifying a set of visual characteristics which differ between the initial prompt and one or more other prompts from the set of transitional prompts; and the method further comprises, for one or more images from the set of images, displaying, using the user interface, highlighted features corresponding to the set of visual characteristics as the highlighted features appear in that image.
  • the set of dimensions in the feature space for images of the image type comprises one or more morphometric features; and the set of visual characteristics comprises at least one visual characteristic which is not a dimension from the set of dimensions in the feature space for images of the image type.
  • the method further comprises receiving user input data; the set of transitional prompts establishes a transition path from the first state to a second state, wherein both the first state and the second state are based on the user input data; the method further comprises identifying a subset of the set of dimensions in the feature space for images of the image type as significant dimensions having values separating the first state from the second state; and for each prompt from the set of transitional prompts, determining that prompt comprises by modifying one or more of the subset of the set of dimensions identified as significant dimensions relative to that transitional prompt’s preceding neighbor.
  • the method further comprises receiving user input data; and determining the set of transitional prompts comprises, for each prompt from the set of transitional prompts, identifying a location along a vector in the feature space, wherein: the location is more distantfrom a location forthe initial prompt in the feature space; and the vector using at least the user input data.
  • the set of dimensions in the feature space comprises a bounded dimension, wherein the bounded dimension corresponds to a characteristic of the image type for which there is a logical upper bound on values for the characteristic; and the set of prompts comprises at least one prompt in which the value corresponding to the bounded dimension is greater than the logical upper bound on values for the characteristic.
  • the user input data comprises an identification of a time period, and a context for change over time
  • the vector is a direction in feature space for changes over time given the context from the user input data.
  • generating the set of images comprises, for the initial prompt and each prompt from the set of transitional prompts: obtaining a derived prompt by providing that prompt as input to a latent diffusion model; and providing the derived prompt as input to a decoder.
  • the method of any of claims 25 -37 wherein The set of images comprises a set of subsets of images, wherein each subset of images corresponds to a condition; and determining the first prompt and each prompt from the set of transitional prompts comprises, for each subset of images, generating a set of prompts whose values are consistent with the condition corresponding to that subset of images.
  • the method further comprises capturing, using a cell analysis system, a plurality of images of cells from a sample; the method further comprises receiving user input data which comprises a selected image from the plurality of images of cells; and the method further comprises determining the initial prompt by providing the selected image from the plurality of cell images to an encoder.
  • a system comprising: a display; and the non-transitory computer readable medium of example 44.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Bioethics (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Computational Linguistics (AREA)
  • Public Health (AREA)
  • Epidemiology (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

Des exemples de la présente divulgation concernent des systèmes et des procédés d'évaluation de données d'image qui consistent à extraire, à l'aide d'un modèle d'apprentissage profond (DL) et de données d'image d'une pluralité de cellules, un vecteur pour une cellule de la pluralité de cellules, le vecteur comprenant un ensemble de caractéristiques basées sur l'apprentissage automatique (ML) et un ensemble de caractéristiques morphométriques cellulaires extraites à l'aide d'un modèle de vision artificielle ; et à générer, à l'aide du modèle de DL et à l'aide de l'ensemble de caractéristiques à base de ML, une pluralité de plongements de DL orthogonaux les uns aux autres.
PCT/US2024/053235 2023-10-30 2024-10-28 Systèmes et procédés d'analyse morphométrique Pending WO2025096338A1 (fr)

Applications Claiming Priority (8)

Application Number Priority Date Filing Date Title
US202363546385P 2023-10-30 2023-10-30
US63/546,385 2023-10-30
NL2036275 2023-11-15
NL2036275A NL2036275B1 (en) 2023-10-30 2023-11-15 Systems and methods for morphometric analysis
US202463618660P 2024-01-08 2024-01-08
US63/618,660 2024-01-08
NL2036929A NL2036929B1 (en) 2024-01-08 2024-01-30 Generative artificial intelligence technology for generating images
NL2036929 2024-01-30

Publications (1)

Publication Number Publication Date
WO2025096338A1 true WO2025096338A1 (fr) 2025-05-08

Family

ID=95580990

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2024/053235 Pending WO2025096338A1 (fr) 2023-10-30 2024-10-28 Systèmes et procédés d'analyse morphométrique

Country Status (1)

Country Link
WO (1) WO2025096338A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN121213946A (zh) * 2025-10-22 2025-12-26 北京科技职业大学 面向铁路计轴的轮缘杂物视觉检测系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210147922A1 (en) * 2018-04-18 2021-05-20 Altius Institute For Biomedical Sciences Methods for assessing specificity of cell engineering tools
US20220163513A1 (en) * 2020-11-23 2022-05-26 University Of Washington Visual cell sorting
WO2022178095A1 (fr) * 2021-02-19 2022-08-25 Deepcell, Inc. Systèmes et procédés d'analyse cellulaire

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210147922A1 (en) * 2018-04-18 2021-05-20 Altius Institute For Biomedical Sciences Methods for assessing specificity of cell engineering tools
US20220163513A1 (en) * 2020-11-23 2022-05-26 University Of Washington Visual cell sorting
WO2022178095A1 (fr) * 2021-02-19 2022-08-25 Deepcell, Inc. Systèmes et procédés d'analyse cellulaire

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN121213946A (zh) * 2025-10-22 2025-12-26 北京科技职业大学 面向铁路计轴的轮缘杂物视觉检测系统

Similar Documents

Publication Publication Date Title
US11901077B2 (en) Multiple instance learner for prognostic tissue pattern identification
EP4022500B1 (fr) Apprenant à plusieurs instances de classification d'images de tissus
US12229959B2 (en) Systems and methods for determining cell number count in automated stereology z-stack images
US20240153289A1 (en) Systems and methods for cell analysis
Wang et al. Deep learning approach to peripheral leukocyte recognition
Van Valen et al. Deep learning automates the quantitative analysis of individual cells in live-cell imaging experiments
Sommer et al. A deep learning and novelty detection framework for rapid phenotyping in high-content screening
Bakker et al. Morphologically constrained and data informed cell segmentation of budding yeast
CN117178302A (zh) 用于细胞分析的系统和方法
US20250218201A1 (en) Analyzing phenotypes of cells
Gupta et al. Simsearch: A human-in-the-loop learning framework for fast detection of regions of interest in microscopy images
AU2021344515A9 (en) Methods and systems for predicting neurodegenerative disease state
WO2025096338A1 (fr) Systèmes et procédés d'analyse morphométrique
Aggarwal et al. Protein subcellular localization prediction by concatenation of convolutional blocks for deep features extraction from microscopic images
US20250191680A1 (en) Analyzing cell phenotypes
Mapstone et al. Machine learning approaches for image classification in developmental biology and clinical embryology
Eulenberg et al. Deep learning for imaging flow cytometry: cell cycle analysis of Jurkat cells
US20260104329A1 (en) Compositions, systems, and methods for multiple analyses of cells
NL2036275B1 (en) Systems and methods for morphometric analysis
Fishman et al. Segmenting nuclei in brightfield images with neural networks
WO2024238130A2 (fr) Systèmes et procédés d'analyse de morphologie cellulaire
WO2025122842A1 (fr) Analyse de phénotypes de cellules
Yuenyong et al. Detection of centroblast cells in H&E stained whole slide image based on object detection
Diouf et al. Combining deep learning and microfluidics for fast and noninvasive sorting of zebrafish embryo
Khatri et al. Deep learning based reconstruction of embryonic cell-division cycle from label-free microscopy time-series of evolutionarily diverse nematodes

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24886659

Country of ref document: EP

Kind code of ref document: A1