WO2025199652A1 - Systèmes et procédés de détection de réceptivité à la reproduction chez le bétail et autres mammifères quadrupèdes - Google Patents

Systèmes et procédés de détection de réceptivité à la reproduction chez le bétail et autres mammifères quadrupèdes

Info

Publication number
WO2025199652A1
WO2025199652A1 PCT/CA2025/050447 CA2025050447W WO2025199652A1 WO 2025199652 A1 WO2025199652 A1 WO 2025199652A1 CA 2025050447 W CA2025050447 W CA 2025050447W WO 2025199652 A1 WO2025199652 A1 WO 2025199652A1
Authority
WO
WIPO (PCT)
Prior art keywords
animal
state
animals
image
detected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/CA2025/050447
Other languages
English (en)
Inventor
Jeffrey SHMIGELSKY
Chen QIAO
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
One Cup Productions Ltd
Original Assignee
One Cup Productions Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by One Cup Productions Ltd filed Critical One Cup Productions Ltd
Publication of WO2025199652A1 publication Critical patent/WO2025199652A1/fr
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61DVETERINARY INSTRUMENTS, IMPLEMENTS, TOOLS, OR METHODS
    • A61D17/00Devices for indicating trouble during labour of animals ; Methods or instruments for detecting pregnancy-related states of animals
    • A61D17/002Devices for indicating trouble during labour of animals ; Methods or instruments for detecting pregnancy-related states of animals for detecting period of heat of animals, i.e. for detecting oestrus
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61DVETERINARY INSTRUMENTS, IMPLEMENTS, TOOLS, OR METHODS
    • A61D17/00Devices for indicating trouble during labour of animals ; Methods or instruments for detecting pregnancy-related states of animals
    • A61D17/004Devices for indicating trouble during labour of animals ; Methods or instruments for detecting pregnancy-related states of animals for detecting mating action

Definitions

  • Animal management can involve animal assessments such as determining if an animal is receptive to be bred (i.e., in an estrus state). Detecting whether an animal is receptive to be bred (estrus detection) offers significant benefits in animal husbandry and agriculture. It allows for precise timing of mating or artificial insemination, optimizing breeding programs for improved genetic traits, and herd management. Efficient estrus detection can increase reproductive efficiency, resulting in higher conception rates, and in reduced costs that may be incurred as a result of unsuccessful breeding attempts. Identifying estrus can also aid in disease management, since changes in behavior and physiology can signal health issues in an animal.
  • a method for animal management comprising: receiving, at a computing device, one or more images captured by one or more imaging devices; processing the one or more images using one or more Al models for: detecting and locating two animals with a pair of overlapping bounding boxes in a given image of the one or more images; determining if a current state of a first animal of the two detected animals in the given image matches a mounting state and when the first animal is in the mounting state identifying a second animal of the two detected animals as being mounted by the first animal which is indicative that the second animal is in an estrus state; and generating a notification when the second animal is in the estrus state or the first animal is in the mounting state.
  • determining if the current state of the first detected animal matches the mounting state comprises: determining if the pair of overlapping bounding boxes corresponds to two adult animals using an age classification model, the age classification model being trained to generate an age classification output, the age classification output being one of: adult or non-adult; determining if the pair of overlapping bounding boxes corresponds to two animals in a standing position, using an activity classification model, the activity classification model being trained to generate an activity classification output, the activity classification output being one of: standing or non-standing; and in response to the pair of overlapping bounding boxes corresponding to two adult animals being in a standing position, applying a mounting state detection model to an image portion encapsulated by the bounding box for the first animal, to generate a mounting classification output for the detected animal associated with the image portion, the mounting classification output indicating whether the first animal in the image portion is mounting the second animal which is indicative that the second animal is in the estrus state and the first animal will be in the estrus state in the future.
  • determining if the current state of the first detected animal matches the mounting state comprises: determining if the pair of overlapping bounding boxes corresponds to two adult animals using an age classification model, the age classification model being trained to generate an age classification output, the age classification output being one of: adult or non-adult; determining if the pair of overlapping bounding boxes corresponds to two animals in a standing position, using an activity classification model, the activity classification model being trained to generate an activity classification output, the activity classification output being one of: standing or non-standing; in response to the pair of overlapping bounding boxes corresponding to two adult animals being in a standing position, applying a mounting state detection model to an image portion encapsulated by the bounding boxes of the first detected animal, to generate a mounting classification output for the first detected animal associated with the image portion, the mounting classification output indicating whether the first animal in the image portion is mounting the second animal; and if the second animal being mounted stays stationary for a predefined period of time after the mounting has finished, identifying the second animal being mounted as
  • the mounting state detection model is trained to: generate a mounting classification output distinguishing the mounting state from other non-mounting states; and generate a confidence value for the mounting classification output.
  • the method further comprises determining if the confidence value for the mounting classification output satisfies a predetermined threshold; and in response to determining the confidence value for the mounting classification output satisfies the predetermined threshold, generating the notification.
  • the notification comprises an annotated image including the animals matching the estrus state, the mounting state and/or the standing heat state.
  • a method for animal management comprising: receiving, at a computing device, one or more images captured by one or more imaging devices; processing the one or more images using one or more Al models for: detecting and locating two animals with a pair of overlapping bounding boxes in a given image of the one or more images; determining if a current state of a first animal of the two detected animals in the given image matches a chin-resting state and when the first animal is in the chin-resting state identifying a second animal of the two detected animals as receiving a chin of an animal nears its rear end which is indicative that the second animal is in an estrus state; and generating a notification when the second animal is in the estrus state or the first animal is in the chin-resting state.
  • determining if the current state of the first detected animal matches the chin-resting state comprises: determining if the pair of overlapping bounding boxes corresponds to two adult animals using an age classification model the age classification model being trained to generate an age classification output, the age classification output being one of: adult or non-adult; determining if the pair of overlapping bounding boxes corresponds to two animals in a standing position using an activity classification model, the activity classification model being trained to generate an activity classification output, the activity classification output being one of: standing or non-standing; in response to the pair of overlapping bounding boxes corresponding to two adult animals in a standing position, generating a merged bounding box, the merged bounding box encapsulating an image portion that includes the two standing adult animals; and generating a chin-resting classification output by using a chin-resting state classification model to analyze the merged bounding box, the chin-resting classification output indicating whether the first detected animal in the image portion is in a chin-resting state which is
  • determining if the current state of the first detected animal matches the chin-resting state comprises: determining if the pair of overlapping bounding boxes corresponds to two adult animals using an age classification model, the age classification model being trained to generate an age classification output; determining if the pair of overlapping bounding boxes corresponds to two animals in a standing position using an activity classification model, the activity classification model being trained to generate an activity classification output, the activity classification output being one of: standing or nonstanding; in response to the pair of overlapping bounding boxes corresponding to two adult animals in a standing position, generating a merged bounding box, the merged bounding box encapsulating an image portion that includes the two standing adult animals; and generating a chin-resting classification output by using a chin-resting state classification model to analyze the merged bounding box, the chin-resting classification output indicating whether the first detected animal in the image portion is in a chin-resting state which is indicative of the second animal being in the estrus state.
  • the chin-resting state classification model is trained to: generate a chin-resting classification output distinguishing the chin-resting state from other non-chin-resting states; and generate a confidence value for the chinresting classification output.
  • the method further comprises determining if the confidence value for the chin-resting classification output satisfies a predetermined threshold; and in response to determining the confidence value for the chin-resting classification output satisfies the predetermined threshold, generating the notification.
  • the notification comprises information related to the current state of each of the at least one detected animals, and one or more image or videos of each of the detected animal.
  • the notification comprises an annotated image including the at least one detected animal matching the estrus state, and/or the chinresting state.
  • a method for animal management comprising receiving, at a computing device, a plurality of images captured by one or more imaging devices, the plurality of images corresponding to images sequentially obtained during a predefined time window; processing the plurality of images using one or more Al models for: detecting and locating two animals with overlapping bounding boxes in a given image of the one or more images; and determining if a current state of a first animal of the two detected animals in the given image matches a mounting state; when the first animal is in the mounting state, determining if a second animal of the two detected animals is in a standing heat state, wherein the second animal is in the standing heat state if the second animal remains stationary for at least several images sequentially obtained over a predefined period of time after the second animal is no longer being mounted; and providing a notification in response to when the second animal is determined to be in the standing heat state.
  • determining if the current state of the first detected animal in the given image matches a mounting state comprises: determining if the pair of overlapping bounding boxes corresponds to two adult animals using an age classification model, the age classification model being trained to generate an age classification output, the age classification output being one of: adult or non-adult; determining if the pair of overlapping bounding boxes corresponds to two animals in a standing position, using an activity classification model the activity classification model being trained to generate an activity classification; in response to the pair of overlapping bounding boxes corresponding to two adult animals being in a standing position, applying a mounting classification image classification model to an image portion encapsulated by the bounding box of the first detected animal to generate a mounting classification output for the first detected animal, the mounting classification output indicating whether the first detected animal in the image portion is mounting the second detected animal.
  • the method comprises providing a candidate estrus image to a multi-modal deep learning Al model that is trained to analyze the candidate estrus image to generate a user notification including the candidate estrus image, image description text and a detected event indication where the image description text includes text describing what is shown in the candidate estrus image and the detected event indication includes text that describes a detected estrus event.
  • the user notification further includes location and time information for indicating where and when the candidate estrus image was obtained.
  • determining the candidate estrus image is based on detecting estrus according to any of the methods described herein.
  • the multi-modal deep learning Al model is trained by providing to the multi-modal deep learning Al model: (a) an introduction description that is used to train the multi-modal deep learning Al model to perform certain functions, (b) an estrus detection signs description that is used to train the multi-modal deep learning Al model to learn estrus detection signs to look for in the candidate calving image and (c) a reporting estrus signs description that is used to train the multi-modal deep learning Al model to learn estrus detection reporting signs to use when generating the image description text and the detected event indication in the user notification.
  • a sex of the detected at least one animal is obtained using a sex classifier, and further processing of the given image is not performed when the at least one detected animal is male.
  • the notification indicates that the detected animal in a mounting state or in a chin-resting state will be experiencing estrus within the near future.
  • a computing device comprising a memory storing program instructions and a processor that is coupled to the memory to read and execute the program instructions which configure the processor to perform a method for animal management including detecting at least one animal that is experiencing estrus, wherein the method is defined according to any of the methods described herein.
  • a non-transitory computer readable medium storing thereon program instructions that, when executed by a processor of a computing device configure the processing for performing a method for animal management including detecting at least one animal that is experiencing estrus, wherein the method is defined according to any of the methods described herein.
  • FIG. 1A is a schematic diagram showing an application of an animal assessment system, according to at least one example embodiment of this disclosure.
  • FIG. 1 B is a schematic diagram showing another example application of the animal assessment system of FIG. 1A.
  • FIG. 2 is a schematic diagram illustrating an example embodiment of the hardware structure of an imaging device that may be used with one of the embodiments of the animal assessment systems described herein.
  • FIG. 3A is a schematic diagram illustrating an example embodiment of the hardware structure of the client computing devices and server computer one of the animal assessment systems described herein.
  • FIG. 3B is schematic diagram illustrating an example embodiment of a simplified software architecture of the client computing devices and server of the at least one of the embodiments of the animal assessment system described herein.
  • FIG. 4A is a schematic diagram of a deep neural network (DNN) that may be used by the at least one of the embodiments of the animal assessment system described herein.
  • DNN deep neural network
  • FIG. 4B is a schematic diagram of a Jigsaw model that may be used by the animal assessment system of FIG. 1A.
  • FIG. 5 is a flowchart showing an example embodiment of an animal assessment method that may be executed by a server program of at least one of the embodiments of the animal assessment system described herein, for detecting a mounting state in at least one animal at the site using an Artificial Intelligence (Al) pipeline.
  • Al Artificial Intelligence
  • FIG. 6 is a flowchart showing another example embodiment of an animal assessment method that may be executed by a server program of at least one of the embodiments of the animal assessment system described herein, for detecting a mounting state in at least one animal at the site using an Artificial Intelligence (Al) pipeline.
  • Al Artificial Intelligence
  • FIGS. 7A-7C are example synthetic images that may be generated for training a mounting state detection model of at least one of the embodiments of the animal assessment system described herein.
  • FIG. 9A shows an example image that may be processed according to the teachings herein.
  • FIG. 9B shows an output of an age classifier applied to the image of FIG. 9A.
  • FIG. 10A shows an example image that may be processed according to the teachings herein.
  • FIG. 10B shows an output of an activity classifier applied to the image of FIG. 10A.
  • FIG. 11 A shows an example image that may be processed according to the teachings herein.
  • FIGS. 11 B-11C show the image of FIG. 11A processed for detection of a mounting state.
  • FIGS. 12A-12H show images processed by the animal assessment system, showing a mounting state and associated heatmaps showing the accuracy of a mounting state detection model of the animal assessment system.
  • FIG. 13 is a flowchart showing an example embodiment of an animal assessment method that may be executed by a server program of at least one of the embodiments of the animal assessment system described herein, for detecting a standing heat in at least one animal at the site using an Artificial Intelligence (Al) pipeline.
  • Al Artificial Intelligence
  • FIGS. 14A-14F show images processed by the animal assessment method of FIG. 13 for standing heat detection.
  • FIG. 15 is a flowchart showing an example embodiment of an animal assessment method that may be executed by a server program of at least one of the embodiments of the animal assessment system described herein, for detecting chinresting in at least one detected animal at the site using an Artificial Intelligence (Al) pipeline.
  • Al Artificial Intelligence
  • FIG. 16 is a flowchart showing another example embodiment of an animal assessment method that may be executed by a server program of at least one of the embodiments of the animal assessment system described herein, for detecting chinresting in at least one detected animal at the site using an Artificial Intelligence (Al) pipeline.
  • Al Artificial Intelligence
  • FIGS. 17A-17B are example synthetic images that may be generated for training a chin-resting detection model of at least one of the embodiments of the animal assessment system described herein.
  • FIG. 18A shows an example image that may be processed according to the teachings herein.
  • FIGS. 18B-18D show the image of FIG. 18A processed for detection of a chin-resting state.
  • FIGS. 19A-19H show example images processed by one of the embodiments of the animal assessment system, showing a chin-resting state and associated heatmaps showing the accuracy of a chin-resting state detection model of the animal assessment system.
  • FIGS. 20A-20B show example images processed according to the teachings herein with bounding boxes added thereto in which a mounting state (FIG. 20A) and a chin-resting state (FIG. 20B) are detected.
  • FIG. 21 is a flowchart showing another example embodiment of an animal assessment method that may be executed by a server program of at least one of the embodiments of the animal assessment system described herein, for detecting a state indicative of breeding receptivity (e.g., estrus) in at least one detected animal at the site using an Artificial Intelligence (Al) pipeline.
  • a state indicative of breeding receptivity e.g., estrus
  • Al Artificial Intelligence
  • FIGS. 22A-22B show example images of user notifications provided by at least one embodiment of an animal assessment system described herein.
  • FIGS. 23A and 23B show example embodiments of a Graphical User Interface (GUI) that may be used by at least one embodiment of an animal assessment system described herein to provide animal assessment data for a site installation.
  • GUI Graphical User Interface
  • FIGS. 24A and 24B show example images of an example user notification provided by at least one embodiment of an animal assessment system described herein.
  • Coupled can have several different meanings depending in the context in which these terms are used.
  • the terms coupled or coupling can have a mechanical or electrical connotation.
  • the terms coupled or coupling can indicate that two elements or devices can be directly connected to one another or connected to one another through one or more intermediate elements or devices via an electrical signal, electrical connection, or a mechanical element, depending on the particular context.
  • X and/or Y is intended to mean X or Y or both, for example.
  • X, Y, and/or Z is intended to mean any operative combination of X, Y or Z such as X, Y, Z, X and Y, X and Z, Y and Z, as well as X, Y and Z.
  • At least a portion of the example embodiments of the systems or methods described in accordance with the teachings herein may be implemented as a combination of hardware or software.
  • a portion of the embodiments described herein may be implemented, at least in part, by using one or more computer programs, executing on one or more programmable devices comprising at least one processing element, and at least one data storage element (including volatile and non- volatile memory).
  • These devices may also have at least one input device (e.g., a touchscreen, and the like) and at least one output device (e.g., a display screen, a printer, a wireless radio, and the like) depending on the nature of the device.
  • some elements that are used to implement at least part of the embodiments described herein may be implemented via software that is written in a high-level procedural language such as object-oriented programming.
  • the program code may be written in JAVA, PYTHON, C, C ++ , Javascript or any other suitable programming language and may comprise modules or classes, as is known to those skilled in object-oriented programming.
  • some of these elements implemented via software may be written in assembly language, machine language, or firmware as needed.
  • programs associated with the systems and methods of the embodiments described herein may be capable of being distributed in a computer program product comprising a computer readable medium that bears computer usable instructions, such as program code, for one or more processors.
  • the program code may be preinstalled and embedded during manufacture and/or may be later installed as an update for an already deployed computing system.
  • the medium may be provided in various forms, including non- transitory forms such as, but not limited to, one or more diskettes, compact disks, DVD, tapes, chips, and magnetic, optical and electronic storage.
  • the medium may be transitory in nature such as, but not limited to, wire- line transmissions, satellite transmissions, internet transmissions (e.g., downloads), media, digital and analog signals, and the like.
  • the computer useable instructions may also be in various formats, including compiled and non-compiled code.
  • any module, unit, component, server, computer, terminal or device described herein that executes software instructions may include or otherwise have access to computer readable media such as storage media, computer storage media, or data storage devices (removable and/or non-removable) such as, for example, magnetic disks, optical disks, or tape.
  • Computer storage media may include volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data.
  • Examples of computer storage media include RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information, and which can be accessed by an application, module, or both. Any such computer storage media may be part of the device or accessible or connectable thereto.
  • Detecting breeding receptivity e.g., detecting estrus
  • Accurate estrus detection allows farmers to time mating or artificial insemination when the animals are most fertile, increasing the likelihood of conception. This optimization reduces the time and resources required for each pregnancy, improving overall herd efficiency and profitability. Identifying estrus also aids in identifying potential reproductive problems, such as anestrous (lack of estrus), allowing for early intervention and veterinary care. Efficient estrus detection is essential for sustainable animal farming, ensuring consistent animal production and maintaining a healthy and economically viable herd.
  • Embodiments disclosed herein generally relate to a computer-vision based Artificial Intelligence (Al) system and method for tracking and managing animals.
  • the Al system uses a plurality of Al models arranged according to an Al pipeline structure so that the embodiments disclosed herein may uniquely and consistently perform animal assessments at a site such as a ranch or feedlot.
  • the disclosed embodiments allow at least some activities related to tracking and managing animals to be performed remotely (e.g., detection of behaviors, activities, and/or states), reducing human resources needed for tracking and managing animals.
  • At least one embodiment of the systems and methods disclosed herein may simultaneously monitor one or more animals in a herd for tracking and managing various aspects of the tracked animals such as, but not limited to, their health, activity, estrus behavior and/or nutrition, for example.
  • At least one embodiment of the systems and methods disclosed herein may continuously operate and/or proactively notify personnel at the site, such as ranch operators, for example, of any tracked animals that are exhibiting estrus indicators thereby allowing the ranch operator to artificially inseminate the animal at an optimized time.
  • embodiments disclosed herein may track and assess animals from various distances.
  • an animal may be assessed from up to 10 meters (m) away from an imaging device (for example, about 1 m, 2m, 3m, 4m, 5m, 6m, 7m, 8m, 9m, or 10m away) when the animal is still, and further tracked up to 20m, 30m, 40m, or 50m away from the imaging device with a visual lock.
  • the system disclosed herein may identify animals from further distances.
  • the system may include 25x optical- zoom cameras that are used to assess animals with sufficient confidence levels from more than 60m, more than 70m, more than 80m, more than 90m, or more than 100m away.
  • At least one embodiment of the systems and methods disclosed herein may assess animals in an image wherein the image may be obtained by an image acquisition device that is positioned from one or more angles relative to the animals.
  • the images can be still images captured from the image acquisition device or can be image frames extracted from a video captured from the image acquisition device.
  • At least one of the embodiments described herein may allow one or more operators at a site to communicate with a server of the system via their handheld devices, such as smartphones or tablets, for example, and view assessed animals, and optionally data about the assessed animals, based on images of the animals that may be captured from one or more angles relative to the animals.
  • At least one embodiment of the systems disclosed herein may process images of animals taken from one of several angles relative to the animals and assess animals in the images by detecting the animals in an image and processing the image of a detected animal to generate a plurality of sections.
  • the sections may then be used by one or more Al models to make a final prediction of the animal’s current state and also generate the level of confidence that the final prediction is accurate.
  • the one or more Al models are only applied to animals that satisfy specific trigger conditions for detecting a particular activity, such as estrus, for example.
  • At least one embodiment of the systems and methods disclosed herein may be trained with images and/or videos to learn certain characteristics of animals over time.
  • the collected temporal data may be used for learning how to accurately determine when an adult animal is displaying mounting behavior.
  • at least one embodiment of the systems and methods disclosed herein may use one or more Al models that are trained in detecting estrus behavior by using positive and negative training images and/or videos of animals, where positive training images/videos show animals experiencing estrus behavior and negative training images/videos show animals not experiencing estrus behavior.
  • it may be challenging to assemble a sufficient number of training images/videos for uncommon estrus events e.g., training images of chin-resting or mounting from various angles with a variety of different animals).
  • the logic memory 174 also comprises a working memory area (174W) that is generally mapped to high-speed, and in some implementations, volatile, physical memory such as RAM, generally for application programs 164 to temporarily store data (including data received from other devices, data generated by the application programs 164, the machine-executable code or instruction of the application programs 164, and/or the like) during program execution.
  • a given application program 164 may load data from the storage memory area 174S into the working memory area 174W and may store data generated during its execution into the working memory area 174W.
  • a given application program 164 may also store some data into the storage memory area 174S, as required or in response to a user’s command.
  • At least one embodiment of the animal assessment system 100 is configured to perform an assessment of at least one animal at the site 102.
  • the various embodiments of the animal assessment system 100 described herein generally use a multi-layer artificial intelligence (Al) pipeline that is organized among several layers or levels that each have Al models where each level has a unique overarching purpose.
  • the Al models may be implemented using various machine learning algorithms depending on the functionality of the Al models since some machine learning algorithms are better suited than others at performing certain functions.
  • the Al models may be implemented as computer-executable instructions or code in the form of software code or firmware code stored in the logic memory 174 which may be executed by one or more processors of the processor module 142.
  • FIG. 4A shown therein is a schematic diagram of an example of a DNN Al model 400 which generally comprises an input layer 402, a plurality of hidden layers including a first hidden layer 404 and a final hidden layer 408, and an output layer 412 cascaded in series.
  • Each hidden layer comprises a plurality of nodes.
  • first hidden layer 404 includes nodes 406a, 406b, 406c, 406d, etc.
  • hidden layer 408 includes nodes 410a, 410b, 410c, 41 Od, etc. with each node in a given hidden layer connecting to a plurality of nodes in a subsequent layer.
  • a jigsaw model may be used for fine-grained visual classification (FGCV) and may be configured for identifying subclasses of a given object category.
  • the feature extraction layers of the jigsaw model can be implemented using any feature extractor known to those skilled in the art, including but not limited to Resnet.
  • the jigsaw model can be used in combination with any neural network, such as a CNN, DNN, and ANN (Artificial Neural Network).
  • the jigsaw model 450 employs a neural network to learn features progressively, from a finer granular level to a coarser granularity.
  • a SAM optimize may be used to improve the training on JigSaw models.
  • each of blocks 452, 454, 456 and 458 correspond to a feature extraction step that includes feature extraction layers
  • block 460 corresponds to a convolution block and block 462 to a classification block.
  • Each feature extraction block 452, 454, 456 and 458 is configured for extracting and learning features at a different level of granularity.
  • feature extraction blocks 452 and 454 correspond to shallower layers and are configured to learn fine-grained information
  • feature extraction blocks 456 and 458 correspond to deeper layers and are accordingly configured to learn more abstract and coarse-grained information. This progressive learning allows the jigsaw model 450 to more accurately perform image classification.
  • jigsaw model 450 is shown with four feature extraction blocks/stages, it will be understood that a jigsaw model architecture can have any number of features extraction stages, wherein each feature extraction stage learns at a coarser granularity level.
  • a Jigsaw model processes an image by cutting it into variously sized patches, which are then shuffled to form a puzzle with pieces of differing scales.
  • the network's architecture typically includes convolutional layers that pick up on local visual cues, enhanced with layers such as recurrent or permutation-invariant ones to manage the shuffled layout, where Sharpness Aware Minimization (SAM) normalization helps the network focus on the most pertinent features.
  • SAM Sharpness Aware Minimization
  • the training process involves coaxing the network to accurately predict the original arrangement of the patches, thereby imparting a nuanced comprehension of spatial relationships and contextual information within the image.
  • the embedding vector is a highdimensional representation that encodes the positional and contextual information about the image's patches as processed by the self-attention mechanisms within the Transformer layers.
  • Preprocessing forViT involves data augmentation strategies such as random horizontal flipping, Color Jittering for dynamic color changes, random conversion to grayscale, pixel-wise normalization, resizing to a fixed resolution, application of Gaussian Blur for smoothness, and Solarization for photonegative effects.
  • the student network which is a ViT or ResNet, attempts to predict the teacher network's output.
  • DINO does not rely on predefined labels but encourages the student to learn invariant features that are robust to the changes introduced by the augmentations.
  • the student gradually learns more generalized features of the images, effectively performing a form of knowledge distillation without labels. This results in the model being capable of capturing the essence of the visual patterns present in the dataset.
  • the Vision Transformer's architecture with its self-attention mechanisms, is particularly well- suited for this approach as it can focus on various parts of an image and understand the global context, making it an excellent backbone for DINO.
  • the representations learned through DINO can then be utilized for a variety of downstream tasks.
  • the model can be fine-tuned on a labeled dataset for tasks like image classification, object detection, or other tasks that require understanding of visual content.
  • a classification head such as a fully connected layer
  • This hybrid approach leverages the unsupervised learning capabilities of DINO to handle complex, high-dimensional data and the structured approach of supervised learning for specific problem-solving, harnessing the power of ViT to achieve state-of-the-art performance on various vision tasks.
  • the image classification models (using a JigSaw classifier, a DINO/FCN model, ViT, etc.) can be trained on 300 epochs with a learning rate of 5e -4 .
  • Learning rate decay techniques may be evaluated to determine the optimal learning rate decay (e.g., optimal in the sense of the most accurate training), including time-based decay, step decay, exponential decay or adaptive learning rate methods such as Adagrad, Adadelta, or RMSprop. T raining may be stopped early or continue for longer, based on the time taken for the loss graph to converge.
  • data augmentations may be applied to the training data, to obtain more training data by applying transformations including but not limited to random horizontal flips and switching, rotation, random crops to upper and lower body, random scale and rotation, color jittering, noise injection, and normalization.
  • deployment into an Al pipeline may involve certain processing.
  • the processing may include scaling input bounding box content to the required input dimensions of a given Al classification model.
  • the processing may reduce the output of the Al classification model to a binary classification and/or convert the output confidence scores to a tuple.
  • a dataset that includes real and synthetic images that consists of positive and negative cases may be used.
  • synthetic training images may be generated as described in PCT International patent application serial no. PCT/CA2024/050321 titled “System and Method for Generation of Simulated Computer Vision Training Data Using a Virtual Reality Engine”, filed March 15, 2024.
  • the images can be obtained for various angles.
  • a virtual camera can be spun around a virtual animal from different heights, angles and positions to create a varied training set.
  • Real world data is used to balance out the synthetic dataset with different lighting conditions, camera models, fields of view, and lens distortion.
  • a training image may consist of an animal, and a bounding box around the animal and optionally a body part of the animal.
  • 10,000 training images may be generated, but more may be needed in some cases depending on the accuracy and robustness of the given Al classifier model.
  • diffusion models may be applied to images included in the training dataset to generate additional training images having greater variance and/or augmenting labelled real-world data using diffusion models.
  • Greater variance may be generated by changing the appearance of the animal and/or the environment.
  • the additional training images may be generated by modifying animal fur patterns or colors, changing the background, modifying the weather, adjusting the time of day or lighting, changing the animal breed, etc., as explained earlier.
  • Training may be done following a standard training approach, using loss graphs and accuracy to measure the model’s performance.
  • the training may be divided into a training set of 70%, validation set of 20% and a testing set of 10%.
  • the training set can include any suitable combination of real-world and synthetic training data.
  • the training set can include 90% synthetic data and 10% real-world data.
  • other combinations of synthetic and real-world data may be used.
  • the testing set will only consist of real-world data, and not include any synthetic data.
  • FIG. 5 shown therein is a flowchart showing an example embodiment of an animal assessment method 500 that may be implemented using one or more server programs 164 operating on one or more processors of the processor module 142 in at least one embodiment of the system 100 for assessing a state of at least one animal at the site 102 using an Al pipeline.
  • Method 500 may be used for determining if an animal is in a mounting state.
  • a cow or cattle
  • the methods described herein may be applied to other types of animals such as any livestock and other four-legged mammalian animal, as all mammal exhibit forms of mounting behavior.
  • the method 500 may start automatically (e.g., periodically), manually under a user’s command, or when one or more imaging devices 108 send a request to the server program 164 for transmitting captured (and encoded) images thereto.
  • the server program 164 commands the imaging devices 108 to capture images or video clips of at least one animal or an entire herd of animals (collectively denoted “images” hereinafter for ease of description) at the site 102 from one or more viewing angles such as front, side, top, rear, and the like, and starts to receive the captured images or video clips from one or more of the imaging devices 108 (step 502).
  • the images may be captured from one or more angles and over various distances depending on the orientation and camera angles of stationary imaging devices or the flightpath/movement and camera angle for mobile imaging devices.
  • the camera angle of some imaging devices may also be controlled by one of the application programs 164 to obtain images of animals from one or more desired angles.
  • the images may be processed, for example, by automatically adjusting image acquisition parameters such as exposure, white balance, contrast, brightness, hue, and/or the like, digital zooming, digital stabilization, and the like.
  • image acquisition parameters such as exposure, white balance, contrast, brightness, hue, and/or the like, digital zooming, digital stabilization, and the like.
  • functions that are available through the Vision Programming Interface (VPI) in the DeepStream SDK may be used for the above processing operations such as performing color balancing on images.
  • the images may be processed, for example, using Al pipeline 3300 described in United States Patent No. 11 ,910,784. Software built into OpenCV or other libraries made available by Nvidia or other companies may also be used.
  • the images may additionally be resized, for example, if the images used when training the machine learning algorithms described herein have a predetermined, standardized size. In such cases, the images may be resized to match the size of the training images.
  • the RGB values of the images may additionally be normalized to transform the pixel intensity values to a standard scale.
  • some of the image processing is not needed such as when training uses images obtained using various image acquisition parameters and the machine learning algorithms described herein do not require images to be pre-processed.
  • image processing can involve dewarping an image such as, for example, when an image obtained using a 360-degree camera.
  • the server program 164 uses one or more machine learning algorithms to process the images captured at step 502 and perform as an object detector to detect, locate and optionally identify animals in one or more of the acquired images.
  • Various machine learning algorithms such as, but not limited to, CNN or DNN based algorithms, may be used to detect animals from the acquired images.
  • a single-pass bounding box algorithm may be used as the object detector and may be implemented using a Single Shot detector (SSD) algorithm, a DetectNet or DetectNet_V2 model, and/or the like.
  • a two-stage network may be used such as FasterRCNN, YOLOX or some other appropriate object detection and/or identification model.
  • a two-stage network may be used such as FasterRCNN or some other appropriate object detection model.
  • a mask model such as MaskRCNN, that combines a mask with a detector may also be used.
  • some image pre-processing may be performed such as, for example, in cases where an image is de-warped such as for images that are obtained using a 360-degree camera.
  • images from acquired video may be scaled to be 720p for the Al models used in layer 0 and layer 1 of the Al pipelines described herein.
  • a combination of machine learning algorithms can be used to detect animals in the acquired image(s).
  • a first machine learning algorithm may be used to detect whether an object in an image is an animal, and a second machine learning algorithm may be used to identify the type of animal.
  • the use of more than one machine learning algorithm for detecting and/or identifying animals in acquired image(s) can improve the accuracy of the object detector. For example, one or more of the techniques described in United States Patent No. 11 ,910,784 may be used.
  • the object detector detects one or more animals in a current image that is processed and also defines a bounding box to indicate the locations of the one or more detected animals in the image.
  • the bounding box may be overlaid on the image to encapsulate or surround the detected animal and may be saved as a new image in at least one embodiment. For example, these images may be used for training in the future.
  • the image may be saved in one of two forms 1) the entire image with the coordinates of the bounding box or 2) only the pixels for the animal defined by the bounding box.
  • at least one animal in the image, including animals partially obscured or turned away from the camera, are detected and associated with a respective bounding box in at least one embodiment of the system 100.
  • FIG. 8 shows an example image 342 in which three animals 104A, 104B, and 104C are detected, for which respective bounding boxes 344A, 344B, and 344C are generated and defined thereabout.
  • the coordinates of certain points e.g., x1 , x2, y1 , y2
  • a combination of coordinates and dimensions of the bounding box e.g., x, y, height, width
  • the animal ID which is uniquely assigned to a given animal and may be done as described in United States Patent No.
  • the server program 164 can determine the size of a bounding box (e.g., based on the coordinates or dimensions of the bounding box) and discard or not perform assessments on images where detected animals in those images are bounded by bounding boxes having dimensions below a predetermined threshold since bounding boxes that are too small may lead to inaccurate assessments.
  • the bounding boxes may be generated such that at least one of the bounding boxes is a rectangular box, a square box or an oriented bounding box.
  • the use of oriented bounding boxes may result in bounding boxes that are smaller but still encapsulate the detected animal since the oriented bounding box is rotated and may not include as many pixels compared to a nonoriented bounding box.
  • the use of oriented bounding boxes may also reduce the amount of other items that are encapsulated by the oriented bounding box resulting in more accurate analysis of the image pixels encapsulated by the oriented bounding box.
  • One or more detected animals may then be identified and assessed at step 610, as will be described in further detail.
  • masks may be generated for the one or more detected animals by identifying each image pixel associated with a detected animal.
  • a mask may be generated, for example, by converting every image pixel not associated with a detected animal to black (e.g., a zero pixel value).
  • Mask images may improve the accuracy of models used to perform assessments, for example, image classification models used to detect one or more states of an animal. Generating the mask images may require additional computing resources and/or processing time.
  • the server program 164 uses one or more machine learning algorithms to process the images captured at step 602 to determine if one or more of animals in a given image matches a mounting state.
  • the server program 164 for example, uses a mounting state detection model, which is an Al model.
  • the mounting state detection model can be a machine learning algorithm that is part of the Al pipeline of Al models and that is trained to generate a mounting classification output identifying whether an animal is in a mounting state or in a not mounting state.
  • the machine learning algorithms can process images to identify all animals in a given image that are in a mounting state.
  • a mounting state for a first animal can be indicative of an estrus state for a second animal (e.g., second cow) which is being mounted by the first cow and accordingly results of assessments determining whether an animal is in an estrus state or a mounting state can be used for animal management since th animal in the estrus state may be ready for artificial insemination and the animal in the mounting state may enter the estrus state within the foreseeable future such as in about 12 hours or so from when the mounting detection occurred.
  • the mounting state detection model may be any suitable image classification model trained to generate classification output for an imaged animal such as, but not limited to, a CNN, DNN, jigsaw or ViT Al model.
  • the classification output can correspond to a mounting state or a not mounting state from which it can be determined that another animal is or is not in the estrus state.
  • the animal bounding box may be scaled to 224x224 pixels or 256x256 pixels to ensure uniformity.
  • the learning rate may be configured to complement the prepared data, to pursue optimal convergence.
  • a higher initial learning rate may be set, often between 0.0005 to 0.05 for DINO, and slightly lower for Jigsaw models, with both employing a cosine decay schedule for gradual reduction over 100 to 300 epochs.
  • a warm-up period is included where the learning rate incrementally reaches the target value to stabilize early training dynamics.
  • Adaptive methods like Adam or SGD with momentum may be used, particularly with Jigsaw, adjusting the learning rate per parameter.
  • This strategy may be fine-tuned through hyperparameter optimization, tailored to the model’s needs and the nuances of the dataset, post standard resizing, normalization, and augmentation procedures in pre-processing.
  • SAM may be used during the fine-tuning phase when the model is being adapted to a specific supervised task like classification.
  • SAM may be employed after the self-supervised pre-training phase, where the model has learned to assemble the jigsaw puzzles and thus has gained an understanding of the data's structure.
  • performing mounting state detection may involve several processing steps and using several Al models, such as is described in the example embodiment shown in FIG. 6.
  • the server program 164 may assesses the bounding boxes defined at step 504. For example, in such embodiments, the server program 164 may identify whether at least two animals are present in an image and if so whether there are one or more pairs of overlapping bounding boxes (e.g., at least a portion of the image is enclosed by both bounding boxes). The image pixels enclosed by each bounding box can be determined based on the coordinates of the corners/edges of the bounding box. For example, the server program 164 may determine that bounding boxes 1120 and 1122 (in FIG.
  • 11 B) corresponding to detected animals 1124 and 1126 respectively are overlapping bounding boxes.
  • the server program 164 can assess each pair of bounding boxes in an image. For example, in an image containing three bounding boxes Bi, B2 and B3, corresponding to three animals, the server program 164 can evaluate each pair of bounding boxes, e.g., B1B2, B1B3 and B2B3.
  • the server program 164 can perform mounting detection on the image portions corresponding to the bounding boxes only.
  • FIG. 6 shows a flowchart showing an example embodiment an animal assessment method 600 for determining if a current state of a detected animal in a given image matches a mounting state.
  • Method 600 may be implemented using one or more server programs 164 operating on one or more processors of the processor module 142 in at least one embodiment of the system 100.
  • method 600 may be executed at step 506 of method 500.
  • method 600 maybe be executed for each pair of animals detected and located in a given image at step 504.
  • Method 500 may be executed for each image received at step 502 that pass filtering criteria (e.g., the size of the bounding boxes is larger than a threshold size in order to obtain more accurate results).
  • pass filtering criteria e.g., the size of the bounding boxes is larger than a threshold size in order to obtain more accurate results.
  • the method 600 may be executed as a stand-alone process independent of method 500.
  • the method 600 may be executed automatically (e.g., periodically) to perform animal assessments for captured images/videos.
  • method 600 may be executed in response to an input identifying one or more animals for assessment.
  • the method 600 may be executed to monitor mounting behavior of animals that have been identified and shown to be receptive to mounting and/or have shown other signs of estrus.
  • the IDs of these identified animals may be stored in a database.
  • the method 500 may be executed for images/videos tagged as including the identified animal(s).
  • method 600 is executed independently of method 500, captured images/videos are received at step 602.
  • the server program 164 uses a machine learning algorithm to perform as an object detector to detect one or more animals in a received image and define a bounding box to indicate the location of the detected animal as described previously.
  • the bounding boxes can be determined as described with reference to method 500. If method 600 is executed at step 508 of method 500, the animal is detected and the bounding box is defined at step 504 of method 500.
  • the server program 164 determines one or more pairs of overlapping bounding boxes, i.e. , at least a portion of the image is enclosed by both bounding boxes.
  • the server program 164 can determine pairs of overlapping bounding boxes as described with reference to method 500.
  • the server program 164 can then perform age classification on the animals located within the overlapping bounding boxes.
  • the age classification may be performed for all animals included in a received image (e.g., the age classification may be conducted at step 602). Alternatively, the age classification may only be performed for animals corresponding to at least one pair of overlapping bounding boxes.
  • the age classification model may also generate a confidence value for the generated age classification output.
  • age classification can be performed only on image portions enclosed by bounding boxes. By performing age classification only on a subset of detected animals, the speed and efficiency of the system 100 can be increased and fewer computing resources expended.
  • the server program 164 can use an age classification model, which is an Al model.
  • the age classification model can be a machine learning algorithm that is part of the Al pipeline of Al models that is trained to generate an age classification output such as, for example, the Al pipeline described in United States Patent No. 11 ,910,784.
  • the age classification model may be any suitable image classification model trained to generate classification output for an imaged animal.
  • the age classification model can output a binary classification (i.e., adult, not adult) or a multi-class classification (e.g., an age range of the animal).
  • the age classification model may be any suitable image classification model trained to generate an adult or non-adult classification output for an imaged animal.
  • an age classifier can be implemented using a computer vison model such as a Jigsaw model or DINO/FCN.
  • the extracted features for the age classification may include, for example, a size dimension (e.g., height) of the animal.
  • the age classification model can, for example, classify animals 952 and 954 of image 900 (FIGS. 9A-9B) as adult animals.
  • the age classification model may also generate a confidence value for the generated age classification output.
  • An effective age classification model for animals may be obtained by collecting and preparing a diverse dataset with images labeled by age, followed by preprocessing to standardize and augment the data.
  • the model architecture which may be a DINO/FCN or Jigsaw model, may be customized forage classification by adjusting the output layer for binary or multi-class classification.
  • Features relevant to age such as size and morphological characteristics, may be used for training the model, utilizing strategies like transfer learning and regularization to prevent overfitting and enhance performance. Since estrus only occurs in (female) adult animals, determining whether a detected animal is an adult animal can allow the server program 164 to only identify a mounting state in adult animals, and avoid using computing resources for detecting animals being mounted that are not sexually mature.
  • the training data used to train the age classification model may include real and/or synthetic images corresponding to adult and non-adult animals as described previously.
  • Each training image may correspond to a single class, for example, an adult class or non-adult class.
  • the model’s effectiveness may be evaluated through k- fold cross-validation and metrics like accuracy and F1 -score, with a focus on generalization. Integration into the Al pipeline and deployment may include generating confidence scores for each prediction to indicate reliability.
  • the identification of the detected animal may be used to determine the age by using information, such as age data, on the identified animal that is stored in a database.
  • the server program 164 determines if both animals in the bounding boxes are adult animals based on the results of the age classification model for both animals. If both animals are adult animals, method 600 proceeds to step 608. If the server program 164 determines that one or both animals in the bounding boxes are not adult animals, the method 600 proceeds to step 616 and processes the next received image that needs processing.
  • the server program 164 can perform standing detection using an activity classification model.
  • the activity classification model can be a machine learning algorithm that is part of the Al pipeline of Al models that is trained to generate an activity classification output identifying whether an animal is in a standing position or a non-standing position.
  • the activity classification may be performed for all detected adulted animals included in a received image (e.g., the age classification may be conducted at step 604). Alternatively, the activity classification may only be performed for animals corresponding to at least one pair of overlapping bounding boxes. In at least one embodiment, the server program 164 only performs standing detection on animals that have been determined to be adult animals at step 606.
  • step 608 is shown as being subsequent to step 604, in some cases, step 608 may be performed prior to step 604. In such cases, in at least one embodiment, the server program 164 only performs age classification on animals that have been determined to be in a standing position. In various embodiments, activity classification can be performed only on image portions enclosed by bounding boxes.
  • the activity classification model may be any suitable image classification Al model trained to generate classification output for an imaged animal.
  • the classification output can correspond to an activity of the animal, including standing, having one or more legs not in contact with the ground, and non-standing activities such as, but not limited to, lying down.
  • the activity classification model may use a Vision Transformer (ViT) approach, where preprocessing involves resizing images to a standard dimension (e.g., 224x224 pixels or 256x256 pixels) to ensure uniformity, which is used for the ViT's patch-based processing. Images are also normalized to maintain consistent scale across the dataset.
  • ViT Vision Transformer
  • ViT Various parameters that may be set for the ViT include the patch size, such as 16x16 pixels for example, the transformer's depth indicating the number of encoding layers, and the number of attention heads, which aids in capturing global image dependencies.
  • the learning rate is carefully chosen, incorporating a warm-up period and a subsequent decay schedule to enhance training stability and efficacy.
  • data augmentation techniques such as random cropping, flipping, and color jittering are applied to improve the model's ability to generalize across diverse data.
  • the training data may include real and/or synthetic images corresponding to standing and non-standing animals.
  • the training may generally be performed as described earlier.
  • advanced techniques may be incorporated such as He or Xavier initialization for optimal weight settings, batch normalization to enhance stability, and dropout for preventing overfitting.
  • Addressing class imbalances through class weight balancing aids for unbiased learning and employing transfer learning by using pretrained ViT models can expedite convergence and enhance performance.
  • Dynamic learning rate adjustments with schedulers like ReduceLROnPlateau can promote efficient learning, while integrating attention mechanisms within the ViT allows for selective focus on relevant image regions.
  • Comprehensive hyperparameter tuning alongside diligent monitoring of the training process via validation curves, may be used for fine-tuning the model to achieve peak accuracy and generalization in activity classification tasks.
  • the extracted features for the activity classification may include, for example, angles between links and/or distance between key points on legs of an animal (e.g., indicating whether the legs are erect).
  • the key points may be determined using various techniques including those described in United States Patent No. 11 ,910,784.
  • the activity classification model can, for example, classify animals 1014 and 1016 of image 1000 (FIGS. 10A-10B) as standing animals and animals 1010 and 1012 as non-standing animals.
  • the activity classification model may also generate a confidence value for the generated activity classification output.
  • the server program 164 determines if both animals in the overlapping bounding boxes are standing. If both animals are standing, the method 600 proceeds to step 612. If one or both animals are in a non-standing state, the method 600 proceeds to step 616, where it is determined if there are any other overlapping bounding boxes that need to be processed.
  • the server program 164 selects an animal for which to perform mounting detection.
  • the server program 164 only performs mounting detection for the animal in an animal pair that is in a higher position relative to the other animal and/or relative to a reference point (e.g., the ground).
  • the server program 164 can determine which animal in an animal pair is in a higher position based on the coordinates of the highest/topmost corners/edges of the bounding box of each animal. For example, the server program 164 may determine that bounding box 1120 is higher than bounding box 1122 (in FIG. 11 B).
  • the server program 164 may perform mounting detection for both animals. If mounting is detected, then the animal in the higher bounding box is considered to be the “mounter” and the lower one is considered to be the “mountee”. The cow in the lower bounding box that is the mountee is the one that is in estrus.
  • the server program 164 performs mounting detection using a mounting state detection model for the animal(s) selected at step 612.
  • the server program 164 uses one or more machine learning algorithms to process the images captured at step 602 to determine if one or more of the animals in a given image matches a mounting state.
  • the server program 164 for example, uses a mounting state detection model.
  • the mounting state detection model can be a machine learning algorithm that is part of the Al pipeline of Al models that is trained to generate a mounting classification output identifying whether an animal is in a mounting state or in a non-mounting state.
  • the mounting detection may only be performed for animals corresponding to at least one pair of overlapping bounding boxes.
  • the server program 164 only performs mounting detection on animals that have been determined to be adult animals at step 606 and/or that have been determined to be in a standing position at step 610. Since mounting requires an animal to be (i) in a standing position, (ii) to be an adult animal, and (iii) to be in close proximity with another animal, performing mounting detection only on animals that have been determined to be (i) adult animals, (ii) in a standing position and (iii) in an overlapping bounding box can allow the mounting detection to only be performed on a subset of animals and/or only on images that include adult animals in a standing position in an overlapping box, reducing computing resources required by the system 100.
  • the mounting state detection model may be any suitable image classification model trained to generate classification output for an imaged animal.
  • the classification output can correspond to a mounting state or a non-mounting state.
  • the mounting state detection model may be implemented and trained as was described at step 506 of method 500 in FIG. 5.
  • the training data for training the mounting state detection model may include real and/or synthetic images corresponding to mounting and non-mounting animals.
  • FIGS.7A-7C show example synthetic images 710, 720, 730 that may be included in the training data.
  • Synthetic images such as synthetic images 710, 720 and 730 can be generated using real images or using previously generated synthetic images and can correspond to real images or previously generated synthetic images in which patterns (e.g., colors, background, fur patterns, orientation, etc.) have been changed.
  • the training may be performed similarly to the general training that was described previously.
  • the images included in the training data may be pre-processed before being used for training the mounting state detection model.
  • the processing may include, but not be limited to resizing the images, applying color jitter to the images so that the brightness, contrast, hue, and/or saturation of the images can be varied, and/or normalizing the RGB values of the images, for example.
  • the extracted features for mounting detection may include, for example, a position of the shoulder portion and/or back portion of the animal.
  • FIGS. 12A-12H show example images of animals in a mounting state and associated heatmaps, showing features and/or areas of the image that provide highly useful information that contributes to the classification of animals in a mounting state.
  • FIGS. 12B, 12D, 12F and 12H features related to the shoulder and back portion of an animal highly contribute to the detection of a mounting state.
  • the mounting state detection model can, for example, classify animal 1124 in image 1100c (FIG. 11 C) as an animal in a mounting state.
  • the server program 164 can classify animal 2010 in image 2000 (FIG. 20A) as an animal in a mounting state.
  • the mounting state detection model may also generate a confidence value for the mounting state classification output.
  • the detection and timespan for standing heat may be used along with the detection of a mounting event to improve the accuracy of determining when the mountee animal is experiencing estrus. For example, this may be assessed by determining whether the lower animal (e.g., mountee animal) stays relatively still for a certain period of time.
  • the server program 164 may only perform mounting detection on an image portion enclosed by a bounding box.
  • the server program 164 determines if all pairs of overlapping bounding boxes have been processed. If all pairs of overlapping bounding boxes in an image have been processed, the server program 164 proceeds to the next image. If all pairs of overlapping boxes have not been processed, the server program 164 returns to step 604, and repeats steps 604 to 616, until all overlapping bounding boxes have been processed. [00194] If method 700 is implemented independently, the server program 164 may, at step 614 generate an output indicating that the current state of the detected animal matches a mounting state in response to detecting a mounting state. If method 600 is implemented at step 506, the output may be generated at step 508. The output may be generated as a user notification.
  • the user notification may include information related to the current state of the detected animal, and one or more images or videos of the detected animal.
  • the user notification may include a location information for the animal and an image corresponding to the estrus state or the mounting state.
  • the server program 164 stores in memory (e.g., in memory module 146) the processed image in association with the detected state of the animal.
  • the stored data may be used for subsequent time-series or historical assessment of one or more animals.
  • the server program 164 generates an output indicating the assessment results. For example, in response to determining that at least one animal in a given image is in a mounting state, the server program 164 can generate a notification. In some cases, the server program 164 may only generate a notification if the confidence value of the mounting classification output meets a predetermined mounting confidence threshold. The notification can indicate that the current state of an animal matches an estrus state or a mounting state. In at least one embodiment, the notification can include identification data associated with the animal.
  • the output may be generated as a user notification that is provided to one or more user devices.
  • the user devices can include, for example, personal computers, smartphones, tablet devices, workstations etc.
  • FIG. 22A shows a user notification 2204a provided at a display 2202a of a user smartphone
  • FIG. 22B shows a user notification 2204b provided at a display 2202b of a user computer.
  • the user notification can be provided in any way that allows a user to be notified that an animal is in an estrus state or a mounting state.
  • the user notification 2204 may include information related to the current state of the detected animal, and one or more images or videos of the detected animal.
  • the one or more images or videos can be annotated images or videos, showing bounding boxes and/or labels indicating animals in a mounting state and identification data (e.g., identification number).
  • the user notification may include detected state information 2210, and date and time information 2212, animal location information 2214, image/video 2216 associated with the detected state.
  • the server program 164 may provide additional data related to the animal assessment. For example, the server program 164 may provide an image 2220 of an animal in an estrus state or a mounting state after providing the related state alert.
  • the output at step 508 may be provided via a custom user interface.
  • FIGS. 23A and 23B show example embodiments of GUI 2300a and GUI 2300b respectively of an administration portal that is displayed on the monitor of a desktop computer.
  • GUI 2300 may display multiple windows including, for example, a menu 2302.
  • the menu 2302 may provide a user with various GUI viewing options and/or animal management/assessment options.
  • GUI 2300 may include various display portions, for example, an image/video portion 2404, a graph portion 2406, a calendar portion 2408 and a notification portion 2410.
  • the image/video portion 2404 may display maps, captured images/videos, and/or processed images/videos that may include one or more animals.
  • the graph portion 2406 may display one or more animal assessment results.
  • one or more outputs related to the animal assessments may be provided via the notification portion 2310 (for example, a notification 2312 related to estrus detection). In some embodiments, one or more outputs related to the animal assessments may be provided via the calendar portion 2308 (for example, a historical estrus detection indicator 2314).
  • the server program 164 may perform postprocessing related to detected mounting states detected at step 506 (of method 500) and/or act 614 (of method 600).
  • the post-processing may be based, for example, on data stored in memory (e.g., in memory module 146) related to the detected states of the animals.
  • additional data e.g., animal identification data, animal location tracking data etc.
  • MM-LLMs multi-modal large language models
  • Confidence score cutoffs used in conjunction with bounding boxes sizes, may be used as cut-offs for a mounting indication.
  • MM-LLMs may be used to take basic JSON outputted data and/or images and convert the notification into a human readable format.
  • FIG. 13 shown therein is a flowchart showing an example embodiment of an animal assessment method 1300 that may be executed by a server program 164 of the animal assessment system of FIG. 1A, for detecting a standing heat in at least one animal at the site using one or more Artificial Intelligence (Al) models which might be in an Al pipeline.
  • Method 1300 may be used in combination with method 500 and 600.
  • standing heat is a primary sign of estrus and corresponds to a period in the reproductive cycle of a female animal where the animal will allow itself to be mounted.
  • a standing heat state is characterized by an animal being continuously receptive to mounting (e.g., continuously being mounted) for a predefined period of time, for example, at least three seconds.
  • Standing heat can be detected using the teachings described herein by evaluating a series of images containing multiple images, or video frames, obtained chronologically.
  • the server program 164 can receive a video showing animals and extract frames chronologically at a predetermined time interval (e.g., one to a few frames per second) or extract frames occurring within a short period of time.
  • the server program 164 can receive timestamped images, captured within a short period of time (e.g., up to within 1 to a few seconds of each other).
  • consecutive frames or “consecutive images” refers to images obtained within a short period of time and that may not necessarily correspond to images captured immediately after one another (e.g., consecutive frames in a video may not correspond to frames extracted at the frame rate of the video and some frames may be omitted).
  • images h, l 2 , , I4 Is obtained at ti, t 2 , ts, t4 and ts
  • images h, h and Is may be referred to as consecutive images, if the images are captured within a short period of time.
  • Obtaining and processing several images captured within a short period of time, on the order of about 3 to about 5 seconds, where the timestamps of consecutive images are not more than 1 second apart, can avoid standing heat from being detected in images that are captured within very short periods of time (e.g., within milliseconds, faster than an animal can move).
  • the server program 164 receives newly captured images or images stored in a data store.
  • Step 1302 may be substantially similar to step 502 of method 500.
  • the server program 164 uses one or more machine learning algorithms to process the images captured at step 1302 and perform as an object detector to detect, locate and optionally identify animals in one or more of the acquired images.
  • Step 1304 may be substantially similar to step 504 of method 500 and the same Al models may be preferably used for object detection, although other may be used in other embodiments. Bounding boxes around the detected animals are also obtained when the animal is detected.
  • the server program 164 uses one or more machine learning algorithms to process the images captured at step 1302 to determine if one or more of animals in a given image matches a mounting state.
  • Step 1306 may be substantially similar to step 508 of method 500.
  • Step 1306 can also include steps substantially similar to steps 604-612 of method 600, that is, the server program 164 can identify overlapping bounding boxes, perform age classification and/or activity classification, which may be done as was described for method 600, for example.
  • step 1306 may additionally involve identifying the animal being mounted, since a standing heat state is, as described, characterized by an animal being receptive to mounting for a certain period of time.
  • the server program 164 determines that one or more animals in a given image matches a mounting state, the server program 164 determines the corresponding one or more animal(s) being mounted by the one or more or more animals in a mounting state, by determining the lower bounding box for these animals being mounted and the animals being mounted are identified as being in the estrus state.
  • the server program 164 determines if at least one animal in a given image is in a standing heat state.
  • a standing heat state can be characterized by an animal being continuously receptive to mounting for a predefined time period, for example, at least about three seconds. Accordingly, a standing heat state can be characterized by an animal being mounted in each image in the series of images obtained over the predefined time period (e.g., a mounting time period).
  • the predetermined time period is set to be at least three seconds and frames are obtained at a rate of one frame per second, three consecutive frames containing the same animal being mounted indicate that the animal being mounted is in a standing heat state which also indicates that the animal being mounted is in an estrus state.
  • each animal can be associated with identification data.
  • the server program 164 can use the identification data associated with each animal.
  • the server program 164 may evaluate the movement of the animal being mounted. Since a standing heat state involves an animal being continuously receptive to mounting for at least a predetermined time period, a standing heat state may be characterized by minimal movement of the animal. In at least one embodiment, to evaluate the movement of the animal being mounted, the server program 164 can determine the overlap between the bounding boxes in each pair of consecutive images.
  • the server program 164 may calculate the intersection over union (i.e., the ratio between the intersection and the union) of the bounding box of the animal determined to be mounted of image h obtained at ti and the bounding box of image h obtained at t2.
  • the loll measure is more preferable than the intersection area since that the latter only indicates the size of the overlapping area between two bounding boxes while the loU measure reflects the proportion of this overlap relative to the combined size of both bounding boxes. Accordingly, the loU measure is unaffected by the objects' size, providing a consistent metric regardless of scale.
  • the server program 164 may calculate the intersection between and the union of bounding box 1410 in image 1400a (FIG.
  • Image 1400f (FIG. 14F) for example shows the bounding box 1460 of the current frame and the bounding box 1450 of frame (image) 1400e.
  • the server program 1504 can receive captured images. Similar to step 502 of method 500, the server program 164 commands the imaging devices 108 to capture images or video clips of at least one animal or an entire herd of animals. Step 1502 may be substantially similar to step 502 of method 500.
  • the server program 164 uses one or more machine learning algorithms to process the images captured at step 1502 to determine if one or more of animals in a given image matches a chin-resting state.
  • the server program 164 can perform chin-resting detection using a chin-resting classification model.
  • the machine learning algorithms can process images to identify all animals in a given image that are in a chin-resting state.
  • the chin-resting classification model can be a machine learning algorithm that is part of the Al pipeline of Al models that is trained to generate a mounting classification output identifying whether an animal is in a chin-resting state or in a non-chin-resting state.
  • the server program 164 can perform age classification on detected animals.
  • Age classification can be performed as described with reference to step 604 of method 600 and as shown in FIGS. 9A-9B.
  • the age classification may be performed for all animals included in a received image (e.g., the age classification may be conducted at step 1602). Alternatively, the age classification may only be performed for animals corresponding to at least one pair of overlapping bounding boxes. In various embodiments, age classification can be performed only on image portions enclosed by bounding boxes. [00227]
  • the server program 164 determines if both animals in a pair of overlapping bounding boxes are adults. If both animals are adults, the method 1600 proceeds to step 1608. If one or both animals are not adults, the method 1600 proceeds to step 1616 to process another pair of overlapping bounding boxes.
  • the server program 164 can perform standing detection using an activity classification model.
  • Activity classification can be performed as described with reference to step 608 of method 600 and as shown in FIGS. 10A-10B.
  • the server program 164 combines the overlapping bounding boxes.
  • the server program 164 generates a merged bounding box for the pair of overlapping bounding boxes and performs chin-resting detection on the merged bounding box.
  • the merged bounding box can be generated to include each portion of the image (i.e. , every image pixel) that is enclosed by at least one of the two overlapping bounding boxes.
  • image 1800c FIG. 18C
  • the server program 164 can perform chin-resting detection only on the merged bounding box 1830 (also corresponding to image 1890d in FIG. 18D).
  • the server program 164 performs chin-resting detection using a chin-resting detection model for the animals in the merged bounding box generated at step 1612.
  • the server program 164 uses one or more machine learning algorithms to process the images captured at step 1602 to determine if one or more of animals in a given image matches a chin-resting state.
  • the server program 164 can perform chin-resting detection using a chin-resting classification model.
  • the chin-resting classification model can be a machine learning algorithm that is part of the Al pipeline of Al models that is trained to generate a mounting classification output identifying whether an animal is in a chin-resting state or not in chin-resting state.
  • the chin-resting detection may only be performed for animals corresponding to at least one pair of overlapping bounding boxes.
  • the server program 164 only performs standing detection on animals that have been determined to be adult animals at step 1604 and that have been determined to be in a standing position at step 1608. Since chin-resting indicating estrus requires an animal to be in a standing position, to be an adult animal, and to be in close proximity with another animal, performing chin-resting detection only on animals that have been determined to be adult animals, in a standing position and in an overlapping bounding box can allow the chin-resting detection to only be performed on a subset of animals and/or only on images that include adult animals in a standing position in an overlapping box, reducing computing resources required by the system 100.
  • the chin-resting detection model may be any suitable image classification model trained to generate classification output for an imaged animal.
  • the JigSaw model which is generally described earlier, may be used.
  • the classification output can correspond to a chin-resting state or a non-chin-resting state.
  • the chinresting model uses the union of two animal bounding boxes, that includes the animals and the background, as shown in 1830 in FIG. 18C. Either a DINO/FCN or Jigsaw training methodology may be utilized.
  • Jigsaw model training a binary classification is modeled, by preparing the dataset such that the data is cleanly segmented and appropriately labeled for the two classifications.
  • a layer that is used for binary classification is a dense layer with a single output and a sigmoid activation function, providing a probability score indicating class membership.
  • Model training involves dividing the data into training (80%), validation (10%), and test sets (10%), with input sizes tailored to the input SAM data such as 256x256 pixels.
  • the training data may include real and/or synthetic images corresponding to animals that are resting their chins on the backside of other animals and images that shown animals that are not performing chin-resting. The training may generally be performed as described earlier.
  • the training data may include real and/or synthetic images corresponding to chin-resting and non-chin-resting animals.
  • FIGS.17A-17B show example synthetic images 1710, 1720 that may be included in the training data.
  • Synthetic images such as synthetic images 1710, 1720 can be generated using real images or using previously generated synthetic images and can correspond to real images previously generated synthetic images in which patterns (e.g., colors, background, fur patterns, etc.) have been changed.
  • the images included in the training data may also be pre-processed as described with reference to FIGS. 7A-7C.
  • the goal may be for a balance between precision and recall, and monitoring metrics such as F1 score and ROC-AUC may be used during validation to adjust hyperparameters like batch size (e.g., 32 or 64) and number of epochs (typically between 10 and 100, with early stopping applied to prevent overfitting).
  • an SAM algorithm may be used to further refine the model weights. SAM seeks parameters that lie in neighborhoods having uniformly low loss, which can lead to improved generalization by sharpening the minima of the loss landscape. SAM may be used in used in conjunction with a primary optimizer, such as Adam, by periodically adjusting the learning rate (dual-step update) to navigate the loss landscape more effectively.
  • the extracted features for chin-resting detection may include, for example, a position and/or angle of an animal’s neck and/or chin, and/or a position of an animal’s chin relative to another animal’s head.
  • FIGS. 19A-19H show example images of animals in a chin-resting state and associated heatmaps, showing features contributing to the classification of animals in a chin-resting state.
  • FIGS. 19B, 19D, 19F and 19H features related to the face, chin and neck portion of an animal highly contribute to the detection of a chin-resting state.
  • the server program 164 can for example, classify, by performing chinresting detection on the merged bounding box 1830, animal 1824 as an animal in a chin-resting state and animal 1124 in image 1100c (FIG. 11 C) as an animal in a chinresting state.
  • the other animals in each of those images that are receiving the chin on their rear end are identified as animals in an estrus state.
  • the server program 164 can classify animal 2052 in image 2050 (FIG. 20B) as an animal in a chin-resting state and animal 2054 in image 2050 as an animal not in a chinresting state.
  • the chin-resting detection model may also generate a confidence value for the generated chin-resting classification output.
  • the server program 164 may only perform chin-resting detection on an image portion enclosed by a bounding box.
  • the server program 164 may, at step 1614 generate an output indicating that the current state of the detected animal matches a chin-resting state in response to detecting a chin-resting state. If method 1600 is implemented at step 1606, the output may be generated at step 1508. The output may be generated as a user notification.
  • the user notification may include information related to the current state of the detected animal, and one or more images or videos of the detected animal. For example, the user notification may include a location information and an image corresponding to the chinresting state and identifying the detected animal that is in the estrus state.
  • the server program 164 stores in memory (e.g., in memory module 146) the processed image in associated with the detected state of the animal.
  • the stored data may be used for subsequent time-series or historical assessment of one or more animals.
  • the server program 164 generates an output indicating the assessment results. For example, in response to determining that at least one animal in a given image is in a chin-resting state, the server program 164 can generate a notification.
  • the notification can indicate that the current state of an animal matches an estrus state or a chin-resting state and may also include the location and identification of the animal that is in the estrus state or the chin-resting state.
  • Step 1508 may be substantially similar to step 508 of method 500.
  • FIGS. 22A and 22B only show user notifications for a mounting state, it will be understood that similar user notifications can be generated for chin-resting.
  • steps 1502-1506 can be repeated over a predetermined period of time to evaluate changes in chin-resting behaviors.
  • the server program 164 can calculate the number of chin-resting events occurring over a predetermined period of time and generate an output at step 1508 to indicate the number of chin-resting events detected. If the number of chin-resting state events exceeds a predetermined threshold over a predetermined time period, which may be referred to as a predetermined chin-resting count threshold and a predetermined chin-resting time period, respectively, then the server program 164 may generate an output indicating a high likelihood of breeding receptivity.
  • the server program 164 can calculate changes in the number of chin-resting events over a predetermined time period and generate an output at step 1508 based on the calculated changes.
  • An increase for a given time period of chin-resting events can indicate a high likelihood of breeding receptivity. For instance, if it is observed that the number of chin-resting events within a 24-hour period increases by more than 20% compared to the previous 24 hours, then send a notification may be sent to personnel who run the animal facility. The 20% threshold may be adjusted to obtain better test results.
  • FIG. 21 shown therein is a flowchart showing another example embodiment of an animal assessment method 2100 that may be executed by a server program of the animal assessment system of FIG. 1A, for detecting a state indicative of estrus in at least one detected animal at a monitoring site, such as a ranch for example, using one or more Artificial Intelligence (Al) models which may be part of an Al pipeline.
  • a server program of the animal assessment system of FIG. 1A for detecting a state indicative of estrus in at least one detected animal at a monitoring site, such as a ranch for example, using one or more Artificial Intelligence (Al) models which may be part of an Al pipeline.
  • Al Artificial Intelligence
  • the server program 164 may proceed to step 2114 and perform chin-resting detection on the combined bounding box that includes the higher bounding box and the lower bounding box if the trigger condition for the chin-resting detection is met which is there is one adult animal in each of the bounding boxes that are overlapping and the animals are standing.
  • Mounting detection may be performed according to methods 500 and 600.
  • Chin-resting may be performed according to methods 1500 and 1600.
  • FIG. 24A and 24B shown therein are images of an example user notification 2400 and 2450, respectively, provided by at least one embodiment of an animal assessment system described herein.
  • the various user notifications may include an image of the detected event as well as text that describes what is in the image and indicates the state of the animals in the image.
  • FIG. 24A a portion of an image of user notification 2400 is shown including an image 2410, location and time information 2420, image description text 2430 and a detected event indication 2440.
  • the image 2410 was obtained from a plurality of images based on the analysis performed according to the teachings herein to determine that one of the cows in the image 2410 is experiencing estrus.
  • the image description text 2430 includes text describing what is in the image 2410 and the detected event indication 2440 includes text that describes the detected event.
  • FIG. 24B a portion of an image of user notification 2450 is shown including an image 2460, image description text 2480 and a detected event indication 2490.
  • the image 2460 was obtained from a plurality of images based on the analysis performed according to the teachings herein to determine that one of the cows in the image 2460 is experiencing estrus.
  • the image description text 2480 includes text describing what is in the image 2410 and the detected event indication 2490 includes text that describes the detected event.
  • the image description text 2480 is: “The cow in the image is exhibiting mounting behavior, where it is placing its front legs on the backend of another cow.
  • the multi-modal Al model is an Al model that can analyze text (e.g., the first and second descriptions) and can analyze images (e.g., the input image) to determine whether an animal in the input image is experiencing estrus and generate text to describe what is shown in the input image.
  • text e.g., the first and second descriptions
  • images e.g., the input image
  • the multi-modal deep learning Al model is similar to a large language model but may process both input text and an input image and map them to the same embedding space.
  • the multi-modal deep learning Al model may be an OpenAI model such as, but not limited to, GPT4 Vision, DALL-E, Gemini Pro, or Gemini Ultra, for example.
  • the multi-modal deep learning Al models can process and understand both textual and visual inputs, allowing them to perform tasks that involve understanding and generating content across different media types.
  • the multi-modal deep learning Al models are very computationally intensive and/or time-consuming.
  • the estrus detection models and methods described herein are used, which can be more computationally efficient and orders of magnitude faster, and act as a filter to process a plurality of images to obtain a smaller set of images, that are more likely to include an animal experiencing estrus and can be referred to as candidate estrus images.
  • these one or more Al models may be a keypoint Al model, an image classification Al model, or an ensemble of Al models where each of these models may be referred to as unimodal Al models that operate on the plurality of images and objects which may be visually related to the images where these objects may be keypoints and/or bounding boxes.
  • the candidate estrus images are then provided to the more computationally intensive multi-modal deep learning Al model which is used to further analyze each candidate estrus image to determine the type of estrus is being experienced based on what is shown in the candidate calving image and provide text to describe the estrus state.
  • the candidate estrus image may include a bounding box around the two animals that are being assessed for estrus. This may enable the multi-modal model to focus its analysis on the detected animal(s) included in the bounding box and reduce analysis time and/or computational resources consumed.
  • the candidate estrus image may not include a bounding box. This may enable the multi-modal model to include the environment/context around the detected animal(s) during analysis.
  • BETSY is a mounting watcher that helps in analyzing early signs of estrus in cows. It acts like the eyes of the rancher.
  • the confidence score that is determined by the Al models when classifying whether a detected animal in an image being analyzed is in the mounting, chin-resting and standing heat states may be used as a filter by comparing the confidence score to a confidence score threshold. This may aid with reducing the number of false positives. For example, alerts may be issued only if they meet a first confidence threshold (e.g., at least 80%, 85% or higher) for mounting and standing heat, and a second higher confidence threshold for chinresting.
  • a first confidence threshold e.g., at least 80%, 85% or higher
  • mounting may be considered to be more of a typical behavior for cows in estrus compared to chin-resting since chin-resting may occur for cows that are not in estrus.
  • the mounting and standing heat classification models may be used as the primary detection methods, with the chin-resting classification model may be used as a supplementary detector. For example, the number of occurrences of detected chinresting events may be counted and any increase in the frequency of chin-resting may be used to determine whether cows in a herd are in estrus.
  • each of the estrus detection methods described herein may be modified to include sex classification as a prefilter/pre-screening step before performing any of the steps for detecting mounting, chin-resting and/or the standing heat by checking that the animals in the pair of bounding boxes are female. This may be done by running a sex classifier, which is an Al model, on the detected animal or using the identification of the detected animal to find information on the animal in a database where the information includes sex data for the detected animal. This will reduce false positives since male animals may exhibit behaviours that may be mistaken as estrus related behaviours.
  • sex classification as a prefilter/pre-screening step before performing any of the steps for detecting mounting, chin-resting and/or the standing heat by checking that the animals in the pair of bounding boxes are female. This may be done by running a sex classifier, which is an Al model, on the detected animal or using the identification of the detected animal to find information on the animal in a database where the information includes sex data for the detected
  • any description herein of a process, method or step that may be executed by a server program should be understood as meaning that at least one processor or other computing device/electronics (e.g., an ASIC, dedicated hardware, etc.) is executing one or more programs, which may be stored on a server or another storage device, for performing the functions defined by the program.
  • processors e.g., an ASIC, dedicated hardware, etc.
  • programs which may be stored on a server or another storage device, for performing the functions defined by the program.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Veterinary Medicine (AREA)
  • Zoology (AREA)
  • Pregnancy & Childbirth (AREA)
  • Biophysics (AREA)
  • Wood Science & Technology (AREA)
  • Animal Husbandry (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

Des procédés et des systèmes sont décrits, lesquels sont destinés à la gestion d'animaux présentant un comportement lié à l'œstrus par traitement d'une ou de plusieurs images capturées par un ou plusieurs dispositifs d'imagerie à l'aide d'un ou de plusieurs modèles d'IA pour détecter et localiser deux animaux avec une paire de boîtes de délimitation se chevauchant dans une image donnée desdites images; déterminer si l'état actuel d'un premier animal parmi les deux animaux détectés correspond à un état de chevauchement ou à un état de pose de menton et, lorsque le premier animal est dans l'état de chevauchement ou l'état de pose de menton, identifier un second animal parmi les deux animaux détectés comme étant chevauché par le premier animal ou recevant le menton du premier animal, ce qui indique que le second animal est dans un état d'œstrus; et générer une notification lorsque le second animal est dans l'état d'œstrus ou que le premier animal est dans l'état de chevauchement ou de pose du menton.
PCT/CA2025/050447 2024-03-29 2025-03-28 Systèmes et procédés de détection de réceptivité à la reproduction chez le bétail et autres mammifères quadrupèdes Pending WO2025199652A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202463572171P 2024-03-29 2024-03-29
US63/572,171 2024-03-29

Publications (1)

Publication Number Publication Date
WO2025199652A1 true WO2025199652A1 (fr) 2025-10-02

Family

ID=97219762

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CA2025/050447 Pending WO2025199652A1 (fr) 2024-03-29 2025-03-28 Systèmes et procédés de détection de réceptivité à la reproduction chez le bétail et autres mammifères quadrupèdes

Country Status (1)

Country Link
WO (1) WO2025199652A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180035648A1 (en) * 2015-03-31 2018-02-08 Kyocera Corporation Mounting behavior detection system and detection method
WO2020031050A1 (fr) * 2018-08-04 2020-02-13 Kaur Kamalpavit Système et dispositif de gestion de santé et de fertilité d'un ou de plusieurs animaux laitiers
CN111685060A (zh) * 2020-06-10 2020-09-22 彭东乔 一种基于人工智能对反刍动物发情行为识别的方法
US20220022427A1 (en) * 2020-04-27 2022-01-27 It Tech Co., Ltd. Ai-based livestock management system and livestock management method thereof
KR102527058B1 (ko) * 2022-12-01 2023-05-02 한국아이오티 주식회사 소의 승가 행위 검출 장치

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180035648A1 (en) * 2015-03-31 2018-02-08 Kyocera Corporation Mounting behavior detection system and detection method
WO2020031050A1 (fr) * 2018-08-04 2020-02-13 Kaur Kamalpavit Système et dispositif de gestion de santé et de fertilité d'un ou de plusieurs animaux laitiers
US20220022427A1 (en) * 2020-04-27 2022-01-27 It Tech Co., Ltd. Ai-based livestock management system and livestock management method thereof
CN111685060A (zh) * 2020-06-10 2020-09-22 彭东乔 一种基于人工智能对反刍动物发情行为识别的方法
KR102527058B1 (ko) * 2022-12-01 2023-05-02 한국아이오티 주식회사 소의 승가 행위 검출 장치

Similar Documents

Publication Publication Date Title
US12193414B2 (en) Animal visual identification, tracking, monitoring and assessment systems and methods thereof
Marsot et al. An adaptive pig face recognition approach using Convolutional Neural Networks
CN115830490A (zh) 一种群养生猪多目标跟踪及行为统计方法
Gao et al. CNN-Bi-LSTM: A complex environment-oriented cattle behavior classification network based on the fusion of CNN and Bi-LSTM
Taiwo et al. Vision transformers for automated detection of pig interactions in groups
Biglari et al. A vision-based cattle recognition system using tensorflow for livestock water intake monitoring
KR102349851B1 (ko) 카메라를 이용한 반려동물 다중객체 인지 서비스 제공 시스템 및 방법
Bai et al. Recognition of the behaviors of dairy cows by an improved YOLO
Guo et al. Pigeon cleaning behavior detection algorithm based on light-weight network
KR102882452B1 (ko) 영상정보를 활용한 반려동물 이상행동 식별 방법
CN115761896A (zh) 一种生猪的异常行为识别方法、系统、设备以及介质
Yu et al. Holstein-Friesian re-identification using multiple cameras and self-supervision on a working farm
Yang et al. A long-term video tracking method for group-housed pigs
Liu et al. An Accurate and Lightweight Algorithm for Caged Chickens Detection based on Deep Learning.
Shao et al. Research on dynamic pig counting method based on improved YOLOv7 combined with DeepSORT
US20250338830A1 (en) Ai-based livestock management system and livestock management method thereof
Yang et al. A Computer Vision Pipeline for Individual-Level Behavior Analysis: Benchmarking on the Edinburgh Pig Dataset
Gong et al. Detection of group-housed pigs feeding behavior using deep learning and edge devices
Han et al. Social Behavior Atlas: A computational framework for tracking and mapping 3D close interactions of free-moving animals
Xue et al. Aggressive behavior recognition and welfare monitoring in yellow-feathered broilers using FCTR and wearable identity tags
WO2025199652A1 (fr) Systèmes et procédés de détection de réceptivité à la reproduction chez le bétail et autres mammifères quadrupèdes
Hu et al. PB-STR: A spatiotemporal transformer network for multi-behavior recognition of pigs
Kapoor et al. Advancements in Animal Behaviour Monitoring and Livestock Management: A Review
Aji et al. YOLO-CE for Accurate Livestock Detection in Challenging Landfill Environments.
Hu et al. Recognizing and localizing chicken behaviors in videos based on spatiotemporal feature learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 25779168

Country of ref document: EP

Kind code of ref document: A1