PL4053695T3 - Systemy, sposoby i aparaty do operacji na iloczynach skalarnych - Google Patents

Systemy, sposoby i aparaty do operacji na iloczynach skalarnych

Info

Publication number
PL4053695T3
PL4053695T3 PL22169888.9T PL22169888T PL4053695T3 PL 4053695 T3 PL4053695 T3 PL 4053695T3 PL 22169888 T PL22169888 T PL 22169888T PL 4053695 T3 PL4053695 T3 PL 4053695T3
Authority
PL
Poland
Prior art keywords
apparatuses
systems
methods
production operations
dot production
Prior art date
Application number
PL22169888.9T
Other languages
English (en)
Inventor
Robert Valentine
Dan Baum
Zeev Sperber
Jesus Corbal
Elmoustapha OULD-AHMED-VALL
Bret L. Toll
Mark Charney
Menachem ADELMAN
Barukh ZIV
Alexander Heinecke
Simon Rubanovich
Original Assignee
Intel Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corporation filed Critical Intel Corporation
Publication of PL4053695T3 publication Critical patent/PL4053695T3/pl

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0207Addressing or allocation; Relocation with multidimensional access, e.g. row/column, matrix
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/5443Sum of products
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • G06F9/30014Arithmetic instructions with variable precision
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30032Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • G06F9/30038Instructions to perform operations on packed data, e.g. vector, tile or matrix operations using a mask
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory
    • G06F9/30043LOAD or STORE instructions; Clear instruction
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30105Register structure
    • G06F9/30109Register structure having multiple operands in a single register
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30105Register structure
    • G06F9/30112Register structure comprising data of variable length
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/3012Organisation of register space, e.g. banked or distributed register file
    • G06F9/30134Register stacks; shift registers
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30145Instruction analysis, e.g. decoding, instruction word fields
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30145Instruction analysis, e.g. decoding, instruction word fields
    • G06F9/30149Instruction analysis, e.g. decoding, instruction word fields of variable length instructions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30145Instruction analysis, e.g. decoding, instruction word fields
    • G06F9/3016Decoding the operand specifier, e.g. specifier format
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30181Instruction operation extension or modification
    • G06F9/30185Instruction operation extension or modification according to one or more bits in the instruction, e.g. prefix, sub-opcode
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30181Instruction operation extension or modification
    • G06F9/30196Instruction operation extension or modification using decoder, e.g. decoder per instruction set, adaptable or programmable decoders
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3818Decoding for concurrent execution
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3861Recovery, e.g. branch miss-prediction, exception handling
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/45Caching of specific data in cache memory
    • G06F2212/454Vector or matrix data
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/45Caching of specific data in cache memory
    • G06F2212/455Image or video data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Advance Control (AREA)
  • Nonlinear Science (AREA)
  • Databases & Information Systems (AREA)
  • Algebra (AREA)
  • Complex Calculations (AREA)
  • Executing Machine-Instructions (AREA)
PL22169888.9T 2017-03-20 2017-07-01 Systemy, sposoby i aparaty do operacji na iloczynach skalarnych PL4053695T3 (pl)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US201762473732P 2017-03-20 2017-03-20

Publications (1)

Publication Number Publication Date
PL4053695T3 true PL4053695T3 (pl) 2026-02-23

Family

ID=63584598

Family Applications (1)

Application Number Title Priority Date Filing Date
PL22169888.9T PL4053695T3 (pl) 2017-03-20 2017-07-01 Systemy, sposoby i aparaty do operacji na iloczynach skalarnych

Country Status (5)

Country Link
US (29) US11263008B2 (pl)
EP (12) EP4137940A1 (pl)
CN (10) CN117407644A (pl)
PL (1) PL4053695T3 (pl)
WO (12) WO2018174925A1 (pl)

Families Citing this family (52)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3971711A1 (en) 2016-10-20 2022-03-23 INTEL Corporation Systems, apparatuses, and methods for fused multiply add
WO2018174925A1 (en) 2017-03-20 2018-09-27 Intel Corporation Systems, methods, and apparatuses for dot production operations
WO2019009870A1 (en) 2017-07-01 2019-01-10 Intel Corporation SAVE BACKGROUND TO VARIABLE BACKUP STATUS SIZE
US11409692B2 (en) 2017-07-24 2022-08-09 Tesla, Inc. Vector computational unit
US10671349B2 (en) * 2017-07-24 2020-06-02 Tesla, Inc. Accelerated mathematical engine
US11157441B2 (en) 2017-07-24 2021-10-26 Tesla, Inc. Computational array microprocessor system using non-consecutive data formatting
US11893393B2 (en) 2017-07-24 2024-02-06 Tesla, Inc. Computational array microprocessor system with hardware arbiter managing memory requests
KR102477404B1 (ko) * 2017-08-31 2022-12-13 캠브리콘 테크놀로지스 코퍼레이션 리미티드 칩 장치 및 관련 제품
US11114138B2 (en) 2017-09-15 2021-09-07 Groq, Inc. Data structures with multiple read ports
US11868804B1 (en) 2019-11-18 2024-01-09 Groq, Inc. Processor instruction dispatch configuration
US11243880B1 (en) 2017-09-15 2022-02-08 Groq, Inc. Processor architecture
US11360934B1 (en) * 2017-09-15 2022-06-14 Groq, Inc. Tensor streaming processor architecture
US11170307B1 (en) 2017-09-21 2021-11-09 Groq, Inc. Predictive model compiler for generating a statically scheduled binary with known resource constraints
WO2019114842A1 (zh) 2017-12-14 2019-06-20 北京中科寒武纪科技有限公司 一种集成电路芯片装置
US11023235B2 (en) 2017-12-29 2021-06-01 Intel Corporation Systems and methods to zero a tile register pair
US11669326B2 (en) * 2017-12-29 2023-06-06 Intel Corporation Systems, methods, and apparatuses for dot product operations
US11816483B2 (en) * 2017-12-29 2023-11-14 Intel Corporation Systems, methods, and apparatuses for matrix operations
US11789729B2 (en) 2017-12-29 2023-10-17 Intel Corporation Systems and methods for computing dot products of nibbles in two tile operands
US11093247B2 (en) 2017-12-29 2021-08-17 Intel Corporation Systems and methods to load a tile register pair
US11809869B2 (en) 2017-12-29 2023-11-07 Intel Corporation Systems and methods to store a tile register pair to memory
US11561791B2 (en) 2018-02-01 2023-01-24 Tesla, Inc. Vector computational unit receiving data elements in parallel from a last row of a computational array
US11132233B2 (en) 2018-05-07 2021-09-28 Micron Technology, Inc. Thread priority management in a multi-threaded, self-scheduling processor
US10901745B2 (en) 2018-07-10 2021-01-26 International Business Machines Corporation Method and apparatus for processing storage instructions
US12340300B1 (en) 2018-09-14 2025-06-24 Groq, Inc. Streaming processor architecture
US11455370B2 (en) 2018-11-19 2022-09-27 Groq, Inc. Flattened input stream generation for convolution with expanded kernel
US10853067B2 (en) 2018-09-27 2020-12-01 Intel Corporation Computer processor for higher precision computations using a mixed-precision decomposition of operations
US10963256B2 (en) * 2018-09-28 2021-03-30 Intel Corporation Systems and methods for performing instructions to transform matrices into row-interleaved format
CN111061507A (zh) * 2018-10-16 2020-04-24 上海寒武纪信息科技有限公司 运算方法、装置、计算机设备和存储介质
US10963246B2 (en) 2018-11-09 2021-03-30 Intel Corporation Systems and methods for performing 16-bit floating-point matrix dot product instructions
CN111338974B (zh) * 2018-12-19 2025-05-16 超威半导体公司 用于矩阵数学指令集的图块化算法
US11042372B2 (en) 2019-05-24 2021-06-22 Texas Instruments Incorporated Vector bit transpose
US11687341B2 (en) * 2019-08-29 2023-06-27 Intel Corporation Multi-variate strided read operations for accessing matrix operands
US11188618B2 (en) * 2019-09-05 2021-11-30 Intel Corporation Sparse matrix multiplication acceleration mechanism
CN110727412B (zh) * 2019-09-14 2022-01-07 无锡江南计算技术研究所 一种基于掩码的混合浮点乘法低功耗控制方法及装置
WO2021108090A1 (en) 2019-11-26 2021-06-03 Mythic, Inc. Systems and methods for implementing redundancy for tile-based intelligence processing computing architecture
US11392535B2 (en) 2019-11-26 2022-07-19 Groq, Inc. Loading operands and outputting results from a multi-dimensional array using only a single side
CN112668015B (zh) * 2019-12-12 2022-02-01 华控清交信息科技(北京)有限公司 一种数据处理方法、装置和用于数据处理的装置
CN113094099A (zh) * 2019-12-23 2021-07-09 超威半导体(上海)有限公司 矩阵数据广播架构
US11714875B2 (en) 2019-12-28 2023-08-01 Intel Corporation Apparatuses, methods, and systems for instructions of a matrix operations accelerator
US20210334072A1 (en) * 2020-04-22 2021-10-28 Facebook, Inc. Mapping convolution to connected processing elements using distributed pipelined separable convolution operations
GB2596056B (en) * 2020-05-27 2022-06-08 Graphcore Ltd Exception register delay
US11593454B2 (en) 2020-06-02 2023-02-28 Intel Corporation Matrix operation optimization mechanism
US12112167B2 (en) 2020-06-27 2024-10-08 Intel Corporation Matrix data scatter and gather between rows and irregularly spaced memory locations
US20220051086A1 (en) * 2020-08-17 2022-02-17 Alibaba Group Holding Limited Vector accelerator for artificial intelligence and machine learning
US12112171B2 (en) * 2020-09-26 2024-10-08 Intel Corporation Loop support extensions
US12474928B2 (en) 2020-12-22 2025-11-18 Intel Corporation Processors, methods, systems, and instructions to select and store data elements from strided data element positions in a first dimension from three source two-dimensional arrays in a result two-dimensional array
US11561794B2 (en) 2021-05-26 2023-01-24 International Business Machines Corporation Evicting and restoring information using a single port of a logical register mapper and history buffer in a microprocessor comprising multiple main register file entries mapped to one accumulator register file entry
US12425047B2 (en) 2021-06-15 2025-09-23 Intel Corporation Methods and apparatus to perform weight and activation compression and decompression
US20230289398A1 (en) * 2022-03-10 2023-09-14 Nvidia Corporation Efficient Matrix Multiply and Add with a Group of Warps
US20240320292A1 (en) * 2023-03-23 2024-09-26 Arm Limited Matrix multiplication in a dynamically spatially and dynamically temporally dividable architecture
US20240320005A1 (en) * 2023-03-23 2024-09-26 Arm Limited Matrix multiplication in a dynamically spatially and dynamically temporally dividable architecture
CN119225815B (zh) * 2024-11-28 2025-03-11 英特尔(中国)研究中心有限公司 处理装置、处理方法以及计算机可读存储介质

Family Cites Families (271)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US138524A (en) 1873-05-06 Improvement in water-indicators for steam-boilers
US2002824A (en) 1932-11-19 1935-05-28 Mayer Bruno Roll film camera
US3933224A (en) 1973-05-30 1976-01-20 Stockamollan Ab Fork lift truck
US4310879A (en) 1979-03-08 1982-01-12 Pandeya Arun K Parallel processor having central processor memory extension
US5142677A (en) 1989-05-04 1992-08-25 Texas Instruments Incorporated Context switching devices, systems and methods
US5247632A (en) 1989-01-23 1993-09-21 Eastman Kodak Company Virtual memory management arrangement for addressing multi-dimensional arrays in a digital data processing system
US5025407A (en) 1989-07-28 1991-06-18 Texas Instruments Incorporated Graphics floating point coprocessor having matrix capabilities
US5170370A (en) * 1989-11-17 1992-12-08 Cray Research, Inc. Vector bit-matrix multiply functional unit
US5263136A (en) 1991-04-30 1993-11-16 Optigraphics Corporation System for managing tiled images using multiple resolutions
JP2572522B2 (ja) 1992-05-12 1997-01-16 インターナショナル・ビジネス・マシーンズ・コーポレイション コンピューティング装置
US5475822A (en) 1993-11-15 1995-12-12 Motorola, Inc. Data processing system for resuming instruction execution after an interrupt and method therefor
JP2932963B2 (ja) 1994-01-21 1999-08-09 モトローラ・インコーポレイテッド 効率的なビット移動能力を有するデータ・プロセッサとその方法
US5426378A (en) 1994-04-20 1995-06-20 Xilinx, Inc. Programmable logic device which stores more than one configuration and means for switching configurations
US5761466A (en) 1994-05-09 1998-06-02 Lsi Logic Corporation Soft programmable single-cycle/pipelined micro-programmed control system
US5584027A (en) 1994-08-31 1996-12-10 Motorola Inc. Method and apparatus for finding induction variables for use in compiling computer instructions
US5513366A (en) 1994-09-28 1996-04-30 International Business Machines Corporation Method and system for dynamically reconfiguring a register file in a vector processor
US5887183A (en) 1995-01-04 1999-03-23 International Business Machines Corporation Method and system in a data processing system for loading and storing vectors in a plurality of modes
US7301541B2 (en) * 1995-08-16 2007-11-27 Microunity Systems Engineering, Inc. Programmable processor and method with wide operations
US6643765B1 (en) 1995-08-16 2003-11-04 Microunity Systems Engineering, Inc. Programmable processor with group floating point operations
CN102707922B (zh) 1995-08-31 2015-10-07 英特尔公司 控制移位分组数据的位校正的装置
US6041403A (en) 1996-09-27 2000-03-21 Intel Corporation Method and apparatus for generating a microinstruction responsive to the specification of an operand, in addition to a microinstruction based on the opcode, of a macroinstruction
US5892962A (en) 1996-11-12 1999-04-06 Lucent Technologies Inc. FPGA-based processor
US6161219A (en) 1997-07-03 2000-12-12 The University Of Iowa Research Foundation System and method for providing checkpointing with precompile directives and supporting software to produce checkpoints, independent of environment constraints
US6393554B1 (en) * 1998-01-28 2002-05-21 Advanced Micro Devices, Inc. Method and apparatus for performing vector and scalar multiplication and calculating rounded products
US6418529B1 (en) 1998-03-31 2002-07-09 Intel Corporation Apparatus and method for performing intra-add operation
US6282634B1 (en) 1998-05-27 2001-08-28 Arm Limited Apparatus and method for processing data having a mixed vector/scalar register file
US6018799A (en) 1998-07-22 2000-01-25 Sun Microsystems, Inc. Method, apparatus and computer program product for optimizing registers in a stack using a register allocator
US6069489A (en) 1998-08-04 2000-05-30 Xilinx, Inc. FPGA having fast configuration memory data readback
EP2309383B1 (en) 1998-08-24 2012-05-09 MicroUnity Systems Engineering, Inc. A processor for and method of executing a single wide switch instruction using a wide operand
US6839728B2 (en) 1998-10-09 2005-01-04 Pts Corporation Efficient complex multiplication and fast fourier transform (FFT) implementation on the manarray architecture
US6282557B1 (en) 1998-12-08 2001-08-28 International Business Machines Corporation Low latency fused multiply-adder
FR2787233B1 (fr) 1998-12-11 2001-02-16 St Microelectronics Sa Procede pour verifier l'integrite des circuits de decodage d'une memoire
US6487171B1 (en) 1999-05-19 2002-11-26 3Com Corporation Crossbar switching matrix with broadcast buffering
KR100331565B1 (ko) 1999-12-17 2002-04-06 윤종용 매트릭스 연산 장치 및 매트릭스 연산기능을 갖는 디지털신호처리 장치
US20020032710A1 (en) 2000-03-08 2002-03-14 Ashley Saulsbury Processing architecture having a matrix-transpose capability
US6487524B1 (en) 2000-06-08 2002-11-26 Bbnt Solutions Llc Methods and apparatus for designing a system using the tensor convolution block toeplitz-preconditioned conjugate gradient (TCBT-PCG) method
US6647484B1 (en) 2000-09-19 2003-11-11 3 Dsp Corporation Transpose address mode in general purpose DSP processor
US20020112148A1 (en) * 2000-12-15 2002-08-15 Perry Wang System and method for executing predicated code out of order
GB2370380B (en) 2000-12-19 2003-12-31 Picochip Designs Ltd Processor architecture
GB0103472D0 (en) 2001-02-13 2001-03-28 Lsi Logic Corp Data processing system
US6901422B1 (en) 2001-03-21 2005-05-31 Apple Computer, Inc. Matrix multiplication in a vector processing system
US7016418B2 (en) 2001-08-07 2006-03-21 Ati Technologies, Inc. Tiled memory configuration for mapping video data and method thereof
US6683392B2 (en) 2001-08-20 2004-01-27 The Boeing Company Switch matrix
AU2002339867A1 (en) 2001-09-04 2003-03-18 Microunity Systems Engineering, Inc. System and method for performing multiplication
US7430578B2 (en) 2001-10-29 2008-09-30 Intel Corporation Method and apparatus for performing multiply-add operations on packed byte data
US7725521B2 (en) 2001-10-29 2010-05-25 Intel Corporation Method and apparatus for computing matrix transformations
CN1142484C (zh) * 2001-11-28 2004-03-17 中国人民解放军国防科学技术大学 微处理器向量处理方法
US6877085B2 (en) * 2001-11-30 2005-04-05 Broadcom Corporation Mechanism for processing speclative LL and SC instructions in a pipelined processor
US6877020B1 (en) 2001-12-31 2005-04-05 Apple Computer, Inc. Method and apparatus for matrix transposition
US7251811B2 (en) * 2002-01-02 2007-07-31 Intel Corporation Controlling compatibility levels of binary translations between instruction set architectures
US7003542B2 (en) 2002-01-02 2006-02-21 Intel Corporation Apparatus and method for inverting a 4×4 matrix
US7315934B2 (en) 2002-03-06 2008-01-01 Matsushita Electric Industrial Co., Ltd. Data processor and program for processing a data matrix
US20030221089A1 (en) 2002-05-23 2003-11-27 Sun Microsystems, Inc. Microprocessor data manipulation matrix module
US7209939B2 (en) 2002-07-11 2007-04-24 Sun Microsystems, Inc. Precision improvement method for the Strassen/Winograd matrix multiplication method
BR0316042A (pt) 2002-11-06 2005-09-13 Procter & Gamble Kits que contêm compressa para o corpo e dispositivo térmico fixável de modo liberável
US7061495B1 (en) 2002-11-18 2006-06-13 Ati Technologies, Inc. Method and apparatus for rasterizer interpolation
US6944747B2 (en) 2002-12-09 2005-09-13 Gemtech Systems, Llc Apparatus and method for matrix data processing
US6873596B2 (en) * 2003-05-13 2005-03-29 Nokia Corporation Fourier-transform based linear equalization for CDMA downlink
US7610466B2 (en) 2003-09-05 2009-10-27 Freescale Semiconductor, Inc. Data processing system using independent memory and register operand size specifiers and method thereof
US7315932B2 (en) 2003-09-08 2008-01-01 Moyer William C Data processing system having instruction specifiers for SIMD register operands and method thereof
US7275148B2 (en) 2003-09-08 2007-09-25 Freescale Semiconductor, Inc. Data processing system using multiple addressing modes for SIMD operations and method thereof
US7107436B2 (en) 2003-09-08 2006-09-12 Freescale Semiconductor, Inc. Conditional next portion transferring of data stream to or from register based on subsequent instruction aspect
US7298925B2 (en) 2003-09-30 2007-11-20 International Business Machines Corporation Efficient scaling in transform domain
US7388999B2 (en) 2003-10-29 2008-06-17 Hewlett-Packard Development Company, L.P. Transformations for denoising images
US8374284B2 (en) 2004-02-12 2013-02-12 Apple, Inc. Universal decoder
US7873815B2 (en) * 2004-03-04 2011-01-18 Qualcomm Incorporated Digital signal processors with configurable dual-MAC and dual-ALU
GB0405283D0 (en) 2004-03-09 2004-04-21 Aspex Technology Ltd Multi-port memory for flexible and space efficient corner turning networks in associative processors
US7873812B1 (en) 2004-04-05 2011-01-18 Tibet MIMAR Method and system for efficient matrix multiplication in a SIMD processor architecture
CN1707426A (zh) * 2004-06-09 2005-12-14 上海华博科技(集团)有限公司 基于可配置的乘法器矩阵结构的操作数分配装置及其分配方法
US20050289208A1 (en) 2004-06-23 2005-12-29 Harrison John R Methods and apparatus for determining quotients
US7350055B2 (en) 2004-10-20 2008-03-25 Arm Limited Tightly coupled accelerator
US8719819B2 (en) 2005-06-30 2014-05-06 Intel Corporation Mechanism for instruction set based thread execution on a plurality of instruction sequencers
WO2006081093A2 (en) 2005-01-27 2006-08-03 Innovasic, Inc. Configurable application specific standard product with configurable i/o
US20060190517A1 (en) 2005-02-02 2006-08-24 Guerrero Miguel A Techniques for transposition of a matrix arranged in a memory as multiple items per word
ATE508549T1 (de) 2005-05-25 2011-05-15 Mitsubishi Electric Corp Kodierungsmatrix in einem mimo system
US8760994B2 (en) 2005-10-28 2014-06-24 Qualcomm Incorporated Unitary precoding based on randomized FFT matrices
KR100812225B1 (ko) * 2005-12-07 2008-03-13 한국전자통신연구원 멀티프로세서 SoC 플랫폼에 적합한 크로스바 스위치구조
US20070156949A1 (en) 2005-12-30 2007-07-05 Rudelic John C Method and apparatus for single chip system boot
US20070186210A1 (en) 2006-02-06 2007-08-09 Via Technologies, Inc. Instruction set encoding in a dual-mode computer processing environment
CN101449256B (zh) 2006-04-12 2013-12-25 索夫特机械公司 对载明并行和依赖运算的指令矩阵进行处理的装置和方法
US20070271325A1 (en) * 2006-05-08 2007-11-22 Nvidia Corporation Matrix multiply with reduced bandwidth requirements
US8089959B2 (en) 2006-05-30 2012-01-03 Ted Henryk Szymanski Method and apparatus to schedule packets through a crossbar switch with delay guarantees
US7792895B1 (en) 2006-06-16 2010-09-07 Nvidia Corporation Efficient matrix multiplication on a parallel processing device
US7506134B1 (en) 2006-06-16 2009-03-17 Nvidia Corporation Hardware resource based mapping of cooperative thread arrays (CTA) to result matrix tiles for efficient matrix multiplication in computing system comprising plurality of multiprocessors
US7912889B1 (en) 2006-06-16 2011-03-22 Nvidia Corporation Mapping the threads of a CTA to the elements of a tile for efficient matrix multiplication
US20080071851A1 (en) 2006-09-20 2008-03-20 Ronen Zohar Instruction and logic for performing a dot-product operation
GB0618921D0 (en) 2006-09-26 2006-11-08 Trw Ltd Matrix multiplication
US8122078B2 (en) 2006-10-06 2012-02-21 Calos Fund, LLC Processor with enhanced combined-arithmetic capability
US7797362B2 (en) 2007-02-23 2010-09-14 Texas Instruments Incorporated Parallel architecture for matrix transposition
GB2447494A (en) 2007-03-15 2008-09-17 Linear Algebra Technologies Lt A method and circuit for compressing data using a bitmap to identify the location of data values
US8538015B2 (en) * 2007-03-28 2013-09-17 Intel Corporation Flexible architecture and instruction for advanced encryption standard (AES)
US8392487B1 (en) 2007-03-29 2013-03-05 Compass Electro-Optical Systems Ltd Programmable matrix processor
US7917568B2 (en) * 2007-04-10 2011-03-29 Via Technologies, Inc. X87 fused multiply-add instruction
US7673120B2 (en) 2007-06-27 2010-03-02 Texas Instruments Incorporated Inter-cluster communication network and heirarchical register files for clustered VLIW processors
DE602008003456D1 (de) * 2007-07-02 2010-12-23 Technology From Ideas Ltd Erzeugung von paritätsprüfmatrizen
US8161271B2 (en) 2007-07-11 2012-04-17 International Business Machines Corporation Store misaligned vector with permute
US8051124B2 (en) 2007-07-19 2011-11-01 Itt Manufacturing Enterprises, Inc. High speed and efficient matrix multiplication hardware module
US8028015B2 (en) 2007-08-10 2011-09-27 Inside Contactless S.A. Method and system for large number multiplication
US8040349B1 (en) 2007-12-04 2011-10-18 Nvidia Corporation System and method for structuring an A-buffer
US9529592B2 (en) 2007-12-27 2016-12-27 Intel Corporation Vector mask memory access instructions to perform individual and sequential memory access operations if an exception occurs during a full width memory access operation
US8923510B2 (en) 2007-12-28 2014-12-30 Intel Corporation Method and apparatus for efficiently implementing the advanced encryption standard
US8631261B2 (en) 2007-12-31 2014-01-14 Intel Corporation Context state management for processor feature sets
US7925853B2 (en) 2008-01-04 2011-04-12 International Business Machines Corporation Method and apparatus for controlling memory array gating when a processor executes a low confidence branch instruction in an information handling system
US8068365B2 (en) 2008-02-04 2011-11-29 Mosaid Technologies Incorporated Non-volatile memory device having configurable page size
US8612723B2 (en) 2008-05-06 2013-12-17 L-3 Communications Integrated Systems, L.P. System and method for storing a sparse matrix
US8533251B2 (en) 2008-05-23 2013-09-10 International Business Machines Corporation Optimized corner turns for local storage and bandwidth reduction
US8060730B2 (en) 2008-05-30 2011-11-15 Freescale Semiconductor, Inc. Selective MISR data accumulation during exception processing
US8250130B2 (en) 2008-05-30 2012-08-21 International Business Machines Corporation Reducing bandwidth requirements for matrix multiplication
US8145880B1 (en) 2008-07-07 2012-03-27 Ovics Matrix processor data switch routing systems and methods
US7870365B1 (en) * 2008-07-07 2011-01-11 Ovics Matrix of processors with data stream instruction execution pipeline coupled to data switch linking to neighbor units by non-contentious command channel / data channel
US8626815B1 (en) 2008-07-14 2014-01-07 Altera Corporation Configuring a programmable integrated circuit device to perform matrix multiplication
US20100180100A1 (en) 2009-01-13 2010-07-15 Mavrix Technology, Inc. Matrix microprocessor and method of operation
US9003340B2 (en) 2009-01-30 2015-04-07 Active-Semi, Inc. Communicating configuration information across a programmable analog tile to another tile
US8577950B2 (en) 2009-08-17 2013-11-05 International Business Machines Corporation Matrix multiplication operations with data pre-conditioning in a high performance computing architecture
US8650240B2 (en) 2009-08-17 2014-02-11 International Business Machines Corporation Complex matrix multiplication operations with data pre-conditioning in a high performance computing architecture
US8352528B2 (en) 2009-09-20 2013-01-08 Mimar Tibet Apparatus for efficient DCT calculations in a SIMD programmable processor
US9519947B2 (en) 2009-09-25 2016-12-13 Nvidia Corporation Architecture and instructions for accessing multi-dimensional formatted surface memory
US8539201B2 (en) 2009-11-04 2013-09-17 International Business Machines Corporation Transposing array data on SIMD multi-core processor architectures
US8984043B2 (en) * 2009-12-23 2015-03-17 Intel Corporation Multiplying and adding matrices
KR101639574B1 (ko) 2009-12-30 2016-07-14 삼성전자주식회사 적응적 뱅크 어드레스를 제공하는 디스플레이 시스템 및 그것의 어드레스 맵핑 방법
GB2476800A (en) * 2010-01-07 2011-07-13 Linear Algebra Technologies Ltd Sparse matrix vector multiplier using a bit map of non-zero elements to control scheduling of arithmetic operations
US9600281B2 (en) 2010-07-12 2017-03-21 International Business Machines Corporation Matrix multiplication operations using pair-wise load and splat operations
US8824222B2 (en) * 2010-08-13 2014-09-02 Rambus Inc. Fast-wake memory
US8478969B2 (en) 2010-09-24 2013-07-02 Intel Corporation Performing a multiply-multiply-accumulate instruction
US20120113133A1 (en) 2010-11-04 2012-05-10 Shpigelblat Shai System, device, and method for multiplying multi-dimensional data arrays
US8825988B2 (en) 2010-11-12 2014-09-02 Advanced Micro Devices, Inc. Matrix algorithm for scheduling operations
US9727471B2 (en) 2010-11-29 2017-08-08 Intel Corporation Method and apparatus for stream buffer management instructions
US8762655B2 (en) 2010-12-06 2014-06-24 International Business Machines Corporation Optimizing output vector data generation using a formatted matrix data structure
CN102081513B (zh) 2011-01-24 2014-07-23 山东大学 Aes加密算法中列混淆过程指令优化方法及其指令集处理器
JP5782265B2 (ja) 2011-02-04 2015-09-24 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation 行列計算処理方法、プログラム及びシステム
US20120254588A1 (en) 2011-04-01 2012-10-04 Jesus Corbal San Adrian Systems, apparatuses, and methods for blending two source operands into a single destination using a writemask
US20120254592A1 (en) 2011-04-01 2012-10-04 Jesus Corbal San Adrian Systems, apparatuses, and methods for expanding a memory source into a destination register and compressing a source register into a destination memory location
EP2695054B1 (en) 2011-04-01 2018-08-15 Intel Corporation Vector friendly instruction format and execution thereof
US20120254591A1 (en) 2011-04-01 2012-10-04 Hughes Christopher J Systems, apparatuses, and methods for stride pattern gathering of data elements and stride pattern scattering of data elements
GB201105716D0 (en) * 2011-04-04 2011-05-18 Advanced Risc Mach Ltd Method of and apparatus for displaying windows on a display
US9984124B2 (en) 2011-05-11 2018-05-29 International Business Machines Corporation Data management in relational databases
US9503741B2 (en) 2011-06-08 2016-11-22 Vixs Systems, Inc. Video decoder with multi-format vector processor and methods for use therewith
US8838664B2 (en) * 2011-06-29 2014-09-16 Advanced Micro Devices, Inc. Methods and apparatus for compressing partial products during a fused multiply-and-accumulate (FMAC) operation on operands having a packed-single-precision format
US9398307B2 (en) 2011-07-11 2016-07-19 Sharp Kabushiki Kaisha Video decoder for tiles
GB2494903B (en) 2011-09-22 2017-12-27 Advanced Risc Mach Ltd Graphics processing systems
CN106293631B (zh) * 2011-09-26 2020-04-10 英特尔公司 用于提供向量分散操作和聚集操作功能的指令和逻辑
GB2508312B (en) 2011-09-26 2020-04-22 Intel Corp Instruction and logic to provide vector load-op/store-op with stride functionality
CN102360344B (zh) * 2011-10-10 2014-03-12 西安交通大学 矩阵处理器及其指令集和嵌入式系统
CN102411558B (zh) 2011-10-31 2015-05-13 中国人民解放军国防科学技术大学 面向向量处理器的大矩阵相乘的向量化实现方法
US9298621B2 (en) 2011-11-04 2016-03-29 Hewlett Packard Enterprise Development Lp Managing chip multi-processors through virtual domains
US8941884B1 (en) 2011-12-02 2015-01-27 Marvell International Ltd. Method and apparatus for dynamically generating a stochastic threshold table
US9960917B2 (en) 2011-12-22 2018-05-01 Intel Corporation Matrix multiply accumulate instruction
US8929539B2 (en) 2011-12-22 2015-01-06 Intel Corporation Instructions to perform Groestl hashing
WO2013095599A1 (en) 2011-12-23 2013-06-27 Intel Corporation Systems, apparatuses, and methods for performing a double blocked sum of absolute differences
US9792115B2 (en) 2011-12-23 2017-10-17 Intel Corporation Super multiply add (super MADD) instructions with three scalar terms
CN104040482B (zh) 2011-12-28 2018-02-16 英特尔公司 用于在打包数据元素上执行增量解码的系统、装置和方法
US20140195783A1 (en) 2011-12-29 2014-07-10 Krishnan Karthikeyan Dot product processors, methods, systems, and instructions
US9454371B2 (en) 2011-12-30 2016-09-27 Intel Corporation Micro-architecture for eliminating MOV operations
JP5840994B2 (ja) 2012-03-27 2016-01-06 富士通株式会社 行列演算装置
US9606961B2 (en) * 2012-10-30 2017-03-28 Intel Corporation Instruction and logic to provide vector compress and rotate functionality
US20140149480A1 (en) 2012-11-28 2014-05-29 Nvidia Corporation System, method, and computer program product for transposing a matrix
US20140157287A1 (en) 2012-11-30 2014-06-05 Advanced Micro Devices, Inc Optimized Context Switching for Long-Running Processes
US9152827B2 (en) 2012-12-19 2015-10-06 The United States Of America As Represented By The Secretary Of The Air Force Apparatus for performing matrix vector multiplication approximation using crossbar arrays of resistive memory devices
US9442723B2 (en) 2012-12-28 2016-09-13 Intel Corporation Method and apparatus for integral image computation instructions
KR102092172B1 (ko) 2013-02-08 2020-04-14 소니 주식회사 데이터 처리 장치, 및 데이터 처리 방법
US9256433B2 (en) * 2013-03-15 2016-02-09 Intel Corporation Systems and methods for move elimination with bypass multiple instantiation table
CN103235724A (zh) * 2013-05-10 2013-08-07 中国人民解放军信息工程大学 基于原子操作语义描述的多源二进制代码一体化翻译方法
US10628156B2 (en) 2013-07-09 2020-04-21 Texas Instruments Incorporated Vector SIMD VLIW data path architecture
GB2516826B (en) 2013-07-23 2016-06-22 Canon Kk Method, device and computer program for encapsulating partitioned timed media data by creating tracks to be independently encapsulated in at least one media f
US9703708B2 (en) 2013-09-27 2017-07-11 Intel Corporation System and method for thread scheduling on reconfigurable processor cores
US9285997B2 (en) 2013-10-30 2016-03-15 Intel Corporation Independently selective tile group access with data structuring
US9898330B2 (en) 2013-11-11 2018-02-20 Intel Corporation Compacted context state management
CN105940381B (zh) * 2013-12-26 2019-11-15 英特尔公司 存储器控制器和由存储器控制器执行的方法
US9286216B2 (en) 2014-01-16 2016-03-15 Carnegie Mellon University 3DIC memory chips including computational logic-in-memory for performing accelerated data processing
US9557995B2 (en) 2014-02-07 2017-01-31 Arm Limited Data processing apparatus and method for performing segmented operations
JP6256088B2 (ja) 2014-02-20 2018-01-10 日本電気株式会社 ベクトルプロセッサ、情報処理装置および追い越し制御方法
US9298540B2 (en) 2014-02-26 2016-03-29 Adobe Systems Incorporated Detection and restoration of erroneous data
FR3021428B1 (fr) 2014-05-23 2017-10-13 Kalray Multiplication de matrices de bits utilisant des registres explicites
KR101753467B1 (ko) 2014-06-26 2017-07-03 인텔 코포레이션 범용 gf(256) simd 암호용 산술 기능성을 제공하는 명령어 및 로직
US9785565B2 (en) * 2014-06-30 2017-10-10 Microunity Systems Engineering, Inc. System and methods for expandably wide processor instructions
US9891886B2 (en) * 2014-07-02 2018-02-13 Via Alliance Semiconductor Co., Ltd Split-path heuristic for performing a fused FMA operation
US9910670B2 (en) * 2014-07-09 2018-03-06 Intel Corporation Instruction set for eliminating misaligned memory accesses during processing of an array having misaligned data rows
US10223333B2 (en) 2014-08-29 2019-03-05 Nvidia Corporation Performing multi-convolution operations in a parallel processing system
CN106663027A (zh) 2014-09-03 2017-05-10 联发科技股份有限公司 具有较少不必要寄存器数据存取的模式切换处理方法及相关非临时机器可读介质
EP3021282A1 (en) 2014-11-14 2016-05-18 Thomson Licensing Methods and apparatus for learning palette dictionaries for device-ready example-guided recolorization
US10255547B2 (en) 2014-12-04 2019-04-09 Nvidia Corporation Indirectly accessing sample data to perform multi-convolution operations in a parallel processing system
US20160179523A1 (en) 2014-12-23 2016-06-23 Intel Corporation Apparatus and method for vector broadcast and xorand logical instruction
US9996350B2 (en) 2014-12-27 2018-06-12 Intel Corporation Hardware apparatuses and methods to prefetch a multidimensional block of elements from a multidimensional array
US10296334B2 (en) 2014-12-27 2019-05-21 Intel Corporation Method and apparatus for performing a vector bit gather
US20160239706A1 (en) 2015-02-13 2016-08-18 Qualcomm Incorporated Convolution matrix multiply with callback for deep tiling for deep convolutional neural networks
US9886418B2 (en) 2015-04-28 2018-02-06 Intel Corporation Matrix operands for linear algebra operations
US9934153B2 (en) * 2015-06-30 2018-04-03 Nvidia Corporation Patch memory system
US10535114B2 (en) 2015-08-18 2020-01-14 Nvidia Corporation Controlling multi-pass rendering sequences in a cache tiling architecture
US10423411B2 (en) 2015-09-26 2019-09-24 Intel Corporation Data element comparison processors, methods, systems, and instructions
US11061672B2 (en) 2015-10-02 2021-07-13 Via Alliance Semiconductor Co., Ltd. Chained split execution of fused compound arithmetic operations
CN106485323B (zh) * 2015-10-08 2019-02-26 上海兆芯集成电路有限公司 具有输出缓冲器反馈以执行时间递归神经网络计算的神经网络单元
US10152421B2 (en) 2015-11-23 2018-12-11 Intel Corporation Instruction and logic for cache control operations
US9812180B2 (en) 2016-01-18 2017-11-07 Hare Krishna Verma Programmable logic accelerator in system on chip
US9875104B2 (en) 2016-02-03 2018-01-23 Google Llc Accessing data in multi-dimensional tensors
US10600475B2 (en) 2016-05-18 2020-03-24 Sitaram Yadavalli Method and apparatus for storing and accessing matrices and arrays by columns and rows in a processing unit
US20170337156A1 (en) * 2016-04-26 2017-11-23 Onnivation Llc Computing machine architecture for matrix and array processing
US10073815B2 (en) 2016-05-31 2018-09-11 Palo Alto Research Cener Incorporated System and method for speeding up general matrix-matrix multiplication on the GPU
US10191744B2 (en) 2016-07-01 2019-01-29 Intel Corporation Apparatuses, methods, and systems for element sorting of vectors
US10275243B2 (en) 2016-07-02 2019-04-30 Intel Corporation Interruptible and restartable matrix multiplication instructions, processors, methods, and systems
US10067911B2 (en) * 2016-07-26 2018-09-04 Advanced Micro Devices, Inc. High performance inplace transpose operations
CN106228238B (zh) * 2016-07-27 2019-03-22 中国科学技术大学苏州研究院 现场可编程门阵列平台上加速深度学习算法的方法和系统
CN106445471B (zh) 2016-10-13 2018-06-01 北京百度网讯科技有限公司 处理器和用于在处理器上执行矩阵乘运算的方法
US10146535B2 (en) 2016-10-20 2018-12-04 Intel Corporatoin Systems, apparatuses, and methods for chained fused multiply add
US10846087B2 (en) 2016-12-30 2020-11-24 Intel Corporation Systems, apparatuses, and methods for broadcast arithmetic operations
US10949496B2 (en) 2016-12-30 2021-03-16 Intel Corporation Dimension shuffling using matrix processors
DE112016007566T5 (de) 2016-12-31 2019-09-26 Intel Corporation Systeme, Verfahren und Vorrichtungen zur heterogenen Berechnung
US10817587B2 (en) 2017-02-28 2020-10-27 Texas Instruments Incorporated Reconfigurable matrix multiplier system and method
JP6912703B2 (ja) 2017-02-24 2021-08-04 富士通株式会社 演算方法、演算装置、演算プログラム及び演算システム
WO2018174925A1 (en) 2017-03-20 2018-09-27 Intel Corporation Systems, methods, and apparatuses for dot production operations
US10338919B2 (en) 2017-05-08 2019-07-02 Nvidia Corporation Generalized acceleration of matrix multiply accumulate operations
WO2018228703A1 (en) 2017-06-16 2018-12-20 Huawei Technologies Co., Ltd. Multiply accumulator array and processor device
US11580193B2 (en) * 2017-06-22 2023-02-14 Nec Corporation Computation device, computation method, and program
GB2563878B (en) 2017-06-28 2019-11-20 Advanced Risc Mach Ltd Register-based matrix multiplication
US11579881B2 (en) * 2017-06-29 2023-02-14 Intel Corporation Instructions for vector operations with constant values
US11418196B2 (en) * 2017-06-29 2022-08-16 Shenzhen Chipuller Chip Technology Co., Ltd Method and apparatus for dynamic routing using heterogeneous and disjoint networks
GB2564696B (en) * 2017-07-20 2020-02-05 Advanced Risc Mach Ltd Register-based complex number processing
US20190079903A1 (en) 2017-09-14 2019-03-14 Qualcomm Incorporated Providing matrix multiplication using vector registers in processor-based devices
US11138291B2 (en) 2017-09-26 2021-10-05 Oracle International Corporation Assymetric allocation of SRAM and data layout for efficient matrix multiplication
FR3075410B1 (fr) 2017-12-14 2021-03-19 Safran Electronics & Defense Procede de traitement d'un signal comprenant une detection de perturbations causees par un impact de foudre
US11023382B2 (en) 2017-12-22 2021-06-01 Intel Corporation Systems, methods, and apparatuses utilizing CPU storage with a memory reference
US11093247B2 (en) 2017-12-29 2021-08-17 Intel Corporation Systems and methods to load a tile register pair
US11669326B2 (en) 2017-12-29 2023-06-06 Intel Corporation Systems, methods, and apparatuses for dot product operations
US20190205137A1 (en) 2017-12-29 2019-07-04 Lawrence Meadows Methods and apparatus for multi-load and multi-store vector instructions
US11816483B2 (en) 2017-12-29 2023-11-14 Intel Corporation Systems, methods, and apparatuses for matrix operations
US11023235B2 (en) 2017-12-29 2021-06-01 Intel Corporation Systems and methods to zero a tile register pair
US11809869B2 (en) 2017-12-29 2023-11-07 Intel Corporation Systems and methods to store a tile register pair to memory
US11789729B2 (en) * 2017-12-29 2023-10-17 Intel Corporation Systems and methods for computing dot products of nibbles in two tile operands
US10572568B2 (en) * 2018-03-28 2020-02-25 Intel Corporation Accelerator for sparse-dense matrix multiplication
US10664287B2 (en) 2018-03-30 2020-05-26 Intel Corporation Systems and methods for implementing chained tile operations
US10649772B2 (en) 2018-03-30 2020-05-12 Intel Corporation Method and apparatus for efficient matrix transpose
US10620951B2 (en) 2018-06-22 2020-04-14 Intel Corporation Matrix multiplication acceleration of sparse matrices using column folding and squeezing
US20200050452A1 (en) 2018-08-11 2020-02-13 Intel Corporation Systems, apparatuses, and methods for generating an index by sort order and reordering elements based on sort order
US11579883B2 (en) 2018-09-14 2023-02-14 Intel Corporation Systems and methods for performing horizontal tile operations
US10970076B2 (en) 2018-09-14 2021-04-06 Intel Corporation Systems and methods for performing instructions specifying ternary tile logic operations
US20200097291A1 (en) 2018-09-24 2020-03-26 Intel Corporation Apparatus and method for tile gather and tile scatter
US10719323B2 (en) 2018-09-27 2020-07-21 Intel Corporation Systems and methods for performing matrix compress and decompress instructions
US10866786B2 (en) 2018-09-27 2020-12-15 Intel Corporation Systems and methods for performing instructions to transpose rectangular tiles
US10990396B2 (en) 2018-09-27 2021-04-27 Intel Corporation Systems for performing instructions to quickly convert and use tiles as 1D vectors
US10963256B2 (en) 2018-09-28 2021-03-30 Intel Corporation Systems and methods for performing instructions to transform matrices into row-interleaved format
US10896043B2 (en) 2018-09-28 2021-01-19 Intel Corporation Systems for performing instructions for fast element unpacking into 2-dimensional registers
US10963246B2 (en) 2018-11-09 2021-03-30 Intel Corporation Systems and methods for performing 16-bit floating-point matrix dot product instructions
US11284112B2 (en) * 2018-12-06 2022-03-22 Tencent America LLC Method and apparatus for a primary transform using an 8-bit transform core
US10929503B2 (en) 2018-12-21 2021-02-23 Intel Corporation Apparatus and method for a masked multiply instruction to support neural network pruning operations
US11294671B2 (en) 2018-12-26 2022-04-05 Intel Corporation Systems and methods for performing duplicate detection instructions on 2D data
US11886875B2 (en) 2018-12-26 2024-01-30 Intel Corporation Systems and methods for performing nibble-sized operations on matrix elements
US20200210517A1 (en) 2018-12-27 2020-07-02 Intel Corporation Systems and methods to accelerate multiplication of sparse matrices
US20200210188A1 (en) 2018-12-27 2020-07-02 Intel Corporation Systems and methods for performing matrix row- and column-wise permute instructions
US10922077B2 (en) * 2018-12-29 2021-02-16 Intel Corporation Apparatuses, methods, and systems for stencil configuration and computation instructions
US10942985B2 (en) * 2018-12-29 2021-03-09 Intel Corporation Apparatuses, methods, and systems for fast fourier transform configuration and computation instructions
US11269630B2 (en) 2019-03-29 2022-03-08 Intel Corporation Interleaved pipeline of floating-point adders
US11016731B2 (en) 2019-03-29 2021-05-25 Intel Corporation Using Fuzzy-Jbit location of floating-point multiply-accumulate results
US11175891B2 (en) 2019-03-30 2021-11-16 Intel Corporation Systems and methods to perform floating-point addition with selected rounding
US10990397B2 (en) 2019-03-30 2021-04-27 Intel Corporation Apparatuses, methods, and systems for transpose instructions of a matrix operations accelerator
US11334647B2 (en) 2019-06-29 2022-05-17 Intel Corporation Apparatuses, methods, and systems for enhanced matrix multiplier architecture
US20200026745A1 (en) 2019-09-27 2020-01-23 Intel Corporation Apparatuses, methods, and systems for instructions of a matrix operations accelerator
US11714875B2 (en) 2019-12-28 2023-08-01 Intel Corporation Apparatuses, methods, and systems for instructions of a matrix operations accelerator
US12112167B2 (en) 2020-06-27 2024-10-08 Intel Corporation Matrix data scatter and gather between rows and irregularly spaced memory locations
US11972230B2 (en) 2020-06-27 2024-04-30 Intel Corporation Matrix transpose and multiply
US20210406018A1 (en) 2020-06-27 2021-12-30 Intel Corporation Apparatuses, methods, and systems for instructions for moving data between tiles of a matrix operations accelerator and vector registers
US20210406012A1 (en) 2020-06-27 2021-12-30 Intel Corporation Loading and storing matrix data with datatype conversion
US11941395B2 (en) 2020-09-26 2024-03-26 Intel Corporation Apparatuses, methods, and systems for instructions for 16-bit floating-point matrix dot product instructions
US20220100513A1 (en) 2020-09-26 2022-03-31 Intel Corporation Apparatuses, methods, and systems for instructions for loading data and padding into a tile of a matrix operations accelerator
US20220197652A1 (en) 2020-12-22 2022-06-23 Intel Corporation Processors, methods, systems, and instructions to merge portions of two source two-dimensional arrays without explicit per-portion control
US12474928B2 (en) 2020-12-22 2025-11-18 Intel Corporation Processors, methods, systems, and instructions to select and store data elements from strided data element positions in a first dimension from three source two-dimensional arrays in a result two-dimensional array
US20220197974A1 (en) 2020-12-22 2022-06-23 Intel Corporation Processors, methods, systems, and instructions to select and store data elements from two source two-dimensional arrays indicated by permute control elements in a result two-dimensional array
US12001385B2 (en) 2020-12-24 2024-06-04 Intel Corporation Apparatuses, methods, and systems for instructions for loading a tile of a matrix operations accelerator
US12020028B2 (en) 2020-12-26 2024-06-25 Intel Corporation Apparatuses, methods, and systems for 8-bit floating-point matrix dot product instructions
US12353878B2 (en) 2021-06-26 2025-07-08 Intel Corporation Apparatuses, methods, and systems for instructions for matrix multiplication instructions
US20240045691A1 (en) 2022-08-03 2024-02-08 Intel Corporation Apparatuses, methods, and systems for 8-bit floating-point matrix dot product instructions
US20240103858A1 (en) * 2022-09-22 2024-03-28 Apple Inc. Instruction Support for Matrix Multiplication
US20240220323A1 (en) 2022-12-30 2024-07-04 Intel Corporation Apparatuses, methods, and systems for instructions for loading a tile of a matrix operations accelerator

Also Published As

Publication number Publication date
US20200233667A1 (en) 2020-07-23
EP4053695B1 (en) 2025-09-24
CN116150564A (zh) 2023-05-23
US11263008B2 (en) 2022-03-01
CN110337635B (zh) 2023-09-19
US20220300286A1 (en) 2022-09-22
CN110337635A (zh) 2019-10-15
US20200241873A1 (en) 2020-07-30
US20220291927A1 (en) 2022-09-15
CN114816530A (zh) 2022-07-29
EP3602279A1 (en) 2020-02-05
WO2018174932A1 (en) 2018-09-27
WO2018174926A1 (en) 2018-09-27
US20250117221A1 (en) 2025-04-10
US11714642B2 (en) 2023-08-01
US20220043652A1 (en) 2022-02-10
US20220171623A1 (en) 2022-06-02
US12314717B2 (en) 2025-05-27
EP4354303A2 (en) 2024-04-17
US10877756B2 (en) 2020-12-29
US20200249947A1 (en) 2020-08-06
US12282773B2 (en) 2025-04-22
US20240192954A1 (en) 2024-06-13
EP3602278A1 (en) 2020-02-05
US12536020B2 (en) 2026-01-27
EP4053695A1 (en) 2022-09-07
EP4336369A2 (en) 2024-03-13
EP4303724A1 (en) 2024-01-10
CN110494846A (zh) 2019-11-22
US11086623B2 (en) 2021-08-10
EP4216057A1 (en) 2023-07-26
CN117130661A (zh) 2023-11-28
US11163565B2 (en) 2021-11-02
WO2018174927A1 (en) 2018-09-27
EP4354303A3 (en) 2024-06-26
CN114461276A (zh) 2022-05-10
EP3602278B1 (en) 2022-09-28
EP3602279B1 (en) 2022-09-28
CN114461276B (zh) 2025-12-23
EP4137941A1 (en) 2023-02-22
US20220291926A1 (en) 2022-09-15
US20250117222A1 (en) 2025-04-10
US12182571B2 (en) 2024-12-31
US20210349720A1 (en) 2021-11-11
US20190339972A1 (en) 2019-11-07
US20200241877A1 (en) 2020-07-30
EP3602277A4 (en) 2021-01-13
US20240134644A1 (en) 2024-04-25
EP3602277A1 (en) 2020-02-05
EP4336369A3 (en) 2024-06-19
EP3602277B1 (en) 2022-08-03
US12147804B2 (en) 2024-11-19
US20220236989A1 (en) 2022-07-28
US11288068B2 (en) 2022-03-29
CN114816530B (zh) 2025-12-23
EP3602279A4 (en) 2021-03-31
US20200233665A1 (en) 2020-07-23
WO2018174928A1 (en) 2018-09-27
US11360770B2 (en) 2022-06-14
WO2018174935A1 (en) 2018-09-27
WO2018174930A1 (en) 2018-09-27
US12106100B2 (en) 2024-10-01
EP3602278A4 (en) 2021-03-24
US20240256276A1 (en) 2024-08-01
US11200055B2 (en) 2021-12-14
US20240111533A1 (en) 2024-04-04
CN117407644A (zh) 2024-01-16
US20200249949A1 (en) 2020-08-06
US12124847B2 (en) 2024-10-22
US11977886B2 (en) 2024-05-07
US20240320001A1 (en) 2024-09-26
US20190347100A1 (en) 2019-11-14
CN118034781A (zh) 2024-05-14
WO2018174934A1 (en) 2018-09-27
US20200065352A1 (en) 2020-02-27
CN110312992A (zh) 2019-10-08
WO2018174931A1 (en) 2018-09-27
EP4012555A1 (en) 2022-06-15
CN119861972A (zh) 2025-04-22
WO2018174925A1 (en) 2018-09-27
US11567765B2 (en) 2023-01-31
US20210132943A1 (en) 2021-05-06
US11847452B2 (en) 2023-12-19
EP4137940A1 (en) 2023-02-22
US20230236833A1 (en) 2023-07-27
US20220058021A1 (en) 2022-02-24
WO2018174933A1 (en) 2018-09-27
US11288069B2 (en) 2022-03-29
US12260213B2 (en) 2025-03-25
EP4553650A1 (en) 2025-05-14
WO2018174929A1 (en) 2018-09-27
US12039332B2 (en) 2024-07-16
WO2018174936A1 (en) 2018-09-27
US20250004716A1 (en) 2025-01-02
US20200233666A1 (en) 2020-07-23
US20190347310A1 (en) 2019-11-14
US11080048B2 (en) 2021-08-03

Similar Documents

Publication Publication Date Title
PL4053695T3 (pl) Systemy, sposoby i aparaty do operacji na iloczynach skalarnych
GB2546016B (en) Apparatuses, systems and methods for three-dimensional printing
GB2547155B (en) Formation characteristics determination apparatus, methods, and systems
GB2529509B (en) Adaptive beam forming devices, methods, and systems
SG11201912231VA (en) Systems and methods for blockchain-dependent operation sets
GB2540062B (en) Systems, apparatuses and methods for communication flow modification
EP3160220A4 (en) Agronomic system, methods and apparatuses
GB2584971B8 (en) Pressurizing masks, systems and methods
GB2568822B (en) Systems, apparatuses, and method for mapping a space
SG10201605019VA (en) Gas Supply System, Gas Supply Control Method And GasReplacement Method
GB2587284B (en) Well ranging apparatus. methods, and systems
EP3230132A4 (en) Smartkey apparatuses, methods and systems
PL3713900T3 (pl) Nawóz npk-si-humatowy, sposób jego produkcji i zastosowanie
GB201517729D0 (en) Data systems, devices and methods
EP3220701A4 (en) Multi-transceiver configuration method, multi-transceiver channel reuse method and apparatuses
EP3187600A4 (en) Stainless steel spring, and stainless-steel-spring production method
SG11202001704SA (en) Production device, system, and method
GB201610734D0 (en) Construction template, system and method
GB2540193B (en) Controller, system and method
EP3179653A4 (en) Precoding method, apparatus and system
SG11201803624QA (en) Media straightener, feeder and method
GB2548745B (en) Downhole electrode apparatus, systems, and methods
EP3266520A4 (en) Conjugated-diolefin-producing catalyst, and production method therefor
GB201506266D0 (en) Apparatus, systems and methods for oil and gas operations
GB201801864D0 (en) Ventialtion unit, system and method