US5113453A - Character recognition method and apparatus - Google Patents

Character recognition method and apparatus Download PDF

Info

Publication number
US5113453A
US5113453A US07/523,257 US52325790A US5113453A US 5113453 A US5113453 A US 5113453A US 52325790 A US52325790 A US 52325790A US 5113453 A US5113453 A US 5113453A
Authority
US
United States
Prior art keywords
character
primitive shapes
matrix
lists
shapes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
US07/523,257
Other languages
English (en)
Inventor
Jean-Claude Simon
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
France Telecom R&D SA
Original Assignee
Centre National dEtudes des Telecommunications CNET
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Centre National dEtudes des Telecommunications CNET filed Critical Centre National dEtudes des Telecommunications CNET
Application granted granted Critical
Publication of US5113453A publication Critical patent/US5113453A/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Definitions

  • the invention relates to character recognition.
  • This category includes, in particular, devices for automatically reading postal addresses for automated sorting, devices for providing assistance to the blind, and digital inputting devices for banks.
  • Optical readers capable of reading a wide range of fonts. These machines are more advanced than the above machines in that they are capable of learning to read a new character font. This improvement in reading performance is offset by a nonnegligible learning time. Further, the quality required of documents being read is similar to that in the preceding case.
  • the aim of the present invention is to remedy the drawbacks of existing equipment.
  • One of the aims of the invention is to provide character recognition means which are "multi-font" and capable even of reading handwriting providing the hand-written characters are isolated (block letters) or are at least separable.
  • another aim of the invention is to make it possible to establish invariant representations of characters, i.e. representations which are independent of the particular character font being used.
  • the invention also seeks to provide character recognition means which are invariant with respect to the width of character strokes, character size, serifs, and to some extent of orientation in order to be able to recognize italic or slanting characters as well.
  • the invention seeks to provide character recognition means capable of being easily integrated into any system that requires such character recognition.
  • the invention also aims to provide means capable of being applied equally well to character images which are "black and white", i.e. in which optical intensity is expressed as an on or an off, and to images having a gray scale. This makes it possible to escape from the requirement for high quality in the original document being processed.
  • the present invention provides firstly a method of processing digital signals representative of image lines, and in particular of characters, the method being of the type in which:
  • each elementary shape or character is isolated, and is framed in a matrix having a predetermined configuration, in particular a rectangular configuration;
  • operation b) comprises:
  • operation b1) comprises:
  • each quantum being defined as a homogeneous block having a minimum number of pixels whose intensity is greater than a local threshold
  • step b11) comprises:
  • the so-called first level primitive shapes comprise:
  • closed loops can be detected by conventional means.
  • the other four types of primitive shape are detected by stroke subtraction means in accordance with the invention.
  • primitive shapes thus located in characteristic zones are encoded not only as to their nature, but also depending on the direction(s) of the stroke(s) ending at the characteristic zones.
  • the method further includes:
  • Scanning may take place using straight lines or curved lines passing through a fixed point (e.g. the center of gravity) or a point at infinity (translation).
  • a fixed point e.g. the center of gravity
  • a point at infinity transformation
  • a special characteristic of the invention consists in providing for the lists of primitive shapes processed in this way to be ordered partially, i.e. shapes which are encountered simultaneously during scanning are collected together in any order, in particular by means of parentheses.
  • each matrix is subjected to at least two predetermined scans, which are associated with respective partially ordered lists, said lists being compared with partially ordered lists contained in memory, with each character or shape to be recognized being associated with at least one such list.
  • the identification class or classes Ai is/are determined by interrogating the list(s) of the unknown character. For example: presence or absence of characteristics such as loops, crosses, etc.
  • the comparison may be accelerated by the fact that memory searching is performed using the presence of crosses or branching points in the list and the length of the list as basic search criteria.
  • the lists obtained are compared with the lists in the memory by calculating the distance between lists of characters to be compared within a class Ai.
  • the character to be identified is eroded in order to reduce its thickness (with the word "character” being used herein as a simple designation for any set of strokes in the matrix and contrasting with the background, but it is clear that a specific character has not yet been identified); and
  • steps b) to d) are performed again.
  • the invention also provides apparatus for implementing the above method, said apparatus comprising, in combination:
  • means for segmenting and framing characters said means being suitable for receiving image data from an imaging system
  • This apparatus is suitable for incorporating all of the characteristics mentioned above in method terms.
  • the apparatus may additionally include fining means suitable for reducing character thickness (under the same conditions as above) prior to reapplying the means for recognizing primitive shapes.
  • FIG. 1 is an overall block diagram of character recognition apparatus in accordance with the invention.
  • FIG. 2 is a diagram showing how pixels are identified relative to a background
  • FIGS. 3A and 3B are two diagrams showing the use of horizontal scanning and vertical scanning respectively for recognizing the character B;
  • FIG. 4 is an explanatory diagram showing the invariant structure of the character B
  • FIG. 5 is a flow chart showing the relationships between the various processing means in accordance with the invention.
  • FIGS. 6A, 6B, 7A, 7B, 8A, and 8B show various different situations relating to scan runs in the preceding scan line and in the current scan line.
  • reference 2 designates a camera connected to a memory 3 suitable for containing the recorded image of a document 1 having characters written thereon, which characters may be printed or handwritten.
  • the image memory is connected to segmentation and framing means M1.
  • This module M1 is connected to a module M2 for recognizing primitive shapes, which in turn transmits information to a module M3 for describing primitive shapes.
  • Character recognition module M4 is then invoked to compare the description of primitive shapes obtained by module M3 with standard or reference descriptions contained in a module M5. If the comparison is successful, i.e. if it satisfies certain proximity criteria, then a result is obtained, for example printed character B.
  • An imaging system such as that shown in FIG. 1 gives rise to a digital representation of an image in the form of digital samples which are nowadays called "pixels".
  • samples may be of the on/off type, in which case each pixel has only one bit.
  • they may take account of several different degrees of optical density in the document, i.e. several gray levels, in which case each pixel is associated with a plurality of bits.
  • the module M1 also has the function of segmenting characters, i.e. of isolating them one-by-one in respective rectangular matrices.
  • segmenting characters i.e. of isolating them one-by-one in respective rectangular matrices.
  • a printed character can be properly represented in a rectangular matrix of 20 pixels by 30 pixels, which corresponds to a total definition of 300 to 600 pixels per character. It should be observed that the character should not touch the edges of the matrix in question which are considered as having the background level of the image as obtained by the imaging system.
  • the module M1 may be made in any manner known to the person skilled in the art.
  • the background level may be defined, for example, by taking the average level of all the pixels in a character-framing matrix, since it may be observed that a character occupies only a small fraction of the area of the matrix in which it is contained.
  • an object which may become a character
  • an object is represented by pixels having values that differ from the background value.
  • the following rule may be adopted, for example: where it intersects the object, no straight line (a vertical line or a horizontal line if the matrix is a square matrix) should have less than three (exceptionally two) pixels with a value that is different from the background value. Naturally, any line may have more such pixels.
  • FIG. 2 shows an example of a horizontal straight line scan giving two adjacent pixels having a maximum level m 1 above the background level as shown at p 1 , followed by five adjacent pixels going up to a maximum level m 2 above the background level and forming a set of pixels p 2 .
  • the pixel sets or runs pl and p2 obtained when scanning a matrix along a line constitute the elements of an object which are useful for character recognition purposes.
  • module M2 which serves to recognize primitive shapes.
  • Proposals have already been made to use primitive shapes in the field of character recognition. This consists in defining character recognition independently of the variations that may be encountered concerning specific representations of characters, which variations relate in particular to the font, the degree of inking, to possible geometrical transformations, and to deformations.
  • the invention uses first level primitive shapes which are selected because of their invariance under such transformations. These first level primitive shapes are the following:
  • loops may be detected by the usual means.
  • primitive shapes a) to d) are detected in a novel manner.
  • the strokes recognized in this way are subtracted from the matrix of the object, thereby leaving the characteristic zones containing said primitive shapes.
  • the first step of the method in accordance with the invention consists in reading the image file recorded in the memory 3, with this read operation being referenced 50 in FIG. 5.
  • This procedure linesgiven () has the function of initializing global parameters from data read in the image file. It also provides the interface between the image data acquisition system and the character recognition system in accordance with the invention.
  • the source file may contain any number of sampled images at various different sizes.
  • the format of a sampled image is as follows:
  • nl--image a four-byte integer defining the number of image lines and called nl--image
  • nc--image a four-byte integer defining the number of image columns and called nc--image
  • the total number of pixels in the image is given by the product nl--image*nc--image. These pixels are stored line-by-line.
  • the procedure linesgiven() uses pointers image and trance as global variables.
  • the first pointer, image, points to MATOCTET structures for storing the image to be processed, while the second pointer trace points to a trace of the work performed by executing the method in accordance with the invention.
  • the image matrix is initialized for the present values of the loop variables i and j.
  • the next two lines compare each pixel of the image matrix with the background for the purpose of subsequently calculating the contrast at line AA13.
  • the effect of the procedure linesgiven() is to detect significant elements in the recorded image relative to the background of the image as expressed in a local manner.
  • the method in accordance with the invention then continues by implementing two scanning procedures, which may be vertical scanning and horizontal scanning in the case of a matrix which is rectangular.
  • the vertical scan uses a procedure called verticalsegment to scan the image line by line.
  • the horizontal scan uses a procedure called horizontalsegment to perform a horizontal column-by-column scan of the image.
  • the procedure horizontalsegment is not described in detail since it is symmetical to the procedure verticalsegment with the roles of lines and columns being interchanged.
  • a run is constituted by a set of adjacent pixels which are taken to be significant, i.e. pixels which are distinguished from the local value of the image background.
  • the runs found contribute to forming subobjects referred to as "boxes" and which comprise sequences of adjoining runs without branching. It is seen below that when such a subobject or box is formed, the procedure verticalsegment calls a procedure strokepresent which has the function of detecting the presence of strokes within the previously-identified box or subobject.
  • connections between the boxes make it possible to detect loops, if any.
  • the global variable of this procedure is the number of image lines: nl--image.
  • firstpave which is a pointer to the head of the list of subobjects contained in the image
  • which is a vector containing run beginnings and run ends encountered on the current line.
  • g1, d1, g2, and d2 are the column numbers of the beginnings and the ends of successive runs encountered and COLMAX is the highest column number.
  • the procedure verticalsegment comprises a first step consisting in initializing the variables (line BBl).
  • Its second step consists in scanning the image matrix so as to mark the beginnings and the ends of the runs (the current runs) of the preceding line, and then to detect the runs on the new line, which is referred to as the "current" line (see BB5), and this is done by making use of another procedure horizontal(i,buff).
  • d, e, and f can then be put at the beginnings of the "current" runs in the current line, and in the present case at the beginnings and at the ends of runs since runs both of data points and of background points are taken into consideration.
  • the WHILE loop beginning at BB7 ensures that all of the runs of the preceding line or of the current line are scanned.
  • the current run is the run beginning at index d and ending at index e. If e is less than the run beginning a situated on the preceding line (FIG. 7A), the current box which contains the run (a,b) of the preceding line is put to one side and a new box is begun which is created starting from the run (d,e) of the current line.
  • create-a-box() The procedure for creating a current box (create-a-box() is described below.
  • the situation is such that the run (a,b) of the preceding line and the run (d,e) of the current line overlap and may together constitute a box, providing certain conditions are met.
  • the run (d,e) is then stacked onto the runs that have already been found in the current box, and the procedure moves on to the next run in the preceding line and the next run in the current line.
  • the above-mentioned procedure horizon() has the function of detecting runs situated on a single line.
  • This vector contains the beginnings and the ends of the runs detected on line i, and its structure is similar to that of the local variable buff
  • seuilmaxloc which defines an acceptation threshold for a local maximum representing the middle of a run.
  • the procedure horizon() is written out formally under subheading CC in the appendix.
  • the first step in this procedure horizon() naturally consists in setting up the column index j to 1, the run index to the value 0, and attributing a single state buffer
  • COLMAX for buffer, since no run has been encountered up to the maximum column number.
  • the WHILE loop at CC4 serves to increment the index j up to the end of the line and on each occasion it determines whether image
  • a run is searched for starting from line i and column j, and a new component is given to the vector buffer to mark the beginning of the run and another new component is given to the vector buffer to mark the end of the run.
  • the index j is merely incremented by unity.
  • the procedure searchregion is given in the appendix under the heading DD.
  • a threshold is set to one-half of the local maximum of image
  • the index of the beginning of the run is decremented so long as the value of j for the beginning of the run remains greater than the beginning of the line and the corresponding intensity value in IMAGE remains greater than the threshold.
  • the result is naturally the value jouv which corresponds to the real beginning of the run.
  • steps DD5 to DD7 perform the same operation in the other direction, going from the middle of the run in order to reach either the end of the line or else an image intensity value at index j which is less than the threshold, thus indicating the end of the run.
  • Heading EE in the appendix gives the procedure create-a-box.
  • the first step consists in dynamically allocating a box structure. Given its simplicity, there is no need to define this structure here in detail.
  • the first run of the allocated structure is initialized to
  • a second step consists in inserting the newly created box to the left of the box pointed to by the pointer ppave, in the doubly-linked list of boxes that have already been found.
  • the boxes are thus classified by the order in which they appear on the preceding line, which makes it possible to associate the various runs encountered on the current line to said boxes.
  • the third step at EE6 consists in moving on to the next run in the current line.
  • Its first step consists in looking for one or more strokes in the found suboject ppave by detecting one or more sequences of quanta. This step takes place at FF1 using the procedure strokespresent which is described in greater detail below.
  • the second step consists initially in attributing the box following the box ppave in the list of boxes to the current box, in removing the box ppave from the list of boxes, and in releasing the structure which has been removed.
  • garbage is removed from the information contained in the list of boxes whenever garbage is found by the processing for discovering strokes.
  • a third step consists in moving on to the next run of the preceding line.
  • a quantum is a succession of at least three adjacent runs satisfying the following conditions:
  • n is the number of runs of the quantum
  • S is the total number of pixels in the quantum
  • contact takes the values 0, 1 or 2 depending on the number of links between the quantum and its environment.
  • ppave which is a pointer to the box within which a search is being made for a possible stroke
  • pp is the first run of the quantum
  • dp is the last run of the quantum.
  • contact1 which indicates the existence of contacts 0 or 1 at the bottom of the quantum (i.e. on the upstream end of the quantum);
  • contact is the number of contacts (0, 1 or 2) relating to the bottom and the top of the quantum (where the top is the downstream end of the quantum);
  • np is the number of runs in the quantum
  • somp which corresponds to S above, expressed as the total number of pixels contained in the quantum.
  • the first step of the procedure strokespresent consists, at GG1 to GG3, in zeroing the values of ppdq and dpdq respectively repesentative of the first run and the last run in the quantum, whereas the binary variable traitencours is set to false, indicating that no strokes have yet been found.
  • a second step then begins at GG4 with a WHILE operator whose execution condition is that the end of the box pointed to by ppave has not yet been reached.
  • GG16 marks the other term of the initial alternative, i.e. there is no current stroke.
  • line GG23 consists in marking the pixels of the runs going from ppdq to dpdq as elements of a characteristic zone in which primitive shapes may be found, and it is recalled that this constitutes one of the essential aspects of the invention.
  • Its first step (HH1 to HH3) consists in initializing the number of runs constituting the quantum to 1, in taking the current run as the first run, and naturally in giving the total number of pixels in the quantum, i.e. the variable somp, the number of pixels in this first run.
  • variable contact1 is set to 1 or 0, in lines HH7 and HH8 respectively. Otherwise, line HH10 sets the variable contact to 0.
  • a WHILE operator has an execution condition which is that either the number of runs in the quantum is not yet equal to 3, or else that condition 2 mentioned above in the definition of a quantum is satisfied, which is now written formally.
  • Lines HH12 to HH14 cause the procedure to return the message "no quantum upto end of box” if it has reached the end of the box.
  • the next run becomes the current run
  • the number of runs in the quantum is incremented np
  • the variable somp is incremented by the appropriate amount to represent the total number of pixels in the quantum by adding in the pixels of the current run.
  • line HH22 tests whether this is the first quantum of a stroke to be detected, and whether there is a top contact, line HH25 gives the variable contact the value contact1+1, taking account of this top contact. Otherwise, line HH26 gives the variable contact the value of contact1.
  • a stroke may be defined as a sequence of quanta which overlap partially. And, by definition, it is recalled that a quantum is a succession of at least three adjacent runs which satisfy the above-mentioned conditions.
  • FIGS. 3A and 3B Reference is now made to FIGS. 3A and 3B.
  • the pixels of the matrix trace belonging to vertical strokes are encoded by 0, and the pixels belonging to a character (i.e. which are different from the background) are encoded by a 2.
  • a second stage consists in adding 0 to pixels belonging to horizontal strokes and 1 to other pixels.
  • characteristic zones which are constituted by sets of adjacent pixels that do not belong to any stroke.
  • the declaration of a characteristic zone is expressed in the C language under heading KK of the appendix, together with comments.
  • the characteristic zones can now be detected by a procedure detzone() which scans through the matrix trace.
  • the characteristic zones are arranged in a doubly-linked list, as described above.
  • triple branches are marked X, Y, Z, and W (in order to keep track of their differences concerning the orientations of the strokes touching said triple branches);
  • loops are marked with a B.
  • module M3 uses the respective positions of the zones having the characteristics of primitive shapes as supplied by module M2 in order to obtain invariant descriptions which are particularly well suited to recognizing characters.
  • the Applicant has observed that scanning by means of a moving straight line (or more generally by a moving curve) makes it possible to obtain a fully ordered list of the primitive elements, in which the order depends on the plane topology of these primitive elements or shapes.
  • a point may be selected (optionally within the matrix in which the character is inserted), and a radius is caused to pivot about said point.
  • An ordered list of the primitive shapes is obtained by noting the various primitive shapes encountered by said radius in the order in which they are intercepted.
  • the three scans chosen would be parallel to the three lines defining the pixels, in a tessellation matrix (0, +60°, -60°).
  • partial ordering if two characteristic zones are encountered simultaneously by the scan line, they are enclosed between parentheses, thereby indicating that their order is immaterial.
  • the present invention is not limited to using scans based on parallel straight lines, nor is it limited to using only two partially ordered lists, since the number of partially ordered lists may be increased in order to increase character recognition performance, if necessary.
  • the partially ordered lists may be obtained by scanning using curves such as circles of different sizes, deformable spirals, or other curves.
  • curves such as circles of different sizes, deformable spirals, or other curves.
  • the module M4 performs character recognition per se.
  • the memory M5 contains standard descriptions of primitive shapes which are associated, for example, in pairs, with given characters.
  • access to the memory M5 which is advantageously organized as a data base is made faster by the fact that the descriptions it contains are classified by non-disjoint classes.
  • Access to a class is obtained by the hash-code process.
  • the interrogation vector is based on the presence or absence of a certain number of primitive shapes of the type (c), (d), and (e) defined above, i.e. cusps, crosses or branching points having three or more branches, and closed loops, respectively.
  • the type (d) comprising crosses or branches having at least three branches is particularly advantageous for searching purposes, in particular because the simultaneous use of two partially ordered lists makes this type easy to locate in the matrix.
  • the module M4 therefore compares an unknown description X coming from the module M3 with the standard descriptions Y coming from the memory M5.
  • the choice of standard descriptions is thus performed as explained above.
  • Comparison per se may use a conventional character string comparison technique, such as:
  • the advantage of the suboptiam maximum slope method is that it is of order n, whereas dynamic programming is of order n 2 , and therefore becomes much less advantageous when the size of the data base is large.
  • the result of each comparison is a distance d(X,Y) between the unknown description X and a standard description Y.
  • the distance d will be greater than a threshold, in which case the module M4 rejects the unknown list.
  • a test is then performed which may show up an anomaly in the primitive shape zones, generally representative of an abnormal thickening in the character to be identified.
  • the fining module M6 is then used.
  • This module M6 takes the matrix provided by the module M1 and fines it by an erosion method, which does not go right down to the skeleton of the character (thickness of 1 pixel), but which stops when a minimum thickness of two or three pixels is reached.
  • the new data is then subjected successively to the modules M2, M3, and M4, and in the vast majority of cases a character is then correctly recognized.
  • the invention has been applied essentially to average quality printed characters and it has given excellent results therewith.
  • the invention may also be applied to any line image, in particular to handwritten words which, unlike the characters mentioned above, include characters which are joined-up.
  • the characteristic zones detected by the procedure detzone() which scans through the matrix trace form a doubly-linked list.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Discrimination (AREA)
  • Character Input (AREA)
US07/523,257 1986-12-12 1990-05-14 Character recognition method and apparatus Expired - Fee Related US5113453A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FR8617436A FR2608295B1 (fr) 1986-12-12 1986-12-12 Procede et dispositif de reconnaissance de caracteres
FR8617436 1986-12-12

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US07131890 Continuation 1987-12-11

Publications (1)

Publication Number Publication Date
US5113453A true US5113453A (en) 1992-05-12

Family

ID=9341848

Family Applications (1)

Application Number Title Priority Date Filing Date
US07/523,257 Expired - Fee Related US5113453A (en) 1986-12-12 1990-05-14 Character recognition method and apparatus

Country Status (5)

Country Link
US (1) US5113453A (de)
EP (1) EP0274944B1 (de)
JP (1) JPS6453282A (de)
DE (1) DE3777993D1 (de)
FR (1) FR2608295B1 (de)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5253307A (en) * 1991-07-30 1993-10-12 Xerox Corporation Image analysis to obtain typeface information
US5436983A (en) * 1988-08-10 1995-07-25 Caere Corporation Optical character recognition method and apparatus
US5515455A (en) * 1992-09-02 1996-05-07 The Research Foundation Of State University Of New York At Buffalo System for recognizing handwritten words of cursive script
US5784501A (en) * 1990-05-15 1998-07-21 Canon Kabushiki Kaisha Image processing method and apparatus
US5825923A (en) * 1996-09-05 1998-10-20 Faxtrieve, Inc. Method for performing character recognition on a pixel matrix
US6128409A (en) * 1991-11-12 2000-10-03 Texas Instruments Incorporated Systems and methods for handprint recognition acceleration
US6404909B2 (en) * 1998-07-16 2002-06-11 General Electric Company Method and apparatus for processing partial lines of scanned images
US20030086700A1 (en) * 1993-11-30 2003-05-08 Taku Yamagami Image pickup apparatus
US20040042664A1 (en) * 2002-09-04 2004-03-04 Lockheed Martin Corporation Method and computer program product for recognizing italicized text
US6934405B1 (en) * 1999-05-12 2005-08-23 Siemens Aktiengesellschaft Address reading method
WO2018178228A1 (en) 2017-03-30 2018-10-04 Myscript System and method for recognition of objects from ink elements
CN110674826A (zh) * 2019-10-09 2020-01-10 嘉兴学院 基于量子纠缠的字符识别方法

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5553162A (en) * 1991-09-23 1996-09-03 Eastman Kodak Company Method for detecting ink jet or dot matrix printing
EP0541299B1 (de) * 1991-11-04 2000-03-01 Canon Kabushiki Kaisha Gerät und Verfahren zur optischen Zeichenerkennung

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4712248A (en) * 1984-03-28 1987-12-08 Fuji Electric Company, Ltd. Method and apparatus for object identification
US4718090A (en) * 1986-04-14 1988-01-05 Cooper Jr James E Method and apparatus for high-speed processing of video images
US4771474A (en) * 1983-10-03 1988-09-13 Shaken Co., Ltd. Apparatus for processing character or pictorial image data

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB1345032A (en) * 1971-01-06 1974-01-30 Int Computers Ltd Optical character recognition system
JPS58169295A (ja) * 1982-03-31 1983-10-05 Ricoh Co Ltd セグメント抽出装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4771474A (en) * 1983-10-03 1988-09-13 Shaken Co., Ltd. Apparatus for processing character or pictorial image data
US4712248A (en) * 1984-03-28 1987-12-08 Fuji Electric Company, Ltd. Method and apparatus for object identification
US4718090A (en) * 1986-04-14 1988-01-05 Cooper Jr James E Method and apparatus for high-speed processing of video images

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
Automatic Interpretation and Classification of Images, a NATO Advanced Study Institute, Edited by A. Grasselli, 1969, Academic Press, New York: J. R. Parks: "A Multi-Level System of Analysis for Mixedfont and Hand-Blocked Printed Characters Recognition", pp. 295-322.
Automatic Interpretation and Classification of Images, a NATO Advanced Study Institute, Edited by A. Grasselli, 1969, Academic Press, New York: J. R. Parks: A Multi Level System of Analysis for Mixedfont and Hand Blocked Printed Characters Recognition , pp. 295 322. *
Patent Abstracts of Japan, vol. 8, No. 6, p. 247, Jan. 12, 1984. *
Pattern Recognition, vol. 3, 1971, pp. 345 361, Pergamon Press, GB, R. Narashimhan et al: A syntax aided recognition scheme for handprinted English letters . *
Pattern Recognition, vol. 3, 1971, pp. 345-361, Pergamon Press, GB, R. Narashimhan et al: "A syntax-aided recognition scheme for handprinted English letters".
The Marconi Review, vol. 32, Nos. 172 175, Jan. Dec. 1969, pp. 82 104, Chelmsford, Essex, GB; J. Thompson et al: Experimental multifont page reader . *
The Marconi Review, vol. 32, Nos. 172-175, Jan.-Dec. 1969, pp. 82-104, Chelmsford, Essex, GB; J. Thompson et al: "Experimental multifont page reader".

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5436983A (en) * 1988-08-10 1995-07-25 Caere Corporation Optical character recognition method and apparatus
US5784501A (en) * 1990-05-15 1998-07-21 Canon Kabushiki Kaisha Image processing method and apparatus
US5253307A (en) * 1991-07-30 1993-10-12 Xerox Corporation Image analysis to obtain typeface information
US6128409A (en) * 1991-11-12 2000-10-03 Texas Instruments Incorporated Systems and methods for handprint recognition acceleration
US5515455A (en) * 1992-09-02 1996-05-07 The Research Foundation Of State University Of New York At Buffalo System for recognizing handwritten words of cursive script
US7366404B2 (en) * 1993-11-30 2008-04-29 Canon Kabushiki Kaisha Image pickup apparatus
US20030086700A1 (en) * 1993-11-30 2003-05-08 Taku Yamagami Image pickup apparatus
US5825923A (en) * 1996-09-05 1998-10-20 Faxtrieve, Inc. Method for performing character recognition on a pixel matrix
US6404909B2 (en) * 1998-07-16 2002-06-11 General Electric Company Method and apparatus for processing partial lines of scanned images
US6442290B1 (en) * 1998-07-16 2002-08-27 Ge Medical Systems Global Technology Company, Llc Method and apparatus for processing partial lines of scanned images and communicating associated data over a network
US6934405B1 (en) * 1999-05-12 2005-08-23 Siemens Aktiengesellschaft Address reading method
US20040042664A1 (en) * 2002-09-04 2004-03-04 Lockheed Martin Corporation Method and computer program product for recognizing italicized text
US7095894B2 (en) * 2002-09-04 2006-08-22 Lockheed Martin Corporation Method and computer program product for recognizing italicized text
WO2018178228A1 (en) 2017-03-30 2018-10-04 Myscript System and method for recognition of objects from ink elements
US10579868B2 (en) 2017-03-30 2020-03-03 Myscript System and method for recognition of objects from ink elements
CN110674826A (zh) * 2019-10-09 2020-01-10 嘉兴学院 基于量子纠缠的字符识别方法
CN110674826B (zh) * 2019-10-09 2022-12-20 嘉兴学院 基于量子纠缠的字符识别方法

Also Published As

Publication number Publication date
FR2608295B1 (fr) 1989-03-31
EP0274944A3 (en) 1988-07-27
DE3777993D1 (de) 1992-05-07
JPS6453282A (en) 1989-03-01
FR2608295A1 (fr) 1988-06-17
EP0274944A2 (de) 1988-07-20
EP0274944B1 (de) 1992-04-01

Similar Documents

Publication Publication Date Title
JP4323328B2 (ja) 取り込み画像データから文字列を識別して抜出するシステムおよび方法
Schurmann et al. Document analysis-from pixels to contents
US4817171A (en) Pattern recognition system
US5113453A (en) Character recognition method and apparatus
US5347595A (en) Preprocessing means for use in a pattern classification system
US6014450A (en) Method and apparatus for address block location
Casey et al. Intelligent forms processing system
US4773099A (en) Pattern classification means for use in a pattern recognition system
US5201011A (en) Method and apparatus for image hand markup detection using morphological techniques
US5818965A (en) Consolidation of equivalence classes of scanned symbols
CA1160347A (en) Method for recognizing a machine encoded character
US5077807A (en) Preprocessing means for use in a pattern classification system
US20090046938A1 (en) Character contour correction
US5033104A (en) Method for detecting character strings
US5394484A (en) Image recognition apparatus
JPH05307638A (ja) ビットマップ・イメージ・ドキュメントのコード化データへの変換方法
Shashidhara et al. A review on text extraction techniques for degraded historical document images
Suen et al. Sorting and recognizing cheques and financial documents
IL98293A (en) A method for distinguishing between text and graphics
Mitchell et al. Newspaper document analysis featuring connected line segmentation
Lebourgeois et al. Document analysis in gray level and typography extraction using character pattern redundancies
Dhandra et al. Word level script identification in bilingual documents through discriminating features
JP3268552B2 (ja) 領域抽出方法、宛名領域抽出方法、宛名領域抽出装置、及び画像処理装置
JP3476595B2 (ja) 画像領域分割方法、および画像2値化方法
Kurdy et al. Omnifont Arabic optical character recognition system

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FPAY Fee payment

Year of fee payment: 4

REMI Maintenance fee reminder mailed
LAPS Lapse for failure to pay maintenance fees
FP Lapsed due to failure to pay maintenance fee

Effective date: 20000512

STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362