This ts the Jfourth article in a series on text manage- ment and, its infWwence in the corporate environment. Although managing infor- mation via computers will certainly give corporations a competitive edge through- out the 1990s, much of the data to be retrieved exists only in printed form. Before this information can be har- nessed, it must be transformed into a machine-readable format. There are two ways to convert printed matter into machine-readable form: by manually rekeying data via a word processor or by tapping new op- ticalscanning technology. Optical-scanning eguipment captures a printed page into a bit-mapped im- age, which is then converted into ASCII using optical character recogni- tion (OCR) software. The software con- verts the bit-mapped page image into ASCII characters by matching the pat- terns of the page image against pat- terns stored in the software. Storage of a bit-mapped image re- guires up to 1M bytes of memory, com- pared with a page of ASCII text, which reguires less than 3,000 bytes. Current OGR software can recognize a variety of typefaces and font sizes, handle typeset text and flag unrecog- nizable characters. Although optical-scanning systems have advanced over the last few years, OCR software is not. yet 100 percent accurate, and there may be conversion errors and characters that it can't rec- Despite these limitations, though, OGR software provides considerable benefits, primarily in the area of data: searching, capabilities. Bit-mapped page images generated by optical scanners are not searchable based on text. For example, consider the image of a page that discusses pricing in a purchasing-system reference manual. To allow users to search for any text that refers to pricing, a key word would have to be attached to the page image. If the image was processed by OCR software, on the other hand, the result: ing file could be accessed by word searches. This would let the user find the page by using a search reguest for PG WEEKNAPPLIGA Converting the Printed Word to Mac T rd on it. duet key reguirement: of optical- canning the ability to man-« ur the SIJ bt text. For example, a user might wish to use different fonts for different kinds of textual material or format a document differently for print ing than for screen display. John Avaklan A technology known as document- structure markup allows for this kind of flexibility. Markup is a scheme of tags that are interspersed throughout the document file. 'The tags convey in- formation about the document's struc- ture and appearance. Markup can indi: cate horizontal and vertical spacing, page breaks, lists, type fonts and point Extenders ' Market Thrwwng Continued from Page 68 Systems Inc. projects its revenue growth for the year to double. Once a little-known technology, DOS ex: tenders have come to infiltrate some of the most popular PC software applications. Lotus Development Corp. tapped Rational Systems' 286 DOS extender for 1-2-3 30, and both Ashton-Tate and DataEase Inter- national Inc. turned to the Rational prod- uct, for versions of their databases. Even though support from commercial software developers is strong, many cus- tomers are eying Windows or OS/2 Presen- tation Manager for more long-term solu- tions to the DOS memory crunch. Still, many of the corporate and verti: Calmarket developers that make up 80 percent of Phar Lap's customer base are not in this camp, said Richard Smith, presi- dent of Phar Lap in Cambridge, Mass. The majority of these vertical market software developers do not have the time 9r experience to learn graphical-user-in- terface programming, Smith said. "They know about finite element analy- sis or rendering, but often they're not sys- tems-type or GUI-type programmers," Smith said, What's more, users of these specinlizecl applicatlons often have no need for Win- dows or OS/2 because these extra layers tend to diminish performance, and appli- cation-switching capabilities are useless on a dedicated computer, users said. Take, for example, Wasatch Computer Technology Inc., a developer of high-end graphics software, which has no plans to migrate its package (which includes the Phar Lap 386 DOS extender) over to Win- dows. Neither Windows nor OS/2 offers the 32-bit support that Wasateh reguires to generate fast graphics, company offi- cials said. : "We need 32-bit code so we can manage 16M-byte pieces of data with reasonable speed," said Mike Ware, president of Wa- sateh Computer in Salt Lake City. "Win- dows doesn't have what we want." B hine-Readable Text - or (617) 639-1958. For information SEPTEMBER 17, 1990 also be used to mark nei a ne within text for easy identification. k In one scenario, a markup scheme for software reference manuals could indicate hardware implementations and version numbers, user interface sections and technical sections. Markup can be employed for proce- dural purposes, such as describing how to format text on a page or what is being formatted. Using this ap- proach, documents are not tied to a specific display medium such as the rinted page. ; S For example, paragraphs might be marked with < PARA > at the begin- ning and at the end. When the document is printed, the style guide used might indicate that < PARA > means to skip a line and include no indentation. However, if the document is being displayed on a screen, a different style guide could be used to indicate that means to skip no lines and indent five spaces. The best-known markup language is Standard Generalized Markup Lan- guage. SGML tags are independent of any specific word-processing package, allowing for easy transfers between packages and the text collection. Next week I will discuss how text is indexed and gueried. H The concepts in this article are de- seribed in a new volume, Text Man- agement, of The James Martin Re- port Series. For more information on this volume, call (800) 242-124 on seminars, contact Tech; Transfer Institute, 741 10th St., San- ta Monica, Calif. 90402, (213) 394- 8305 (in the United States and Cana- da). In Europe, contact Savant, 2 New St., Carnforth, Lancs., LA5 9BX United. Kingdom, (0524) 734 505. I/F Builder ! New DDE Support Continued from Page 68 processors arid graphics packages, Gardner claimed. This integration would come in handy, for example, in deriving the names and addresses for a direct:mail letter draft- ed and printed in a Windows word proces- sor from a mainframe customer database, he explained. 'The DDE support could also be harnessed to guickly generate and update a Windows spreadsheet, with mainframe numerical data to create graphs and charts, he added. "All of these functions can be integrated under the same interface so they appear to the user as one application," Gardner said. One company turning to I/F Builder for these purposes is Information Sciences Ine. (InSci) in Montvale, N.J. InSci has used I/F Builder to create a PC-based Windows 3.0 front end to its mainframe Human Re- source Management System, according to Laura Hills, InSci's vice president of prod- uct management. Called InSciVision, the Windows interface vastly reduced the number of function keys, codes and com- mands users need to know to navigate through the mainframe application. "Instead of having to memorize a series of transaction codes, what users can do is eliek on an icon of a file folder that is labeled with the function they wish to per- form," Hills said. The interface, she said, speeds training, makes the system easier to use, and reduces keystroke errors and frustration. "This al- ' lows users to rely more on their knowledge of human resources than their ability to remember codes and transactions," she said. Due by the end of this month, I/F Build- er 2.1 will be priced at $17,500. A run-time version, called I/F Manager, is sold sepa- rately for $395 per workstation. The software can be used to create Win- dows 30 interfaces for a variety of host mainframe systems, including the IBM 3080, 3090, 4300 and 9370 running the MVS, VM/SP or DOS/VSE operating systems. Viewpoint Systems can be reached at (415) 578-1591.8