by Frank Sweet Bachman data-structure diagrams consist of boxes and arrows. The boxes represent data entities or types of records. The ar- rows depict relationships between records. Each arrowhead marks the '"many' end of a relationship. | Bachman diagrams are useful to de- signers for two reasons. First, they pack so much information into a succinct, easily re- produced form. A few lines sketched on a blackboard or notepad replace tedious, eas- ily misunderstood explanations. Also, we can manipulate the symbols, like the terms of an eguation, to derive a detailed concep- tual database design from a high-level sum- mary. Diagram manipulations let us con- clude things about our design. We can test it for consistency and examine alternatives before spending days and dollars physically building it in our shop's database manage- ment system. The three basic diagram manipula- tions are these: two-headed arrows produce intersection entities, headless arrows will merge entities, and optional arrows will split entities. We'il cover these rules and others in the next few issues. First, though, consider the symbols themselves. Boxes model data entities, the things about which we'il store data. They are important because once implemented, they become different types of records in the database. Arrows model interrecord re- lationships. They are important for three reasons. First, they embody referential data integrity. In other words, a "vendor'"' box pointing to a "purchase order" box means that we must not store a new purchase or- der unless it is associated with a valid ven- dor. Neither should we erase a vendor as long as it has purchase orders associated with it. Second, arrows show important ac- cess paths (predefined J01Ns, in relational terms) that the finished application will use (for example, given a vendor, find its pur- chase orders). Finally, many database man- agement systems use disk address pointer chains or arrays to relate records to one an- We continue our exploration of database design in part five of this 14-part series, Process-Driven Data Design. TWO-HEADED other. Conceptual relationships are a start- ing point for defining these physical rela- tionships. Here are six sample Bachman dia- grams. The first three are not valid in a fin- ished conceptual database design. No arrowhead. Which of the two is the "many" end? We'll talk about that next time. Conceptually meaningless, al- though this can be implemented in some database management systems. This one is conceptually meaningful but theoretically impossible. This situation does not occur in real life. We will discuss it in a moment. The next three, though bizarre, do make sense. Ouite common actually, a bill-of- material structure. Weird but legitimate. I've seen only one like this. It belongs to a large-city res- cue sguad's mapping database. ARROW Also common in real lifeā€”each or- ganizational unit reports to one and only one other unit, but each may have several units reporting to it. It is not directly imple- mentable in most database packages. Let's examine the first diagram ma- nipulation: the two-headed arrow and the missing intersection. A two-headed arrow means an entity is missing from our design. There's a record out there that we must identify and include before translating con- ceptual into physical design. PURCHORO Hi The above figure tells us that the re- lationship between the two records is not simply one of header detail. Since an ar- rowhead is the many end of a one-to-many relationship, twin arrowheads don't tell us which is the many end. Say we wanted a part-number catalog master file as well as a file of purchase orders. Obviously, pur- chased parts and their purchase orders are related in some way, but where does the ar- rowhead go? One PO can include many dif- ferent parts, putting the arrowhead on the right. But wait, any one part can be includ- ed in many different purchase orders. This means that the arrowhead goes on the left. Our problem is caused by lack of an entity. The many-to-many situation indi- cates that we are missing a record. There is a box, a thing about which we need to keep data, that we have not yet identified. Think about it. The PURCHORD rec- ord carries data about a purchase order (in- dependent of what individual items are in the PO). The PARTNUM record holds a part's catalog data (regardless of any POs that exist). Where does "guantity ordered" go? Not in PARTNUM, because any one part could be ordered in different guantities on many POs. but not in PURCHORD either: a PO can include different guantities of many OCTOBER 15, 1935 137