CSc 538
DSM research highlights: a blog

What follows are highlights of some of the mental processes I worked through during Phase 1 of the project. The purpose of this log is not to show you exactly what you should have noticed in your research, but to give a concrete example of how the process works. It will help to have the DSM paper handy as you read through this.
I found the original 1985 paper by Copeland and Khoshafian, which I will hereafter refer to as CK85, in the ACM digital library.

CK85: page 1, 1st column

The second paragraph in the paper sets the tone. This will be a direct comparison with the N-ary storage model. The authors say they're not taking a "we are always better" tone, so I should be able to find both advantages and disadvantages straight away.


This example explains the entire concept. Each attribute is its own table. Based on the context, the authors seem to be using the word "surrogate" for "key". Ok.


CK85: page 1, 2nd column

The two copies, both using cluster indexing, are important. There will be a space issue with this model.


Section 2 appears to be listing all the advantages. Some may not be worth mentioning due to their limited applicability, like the multivalued attribute support in 2.1 (that means it's not even in 1NF!) and the directed graphs of 2.5. Some would make good examples though, like the heterogeneous records of 2.4, which eliminates nulls.


CK85: page 3, 1st column

This is now the second time something called an "inverted file" has been mentioned. Nothing in our class slides about it. Our textbook has a passage about "fully inverted files" on p. 486, which is a file that has a secondary index on every attribute. On a whim, I type "inverted file" into Google to see what I get. It leads to me to pages on document searching and information retrieval. This is probably related, but it's a dead end as far as finding a straightforward definition. I may have to look this up elsewhere if CK85 uses this term more.


Section 2.6 of CK85 lists differential file support as an advantage. I check out the Severance and Lohman reference, SL76, to make sure I know what they mean.


Severance and Lohman: page 2

SL76 gives a good analogy to an errata list that I can use (and cite, of course). Now that the definition of a differential file is confirmed, the way the DSM allows the "errata list" to contain just the changed attribute instead of the entire record makes sense as an advantage for the DSM.



CK85: page 4, 1st column

Inverted files again. I better figure this out before going on. Luckily, there's a reference this time. After reading the first two pages of the Cardenas reference on inverted files, it seems definitely to be an index:


Cardenas: page 2

Oh goody, a picture. The format of the entries appears to be a value for the entry, a pointer to the record, and then some length entry telling you how many pointers you've got. (And so there may be more than one pointer.) Using this I should be able to come up with a small example of my own:


Inverted file example

So this would be the inverted file index on the "Number" attribute of my sample table. Look familiar? It's slide 59 from the notes! It's one of the implementation options for a secondary index. This is confirmed by the textbook's definition of a fully inverted file, which has a lot of such indexes. Ok, so an inverted file is just a secondary index.


Section 3.1 lists the N-ary model as having up to a 4-to-1 advantage in space over DSM. Definitely a shortcoming to acknowledge. The rest of section 3 appears to show why the authors believe this isn't so much a concern.
The graphs in Section 5 can be pretty daunting, but decipherable after all the acronyms like "nca" and "njr" are defined. I can use a couple of the graphs as examples.

Conclusions

My own research into DSM eventually led me to consult three other papers about related subjects besides the original, as well as the textbook and other notes. Your mileage may vary. Remember, the point of this was to show you the process in action and to give you something with which to compare your own work. Here are the slides that I created as a result of my own research. Compare them to yours so you can see the level of detail I'm looking for. (I recommend downloading them instead of looking at them via the web since I left notes on many slides that will be helpful to you in understanding what I would be talking about with each slide.)
Back to Project page