Where does an attribute depend on one of the other attributes


I can now be sure that I've found all the cases where an attribute depends on one of the other attributes. But there can also be attributes that depend on two, or even more, attributes. In fact, I hope there are, because I'm still left with a few attributes that don't depend on any other attribute. If you ever run into this, it's a sure sign of one or more missing attributes on your shortlist-one of the hardest problems to overcome in data modeling. The method for finding multiattribute dependencies is the same as that for singleattribute dependencies-for every possible combination, create a sample with two rows that duplicate the columns to test and don't duplicate any other column. If at this point I hadn't found any dependency yet, I'd be facing an awful lot of combinations to test. Fortunately, I've already found some dependencies (which you'll find is almost always the case if you start using this method for your modeling), so I can rule out most of these combinations. At this point, if you haven't already done so, you should remove attributes that don't depend on the candidate key or that transitively depend on the primary key.

You'll have noticed that I already did so. Not moving these attributes to their own tables now will make this step unnecessarily complex. The key to reducing the number of possible combinations is to observe that at this point, you can only have three kinds of attributes in the table: a single-attribute candidate key (or more in the case of a mutual dependency), one or more attributes that depend on the candidate key, and one or more attributes that don't depend on the candidate key, or on any other attribute (as we tested all single-attribute dependencies). Because we already moved attributes that depend on an attribute other than the candidate key, these are the only three kinds of attributes we have to deal with. And that means that there are six possible kinds of combinations to consider: a candidate key and a dependent attribute; a candidate key and an independent attribute; a dependent attribute and an independent attribute; two independent attributes; two dependent attributes; or two candidate keys. Because alternate keys always have a mutual dependency, the last category is a special case of the one before it, so I won't cover it explicitly. This combination (as well as the combination of two candidate keys, as I already mentioned) can be omitted completely. I won't bother you with the mathematical proof, but instead will try to explain in language intended for mere mortals.

Given three attributes (A, B, and C), if there's a dependency from the combination of A and B to C, that would imply that for each possible combination of values for A and B, there can be at most one value of C. But if there's also a dependency of A to B, this means that for every value of A, there can be at most one value of B-in other words, there can be only one combination of A and B for every value of A; hence there can be only one value of C for every value of A. So it naturally follows that if B depends on A, then every attribute that depends on A will also depend on the combination of A and B, and every attribute that doesn't depend on A can't depend on the combination of A and B. For this combination, some testing is required. In fact, I'll test combination first, because it's the most common-and the sooner I find extra dependencies, the sooner I can start removing attributes from the table, cutting down on the number of other combinations to test. But, as before, it's not required to test all other attributes for dependency on a given combination of a candidate key and an independent attribute. Every attribute that depends on the candidate key will also appear to depend on any combination of the candidate key with any other attribute. This isn't a real dependency, so there's no need to test for it, or to conclude the existence of such a dependency. This means that in my example, I need to test the combinations of OrderNo and Product, OrderNo and Qty, and OrderNo and TotalPrice. And when testing the first combination (OrderNo and Product), I can omit the attributes CustomerID and OrderTotal, but I do need to test whether Qty or TotalPrice depend on the combination of OrderNo and Price. The domain expert rejected the sample order confirmation I based on this data. As reason for this rejection, she told me that obviously, the orders for 10 and 12 units of Gizmo should've been combined on a single line, as an order for 22 units of Gizmo, at a total price of $375.00.

This proves that Qty and TotalPrice both depend on the combination of OrderNo and Product. Second normal form requires me to create a new table with the attributes OrderNo and Product as key attributes, and Qty and Total- Price as dependent attributes. I'll have to continue testing in this new table for twoattribute dependencies for all remaining combinations of two attributes, but I don't have to repeat the single-attribute dependencies, because they've already been tested before the attributes were moved to their own table. For the orders table, I now have only the OrderNo, CustomerID, and OrderTotal as remaining attributes. This is another combination that should be included in the tests. Just as with a single dependent attribute, you'll have to test the key attribute (which will be dependent on the combination in case of a mutual dependency, in which case the combination is an alternate key) and the other dependent attributes (which will be dependent on the combination in case of a transitive dependency). In the case of my sample Orders table, I only have two dependent attributes left (CustomerID and OrderTotal), so there's only one combination to test. And the only other attribute is OrderID, the key. So I create the test population to check for a possible alternate key.

The domain expert saw no reason to reject this example, so there's obviously no dependency from CustomerID and OrderTotal to OrderNo. Because the Orders table used in my example has no independent columns anymore, I can obviously skip this combination. But if there still were two or more independent columns left, then I'd have to test each combination for a possible dependency of a candidate key or any other independent attribute upon this combination. This last possible combination is probably the least common-but there are cases where an attribute turns out to depend on a combination of a dependent and an independent attribute. Attributes that depend on the key attribute can't also depend on a combination of a dependent and an independent column (see the sidebar a few pages back for an explanation), so only candidate keys and other independent attributes need to be tested.

Legal Disclaimer

Our website is not responsible for the information contained by this article. Articleinput.com is a free articles resource thus practically any visitor can submit an article. However if you notice any copyrighted material, please contact us and we will remove the article(s) in discussion right away.

Note: This article was sent to us by: Christopher Baker at 01072010

Related Articles

1. Definitions and uses of functions in Python
This article assumes you're familiar with function definitions in at least one other computer language and with the concepts that correspond to...

2. Python functions handle variable numbers of arguments
Python functions can also be defined to handle variable numbers of arguments. You can do this two different ways. One way handles the relativel...

3. Lambda expressions and generator functions in Python
Short functions like those you just saw can also be defined using lambda expressions of the form. Lambda expressions are anonymous little funct...

4. How to create a basic program in Python
Up until now, you've been using the Python interpreter mainly in interactive mode. For production use, you'll want to create Python programs or...

5. How to make Python script execution in UNIX and Mac OS X and in Windows
If you're on UNIX, you can easily make a script directly executable. Note that if Python 3.x isn't your default version of Python, you may need...

6. The difference between scripts on Windows scripts on UNIX
The way you call scripts on Windows differs from the way scripts are called on Linux/ UNIX, and that difference can affect what kind of scripts...

7. Python applications are distributed as source files
You can distribute your scripts as source files (as .py files). You can also ship them as byte code (as .pyc or .pyo files). A byte code file w...

8. How to use the Python file system
Working with files involves one of two things: basic I/O and working with the filesystem (for example, naming, creating, moving, or referring t...

9. Client complaints about inconsistent query results
You've probably seen it before: a client has called the help desk and reported that a couple of queries have seemingly inconsistent results. Yo...