How to model the sales order


The sales order has to be one of the most standard and most downtrodden examples in the history of data modeling texts. Surely you don't need any advanced techniques to find the functional dependencies for this example; you've seen it so often that you could build the relevant tables in your sleep. And that makes this an ideal example scenario for this article. Using a simple and recognizable example enables you to check my questions and the answers provided by the fictitious domain expert, and to see for yourself that the procedure of asking questions and drawing conclusions outlined in this article will indeed result in the correct data model-a statement you'd have to take for granted if I'd chosen to use a more complex example. In real life, you won't follow all steps exactly as I outline them here, but instead immediately conclude functional dependencies in case you feel confident that you're sufficiently familiar with the domain, and use the detailed procedure in situations where you're in doubt or have no idea at all. But for illustration purposes, I'll presume that I (as the data modeler) am completely unfamiliar with the domain of sales orders, so I can't assume anything, but have to ask the domain expert everything I need to know.

So let's focus on WeTrade, Inc., a fictitious trading company, where I'm supposed to model the database for their sales orders. I've already acquired a sample order confirmation form, which I used to determine that the data model would include the attributes OrderNo, CustomerName, CustomerID, Product, Qty, Price, TotalPrice, and OrderTotal. The procedure for finding all functional dependencies is easier to perform if you first lay out all attributes in a simple tabular format, with the attribute names as column headers and some sample data below them. You need to start with at least one row of data, and you're well advised to keep the number of rows low. A functional dependency can be single-attribute (when a single attribute determines the dependent attribute) or multiattribute (when two or more attributes have to be combined to determine the dependent attribute). In the first step of the procedure, you'll only search for single-attribute dependencies. The definition of functional dependency implies that it's never allowed to have two rows of data with the same value for the determining attribute but different values for the dependent attribute. I'll leverage that knowledge by changing the sample data to include exactly those patterns and then asking the domain expert if the changed data would still be allowed. Finding single-attribute functional dependencies is done on an attribute-by-attribute basis. I'll start with OrderNo. I now have to either add a row or modify some existing data in my table so that I get a combination of two rows with the same value in the OrderNo column and different values in all other columns. We've already established that the best way to ask questions of the domain expert is to use concrete examples in a notation she's familiar with. I have the concrete example, so all I need to do now is to transform it back into the familiar notation-that of an order confirmation form. Note that there are sometimes different ways to represent the information from your table in a familiar format. For instance, in this case I used a single form with two customer names, two customer IDs, and two order totals. I could also have chosen to use two separate forms. In fact, because I'm completely unfamiliar with the domain of sales orders, I should try both versions to make sure that I won't conclude that the data isn't allowed if the domain expert rejects an example that's correct but incorrectly represented.

To save space, I won't include the two extra order confirmation forms I have to make to test the alternative representation. When I showed my altered examples to the domain expert and asked her if these could be valid order confirmations, she told me that there were many problems. It's not allowed to have two separate orders with the same order number, so this rules out the alternate representation I just described. But it's also not allowed to have more than a single customer name or more than a single customer number on an order confirmation form. And finally, the total price of the order for 12 doobles doesn't match the quantity and price, and neither of the order totals matches the sum of the individual total price values; all these numbers should match. The last statement illustrates an important pitfall when using examples to find functional dependencies (or other constraints). A domain expert could reject an example for reasons other than what you're trying to test. In this case, I wanted to know if more than one total price is allowed for a single order, but the domain expert rejects the example because a different business rule (total price = price * quantity) was violated. It's important that you always verify why the domain expert rejects your example before jumping to conclusions. Normally, the next step would be to correct the errors in the calculation of total price and order total and then consult the domain expert again. After all, the fact that she rejected the total price of the doobles because of a calculation error doesn't imply that she'd also reject it if the calculation were correct-so far I've failed to create a proper test to check whether a single order can encompass more than a single total price. But in this case I can already see from the original example, which is a correct example, that there can be multiple total prices for a single order, so there's no need to test this column. This doesn't fly for the order total, though.

And no matter how hard I try, I quickly find that there's no way to create an example with two order totals on a single order confirmation form that both match the sum of the individual total prices and yet are different. This shows that it's impossible to have more than a single order total on a single order confirmation form, which is what I needed to know. I can already deduct this myself, so there's no need to bother the domain expert with this. Because the domain expert objected to associating multiple customer names or multiple customer IDs to a single order number and I could deduct that it's impossible to associate multiple order totals with a single order number, I can now conclude that I've found three functional dependencies: CustomerName, CustomerID, and OrderTotal all are dependent on OrderNo. I obviously have to repeat the previous step for all attributes in the table. But I can save myself (and the domain expert) a lot of work if I take already-discovered functional dependencies into account. In this case, because I already know that CustomerName is dependent on OrderNo, I also know that it's impossible for attributes that don't depend on OrderNo to depend on CustomerName. After all, if for instance Product depended on CustomerName, then it would as a result of the OrderNo CustomerName dependency also transitively depend on OrderNo. And because I already established that Product doesn't depend on OrderNo, there's no way that it can depend on OrderNo. For this reason, I can exclude product, quantity, price, and total price from this round of testing. I do still need to test the other columns, though. So I once more change the data from the original example to make sure that I get two rows with the same customer name, but different order number, customer ID, and order total. You'll have noted that I left some cells empty.

This is done deliberately, to draw your focus to the value combinations that I need to test for. I'll next add in some values for the missing cells, taking particular care not to violate any of the already-identified functional dependencies or other business rules, and then I'll once more transform this to the familiar notation of an order confirmation form before consulting the domain expert. I won't include these steps here (nor for any of the following examples), to save space. The domain expert informed me that there was nothing wrong with this new example. This shows that it's great that I don't have any knowledge about this domain, for I might otherwise have assumed that the combination of two customer IDs with the same customer name wouldn't be allowed. Obviously, WeTrade, Inc., does business with different customers that share the same name. Because the modified example wasn't rejected, I can conclude that none of the attributes are functionally dependent on CustomerName.

Legal Disclaimer

Our website is not responsible for the information contained by this article. Articleinput.com is a free articles resource thus practically any visitor can submit an article. However if you notice any copyrighted material, please contact us and we will remove the article(s) in discussion right away.

Note: This article was sent to us by: Christopher Baker at 01072010

Related Articles

1. What can be used as a key in Python
Python permits more than just strings to be used in this manner. Any Python object that is immutable and hashable can be used as a key to a dic...

2. Definitions and uses of functions in Python
This article assumes you're familiar with function definitions in at least one other computer language and with the concepts that correspond to...

3. Python functions handle variable numbers of arguments
Python functions can also be defined to handle variable numbers of arguments. You can do this two different ways. One way handles the relativel...

4. Lambda expressions and generator functions in Python
Short functions like those you just saw can also be defined using lambda expressions of the form. Lambda expressions are anonymous little funct...

5. How to create a basic program in Python
Up until now, you've been using the Python interpreter mainly in interactive mode. For production use, you'll want to create Python programs or...

6. How to make Python script execution in UNIX and Mac OS X and in Windows
If you're on UNIX, you can easily make a script directly executable. Note that if Python 3.x isn't your default version of Python, you may need...

7. The difference between scripts on Windows scripts on UNIX
The way you call scripts on Windows differs from the way scripts are called on Linux/ UNIX, and that difference can affect what kind of scripts...

8. Python applications are distributed as source files
You can distribute your scripts as source files (as .py files). You can also ship them as byte code (as .pyc or .pyo files). A byte code file w...

9. How to use the Python file system
Working with files involves one of two things: basic I/O and working with the filesystem (for example, naming, creating, moving, or referring t...