Wednesday, March 7, 2012

Question on design with Identity columns

Normally when I have a "Many-toMany" or linkage table where the primary key consists of a foreign key from two different tables, I do not bother to make a separate identiy column instead.
Does anyone see a reason why an identity column would be more or less desireable ? For example
Table Person
PK - PersonID
Table Car
PK - CarID
Table PersonCar
PK (PersonID, CarID)
Or would it be better to make an Identity Column such as PersonCarID so then the table would look like the following:
Table PersonCar
PK - (PersonCarID)
FK - PersonID
FK - CarID
Create Unique Constraint on Person and CarID
Any feedback is appreciatedHave a "quick" read of...
http://forums.asp.net/1015568/ShowPost.aspx
|||

mru22 wrote:


Does anyone see a reason why an identity column would be more or less desireable ?


My opinion, which I am not willing to defend to death, mind you, isthat an additional identity column is preferable. I've learned tohate composite keys on tables. I prefer my UPDATE, DELETE, etc.statements to have a single variable in the WHERE condition. Whencommunicating a specific record to another developer, or making a noteto myself, it is a lot easier to be able to refer to a single value asa key. Database purists will certainly scoff, and perhaps evenrise up in outrage. But the additional identity column has servedme well so far.
|||

Another typical counter-identity worry is that as they auto-inc then all the new data 'bottlenecks' at the end of the indexes (when your identity is involved in an index). Although that can work for you depending upon your solution.

|||I'm leaning toward having the additiona identity field. If I do so I think I would make it non-clustered since I would never be doing any joins on i and it would never been used in any queries.
I have heard pros and cons for having the identity remain Clustered but I think it would best serve me to move it to one of the other columns which is less unique. Or maybe only have the PK as a clustered index and no other indexes on the table since there is only 3 columns total and the optimizer may prefer index scans vs. index seek anyway.|||

mru22 wrote:

I would never be doing any joins on i and it would never been used in any queries.


So then what will be using the field for?
|||

It would just be used to have the PK represent a single column vs. multi column. No other reason, that's why I was contemplating using it in the first place. Typically I have not when I have "many to many" tables.
So instead i would just have the two keys, one from each of the parent tables. With that being said, I notice that the OR Mappers seem to work more easily when you define a single column primary key and if I end up going that route then I may consider the single column pk.

|||

mru22 wrote:

With that being said, I notice that the OR Mappersseem to work more easily when you define a single column primary keyand if I end up going that route then I may consider the single columnpk.


In that case, you would indeed be using the field in queries, behindthe scenes. Adding a field which would never be used orreferenced of course would make no sense, that's why I questioned you.:-)
|||

mru22 wrote:

Normally when I have a "Many-toMany" or linkage table where the primary key consists of a foreign key from two different tables, I do not bother to make a separate identiy column instead.
Does anyone see a reason why an identity column would be more or less desireable ?


The fact that you're asking this question shows that you have a total lack of understanding on relational databases. You need to go to back to the fundamentals. Actually, I'm guessing you never started at the fundamentals and just dived right into SQL Server and started programming. That's OK, a lot of people do that.
If you are being paid to program, you have a professional obligation to do it right instead of making it up as you go along. I'm sure you wouldn't be too happy if all your electrician had a multimeter, wire trimmers, and eletrical tape but didn't know his Volts from his Watts.
Go pick up Date's INTRODUCTION TO DATABASE SYSTEMS. You can get an old edition (it's a text book and doesn't change too much) for $5 or so.|||

Sorry but I do have a complete understanding of databases. And no I did not just pick SQL server and start programming. Your insults are not only a waste of posting to this board by providing no true value to the discussion but that's ok because there will always be people like you.
As far as this thread why do you go ahead and explain which one you prefer and why or would you rather reply with another insult ?

|||

tmorton wrote:

mru22 wrote:

With that being said, I notice that the OR Mappers seem to work more easily when you define a single column primary key and if I end up going that route then I may consider the single column pk.


In that case, you would indeed be using the field in queries, behind the scenes. Adding a field which would never be used or referenced of course would make no sense, that's why I questioned you. :-)


I agree completely and again that is how I normally do it. But I have seen others add the identity column so I just wondered if there was some underlying reason for doing so. I think not in this case other than they wanted to have a one column key.|||I suspect it is as simple as wanting a one column key 'cause it makes using things like object dictionaries far easier when you've only got to worry about one value. I know I do that.|||

mru22 wrote:

Sorry but I do have a complete understanding of databases. And no I did not just pick SQL server and start programming. Your insults are not only a waste of posting to this board by providing no true value to the discussion but that's ok because there will always be people like you.
As far as this thread why do you go ahead and explain which one you prefer and why or would you rather reply with another insult ?


A "complete understanding of databases"? Perhaps you mean CODASYL databases? Or XML databases? Certainly not relational.
In the relational world, we don't do things like add artificial pointers to relations. In fact, that's a fundamental concept of relational databases: no pointers. The whole point of relational databases is to use values on their own to key and relate. Not meaningless pointers.
Your sample schema shows you don't get that, and therefore lack an understanding of that fundamental concept. CarID? What on earth is that? The standard that the rest of the world is VIN. It's right there, physically stamped on every car. In multiple places. It cannot be changed.
Ditto for PersonID. First off, the name "Person" is a very poor naming choice for an individual: companies have Employees and Customers/Clients, schools have Faculty and Students. No organization has generic "Persons." When you model what the actual data is, you find that they already have identifiers there for you. Employee_Num or SSN. Student_Num. Etc. No need to use an artificial pointer.
As I stated earlier, you do not understand these fundamental concepts of relational databases. If you did, you would never need to ask such a question. But what's worse is that you *think* you do.
I do not intend to be insulting. Your question is tantamount to someone asking on forums.carpentry.net: "I normally use a hacksaw when I'm building stuff for my customers ... but what tool do you guys think I should use? A hacksaw or a mitre box? I'm trying to pound in a nail ..."|||Alex makes some fair points, however, I'm sure he'll agree that even pure database design will conceed to the practical natures of a solution, for example de-normalised data. What I'm poining out is that there is *never* a abs. correct answer, the only true answer is, "it depends". However, I do think it would be well worth your time to invest in an introduction to relational database design, at least when you take the decision to use an artifical key you'll know the reasons why ;)
|||So you are saying in Relational Databases meaningless keys such as Identity Columns are bad? VIN For example includes characters thus cannot be an integer or numeric data type.
I would not want VIN as the PK. I typically avoid varchar or char datatypes as primary keys. And yes I typically use identity columns as Primary Keys
In your carID example I would Instead put VIN as unique and would add an Identity column as the PK. Furthermore for anytable that had the a reference to the Car table, It would store a smaller more efficient integer as opposed to varchar. I know that storage isn't much of an issue today as say 10-20 years ago but I still find that a more efficient design.
Plus what if the vin number was incorrect and had to be changed to another unique vin number? This way the change would not have to propogate through any tables that referenced car other than the car table. Same with Employee SSN. While that would be numeric by removing dashes I still prefer a meaningless key.
In your person example you indicated taht What kind of name is person ? Well Employee is a person, Student is a person. You would either have a type table to know which is which but what happens is a person can be both an employee and student and you need to differentiate ? You sure are not going to make two records, so either you have many-to-many table between Person and Person Type or you would have Person and then Employee and Student tables where the PK from the Person is also the PK in the Employee and student table. I would do this if there many attributes that they did not share in common otherwise having the many to many with types would make more sense if they shared all other attributes in common.
I understand that many do not like synthetic keys but I often prefer to keep my PK meaningless when Possible. I do understand the arguments against it but do not always agree especially when using non-integer values as the Primary key.

No comments:

Post a Comment