Experience in database design
A successful management system consists of [51% business+51% software], and 51% successful software consists of [25% database+25% program]. The quality of database design is a key. If the enterprise data is compared to the blood necessary for life, then the design of database is the most important part in the application. There are a lot of materials about database design, and there are also special stories in university degree courses. However, as we have repeatedly stressed, no matter how good a teacher is, he can't compare with the teaching of experience. So I summed up the detours and experiences I have taken over the years, and found some professionals who are quite accomplished in database design on the Internet to teach you some skills and experiences in designing databases. Selected 61 of the best skills, and wrote these skills into this paper. For the convenience of indexing, the content is divided into five parts:
Part 1-Before designing the database
This part lists 12 basic skills, including naming specifications and defining business requirements.
part 2-designing database tables
there are 24 guiding skills, covering the design of fields in the table and common problems that should be avoided.
part 3-selecting keys
how to select keys? Here are 111 tips on the correct use of system-generated primary keys, and when and how to index fields for the best performance.
Part 4-Ensuring Data Integrity
Discusses how to keep the database clear and robust and how to minimize harmful data.
Part 5-Various Tips
There are many other tips that are not included in the above four parts. I hope your database development work will be easier with them.
Part 1-Examine the existing environment before designing a database
When designing a new database, you should not only carefully study the business requirements, but also examine the existing systems. Most database projects are not built from scratch; Usually, there will always be existing systems in the organization to meet specific needs (automatic calculation may not be realized). Obviously, the existing system is not perfect, otherwise you don't have to build a new system. But the study of the old system can let you find some subtle problems that may be overlooked. Generally speaking, it is absolutely good for you to examine the existing system.
define a standard naming convention for objects
Be sure to define a naming convention for database objects. For database tables, it is necessary to determine whether the table name is in plural or singular form from the beginning of the project. In addition, simple rules should be defined for table aliases (for example, if the table name is a word, the alias will take the first four letters of the word; If the table name is two words, take the first two letters of each word to form a four-letter alias; If the name of a table consists of three words, you might as well take one of the first two words and then two letters from the last word, and the result will still be a four-letter alias, and so on.) For a work table, the name of the table can be prefixed with WORK_ followed by the name of the application that uses the table. Columns [fields] in the table should adopt a set of design rules for keys. For example, if the key is numeric, you can use _N as the suffix; If it is a character type, you can use the _C suffix. Standard prefixes and suffixes should be used for column [field] names. For another example, if there are many "money" fields in your table, you might as well add a _M suffix to each column [field]. Also, the date column [field] had better start with D_.
check the naming conventions among table names, report names and query names. You may soon be confused by the names of these different database elements. If you insist on naming the different components of these databases uniformly, at least you should use prefixes such as Table, Query or Report at the beginning of these object names to distinguish them.
if Microsoft Access is adopted, you can use symbols such as qry, rpt, tbl and mod to identify objects (such as tbl_Employees). I also used tbl to index tables when dealing with SQL Server, but I used sp_company (now sp_feft_) to identify stored procedures, because sometimes I often save several copies if I find a better way to deal with them. When I implement SQL Server 2111, I use udf_ (or similar tags) to identify the functions I wrote.
if a worker wants to do a good job, he must first sharpen his tools.
He should use ideal database design tools, such as PowerDesign of SyBase Company, which supports languages such as PB, VB and Delphe, and can connect to more than 31 popular databases in the market through ODBC, including dBase, FoxPro, VFP, SQL Server, etc. In the future, I will focus on the use of PowerDesign.
getting the data pattern resource manual
anyone who is looking for a sample pattern can read the book "Data Pattern Resource Manual", which is written by Len Silverston, W. H. Inmon and Kent Graziano, and is the best data modeling book worth owning. The book includes chapters covering a variety of data fields, such as people, institutions and work efficiency. You can also refer to other related books.
think about the future, but don't forget the lessons of the past
I find it very useful to ask users how they think about future demand changes. This can achieve two purposes: first, you can clearly understand where application design should be more flexible and how to avoid performance bottlenecks; Secondly, you know that users will be as surprised as you when there is a demand change that is not determined in advance.
be sure to remember the past experience and lessons! We developers should also help each other by sharing our own experiences and experiences. Even if users think they don't need any support anymore, we should educate them in this respect. We have all faced the moment "if only we had done this ...".
make logical design before physical practice
make logical design before going deep into physical design. With the emergence of a large number of CASE tools, your design can reach a fairly high level of logic, and you can usually better understand all aspects of database design as a whole.
Know your business
Don't add even a data table to your ER (Entity Relationship) model until you are 111% sure that the system meets its needs from the customer's perspective (why, you don't have a model yet? Then please refer to tip 9). Knowing your business can save a lot of time in the later development stage. Once you know the business requirements, you can make many decisions by yourself.
once you think you have defined the business content, you'd better have a systematic communication with your customers. Use customer terms and explain to them what you think and what you hear. At the same time, the relational cardinality of the system should be expressed by words such as possible, will and must. In this way, you can ask your customers to correct your own understanding and then do the next ER design.
create data dictionary and ER chart
be sure to take some time to create ER chart and data dictionary. It should at least contain the data type of each field and the primary foreign key in each table. It is time-consuming to create ER charts and data dictionaries, but it is absolutely necessary for other developers to understand the whole design. The earlier it is created, the more it will help to avoid possible confusion in the future, so that anyone who knows the database can clearly understand how to get data from the database.
The importance of having an up-to-date document, such as ER chart, cannot be overemphasized, which is very useful to show the relationship between tables, while the data dictionary explains the purpose of each field and any possible aliases. This is absolutely necessary for the documentation of SQL expressions.
creating patterns
a chart is worth a thousand words: developers should not only read and implement it, but also use it to help them talk to users. Pattern helps to improve the efficiency of collaboration, so it is almost impossible to have big problems in the early database design. Patterns don't have to be complicated; It can even be as simple as writing on a piece of paper. Just to ensure that the logical relationship can produce benefits in the future.
start with input and output
when defining database table and field requirements (inputs), you should first check the existing or designed reports, queries and views (outputs) to determine which tables and fields are necessary to support these outputs. For a simple example, if a customer needs a report to sort, segment and sum by postal code, you should make sure that it includes a separate postal code field instead of putting the postal code into the address field.
reporting skills
To understand how users usually report data: batch processing or online reporting? Is the interval daily, weekly, monthly, quarterly or yearly? You can also consider creating a summary table if necessary. The primary key generated by the system is difficult to manage in the report. Users often return a lot of duplicate data when searching with secondary keys in tables with system-generated primary keys. This kind of retrieval performance is low and easy to cause confusion.
Understanding customer needs
It seems that this should be an obvious thing, but the needs come from customers (from the perspective of internal and external customers). Don't rely on the needs written by users, the real needs are in the minds of customers. You should ask the customer to explain their requirements, and as the development continues, you should always ask the customer to ensure that their requirements are still in the purpose of development. An unchangeable truth is that "I don't know what I want until I see it" will inevitably lead to a lot of rework, because the database does not meet the requirements standards that customers have never written down. What's worse is that your explanation of their needs belongs only to you, and it may be completely wrong.
The other four parts are to be continued ...
Experience in database design (2)
[ Preface]: A successful management system consists of [51% business+51% software], and 51% successful software consists of [25% database+25% program]. If the enterprise data is compared to the blood necessary for life, then the design of database is the most important part in the application. There are a lot of materials about database design, and there are also special stories in university degree courses. However, as we have repeatedly stressed, no matter how good a teacher is, he can't compare with the teaching of experience. So I summed up the detours and experiences I have taken over the years, and found some professionals who are quite accomplished in database design on the Internet to teach you some skills and experiences in designing databases. Selected 61 of the best skills, and wrote these skills into this paper. In order to facilitate indexing, the content is divided into five parts:
The previous part introduced the first 12 basic skills of designing a database, including naming norms and defining business requirements (database design experience [1]). The second part of this paper introduces 24 guiding skills for designing database tables, covering the design of fields in tables and common problems that should be avoided.
part 2-designing tables and fields
checking for changes
when I design the database, I will consider which data fields may change in the future. For example, this is the case with surnames (pay attention to westerners' surnames, such as women taking their husbands' surnames after marriage, etc.). Therefore, when building a system to store customer information, I tend to store the last name field in a separate data table, and add fields such as start date and end date, so that I can track the change of this data item.
Use meaningful field names
Once I participated in the development of a project, which included a program inherited from other programmers. The programmer liked to name fields with data instructions displayed on the screen, which was not bad, but unfortunately, she also liked to use some strange naming methods, which used a combination of Hungarian naming and control serial numbers, such as cbo1, txt2, txt2_b and so on.
unless you are using a system of abbreviated field names only for you, please describe the fields as clearly as possible. Of course, don't overdo it, such as customer _ shipping _ address _ street _ line _ 1. Although it is very descriptive, no one wants to type such a long name. The specific scale is in your grasp.
use prefix to name
if there are many fields of the same type in multiple tables (such as FirstName), you might as well use the prefix of a specific table (such as CusLastName) to help you identify the fields.
timeliness data should include the "Last Updated Date/Time" field. Time stamping is particularly useful for finding the cause of data problems, reprocessing/reloading data by date, and clearing old data.
standardization and data-driven
the standardization of data is not only convenient for yourself but also for others. For example, if your user interface wants to access external data sources (files, XML documents, other databases, etc.), you might as well store the corresponding connection and path information in the user interface support table. Also, if the user interface performs tasks such as workflow (sending mail, printing stationery, modifying record status, etc.), then the data that generates workflow can also be stored in the database. Pre-arrangement always requires efforts, but if these processes are data-driven rather than hard-coded, policy changes and maintenance will be much more convenient. In fact, if the process is data-driven, you can put considerable responsibility on users, who will maintain their own workflow process.
standardization should not be overdone
for those who are not familiar with the word standardization, standardization can ensure that the fields in the table are the most basic elements, and this measure helps to eliminate data redundancy in the database. There are several forms of standardization, but Third Normal Form(3NF) is generally considered as the best balance between performance, scalability and data integrity. To put it simply, 3NF stipulates:
* Every value in the table can only be expressed once.
* Each row in the table should be uniquely identified (with a unique key).
* Non-key information that depends on other keys should not be stored in the table.
data complying with 3NF standard.