It is too often that developers use the wrong data types in their databases. As queries to databases take a major amount of time in any application, it is essential that the database and its tables are as optimised as possible to ensure the best efficiency, especially in high-demand applications.
In MySQL, the most simple optimisation is to select the correct, most appropriate, and therefore most efficient data types. But many MySQL developers, and other database developers, sometimes fail to do this; non-MySQL developers - such as PHP programmers who use MySQL as just a necessity for their database-driven website - often fail miserably.
To help you understand the data types available, and hopefully use them correctly, I have written about some of the most common - the ones which you probably use every day:
SIGNED or UNSIGNED?
For people who don't know what signed or unsigned is, I'll give a brief explanation:
With signed integers, one bit is reserved so the numbers can be made negative. With unsigned, a bit is not reserved and numbers can only be positive. If you had a SIGNED INT, the value can be between -2,147,483,648 and 2,147,483,647. But, with an UNSIGNED INT, the value can range from 0 to 4,294,967,295.
If you will never use a negative number in a field, then you should use UNSIGNED. If you have an auto-incrementing primary key, then you'd want to use UNSIGNED because it allows for more rows.
Too big?
You should think carefully before creating fields. If you have a 'users' table on a small website, with UNSIGNED INT as the primary key, then ask yourself, "Will I ever get 4,294,967,295 users?" The most likely answer to this question is no. You've got to think about what the highest value is ever likely to be. Then, select the right numeric data type which can cover that amount. I've made a little numeric data type table with the minimum and maximum values:
| Type |
Minimum |
Maximum |
| SIGNED TINYINT |
-128 |
127 |
| UNSIGNED TINYINT |
0 |
255 |
| SIGNED SMALLINT |
-32,768 |
32,767 |
| UNSIGNED SMALLINT |
0 |
65,535 |
| SIGNED MEDIUMINT |
-8,388,608 |
8,388,607 |
| UNSIGNED MEDIUMINT |
0 |
16,777,215 |
| SIGNED INT |
-2,147,483,648 |
2,147,483,647 |
| UNSIGNED INT |
0 |
4,294,967,295 |
| SIGNED BIGINT |
-9,223,372,036,854,775,808 |
9,223,372,036,854,775,807 |
| UNSIGNED BIGINT |
0 |
18,446,744,073,709,551,615 |
What's the difference between CHAR and VARCHAR?
The difference between CHAR and VARCHAR is that CHAR is fixed length, while VARCHAR is variable length. This means that CHAR is always the same size and takes up the same amount of bytes, while VARCHAR varies. VARCHAR also adds an extra byte. When should you use CHAR?
To store MD5 passwords, which are always 32 characters long (when stored in hexadecimal form; it may be better to store them as 16 bytes in a BINARY field?).
To store same-length barcodes.
When should you use VARCHAR?
To store email addresses, which vary in size
To store peoples names, which vary in size
You'd usually want to use VARCHAR when storing most user input, because their input can vary in size.
Store as numbers
When possible, you should store data in numbers, because it can be searched and ordered more quickly. For example, IP address can be stored like this:
123.123.123.123
using VARCHAR(15). If you first translated it to an IPv4 Internet network address (in PHP, you can use the ip2long() function), you could store it as:
2071690107
using a SIGNED INT.
Choosing the right date/time type
In MySQL, there are 5 different date and time types:
DATE
TIME
DATETIME
TIMESTAMP
YEAR
DATE gives the date to the nearest day, TIME gives the time (but not the date) to the nearest second, DATETIME gives the date and time to the nearest second, TIMESTAMP gives the number of seconds since epoch, and YEAR gives the date to the nearest year.
YEAR only requires 1 byte of storage, so is the best option if you only need to know the year.
DATE and TIME are the next smallest type of date/time storage, requiring just 3 bytes.
TIMESTAMP requires 4 bytes.
DATETIME requires 8 bytes.
In almost any case, it is probably best to use TIMESTAMP, which gives the date and time to the nearest second.
__('Read the rest of this entry »')