|
There are limits to spreadsheets, that’s why databases work better.
Spreadsheets are quick and easy to create, but they have limits, sometimes these limits have consequences. In the UK, for example, over 15,000 people infected with COVID-19 were left out of infection counts and contact tracing efforts because the spreadsheet used to track infections ran out of space.
The BBC reported: developers picked an old file format to do this—known as XLS. As a consequence, each template could handle only about 65,000 rows of data rather than the one million-plus rows that Excel is actually capable of.
This example is extreme and your business may not be working on anything this wide-reaching. But spreadsheets have inherent limits, and that is worth thinking about – especially if you are using a spreadsheet as a sort of improvised database.
According to Microsoft, Excel limits the total number of rows in a spreadsheet to 1,048,576 and columns to 16,384. And there are a few other maximums to keep in mind, some of them weirdly specific.
Google Sheets also has limits: a spreadsheet can have five million cells, total, and you cannot import cells with more than 50,000 characters.
Most spreadsheets you create will never come close to these numbers. But some will, and it’s probably a good idea to think about switching to a database long before that for all sorts of reasons.
When you open a spreadsheet, you will see everything on the current worksheet. This is great until your dataset grows too much, and your computer grinds to a halt. You may have a bunch of formulas and scripts running, and unless you have a super-charged computer, it is likely to grind to a halt long before hitting the limits mentioned.
With Databases, it all happens behind the scenes as this is where the data is stored. There is no listening to your laptop fan’s whirring, as Databases only load your specific request. At the point that you struggle to valuable data from your spreadsheet this is the time you should probably set up a database.
Spreadsheets offer some filtering and querying, but there are limitations. Seeing data in a particular order, for example, typically requires making changes to the sheet itself.
However, Databases, allow for specific and fast querying and reporting without actually affecting the data. If you need to arrange data in different ways, on a regular basis, a database is a far better choice.
When a column is first added to a spreadsheet, you can type whatever you want wherever you want. Spreadsheets do not offer data validation by default, therefore leaving room for error.
For example, if you are recording someone’s age, unless you’ve anticipated this and set that column up with validation, you could type anything in the column, and nothing would stop you.
Databases, meanwhile, allow for data integrity; the system checks whether data you have entered makes, and flag potential issues; this is built into the setup itself. When you add a column to a database, you include the expected data type and limitations along with the column name. In the case of age, this means verifying the data is an actual number – avoiding lot of headaches in the future.
Spreadsheets are static meaning that every spreadsheet, and each worksheet, are more-or-less separate from one another. You can change this with formulas or by drawing information from other places, but that doesn’t change the fact that these are separate entities.
Databases are usually relational by default; if you were to update one piece of data in one area, it will automatically update everywhere else.
Please don’t presume spreadsheets are bad; they’re not! They just aren’t an alternative to an actual database. If the UK government had used a database instead of a spreadsheet to track COVID cases, they probably would not have run into the problems. Don’t let the same happen to your business.
Do you work with a lot of data and need to make regular connections? If so, it’s a good idea to look into setting up a database.
Date: February 22, 2021
Author: Morris