It was during a benchmark that I discovered just how wrong I was. In the early days of these databases, it was common for the vendors to benchmark for really large procurements to see who could do the work the fastest, the easiest, and with the most features.
The benchmark was between Informix, Sybase SQL Server, and Oracle. Oracle went first. Their technical people came on-site, read through the benchmark specs, and started setting it up. The first thing I noticed was that the technicians from Oracle were going to use a database table to record their timings, even though we were going to have many dozens of connections doing work, each of which would frequently need to insert and update data in this log table. Not only that, but they were going to read the log table during the benchmark as well! Being a nice guy, I pulled one of the Oracle technicians aside to ask him if they were crazy.
Why would they purposely introduce another point of contention into the system? Wouldn’t the benchmark processes all tend to serialize around their operations on this single table? Would they jam the benchmark by trying to read from this table as others were heavily modifying it? Why would they want to introduce all of these extra locks that they would need to manage? I had dozens of “Why would you even consider that?” type of questions. The technical folks from Oracle thought I was a little daft at that point.
That is until I pulled up a window into either Sybase SQL Server or Informix and showed them the effects of two people inserting into a table or someone trying to query a table with others inserting rows (the query returns zero rows per second). The differences between the way Oracle does it and the way almost every other database does it are phenomenal—they are night and day.
Needless to say, neither the Informix nor the SQL Server technicians were too keen on the database log table approach during their attempts. They preferred to record their timings to flat files in the operating system. The Oracle people left with a better understanding of exactly how to compete against Sybase SQL Server and Informix: just ask the audience “How many rows per second does your current database return when data is locked?” and take it from there.
The moral to this story is twofold. First, all databases are fundamentally different. Second, when designing an application for a new database platform, you must make no assumptions about how that database works. You must approach each new database as if you had never used a database before. Things you would do in one database are either not necessary or simply won’t work in another database.
In Oracle, you will learn that
•\ Transactions are what databases are all about. They are a good thing.
•\ You should defer committing until the correct moment. You should not do it quickly to avoid stressing the system, as it does not stress the system to have long or large transactions. The rule is commit when you must, and not before. Your transactions should only be as small or as large as your business logic dictates. (Interesting side note: I just wrote that rule for committing—“when you must, and not before”— on http://asktom.oracle.com just this morning, probably for the millionth time. Some things never change.)
•\ You should hold locks on data as long as you need to. They are tools for you to use, not things to be avoided. Locks are not a scarce resource. Conversely, you should hold locks on data only as long as you need to. Locks may not be scarce, but they can prevent other sessions from modifying information.
•\ There is no overhead involved with row-level locking in Oracle—none. Whether you have 1 row lock or 1,000,000 row locks, the number of resources dedicated to locking this information will be the same. Sure, you’ll do a lot more work modifying 1,000,000 rows rather than 1 row, but the number of resources needed to lock 1,000,000 rows is the same as for 1 row; it is a fixed constant.
•\ You should never escalate a lock (e.g., use a table lock instead of rowlocks) because it would be “better for the system.” In Oracle, it won’t be better for the system—it will save no resources. There are times to use table locks, such as in a batch process, when you know you will update the entire table and you do not want other sessions to lock rows on you. But you are not using a table lock to make it easier for the system by avoiding having to allocate row locks; you are using a table lock to ensure you can gain access to all of the resources your batch program needs in this case.
•\ Concurrency and consistency can be achieved simultaneously. You can get it fast and correct, every time. Readers of data are not blocked by writers of data. Writers of data are not blocked by readers of data. This is one of the fundamental differences between Oracle and most other relational databases.
As we cover the remaining components in this chapter and the next, I’ll reinforce these points.
Leave a Reply