This article is based on TPC-C 5.11
We provide a brief summary of TPC-C, though a reader is encouraged to read the TPC-C specification first. TPC-C simulates the database of a wholesale company. It includes a number of warehouses, each maintaining stocks for 100,000 items and covering 10 districts with 3,000 customers in each. It consists of nine tables (WAREHOUSE, DISTRICT, CUSTOMER, HISTORY, ORDER, NEW-ORDER, ORDER-LINE, STOCK, and ITEM) and five types of transactions:
The TPC-C specification further specifies the distribution of these five types of transactions: New-Order 45%, Payment 43%, Order-Status 4%, Delivery 4%, and Stock-Level 4%. Because of randomness in the workload, a small deviation from such distribution is allowed.
TPC-C adds a wait time, including a keying time and a think time, before each transaction to simulate the user’s behavior of typing keyboard and thinking before making a decision. Each warehouse has ten terminals, one for each district, which means each warehouse can have at most ten concurrent transactions. Note that vanilla TPC-C does not allow tuning of these parameters, except the number of warehouses. However, research prototypes often tune some of them and we analyze the reasons and effects of such tuning in this section.
A database transaction usually contains multiple SQL statements. There are two typical ways to store and run a multi-statement transaction: 1) The developer can predefine the whole transaction, including all its statements, as a "stored procedure" on the database side. A stored procedure looks like a function in most programming languages: it can take arguments from the caller; it can use IF or LOOP statements internally to control what SQL statements to run and how many iterations to run. A client, which is typically a web application, can call a stored procedure by giving the name and the arguments of the stored procedure, much like calling a function; 2) A client can send SQL statements one by one to the database through a network connection. This approach is often called "interactive transactions", because the client interacts with the database to determine what to do next. In this approach, the logic (e.g., what statements to run and how many iterations, etc) is performed at the client side.
The stored procedure mode has an obvious performance benefit. When a transaction contains multiple statements, a stored procedure needs only two network I/Os, one from the client to the database server to start the stored procedure and one from the database server to the client to send the reply. Interactive transactions, however, need two network I/Os for each SQL statement, which incurs a significantly higher network overhead. Furthermore, for a stored procedure, since all statements and logic are predefined, it may be possible to statically analyze its code to find optimization opportunities---this is widely used in research papers, but such static analysis is much harder, if not impossible, for interactive transactions.
The TPC-C specification does not require to use stored procedures or interactive transactions, although it presents a sample implementation in stored procedures in its appendix. For the performance benefit mentioned above, most industrial companies which submit their scores for TPC-C use stored procedures; for the static analysis opportunities mentioned above, many research works use stored procedures as well. However, multiple studies have shown that, in practice, database administrators are often reluctant to use stored procedures and many real-world applications use interactive transactions or a mix of stored procedures and interactive transactions.
TPC-C adds a wait time (i.e, keying time and think time) before each transaction to simulate a customer's effort to type keyboards and to think before making a decision. Wait time is required by the TPC-C standard, so if you want your TPC-C results to be accepted by TPC, you cannot remove wait time. However, many research works do remove wait time for the reason explained below:
The existence of wait time, combined with the restriction that each warehouse can have only ten concurrent users, means that the maximal throughput we can achieve on each warehouse is limited. To be precise, we can compute the maximal throughput per warehouse as follows: given the distribution of the five types of transactions and the average keying and think time for each type of transaction (i.e., Section 220.127.116.11 in the TPC-C specification version 5.11.0 ), we can compute the average wait time per transaction as (18+12) ×45%+ (3+12) ×43%+ (2+10) ×4%+ (2+5) ×4%+ (2+ 5) × 4% ≈ 21 seconds. This means the average throughput per user is about 1 21 ≈ 0.048 transactions/second. Considering we can have 10 concurrent users per warehouse, the maximal throughput we can achieve on a single warehouse is 0.48 transactions/second.
This limit is very low and there is no way to improve that, considering wait time is required by TPC-C. Therefore, to gain a high throughput, an experimenter has to use a large number of warehouses: the top TPC-C results often use millions of warehouses or more. As explained in the next section, such a large number of warehouses usually result in an I/O intensive and low-contention workload.
Research works, however, often have other goals. Some try to address the contentions in a multi-core in-memory database. Some try to reduce the contentions in distributed transactions; etc. For these goals, the vanilla TPC-C, with its wait time, is not suitable. Therefore, these works remove the wait time from TPC-C so that they can gain a high throughput with a small number of warehouses, which result in a memory intensive and high-contention workload. Of course, if a research work targets optimizing I/O performance, then keeping the wait time is a natural choice.
It is not the purpose of this article to determine whether wait time is realistic or not. At the very least, TPC-C numbers with wait time (i.e., numbers published on TPC-C webiste) and numbers without wait time (i.e., numbers in many research papers) are not comparable, since they are stress testing very different things.
The number of warehouses determines how much data the database needs to load, which has multiple kinds of effects on the behavior of the benchmark.
First, it determines whether the data can fit into DRAM or has to go to a storage device, which largely determines the bottleneck of your experiment. If most of your data accesses go to the storage devices, since TPC-C is dominated by random accesses, I/O is mostly likely to be the bottleneck of the experiment, unless your experiment uses a large number of I/O devices to parallelize the I/Os. The introduction of persistent memory as storage devices may change this conclusion but so far we don't have experimental results to confirm that. Some works use a hybrid mode in which data can fit into DRAM but updates need to persist to storage devices. In other words, reads go to DRAM and writes go to storage devices. In this hybrid mode, the database may log the updates rather than performing in-place updates, so that the I/O pattern becomes sequential. In this case, the I/Os may or may not be the bottleneck.
Second, the size of the data may affect the efficiency of multiple components. For example, more data will make caching less effective; more data will generate a larger index in the database, and depending on the implementation (e.g., hash or tree) of the index, a larger index may make data search slower; more data may even make hard-drive I/Os less efficient, since the hard drive needs to perform longer seeks. It's hard to enumerate all such effects and you may run expeirments with different number of warehouses to find out.
Finally, the number of warehouses affects the contention level of the experiment. In TPC-C, the contention level can be roughly defined as the number of concurrent transactions per warehouse, since only transactions touching the same warehouse may contend (for simpliciy, let's skip distributed transactions for now). The total number of transactions is usually determined by the number of worker threads at the server side or the number of customers at the client side. If we consider the total number of concurrent transactions as a constant number, the contention level is inversely proportional to the number of warehouses.
A cross-warehouse transaction will touch more than one warehouse (most likely two, but more is possible) and a local transaction will only touch one. Theremore, more cross-warehouse transactions will incur a higher level of contention, since a transaction touching multiple warehouses is more likely to contend with others than a transaction touching one warehouse. Furthermore, in a distributed database, more cross-warehouse transactions probably will incur more network I/Os to run protocols like two-phase commit (2PC).
As discussed above, TPC-C by default includes five types of transactions. Some works use two types (NewOrder and Payment) or just one type (NewOrder). There are multiple reasons for that. First, as shown in the distribution of these five types of transactions, NewOrder and Payment combined account for 87% of all transactions, so their performance can largely determine the performance of TPC-C. Second, NewOrder and Payment can trigger a high level of contention (i.e., on D_NEXT_O_ID and payment information), so they are good enough for testing concurrency control. Third, NewOrder and Payment do not include range queries and the other three types include. For example, Order-Status needs to find the newest order of a customer and Delivery needs to find the oldest order of a customer. Such difference has multiple implications: 1) Range queries usually require a tree like index to search efficiently. When range queries are not necessary, hash index might be the most efficient implementation; 2) Many research works rely on static analysis to estimate which rows a transaction is going to touch before executing the transaction. For NewOrder and Payment, such estimate can be done based on the argument values of the transaction (e.g. customer ID, item ID, etc). As one can imagine, range queries make such analysis harder. For example, how to know the oldest order while a transaction does not provide such information by itself?
Different systems vary significantly in terms of the network and disk I/Os they involve. On the one extreme, for a single-node in-memory database, if we run clients on the same machine or even in the same process of the database server, then there will be no network or disk I/Os. On the other extreme, if we have a distributed, replicated, and persistent database, and we run remote clients using interactive transactions, this case will incur many network and disk I/Os. These I/Os can affect both the throughput and latency of your system.
As a general principle, the maximal throughput of a system is determined by the slowest component in the system. Therefore, the software settings (e.g., persist data or not, replicate data or not, etc) and hardware settings (e.g., 10G Ethernet or 100G Infiniband, hard drive or SSD, etc) will interact to determine the bottleneck.
There is one exception to the above principle. If the workload is highly contended, then the maximal throughput of your hardware may not matter. Instead, the latency of your hardware may be more important. Imagine an extreme case that all transactions contend on a single row and thus only one can execute at a time. In this case, the maximal throughput of your system will be equal to 1/(transaction latency), and transaction latency will be largely affected by network and disk latencies.
First, keep in mind that the TPC-C specification only allows the tuning of the number of warehouses, so if you expect your TPC-C results to be accepted by the TPC website, don't tune other parameters.
In the following discussion, when counting the number of warehouses, I will use "large" to indicate that data does not fit into DRAM, use "medium" to indicate that data fit into DRAM, but the number of warehouses is much larger than the number of CPU cores or worker threads, and use "small" to indicate that the number of warehouses is smaller than or equal to the number of CPU cores or worker threads. The absolute value of course depends on the size of your DRAM and the memory consumption of your database implementation.