Just-in-Time Compilation (JIT)
This topic explains Just-in-Time (JIT) compilation and how to configure it in SynxDB.
What is JIT compilation
Just-in-Time (JIT) compilation transforms interpreted program evaluation into a native program at run time. For example, instead of using general-purpose code to evaluate arbitrary SQL expressions like WHERE a.col=3, JIT generates a function specific to that expression. The CPU executes this function natively, which speeds up execution. JIT compilation reduces the overhead of indirect jumps and branches common in generic interpreted code by generating native code with direct calls and constant folding.
SynxDB uses LLVM for JIT compilation. SynxDB uses LLVM for JIT compilation. Unlike the standard interpretation execution, JIT is an optional execution mode.
The JIT workflow is designed with fault tolerance. If the JIT library fails to load on the segments (for example, if the dependency is not installed), the execution mode automatically falls back to non-JIT interpretation without interrupting the query.
User scenarios
JIT compilation primarily benefits long-running CPU-bound queries, such as analytical queries. For short queries, the overhead of JIT compilation often exceeds the time it saves.
By generating native code specific to the query and data layout, JIT optimizes away a great percentage of the interpretation overhead. This process speeds up query completion for complex workloads.
Principles of JIT compilation
The internal workflow of JIT has three stages:
Planner stage
This stage occurs in the SynxDB coordinator. The planner generates the plan tree of a query and estimates its cost.
The planner triggers JIT compilation if:
The server configuration parameter
jitistrue.The estimated query cost exceeds the value of
jit_above_cost.
If
jit_expressionsis enabled, the planner suggests that the executor compile the expressions in JIT space. The planner makes other decisions based on costs:If the estimated cost exceeds
jit_inline_above_cost, the planner compiles short functions and operators used in the query using in-line compilation.If the estimated cost exceeds
jit_optimize_above_cost, the planner applies expensive optimizations to improve the generated code.If
jit_tuple_deformingis enabled, the planner generates a custom function to deform the target table.
When the plan is ready, the planner sends the plan trees and JIT flags to the executor.
Executor initialization stage
This stage occurs in the SynxDB segments. SynxDB creates the expression evaluation steps. If using JIT, it rewrites the steps as functions in the JIT space. The planner decisions determine whether to trigger JIT compilation and which strategy to apply. However, SynxDB decides to use JIT at execution time only if
jitis enabled and the JIT libraries load successfully. The executor ignores cached decisions if the configuration forjitorjit_expressionschanges tofalsebetween the planner and execution stages, or if an error occurs.In addition, the executor checks the developer configuration parameters for providers, bitcode dumping, profiling, and debugging support.
Executor run stage
This stage also occurs in the SynxDB segments. The segments execute the steps provided by the initialization stage. The functions in JIT space are combined as a whole before the first call.
JIT accelerated operations
Currently, the SynxDB JIT implementation accelerates expression evaluation and tuple deforming:
Expression evaluation: Evaluates
WHEREclauses, target lists, aggregates, and projections. SynxDB accelerates this by generating code specific to each case.Tuple deforming: Transforms an on-disk tuple into its in-memory representation. SynxDB accelerates this by creating a function specific to the table layout and the number of columns to extract.
In-line compilation (Inlining)
SynxDB allows defining new data types, functions, operators, and other database objects. Built-in objects use similar mechanisms. This extensibility incurs overhead, for example, due to function calls. To reduce this overhead, JIT uses in-line compilation to fit the bodies of small functions into the expressions that use them. This process optimizes away a significant percentage of the overhead. SynxDB uses pre-generated bitcode files installed with the server for built-in functions and operators to facilitate this inlining.
Optimization
LLVM supports optimizing generated code. Some optimizations are cheap enough to perform whenever JIT runs, while others benefit only longer-running queries.
How to use JIT compilation
Prerequisites
Note
To use JIT, first install the LLVM libraries in your system. SynxDB requires LLVM version 14.0.0 or lower. LLVM 14.0.0 is recommended. You can install the libraries by running the command yum install llvm-libs.
Configuration
JIT works with both GPORCA and the Postgres-based planner. Because GPORCA and the Postgres-based planner use different algorithms and calculate costs differently, tune the JIT thresholds according to your usage.
Enable JIT: Set
jittoon.Tune thresholds: Adjust
jit_above_costto determine when JIT triggers.Check the values of these configuration parameters for both GPORCA and the Postgres-based planner, because the meaning of cost differs.
Because SynxDB with GPORCA might fall back to the Postgres-based planner for some operations, verify settings for both planners.
Setting the JIT cost parameters to
0forces JIT compilation for all queries. This is useful for testing but slows down short queries.Setting them to a negative value disables the feature the parameter provides.
Usage examples
To verify that JIT compilation is active and working correctly, or to force JIT compilation for testing purposes, you can temporarily lower the JIT cost thresholds.
Configure the session to force JIT compilation:
By setting the cost thresholds to 0, you ensure that JIT compilation is triggered even for simple queries that would normally be too fast to benefit from it.
SET jit = on; SET jit_above_cost = 0; SET jit_inline_above_cost = 0; SET jit_optimize_above_cost = 0;
Run a query with
EXPLAIN (ANALYZE, VERBOSE):Use
EXPLAIN (ANALYZE, VERBOSE)to execute the query and display detailed execution statistics, including JIT information. Alternatively, enable the configuration parametergp_explain_jit.The
EXPLAINoutput provides specific JIT metrics, such as:Functions: The number of JIT functions created.
Timing: The average time spent in JIT tasks per slice.
Options: Which JIT strategies (Inlining, Optimization, Expressions, Deforming) were applied.
Example with GPORCA:
EXPLAIN (ANALYZE, VERBOSE) SELECT count(*) FROM jit_explain_output;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=0.00..431.00 rows=1 width=8) (actual time=...)
-> Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..431.00 rows=1 width=8) (actual time=...)
-> Partial Aggregate (cost=0.00..431.00 rows=1 width=8) (actual time=...)
-> Seq Scan on jit_explain_output (cost=0.00..431.00 rows=...) (actual time=...)
Settings: jit = 'on', jit_above_cost = '0', jit_inline_above_cost = '0', jit_optimize_above_cost = '0'
Optimizer: GPORCA
Planning Time: 2.125 ms
JIT:
Options: Inlining true, Optimization true, Expressions true, Deforming true.
(slice0): Functions: 2.00. Timing: 1.137 ms total.
(slice1): Functions: 1.00 avg x 3 workers. Timing: 0.830 ms avg.
Execution Time: 15.123 ms
(12 rows)
Example with Postgres-based planner:
SET optimizer = off;
EXPLAIN (ANALYZE, VERBOSE) SELECT count(*) FROM jit_explain_output;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=...) (actual time=...)
-> Gather Motion 3:1 (slice1; segments: 3) (cost=...) (actual time=...)
-> Partial Aggregate (cost=...) (actual time=...)
-> Seq Scan on jit_explain_output (cost=...) (actual time=...)
Settings: jit = 'on', jit_above_cost = '0', jit_inline_above_cost = '0', jit_optimize_above_cost = '0', optimizer = 'off'
Optimizer: Postgres query optimizer
Planning Time: 0.158 ms
JIT:
Options: Inlining true, Optimization true, Expressions true, Deforming true.
(slice0): Functions: 2.00. Timing: 1.381 ms total.
(slice1): Functions: 1.00 avg x 3 workers. Timing: 0.854 ms max.
Execution Time: 24.023 ms
(14 rows)
Note
If the JIT: section is missing from the output, or if the logs contain warnings about missing libraries (for example, could not load library "llvmjit.so"), the JIT module might not be installed or loaded correctly. In such cases, the system automatically falls back to standard interpretation to ensure the query completes successfully.
Notes and limitations
Performance trade-off: JIT compilation incurs overhead. For short queries, this overhead might outweigh the execution speed gains. It is most effective for complex, CPU-intensive analytical queries.
No caching: Currently, JIT-compiled functions are not cached across queries. The compilation overhead applies to each query execution.
Fault tolerance: The JIT workflow handles executor fault tolerance. If JIT fails to load on the segments, the execution mode falls back to non-JIT interpretation.